-
Notifications
You must be signed in to change notification settings - Fork 459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSBSim for SPEC CPUv8 #834
Comments
Hi @heshpdx, sounds interesting. Having a quick glance it looks like the closing date for the first step submission is 13 Mar 2023.
I'm a bit confused about the reference link to my pull request for - "Add support for forces in stability axes and moments in wind and stability axes" |
Ah, just realised you probably just typed "hash 2" and Github auto-completed that to reference a pull request, with the 2nd pull request in the Github repo happening to be my pull request at the time. |
Yes, the first step is submission of entry, which takes five minutes. That is what the 13 Mar 2023 deadline is for. Sorry about referencing PR 2! I edited and removed the link. |
I agree with Sean – this is very interesting and I think it is appropriate to use a JSBSim case as a potential benchmark.
|
@jonsberndt do you want to go ahead and fill in the submission entry? You have the right initials 😉 |
I just received the first portability report, from gcc/mingw running on Windows server 2019. As an example, this is the kind of problems we solve and then upstream the fix if needed. Do we expect issues with mingw or with MSVC?
|
No we don't expect any issues with mingw and MSVC. Each commit results in builds for all the supported platforms, e.g. see https://github.com/JSBSim-Team/jsbsim/pull/836/checks Plus a number of tests are run each time as well. In terms of the particular error report above, how is JSBSim being built? Is it using our CMake setup? In particular I'm wondering whether there is some missing define to do with JSBSIM_API and dll import? |
Thanks, your info provides proof that I am missing something. The SPEC CPU harness builds code differently than CMake. I had to break humpty-dumpty apart and put him back together again, and perhaps we are missing a flag. We can't support dll import so that functionality is turned off; that is probably it. |
Take a look at: Lines 97 to 111 in ec99f79
And: Lines 34 to 47 in ec99f79
|
LoL – sorry I didn’t see this until just now. I might not be able to get to it until this weekend. If there’s no rush and no one else has addressed it, I’d be glad to.
|
FYI, I am reviewing the benchmark for at https://www.spec.org/cpuv8/entry_form.html and intend to submit the entry this evening or tomorrow.
|
Yep, I was going to say, no immediate rush, just have to make sure it's done by the submission deadline of 13 March 2023. |
I submitted JSBSim for consideration this evening. |
Thanks! The committee voted yesterday and accepted it into the search program. I will continue to be your project contact. 😄 I am curious about your opinions on building with
We began testing with There are some nuances in SPEC rules, to separate FP benchmarks from INT benchmarks. The rules state that benchmarks that run more than 10% FP instructions are designated as FP, and less than 1% are considered INT. The ones in between 1% and 10% are in no-mans-land and are requested to fix their profile if they can. It turns out that JSBSim comes in at 6% FP instructions. Now, the rules can be changed and I know there will be discussion as to what it means to be an FP application because I am working with multiple benchmarks that come in around 5% FP. I predict that the committee will categorize JSBSim as integer. If this is the case, then the issue above is a moot point because in general, the INT suite does not support verification with |
Also, can someone help craft cmdlines? I naively ran all the examples and selected the ones which were running the longest. I came up with these:
But when running these back-to-back we just clocked 50 seconds on our reference machine. We need a runtime on the order of 180 to 300 seconds of interesting work, which motivates this post. I would love suggestions, especially scripts that exercise different parts of the code, and use different aircraft or engine models. It doesn't have to be in the repository; if you have large models of your own I welcome those with open arms. Thanks! |
I will try and post some scripts within the next week or so if that's OK. |
Can you elaborate on this? JSBSim is not an integer application (edit: at least I don't think of it as one). Is there some more context that can be provided? |
new cmdlines: sure thing, we are not in a super-rush. We have about 7 weeks to get this right. FP vs INT: Step 3f at the rules: https://spec.org/cpuv8/#step3f
The current profile shows that 6% of total instructions are FP. Your opinion matters! If you think it is an FP app then we can take that into account. When I look at the perf profile, I see JSBSim has a much higher than average number of icache and itlb misses (commensurate for a simulator), which is not usually found in "traditional FP" apps. Or maybe the "traditional FP" apps in SPEC CPU are more focused on HPC and not simulation. Let's see what happens; this is not a deal-breaker by any means. Actually it forces the committee to have a discussion. |
It's a very interesting discussion! When I think of JSBSim code I think of the equations of motion, which are a very math-intensive set of operations (matrix and vector operations). In cases where there is data logging going on there would be some file IO. Depending on which aircraft is being modeled there are varying numbers and complexities of what I think of as math models: atmosphere, gravity, engines, aerodynamics, etc. I'm surprised that the actual floating point operations are a relatively small portion of what the CPU sees. I'm considering some scenarios that might lead to a higher percentage of FP operations. I'll try and supply some new scripts over the weekend. Also, I can't say that I ever coded in a way that considered optimization for the CPU. I'm curious if there are [non-draconian] things we can do to the code that might improve performance. |
@heshpdx can you share a link to the profile results? |
@seanmcleod I ran cachegrind this evening and that was interesting. Not entirely sure how to interpret all the output or to use it for anything yet, but I ran this:
This can be turned into something more readable like this:
Seems that find_child() (a property system call) represents over 3% of the calls in a JSBSim run - at least for the example I ran. |
I also see
|
Attaching some details. Prof is just the info I pasted above. |
I ran cppcheck on the source tree to check for any crazy errors. (This was more out of curiosity than anything else., since I have proposed cppcheck as another candidate for CPUv8.) In case it is helpful I put the output below. Some of the issues tend to be innocuous or irrelevant, so your mileage may vary.
|
Hi! Any luck in making longer scripts? Thanks! |
Hi, Mahesh, Sorry it has taken me a while to get back to you. Define "interesting work". ;-) There is one script that can pretty easily be modified to run for any length of time. That is the scripts/ball_orbit.xml script. I went in and changed that as follows:
That is, the end time was increased to 86400 seconds. It ran for about ten minutes. I'm thinking if you set that to 30000 then the sim should run for about 4 minutes. But, is that "interesting"? I think I could also modify one of the C310 scripts to fly a route for that long. Jon |
If the work is repetitive (the same code is being executed over and over with same/similar data), then it is not interesting. If a script exercises different parts of the code with different data, that would be ideal. If I were to use knowledge of my own CPU simulator to make an analogy to your simulator (even though it is a different domain), it would be something like the following. Not interesting:
Interesting:
When I pitched the benchmark to the committee, I had the aircraft directory open, and they saw the model for a pterodactyl and got excited that we would have that in one of the cmdlines. I don't see any scripts with pterodactyl though. Can we use that model somehow? Do others have more complex scripts which simulate other kind of phenomena? I saw people on the forum asking for Martian atmosphere for the NASA helicopter. Also, some of the scripts where the aircraft crashes - those are good too. |
I'd define it in terms of good code coverage 😉 It looks like multiple scripts are being run, i.e. we don't have to get everything done in a single script. In terms of coverage I'd suggest the following as a starting point:
|
Thank you for a much better description. Yes, different types of engines and crafts, and timed events. |
If it works to have multiple scripts involved to interestingly exercise more of the code instead of a single, mythical "super-interesting" script ;-) then I think Sean's suggestion is a great way to go. I will note, though, that at the lowest levels much of the same code is being continuously exercised, just with different data. The C310 script that takes off and flies a route is pretty interesting for a bit at least. I think it still works. Also, would perhaps running a couple of instances of JSBSim simultaneously be helpful? I'll see if I can come up with a batch of runs to make. |
I work on the SPEC CPU committee as a colleague of @heshpdx. I work on porting and profiling workloads on Windows. One task is to ensure that the work done and the workload outputs are consistent between Windows and other OSes. I am investigating a case where the output produced for the script ball_orbit.xml is different when I compile JSBSim with the Microsoft MSVC compiler, versus when I compile it with the gcc compiler in a MinGW environment. The discrepancy is so precise & consistent that I hoped an expert in JSBSim might recognize what what variable or function might be going awry under MSVC. To be clear, I don't think this is an issue in JSBSim. In a few specific and consistent cases, the MSVC-compiled version reports a value of "Argument of perigee" that is (360 - X) compared to the gcc-compiled version. I noticed that this occurs when (and only when) the matching "True anomaly" value is ~ 360. An output example when compiled with MinGW/gcc:
The same output when compiled with MSVC:
I can "correct" this difference in the output by inserting this into src/models/FGPropagate.cpp#533:
|
To augment @bowers's post, we recently taught SPEC CPU's verification harness to tolerate coterminal angles. Therefore, if the MSVC output for Argument of Perigree were reported as -158.5' instead of +158.5', that would be coterminal with +201.5' (which is the "golden reference output") and we would not be in this predicament because the output would verify. So perhaps that gives a clue as to what is happening. |
Okay, just double-checking, so it looks like the only assignments to And in the example output above, the jsbsim/src/models/FGPropagate.cpp Lines 518 to 529 in ae4652a
And so the difference between gcc and MSVC is due to a different result between them with regards to line 523 and this conditional: if (eVector(eZ) < 0) |
Okay, so doing a quick lookup on And taking a look at the output for MSVC I see the following pattern:
Which looks incorrect, i.e. between time 2400 and 3000 it implies the ball has moved in reverse, a value of 201.5 would make more sense. Taking a look in the debugger for time 3000,
VState.vInertialVelocity - {-9352.1465544806997, -23694.549595047396, 0.0000000000000000}
angularMomentum - {0.0000000000000000, 0.0000000000000000, 553426172137.42932}
eVector - {-0.0014013482342756234, 0.00055309720560003939, 0.0000000000000000} @jonsberndt, @bcoconni can you spot anything obvious going wrong? |
Hmm, trying to picture the orbit I took a look at the reset file used by the script. jsbsim/aircraft/ball/reset00_v2.xml Lines 8 to 19 in ae4652a
So it appears that the ball starts off at (lat, lon) of (0, 0) and then given the yaw angle specified and the x-body velocity it starts an equatorial orbit? Which seems to match in terms of the initial state report looking at the ECEF coordinates and ECEF velocity. ------------------------------------------------------------------
State Report at sim time: 0.000000 seconds
Position
ECI: 21725646.32546, 0, 0 (x,y,z, in ft)
ECEF: 21725646.325460 , 0.000000 , 0.000000 (x,y,z, in ft)
Local: 0.000000, 0.000000, 800000.000000 (geodetic lat, lon, alt ASL in deg and ft)
Orientation
ECI: -89.99999999999999, -0, 90 (phi, theta, psi in deg)
Local: 1.272221872585406e-14, -1.412450015376051e-30, 90.00000000000001 (phi, theta, psi in deg)
Velocity
ECI: 0, 25473.40428154583, 0 (x,y,z in ft/s)
ECEF: 0, 23889.14516700002, 0 (x,y,z in ft/s)
Local: 0.000000 , 23889.145167 , 0.000000 (n,e,d in ft/sec)
Body: 23889.145167 , 0.000000 , 0.000000 (u,v,w in ft/sec)
Body Rates (relative to given frame, expressed in body frame)
ECI: 0, 0, 9.277188200408133e-19 (p,q,r in deg/s)
ECEF: 0, 0.004178074132240404, 0 (p,q,r in deg/s)
---- JSBSim Execution beginning ... --------------------------------------------
Time Notify (Event 0) executed at time: 0.005000
Height AGL (ft): = 800000.000000
Altitude (Geod, ft): = 800000.000000
Latitude (Geod, deg): = 0.000000
Vehicle Radius (ft): = 21725646.325460
Inertial Vel Mag (ft/s): = 25473.404282
Body X Velocity (ft/s): = 23889.145167
Simulation Frame: = 1.000000
Density: = 0.000000
Angular momentum (ft^2/s) = 553426172126.323120
Inclination (deg) = 0.000000
Right ascension (deg) = 0.000000
Argument of perigee (deg) = 0.000000
Period (s) = 5374.952766
Eccentricity = 0.001507
Apoapsis radius (ft) = 21791206.646590
Periapsis radius (ft) = 21725646.325460
True anomaly (deg) = 0.000000
Start: Monday October 23 2023 22:17:03 (HH:MM:SS) So as a complete orbital mechanics layman, shouldn't the |
In case it matters, here is the change I made in the
|
I'm still probing. Two observations: I can get gcc to report same values as MSVC by replacing |
So in my mind jsbsim/src/models/FGPropagate.cpp Lines 507 to 515 in ae4652a
|
https://oer.pressbooks.pub/lynnanegeorge/chapter/chapter-3-the-classical-orbital-elements-coes/
So we should be detecting all these cases and setting the argument of perigee to 0. I'll submit a pull request to fix this. |
Another way to maybe get around the argument of perigee problem is to consider using the new script I committed that has an orbit with a non-zero inclination:
Note that this script requires an update to the code and a recompile. |
I noticed that JSBSim runs have an unseemingly large number of system calls (as measured by strace). I looked further and it turns out the vast majority are write syscalls. The big culprits are scripts Short_S23 and ah1s_flight_test. I debugged and figured out the reason. Every time we print a line to stdout, we also flush it. Each flush invokes
I went through and ran The reason why this is important: when SPEC CPU users run the benchmark, they will typically spawn one copy per core, so for a 192-core system you will have 192 app instances all on top of a single operating system handling the I/O. So if we have a large storm of small writes, the operating system would become the bottleneck for the benchmark. We want to keep the OS out of the way as much as possible, so we can measure system hardware scalability. That is why I did this conversion. This brings us closer to proposing Step 4 completion for the benchmark candidate. Anyway, just sharing this tidbit with the community in case it helps. I could provide a PR if you are interested, but it is super easy to do yourself with the sed cmdline. |
Coincidentally @bcoconni and myself have just been looking at the performance impact of The new logging implementation is still in progress and we'll take this into account. Also in terms of launching hundreds of instances of JSBSim like you mention I had made the following mention in #1094 (comment) when the initial logging design was being proposed:
So it is something we are aware of handling, i.e. large scale high performance cases and will continue to work on. For example, in this sort of case: (base) C:\source\jsbsim>$env:JSBSIM_DEBUG=0
(base) C:\source\jsbsim>measure-command { .\Release\JSBSim.exe --script=scripts\ah1s_flight_test.xml }
9.107s Should there be a way to suppress all output except say exception messages, i.e. don't even build the strings even if they're going to be well buffered. |
When you say "ten command lines" do you mean a script that runs 10 different JSBSim script examples? I had a quick look at The majority of the output from State report (Event 26) executed at time: 2700.000000
velocities/vc-kts = 151.830680
fcs/SFD/trim/v_dir_kts = 163.383721
position/h-agl-ft = 2731.074868
position/h-sl-ft = 5014.598617
attitude/heading-true-rad = 3.141579
aero/alpha-deg = -5.744751
aero/beta-deg = -0.141843
aero/mu = 0.367578
aero/qbar-psf = 77.841712
aero/force/D0_fus = 809.164597
velocities/u-aero-fps = 274.374860
velocities/v-aero-fps = -0.682678
velocities/w-aero-fps = -27.602704
velocities/h-dot-fps = -0.000002
propulsion/engine/rotor-rpm = 324.000000
position/distance-from-start-mag-mt = 203791.502611
propulsion/engine/torque-lbsft = 13780.905888
propulsion/engine/bell-torque-sensor-psi = 49.998568 We haven't updated the script code yet to make use of the new logging functionality, which does include a level of buffering before issuing a flush. With the property output coming from this code: jsbsim/src/input_output/FGScript.cpp Lines 508 to 513 in 4d93022
I changed the |
Yes, I have ten different cmdlines that are run in order, and ah1s is one of them. I modified some of the scripts, e.g. decreasing Thanks for sharing the info about what is happening on the mainline. The modern logging and buffering is the way to go. I just cloned your latest and ran it against my ah1s script using strace:
My copy has the replacements of "\n" for std::endl, and we can see the delta in |
Okay, so if not explicitly set the debug level is set to 1. Line 78 in 4d93022
And with it set to 1, that includes the echoing of the config files, so the redirected output from the ah1s script now totals 2,889 lines, up from the 1,145 lines I'd seen previously with a debug level of 0. So assuming a flush/write for every output line, then the 2,889 lines makes up the vast majority of the 3,413 writes your trace shows above. |
I ran up coverage metrics using gcov. It may be useful to this team, so I am attaching it here: fdm.zip Coverage reports have helped us in various efforts including portability as well as performance analysis. On the latter, I found some quick optimizations that I will share in a PR shortly. |
Hmm, so if you look at the Lines 86 to 96 in 8e4803b
Now in your output you can see that both the non-PowerPC and PowerPC versions trigger this crash event at the same time So is there some logic in the code handling the terminate request that is buggy due to big-endian mode? Or is it some compiler issue/difference not specific to endianess? Hmm, so looking at the relevant code we have Line 640 in 8e4803b
Line 171 in 8e4803b
Line 93 in 8e4803b
Lines 442 to 445 in 8e4803b
|
Wow, that cast is likely the problem. We could change it to |
Or, the third option which I just tried on x64 MSVC successfully running bool Terminate;
instance->Tie("simulation/terminate", (bool *)&Terminate); |
Hmm, so noticed the following while glancing at this. // FGFDMExec.h
bool holding;
unsigned short Terminate;
bool HoldDown;
void SetHoldDown(bool hd);
bool GetHoldDown(void) const {return HoldDown;}
// FGFDMExec.cpp
holding = false;
Terminate = false;
HoldDown = false;
instance->Tie("simulation/pause", (int *)&holding);
instance->Tie("simulation/terminate", (int *)&Terminate);
instance->Tie("forces/hold-down", this, &FGFDMExec::GetHoldDown, &FGFDMExec::SetHoldDown); So I'm thinking we should look at getting this more type consistent. bool Terminate;
instance->Tie("simulation/pause", (bool *)&holding);
instance->Tie("simulation/terminate", (bool *)&Terminate); I'll test it out on x64 locally, can you test the proposed changes out on PowerPC? |
The variable has been defined as a @seanmcleod may be the member |
Oh ! You already did that. Sorry... |
I was the one that originally reported the BE problem to @heshpdx and he pushed the fix in our internal repo. Tried it and it worked on AIX Power10 big endian. Thanks very much for the quick investigation and fix, this was great. |
@ZarkoT out of interest what are you using JSBSim for on AIX Power10? Or were you just using it for SPEC benchmarking? |
@seanmcleod I'm just using it through SPEC. |
Hi @bcoconni and friends,
I’m a CPU architect at Ampere Computing where I do performance analysis and workload characterization. I also serve on the SPEC CPU committee, searching for benchmarks for the next version of SPEC CPU. We try to find computationally intensive workloads in diverse fields, to help measure performance across a wide variety of behaviors and application domains. Based on the longevity of JSBSim, its large active community in aeronautics, education, and game play, I am proposing the JSBSim flight dynamics models be included in the next set of marquee benchmarks in SPEC CPU. The committee is running a “search program” or contest, where people can submit their benchmark ideas. I am humbly asking if you can please submit JSBSim for consideration. You can see all the information here, along with the submission form: https://spec.org/cpuv8/ There are monetary rewards for the submitter if the entry is chosen to move forward.
Full disclosure: Since I am on the committee, I cannot submit entries for winning prizes. But you can, and I encourage you to do so, because:
Full disclosure 2: I have already done 90% of the work. I took the JSBSim mainline source, placed it into the SPEC harness, and wrangled it to build and run on multiple platforms with command lines that conform to memory footprints required by the benchmark rules. I started the process of verification of results across multiple platforms, which led me to file PR #830. The part that is remaining is the documentation, fixing cases where the simulation script output is slightly different between optimization levels, and adjusting the scripts to satisfy the run-length requirements for CPUv8 rules. There is also opportunity to use different scripts/aircraft/engines other than the examples that I dug up while browsing the repo.
There may be other things that we find that could be pushed back to the JSBSim mainline, as we work through portability issues. So that is also part of the journey. If this sounds interesting, please let me know and/or take five minutes to submit an entry at this website: https://www.spec.org/cpuv8/entry_form.html. This is a tremendous opportunity to have a lasting impact on the CPU industry.
Cheers,
Mahesh
The text was updated successfully, but these errors were encountered: