Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSBSim for SPEC CPUv8 #834

Open
heshpdx opened this issue Feb 10, 2023 · 68 comments
Open

JSBSim for SPEC CPUv8 #834

heshpdx opened this issue Feb 10, 2023 · 68 comments

Comments

@heshpdx
Copy link
Contributor

heshpdx commented Feb 10, 2023

Hi @bcoconni and friends,

I’m a CPU architect at Ampere Computing where I do performance analysis and workload characterization. I also serve on the SPEC CPU committee, searching for benchmarks for the next version of SPEC CPU. We try to find computationally intensive workloads in diverse fields, to help measure performance across a wide variety of behaviors and application domains. Based on the longevity of JSBSim, its large active community in aeronautics, education, and game play, I am proposing the JSBSim flight dynamics models be included in the next set of marquee benchmarks in SPEC CPU. The committee is running a “search program” or contest, where people can submit their benchmark ideas. I am humbly asking if you can please submit JSBSim for consideration. You can see all the information here, along with the submission form: https://spec.org/cpuv8/ There are monetary rewards for the submitter if the entry is chosen to move forward.

Full disclosure: Since I am on the committee, I cannot submit entries for winning prizes. But you can, and I encourage you to do so, because:

Full disclosure 2: I have already done 90% of the work. I took the JSBSim mainline source, placed it into the SPEC harness, and wrangled it to build and run on multiple platforms with command lines that conform to memory footprints required by the benchmark rules. I started the process of verification of results across multiple platforms, which led me to file PR #830. The part that is remaining is the documentation, fixing cases where the simulation script output is slightly different between optimization levels, and adjusting the scripts to satisfy the run-length requirements for CPUv8 rules. There is also opportunity to use different scripts/aircraft/engines other than the examples that I dug up while browsing the repo.

There may be other things that we find that could be pushed back to the JSBSim mainline, as we work through portability issues. So that is also part of the journey. If this sounds interesting, please let me know and/or take five minutes to submit an entry at this website: https://www.spec.org/cpuv8/entry_form.html. This is a tremendous opportunity to have a lasting impact on the CPU industry.

Cheers,
Mahesh

@seanmcleod
Copy link
Member

Hi @heshpdx, sounds interesting.

Having a quick glance it looks like the closing date for the first step submission is 13 Mar 2023.

Full disclosure #2: I have already done 90% of the work.

I'm a bit confused about the reference link to my pull request for - "Add support for forces in stability axes and moments in wind and stability axes"

@seanmcleod
Copy link
Member

Ah, just realised you probably just typed "hash 2" and Github auto-completed that to reference a pull request, with the 2nd pull request in the Github repo happening to be my pull request at the time.

@heshpdx
Copy link
Contributor Author

heshpdx commented Feb 10, 2023

Yes, the first step is submission of entry, which takes five minutes. That is what the 13 Mar 2023 deadline is for.

Sorry about referencing PR 2! I edited and removed the link.

@jonsberndt
Copy link
Contributor

jonsberndt commented Feb 10, 2023 via email

@seanmcleod
Copy link
Member

@jonsberndt do you want to go ahead and fill in the submission entry? You have the right initials 😉

@heshpdx
Copy link
Contributor Author

heshpdx commented Feb 15, 2023

I just received the first portability report, from gcc/mingw running on Windows server 2019. As an example, this is the kind of problems we solve and then upstream the fix if needed. Do we expect issues with mingw or with MSVC?

g++      -std=c++17   -m64 -c -oJSB-FlightSim/src/FGJSBBase.obj -DSPEC -DNDEBUG -DSPEC_WINDOWS -DSPEC_AUTO_BYTEORDER=0x12345678-I. -IJSB-FlightSim/src -DXML_DEV_URANDOM -DSPEC_AUTO_SUPPRESS_THREADING  -g -O3 -march=native    -DSPEC_LP64 -DWIN32       JSB-FlightSim/src/FGJSBBase.cpp
In member function 'std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::pointer std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::_M_data() const [with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]',
    inlined from 'bool std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::_M_is_local() const [with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]' at D:/Users/specdev3/mingw-w64/mingw64/lib/gcc/x86_64-w64-mingw32/12.1.0/include/c++/bits/basic_string.h:275:23,
    inlined from 'void std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::_M_dispose() [with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]' at D:/Users/specdev3/mingw-w64/mingw64/lib/gcc/x86_64-w64-mingw32/12.1.0/include/c++/bits/basic_string.h:286:18,
    inlined from 'std::__cxx11::basic_string<_CharT, _Traits, _Alloc>::~basic_string() [with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]' at D:/Users/specdev3/mingw-w64/mingw64/lib/gcc/x86_64-w64-mingw32/12.1.0/include/c++/bits/basic_string.h:795:19,
    inlined from 'void __tcf_1()' at JSB-FlightSim/src/FGJSBBase.cpp:77:14:
JSB-FlightSim/src/FGJSBBase.cpp:77:14: error: definition of static data member 'JSBSim::FGJSBBase::needed_cfg_version' of dllimport'd class
   77 | const string FGJSBBase::needed_cfg_version = "2.0";
      |              ^~~~~~~~~
JSB-FlightSim/src/FGJSBBase.cpp: In member function 'void JSBSim::FGJSBBase::disableHighLighting()':
JSB-FlightSim/src/FGJSBBase.cpp:52:10: error: definition of static data member 'JSBSim::FGJSBBase::highint' of dllimport'd class
   52 |     char FGJSBBase::highint[5]  = {27, '[', '1', 'm', '\0'      };
      |          ^~~~~~~~~
JSB-FlightSim/src/FGJSBBase.cpp:53:10: error: definition of static data member 'JSBSim::FGJSBBase::halfint' of dllimport'd class
   53 |     char FGJSBBase::halfint[5]  = {27, '[', '2', 'm', '\0'      };
      |          ^~~~~~~~~
JSB-FlightSim/src/FGJSBBase.cpp:54:10: error: definition of static data member 'JSBSim::FGJSBBase::normint' of dllimport'd class
   54 |     char FGJSBBase::normint[6]  = {27, '[', '2', '2', 'm', '\0' };
      |          ^~~~~~~~~
JSB-FlightSim/src/FGJSBBase.cpp:55:10: error: definition of static data member 'JSBSim::FGJSBBase::reset' of dllimport'd class
   55 |     char FGJSBBase::reset[5]    = {27, '[', '0', 'm', '\0'      };
      |          ^~~~~~~~~
JSB-FlightSim/src/FGJSBBase.cpp:56:10: error: definition of static data member 'JSBSim::FGJSBBase::underon' of dllimport'd class
   56 |     char FGJSBBase::underon[5]  = {27, '[', '4', 'm', '\0'      };
      |          ^~~~~~~~~
JSB-FlightSim/src/FGJSBBase.cpp:57:10: error: definition of static data member 'JSBSim::FGJSBBase::underoff' of dllimport'd class
   57 |     char FGJSBBase::underoff[6] = {27, '[', '2', '4', 'm', '\0' };
      |          ^~~~~~~~~
JSB-FlightSim/src/FGJSBBase.cpp:58:10: error: definition of static data member 'JSBSim::FGJSBBase::fgblue' of dllimport'd class
   58 |     char FGJSBBase::fgblue[6]   = {27, '[', '3', '4', 'm', '\0' };
      |          ^~~~~~~~~
JSB-FlightSim/src/FGJSBBase.cpp:59:10: error: definition of static data member 'JSBSim::FGJSBBase::fgcyan' of dllimport'd class
   59 |     char FGJSBBase::fgcyan[6]   = {27, '[', '3', '6', 'm', '\0' };
      |          ^~~~~~~~~
JSB-FlightSim/src/FGJSBBase.cpp:60:10: error: definition of static data member 'JSBSim::FGJSBBase::fgred' of dllimport'd class
   60 |     char FGJSBBase::fgred[6]    = {27, '[', '3', '1', 'm', '\0' };
     |          ^~~~~~~~~
JSB-FlightSim/src/FGJSBBase.cpp:61:10: error: definition of static data member 'JSBSim::FGJSBBase::fggreen' of dllimport'd class
   61 |     char FGJSBBase::fggreen[6]  = {27, '[', '3', '2', 'm', '\0' };
      |          ^~~~~~~~~
JSB-FlightSim/src/FGJSBBase.cpp:62:10: error: definition of static data member 'JSBSim::FGJSBBase::fgdef' of dllimport'd class
   62 |     char FGJSBBase::fgdef[6]    = {27, '[', '3', '9', 'm', '\0' };
      |          ^~~~~~~~~

@seanmcleod
Copy link
Member

Do we expect issues with mingw or with MSVC?

No we don't expect any issues with mingw and MSVC.

Each commit results in builds for all the supported platforms, e.g. see

https://github.com/JSBSim-Team/jsbsim/pull/836/checks

image

Plus a number of tests are run each time as well.

In terms of the particular error report above, how is JSBSim being built? Is it using our CMake setup? In particular I'm wondering whether there is some missing define to do with JSBSIM_API and dll import?

@heshpdx
Copy link
Contributor Author

heshpdx commented Feb 15, 2023

Thanks, your info provides proof that I am missing something. The SPEC CPU harness builds code differently than CMake. I had to break humpty-dumpty apart and put him back together again, and perhaps we are missing a flag. We can't support dll import so that functionality is turned off; that is probably it.

@seanmcleod
Copy link
Member

Take a look at:

jsbsim/src/CMakeLists.txt

Lines 97 to 111 in ec99f79

if(WIN32)
if(BUILD_SHARED_LIBS)
# The flag JSBSIM_EXPORT must be declared PRIVATE to avoid propagating it
# to other targets such as the executable, the Python module and the unit
# tests. Otherwise the linking of these targets will fail.
target_compile_definitions(libJSBSim PRIVATE JSBSIM_EXPORT)
foreach(TARGET_OBJECT ${TARGET_OBJECTS_LIST})
target_compile_definitions(${TARGET_OBJECT} PRIVATE JSBSIM_EXPORT)
endforeach(TARGET_OBJECT)
else()
list(APPEND MSVC_COMPILE_DEFINITIONS JSBSIM_STATIC_LINK)
endif(BUILD_SHARED_LIBS)
list(APPEND COMPILE_DEFINITIONS ${MSVC_COMPILE_DEFINITIONS})
endif(WIN32)

And:

jsbsim/src/JSBSim_API.h

Lines 34 to 47 in ec99f79

#ifndef JSBSIM_API_H
#define JSBSIM_API_H
#if defined(_WIN32) && !defined(JSBSIM_STATIC_LINK)
#ifdef JSBSIM_EXPORT
#define JSBSIM_API __declspec(dllexport)
#else
#define JSBSIM_API __declspec(dllimport)
#endif
#else
#define JSBSIM_API
#endif
#endif /* JSBSIM_API_H */

@jonsberndt
Copy link
Contributor

jonsberndt commented Feb 15, 2023 via email

@jonsberndt
Copy link
Contributor

jonsberndt commented Feb 19, 2023 via email

@seanmcleod
Copy link
Member

I might not be able to get to it until this weekend.

Yep, I was going to say, no immediate rush, just have to make sure it's done by the submission deadline of 13 March 2023.

@jonsberndt
Copy link
Contributor

I submitted JSBSim for consideration this evening.

@heshpdx
Copy link
Contributor Author

heshpdx commented Feb 22, 2023

Thanks! The committee voted yesterday and accepted it into the search program. I will continue to be your project contact. 😄

I am curious about your opinions on building with -Ofast, based on the code here:
https://github.com/JSBSim-Team/jsbsim/blob/master/src/math/FGLocation.h#L53

/* Setting the -ffast-math compilation flag is highly discouraged */
#ifdef __FAST_MATH__
#error Usage of -ffast-math is strongly discouraged
#endif

We began testing with -Ofast since some users will do that for floating point applications, and of course we see some variance, such as the left and right landing gears touching down in a different order compared to the reference non-optimized baseline. Eyeballing the output of the flight script, everything looks ok; so there are methods we can employ to relax our verification. However I wanted to make sure with you all that this is acceptable, or if I should push back and insist on -fno-fast-math.

There are some nuances in SPEC rules, to separate FP benchmarks from INT benchmarks. The rules state that benchmarks that run more than 10% FP instructions are designated as FP, and less than 1% are considered INT. The ones in between 1% and 10% are in no-mans-land and are requested to fix their profile if they can. It turns out that JSBSim comes in at 6% FP instructions. Now, the rules can be changed and I know there will be discussion as to what it means to be an FP application because I am working with multiple benchmarks that come in around 5% FP. I predict that the committee will categorize JSBSim as integer. If this is the case, then the issue above is a moot point because in general, the INT suite does not support verification with -Ofast (or your mileage may vary). But I wanted to get your collective thoughts so I can advocate properly.

@heshpdx
Copy link
Contributor Author

heshpdx commented Feb 22, 2023

Also, can someone help craft cmdlines? I naively ran all the examples and selected the ones which were running the longest. I came up with these:

--nohighlight --property=simulation/jsbsim-debug=0 scripts/B747_script1.xml
--nohighlight --property=simulation/jsbsim-debug=0 scripts/ah1s_flight_test.xml
--nohighlight --property=simulation/jsbsim-debug=0 scripts/Short_S23_3.xml
--nohighlight --property=simulation/jsbsim-debug=0 scripts/Short_S23_4.xml
--nohighlight --property=simulation/jsbsim-debug=0 scripts/c3104.xml
--nohighlight --property=simulation/jsbsim-debug=0 scripts/c3105.xml
--nohighlight --property=simulation/jsbsim-debug=0 scripts/ball_orbit_g_torque.xml
--nohighlight --property=simulation/jsbsim-debug=0 scripts/ball_orbit.xml

But when running these back-to-back we just clocked 50 seconds on our reference machine. We need a runtime on the order of 180 to 300 seconds of interesting work, which motivates this post. I would love suggestions, especially scripts that exercise different parts of the code, and use different aircraft or engine models. It doesn't have to be in the repository; if you have large models of your own I welcome those with open arms. Thanks!

@jonsberndt
Copy link
Contributor

Also, can someone help craft cmdlines?
...
We need a runtime on the order of 180 to 300 seconds of interesting work, which motivates this post.

I will try and post some scripts within the next week or so if that's OK.

@jonsberndt
Copy link
Contributor

jonsberndt commented Feb 22, 2023

Thanks! The committee voted yesterday and accepted it into the search program. I will continue to be your project contact. 😄

...

There are some nuances in SPEC rules, to separate FP benchmarks from INT benchmarks. The rules state that benchmarks that run more than 10% FP instructions are designated as FP, and less than 1% are considered INT. The ones in between 1% and 10% are in no-mans-land and are requested to fix their profile if they can. It turns out that JSBSim comes in at 6% FP instructions. ...

Can you elaborate on this? JSBSim is not an integer application (edit: at least I don't think of it as one). Is there some more context that can be provided?

@heshpdx
Copy link
Contributor Author

heshpdx commented Feb 22, 2023

new cmdlines: sure thing, we are not in a super-rush. We have about 7 weeks to get this right.

FP vs INT: Step 3f at the rules: https://spec.org/cpuv8/#step3f

  • The executed instruction mix is either:
    • Less than 1% floating point instructions, in which case it will be considered an integer benchmark, or
    • 10% or more floating point instructions, in which case it will be considered a floating point benchmark
  • Notes:
    • If a benchmark falls in between 1% and 10%, the submitter will be asked whether modifications can be made so that the benchmark would be more clearly integer or floating point.
    • It is common for profiles to vary depending on optimization level.
    • SPEC's profiling will be conducted using moderately advanced optimization, with inlining disabled so that individual functions may be observed (for example: gcc -O2 -fno-inline -fno-builtin)

The current profile shows that 6% of total instructions are FP. Your opinion matters! If you think it is an FP app then we can take that into account.

When I look at the perf profile, I see JSBSim has a much higher than average number of icache and itlb misses (commensurate for a simulator), which is not usually found in "traditional FP" apps. Or maybe the "traditional FP" apps in SPEC CPU are more focused on HPC and not simulation. Let's see what happens; this is not a deal-breaker by any means. Actually it forces the committee to have a discussion.

@jonsberndt
Copy link
Contributor

jonsberndt commented Feb 23, 2023

When I look at the perf profile, I see JSBSim has a much higher than average number of icache and itlb misses (commensurate for a simulator), which is not usually found in "traditional FP" apps. Or maybe the "traditional FP" apps in SPEC CPU are more focused on HPC and not simulation. Let's see what happens; this is not a deal-breaker by any means. Actually it forces the committee to have a discussion.

It's a very interesting discussion!

When I think of JSBSim code I think of the equations of motion, which are a very math-intensive set of operations (matrix and vector operations). In cases where there is data logging going on there would be some file IO. Depending on which aircraft is being modeled there are varying numbers and complexities of what I think of as math models: atmosphere, gravity, engines, aerodynamics, etc. I'm surprised that the actual floating point operations are a relatively small portion of what the CPU sees. I'm considering some scenarios that might lead to a higher percentage of FP operations. I'll try and supply some new scripts over the weekend.

Also, I can't say that I ever coded in a way that considered optimization for the CPU. I'm curious if there are [non-draconian] things we can do to the code that might improve performance.

@seanmcleod
Copy link
Member

The current profile shows that 6% of total instructions are FP.

@heshpdx can you share a link to the profile results?

@jonsberndt
Copy link
Contributor

@seanmcleod I ran cachegrind this evening and that was interesting. Not entirely sure how to interpret all the output or to use it for anything yet, but I ran this:

valgrind --tool=cachegrind build/src/JSBSim scripts/ball_orbit

This can be turned into something more readable like this:

cg_annotate cachegrind.out.76375 | more

Seems that find_child() (a property system call) represents over 3% of the calls in a JSBSim run - at least for the example I ran.

@heshpdx
Copy link
Contributor Author

heshpdx commented Feb 23, 2023

I also see find_child() around 2%. Here is a profile that was generated by one of my colleagues:

-------------------------------------------------------------------------------
SPEC CPU Events                                        Sun Feb 12 08:53:47 2023
748.flightdm_r  ref  feb08a_d-gcc12-lessopt  
-------------------------------------------------------------------------------

                        %         % flops    %    L2   llc                    icach  icach   itlbm   itlb
         CPU          bra    % load   /fp   br Miss/ Miss/    L2   llc  dtlbm Miss/  Miss/   stlbh  walkc
 %time   Sec Instr    nch   fp stor  inst miss kInst kInst %Miss %Miss /mInst kInst   kRef  /mInst /mInst      size Function             Object
====== ===== ======= ==== ==== ==== ===== ==== ===== ===== ===== ===== ====== ===== ====== ======= ====== ========= ==================== =====================
100.0% 130.2 572.0 B  26%   6%  47%  1.01   1%   3.1         35%          8.8  10.9   43.7     804     29           <Total>              <Total>
  8.5%  11.1  49.2 B  20%  21%  33%  1.01        1.9         25%               15.2   34.9     687     26 1,087,000 <static>@0x76a1c     <libm-2.28.so>
  4.8%   6.2  40.6 B  34%       37%  1.00   1%   1.0         26%                 .5   28.2       8            1,320 __strncmp_avx2       <libc-2.28.so>
  2.7%   3.4   8.9 B  25%       48%  1.00   1%  17.7         66%                5.4   43.3     142     27       559 JSBSim::FGCondition: exe
  2.3%   3.0   9.7 B  37%       55%              3.4         56%                2.3   21.6     170     25        36 SGReferenced::get    exe
  2.2%   2.9  24.7 B  10%  15%  38%  1.00         .1         10%                2.2    7.4     121     19     2,655 sincos               <libm-2.28.so>
  2.1%   2.7  11.1 B  36%       27%              1.6         24%                3.4   51.4      81              158 int find_child<char  exe
  2.0%   2.6   5.7 B  23%       61%  1.00   1%  14.2         40%                9.3   53.3     838    100       262 SGPropertyNode::getD exe
  2.0%   2.5   3.9 B  32%   1%  69%  1.00       18.5         32%                5.9   36.1   1,591     15       119 JSBSim::FGFunction:: exe
  1.6%   2.0   6.0 B  33%   1%  45%  1.00   3%  11.6         49%                3.0   25.6     304      5        24 JSBSim::FGPropertyVa exe
  1.4%   1.8  11.8 B  17%   9%  48%  1.05         .3         15%               12.2   40.6     613              603 JSBSim::FGMatrix33:: exe
  1.4%   1.8   8.9 B  33%       58%         2%                                  1.5   13.1      30              400 parse_name           exe
  1.4%   1.8   4.9 B  33%   4%  39%  1.00   1%  15.0         62%                4.8   24.9      97    925     1,189 JSBSim::FGTable::Get exe
  1.3%   1.7  10.0 B  18%  25%  44%  1.00         .2          7%                5.7   13.4      90            1,987 JSBSim::FGLocation:: exe
  1.3%   1.7   5.3 B  27%   3%  59%  1.00        1.3         13%               29.1   76.0   1,898      6     3,913 JSBSim::FGFDMExec::L exe
  1.2%   1.6   6.5 B  37%   2%  60%  1.00   1%   3.7         29%                 .6   11.8      14      5       161 JSBSim::FGFunction:: exe
  1.2%   1.5   8.6 B  16%  12%  49%  1.02        1.0         32%               12.6   45.3     556     28       585 JSBSim::FGMatrix33:: exe
  1.1%   1.5  11.3 B  22%       36%               .4         29%                2.0   28.0     191               30 isalpha              <libc-2.28.so>
  1.1%   1.4   6.9 B  29%       28%  1.00        1.6         33%               10.1   33.1   1,034      9       762 malloc               <libc-2.28.so>
  1.1%   1.4   3.7 B  25%       48%         2%  11.6         35%                3.4   20.9     760      8       278 JSBSim::FGPropertyVa exe
  1.0%   1.3   5.4 B  20%  16%  55%  1.00        1.1         19%               21.5   52.5     887     11       378 JSBSim::FGColumnVect exe
  1.0%   1.2   5.1 B  25%       46%  1.00   1%   9.2         25%                1.5   37.0     111              149 SGPropertyNode::fire exe
   .9%   1.1   4.4 B  42%   1%  69%  1.00        3.0         40%                4.1   60.4     193     14        22 SGSharedPtr<JSBSim:: exe
   .9%   1.1   1.9 B  36%   2%  87%  1.00       30.2         66%                3.9   22.9     242     16        29 JSBSim::FGParameterV exe
   .8%   1.1   4.3 B  22%  23%  55%  1.01        1.9         33%                9.3   19.6     826      7        17 JSBSim::FGColumnVect exe
   .8%   1.0   5.3 B  28%       37%  1.00         .9         24%               12.1   47.7   1,763      6       414 __strlen_avx2        <libc-2.28.so>
   .8%   1.0   4.4 B  20%  12%  50%  1.01         .2          4%               10.6   36.5     313     14        47 JSBSim::FGColumnVect exe
   .8%   1.0   4.0 B  42%       52%  1.00   1%   2.2         32%                2.6   15.4     446              197 JSBSim::FGFCSChannel exe
   .8%   1.0   2.8 B  41%       30%               .7          9%                 .2    3.9      65               24 compare_strings      exe
   .7%   1.0   2.0 B  19%       63%  1.00   3%  13.6         63%                8.2   36.6     922              265 JSBSim::FGSwitch::Ru exe
   .7%    .9   9.2 B  23%       58%                                              .1    3.1                      119 std::__cxx11::basic_ <libstdc++.so.6.0.25>
   .7%    .9   4.5 B  12%   3%  54%  1.00         .7         32%                8.4   23.7     141               71 JSBSim::FGMatrix33:: exe
   .7%    .9   3.3 B  11%  24%  59%  1.00                                      15.2   41.1     349            1,027 JSBSim::FGQuaternion exe
   .7%    .9   2.9 B  27%   5%  66%  1.00        1.7         28%               13.2   35.4   2,059            2,550 JSBSim::FGAerodynami exe
   .7%    .8   1.9 B  40%   1%  79%  1.00        5.2         21%               54.3  126.1  10,113     78       325 JSBSim::FGFDMExec::R exe
   .6%    .8   4.5 B  29%       34%  1.00        1.1         21%                7.9   24.5      73     13     2,492 _int_free            <libc-2.28.so>
   .6%    .8   4.5 B  26%   4%  38%  1.00        4.2         28%                 .6   13.2       7               38 bool __gnu_cxx::oper exe
   .6%    .8   3.7 B  20%   7%  51%  1.03        1.1         38%                9.1   34.5     113              219 JSBSim::FGMassBalanc exe
   .6%    .7   4.3 B  32%       50%  1.00        1.9         38%                 .6    8.6      42              151 JSBSim::FGFCSCompone exe
   .6%    .7   3.8 B  19%       47%         4%                                  5.4   21.5                       89 PathComponent* std:: exe
   .6%    .7   2.0 B  26%   4%  51%  1.00        2.0         29%               24.5   56.2   3,388     91     2,076 JSBSim::FGLGear::Get exe
   .5%    .7   2.9 B  19%   5%  59%  1.00        1.7         61%               15.3   36.0     781              504 JSBSim::FGLocation:: exe
   .5%    .7   2.4 B  20%  16%  49%  1.01        1.2         16%               16.2   49.2     210     12     2,385 JSBSim::FGAuxiliary: exe
   .5%    .7   1.7 B  32%       65%  1.00   1%  10.4         45%                2.5   30.3     330     35       116 JSBSim::FGFCSFunctio exe
   .5%    .6   3.2 B  17%  14%  54%  1.01        2.5         71%                6.6   26.3      94    141       126 JSBSim::FGColumnVect exe
   .5%    .6   3.1 B  22%       49%               .3         24%                5.1   21.4      10              235 void std::vector<Pat exe
   .5%    .6   2.7 B  26%       43%  1.00                                      10.5   26.4   3,406     11       268 free                 <libc-2.28.so>
   .5%    .6   2.4 B  19%  17%  51%  1.00                                      27.6   50.6   1,711              108 JSBSim::FGColumnVect exe
   .5%    .6   2.3 B  23%       58%  1.00        1.3         32%               19.0   92.2   1,589               25 std::__deque_buf_siz exe
   .5%    .6   1.9 B  31%       58%              5.9         30%                3.9   24.1     210     32       441 SGPropertyNode::setD exe
   .5%    .6   1.7 B  26%       42%         1%                                 18.5   76.9   1,623    126       984 __memcpy_avx_unalign <libc-2.28.so>
   .5%    .6   1.0 B  69%       86%         1%  38.6         56%               17.5   72.5     688    125        41 SGPropertyNode::get_ exe
   .4%    .6   3.3 B  10%  43%  44%  1.00                                       8.8   30.3     754              403 JSBSim::FGLocation:: exe
   .4%    .6   2.0 B  16%       35%              2.0         29%               10.4   53.0     862              128 std::__cxx11::basic_ <libstdc++.so.6.0.25>
   .4%    .6   1.9 B  25%   3%  79%  1.00        4.9         32%                3.9   62.8      32               82 SGSharedPtr<JSBSim:: exe
   .4%    .6   1.3 B  23%       54%         1%   1.5         14%               35.4   81.2   2,902     45       629 JSBSim::FGFCS::Run   exe
   .4%    .5   2.6 B  29%   1%  49%  1.00        1.5         15%                                34               38 bool __gnu_cxx::oper exe
   .4%    .5   2.3 B  17%       13%               .4        100%                2.3   41.2      65     13         8 std::__cxx11::basic_ <libstdc++.so.6.0.25>
   .4%    .5   2.2 B  41%       43%               .9         24%                4.9   23.4     442              426 void std::__cxx11::b exe
   .4%    .5   2.2 B  26%       61%               .5        100%                6.8   40.4   1,724               98 std::__cxx11::basic_ <libstdc++.so.6.0.25>
   .4%    .5   2.2 B   9%       31%               .9         24%                8.2   23.9     497              252 std::__cxx11::basic_ <libstdc++.so.6.0.25>
   .4%    .5   2.1 B  21%       40%              1.4         48%                4.8   12.6      57              329 find_node            exe
   .4%    .5   2.1 B  19%       38%               .9         17%               12.5   43.3     966              192 std::__cxx11::basic_ <libstdc++.so.6.0.25>
   .4%    .5   1.7 B  35%       44%  1.00        1.2         24%                6.4  107.0   4,394               33 JSBSim::FGFunction:: exe
   .4%    .5   1.7 B  30%       42%                                             4.3   85.7     198               22 SGSharedPtr<JSBSim:: exe
   .4%    .5   1.7 B  19%  12%  61%  1.16                                      12.7   28.9     144              279 JSBSim::FGColumnVect exe
   .4%    .5   1.7 B  12%  11%  47%  1.14         .6          9%               31.4   85.0     973     17     1,512 JSBSim::FGMassBalanc exe
   .4%    .5   1.6 B  14%   9%  48%  1.00                                      11.4   43.7                       42 JSBSim::FGColumnVect exe
   .4%    .5   1.3 B  58%  11%  75%  1.00        3.1         17%                 .8   30.3      23                4 __gnu_cxx::__normal_ exe
   .4%    .5   1.1 B  40%      119%                                            24.3   44.4   1,379               28 JSBSim::FGMatrix33:: exe
   .4%    .5    .8 B  46%       80%         2%   9.6         71%               57.2  110.1   1,659    108     2,636 JSBSim::FGScript::Ru exe
   .3%    .5   2.8 B  17%  16%  34%  1.00                                       5.1   19.7     240              296 JSBSim::FGDefaultGro exe
   .3%    .5   1.1 B  25%   1%  45%  1.00        2.8        100%               12.4   48.9     496              114 std::operator+       exe
   .3%    .5    .8 B  68%       90%                                            34.6   60.1     508              138 parse_component      exe
   .3%    .5    .6 B  43%       68%         4%  45.3         35%               20.3  144.4     610               25 JSBSim::aFunc<JSBSim exe
   .3%    .4   2.4 B  23%       36%  1.00        5.1         48%                8.5   43.3   3,194              170 std::vector<bool, st exe
   .3%    .4   2.3 B  22%       37%                                             9.6   36.4     104               49 std::vector<PathComp exe
   .3%    .4   2.0 B  22%       37%                                             4.3   20.4     372     15       366 parse_path           exe
   .3%    .4   2.0 B  13%   6%  41%  1.02         .5         24%                 .5    1.7                      341 JSBSim::FGMassBalanc exe
   .3%    .4   1.8 B  32%       45%  1.00                                      15.8   63.9   2,763               73 std::__cxx11::basic_ exe
   .3%    .4   1.6 B  34%   2%  47%  1.00   2%   5.6         41%                 .9   11.6                      161 JSBSim::FGFunction:: exe
   .3%    .4   1.6 B  17%       41%  1.00        4.3         69%               24.1  103.3   3,678     37    27,744 <static>@0x8c6a0     <libstdc++.so.6.0.25>
   .3%    .4   1.5 B  19%   8%  39%  1.00         .7        100%               14.0   37.0     204     20        62 JSBSim::FGMatrix33:: exe
   .3%    .4   1.5 B  15%       44%  2.00                                      20.6   57.2   1,954               16 JSBSim::FGColumnVect exe
   .3%    .4   1.3 B  19%  21%  43%  1.09                                       4.5   10.6                       12 JSBSim::FGMatrix33:: exe
   .3%    .4   1.3 B  16%  17%  51%  1.00         .7         14%               36.5  126.7   2,881              130 JSBSim::FGLocation:: exe
   .3%    .4   1.1 B  27%   3%  47%  1.00   1%   9.2         76%               11.9   65.7      28               40 std::vector<double,  exe
   .3%    .4   1.0 B  35%  21% 103%  1.00        1.0         24%               24.5   27.6                    1,307 JSBSim::FGMatrix33:: exe
   .3%    .4   1.0 B  30%       90%         3%  11.5         22%                1.0    4.4     437                6 JSBSim::FGRealValue: exe
   .3%    .4   1.0 B  24%   1%  42%  1.13   1%   2.1         38%               28.5   74.9   1,750              587 JSBSim::FGPropulsion exe
   .3%    .4   1.0 B  21%       62%  1.00                                      27.0   58.8     437              410 std::vector<double,  exe
   .3%    .4    .9 B  36%       57%  1.00   1%   6.7         38%                3.2   16.8     301              163 JSBSim::FGFunction:: exe
   .3%    .4    .7 B  35%  15%  99%  1.04        7.1         18%               35.5   87.2   5,158              752 JSBSim::FGPropagate: exe
   .3%    .4    .7 B  28%  27%  48%  1.01                                      64.1  155.1     895            1,047 JSBSim::FGPropagate: exe
   .3%    .3   1.8 B  41%   1%  30%  1.00        1.7         24%                 .5   10.3                       22 SGReferenced::put    exe
   .3%    .3   1.3 B  10%  23%  11%  1.01         .7         24%                8.9   14.8     268              198 atan2                <libm-2.28.so>
   .3%    .3   1.1 B  35%       57%                                             3.5   11.2                       35 PathComponent::PathC exe
   .3%    .3   1.0 B  20%   4%  52%  1.07        1.0         24%               28.2   67.3   2,785            3,243 JSBSim::FGPropagate: exe
   .3%    .3    .9 B  36%       77%                                             3.8    9.0                       96 PathComponent& std:: exe
   .3%    .3    .9 B  34%  15%  61%  1.00        1.1         24%               42.9   60.2      33              369 JSBSim::FGStandardAt exe
   .3%    .3    .6 B  17%   2%  79%  1.00   2%  10.9         42%                1.5   10.3     234              121 JSBSim::FGFunction:: exe
   .2%    .3   1.8 B  11%  16%  44%  1.00                                       7.2   19.2     335               81 JSBSim::FGLocation:: exe
   .2%    .3   1.7 B  12%  12%  54%  1.00        3.5         27%               11.4   63.2      52               42 JSBSim::FGColumnVect exe
   .2%    .3   1.4 B  28%       75%                                              .7    2.4                       58 PathComponent* std:: exe
   .2%    .3   1.2 B  33%       41%              1.7        100%               15.4   39.7     104    130        19 bool std::operator== exe
   .2%    .3   1.1 B  33%       61%  1.00   2%   7.4         45%                6.2   98.6      28     55        45 SGRawValueMethods<JS exe
   .2%    .3    .3 B  10%       41%         2%   3.1         24%                3.0   10.3   1,125               27 std::vector<SGShared exe

@heshpdx
Copy link
Contributor Author

heshpdx commented Feb 23, 2023

Attaching some details. Prof is just the info I pasted above.
prof-func.748.flightdm_r.ref.feb08a_d-gcc12-lessopt.txt
cache-func.748.flightdm_r.ref.feb08a_d-gcc12-lessopt.txt
itlb-func.748.flightdm_r.ref.feb08a_d-gcc12-lessopt.txt

@heshpdx
Copy link
Contributor Author

heshpdx commented Feb 24, 2023

I ran cppcheck on the source tree to check for any crazy errors. (This was more out of curiosity than anything else., since I have proposed cppcheck as another candidate for CPUv8.) In case it is helpful I put the output below. Some of the issues tend to be innocuous or irrelevant, so your mileage may vary.

JSB-FlightSim/src/initialization/FGTrim.cpp:535:10: warning: Uninitialized variable: rParam.contactRef [uninitvar]
  return rParam;
         ^
JSB-FlightSim/src/initialization/FGTrim.cpp:497:38: note: Assuming condition is false
  for (iter = contacts.begin(); iter != contacts.end(); ++iter) {
                                     ^
JSB-FlightSim/src/initialization/FGTrim.cpp:535:10: note: Uninitialized variable: rParam.contactRef
  return rParam;
         ^
JSB-FlightSim/src/simgear/magvar/coremag.cxx:272:23: error: Array 'P[13][13]' accessed at index P[-1][*], which is out of bounds. [negativeIndex]
          P[n][m] = (P[n-1][m] * c * (2.0*n-1) -
                      ^
JSB-FlightSim/src/simgear/magvar/coremag.cxx:273:21: error: Array 'P[13][13]' accessed at index P[-2][*], which is out of bounds. [negativeIndex]
                   P[n-2][m] * roots[m][n][0]) *
                    ^
JSB-FlightSim/src/simgear/magvar/coremag.cxx:276:26: error: Array 'DP[13][13]' accessed at index DP[-1][*], which is out of bounds. [negativeIndex]
          DP[n][m] = ((DP[n-1][m] * c - P[n-1][m] * s) *
                         ^
JSB-FlightSim/src/simgear/magvar/coremag.cxx:276:42: error: Array 'P[13][13]' accessed at index P[-1][*], which is out of bounds. [negativeIndex]
          DP[n][m] = ((DP[n-1][m] * c - P[n-1][m] * s) *
                                         ^
JSB-FlightSim/src/simgear/magvar/coremag.cxx:277:33: error: Array 'DP[13][13]' accessed at index DP[-2][*], which is out of bounds. [negativeIndex]
                  (2.0*n-1) - DP[n-2][m] * roots[m][n][0]) *
                                ^
JSB-FlightSim/src/simgear/magvar/coremag.cxx:252:47: warning: Invalid sqrt() argument nr 1. The value is -23 but the valid values are '0.0:'. [invalidFunctionArg]
            roots[m][n][0] = sqrt((n-1)*(n-1) - mm);
                                              ^
JSB-FlightSim/src/simgear/magvar/coremag.cxx:251:38: note: Assuming that condition 'n<=nmax' is not redundant
          for ( n = MAX(m + 1, 2); n <= nmax; n++ ) {
                                     ^
JSB-FlightSim/src/simgear/magvar/coremag.cxx:252:47: note: Invalid argument
            roots[m][n][0] = sqrt((n-1)*(n-1) - mm);
                                              ^
JSB-FlightSim/src/simgear/props/props.cxx:462:9: warning: Iterators to containers from different expressions 'name' and 'token' are used together. [mismatchingContainerExpression]
    if (name.end() != token.end()) {
        ^
JSB-FlightSim/src/JSBSim.cpp:502:15: error: Uninitialized variable: &local [uninitvar]
  localtime_s(&local, &tod);
              ^
JSB-FlightSim/src/simgear/xml/xmlparse.c:4288:21: error: Uninitialized variable: &versionend [uninitvar]
          &version, &versionend, &encodingName, &newEncoding, &standalone)) {
                    ^
JSB-FlightSim/src/simgear/xml/xmlparse.c:4279:7: note: Assuming condition is false
  if (! accountingDiffTolerated(parser, XML_TOK_XML_DECL, s, next, __LINE__,
      ^
JSB-FlightSim/src/simgear/xml/xmlparse.c:4288:21: note: Uninitialized variable: &versionend
          &version, &versionend, &encodingName, &newEncoding, &standalone)) {
                    ^

@heshpdx
Copy link
Contributor Author

heshpdx commented Mar 5, 2023

Also, can someone help craft cmdlines?
...
We need a runtime on the order of 180 to 300 seconds of interesting work, which motivates this post.

I will try and post some scripts within the next week or so if that's OK.

Hi! Any luck in making longer scripts? Thanks!

@jonsberndt
Copy link
Contributor

jonsberndt commented Mar 7, 2023

Hi, Mahesh,

Sorry it has taken me a while to get back to you.

Define "interesting work". ;-)

There is one script that can pretty easily be modified to run for any length of time. That is the scripts/ball_orbit.xml script. I went in and changed that as follows:

<run start="0.0" end="86400" dt="0.005">

That is, the end time was increased to 86400 seconds. It ran for about ten minutes. I'm thinking if you set that to 30000 then the sim should run for about 4 minutes. But, is that "interesting"?

I think I could also modify one of the C310 scripts to fly a route for that long.

Jon

@heshpdx
Copy link
Contributor Author

heshpdx commented Mar 7, 2023

If the work is repetitive (the same code is being executed over and over with same/similar data), then it is not interesting. If a script exercises different parts of the code with different data, that would be ideal. If I were to use knowledge of my own CPU simulator to make an analogy to your simulator (even though it is a different domain), it would be something like the following.

Not interesting:

  1. aircraft takes off and climbs to altitude
  2. flies for 86400 seconds, with no change in external conditions
  3. aircraft lands

Interesting:

  1. aircraft takes off facing East and climbs to altitude. Wind is coming from the South.
  2. flies for 1000 seconds
  3. aircraft turns to south and travels into the headwind
  4. flies for 1000 seconds
  5. aircraft turns to north and travels with tailwind
  6. flies for 1000 seconds
  7. aircraft engine 2 catches fire
  8. engine 2 breaks off; aircraft has a new weight and is lopsided.
  9. pilot decides to leak most of the fuel to reduce the weight further for better stabilization
  10. aircraft turns to west.
  11. flies for 2000 seconds at half velocity back to airport,
  12. aircraft lands

When I pitched the benchmark to the committee, I had the aircraft directory open, and they saw the model for a pterodactyl and got excited that we would have that in one of the cmdlines. I don't see any scripts with pterodactyl though. Can we use that model somehow? Do others have more complex scripts which simulate other kind of phenomena? I saw people on the forum asking for Martian atmosphere for the NASA helicopter. Also, some of the scripts where the aircraft crashes - those are good too.

@seanmcleod
Copy link
Member

Define "interesting work". ;-)

I'd define it in terms of good code coverage 😉

It looks like multiple scripts are being run, i.e. we don't have to get everything done in a single script. In terms of coverage I'd suggest the following as a starting point:

  • Different engine types, piston, turbine, rocket
  • Trim code being exercised
  • Landing gear code being exercised, so take-off and landing
  • Actuators being used
  • Timed script events
  • High altitude/space
  • Autopilot control
  • FBW (e.g. F-16)
  • Wind and turbulence
  • External force

@heshpdx
Copy link
Contributor Author

heshpdx commented Mar 7, 2023

Thank you for a much better description. Yes, different types of engines and crafts, and timed events.

@jonsberndt
Copy link
Contributor

If it works to have multiple scripts involved to interestingly exercise more of the code instead of a single, mythical "super-interesting" script ;-) then I think Sean's suggestion is a great way to go. I will note, though, that at the lowest levels much of the same code is being continuously exercised, just with different data.

The C310 script that takes off and flies a route is pretty interesting for a bit at least. I think it still works. Also, would perhaps running a couple of instances of JSBSim simultaneously be helpful?

I'll see if I can come up with a batch of runs to make.

@bowers
Copy link

bowers commented Oct 23, 2023

I work on the SPEC CPU committee as a colleague of @heshpdx. I work on porting and profiling workloads on Windows. One task is to ensure that the work done and the workload outputs are consistent between Windows and other OSes.

I am investigating a case where the output produced for the script ball_orbit.xml is different when I compile JSBSim with the Microsoft MSVC compiler, versus when I compile it with the gcc compiler in a MinGW environment.

The discrepancy is so precise & consistent that I hoped an expert in JSBSim might recognize what what variable or function might be going awry under MSVC. To be clear, I don't think this is an issue in JSBSim.

In a few specific and consistent cases, the MSVC-compiled version reports a value of "Argument of perigee" that is (360 - X) compared to the gcc-compiled version. I noticed that this occurs when (and only when) the matching "True anomaly" value is ~ 360.

An output example when compiled with MinGW/gcc:

Time Notify (Event 0) executed at time: 3000.000000
Height AGL (ft): = 800000.000478
Altitude (Geod, ft): = 800000.000478
Latitude (Geod, deg): = -0.000000
Vehicle Radius (ft): = 21725646.325938
Inertial Vel Mag (ft/s): = 25473.404281
Body X Velocity (ft/s): = -22221.033016
Simulation Frame: = 600000.000000
Density: = 0.000000
Angular momentum (ft^2/s) = 553426172137.411865
Inclination (deg) = 0.000000
Right ascension (deg) = 0.000000
Argument of perigee (deg) = 201.538233
Period (s) = 5374.952767
Eccentricity = 0.001507
Apoapsis radius (ft) = 21791206.647858
Periapsis radius (ft) = 21725646.325938
True anomaly (deg) = 360.000000

The same output when compiled with MSVC:

Time Notify (Event 0) executed at time: 3000.000000
Height AGL (ft): = 800000.000478
Altitude (Geod, ft): = 800000.000478
Latitude (Geod, deg): = 0.000000
Vehicle Radius (ft): = 21725646.325938
Inertial Vel Mag (ft/s): = 25473.404281
Body X Velocity (ft/s): = -22221.033016
Simulation Frame: = 600000.000000
Density: = 0.000000
Angular momentum (ft^2/s) = 553426172137.411133
Inclination (deg) = 0.000000
Right ascension (deg) = 0.000000
Argument of perigee (deg) = 158.461767
Period (s) = 5374.952767
Eccentricity = 0.001507
Apoapsis radius (ft) = 21791206.647858
Periapsis radius (ft) = 21725646.325938
True anomaly (deg) = 360.000000

I can "correct" this difference in the output by inserting this into src/models/FGPropagate.cpp#533:

#if defined(_MSC_VER)
  if (TrueAnomaly > 1) PerigeeArgument = 360. - PerigeeArgument;
#endif

@heshpdx
Copy link
Contributor Author

heshpdx commented Oct 23, 2023

To augment @bowers's post, we recently taught SPEC CPU's verification harness to tolerate coterminal angles. Therefore, if the MSVC output for Argument of Perigree were reported as -158.5' instead of +158.5', that would be coterminal with +201.5' (which is the "golden reference output") and we would not be in this predicament because the output would verify. So perhaps that gives a clue as to what is happening.

@seanmcleod
Copy link
Member

seanmcleod commented Oct 23, 2023

Okay, just double-checking, so it looks like the only assignments to PerigeeArgument are on lines 522, 523 and 528 in the JSBSim source code?

And in the example output above, the Eccentricity > 1E-8, so the difference between gcc and MSVC is on lines 522, 523?

FGColumnVector3 eVector = (VState.vInertialVelocity*angularMomentum/in.GM - R);
Eccentricity = eVector.Magnitude();
if (Eccentricity > 1E-8) {
eVector /= Eccentricity;
PerigeeArgument = acos(DotProduct(N, eVector))*radtodeg;
if (eVector(eZ) < 0) PerigeeArgument = 360. - PerigeeArgument;
}
else
{
eVector = {1., 0., 0.};
PerigeeArgument = 0.0;
}

And so the difference between gcc and MSVC is due to a different result between them with regards to line 523 and this conditional:

if (eVector(eZ) < 0)

@seanmcleod
Copy link
Member

Okay, so doing a quick lookup on Argument of Perigee - http://www.castor2.ca/03_Mechanics/02_Elements/05_Arg/index.html

And taking a look at the output for MSVC I see the following pattern:

Time Argument of Perigee
0.005 0.0
600 40.3
1200 80.6
1800 120.9
2400 161.2
3000 158.5
3600 118.2

Which looks incorrect, i.e. between time 2400 and 3000 it implies the ball has moved in reverse, a value of 201.5 would make more sense.

Taking a look in the debugger for time 3000, eVector(eZ) == 0.0 which is based on the cross product between:

VState.vInertialVelocity * angularMomentum on line 518.

VState.vInertialVelocity - {-9352.1465544806997, -23694.549595047396, 0.0000000000000000}

angularMomentum - {0.0000000000000000, 0.0000000000000000, 553426172137.42932} 

eVector - {-0.0014013482342756234, 0.00055309720560003939, 0.0000000000000000}

@jonsberndt, @bcoconni can you spot anything obvious going wrong?

@seanmcleod
Copy link
Member

seanmcleod commented Oct 23, 2023

Hmm, trying to picture the orbit I took a look at the reset file used by the script.

<position frame="ECEF">
<altitudeMSL unit="FT"> 800000.0 </altitudeMSL>
</position>
<orientation unit="DEG" frame="LOCAL">
<yaw> 90.0 </yaw>
</orientation>
<velocity unit="FT/SEC" frame="BODY">
<!--x> 23869.978340 </x--> <!-- For use with standard grav model -->
<x> 23889.145167 </x> <!-- For use with WGS84 grav model -->
</velocity>

So it appears that the ball starts off at (lat, lon) of (0, 0) and then given the yaw angle specified and the x-body velocity it starts an equatorial orbit?

Which seems to match in terms of the initial state report looking at the ECEF coordinates and ECEF velocity.

------------------------------------------------------------------
State Report at sim time: 0.000000 seconds
  Position
    ECI:   21725646.32546, 0, 0 (x,y,z, in ft)
    ECEF:  21725646.325460 , 0.000000 , 0.000000 (x,y,z, in ft)
    Local: 0.000000, 0.000000, 800000.000000 (geodetic lat, lon, alt ASL in deg and ft)

  Orientation
    ECI:   -89.99999999999999, -0, 90 (phi, theta, psi in deg)
    Local: 1.272221872585406e-14, -1.412450015376051e-30, 90.00000000000001 (phi, theta, psi in deg)

  Velocity
    ECI:   0, 25473.40428154583, 0 (x,y,z in ft/s)
    ECEF:  0, 23889.14516700002, 0 (x,y,z in ft/s)
    Local: 0.000000 , 23889.145167 , 0.000000 (n,e,d in ft/sec)
    Body:  23889.145167 , 0.000000 , 0.000000 (u,v,w in ft/sec)

  Body Rates (relative to given frame, expressed in body frame)
    ECI:   0, 0, 9.277188200408133e-19 (p,q,r in deg/s)
    ECEF:  0, 0.004178074132240404, 0 (p,q,r in deg/s)

---- JSBSim Execution beginning ... --------------------------------------------


Time Notify (Event 0) executed at time: 0.005000
    Height AGL (ft):          = 800000.000000
    Altitude (Geod, ft):      = 800000.000000
    Latitude (Geod, deg):     = 0.000000
    Vehicle Radius (ft):      = 21725646.325460
    Inertial Vel Mag (ft/s):  = 25473.404282
    Body X Velocity (ft/s):   = 23889.145167
    Simulation Frame:         = 1.000000
    Density:                  = 0.000000
    Angular momentum (ft^2/s) = 553426172126.323120
    Inclination (deg)         = 0.000000
    Right ascension (deg)     = 0.000000
    Argument of perigee (deg) = 0.000000
    Period (s)                = 5374.952766
    Eccentricity              = 0.001507
    Apoapsis radius (ft)      = 21791206.646590
    Periapsis radius (ft)     = 21725646.325460
    True anomaly (deg)        = 0.000000

Start: Monday October 23 2023 22:17:03 (HH:MM:SS)

So as a complete orbital mechanics layman, shouldn't the Argument of Perigee be 0 all the time, given the equatorial orbit with an inclination of 0?

@heshpdx
Copy link
Contributor Author

heshpdx commented Oct 23, 2023

In case it matters, here is the change I made in the ball_orbit.xml script, to get more runtime and to verify the final location. The aircraft did not change.

$ diff -u upstream/ball_orbit.xml spec/ball_orbit.xml
--- upstream/ball_orbit.xml     2023-10-23 21:19:45.527498845 +0000
+++ spec/ball_orbit.xml 2023-10-23 21:22:33.763959308 +0000
@@ -4,7 +4,7 @@
     xsi:noNamespaceSchemaLocation="http://jsbsim.sf.net/JSBSimScript.xsd"
     name="cannon ball test">
   <use aircraft="ball" initialize="reset00_v2"/>
-  <run start="0.0" end="10800" dt="0.005">
+  <run start="0.0" end="30200" dt="0.005">
     <description>
       Integrators
       
@@ -53,7 +53,7 @@
       <set name="simulation/notify-time-trigger" value="600" type="FG_DELTA"/>
     </event>
 
-    <!--event name="Final Data Point">
+    <event name="Final Data Point">
       <description> Log data on command </description>
       <notify>
         <property caption="Simulation Time:         "> simulation/sim-time-sec         </property>
@@ -65,9 +65,9 @@
         <property caption="Body X Velocity (ft/s):  "> velocities/u-fps                </property>
         <property caption="Simulation Frame:        "> simulation/frame                </property>
       </notify>
-      <condition> simulation/sim-time-sec >= 86400.0 </condition>
+      <condition> simulation/sim-time-sec >= 30000.0 </condition>
       <set name="simulation/force-output" value="0"/>
-    </event-->
+    </event>
 
   </run>
 </runscript>

@bowers
Copy link

bowers commented Oct 30, 2023

I'm still probing. Two observations:
With gcc, the reported Latitude value is -0 (negative zero) in frames where gcc and MSVC report different Perigee Arguments. In other frames, it's positive zero. Looking under the debugger, with gcc the value is a varying but minuscule negative number in these cases (e.g. -8.9772542999769379e-17), whereas with MSVC the value appears to always be positive 0.

I can get gcc to report same values as MSVC by replacing
if (eVector(eZ) < 0) PerigeeArgument = 360. - PerigeeArgument;
with
if (eVector(eZ) < -1E-8) PerigeeArgument = 360. - PerigeeArgument;
...because under gcc the Z parameter of eVector is, like the Latitude, has a negative sign but the absolute value is tiny.

@seanmcleod
Copy link
Member

So as a complete orbital mechanics layman, shouldn't the Argument of Perigee be 0 all the time, given the equatorial orbit with an inclination of 0?

So in my mind PerigeeArgument should be set to 0 in the else statement below, and in this case shouldn't be computed lower down.

if (abs(Inclination) > 1E-8) {
N = Z * angularMomentum;
RightAscension = atan2(N(eY), N(eX))*radtodeg;
N.Normalize();
}
else {
RightAscension = 0.0;
N = {1., 0., 0.};
}

@seanmcleod
Copy link
Member

seanmcleod commented Oct 30, 2023

https://oer.pressbooks.pub/lynnanegeorge/chapter/chapter-3-the-classical-orbital-elements-coes/

Note the argument of perigee is not defined if there is no ascending node, in other words if the orbit is equatorial with an inclination of 0 or 180 degrees. It is also not defined if the orbit is circular, or has an eccentricity of 0, since there is no perigee in a circular orbit.

So we should be detecting all these cases and setting the argument of perigee to 0.

I'll submit a pull request to fix this.

@jonsberndt
Copy link
Contributor

jonsberndt commented Nov 10, 2023

To augment @bowers's post, we recently taught SPEC CPU's verification harness to tolerate coterminal angles. Therefore, if the MSVC output for Argument of Perigree were reported as -158.5' instead of +158.5', that would be coterminal with +201.5' (which is the "golden reference output") and we would not be in this predicament because the output would verify. So perhaps that gives a clue as to what is happening.

Another way to maybe get around the argument of perigee problem is to consider using the new script I committed that has an orbit with a non-zero inclination:

./JSBSim scripts/ball_orbit_phase.xml

Note that this script requires an update to the code and a recompile.

@seanmcleod
Copy link
Member

@bowers @heshpdx I've just merged in the pull request with regards to the argument of perigee.

@heshpdx
Copy link
Contributor Author

heshpdx commented Aug 21, 2024

I noticed that JSBSim runs have an unseemingly large number of system calls (as measured by strace). I looked further and it turns out the vast majority are write syscalls. The big culprits are scripts Short_S23 and ah1s_flight_test.

I debugged and figured out the reason. Every time we print a line to stdout, we also flush it. Each flush invokes write() and that is not scalable when running hundreds of copies of jsbsim. It turns out that we can mitigate this easily by replacing all the endl with "\n". This way we get the newline into the I/O buffer but it isn't flushed and write() is only called in a lazy way when the system buffer fills up.

       for (auto param: OutputParameters)
-        cout << "      - " << param->GetName() << endl;
+        cout << "      - " << param->GetName() << "\n";

I went through and ran sed -i 's/\ endl/\ \"\\n\"/g' * in all the directories, and this reduced the number of writes to be just 4% of the baseline! The 37000 total write calls for my ten command lines went down to 1400, and the output still matched exactly. However since the text is buffered, my screen was not updating as fast; so depending on the user experience you need (embedded/online versus background/offline), you can decide what is better. JSBSim does a lot of small printouts, which is why it became an outlier to investigate.

The reason why this is important: when SPEC CPU users run the benchmark, they will typically spawn one copy per core, so for a 192-core system you will have 192 app instances all on top of a single operating system handling the I/O. So if we have a large storm of small writes, the operating system would become the bottleneck for the benchmark. We want to keep the OS out of the way as much as possible, so we can measure system hardware scalability. That is why I did this conversion. This brings us closer to proposing Step 4 completion for the benchmark candidate.

Anyway, just sharing this tidbit with the community in case it helps. I could provide a PR if you are interested, but it is super easy to do yourself with the sed cmdline.

@seanmcleod
Copy link
Member

Coincidentally @bcoconni and myself have just been looking at the performance impact of flush(), see - #1136 (comment) and the recent pull request #1142 that was merged.

The new logging implementation is still in progress and we'll take this into account.

Also in terms of launching hundreds of instances of JSBSim like you mention I had made the following mention in #1094 (comment) when the initial logging design was being proposed:

Thinking particularly for the use case of using JSBSim not in real-time mode, but while being used for example by reinforcement learning setups etc.

So it is something we are aware of handling, i.e. large scale high performance cases and will continue to work on.

For example, in this sort of case:

(base) C:\source\jsbsim>$env:JSBSIM_DEBUG=0
(base) C:\source\jsbsim>measure-command { .\Release\JSBSim.exe --script=scripts\ah1s_flight_test.xml }
9.107s

Should there be a way to suppress all output except say exception messages, i.e. don't even build the strings even if they're going to be well buffered.

@seanmcleod
Copy link
Member

The 37000 total write calls for my ten command lines went down to 1400

When you say "ten command lines" do you mean a script that runs 10 different JSBSim script examples?

I had a quick look at ah1s_flight_test.xml, ran it and re-directed the output. Total of 1,145 lines, so assuming a worst case of a flush for every single line that would be 1,145 flush calls. So if ah1s_flight_test.xml is one of 10 script runs making up 37,000 write calls there must be other scripts with a lot more output, or you're also running with some level of debug output as well?

The majority of the output from ah1s_flight_test.xml are these state reports every 60s.

State report (Event 26) executed at time: 2700.000000
    velocities/vc-kts = 151.830680
    fcs/SFD/trim/v_dir_kts = 163.383721
    position/h-agl-ft = 2731.074868
    position/h-sl-ft = 5014.598617
    attitude/heading-true-rad = 3.141579
    aero/alpha-deg = -5.744751
    aero/beta-deg = -0.141843
    aero/mu = 0.367578
    aero/qbar-psf = 77.841712
    aero/force/D0_fus = 809.164597
    velocities/u-aero-fps = 274.374860
    velocities/v-aero-fps = -0.682678
    velocities/w-aero-fps = -27.602704
    velocities/h-dot-fps = -0.000002
    propulsion/engine/rotor-rpm = 324.000000
    position/distance-from-start-mag-mt = 203791.502611
    propulsion/engine/torque-lbsft = 13780.905888
    propulsion/engine/bell-torque-sensor-psi = 49.998568

We haven't updated the script code yet to make use of the new logging functionality, which does include a level of buffering before issuing a flush.

With the property output coming from this code:

for (j=0; j<thisEvent.NotifyProperties.size();j++) {
cout << " " << thisEvent.DisplayString[j] << " = "
<< thisEvent.NotifyProperties[j]->getDoubleValue();
if (thisEvent.NotifyKML) cout << " <br/>";
cout << endl;
}

I changed the endl on line 512 to "\n" and it made virtually no difference in runtime.

@heshpdx
Copy link
Contributor Author

heshpdx commented Aug 22, 2024

Yes, I have ten different cmdlines that are run in order, and ah1s is one of them. I modified some of the scripts, e.g. decreasing dt, so the total runtime is around 3 minutes for the total set. The 37000 write syscalls was the aggregate for the ten scripts. I am just running with the default logging settings, nothing extra; but I do need to keep it enabled since that facilitates benchmark validation.

Thanks for sharing the info about what is happening on the mainline. The modern logging and buffering is the way to go. I just cloned your latest and ran it against my ah1s script using strace:

$ strace -f -o ah1s_upstream.log ../../jsbsim/build/src/JSBSim --nohighlight scripts/ah1s_flight_test.xml
$ strace -f -o ah1s_spec.log  ./JSBSim_base.gcc-O3-profiling-64 --nohighlight scripts/ah1s_flight_test.xml
$ awk '{print $2}' ah1s_upstream.log | sort | uniq -c | sort -nr | head
   3413 write(1,
     75 newfstatat(AT_FDCWD,
     57 openat(AT_FDCWD,
     37 read(3,
     19 close(3)
     13 read(4,
     13 close(4)
      8 mmap(NULL,
      7 newfstatat(3,
      2 brk(NULL)
$ awk '{print $2}' ah1s_spec.log | sort | uniq -c | sort -nr | head
    103 write(1,
     74 newfstatat(AT_FDCWD,
     44 openat(AT_FDCWD,
     37 read(3,
     19 close(3)
     13 writev(1,
     13 getpid()
      8 mmap(NULL,
      7 newfstatat(3,
      2 brk(NULL)

My copy has the replacements of "\n" for std::endl, and we can see the delta in write while the rest is the same. I will share that I haven't rebased to upstream since starting in Feb 2023. If there are big algorithm changes which impact performance or other behaviors, that would motivate a rebase. But I'm hesitant just because "everything works" right now across all the different platforms and compilers so the ROI would be low. SPEC CPUv8 does use C++17 as well so at least that should not be a problem. Let me know your thoughts.

@seanmcleod
Copy link
Member

seanmcleod commented Aug 22, 2024

Okay, so if not explicitly set the debug level is set to 1.

short FGJSBBase::debug_lvl = 1;

And with it set to 1, that includes the echoing of the config files, so the redirected output from the ah1s script now totals 2,889 lines, up from the 1,145 lines I'd seen previously with a debug level of 0.

So assuming a flush/write for every output line, then the 2,889 lines makes up the vast majority of the 3,413 writes your trace shows above.

@heshpdx
Copy link
Contributor Author

heshpdx commented Sep 13, 2024

I ran up coverage metrics using gcov. It may be useful to this team, so I am attaching it here: fdm.zip
Just unzip and point your browser at the index.html file. The source tree is annotated with hit counts for each line of code. The source I measured is the SPEC CPU drop, which started from the upstream sometime last year and added some cherry picked items as we worked through the issues above... but the core is basically the same.

Coverage reports have helped us in various efforts including portability as well as performance analysis. On the latter, I found some quick optimizations that I will share in a PR shortly.

@heshpdx
Copy link
Contributor Author

heshpdx commented Sep 27, 2024

Hi, another crazy behavior divergence was found when building and running on an IBM PowerPC machine with the OpenXL compiler in big-endian mode (all previous testing we have done has been on little-endian).

The program builds and runs, and most of the time it provides the same answers within tolerance. But in a couple of cases, the simulation does not terminate properly. For the case of c3105.xml, we want the simulation to stop after the plane crashes. In the case of IBM, it appears that it does not exit after the termination event, and instead continues to simulate until the time N specified in the <run end="N"> directive in c3105.xml. We also see this case in weather-balloon.xml, where IBM keeps simulating even after the balloon is on the ground and deflated, up until the script timeout much later.

Here is a vimdiff between the c3105 golden output on the left, and the IBM output on the right. We offer tolerance on the FP number difference, but the extra lines are what triggered the verification error.
image

Can you provide any hints on how we can debug this? I know we can increase dbg_level to output more information, but this is a case of "not getting a signal". Which module takes care of recognizing when to tear down the simulation? Thanks!

@seanmcleod
Copy link
Member

Hmm, so if you look at the c3105.xml script it has a Crash! event, which has a simulation/terminate property set to terminate the simulation.

jsbsim/scripts/c3105.xml

Lines 86 to 96 in 8e4803b

<event name="Crash!">
<description>
We hit the ground
</description>
<condition logic="AND">
simulation/sim-time-sec ge 100
position/h-agl-ft le 10
</condition>
<set name="simulation/terminate" value="1"/>
<notify/>
</event>

Now in your output you can see that both the non-PowerPC and PowerPC versions trigger this crash event at the same time 2248.072, but if the PowerPC version isn't quitting the simulation then that means the code handling the simulation/terminate property set is failing for some reason, or isn't actually being executed.

So is there some logic in the code handling the terminate request that is buggy due to big-endian mode? Or is it some compiler issue/difference not specific to endianess?

Hmm, so looking at the relevant code we have Terminate as an unsigned short and the property tie assumes a pointer to an int. Not sure why there was a specific choice of unsigned short in this case.

unsigned short Terminate;

instance->Tie("simulation/terminate", (int *)&Terminate);

Terminate = false;

jsbsim/src/FGFDMExec.cpp

Lines 442 to 445 in 8e4803b

if (Terminate) success = false;
return success;
}

@heshpdx
Copy link
Contributor Author

heshpdx commented Sep 27, 2024

Wow, that cast is likely the problem. We could change it to (short *) or just make Terminate be an int. Thank you for the quick insight!

@seanmcleod
Copy link
Member

Or, the third option which I just tried on x64 MSVC successfully running C3105.xml is to change it to a bool.

bool Terminate;

instance->Tie("simulation/terminate", (bool *)&Terminate);

@seanmcleod
Copy link
Member

Hmm, so noticed the following while glancing at this.

// FGFDMExec.h
bool holding;
unsigned short Terminate;
bool HoldDown;

void SetHoldDown(bool hd);
bool GetHoldDown(void) const {return HoldDown;}

// FGFDMExec.cpp
holding = false;
Terminate = false;
HoldDown = false;

instance->Tie("simulation/pause", (int *)&holding);
instance->Tie("simulation/terminate", (int *)&Terminate);
instance->Tie("forces/hold-down", this, &FGFDMExec::GetHoldDown, &FGFDMExec::SetHoldDown);

So I'm thinking we should look at getting this more type consistent.

bool Terminate;

instance->Tie("simulation/pause", (bool *)&holding);
instance->Tie("simulation/terminate", (bool *)&Terminate);

I'll test it out on x64 locally, can you test the proposed changes out on PowerPC?

@bcoconni
Copy link
Member

The variable has been defined as a short since the terminate feature has been introduced (commit cb1cdc4) and honestly I'm struggling to see what the point of using a short is ?

@seanmcleod may be the member Terminate should also be changed to bool in addition to the changes you're suggesting ?

@bcoconni
Copy link
Member

@seanmcleod may be the member Terminate should also be changed to bool in addition to the changes you're suggesting ?

Oh ! You already did that. Sorry...

@ZarkoT
Copy link

ZarkoT commented Sep 30, 2024

I was the one that originally reported the BE problem to @heshpdx and he pushed the fix in our internal repo. Tried it and it worked on AIX Power10 big endian. Thanks very much for the quick investigation and fix, this was great.

@seanmcleod
Copy link
Member

@ZarkoT out of interest what are you using JSBSim for on AIX Power10? Or were you just using it for SPEC benchmarking?

@ZarkoT
Copy link

ZarkoT commented Sep 30, 2024

@seanmcleod I'm just using it through SPEC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants