forked from OpenSpeedShop/openspeedshop
-
Notifications
You must be signed in to change notification settings - Fork 0
/
CHANGELOG
558 lines (317 loc) · 38.2 KB
/
CHANGELOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
OpenSpeedShop-2.3 release
We have moved from using GNU autotools to build OpenSpeedShop
to the cmake build tool. There we additional refinements and
fixes to the initial deployment in version 2.2.
Many changes to the Component Framework Tool Framework
(CBTF) software to support the usage of CBTF in the OpenSpeedShop
tools.
A new experiment (omptp) that targets displaying OpenMP specific
information for applications that use OpenMP. Details are:
* Report task idle, barrier, and barrier wait times per OpenMP thread
and attribute those times to the OpenMP parallel regions.
Refinements to the memory analysis (mem) experiment. Details are:
* Tracks potential memory allocation call that is not later destroyed (leak).
* Records any memory allocation event that set a new high-water of allocated
memory current thread or process.
* Creates an event for each unique call path to a traced memory call and
records the total number of times this call path was followed, the
max allocation size, the min allocation size, and the total allocation,
the total time spent in the call path, and the start time for the first call.
Updates to the cuda experiment:
* Periodically samples both CPU and GPU hardware performance counter events.
* Traces all NVIDIA CUDA kernel executions and the data transfers between main
memory and the GPU.
* Records the call sites, time spent, and data transfer sizes.
* New graphical user interface focused on the cuda experiment.
Fixes to numerous issues and bugs. Some of note:
* Fix issue where clicking on StatsPanel stats lines would fail to show the
view because the CLI can not handle STL names or names containing ::.
This mod creates a set of mandled and demangled name pairs and then when a
user clicks on a stats line we look for a mangled name and send that name
to the CLI instead of the demangled name. This avoids the issue and allows
more success for users when trying to use a view where a function name
is required.
* The original code for finding if an executable a MPI executable depends on
the Dyninst symtabAPI interface, but it did not find libmpi.so in some
applications, like simpleFoam from OpenFOAM. Dyninst sets the list of
dependencies from the DT_NEEDED entries in the binary. Dyninst does not
follow transitive dependencies, which ldd does. If you run readelf -d,
you should see NEEDED entries that correspond to the dependencies Dyninst
finds; the Symtab model is (at present) to represent what is present in the
binary, not anything that is contingent on where it runs. We have replaced
the Dyninst usage with code that does a ldd command and parses the library
list to see if libmpi is present. We also use this same routine to find
libgomp and/or libomp5 the openMP runtime libraries (GNU and Intel
respectively).
* On some platforms the PBS supplied node names are host.dns.domain and the
front-end node name is only a host name. This causes problems when
using the two differing node name forms. So, here we reduce the PBS
supplied names to match the host only form so we have a successful topology
created.
* Fix issue that prevented cuda experiment from working with mpi
applications. Essentially, for mpi and mrnet if any thread starts
before before the main thread has attained an mpi rank value used
for ltwt mrnet connections, we must wait until the main thread
connects to mrnet before attempting to send the attatched thread
message.
* Adds a --offline mode to OSS when configured for "cbtf"
instrumentor. This will deprecate the OSS collectors and runtimes
in the future as all collectors will be supported by the cbtf-krell
collection infrastructure. This hardens the dependency of cbtf-krell
for OSS regardless of the configured instrumentor.
OpenSpeedShop-2.2 release
This is the initial release (2.2.0) of OpenSpeedShop version 2.2.
There were three beta versions tested on the Tri-lab systems and other sites prior to making the new release generally available.
The release consists of two versions of OpenSpeedShop, the traditional "offline" version of OpenSpeedShop that has been the default version for a few years now.
The second version is based on the Component Based Tool Framework (CBTF) project, which is used as the underlying mechanism for gathering the performance data and sending it to the client.
The interfaces for both versions is the same. Gather the data using the OpenSpeedShop convenience scripts:
For offline: ossusertime osspcsamp ossmpit ossmpiotf ossmpi ossiot ossio osshwctime osshwcsamp osshwc ossfpe
Additional experiment convenience scripts in OSS/CBTF: ossmpip, ossiop, osscuda, ossmem, osspthreads
Both the offline and CBTF versions of OpenSpeedShop have been tested on Linux x86_64 clusters.
NOTE: This version is for earlier adopters, there are some issues that need investigation:
a) Blue Gene version is getting warning messages about data blobs (the performance information data structures) being truncated.
b) Cray cbtf version has only been tested on one Cray platform/machine
c) ARM and Intel MIC platform support is with the offline version. We are still debugging the cbtf version.
d) The cuda experiment is getting revamped to work with the latest CUDA version interfaces. It was originally developed to work with CUDA-4.2.
Installation is still done via the "install-tool" as it was in the previous release. However, when building cbtf, please use "--build-cbtf-all" now in place of the previous release's "--build-cbtf" build phrase.
We are now using cmake as our build mechanism and that has altered how we build cbtf now.
We are now dependent on cmake, version 2.12 or higher.
The major changes in OpenSpeedShop 2.2 are outlined below in two sections: development tasks and maintenance tasks.
DEVELOPMENT TASKS:
Milestone 1: Add the ability to build CBTF using cmake technology (Generic Linux clusters)
Update and/or create the CBTF build mechanisms to use cmake technology to build and install all of the necessary CBTF components on generic Linux cluster platforms.
Milestone 2: Add the ability to build CBTF using cmake technology on specialty platforms (Cray, Intel MIC, GPGPU)
Update the CBTF build mechanisms to use cmake technology to build and install all of the necessary CBTF components on specialty platforms such as Cray and heterogeneous processor (Intel MIC, GPGPU) platforms.
Milestone 3: Add the ability to build Open|SpeedShop using cmake technology (Generic Linux clusters)
Update the Open|SpeedShop build mechanisms to use cmake technology to build and install all of the necessary Open|SpeedShop components on generic Linux clusters.
Milestone 4: Add the ability to build Open|SpeedShop using cmake technology on specialty platforms (Cray, Intel MIC, GPGPU)
Update the Open|SpeedShop build mechanisms to use cmake technology to build and install all of the necessary Open|SpeedShop components on specialty platforms such as Cray and heterogeneous processor (Intel MIC, GPGPU) platforms.
Milestone 5: Add the ability to build CBTF through the SPACK build package management tool on Generic Linux clusters.
Update the CBTF build mechanisms to use SPACK and the build and install mechanism to build and install all of the CBTF components.
Milestone 6: Add the ability to build Open|SpeedShop through the SPACK build package management tool on Generic Linux clusters.
Update the Open|SpeedShop build mechanisms to use SPACK and the build and install mechanism to build and install all of the Open|SpeedShop components.
Milestone 7: Open|SpeedShop CBTF version port to ARM
This is a ARM specific milestone which consists of porting the CBTF version of Open|SpeedShop to the ARM (aarch64) processor system and committing any source code changes to the Open|SpeedShop repositories in order to build an Open|SpeedShop version that runs on the ARM platform through the Open|SpeedShop infrastructure and send data blobs from the application to the client tool. This represents the work that needs to be done on the Open|SpeedShop side to support running on ARM systems.
NOTE (Work in progress) Milestone 8: Run a sampling experiment using the CBTF version of Open|SpeedShop (Execution task: ARM)
This is the execution task for one representative of the sampling experiments: pcsamp, usertime, hwc, hwctime, hwcsamp experiments. For this task, we chose the pcsamp experiments as the representative of the sampling experiment group. This sampling experiment (pcsamp) task shall consist of designing, developing, and committing code to the Open|SpeedShop repositories in order to build and execute the pcsamp experiment using the CBTF version of Open|SpeedShop on the ARM platform.
NOTE (Work in progress) Milestone 9: Run a call path unwinding experiment using the CBTF version of Open|SpeedShop (Execution task: ARM)
This is the execution task for one representative of the call path unwinding related experiments: usertime, io, mem, mpi, pthreads experiments. For this task, we chose the usertime experiments as the representative of the call path unwinding experiment group. This call path unwinding experiment (usertime) task shall consist of designing, developing, and committing code to the Open|SpeedShop repositories in order to build and execute the usertime experiment using both the offine and CBTF versions of Open|SpeedShop on the ARM platform.
Milestone 13: Deploy the CBTF based version of Open|SpeedShop on the TLCC cluster platforms at the Tri-labs.
Build, install, and test the latest CBTF based version of Open|SpeedShop on the Tri-labs cluster based systems. Verify that the applications of interest run and produce valid performance information when run with this version of Open|SpeedShop.
NASA related work:
Q1. Deliverables:
* Create initial CLI views for initial base GPU experiment
* Create initial CLI views for CPU/GPU Utilization experiment
* Enhance Intel MIC non-offload model support to include I/O and MPI experiments.
* Port Open|SpeedShop CBTF version infrastructure to NASA platforms of interest
Q2. Deliverables:
* Create initial CLI views for GPU Data Transfer / Compute Balance Experiment
* Create initial CLI views for Advanced GPU Internals Analysis.
* Enhance Intel MIC non-offload model support to include all offline mode experiments.
* Initial Intel MIC non-offload model support in the CBTF version for the pcsamp experiment.
* Continue to port Open|SpeedShop CBTF version infrastructure to NASA platforms of interest
MAINTENANCE TASKS:
Sep 2015:
* Fix an issue with generating an empty version of the OpenSS_Papi_Events.h file when building the front-end version of OSS for a targeted build. The PAPI on the front-end node should not be used to set up the available counters.
* Additional changes for building cbtf and oss with cmake. Also, fixes for incorrect, unprotected setting of LD_LIBRARY_PATH and PATH that have perhaps caused issues in the past.
* Add latest version of libunwind and patch to match previous patches.
* Update the libunwind version and clean-up.
* Add changes to allow the cray targeted build to use cray as the platform designator. Leaving in cray-xe and cray-xk for a transition period.
* Changes required to build targeted compute node components for use with cray based OSS/CBTF version.
* Remove old target specific install scripts to help eliminate install confusion.
* Remove old versions of papi from OSS root tree.
* Remove old versions of dyninst from the OSS root tree.
* Remove unsupported GUI auxiliary add on package from OSS root tree.
* Remove old MRNet packages from the OSS root tree.
* Remove old version of cmake from root.
* Update the install scripts for building cbtf and oss with cmake. Also, update the scripts for Intel MIC and ARM builds.
* Add new cbtf tarballs to root directory. Remove the old cbtf tarballs from the root directory.
* Add cmake build only OSS version 2.2 beta tarball.
* Prepare for LANL and SNL training trips by building and testing on Tri-lab tlcc2 platforms.
* Worked on three issues regarding moving to the latest Dyninst-9.0.0 official release:
* Not getting the head loop anymore via an API call with Dyninst version 9. I've been working on changing our interface to jump through hoops to get the same information that was available via a simple call in previous Dyninst versions.
* The dyninst versioning interface has also changed and at least one component is not available now. I'm working on those changes as well. Haven't checked this in because I didn't want to destabilize the software until the training trips are over.
* Dyninst will not build DyninstAPI component when targeting for the compute nodes on muzia at SNL.
* Worked on building OSS on the SC15 tutorial cluster at CreativeC which is partnering with our Tri-labs team to provide a cluster for the SC tutorial and the OSS booth at SC15.
* Worked with the Krell IT team to fix an issue with the OSS website:
* Every time I access the OpenSpeedShop website, I see that the website url has a caution sign by it and when I hover over this it says that "This website has no identity information".
* Worked with Bill Hachfeld on compile issues in cbtf-argonavis CUDA code when compiled on skybridge.
* I went through and deleted some of the # commented out lines in the *.in files in core/scripts. Then I removed the set for unused variables and commented the remaining set statements. Then checked in this code after a review by Don M.
* Working with Ben Welton at Uwisconsin on issue with MRNet-5.0.0:
* We are seeing some funny behaviour with the mrnet-5.0.0 version of libxplat_lightweight_r.so. I found the case where if I move the BUILD directory away cbtf fails. The installed libxplat_lightweight_r.so doesn't seem to be recognized even though it is in the install directory. There is no issue like this with mrnet-4.1.0, so that is why I'm emailing you for advice. Here is what I'm doing: (details available on request).
* Add boost qualifier to shared_ptr reference in daeamonTool source file to avert compile error on muzia cray build.
* Add qualifiers to the equivalent function call in ArgoNavis::Base. The compilers on SNL are not able to resolve the function to ArgoNavis::Base on their own. wdh directed fix.
* Clean-up and add comments outlining how the statement in the CMakeLists.txt file will be used in the .in files.
* Allow dyninst-9.0.x to be successfully found by the cmake find_package calls. The version information is in a new version.h file and symLite is no longer packaged. We can still find
Aug 2015:
* Add a missing Boost Include Dir line to the libopenss-queries-cuda/CMakeLists.txt file.
* Continued interfacing with the Dyninst team over a couple of issues related to their new git tree version of Dyninst 9.0.0.
* Loop API has changed and we need to rewrite code in the loop detection section of OSS.
* Elf detection for new platforms that we will be supporting is missing. Dyninst team has said they will add those so we can continue to use Dyninst for processing symbols.
* Tested changes they added for versioning and symLite changes they added
* Add a fix for the recent FindBinutils.cmake change I made. Change the if DEFINED chec to if and add comments to explain why.
* Update the autotools m4 files to return <component_name>_LIBDIR so it can be used to create cbtf-krell related scripts based on the actual library suffix being used. It turns out there are multiple versions of each m4 file. I updated the files so that all the m4 files match across the subdirectories. That is why there are several files being checked in. I tested the autotools build and executed several experiments with the autotools build to verify.
* Reported problems with the 5.0.0 version of MRNet to the University of Wisconsin MRNet team: When I try to use MRNet 5.0.0, I see these issues:
The Types.h:# define MRNET_VERSION_MAJOR 4 - This is minor, but our FindMRNet.cmake file is doing some sanity checking and tripping up because we think this should be a 5.
include/mrnet/Network.h is asking for libi/libi_api.h but the default MRNet build is not installing libi/libi_api.h. Is this suppose to be installed? Or is Network.h not supposed to reference libi_api.h if I don't ask for libi startup?
* Reported dangling link on LLNL website to Blaise Barney: I was walking through your tutorial on https://computing.llnl.gov and came across a reference to OpenSpeedShop that has changed slightly. The Krell IT team has changed vendors for supporting the OpenSpeedShop website. There is now no "/wp" at the end of the URL So, in the https://computing.llnl.gov/tutorials/parallel_comp/#DesignPerformance section http://www.openspeedshop.org/wp/ should be changed to: https://openspeedshop.org/
* Review and enhance Quick Start Guide and Flyer for OpenSpeedShop, work on tutorial slides for SC15 joint tutorial on OpenSpeedShop with laboratory personnel (Martin S, Mahesh R).
* Add latest mrnet release that is announced to support ARM. Also add patch that contains fixes for problems found while trying to build and use with CBTF and OSS.
* Worked with Dyninst team on Dyninst versioning. This is important because the API changed from 8.2.1 to 9.0.0 and we need a way to distinguish between those versions in our OSS/CBTF code. This was fixed by the Dyninst team and tested by me.
* If Dyninst is installed in its own install prefix then we need to export its library path in the cbtf mrnet commnode script. This is tested on my laptop and on NASA pfe platform.
* Add find cbtf argonavis cmake file to follow the pattern we are using for cbtf and cbtf-krell.
* Add cmake changes that allow the build of the cuda view, cuda client collector portion, cuda convenience scripts, and changes to use FindCBTF-Argo.cmake.
Jul 2015:
* Change the find binutils configuration routine to favor finding our libiberty_pic.a library. Also, if that doesn't exist then favor the normal libiberty.a library when searching for the Iberty static library.
* Change configuration for OpenSpeedShop to use the library directory variable instead of using the LIBDIR setting for the hardware platform. Some build systems do not follow the convention, as they probably should.
* Interfaced with the Dyninst team over a couple of issues related to their new git tree version of Dyninst 9.0.0.
* Loop API has changed and we need to rewrite code in the loop detection section of OSS.
* Elf detection for new platforms that we will be supporting is missing. Dyninst team has said they will add those so we can continue to use Dyninst for processing symbols.
* Add documentation in the osslink help per a request from Koushik Ghosh. The dash-i option was not in the help output.
* Add missing LTDL variables discovered when libltdl libs and includes were not in their normal default locations on the babbage Intel MIC testbed.
* Fix issues that caused the selection of bfd symbol resolution to be negated by add_definitions calls that set the OPENSS_USE_SYMTABAPI=1 define globally. Now we use the set_target_properties COMPILE_DEFINITIONS argument to keep the options separated better. Also, set the valid boost version back to 1.41. I had inadvertently changed it to 1.42 by checking in that change by accident.
* Intel MPI implementation has a libmpi.so file but not libmpich.so. I added recognition for libmpi.so and removed the PATH line.
* If papi is targetted then do not create the OpenSS_Papi_Events.h file populated via papi_avail because the papi_avail will not run on the front-end node. So, be more restrictive on the file creation. Make decisions based on the RUNTIME_DIR and RUNTIME_ONLY variables. This may need to be refined in order to give users information about the compute node papi events that are available.
* Add recognition of Intel MIC MPI implementation to the offline.py.in file. Look for Intel MIC MPI symbol and then set up the mpi driver variable etc. similar to the other MPI implementations OSS supports.
* Examined the memory experiment at length, looking for more meaningful information to present to the user.
June 2015:
* Fix for supporting hwc experiments on static binaries. Problem was undefined references to __wrap_dlopen and __wrap_dlclose when osslink was used to link the OSS collectors into the static binary. Problem found on the Cray muzia at Sandia as well as Crays on DOD systems.
* Took the default cbtf-argonavis directory and added and changed a few files in order to work in the separated install directory environment. The summary of the changes:
* Create cmake subdirectory in CUDA to hold the cmake files & update CMakeLists.txt file to reflect the cmake subdirectory change This is mainly to match what we've done in OSS, cbtf, and cbtf-krell.
* Use the cmake provided find_package file to find CUDA and then infer the CUPTI location using CUDA_TOOLKIT_ROOT_DIR variable
* We now have a FindCBTF-Krell.cmake file instead of finding the individual messages, services, and core. So, we only call find_package(CBTF-Krell REQUIRED) now.
* Fix for a bug found at INL where the iot experiment asserted due to buffer size issues. This mod adjusts the path buffer to hold 16 paths of length 256 characters. Fix via dpm.
* Add changes to install the offline_monitor object file which gets used for supporting performance analysis of static binaries via the osslink script.
* Fix missing and incorrect substitution parameters and fix bad bash conditional syntax. Issues found on muzia.sandia.gov while testing the cmake build for a Cray target.
* Met a couple of times with Krell IT to help with the conversion of the Open|SpeedShop WordPress website to WPEngine (is a managed WordPress provider) to have higher availability and easier management.
* Fix bad syntax in build script found while building on muzia at sandia.
* Look for cbtf-krell root instead of cbtf-messages and cbtf-services when building OSS/CBTF in the install-tool build scripts.
* Updates to allow cbtf instrumentor to work with cbtf-krell installed in its own location. Update handling of --enable-debug to be default and be defined as -g -O2. Enable build and install of all collector plugins excluding runtime for instrumentor=cbtf.
* Add missing target includes for the "plugin" part of this target. Not sure if that is a valid plugin to install even if autotools builds were installing it.
May 2015:
* Fix an issue with the osscompare script that will allow it to work better on a SLURM partition. Another fix is necessary to allow osscompare to work under a SLURM or PBS partition.
* Comment out code for a no longer used mrnet version, particularly the mrnet.py code in this file (init.py). This was causing osscompare to fail when executed on SLURM and PBS partitions because the mrnet.py code was trying to read (unsuccessfully) from the nodes in the partition list.
* Don and I fixed a couple of issues but there is still a problem with openMPI doing a “free” memory call, which is wrapped by the memory experiment. This causes a recursive loop. More investigation is needed.
* Investigated why the cmake build needs the skip_frame count to be 5 frames and the autotools based OSS build is only successful with the skip_frame count at 4 frames. No final determination for this issue.
* Continued to build and test the cbtf version of OpenSpeedShop in order to deploy this version as the default version of OpenSpeedShop, replacing the offline version.
* Update the offline specific section of the autoconf configure.ac file to set five cbtf-krell configure variables to false to appease autotools. Without setting these false the configure fails.
* Change the default build setting for the automated build process to implicit tls.
* Changes to the autotools build system to support cbtf and cbtf-krell being in separate installation directories. This change properly sets up the directory paths for the usage of cbtf and cbtf-krell as individual installations in scripts and in makefiles that used to reference a combined installation directory. It also makes the tls usage default to implicit instead of explicit in services and core.
* Allow OSS to configure and use a cbtf-krell install that is in a separate location from cbtf itself. This introduces ax_cbtf_krell.m4 which handles all the needs of messages, services, and core.
* Allow saved view code views to use the command found in the views table view_commad as is and not parse it for -v. Add a simple addView method to Experiment class for use with the cbtf instrumentor to write computed metrics into the views table.
Apr 2015:
* Add new 1.5 version tarballs for cbtf and remove 1.1 versions, update the support for install script to build 1.5 versions. Add back in the libdwarf patch application code.
* Update the openspeedshop tarball with latest changes, including m4 files fixing the unwanted creation of /usr/lib64 references after configure.
* Did a large check-in of cmake changes for the OpenSpeedShop build
* Contacted the Dyninst support team about build issues with building Dyninst on laboratory machines.
* Worked on Cray specific static binary experiment issues where the libdl.so library was not being included in the osslink line causing dlopen and dlclose undefined symbol errors.
* Worked on a compile issue with osscollect.cxx that used to compile prior to the system update on rzmerl at LLNL. Eventually, Don and I decided to add a cast to the two NULL references that were failing to compile.
* Participated with LANL in testing scalability of the current OSS/CBTF version at LANL on moonlight with 4096 processors using the mpip experiment.
* Worked on a fix for a bad usertime bug in the offline version. Don M was able to figure it out and suggest a fix.
* Investigated TLS issues with the CBTF version of OpenSpeedShop. There were some duplication TLSKey values the were causing failures.
* Fix an issue with explicit tls that is a result of previous refactoring code that split common code from the collectors into a monitor file and collector file, but did not unify the TLS key that was being used to manage the explicit tls use. Thanks to Don for his help on finding and fixing this one. Also, per Dons suggestion change the TLS keys in the io, mpi, mem, and pthreads collectors to be unique and consistent across all the collectors.
* Fix an abort that occurs when there is no tls from the CBTF_GetTLS call in the explicit TLS model. We were checking for a debug flag using tls when there is no tls available.
* Remove fork used to run the user program under cbtfrun control. This is uneeded and simply calling system with a modified command line is sufficient. Much code cleanup and fixing formatting of some code additions to match the existing tab and space usage. Add more comments where needed to clarify the launching of the cbtf network with the users program. Set proper defaults for some of the command line options. by Don Maghrak
* For mpi jobs collectors should defer sending threads attached message until just before first data blob is sent. by Don Maghrak
* Do not set default for toplogy as that means you have a topology file already generated...
* client must ensure that the connections file is written be for allowing the ltwt BEs to be run and attempt to connect.
* If connection file does not exist, call abort.
* Add note to code that pads node names with 0 characters and disable that behaviour. If the need to padding is observed on any system then this must only be done under the conditions that require it.
Mar 2015:
* Add the latest papi release tarball (version 5.4.1).
* Add the ability to force the build of dyninst, if desired. We are finding it in /usr on some systems but it is an old version. This allows us to use --force-dyninst-build to force it to build our version.
* If messages-cuda is not specified do not default to specifying /usr in LDFLAGS and CPPFLAGS.
* If symtabapi is not specified or is found in /usr do not specify /usr in LDFLAGS and CPPFLAGS.
* Set the proper environment variable after building the krell externals version of boost. An incorrect variable was set, because of that MRNet did not use the krell externals version causing problems when executing cbtf.
* The detect install script was detecting dyninst proper but not detecting the dyninst libdir properly. This mod fixes that by adding a couple of more checks for the libdir case.
* When creating the install directories for root, cbtf, and OSS, change the permission of the directories to 755 instead of letting the mkdir default to the umask settings.
* Add code to separate the ld flags and lib specification portions of the python m4 file. PYTHON_LIBS was created to contain -l<python lib and version>. The lib specification was removed from PYTHON_LDFLAGS. Additional change to not set PYTHON_LDFLAGS if python is in /usr/<lib or lib64>. Then use the new PYTHON_LIBS when building the cli.
* Add new 1.5 version tarballs for cbtf and remove 1.1 versions, update the support for install script to build 1.5 versions. Add back in the libdwarf patch application code.
* Work with the MRNet development team, Mahesh R and Don Maghrak to try to understand why our OpenSpeedShop CBTF version does not work at higher scale on chama at Sandia. We are continuing the process with no definitive answer at this time.
* Rebuilt the OpenSpeedShop CBTF versions at all the laboratories in order to debug the scaling issue seen on chama at Sandia.
Feb 2015:
* Updated the CBTF release on sourceforge from 1.1 to 1.5. Updated documents and build instructions. Tested installation and execution of new version.
* Rearrange the ordering to check if an openMP thread id exists before looking for existence of a POSIX thread id. This fixes the loadbalance view when looking under a rank for the load balance of the underlying threads.
* Update the PTGF and PTGF OpenSpeedShop tarballs with fixes from Dane. Fix for double-free crash on close. We are now (selectively) truncating file paths to 16 chars (unless the filename itself is longer). This is a feature used in the standard path display, and the stack trace. ToolTips are not truncated, so the user can see the full path, if desired. These mods also include a solution to the expansion of the stack trace into the tree views.
* Update code to build from OS system installed versions of dyninst, libdwarf, papi, xerces-c without having to build the so-called root versions and instead using the rpm installed versions. Issues were discovered when installing OSS and CBTF on a new Fedora 21 system installation
* A few tweaks to the existing build scripts for finding components in their default installation locations.
* Fix compile issue found on BG/Q Vulcan platform at LLNL. Fix thanks to Don M (dpm). The define OPENSS_USE_DL_ITERATE was not being set for the Blue Gene platform.
* We do not set the proper library suffix into offline.py and ossrun when libmonitor is installed into slash lib. Found while working with the spack install mechanism.
* Allow the cbtf build to find dyninst installed via OS distribution installation.
* Update the find symtabapi cmake routine to look in the subdirectory dyninst for the libsymtabAPI.so file.
* Update the build script for cbtf-krell to find papi and xercesc in /usr if they exist there.
* We do not find xplat_config.h when mrnet is installed in slash lib. Found this while working with the spack install tool.
* Fix issues with building cbtf-lanl that spack building exposed. There were references to cbtf-krell/core without the proper CPPFLAGS and LDFLAGS being set. Also, could not find mrnet when installed into lib instead of lib64.
* Fix some issues with the autotools based cbtf-krell build mechanisms found while doing spack testing. We were missing some CPPFLAGS settings that were masked by installing into the same prefix. Also, improve the finding of MRNet.
OpenSpeedShop-2.1 release
1) MPI-3 standard support in the MPI wrappers.
2) Experimental CBTF instrumentor support, if built for CBTF instrumentation
a) 4 new experiments (under development) mem, pthread, iop, and cuda
b) MRNet based performance data transport mechanism.
3) Many bug fixes for the offline version.
4) Different installation mechanism based on install-tool.
Changes in 2.0.0 since 1.9.3.4
1) More general support for support of PPC platforms
2) More general support for support of Cray-XT/XE and Blue Gene platforms.
3) Updated man pages and convenience scripts
4) More general support for the online/dynamic version of OpenSpeedShop
5) TBD
Changes in 1.9.3.4 since the 1.9.3.3 release:
1) TBD
2) Incremental fixes toward support for PPC platforms
3) Incremental fixes toward support for the Cray-XT5 and BG/P platforms.
Changes in 1.9.3.3 since the 1.9.3.2 release:
1) Fix call stack unwinding bugs by patching the libunwind component
2) Incremental fixes toward support for PPC platforms
3) Incremental fixes toward support for the Cray-XT5 and BG/P platforms.
4) Fixes for better mpich2 and mvapich2 support for OpenSpeedShop experiments
on Fortran applications.
5) Make configuration changes so that MRNet 2.2 and Dyninst 6.1 are the default
versions for OpenSpeedShop.
Changes in 1.9.3.2 since the 1.9.3.1 release:
1) Removed all autogen created files. Now use libtool 2.2x,
automake 1.11, and autoconf 2.65 for deveopment builds from cvs source.
2) Updates to better support mvapich, mpt, and openmpi.
Support for wrapping Fortran MPI calls was needed.
3) Add osslink command to support building static applications
with static collectors.
4) More addtional updates to better support mvapich, mpt, openmpi, and Intel mpi (mpich2)
For Intel mpi, when building set the OPENSS_MPI_MPICH2 environment variable to the path
to the Intel mpi installation directory.
Changes in 1.9.3.1 since the 1.9.3 release:
1) Many fixes to support OpenSpeedShop usage on FC11.
Changes in 1.9.3 since the 1.9.2 release:
1) Configuration changes to support creating target static collectors and a static runtime library for inclusion into static application binaries to gather performance data.
2) Improvements to the online version, where support for non-threaded MPI implementations is working now at higher processor counts.
3) Several bug fixes. A key bug that was fixed was an problem in performance gathering that occurred when hundreds of dsos were loaded and unloaded.
4) Configuration improvements in a number areas including improved recognition of PAPI, MRNET, and DYNINST installations.
Changes in 1.9.2 since the 1.9.1 release:
1) Add a scrollview bar to the preference panel General page
2) Add toolbar and icons to the Customize Stats Panel for more ease of use.
3) Add an option to enable and disable the showing of the advanced (more robust) feature icons in the StatsPanel toolbar.
4) Use the above option to reduce the toolbar to a manageable number for normal users.
5) Add a feature to allow users to get at the various metrics (specific pieces of performance information) for each experiment by selecting check boxes corresponding to the metrics. The StatsPanel icon "OV" representing Optional View, will raise a dialog panel corresponding to the particular experiment. Check the boxes and when "OK" is clicked a new StatsPanel view reflecting the choices will be generated.
6) Add the "TS" icon in the StatsPanel to bring up the Time Segment dialog box. This lets the user choose the part his programs execution to view the data for. So, by choosing a segment of the programs execution, only that portion of the programs performance data will be shown when the user clicks the "OK" button.
7) Improvements to the MPICH2 and MVAPICH configuration. Add support for configuration of MVAPICH2.
8) Enhance the StatsPanel for comparisons to show the general toolbar and also to allow users to focus the StatsPanel on one of the experiments being compared. A dialog box will allow the selection of one of the experiments when the user selects a performance data view they would like to see.
9) Bug fix for an address range problem where for some programs the performance data was lost for the main application.
10) Performance improvements to generating call tree views.
11) Numerous improvements to the collector runtimes for speed and efficiency.
12) Changes to the openss command syntax accepted:
a) -offline is now optional on the openss command for running in offline mode. Offline is default.
b) openss -cli now brings up the CLI in offline mode
c) openss -cli -online bring up the CLI in online/dynamic mode
d) openss -f "executable" experiment_type nows loads the executable, creates, and runs the experiment in offline mode.
e) openss -gui -f "executable" experiment_type nows loads the executable, creates the experiment in offline mode ready to run in the GUI.
e) openss -gui -online -f "executable" experiment_type nows loads the executable, creates the experiment in online mode ready to run in the GUI.
e) openss -f <openss_database_file_name>.openss still brings up the GUI with the performance data displayed in the GUI.
Changes in 1.9.1 since the 1.9 release:
1) Fix mis-spelling in the Stats Panel regarding communicator metric settings which caused the communicator to not be available for display in the Stats Panel.
2) Fix misnamed mpit metric references in the GUI and support more optional metric names in mpit view.
3) Fix a mis-match in the default setting for less restrictive compares. It is off by default, but it didn't honor the true setting until the preference was reset to true.
4) Remove import of tempfile. Not currently used and is causing a runtime undefined error on ubuntu 8.10 (intrepid).
5) Update the Makefile.am files for all the wizards to include -libopenss-guiobjects where the ArgumentObject dso is located. This changed to support Ubuntu-8.10.
6) Modify the openssd shutdown process to work properly with MRNet 2.0. This is related to the online mode of operation. The daemons were not able to shutdown cleanly before this change.
7) When no preference file has been written return the default value for instrumentorIsOffline, so that the OpenSpeedShop wizards have offline as the default instrumentation mechanism.
8) Speedups for ossutil processing when processing symbols from the offline experiment. ossutil is the utility that converts raw data files written by running the offline version of OpenSpeedShop.
9) Fixes to the offline script to invoke openss to support mpiexec for mpich2 and SGI MPT MPI implementations.