Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align print format string for column names and units #126

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dmitrygx
Copy link

Format for column names and units started to be mismatched after changing "error" to "#wrong" in NCCL 2.13 (merged to master as a part of 51af557).

@AddyLaddy
Copy link
Collaborator

I can't see any misalignment in my tests:

#
#                                                              out-of-place                       in-place          
#       size         count      type   redop    root     time   algbw   busbw #wrong     time   algbw   busbw #wrong
#        (B)    (elements)                               (us)  (GB/s)  (GB/s)            (us)  (GB/s)  (GB/s)       
  1073741824     268435456     float     sum      -1   8160.7  131.58  230.26      0   8158.3  131.61  230.32      0
  2147483648     536870912     float     sum      -1    16191  132.64  232.11      0    16192  132.63  232.10      0
  4294967296    1073741824     float     sum      -1    32115  133.74  234.04      0    32106  133.77  234.10      0
  8589934592    2147483648     float     sum      -1    64086  134.04  234.57      0    64070  134.07  234.62      0
 17179869184    4294967296     float     sum      -1   128070  134.14  234.75      0   128042  134.17  234.80      0

@sjeaugey
Copy link
Member

That's because you don't have any data corruption. In the case of corruption, we could have up to 6 chars I guess hence the change.

If that is truly the idea, I agree it would be better. It's annoying to widen the line for something that should never happen, but if it happened, it would break any automatic parsing, including parsing supposed to report errors.

@dmitrygx
Copy link
Author

dmitrygx commented Jan 20, 2023

this change aligns format string:

  • for column names: "# %10s %12s %8s %6s %6s %7s %6s %6s %6s %7s %6s %6s %6s\n"
  • for units................:"# %10s %12s %8s %6s %6s %7s %6s %6s %5s %7s %6s %6s %5s\n"

to make them identical: "# %10s %12s %8s %6s %6s %7s %6s %6s %5s %7s %6s %6s %5s\n" and have the same width for #wrong column in units and measurements printed:

PRINT(" %7s %6.2f %6.2f %5g", timeStr, algBw, busBw, (double)wrongElts);
and
PRINT(" %7s %6.2f %6.2f %5s", timeStr, algBw, busBw, "N/A");

@dmitrygx dmitrygx force-pushed the topic/align_test_result_format branch from 22de050 to fd03a19 Compare January 20, 2023 07:33
@dmitrygx
Copy link
Author

dmitrygx commented Feb 2, 2023

@AddyLaddy @sjeaugey could you review pls?

@sjeaugey
Copy link
Member

sjeaugey commented Feb 2, 2023

The reason we replaced %5s by %6s (removing one space and adding one char to the column name) was because we needed that extra char given #wrong is 6 chars. If you unify everything on %5s then #wrong will be too large and cause a shift of the columns by one char.

So I'm still not sure what problem we're trying to solve here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants