Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coll: add coll_attr and comm subgroups #6590

Closed
wants to merge 15 commits into from
Closed

Conversation

hzhou
Copy link
Contributor

@hzhou hzhou commented Jul 11, 2023

Pull Request Description

Add a coll_attr parameter to replace the errflag parameter in internal collective interfaces. Make the lower 8-bit of coll_attr compatible to the lower 8-bit of pt2pt attr, which will avoid extra code to translate bits such as errflags when passing from collective to point-to-point. The next 8-bit is used for subgroup indexes, enabling group collectives without extra subcomms (which are expensive to maintain). We may extend in future coll_attr for passing hints such as memory alloc kinds and algorithm choices.

Add a bcast smp_new algorithm that are similar to bcast smp but uses comm subgroups instead. Because we can construct lightweight custom subgroup, we can avoid the extra local send/recv or bcast step when root is not one of the "node roots". Instead of the node_roots_comm, we can construct a inter_group made of the actual local roots.

NOTE: the bcast smp_new is covered in the collective cvar tests.

[skip warnings]

Author Checklist

  • Provide Description
    Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
  • Commits Follow Good Practice
    Commits are self-contained and do not do two things at once.
    Commit message is of the form: module: short description
    Commit message explains what's in the commit.
  • Passes All Tests
    Whitespace checker. Warnings test. Additional tests via comments.
  • Contribution Agreement
    For non-Argonne authors, check contribution agreement.
    If necessary, request an explicit comment from your companies PR approval manager.

@hzhou hzhou changed the title coll: add collattr parameter coll: add coll_attr parameter Aug 11, 2024
@hzhou hzhou force-pushed the 2301_collattr branch 2 times, most recently from c63d810 to 7ad8696 Compare August 13, 2024 00:56
@hzhou
Copy link
Contributor Author

hzhou commented Aug 13, 2024

test:mpich/ch3/tcp
test:mpich/ch4/ofi

All ✔️

@hzhou hzhou force-pushed the 2301_collattr branch 2 times, most recently from 8366cd2 to 48cff65 Compare August 13, 2024 14:09
@hzhou hzhou marked this pull request as ready for review August 13, 2024 14:09
@hzhou hzhou changed the title coll: add coll_attr parameter coll: add coll_attr and comm subgroups Aug 13, 2024
@hzhou hzhou requested a review from raffenet August 13, 2024 15:16
hzhou added 15 commits August 13, 2024 16:55
My desktop caught more spelling errors than the Jenkins check, likely
due to newer version of the codespell package.
Make these two err flags bit patterns independent of actual values of
MPI_ERR_OTHER and MPIX_ERR_PROC_FAILED. This allows the errflags easily
fit into attribute bits. We'll fix the a few usages in the next commit.
They are no longer the same as MPI_ERR_OTHER and MPI_ERR_PROC_FAILED.
Define MPIR_ERR_NONE, MPIR_ERR_PROC_FAILED, and MPIR_ERR_OTHER as
macros and remove the definition of MPIR_Errflag_t enum. We will replace
MPIR_Errflag_t with "int coll_attr" in the next commit. Using coll_attr
gives us more flexibility in extending the implementation with
additional attributes such as sub-group, memory kinds, and algorithm
hints.

Sub-group can be an index pointing to a group list in comm.
As the title, trivial but messy.

Collectives use coll_attr, but pt2pt apis use "int errflag".

Notably, MPIR_ERR_COLL_CHECKANDCONT works directly since it only does
bit or.

Reviewers: pay attention to changes to files outside src/mpi/coll/.
The multi-lead algorithm not only assumes the same number of ranks per
node, it also has to be ordered in exact round-robin order.
In particular, MPII_Comm_is_node_balanced is not sufficient check.
Store num_local and num_external in MPIR_Comm. Along with
internode_table, they help construct internode subgroups.
This is the same as num_external.
Lightweight struct to describe sub-groups of a communicator. They intend
to replace the subcomms.
Add coll_attr parameter to the MPIC_Recv and MPIC_Irecv so that we can
enable subgroup collectives later.
Let MPIC Send/Recv routines check coll_attr for potential subgroup
attributes, effectively enabling group collectives.
Enhance the macro MPIR_THREADCOMM_RANK_SIZE to check coll_attr for rank
and size.
A copy of MPIR_Bcast_intra_smp (to MPIR_Bcast_intra_smp_new) that uses
MPIR_Subgroup instead of subcomms.
When root is not local rank 0, instead of adding a extra intra-node
send/recv or bcast, construct an inter group that includes the root
process.
In MPIR_nodeid_init use MPIR_Allgather_fallback and MPIR_Bcast_fallback
to avoid the complication of collective algorithm selection.

It causes issue here because the bcast smp_new algorithm does not have
proper CVAR fallback check yet. The proper fix need add coll_attr to
most communicator checking routines, and will need coll_attr to be
universally added to all collective interfaces including nonblocking and
persistent collectives. Let's postpone that big change for now.
@hzhou
Copy link
Contributor Author

hzhou commented Aug 13, 2024

test:mpich/ch3/most
test:mpich/ch4/most

@hzhou
Copy link
Contributor Author

hzhou commented Aug 29, 2024

Superseded by #7103

@hzhou hzhou closed this Aug 29, 2024
@hzhou hzhou deleted the 2301_collattr branch August 29, 2024 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant