-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[moe] merge moe into main #4978
Commits on Oct 26, 2023
-
[moe] support moe fwd and bwd with low level zero (hpcaitech#4421)
* fix test files * new file * add new * fix zero * update moe tests for forward and backward * remove useless test * remove print * moe * code style * code style * rename * rename * remove useless func * update param check * update utils and config
Configuration menu - View commit details
-
Copy full SHA for 75fa0b6 - Browse repository at this point
Copy the full SHA 75fa0b6View commit details -
[moe] support low level zero optim (hpcaitech#4429)
* update optim * update grad handler * update moe param interface * update doc * move moe tensor
Configuration menu - View commit details
-
Copy full SHA for 4373d06 - Browse repository at this point
Copy the full SHA 4373d06View commit details -
[moe] refactor code to better adapt to llm (hpcaitech#4469)
* polish code * rename * refactor code * fix test * refactor code * update flash attention version * Support TP (#6) * add tp test * update tp test * update * remove fa dependency * update dependency * update softmax * update checkpointio * update processgroupmesh * update name * update param * add keep vars
Configuration menu - View commit details
-
Copy full SHA for 8240463 - Browse repository at this point
Copy the full SHA 8240463View commit details -
[moe] support local moe and fix bugs (hpcaitech#4574)
* add local moe * update moe layer
Configuration menu - View commit details
-
Copy full SHA for 75fdcc2 - Browse repository at this point
Copy the full SHA 75fdcc2View commit details -
[moe] support openmoe inference (hpcaitech#4616)
* init * update moe ckpt * update config * support openmoe infernece * update config * remove pdb * update ci * update requirement * add build ffn experts * update requirement * update ci * update ci * update require * update ci
Configuration menu - View commit details
-
Copy full SHA for 61995f8 - Browse repository at this point
Copy the full SHA 61995f8View commit details -
[moe] support openmoe train (hpcaitech#4637)
* init * update moe ckpt * update config * support openmoe infernece * update config * remove pdb * support train * add ckpt download * update ckpt loading * use general ckpt
Configuration menu - View commit details
-
Copy full SHA for bf53487 - Browse repository at this point
Copy the full SHA bf53487View commit details -
[moe] align train settings and losses (hpcaitech#4655)
* init * update moe ckpt * update config * support openmoe infernece * update config * remove pdb * support train * add ckpt download * update ckpt loading * use general ckpt * add loss and optim * update ci * update require
Configuration menu - View commit details
-
Copy full SHA for 55a81a6 - Browse repository at this point
Copy the full SHA 55a81a6View commit details -
[moe] move to moe and remove legacy (hpcaitech#4672)
* init * update moe ckpt * update config * support openmoe infernece * update config * remove pdb * support train * add ckpt download * update ckpt loading * use general ckpt * add loss and optim * update ci * update require * move * move * remove legacy * update file name and restore moe context * update module * update build_ffn_experts * update init * add ctx
Configuration menu - View commit details
-
Copy full SHA for 84f05b1 - Browse repository at this point
Copy the full SHA 84f05b1View commit details -
[moe]: add top k router (hpcaitech#4597)
* docs: add shape spec * docs: add doc * feat: add top_k router * feat: update init * test: add moe router tests * fix: reorder return values
Configuration menu - View commit details
-
Copy full SHA for d1d0de8 - Browse repository at this point
Copy the full SHA d1d0de8View commit details -
[moe]: modify router loss, polish code (hpcaitech#4693)
* feat: check z_loss and add doc * style: rename misleading variable * feat: modify auxiliary loss * feat: add aux_loss in topk router and modify doc * docs: add fn doc
Configuration menu - View commit details
-
Copy full SHA for 708bf6f - Browse repository at this point
Copy the full SHA 708bf6fView commit details -
[moe] speed up embed and mlp (hpcaitech#4701)
* update triton * update kernel * add init * add version check * update precision * update precision * update kernel in experts * update test arg * update settings
Configuration menu - View commit details
-
Copy full SHA for fde57bf - Browse repository at this point
Copy the full SHA fde57bfView commit details -
Configuration menu - View commit details
-
Copy full SHA for adb8ebe - Browse repository at this point
Copy the full SHA adb8ebeView commit details -
[moe]: add flash attention & optimize top2 router (hpcaitech#4712)
* feat: add benchmark train * perf: use flash_attn * fix: modify benchmark config * fix: check flash attn installation * fix: update config with args * perf: optimize top2 router
Configuration menu - View commit details
-
Copy full SHA for 3f02e57 - Browse repository at this point
Copy the full SHA 3f02e57View commit details -
[moe] support hybrid parallel (hpcaitech#4748)
* init policy * renam,e * update pp * finish pp * update script * update plugin * finish pp * update setup for different plugin * update ci * update ci * update ci * support ep inside or dp inside * update arg for kernel * disable ci * update train script * update plugin
Configuration menu - View commit details
-
Copy full SHA for d12bbe7 - Browse repository at this point
Copy the full SHA d12bbe7View commit details -
[moe] update benchmark (hpcaitech#4770)
* init policy * renam,e * update pp * finish pp * update script * update plugin * finish pp * update setup for different plugin * update ci * update ci * update ci * support ep inside or dp inside * update arg for kernel * disable ci * update train script * fsdp * update train * update train * fsdp benchmark * rename * update fsdp bench * fix plugin * update benchmark
Configuration menu - View commit details
-
Copy full SHA for b72fa37 - Browse repository at this point
Copy the full SHA b72fa37View commit details -
* init policy * renam,e * update pp * finish pp * update script * update plugin * finish pp * update setup for different plugin * update ci * update ci * update ci * support ep inside or dp inside * update arg for kernel * disable ci * update train script * fsdp * update train * update train * fsdp benchmark * rename * update fsdp bench * fix plugin * update benchmark * fix ci * fix ci * rename * update ci * update test * update vocab * update chunk head
Configuration menu - View commit details
-
Copy full SHA for 5c97a96 - Browse repository at this point
Copy the full SHA 5c97a96View commit details -
[moe] update benchmark scripts and ckpt io (hpcaitech#4804)
* update benchmark script * update pp strategy * update plugin * update bench script * optimize * update pp layers * update zero ep * ep * update ckpt * update test
Configuration menu - View commit details
-
Copy full SHA for c68303b - Browse repository at this point
Copy the full SHA c68303bView commit details -
[moe] support overlap for expert tp (hpcaitech#4851)
* overlap comm * fix typo * update bench script * add option * update script * update bench
Configuration menu - View commit details
-
Copy full SHA for 4d74f83 - Browse repository at this point
Copy the full SHA 4d74f83View commit details -
[moe] support hybrid zero strategy. (hpcaitech#4877)
* overlap comm * fix typo * update bench script * add option * update script * update bench * param init * support dp zero * fix zero dp * fxi bug * update pg bug * update experts * fix optim bug * update config * kaishen niubi * fix bug * embed * Merge branch 'feature/MoE' of https://github.com/hpcaitech/ColossalAI into bench * update bench * update optim * update doc * update sync * fix test * fix arg * update ckpt * update test * fix * remove print * polish code * update hybrid zero optim * update print
Configuration menu - View commit details
-
Copy full SHA for 2481b83 - Browse repository at this point
Copy the full SHA 2481b83View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7441a1f - Browse repository at this point
Copy the full SHA 7441a1fView commit details -
[moe] support load balance (hpcaitech#4914)
* add load balance * update test * update param exchange * pass test * update test * update test * update test * update test * fix ranks * update
Configuration menu - View commit details
-
Copy full SHA for 5844f34 - Browse repository at this point
Copy the full SHA 5844f34View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5f20878 - Browse repository at this point
Copy the full SHA 5f20878View commit details -
[moe]: add overlap ep, and fix overlap tp (hpcaitech#4925)
* test: add more ep/tp test case * to: add TPOverlap fn * fix: fix tp overlap * fix: remove useless variables * feat: add async all to all * feat: add overlap ep * fix: fix import error * fix: fix ep/tp tests * perf: optimize overlap * fix: add world_size check
Configuration menu - View commit details
-
Copy full SHA for b0e277b - Browse repository at this point
Copy the full SHA b0e277bView commit details -
[moe] polish code (hpcaitech#4952)
* doc * update script * update experts * update optim in fsdp * update kernel in sparse * empty cache * update script * update bench * update script * remove epzero2 * fix * update print * update test script * update script * update manager * update host * update script
Configuration menu - View commit details
-
Copy full SHA for 4a7bf29 - Browse repository at this point
Copy the full SHA 4a7bf29View commit details -
[moe] update train script (hpcaitech#4959)
* update * update ckpt * update train * update train
Configuration menu - View commit details
-
Copy full SHA for c644b47 - Browse repository at this point
Copy the full SHA c644b47View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5cc3ad0 - Browse repository at this point
Copy the full SHA 5cc3ad0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 713446b - Browse repository at this point
Copy the full SHA 713446bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1b19a5f - Browse repository at this point
Copy the full SHA 1b19a5fView commit details -
Configuration menu - View commit details
-
Copy full SHA for ca42bf4 - Browse repository at this point
Copy the full SHA ca42bf4View commit details -
Configuration menu - View commit details
-
Copy full SHA for c381e4c - Browse repository at this point
Copy the full SHA c381e4cView commit details -
Configuration menu - View commit details
-
Copy full SHA for b19fb91 - Browse repository at this point
Copy the full SHA b19fb91View commit details
Commits on Oct 28, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 61df786 - Browse repository at this point
Copy the full SHA 61df786View commit details -
Configuration menu - View commit details
-
Copy full SHA for 685c80a - Browse repository at this point
Copy the full SHA 685c80aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9586f61 - Browse repository at this point
Copy the full SHA 9586f61View commit details
Commits on Oct 30, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 6c0094c - Browse repository at this point
Copy the full SHA 6c0094cView commit details -
Configuration menu - View commit details
-
Copy full SHA for b732ab0 - Browse repository at this point
Copy the full SHA b732ab0View commit details
Commits on Oct 31, 2023
-
Configuration menu - View commit details
-
Copy full SHA for e85122b - Browse repository at this point
Copy the full SHA e85122bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 25c329f - Browse repository at this point
Copy the full SHA 25c329fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6b03bd4 - Browse repository at this point
Copy the full SHA 6b03bd4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 659c9b1 - Browse repository at this point
Copy the full SHA 659c9b1View commit details -
Configuration menu - View commit details
-
Copy full SHA for caece56 - Browse repository at this point
Copy the full SHA caece56View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9fe7680 - Browse repository at this point
Copy the full SHA 9fe7680View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0eb5623 - Browse repository at this point
Copy the full SHA 0eb5623View commit details
Commits on Nov 1, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 4be194a - Browse repository at this point
Copy the full SHA 4be194aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7e92e7b - Browse repository at this point
Copy the full SHA 7e92e7bView commit details -
Configuration menu - View commit details
-
Copy full SHA for da6392f - Browse repository at this point
Copy the full SHA da6392fView commit details