-
Notifications
You must be signed in to change notification settings - Fork 6
Issues: HomebrewNLP/Olmax
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Multiple forward per backward
core
Improves core model while keeping core idea intact
engineering
Software-engineering problems that don't require ML-Expertise
research
Creative project that might fail but could give high returns
#81
opened Sep 11, 2022 by
ClashLuke
Staged batchsize training
core
Improves core model while keeping core idea intact
engineering
Software-engineering problems that don't require ML-Expertise
research
Creative project that might fail but could give high returns
#80
opened Sep 11, 2022 by
ClashLuke
Compact Loss
core
Improves core model while keeping core idea intact
research
Creative project that might fail but could give high returns
#79
opened Sep 11, 2022 by
ClashLuke
Causality Test
core
Improves core model while keeping core idea intact
engineering
Software-engineering problems that don't require ML-Expertise
#78
opened Sep 11, 2022 by
ClashLuke
Square LR-Schedule
core
Improves core model while keeping core idea intact
engineering
Software-engineering problems that don't require ML-Expertise
#70
opened Aug 13, 2022 by
ClashLuke
Long-Range-Arena Evaluation
downstream
Changes code wrapping the core model
engineering
Software-engineering problems that don't require ML-Expertise
ML
Requires machine-learning knowledge (can be built up on the fly)
#49
opened May 18, 2022 by
ClashLuke
Gradient Noise
core
Improves core model while keeping core idea intact
ML
Requires machine-learning knowledge (can be built up on the fly)
research
Creative project that might fail but could give high returns
#46
opened May 17, 2022 by
ClashLuke
Retrieval Augmented Causal Generation
core
Improves core model while keeping core idea intact
ML
Requires machine-learning knowledge (can be built up on the fly)
research
Creative project that might fail but could give high returns
#45
opened May 17, 2022 by
ClashLuke
Encoder-Decoder Architecture
core
Improves core model while keeping core idea intact
ML
Requires machine-learning knowledge (can be built up on the fly)
research
Creative project that might fail but could give high returns
#44
opened May 17, 2022 by
ClashLuke
Automated Eval-Demo Update
engineering
Software-engineering problems that don't require ML-Expertise
mlops
#32
opened May 9, 2022 by
ClashLuke
Automated Long-Running Experiments
engineering
Software-engineering problems that don't require ML-Expertise
mlops
#31
opened May 9, 2022 by
ClashLuke
Automated Integration Tests
engineering
Software-engineering problems that don't require ML-Expertise
mlops
#30
opened May 9, 2022 by
ClashLuke
Long-Context Experiments
engineering
Software-engineering problems that don't require ML-Expertise
ML
Requires machine-learning knowledge (can be built up on the fly)
"Resume" option for tokenizers
downstream
Changes code wrapping the core model
engineering
Software-engineering problems that don't require ML-Expertise
#23
opened Apr 30, 2022 by
ClashLuke
Language-Model Evaluation
downstream
Changes code wrapping the core model
engineering
Software-engineering problems that don't require ML-Expertise
ML
Requires machine-learning knowledge (can be built up on the fly)
Stabilize MoE
core
Improves core model while keeping core idea intact
engineering
Software-engineering problems that don't require ML-Expertise
ML
Requires machine-learning knowledge (can be built up on the fly)
#16
opened Apr 30, 2022 by
ClashLuke
Non-Autoregressive Generation
downstream
Changes code wrapping the core model
ML
Requires machine-learning knowledge (can be built up on the fly)
research
Creative project that might fail but could give high returns
#12
opened Apr 30, 2022 by
ClashLuke
Image Classification
downstream
Changes code wrapping the core model
ML
Requires machine-learning knowledge (can be built up on the fly)
research
Creative project that might fail but could give high returns
#11
opened Apr 30, 2022 by
ClashLuke
Tokenizing Phonetics
downstream
Changes code wrapping the core model
ML
Requires machine-learning knowledge (can be built up on the fly)
research
Creative project that might fail but could give high returns
#10
opened Apr 30, 2022 by
ClashLuke
Audio Modelling
ML
Requires machine-learning knowledge (can be built up on the fly)
research
Creative project that might fail but could give high returns
#9
opened Apr 30, 2022 by
ClashLuke
Explicit Memory
core
Improves core model while keeping core idea intact
ML
Requires machine-learning knowledge (can be built up on the fly)
research
Creative project that might fail but could give high returns
#8
opened Apr 30, 2022 by
ClashLuke
MoE + Weight Sharing
core
Improves core model while keeping core idea intact
ML
Requires machine-learning knowledge (can be built up on the fly)
research
Creative project that might fail but could give high returns
#6
opened Apr 30, 2022 by
ClashLuke
Previous Next
ProTip!
Follow long discussions with comments:>50.