Tags: classicvalues/DeepSpeed
Tags
DeepSpeed MoE (deepspeedai#1310) Co-authored-by: Alex Muzio <[email protected]> Co-authored-by: Ammar Ahmad Awan <[email protected]> Co-authored-by: Conglong Li <[email protected]> Co-authored-by: Felipe Cruz Salinas <[email protected]> Co-authored-by: Jeff Rasley <[email protected]> Co-authored-by: Reza Yazdani <[email protected]> Co-authored-by: Samyam Rajbhandari <[email protected]> Co-authored-by: Shaden Smith <[email protected]> Co-authored-by: Young Jin Kim <[email protected]> Co-authored-by: bapatra <[email protected]> Co-authored-by: Samyam Rajbhandari <[email protected]> Co-authored-by: Shaden Smith <[email protected]> Co-authored-by: Young Jin Kim <[email protected]>
Use correct input size for splits (deepspeedai#1284) * Use correct input size for splits * Use smarter partitioning
[Doc] round_robin_gradients (deepspeedai#1261) * Fix docstring * Make screenshots clickable for easier viewing * Navigation menu in alphabetical order; More clicable screenshots * Rename 1Cycle doc * Tweak naming * Remove no longer used flag * ZeRO3 Offload release * Single GPU results * Rearrange figures * Single GPU text * tweak intro * zero3-offload section * Add asynchronous i/o docs * Fix print_per_steps doc * Document round_robin_gradients * Tweak description * Trigger CI
revert part of deepspeedai#1220 (deepspeedai#1221) deepspeedai#1220 fixed the leak, but lead to another problem. reverting that part so that we could do release and will work on it after the release. @jeffra
clean up logging (deepspeedai#1190) Co-authored-by: Jeff Rasley <[email protected]>
PreviousNext