Tags: spancer/DeepSpeed
Tags
use HF NeoX (deepspeedai#2087) Co-authored-by: Olatunji Ruwase <[email protected]> Co-authored-by: Jeff Rasley <[email protected]>
Improving memory utilization of Z2+MoE (deepspeedai#2079) * Shards expert parameter groups * Do upscaling, optimizer and deletion of fp32 grads one-by-one on each parameter group in zero-2 Co-authored-by: Olatunji Ruwase <[email protected]>
Fixing several bugs in the inference-api and the kernels (deepspeedai… …#1951) Co-authored-by: Jeff Rasley <[email protected]>
Improve z3 trace management (deepspeedai#1916) * Fix OOM and type mismatch * Toggle prefetching * Disable z3 prefetching for inference (temp workaround) * Fix zero3 tracing issues * Remove debug prints * Enable prefetch for inference * Code clarity * Invalidate trace cache * Trace cache invalidation when needed Separate nvme prefetch from all-gather prefetch * Track last used step id * Use debug name in error message * Construct param trace from module trace Co-authored-by: Jeff Rasley <[email protected]>
Fix OOM and type mismatch (deepspeedai#1884) Co-authored-by: Jeff Rasley <[email protected]>
qkv_out can be a single tensor or a list. Handling these cases separe… …tely. (deepspeedai#1850) Co-authored-by: Jeff Rasley <[email protected]>
[ZeRO] Default disable elastic ckpt in stage 1+2 and reduce CPU memor… …y overhead during ckpt load (deepspeedai#1525) Co-authored-by: Olatunji Ruwase <[email protected]>
Various small documentation text improvements (deepspeedai#1665) Co-authored-by: Jeff Rasley <[email protected]>
PreviousNext