Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Zeta
#9 opened 19 hours ago in kyegomez/MoE-Mamba
#6 opened 2 months ago in kyegomez/MoE-Mamba