Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling
#14 opened 1 month ago in kyegomez/MambaTransformer