Describe the solution you'd like
Ideally you should add true Infini-attention implementation there but I suppose you won't
Additional context
https://arxiv.org/pdf/2404.07143 - if you take a look at infini-attention paper, it's something totally different, I get it that the paper mentions both full attention and linear attention, but that doesn't mean you're just supposed to concatenate those, how tf did you even get an idea of doing that and calling it an implementation of infini-attention
Pay now to fund the work behind this issue.
Get updates on progress being made.
Maintainer is rewarded once the issue is completed.
You're funding impactful open source efforts
You want to contribute to this effort
You want to get funding like this too