Kye Gomez/Hedgehog

quadratic_linear_attn implementation

3 months ago

👍

I think you should put an epsilon in denominator of output of quadratic_linear_attn function to prevent NaN value when training HedgeHog MLP.
qk / (qk.sum(dim=-1, keepdim=True) +epsilon)

kyegomez / Hedgehog

Implementation of the model "Hedgehog" from the paper: "The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry"

Contributors get 40% of received funds after fees

Assigned

How does funding with Polar work?

Pay now to fund the work behind this issue.

Get updates on progress being made.

Maintainer is rewarded once the issue is completed.

FAQ

Backer

You're funding impactful open source efforts

Contributor

You want to contribute to this effort

Maintainer

You want to get funding like this too

Kye Gomez/Hedgehog

quadratic_linear_attn implementation

How does funding with Polar work?

Backer

Why does "Fund on completion" require GitHub login?

When is the invoice due for "Fund on completion"?

What happens if the issue is never completed?

Do I get any extra benefits by funding?

Do I get progress updates?

Contributor

Do I get a reward?

Is rewards guaranteed?

Maintainer

How can I get funding like this for my open source initiatives?