Fund: Feature Request : Enhance Attention Mechanism for Multi-GPU Support

Is your feature request related to a problem? Please describe.
Yes, the current implementation of the DilatedAttention and FlashAttention modules in the Zeta repository does not support multi-GPU configurations effectively, particularly lacking in model parallelism and data parallelism capabilities. Specifically, FlashAttention is optimized for A100 GPUs, but I am equipped with 8 A10 GPUs and would like to leverage all available resources efficiently. This limitation restricts the scalability and speed of my deep learning tasks, particularly for large-scale sequence processing and attention mechanisms.

Describe the solution you'd like
I propose enhancing the DilatedAttention and FlashAttention classes to include support for both model parallelism and data parallelism. This update should include:

Automatic detection and utilization of multiple GPU architectures (beyond A100).
Implementation of data parallelism to distribute data across multiple GPUs, improving throughput and efficiency.
Integration of model parallelism where the model can be split across multiple GPUs to manage large models or balance load more effectively.
Support for distributed computing across multiple nodes, initially starting with a straightforward implementation and gradually scaling to more complex distributed systems.

Describe alternatives you've considered
An alternative could be the manual partitioning of tasks and managing CUDA devices at the application level, but this approach is less efficient and scalable. Utilizing existing frameworks like NVIDIA’s NCCL for communication in parallel processing might be considered if native support in the framework proves too complex to implement in the initial stages.

Kye Gomez/LongNet

Feature Request : Enhance Attention Mechanism for Multi-GPU Support

How does funding with Polar work?

Backer

Contributor

Maintainer

Kye Gomez/LongNet

Feature Request : Enhance Attention Mechanism for Multi-GPU Support

How does funding with Polar work?

Backer

Why does "Fund on completion" require GitHub login?

When is the invoice due for "Fund on completion"?

What happens if the issue is never completed?

Do I get any extra benefits by funding?

Do I get progress updates?

Contributor

Do I get a reward?

Is rewards guaranteed?

Maintainer

How can I get funding like this for my open source initiatives?