I ran the example program and got the following error.
import torch
from long_net.model import LongNetTransformer
longnet = LongNetTransformer(
num_tokens=20000,
dim=512,
depth=6,
dim_head=64,
heads=8,
ff_mult=4,
).to("cuda:0")
tokens = torch.randint(0, 20000, (1, 512)).to("cuda:0")
logits = longnet(tokens)
print(logits)
It looks like there's something wrong internally?
2024-07-08 01:43:03.002114: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-07-08 01:43:03.048251: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-07-08 01:43:03.679049: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-07-08 01:43:04,742 - numexpr.utils - INFO - Note: detected 96 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable.
2024-07-08 01:43:04,742 - numexpr.utils - INFO - Note: NumExpr detected 96 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
2024-07-08 01:43:04,742 - numexpr.utils - INFO - NumExpr defaulting to 8 threads.
Non-A100 GPU detected, using math or mem efficient attention if input tensor is on cuda
Traceback (most recent call last):
File "/workspace/DeepVQ/model/LongNetGPT.py", line 20, in <module>
logits = longnet(tokens)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/long_net/model.py", line 302, in forward
x = self.transformer(x)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/long_net/model.py", line 271, in forward
x = block(x) + x
RuntimeError: The size of tensor a (256) must match the size of tensor b (512) at non-singleton dimension 1
Process finished with exit code 1
Pay now to fund the work behind this issue.
Get updates on progress being made.
Maintainer is rewarded once the issue is completed.
You're funding impactful open source efforts
You want to contribute to this effort
You want to get funding like this too