At vectordb.py, llamaindex UpstageEmbedding class is instance of OpenAIEmbedding.
So it tires to truncate token to 8000, but tiktoken cannot tokenize it because solar embedding model is not capable of tokenize in tiktoken.
Add model_name condition at truncate openai token also.
Just change colab demo to BAAI
Pay now to fund the work behind this issue.
Get updates on progress being made.
Maintainer is rewarded once the issue is completed.
You're funding impactful open source efforts
You want to contribute to this effort
You want to get funding like this too