Is your feature request related to a problem? Please describe.
At the antropic, they extract core information for lexical search from the each chunk and put into the front side of the chunk.
Describe the solution you'd like
I think it is super expensive if we generate new information at the each chunk while running the Chunker
It is better to use it on the only ‘one chunk’ or the Corpus
instance.
Or compare between with metadata and without one.
Describe alternatives you've considered
Is there any method to create metadata from the raw document?
Additional context
https://www.linkedin.com/feed/update/urn:li:activity:7245896802107826176/
https://lnkd.in/gaHby5ZT
Pay now to fund the work behind this issue.
Get updates on progress being made.
Maintainer is rewarded once the issue is completed.
You're funding impactful open source efforts
You want to contribute to this effort
You want to get funding like this too