I built Chonkie because I was tired of rewriting chunking code for RAG applications. Existing libraries were either too bloated (80MB+) or too basic, with no middle ground.
Core features:
- 21MB default install vs 80-171MB alternatives
- 33x faster token chunking than popular alternatives
- Supports multiple chunking strategies: token, word, sentence, and semantic
- Works with all major tokenizers (transformers, tokenizers, tiktoken)
- Zero external dependencies for basic functionality
Technical optimizations:
- Uses tiktoken with multi-threading for faster tokenization
- Implements aggressive caching and precomputation
- Running mean pooling for efficient semantic chunking
- Modular dependency system (install only what you need)
Benchmarks and code: https://github.com/bhavnicksm/chonkie
Looking for feedback on the architecture and performance optimizations. What other chunking strategies would be useful for RAG applications?
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...