Dear learner,
We're excited to introduce Retrieval Optimization: From Tokenization to Vector Quantization, a short course made in collaboration with Qdrant, and taught by Kacper Łukawski, its Developer Relations Lead.
In this course, you'll learn about tokenization and vector search optimization for large-scale customer-facing RAG applications. You'll learn about the technical details of how vector search works and how to optimize it for better performance.
By the end of this course, you'll have a solid understanding of how tokenization is done and how to optimize vector search in your RAG systems.
Here's what you'll learn, in detail:
- Understand the internal workings of embedding models and how your text turns into vectors.
- Explore how different tokenization techniques like Byte-Pair Encoding, WordPiece, and Unigram, work and affect search relevance.
- Learn how to measure the quality of your search across several quality metrics.
- Understand how the main parameters in HNSW, a graph-based algorithm, affect the relevance and speed of vector search and how to optimally adjust these parameters.
- Experiment with the three major quantization methods –product, scalar, and binary – and learn how they impact memory requirements, search quality, and speed.
Join in and take your RAG applications to the next level!
-
Learn how tokenization works in large language and embedding models and how the tokenizer can affect the quality of your search.
-
Explore how different tokenization techniques including Byte-Pair Encoding, WordPiece, and Unigram are trained and work.
-
Understand how to measure the quality of your retrieval and how to optimize your search by adjusting HNSW parameters and vector quantizations.
Lesson | Video | Code |
---|---|---|
Introduction | video | |
Embedding models | video | code |
Role of the tokenizers | video | code |
Practical implications of the tokenization | video | code |
Measuring Search Relevance | video | code |
Optimizing HNSW search | video | code |
Vector quantization | video | code |
Conclusion | video | |
Appendix - Tips and Help | code |