diff --git a/qdrant-landing/content/articles/binary-quantization-openai.md b/qdrant-landing/content/articles/binary-quantization-openai.md index 1189252c2..6d5c168a3 100644 --- a/qdrant-landing/content/articles/binary-quantization-openai.md +++ b/qdrant-landing/content/articles/binary-quantization-openai.md @@ -40,7 +40,7 @@ You can also try out these techniques as described in [Binary Quantization OpenA ## New OpenAI embeddings: performance and changes -As the technology of embedding models has advanced, demand has grown. Users are looking more for powerful and efficient text-embedding models. OpenAI's Ada-003 embeddings offer state-of-the-art performance on a wide range of NLP tasks, including those noted in [MTEB](https://huggingface.co/spaces/mteb/leaderboard) and [MIRACL](https://openai.com/blog/new-embedding-models-and-api-updates). +As the technology of [embedding models](/articles/fastembed/) has advanced, demand has grown. Users are looking more for powerful and efficient text-embedding models. OpenAI's Ada-003 embeddings offer state-of-the-art performance on a wide range of NLP tasks, including those noted in [MTEB](https://huggingface.co/spaces/mteb/leaderboard) and [MIRACL](https://openai.com/blog/new-embedding-models-and-api-updates). These models include multilingual support in over 100 languages. The transition from text-embedding-ada-002 to text-embedding-3-large has led to a significant jump in performance scores (from 31.4% to 54.9% on MIRACL). @@ -118,7 +118,7 @@ For those exploring the integration of text embedding models with Qdrant, it's c 1. **Model Name**: Signifying the specific text embedding model variant, such as "text-embedding-3-large" or "text-embedding-3-small". This distinction correlates with the model's capacity, with "large" models offering more detailed embeddings at the cost of increased computational resources. -2. **Dimensions**: This refers to the size of the vector embeddings produced by the model. Options range from 512 to 3072 dimensions. Higher dimensions could lead to more precise embeddings but might also increase the search time and memory usage in Qdrant. +2. **Dimensions**: This refers to the size of the [vector embeddings](/articles/what-are-embeddings/) produced by the model. Options range from 512 to 3072 dimensions. Higher dimensions could lead to more precise embeddings but might also increase the search time and memory usage in Qdrant. Optimizing these parameters is a balancing act between search accuracy and resource efficiency. Testing across these combinations allows users to identify the configuration that best meets their specific needs, considering the trade-offs between computational resources and the quality of search results. diff --git a/qdrant-landing/content/articles/binary-quantization.md b/qdrant-landing/content/articles/binary-quantization.md index aba50e8fd..8f73b17b8 100644 --- a/qdrant-landing/content/articles/binary-quantization.md +++ b/qdrant-landing/content/articles/binary-quantization.md @@ -30,7 +30,7 @@ The rest of this article will cover: 3. Benchmark analysis and usage recommendations ## What is Binary Quantization? -Binary quantization (BQ) converts any vector embedding of floating point numbers into a vector of binary or boolean values. This feature is an extension of our past work on [scalar quantization](/articles/scalar-quantization/) where we convert `float32` to `uint8` and then leverage a specific SIMD CPU instruction to perform fast vector comparison. +Binary quantization (BQ) converts any [vector embedding](/articles/what-are-embeddings/) of floating point numbers into a vector of binary or boolean values. This feature is an extension of our past work on [scalar quantization](/articles/scalar-quantization/) where we convert `float32` to `uint8` and then leverage a specific SIMD CPU instruction to perform fast vector comparison. ![What is binary quantization](/articles_data/binary-quantization/bq-2.png) diff --git a/qdrant-landing/content/articles/chatgpt-plugin.md b/qdrant-landing/content/articles/chatgpt-plugin.md index 34355890a..19192358a 100644 --- a/qdrant-landing/content/articles/chatgpt-plugin.md +++ b/qdrant-landing/content/articles/chatgpt-plugin.md @@ -36,7 +36,7 @@ These plugins, designed to enhance the model's performance, serve as modular ext that seamlessly interface with the core system. By adding a knowledge base plugin to ChatGPT, we can effectively provide the AI with a curated, trustworthy source of information, ensuring that the generated content is more accurate and relevant. Qdrant -may act as a vector database where all the facts will be stored and served to the model +may act as a [vector database](/qdrant-vector-database/) where all the facts will be stored and served to the model upon request. If you’d like to ask ChatGPT questions about your data sources, such as files, notes, or diff --git a/qdrant-landing/content/articles/data-privacy.md b/qdrant-landing/content/articles/data-privacy.md index acde872e2..4e737ae25 100644 --- a/qdrant-landing/content/articles/data-privacy.md +++ b/qdrant-landing/content/articles/data-privacy.md @@ -17,7 +17,7 @@ keywords: # Keywords for SEO - Enterprise Data Compliance --- -Data stored in vector databases is often proprietary to the enterprise and may include sensitive information like customer records, legal contracts, electronic health records (EHR), financial data, and intellectual property. Moreover, strong security measures become critical to safeguarding this data. If the data stored in a vector database is not secured, it may open a vulnerability known as "[embedding inversion attack](https://arxiv.org/abs/2004.00053)," where malicious actors could potentially [reconstruct the original data from the embeddings](https://arxiv.org/pdf/2305.03010) themselves. +Data stored in vector databases is often proprietary to the enterprise and may include sensitive information like customer records, legal contracts, electronic health records (EHR), financial data, and intellectual property. Moreover, strong security measures become critical to safeguarding this data. If the data stored in a [vector databases](/qdrant-vector-database/) is not secured, it may open a vulnerability known as "[embedding inversion attack](https://arxiv.org/abs/2004.00053)," where malicious actors could potentially [reconstruct the original data from the embeddings](https://arxiv.org/pdf/2305.03010) themselves. Strict compliance regulations govern data stored in vector databases across various industries. For instance, healthcare must comply with HIPAA, which dictates how protected health information (PHI) is stored, transmitted, and secured. Similarly, the financial services industry follows PCI DSS to safeguard sensitive financial data. These regulations require developers to ensure data storage and transmission comply with industry-specific legal frameworks across different regions. **As a result, features that enable data privacy, security and sovereignty are deciding factors when choosing the right vector database.** @@ -234,7 +234,7 @@ Data governance varies by country, especially for global organizations dealing w To address these needs, the vector database you choose should support deployment and scaling within your controlled infrastructure. [Qdrant Hybrid Cloud](/documentation/hybrid-cloud/) offers this flexibility, along with features like sharding, replicas, JWT authentication, and monitoring. -Qdrant Hybrid Cloud integrates Kubernetes clusters from various environments—cloud, on-premises, or edge—into a unified managed service. This allows organizations to manage Qdrant databases through the Qdrant Cloud UI while keeping the databases within their infrastructure. +Qdrant Hybrid Cloud integrates Kubernetes clusters from various environments—cloud, on-premises, or edge—into a unified managed service. This allows organizations to manage Qdrant databases through the [Qdrant Cloud](/cloud/) UI while keeping the databases within their infrastructure. With JWT and RBAC, Qdrant Hybrid Cloud provides a secure, private, and sovereign vector store. Enterprises can scale their AI applications geographically, comply with local laws, and maintain strict data control. diff --git a/qdrant-landing/content/articles/dedicated-service.md b/qdrant-landing/content/articles/dedicated-service.md index 37014b06f..b5422acfd 100644 --- a/qdrant-landing/content/articles/dedicated-service.md +++ b/qdrant-landing/content/articles/dedicated-service.md @@ -21,7 +21,7 @@ keywords: Ever since the data science community discovered that vector search significantly improves LLM answers, various vendors and enthusiasts have been arguing over the proper solutions to store embeddings. -Some say storing them in a specialized engine (aka vector database) is better. Others say that it's enough to use plugins for existing databases. +Some say storing them in a specialized engine (aka [vector databases](/qdrant-vector-database/)) is better. Others say that it's enough to use plugins for existing databases. Here are [just](https://nextword.substack.com/p/vector-database-is-not-a-separate) a [few](https://stackoverflow.blog/2023/09/20/do-you-need-a-specialized-vector-database-to-implement-vector-search-well/) of [them](https://www.singlestore.com/blog/why-your-vector-database-should-not-be-a-vector-database/). @@ -72,7 +72,7 @@ Those priorities lead to different architectural decisions that are not reproduc ###### Having a dedicated vector database requires duplication of data. -By their very nature, vector embeddings are derivatives of the primary source data. +By their very nature, [vector embeddings](/articles/what-are-embeddings/) are derivatives of the primary source data. In the vast majority of cases, embeddings are derived from some other data, such as text, images, or additional information stored in your system. So, in fact, all embeddings you have in your system can be considered transformations of some original source. diff --git a/qdrant-landing/content/articles/discovery-search.md b/qdrant-landing/content/articles/discovery-search.md index 64dbbe1bb..313f7131c 100644 --- a/qdrant-landing/content/articles/discovery-search.md +++ b/qdrant-landing/content/articles/discovery-search.md @@ -103,4 +103,4 @@ This way you can give refreshing recommendations, while still being in control b - Discovery search is a powerful tool for controlled exploration in vector spaces. Context, positive, and negative vectors guide search parameters and refine results. - Real-world applications include multimodal search, diverse recommendations, and context-driven exploration. -- Ready to experience the power of Qdrant's Discovery search for yourself? [Try a free demo](https://qdrant.tech/contact-us/) now and unlock the full potential of controlled exploration in vector spaces! \ No newline at end of file +- Ready to experience the power of Qdrant's Discovery search for yourself? [Try a free demo](/contact-us/) now and unlock the full potential of controlled exploration in vector spaces! \ No newline at end of file diff --git a/qdrant-landing/content/articles/fastembed.md b/qdrant-landing/content/articles/fastembed.md index d70572f73..ce7244045 100644 --- a/qdrant-landing/content/articles/fastembed.md +++ b/qdrant-landing/content/articles/fastembed.md @@ -139,7 +139,7 @@ If anything changes, you'll see a new version number pop up, like going from 0.0 ## Using FastEmbed with Qdrant -Qdrant is a Vector Store, offering comprehensive, efficient, and scalable [enterprise solutions](https://qdrant.tech/enterprise-solutions/) for modern machine learning and AI applications. Whether you are dealing with billions of data points, require a low latency performant [vector database solution](https://qdrant.tech/qdrant-vector-database/), or specialized quantization methods – [Qdrant is engineered](/documentation/overview/) to meet those demands head-on. +Qdrant is a Vector Store, offering comprehensive, efficient, and scalable [enterprise solutions](/enterprise-solutions/) for modern machine learning and AI applications. Whether you are dealing with billions of data points, require a low latency performant [vector database solution](/qdrant-vector-database/), or specialized quantization methods – [Qdrant is engineered](/documentation/overview/) to meet those demands head-on. The fusion of FastEmbed with Qdrant’s vector store capabilities enables a transparent workflow for seamless embedding generation, storage, and retrieval. This simplifies the API design — while still giving you the flexibility to make significant changes e.g. you can use FastEmbed to make your own embedding other than the DefaultEmbedding and use that with Qdrant. @@ -229,7 +229,7 @@ Behind the scenes, we first convert the query_text to the embedding and use tha By following these steps, you effectively utilize the combined capabilities of FastEmbed and Qdrant, thereby streamlining your embedding generation and retrieval tasks. -Qdrant is designed to handle large-scale datasets with billions of data points. Its architecture employs techniques like [binary quantization](https://qdrant.tech/articles/binary-quantization/) and [scalar quantization](https://qdrant.tech/articles/scalar-quantization/) for efficient storage and retrieval. When you inject FastEmbed’s CPU-first design and lightweight nature into this equation, you end up with a system that can scale seamlessly while maintaining low latency. +Qdrant is designed to handle large-scale datasets with billions of data points. Its architecture employs techniques like [binary quantization](/articles/binary-quantization/) and [scalar quantization](/articles/scalar-quantization/) for efficient storage and retrieval. When you inject FastEmbed’s CPU-first design and lightweight nature into this equation, you end up with a system that can scale seamlessly while maintaining low latency. ## Summary diff --git a/qdrant-landing/content/articles/langchain-integration.md b/qdrant-landing/content/articles/langchain-integration.md index fc05c2c49..a0669727e 100644 --- a/qdrant-landing/content/articles/langchain-integration.md +++ b/qdrant-landing/content/articles/langchain-integration.md @@ -31,8 +31,8 @@ provides unified interfaces to different libraries, so you can avoid writing boi It has been reported millions of times recently, but let's say that again. ChatGPT-like models struggle with generating factual statements if no context is provided. They have some general knowledge but cannot guarantee to produce a valid answer consistently. Thus, it is better to provide some facts we know are actual, so it can just choose the valid parts and extract them from all the provided contextual data to give a comprehensive answer. [Vector database, -such as Qdrant](https://qdrant.tech/), is of great help here, as their ability to perform a [semantic search](https://qdrant.tech/documentation/tutorials/search-beginners/) over a huge knowledge base is crucial to preselect some possibly valid -documents, so they can be provided into the LLM. That's also one of the **chains** implemented in [LangChain](https://qdrant.tech/documentation/frameworks/langchain/), which is called `VectorDBQA`. And Qdrant got +such as Qdrant](https://qdrant.tech/), is of great help here, as their ability to perform a [semantic search](/documentation/tutorials/search-beginners/) over a huge knowledge base is crucial to preselect some possibly valid +documents, so they can be provided into the LLM. That's also one of the **chains** implemented in [LangChain](/documentation/frameworks/langchain/), which is called `VectorDBQA`. And Qdrant got integrated with the library, so it might be used to build it effortlessly. ### The Two-Model Approach diff --git a/qdrant-landing/content/articles/memory-consumption.md b/qdrant-landing/content/articles/memory-consumption.md index a36ce3f59..737d7c135 100644 --- a/qdrant-landing/content/articles/memory-consumption.md +++ b/qdrant-landing/content/articles/memory-consumption.md @@ -215,7 +215,7 @@ But let's first see how much RAM we need to serve 1 million vectors and then we ### Vectors and HNSW graph stored using MMAP -In the third experiment, we tested how well our system performs when vectors and [HNSW](https://qdrant.tech/articles/filtrable-hnsw/) graph are stored using the memory-mapped files. +In the third experiment, we tested how well our system performs when vectors and [HNSW](/articles/filtrable-hnsw/) graph are stored using the memory-mapped files. Create collection with: ```http @@ -355,7 +355,7 @@ Which might be an interesting option to serve large datasets with low search lat ## Conclusion -In this article, we showed that Qdrant has flexibility in terms of RAM usage and can be used to serve large datasets. It provides configurable trade-offs between RAM usage and search speed. If you’re interested to learn more about Qdrant, [book a demo today](https://qdrant.tech/contact-us/)! +In this article, we showed that Qdrant has flexibility in terms of RAM usage and can be used to serve large datasets. It provides configurable trade-offs between RAM usage and search speed. If you’re interested to learn more about Qdrant, [book a demo today](/contact-us/)! We are eager to learn more about how you use Qdrant in your projects, what challenges you face, and how we can help you solve them. Please feel free to join our [Discord](https://qdrant.to/discord) and share your experience with us! diff --git a/qdrant-landing/content/articles/multitenancy.md b/qdrant-landing/content/articles/multitenancy.md index 0a920b2ef..657069347 100644 --- a/qdrant-landing/content/articles/multitenancy.md +++ b/qdrant-landing/content/articles/multitenancy.md @@ -20,7 +20,7 @@ keywords: We are seeing the topics of [multitenancy](/documentation/guides/multiple-partitions/) and [distributed deployment](/documentation/guides/distributed_deployment/#sharding) pop-up daily on our [Discord support channel](https://qdrant.to/discord). This tells us that many of you are looking to scale Qdrant along with the rest of your machine learning setup. -Whether you are building a bank fraud-detection system, [RAG](https://qdrant.tech/articles/what-is-rag-in-ai/) for e-commerce, or services for the federal government - you will need to leverage a multitenant architecture to scale your product. +Whether you are building a bank fraud-detection system, [RAG](/articles/what-is-rag-in-ai/) for e-commerce, or services for the federal government - you will need to leverage a multitenant architecture to scale your product. In the world of SaaS and enterprise apps, this setup is the norm. It will considerably increase your application's performance and lower your hosting costs. ## Multitenancy & custom sharding with Qdrant diff --git a/qdrant-landing/content/articles/neural-search-tutorial.md b/qdrant-landing/content/articles/neural-search-tutorial.md index 0b8515191..6f6428f79 100644 --- a/qdrant-landing/content/articles/neural-search-tutorial.md +++ b/qdrant-landing/content/articles/neural-search-tutorial.md @@ -92,7 +92,7 @@ Transformers is not the only architecture suitable for neural search, but for ou We will use a model called `all-MiniLM-L6-v2`. This model is an all-round model tuned for many use-cases. Trained on a large and diverse dataset of over 1 billion training pairs. -It is optimized for low memory consumption and fast inference. +It is optimized for low [memory consumption](/articles/memory-consumption/) and fast inference. The complete code for data preparation with detailed comments can be found and run in [Colab Notebook](https://colab.research.google.com/drive/1kPktoudAP8Tu8n8l-iVMOQhVmHkWV_L9?usp=sharing). diff --git a/qdrant-landing/content/articles/new-recommendation-api.md b/qdrant-landing/content/articles/new-recommendation-api.md index 4f2711d3f..9564ffc03 100644 --- a/qdrant-landing/content/articles/new-recommendation-api.md +++ b/qdrant-landing/content/articles/new-recommendation-api.md @@ -24,7 +24,7 @@ Here, we'll discuss some internals and show how they may be used in practice. ### Recap of the old recommendations API The previous [Recommendation API](/documentation/concepts/search/#recommendation-api) in Qdrant came with some limitations. First of all, it was required to pass vector IDs for -both positive and negative example points. If you wanted to use vector embeddings directly, you had to either create a new point +both positive and negative example points. If you wanted to use [vector embeddings](/articles/what-are-embeddings/) directly, you had to either create a new point in a collection or mimic the behaviour of the Recommendation API by using the [Search API](/documentation/concepts/search/#search-api). Moreover, in the previous releases of Qdrant, you were always asked to provide at least one positive example. This requirement was based on the algorithm used to combine multiple samples into a single query vector. It was a simple, yet effective approach. diff --git a/qdrant-landing/content/articles/product-quantization.md b/qdrant-landing/content/articles/product-quantization.md index f3ecf864c..5ec008108 100644 --- a/qdrant-landing/content/articles/product-quantization.md +++ b/qdrant-landing/content/articles/product-quantization.md @@ -23,13 +23,13 @@ Qdrant 1.1.0 brought the support of [Scalar Quantization](/articles/scalar-quant a technique of reducing the memory footprint by even four times, by using `int8` to represent the values that would be normally represented by `float32`. -The memory usage in [vector search](https://qdrant.tech/solutions/) might be reduced even further! Please welcome **Product +The memory usage in [vector search](/solutions/) might be reduced even further! Please welcome **Product Quantization**, a brand-new feature of Qdrant 1.2.0! ## What is Product Quantization? Product Quantization converts floating-point numbers into integers like every other quantization -method. However, the process is slightly more complicated than [Scalar Quantization](https://qdrant.tech/articles/scalar-quantization/) and is more customizable, so you can find the sweet spot between memory usage and search precision. This article +method. However, the process is slightly more complicated than [Scalar Quantization](/articles/scalar-quantization/) and is more customizable, so you can find the sweet spot between memory usage and search precision. This article covers all the steps required to perform Product Quantization and the way it's implemented in Qdrant. ## How Does Product Quantization Work? @@ -210,7 +210,7 @@ but also the search time. ## Product Quantization vs Scalar Quantization -Compared to [Scalar Quantization](https://qdrant.tech/articles/scalar-quantization/), Product Quantization offers a higher compression rate. However, this comes with considerable trade-offs in accuracy, and at times, in-RAM search speed. +Compared to [Scalar Quantization](/articles/scalar-quantization/), Product Quantization offers a higher compression rate. However, this comes with considerable trade-offs in accuracy, and at times, in-RAM search speed. Product Quantization tends to be favored in certain specific scenarios: diff --git a/qdrant-landing/content/articles/qdrant-1.7.x.md b/qdrant-landing/content/articles/qdrant-1.7.x.md index 3fe9c617d..5daf4e29e 100644 --- a/qdrant-landing/content/articles/qdrant-1.7.x.md +++ b/qdrant-landing/content/articles/qdrant-1.7.x.md @@ -25,7 +25,7 @@ keywords: --- Please welcome the long-awaited [Qdrant 1.7.0 release](https://github.com/qdrant/qdrant/releases/tag/v1.7.0). Except for a handful of minor fixes and improvements, this release brings some cool brand-new features that we are excited to share! -The latest version of your favorite vector search engine finally supports **sparse vectors**. That's the feature many of you requested, so why should we ignore it? +The latest version of your favorite vector search engine finally supports[ **sparse vectors**](/articles/sparse-vectors/). That's the feature many of you requested, so why should we ignore it? We also decided to continue our journey with [vector similarity beyond search](/articles/vector-similarity-beyond-search/). The new Discovery API covers some utterly new use cases. We're more than excited to see what you will build with it! But there is more to it! Check out what's new in **Qdrant 1.7.0**! diff --git a/qdrant-landing/content/articles/qdrant-1.8.x.md b/qdrant-landing/content/articles/qdrant-1.8.x.md index 67a1f8279..17d6910ec 100644 --- a/qdrant-landing/content/articles/qdrant-1.8.x.md +++ b/qdrant-landing/content/articles/qdrant-1.8.x.md @@ -25,9 +25,9 @@ tags: [Qdrant 1.8.0 is out!](https://github.com/qdrant/qdrant/releases/tag/v1.8.0). This time around, we have focused on Qdrant's internals. Our goal was to optimize performance so that your existing setup can run faster and save on compute. Here is what we've been up to: -- **Faster [sparse vectors](https://qdrant.tech/articles/sparse-vectors/):** [Hybrid search](https://qdrant.tech/articles/hybrid-search/) is up to 16x faster now! +- **Faster [sparse vectors](/articles/sparse-vectors/):** [Hybrid search](/articles/hybrid-search/) is up to 16x faster now! - **CPU resource management:** You can allocate CPU threads for faster indexing. -- **Better indexing performance:** We optimized text [indexing](https://qdrant.tech/documentation/concepts/indexing/) on the backend. +- **Better indexing performance:** We optimized text [indexing](/documentation/concepts/indexing/) on the backend. ## Faster search with sparse vectors @@ -49,7 +49,7 @@ A real-life simulation of sparse vector queries was run against the [NeurIPS 202 Latency (y-axis) has dropped significantly for queries. You can see the before/after here: ![dropping latency](/articles_data/qdrant-1.8.x/benchmark.png) -**Figure 1:** Dropping latency in sparse vector search queries across versions 1.7-1.8. +**Figure 1:** Dropping latency in [sparse vector search](/articles/sparse-vectors/) queries across versions 1.7-1.8. The colors within both scatter plots show the frequency of results. The red dots show that the highest concentration is around 2200ms (before) and 135ms (after). This tells us that latency for sparse vector queries dropped by about a factor of 16. Therefore, the time it takes to retrieve an answer with Qdrant is that much shorter. @@ -86,13 +86,13 @@ This configuration can be done at any time, but it requires a restart of Qdrant. ## Better indexing for text data -In order to [minimize your RAM expenditure](https://qdrant.tech/articles/memory-consumption/), we have developed a new way to index specific types of data. Please keep in mind that this is a backend improvement, and you won't need to configure anything. +In order to [minimize your RAM expenditure](/articles/memory-consumption/), we have developed a new way to index specific types of data. Please keep in mind that this is a backend improvement, and you won't need to configure anything. > Going forward, if you are indexing immutable text fields, we estimate a 10% reduction in RAM loads. Our benchmark result is based on a system that uses 64GB of RAM. If you are using less RAM, this reduction might be higher than 10%. Immutable text fields are static and do not change once they are added to Qdrant. These entries usually represent some type of attribute, description or tag. Vectors associated with them can be indexed more efficiently, since you don’t need to re-index them anymore. Conversely, mutable fields are dynamic and can be modified after their initial creation. Please keep in mind that they will continue to require additional RAM. -This approach ensures stability in the [vector search](https://qdrant.tech/documentation/overview/vector-search/) index, with faster and more consistent operations. We achieved this by setting up a field index which helps minimize what is stored. To improve search performance we have also optimized the way we load documents for searches with a text field index. Now our backend loads documents mostly sequentially and in increasing order. +This approach ensures stability in the [vector search](/documentation/overview/vector-search/) index, with faster and more consistent operations. We achieved this by setting up a field index which helps minimize what is stored. To improve search performance we have also optimized the way we load documents for searches with a text field index. Now our backend loads documents mostly sequentially and in increasing order. ## Minor improvements and new features @@ -107,7 +107,7 @@ Beyond these enhancements, [Qdrant v1.8.0](https://github.com/qdrant/qdrant/rele ## Experience the Power of Qdrant 1.8.0 -Ready to experience the enhanced performance of Qdrant 1.8.0? Upgrade now and explore the major improvements, from faster sparse vectors to optimized CPU resource management and better indexing for text data. Take your search capabilities to the next level with Qdrant's latest version. [Try a demo today](https://qdrant.tech/demo/) and see the difference firsthand! +Ready to experience the enhanced performance of Qdrant 1.8.0? Upgrade now and explore the major improvements, from faster sparse vectors to optimized CPU resource management and better indexing for text data. Take your search capabilities to the next level with Qdrant's latest version. [Try a demo today](/demo/) and see the difference firsthand! ## Release notes diff --git a/qdrant-landing/content/articles/rag-is-dead.md b/qdrant-landing/content/articles/rag-is-dead.md index 737d7d779..49a61b011 100644 --- a/qdrant-landing/content/articles/rag-is-dead.md +++ b/qdrant-landing/content/articles/rag-is-dead.md @@ -19,11 +19,11 @@ keywords: # Is RAG Dead? The Role of Vector Databases in AI Efficiency and Vector Search -When Anthropic came out with a context window of 100K tokens, they said: “*[Vector search](https://qdrant.tech/solutions/) is dead. LLMs are getting more accurate and won’t need RAG anymore.*” +When Anthropic came out with a context window of 100K tokens, they said: “*[Vector search](/solutions/) is dead. LLMs are getting more accurate and won’t need RAG anymore.*” Google’s Gemini 1.5 now offers a context window of 10 million tokens. [Their supporting paper](https://storage.googleapis.com/deepmind-media/gemini/gemini_v1_5_report.pdf) claims victory over accuracy issues, even when applying Greg Kamradt’s [NIAH methodology](https://twitter.com/GregKamradt/status/1722386725635580292). -*It’s over. [RAG](https://qdrant.tech/articles/what-is-rag-in-ai/) (Retrieval Augmented Generation) must be completely obsolete now. Right?* +*It’s over. [RAG](/articles/what-is-rag-in-ai/) (Retrieval Augmented Generation) must be completely obsolete now. Right?* No. @@ -37,9 +37,9 @@ This is not surprising. LLMs require massive amounts of compute and memory to ru ## Context stuffing is not the solution -> Relying on context is expensive, and it doesn’t improve response quality in real-world applications. Retrieval based on [vector search](https://qdrant.tech/solutions/) offers much higher precision. +> Relying on context is expensive, and it doesn’t improve response quality in real-world applications. Retrieval based on [vector search](/solutions/) offers much higher precision. -If you solely rely on an [LLM](https://qdrant.tech/articles/what-is-rag-in-ai/) to perfect retrieval and precision, you are doing it wrong. +If you solely rely on an [LLM](/articles/what-is-rag-in-ai/) to perfect retrieval and precision, you are doing it wrong. A large context window makes it harder to focus on relevant information. This increases the risk of errors or hallucinations in its responses. @@ -91,4 +91,4 @@ Our customers remind us of this fact every day. As a product, [our vector databa We want to keep Qdrant compact, efficient and with a focused purpose. This purpose is to empower our customers to use it however they see fit. -When large enterprises release their generative AI into production, they need to keep costs under control, while retaining the best possible quality of responses. Qdrant has the [vector search solutions](https://qdrant.tech/solutions/) to do just that. Revolutionize your vector search capabilities and get started with [a Qdrant demo](https://qdrant.tech/contact-us/). \ No newline at end of file +When large enterprises release their generative AI into production, they need to keep costs under control, while retaining the best possible quality of responses. Qdrant has the [vector search solutions](/solutions/) to do just that. Revolutionize your vector search capabilities and get started with [a Qdrant demo](/contact-us/). \ No newline at end of file diff --git a/qdrant-landing/content/articles/rapid-rag-optimization-with-qdrant-and-quotient.md b/qdrant-landing/content/articles/rapid-rag-optimization-with-qdrant-and-quotient.md index b1afa0ad7..d559e56ae 100755 --- a/qdrant-landing/content/articles/rapid-rag-optimization-with-qdrant-and-quotient.md +++ b/qdrant-landing/content/articles/rapid-rag-optimization-with-qdrant-and-quotient.md @@ -21,7 +21,7 @@ keywords: In today's fast-paced, information-rich world, AI is revolutionizing knowledge management. The systematic process of capturing, distributing, and effectively using knowledge within an organization is one of the fields in which AI provides exceptional value today. -> The potential for AI-powered knowledge management increases when leveraging Retrieval Augmented Generation (RAG), a methodology that enables LLMs to access a vast, diverse repository of factual information from knowledge stores, such as vector databases. +> The potential for AI-powered knowledge management increases when leveraging Retrieval Augmented Generation (RAG), a methodology that enables LLMs to access a vast, diverse repository of factual information from knowledge stores, such as [vector databases](/qdrant-vector-database/). This process enhances the accuracy, relevance, and reliability of generated text, thereby mitigating the risk of faulty, incorrect, or nonsensical results sometimes associated with traditional LLMs. This method not only ensures that the answers are contextually relevant but also up-to-date, reflecting the latest insights and data available. diff --git a/qdrant-landing/content/articles/scalar-quantization.md b/qdrant-landing/content/articles/scalar-quantization.md index c7b411ab2..039675139 100644 --- a/qdrant-landing/content/articles/scalar-quantization.md +++ b/qdrant-landing/content/articles/scalar-quantization.md @@ -17,7 +17,7 @@ keywords: --- # Efficiency Unleashed: The Power of Scalar Quantization -High-dimensional vector embeddings can be memory-intensive, especially when working with +High-dimensional [vector embeddings](/articles/what-are-embeddings/) can be memory-intensive, especially when working with large datasets consisting of millions of vectors. Memory footprint really starts being a concern when we scale things up. A simple choice of the data type used to store a single number impacts even billions of numbers and can drive the memory requirements crazy. The diff --git a/qdrant-landing/content/articles/seed-round.md b/qdrant-landing/content/articles/seed-round.md index b77f8a196..3edc66cff 100644 --- a/qdrant-landing/content/articles/seed-round.md +++ b/qdrant-landing/content/articles/seed-round.md @@ -13,7 +13,7 @@ date: 2023-04-19T00:42:00.000Z --- -> Vector databases are here to stay. The New Age of AI is powered by vector embeddings, and vector databases are a foundational part of the stack. At Qdrant, we are working on cutting-edge open-source vector similarity search solutions to power fantastic AI applications with the best possible performance and excellent developer experience. +> [Vector databases](/qdrant-vector-database/) are here to stay. The New Age of AI is powered by [vector embeddings](/articles/what-are-embeddings/), and vector databases are a foundational part of the stack. At Qdrant, we are working on cutting-edge open-source [vector similarity](/articles/vector-similarity-beyond-search/) search solutions to power fantastic AI applications with the best possible performance and excellent developer experience. > > Our 7.5M seed funding – led by [Unusual Ventures](https://www.unusual.vc/), awesome angels, and existing investors – will help us bring these innovations to engineers and empower them to make the most of their unstructured data and the awesome power of LLMs at any scale. @@ -41,7 +41,7 @@ A new AI product category, “Co-Pilot for X,” was born and is already affecti At the same time, adoption has only begun. Vector Search Databases are replacing VSS libraries like FAISS, etc., which, despite their disadvantages, are still used by ~90% of projects out there They’re hard-coupled to the application code, lack of production-ready features like basic CRUD operations or advanced filtering, are a nightmare to maintain and scale and have many other difficulties that make life hard for developers. -The current Qdrant ecosystem consists of excellent products to work with vector embeddings. We launched our managed vector database solution, Qdrant Cloud, early this year, and it is already serving more than 1,000 Qdrant clusters. We are extending our offering now with managed on-premise solutions for enterprise customers. +The current Qdrant ecosystem consists of excellent products to work with vector embeddings. We launched our managed [vector database solution](/qdrant-vector-database/), Qdrant Cloud, early this year, and it is already serving more than 1,000 Qdrant clusters. We are extending our offering now with managed on-premise solutions for enterprise customers. {{< figure src=/articles_data/seed-round/ecosystem.png caption="Qdrant Ecosystem" alt="Qdrant Vector Database Ecosystem" >}} diff --git a/qdrant-landing/content/articles/sparse-vectors.md b/qdrant-landing/content/articles/sparse-vectors.md index 5f0c8a31e..8d70ca505 100644 --- a/qdrant-landing/content/articles/sparse-vectors.md +++ b/qdrant-landing/content/articles/sparse-vectors.md @@ -60,7 +60,7 @@ For example, in the medical domain, many rare terms are not present in the gener | **Data Representation** | Majority of elements are zero | All elements are non-zero | | **Computational Efficiency** | Generally higher, especially in operations involving zero elements | Lower, as operations are performed on all elements | | **Information Density** | Less dense, focuses on key features | Highly dense, capturing nuanced relationships | -| **Example Applications** | Text search, Hybrid search | [RAG](https://qdrant.tech/articles/what-is-rag-in-ai/), many general machine learning tasks | +| **Example Applications** | Text search, Hybrid search | [RAG](/articles/what-is-rag-in-ai/), many general machine learning tasks | Where do sparse vectors fail though? They're not great at capturing nuanced relationships between words. For example, they can't capture the relationship between "king" and "queen" as well as dense vectors. @@ -407,7 +407,7 @@ This formula calculates the similarity score by multiplying corresponding elemen ## Hybrid search: combining sparse and dense vectors -By combining search results from both dense and sparse vectors, you can achieve a hybrid search that is both efficient and accurate. +By combining search results from both dense and sparse vectors, you can achieve a [hybrid search](/articles/hybrid-search/) that is both efficient and accurate. Results from sparse vectors will guarantee, that all results with the required keywords are returned, while dense vectors will cover the semantically similar results. diff --git a/qdrant-landing/content/articles/vector-similarity-beyond-search.md b/qdrant-landing/content/articles/vector-similarity-beyond-search.md index 76fe4ab43..c75b6139f 100644 --- a/qdrant-landing/content/articles/vector-similarity-beyond-search.md +++ b/qdrant-landing/content/articles/vector-similarity-beyond-search.md @@ -24,7 +24,7 @@ keywords: When making use of unstructured data, there are traditional go-to solutions that are well-known for developers: - **Full-text search** when you need to find documents that contain a particular word or phrase. -- **[Vector search](https://qdrant.tech/documentation/overview/vector-search/)** when you need to find documents that are semantically similar to a given query. +- **[Vector search](/documentation/overview/vector-search/)** when you need to find documents that are semantically similar to a given query. Sometimes people mix those two approaches, so it might look like the vector similarity is just an extension of full-text search. However, in this article, we will explore some promising new techniques that can be used to expand the use-case of unstructured data and demonstrate that vector similarity creates its own stack of data exploration tools. @@ -204,5 +204,5 @@ We believe that this is the future of vector databases, and we are excited to se - Practical applications of vector similarity include improving data quality through mislabeling detection and anomaly identification. - Enhanced user experiences are achieved by leveraging advanced search techniques, providing users with intuitive data exploration, and improving decision-making processes. -Ready to unlock the full potential of your data? [Try a free demo](https://qdrant.tech/contact-us/) to explore how vector similarity can revolutionize your data insights and drive smarter decision-making. +Ready to unlock the full potential of your data? [Try a free demo](/contact-us/) to explore how vector similarity can revolutionize your data insights and drive smarter decision-making. diff --git a/qdrant-landing/content/articles/what-are-embeddings.md b/qdrant-landing/content/articles/what-are-embeddings.md index 575b9e0e5..afe6a9f23 100644 --- a/qdrant-landing/content/articles/what-are-embeddings.md +++ b/qdrant-landing/content/articles/what-are-embeddings.md @@ -54,7 +54,7 @@ At their core, vector embeddings are about semantics. They take the idea that "a ![Example of how synonyms are placed closer together in the embeddings space](/articles_data/what-are-embeddings/Similar-Embeddings.jpg) -This capability is crucial for creating search systems, recommendation engines, retrieval augmented generation (RAG) and any application that benefits from a deep understanding of content. +This capability is crucial for creating search systems, recommendation engines, [retrieval augmented generation (RAG)](/rag/) and any application that benefits from a deep understanding of content. ## How do embeddings work? @@ -134,7 +134,7 @@ Selecting the right embedding model for your use case is crucial to your applica If you’re looking for NLP and rapid prototyping, including language translation, question-answering, and text generation, OpenAI is a great choice. Gemini is ideal for image search, duplicate detection, and clustering tasks. -Fastembed, which we’ll use on the example below, is designed for efficiency and speed, great for applications needing low-latency responses, such as autocomplete and instant content recommendations. +[Fastembed](/articles/fastembed/), which we’ll use on the example below, is designed for efficiency and speed, great for applications needing low-latency responses, such as autocomplete and instant content recommendations. We plan to go deeper into selecting the best model based on performance, cost, integration ease, and scalability in a future post. diff --git a/qdrant-landing/content/articles/what-is-a-vector-database.md b/qdrant-landing/content/articles/what-is-a-vector-database.md index f74fcbbfc..d00f003cf 100644 --- a/qdrant-landing/content/articles/what-is-a-vector-database.md +++ b/qdrant-landing/content/articles/what-is-a-vector-database.md @@ -21,11 +21,11 @@ aliases: [ /blog/what-is-a-vector-database/ ] # Why use a Vector Database & How Does it Work? -In the ever-evolving landscape of data management and artificial intelligence, [vector databases](https://qdrant.tech/qdrant-vector-database/) have emerged as a revolutionary tool for efficiently handling complex, high-dimensional data. But what exactly is a vector database? This comprehensive guide delves into the fundamentals of vector databases, exploring their unique capabilities, core functionalities, and real-world applications. +In the ever-evolving landscape of data management and artificial intelligence, [vector databases](/qdrant-vector-database/) have emerged as a revolutionary tool for efficiently handling complex, high-dimensional data. But what exactly is a vector database? This comprehensive guide delves into the fundamentals of vector databases, exploring their unique capabilities, core functionalities, and real-world applications. ## What is a Vector Database? -A [Vector Database](https://qdrant.tech/qdrant-vector-database/) is a specialized database system designed for efficiently indexing, querying, and retrieving high-dimensional vector data. Those systems enable advanced data analysis and similarity-search operations that extend well beyond the traditional, structured query approach of conventional databases. +A [Vector Database](/qdrant-vector-database/) is a specialized database system designed for efficiently indexing, querying, and retrieving high-dimensional vector data. Those systems enable advanced data analysis and similarity-search operations that extend well beyond the traditional, structured query approach of conventional databases. ## Why use a Vector Database? @@ -70,7 +70,7 @@ The **creation** of vector data (so we can store this high-dimensional data on o ### How do Embeddings Work? -[Embeddings](https://qdrant.tech/articles/what-are-embeddings/) translate this high-dimensional data into a more manageable, **lower-dimensional** vector form that's more suitable for machine learning and data processing applications, typically through **neural network models**. +[Embeddings](/articles/what-are-embeddings/) translate this high-dimensional data into a more manageable, **lower-dimensional** vector form that's more suitable for machine learning and data processing applications, typically through **neural network models**. In creating dimensions for text, for example, the process involves analyzing the text to capture its linguistic elements. @@ -134,9 +134,9 @@ Once the closest vectors are identified at the bottom layer, these points transl ### Scalability -[Vector databases](https://qdrant.tech/qdrant-vector-database/) often deal with datasets that comprise billions of high-dimensional vectors. This data isn't just large in volume but also complex in nature, requiring more computing power and memory to process. Scalable systems can handle this increased complexity without performance degradation. This is achieved through a combination of a **distributed architecture**, **dynamic resource allocation**, **data partitioning**, **load balancing**, and **optimization techniques**. +[Vector databases](/qdrant-vector-database/) often deal with datasets that comprise billions of high-dimensional vectors. This data isn't just large in volume but also complex in nature, requiring more computing power and memory to process. Scalable systems can handle this increased complexity without performance degradation. This is achieved through a combination of a **distributed architecture**, **dynamic resource allocation**, **data partitioning**, **load balancing**, and **optimization techniques**. -Systems like Qdrant exemplify scalability in vector databases. It [leverages Rust's efficiency](https://qdrant.tech/articles/why-rust/) in **memory management** and **performance**, which allows the handling of large-scale data with optimized resource usage. +Systems like Qdrant exemplify scalability in vector databases. It [leverages Rust's efficiency](/articles/why-rust/) in **memory management** and **performance**, which allows the handling of large-scale data with optimized resource usage. ### Efficient Query Processing @@ -197,13 +197,13 @@ Alternatively, the Memmap storage option creates a virtual address space linked ## Vector Database Use Cases -If we had to summarize the [use cases for vector databases](https://qdrant.tech/use-cases/) into a single word, it would be "match". They are great at finding non-obvious ways to correspond or “match” data with a given query. Whether it's through similarity in images, text, user preferences, or patterns in data. +If we had to summarize the [use cases for vector databases](/use-cases/) into a single word, it would be "match". They are great at finding non-obvious ways to correspond or “match” data with a given query. Whether it's through similarity in images, text, user preferences, or patterns in data. Here are some examples of how to take advantage of using vector databases: -[Personalized recommendation systems](https://qdrant.tech/recommendations/) to analyze and interpret complex user data, such as preferences, behaviors, and interactions. For example, on Spotify, if a user frequently listens to the same song or skips it, the recommendation engine takes note of this to personalize future suggestions. +[Personalized recommendation systems](/recommendations/) to analyze and interpret complex user data, such as preferences, behaviors, and interactions. For example, on Spotify, if a user frequently listens to the same song or skips it, the recommendation engine takes note of this to personalize future suggestions. -[Semantic search](https://qdrant.tech/documentation/tutorials/search-beginners/) allows for systems to be able to capture the deeper semantic meaning of words and text. In modern search engines, if someone searches for "tips for planting in spring," it tries to understand the intent and contextual meaning behind the query. It doesn’t try just matching the words themselves. +[Semantic search](/documentation/tutorials/search-beginners/) allows for systems to be able to capture the deeper semantic meaning of words and text. In modern search engines, if someone searches for "tips for planting in spring," it tries to understand the intent and contextual meaning behind the query. It doesn’t try just matching the words themselves. Here’s an example of a [vector search engine for Startups](https://demo.qdrant.tech/) made with Qdrant: diff --git a/qdrant-landing/content/articles/what-is-rag-in-ai.md b/qdrant-landing/content/articles/what-is-rag-in-ai.md index 0a0730bc6..0f50eb370 100644 --- a/qdrant-landing/content/articles/what-is-rag-in-ai.md +++ b/qdrant-landing/content/articles/what-is-rag-in-ai.md @@ -22,7 +22,7 @@ tags: --- -> Retrieval-augmented generation (RAG) integrates external information retrieval into the process of generating responses by Large Language Models (LLMs). It searches a database for information beyond its pre-trained knowledge base, significantly improving the accuracy and relevance of the generated responses. +> [Retrieval-augmented generation (RAG)](/rag/) integrates external information retrieval into the process of generating responses by Large Language Models (LLMs). It searches a database for information beyond its pre-trained knowledge base, significantly improving the accuracy and relevance of the generated responses. Language models have exploded on the internet ever since ChatGPT came out, and rightfully so. They can write essays, code entire programs, and even make memes (though we’re still deciding on whether that's a good thing). @@ -36,7 +36,7 @@ The image above shows how a basic RAG system works. Before forwarding the questi As your data grows, you’ll need efficient ways to identify the most relevant information for your LLM's limited memory. This is where you’ll want a proper way to store and retrieve the specific data you’ll need for your query, without needing the LLM to remember it. -**Vector databases** store information as **vector embeddings**. This format supports efficient similarity searches to retrieve relevant data for your query. For example, Qdrant is specifically designed to perform fast, even in scenarios dealing with billions of vectors. +[**Vector databases**](/qdrant-vector-database/) store information as **vector embeddings**. This format supports efficient similarity searches to retrieve relevant data for your query. For example, Qdrant is specifically designed to perform fast, even in scenarios dealing with billions of vectors. This article will focus on RAG systems and architecture. If you’re interested in learning more about vector search, we recommend the following articles: [What is a Vector Database?](/articles/what-is-a-vector-database/) and [What are Vector Embeddings?](/articles/what-are-embeddings/). @@ -77,14 +77,14 @@ Once you have vectorized your knowledge base you can do the same to the user que #### Retrieval of relevant documents -When the system needs to find the most relevant documents or passages to answer a query, it utilizes vector similarity techniques. **Vector similarity** is a fundamental concept in machine learning and natural language processing (NLP) that quantifies the resemblance between vectors, which are mathematical representations of data points. +When the system needs to find the most relevant documents or passages to answer a query, it utilizes vector similarity techniques. [**Vector similarity**](/articles/vector-similarity-beyond-search/) is a fundamental concept in machine learning and natural language processing (NLP) that quantifies the resemblance between vectors, which are mathematical representations of data points. The system can employ different vector similarity strategies depending on the type of vectors used to represent the data: ##### Sparse vector representations -A sparse vector is characterized by a high dimensionality, with most of its elements being zero. +A [sparse vector](/articles/sparse-vectors/) is characterized by a high dimensionality, with most of its elements being zero. The classic approach is **keyword search**, which scans documents for the exact words or phrases in the query. The search creates sparse vector representations of documents by counting word occurrences and inversely weighting common words. Queries with rarer words get prioritized. diff --git a/qdrant-landing/content/articles/why-rust.md b/qdrant-landing/content/articles/why-rust.md index dda52b5c0..67fb396a7 100644 --- a/qdrant-landing/content/articles/why-rust.md +++ b/qdrant-landing/content/articles/why-rust.md @@ -19,7 +19,7 @@ Looking at the [github repository](https://github.com/qdrant/qdrant), you can se **Scala** also builds on the JVM, although there is a native compiler, there was the question of compatibility. So Scala shared the limitations of Java, and although it has some nice high-level amenities (of which Java only recently copied a subset), it still doesn’t offer the same level of control over memory layout as, say, C++, so it is similarly disqualified. -**Python**, being just a bit younger than Java, is ubiquitous in ML projects, mostly owing to its tooling (notably jupyter notebooks), being easy to learn and integration in most ML stacks. It doesn’t have a traditional garbage collector, opting for ubiquitous reference counting instead, which somewhat helps memory consumption. With that said, unless you only use it as glue code over high-perf modules, you may find yourself waiting for results. Also getting complex python services to perform stably under load is a serious technical challenge. +**Python**, being just a bit younger than Java, is ubiquitous in ML projects, mostly owing to its tooling (notably jupyter notebooks), being easy to learn and integration in most ML stacks. It doesn’t have a traditional garbage collector, opting for ubiquitous reference counting instead, which somewhat helps [memory consumption](/articles/memory-consumption/). With that said, unless you only use it as glue code over high-perf modules, you may find yourself waiting for results. Also getting complex python services to perform stably under load is a serious technical challenge. ## Into the Unknown diff --git a/qdrant-landing/content/benchmarks/single-node-speed-benchmark.md b/qdrant-landing/content/benchmarks/single-node-speed-benchmark.md index ae48ca0fa..11656090f 100644 --- a/qdrant-landing/content/benchmarks/single-node-speed-benchmark.md +++ b/qdrant-landing/content/benchmarks/single-node-speed-benchmark.md @@ -3,7 +3,7 @@ draft: false id: 1 title: Single node benchmarks description: | - We benchmarked several vector databases using various configurations of them on different datasets to check how the results may vary. Those datasets may have different vector dimensionality but also vary in terms of the distance function being used. We also tried to capture the difference we can expect while using some different configuration parameters, for both the engine itself and the search operation separately.

Updated: January/June 2024 + We benchmarked several vector databases using various configurations of them on different datasets to check how the results may vary. Those datasets may have different vector dimensionality but also vary in terms of the distance function being used. We also tried to capture the difference we can expect while using some different configuration parameters, for both the engine itself and the search operation separately.

Updated: January/June 2024 single_node_title: Single node benchmarks single_node_data: /benchmarks/results-1-100-thread-2024-06-15.json preview_image: /benchmarks/benchmark-1.png diff --git a/qdrant-landing/content/blog/are-you-vendor-locked.md b/qdrant-landing/content/blog/are-you-vendor-locked.md index bd65c84d6..0f03d6de3 100644 --- a/qdrant-landing/content/blog/are-you-vendor-locked.md +++ b/qdrant-landing/content/blog/are-you-vendor-locked.md @@ -89,7 +89,7 @@ This is how you stay dynamic and move vendors whenever it suits you best. The key to freedom is to building your applications and infrastructure to run on any cloud. By leveraging containerization and service abstraction using Kubernetes or Docker, software vendors can exercise good faith in helping their customers transition to other cloud providers. -We designed the architecture of Qdrant Hybrid Cloud to meet the evolving needs of businesses seeking unparalleled flexibility, control, and privacy. +We designed the architecture of [Qdrant Hybrid Cloud](/hybrid-cloud/) to meet the evolving needs of businesses seeking unparalleled flexibility, control, and privacy. This technology integrates Kubernetes clusters from any setting - cloud, on-premises, or edge - into a unified, enterprise-grade managed service. diff --git a/qdrant-landing/content/blog/azure-marketplace.md b/qdrant-landing/content/blog/azure-marketplace.md index 72f261584..7d2a26085 100644 --- a/qdrant-landing/content/blog/azure-marketplace.md +++ b/qdrant-landing/content/blog/azure-marketplace.md @@ -27,7 +27,7 @@ We're thrilled to announce that Qdrant is now [officially available on Azure Mar ## Key Benefits for Users: -- **Rapid Application Development:** Deploying a cluster on Microsoft Azure via the Qdrant Cloud console only takes a few seconds and can scale up as needed, giving developers maximal flexibility for their production deployments. +- **Rapid Application Development:** Deploying a cluster on Microsoft Azure via the [Qdrant Cloud](/cloud/) console only takes a few seconds and can scale up as needed, giving developers maximal flexibility for their production deployments. - **Billion Vector Scale:** Seamlessly grow and handle large-scale datasets with billions of vectors by leveraging Qdrant's features like vertical and horizontal scaling or binary quantization with Microsoft Azure's scalable infrastructure. diff --git a/qdrant-landing/content/blog/batch-vector-search-with-qdrant.md b/qdrant-landing/content/blog/batch-vector-search-with-qdrant.md index a5b397376..515745c70 100644 --- a/qdrant-landing/content/blog/batch-vector-search-with-qdrant.md +++ b/qdrant-landing/content/blog/batch-vector-search-with-qdrant.md @@ -18,7 +18,7 @@ tags: # How to Optimize Vector Search Using Batch Search in Qdrant 0.10.0 -The latest release of Qdrant 0.10.0 has introduced a lot of functionalities that simplify some common tasks. Those new possibilities come with some slightly modified interfaces of the client library. One of the recently introduced features is the possibility to query the collection with [multiple vectors](https://qdrant.tech/blog/storing-multiple-vectors-per-object-in-qdrant/) at once — a batch search mechanism. +The latest release of Qdrant 0.10.0 has introduced a lot of functionalities that simplify some common tasks. Those new possibilities come with some slightly modified interfaces of the client library. One of the recently introduced features is the possibility to query the collection with [multiple vectors](/blog/storing-multiple-vectors-per-object-in-qdrant/) at once — a batch search mechanism. There are a lot of scenarios in which you may need to perform multiple non-related tasks at the same time. Previously, you only could send several requests to Qdrant API on your own. But multiple parallel requests may cause significant network overhead and slow down the process, especially in case of poor connection speed. diff --git a/qdrant-landing/content/blog/binary-quantization-andrey-vasnetsov-vector-space-talk-001.md b/qdrant-landing/content/blog/binary-quantization-andrey-vasnetsov-vector-space-talk-001.md index 5c097d0d1..9d08fc0ea 100644 --- a/qdrant-landing/content/blog/binary-quantization-andrey-vasnetsov-vector-space-talk-001.md +++ b/qdrant-landing/content/blog/binary-quantization-andrey-vasnetsov-vector-space-talk-001.md @@ -22,7 +22,7 @@ tags: Ever wonder why we need quantization for vector indexes? Andrey Vasnetsov explains the complexities and challenges of searching through proximity graphs. Binary quantization reduces storage size and boosts speed by 30x, but not all models are compatible. -Andrey worked as a Machine Learning Engineer most of his career. He prefers practical over theoretical, working demo over arXiv paper. He is currently working as the CTO at Qdrant a Vector Similarity Search Engine, which can be used for semantic search, similarity matching of text, images or even videos, and also recommendations. +Andrey worked as a Machine Learning Engineer most of his career. He prefers practical over theoretical, working demo over arXiv paper. He is currently working as the CTO at Qdrant a [Vector Similarity Search Engine](/articles/vector-similarity-beyond-search/), which can be used for semantic search, similarity matching of text, images or even videos, and also recommendations. ***Listen to the episode on [Spotify](https://open.spotify.com/episode/7dPOm3x4rDBwSFkGZuwaMq?si=Ip77WCa_RCCYebeHX6DTMQ), Apple Podcast, Podcast addicts, Castbox. You can also watch this episode on [YouTube](https://youtu.be/4aUq5VnR_VI).*** diff --git a/qdrant-landing/content/blog/building-a-high-performance-entity-matching-solution-with-qdrant-rishabh-bhardwaj-vector-space-talks-005.md b/qdrant-landing/content/blog/building-a-high-performance-entity-matching-solution-with-qdrant-rishabh-bhardwaj-vector-space-talks-005.md index 19ea5e6ad..1f638a2c3 100644 --- a/qdrant-landing/content/blog/building-a-high-performance-entity-matching-solution-with-qdrant-rishabh-bhardwaj-vector-space-talks-005.md +++ b/qdrant-landing/content/blog/building-a-high-performance-entity-matching-solution-with-qdrant-rishabh-bhardwaj-vector-space-talks-005.md @@ -21,7 +21,7 @@ tags: > -- Rishabh Bhardwaj > -How does the HNSW (Hierarchical Navigable Small World) algorithm benefit the solution built by Rishabh? +How does the [HNSW (Hierarchical Navigable Small World)](/articles/filtrable-hnsw/) algorithm benefit the solution built by Rishabh? Rhishabh, a Data Engineer at HRS Group, excels in designing, developing, and maintaining data pipelines and infrastructure crucial for data-driven decision-making processes. With extensive experience, Rhishabh brings a profound understanding of data engineering principles and best practices to the role. Proficient in SQL, Python, Airflow, ETL tools, and cloud platforms like AWS and Azure, Rhishabh has a proven track record of delivering high-quality data solutions that align with business needs. Collaborating closely with data analysts, scientists, and stakeholders at HRS Group, Rhishabh ensures the provision of valuable data and insights for informed decision-making. diff --git a/qdrant-landing/content/blog/building-search-rag-for-an-openapi-spec-nick-khami-vector-space-talks.md b/qdrant-landing/content/blog/building-search-rag-for-an-openapi-spec-nick-khami-vector-space-talks.md index 0a8648388..3e91ed0e7 100644 --- a/qdrant-landing/content/blog/building-search-rag-for-an-openapi-spec-nick-khami-vector-space-talks.md +++ b/qdrant-landing/content/blog/building-search-rag-for-an-openapi-spec-nick-khami-vector-space-talks.md @@ -57,7 +57,7 @@ Here are five key takeaways from this episode: 17:00 Trieve wrapped up YC W24 batch.\ 21:45 Revolutionizing company search.\ 23:30 Next update: user tracking, analytics, and cross-encoders.\ -27:39 Quadruple supported sparse vectors.\ +27:39 Quadruple supported [sparse vectors](/articles/sparse-vectors/).\ 30:09 Final questions and wrap up. ## More Quotes from Nick: diff --git a/qdrant-landing/content/blog/case-study-bloop.md b/qdrant-landing/content/blog/case-study-bloop.md index b587bc58c..4c07e23dd 100644 --- a/qdrant-landing/content/blog/case-study-bloop.md +++ b/qdrant-landing/content/blog/case-study-bloop.md @@ -51,7 +51,7 @@ to make the most of unstructured data. It is easy to use, deploy and scale, blaz simultaneously. Qdrant was founded in 2021 in Berlin by Andre Zayarni and Andrey Vasnestov with the mission to power the -next generation of AI applications with advanced and high-performant vector similarity search technology. +next generation of AI applications with advanced and high-performant [vector similarity](/articles/vector-similarity-beyond-search/) search technology. Their flagship product is the vector search database which is available as an open source https://github.com/qdrant/qdrant or managed cloud solution https://cloud.qdrant.io/. diff --git a/qdrant-landing/content/blog/case-study-dailymotion.md b/qdrant-landing/content/blog/case-study-dailymotion.md index 77875cc21..56cdabad1 100644 --- a/qdrant-landing/content/blog/case-study-dailymotion.md +++ b/qdrant-landing/content/blog/case-study-dailymotion.md @@ -75,7 +75,7 @@ Title , Tags , Description , Transcript (generated by [OpenAI whisper](https://o ![quote-from-Samuel](/case-studies/dailymotion/Dailymotion-Quote.jpg) -Looking at the complexity, scale and adaptability of the desired solution, the team decided to leverage Qdrant’s vector database to implement a content-based video recommendation that undoubtedly offered several advantages over other methods: +Looking at the complexity, scale and adaptability of the desired solution, the team decided to leverage [Qdrant’s vector database](/qdrant-vector-database/) to implement a content-based video recommendation that undoubtedly offered several advantages over other methods: **1. Efficiency in High-Dimensional Data Handling:** diff --git a/qdrant-landing/content/blog/case-study-dust.md b/qdrant-landing/content/blog/case-study-dust.md index 87f7b3614..4095c9ce4 100644 --- a/qdrant-landing/content/blog/case-study-dust.md +++ b/qdrant-landing/content/blog/case-study-dust.md @@ -59,7 +59,7 @@ strategy with the embeddings models and performs retrieval augmented generation. ![solution-laptop-screen](/case-studies/dust/laptop-solutions.jpg) -For this, Dust required a vector database and evaluated different options +For this, Dust required a [vector database](/articles/what-is-a-vector-database/) and evaluated different options including Pinecone and Weaviate, but ultimately decided on Qdrant as the solution of choice. “We particularly liked Qdrant because it is open-source, written in Rust, and it has a well-designed API,” Polu says. For example, Dust diff --git a/qdrant-landing/content/blog/case-study-visua.md b/qdrant-landing/content/blog/case-study-visua.md index d0a8503da..1c2af3d5f 100644 --- a/qdrant-landing/content/blog/case-study-visua.md +++ b/qdrant-landing/content/blog/case-study-visua.md @@ -31,7 +31,7 @@ The accuracy of object detection within images is critical for VISUA ensuring th The challenge was twofold. First, VISUA needed a method to rapidly and accurately identify images and the objects within them that were similar, to identify false negatives, or unclear outcomes and use them as inputs for reinforcement learning. -Second, the rapid growth in data volume challenged their previous quality control processes, which relied on a sampling method based on meta-information (like analyzing lower-confidence, smaller, or blurry images), which involved more manual reviews and was not as scalable as needed. In response, the team at VISUA explored vector databases as a solution. +Second, the rapid growth in data volume challenged their previous quality control processes, which relied on a sampling method based on meta-information (like analyzing lower-confidence, smaller, or blurry images), which involved more manual reviews and was not as scalable as needed. In response, the team at VISUA explored [vector databases as a solution](/qdrant-vector-database/). ## The Solution diff --git a/qdrant-landing/content/blog/comparing-qdrant-vs-pinecone-vector-databases.md b/qdrant-landing/content/blog/comparing-qdrant-vs-pinecone-vector-databases.md index faeb9f1c6..a465f5033 100644 --- a/qdrant-landing/content/blog/comparing-qdrant-vs-pinecone-vector-databases.md +++ b/qdrant-landing/content/blog/comparing-qdrant-vs-pinecone-vector-databases.md @@ -20,7 +20,7 @@ tags: # Comparing Qdrant vs Pinecone: Vector Database Showdown -Data forms the foundation upon which AI applications are built. Data can exist in both structured and unstructured formats. Structured data typically has well-defined schemas or inherent relationships. However, unstructured data, such as text, image, audio, or video, must first be converted into numerical representations known as [vector embeddings](https://qdrant.tech/articles/what-are-embeddings/). These embeddings encapsulate the semantic meaning or features of unstructured data and are in the form of high-dimensional vectors. +Data forms the foundation upon which AI applications are built. Data can exist in both structured and unstructured formats. Structured data typically has well-defined schemas or inherent relationships. However, unstructured data, such as text, image, audio, or video, must first be converted into numerical representations known as [vector embeddings](/articles/what-are-embeddings/). These embeddings encapsulate the semantic meaning or features of unstructured data and are in the form of high-dimensional vectors. Traditional databases, while effective at handling structured data, fall short when dealing with high-dimensional unstructured data, which are increasingly the focal point of modern AI applications. Key reasons include: @@ -40,18 +40,18 @@ Over the past few years, several vector database solutions have emerged – the Qdrant is a high-performance, open-source vector similarity search engine built with Rust, designed to handle the demands of large-scale AI applications with exceptional speed and reliability. Founded in 2021, Qdrant's mission is to "build the most efficient, scalable, and high-performance vector database in the market." This mission is reflected in its architecture and feature set. -Qdrant is highly scalable and performant: it can handle billions of vectors efficiently and with [minimal latency](https://qdrant.tech/benchmarks/). Its advanced vector indexing, search, and retrieval capabilities make it ideal for applications that require fast and accurate search results. It supports vertical and horizontal scaling, advanced compression techniques, highly flexible deployment options – including cloud-native, hybrid cloud, and private cloud solutions – and powerful security features. +Qdrant is highly scalable and performant: it can handle billions of vectors efficiently and with [minimal latency](/benchmarks/). Its advanced vector indexing, search, and retrieval capabilities make it ideal for applications that require fast and accurate search results. It supports vertical and horizontal scaling, advanced compression techniques, highly flexible deployment options – including cloud-native, hybrid cloud, and private cloud solutions – and powerful security features. Let’s look at some of its key features. -- **Advanced Similarity Search:** Qdrant supports various similarity [search](https://qdrant.tech/documentation/concepts/search/) metrics like dot product, cosine similarity, Euclidean distance, and Manhattan distance. You can store additional information along with vectors, known as [payload](https://qdrant.tech/documentation/concepts/payload/) in Qdrant terminology. A payload is any JSON formatted data. +- **Advanced Similarity Search:** Qdrant supports various similarity [search](/documentation/concepts/search/) metrics like dot product, cosine similarity, Euclidean distance, and Manhattan distance. You can store additional information along with vectors, known as [payload](/documentation/concepts/payload/) in Qdrant terminology. A payload is any JSON formatted data. - **Built Using Rust:** Qdrant is built with Rust, and leverages its performance and efficiency. Rust is famed for its [memory safety](https://arxiv.org/abs/2206.05503) without the overhead of a garbage collector, and rivals C and C++ in speed. -- **Scaling and Multitenancy**: Qdrant supports both vertical and horizontal scaling and uses the Raft consensus protocol for [distributed deployments](https://qdrant.tech/documentation/guides/distributed_deployment/). Developers can run Qdrant clusters with replicas and shards, and seamlessly scale to handle large datasets. Qdrant also supports [multitenancy](https://qdrant.tech/documentation/guides/multiple-partitions/) where developers can create single collections and partition them using payload. -- **Payload Indexing and Filtering:** Just as Qdrant allows attaching any JSON payload to vectors, it also supports payload indexing and [filtering](https://qdrant.tech/documentation/concepts/filtering/) with a wide range of data types and query conditions, including keyword matching, full-text filtering, numerical ranges, nested object filters, and [geo](https://qdrant.tech/documentation/concepts/filtering/#geo)filtering. -- **Hybrid Search with Sparse Vectors:** Qdrant supports both dense and [sparse vectors](https://qdrant.tech/articles/sparse-vectors/), thereby enabling hybrid search capabilities. Sparse vectors are numerical representations of data where most of the elements are zero. Developers can combine search results from dense and sparse vectors, where sparse vectors ensure that results containing the specific keywords are returned and dense vectors identify semantically similar results. -- **Built-In Vector Quantization:** Qdrant offers three different [quantization](https://qdrant.tech/documentation/guides/quantization/) options to developers to optimize resource usage. Scalar quantization balances accuracy, speed, and compression by converting 32-bit floats to 8-bit integers. Binary quantization, the fastest method, significantly reduces memory usage. Product quantization offers the highest compression, and is perfect for memory-constrained scenarios. -- **Flexible Deployment Options:** Qdrant offers a range of deployment options. Developers can easily set up Qdrant (or Qdrant cluster) [locally](https://qdrant.tech/documentation/quick-start/#download-and-run) using Docker for free. [Qdrant Cloud](https://qdrant.tech/cloud/), on the other hand, is a scalable, managed solution that provides easy access with flexible pricing. Additionally, Qdrant offers [Hybrid Cloud](https://qdrant.tech/hybrid-cloud/) which integrates Kubernetes clusters from cloud, on-premises, or edge, into an enterprise-grade managed service. -- **Security through API Keys, JWT and RBAC:** Qdrant offers developers various ways to [secure](https://qdrant.tech/documentation/guides/security/) their instances. For simple authentication, developers can use API keys (including Read Only API keys). For more granular access control, it offers JSON Web Tokens (JWT) and the ability to build Role-Based Access Control (RBAC). TLS can be enabled to secure connections. Qdrant is also [SOC 2 Type II](https://qdrant.tech/blog/qdrant-soc2-type2-audit/) certified. +- **Scaling and Multitenancy**: Qdrant supports both vertical and horizontal scaling and uses the Raft consensus protocol for [distributed deployments](/documentation/guides/distributed_deployment/). Developers can run Qdrant clusters with replicas and shards, and seamlessly scale to handle large datasets. Qdrant also supports [multitenancy](/documentation/guides/multiple-partitions/) where developers can create single collections and partition them using payload. +- **Payload Indexing and Filtering:** Just as Qdrant allows attaching any JSON payload to vectors, it also supports payload indexing and [filtering](/documentation/concepts/filtering/) with a wide range of data types and query conditions, including keyword matching, full-text filtering, numerical ranges, nested object filters, and [geo](/documentation/concepts/filtering/#geo)filtering. +- **Hybrid Search with Sparse Vectors:** Qdrant supports both dense and [sparse vectors](/articles/sparse-vectors/), thereby enabling hybrid search capabilities. Sparse vectors are numerical representations of data where most of the elements are zero. Developers can combine search results from dense and sparse vectors, where sparse vectors ensure that results containing the specific keywords are returned and dense vectors identify semantically similar results. +- **Built-In Vector Quantization:** Qdrant offers three different [quantization](/documentation/guides/quantization/) options to developers to optimize resource usage. Scalar quantization balances accuracy, speed, and compression by converting 32-bit floats to 8-bit integers. Binary quantization, the fastest method, significantly reduces memory usage. Product quantization offers the highest compression, and is perfect for memory-constrained scenarios. +- **Flexible Deployment Options:** Qdrant offers a range of deployment options. Developers can easily set up Qdrant (or Qdrant cluster) [locally](/documentation/quick-start/#download-and-run) using Docker for free. [Qdrant Cloud](/cloud/), on the other hand, is a scalable, managed solution that provides easy access with flexible pricing. Additionally, Qdrant offers [Hybrid Cloud](/hybrid-cloud/) which integrates Kubernetes clusters from cloud, on-premises, or edge, into an enterprise-grade managed service. +- **Security through API Keys, JWT and RBAC:** Qdrant offers developers various ways to [secure](/documentation/guides/security/) their instances. For simple authentication, developers can use API keys (including Read Only API keys). For more granular access control, it offers JSON Web Tokens (JWT) and the ability to build Role-Based Access Control (RBAC). TLS can be enabled to secure connections. Qdrant is also [SOC 2 Type II](/blog/qdrant-soc2-type2-audit/) certified. Additionally, Qdrant integrates seamlessly with popular machine learning frameworks such as LangChain, LlamaIndex, and Haystack; and Qdrant Hybrid Cloud integrates seamlessly with AWS, DigitalOcean, Google Cloud, Linode, Oracle Cloud, OpenShift, and Azure, among others. @@ -127,7 +127,7 @@ When choosing between Qdrant and Pinecone, you need to consider some key factors ### **5. Cost** -**Qdrant** can be self-hosted locally (single node or a cluster) with a single Docker command. With its SaaS option, it offers a free tier in Qdrant Cloud sufficient for around 1M 768-dimensional vectors, without any limitation on the number of collections it is used for. This allows developers to build multiple demos without limitations. For more pricing information, check [here](https://qdrant.tech/pricing/). +**Qdrant** can be self-hosted locally (single node or a cluster) with a single Docker command. With its SaaS option, it offers a free tier in Qdrant Cloud sufficient for around 1M 768-dimensional vectors, without any limitation on the number of collections it is used for. This allows developers to build multiple demos without limitations. For more pricing information, check [here](/pricing/). **Pinecone** cannot be self-hosted, and signing up for the SaaS solution is the only option. Pinecone has a free tier that supports approximately 300K 1536-dimensional embeddings. For Pinecone’s pricing details, check their pricing page. @@ -162,12 +162,12 @@ For maximum control, security, and cost-efficiency, choose Qdrant. It offers fle Qdrant is one of the leading Pinecone alternatives in the market. For developers who seek control of their vector database, Qdrant offers the highest level of customization, flexible deployment options, and advanced security features. -To get started with Qdrant, explore our [documentation](https://qdrant.tech/documentation/), hop on to our [Discord](https://qdrant.to/discord) channel, sign up for [Qdrant cloud](https://cloud.qdrant.io/) (or [Hybrid cloud](https://qdrant.tech/hybrid-cloud/)), or [get in touch](https://qdrant.tech/contact-us/) with us today. +To get started with Qdrant, explore our [documentation](/documentation/), hop on to our [Discord](https://qdrant.to/discord) channel, sign up for [Qdrant cloud](https://cloud.qdrant.io/) (or [Hybrid cloud](/hybrid-cloud/)), or [get in touch](/contact-us/) with us today. References: - [Pinecone Documentation](https://docs.pinecone.io/) -- [Qdrant Documentation](https://qdrant.tech/documentation/) +- [Qdrant Documentation](/documentation/) - If you aren't ready yet, [try out Qdrant locally](/documentation/quick-start/) or sign up for [Qdrant Cloud](https://cloud.qdrant.io/). diff --git a/qdrant-landing/content/blog/cve-2024-3829-response.md b/qdrant-landing/content/blog/cve-2024-3829-response.md index fd66ee12d..1e1ce648f 100644 --- a/qdrant-landing/content/blog/cve-2024-3829-response.md +++ b/qdrant-landing/content/blog/cve-2024-3829-response.md @@ -41,17 +41,17 @@ is not at least v1.9.0. To confirm the version of your Qdrant deployment in the cloud or on your local or cloud system, run an API GET call, as described in the [Qdrant Quickstart -guide](https://qdrant.tech/documentation/cloud/quickstart-cloud/#step-2-test-cluster-access). +guide](/documentation/cloud/quickstart-cloud/#step-2-test-cluster-access). If your Qdrant deployment is local, you do not need an API key. Your next step depends on how you installed Qdrant. For details, read the -[Qdrant Installation](https://qdrant.tech/documentation/guides/installation/) +[Qdrant Installation](/documentation/guides/installation/) guide. #### If you use the Qdrant container or binary Upgrade your deployment. Run the commands in the applicable section of the -[Qdrant Installation](https://qdrant.tech/documentation/guides/installation/) +[Qdrant Installation](/documentation/guides/installation/) guide. The default commands automatically pull the latest version of Qdrant. #### If you use the Qdrant helm chart diff --git a/qdrant-landing/content/blog/datatalk-club-podcast-plug.md b/qdrant-landing/content/blog/datatalk-club-podcast-plug.md index ac0297562..9790d13ad 100644 --- a/qdrant-landing/content/blog/datatalk-club-podcast-plug.md +++ b/qdrant-landing/content/blog/datatalk-club-podcast-plug.md @@ -18,7 +18,7 @@ tags: ## Navigating challenges and innovations in search technologies -We participated in a [podcast](#podcast-discussion-recap) on search technologies, specifically with retrieval-augmented generation (RAG) in language models. +We participated in a [podcast](#podcast-discussion-recap) on search technologies, specifically with [retrieval-augmented generation (RAG)](/rag/) in language models. RAG is a cutting-edge approach in natural language processing (NLP). It uses information retrieval and language generation models. We describe how it can enhance what AI can do to understand, retrieve, and generate human-like text. diff --git a/qdrant-landing/content/blog/dspy-vs-langchain.md b/qdrant-landing/content/blog/dspy-vs-langchain.md index e008c9dd6..d57ea33dc 100644 --- a/qdrant-landing/content/blog/dspy-vs-langchain.md +++ b/qdrant-landing/content/blog/dspy-vs-langchain.md @@ -18,7 +18,7 @@ keywords: # Keywords for SEO - chatbots --- -As Large Language Models (LLMs) and vector stores have become steadily more powerful, a new generation of frameworks has appeared which can streamline the development of AI applications by leveraging LLMs and vector search technology. These frameworks simplify the process of building everything from Retrieval Augmented Generation (RAG) applications to complex chatbots with advanced conversational abilities, and even sophisticated reasoning-driven AI applications. +As Large Language Models (LLMs) and vector stores have become steadily more powerful, a new generation of frameworks has appeared which can streamline the development of AI applications by leveraging LLMs and vector search technology. These frameworks simplify the process of building everything from [Retrieval Augmented Generation (RAG)](/rag/) applications to complex chatbots with advanced conversational abilities, and even sophisticated reasoning-driven AI applications. The most well-known of these frameworks is possibly [LangChain](https://github.com/langchain-ai/langchain). [Launched in October 2022](https://en.wikipedia.org/wiki/LangChain) as an open-source project by Harrison Chase, the project quickly gained popularity, attracting contributions from hundreds of developers on GitHub. LangChain excels in its broad support for documents, data sources, and APIs. This, along with seamless integration with vector stores like Qdrant and the ability to chain multiple LLMs, has allowed developers to build complex AI applications without reinventing the wheel. diff --git a/qdrant-landing/content/blog/fastembed-fast-lightweight-embedding-generation-nirant-kasliwal-vector-space-talks-004.md b/qdrant-landing/content/blog/fastembed-fast-lightweight-embedding-generation-nirant-kasliwal-vector-space-talks-004.md index 4e9fb5137..b0b7e823f 100644 --- a/qdrant-landing/content/blog/fastembed-fast-lightweight-embedding-generation-nirant-kasliwal-vector-space-talks-004.md +++ b/qdrant-landing/content/blog/fastembed-fast-lightweight-embedding-generation-nirant-kasliwal-vector-space-talks-004.md @@ -20,7 +20,7 @@ tags: > *"When things are actually similar or how we define similarity. They are close to each other and if they are not, they're far from each other. This is what a model or embedding model tries to do.”*\ >-- Nirant Kasliwal -Heard about FastEmbed? It's a game-changer. Nirant shares tricks on how to improve your embedding models. You might want to give it a shot! +Heard about [FastEmbed](/articles/fastembed/)? It's a game-changer. Nirant shares tricks on how to improve your embedding models. You might want to give it a shot! Nirant Kasliwal, the creator and maintainer of FastEmbed, has made notable contributions to the Finetuning Cookbook at OpenAI Cookbook. His contributions extend to the field of Natural Language Processing (NLP), with over 5,000 copies of the NLP book sold. diff --git a/qdrant-landing/content/blog/gen-ai-and-vector-search-iveta-lohovska-vector-space-talks.md b/qdrant-landing/content/blog/gen-ai-and-vector-search-iveta-lohovska-vector-space-talks.md index a7fa9d837..4f561a8d1 100644 --- a/qdrant-landing/content/blog/gen-ai-and-vector-search-iveta-lohovska-vector-space-talks.md +++ b/qdrant-landing/content/blog/gen-ai-and-vector-search-iveta-lohovska-vector-space-talks.md @@ -100,7 +100,7 @@ Sabrina Aquino: That's amazing. And can you talk a little bit more about the importance of the transparency of these models and what can happen if we don't know exactly what kind of data they are being trained on? Iveta Lohovska: -I mean, this is especially relevant under our context of [vector databases](https://qdrant.tech/articles/what-is-a-vector-database/) and vector search. Because in the generative AI context of AI, all foundational models have been trained on some foundational data sets that are distributed in different ways. Some are very conversational, some are very technical, some are on, let's say very strict taxonomy like healthcare or chemical structures. We call them modalities, and they have different representations. So, so when it comes to implementing vector search or [vector database](https://qdrant.tech/articles/what-is-a-vector-database/) and knowing the distribution of the foundational data sets, you have better control if you introduce additional layers or additional components to have the control in your hands of where the information is coming from, where it's stored, [what are the embeddings](https://qdrant.tech/articles/what-are-embeddings/). So that helps, but it is actually quite important that you know what the foundational data sets are, so that you can predict any kind of weaknesses or vulnerabilities or penetrations that the solution or the use case of the model will face when it lands at the end user. Because we know with generative AI that is unpredictable, we know we can implement guardrails. They're already solutions. +I mean, this is especially relevant under our context of [vector databases](/articles/what-is-a-vector-database/) and vector search. Because in the generative AI context of AI, all foundational models have been trained on some foundational data sets that are distributed in different ways. Some are very conversational, some are very technical, some are on, let's say very strict taxonomy like healthcare or chemical structures. We call them modalities, and they have different representations. So, so when it comes to implementing vector search or [vector database](/articles/what-is-a-vector-database/) and knowing the distribution of the foundational data sets, you have better control if you introduce additional layers or additional components to have the control in your hands of where the information is coming from, where it's stored, [what are the embeddings](/articles/what-are-embeddings/). So that helps, but it is actually quite important that you know what the foundational data sets are, so that you can predict any kind of weaknesses or vulnerabilities or penetrations that the solution or the use case of the model will face when it lands at the end user. Because we know with generative AI that is unpredictable, we know we can implement guardrails. They're already solutions. Iveta Lohovska: We know they're not 100, they don't give you 100% certainty, but they are definitely use cases and work where you need to hit the hundred percent certainty, especially intelligence, cybersecurity and healthcare. @@ -118,10 +118,10 @@ Iveta Lohovska: It has the public. You can go and benchmark your carbon footprint as an individual living in one country comparing to an individual living in another. But if you are a policymaker, which is the other interface of this application, who will write the policy recommendation of a country in their own country, or a country they're advising on, you might want to make sure that the scientific citations and the policy recommendations that you're making are correct and they are retrieved from the proper data sources. Because there will be a huge implication when you go public with those numbers or when you actually design a law that is reinforceable with legal terms and law enforcement. Sabrina Aquino: -That's very interesting, Iveta, and I think this is one of the great use cases for [RAG](https://qdrant.tech/articles/what-is-rag-in-ai/), for example. And I think if you can talk a little bit more about how vector search is playing into all of this, how it's helping organizations do this, this. +That's very interesting, Iveta, and I think this is one of the great use cases for [RAG](/articles/what-is-rag-in-ai/), for example. And I think if you can talk a little bit more about how vector search is playing into all of this, how it's helping organizations do this, this. Iveta Lohovska: -Would be amazing in such specific use cases. I think the main differentiator is the traceability component, the first that you have full control on which data it will refer to, because if you deal with open source models, most of them are open, but the data it has been trained on has not been opened or given public so with vector database you introduce a step of control and explainability. Explainability means if you receive a certain answer based on your prompt, you can trace it back to the exact source where the embedding has been stored or the source of where the information is coming from and things. So this is a major use case for us for those kind of high stake solution is that you have the explainability and traceability. Explainability. It could be as simple as a semantical similarity to the text, but also the traceability of where it's coming from and the exact link of where it's coming from. So it should be, it shouldn't be referred. You can close and you can cut the line of the model referring to its previous knowledge by introducing a [vector database](https://qdrant.tech/articles/what-is-a-vector-database/), for example. +Would be amazing in such specific use cases. I think the main differentiator is the traceability component, the first that you have full control on which data it will refer to, because if you deal with open source models, most of them are open, but the data it has been trained on has not been opened or given public so with vector database you introduce a step of control and explainability. Explainability means if you receive a certain answer based on your prompt, you can trace it back to the exact source where the embedding has been stored or the source of where the information is coming from and things. So this is a major use case for us for those kind of high stake solution is that you have the explainability and traceability. Explainability. It could be as simple as a semantical similarity to the text, but also the traceability of where it's coming from and the exact link of where it's coming from. So it should be, it shouldn't be referred. You can close and you can cut the line of the model referring to its previous knowledge by introducing a [vector database](/articles/what-is-a-vector-database/), for example. Iveta Lohovska: So there could be many other implications and improvements in terms of speed and just handling huge amounts of data, yet also nice to have that come with this kind of technique, but the prior use case is actually not incentivized around those. @@ -139,7 +139,7 @@ Iveta Lohovska: Yeah, so most of the cases, I would say 99% of the cases, is that if you have such a high requirements around security and explainability, security of the data, but those security of the whole use case and environment, and the explainability and trustworthiness of the answer, then it's very natural to have expectations that will be on prem and not in the cloud, because only on prem you have a full control of where your data sits, where your model sits, the full ownership of your IP, and then the full ownership of having less question marks of the implementation and architecture, but mainly the full ownership of the end to end solution. So when it comes to those use cases, RAG on Prem, with the whole infrastructure, with the whole software and platform layers, including models on Prem, not accessible through an API, through a service somewhere where you don't know where the guardrails is, who designed the guardrails, what are the guardrails? And we see those, this a lot with, for example, copilot, a lot of question marks around that. So it's a huge part of my work is just talking of it, just sorting out that. Sabrina Aquino: -Exactly. You don't want to just give away your data to a cloud provider, because there's many implications that that comes with. And I think even your clients, they need certain certifications, then they need to make sure that nobody can access that data, something that you cannot. Exactly. I think ensure if you're just using a cloud provider somewhere, which is, I think something that's very important when you're thinking about these high stakes solutions. But also I think if you're going to maybe outsource some of the infrastructure, you also need to think about something that's similar to a [hybrid cloud solution](https://qdrant.tech/documentation/hybrid-cloud/) where you can keep your data and outsource the kind of management of infrastructure. So that's also a nice use case for that, right? +Exactly. You don't want to just give away your data to a cloud provider, because there's many implications that that comes with. And I think even your clients, they need certain certifications, then they need to make sure that nobody can access that data, something that you cannot. Exactly. I think ensure if you're just using a cloud provider somewhere, which is, I think something that's very important when you're thinking about these high stakes solutions. But also I think if you're going to maybe outsource some of the infrastructure, you also need to think about something that's similar to a [hybrid cloud solution](/documentation/hybrid-cloud/) where you can keep your data and outsource the kind of management of infrastructure. So that's also a nice use case for that, right? Iveta Lohovska: I mean, I work for HPE, so hybrid is like one of our biggest sacred words. Yeah, exactly. But actually like if you see the trends and if you see how expensive is to work to run some of those workloads in the cloud, either for training for national model or fine tuning. And no one talks about inference, inference not in ten users, but inference in hundred users with big organizations. This itself is not sustainable. Honestly, when you do the simple Linux, algebra or math of the exponential cost around this. That's why everything is hybrid. And there are use cases that make sense to be fast and speedy and easy to play with, low risk in the cloud to try. diff --git a/qdrant-landing/content/blog/hybrid-cloud-airbyte.md b/qdrant-landing/content/blog/hybrid-cloud-airbyte.md index a3bd6cac7..c7f69bd0e 100644 --- a/qdrant-landing/content/blog/hybrid-cloud-airbyte.md +++ b/qdrant-landing/content/blog/hybrid-cloud-airbyte.md @@ -17,7 +17,7 @@ In their mission to support large-scale AI innovation, [Airbyte](https://airbyte This is a major step forward in offering enterprise customers incredible synergy for maximizing the potential of their AI data. Qdrant's new Kubernetes-native design, coupled with Airbyte’s powerful data ingestion pipelines meet the needs of developers who are both prototyping and building production-level apps. Airbyte simplifies the process of data integration by providing a platform that connects to various sources and destinations effortlessly. Moreover, Qdrant Hybrid Cloud leverages advanced indexing and search capabilities to empower users to explore and analyze their data efficiently. -In a major benefit to Generative AI, businesses can leverage Airbyte's data replication capabilities to ensure that their data in Qdrant Hybrid Cloud is always up to date. This empowers all users of Retrieval Augmented Generation (RAG) applications with effective analysis and decision-making potential, all based on the latest information. Furthermore, by combining Airbyte's platform and Qdrant's hybrid cloud infrastructure, users can optimize their data operations while keeping costs under control via flexible pricing models tailored to individual usage requirements. +In a major benefit to Generative AI, businesses can leverage Airbyte's data replication capabilities to ensure that their data in Qdrant Hybrid Cloud is always up to date. This empowers all users of [Retrieval Augmented Generation (RAG) applications](/rag/) with effective analysis and decision-making potential, all based on the latest information. Furthermore, by combining Airbyte's platform and Qdrant's hybrid cloud infrastructure, users can optimize their data operations while keeping costs under control via flexible pricing models tailored to individual usage requirements. > *“The new Qdrant Hybrid Cloud is an exciting addition that offers peace of mind and flexibility, aligning perfectly with the needs of Airbyte Enterprise users who value the same balance. Being open-source at our core, both Qdrant and Airbyte prioritize giving users the flexibility to build and test locally—a significant advantage for data engineers and AI practitioners. We're enthusiastic about the Hybrid Cloud launch, as it mirrors our vision of enabling users to confidently transition from local development and local deployments to a managed solution, with both cloud and hybrid cloud deployment options.”* AJ Steers, Staff Engineer for AI, Airbyte diff --git a/qdrant-landing/content/blog/hybrid-cloud-jinaai.md b/qdrant-landing/content/blog/hybrid-cloud-jinaai.md index 54a5fe71a..36b7a16b0 100644 --- a/qdrant-landing/content/blog/hybrid-cloud-jinaai.md +++ b/qdrant-landing/content/blog/hybrid-cloud-jinaai.md @@ -39,7 +39,7 @@ To get you started, we created a comprehensive tutorial that shows how to build #### Tutorial: Hybrid Search for Household Appliance Manuals -Learn how to build an app that retrieves information from PDF user manuals to enhance user experience for companies that sell household appliances. The system will leverage Jina AI embeddings and Qdrant Hybrid Cloud for enhanced generative AI capabilities, while the RAG pipeline will be tied together using the LlamaIndex framework. This example demonstrates how complex tables in PDF documentation can be processed as high quality embeddings with no extra configuration. By introducing Hybrid Search from Qdrant, the RAG functionality is highly accurate. +Learn how to build an app that retrieves information from PDF user manuals to enhance user experience for companies that sell household appliances. The system will leverage Jina AI embeddings and Qdrant Hybrid Cloud for enhanced generative AI capabilities, while the RAG pipeline will be tied together using the LlamaIndex framework. This example demonstrates how complex tables in PDF documentation can be processed as high quality embeddings with no extra configuration. By introducing [Hybrid Search](/articles/hybrid-search/) from Qdrant, the RAG functionality is highly accurate. [Try the Tutorial](/documentation/tutorials/hybrid-search-llamaindex-jinaai/) diff --git a/qdrant-landing/content/blog/hybrid-cloud-launch-partners.md b/qdrant-landing/content/blog/hybrid-cloud-launch-partners.md index 5a862e767..8aad3c425 100644 --- a/qdrant-landing/content/blog/hybrid-cloud-launch-partners.md +++ b/qdrant-landing/content/blog/hybrid-cloud-launch-partners.md @@ -14,7 +14,7 @@ tags: - launch partners --- -With the launch of [Qdrant Hybrid Cloud](/hybrid-cloud/) we provide developers the ability to deploy Qdrant as a managed vector database in any desired environment, be it *in the cloud, on premise, or on the edge*. +With the launch of [Qdrant Hybrid Cloud](/hybrid-cloud/) we provide developers the ability to deploy Qdrant as a managed [vector database](/articles/what-is-a-vector-database/) in any desired environment, be it *in the cloud, on premise, or on the edge*. We are excited to have trusted industry players support the launch of Qdrant Hybrid Cloud, allowing developers to unlock best-in-class advantages for building production-ready AI applications: diff --git a/qdrant-landing/content/blog/hybrid-cloud-llamaindex.md b/qdrant-landing/content/blog/hybrid-cloud-llamaindex.md index 757b4aa49..919bf2885 100644 --- a/qdrant-landing/content/blog/hybrid-cloud-llamaindex.md +++ b/qdrant-landing/content/blog/hybrid-cloud-llamaindex.md @@ -29,7 +29,7 @@ Building apps with Qdrant Hybrid Cloud and LlamaIndex comes with several key adv **Open-Source Compatibility:** LlamaIndex and Qdrant pride themselves on maintaining a reliable and mature integration that brings peace of mind to those prototyping and deploying large-scale AI solutions. Extensive documentation, code samples and tutorials support users of all skill levels in leveraging highly advanced features of data ingestion and vector search. -**Advanced Search Features:** LlamaIndex comes with built-in Qdrant Hybrid Search functionality, which combines search results from sparse and dense vectors. As a highly sought-after use case, hybrid search is easily accessible from within the LlamaIndex ecosystem. Deploying this particular type vector search on Hybrid Cloud is a matter of a few lines of code. +**Advanced Search Features:** LlamaIndex comes with built-in [Qdrant Hybrid Search](/articles/hybrid-search/) functionality, which combines search results from sparse and dense vectors. As a highly sought-after use case, hybrid search is easily accessible from within the LlamaIndex ecosystem. Deploying this particular type vector search on Hybrid Cloud is a matter of a few lines of code. #### Start Building With LlamaIndex and Qdrant Hybrid Cloud: Hybrid Search in Complex PDF Documentation Use Cases diff --git a/qdrant-landing/content/blog/hybrid-cloud-red-hat-openshift.md b/qdrant-landing/content/blog/hybrid-cloud-red-hat-openshift.md index 4b72b0d24..0eec3393a 100644 --- a/qdrant-landing/content/blog/hybrid-cloud-red-hat-openshift.md +++ b/qdrant-landing/content/blog/hybrid-cloud-red-hat-openshift.md @@ -13,7 +13,7 @@ tags: - Vector Database --- -We’re excited about our collaboration with Red Hat to bring the Qdrant vector database to [Red Hat OpenShift](https://www.redhat.com/en/technologies/cloud-computing/openshift) customers! With the release of [Qdrant Hybrid Cloud](/hybrid-cloud/), developers can now deploy and run the Qdrant vector database directly in their Red Hat OpenShift environment. This collaboration enables developers to scale more seamlessly, operate more consistently across hybrid cloud environments, and maintain complete control over their vector data. This is a big step forward in simplifying AI infrastructure and empowering data-driven projects, like retrieval augmented generation (RAG) use cases, advanced search scenarios, or recommendations systems. +We’re excited about our collaboration with Red Hat to bring the Qdrant vector database to [Red Hat OpenShift](https://www.redhat.com/en/technologies/cloud-computing/openshift) customers! With the release of [Qdrant Hybrid Cloud](/hybrid-cloud/), developers can now deploy and run the Qdrant [vector database](/qdrant-vector-database/) directly in their Red Hat OpenShift environment. This collaboration enables developers to scale more seamlessly, operate more consistently across hybrid cloud environments, and maintain complete control over their vector data. This is a big step forward in simplifying AI infrastructure and empowering data-driven projects, like retrieval augmented generation (RAG) use cases, advanced search scenarios, or recommendations systems. In the rapidly evolving field of Artificial Intelligence and Machine Learning, the demand for being able to manage the modern AI stack within the existing infrastructure becomes increasingly relevant for businesses. As enterprises are launching new AI applications and use cases into production, they require the ability to maintain complete control over their data, since these new apps often work with sensitive internal and customer-centric data that needs to remain within the owned premises. This is why enterprises are increasingly looking for maximum deployment flexibility for their AI workloads. diff --git a/qdrant-landing/content/blog/hybrid-cloud-scaleway.md b/qdrant-landing/content/blog/hybrid-cloud-scaleway.md index bd9c8d56b..b408f6b37 100644 --- a/qdrant-landing/content/blog/hybrid-cloud-scaleway.md +++ b/qdrant-landing/content/blog/hybrid-cloud-scaleway.md @@ -13,7 +13,7 @@ tags: - Vector Database --- -In a move to empower the next wave of AI innovation, Qdrant and [Scaleway](https://www.scaleway.com/en/) collaborate to introduce [Qdrant Hybrid Cloud](/hybrid-cloud/), a fully managed vector database that can be deployed on existing Scaleway environments. This collaboration is set to democratize access to advanced AI capabilities, enabling developers to easily deploy and scale vector search technologies within Scaleway's robust and developer-friendly cloud infrastructure. By focusing on the unique needs of startups and the developer community, Qdrant and Scaleway are providing access to intuitive and easy to use tools, making cutting-edge AI more accessible than ever before. +In a move to empower the next wave of AI innovation, Qdrant and [Scaleway](https://www.scaleway.com/en/) collaborate to introduce [Qdrant Hybrid Cloud](/hybrid-cloud/), a fully managed [vector database](/qdrant-vector-database/) that can be deployed on existing Scaleway environments. This collaboration is set to democratize access to advanced AI capabilities, enabling developers to easily deploy and scale vector search technologies within Scaleway's robust and developer-friendly cloud infrastructure. By focusing on the unique needs of startups and the developer community, Qdrant and Scaleway are providing access to intuitive and easy to use tools, making cutting-edge AI more accessible than ever before. Building on this vision, the integration between Scaleway and Qdrant Hybrid Cloud leverages the strengths of both Qdrant, with its leading open-source vector database, and Scaleway, known for its innovative and scalable cloud solutions. This integration means startups and developers can now harness the power of vector search - essential for AI applications like recommendation systems, image recognition, and natural language processing - within their existing environment without the complexity of maintaining such advanced setups. @@ -21,7 +21,7 @@ Building on this vision, the integration between Scaleway and Qdrant Hybrid Clou #### Developing a Retrieval Augmented Generation (RAG) Application with Qdrant Hybrid Cloud, Scaleway, and LangChain -Retrieval Augmented Generation (RAG) enhances Large Language Models (LLMs) by integrating vector search to provide precise, context-rich responses. This combination allows LLMs to access and incorporate specific data in real-time, vastly improving the quality of AI-generated content. +[Retrieval Augmented Generation (RAG)](/rag/) enhances Large Language Models (LLMs) by integrating vector search to provide precise, context-rich responses. This combination allows LLMs to access and incorporate specific data in real-time, vastly improving the quality of AI-generated content. RAG applications often rely on sensitive or proprietary internal data, emphasizing the importance of data sovereignty. Running the entire stack within your own environment becomes crucial for maintaining control over this data. Qdrant Hybrid Cloud deployed on Scaleway addresses this need perfectly, offering a secure, scalable platform that respects data sovereignty requirements while leveraging the full potential of RAG for sophisticated AI solutions. diff --git a/qdrant-landing/content/blog/hybrid-cloud-stackit.md b/qdrant-landing/content/blog/hybrid-cloud-stackit.md index b2f45e117..92b578288 100644 --- a/qdrant-landing/content/blog/hybrid-cloud-stackit.md +++ b/qdrant-landing/content/blog/hybrid-cloud-stackit.md @@ -13,7 +13,7 @@ tags: - Vector Database --- -Qdrant and [STACKIT](https://www.stackit.de/en/) are thrilled to announce that developers are now able to deploy a fully managed vector database to their STACKIT environment with the introduction of [Qdrant Hybrid Cloud](/hybrid-cloud/). This is a great step forward for the German AI ecosystem as it enables developers and businesses to build cutting edge AI applications that run on German data centers with full control over their data. +Qdrant and [STACKIT](https://www.stackit.de/en/) are thrilled to announce that developers are now able to deploy a fully managed [vector database](/qdrant-vector-database/) to their STACKIT environment with the introduction of [Qdrant Hybrid Cloud](/hybrid-cloud/). This is a great step forward for the German AI ecosystem as it enables developers and businesses to build cutting edge AI applications that run on German data centers with full control over their data. Vector databases are an essential component of the modern AI stack. They enable rapid and accurate retrieval of high-dimensional data, crucial for powering search, recommendation systems, and augmenting machine learning models. In the rising field of GenAI, vector databases power retrieval-augmented-generation (RAG) scenarios as they are able to enhance the output of large language models (LLMs) by injecting relevant contextual information. However, this contextual information is often rooted in confidential internal or customer-related information, which is why enterprises are in pursuit of solutions that allow them to make this data available for their AI applications without compromising data privacy, losing data control, or letting data exit the company's secure environment. diff --git a/qdrant-landing/content/blog/hybrid-cloud.md b/qdrant-landing/content/blog/hybrid-cloud.md index 2f76c68b9..95379224b 100644 --- a/qdrant-landing/content/blog/hybrid-cloud.md +++ b/qdrant-landing/content/blog/hybrid-cloud.md @@ -13,7 +13,7 @@ tags: - Hybrid Cloud --- -We are excited to announce the official launch of [Qdrant Hybrid Cloud](/hybrid-cloud/) today, a significant leap forward in the field of vector search and enterprise AI. Rooted in our open-source origin, we are committed to offering our users and customers unparalleled control and sovereignty over their data and vector search workloads. Qdrant Hybrid Cloud stands as **the industry's first managed vector database that can be deployed in any environment** - be it cloud, on-premise, or the edge. +We are excited to announce the official launch of [Qdrant Hybrid Cloud](/hybrid-cloud/) today, a significant leap forward in the field of vector search and enterprise AI. Rooted in our open-source origin, we are committed to offering our users and customers unparalleled control and sovereignty over their data and vector search workloads. Qdrant Hybrid Cloud stands as **the industry's first managed [vector databases](/qdrant-vector-database/) that can be deployed in any environment** - be it cloud, on-premise, or the edge.

@@ -37,7 +37,7 @@ Let’s explore these aspects in more detail: Qdrant Hybrid Cloud, powered by our seamless Kubernetes-native architecture, is the first managed vector database engineered for unparalleled deployment flexibility. This means that regardless of where you run your AI applications, you can now enjoy the benefits of a fully managed Qdrant vector database, simplifying operations across any cloud, on-premise, or edge locations. -For this launch of Qdrant Hybrid Cloud, we are proud to collaborate with key cloud providers, including [Oracle Cloud Infrastructure (OCI)](https://blogs.oracle.com/cloud-infrastructure/post/qdrant-hybrid-cloud-now-available-oci-customers), [Red Hat OpenShift](/blog/hybrid-cloud-red-hat-openshift/), [Vultr](/blog/hybrid-cloud-vultr/), [DigitalOcean](/blog/hybrid-cloud-digitalocean/), [OVHcloud](/blog/hybrid-cloud-ovhcloud/), [Scaleway](/blog/hybrid-cloud-scaleway/), [Civo](/documentation/hybrid-cloud/platform-deployment-options/#civo), and [STACKIT](/blog/hybrid-cloud-stackit/). These partnerships underscore our commitment to delivering a versatile and robust vector database solution that meets the complex deployment requirements of today's AI applications. +For this launch of Qdrant Hybrid Cloud, we are proud to collaborate with key cloud providers, including [Oracle Cloud Infrastructure (OCI)](https://blogs.oracle.com/cloud-infrastructure/post/qdrant-hybrid-cloud-now-available-oci-customers), [Red Hat OpenShift](/blog/hybrid-cloud-red-hat-openshift/), [Vultr](/blog/hybrid-cloud-vultr/), [DigitalOcean](/blog/hybrid-cloud-digitalocean/), [OVHcloud](/blog/hybrid-cloud-ovhcloud/), [Scaleway](/blog/hybrid-cloud-scaleway/), [Civo](/documentation/hybrid-cloud/platform-deployment-options/#civo), and [STACKIT](/blog/hybrid-cloud-stackit/). These partnerships underscore our commitment to delivering a versatile and robust [vector databases solution](/qdrant-vector-database/) that meets the complex deployment requirements of today's AI applications. In addition to our partnerships with key cloud providers, we are also launching in collaboration with renowned AI development tools and framework leaders, including [LlamaIndex](/blog/hybrid-cloud-llamaindex/), [LangChain](/blog/hybrid-cloud-langchain/), [Airbyte](/blog/hybrid-cloud-airbyte/), [JinaAI](/blog/hybrid-cloud-jinaai/), [Haystack by deepset](/blog/hybrid-cloud-haystack/), and [Aleph Alpha](/blog/hybrid-cloud-aleph-alpha/). These launch partners are instrumental in ensuring our users can seamlessly integrate with essential technologies for their AI applications, enriching our offering and reinforcing our commitment to versatile and comprehensive deployment environments. diff --git a/qdrant-landing/content/blog/insight-generation-platform-for-lifescience-corporation-hooman-sedghamiz-vector-space-talks-014.md b/qdrant-landing/content/blog/insight-generation-platform-for-lifescience-corporation-hooman-sedghamiz-vector-space-talks-014.md index 841e2834d..f9d468e22 100644 --- a/qdrant-landing/content/blog/insight-generation-platform-for-lifescience-corporation-hooman-sedghamiz-vector-space-talks-014.md +++ b/qdrant-landing/content/blog/insight-generation-platform-for-lifescience-corporation-hooman-sedghamiz-vector-space-talks-014.md @@ -103,7 +103,7 @@ Hooman Sedghamiz: At the same time, drug discovery is making really big strides when it comes to identifying new compounds. You can essentially describe these compounds using formats like smiles, which could be represented as real text. And these large language models can be trained on them and they can predict the sequences. At the same time, you have this clinical trial outcome prediction, which is huge for pharmaceutical companies. If you could predict what will be the outcome of a trial, it would be a huge time and resource saving for a lot of companies. And of course, a lot of us already see in the market a lot of medical virtual assistants using large language models that can answer medical inquiries and give consultations around them. And there is really, I believe the biggest potential here is around real world data, like most of us nowadays, have some sort of sensor or watch that's measuring our health maybe at a minute by minute level, or it's measuring our heart rate. You go to the hospital, you have all your medical records recorded there, and these large language models have their capacity to process this complex data, and you will be able to drive better insights for individualized insights for patients. Hooman Sedghamiz: -And our company is also in crop science, as I mentioned, and crop yield prediction. If you could help farmers improve their crop yield, it means that they can produce better products faster with higher quality. So maybe I could start with maybe a history in 2023, what happened? How companies like ours were looking at large language models and opportunities. They bring, I think in 2023, everyone was excited to bring these efficiency games, right? Everyone wanted to use them for creating content, drafting emails, all these really low hanging fruit use cases. That was around. And one of the earlier really nice architectures that came up that I really like was from a 16 z enterprise that was, I think, back in really, really early 2023. LangChain was new, we had land chain and we had all this. Of course, Qdrant been there for a long time, but it was the first time that you could see vector store products could be integrated into applications. +And our company is also in crop science, as I mentioned, and crop yield prediction. If you could help farmers improve their crop yield, it means that they can produce better products faster with higher quality. So maybe I could start with maybe a history in 2023, what happened? How companies like ours were looking at large language models and opportunities. They bring, I think in 2023, everyone was excited to bring these efficiency games, right? Everyone wanted to use them for creating content, drafting emails, all these really low hanging fruit use cases. That was around. And one of the earlier really nice architectures that came up that I really like was from a 16 z enterprise that was, I think, back in really, really early 2023. [LangChain](/articles/langchain-integration/) was new, we had land chain and we had all this. Of course, Qdrant been there for a long time, but it was the first time that you could see vector store products could be integrated into applications. Hooman Sedghamiz: Really at large scale. There are different components. It's quite complex architecture. So on the right side you see how you can host large language models. On the top you see how you can augment them using external data. Of course, we had these plugins, right? So you can connect these large language models with Google search APIs, all those sort of things, and some validation that are in the middle that you could use to validate the responses fast forward. Maybe I can kind of spend, let me check out the time. Maybe I can spend a few minutes about the components of LLM APIs and hosting because that I think has a lot of potential in terms of applications that need to be really scalable. diff --git a/qdrant-landing/content/blog/open-source-vector-search-engine-and-vector-database.md b/qdrant-landing/content/blog/open-source-vector-search-engine-and-vector-database.md index 0d3c114f9..6d244ef7e 100644 --- a/qdrant-landing/content/blog/open-source-vector-search-engine-and-vector-database.md +++ b/qdrant-landing/content/blog/open-source-vector-search-engine-and-vector-database.md @@ -24,7 +24,7 @@ tags: Discussing core differences between search engines and databases, Andrey underlined the importance of application needs and scalability in database selection for vector search tasks. -Andrey Vasnetsov, CTO at Qdrant is an enthusiast of [Open Source](https://qdrant.tech/), machine learning, and vector search. He works on Open Source projects related to [Vector Similarity Search](https://qdrant.tech/articles/vector-similarity-beyond-search/) and Similarity Learning. He prefers practical over theoretical, working demo over arXiv paper. +Andrey Vasnetsov, CTO at Qdrant is an enthusiast of [Open Source](https://qdrant.tech/), machine learning, and vector search. He works on Open Source projects related to [Vector Similarity Search](/articles/vector-similarity-beyond-search/) and Similarity Learning. He prefers practical over theoretical, working demo over arXiv paper. ***You can watch this episode on [YouTube](https://www.youtube.com/watch?v=bU38Ovdh3NY).*** @@ -34,7 +34,7 @@ Andrey Vasnetsov, CTO at Qdrant is an enthusiast of [Open Source](https://qdrant ## **Top Takeaways:** -Dive into the intricacies of [vector databases](https://qdrant.tech/articles/what-is-a-vector-database/) with Andrey as he unpacks Qdrant's approach to combining filtering and vector search, revealing how in-place filtering during graph traversal optimizes precision without sacrificing search exactness, even when scaling to billions of vectors. +Dive into the intricacies of [vector databases](/articles/what-is-a-vector-database/) with Andrey as he unpacks Qdrant's approach to combining filtering and vector search, revealing how in-place filtering during graph traversal optimizes precision without sacrificing search exactness, even when scaling to billions of vectors. 5 key insights you’ll learn: @@ -48,7 +48,7 @@ Dive into the intricacies of [vector databases](https://qdrant.tech/articles/wha - 🔗 **Connected Graph Challenges:** Learn about navigating the difficulties of maintaining a connected graph while filtering during search operations. -> Fun Fact: [The Qdrant system](https://qdrant.tech/) is capable of in-place filtering during graph traversal, which is a novel approach compared to traditional post-filtering methods, ensuring the correct quantity of results that meet the filtering conditions. +> Fun Fact: [The Qdrant system](/) is capable of in-place filtering during graph traversal, which is a novel approach compared to traditional post-filtering methods, ensuring the correct quantity of results that meet the filtering conditions. > ## Timestamps: diff --git a/qdrant-landing/content/blog/qdrant-cpu-intel-benchmark.md b/qdrant-landing/content/blog/qdrant-cpu-intel-benchmark.md index 01fc6e1e7..74ef678e0 100644 --- a/qdrant-landing/content/blog/qdrant-cpu-intel-benchmark.md +++ b/qdrant-landing/content/blog/qdrant-cpu-intel-benchmark.md @@ -21,7 +21,7 @@ tags: > *Intel’s 5th gen Xeon processor is made for enterprise-scale operations in vector space.* -Vector search is surging in popularity with institutional customers, and Intel is ready to support the emerging industry. Their latest generation CPU performed exceptionally with Qdrant, a leading vector database used for enterprise AI applications. +Vector search is surging in popularity with institutional customers, and Intel is ready to support the emerging industry. Their latest generation CPU performed exceptionally with Qdrant, a leading [vector database](/qdrant-vector-database/) used for enterprise AI applications. Intel just released the latest Xeon processor (**codename: Emerald Rapids**) for data centers, a market which is expected to grow to $45 billion. Emerald Rapids offers higher-performance computing and significant energy efficiency over previous generations. Compared to the 4th generation Sapphire Rapids, Emerald boosts AI inference performance by up to 42% and makes vector search 38% faster. diff --git a/qdrant-landing/content/blog/qdrant-x-dust-how-vector-search-helps-make-work-work-better-stan-polu-vector-space-talk-010.md b/qdrant-landing/content/blog/qdrant-x-dust-how-vector-search-helps-make-work-work-better-stan-polu-vector-space-talk-010.md index 1f6feaa3c..e26b4e55e 100644 --- a/qdrant-landing/content/blog/qdrant-x-dust-how-vector-search-helps-make-work-work-better-stan-polu-vector-space-talk-010.md +++ b/qdrant-landing/content/blog/qdrant-x-dust-how-vector-search-helps-make-work-work-better-stan-polu-vector-space-talk-010.md @@ -237,13 +237,13 @@ Okay, the next question that I had is you talked about how benchmarking with the Stanislas Polu: Yeah -I think the benchmarking was really about quality of models, answers in the context of [retrieval augmented generation](https://qdrant.tech/articles/what-is-rag-in-ai/). +I think the benchmarking was really about quality of models, answers in the context of [retrieval augmented generation](/articles/what-is-rag-in-ai/). So it's not as much as performance, but obviously performance matters, and that's why we love using Qdrants. But I think the main idea of. Stanislas Polu: What I mentioned is that it's interesting because today the retrieval is noisy, because the embedders are not perfect, which is an interesting point. -Sorry, I'm double clicking, but I'll come back. The embedded are really not perfect. Are really not perfect. So that's interesting. When Qdrant release kind of optimization for [storage of vectors](https://qdrant.tech/documentation/concepts/storage/), they come with obviously warnings that you may have a loss. +Sorry, I'm double clicking, but I'll come back. The embedded are really not perfect. Are really not perfect. So that's interesting. When Qdrant release kind of optimization for [storage of vectors](/documentation/concepts/storage/), they come with obviously warnings that you may have a loss. Of precision because of the compression, et cetera, et cetera. And that's funny, like in all kind of retrieval and mental generation world, it really doesn't matter. We take all the performance we can because the loss of precision coming from compression of those vectors at the vector DB level are completely negligible compared to. The holon fuckness of the embedders in. diff --git a/qdrant-landing/content/blog/semantic-cache-ai-data-retrieval.md b/qdrant-landing/content/blog/semantic-cache-ai-data-retrieval.md index bf2497f47..3a00c43d7 100644 --- a/qdrant-landing/content/blog/semantic-cache-ai-data-retrieval.md +++ b/qdrant-landing/content/blog/semantic-cache-ai-data-retrieval.md @@ -77,7 +77,7 @@ The first part of this video explains how caching works. In the second part, you ## Embrace the Future of AI Data Retrieval -[Qdrant](https://github.com/qdrant/qdrant) offers the most flexible way to implement vector search for your RAG and AI applications. You can test out semantic cache on your free Qdrant Cloud instance today! Simply sign up for or log into your [Qdrant Cloud account](https://cloud.qdrant.io/login) and follow our [documentation](/documentation/cloud/). +[Qdrant](https://github.com/qdrant/qdrant) offers the most flexible way to implement vector search for your RAG and AI applications. You can test out semantic cache on your free [Qdrant Cloud](/cloud/) instance today! Simply sign up for or log into your [Qdrant Cloud account](https://cloud.qdrant.io/login) and follow our [documentation](/documentation/cloud/). You can also deploy Qdrant locally and manage via our UI. To do this, check our [Hybrid Cloud](/blog/hybrid-cloud/)! diff --git a/qdrant-landing/content/blog/series-A-funding-round.md b/qdrant-landing/content/blog/series-A-funding-round.md index 342688cc2..25c95d083 100644 --- a/qdrant-landing/content/blog/series-A-funding-round.md +++ b/qdrant-landing/content/blog/series-A-funding-round.md @@ -29,7 +29,7 @@ The rise of generative AI in the last few years has shone a spotlight on vector To meet the needs of the next generation of AI applications, Qdrant has always been built with four keys in mind: efficiency, scalability, performance, and flexibility. Our goal is to give our users unmatched speed and reliability, even when they are building massive-scale AI applications requiring the handling of billions of vectors. We did so by building Qdrant on Rust for performance, memory safety, and scale. Additionally, [our custom HNSW search algorithm](/articles/filtrable-hnsw/) and unique [filtering](/documentation/concepts/filtering/) capabilities consistently lead to [highest RPS](/benchmarks/), minimal latency, and high control with accuracy when running large-scale, high-dimensional operations. -Beyond performance, we provide our users with the most flexibility in cost savings and deployment options. A combination of cutting-edge efficiency features, like [built-in compression options](/documentation/guides/quantization/), [multitenancy](/documentation/guides/multiple-partitions/) and the ability to [offload data to disk](/documentation/concepts/storage/), dramatically reduce memory consumption. Committed to privacy and security, crucial for modern AI applications, Qdrant now also offers on-premise and hybrid SaaS solutions, meeting diverse enterprise needs in a data-sensitive world. This approach, coupled with our open-source foundation, builds trust and reliability with engineers and developers, making Qdrant a game-changer in the vector database domain. +Beyond performance, we provide our users with the most flexibility in cost savings and deployment options. A combination of cutting-edge efficiency features, like [built-in compression options](/documentation/guides/quantization/), [multitenancy](/documentation/guides/multiple-partitions/) and the ability to [offload data to disk](/documentation/concepts/storage/), dramatically reduce [memory consumption](/articles/memory-consumption/). Committed to privacy and security, crucial for modern AI applications, Qdrant now also offers on-premise and hybrid SaaS solutions, meeting diverse enterprise needs in a data-sensitive world. This approach, coupled with our open-source foundation, builds trust and reliability with engineers and developers, making Qdrant a game-changer in the vector database domain. ## What's next? diff --git a/qdrant-landing/content/blog/soc2-type2-report.md b/qdrant-landing/content/blog/soc2-type2-report.md index b638ad304..504a5a9ec 100644 --- a/qdrant-landing/content/blog/soc2-type2-report.md +++ b/qdrant-landing/content/blog/soc2-type2-report.md @@ -53,4 +53,4 @@ Recognizing the critical importance of data security and the trust our clients p Qdrant is a vector database designed to handle large-scale, high-dimensional data efficiently. It allows for fast and accurate similarity searches in complex datasets. Qdrant strives to achieve seamless and scalable vector search capabilities for various applications. -For more information about Qdrant and our security practices, please visit our [website](http://qdrant.tech) or [reach out to our team directly](https://qdrant.tech/contact-us/). +For more information about Qdrant and our security practices, please visit our [website](http://qdrant.tech) or [reach out to our team directly](/contact-us/). diff --git a/qdrant-landing/content/blog/storing-multiple-vectors-per-object-in-qdrant.md b/qdrant-landing/content/blog/storing-multiple-vectors-per-object-in-qdrant.md index 54f64adef..1d1f42569 100644 --- a/qdrant-landing/content/blog/storing-multiple-vectors-per-object-in-qdrant.md +++ b/qdrant-landing/content/blog/storing-multiple-vectors-per-object-in-qdrant.md @@ -19,7 +19,7 @@ tags: # How to Optimize Vector Storage by Storing Multiple Vectors Per Object -In a real case scenario, a single object might be described in several different ways. If you run an e-commerce business, then your items will typically have a name, longer textual description and also a bunch of photos. While cooking, you may care about the list of ingredients, and description of the taste but also the recipe and the way your meal is going to look. Up till now, if you wanted to enable [semantic search](https://qdrant.tech/documentation/tutorials/search-beginners/) with multiple vectors per object, Qdrant would require you to create separate collections for each vector type, even though they could share some other attributes in a payload. However, since Qdrant 0.10 you are able to store all those vectors together in the same collection and share a single copy of the payload! +In a real case scenario, a single object might be described in several different ways. If you run an e-commerce business, then your items will typically have a name, longer textual description and also a bunch of photos. While cooking, you may care about the list of ingredients, and description of the taste but also the recipe and the way your meal is going to look. Up till now, if you wanted to enable [semantic search](/documentation/tutorials/search-beginners/) with multiple vectors per object, Qdrant would require you to create separate collections for each vector type, even though they could share some other attributes in a payload. However, since Qdrant 0.10 you are able to store all those vectors together in the same collection and share a single copy of the payload! Running the new version of Qdrant is as simple as it always was. By running the following command, you are able to set up a single instance that will also expose the HTTP API: @@ -176,7 +176,7 @@ The created vectors might be easily put into Qdrant. For the sake of simplicity, ## Searching with multiple vectors -If you decided to describe each object with several [neural embeddings](https://qdrant.tech/articles/neural-search-tutorial/), then at each search operation you need to provide the vector name along with the [vector embedding](https://qdrant.tech/articles/what-are-embeddings/), so the engine knows which one to use. The interface of the search operation is pretty straightforward and requires an instance of NamedVector. +If you decided to describe each object with several neural embeddings, then at each search operation you need to provide the vector name along with the [vector embedding](/articles/what-are-embeddings/), so the engine knows which one to use. The interface of the search operation is pretty straightforward and requires an instance of NamedVector. ```python from qdrant_client.http.models import NamedVector diff --git a/qdrant-landing/content/blog/superpower-your-semantic-search-using-vector-database-nicolas-mauti-vector-space-talk-007.md b/qdrant-landing/content/blog/superpower-your-semantic-search-using-vector-database-nicolas-mauti-vector-space-talk-007.md index 03a71d594..ca4a00c84 100644 --- a/qdrant-landing/content/blog/superpower-your-semantic-search-using-vector-database-nicolas-mauti-vector-space-talk-007.md +++ b/qdrant-landing/content/blog/superpower-your-semantic-search-using-vector-database-nicolas-mauti-vector-space-talk-007.md @@ -33,7 +33,7 @@ Nicolas Mauti, a computer science graduate from INSA Lyon Engineering School, tr ## **Top Takeaways:** -Dive into the intricacies of [semantic search](https://qdrant.tech/documentation/tutorials/search-beginners/) enhancement with Nicolas Mauti, MLOps Engineer at Malt. Discover how Nicolas and his team at Malt revolutionize the way freelancers connect with projects. +Dive into the intricacies of [semantic search](/documentation/tutorials/search-beginners/) enhancement with Nicolas Mauti, MLOps Engineer at Malt. Discover how Nicolas and his team at Malt revolutionize the way freelancers connect with projects. In this episode, Nicolas delves into enhancing semantics search at Malt by implementing a retriever-ranker architecture with multilingual transformer-based models, improving freelancer-project matching through a transition to [Qdrant](https://qdrant.tech/) that reduced latency from 10 seconds to 1 second and bolstering the platform's overall performance and scaling capabilities. @@ -133,13 +133,13 @@ Nicolas Mauti: So I think I already talked about this ponds, but yeah, we needed performances. The second ones was about inn quality. As I said before, we cannot do a KnN search, brute force search each time. And so we have to find a way to approximate but to be close enough and to be good enough on these points. And so otherwise we won't be leveraged the performance of our model. And the last one, and I didn't talk a lot about this before, is filtering. Filtering is a big problem for us because we have a lot of filters, of art filters, as I said before. And so if we think about my architecture, we can say, okay, so filtering is not a problem. Nicolas Mauti: -You can just have a three step process and do filtering, semantic search and then ranking, or do semantic search, filtering and then ranking. But in both cases, you will have some troubles if you do that. The first one is if you want to apply prefiltering. So filtering, semantic search, ranking. If you do that, in fact, you will have, so we'll have this kind of architecture. And if you do that, you will have, in fact, to flag each freelancers before asking the [vector database](https://qdrant.tech/articles/what-is-a-vector-database/) and performing a search, you will have to flag each freelancer whether there could be selected or not. And so with that, you will basically create a binary mask on your freelancers pool. And as the number of freelancers you have will grow, your binary namask will also grow. +You can just have a three step process and do filtering, semantic search and then ranking, or do semantic search, filtering and then ranking. But in both cases, you will have some troubles if you do that. The first one is if you want to apply prefiltering. So filtering, semantic search, ranking. If you do that, in fact, you will have, so we'll have this kind of architecture. And if you do that, you will have, in fact, to flag each freelancers before asking the [vector database](/articles/what-is-a-vector-database/) and performing a search, you will have to flag each freelancer whether there could be selected or not. And so with that, you will basically create a binary mask on your freelancers pool. And as the number of freelancers you have will grow, your binary namask will also grow. Nicolas Mauti: -And so it's not very scalable. And regarding the performance, it will be degraded as your freelancer base grow. And also you will have another problem. A lot of [vector database](https://qdrant.tech/articles/what-is-a-vector-database/) and Qdrants is one of them using hash NSW algorithm to do your inn search. And this kind of algorithm is based on graph. And so if you do that, you will deactivate some nodes in your graph, and so your graph will become disconnected and you won't be able to navigate in your graph. And so your quality of your matching will degrade. So it's definitely not a good idea to apply prefiltering. +And so it's not very scalable. And regarding the performance, it will be degraded as your freelancer base grow. And also you will have another problem. A lot of [vector database](/articles/what-is-a-vector-database/) and Qdrants is one of them using hash NSW algorithm to do your inn search. And this kind of algorithm is based on graph. And so if you do that, you will deactivate some nodes in your graph, and so your graph will become disconnected and you won't be able to navigate in your graph. And so your quality of your matching will degrade. So it's definitely not a good idea to apply prefiltering. Nicolas Mauti: -So, no, if we go to post filtering here, I think the issue is more clear. You will have this kind of architecture. And so, in fact, if you do that, you will have to retrieve a lot of freelancer for your [vector database](https://qdrant.tech/articles/what-is-a-vector-database/). If you apply some very aggressive filtering and you exclude a lot of freelancer with your filtering, you will have to ask for a lot of freelancer in your vector database and so your performances will be impacted. So filtering is a problem. So we cannot do pre filtering or post filtering. So we had to find a database that do filtering and matching and semantic matching and search at the same time. And so Qdrant is one of them, you have other one in the market. +So, no, if we go to post filtering here, I think the issue is more clear. You will have this kind of architecture. And so, in fact, if you do that, you will have to retrieve a lot of freelancer for your [vector database](/articles/what-is-a-vector-database/). If you apply some very aggressive filtering and you exclude a lot of freelancer with your filtering, you will have to ask for a lot of freelancer in your vector database and so your performances will be impacted. So filtering is a problem. So we cannot do pre filtering or post filtering. So we had to find a database that do filtering and matching and semantic matching and search at the same time. And so Qdrant is one of them, you have other one in the market. Nicolas Mauti: But in our case, we had one filter that caused us a lot of troubles. And this filter is the geospatial filtering and a few of databases under this filtering, and I think Qdrant is one of them that supports it. But there is not a lot of databases that support them. And we absolutely needed that because we have a local approach and we want to be sure that we recommend freelancer next to the project. And so now that I said all of that, we had three candidates that we tested and we benchmarked them. We had elasticsearch PG vector, that is an extension of PostgreSQL and Qdrants. And on this slide you can see Pycon for example, and Pycon was excluded because of the lack of geospatial filtering. And so we benchmark them regarding the qps. @@ -175,7 +175,7 @@ Demetrios: All right, first off, I want to give a shout out in case there are freelancers that are watching this or looking at this, now is a great time to just join Malt, I think. It seems like it's getting better every day. So I know there's questions that will come through and trickle in, but we've already got one from Luis. What's happening, Luis? He's asking what library or service were you using for Ann before considering Qdrant, in fact. Nicolas Mauti: -So before that we didn't add any library or service or we were not doing any ann search or [semantic searc](https://qdrant.tech/documentation/tutorials/search-beginners/) in the way we are doing it right now. We just had one model when we passed the freelancers and the project at the same time in the model, and we got relevancy scoring at the end. And so that's why it was also so slow because you had to constrict each pair and send each pair to your model. And so right now we don't have to do that and so it's much better. +So before that we didn't add any library or service or we were not doing any ann search or [semantic searc](/documentation/tutorials/search-beginners/) in the way we are doing it right now. We just had one model when we passed the freelancers and the project at the same time in the model, and we got relevancy scoring at the end. And so that's why it was also so slow because you had to constrict each pair and send each pair to your model. And so right now we don't have to do that and so it's much better. Demetrios: Yeah, that makes sense. One question from my side is it took you, I think you said in October you started with the A B test and then in December you rolled it out. What was that last slide that you had? @@ -196,7 +196,7 @@ Nicolas Mauti: Thanks. Demetrios: -All right, everyone. By the way, in case you want to join us and talk about what you're working on and how you're using Qdrant or what you're doing in the semantic space or [semantic search](https://qdrant.tech/documentation/tutorials/search-beginners/) or vector space, all that fun stuff, hit us up. We would love to have you on here. One last question for you, Nicola. Something came through. What indexing method do you use? Is it good for using OpenAI embeddings? +All right, everyone. By the way, in case you want to join us and talk about what you're working on and how you're using Qdrant or what you're doing in the semantic space or [semantic search](/documentation/tutorials/search-beginners/) or vector space, all that fun stuff, hit us up. We would love to have you on here. One last question for you, Nicola. Something came through. What indexing method do you use? Is it good for using OpenAI embeddings? Nicolas Mauti: So in our case, we have our own model to build the embeddings. diff --git a/qdrant-landing/content/blog/teaching-vector-databases-at-scale-alfredo-deza-vector-space-talks-019-2.md b/qdrant-landing/content/blog/teaching-vector-databases-at-scale-alfredo-deza-vector-space-talks-019-2.md index e77619ef2..20ea3c2b5 100644 --- a/qdrant-landing/content/blog/teaching-vector-databases-at-scale-alfredo-deza-vector-space-talks-019-2.md +++ b/qdrant-landing/content/blog/teaching-vector-databases-at-scale-alfredo-deza-vector-space-talks-019-2.md @@ -40,7 +40,7 @@ How does a former athlete such as Alfredo Deza end up in this AI and Machine Lea Here are some things you’ll discover from this episode: -1. **The Intersection of Teaching and Tech:** Alfredo discusses on how to effectively bridge the gap between technical concepts and student understanding, especially when dealing with complex topics like vector databases. +1. **The Intersection of Teaching and Tech:** Alfredo discusses on how to effectively bridge the gap between technical concepts and student understanding, especially when dealing with complex topics like [vector databases](/qdrant-vector-database/). 2. **Simplified Learning:** Dive into Alfredo's advocacy for simplicity in teaching methods, mirroring his approach with Qdrant and the potential for a Rust in-memory implementation aimed at enhancing learning experiences. 3. **Beyond the Titanic Dataset:** Discover why Alfredo prefers to teach with a wine dataset he developed himself, underscoring the importance of using engaging subject matter in education. 4. **AI Learning Acceleration:** Alfredo discusses the struggle universities face to keep pace with AI advancements and how online platforms can offer a more up-to-date curriculum. diff --git "a/qdrant-landing/content/blog/the-bitter-lesson-of-retrieval-in-generative-language-model-workflows-mikko-lehtim\303\244ki-vector-space-talks.md" "b/qdrant-landing/content/blog/the-bitter-lesson-of-retrieval-in-generative-language-model-workflows-mikko-lehtim\303\244ki-vector-space-talks.md" index 7940588ff..17a5570a2 100644 --- "a/qdrant-landing/content/blog/the-bitter-lesson-of-retrieval-in-generative-language-model-workflows-mikko-lehtim\303\244ki-vector-space-talks.md" +++ "b/qdrant-landing/content/blog/the-bitter-lesson-of-retrieval-in-generative-language-model-workflows-mikko-lehtim\303\244ki-vector-space-talks.md" @@ -37,7 +37,7 @@ Recently, Mikko has contributed software to Llama-index and Guardrails-AI, two l Aren’t you curious about what the bitter lesson is and how it plays out in generative language model workflows? -Check it out as Mikko delves into the intricate world of retrieval-augmented generation, discussing how Yokot AI manages vast diverse data inputs and how focusing on re-ranking can massively improve LLM workflows and output quality. +Check it out as Mikko delves into the intricate world of [retrieval-augmented generation](/rag/), discussing how Yokot AI manages vast diverse data inputs and how focusing on re-ranking can massively improve LLM workflows and output quality. 5 key takeaways you’ll get from this episode: @@ -113,7 +113,7 @@ Mikko Lehtimäki: You don't really even know when. So we need to give them access to more recent data, and we need a method for doing that. And the other thing is problems like hallucinations. We found that if you just ask the model a question that is in the training data, you won't get always reliable results. But if you can crown the model's answers with data, you will get more factual results. So this is what can be done with the rack as well. And the final thing is that we just cannot give a book, for example, in one go the language model, because even if theoretically it could read the input in one go, the result quality that you get from the language model is going to suffer if you feed it too much data at once. So this is why we have designed retrieval augmented generation architectures. Mikko Lehtimäki: -And if we look at this system on the bottom, you see the typical data ingestion. So the user gives a document, we slice it to small chunks, and we compute a numerical representation with vector embeddings and store those in a vector database. Why a vector database? Because it's really efficient to retrieve vectors from it when we get users query. So that is also embedded and it's used to look up relevant sources from the data that was previously uploaded efficiently directly on the database, and then we can fit the resulting text, the language model, to synthesize an answer. And this is how the RHe works in very basic form. Now you can see that if you have only a single document that you work with, it's nice if the problem set that you want to solve is very constrained, but the more data you can bring to your system, the more workflows you can build on that data. So if you have, for example, access to a complete book or many books, it's easy to see you can also generate higher quality content from that data. So this architecture really must be such that it can also make use of those larger amounts of data. +And if we look at this system on the bottom, you see the typical data ingestion. So the user gives a document, we slice it to small chunks, and we compute a numerical representation with [vector embeddings](/articles/what-are-embeddings/) and store those in a [vector database](/articles/what-is-a-vector-database/). Why a vector database? Because it's really efficient to retrieve vectors from it when we get users query. So that is also embedded and it's used to look up relevant sources from the data that was previously uploaded efficiently directly on the database, and then we can fit the resulting text, the language model, to synthesize an answer. And this is how the RHe works in very basic form. Now you can see that if you have only a single document that you work with, it's nice if the problem set that you want to solve is very constrained, but the more data you can bring to your system, the more workflows you can build on that data. So if you have, for example, access to a complete book or many books, it's easy to see you can also generate higher quality content from that data. So this architecture really must be such that it can also make use of those larger amounts of data. Mikko Lehtimäki: Anyway, once you implement this for the first time, it really feels like magic. It tends to work quite nicely, but soon you'll notice that it's not suitable for all kinds of tasks. Like you will see sometimes that, for example, the lists. If you retrieve lists, they may be broken. If you ask questions that are document comparisons, you may not get complete results. If you run summarization tasks without thinking about it anymore, then that will most likely lead to super results. So we'll have to extend the architecture quite a bit to take into account all the use cases that we want to enable with bigger amounts of data that the users upload. And this is what it may look like once you've gone through a few design iterations. diff --git a/qdrant-landing/content/blog/using-qdrant-and-langchain.md b/qdrant-landing/content/blog/using-qdrant-and-langchain.md index ce805855b..adc508062 100644 --- a/qdrant-landing/content/blog/using-qdrant-and-langchain.md +++ b/qdrant-landing/content/blog/using-qdrant-and-langchain.md @@ -21,9 +21,9 @@ tags: ## Long-Term Memory for Your GenAI App -Qdrant's vector database quickly grew due to its ability to make Generative AI more effective. On its own, an LLM can be used to build a process-altering invention. With Qdrant, you can turn this invention into a production-level app that brings real business value. +Qdrant's [vector database](/qdrant-vector-database/) quickly grew due to its ability to make Generative AI more effective. On its own, an LLM can be used to build a process-altering invention. With Qdrant, you can turn this invention into a production-level app that brings real business value. -The use of vector search in GenAI now has a name: **Retrieval Augmented Generation (RAG)**. [In our previous article](/articles/rag-is-dead/), we argued why RAG is an essential component of AI setups, and why large-scale AI can't operate without it. Numerous case studies explain that AI applications are simply too costly and resource-intensive to run using only LLMs. +The use of vector search in GenAI now has a name: [**Retrieval Augmented Generation (RAG)**](/rag/). [In our previous article](/articles/rag-is-dead/), we argued why RAG is an essential component of AI setups, and why large-scale AI can't operate without it. Numerous case studies explain that AI applications are simply too costly and resource-intensive to run using only LLMs. > Going forward, the solution is to leverage composite systems that use models and vector databases. @@ -37,7 +37,7 @@ Qdrant streamlines this process of retrieval augmentation, making it faster, eas Retrieval Augmented Generation is not without its challenges and limitations. One of the main setbacks for app developers is managing the entire setup. The integration of a retriever and a generator into a single model can lead to a raised level of complexity, thus increasing the computational resources required. -[LangChain](https://www.langchain.com/) is a framework that makes developing RAG-based applications much easier. It unifies interfaces to different libraries, including major embedding providers like OpenAI or Cohere and vector stores like Qdrant. With LangChain, you can focus on creating tangible GenAI applications instead of writing your logic from the ground up. +[LangChain](https://www.langchain.com/) is a framework that makes developing RAG-based applications much easier. It unifies interfaces to different libraries, including major embedding providers like OpenAI or Cohere and vector stores like Qdrant. With [LangChain](/articles/langchain-integration/), you can focus on creating tangible GenAI applications instead of writing your logic from the ground up. > Qdrant is one of the **top supported vector stores** on LangChain, with [extensive documentation](https://python.langchain.com/docs/integrations/vectorstores/qdrant) and [examples](https://python.langchain.com/docs/integrations/retrievers/self_query/qdrant_self_query). @@ -72,7 +72,7 @@ Here is what this basic tutorial will teach you: **2. Preprocess and format data for use by the chatbot:** First, you will download a sample dataset based on some academic journals. Then, you will process this data into embeddings and store it as vectors inside of Qdrant. -**3. Implement vector similarity search algorithms:** Second, you will create and test a chatbot that only uses the LLM. Then, you will enable the memory component offered by Qdrant. This will allow your chatbot to be modified and updated, giving it long-term memory. +**3. Implement [vector similarity](/articles/vector-similarity-beyond-search/) search algorithms:** Second, you will create and test a chatbot that only uses the LLM. Then, you will enable the memory component offered by Qdrant. This will allow your chatbot to be modified and updated, giving it long-term memory. **4. Optimize the chatbot's performance:** In the last step, you will query the chatbot in two ways. First query will retrieve parametric data from the LLM, while the second one will get contexual data via Qdrant. diff --git a/qdrant-landing/content/blog/vector-image-search-rag-vector-space-talk-008.md b/qdrant-landing/content/blog/vector-image-search-rag-vector-space-talk-008.md index bfeff817b..688d4ff47 100644 --- a/qdrant-landing/content/blog/vector-image-search-rag-vector-space-talk-008.md +++ b/qdrant-landing/content/blog/vector-image-search-rag-vector-space-talk-008.md @@ -33,9 +33,9 @@ Noé Achache is a Lead Data Scientist at Sicara, where he worked on a wide range ## **Top Takeaways:** -Discover the efficacy of Dino V2 in image representation and the complexities of deploying vector databases, while navigating the challenges of fine-tuning and data safety in sensitive fields. +Discover the efficacy of Dino V2 in image representation and the complexities of deploying [vector databases](/qdrant-vector-database/), while navigating the challenges of fine-tuning and data safety in sensitive fields. -In this episode, Noe, shares insights on vector search from image search to retrieval augmented generation, emphasizing practical application in complex projects. +In this episode, Noe, shares insights on vector search from image search to [retrieval augmented generation](/rag/), emphasizing practical application in complex projects. 5 key insights you’ll learn: diff --git a/qdrant-landing/content/blog/vector-search-for-content-based-video-recommendation-gladys-and-sam-vector-space-talk-012.md b/qdrant-landing/content/blog/vector-search-for-content-based-video-recommendation-gladys-and-sam-vector-space-talk-012.md index 82399b8c3..206a7cc6f 100644 --- a/qdrant-landing/content/blog/vector-search-for-content-based-video-recommendation-gladys-and-sam-vector-space-talk-012.md +++ b/qdrant-landing/content/blog/vector-search-for-content-based-video-recommendation-gladys-and-sam-vector-space-talk-012.md @@ -80,7 +80,7 @@ Sourabh Agrawal: Yeah. First of all, thanks a lot for inviting me and no worries for hiccup. I guess I have never seen a demo or a talk which goes without any technical hiccups. It is bound to happen. Really excited to be here. Really excited to talk about LLM evaluations. And as you rightly pointed right, it's really a hot topic and rightly so. Right. Sourabh Agrawal: -The way things have been panning out with LLMs and chat, GPT and GPT four and so on, is that people started building all these prototypes, right? And the way to evaluate them was just like eyeball them, just trust your gut feeling, go with the vibe. I guess they truly adopted the startup methodology, push things out to production and break things. But what people have been realizing is that it's not scalable, right? I mean, rightly so. It's highly subjective. It's a developer, it's a human who is looking at all the responses, someday he might like this, someday he might like something else. And it's not possible for them to kind of go over, just read through more than ten responses. And now the unique thing about production use cases is that they need continuous refinement. You need to keep on improving them, you need to keep on improving your prompt or your retrieval, your embedding model, your retrieval mechanisms and so on. +The way things have been panning out with LLMs and chat, GPT and GPT four and so on, is that people started building all these prototypes, right? And the way to evaluate them was just like eyeball them, just trust your gut feeling, go with the vibe. I guess they truly adopted the startup methodology, push things out to production and break things. But what people have been realizing is that it's not scalable, right? I mean, rightly so. It's highly subjective. It's a developer, it's a human who is looking at all the responses, someday he might like this, someday he might like something else. And it's not possible for them to kind of go over, just read through more than ten responses. And now the unique thing about production use cases is that they need continuous refinement. You need to keep on improving them, you need to keep on improving your prompt or your retrieval, your [embedding model](/articles/fastembed/), your retrieval mechanisms and so on. Sourabh Agrawal: So that presents a case like you have to use a more scalable technique, you have to use LLMs as a judge because that's scalable. You can have an API call, and if that API call gives good quality results, it's a way you can mimic whatever your human is doing or in a way augment them which can truly act as their copilot. diff --git a/qdrant-landing/content/blog/what-is-vector-similarity.md b/qdrant-landing/content/blog/what-is-vector-similarity.md index d3447cf7b..805bc08b0 100644 --- a/qdrant-landing/content/blog/what-is-vector-similarity.md +++ b/qdrant-landing/content/blog/what-is-vector-similarity.md @@ -91,7 +91,7 @@ Vector similarity in text analysis helps in understanding and processing languag **Retrieval Augmented Generation (RAG)** -Vector similarity can help in representing and comparing linguistic features, from single words to entire documents. This can help build retrieval augmented generation (RAG) applications, where the data is retrieved based on user intent. It also enables nuanced language tasks such as sentiment analysis, synonym detection, language translation, and more. +Vector similarity can help in representing and comparing linguistic features, from single words to entire documents. This can help build [retrieval augmented generation (RAG)](/rag/) applications, where the data is retrieved based on user intent. It also enables nuanced language tasks such as sentiment analysis, synonym detection, language translation, and more. **Recommender Systems** @@ -155,7 +155,7 @@ There are three quantization strategies you can choose from - scalar quantizatio Qdrant offers several [security features](/documentation/guides/security/) to help protect data and access to the vector store: -- API Key Authentication: This helps secure API access to Qdrant Cloud with static or read-only API keys. +- API Key Authentication: This helps secure API access to [Qdrant Cloud](/cloud/) with static or read-only API keys. - JWT-Based Access Control: You can also enable more granular access control through JSON Web Tokens (JWT), and opt for restricted access to specific parts of the stored data while building Role-Based Access Control (RBAC). - TLS Encryption: Additionally, you can enable TLS Encryption on data transmission to ensure security of data in transit. diff --git a/qdrant-landing/content/documentation/_index.md b/qdrant-landing/content/documentation/_index.md index 1f08993bb..654fc782d 100644 --- a/qdrant-landing/content/documentation/_index.md +++ b/qdrant-landing/content/documentation/_index.md @@ -4,7 +4,7 @@ weight: 10 --- # Documentation -**Qdrant (read: quadrant)** is a vector similarity search engine. Use our documentation to develop a production-ready service with a convenient API to store, search, and manage vectors with an additional payload. Qdrant's expanding features allow for all sorts of neural network or semantic-based matching, faceted search, and other applications. +**Qdrant (read: quadrant)** is a [vector similarity](/articles/vector-similarity-beyond-search/) search engine. Use our documentation to develop a production-ready service with a convenient API to store, search, and manage vectors with an additional payload. Qdrant's expanding features allow for all sorts of neural network or semantic-based matching, faceted search, and other applications. ## Product Release: Announcing Qdrant Hybrid Cloud! ***

Now you can attach your own infrastructure to Qdrant Cloud!

*** diff --git a/qdrant-landing/content/documentation/cloud/backups.md b/qdrant-landing/content/documentation/cloud/backups.md index 29fec9e31..543fe0fa9 100644 --- a/qdrant-landing/content/documentation/cloud/backups.md +++ b/qdrant-landing/content/documentation/cloud/backups.md @@ -19,7 +19,7 @@ self-service backups. ## Prerequisites -You can back up your Qdrant clusters though the Qdrant Cloud +You can back up your Qdrant clusters though the [Qdrant Cloud](/cloud/) Dashboard at https://cloud.qdrant.io. This section assumes that you've already set up your cluster, as described in the following sections: diff --git a/qdrant-landing/content/documentation/cloud/capacity-sizing.md b/qdrant-landing/content/documentation/cloud/capacity-sizing.md index 1f2d79581..4c1941bb1 100644 --- a/qdrant-landing/content/documentation/cloud/capacity-sizing.md +++ b/qdrant-landing/content/documentation/cloud/capacity-sizing.md @@ -66,6 +66,6 @@ If you're running low on disk space, consider the following advantages: - Larger Datasets: Supports larger datasets. With vector search, larger datasets can improve the relevance and quality of search results. - Improved Indexing: Supports the use of indexing strategies such as -HNSW (Hierarchical Navigable Small World). +[HNSW](/articles/filtrable-hnsw/) (Hierarchical Navigable Small World). - Caching: Improves speed when you cache frequently accessed data on disk. - Backups and Redundancy: Allows more frequent backups. Perhaps the most important advantage. diff --git a/qdrant-landing/content/documentation/concepts/explore.md b/qdrant-landing/content/documentation/concepts/explore.md index a3c4e074c..fc583253e 100644 --- a/qdrant-landing/content/documentation/concepts/explore.md +++ b/qdrant-landing/content/documentation/concepts/explore.md @@ -884,7 +884,7 @@ Notes about discovery search: ### Context search -Conversely, in the absence of a target, a rigid integer-by-integer function doesn't provide much guidance for the search when utilizing a proximity graph like HNSW. Instead, context search employs a function derived from the [triplet-loss](/articles/triplet-loss/) concept, which is usually applied during model training. For context search, this function is adapted to steer the search towards areas with fewer negative examples. +Conversely, in the absence of a target, a rigid integer-by-integer function doesn't provide much guidance for the search when utilizing a proximity graph like [HNSW](/articles/filtrable-hnsw/). Instead, context search employs a function derived from the [triplet-loss](/articles/triplet-loss/) concept, which is usually applied during model training. For context search, this function is adapted to steer the search towards areas with fewer negative examples. ![Context search](/docs/context-search.png) diff --git a/qdrant-landing/content/documentation/concepts/search.md b/qdrant-landing/content/documentation/concepts/search.md index aa32e66a0..c72a85ffa 100644 --- a/qdrant-landing/content/documentation/concepts/search.md +++ b/qdrant-landing/content/documentation/concepts/search.md @@ -349,7 +349,7 @@ Parameter `limit` (or its alias - `top`) specifies the amount of most similar re Values under the key `params` specify custom parameters for the search. Currently, it could be: -* `hnsw_ef` - value that specifies `ef` parameter of the HNSW algorithm. +* `hnsw_ef` - value that specifies `ef` parameter of the [HNSW](/articles/filtrable-hnsw/) algorithm. * `exact` - option to not use the approximate search (ANN). If set to true, the search may run for a long as it performs a full scan to retrieve exact results. * `indexed_only` - With this option you can disable the search in those segments where vector index is not built yet. This may be useful if you want to minimize the impact to the search performance whilst the collection is also being updated. Using this option may lead to a partial result if the collection is not fully indexed yet, consider using it only if eventual consistency is acceptable for your use case. @@ -475,7 +475,7 @@ If the collection was created with sparse vectors, the name of the sparse vector You can still use payload filtering and other features of the search API with sparse vectors. -There are however important differences between dense and sparse vector search: +There are however important differences between dense and [sparse vector search](/articles/sparse-vectors/): | Index| Sparse Query | Dense Query | | --- | --- | --- | diff --git a/qdrant-landing/content/documentation/concepts/storage.md b/qdrant-landing/content/documentation/concepts/storage.md index a956ef4f4..f5ffd8a14 100644 --- a/qdrant-landing/content/documentation/concepts/storage.md +++ b/qdrant-landing/content/documentation/concepts/storage.md @@ -271,7 +271,7 @@ The rule of thumb to set the memmap threshold parameter is simple: - if you have a balanced use scenario - set memmap threshold the same as `indexing_threshold` (default is 20000). In this case the optimizer will not make any extra runs and will optimize all thresholds at once. - if you have a high write load and low RAM - set memmap threshold lower than `indexing_threshold` to e.g. 10000. In this case the optimizer will convert the segments to memmap storage first and will only apply indexing after that. -In addition, you can use memmap storage not only for vectors, but also for HNSW index. +In addition, you can use memmap storage not only for vectors, but also for [HNSW](/articles/filtrable-hnsw/) index. To enable this, you need to set the `hnsw_config.on_disk` parameter to `true` during collection [creation](../collections/#create-a-collection) or [updating](../collections/#update-collection-parameters). ```http diff --git a/qdrant-landing/content/documentation/examples/aleph-alpha-search.md b/qdrant-landing/content/documentation/examples/aleph-alpha-search.md index df0c076cf..a6ea5af13 100644 --- a/qdrant-landing/content/documentation/examples/aleph-alpha-search.md +++ b/qdrant-landing/content/documentation/examples/aleph-alpha-search.md @@ -14,7 +14,7 @@ This tutorial shows you how to run a proper multimodal semantic search system wi In most cases, semantic search is limited to homogenous data types for both documents and queries (text-text, image-image, audio-audio, etc.). With the recent growth of multimodal architectures, it is now possible to encode different data types into the same latent space. That opens up some great possibilities, as you can finally explore non-textual data, for example visual, with text queries. -In the past, this would require labelling every image with a description of what it presents. Right now, you can rely on vector embeddings, which can represent all +In the past, this would require labelling every image with a description of what it presents. Right now, you can rely on [vector embeddings](/articles/what-are-embeddings/), which can represent all the inputs in the same space. *Figure 1: Two examples of text-image pairs presenting a similar object, encoded by a multimodal network into the same diff --git a/qdrant-landing/content/documentation/examples/cohere-rag-connector.md b/qdrant-landing/content/documentation/examples/cohere-rag-connector.md index 2fbc49b1a..96ca81638 100644 --- a/qdrant-landing/content/documentation/examples/cohere-rag-connector.md +++ b/qdrant-landing/content/documentation/examples/cohere-rag-connector.md @@ -30,8 +30,8 @@ This tutorial guides you step by step on building such a service around Qdrant. ## Qdrant connector You probably already have some collections you would like to bring to the LLM. Maybe your pipeline was set up using some -of the popular libraries such as Langchain, Llama Index, or Haystack. Cohere connectors may implement even more complex -logic, e.g. hybrid search. In our case, we are going to start with a fresh Qdrant collection, index data using Cohere +of the popular libraries such as [Langchain](/articles/langchain-integration/), Llama Index, or Haystack. Cohere connectors may implement even more complex +logic, e.g. [hybrid search](/articles/hybrid-search/). In our case, we are going to start with a fresh Qdrant collection, index data using Cohere Embed v3, build the connector, and finally connect it with the [Command-R model](https://txt.cohere.com/command-r/). ### Building the collection diff --git a/qdrant-landing/content/documentation/examples/hybrid-search-llamaindex-jinaai.md b/qdrant-landing/content/documentation/examples/hybrid-search-llamaindex-jinaai.md index cf2a5e625..9db98716c 100644 --- a/qdrant-landing/content/documentation/examples/hybrid-search-llamaindex-jinaai.md +++ b/qdrant-landing/content/documentation/examples/hybrid-search-llamaindex-jinaai.md @@ -138,7 +138,7 @@ The code below does the following: - combines `sparse` and `dense` vectors for hybrid search; - stores all data into Qdrant; -Hybrid search with Qdrant must be enabled from the beginning - we can simply set `enable_hybrid=True`. +[Hybrid search with Qdrant](/articles/hybrid-search/) must be enabled from the beginning - we can simply set `enable_hybrid=True`. ```python # By default llamaindex uses OpenAI models diff --git a/qdrant-landing/content/documentation/examples/natural-language-search-oracle-cloud-infrastructure-cohere-langchain.md b/qdrant-landing/content/documentation/examples/natural-language-search-oracle-cloud-infrastructure-cohere-langchain.md index 3cddaa6f5..de7cefacc 100644 --- a/qdrant-landing/content/documentation/examples/natural-language-search-oracle-cloud-infrastructure-cohere-langchain.md +++ b/qdrant-landing/content/documentation/examples/natural-language-search-oracle-cloud-infrastructure-cohere-langchain.md @@ -85,7 +85,7 @@ export QDRANT_URL="https://qdrant.example.com" export QDRANT_API_KEY="your-api-key" ``` -*Optional:* Whenever you use LangChain, you can also [configure LangSmith](https://docs.smith.langchain.com/), which will help us trace, monitor and debug LangChain applications. You can sign up for LangSmith [here](https://smith.langchain.com/). +*Optional:* Whenever you use [LangChain](/articles/langchain-integration/), you can also [configure LangSmith](https://docs.smith.langchain.com/), which will help us trace, monitor and debug LangChain applications. You can sign up for LangSmith [here](https://smith.langchain.com/). ```shell export LANGCHAIN_TRACING_V2=true diff --git a/qdrant-landing/content/documentation/examples/rag-chatbot-scaleway.md b/qdrant-landing/content/documentation/examples/rag-chatbot-scaleway.md index d073ee013..ef785f297 100644 --- a/qdrant-landing/content/documentation/examples/rag-chatbot-scaleway.md +++ b/qdrant-landing/content/documentation/examples/rag-chatbot-scaleway.md @@ -128,7 +128,7 @@ vectorstore = Qdrant.from_documents( ## Retrieve and generate content -The `vectorstore` is used as a retriever to fetch relevant documents based on vector similarity. The `hub.pull("rlm/rag-prompt")` function is used to pull a specific prompt from a repository, which is designed to work with retrieved documents and a question to generate a response. +The `vectorstore` is used as a retriever to fetch relevant documents based on [vector similarity](/articles/vector-similarity-beyond-search/). The `hub.pull("rlm/rag-prompt")` function is used to pull a specific prompt from a repository, which is designed to work with retrieved documents and a question to generate a response. The `format_docs` function formats the retrieved documents into a single string, preparing them for further processing. This formatted string, along with a question, is passed through a chain of operations. Firstly, the context (formatted documents) and the question are processed by the retriever and the prompt. Then, the result is fed into a large language model (`llm`) for content generation. Finally, the output is parsed into a string format using `StrOutputParser()`. diff --git a/qdrant-landing/content/documentation/examples/rag-customer-support-cohere-airbyte-aws.md b/qdrant-landing/content/documentation/examples/rag-customer-support-cohere-airbyte-aws.md index e023d2877..c1fc8700f 100644 --- a/qdrant-landing/content/documentation/examples/rag-customer-support-cohere-airbyte-aws.md +++ b/qdrant-landing/content/documentation/examples/rag-customer-support-cohere-airbyte-aws.md @@ -14,7 +14,7 @@ aliases: Maintaining top-notch customer service is vital to business success. As your operation expands, so does the influx of customer queries. Many of these queries are repetitive, making automation a time-saving solution. Your support team's expertise is typically kept private, but you can still use AI to automate responses securely. -In this tutorial we will setup a private AI service that answers customer support queries with high accuracy and effectiveness. By leveraging Cohere's powerful models (deployed to [AWS](https://cohere.com/deployment-options/aws)) with Qdrant Hybrid Cloud, you can create a fully private customer support system. Data synchronization, facilitated by [Airbyte](https://airbyte.com/), will complete the setup. +In this tutorial we will setup a private AI service that answers customer support queries with high accuracy and effectiveness. By leveraging Cohere's powerful models (deployed to [AWS](https://cohere.com/deployment-options/aws)) with [Qdrant Hybrid Cloud](/hybrid-cloud/), you can create a fully private customer support system. Data synchronization, facilitated by [Airbyte](https://airbyte.com/), will complete the setup. ![Architecture diagram](/documentation/examples/customer-support-cohere-airbyte/architecture-diagram.png) diff --git a/qdrant-landing/content/documentation/examples/recommendation-system-ovhcloud.md b/qdrant-landing/content/documentation/examples/recommendation-system-ovhcloud.md index 2809c2705..a8b4f483d 100644 --- a/qdrant-landing/content/documentation/examples/recommendation-system-ovhcloud.md +++ b/qdrant-landing/content/documentation/examples/recommendation-system-ovhcloud.md @@ -11,7 +11,7 @@ aliases: | Time: 120 min | Level: Advanced | Output: [GitHub](https://github.com/infoslack/qdrant-example/blob/main/HC-demo/HC-OVH.ipynb) | | --- | ----------- | ----------- |----------- | -In this tutorial, you will build a mechanism that recommends movies based on defined preferences. Vector databases like Qdrant are good for storing high-dimensional data, such as user and item embeddings. They can enable personalized recommendations by quickly retrieving similar entries based on advanced indexing techniques. In this specific case, we will use [sparse vectors](/articles/sparse-vectors/) to create an efficient and accurate recommendation system. +In this tutorial, you will build a mechanism that recommends movies based on defined preferences. [Vector databases](/articles/what-is-a-vector-database/) like Qdrant are good for storing high-dimensional data, such as user and item embeddings. They can enable personalized recommendations by quickly retrieving similar entries based on advanced indexing techniques. In this specific case, we will use [sparse vectors](/articles/sparse-vectors/) to create an efficient and accurate recommendation system. **Privacy and Sovereignty:** Since preference data is proprietary, it should be stored in a secure and controlled environment. Our vector database can easily be hosted on [OVHcloud](https://ovhcloud.com/), our trusted [Qdrant Hybrid Cloud](/documentation/hybrid-cloud/) partner. This means that Qdrant can be run from your OVHcloud region, but the database itself can still be managed from within Qdrant Cloud's interface. Both products have been tested for compatibility and scalability, and we recommend their [managed Kubernetes](https://www.ovhcloud.com/en/public-cloud/kubernetes/) service. diff --git a/qdrant-landing/content/documentation/faq/qdrant-fundamentals.md b/qdrant-landing/content/documentation/faq/qdrant-fundamentals.md index 4cf01b072..2640c7eb9 100644 --- a/qdrant-landing/content/documentation/faq/qdrant-fundamentals.md +++ b/qdrant-landing/content/documentation/faq/qdrant-fundamentals.md @@ -35,7 +35,7 @@ What Qdrant can do: - Search with full-text filters - Apply full-text filters to the vector search (i.e., perform vector search among the records with specific words or phrases) - Do prefix search and semantic [search-as-you-type](../../../articles/search-as-you-type/) -- Sparse vectors, as used in [SPLADE](https://github.com/naver/splade) or similar models +- [Sparse vectors](/articles/sparse-vectors/), as used in [SPLADE](https://github.com/naver/splade) or similar models What Qdrant plans to introduce in the future: diff --git a/qdrant-landing/content/documentation/frameworks/langchain4j.md b/qdrant-landing/content/documentation/frameworks/langchain4j.md index e226fe271..46981596f 100644 --- a/qdrant-landing/content/documentation/frameworks/langchain4j.md +++ b/qdrant-landing/content/documentation/frameworks/langchain4j.md @@ -5,7 +5,7 @@ weight: 2110 # LangChain for Java -LangChain for Java, also known as [Langchain4J](https://github.com/langchain4j/langchain4j), is a community port of [Langchain](https://www.langchain.com/) for building context-aware AI applications in Java +[LangChain](/articles/langchain-integration/) for Java, also known as [Langchain4J](https://github.com/langchain4j/langchain4j), is a community port of [Langchain](https://www.langchain.com/) for building context-aware AI applications in Java You can use Qdrant as a vector store in Langchain4J through the [`langchain4j-qdrant`](https://central.sonatype.com/artifact/dev.langchain4j/langchain4j-qdrant) module. diff --git a/qdrant-landing/content/documentation/frameworks/pipedream.md b/qdrant-landing/content/documentation/frameworks/pipedream.md index 07f45f5d1..af2983f18 100644 --- a/qdrant-landing/content/documentation/frameworks/pipedream.md +++ b/qdrant-landing/content/documentation/frameworks/pipedream.md @@ -39,5 +39,5 @@ Once a connection is set up, you can use the app to build workflows with the [20 ## Further Reading - [Pipedream Documentation](https://pipedream.com/docs). -- [Qdrant Cloud Authentication](https://qdrant.tech/documentation/cloud/authentication/). +- [Qdrant Cloud Authentication](/documentation/cloud/authentication/). - [Source Code](https://github.com/PipedreamHQ/pipedream/tree/master/components/qdrant) diff --git a/qdrant-landing/content/documentation/frameworks/spring-ai.md b/qdrant-landing/content/documentation/frameworks/spring-ai.md index 1d6095737..fd9ff45b9 100644 --- a/qdrant-landing/content/documentation/frameworks/spring-ai.md +++ b/qdrant-landing/content/documentation/frameworks/spring-ai.md @@ -7,7 +7,7 @@ weight: 2200 [Spring AI](https://docs.spring.io/spring-ai/reference/) is a Java framework that provides a [Spring-friendly](https://spring.io/) API and abstractions for developing AI applications. -Qdrant is available as supported vector database for use within your Spring AI projects. +Qdrant is available as supported [vector database](/articles/what-is-a-vector-database/) for use within your Spring AI projects. ## Installation diff --git a/qdrant-landing/content/documentation/guides/security.md b/qdrant-landing/content/documentation/guides/security.md index 58ae371b6..e2f4503e2 100644 --- a/qdrant-landing/content/documentation/guides/security.md +++ b/qdrant-landing/content/documentation/guides/security.md @@ -40,7 +40,7 @@ export QDRANT__SERVICE__API_KEY=your_secret_api_key_here -For using API key based authentication in Qdrant Cloud see the cloud +For using API key based authentication in [Qdrant Cloud](/cloud/) see the cloud [Authentication](/documentation/cloud/authentication/) section. diff --git a/qdrant-landing/content/documentation/hybrid-cloud/hybrid-cloud-setup.md b/qdrant-landing/content/documentation/hybrid-cloud/hybrid-cloud-setup.md index edb9002b9..046c3a783 100644 --- a/qdrant-landing/content/documentation/hybrid-cloud/hybrid-cloud-setup.md +++ b/qdrant-landing/content/documentation/hybrid-cloud/hybrid-cloud-setup.md @@ -11,7 +11,7 @@ To learn how Hybrid Cloud works, [read the overview document](/documentation/hyb ## Prerequisites -- **Kubernetes cluster:** To create a Hybrid Cloud Environment, you need a [standard compliant](https://www.cncf.io/training/certification/software-conformance/) Kubernetes cluster. You can run this cluster in any cloud, on-premise or edge environment, with distributions that range from AWS EKS to VMWare vSphere. +- **Kubernetes cluster:** To create a [Hybrid Cloud](/hybrid-cloud/) Environment, you need a [standard compliant](https://www.cncf.io/training/certification/software-conformance/) Kubernetes cluster. You can run this cluster in any cloud, on-premise or edge environment, with distributions that range from AWS EKS to VMWare vSphere. - **Storage:** For storage, you need to set up the Kubernetes cluster with a Container Storage Interface (CSI) driver that provides block storage. For vertical scaling, the CSI driver needs to support volume expansion. For backups and restores, the driver needs to support CSI snapshots and restores. diff --git a/qdrant-landing/content/documentation/hybrid-cloud/platform-deployment-options.md b/qdrant-landing/content/documentation/hybrid-cloud/platform-deployment-options.md index 16d661768..72614e829 100644 --- a/qdrant-landing/content/documentation/hybrid-cloud/platform-deployment-options.md +++ b/qdrant-landing/content/documentation/hybrid-cloud/platform-deployment-options.md @@ -5,7 +5,7 @@ weight: 4 # Platform Deployment Options -This page provides an overview of how to deploy Qdrant Hybrid Cloud on various managed Kubernetes platforms. +This page provides an overview of how to deploy [Qdrant Hybrid Cloud](/hybrid-cloud/) on various managed Kubernetes platforms. For a general list of prerequisites and installation steps, see our [Hybrid Cloud setup guide](/documentation/hybrid-cloud/hybrid-cloud-setup/). diff --git a/qdrant-landing/content/documentation/overview/_index.md b/qdrant-landing/content/documentation/overview/_index.md index 89701c2cf..13f961d6c 100644 --- a/qdrant-landing/content/documentation/overview/_index.md +++ b/qdrant-landing/content/documentation/overview/_index.md @@ -9,7 +9,7 @@ aliases: ![qdrant](/images/logo_with_text.png) -Vector databases are a relatively new way for interacting with abstract data representations +[Vector databases](/qdrant-vector-database/) are a relatively new way for interacting with abstract data representations derived from opaque machine learning models such as deep learning architectures. These representations are often called vectors or embeddings and they are a compressed version of the data used to train a machine learning model to accomplish a task like sentiment analysis, @@ -21,7 +21,7 @@ learn about one of the most popular and fastest growing vector databases in the ## What is Qdrant? -[Qdrant](https://github.com/qdrant/qdrant) "is a vector similarity search engine that provides a production-ready +[Qdrant](https://github.com/qdrant/qdrant) "is a [vector similarity search engine](/articles/vector-similarity-beyond-search/) that provides a production-ready service with a convenient API to store, search, and manage points (i.e. vectors) with an additional payload." You can think of the payloads as additional pieces of information that can help you hone in on your search and also receive useful information that you can give to your users. diff --git a/qdrant-landing/content/documentation/overview/vector-search.md b/qdrant-landing/content/documentation/overview/vector-search.md index 0d219e152..39a519b53 100644 --- a/qdrant-landing/content/documentation/overview/vector-search.md +++ b/qdrant-landing/content/documentation/overview/vector-search.md @@ -22,7 +22,7 @@ Time passed, and we haven’t had much change in that area for quite a long time {{< figure src=/docs/gettingstarted/tokenization.png caption="The process of tokenization with an additional stopwords removal and converstion to root form of a word." >}} -Technically speaking, we encode the documents and queries into so-called sparse vectors where each position has a corresponding word from the whole dictionary. If the input text contains a specific word, it gets a non-zero value at that position. But in reality, none of the texts will contain more than hundreds of different words. So the majority of vectors will have thousands of zeros and a few non-zero values. That’s why we call them sparse. And they might be already used to calculate some word-based similarity by finding the documents which have the biggest overlap. +Technically speaking, we encode the documents and queries into so-called [sparse vectors](/articles/sparse-vectors/) where each position has a corresponding word from the whole dictionary. If the input text contains a specific word, it gets a non-zero value at that position. But in reality, none of the texts will contain more than hundreds of different words. So the majority of vectors will have thousands of zeros and a few non-zero values. That’s why we call them sparse. And they might be already used to calculate some word-based similarity by finding the documents which have the biggest overlap. {{< figure src=/docs/gettingstarted/query.png caption="An example of a query vectorized to sparse format." >}} @@ -50,9 +50,9 @@ Dense vectors can capture the meaning, not the words used in a text. That being ## Why Qdrant? -The challenge with vector search arises when we need to find similar documents in a big set of objects. If we want to find the closest examples, the naive approach would require calculating the distance to every document. That might work with dozens or even hundreds of examples but may become a bottleneck if we have more than that. When we work with relational data, we set up database indexes to speed things up and avoid full table scans. And the same is true for vector search. Qdrant is a fully-fledged vector database that speeds up the search process by using a graph-like structure to find the closest objects in sublinear time. So you don’t calculate the distance to every object from the database, but some candidates only. +The challenge with vector search arises when we need to find similar documents in a big set of objects. If we want to find the closest examples, the naive approach would require calculating the distance to every document. That might work with dozens or even hundreds of examples but may become a bottleneck if we have more than that. When we work with relational data, we set up database indexes to speed things up and avoid full table scans. And the same is true for vector search. Qdrant is a fully-fledged [vector database](/articles/what-is-a-vector-database/) that speeds up the search process by using a graph-like structure to find the closest objects in sublinear time. So you don’t calculate the distance to every object from the database, but some candidates only. -{{< figure src=/docs/gettingstarted/vector-search.png caption="Vector search with Qdrant. Thanks to HNSW graph we are able to compare the distance to some of the objects from the database, not to all of them." >}} +{{< figure src=/docs/gettingstarted/vector-search.png caption="Vector search with Qdrant. Thanks to [HNSW](/articles/filtrable-hnsw/) graph we are able to compare the distance to some of the objects from the database, not to all of them." >}} While doing a semantic search at scale, because this is what we sometimes call the vector search done on texts, we need a specialized tool to do it effectively — a tool like Qdrant. @@ -66,7 +66,7 @@ Despite its complicated background, vectors search is extraordinarily simple to [**Tutorial 2 - Question and Answer System**](/articles/qa-with-cohere-and-qdrant/) However, you can also choose SaaS tools to generate them and avoid building your model. Setting up a vector search project with Qdrant Cloud and Cohere co.embed API is fairly easy if you follow the [Question and Answer system tutorial](/articles/qa-with-cohere-and-qdrant/). -There is another exciting thing about vector search. You can search for any kind of data as long as there is a neural network that would vectorize your data type. Do you think about a reverse image search? That’s also possible with vector embeddings. +There is another exciting thing about vector search. You can search for any kind of data as long as there is a neural network that would vectorize your data type. Do you think about a reverse image search? That’s also possible with [vector embeddings](/articles/what-are-embeddings/). diff --git a/qdrant-landing/content/documentation/tutorials/collaborative-filtering.md b/qdrant-landing/content/documentation/tutorials/collaborative-filtering.md index 9eb217a55..e3cd0f022 100644 --- a/qdrant-landing/content/documentation/tutorials/collaborative-filtering.md +++ b/qdrant-landing/content/documentation/tutorials/collaborative-filtering.md @@ -26,7 +26,7 @@ Fortunately, there is a way to build collaborative filtering systems without any ## Implementation -To implement this, you will use a simple yet powerful resource: [Qdrant with Sparse Vectors](https://qdrant.tech/articles/sparse-vectors/). +To implement this, you will use a simple yet powerful resource: [Qdrant with Sparse Vectors](/articles/sparse-vectors/). Notebook: [You can try this code here](https://githubtocolab.com/qdrant/examples/blob/master/collaborative-filtering/collaborative-filtering.ipynb) diff --git a/qdrant-landing/content/documentation/tutorials/create-snapshot.md b/qdrant-landing/content/documentation/tutorials/create-snapshot.md index 38860ed50..d00467611 100644 --- a/qdrant-landing/content/documentation/tutorials/create-snapshot.md +++ b/qdrant-landing/content/documentation/tutorials/create-snapshot.md @@ -71,7 +71,7 @@ dataset = load_dataset( ) ``` -We used the streaming mode, so the dataset is not loaded into memory. Instead, we can iterate through it and extract the id and vector embedding: +We used the streaming mode, so the dataset is not loaded into memory. Instead, we can iterate through it and extract the id and [vector embedding](/articles/what-are-embeddings/): ```python for payload in dataset: diff --git a/qdrant-landing/content/documentation/tutorials/hybrid-search-fastembed.md b/qdrant-landing/content/documentation/tutorials/hybrid-search-fastembed.md index cfb887212..49d46b8bc 100644 --- a/qdrant-landing/content/documentation/tutorials/hybrid-search-fastembed.md +++ b/qdrant-landing/content/documentation/tutorials/hybrid-search-fastembed.md @@ -11,7 +11,7 @@ aliases: | Time: 20 min | Level: Beginner | Output: [GitHub](https://github.com/qdrant/qdrant_demo/) | | --- | ----------- | ----------- |----------- | -This tutorial shows you how to build and deploy your own hybrid search service to look through descriptions of companies from [startups-list.com](https://www.startups-list.com/) and pick the most similar ones to your query. +This tutorial shows you how to build and deploy your own [hybrid search](/articles/hybrid-search/) service to look through descriptions of companies from [startups-list.com](https://www.startups-list.com/) and pick the most similar ones to your query. The website contains the company names, descriptions, locations, and a picture for each entry. As we have already written on our [blog](/articles/hybrid-search/), there is no single definition of hybrid search. @@ -19,7 +19,7 @@ In this tutorial we are covering the case with a combination of dense and [spars The former ones refer to the embeddings generated by such well-known neural networks as BERT, while the latter ones are more related to a traditional full-text search approach. Our hybrid search service will use [Fastembed](https://github.com/qdrant/fastembed) package to generate embeddings of text descriptions and [FastAPI](https://fastapi.tiangolo.com/) to serve the search API. -Fastembed natively integrates with Qdrant client, so you can easily upload the data into Qdrant and perform search queries. +[Fastembed](/articles/fastembed/) natively integrates with Qdrant client, so you can easily upload the data into Qdrant and perform search queries. ![Hybrid Search Schema](/documentation/tutorials/hybrid-search-with-fastembed/hybrid-search-schema.png)