Releases: Marker-Inc-Korea/AutoRAG
v0.3.2
v0.3.1
What's Changed
- Add toctree by @bwook00 in #745
- Fix minor errors at the documentations by @vkehfdl1 in #747
- add effective_order at bleu as True by @vkehfdl1 in #748
- add passage dependency filter at data creation by @vkehfdl1 in #751
- Add Passage Dependency at README.md by @bwook00 in #761
- docs: update data_format.md by @eltociear in #772
- change the README and tutorial of deploying the result. by @vkehfdl1 in #769
- Windows support (partially) AutoRAG by @vkehfdl1 in #766
- Feature/hongsw/671 dockerfile Add Dockerfile and Docker configuration for AutoRAG production environment by @hongsw in #763
- Add total three evolving methods to QA creation by @vkehfdl1 in #767
- Possible error when the QA retrieval_gt shape will be different by @vkehfdl1 in #774
- dump version 0.3.1 by @vkehfdl1 in #776
New Contributors
- @eltociear made their first contribution in #772
Full Changelog: v0.3.0...v0.3.1
v0.3.0
What's Changed
- Refactoring to v3.0 for efficient deployment by @vkehfdl1 in #727
- resolve vllm error by @vkehfdl1 in #735
- Change data creation package names to v0.3 by @vkehfdl1 in #740
- Add more yaml file by @bwook00 in #743
- Update README for v 0.3.0 by @bwook00 in #739
- dump version 0.3.0 by @vkehfdl1 in #741
Full Changelog: v0.2.18...v0.3.0
🚀 AutoRAG v0.3.0 is Here! 🚀
We're thrilled to introduce AutoRAG v0.3.0, packed with new features and key improvements. Here’s what’s new:
1. Improved Response Time for Deployment
In earlier versions, the response time during deployment was slow, making it difficult to use optimized RAG pipeline. With v0.3.0, we've significantly reduced the response time, making deployment much more efficient for user-facing services.
2. Re-designed Data Creation Process
Data creation is an essential part of optimizing RAG pipelines, and we've made the process much smoother. In earlier versions, this feature was still in its early stages. Now, in v0.3.0, you can build the data creation process within AutoRAG.
We’ve added AutoParse and AutoChunk, allowing you to configure, parse, and chunk your data using a single YAML file. You can also easily compare different methods to refine your pipeline. Whether you build QA datasets with LLMs or manually, this structure offers a human-in-the-loop process to help you create and manage your data.
Check out the detailed guide on data creation.
3. Python & Library Support Updates
- Python 3.9 is no longer supported. Please upgrade to Python 3.10.
- AutoRAG now works with LangChain 0.3, LlamaIndex 0.11, Pydantic v2, and OpenAI o1 models.
Share Your Feedback
Your insights help us improve AutoRAG! Let us know how these updates impact your workflow and what you’d like to see in future versions.
Join Discord server now!
Thank you for being part of the AutoRAG journey!
v0.2.18
What's Changed
- change add_file_name language notation by @bwook00 in #717
- Ingest bm25_tokenizer and embedidng only in the strategy of other modules by @vkehfdl1 in #716
- OpenAI o1 model compatibility by @vkehfdl1 in #719
- Compatible with Langchain version 0.3.0 by @bwook00 in #724
- Release/v0.2.18 by @vkehfdl1 in #726
Full Changelog: v0.2.17...v0.2.18
v0.2.17
What's Changed
- Add update corpus feature for chunking optimization by @vkehfdl1 in #706
- Add func annotation about parse module by @bwook00 in #708
- Add baseline beta docs by @bwook00 in #710
- Finish new data creation documentation by @vkehfdl1 in #711
- Finish Chunk and Parse documentation by @bwook00 in #712
- fix vectordb score bug by @vkehfdl1 in #713
Full Changelog: v0.2.16...v0.2.17
v0.2.16
What's Changed
- Replace FastAPI with Flask by @rjwharry in #657
- Mock all OpenAI Embeddings at the test code for outside contributors by @vkehfdl1 in #659
- Add basic dataset schema for new 'beta' version of data creation by @vkehfdl1 in #663
- Add AutoParse baseline and module 'langchain_parse' and 'clova' by @bwook00 in #660
- Add llamaparse module by @bwook00 in #666
- replace yaml.dump with yaml.safe_dump by @rjwharry in #669
- Add table hybrid parse module by @bwook00 in #668
- [Data Creation Refactoring] Add generate qa set features by @vkehfdl1 in #678
- Add more data creation methods by @vkehfdl1 in #680
- add (auto)chunk and its first module llama_index_chunk by @bwook00 in #681
- [Data Creation Refactoring] Add don't know filter at data creation and its docs by @vkehfdl1 in #686
- [Chunk] Add "path" and "start_end_idx" at chunk return by @bwook00 in #685
- add override at Raw and Chunker from_parquet classmethod by @vkehfdl1 in #692
- [Chunk] Add langchain chunk module by @bwook00 in #693
- Fix bug when use vllm in multi-gpu environment by @vkehfdl1 in #697
- Add chunk method at Raw schema and test whole pipeline to generate initial dataset. by @vkehfdl1 in #698
- fix an issue with loading HuggingfaceLLM models by @jis478 in #652
- [Bug] Modify to kiwipiepy version 0.18.0 or higher by @bwook00 in #704
- refactor existing metric python files with input schema by @Eastsidegunn in #667
- dump version 0.2.16 by @vkehfdl1 in #705
New Contributors
Full Changelog: v0.2.15...v0.2.16
v0.2.15
What's Changed
- Update ragas.md to fix typo by @cd80 in #628
- Add optional parameter 'exist_gen_gt' at make_qa_with_existing_queries function. by @bwook00 in #620
- Refactor contributing guide to real informative one by @vkehfdl1 in #625
- Add ruff Linter and reformat all codes by @vkehfdl1 in #622
- Update tutorial.md to fix typo by @cd80 in #629
- Delete empty contents at corpus df for ingesting all document (avoid error at OpenAI embeddings) by @vkehfdl1 in #634
- Fix most of the grammar and typo at the docs. by @vkehfdl1 in #633
- Add running web interface in Runner with Gradio by @rjwharry in #632
- Update LlamaIndex version to 0.11 by @vkehfdl1 in #644
- fix error at datacreation tutorial.md documentation by @vkehfdl1 in #641
- dump version 0.2.15 by @vkehfdl1 in #642
- add documentation about ollama request time out by @vkehfdl1 in #646
New Contributors
Full Changelog: v0.2.14...v0.2.15
v0.2.14
What's Changed
- Refactor code for removing few warnings by @vkehfdl1 in #605
- Resolve asyncio error at FastAPI server execution by @vkehfdl1 in #611
- update cohere version to the latest (which is okay) by @vkehfdl1 in #612
- Update issue templates by @vkehfdl1 in #617
- Add New feature: existing_query_df and local parameters by @skmanzg in #607
- dump version v0.2.14 by @vkehfdl1 in #619
New Contributors
Full Changelog: v0.2.13...v0.2.14
v0.2.13
What's Changed
- Add few description for better understanding of AutoRAG by @vkehfdl1 in #596
- Add ko_okt and ko_kkma bm25 tokenizer by @vkehfdl1 in #597
- Add autorag validate for validating system setup easily by @vkehfdl1 in #599
- update modules pictures at README.md and delete tutorial step 2 by @bwook00 in #600
- dump version v0.2.13 by @vkehfdl1 in #601
Full Changelog: v0.2.12...v0.2.13