[R-263] Roadmap - v0.2 #1009

jjmachan · 2024-06-05T18:00:42Z

rajib76 · 2024-06-06T14:56:30Z

We lack chunk quality metrics as of today. It will be good to see some chunk quality evaluation metrics.

jjmachan · 2024-06-08T06:45:23Z

hey @rajib76, thanks for chipping in 🙂

could you explain a bit more about how you're measuring quality here? maybe an example too if possible?

rajib76 · 2024-06-08T07:01:32Z

One of the hard problem today in RAG is to determine the right size of the chunk. If a chunk talks about multiple concept, it is very difficult to find the most relevant chunk for the question. I was looking for a metrics that will tell that a chunk is atomic and it talks about only one concept. The semantic chunking approach did not work as the embedding model itself has a semantic dissonance.

sky-2002 · 2024-08-18T18:03:35Z

@jjmachan @rajib76 metrics like chunk_attribution and chunk_utilization (as referenced here) could help to quantify chunk quality. We already have relevance scores(from vector DBs or keyword search engines) to measure chunk relevance with respect to the query. But metrics to quantify how much of the chunk was used can be helpful. I can can take this up you find them useful, I found it interesting, could help decide how many chunks to retrieve.

rajib76 · 2024-08-18T23:17:26Z

This will be useful Akash.this will help in chunk tuning.

…

On Sun, Aug 18, 2024 at 11:03 AM Aakash Thatte ***@***.***> wrote: @jjmachan <https://github.com/jjmachan> @rajib76 <https://github.com/rajib76> metrics like chunk_attribution and chunk_utilization (as referenced here <https://docs.rungalileo.io/galileo/gen-ai-studio-products/guardrail-store/chunk-attribution>) could help to quantify chunk quality. We already have relevance scores(from vector DBs or keyword search engines) to measure chunk relevance with respect to the query. But metrics to quantify how much of the chunk was used can be helpful. I can can take this up you find them useful, I found it interesting, could help decide how many chunks to retrieve. — Reply to this email directly, view it on GitHub <#1009 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AD4VIRC6RTYRBPFEA76LKY3ZSDOYZAVCNFSM6AAAAABI3G2TLKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJVGM2DGNBRGA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

alexander-zuev · 2024-10-02T03:58:44Z

@jjmachan hey, thanks for the superb library and effort!

I want to use RAGAS to evaluate my open-source RAG application that has it's own custom chunker and retriever. Do you consider feasible adding support for custom-chunks to the synthetic data generator?

Right now I can't really use ragas fully because I need to rely on the chunks generated by ragas, instead of my own chunker.

jjmachan · 2024-10-05T18:21:00Z

hey @Twist333d - thanks for the kind words ❤️
we just revamped the testset generation piece for v0.2 - we just released an beta version - but main version should be out next week

do you want to give that a go?

alexander-zuev · 2024-10-06T17:53:41Z

yep @jjmachan shoot it of course!

Btw, I've just setup RAGAS to be used with Weave, and another feature request came up - it would be great if you supported a much easier integration with tracing & eval suites such as Weave by W&B.

alexander-zuev · 2024-10-06T18:06:08Z

Several more feature requests:

Allow random or 'smart' sampling of samples to be used to generate questions. For example, I want to generate a test dataset for a set of documents. Depending on the volume (1 page vs 10000 pages), I want to be able to control how / where do the questions come from
Async / parallel generation of embeddings
Control over how many embeddings are generated for the input file to the dataset generator - as I understand right now it converts all nodes to embeddings which might be too costly or not necessary. For example, if I set test_size==1, why does it convert all nodes to embeddings

jjmachan changed the title ~~Roadmap - v0.2~~ [R-263] Roadmap - v0.2 Jun 5, 2024

jjmachan added the enhancement New feature or request label Jun 5, 2024

jjmachan modified the milestone: v0.1.9 Jun 5, 2024

dosubot bot added the module-metrics this is part of metrics module label Jun 5, 2024

jjmachan pinned this issue Jun 5, 2024

jjmachan removed the module-metrics this is part of metrics module label Jun 6, 2024

jjmachan mentioned this issue Jun 6, 2024

Removed the temperature parameter #657

Closed

jjmachan mentioned this issue Sep 6, 2024

token usage parser for llama_index #1248

Open

This was referenced Sep 9, 2024

Additional required fields for metrics #1257

Closed

ragas v0.1.17 raise Original column names {'contexts'} not in the dataset. #1271

Closed

shahules786 unpinned this issue Oct 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[R-263] Roadmap - v0.2 #1009

[R-263] Roadmap - v0.2 #1009

jjmachan commented Jun 5, 2024 •

edited

Loading

rajib76 commented Jun 6, 2024

jjmachan commented Jun 8, 2024

rajib76 commented Jun 8, 2024

sky-2002 commented Aug 18, 2024

rajib76 commented Aug 18, 2024 via email

alexander-zuev commented Oct 2, 2024

jjmachan commented Oct 5, 2024

alexander-zuev commented Oct 6, 2024

alexander-zuev commented Oct 6, 2024

[R-263] Roadmap - v0.2 #1009

[R-263] Roadmap - v0.2 #1009

Comments

jjmachan commented Jun 5, 2024 • edited Loading

rajib76 commented Jun 6, 2024

jjmachan commented Jun 8, 2024

rajib76 commented Jun 8, 2024

sky-2002 commented Aug 18, 2024

rajib76 commented Aug 18, 2024 via email

alexander-zuev commented Oct 2, 2024

jjmachan commented Oct 5, 2024

alexander-zuev commented Oct 6, 2024

alexander-zuev commented Oct 6, 2024

jjmachan commented Jun 5, 2024 •

edited

Loading