You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Task:
To make a system/app to respond to questions with a segment or a list of segments from a number of transcribed videos.
Current state:
I've already made a prototype using retrieval example, to process video transcriptions and index them, and play back video segment corresponding to best matched chunk timestamps.
Problem:
The basic retrieval example adaptation lacks "intelligence" of LLM and there is constraint of usable statements from video transcripts.
My idea to try out to feed the best responses to LLM and ask to arrange them to build complete answer. I this a correct path? If so how to properly feed data into it? Are there good and bad ways to d it? Or is there a completely different and better way?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Task:
To make a system/app to respond to questions with a segment or a list of segments from a number of transcribed videos.
Current state:
I've already made a prototype using retrieval example, to process video transcriptions and index them, and play back video segment corresponding to best matched chunk timestamps.
Problem:
The basic retrieval example adaptation lacks "intelligence" of LLM and there is constraint of usable statements from video transcripts.
My idea to try out to feed the best responses to LLM and ask to arrange them to build complete answer. I this a correct path? If so how to properly feed data into it? Are there good and bad ways to d it? Or is there a completely different and better way?
Beta Was this translation helpful? Give feedback.
All reactions