I need to build a rag based document retrieval llm, with the following features. I am having difficulty in coding it, because of the multiple dependencies and package versions, so if anyone has any reference code that I can use, it would be helpful.
Features:
- It must give information strictly from the documents only and not from the internet
- It must give sources such as page number and document at the end of each response
- It must be able to accept text, audio and images as input
- It should very accurate in it's response giving information from the document chunks
- The documents knowledge base should support PDF, audio and video formats as well.
- This one I find very tricky, the bot must probe the user to explore more topics based on question, or probe the user if the query is very vague or hard to understand.
- Other features like chat history, search through chats etc must also be there.
I would appreciate the support!