I have a question regarding handling data that exceeds the context window limit. I once worked on an application to generate unit tests. As you know, writing a good unit test for a function requires access to the definitions of all dependencies used by the function. While I, as a human, can access the entire codebase, this is not possible with an LLM.
If I don’t provide the definitions of all the dependencies, the LLM will make assumptions, which can result in incorrect mocks for other classes or external tools used by the function I’m trying to test.
Do you have any suggestions for addressing this issue? Summarization or advanced chunking solutions won't work in this case because the LLM must have full access to all dependencies to generate accurate unit tests.