Log Analysis Using LLMs · AI Developer Accelerator

Log Analysis Using LLMs

Request help for Building a chatbot for Log analysis using LLMs

The Motivation is as follows:

There are many industrial applications that use various systems (devices). It is these logs that are analysed by the engineers to study the functioning, performance and maintenance. This exercise is resource intensive. But it is unavoidable because there is deep reasoning and logic behind the analysis and is based on the industry specific knowledge and domain expertise and is currently done by humans.For standard devices like our computers, we have system logs, event logs and application logs. For analysing these logs, there are many commercial and open source applications since these logs follow a standard format.

However, for proprietary systems like the one I am building for, the logs follow a custom format and these formats may have custom entities and descriptions. Of course, there is a document that has the Business logic and knowledge embedded in the logs that explains these custom logs in detail i.e. events codes, Alarms, descriptions, etc. This coupled with the "How to perform log analysis" document and product design document will help to understand why a device behaved in a certain way.

My main goal is to provide a chatbot application that can use an LLM to extract key insights from log files. Essentially, provide a simple user interface where the upper management (Non technical staff) can ask questions and get a response. i.e.

When was the last time Device A generated the following events/alarms ?
In the last 30 days, how many times did Device B log the following error "<error description>
When the <error> was logged by Device A, was Device B in working condition?
Why did Device C generate the alarm at <time> on <date>?
Generate an error report for Device D in a tabular form with date time and the event with causation and its consequences on other devices?
Generate a summary report for the entire system for the month of <month>?

Possible solution I have in mind:

Design a suitable schema for the database. Use meaningful full table names and column names.
Stream all log data to the database after cleaning, preprocessing etc
Build a Vector DB with the Business Logic, Knowledge base, product design, log formats documents
Build the following pipeline

a. SQL Agent : Take user question > convert to SQL > get response from SQL DB

b. RAG Agent: Same user questions > send to vector DB > get context from Vector DB

c. Combine both (SQL Agent response + RAG agent response)

d. Feed this to an LLM (This is a distilled model that has been trained on the domain specific Knowledge base) to generate a final answer.

So my question is step 4. Is it possible to combine RAG and SQL responses and feed to distilled LLM to generate a final response?

Can you point to some resources online where responses from multiple agents (SQL & RAG) have been combined and fed to a Main LLM to provide the answer.

Thank you for sparing a few minutes to read this lengthy post

2 comments