Chat History & Memory
Learn how Neuron AI manage multi turns conversations.
Neuron AI has a built-in system to manage the memory of a chat session you perform with the agent.
In many Q&A applications you can have a back-and-forth conversation with the LLM, meaning the application needs some sort of "memory" of past questions and answers, and some logic for incorporating those into its current thinking.
For example, if you ask a follow-up question like "Can you elaborate on the second point?", this cannot be understood without the context of the previous message. Therefore we can't effectively perform retrieval with a question like this.
In the example below you can see how the LLM don't know our name initially:
Clearly the LLM doesn't have any context about me. Now I try present me in the first message, and then ask for my name:
How Chat History works
Neuron Agents take the list of messages exchanged between your application and the LLM into an object called Chat History. It's a crucial part of the framework because the chat history needs to be managed based on the context window of the underlying LLM.
It's important to send past messages back to LLM to keep the context of the conversation, but if the list of messages grow enough to exceeds the context window of the model the request will be rejected by the AI provider.
Chat history automatically truncate the list of messages to never exceeds the context window avoiding unexpected errors.
How to register a chat history
InMemoryChatHistory
simply store the list of messages into an array. It is kept in memory only during the current execution. If you need to persist the chat history to resume the conversation later in time, you need to build a new chat history implementation.
We plan to develop file storage as well, but feel free to send us your proposals via Pull Request on the GitHub repository.
How to create a new chat history
To create a new implementation of the chat history you must implement the AbstractChatHistory
:
The abstraction already implement the method calculateTotalUsage()
that is automatically calculated based on the usage data collected by the AI provider response.
We strongly suggest to look at the InMemoryChatHistory
implementation to replicate the same behavior.
Last updated