Structured Output
Enforce the Agent output based on the provided schema.
Last updated
Enforce the Agent output based on the provided schema.
Last updated
There are many use cases where we need Agents to understand natural language, but output in a structured format.
One common use-case is extracting data from text to insert into a database or use with some other downstream system. This guide covers how Neuron allows you to enforce structured outputs from the agent.
The central concept is that the output structure of LLM responses needs to be represented in some way. The schema that Neuron validates against is defined by PHP type hints. Basically you have to define a class with strictly typed properties:
Neuron generates the corresponding JSON schema from the PHP object to instruct the underlying model about your required data format. Then the framework parse the LLM output to extract data and returns an object instance filled with appropriate values:
You can also encapsulate the output format into the Agent implementation, so it will be the Agent standard output format. You always need to call the structured()
method to require strict output.
Neuron requires you to define rules for two layers to create the structured output class.
The first is the SchemaProperty
attribute that allows you to control the JSON schema sent to the LLM to understand the required data format.
We strongly recommend to use the SchemaProperty
attribute to define at least the description, to allow the LLM understand the purpose of a property, and the required flag:
The Validation component already contains many validation rules that you can apply to the output class properties. The example below shows you how to mark the name property as required (NotBlank):
You can construct complex output structures using other PHP object as a property type. Following the example of a the Person class we can add the address
property that is another structured class.
In the Address
definition we require only the street and zip code properties, and allow city to be empty.
Now when you ask the agent for the structured output you will get the filled instance back:
If you declare a property as an array Neuron assumes the list of items to be a list of string. Assume we want to add a list of tags to the Person object:
In the example above we assume that the tags property is an array of strings by default.
It could be needed to populate the list of tags with a structured data type. To do this you must specify the related class in the property doc-block:
Notice how we are using the absolute namespace (\App\Agent\Models\Tag) to inform Neuron about the class that should instantiated.
And here is the hypotetical implementation of the Tag class with it own validation rules and property info:
Since the LLM are not perfectly deterministic it's mandatory to have a retry mechanism in place if something is missing in the LLM response.
By default Neuron extracts and validates the data from the LLM response and if there is one or more validation errors automatically retry the request just one more time informing the LLM about what went wrong and for what properties.
You can eventually customize the number of times the agent must retry to get a correct answer from the LLM:
If you work with a less capable LLM consider to use a number of retries balancing the probability to get e valid answer, and the potential token consumption.
You can disable retry just passing zero. It will be a one shot attempt:
You can rely on the Inspector dashboard to get full visibility on the internal steps executed by the Agent to provide you with a structured output instance:
Each segment bring its own debug information to follow the agent execution in real time:
The second layer is validation, built on top of the component. It will ensure data gathered from the LLM response are consistent with your requirements.
Neuron ability to validate data in structured PHP objects is built on top of the component.
You can find the available validation constraints in the official Symfony component page:
Learn how to enable in the next section.