Managing Large Language Model Outputs with Parsers

This article covers the different types of parsing objects used for LLMs and the troubleshooting processing
natural-language-processing
deep-learning
langchain
activeloop
openai
prompt-engineering
Author

Pranath Fernando

Published

August 4, 2023

1 Introduction

In a production setting, a predictable data structure is always desired even when language models can only produce textual outputs. Imagine, for instance, that you are developing a thesaurus application and want to provide a list of potential synonyms depending on the context. The LLMs are strong enough to produce a lot of proposals quickly.

Lack of a dynamic mechanism to retrieve relevant data from the stated string is the issue. You might argue that we can ignore the first two lines and break the response up by a new line. However, there is no assurance that the format of the response will always be the same. There may or may not be an introduction line, depending on the list.

The Output Parsers assist in building a data structure that properly defines what should be expected from the output. In the instance of the word recommendation application, we can request a list of words or a combination of multiple characteristics, such as a word and an explanation of why it fits. The expected data can be extracted for you by the parser.

The various categories of object parsing and troubleshooting processing are covered in this article.

2 Import Libs & Setup

!echo "OPENAI_API_KEY='<API_KEY>'" > .env
from dotenv import load_dotenv

load_dotenv()
True

3 Output Parsers

In this section, we’ll give you an introduction to three classes. Even if the Pydrantic parser is the most potent and adaptable wrapper, it is still useful to be aware of the alternatives for simpler issues. To further comprehend the specifics of each strategy, we shall use the thesaurus application in each segment.

3.1 PydanticOutputParser

This class instructs the model to generate its output in a JSON format and then extract the information from the response. You will be able to treat the parser’s output as a list, meaning it will be possible to index through the results without worrying about formatting.

It is important to note that not all models have the same capability in terms of generating JSON outputs. So, it would be best to use a more powerful model (like OpenAI’s DaVinci instead of Curie) to get the most satisfactory result.

The Pydantic package, which aids in the creation and validation of data structures in Python, is used by this class. It allows us to give the anticipated output a name, type, and description. In the case of the thesaurus, we require a variable that can hold numerous suggestions. By creating a class that derives from the Pydantic’s BaseModel class, it is simple to accomplish. The following command should be used to install the necessary packages: Installing the langchain==0.0.208 deeplake openai tiktoken package.

from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field, validator
from typing import List

# Define your desired data structure.
class Suggestions(BaseModel):
    words: List[str] = Field(description="list of substitue words based on context")

    # Throw error in case of receiving a numbered-list from API
    @validator('words')
    def not_start_with_number(cls, field):
        for item in field:
            if item[0].isnumeric():
                raise ValueError("The word can not start with numbers!")
        return field

parser = PydanticOutputParser(pydantic_object=Suggestions)

By building the Suggestions schema class, we always import and reference the required libraries. This class has the following two crucial components:

  1. Expected Outputs: Each output is defined by declaring a variable with desired type, like a list of strings (: List[str]) in the sample code, or it could be a single string (: str) if you are expecting just one word/sentence as the response. Also, It is required to write a simple explanation using the Field function’s description attribute to help the model during inference. (We will see an example of having multiple outputs later in the lesson)
  2. Validators: It is possible to declare functions to validate the formatting. We ensure that the first character is not a number in the sample code. The function’s name is unimportant, but the @validator decorator must receive the same name as the variable you want to approve. (like @validator(’words’)) It is worth noting that the field variable inside the validator function will be a list if you specify it as one.

We will pass the created class to the PydanticOutputParser wrapper to make it a LangChain parser object. The next step is to prepare the prompt.

from langchain.prompts import PromptTemplate

template = """
Offer a list of suggestions to substitue the specified target_word based the presented context.
{format_instructions}
target_word={target_word}
context={context}
"""

prompt = PromptTemplate(
    template=template,
    input_variables=["target_word", "context"],
    partial_variables={"format_instructions": parser.get_format_instructions()}
)

model_input = prompt.format_prompt(
            target_word="behaviour",
            context="The behaviour of the students in the classroom was disruptive and made it difficult for the teacher to conduct the lesson."
)

The template variable is a string that can have named index placeholders using the following variable_name format, as was covered in earlier articles. The template describes what we anticipate from the model, including how the inputs and parser should be formatted. The template string containing information about the kind of each placeholder is sent to the prompt template. They might either be partial_variables that need to be initialised immediately or input_variables whose value is initialised later using the.format_prompt() method.

The prompt can use LangChain’s OpenAI wrapper to deliver the query to models like GPT. (Remember to set the environment variables OPENAI_API_KEY with your OpenAI API key.) To obtain the best results, we are employing the Davinci model, one of the most potent possibilities, and setting the temperature value to 0, which makes the results repeatable.

The temperature value might range from 0 to 1, with a higher number indicating a more imaginative model. If you work on projects that call for creative output, using larger value in production is a beneficial practise.

from langchain.llms import OpenAI

# Before executing the following code, make sure to have
# your OpenAI key saved in the “OPENAI_API_KEY” environment variable.
model = OpenAI(model_name='text-davinci-003', temperature=0.0)

output = model(model_input.to_string())

parser.parse(output)

The parser object’s parse() function will convert the model’s string response to the format we specified. There is a list of words that you can index through and use in your applications.

Multiple Outputs Example

Here is some sample Python code that processes numerous outputs. It asks the model to provide a list of words and the justifications for each claim.

To execute this example, substitute the template variable and Suggestion class with the following scripts. The model will be asked to explain its reasoning in response to template modifications, and the suggestion class specifies a new output called reasons. The validator function also modifies the output to guarantee that each line of reasoning concludes with a dot. The validator function may also be used to manipulate output.

template = """
Offer a list of suggestions to substitute the specified target_word based on the presented context and the reasoning for each word.
{format_instructions}
target_word={target_word}
context={context}
"""
class Suggestions(BaseModel):
    words: List[str] = Field(description="list of substitue words based on context")
    reasons: List[str] = Field(description="the reasoning of why this word fits the context")
    
    @validator('words')
    def not_start_with_number(cls, field):
      for item in field:
        if item[0].isnumeric():
          raise ValueError("The word can not start with numbers!")
      return field
    
    @validator('reasons')
    def end_with_dot(cls, field):
      for idx, item in enumerate( field ):
        if item[-1] != ".":
          field[idx] += "."
      return field
parser = PydanticOutputParser(pydantic_object=Suggestions)
prompt = PromptTemplate(
    template=template,
    input_variables=["target_word", "context"],
    partial_variables={"format_instructions": parser.get_format_instructions()}
)

model_input = prompt.format_prompt(target_word="behaviour", context="The behaviour of the students in the classroom was disruptive and made it difficult for the teacher to conduct the lesson.")
output = model(model_input.to_string())
parser.parse(output)
Suggestions(words=['conduct', 'manner', 'demeanor', 'comportment'], reasons=['refers to the way someone acts in a particular situation.', 'refers to the way someone behaves in a particular situation.', 'refers to the way someone behaves in a particular situation.', 'refers to the way someone behaves in a particular situation.'])

3.2 CommaSeparatedListOutputParser

This class’s name makes it clear that it handles comma-separated outputs. It handles one particular situation: whenever you request a list of the model’s outputs. Importing the relevant module will be the first step.

from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.output_parsers import CommaSeparatedListOutputParser

The parser does not require a setting up step. Therefore it is less flexible. We can create the object by calling the class. The rest of the process for writing the prompt, initializing the model, and parsing the output is as follows.

parser = CommaSeparatedListOutputParser()
template = """
Offer a list of suggestions to substitue the word '{target_word}' based the presented the following text: {context}.
{format_instructions}
"""
prompt = PromptTemplate(
    template=template,
    input_variables=["target_word", "context"],
    partial_variables={"format_instructions": parser.get_format_instructions()}
)

model_input = prompt.format(
  target_word="behaviour",
  context="The behaviour of the students in the classroom was disruptive and made it difficult for the teacher to conduct the lesson."
)
model_name = 'text-davinci-003'
temperature = 0.0

model = OpenAI(model_name=model_name, temperature=temperature)
output = model(model_input)
parser.parse(output)
['Conduct',
 'Actions',
 'Demeanor',
 'Mannerisms',
 'Attitude',
 'Performance',
 'Reactions',
 'Interactions',
 'Habits',
 'Repertoire',
 'Disposition',
 'Bearing',
 'Posture',
 'Deportment',
 'Comportment']

Two sections of the sample code might need attention, even though the majority of it was covered in the preceding subsection. In order to demonstrate multiple approaches to writing a prompt, we first tested a new format for the prompt’s template. The second is that the model’s input is produced using.format() rather than.format_prompt(). The key change from the code in the preceding part is that since the prompt is already of the string type, we no longer need to invoke the.to_string() object.

As you can see, the result is a list of words with some overlaps and more variation than the PydanticOutputParser method. However, it is not possible to request extra reasoning data using the CommaSeparatedOutputParser class.

3.3 StructuredOutputParser

The LangChain team has just implemented its first output parser. However, it only accepts texts and does not offer alternatives for other data kinds, such as lists or integers, while being able to process numerous outputs. When you only want one response from the model, you can utilise it. For instance, the thesaurus application should only have one alternative term.

from langchain.output_parsers import StructuredOutputParser, ResponseSchema

response_schemas = [
    ResponseSchema(name="words", description="A substitue word based on context"),
    ResponseSchema(name="reasons", description="the reasoning of why this word fits the context.")
]

parser = StructuredOutputParser.from_response_schemas(response_schemas)

How to define a schema is shown in the code above. But we won’t get into specifics here. The PydanticOutputParser class offers validation and more flexibility for more complicated activities, and the CommaSeparatedOutputParser option covers simpler applications, therefore this class has no advantage.

4 Fixing Errors

The parsers are effective tools for dynamically extracting the data from the prompt and partially validating it. They cannot, however, promise a response. Imagine that once you’ve launched your application, the parser throws an error because the model’s answer [to a user’s request] isn’t complete. It’s not the best! We shall present two fail-safe classes in the subsections that follow. To help correct the inaccuracies, they build a layer on top of the model’s answer.

The PydanticOutputParser class, which is the only one containing a validation method, is compatible with the following strategies.

4.1 OutputFixingParser

By examining the model’s answer and the preceding parser, this approach seeks to correct the parsing error. The problem is resolved using a Large Language Model (LLM). To stay consistent with the rest of the tutorial, we’ll use GPT-3, although you can pass any model that is currently supported. Let’s begin by outlining the Pydantic data schema and then demonstrate a possible issue.

from langchain.llms import OpenAI
from langchain.output_parsers import PydanticOutputParser
from langchain.output_parsers import OutputFixingParser
from pydantic import BaseModel, Field
from typing import List
model_name = 'text-davinci-003'
temperature = 0.0
model = OpenAI(model_name=model_name, temperature=temperature)
# Define your desired data structure.
class Suggestions(BaseModel):
    words: List[str] = Field(description="list of substitue words based on context")
    reasons: List[str] = Field(description="the reasoning of why this word fits the context")

parser = PydanticOutputParser(pydantic_object=Suggestions)

Example can fix

missformatted_output = '{"words": ["conduct", "manner"], "reasoning": ["refers to the way someone acts in a particular situation.", "refers to the way someone behaves in a particular situation."]}'
parser.parse(missformatted_output)
OutputParserException: ignored

The parser properly spotted a mistake in our sample response (missformatted_output), as you can see in the error message, because we used the word reasoning instead of the anticipated reasons key. It would be simple to correct this problem using the OutputFixingParser class.

outputfixing_parser = OutputFixingParser.from_llm(parser=parser, llm=model)
outputfixing_parser.parse(missformatted_output)
Suggestions(words=['conduct', 'manner'], reasons=['refers to the way someone acts in a particular situation.', 'refers to the way someone behaves in a particular situation.'])

Example can NOT fix

The old parser and a language model are input arguments for the from_llm() method. It then creates a new parser for you that can correct output errors. In this instance, it was able to recognise the incorrect key and update it to the one we defined.

However, it is not always possible to resolve the problems using this class. Here is an illustration of how to fix a missing key mistake using the OutputFixingParser class.

missformatted_output = '{"words": ["conduct", "manner"]}'
parser.parse(missformatted_output)
OutputParserException: ignored
outputfixing_parser = OutputFixingParser.from_llm(parser=parser, llm=model)
outputfixing_parser.parse(missformatted_output)
Suggestions(words=['conduct', 'manner'], reasons=["The word 'conduct' implies a certain behavior or action, while 'manner' implies a polite or respectful way of behaving."])

It is clear from the output that the model understood the important factors that were absent from the response but was missing the context of the intended result. While we anticipate one rationale for each word, it produced a list with only one entry. We occasionally need to use the RetryOutputParser class because of this.

4.2 RetryOutputParser

As shown in the last section, the parser occasionally needs access to both the output and the prompt in order to analyse the entire context. We must define the aforementioned variables first. The LLM model, parser, and prompt, which were previously described in greater depth, are initialised by the ensuing scripts.

from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.output_parsers import PydanticOutputParser
from langchain.output_parsers import RetryWithErrorOutputParser
from pydantic import BaseModel, Field, validator
from typing import List
model_name = 'text-davinci-003'
temperature = 0.0
model = OpenAI(model_name=model_name, temperature=temperature)
# Define your desired data structure.
class Suggestions(BaseModel):
    words: List[str] = Field(description="list of substitue words based on context")
    reasons: List[str] = Field(description="the reasoning of why this word fits the context")

parser = PydanticOutputParser(pydantic_object=Suggestions)
template = """
Offer a list of suggestions to substitue the specified target_word based the presented context and the reasoning for each word.
{format_instructions}
target_word={target_word}
context={context}
"""

prompt = PromptTemplate(
    template=template,
    input_variables=["target_word", "context"],
    partial_variables={"format_instructions": parser.get_format_instructions()}
)

model_input = prompt.format_prompt(target_word="behaviour", context="The behaviour of the students in the classroom was disruptive and made it difficult for the teacher to conduct the lesson.")
missformatted_output = '{"words": ["conduct", "manner"]}'
parser.parse(missformatted_output)
OutputParserException: ignored

Using the RetryWithErrorOutputParser class, we can now correct the same missformatted_output. As we saw in the previous part, it is given the old parser and a model to declare the new parser object. The parse_with_prompt function, which requires both the output and the prompt, is in charge of resolving the parsing problem.

retry_parser = RetryWithErrorOutputParser.from_llm(parser=parser, llm=model)
retry_parser.parse_with_prompt(missformatted_output, model_input)
Suggestions(words=['conduct', 'manner'], reasons=["The behaviour of the students in the classroom was disruptive and made it difficult for the teacher to conduct the lesson, so 'conduct' is a suitable substitute.", "The students' behaviour was inappropriate, so 'manner' is a suitable substitute."])

The results demonstrate that the RetryOuputParser can resolve problems that the OuputFixingParser was unable to. The model generated one rationale for each word thanks to the parser’s accurate guidance.

The try:… except:… method is the best way to use these strategies in production to capture the parsing mistake. It implies that utilising the aforementioned classes, we can attempt to remedy the issues that are caught in the except section. It will reduce the amount of API requests and prevent extra expenses that come with them.

5 Conclusion

We learned how to validate and extract the information in an easy-to-use format from the language models’ responses which are always a string. Additionally, we reviewed LangChain’s fail-safe procedures to guarantee the consistency of the output. Combining these approaches will help us write more reliable applications in production environments. In the next lesson, we will learn how to build a knowledge graph and capture useful information or entities from texts.

6 Acknowledgements

I’d like to express my thanks to the wonderful LangChain & Vector Databases in Production Course by Activeloop - which i completed, and acknowledge the use of some images and other materials from the course in this article.

Subscribe