OpenAI Function Calling In LangChain

This article focuses on the integration of OpenAI functions with Langchain’s expression language and how this makes applications quicker to produce. We will also delve into the utility of PyDantic, a Python library that simplifies the construction of OpenAI functions.
natural-language-processing
langchain
openai
Author

Pranath Fernando

Published

November 5, 2023

1 Introduction

In our ongoing exploration of artificial intelligence tools, this article synthesizes insights from our previous articles, focusing on the integration of OpenAI functions with Langchain’s expression language. We will also delve into the utility of PyDantic, a Python library that simplifies the construction of OpenAI functions.

2 Understanding PyDantic

What is PyDantic? PyDantic is a robust data validation library in Python that enhances the functionality of Python’s data classes. This library facilitates the definition of data structures with stringent type enforcement and validation, offering an efficient means to manage data integrity. Moreover, PyDantic is especially useful for converting data structures to JSON format, which is instrumental in constructing OpenAI function descriptions. It offers a concise way to define data structures while ensuring that the data adheres to specified types and constraints. it also makes it really easy to export those structures to JSON.

That will be useful because we can utilise the PyDantic object to generate OpenAI function descriptions. Remember how those OpenAI function descriptions were a large chunk of JSON with a variety of different coding? We can use PyDantic to avoid having to think about all of that. We’ll accomplish this by creating a PyDantic class.

Implementing PyDantic Classes The implementation of PyDantic involves defining classes with typed attributes instead of the traditional __init__ method. These classes serve as templates for generating JSON schemas for OpenAI functions, bypassing the intricacies of manual JSON crafting.

It’s fairly similar to a standard Python class, so you can compare them here. The primary difference is that instead of an init function, we’ll simply specify the attributes and their types in the class declaration. We’re not going to do anything with these classes. We’ll just use it to write the OpenAI function in JSON.

3 Practical Application of PyDantic

Setting Up the Environment We begin by setting up our working environment, importing necessary PyDantic classes, and preparing the usual Python class for comparison.

import os
import openai

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
openai.api_key = os.environ['OPENAI_API_KEY']
from typing import List
from pydantic import BaseModel, Field
import warnings
warnings.filterwarnings('ignore')

4 Pydantic Syntax

Python vs. PyDantic Classes A conventional Python class, equipped with an __init__ method, allows for the instantiation of objects with attributes like name, age, and email. However, this approach lacks built-in validation. Conversely, a PyDantic class automatically validates input data, raising errors when invalid data is supplied, ensuring data integrity.

We’ll begin with the most basic Python class. So we’ll have the standard Python class producer, which has an init function to which we supply the name, age, and email. So, if we make an instance of this, we can see that we have a standard Python class and that we can access the elements based on them. So, if age is an incorrect value, we have bar, which is a string, and we accept the init term. We can see that it produces it properly. And if we look at the element, we can see that it is active. And that’s not good because we need some validation of that.

class User:
    def __init__(self, name: str, age: int, email: str):
        self.name = name
        self.age = age
        self.email = email
foo = User(name="Joe",age=32, email="joe@gmail.com")
foo.name
'Joe'
foo = User(name="Joe",age="bar", email="joe@gmail.com")
foo.age
'bar'

We can now have our class in character for the base model, which is imported from PyDantic, and we can define our attributes using various techniques right under the class definition. So name is a string, pages is an integer, and we also have a string. We can construct an object as usual, and if we inspect it, we can see that it looks good in all the different aspects, but this is a wonderful benefit. We can also access individual pieces in this manner.

# Pydantic Version
class pUser(BaseModel):
    name: str
    age: int
    email: str
foo_p = pUser(name="Jane", age=32, email="jane@gmail.com")
foo_p.name
'Jane'

Now, if we try to pass in a invalid age argument, such as a string or a value bar, we can see that it raises a validation error. PyDantic is doing something similar behind the scenes. It is actually performing further validation on the difference that we have passed in. This is yet another advantageous feature of PyDantic.

# Should throw an error as pydantic detects age is a string rather than and int
foo_p = pUser(name="Jane", age="bar", email="jane@gmail.com")
ValidationError: 1 validation error for pUser
age
  value is not a valid integer (type=type_error.integer)

By using Pydantic, validation is done for us on the data types.

Nested Data Structures with PyDantic PyDantic’s capabilities extend to constructing nested data structures. By defining a class with attributes as lists of other PyDantic classes, we can create complex and validated JSON-compatible objects.

So we’re going to define class type here, and since this is a PyDantic model, it inherits from this model. And the only element in this class will be students, and students will be a list of keys as stated before. And we can now make an object with this precise structure. So we pass in a list of students, in this example, and we get an object back, which we print. And to the point here is that you can nest PyDantic’s objects.

class Class(BaseModel):
    students: List[pUser]
obj = Class(
    students=[pUser(name="Jane", age=32, email="jane@gmail.com")]
)
obj
Class(students=[pUser(name='Jane', age=32, email='jane@gmail.com')])

5 Creating OpenAI Function Definitions with PyDantic

Designing OpenAI Function Definitions The transition from PyDantic objects to OpenAI function definitions involves creating a PyDantic class that encapsulates the desired function schema. We then employ PyDantic’s BaseModel to define parameters and utilize docstrings for descriptive purposes.

Converting PyDantic to OpenAI JSON Schema We convert our PyDantic class into an OpenAI-compatible JSON schema, which includes names, descriptions, and parameter details. This schema adheres to our established function definitions and includes mandatory descriptions, ensuring clarity and usability.

So what we’re going to do is create a PyDantic object that we can then feed to the schema we discussed earlier. Importantly, here is a PyDantic object that we generated that will not do anything. We’re only utilising it to build this schema. We’ll make a class called weather search, which corresponds to the function we made earlier, where I’m inheriting from base model. You can also notice that we’re inserting doc string here. Then there’s a single argument named airport code, which we enter as a string. Then, to get the weather search, we add this field description for airport code.

class WeatherSearch(BaseModel):
    """Call this with an airport code to get the weather at that airport"""
    airport_code: str = Field(description="airport code to get weather for")

This function will do exactly what it says on the tin. It will convert a PyDantic object to the JSON structure required by an open AI function. So, if we pass the class in here, and notice that we’re only passing in the class’s name, we’re not giving it an object. We send in the class type, and we get back the weather function. When we look at what the weather function is, we can see that it is the same JSON schema that we passed in open AI earlier.

from langchain.utils.openai_functions import convert_pydantic_to_openai_function
weather_function = convert_pydantic_to_openai_function(WeatherSearch)
weather_function
{'name': 'WeatherSearch',
 'description': 'Call this with an airport code to get the weather at that airport',
 'parameters': {'title': 'WeatherSearch',
  'description': 'Call this with an airport code to get the weather at that airport',
  'type': 'object',
  'properties': {'airport_code': {'title': 'Airport Code',
    'description': 'airport code to get weather for',
    'type': 'string'}},
  'required': ['airport_code']}}

We can see that it has the name weather search, which indicates that this is the name of the Python class. We can also see that it has a description, which is extracted from the doc string. We can see in the parameters that it has a list of attributes, one of which is Airport Code, or the only one, and this is extracted from the argument that it will see here. Airport Code has a description, which is taken from the field description above, and a type, which is string.

So one thing in particular that LangChain have done is made this doc string here mandatory so that it can put in the function description. As previously said, functions are essentially prompts, thus if you’re sending in a function, you should include a description of what that function does. Langchain implemented several tests to ensure that you’re entering the description correctly. They don’t enforce descriptions for everywhere though.

class WeatherSearch1(BaseModel):
    airport_code: str = Field(description="airport code to get weather for")

Note: The next cell is expected to generate an error.

convert_pydantic_to_openai_function(WeatherSearch1)
KeyError: 'description'
class WeatherSearch2(BaseModel):
    """Call this with an airport code to get the weather at that airport"""
    airport_code: str
convert_pydantic_to_openai_function(WeatherSearch2)
{'name': 'WeatherSearch2',
 'description': 'Call this with an airport code to get the weather at that airport',
 'parameters': {'title': 'WeatherSearch2',
  'description': 'Call this with an airport code to get the weather at that airport',
  'type': 'object',
  'properties': {'airport_code': {'title': 'Airport Code', 'type': 'string'}},
  'required': ['airport_code']}}
from langchain.chat_models import ChatOpenAI
model = ChatOpenAI()
model.invoke("what is the weather in SF today?", functions=[weather_function])
AIMessage(content='', additional_kwargs={'function_call': {'name': 'WeatherSearch', 'arguments': '{\n  "airport_code": "SFO"\n}'}}, example=False)
model_with_function = model.bind(functions=[weather_function])
model_with_function.invoke("what is the weather in sf?")
AIMessage(content='', additional_kwargs={'function_call': {'name': 'WeatherSearch', 'arguments': '{\n  "airport_code": "SFO"\n}'}}, example=False)

6 Forcing it to use a function

We can force the model to use a function

model_with_forced_function = model.bind(functions=[weather_function], function_call={"name":"WeatherSearch"})
model_with_forced_function.invoke("what is the weather in sf?")
AIMessage(content='', additional_kwargs={'function_call': {'name': 'WeatherSearch', 'arguments': '{\n  "airport_code": "SFO"\n}'}}, example=False)
model_with_forced_function.invoke("hi!")
AIMessage(content='', additional_kwargs={'function_call': {'name': 'WeatherSearch', 'arguments': '{\n  "airport_code": "JFK"\n}'}}, example=False)

7 Integrating OpenAI Functions with Langchain Expression Language

Direct Interaction with Langchain By importing Langchain’s model, we can directly interact with OpenAI functions. This interaction is demonstrated through the instantiation of a model and the execution of queries requiring the defined functions.

Model Binding and Function Invocation Binding functions to a model streamlines the process of function invocation, enabling straightforward integration into a chain. This technique allows the model to recognize and utilize the relevant functions based on the input context.

Let’s ask a question that will necessitate the use of the weather function. So, how is the weather and SF today? Then we can include keyword arguments. So we’ll pass in the weather function we defined earlier. So we’ll have a content message, and then in the additional quotations field, we’ll have this function called parameter, which returns a function named weather search and then arguments Airport Code SFR. So it’s utilising both of their functions.

from langchain.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant"),
    ("user", "{input}")
])
chain = prompt | model_with_function
chain.invoke({"input": "what is the weather in sf?"})
AIMessage(content='', additional_kwargs={'function_call': {'name': 'WeatherSearch', 'arguments': '{\n  "airport_code": "SFO"\n}'}}, example=False)

8 Using multiple functions

Even better, we can pass a set of functions and let the LLM decide which to use based on the question context. We can also associate the functions with the model. And one reason for doing so is so that we can simply pass that model plus functions surrounding it without worrying about what we pass in. These are functions that take keyword parameters. So, if we do model with function and set that equal to model.bind, functions with weather function, we can now directly call this model function. And we only need to pass in weather and SF. Now we can see that it responds and continues to use the function call. That’s because it still knows the function calls exist because we discovered them in the model in this manner.

The next step is to pass in a list of functions, and the model will pick which one to employ based on the context of the query. So we’re going to make another identity model class called artist search here. And we’ll include a description named this condition of the name so song by a specific artist. We’ll add two arguments: the artist’s name and N, an integer proportional to the number of results to check up. We’ll then make a fresh set of functions, and the time will be two.

class ArtistSearch(BaseModel):
    """Call this to get the names of songs by a particular artist"""
    artist_name: str = Field(description="name of artist to look up")
    n: int = Field(description="number of results")
functions = [
    convert_pydantic_to_openai_function(WeatherSearch),
    convert_pydantic_to_openai_function(ArtistSearch),
]

We’ll utilise openai functions to perform weather and artist searches, and we’ll create a new object called model with functions. And then we’ll use model.bind. And now let’s check what happens when we invoke this with different inputs.

model_with_functions = model.bind(functions=functions)
model_with_functions.invoke("what is the weather in sf?")
AIMessage(content='', additional_kwargs={'function_call': {'name': 'WeatherSearch', 'arguments': '{\n  "airport_code": "SFO"\n}'}}, example=False)
model_with_functions.invoke("what are three songs by taylor swift?")
AIMessage(content='', additional_kwargs={'function_call': {'name': 'ArtistSearch', 'arguments': '{\n  "artist_name": "Taylor Swift",\n  "n": 3\n}'}}, example=False)
model_with_functions.invoke("hi!")
AIMessage(content='Hello! How can I assist you today?', additional_kwargs={}, example=False)

9 Conclusion: Advancing with OpenAI and Langchain

We’ve ventured through the process of using PyDantic for structuring OpenAI functions and integrating these with Langchain’s expression language. The power of PyDantic to enforce data integrity and facilitate JSON schema creation is evident, as is the flexibility of Langchain in handling dynamic function calls.

10 Acknowledgements

I’d like to express my thanks to the wonderful Functions, Tools and Agents with LangChain by DeepLearning.ai - which i completed, and acknowledge the use of some images and other materials from the course in this article.

Subscribe