Chains and why they are used in Langchain

In this post we delve deeper into the concept of chains, which provide an end-to-end pipeline for utilizing language models. These chains seamlessly integrate models, prompts, memory, parsing output, and debugging capabilities, offering a user-friendly interface.
natural-language-processing
deep-learning
langchain
activeloop
openai
Author

Pranath Fernando

Published

August 12, 2023

1 Introduction

Because it allows for natural language querying, prompting is regarded the most effective means of communicating with language models. We talked over prompting tactics and briefly used chains earlier. The chains will be explained in greater depth in this lesson.

The chains are in charge of building an end-to-end pipeline for utilising the language models. They will integrate the model, prompt, memory, parsing output, and debugging capacity into a user-friendly interface. A chain will 1) take the user’s question as input, 2) process the LLM’s response, and 3) deliver the result to the user.

By inheriting the Chain class, you can create your own pipeline. The LLMChain, for example, is the most basic type of chain in LangChain, descended from the Chain parent class. We’ll start by looking at how to invoke this class and then move on to adding other functionalities.

2 Import Libs & Setup

import os
import openai
import sys
sys.path.append('../..')
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.environ['OPENAI_API_KEY']
True

3 LLMChain

There are several ways to use a chain, each with a different output format. This section’s example is of constructing a bot that can suggest a replacement term based on context. The code sample below shows how to use the GPT-3 model with the OpenAI API. It then constructs a prompt using LangChain’s PromptTemplate, and the LLMChain class connects everything together. It is also critical to establish the OPENAI_API_KEY environment variable with your OpenAI API credentials.

The most straightforward approach uses the chain class call method. It means passing the input directly to the object while initializing it. It will return the input variable and the model’s response under the text key.

from langchain import PromptTemplate, OpenAI, LLMChain

prompt_template = "What is a word to replace the following: {word}?"

# Set the "OPENAI_API_KEY" environment variable before running following line.
llm = OpenAI(model_name="text-davinci-003", temperature=0)

llm_chain = LLMChain(
    llm=llm,
    prompt=PromptTemplate.from_template(prompt_template)
)
llm_chain("artificial")
{'word': 'artificial', 'text': '\n\nSynthetic'}

It is also possible to pass numerous inputs at once and receive a list for each input by using the.apply() method. The only distinction is that inputs are not included in the returning list. Regardless, the returned list will be in the same order as the input.

input_list = [
    {"word": "artificial"},
    {"word": "intelligence"},
    {"word": "robot"}
]

llm_chain.apply(input_list)
[{'text': '\n\nSynthetic'}, {'text': '\n\nWisdom'}, {'text': '\n\nAutomaton'}]

The.generate() method returns an instance of LLMResult, which has more information. For example, the finish_reason key shows the reason for the generation process’s termination. It could be stopped, indicating that the model has opted to finish or has reached the length limit. Other self-explanatory information includes the total amount of spent tokens and the model.

llm_chain.generate(input_list)
LLMResult(generations=[[Generation(text='\n\nSynthetic', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text='\n\nWisdom', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text='\n\nAutomaton', generation_info={'finish_reason': 'stop', 'logprobs': None})]], llm_output={'token_usage': {'prompt_tokens': 33, 'completion_tokens': 13, 'total_tokens': 46}, 'model_name': 'text-davinci-003'})

The next function we’ll look at is.predict(). (which might be interchanged with.run()) Its most common application is to pass several inputs for a single prompt. However, it is also feasible to utilise it with a single input variable. The following command will send both the word to be replaced and the context for the model to consider.

prompt_template = "Looking at the context of '{context}'. What is a approapriate word to replace the following: {word}?"

llm_chain = LLMChain(
    llm=llm,
    prompt=PromptTemplate(template=prompt_template, input_variables=["word", "context"]))

llm_chain.predict(word="fan", context="object")
'\n\nVentilator'

In the domain of objects, the model accurately recommended that a Ventilator would be an appropriate alternative for the word fan. Furthermore, when we perform the experiment in a new context, people, the outcome will alter.

llm_chain.predict(word="fan", context="humans")
'\n\nAdmirer'
# llm_chain.run(word="fan", context="object")
'\n\nVentilator'

The sample codes above demonstrate how to feed single or multiple inputs to a chain and retrieve the outputs. However, as we discussed in the “Managing Outputs with Output Parsers” course, we prefer to receive formatted output in most circumstances.

As shown below, we can give a prompt as a string to a Chain and initialise it with the.from_string() function. from_string(llm=llm, template=template).

template = """Looking at the context of '{context}'. What is a approapriate word to replace the following: {word}?"""
llm_chain = LLMChain.from_string(llm=llm, template=template)
llm_chain.predict(word="fan", context="object")
'\n\nVentilator'

4 Parsers

As previously stated, the output parsers can establish a data schema to provide suitably structured replies. It wouldn’t be an end-to-end pipeline unless parsers were used to extract information from the LLM textual output. The following example shows how to use the CommaSeparatedListOutputParser class in conjunction with the PromptTemplate to ensure that the results are in a list format.

from langchain.output_parsers import CommaSeparatedListOutputParser

output_parser = CommaSeparatedListOutputParser()
template = """List all possible words as substitute for 'artificial' as comma separated."""

llm_chain = LLMChain(
    llm=llm,
    prompt=PromptTemplate(template=template, input_variables=[], output_parser=output_parser))

llm_chain.predict()
'\n\nSynthetic, Manufactured, Imitation, Fabricated, Fake, Simulated, Artificial Intelligence, Automated, Constructed, Programmed, Mechanical, Processed, Algorithmic, Generated.'
llm_chain.predict_and_parse()
['Synthetic',
 'Manufactured',
 'Imitation',
 'Fabricated',
 'Fake',
 'Simulated',
 'Artificial Intelligence',
 'Automated',
 'Constructed',
 'Programmed',
 'Processed',
 'Mechanical',
 'Man-Made',
 'Lab-Created',
 'Artificial Neural Network.']

5 Conversational Chain (Memory)

Memory is the next component that will complete a chain, depending on the application. Using the ConversationalBufferMemory class, LangChain provides a ConversationalChain to track past prompts and responses.

from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

template = """List all possible words as substitute for 'artificial' as comma separated.

Current conversation:
{history}

{input}"""

conversation = ConversationChain(
    llm=llm,
    prompt=PromptTemplate(template=template, input_variables=["history", "input"], output_parser=output_parser),
    memory=ConversationBufferMemory())

conversation.predict_and_parse(input="Answer briefly. write the first 3 options.")
['Synthetic', 'Manufactured', 'Imitation']

Now, we can ask it to return the following four replacement words. It uses the memory to find the next options.

conversation.predict_and_parse(input="And the next 4?")
['Fabricated', 'Simulated', 'Automated', 'Constructed']

6 Debug

By setting the verbose option to True, you can inspect the inner workings of any chain. As shown in the code below, the chain will return the initial prompt and the output. The output is determined by the application. If there are more steps, it may provide more information.

conversation = ConversationChain(
    llm=llm,
    prompt=PromptTemplate(template=template, input_variables=["history", "input"], output_parser=output_parser),
    memory=ConversationBufferMemory(),
    verbose=True)

conversation.predict_and_parse(input="Answer briefly. write the first 3 options.")


> Entering new ConversationChain chain...
Prompt after formatting:
List all possible words as substitute for 'artificial' as comma separated.

Current conversation:


Answer briefly. write the first 3 options.

> Finished chain.
['Synthetic', 'Manufactured', 'Imitation']

7 Sequential Chain

Another helpful feature is using a sequential chain that concatenates multiple chains into one. The following code shows a sample usage.

from langchain.chains import SimpleSequentialChain

overall_chain = SimpleSequentialChain(chains=[chain_one, chain_two], verbose=True)

The SimpleSequentialChain will start running each chain from the first index and pass its response to the next one in the list.

8 Custom Chain

The LangChain library includes multiple preconfigured chains for various purposes, such as Transformation Chain, LLMCheckerChain, LLMSummarizationCheckerChain, and OpenAPI Chain, all of which have the properties indicated in previous sections. You can also define your chain for any custom task. In this section, we will build a chain that returns the meaning of a term and then offers a substitute.

It begins by developing a class that derives the majority of its functionality from the Chain class. Then, depending on the use case, the following three methods must be defined. The input_keys and output_keys methods tell the model what to expect, and the _call method executes each chain and merges its outputs.

from langchain.chains import LLMChain
from langchain.chains.base import Chain

from typing import Dict, List


class ConcatenateChain(Chain):
    chain_1: LLMChain
    chain_2: LLMChain

    @property
    def input_keys(self) -> List[str]:
        # Union of the input keys of the two chains.
        all_input_vars = set(self.chain_1.input_keys).union(set(self.chain_2.input_keys))
        return list(all_input_vars)

    @property
    def output_keys(self) -> List[str]:
        return ['concat_output']

    def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
        output_1 = self.chain_1.run(inputs)
        output_2 = self.chain_2.run(inputs)
        return {'concat_output': output_1 + output_2}

Then, we will declare each chain individually using the LLMChain class. Lastly, we call our custom chain ConcatenateChain to merge the results of the chain_1 and chain_2.

prompt_1 = PromptTemplate(
    input_variables=["word"],
    template="What is the meaning of the following word '{word}'?",
)
chain_1 = LLMChain(llm=llm, prompt=prompt_1)

prompt_2 = PromptTemplate(
    input_variables=["word"],
    template="What is a word to replace the following: {word}?",
)
chain_2 = LLMChain(llm=llm, prompt=prompt_2)

concat_chain = ConcatenateChain(chain_1=chain_1, chain_2=chain_2)
concat_output = concat_chain.run("artificial")
print(f"Concatenated output:\n{concat_output}")
Concatenated output:


Artificial means something that is not natural or made by humans, but rather created or produced by artificial means.

Synthetic

9 Conclusion

This post introduced us to LangChain and its powerful feature, chains, which integrate several components to form a cohesive application. The post began by demonstrating the use of numerous premade chains from the LangChain package. Then we added more functionality like parsers, memory, and debugging. Finally, the technique of creating bespoke chains was described.

More articles on Langchain can be found here.

Further Reading:

https://python.langchain.com/docs/modules/chains/

10 Acknowledgements

I’d like to express my thanks to the wonderful LangChain & Vector Databases in Production Course by Activeloop - which i completed, and acknowledge the use of some images and other materials from the course in this article.

Subscribe