Inferring with Text Prompts for Large Language Models

Here we look at how to use Large Language Models such as ChatGPT to infer sentiment and topics from product reviews and news articles
natural-language-processing
deep-learning
openai
prompt-engineering
Author

Pranath Fernando

Published

May 4, 2023

1 Introduction

Large language models such as ChatGPT can generate text responses based on a given prompt or input. Writing prompts allow users to guide the language model’s output by providing a specific context or topic for the response. This feature has many practical applications, such as generating creative writing prompts, assisting in content creation, and even aiding in customer service chatbots.

For example, a writing prompt such as “Write a short story about a time traveler who goes back to the medieval period” could lead the language model to generate a variety of unique and creative responses. Additionally, prompts can be used to generate more specific and relevant responses for tasks such as language translation or summarization. In these cases, the prompt would provide information about the desired output, such as the language to be translated or the key points to be included in the summary. Overall, prompts provide a way to harness the power of large language models for a wide range of practical applications.

However, creating effective prompts for large language models remains a significant challenge, as even prompts that seem similar can produce vastly different outputs.

In my previous article, we looked at how to use prompts to summarize text with a focus on specific topics.

In this article, we will look at how to infer sentiment and topics from product reviews and news articles.

2 Setup

2.1 Load the API key and relevant Python libaries.

First we need to load certain python libs and connect the OpenAi api.

The OpenAi api library needs to be configured with an account’s secret key, which is available on the website.

You can either set it as the OPENAI_API_KEY environment variable before using the library: !export OPENAI_API_KEY='sk-...'

Or, set openai.api_key to its value:

import openai
openai.api_key = "sk-..."
import openai
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_key  = os.getenv('OPENAI_API_KEY')

2.2 Helper function

We will use OpenAI’s gpt-3.5-turbo model and the chat completions endpoint.

This helper function will make it easier to use prompts and look at the generated outputs:

We’ll simply define this helper function to make it easier to use prompts and examine outputs that are generated. GetCompletion is a function that just accepts a prompt and returns the completion for that prompt.

def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message["content"]

3 Inferring using Large Language Models

We will now examine inferring, which can be thought of as tasks where the model receives a text as input and conducts some sort of analysis. Therefore, this may be things like extracting names, extracting labels, or sort of interpreting the sentiment of a text. So if you want to extract a sentiment, positive or negative, with a piece of text, in the traditional machine learning approach, you’d have to collect the label data set, train the model, figure out how to deploy the model someplace in the cloud and make inferences. And while that has some potential for success, going through the process was simply time-consuming.

And so for every task, such as sentiment versus extracting names versus something else, you have to train and deploy a separate model. A large language model has the benefit of allowing you to write a prompt for many of these tasks and have it begin producing results almost immediately. And that brings amazing speed in terms of application development. And you can also just use one model, one API, to handle many various tasks rather than trying to figure out how to train and deploy a bunch of different models.

4 Product review text

So let’s begin by using a lamp review as an example. We want to create a prompt to categorise this’s sentiment. And if I want the system to inform me of the sentiment, I can simply write it down along with the customary delimiter, the review text, and other relevant information.

lamp_review = """
Needed a nice lamp for my bedroom, and this one had \
additional storage and not too high of a price point. \
Got it fast.  The string to our lamp broke during the \
transit and the company happily sent over a new one. \
Came within a few days as well. It was easy to put \
together.  I had a missing part, so I contacted their \
support and they very quickly got me the missing piece! \
Lumina seems to me to be a great company that cares \
about their customers and products!!
"""

5 Sentiment (positive/negative)

prompt = f"""
What is the sentiment of the following product review, 
which is delimited with triple backticks?

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)
Output

The sentiment of the product review is positive.

This indicates a good attitude towards the product, which actually seems about appropriate. Although this light isn’t ideal, the buyer seems to be content with it.

I can take this prompt and add another directive to have you respond with a single word, either positive or negative, if you wanted to be more succinct to make it easier for post-processing.

prompt = f"""
What is the sentiment of the following product review, 
which is delimited with triple backticks?

Give your answer as a single word, either "positive" \
or "negative".

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)
Output

positive

6 Identify types of emotions

Let’s imagine we wish to list the emotions the author of the review is expressing, with a maximum of five items per list. Large language models can therefore be rather effective at identifying specific information inside a text. We’re expressing our feelings in this instance, I believe. And knowing this might help you figure out what a certain product’s customers believe.

prompt = f"""
Identify a list of emotions that the writer of the \
following review is expressing. Include no more than \
five items in the list. Format your answer as a list of \
lower-case words separated by commas.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)
Output

happy, satisfied, grateful, impressed, content

7 Identify anger

It’s critical to know if a certain user is severely upset for many customer support organisations. As a result, you might be experiencing a different classification issue. Is the reviewer upset?

Because if a person is truly upset, it can be worth paying extra attention to have a customer review, to have customer support or customer success reach out to determine what’s wrong and make things right for the consumer. The client is not irate in this instance, I promise. Additionally, you can see that using supervised learning, there is no way I could have built all of these classifiers in a short period of time.

prompt = f"""
Is the writer of the following review expressing anger?\
The review is delimited with triple backticks. \
Give your answer as either yes or no.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)
Output

No

8 Extract product and company name from customer reviews

Let’s examine a different topic: getting more detailed information from customer reviews.

Information extraction, then, is the area of NLP, or natural language processing, that has to do with taking a text and extracting specific information from it. The following things, the purchase date, and the name of the manufacturer are what I’m asking you to name in this prompt. Once more, if you’re trying to summarise a lot of reviews from an online store, it might be helpful to identify the products, the manufacturer, the positive and negative feedback, and any trends in positive or negative sentiment for particular products or manufacturers.

prompt = f"""
Identify the following items from the review text: 
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks. \
Format your response as a JSON object with \
"Item" and "Brand" as the keys. 
If the information isn't present, use "unknown" \
as the value.
Make your response as short as possible.
  
Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)
Output

{ “Item”: “lamp”, “Brand”: “Lumina” }

9 Doing multiple tasks at once

You saw how to create a prompt to identify the sentiment, determine whether someone is upset, and then extract the product and brand from the instances we looked at. A single prompt can actually be written to extract all of this information at once, as opposed to using three or four prompts and calling getCompletion repeatedly to extract the various fields one at a time.

So, let’s say we want to find the fine elements, extract sentiment, and then, here, tell it to structure the angry value as a, as a boolean value, which returns a JSON. The item was extracted as a lamp with additional storage instead of lamp, which seems good, but this method can be used to extract multiple fields from a piece of text with just one prompt where sentiment is positive, anger, and there are no quotes around false.

prompt = f"""
Identify the following items from the review text: 
- Sentiment (positive or negative)
- Is the reviewer expressing anger? (true or false)
- Item purchased by reviewer
- Company that made the item

The review is delimited with triple backticks. \
Format your response as a JSON object with \
"Sentiment", "Anger", "Item" and "Brand" as the keys.
If the information isn't present, use "unknown" \
as the value.
Make your response as short as possible.
Format the Anger value as a boolean.

Review text: '''{lamp_review}'''
"""
response = get_completion(prompt)
print(response)
Output

{ “Sentiment”: “positive”, “Anger”: false, “Item”: “lamp with additional storage”, “Brand”: “Lumina” }

10 Inferring topics

Inferring themes is a fantastic use for large language models. What is the subject matter of a lengthy passage of text? What subjects are covered? Here is a made-up newspaper story that describes how government employees feel about the organisation they work for. Therefore, the findings of the most recent government poll, were evaluated at NASA, which was a well-liked department with a high satisfaction rating. With this prompt, we can ask an article like this one to identify five subjects that will be covered in the content that follows. We can format the response as a list with each item being one or two words long.

story = """
In a recent survey conducted by the government, 
public sector employees were asked to rate their level 
of satisfaction with the department they work at. 
The results revealed that NASA was the most popular 
department with a satisfaction rating of 95%.

One NASA employee, John Smith, commented on the findings, 
stating, "I'm not surprised that NASA came out on top. 
It's a great place to work with amazing people and 
incredible opportunities. I'm proud to be a part of 
such an innovative organization."

The results were also welcomed by NASA's management team, 
with Director Tom Johnson stating, "We are thrilled to 
hear that our employees are satisfied with their work at NASA. 
We have a talented and dedicated team who work tirelessly 
to achieve our goals, and it's fantastic to see that their 
hard work is paying off."

The survey also revealed that the 
Social Security Administration had the lowest satisfaction 
rating, with only 45% of employees indicating they were 
satisfied with their job. The government has pledged to 
address the concerns raised by employees in the survey and 
work towards improving job satisfaction across all departments.
"""
prompt = f"""
Determine five topics that are being discussed in the \
following text, which is delimited by triple backticks.

Make each item one or two words long. 

Format your response as a list of items separated by commas.

Text sample: '''{story}'''
"""
response = get_completion(prompt)
print(response)
Output

government survey, job satisfaction, NASA, Social Security Administration, employee concerns

11 Make a news alert for certain topics

If you have a collection of articles from which you have extracted the themes, you can utilise a large language model to assist you index the articles into several categories. So I will utilise a little different topic list. Let’s imagine we own a news website or something, and these are the things we follow: NASA, local government, engineering, customer happiness, and the federal government.

And let’s say you want to determine which of these subjects are covered in a specific news item. So, I can use this prompt.

Determine whether each item in the list of topics below is a topic in the text below, is what I’m going to say. Give each topic’s response as a list of 0s and 1s.

Therefore, the story text is the same as previously. It concerns NASA. Local governments and engineering are unrelated, I would say. It concerns both the federal government and employee pleasure. Due to the lack of labelled training data, this approach is sometimes referred to as a “zero shot” learning algorithm in machine learning. And it was able to detect which of these subjects are covered in that news item with simply a prompt.

topic_list = [
    "nasa", "local government", "engineering", 
    "employee satisfaction", "federal government"
]
prompt = f"""
Determine whether each item in the following list of \
topics is a topic in the text below, which
is delimited with triple backticks.

Give your answer as list with 0 or 1 for each topic.\

List of topics: {", ".join(topic_list)}

Text sample: '''{story}'''
"""
response = get_completion(prompt)
print(response)
Output

nasa: 1 local government: 0 engineering: 0 employee satisfaction: 1 federal government: 1

topic_dict = {i.split(': ')[0]: int(i.split(': ')[1]) for i in response.split(sep='\n')}
if topic_dict['nasa'] == 1:
    print("ALERT: New NASA story!")
Output

ALERT: New NASA story!

So that’s it for inference; in contrast to the days or even weeks it would have previously took an experienced machine learning engineer, you can now design a number of systems for inferring information from text in just a few minutes.

I find it quite exciting, because prompting can now be used to quickly build and begin drawing conclusions on quite challenging natural language processing problems like these, both for experienced machine learning developers and for others who are more new to machine learning.

12 Acknowledgements

I’d like to express my thanks to the wonderful ChatGPT Prompt Engineering for Developers Course by DeepLearning.ai and OpenAI - which i completed, and acknowledge the use of some images and other materials from the course in this article.

Subscribe