Some of the most valuable information to make LLMs useful is in structured data such as atabases. In this article we show how we can use langchain to help LLMs answer questions based on information stored in an SQL database.
In this article we show how to use labeled preference scoring to help compare two versions of a system and choose the preferred outputs
In many real-world settings, the proper answer to a question may alter over time. For example, if you’re designing a Q&A system on top of a database or that connects to an API, the underlying data may be updated regularly. Instead of storing labels directly as values, we’ll utilise references to overcome this issue in this post using Langsmith where our labels will be references to look up the relevant values.
Evaluating a question and response system can help you improve its system design as well as the prompt and model quality. We tend to improve what we can measure, therefore verifying for correctness is a key focus. In this post, we will utilise LangSmith to test the accuracy of a Q&A system against an example dataset.
Developing LLM based applications is now possible using libraries such as Langchain, but taking these applications into production can involve many challenges such as evaluation & monitoring. Langsmith is a new tool that can help with these challenges of taking LLMs into production.
While language models have remarkable capabilities they can occasionally generate undesirable outputs. Here we addresses this issue by introducing the self-critique chain which acts as a mechanism to ensure model responses are appropriate in a production environment.
In this article we are going to create a voice assistant for your knowledge base! This will outline how you can develop your very own voice assistant employing state-of-the-art artificial intelligence tools
In this post we dive into the challenge of summarizing YouTube videos efficiently in the context of the digital age. It will introduce two cutting-edge tools, Whisper and LangChain, that can help tackle this issue. We will discuss the strategies of “stuff,” “map-reduce,” and “refine” for handling large amounts of text and extracting valuable information.
In this post we delve deeper into the concept of chains, which provide an end-to-end pipeline for utilizing language models. These chains seamlessly integrate models, prompts, memory, parsing output, and debugging capabilities, offering a user-friendly interface.
Here, we show how to use website material as additional context to help a chatbot efficiently reply to user queries. The code implementation uses data loaders, and stores the associated embeddings in the Deep Lake dataset, and then retrieves the documents that are most relevant to the user’s query.
High-dimensional vectors called embeddings are used to store semantic data. Textual data can be transformed into embedding space by large language models, enabling flexible representations across languages. These embeddings act as useful tools for identifying relevant information.
Giving documents to the LLM as information sources and asking it to produce an answer based on the information it extracts from the document is one strategy for reducing hallucinations - in this article we will look at how text splitters can help with this
The LangChain library provides a number of assistance classes that are intended to make it easier to load and extract data from various sources which we will cover in this post
With an emphasis on the function of indexes and retrievers - here we will examine some of the benefits and drawbacks of employing document-based LLMs that use these
Here we walk through a simple workflow for creating a knowledge graph from textual data, making complex information more accessible and easier to understand
Our goal in this post is to improve a news summarisers ability to extract the most important information from lengthy news items and display it in an easy-to-read bulleted list format
This article covers the different types of parsing objects used for LLMs and the troubleshooting processing
In this article, we’ll examine how example selectors and few-shot prompts might improve LangChain’s language model performance
This article explores the subtleties of PromptTemplates and efficient ways to use them. A PromptTemplate is a pre-established pattern or framework used to create efficient and dependable prompts for extensive language models - it serves as a guide to make sure the input text or prompt is formatted correctly
The aim of this post is to provide a strong basis in the knowledge and techniques required to develop effective prompts that empower LLMs to provide precise, contextually relevant, and insightful responses.
We will examine the integration of various LLM models in LangChain in this article
There are many Large Language Models many are not fully accesible - access to the weights and architecture of these models is limited, and even if one does it requires a large amount of resources to carry out any activities and building on top of these APIs is not free. Open-source models like GPT4All get over these limitations and increase everyones access to the LLMs
In this project we create a News Articles Summarizer application utilising ChatGPT and LangChain to help save time staying current on news and information in the fast-paced world of today
In LangChain LLMs and Chat Models are two different kinds of models that are used for various tasks involving natural language processing - the distinctions between LLMs and Chat Models as well as their distinctive applications and implementation strategies within LangChain will be covered in this article
Activeloop Deep Lake provides storage for embeddings and their corresponding metadata in the context of LLM apps, and enables hybrid searches on these embeddings and their attributes for efficient data retrieval and integrates with LangChain and Agents
In this article we will look at how we can use the open source Llama-70b-chat model in both Hugging Face transformers and LangChain
In this article we are going to give a chatbot memory to help it better ask questions about data using langchain.
In this article we look at how you can split documents extract the relevant data take a question and pass them both to a language model, and ask it to answer the question using Langchain.
In this article we look at how you can retrieve content from a vectorstore using state-of-the-art methods to ensure only the most relevant content is made available for Large Language Models.
In this article we look at how to convert documents into vector stores an embeddings as an important step in making content available for Large Language Models.
In this article we look at how you can split documents as an important step in making content available for Large Language Models
In this post we look at several aspects to consider when deploying a Large Language Model (LLM) into an application such as chain-of-thought reasoning, program-aided language models (PAL), the REAct framework combining reason and action, application architectures, and responsible AI.
In this post we look at several aspects to consider when deploying a Large Language Model (LLM) into an application such as Model optimizations, a Generative AI project lifecycle cheat sheet, and how LLM’s can be turned into useful applications using external data sources and services.
In this project we will fine-tune a FLAN-T5 model to generate less toxic content with Meta AI’s hate speech reward model
In this post we will look at Proximal Policy Optimization which is a powerful algorithm for solving reinforcement learning problems
Here we look at more advanced aspects of Reinforcement learning from human feedback (RLHF) in particular the reward model, use of chain-of-thought prompting and looking at the challenges LLMs face with knowledge cut-offs
In this post we will introduce Reinforcement learning from human feedback (RLHF) which is an important method used in modern large language models to help improve the performance and alignment of large language models.
In this project I will fine-tune an existing Large Language Model from Hugging Face for enhanced dialogue summarization
Training large language models can be computationally and financially expensive. Parameter efficient fine tuning techniques only modify a restricted number of parameters and can result in drastically reduce costs and training time.
In this article we explore several metrics that are used by developers of large language models that you can use to assess the performance of your own models and compare to other models out in the world
In this post, we’ll look at techniques you might employ to make an existing large language model more effective for your particular use case using a method called instruction fine-tuning, and in particular see how this can be used to optimise for multiple tasks as the same time.
In this article we will look at methods that you can use to improve the performance of an existing large language model for your specific use case using instruction fine-tuning
Here we will examine particular use cases where it might make sense to train a large language model from scratch. These use cases are often characterised by situations that use language in a very unique way such as legal or medical text
In this article we look at research that has looked at the relationship between model size, training, configuration, and performance to try to pinpoint the optimal size for large language models
Running out of memory is one of the most frequent problems you still encounter when trying to train large language models. In this article we look at strategies used to help train these models more efficiently.
In this article we will look at different types of pre-trained models and see how these are suited for different tasks - this can help you choose the best model for your LLM use-case
Here I will explore dialogue summarization using generative AI and will look at how the input text affects the output of the model and use prompt engineering to direct it towards the task we need
In this article I will present a high level project architecture for building Generative AI projects that could be applied to any project proposed by DeepLearning AI and AWS in their Generative AI with Large Language Models Course
In this article we will take a high level non-technical view of what generative configuration options for Large language models allow you to do
Here we will take a high level non-technical view of what prompting is all about and introduce in-context learning
In this article we will take a high level non-technical view of key aspects of the Transformer Model - the technology behind recent advances in AI
Here we look at some best practices for evaluating the outputs of an LLM application when you do not have a clear sense of the right output or its ambiguous - to help us know before and after deployment how well its working
Here we look at some best practices for evaluating the outputs of an LLM application when you do have a clear sense of the right output - to help us know before and after deployment how well its working
Here, we will put together chained prompts, moderation and other quality checks to create a better customer services chatbot using ChatGPT
In this article we will focus on checking outputs generated by an LLM before showing them to users - which can be important for ensuring the quality, relevance, and safety of the responses provided to them or used in automation flows
Here using ChatGPT we will see how to split complex tasks into a series of simpler subtasks by chaining multiple prompts together which can help provide better results than trying to perform a task using just one prompt
In this article we will focus on large language model tasks to process a series of inputs i.e. the tasks that take the input and generate a useful output often through a series of steps - using ChatGPT
In this article we look at how you evaluate moderation inputs to large language models, which is important when creating LLM applications that involve chains of multiple inputs and outputs to LLMs to ensure that users are behaving responsibly and aren’t trying to exploit the system in any manner
Here we look at how you evaluate classiciation inputs to large language models, which is important when creating LLM applications that involve chains of multiple inputs and outputs to LLMs
Here we give a brief overview of how LLM’s work, how they are trained, what is a tokeniser and how a choice of different tokenisers can effect the output of the LLM. We will also look at what the ‘chat format’ for LLM’s is all about
Creating useful applications with AI & Large Language Models involves many aspects, here I highlight key considerations when building these applications & describe how I built & deployed 6 LLM applications with LangChain to summarise or chat with documents, web pages or youtube videos
In this project we will use LangChain to create LLM based agents which can help answer questions, reason through content or even to decide what to do next based on various information sources or tools you can give it access to
In this article we look at how LangChain can help evaluate LLM performance for a specific Application
In this article we look at how LangChain can perform question answering over documents using embeddings and vector stores.
Here we will look at the Chains component of LangChain and see how this can help us combine different sequences of events using LLM’s.
Here we look at how LangChain can give useful memory to improve LLM model responses.
LangChain is an intuitive open-source python framework created to simplify the development of useful applications using LLMs. In this article we introduce the framwwork then look at the Models, Prompts and Parsers components of LangChain.
In this project we will use ChatGPT to utilize its chat format to have extended conversations with chatbots personalized or specialized for specific tasks or behaviors.
We will use ChatGPT to generate customer service emails that are tailored to each customer’s review.
In this article we will explore how to use Large Language Models for text transformation tasks such as language translation, spelling and grammar checking, tone adjustment, and format conversion.
Here we look at how to use Large Language Models such as ChatGPT to infer sentiment and topics from product reviews and news articles
In this article we look at how to use Large Language Models such as ChatGPT to summarize text with a focus on specific topics
Here we look at how to develop prompts for large language models iteratively
In this article we look at two prompting principles and their related tactics in order to write effective prompts for large language models.
In this project we fine-tune a pre-trained model for sentiment analysis model using Hugging Face
In this article we will look in a bit more detail at what you might need to do to fine-tune a pre-trained model for text similarity using Hugging Face
In this article we will look in a bit more detail at what you might need to do to prepare your data for fine-tuning a pre-trained model for text similarity using Hugging Face
In this non-technical article we describe the basics of how transfomer models work which is the underlying technology behind Chat-GPT and most of the recent advances in AI
Here we are going to use the Reformer aka the efficient Transformer to create a more advanced conversational chatbot. It will learn how to understand context to better answer questions and it will also know how to ask questions if it needs more info, which could be useful for customer service applications.
In this post we will explore Reversible Residual Networks and see how they can be used to improve Transfomer models
Here we look at how to make transfomers more efficient using Reversible Layers and Locality Sensitive Hashing (LSH)
In this article, we will fine-tune a model using Hugging Face transformers to create a better chat bot for question answering
We will use Hugging Face transformers to download and use the DistilBERT model to create a chat bot for question answering
We implement the Text to Text Transfer from Transformers model (better known as T5) which can perform a wide variety of NLP tasks and is a versatile model.
Text summarization is an important task in natural language processing. In this article we will create a transfomer decoder model to perform text summarization.
In this article we’ll explore the transformer decoder which is the architecture behind GPT-2 and see how to implement it with trax.
In this article we explore the three ways of attention (encoder-decoder attention, causal attention, and bi-directional self attention) used in transformer NLP models and introducted in the 2017 paper Attention is all you need and see how to implement the latter two with dot product attention.
The 2017 paper Attention Is All You Need introduced the Transformer model and scaled dot-product attention, sometimes also called QKV (Queries, Keys, Values) attention. In this article we’ll implement a simplified version of scaled dot-product attention and replicate word alignment between English and French, as shown in the earlier paper Bhadanau, et al. (2014).
The attention mechanism is behind some of the recent advances in deep learning using the Transfomer model architecture. In this article we look at the first attention mechanism proposed in a paper by Bhadanau et al (2014) used to improve seq2seq models for language translation.
In this project we will create our own human workforce, a human task UI, and then define the human review workflow to perform data labeling for an ML task.
AWS Sagemaker offers many options for deploying models, in this project we will create an endpoint for a text classification model, splitting the traffic between them. Then after testing and reviewing the endpoint performance metrics, we will shift the traffic to one variant and configure it to autoscale.
When training ML models, hyperparameter tuning is a step taken to find the best performing training model. In this article we will apply a random algorithm of Automated Hyperparameter Tuning to train a BERT-based natural language processing (NLP) classifier. The model analyzes customer feedback and classifies the messages into positive, neutral, and negative sentiments.
In this project we train and deploy a BERT Based text classifier using AWS Sagemaker pipelines, and describe how this can help with MLOPS to provide the most efficient path to production for training deploying and maintaining machine learning models at scale in production.
We train a text classifier using a variant of the BERT deep learning model architecture called RoBERTa - a Robustly Optimized BERT Pretraining Approach, within a PyTorch model ran as a SageMaker Training Job.
We will prepare to train a BERT-based natural language processing (NLP) model converting review text into machine-readable features used by BERT. With the required feature transformation we will configure an Amazon SageMaker processing job to perform the task.
In this article we will use the AWS SageMaker BlazingText built-in deep learning model to predict the sentiment for customer text reviews. BlazingText is a variant of FastText which is based on word2vec.
We will use Amazon Sagemaker Autopilot to automatically train a natural language processing (NLP) model. The model will analyze customer feedback and classify the messages into positive (1), neutral (0) and negative (-1) sentiment.
In Data Science and machine learning, bias can be present in data before any model training occurs. In this article we will analyze bias on a dataset, generate and analyze bias reports, and prepare the dataset for the model training.
In this project we will explore text reviews for clothing products using tools from the cloud data science service AWS Sagemaker to load and visualise the data and to gain key insights from it.
In this project we will be using a deep learning model to classify satellite images of the amazon rain forest. Here the main objective is not to get the best results for this task, rather to use this dataset to illustrate the use of the Fastai deep learning library
Deep Learning and AI is powering some of the most recent amazing advances in text & natural language processing (NLP) applications, such as GPT-3, Chat-GPT and Dall-E but these often require specialist resources such as deep learning. With Machine Learning (ML) its possible to create useful NLP applications for businesses without using AI and Deep Learning.
What’s the difference between machine learning and deep learning? In this article we will explain the differences between machine learning & deep learning, and will illustrate this by building a machine learning and a deep learning model from scratch.
In this project I will create a model that can associate short text phrases with the correct US patent classification.
This article covers lesson 1 the fastai 2022 course where I will create a model that can identify different types of galaxies. I will also highlight some notable differences from earlier versions of the fastai course and library.
In this project we will build a model to predict the 10-year risk of death of individuals from the NHANES I epidemiology dataset
In this project we will build a Prognostic risk score model for retinopathy in diabetes patients using logistic regression
In this project we will be working with the results of the X-ray classification model for diseases we developed in the previous article, and evaluate the model performance on each of these classes using various classification metrics.
In this project, I will explore medical image diagnosis by building a state-of-the-art deep learning chest X-ray classifier using Keras that can classify 14 different medical conditions.
In this article we will look at the history of the International Classification of Diseases (ICD) system, which has been developed collaboratively so that the medical terms and information in death certificates can be grouped together for statistical purposes. In practical examples we will look at how to extract ICD-9 codes from MIMIC III database and visualise them.
In this article we will further explore the MIMIC-III critical care Electronic Health Record Dataset, looking at how we examine clinical outcomes as well as extracting indivdual patient data.
In this article we will look at the MIMIC-III Electronic Health Record (EHR) database. In particular, we will learn about the design of this relational database, and what tools are available to query, extract and visualise descriptive analytics.
In this article we will look at MIMIC-III, which is the largest publicly Electronic Health Record (EHR) database available to benchmark machine learning algorithms.
Epidemiological studies can provide valuable insights about a disease, however a study can yield biased results for many different reasons. In this article we explore some of these factors, and provides guidance on how to deal with bias in epidemiological research.
In this article, we will learn about the main epidemiological study designs, including cross-sectional and ecological studies, case-control and cohort studies, as well as the more complex nested case-control, case-cohort designs, and randomised controlled trials.
In this article we look at the fundamental tools of Epidemiology (the study of disease) essential to conduct studies such as measures to describe the frequency of disease, how to quantify the strength of an association, how to describe different strategies for prevention, how to identify strengths and weaknesses of diagnostic tests, and when a screening programme may be appropriate.
In this project I develop a deep learning CNN model to predict Alzheimer’s disease using 3D MRI medical images of the Hippocampus region of the brain.
Utilizing a synthetic Diabetes patient dataset, we will create a deep learning model trained on EHR data (Electronic Health Records) to find suitable patients for testing a new Diabetes drug.
In this project, I will analyze data from the NIH Chest X-ray 2D Medical image dataset and train a deep learning model to classify a given chest x-ray for the presence or absence of pneumonia.
In Python Power Tools for Data Science articles I look at python tools that help automate or simplify common tasks a Data Scientist would need to perform. In this article I look at the Pycaret Anomaly Detection module and see how this can help automate this process.
Singular Value Decomposition (SVD) is a method from Linear Algebra widley used accross science and engineering. In this article we will introduce the concept and show how it can be used for Topic Modelling in Natural Language Processing (NLP).
In Python Power Tools for Data Science articles I look at python tools that help automate or simplify common tasks a Data Scientist would need to perform. In this article I look at how Pycaret can help automate the machine learning workflow.
In this article we will introduce Network Analysis, and use it to study the structure and relationships within a Karate Club.
In this article we will look at how Class Acivation Maps (CAM’s) can be used to understand and interpret the decisions that Convolutional Neural Networks (CNN’s) make.
In this article we will cover building a basic neural network from the most basic elements.
In this article we will look at methods to improve gradient decent optimisation for training neural networks beyond SGD including momentum, RMSProp and Adam.
In this article we will build a ResNet type convolutional image networks from scratch using PyTorch, and see why they are key to building deeper neural networks.
In this article we will look at how to build custom applications in the fastai library, by looking at how current fastai image model applications are actually built.
In this article we are going to look at building a convolutional neural network from scratch, using Pytorch as well as one-cycle training and batch normalisation.
In this article we will look at how we build an LSTM language model from scratch that is able to predict the next word in a sequence of words. This covers all the details of how to build the AWD-LSTM architecture.
In this article we will introduce and explore the fastai mid-level API, in particular it’s data preparation features.
In this article we are going to create a deep learning text classifier using the fastai library, and the ULMFit approach.
In this article we will look to build a collaberitive filtering model from scratch, using pure Pytorch and some support from the Fastai deep learning library.
In this article we are going to look at some of the most advanced techniques available in 2021 for training deep learning vision models.
In this project I look at applying AI to recognising buildings, woodlands & water areas from satellite images
Many of the greatest challenges the world faces today are global in nature, AI and satellite images is a powerful technology that holds huge potential for helping us solve many problems we face.
AI systems are being used everywhere, but often little work is done to gain a deeper understanding how and why they work. We have so much to gain from trying to look deeper inside these AI systems to understand them better.