9 min read 7 July 2023

Benefit from generative AI and large language models

Large language models such as ChatGPT are disrupting most markets. What do they offer and how can you benefit from it?

Tomasz Dudek AWS Machine Learning Hero Head of Data & AI

There is a surge of interest in generative AI, especially in large language models (LLMs), as they continue to advance the field of artificial intelligence. We’re seeing major players and their models like Anthropic AI’s Claude, Google’s Bard and China’s Ernie making strides, along with OpenAI’s GPT-4 and ChatGPT gaining significant traction literally everywhere. The major boom in LLMs can be attributed to their human-like language understanding and language generation capabilities. Suddenly, we can create machine learning models that answer questions and write texts that sound just like a person would sound.

This progress has sparked attention as businesses and researchers realize the potential applications across various industries. What’s even more impressive is the fact that LLMs often don’t need any further tuning to a particular task, like most of the current ML algorithms. You just use their already accumulated capabilities. The only thing they need is a well-crafted prompt explaining to them in detail what you want to achieve. You may provide (literally) two or three examples of the output you want to get for even better results.

From automating content creation to improving customer support with AI-powered chatbots, LLMs have the power to streamline processes, reduce costs and unlock new opportunities. Their transformative potential has made them a hot topic among technology enthusiasts, business leaders and the general public, fueling the excitement and anticipation around their continued development and adoption. While the future of LLMs is unclear, the technology seems so powerful that it may not be wise to ignore it and treat it as a short-term fad.

How large language models (LLMs) work: the basics

LLMs operate within the standard machine learning lifecycle - you train a very large neural network using clusters of GPUs on massive training datasets, often containing terabytes of text data from the internet, books, magazines, articles and other sources. The primary goal of LLMs is to generate human-like language.

However, you can trick them into solving various language-related tasks, by crafting a well-designed prompt (textual input for the model). If the prompt asks to generate a snippet of programming code, LLM will act as the programmer and create it. If the prompt asks whether a given sentence is offensive or not, LLM will judge accordingly. If the prompt is a logical text puzzle, LLM will attempt to solve the problem. If the prompt says that LLM should generate an ad, it will write slick, flashy headlines for your article. The possibilities are endless.

You can also enhance LLMs capabilities by letting them use tools such as internet search, calculator, external APIs or databases. This way their capabilities are even higher.

What can you do with large language models?

Let’s explore some key applications in greater detail. While these potential use cases have long been seen as “partially solvable” via various Natural Language Processing ML algorithms, the difference between the previous generation of NLP applications and the recent one, powered by LLMs is astonishing. We went from mediocre results to a human-level performance or above, seemingly overnight!

Content generation

LLMs can automatically generate high-quality content, from blog posts and marketing copy to product descriptions and email templates. The model generates contextually relevant content by inputting just a brief description of what you’d like to get, saving businesses time and resources. The advertising and media industries can benefit significantly from this capability, using it to create engaging content that resonates with their target audience.

Example Scenario: A marketing agency could use an LLM to draft persuasive ad copy for a new product launch, reducing the time spent on brainstorming and writing, while maintaining consistent messaging.

A diagram showcasing the example above. A person or an app asks LLM to create an ad for LinkedIn and LLM generates it perfectly.

Sentiment analysis

Sentiment analysis allows LLMs to determine the sentiment behind a piece of text, classifying it as positive, negative, or neutral. This capability is useful for analyzing customer feedback, social media posts, or product reviews. Industries like retail, hospitality and customer service can leverage sentiment analysis to gauge customer satisfaction and adapt their strategies accordingly. And, since LLM understands nuances of human language, they don’t need any finetuning or large datasets of examples for most of the use cases.

Example Scenario: A hotel chain could use sentiment analysis to process customer reviews, identifying trends and areas for improvement, ultimately enhancing the guest experience.

A diagram showcasing the example above. A person or an app asks LLM to judge whether a given hotel review (“I love this hotel”) is positive, negative or mixed and LLM correctly answers “positive”.

Summarization

LLMs can condense lengthy documents into concise summaries, extracting the most critical information without losing context. This capability benefits industries that deal with large volumes of text, such as legal, finance and research.

Example Scenario: A law firm could use an LLM to summarize lengthy contracts or case documents, allowing lawyers to quickly grasp key points without reading the entire text, thus saving time and increasing efficiency.

A diagram showcasing the example above. A person or an app asks LLM to summarize a Framework Agreement between Chaos Gears and Polar Bears Hotel Chain and LLM does so.

Chatbots

Chatbots powered by LLMs can understand and respond to user queries with human-like proficiency. Industries such as customer support, healthcare and e-commerce can use chatbots to streamline communication, reduce response times and provide personalized experiences.

Example Scenario: An e-commerce company could deploy an LLM-powered chatbot to assist customers with order tracking, product inquiries and returns, reducing the burden on human customer support agents and increasing customer satisfaction.

A diagram showcasing the example above. An app asks LLM to hold a conversation with a customer of an ecommerce website. The customer asks “where is my package” and LLM responds “may I get your package number”.

Retrieval-augmented generation and semantic search

LLMs can be used to create systems that efficiently retrieve specific information from a vast knowledge base. Industries like healthcare, finance and education can benefit from this capability, offering users quick access to accurate, factual information about their customers, available via natural language!

Example Scenario: A medical information platform could use an LLM-powered question-answering system to help users find relevant information about medications, symptoms and treatments, providing a valuable resource for patients and healthcare professionals alike.

A diagram showcasing the example above. A person or an app asks LLM to show all patients attacked by a polar bear and the LLM queries the databases and uses other tools to find all 10 relevant cases to the query.

How to get a LLM: leveraging models from third-party vendors

Using an existing, already trained LLM is the most accessible approach for most companies. Several established vendors like Amazon, OpenAI or Anthropic let you use their LLMs in the form of APIs. You don’t need to fine-tune existing models for most cases. All you need to do is craft a prompt that explains the problem you’d like to solve. A pre-trained LLM will do the rest. Prompt engineering is a newly established discipline that tackles the question “How to write an excellent prompt for the model”.

By integrating with APIs, businesses can harness the power of these advanced models without the need to build and maintain their own infrastructure. This approach offers a very simple to use solution that is also very quick to develop. Of course, you’re then at the mercy of a 3rd party API, pay per every character sent to the model and sometimes share your inputs and outputs with the vendor. This poses a risk of data leaks.

A diagram showcasing that 3rd party vendors of LLMs are just APIs that your existing system calls. It may also collect your data such as the history of your prompts.

Note that some 3rd party commercial APIs allow you to fine-tune a LLM for your particular use case. Some of them also ensure that your data is not shared with anyone else. Read the Terms of Service before you send anything confidential though!

How to train and deploy a LLM: current limitations

Of course, some companies might want to create and/or host their own LLM entirely on their own infrastructure. This approach lets them ensure the privacy and security of their data. Their product or service is also not dependent on a 3rd party API. Those are both valid arguments for why a company would want to build everything itself.

Unfortunately, training a custom LLM entirely from scratch requires enormous resources, both in terms of computing power, data and talent pool. Only the largest, specialized companies can afford to do it at the moment.

What you can do however, is use alternative open source LLMs like Falcon, Open Assistant, FLAN-T5 and deploy them using technologies such as Amazon SageMaker. While most of the open source models offer lower quality than the commercial models (at the time of writing) and not all of them are available for commercial use, this option is developing rapidly. Open-source models are catching up in terms of quality AND they are getting smaller in terms of compute required.

A diagram showcasing that if you self-host a LLM then you’re no longer at the mercy of 3rd-party API and you remain fully in control of your data.

Unlike proprietary API-based models, open source models more often require fine tuning to a particular task or to a domain language of your company. Thankfully, various methods such as LoRA or Prompt Tuning have emerged and are quite simple to use — as long as you have a proper dataset to fine-tune the model.

This self-hosted approach allows you to build products and features on top of LLMs while ensuring that your data remains private and you’re not at the mercy of 3rd party vendors.

Potential dangers of large language models

Despite their numerous benefits, LLMs also come with potential drawbacks and risks. These include hallucinations, prompt injections, inaccurate results, a lack of concrete reasoning, black-box decision-making, privacy concerns and training-related challenges. There’s even a proposal for OWASP Top 10 list for LLMs! As businesses adopt LLMs, it is essential to understand and address these issues to ensure responsible and ethical AI deployment.

While LLMs can generate human-like content, they are fundamentally statistical models providing the most probable response for a given query or task. Since they are trained on internet data, the frequency of occurrence influences the model’s output. Furthermore, LLMs are not updated in real-time, which can lead to outdated information.

Another limitation is that LLMs do not reason as effectively as humans, which can impact their ability to handle certain tasks. For instance, they may struggle with complex mathematics, searching the internet for information, or other tasks requiring deeper understanding - they just don’t think! To address these challenges, emerging tools like LangChain or ChatGPT Plugins are being developed to enhance the capabilities of LLMs.

Finally, they’re often confidently wrong - a LLM rarely answers “I don’t know”. Instead, it gives you a very probable but incorrect answer to a question!

A diagram showcasing that LLMs can be confidently wrong. A person or an app asks how many polar bears are on Mars and LLM responds “there are 42 polar bears on Mars”.

Large language models: unlocking new possibilities for businesses

The rapid rise of LLMs is revolutionizing artificial intelligence and bringing near-human levels of language understanding to various industries. With applications such as content generation, sentiment analysis, summarization, chatbots and question-answering systems, LLMs have the potential to streamline business processes, reduce costs and uncover new opportunities. However, it is crucial to be aware of their limitations, such as inaccurate results, privacy concerns and the lack of concrete reasoning to ensure responsible AI deployment.

As businesses explore the possibilities of LLMs, partnering with an experienced technology provider can help navigate the complexities and maximize the benefits. Chaos Gears, with its expertise in Data Engineering, Machine Learning and generative AI, is well-positioned to help your organization harness the power of LLMs to drive innovation and growth. To learn more about how large language models can benefit your business, or to discuss potential use cases and strategies, get in touch with Chaos Gears today.