April 27, 2023

Introducing tinyChat, the instruction-tuned LLM that’s less than 1% the size of GPT-3.5

Generative AI has seen rapid growth thanks to recent developments in open source machine learning. One major factor is the creation of models trained on large internet data corpora, like Pile, along with the availability of new datasets, such as Alpaca and Databricks 15k. Additionally, researchers have developed new machine learning training techniques like Low-Rank Adaptation of Large Language Models (LoRA) from Microsoft. This allows for finetuning of previously open sourced models on domain specific tasks, reducing the need for expensive computational resources when creating new models for new tasks.

Thanks in part to these advancements, we introduce tinyChat. tinyChat is an instruction-tuned large language model under 1B parameters that is open source under Apache 2.0. To put this into context, tinyChat (770m parameters + 2.4m LoRA adaptor) is less than 1% the size of GPT-3.5 (176B parameters). tinyChat is based on Google’s Flan-T5-Large, a 770m parameter model.

This model was finetuned using the Databricks 15k dataset recently open sourced by Databricks. This dataset contains instruction-following records generated by thousands of Databricks employees including brainstorming, classification, closed QA, generation, information extraction, open QA, and summarization. By finetuning Flan-T5 on this dataset, we were able to demonstrate new capabilities such as summarization and creative text generation that were previously not possible with Flan-T5 alone.

While not as performant as GPT-3.5 given its size, tinyChat is able to show chatGPT like qualities and performs reasonably well on a variety of NLP tasks such as summarization, question and answering, and sentiment analysis.

Link to repo and model:

Github Repo

Huggingface Model Hub

Why make a smaller LLM?

There are several reasons we believe that smaller LLMs, or small language models (SLMs) like tinyChat, can play a crucial role in the advancement of generative AI.

Smaller models can function as co-agents alongside larger models. While models like ChatGPT can handle sophisticated tasks, they can delegate repetitive tasks to smaller models, leading to increased computational efficiency and cost savings. Projects like LangChain and HuggingGPT are already exploring this approach, and we believe tinyChat can contribute to these efforts.
Smaller architectures serve as a catalyst for exploring innovative architectures and datasets. Although tinyChat v1 utilized an existing open-source model and data, we aim to develop future versions of tinyChat that incorporate unique architectures and datasets specifically tailored for smaller models.
The development of small, high-performance models can promote the democratization of generative AI and enhance accessibility. Our vision is a world where AI is transparent, open-source, and designed for public benefit. Smaller models could enable tasks to be executed on local devices like mobile phones rather than in centralized cloud infrastructure.
Smaller models offer increased energy efficiency, possibly making them more environmentally friendly. By developing and employing smaller models for a variety of tasks, we can reduce energy consumption for both training and inference, thereby lowering the environmental impact of AI while still delivering powerful capabilities.

Potential Use Cases

‍Small language models like tinyChat offer valuable advantages in various applications, particularly in complementing larger LLMs, mobile and IoT environments, and large-scale text extraction. Some key use cases for tinyChat include:

Task Distribution: Developers can employ tinyChat for simpler tasks such as text completion or sentiment analysis, while reserving larger LLMs for more intricate undertakings, such as advanced language generation or in-depth textual analysis. This balanced approach allows for efficient resource utilization and cost savings.
Mobile and IoT Applications: Small language models like tinyChat can be integrated into mobile apps, IoT devices, or embedded systems for tasks such as grammar checking, simple translations, or personalized content recommendations. In resource-constrained environments, tinyChat can enable natural language processing without the need for a constant internet connection or powerful hardware, broadening the scope of interaction capabilities across devices.
Large-Scale Text Extraction: tinyChat can be employed to analyze and extract information from vast volumes of text data, such as news articles, research papers, or social media posts. By generating summaries, identifying trends, or extracting key insights, tinyChat can facilitate the efficient processing of text data, allowing users to quickly grasp essential information without having to sift through an overwhelming amount of content.

By focusing on the unique advantages that small language models like tinyChat provide, developers can unlock new possibilities for AI applications, driving progress in natural language processing while empowering users.

What can tinyChat do?

tinyChat improves on Flan-T5’s capabilities in NLP tasks like summarization and creative writing while retaining its questions-answering ability.

tinyChat like Flan-T5 can perform well on a variety of NLP tasks such as question answering with tinyChat improving on Flan-T5s creative capabilities and summarization.

Question and Answering

In the example below, the question asks which country is mentioned in the input with the name of the country not disclosed. Both models are accurately able to determine the country is Morocco.

Prompt: What country did the event take place? Provide only the name of the COUNTRY.

Input: The event took place in Casablanca.

Model	Response
Flan-T5	Morocco
tinyChat	Morocco

Summarization

In the following example, the models are prompted to summarize the Wikipedia entry for the video game Witcher 3. Flan-T5 is not able to provide an accurate summary while tinyChat is able to provide one.

Model	Response
Flan-T5	The Witcher 3: Wild Hunt
tinyChat	The Witcher 3: Wild Hunt is an action role-playing game developed and published by CD Projekt Red. The game is the third in the series and takes place in a magical world, where Geralt is the first witcher character. Players battle monsters, build powerful spells and use a series of spells to protect himself and his adopted daughter from the Wild Hunt. The game has set the bar for an open world game series.

Model

Response

Flan-T5

The Witcher 3: Wild Hunt

tinyChat

The Witcher 3: Wild Hunt is an action role-playing game developed and published by CD Projekt Red. The game is the third in the series and takes place in a magical world, where Geralt is the first witcher character. Players battle monsters, build powerful spells and use a series of spells to protect himself and his adopted daughter from the Wild Hunt. The game has set the bar for an open world game series.

Creative Text Generation

Both models were asked to write a poem. In comparison to Flan-T5, tinyChat showed longer prose and creativity in its output.

Prompt: Write a poem

Model	Response
Flan-T5	i love chocolate so i got 2 bags of them
tinyChat	This poem is so beautiful - the beauty of the nature, the richness of the trees, the silence of the birds - everything is so peaceful, the way nature flows and the time always reaches such a happy destination. I think it's beautiful in its simplicity and beauty, the beauty of life, the way nature flows. As we look out across the bright fields and trees, it's so simple and yet such a powerful force that changes the whole environment.

Model

Response

Flan-T5

i love chocolate so i got 2 bags of them

tinyChat

This poem is so beautiful - the beauty of the nature, the richness of the trees, the silence of the birds - everything is so peaceful, the way nature flows and the time always reaches such a happy destination. I think it's beautiful in its simplicity and beauty, the beauty of life, the way nature flows. As we look out across the bright fields and trees, it's so simple and yet such a powerful force that changes the whole environment.

Benchmarking tinyChat

While benchmarking LLMs continues to be an area of research, some metrics can be provided on basic NLP tasks. We used the lm-evaluation-harness by EleutherAI to benchmark tinyChat and compare it against other open source models. See the Huggingface model card for metrics.

Where we go from here

The release of tinyChat is just the beginning of our efforts to democratize access to NLP capabilities and promote responsible use of AI. In the future, we aim to focus on several key areas of development:

Exploring Efficient Model Architectures: While tinyChat has demonstrated promising results, its performance may not be on par with larger models. To enhance the accuracy of smaller models like tinyChat, we will continue to investigate new efficient model architectures and improve training techniques.
Addressing Biases and Toxicity: As with any language model, tinyChat may exhibit biases and produce toxic outputs. We are committed to mitigating these issues by developing new datasets and training methods that emphasize fairness and inclusivity.
Creating New Datasets: The availability of high-quality, large-scale datasets is crucial for training accurate language models. We will continue working on generating new datasets that capture the complexity and diversity of human language, enriching the resources available for NLP development.
Collaborating with the Open-Source Community: We believe that open-source development is vital to advancing the field of AI responsibly and transparently. By collaborating with the open-source community, we can leverage collective expertise to enhance tinyChat and other open-source models.
Expanding Application Domains: We will explore new domains in which tinyChat can be applied, such as finance, legal, and government. By creating models tailored to specific domains, we can improve their accuracy and usefulness in real-world applications.

Our overarching goal is to develop efficient and responsible AI models that can be readily utilized by developers and organizations. We are confident that by working in tandem with the open-source community, we can drive progress in the field of AI in a way that benefits everyone.

Responsible Use

While tinyChat could offer many promising applications, it is essential to recognize that the model is far from perfect and is prone to hallucinations like many LLMs currently available today. As a result, tinyChat is primarily intended for research and experimentation purposes at this stage. We advocate for the responsible use of AI and encourage users to follow Google’s Responsible AI Practices when working with tinyChat or any other language models.

Acknowledgements

We would like to express our gratitude to the open source community, Databricks and Huggingface. Its through their contributions we can create tinyChat.

‍