Chinchilla AI
Chinchilla AI is a language model developed by the research team at DeepMind that was released in March of 2022. Chinchilla AI is a large language model claimed to outperform GPT-3.[1]
It considerably simplifies downstream utilization because it requires much less computer power for inference and fine-tuning. Based on the training of previously employed language models, it has been determined that if one doubles the model size, one must also have twice the number of training tokens. This hypothesis has been used to train Chinchilla AI by DeepMind. Similar to Gopher in terms of cost, Chinchilla AI has 70B parameters and four times as much data.[1]
Chinchilla AI has an average accuracy of 67.5% on the MMLU benchmark (Measuring Massive Multitask Language Understanding), which is 7% higher than Gopher’s performance. Chinchilla AI is still in the testing phase as of January 12, 2023.[2]
Chinchilla AI contributes to developing an effective training paradigm for large auto-regressive language models with limited compute resources. The Chinchilla team recommends that the number of training tokens is twice for every model size doubling, meaning that using larger, higher-quality training datasets can lead to better results on downstream tasks.[3][4]
References
- "What Is Chinchilla AI: Chatbot Language Model Rival By Deepmind To GPT-3 - Dataconomy". January 12, 2023.
- Hendrycks, Dan (2023-03-14), Measuring Massive Multitask Language Understanding, retrieved 2023-03-15
- Chaithali, G. (April 9, 2022). "Check Out This DeepMind's New Language Model, Chinchilla (70B Parameters), Which Significantly Outperforms Gopher (280B) and GPT-3 (175B) on a Large Range of Downstream Evaluation Tasks".
- Wali, Kartik (April 12, 2022). "DeepMind launches GPT-3 rival, Chinchilla". Analytics India Magazine.