Rakuten Releases High-Performance Open Large Language Models Optimized for the Japanese Language
Rakuten Group, Inc. has released high performance open Japanese large language models (LLMs): a foundation model Rakuten AI 7B and an instruct model Rakuten AI 7B Instruct to the open source community. Rakuten has also released a chat model Rakuten AI 7B Chat based on a version of the instruct model and fine-tuned on chat data for conversational text generation.
Rakuten AI 7B is a 7 billion parameter Japanese language foundation model that has been developed by continually training Mistral-7B-v0.1, an open LLM from France-based AI startup Mistral AI. Training of the model was carried out on an in-house multi-node GPU cluster engineered by Rakuten to enable the rapid and scalable training of models on large, complex datasets. Rakuten AI 7B Instruct is an instruct model fine-tuned from the Rakuten AI 7B foundation model.
All the models are released under the Apache 2.0 license and are available from the official Rakuten Group Hugging Face repository.
Characteristics of the LLM
- High performance through training on high-quality data
The models have been continually trained from the Mistral-7B-v0.1 model on English and Japanese internet-scale datasets. These datasets have been carefully curated and cleaned through an in-house multi-stage data filtering and annotation process to ensure the quality, contributing to the high performance of the models.
- High efficiency through extended tokenizer
The models also use an extended tokenizer optimized for Japanese language characters. In the extended tokenizer, a single token now represents more characters than before, enabling cost-efficient text processing during model training as well as inference.
- Top performance among open Japanese LLMs
The foundation and instruct models were evaluated on both Japanese and English language performance via LM Evaluation Harness. For Japanese, the foundation model achieves an average score of 69.8 points, and the instruct model 77.3 points. For English, the scores are 60.5 for the foundation model and 61.3 for the instruct model. These scores place the models at the top among the open Japanese language LLMs in their respective categories.
“At Rakuten, we want to leverage the best tools to solve our customers’ problems. We have a broad portfolio of tools, including proprietary models and our own data science and machine learning models developed over the years. This enables us to provide the most suitable tool for each use case in terms of cost, quality and performance,” commented Ting Cai, Chief Data Officer of Rakuten Group. “With Rakuten AI 7B, we have reached an important milestone in performance and are excited to share our learnings with the open-source community and accelerate the development of Japanese language LLMs.”
All models can be used commercially for various text generation tasks such as summarizing content, answering questions, general text understanding, and building dialogue systems. In addition, the models can be used as a base for building other models.
Large language models are the core technology powering the Generative AI services that have sparked the recent revolution in AI. Rakuten’s current models have been developed for research purposes, and the company will continue to evaluate various options to deliver the best service to its customers. By developing models in-house, Rakuten can build up its knowledge and expertise in LLMs and create models that are optimized to support the Rakuten Ecosystem. By making the models open, Rakuten aims to contribute to the open source community and promote the further development of Japanese language LLMs.
With breakthroughs in AI triggering transformations across industries, Rakuten has launched its AI-nization initiative, implementing AI in every aspect of its business to drive further growth. Rakuten is committed to making AI a force for good that augments humanity, drives productivity and fosters prosperity.
Source: Rakuten