A new artificial intelligence large language model for Arabic has been unveiled in Abu Dhabi.
The model, named Jais, is an open-source bilingual Arabic-English model developed by Inception, a unit of Abu Dhabi AI company G42, Mohammed bin Zayed University of Artificial Intelligence and Silicon Valley-based Cerebras Systems. It is available to download on the machine learning platform Hugging Face.
More Accurate than Other LLMs for Arabic
The developers of Jais claim that it is more accurate than other existing LLMs for Arabic. It captures the linguistic nuances of various Arabic dialects and can comprehend language, context and cultural references, making it more accurate and contextually relevant than other models.
Encouraging Focus on Non-English LLMs
The launch of Jais is a further step towards encouraging the scientific and computing communities to focus more on non-English LLMs, similar to efforts made in Japan and India. Andrew Jackson, chief executive of Inception, told The National that Jais will be useful in generative use cases such as generating responses to questions, generating documents, translations, emails and even providing advice and recommendations.
Developed for Government Use and Various Sectors
Jais has been developed for government use and the financial, energy, climate and healthcare sectors. Several public and private organizations in the UAE have signed on as Jais launch partners, including the Ministry of Foreign Affairs, the Ministry of Industry and Advanced Technology, the Department of Health – Abu Dhabi, ADNOC, Etihad Airways, FAB and e&, the technology conglomerate formerly known as Etisalat.
Trained on the World’s Largest AI Supercomputer
Jais is trained on the Condor Galaxy, the “world’s largest AI supercomputer”, launched by G42 and Cerebras in July. It uses 116 billion Arabic tokens and 279 billion English tokens. The model is being continuously expanded as more Arabic content is collected to generate new instruction sets.
Boosting Arabic Content Online
Arabic is one of the most widespread languages worldwide, spoken by more than 400 million people. However, its online presence is minuscule with about 1 per cent of Arabic content available online. Mr Jackson said that Jais would help to boost this figure through an initiative to collect more Arabic data from offline sources.
A New Battlefront in the Tech Sector
The advent of generative AI has created a new battlefront in the tech sector with companies vying to get a head start and broaden their scope in generative AI. The availability of LLMs would help companies in their efforts as developers continuously improve AI capabilities. Mr Jackson said that speed performance is important to developers because it allows them to quickly bring up and iterate on different models.