Dubai: An Arabic language learning model (LLM) which can power generative AI applications?
Yes, it is possible. A new and homegrown open-source Arabic LLM, Jais, was unveiled on Wednesday.
Touted to be the world's highest quality open source Arabic LLM, Inception developed Jais, a subsidiary of Abu Dhabi's G42, in collaboration with Mohammed bin Zayed University of Artificial Intelligence and Silicon Valley-based Cerebras Systems.
How was Jais developed?
The Jais model, with 13 billion parameters, was created using a specialised dataset. This dataset consisted of 116 billion Arabic tokens to capture the richness of the Arabic language and 279 billion English word tokens to enhance its performance across languages.
It was trained on Condor Galaxy, the recently announced multi-exaFLOP AI supercomputer built by G42 and Cerebras, explained Dr Andrew Jackson, the CEO of Inception.
Named after the tallest peak in the UAE, Jais was developed by academics and engineers as they believed few large language models are bilingual.
Existing LLMS are English–focused. But Arabic is one of the largest languages in the world, with over 400 million speakers. We asked ourselves why the Arabic community shouldn't have an Arabic LLM to represent the population.
However, Dr Jackson said since there's little Arabic data out there, it posed a challenge. "We believe that only 1 per cent of Arabic content is online," he added.
The CEO of Inception said Jais outperforms existing Arabic models like Falcon (Abu Dhabi's Technology Innovation Institute), Llama 2 (Meta Platforms) and Bloom by a sizable margin. "It is also competitive with English models of similar size despite being trained on significantly less English data," he stated.
How will Jais be used?
"We will focus on bringing Jais into government, financial, energy/ climate, and healthcare domains," said Dr Jackson.
The UAE Ministry of Foreign Affairs, Ministry of Industry and Advanced Technology, Department of Health – Abu Dhabi, ADNOC, Etihad Airways, First Abu Dhabi Bank, e&, and Mubadala Investment Company will work with Inception to use Jais.
Jais is available for download on Hugging Face, said its developers. Users can also try Jais online upon registering interest on Jais' website and receiving an invite to access the playground environment.
Jais' launch marks a major step for AI in the Arabic world. Developed in Abu Dhabi, it empowers Arabic speakers worldwide with generative AI.