UAE: How Nanda will help Hindi speakers harness power of generative artificial intelligence
Abu Dhabi: Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) in Abu Dhabi recently released Nanda, the world’s most sophisticated open-source Hindi large language model (LLM).
A LLM is an artificial intelligence (AI) programme that can recognise and generate text, with perhaps the most familiar example of its use being ChatGPT, which allows users to have human-like conversations with a chatbot.
Now, Nanda - whose name is inspired by one of India’s highest peaks - has enabled over half a billion Hindi language speakers the chance to use the power of generative AI.
Gulf News interviewed Professor Monojit Choudhury, a professor of natural language processing (NLP) at MBZUAI, to gain exclusive insights on Nanda.
How will Nanda help Hindi speakers?
Nanda is poised to drive significant advancements in India’s digital and economic spheres by fostering inclusion and stimulating innovation. The model will make educational resources more accessible, enhance public services, and support job creation in sectors critical to Hindi-speaking communities. Its deployment could transform various sectors, including education and healthcare, by personalising learning and improving service delivery.
What about safety and ethics?
MBZUAI prioritises high-quality, diverse, and unbiased data sets while embedding ethical principles into the model’s training phase. We’ve taken extensive measures to ensure Nanda is not only effective but also ethical and accessible to all.
What are Nanda’s availability and future prospects?
Nanda has been made available as an open model on HuggingFace, allowing widespread access and the ability to run locally on modest hardware. This strategic move aligns with MBZUAI’s commitment to democratising AI technology, ensuring that advancements in AI are accessible and beneficial to a broad audience globally.
How is Nanda different from similar projects?
Nanda, which operates with 10 billion parameters, offers advanced knowledge and reasoning capabilities in Hindi, setting it apart from other Hindi models of comparable size. It brings enhanced performance, making it accessible even to users with basic hardware setups.
Training on the Condor Galaxy, one of the world’s most powerful AI supercomputers, has notably enhanced Nanda’s performance. This supercomputer, developed by G42 and Cerebras Systems, facilitated training on a bespoke dataset, enabling the model to capture the full complexity and richness of the Hindi language.
The development of Nanda benefited significantly from the experiences gained during the creation of Jais, MBZUAI’s Arabic LLM. Techniques used in Jais to blend Arabic and English during training were adapted to enhance Nanda’s effectiveness.
With the release of Nanda, MBZUAI continues to push the boundaries of AI, supporting linguistic diversity and promoting technological inclusivity across the globe.
What challenges, difficulties and complexities were involved in developing Nanda?
The main challenges were the limited availability of Hindi data compared to English and the unique morphological features of the Hindi language. These challenges required a model that could effectively process and generate natural language while accommodating the language’s many dialects. The goal was to develop a high-quality Hindi LLM that could perform on par with the best English models, making it a practical and effective tool for Hindi users.