publish time

06/10/2023

author name Arab Times

publish time

06/10/2023

UAE introduces 'Jais': A game-changer in Arabic-language AI.

DUBAI, UAE, Oct 6, (Agencies): In recent times, the world has witnessed the rise of large language model AI platforms like Chat-GPT, which have garnered widespread attention. However, one area where AI development has lagged is the creation of Arabic-language models. In response to this gap, a team of academics, researchers, and engineers in the United Arab Emirates (UAE) has unveiled a groundbreaking tool tailored specifically for Arabic speakers. This innovative creation, named "Jais" after the UAE's largest mountain, is poised to pave the way for large language models (LLMs) in other underrepresented languages within the realm of mainstream AI.

Jais is the product of a collaborative effort involving Abu Dhabi's Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Silicon Valley-based Cerebras Systems, and Inception, a subsidiary of UAE-based AI company G42. While models like ChatGPT and Meta's LLaMA do offer Arabic-language capabilities, they primarily rely on English data from the internet. In contrast, Jais utilizes both English and Arabic datasets, with a specific focus on content originating from the Middle East. This unique approach allows Jais to achieve what no other AI model has done for Arabic speakers, according to Timothy Baldwin, acting provost and professor of natural language processing at MBZUAI.

The dominance of languages using the Latin alphabet on the internet, particularly English, results in larger datasets for these languages. Mohammed Soliman, director of strategic technologies and the cybersecurity program at the Middle East Institute, emphasizes that limiting access to AI tools based on language proficiency could deprive disadvantaged segments of society from harnessing the benefits of AI.

Typically, language models trained in English have Western-centric datasets, which means they often lack cultural awareness, leading to a less inclusive user experience. Jais, on the other hand, has been trained on a diverse dataset, enabling it to understand cultural nuances and dialects, making it versatile across different industries. Future releases of Jais aim to expand its capabilities beyond text, allowing it to work with images, graphs, or tabular data, potentially revolutionizing fields like medical diagnostics, financial analysis, and satellite data interpretation.

Arabic, as the sixth most spoken language globally, presents a unique challenge due to its diverse array of dialects. Modern Standard Arabic is typically used for formal writing, but local dialects are prevalent in online communication. Jais is designed to seamlessly switch between these dialects, enhancing its usability.

In addition to Jais, Google's Bard has also been updated to understand questions in various Arabic dialects, further improving accessibility for Arabic speakers. Jais boasts 13 billion parameters, with plans for a 30-billion parameter update in the pipeline. While the size of a language model is important, it's essential to note that parameters alone do not determine accuracy. For comparison, ChatGPT-3.5 has around 175 billion parameters, according to OpenAI.

To ensure responsible AI usage, Jais, like other generative AI models, uses instruction tuning to prevent the generation of harmful or toxic content. It adheres to local customs and regulations, including sensitive topics like homosexuality and drugs.

MBZUAI collaborated with the UAE government and other institutions to develop Jais with responsible AI practices in mind, reflecting the growing emphasis on ethics in AI development. The UAE has been at the forefront of AI development, appointing the world's first minister of AI in 2017 and unveiling Falcon, the region's largest generative AI model.

While Falcon currently outperforms Jais in English with its 180 billion parameters, both models are open-source, allowing for greater accessibility and collaboration. The Middle East is poised to benefit significantly from AI, with estimates suggesting potential benefits of up to $320 billion by 2030, according to a 2018 report by PwC.

The creators of Jais are optimistic that this innovative model will propel the development of generative AI in the Middle East and beyond. "This is step one of many future steps," says Timothy Baldwin, emphasizing the model's potential impact not just for Arabic but for other underrepresented languages in the world of large language models.