Google has released new details about the Universal Speech Model (USM), a system that the company describes as a “critical first step” towards realizing its goals, and which is now getting closer to its goal of building an AI language model that supports 1,000 different languages in order to beat ChatGPT. In November of last year, the business revealed its USM model as well as its aspirations to construct a language model that supports 1,000 of the world’s most spoken languages.
USM is described by the tech giant as a collection of cutting-edge speech models with 2 billion parameters trained on 12 million hours of speech and 28 billion phrases of text in 300+ languages.
Google in a blog post stated that USM, which is intended for use in YouTube, can perform automatic speech recognition not only on widely spoken languages like English and Mandarin, but also on under-resourced languages like Amharic, Cebuano, Assamese, and Azerbaijani, to name a few. According to Google, USM presently supports over 100 languages and will act as the “basis” for a much larger system. Meanwhile, Google is scheduled to release a slew of AI capabilities for its products in the next months, including Gboard for Android, which is planning to incorporate the Imagen text-to-image generator.