Udio is the latest AI music tool to hit the market, coming out of stealth with a bang as it reveals an incredible ability to capture emotion in synthetic vocals.
Created by former Google DeepMind engineers, the platform has already attracted both investment and attention from the music community, including will.i.am and Common.
Several tracks have been leaked ahead of the big launch on X and other platforms, leading to speculation of just how good this new AI tool could be. I’ve been trying it for a little over a week now, and I think it’s a Sora-like moment for AI music.
It has the same ability to create a full song from a text prompt as the Suno — which is still an impressive tool — but has much better vocals and a more natural sound.
The ability to capture not only the emotion of a song, but to generate both the strange and the unexpected while maintaining musical fidelity and cohesion is astounding. For example, I generated all the songs in this story, merging unusual genres with ease.
What is Udio?
I had the chance to speak with founders David Ding and Andrew Sanchez about Udio, and they told me it was inspired by a desire to make it easier to create and share music.
“It’s a magical moment,” Sanchez said. “It’s really magic for people to go from nothing to something.” That’s why they decided to focus, at least initially, on being able to create a complete song from lyrics—giving people a “wow” event.
Future updates will include more musician-focused tools, including the ability to add reference vocals, more detailed creation options, and easy import of external tracks. For now, the focus is on building a library of amazing songs inspired by people with no or minimal musical ability.
Future updates will include more musician-focused tools, including the ability to add reference vocals, more detailed creation options, and easy import of external tracks.
The pair would not be involved in the model’s underlying architecture or training data, but said they have strong copyright protections in place. For example, you can’t specify a specific artist just like Suno — but it also blocks a song if it sounds like the artist.
How does Udio work?
Like any AI tool, it starts with text. Enter a prompt and click generate and it will make two completely different tracks to that theme. However, you can give it your own lyrics, make it instrumental, or add more specific genre tags to target the generation.
After playing with it for a week, I’ve found that you get the most accurate generation by giving it a rough one-line text and story setting the direction of the text pattern, then a descriptive genre to set the direction of the music pattern.
When a song is generated, it splits the task, first creating lyrics using a traditional large language model, and then creating music using what I assume is a diffusion transformer model similar to those found in OpenAI’s Sora or Stable Diffusion 3 — although this has not been confirmed by the Udio team.
Users can then publish the recording for the community to enjoy, download an audio or video file to share on other social media platforms, or embed it into another project.
One of the use cases the team and some of the artists they’ve worked with have pointed to is the potential to use Udio as a songwriting aid. Being able to take a set of lyrics, define a melody and create an instant demo to send to artists to record in a real studio.
“It’s a brand new Renaissance and Udio is the creativity tool of this era – with Udio you can create songs through AI and your imagination,” said will.i.am.
How well does Udio work?
In less than a minute I was able to create a haunting yet stomping goth bluegrass piece for a haunted hoe. I was able to select one of the generated tracks and expand it — with detailed controls like adding an intro, a pre or post segment, or an ending.
The resulting tune should have been a mixed-genre mess, but it was surprisingly effective. The AI model managed to create something fascinating, original and somewhat strange – all from text.
The team continues to find new skills they never knew Udio possessed. “I recently found out that he can perform traditional Chinese folk music,” Ding said. “I have heard good Korean, Japanese and other languages.”
It’s a brand new Renaissance and Udio is the creative tool of this era – with Udio you can create songs through AI and your imagination,
will.i.am
“There’s nothing out there that comes close to the ease of use, voice quality and musicality of what we’ve achieved with Udio – it’s a real testament to the people we’ve got involved,” he said.
In the future, they’re working on adding support for more languages, the ability to split stems from individual songs, and potentially even the ability to specify the vocalist – but for now, their focus is building a community around Udio.
One thing we could see is Udio being used as an alternative to sending gifs. Or it allows people to express themselves in the form of a song to a loved one or to share an emotion. You can send a 30-second birthday song message to a loved one instead of sending a card.
More from Tom’s Guide