Hello Developers,
We've heard that many of you are interested in using text-to-speech in your playlists, but were confused about how it would work.
As part of our partnership with ElevenLabs, we've decided to make this process a lot easier to help you experiment during the Yoto Developer Challenge (we'll probably discontinue this at the end).
The previous workflow to create an AI-narrated playlist used to looks like this:
Create a web service
When the service receives a request, call a text-to-speech API with your text
Get audio file from the API response, return it in your web service
Create Yoto streaming card with your URL
Now, we've created a new API endpoint that lets you do all of the above in a single API call, and we handle the rest (including paying for ElevenLabs credits 💸)
The API endpoint is available at
https://labs.api.yotoplay.com/contentAnd it lets you specify your text in the trackUrl property of your track. The full API call looks like this:
const chapters = [
{
key: "01",
title: "Test ElevenLabs card",
overlayLabel: "1",
tracks: [
{
key: "01",
format: "mp3",
title: "This will be read by ElevenLabs",
trackUrl:
"text:This is a test of the ElevenLabs text-to-speech service. It will read this text aloud.", // important
display: {
icon16x16: "yoto:#4PcvvM5CYc1nmeEHsWJcYcQW1jAbqePuQ97ccWGmPnA",
},
type: "elevenlabs", // important
},
],
display: {
icon16x16: "yoto:#4PcvvM5CYc1nmeEHsWJcYcQW1jAbqePuQ97ccWGmPnA",
},
},
];
const card = {
content: { chapters },
metadata: { description: "Test ElevenLabs card" },
title: "Test ElevenLabs card",
};
const response = await fetch(`https://labs.api.yotoplay.com/content`, {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization:
"Bearer [YOUR_ACCESS_TOKEN]",
},
body: JSON.stringify(card),
});
const json = await response.json();
console.log("Card created with ID:", json);
Obviously, we're asking you to be reasonable in your usage, otherwise we'll have to turn this off.