Have you ever seen an interesting video and when you watched it you felt let down by having to read text instead of a voice over? Usually, it’s constraints in time, budget or experience that force video makers to take the easy way out. And it’s true: Adding the sound component to your video is no easy task. Or is it?
What used to take weeks and a considerable budget can now be done for small money in less than an hour. Great. Isn't it? Platforms like Powtoon, Animaker or Scribely make it very easy for everyone to create a high quality video without any video editing expertise.
However, once you are done and compare the result with the professionally produced competition you will quickly realize that one key ingredient is missing: a professional sounding voice over to support your message. What if I told you that you can create one even faster than the actual video? In different versions (e.g. multiple languages) too? And it will sound amazing? Now you can: with the help of Artificial Intelligence.

The perfect voice over is a combination of...
Your video might look great - it feels like your brand, captures the message well and transports the emotions you were going for. But is the audio track doing it justice as well? Here are some tips for the perfect audio track:
…the perfect script:
You might have big letters with a clear message on the screen, but it is hard to get the details across without a good script. A good starter rule here is to summarize on screen and flesh out the details in audio. Are you trying to score points with a high stake client? You might even want to go the extra mile and personalize the script accordingly. Including their company name, product, industry or city in the audio track might do just that without the need to produce new visuals.
… the perfect voice:
Your brand has a personality and the voice over should capture that personality. Do not make that choice lightly: a voice is the main component your viewers will (or will not) connect to: Upbeat and joyful or generous and reflected, male or female, young or mature, accent or not? No matter what your choice might be, pay attention that the voice is clear, of high quality and speaks with the right pace; but most importantly, make sure it sounds and feels right for what you are trying to communicate.
… an “audio arc” that supports your story arc:
Your viewers will only remember a few parts of your masterpiece: Make sure they are the right ones. Your audio track should accentuate and support this. My tip: keep it simple! Start with an engaging intro; create the main part so it sounds different and that it leaves a bit more room for the voice and then finish strong with an outro that sounds active and supports your call to action.
… the perfect sound design:
This is hard to get right. We all have our own favorite music genre and sometimes it can be hard to take a step back and recognize what will make your video shine: Do you want a jazzy vibe for a mature audience or attract young customers with an EDM sound? What is almost always true is that it should not be a “background music” track (remember...your last time in an elevator? :) To support your narrative and the progression of your story arc, there should be well executed transitions in the sound design as well.
… the right audio production:
If you get the above right, it can still all be ruined with an audio production that does not match the quality of your Powtoon, Animaker or Scribely video creation: nothing is more frustrating than watching your video air right after a professionally produced video (e.g. the ad your hosting platform makes your users watch before starting yours) and then realizing that it just sounds (and feels) much less exciting: Not quite as big and bright, the voice un-engagingly buried in the audio or - even worse - awkwardly standing out so it is all the viewer can focus on.
To get started, just try to get two things right: adjust the relationship between voice and the rest of the audio so that it sounds ok when you compare it to a professionally produced video. Then apply some basic “mastering” settings in your audio editor to make it sound as big as it can get without distorting it.
Sounds like a lot of work. How do I do this quickly?
The above might sound intimidating at first. If you want to become a pro at making great sounding audio production for video, it definitely takes a fair bit of expertise and a lot of practice. It also can be a lot of fun: you can just get started with an inexpensive podcasting microphone and a Digital Audio Workstation that might already be installed on your computer (e.g. Garageband is preinstalled on every Mac).
However, since you have chosen Powtoon, Animaker or Scribely to create your video, my guess is that you will want to produce a fantastic sounding voice over with the same ease and without a massive learning curve: with API.audio you can make this happen with only a few lines of code:
- Write a script to support your video and organize it in 3 sections: Intro - Main - Outro. You can include flexible parameters if you want to create different versions of your voice over (and your video)
- Choose from almost 200 AI voices. These voices will use text to speech and transform your written words to voice over in seconds and it will sound amazing. You can use multiple voices (i.e. changing speakers) in the same voice over, and you can also request us to clone a specific voice, such as your CEO’s voice, a vo actor, your own voice, or a celebrity that has agreed to represent your brand
- Choose a sound design: This is a set of sounds, music and audio layers that will make your audio sound great. AI algorithms will blend these together so that it can compete with a professional human sound engineer mixing your voice over
- Master it: AI will put the finishing touches on it and will make it sound as classy and professional as any other video voice over that is coming out of a professional audio production studio
Now you'll want to know what it sounds and feels like, right?
This is a video of a walkthrough of an apartment for sale. All of the audio and voice over has been created using api.audio:
And here is how to do it step-by-step, creating a new voice over using a different script, speaker and sound design. All of that within under 2 minutes:
Try it. It is easy and only takes 10 minutes. And the best thing is: creating different versions, trying different copy ideas, and tweaking your script as well as running different versions by your colleagues can all be done with a bit of writing and a few minutes of automated audio production.
About:
Aflorithmic is a London/Barcelona-based technology company. Its api.audio platform enables fully automated, scalable audio production by using synthetic media, voice cloning, and audio mastering, to then deliver it on any device, such as websites, mobile apps, or smart speakers.
With this Audio-As-A-Service, anybody can create beautiful sounding audio, starting from a simple text to including music and complex audio engineering without any previous experience required.
The team consists of highly skilled specialists in machine learning, software development, voice synthesizing, AI research, audio engineering, and product development.