Automating a podcast to the top

How I automated a podcast publishing and marketing using AWS

Maxim Makatchev
9 min readOct 25, 2021

As I am waking up, today’s episode of the Japanese podcast I help produce has already been published on multiple platforms and promoted on Instagram. I glance at the post’s engagement and give a quick listen to the episode on Apple Podcasts just to make sure. Everything seems cool. The podcast’s ranking on Chartable has reached the 5th place in its category. I feel particularly satisfied by the fact that this all happens with me barely touching the content. In fact no human had, aside from the podcast creator, Ayako-san, when she recorded the episode’s audio, typed the transcript, and sent both via LINE messenger — using just her phone — and then me, when I lightly edited a week’s worth of Ayako’s recordings.

Things weren’t always this way.

Ayako-san does not own or use a computer. The apps she uses on her smartphone are the few vital ones, including LINE messenger. So when we were just starting our weekly podcast, we decided that Ayako-san would send me her audio and transcripts via LINE. I would confirm and save the files, edit the audio, and publish the podcast via Anchor, publish an Alexa flashbriefing by directly editing its JSON, and occasionally tweak the podcast’s RSS as well. Below is our manual pipeline with the three roles that I played: Creator Liaison, Sound Editor, and Publisher.

There are mobile apps that do podcast publishing, including Anchor’s own app. Ayako-san could learn using such an app and publish directly. I decided against that for the following two reasons:

  • There was a need to do some basic editing on the audio before it could be published: adjust gain, remove occasional noise, add calls to action, intro and outro music.
  • Crucially, it was clear is that any future automation would need to be custom to an extent, for example, to promote on social media and to publish on non-podcast audio platforms like Amazon Alexa. Since some custom work would need to be done anyways, the creator could as well use the app they are already familiar with.

In other words, the sound editing role was too hard to automate or delegate to an app, and the publishing role was too complex, so that no single existing service could handle it.

Editing and manually publishing each weekly episode was taking about 30 minutes of my time, which looked sustainable for the first couple of weeks.

Then, Ayako-san decided to do her show daily.

The idea of more quality content sounds great. However, a bottleneck quickly became obvious: unsurprisingly, it was me. At first, spending a few minutes on editing and publishing the episodes daily did not seem like a huge burden, so what could go wrong? To begin, here are a few things:

  • Life gets in the way: travel, personal schedule fluctuations will interfere with the timing of episode publishing. This is bad because listeners like consistency. Social media platforms like consistency too: Instagram accounts that consistently publish daily get higher exposure (see here).
  • Humans make mistakes. Hand-editing a podcast’s RSS file and an Alexa Flashbriefing’s JSON file will result in typos. If you are busy, it may take days before you discover that the podcast is down because of the malformed feed. It may take weeks to recover from dropped ratings and lost followers.

In fact, this is exactly what happened. As I traveled, supposedly daily shows started getting published after a multi-day delay. And then, a malformed RSS feed got our show unlisted from Google Podcasts.

It was the time for a change.

What I needed to keep in mind

Knowing the creator personally and having published the podcast manually for a while gave me just enough empathy for both the creator and the producer jobs to realize that any new pipeline should comply with the following two requirements:

  1. As little disturbance to the creator’s process as possible. We had already developed a routine where Ayako-san submitted the audio and transcripts via LINE and I acknowledged and processed the data, so making her learn a whole new process seemed counter to the point of the automation.
  2. A possibility of manual intervention. First, fully automating audio editing did not look feasible, or at least was not a low-hanging automation target. Second, I wanted to deploy and test parts of the automation as they were completed, rather than waiting to deploy until the whole system is complete.

Automating the Creator Liaison and Publisher roles would satisfy these requirements.

Automating the creator’s interface

I started with automating Ayako-san interaction with a dedicated business LINE account which meant building a kind of a conversational UI. Data that needs to be collected to publish an episode includes:

  • Episode category: daily or weekly, since Ayako-san hosts both.
  • Intended publishing date.
  • Episode’s topic and the rest of the transcript (for the episode’s Instagram post).
  • Audio file (an m4a file Ayako-san records on her iPhone).

A typical transcript includes the publishing date, the podcast category, and a topic, so all I needed is to parse those values from the text. If we could agree that Ayako-san sends a transcript first, and if that transcript can be successfully parsed for the desired pieces of data, an episode context would be established and any number of the following audio files would be associated with this episode.

How does the creator know that the message is well-formed and the episode context is established? Conversational UI, like other types of UI, benefits greatly from providing a user with feedback.

Here is an example of a feedback to the first message that established the context.

Responses to the file uploads that follow, confirm the current context as well.

If a context cannot be parsed from a message, there needs to be an error feedback, so that the user can correct the message and resubmit. Here is an example when missing brackets make the topic unparseable.

I implemented the UI using an HTTP endpoint on AWS, a lambda function controlling the interaction, and DynamoDB and S3 for storage. Keys and access tokens that are required to access LINE Messaging API are stored outside of the code base in AWS Secrets Manager.

Since LINE messenger is intended for a relatively short-delay communication (nevermind your ex), file attachments expire within a week. A nice side-effect of automating a LINE channel is never having to lose any submitted content due to its expiration: everything can be stored in your cloud as soon as it is received.

What I still have to do manually

Now that the m4a audio files are saved neatly with timestamps inside folders corresponding to the episodes, I can process them manually using Audacity. The processing involves tasks that are harder to automate, such as gain control, noise reduction, and insertion of calls to action in the middle of the recordings. Once I am satisfied with the audio, I upload the resulting mp3 files back into their corresponding episode folders on S3, making them available for the second step of the automation: publishing.

How I automated publishing

Now that we’ve obtained, stored, and processed the audio files and transcripts for the upcoming episodes, we have everything we need to publish and promote an episode.

Publishing a podcast involves updating an RSS feed file with an episode record that points to a publicly accessible mp3 file. Similarly, publishing an Alexa Flashbriefing involves updating a JSON file. We can store the RSS, the JSON, and the corresponding mp3 files in a public S3 bucket.

An episode record in RSS should include the audio file size and duration. While a file’s size can be obtained using an S3 API call s3.listObjects, to get an audio file’s duration we need to run an FFmpeg or something similar. This can be done by creating a Lambda Layer and uploading an FFmpeg binary compiled for the ARM architecture (the host architecture of Lambda).

Since we have to run an FFmpeg to get the audio file info, we can as well verify that the file’s bit rate and sample rate conform to the podcast and flashbriefing requirements. In case they don’t, we can emit an error to CloudWatch and send an email to the podcast producer — more on that below.

Finally, we can use EventBridge to schedule the Lambda function to publish a due episode at set times. In case of our daily podcast, I’ve set 7:30 am.

Marketing your podcast, automatically

It would be nice to publish an Instagram post that promotes the new podcast episode. A Lambda function can generate an image with the text from the episode transcript laid over a random color background, store the image in a public S3 bucket, and publish it as a post using the Instagram Graph API.

I found sharp.js to be simple in use but powerful enough JavaScript graphics library. There was a slight hassle as I had to install it as a Lambda Layer. Once that was done, composite images like this could easily be created programmatically in Lambda.

Knowing when things go wrong

It’s better to have an automated procedure that sometimes makes mistakes, than a manual procedure that sometimes makes mistakes.

— Roberto Pieraccini

The pipeline I described involves a combination of manual and automated steps, third party services, and network APIs, so errors are inevitable. For example, manually edited mp3 files could have wrong bit or sample rates, or may even not be supplied in time for publishing, microservices can occasionally fail, networks can go down, Facebook can have an outage that affects Instagram API, etc.

Alerts can help a podcast producer — me — to intervene before the delay will be noticed by listeners. One simple way to alert a human about errors using AWS is to have a Lambda function triggered by CloudWatch log errors. That function, in turn, can compose a message and send an email using Simple Notification Service.

To summarize, the podcast publishing, marketing, and error alert pipeline looks like this:

Results

As I mentioned in the beginning, things got quite dire before automation: malformed feeds and irregular publishing got the podcast dropped below charts and out of the first pages of search results on Apple, Google, and Spotify podcast platforms.

Within about 3 weeks of regular automated publishing the podcast got back to the top 5 on Chartable and is consistently top in keyword searches on the major podcast platforms.

Within a month of daily automated Instagram posts, combined with periodic manual follow-unfollow routines, Instagram followers grew from 80 to 800 (10-fold).

More recently, the podcast’s rating on Chartable has been less consistent, so a more aggressive marketing may be necessary to bring new podcast subscribers — one of the variables affecting the rating.

Acknowledgements

I’d like to thank Ayako-san for providing the consistently top quality content, a prerequisite and a motivation for any other kind of optimization.

The Instagram automation was heavily inspired by this legendary post.

This work is supported by susuROBO, Inc.

--

--

Maxim Makatchev

Founder of susuROBO. Talking machines: contributed to roboceptionists Tank and culture-aware Hala, trash-talking scrabble gamebot Victor, Jibo, and Volley.