Over the last six months we’ve received a lot of interest from clients wondering how to win and where to play in the audio media space. We’ve taken a look at the landscape and summarised the emerging areas of early stage opportunity to spark new conversations.

There is no doubt that audio media is having a resurgence. Termed the “hottest media trend”, the “most immersive media” and the “nation’s constant companion”, the pandemic saw audio streaming spike by 45% and connected listening by 41%.

Listeners’ dependence on audio for entertainment, information and connection laid the groundwork for the advent of social audio – a term that was pretty uncommon until Clubhouse entered the scene in May 2020. From a $100M valuation at beta to $4B in 11 months, Clubhouse has caused two main buckets of activity to emerge in the audio media investment space:

1. Funding the ‘new’ form of social media: social audio

2. Facilitating superior audio content creation

We’ve dived into the two spaces to analyse emerging trends, interesting players and the opportunity for media powerhouses moving forward.

1. Social Audio

It’s not surprising that audio chat rooms like Clubhouse have seen the most attention and investment in the audio space to date. Facebook has announced the imminent launch of their Live Audio Rooms feature, Twitter Spaces is in beta testing and LinkedIn, Reddit and Slack have all announced intentions to follow suit. Spotify acquired Betty Labs to launch Spotify Greenrooms and Mark Cuban has invested in Fireside to get a piece of the action. It’s safe to say the audio chat room segment is saturated, so where is the opportunity?


Stepping away from celebrity endorsed streams and CEO sponsored chats, the wellbeing space harnesses audio’s foundational element of connection. With the Mental Wellness market growing at a 20% CAGR, Quilt has recently raised $8.3M in seed stage funding as an audio network for supportive conversations. It hosts ‘sitting room’ chats to foster better mental health spanning topics such as spirituality and healthier relationships. With TooDeep tackling loneliness and MyWave promoting self growth, it’s easy to see why the anonymity and emotional relief that social audio can offer makes it an interesting play for investors.

Short form audio

Looking beyond live audio streams, attention is starting to trend towards short, shareable audio clips. Short form audio offers a rich opportunity to emulate the success of TikTok with viral audio by tapping into a whole new audience base of people who want to share content but retain an element of privacy.

Companies like Swell support audio-only group chats, private conversations, bite-sized podcasts or live conversation that’s structured more like a comment thread. For all four formats, users post standalone audio clips that listeners can browse and leave a response in their own time. Think of it as an audio-only Reddit. Despite Facebook announcing its own feature, Soundbites, the short form space is seeing a lot of early stage activity with companies like Beams and Quest leading the charge.

Social Music Collaboration

The new form of mixtape. Social music apps such as Roadtrip and Cappuccino Fm that allow you to create playlists with friends seem to have been superseded by start-ups merging music collaboration, commentary and personalised playlists. Multiple players in this space have gained Series A and B funding in the last 12 months such as Seoul-based Spoon Radio, which raised $59M for its social radio station where creators collaborate with listeners, incentivised by “gifts” sent from fans. Stationhead received $12M in a similar model that lets users present their own radio shows from personal music playlists hosted by Spotify and Apple Music. However, the future of the category seems to be emerging start-ups that harness machine learning algorithms to build personalised audio feeds for listeners, such as Headfone. The platform aggregates playlists, stories and talk shows to curate an audio content channel that’s totally unique to the individual.

2. Audio content creation platforms

Behind the three social audio opportunities above is the technology that makes it happen. Even in saturated categories such as audio chat rooms, there’s an opportunity to spearhead the behind the scenes magic that facilitates the creator economy.

Augmented audio

This category is all about taking sound quality to the next level by recreating space and sound virtually. Companies such as High Fidelity have secured $72m funding to use spatial audio mapping to recreate how we hear sound in real life virtually. By using a 2D map to define sound sources, users can ‘move around’, which changes the volume of speech and music depending on distance. Imagine joining a video call and being able to have multiple conversations at the same time in the same room, and actually hear what the person ‘nearest’ you is saying. Following suit, Spatial, Sonar and Riff have all obtained early stage funding to make audio more immersive and interactive. This could even include the emergence of bots using natural language processing to engage in ‘human’ conversation – a real life version of the movie Her.

Audio analytics and advice

Audio analytics platforms are emerging with the aim of improving audio marketing or social audio reach. Start-ups such as Oto use AI to analyse social network behaviours and sentiment to optimise content. They can also adapt content to different languages and cultural nuances to make sure the audio resonates appropriately with multiple audiences. We will also begin to see marketing agencies enter this space to offer audio content advertising, as well as voice talent sourcing and social audio brand building.

Audio Publishing Tools

Social audio platforms are expected to eventually provide recording capabilities to help users generate and memorialise their own content more easily and in better quality. Podcast hosting and publishing platform, Acast, has reached $170m in total funding and quickly diversified into offering audio analytics, advertising and app building. Acast now counts over 125 million listens each month on its platform and app, with a 40% increase in advertising in 2020. The future of audio publishing will be an aggregation of features such as APIs for customisation, synthetic media features such as sound effects and voice filters and the ability to publish to many social audio networks simultaneously.

Text to speech

Last but not least, there has been a flurry of early stage investment activity in the text to speech space. This category has been around for a while, but investment is focusing on start-ups that have the right technology to compete with human accuracy. Noa provides narrated versions of articles from The New York Times, Bloomberg, Financial Times and The Economist as the biggest app for audio journalism. SpeechKit takes this a step further, by allowing news outlets to turn written articles into audio podcasts in 16 different languages. Individuals can even use Otter, which just raised $50 million in Series B funding, for it’s AI transcription interface that will minute your Zoom meetings. The future of this space will be determined by the growing sophistication of technology that will ultimately negate the need for a human eye to sense check the transcription.


The next year will begin to demonstrate the results of the investment into the Social Audio space by the tech giants. Despite saturation in the pure B2C audio chatroom category, there is still plenty of opportunity to leverage the demand and desire for audio content. The interesting areas to play in will be in the short form audio and audio publishing space – both areas that have the strongest ability to facilitate the expected demand and provide the sound enhancement that will be needed to see successful traction in the category.