The Duality of Social Audio: Leaning In While Leaning Back

April 22, 2021 by  Andrew Cohen

WHAT’s Happening?

Over the past 6 months, there’s been a boom in social audio. Billions of dollars of capital is flowing into the space, and incumbent tech platforms are announcing new audio-focused product launches. 

In December 2020, Clubhouse had 600,000 users. As of April 2021, it has 10 million. But just as it’s beginning to achieve critical mass, its core “Social Audio” experience, is being replicated by a slew of well-capitalized incumbent platforms, as well as a few ambitious upstarts.

WHY Is This Happening? 

Social Audio combines the intimacy of podcasts with the interactivity of livestreaming to create a deeply engaging content experience. It also merges the flexible ambience of podcast consumption with the frictionless “drop-in” consumption of live / linear programming to create a broadly accessible content experience. As a result, incumbent platforms see Social Audio as a massive opportunity to deepen engagement and expand time spent on-platform, for both users and creators. 

Clubhouse ushered in the Social Audio revolution by fusing key elements of spoken-word audio and live / linear video to develop a new content experience that’s uniquely accessible and engaging. Incumbent platforms view this feature as a tool to expand the time spent in their ecosystems by users and creators, and to deepen overall engagement.

We’ve seen this before. After Snapchat introduced “Stories”, we soon saw the format adopted by Instagram, Facebook, and even Linkedin. After TikTok broke through, Instagram launched its TikTok clone called “Reels”, and other platforms followed suit. Twitch introduced livestream gaming, and soon after, YouTube, Facebook, and Microsoft launched similar products.

So will Clubhouse cease to be relevant as a destination platform once its core experience becomes integrated into platforms with more entrenched user bases? Or — like Snap, TikTok, and Twitch — will Clubhouse establish its own identity and carve out a distinct role in the increasingly crowded social platform ecosystem?

In Part 2 of this report, we’ll outline how we believe the Social Audio space will evolve. But first, we’ll explore what makes this medium so powerful in the first place.

Why do so many platforms see so much potential in Social Audio?

Social Audio combines the intimacy of spoken-word audio with the communal interactivity of livestreaming to drive deeper levels of engagement. And it combines the flexible ambience of podcasting with the frictionless “drop-in” nature of live and linear video to provide a content experience that’s broadly accessible, which will increase the time spent on platforms that offer the emerging medium.

In tandem, this unique intersection of podcasting and livestreaming — two mediums that have ascended in today’s “Passion Economy” — combine to simultaneously provide  “Lean-Back” and “Lean-In” experiences for both creators and fans.

Lean-Back Experience

Combining the flexibility of podcast consumption with the frictionless “drop-in” ambience of live and linear video lowers the barrier-to-entry for consumption and expands the amount of time spent on-platform.

Nextgen video services like TikTok and Twitch make the most of the visual medium, leveraging the power of video to supercharge engagement, thereby demanding and rewarding the undivided attention of their viewers. These platforms are creating a new, elevated, standard for visual attention.

Spoken-word audio content has thrived in this overly stimulating consumption environment, because it’s a complementary experience to all the visual platforms vying for our attention. Spoken-word audio’s role as portable “background programming” makes it uniquely flexible and accessible medium for users.

Between 2014 and 2020, the share of spoken-word audio consumption by Americans aged 13 – 34 increased by 84%. And when asked why they’ve increased their intake of this content, 79% said the fact that “You can do other things while also listening to spoken word audio” has been a key factor.

Mark Zuckerberg, founder and CEO of Facebook, which is investing heavily in incorporating Social Audio features into its platform, shared his perspective on how the flexible nature of social audio is uniquely valuable for both consumers and creators:

You can walk around a lot more easily. You can consume it without having to look at the screen and kind of do that in the background while doing something else

The flexible portability of audio makes it a powerful tool for platforms looking to expand consumption time, as it can function in the background, thus filling out the cracks in one’s day where visual attention is not an option.

However, the on-demand format also introduces a source of friction that limits the overall accessibility of the medium. Podcasts require a heightened commitment from the listener, as users must proactively discover and subscribe to podcast programs (and discovery is a sub-par experience for the user, across any existing platform). Although this commitment helps breed the audience intimacy that makes podcast listenership so valuable, it also creates a barrier-to-entry that stifles the volume of total consumption.

Livestream and linear video platforms, on the other hand, are accessible in the exact way that podcasts aren’t: they provide frictionless, low-intent, “drop-in” consumption. As much as viewers love the convenience of unlimited on-demand viewing options, sometimes they just wanna flip through the channels and “see what’s on”. This demand for low-commitment tune-in helps explain the growing usage of platforms like Pluto TV, the Roku Channel, and Tubi.

Further, it’s no coincidence that the ambient verticals that work best in “drop-in” video environments, like FAST platforms, are the same ones that are most successful in spoken-word audio. Both mediums fulfil the role of ambient background programming. For example, the most popular spoken-word audio category is News, which is often cited as one of the only programming categories that people prefer to watch in linear environments because they can easily drop in and out, consuming it in the background while they attend to other tasks.

But linear and livestream video lack the portable consumption flexibility that has enabled the mass adoption of podcasting. I’ve personally tried to treat ambient video like I would a podcast, by tuning in to the CNN live channel on the YouTube or Roku mobile app. I just wanted to drop into the news to catch up on what was happening that day, and because the content consisted of hosts conversing around a table — not something that demanded my full ocular attention — I put my headphones on and walked around my apartment listening it while I attempted to multitask and consume other pieces of visual media on my phone, as I would when listening to a podcast.

But the consumption experience on livestream video platforms isn’t nearly as flexible as podcasts. The video window monopolized my phone, and I had to choose between listening, and responding to a text or checking Instagram. I ended up dropping out of the stream, as it wasn’t engaging enough to pay full attention, yet not flexible enough to integrate into the background of my routine.

And that’s where audio comes in…

Social Audio bridges the gap between the low opt-in / portability of live and linear video, and the high opt-in / portability medium of podcasts. Maximizing the accessibility of ambient background programming.

Social Audio therefore has the potential to be the go-to source for flexible and ambient content consumption. For example, shortly upon waking up you just want to drop into “whatever’s on” while you’re getting ready, and don’t want to make the conscious effort to sort through limitless on-demand options to find the perfect podcast for the moment. So you just drop into a live show on Clubhouse (or soon, on Spotify, Facebook, or elsewhere). Unlike a live linear FAST channel, you can walk around the house and use your phone without interrupting the ambient stream of background programming. And if you’re still enjoying the show when you’ve wrapped up your morning routine and it’s time to leave the house, you can take it with you on your commute, while continuing to use your phone for GPS, responding to emails, or whatever else you’re doing.

For creators, the flexibility of social audio has the potential to be just as significant.

As we’ve written about in the past, the process of production and distribution is a much simpler process in podcasting than it is in almost any other creative medium. Even more so than livestream video, which as we’ve outlined in another recent piece, is an underrated complex production process. But podcast production is still not completely frictionless. That’s why Spotify paid a reported $150 million to acquire Anchor. At its simplest, recording a podcast still requires equipment, time, and know-how (we’re learning this first hand through building our own podcast network!).

But Clubhouse has made recording and distributing an audio show as simple as having a phone call.

Ben Thompson comments:

It’s much easier to get a group of people together for an informal conversation that requires nothing more than the tap of a button than a formal podcast recording. Convenience matters! It matters more than anything.

This ease of creation doesn’t just appeal to professional creators, but also to UGC creators who may be interested in podcasting but don’t have the resources to get started. A user review on the app store page for a new Social Audio app called Stereo, commented:

“This app is great for an impromptu conversation that you want to share with others. If you’ve ever wanted to create a podcast, but don’t have any of the gear, this app is definitely for you.”

It’s through this broadened accessibility, for both consumers and creators, that incumbent platforms see the opportunity to use Social Audio to extend time spent in their ecosystems. For example, users already turn to Facebook to interact with their social networks through visual formats. But what about all the instances where they’re unable to be visually attentive? Whether they’re driving, cooking, or folding laundry.

With Social Audio, users will still be able to access the core functionality of the platform through the powers of audio and voice. And while most audio creators may be too busy to produce the quality required for a Spotify Original podcast, with the ease of production and distribution enabled by Locker Room, these creators can always hop into a live room to engage with users on the platform. The end result? Creators and users spending more time and engaging more deeply on the platform. That’s a win, win, win.

“Lean-In” Experience

Infusing the inherent intimacy of podcasting with the interactive capabilities of livestreaming supercharges user engagement.  

As we’ve discussed. podcasts are renowned for the intimate relationships they enable between creators and fans.

However, it’s inherently a one-way medium. And despite the intense connection that podcasts enable between creators and fans, a one-way dynamic is not enough to truly maximize engagement.

Tom Webster, SVP at Edison Research, explains:

Podcasting has become the greatest companion medium because you can listen to a podcast and it can be a friend. Now is a good time to be friends with our listeners and make them feel like they are part of a community, and not an audience…Podcast audiences have changed. We have had six months in quarantine and now in 2020 and the foreseeable future, we want connection. And it is providing that connection that is incumbent upon every podcast creator and producer to find ways to connect people.

Fans want to interact with the audio creators with whom they feel like close personal friends. They want to participate in the shows, rather than just observe them. But because of the one-way nature of podcasting, the only outlet for this real-time fan engagement has typically been enabled via podcasts who also do live event tours, AMAs on Reddit, or supplemental livestreams on Twitch, YouTube, or Instagram.

This dynamic hurts platforms as well. Especially audio-native platforms like Spotify, where the one-way creator-fan relationship narrows the expanse of potential user engagement. For example, Spotify owns The Ringer, and many of The Ringer’s podcasts are Spotify Exclusives. Yet, if one of these properties wants to interact with its community in real-time, they must move that audience to another platform — like when the hosts of The Ringer’s Game of Thrones podcast hosted live aftershows on Twitter. Considering the fact that Spotify has invested nearly a billion dollars to expand into podcasting, in part, to grow its share of total platform engagement and listening hours by eliminating the need for users to listen to music on Spotify and podcasts on Apple, the lack of live and social programming represents a blindspot that threatens its ambition to be the preeminent home for all things audio.

Livestream video, on the other hand, thrives because most social livestream products offer the real-time audience interactivity that podcasts lack. Younger audiences have demonstrated a strong affinity for these intimate and interactive formats. In Q1 2021, viewership doubled for both Twitch and YouTube Gaming YoY, culminating in 7.7 billion combined viewership hours for the quarter. And this consumption isn’t just confined to gamers. “Just Chatting” is actually the most popular content vertical on Twitch, where, instead of playing video games, creators just hang out and chat with fans. This trend signals a powerful consumer demand for live creator and community interaction.

In China, the creator intimacy enabled by livestreaming has produced a $170+ billion Livestream E-Commerce industry.  We recently conducted a survey with Chinese consumers, and one avid livestream shopper told us that the interactive nature of the broadcasts makes it feel like the creator is “having a one-on-one conversation with me”. And in addition to the hyperactive engagement facilitated by live chatting, the authentic “hangout” feel of livestreams makes creators on these platforms feel like intimate friends to their fan communities — just like podcasts. So Social Audio uses the power of live to double down on the intimacy of audio while adding a much-needed element of interactivity.

Creators are demanding more outlets for fan interactivity as well. For creators, fan intimacy and interactivity represents a new pathway to audience engagement, which in today’s “passion economy”, is the most surefire pathway to sustainable and diversified revenues.

Spotify’s founder and CEO Daniel Ek commented:

My fundamental view is that Clubhouse is…a creative format and it’s super-engaging for creators. It’s very interesting with the interactivity, so we obviously pay a lot of attention to all social and interactive features.

For one-way content platforms like Spotify, Social Audio thus presents an opportunity to evolve into a community hub for audio lovers. And in turn, for community hubs like Reddit, Twitter, Facebook, and Linkedin, social audio presents an opportunity to leverage the power of audio to create new outlets for audience intimacy and engagement, and reinforce communal ties while expanding time spent on-platform.

By merging the most engaging components of spoken-word audio and livestream video, Social Audio represents a uniquely engaging communications format that benefits both fans, creators, and platforms. Like the best podcasts, the vibrancy of an audio-only conversation is potent, cultivating deeper levels of audience intimacy. Like the best livestreams, each show is a communal experience that fans participate in, rather than merely observe.


In Part 2 of this Report on Social Audio, we’ll explore…

HOW will the Social Audio Landscape Evolve Going Forward?

  • Social Audio will NOT be a winner takes all market

WHAT innovations need to happen for Social Audio to reach its full potential?

  • Monetization

  • Social Audio native formats and IP

  • Personalized discovery

  • Verticalized feature sets

  • Live + on-demand editing (creator tools like those that increasingly exist for livestreaming)

  • On-demand recording / listening + RSS distribution

Ping us here at anytime. We love to hear from our readers.

Related Posts

Get RockWater's deal news and insights for the media, agency, and creator economies.