PodClips Logo
PodClips Logo
a16z
The Present Future of Audio: Talk, Music, Video, Interactivity
The Present Future of Audio: Talk, Music, Video, Interactivity

The Present Future of Audio: Talk, Music, Video, Interactivity

a16zGo to Podcast Page

Connie Chan, Gustav Söderström, Sonal Chokshi
·
31 Clips
·
Oct 14, 2020
Listen to Clips & Top Moments
Episode Summary
Episode Transcript
0:00
Hi everyone, welcome to the 1/6 + z podcast. I'm so no and today we are talking about one of my many but actually probably most favorite topics the future of audio. Our special guest is Gustav, soderstrom the chief R&D officer of Spotify, which is the world's most popular audio streaming Subscription Service as a reminder and none of the following should be taken as investment advice. Please see a six and z.com disclosures for more information also,
0:30
Ending this episode is a 6 and C General partner Connie Chen who covers consumer rights a lot about tech Trends and product in China and Beyond Alternative monetization models and more and she and I have actually done a couple of podcasts on podcasting one a podcast about podcasting with Nick qua and the other on how we at a 6 and Z podcast. You can find both of those episodes as well as other resources on the topic at a six and z.com podcasting note also
1:00
Oh that Spotify actually got into podcasting in 2015. We were actually included as one of their launch partners for that among select other since they were huge fans of the Pod.
1:10
We still are so it's still
1:11
true. Thank you. Anyway in this episode. We actually go beyond podcasting to talk about the broader category of audio past present and future. So we chat about the parallels and differences in audio and video including referencing and episode. I recently did with Eugene way on Tick-Tock, would you
1:30
Also catch in this feed we discuss the trend of interactivity as well as augmented audio and where we are right now what's possible? What are the challenges we talk about where podcasting and music converge and diverge both on user experience and design as well as technically in machine learning. And finally we go deep on recommender systems the idea of quote hearing like an algorithm and we're subscription models come into machine learning, but we also talked throughout this episode about the
2:00
Coughs of full-stack approaches regardless of what kind of company you are and the topic of super apps as well and we're also really talking about how Innovation happens in practice whether it's having an opinionated point of view about the future or listening to users disrupting oneself and how to change an organization and much more but we begin however with the super quick debate on how much things have or haven't change in the podcasting world. At least since we did our last podcasting episode over a year and a half.
2:30
I actually personally think that audio hasn't changed that much yet. A lot of things are still I don't know if brokens the right word, but just problems that are not solved yet Discovery is still difficult search is still difficult. It's really like a one-way listening experience. You aren't interacting with other listeners. You aren't interacting with the craters crater still have to rely on very old business models for monetization. But ultimately don't work for a lot of long tail creators a lot of those big problems. So
2:59
So exist, but I do have this optimistic feel that we're on the cusp of change that's going to come to the broader. Audio Market. You're right. Those things actually haven't changed very much. I was thinking of the fact that the content landscape in podcasting has super exploded in the last year two years alone. Spotify itself has led a number of content Acquisitions, which is such an interesting Evolution.
3:24
Yes, it's both very much the same but very much more of the same, right? So like the forklifting of
3:30
Your time into your airports that just keeps increasing right? There's certainly been shifting listening Behavior due to covid lot of listening was in the car that shifted two speakers in the homes. Their overall is much more listening and to your point certainly we've invested aggressively in content and exclusive to the crater side of this landscape has changed in a direction that we wanted to change but I would also agree that we're on the cusp on the consumer experience that what's so interesting about audio is it feels like you have this cheat sheet which is what happened in video.
3:59
We just haven't done one attestation in a 21st century way yet. We have no interactivity. You can really just look at the other media Industries and see what's missing in a
4:08
sense. So Edison research which publishes a lot of the leading work and studying podcasting Behavior. They argued a few things last year won that one of the major inflection points in podcasting interestingly came to Spotify because of the streaming and that brought in kind of a new generation of users to the other argument they made and this is a course preview a lot of
4:29
Content Acquisitions, is that for a new generation? The medium of audio is really not that different than video that in fact for a lot of people their default podcast player is often a video app or just turning off the visuals and listening and so I'm curious for your guys's thoughts on where audio and video which is another big Trend do and don't intersect both from a trend perspective and a product development perspective and then we can dig in deeper and other aspects.
4:59
I mean a video is really just the combination of using your ears and your eyes. It's the audio plus the visual which means the stakes are actually higher for audio because if I can't have like a 20 second gap of silence and a podcast and expect you to be okay with it, but in a video you can go quiet and there might just be some visual distraction and you don't have to be on as much every second and so it's still a different medium, but I do think that the stakes and audio or higher.
5:27
So I think that when you talk about audio it's different.
5:30
Depending on the type of audio actually, so you have kind of foreground audio, which is more similar to video. It is a main activity you're doing you're really concentrating. It requires most of your attention. Then you have background audio like you're listening to music and you actually paying attention to something completely different you're working out or you're studying or something, right? So there are these different modes of audio that don't really exist in video video is mostly all your attention or you're doing something else, right? This is also the benefit of audio. That's why it's so much in
5:59
A judgment because you have both foreground moments and background moment. But even in the foreground moment, when you're paying full attention, you can still do other things you can drive you can do dishes you can walk around the house, right? So it is this other mode that video doesn't cover. That's why we think it is almost as much engagement as foreground video, but it's not nearly value the same yet. And that's not because it's less valuable. We think that's because it's undervalued at you can think about it the other way as well you have some
6:29
Yo, that actually works quite well as audio that you can background the you watch every now and then Joe Rogan, for example, it certainly has video. Right and that actually does help the user experience. But it is what we call Background about video or foreground double audio. If you want to call it
6:45
that as soon as the comment do stop on your point about the modes, that's a phrase that I use when I think about describing people's behaviors and I actually describe it less as foreground and background and more as passive versus active mode. And so I really believe strongly that
6:59
Audio has different modes. Sometimes you're just in hanging out in chill mode. Sometimes I'm in passive mode, which means I just want to listen to other people other times. I'm an active mode, which means I want to talk or Super Active mode, which means I want to lead a discussion. So I just think it's really interesting to think in terms of modes. I'd love to hear your initial thoughts on just a mediums differences between audio and video. What do you make of the differences and similarities between Tick Tock and what we can and can't learn from Tick-Tock when it comes to product in.
7:30
Do you guys have any thoughts on that? I mean Connie you've written so many posts about Tick-Tock since very early on. Yeah, like Tic tocs an extreme example, if you don't look at the screen and you just listen, none of the videos make sense, you'll miss the punchline like the whole exactly all your profits and their visual for
7:46
Tick-Tock. So I think there are at least two similarities what they do really? Well is they take to Connie's Point commodity music that if you just listen to it in the background you missed the whole point, but then they let their users unique.
7:59
If I that commodity music right by adding uniqueness to it with their video.
8:03
I think you just made up by word by the way unique. Goodbye. Keep going. Yeah,
8:07
and I think there's a great pattern right you have something that is commodity you can use your user base to turn that into something that is not common. It gets this engine that takes these clips and creates unique content around it. So I think that's a really interesting pattern that you could probably copied all their businesses that has come up with content. Let your audience do something with it to make it unique the other analogy that I see two audio is specifically music if you think
8:29
Out Eugene Waste post on seeing like an algorithm what he said was that the medium itself is built to be understood by an algorithm that you're presented with one item at the time. You either consume or you swipe. So it's built for the algorithm to understand what you're paying attention to versus for example, a scrolling feed where the algorithm has no idea which item your eyes are actually
8:50
looking at right isolating the specific variable so that the product developer knows what is working or not working essentially for the user
8:57
exactly and if you think about
8:59
Music actually, it's the exact same thing. You present one audio track at a time you to listen to it or you skip so in that sense, you can say it's a similar sort of UI but in
9:09
audio the tricky part is actually just the length of the song versus the length of The Tick-Tock video because you get to a very quick decision if you like that Tick Tock video or not literally within like two three seconds for a song as many of you know, like the first couple seconds of a song doesn't sound anything like the chorus or the ending so you just have to go
9:29
They're into this long before you really gauge a someone truly likes it or not. But to me, that's the only difference.
9:35
Yeah and Tick-Tock you have more evaluations per minute because they're shorter Clips, but it's also more direct but it is interesting that you mention this because this is what is happening in the label industry. It is super clear that the intro matters more and more. So you do have the tick tock effect and music, you know songs used to start slow they don't anymore because people keep within the first 10 seconds.
9:54
That's so fascinating. So The Tick-Tock effect where people are now creating different kind of music.
9:58
I would say one more thing on.
9:59
Tick-Tock, so while there are some similarities between evaluating audio one track at a time and evaluating video one track at a time. There is a big difference which is Tick-Tock has your full attention. If your full screen and you're paying full attention, then it's a pretty good signal but if you're washing dishes and listening on a speaker you get very poor signal so it depends on the context and you have to take that into account when you look at the
10:21
signal I'd love to probe briefly on this part, which is you both have talked a lot Connie you in particular have written so much about how mobile is
10:29
Literally the thing that made a lot of China's apps work the way they do because everything was mobile first and we talked about Mobile leapfrogging in our post from what now five years ago why it's been a long time. So where does that come in when you think about innovation in audio and then Gustav, I'd love your thoughts on this as well. Because when you said that in the pandemic a lot of the listening Behavior has shifted to home speakers. I'm curious how that changes your abused given initially mobile default interface. So if
10:59
I just break down what a phone is and the different components of it. Like you have the touch screen which means whatever you're doing on the phone. You can have more interactivity ideally, but you also have camera and GPS and you know, the camera is the unlock for tick tock and the microphone could be the unlock for a bunch of audio platforms because now it means that I don't just have to be listening. I'm not just leveraging the speaker on the phone, but I'm leveraging the microphone and I'm giving back the microphone in particular for audio and video I think is dramatic.
11:29
That is one of the sensors that are super interesting and Honda leveraged for audio. I would say so one of the benefits of being a streaming services that we understand the consumption situation. We understand if you're listening on a speaker but putting on an Apple Watch or a phone we understand is if you're in your car for example, because the phone is connected and so forth. So we actually think that's a very important signal and we try to think of them as kind of different jobs to be done. And what we want to try to understand is the situation that you're in and it's always a combination.
11:59
Ation of your play history your time and your taste but the device is actually really good signal. So there are two levels one is the you I'd and the hardware that you can leverage and that changes when you go from a phone to connect it speaker. For example, you have much less control the actually still do have a feedback channel in terms of a microphone as Connie mentioned, but you have less you I right. So we're thinking about multimodal consumption quite a lot where you have some devices that really good for input on your body, but they're not that good for output. You actually want to sound in your speakers.
12:29
That's why we built this remote control protocol so that you don't have to interact in the same place that you're listening. You can interact on one device. So forth. The other way to think about is on the content level. So one of the things that happened during covid-19 of consumption shifted from the car to the home was that we have this very successful playlist called The Daily Drive where we mix music and talk and create literally your daily drive now people stop driving right? So then we try to Pivot and we create the same job to be done but not while driving it's different so
12:59
The two levels kind of the content level and the pure ux interactivity
13:03
level. Okay, so we can shift into Discovery and recommendations in a bit. But before we close this thread, what do you guys think of this trend and phrase augmented audio which means different things to different people but the idea that you can actually do your point Connie much like video has many layers. You can actually bring more and more layers into audio as well. You guys have any quick thoughts on that so many but not really just leads me to the
13:29
Leaf that audio today. I was still this more setback experience. It's very much like a one-way consumption experience the same way that we consume television or the same way that we consume movies and kind of like more YouTube live streaming that kind of format hasn't really arrived mainstream and audio yet. And so even just capturing the comments the feedback to podcast like that kind of content is not well harness today. So there's so many more layers.
13:59
Around The Listener feedback or interacting with other listeners or interacting with the Creator a lot of fun should be added on and layer it on into audio that right. Now at least doesn't exist. It doesn't have to be necessarily be fun. I mean, that's a Creator. I found the news when you guys rolled out your poles feature to be quite interesting because we just had the debates here in the United States and I literally was like, I wonder if a lot of the political news shows should do like their own polling as part of their audio experience.
14:29
Once it's not just fun. It's instant
14:32
feedback. Yeah, I agree. We started with pulse which is both a safe and constructive way to bring feedback. You mentioned the consumers of the listeners talking to each other. You mentioned the crater talking to the listener. We try to focus on the crater and what tools does the Creator one and actually not just for having fun but to your point zonal to be a better Creator. What information do you want from your fans and what would make it easier for greater to produce another episode for example?
14:59
We started with pulse which is one way to get clear answers and questions you have and we want to continue in this way focusing not really on listening to listener conversations. I mean, you have Instagram Facebook Twitter, there's lots of places to go and talk to other users but there aren't a lot of places to have good conversations with a
15:17
craters and I think if you focus on craters, there's also a huge opportunity to expand the funnel of creators. If you look at Trends and video lots of the top trending YouTube videos are actually
15:29
Reaction videos where people are watching a video and showcasing a reaction and Tick-Tock is all about re mixing. There's a lot of great audio content out there today that if you talk about augmented audio, you could take a podcast and then have another person share their thoughts directly just like a sports broadcaster even commenting directly on what's happening in the audio whether it's music or even another
15:53
podcast. Yeah, you have these two extremes like the old world broadcast One Way media, and then on the Other Extreme are
15:59
Put gaming where the interactivity is the experience. We know being broadcasted anything you're actually creating it and then you have this thing in between and I think audio needs to move towards interactivity. And like I said, there is basically a cheat sheet where you can look at other types of media. And as soon as you add a feedback loop the Creator gets a chance to improve so I think that's why
16:19
it'll tell me more about some of your thinking behind poles when you guys design a product you actually have an opinionated philosophy that this is how we think people are going to use it or are you just giving them the
16:29
bare minimum and then unlocking your community to kind of let loose a simplified way of asking us. Also. Is it a Steve Jobs point of view or a piezos point of view?
16:38
That's a great question great way to put it and it's a tough question to answer. It's definitely not a Steve Jobs point of view in the sense that we know how people are going to use it. But we tried to be slightly more opinionated. We don't have the complete Bottoms Up or throw stuff at the wall. I think it's due to our history. So when we develop products in music, it usually
16:59
Involved once you came up with the idea you had a three-year roadmap to go and license that idea from for majors. And if you license the wrong thing you lost four years. So you needed to be right and you need to be more short because the cost of being wrong used to be so high for us and I don't know if it's good or bad. I think if we had grown up in a world where the cost of being wrong was just engineering time put into it or something. You can just pull it back. Maybe we would be different but we have a pretty specific culture where we actually do plan quite a lot more. I wouldn't say Steve Jobs for sure.
17:29
And Daniel himself actually talks all the time about Distributing decisions, but it is more opinionated and then four poles. We're lucky enough to have gimlet and all these studios in house with lots of fantastic creators. So we get to test this internally and we use them as an internal inspiration. And sometimes they are the product owners because they represent the user
17:47
needs. That's fantastic Connie more thoughts on interactivity. I feel like you live in this world and you talked so much about China apps and what's possible when it comes to interactive audio. So another interesting
17:59
Anything about creators that comes from looking out what's working in China is not just giving them feedback on what the audience wants to hear next or what the audience is thinking but also separating your average listener from your super listener the person who really wants to even pay you directly for your work and helping you identify who your real true fans are right you think about the cratering economy? Very clear Trend that's already been in Asia for a while now, so something like a QQ music which is the
18:29
the main music app that people are using in China if you have someone who's hosting a radio show or kind of a listen together type of group chat. There's the option to basically be part of their page Fan Club. And then if you're part of their page down Club, you get a different badge on your own profile you get access to exclusive virtual gift that you can send that post. So everyone knows that you're part of that page Beyond Club, you can get a different announcement when you enter the room different kinds of bonus.
18:59
Can't ask there's a bunch of new features that get unlocked. If you're part of this craters fan club and ultimately what that allows the crater to do is monetize better than just a traditional advertising route because in addition to receiving normal virtual gifts from their listeners from anyone who drops in and participate, we also are cultivating your small following of superfans who really really love you. I love that you're pointing that out because it's basically making this link
19:29
think that these tools and features are not just about getting more information or data, but actually their pad to monetization as well, which is super interesting. Well helps you create your own Empire in a different way like one picture I love is this battle feature where you can almost battle another radio station at the same time and almost compare how many gifts each of you are able to aggregate in a certain period of time. It's like duets with an audio challenge. It's really focused on how to help creators.
19:59
Motivate their community and build that core fan base.
20:02
So one of the things I think is really interesting with these things that you mentioned are dependent on actually having a logged in service so that the Creator can understand their audience that wasn't really possible over the previous protocols. You got download numbers, but you can really understand your audience and who is your superfan what they look like and who they are and where they live and so forth or access protocol doesn't actually support feedback to the crater so one-way broadcast protocol, but because we're now
20:29
sort of full stack we can start doing these things that have happened in on their Industries and the thing that happened in video and in many of these other things like you take text messaging, for example, it used to be standardized and innovating on that text messaging protocol needed a ton of the carriers to sit in different forums and agree. Right? So the benefit was ubiquity and reach but Innovation was really slow and then at some point something like Snapchat happened that verticalized the whole thing and you know, what's up, and so forth and Innovation just
20:59
Ran away one day you have disappearing messages the next day you had Stories the third day you had lenses because it didn't really have to wait because I'm really excited about that happening to audio.
21:08
Yeah. This is what we mean when we say like very early inning audio
21:13
exactly but there was like a foundation that needed to exist that does exist in China to your point. They're all
21:18
verticals. Yeah. I'm very obsessed with and the student of the history of innovation and to me this is a classic Arc from when you go from a utility layer to like a value-add layer.
21:29
Of course, there's a lot of debates around what platforms shouldn't shouldn't have control over and that's something that's playing out a lot with crypto and a lot of other discussions that said I think the point you're making Gustav, which makes it less academic and more interesting to users is it is really comes down to you are giving me something I can't get right now. Yeah. If you have one app that can give you a vertical position basically give you everything you want that apps true understanding of you is very strong and its ability to personalize things.
21:59
I do is hire your ability to create a profile that you then are proud to share with other people or that you want to build upon whether it's earning different levels or different points that also increases I mean, I love what cassava saying about how things are more vertical and there's a lot of benefits When you take kind of the super at mentality and a super app is basically a product or a platform that focuses on all the different needs of particular customer wants versus giving a single feature solution.
22:29
And recognizing that oh this person loves listening to these kinds of music. But this person also probably loves listening to all these other things. So why not? Let's offer this all-in-one package. We now better understand that listener and we can solve more of their problems.
22:45
So we were actually quite inspired by the super apps of China when we thought about podcasting the onliest solution if you're going to build a podcasting app, if you come from a pure design angle is to build a standalone app, but the
22:59
Of tennis distribution and so we looked at it more from a super a point of view. Then we realized that what users actually wanted was all of their audio. We all wish they used to have on radio music and talk and so forth mix if you had a zero user base in podcasting, so we'd be starting from scratch yet hundred symbolism music users and that's an advantage in itself. But more importantly we understood these users they were logged in and so we could just augment their moments and one of the interesting things we found was that it turns out that your music listening is actually very predictive of your podcast listening.
23:28
You can probably guess
23:29
a person's age range from their music listening alone, right? Yes, you can for sure. So you're saying People's music listening predicted their podcast taste.
23:38
Yeah, when you want a cold start a podcast listener, it turns out that your music listening is actually really good signal for that for which obviously recommend
23:46
that is incredible to me. I just think People's music listening is so much more visceral and less intellectual that I'm just so shocked by that
23:53
fact, I would not say it was obvious to me either but it's like a very clear result it also supports the idea of
23:59
The audience should think of them as one person right and try to serve them in the different needs. They have yes think of
24:05
the customer as one person right what you're basically both really saying is when you think of the super at mindset it's a cohesive identity of a user's needs. And in fact, if I were to visualize it, I think of that classic Da Vinci renaissance man where you have like this person at the center and then you have multiple spokes of interests kind of radiating around them. And then you think of each of these moments and their day can be time it could be interest it could
24:29
Be need it could be whatever job to be done to use a Clayton Christensen framework and that you reference a few times Gustav, but you're both also essentially saying is that a super app once you have one is built in distribution. And so you'd be silly not to use that base and do it the cold start.
24:45
Yeah, it's much easier to say let's put a competing team over there and let Evolution take care of they build their own up and they compete but it's at the cost of the user to do it that way. And so the first thing we did was we figured out that instead of having the apps.
24:59
As different as possible you actually wanted to have them be the same thing and you can say that radio has always done. This people are mixing these mediums. So it didn't seem that far-fetched, but it wasn't clear then if you optimize for ease of implementation, you have small things such as just the fact that the UI has to change from skipping a whole song when you're listening to music to all of a sudden skipping 15 seconds back and forth and scrubbing within a podcast. That's a big challenge to solve dynamically in the same you I would have been much easier.
25:29
Easier to just maximize the two different hypotheses
25:32
Yeah. So basically what I'm hearing is even something as seemingly mundane to the user as the ability to scrub forward 15 10 seconds, which I do all the time in my podcast. If you're in music, you can just skip an entire song forward and even that kind of trade-off is like actually really complex on you're doing it the same UI that's super fascinating
25:51
exactly. So the UI has to be much more
25:53
Dynamic. I mean even how you show a track versus an album cover right or a podcast episode.
25:59
The podcast cover like it's a very different thing. It's not easy to pull off and it gets harder and harder the bigger the company is because it requires real changes that are top down that have to come from leadership. It's a change in your org structure. It's a change in your release cycle. It's a massive change and that's very hard to pull off.
26:19
It was painful. We needed to quote unquote force is not like people didn't want to do it, but you needed to get people to work with each other instead of putting in a different team and it's certainly not
26:29
Need a global prioritization from Daniel down and we have the system to prioritize things globally called that's born in Spotify, which was very helpful to get these things through the company and I don't think if we had that gold purchases until we could get this through the company. It's very hard to do but this is the benefit of software, right? And this is one of the benefits of being full stack we can actually try to solve these problems and actually improve the consumer experience.
26:54
So let me ask you as a quick question, especially you given Spotify worked within the existing UI
26:59
To blend from music to podcasting. Where do you stand on the definition of podcast music audio? I always talk about how audio is a huge category. Like I honestly think trying to homogenize audio is like trying to homogenize text. It's like a word is the same thing as a book is the same thing as an article as a blog post as a tweet, that's ridiculous. However, Connie you made the argument in our podcast about podcasting with Nick qua how podcast singing and music and I agreed with you.
27:29
You as well, then that there's a big difference between the spoken word and the sung word. And so I'd love to hear your guys thoughts on where are we today radio is the integration of both talk and music they live very symbiotic lie together. And if you look at most podcast, they have a music introduction already. There are sound effects in a bunch of them to so this combination or this belief that normal talking can be improved with music or music can be improved or talking breaks is Ben.
27:59
Forever but even then where does and doesn't the blending of music and podcasting actually work and where does it fall
28:08
apart? Right? So we had this intuition that people wanted their music and their podcast in the same app and that certainly turned out to work but there's a category where they're actually related. It is the same session, right? So this is the thing that we just released. So now we are going to let creators do this new type of session where they
28:29
can mix talk with licensed music in a seamless session. So you see these two user needs if you take the clay Christensen approach you see podcasters really wanting to use and talk about music, but they can't because the creators do not get paid for some burnt in song in a podcast and then you see the music creators that would like to talk about the music. So you have both of these sides at the same time and it's been really hard to solve it especially if they were two different apps, but now
28:59
It feels very natural that you should be able to have this new type of show. So you've seen us play around with things like daily drive. For example for a long time. We're remixed talking music and we've seen a lot of success people love hearing their news and then their new music in the same session instead of especially when they're driving trying to switch to the music session and here the new releases as well. But so what we were thinking now is we want to enable anyone to do that and on the consumer side. It is neither a podcast or a playlist. It's just yeah the best
29:29
Podcast and the best of play listing but it is neither because podcasting has the problem that you actually aren't allowed to feature music in it and play listing has the problem that you actually can't comment between the tracks. So we created this new format where you can do some talk then you can add a Spotify track in there. Then you can do some more talking and so the user can then listen to The Talk part as if it was a podcast they can listen to the track they can skip the track, but they can also save the track if they like it one of the things that radio was missed. So it's a new
29:59
But hopefully is not new in the bad sense. They have to learn anything new. It should be just like listening just that it works the way you kind of always wanted
30:05
to work. What would you call this new format? I think very broadly of again I mentioned how audio is heterogeneous as text. So it's ridiculous to use one word for everything but it is a new kind of audio experience. It's not a podcast. It's not a music or a song I think of this as going back to radio for me. This is the new radio station. Yeah. This is the new way you can listen
30:27
together the sense a very
30:29
Obvious Innovation but also an innovation that requires tons and tons of Licensing work over many years and a big investment in podcasting and created tools and so forth.
30:41
I'm smiling because it's going to open the door for a whole batch of brand new creators people who don't want to host the podcast and talk the whole way through but now can use music as their passion as their content at the thing. They're kind of anchoring their talk around and then this also brings about curation social
30:59
Right. I mean I can't even think of several a 16-bit colleagues myself that I think would be really good.
31:05
That's what that's what I'm hoping for. I'm hoping for you Connie
31:09
because I niches aside DJ my stuff or I'll be probably Chinese music. We want that tip. Yes, but the point is it really opens the door to new batches of creators and it brings in social discovery and it brings in the idea of curation. It's back to kind of the thought of my playlist but with more color, right and
31:29
And with more storytelling augmenting, I might even argue and the interaction that you can have with a mirror, right and Asia, you can have people order different songs and pay to try and see what's already on the playlist and change that playlist even in real time. So the kind of interaction you can build on top of this is all so
31:49
exciting and you spoke about augmenting their and I think that's a great point. So we spoke about Tick-Tock and I mentioned this pattern of taking sort of commodity license music and letting them users.
31:59
Make it unique. So one way to think about this. It's a similar pattern. We've had tremendous success by letting our users work with the music catalog and playlists it, you know, the created billions and billions of playlist that have helped them. It has helped other users, but it has also helped all our algorithms to learn right so you can think of this as a similar pattern where you take the commodity catalog but you let any Creator through anchor work with it and make it more unique and unique if I
32:25
write I love it unique if I again will the other interesting point
32:29
Is when Eugene and I talked about Tick-Tock on this podcast, he did bring up that one of the big unlocks as minor as it might seem for the remix culture as well was the ability to quickly license combined with the Creator tools combined with the distribution so that you do then get quote this creative creativity Network effects flywheel which sort of then reinforces. Yeah. It's a big way that people are interacting with music on the QQ music app when you tap into radio stations or listen together.
32:59
You see all these different hosts and you can listen to them live when you're listening together with other people. You can choose different topics or categories like friendship music emotions talk shows and the interactions that you already see happening on these radio stations are listen together. There's a chat that's usually going on while people are listening to music. There are different leaderboards for these different creators. You can have different tasks that the crater asks you to do. You can order songs. You can see what's next on the
33:30
You can get the Creator and thank them for curating this kind of music and you can even subscribe to their fan club. Right? Like if they always have great music choices, you can make sure that you're always able to know when they release something new and when they go on so it does unlock a brand-new batch of creators that today don't live on YouTube today, they're not podcasters, but they have a lot of things to say and they love music. So a lot more people will be able to participate.
33:59
8B creators themselves build a following and eventually monetize.
34:03
I agree The increased participation of new types of creators is really interesting because there are all of these craters were clearly want to talk about music and there are all of these artists who you know, they've always wanted to be on radio like they want to be featured by someone but business models is often a problem. No one has been able to solve that. Both parties actually get paid for that. We saw what I think is a harder part actually of Licensing all the music in the world and paying wrong to organization already solved that so feels like it.
34:29
That's what product for us to play
34:30
with. Yeah, when I was growing up, I used to listen to radio shows. You know, I used to listen to Delilah and she would have stories in between and then she would have audience people call in and then she'd have a nice soft music to go with that story. And actually it was
34:45
fantastic and then you probably recorded the tracks right because it really wanted the
34:49
music and that's how I discovered music to right and that's how she could also resurface music from the past rather than having us Listen to Only stuff that was released in
34:59
Last 18 months, let's resurface. Some of these Oldies in this is potentially a great way to do that with really fascinating to me about this is his almost like a vector to social because there's nothing more inherently social than music listening and music sharing as you're noting of playlist music curating and to your earlier point about unlocking creators. One of my favorite podcast actually is song Exploder by rishikesh her way. I should think I heard about this podcast from Eugene actually like a year ago and Noggin be a Netflix show and you know, he really
35:29
Roxy's songs on air but imagine all the people like all the kids all the adults who just lie around listening to music talking music with their friends bonding over music. So to me what's really fascinating here is there is a social Vector both socially and para socially with acquaintances and strangers when you think about been connecting with fellow fans of those playlists and other people so I think there's actually a really interesting vectored all that to tick tock is not a social network, but this theoretically could be
35:59
It'd be so
36:00
this is an interesting point. We think about Spotify more like YouTube and Tick-Tock then Facebook and Twitter. It's actually not about following your friends. But I think you're right. I think there are so many creators out there who would love to tell a story about a specific piece of music write their own story is some story or something and we'll see how it gets used. I'm hoping obviously that many artists would like to tell their story of their own album that they released for example. Yeah things that could happen
36:27
even in that great example where the
36:29
Artist is telling the story The Artist doesn't have to sign up and say okay. I'm going to start a brand new podcast that is such a big responsibility and commitment to take on and now you kind of have these kind of a trojan horse to starting a podcast with this really lowers the bar of commitment for creating a show and you can try it with no real consequence and get that distribution to okay. So now let's then talk about how do you solve?
36:59
L've this is like the big elephant in the room and potentially the big exciting thing in the room recommendation and Discovery. How do you then think about that side of this both in the context of Spotify shows? And also Beyond we open this conversation about what has and hasn't changed. This has been a broken problem quote in podcasting. It might not be as broken in music. We've talked about Tick-Tock. We've talked about the parallels and differences between video. Let's bring it all back together around this theme and topic of recommendation and
37:29
Every for music, there is a commitment of more than two or three seconds to figure out if you like a song right so the bar for who you trust as your source for who's giving you that recommendation is higher and so you either have to have a system that builds trust showing that their algorithm has given you enough hits like Tic Toc can't be wrong five times in a row of stakes are really high. So you either have an algorithm. That is so good that knows enough about you already that the majority of the time when they
37:59
I give you something you like it or you have a creator that also has that same kind of hit rate that you realize. Hey most of the stuff that that person likes. I also like that is also a great way to kind of get that Discovery element. It's all about giving the user the send trust that they're willing to test your recommendation because say eighty ninety percent of the time you're going to be right.
38:20
So I think you're completely right. That was a success with user playlist. There are literally many billions of different curations of the Spotify catalog. So you
38:29
We have something for everyone. I needed a find that playlist or you can use machine learning to learn from that to be able to serve users. Then you have the UI elements themselves and I think that's different between music and podcast music is easier in a sense because it is three minute items and you can skip through and what we see in music is that it's like the investment on how much time you spend versus finding one gem. So it is actually okay if even most of the songs theoretically or not that good if they're easy to skip through and like 2/7
38:59
Song is like your dream song because that can make your entire week or maybe month. Right? So I try to think about I think Chris Dixon upsetness a fault-tolerant DUI, if your machine learning is perfect, you only need to run. So one item if you machine learning is one of the ten you probably need to show 10 items because then there's always one jam on the screen. You have to adapt your user interface to your kind of level of recommendation. And so these playlist formats we try to think of as kind of a GTD get things done. You quickly go through and like yeah, that was perfect. Save that to my library. It's like
39:29
Productivity flow in the discovery moment which is very different from the consumption moment when you may be on a speaker and then it's not okay that you have three bad songs in a row, but it's okay. If the fourth one is good that makes sense
39:40
that goes back to modes actually thinking about the mode the user is in. Yeah. I also think if there are good mechanisms in there for the Creator's to have potential Financial payoff from participating. The creators are actually going to be incented to have discovery that incentive is actually built in because you cannot have thousands of
39:59
concurrent Spotify shows all showcasing the same music no one's going to want to listen to that. And so all these craters are naturally going to be incented to Showcase you something brand-new because what they're really being valued for is their ability to curate and then match that with the storytelling. Let me give you a concrete example when I go to the gym and someone is trying to do a workout and they're talking through and they have music spliced in between or just think about a yoga class. They want that variety of Music. They don't want you to be listening.
40:29
Into the same thing time and time again, and now even that gym workout that yoga class could exist as a Spotify show where they're making you do push-ups and counting down and then there's music right there in the background. You have to really think what this can unlock
40:42
I'm definitely hoping for that yoga and push up workout to happen. You have to make it
40:47
happen. Okay, Connie. So either you make a yoga show or you do like a Chinese song playlist? No, but the point is like there's so much context that can now be wrapped around recommendations, like even the time of day.
40:59
Say what are the right kinds of shows that work for the morning? What are the right kind of shows that you want to wind down to those creators will have the incentive to naturally pick what they think makes sense for you
41:10
exactly so I think there are two things that are really interesting here. So one is when we think about machine learning overall and recommendations from a product point of view and this is completely borrowed from and rating by the way. So it's nothing that we came up with that we try to use this if you think about what algorithms do really well
41:29
They tend to scale really well, they tend to be able to personalize edit. Okay level to hundreds of millions of people humans. Don't do that really well humans are incredibly smart and creative though, but they don't scale. So well, so one way to think about this that I think Andrew Ian Cohen was to let the editor for example or the Creator if we talk in Spotify show, but an editorial playlist this alga toriel principle that we use
41:52
algorithm plus editorial
41:54
exactly algorithms trust editorial that we call Agate Oreo you literally think of the ad
41:59
Litter as the product owner. This is the product person that has the idea in the hypothesis and they come up with what the job to be done is or what the hypothesis is or what the use cases. So for example, you take something like songs to sing in the car. No machine came up with that idea. It was a human who sat inside. Like I think there's a user need here people want to scream their lungs out when they're driving to work. So how do you teach a machine this the algorithm doesn't understand what songs to sing in the car means is that like a bit of 80s music is a bit of movie.
42:29
But for a human it's super clear. Like this is a song to sing in the car. This is not so what the other does is they literally create like a playlist of a few thousand tracks and then the algorithm can understand it and they can personalize it to 300 million people and scale it. Right? So the job of a product owner is to create this data example this data wireframe I think is very useful that Loop has been very useful for
42:49
us. So basically bonding the best of human creativity with the best of algorithmic scaling in order to deliver on the personalization and recommendations to
42:59
Mass of
42:59
users exactly humans have to come up with the ideas. They have to show the ammo system what that idea actually looks like for them assistant understand it because the immune systems are great at scaling but not great at coming up with new
43:11
ideas. You give me a little bit more color on some of the challenges here. I'd love to hear about how you have to think about solving them. What's hard about auditory oral, but then more specifically about how you had to negotiate that when you transition from music to podcasting and then now in blending the two
43:29
Want to hear a little bit more color about it, basically.
43:31
So in music, we have really two sources traditionally of recommendation information one big source is the playlist the other is editors, but then we have the Third Way obviously which is the engagement from the users listen and skips and so forth. Those are the signals in music but music is different because the items are three minutes long like we spoke about it's more like Tick-Tock, then you go to podcast and is a maybe one and a half hour and then
43:59
You get one
43:59
skill it's just
44:03
you know, feed the machine, right? It's very low signal. So we have to think about it completely differently, but not only is it much further between the skips. We don't have anything equivalent to a billion playlist. So we had to go back and start working with quote unquote more old Tech like knowledge graphs you have other advantages in podcast, which is there's actually information in the audio you have other signals to have show notes and you have the transcripts on the shows. So we started
44:29
With those Technologies instead to get some understanding. So actually these two stacks are quite different. We certainly members a lot of learnings, but they're not the same thing because they're such different objects,
44:39
especially because podcasts are usually multiple people on a podcast. There's oftentimes a host and I guess you actually don't know who people are following. Sometimes. You don't know if there's like a Joe Rogan talking to Elon Musk. You don't know if it's because I like Elon Musk or if I liked a rocket that's quite different than music where there's a bunch of artists any song they put out.
44:59
I'm going to like I'll take Listen to It's like a cult of personality show because you're following the host in that case and this case you're following the artist. But one thing that I think is really interesting when talking about the knowledge graph is the mood graph. I always talk about coined the phrase when I assigned an op-ed on it a number of years ago at wired because I should think we're missing a huge opportunity in optimizing things. Frankly. My playlists are all organized by mood and emotion. They're not organized by any other criteria.
45:25
That's a great point. And in music that is one of our biggest vectors like one of the biggest
45:29
Sections of editor planners are the mood planners. You're completely
45:32
right. That's great. It's interesting. You bring up a Knowledge Graph whose job because it's tough to know. Is it a book author? They're just listening to every single podcast there on is it a Content thing? It's so complex and multi-dimensional
45:44
exactly and the answer is far as we can see it's all of the above. There's personality cult. There is you're following a certain guest around all the podcasts that they visit there's interest is just going to Computing. I don't care who's talking right? So you really need this knowledge graph to go.
45:59
With all of those dimensions and then you need to be able to let the user kind of Traverse along these different dimensions and then you can lead them to some Discovery you remember this debate around music. Everyone had a music friend that influence them and for while early Spotify we invested heavily in Social to try to replicate that but it turned out that most of your friends on Facebook. They don't inspire you so much musically if you average them is just the US billboard. So we take the same approach in podcast. I mean, we have a core belief that if Spotify can make you discover.
46:29
I mean that you wouldn't otherwise have discovered. I will be more important in your life. So we really try to make sure that we measure and understand how many discoveries we generate for
46:38
you. It's almost like a new metric a return on Discovery instead of return on investment or return on energy. If I think about every app, what is my return on Discovery or are OD on that particular platform? I'll borrow that for me
46:52
but another difference from these things as that. We are revenue-wise mostly a subscription service. So in machine learning in the
46:59
The Practical world has been a lot of deep learning and so forth. But in the academic world for a long time, there's been a lot of focus and Discovery and exciting results around reinforcement learning, but you'll half ago and all these
47:09
things. Yeah, we've actually talked about on this podcast quite a bit too
47:12
and not to go through it. But the main idea is just you look for some long-term reward and you back propagated through time instead of looking at what is the most likely next click. And so I think if you have a service that is free only and you know, you have an average engagement same every day. It's gonna be really hard to like back.
47:29
Open gate signal it's going to be nicer. But but if you have an event for months down the line that is, you know, I went from just consuming ads to pay $120 per year. You have this massive amount of sort of gradient. You can back probe through time.
47:42
Oh, I love
47:43
this and the thing that is different between for example YouTube or take Target every month all the paying users hundreds of millions of them. They go and they evaluate like should I still pay and their vote with their wallet regardless of how much they actually consume so we have a different signal
47:59
That is not just engagement and consumption and attention we can see you keep paying and obviously as you know is not really possible to do the real reinforcement learning you basically need a perfect simulator of the world, but you can approximate it quite well. And so that's something that is happening in the rest of the industry as well. Slowly you need enough signal for that to really be valuable. So that's something I'm excited about in the recommendation
48:21
space what you're basically saying I talked about this quite often on the podcast about how subscription models changed so much but what you're saying, which is so
48:29
And to me is that it's also a way to get much better signal into your system. Also, basically saying you're essentially waiting hire people with more skin in the game, which is exactly how you want to design something
48:42
exactly. Everyone has saved son likes but you can think of like paying $10 as a super big
48:46
like exactly you're waiting at higher and you have that data because people are logged in and they're streaming. One of my favorite books is James cars has finite and infinite games and just died actually rest in peace Jason.
48:59
James cars but the idea what you're saying is you're playing a repeated game with your users which then gives them an even better game board to play on versus a transactional game only
49:09
that's exactly it which is a big problem that is important to solve. I think you can try to understand what the user actually values long-term versus just in the
49:16
moment. Yeah. It's description for the Fantastic business model, but also I can see how that would allowed new revenue streams for these craters and I'm not just talking about the people who create the music but I'm talking also about
49:29
The people who are going to create and deliver a brand-new experience that lives on top of the music if those people can find some kind of financial payoff and participating. That's a brand new Revenue stream and then think about the possibilities the kind of interaction you have with that listener at that moment is another area you can charge more I also love that while we've talked so much about putting the power back for creators. It really does actually most Empower The Listener just one quick question Gustav. How do you think about the tension between data?
49:59
And all the data you're getting and all the signals and where it goes too far. Like is there a risk that sometimes and listening to your users? You're missing out on what they don't tell you and how do you think about that in as a head of R&D at a company where you're not just abstract Rd. You're actually Building
50:15
Product. Yeah. I think that's a fantastic question and really hard to answer. It is an age-old problem. I think one way to think about it is to simplify a little bit algorithms. They kind of look in the rearview mirror and draw a straight line into the future. And so
50:29
That's great for a while but product development usually good product development is based on some sort of ideally contrarian hypothesis. And your machine learning is not going to come up with a contrarian hypothesis. Right? So you need some mechanism for that to happen. And so we try to think of this in different ways. I mentioned I'll go tutorial where the editor actually has the ability to say like, no, I believe in something different. So we try to build in this mechanism where humans can go in and you know, they have the steering wheel they can take a left turn or something and then the algorithms follow and you know, there are incentives.
50:59
Sentence to not do it is always going to be safer to keep going straight for a while more white take risk, all of these things, right but back to playing infinite games. If you play the game, you know, many times think about is game theory that you have to end up in a place where the optimal thing is to try new things every now and then to try to cover as much space as possible. And as I said, we have a culture of being quite specific in the hypothesis we have and we try to think about it as do many companies to sort of a portfolio. I want to have some things that are
51:29
Quite contrarian and has a pretty high chance of failing. Whereas I want a bunch of things that are obvious but that balance I mean no one has the perfect solution, but everyone at some scale has to start thinking about it. And so we found a few mechanisms that were useful for product development one was to take the concept of simple prioritization and the kanban board all the way to the c-suite, you know, everyone thinks they're good at prioritizing but they're not and I bet that in most companies to see sweet is the worst apologizing they actually want to do everything and so we
51:59
Have something like five to seven things that the company needs to do and Daniel owns that but the one rule is two things cannot have the same priority.
52:07
It reminds me of the Steve Jobs bio anecdote. We're at one of their off sites. They put a whole list of things and he literally crossed everything off the list and I only did the first for what you're describing though is not just siphoning off what to do versus not to do but what to order the priority from the top so that the managers don't have this friction and they don't waste in terms of building things. And that's the trick. Yes, I agree.
52:29
And the other thing I think is facing about that is that when you say that Daniel kind of owns that too when you are disrupting yourself so to speak but when you went from music to podcasting putting that higher up on the bets board in his office is like hey, no complaints guys. This is it.
52:43
So that's exactly what happened podcast was the number one company bet for two years and everyone in the company knew it and so what happens if you don't have that you push that decision to managers and you create conflict in the organ. The truth is Daniel can't have any idea in a
52:59
Company of thousands of people what is going to clash with what resources were the only thing he can do is like when you class this is the priority
53:05
I love that as a management thing.
53:07
It's so simple and one thing is complicated. It's actually very simple. It's a discussion is hard actually prioritizing is very
53:12
hard. Okay. So we started with talking about where podcasting has been we've gone through what shifted the parallels and differences between video music. We've talked about the trend of interactivity and augmenting audio and different ways. We've talked about recommendations and hearing like an algorithm even Anna
53:29
Order what do you guys think is sort of the future of a lot of these like, where do you think the future is kind of going
53:36
my guess is that if we use the cheat sheet of other media, I think audio is going to increase on the crater side. Just like the other mediums. I think it's going to increase the numbers of
53:47
craters. The market for audio is bigger than I think people realize our as Connie said earlier to we're still in the very early Innings. So my obsession is this two-word phrase that I use all the time of World building and to me
53:59
One of the missed opportunities and audio for a long time and you know Gustav you painted this range from gaming models all the way to music models different things. I actually think we're starting to increasingly see more game-like behavior in audio and I'm so excited for that kind of World building but it's a very different kind of World building because audio has an immersive - that's very different than the visual base world-building of other worlds. And so I'm super excited for what we can do. I mean, I already think about our expanding podcast Network as a former World building and
54:29
Mention Spotify shows that to me is another former World building because you're essentially bridging different worlds and creating new experiences. And so to me, that's actually the thing that I'm most excited
54:40
about. So I think that's a great way to think about it and you think of the music world the podcast world and now you can think of this new world where you can mix them and then you can have other worlds. The thing that I think is going to happen is you look at something like audio and it's so easy to create it's even easier to create them video. So as we both
54:59
Make it even easier and lower the friction for everyone. We let creators make more money and we had these new formats. What I'm hoping is that that market is going to grow as well. Just like we've seen the market for creators growing in other
55:10
media. I think audio will be further optimized in the sense that you can almost peel apart the different nuggets of a podcast, right? You can take certain segments. Now, you can take a commentary around it now and you're going to be able to do new things when you break apart a song when you break apart a pod cap.
55:29
See what that will unlock Tick Tock is breaking apart a song and kind of getting to a specific 5/10 I can slice of it right snip it and then this idea of now taking something that used to be no one piece of content and chunking it down two different things now gives you new building blocks to build new kinds of shows new kinds of interactions, which means things will get much more participatory more people can become creators more people can probably become listeners more listeners will find each other listeners.
55:58
Will become stronger fans of their creators. So I think there's a very hopeful very optimistic future where now technology actually can help everyone win. That's fantastic. I love that Gustav Connie. Thank you so much you guys thank you for joining the A6 and see podcast. This was super fun. Super fun. Yeah. I wish we could all talk for hours. Take care everyone. Bye I should put in a plug.
56:24
The China song show Connie my guys have a really good day or evening for you. Take care of you.
ms