Music recognition was one of the earliest applications of artificial intelligence in the music industry. Back in the early 2000s, AI was being used to identify songs and categorise genres. The pioneers in this field were the open-source project MusicBrainz and the companies Gracenote, Shazam and SoundHound. Here is the story of these four pioneers of music recognition, from simple audio fingerprinting to sophisticated AI applications.

AI in the Music Industry – Part 3: The Rise of Music Recognition

In 1993, Ti Kan, a 30-year-old computer scientist from Taiwan who had studied in the US, had the idea of developing a program for computers that could recognise the contents of music CDs and output them as audio information. This was at a time when the CD was just beginning to take off commercially and was seen as a sound carrier that could be played on a CD player. In the early 1990s, the idea of putting a CD into a computer drive and listening to music sounded absurd. Not so for Ti Kan. He programmed an open-source UNIX-based software called Xmcd that enabled his computer not only to read the musical content of a CD, but also to output and store it in the highest audio quality in various digital formats such as WAV, MP3, Ogg and FLAC. Xmcd was therefore not only a music playback software, but also a tool for CD ripping.[1]

However, music recognition also requires a database, which was provided by Steve Scherf, who originally studied mathematics and biology at the University of California. He came up with the idea of reading the table of contents (TOC) information provided digitally with each CD and creating a music database from it. In 1995, he teamed up with former fellow student Ti Kan to link his database project to the Xmcd software to create the Compact Disc Database (CDDB). It was now possible to insert a CD into a computer drive and, as it played, the song title, playing time, artist and any other information available in the table of contents would be displayed.[2] As the CDDB was designed as an open source project without any commercial intentions, many sympathisers made their CD collections available to expand the database. In the beginning, information was even sent by e-mail.[3]

An early user and fan of CDDB was Graham Toal, who ran an Internet service provider in Texas. In 1997, he offered to host the service for Kan and Scherf and placed banner ads on the homepage, which quickly gave them an idea of CDDB’s commercial potential. CDNow, an online CD retailer, also got involved, paying CDDB a small cent for each CD found on the site. This established a business model that led to the formation of CDDB LLC in Emeryville, California in 1998.[4] The database was now accessible via the Internet and anyone who had the software installed on their computer could view all the information on the CD by inserting it into their computer drives.

As the CDDB founders lacked the capital to expand their service and didn’t want to waste time running their business, they looked for a buyer for CDDB. They found one in Escient, an Indiana-based manufacturer of hi-fi equipment, including Tunebase, a CD changer that could be loaded with 200 CDs. Escient paid less than a million US dollars for CDDB and renamed the service Gracenote in 2000.[5]

Gracenote was no longer an open-source project but became a business partner for music industry and tech companies. The range of music recognition software was expanded by MusicID and integrated into numerous devices from Philips, Sony and Apple’s iPod to retrieve music information from the web.[6] In 2008, Escient sold Gracenote to Sony Corporation for a substantial profit of US $260 million.[7] The synergies that the Japanese electronics and entertainment conglomerate had hoped to achieve with the acquisition of Gracenote did not materialise, and as Sony slipped into the red figures in 2013, it began selling off parts of the company, including Gracenote. In February 2014, Sony sold Gracenote at a discount of 35 per cent on the original purchase price for US $170 million to the Tribune media group, which operated 28 TV stations and 8 daily newspapers, as well as the Tribune Media Services division, into which Gracenote was integrated.[8] Tribune Media Services had a focus on TV and film metadata. Gracenote appeared to be a good addition in the music sector and was merged with Media Services.[9] In 2014 and 2015, Tribune Media Company acquired a number of data analytics companies in the TV & film,[10] music, video and sports[11] sectors, and incorporated these acquisitions into Gracenote.

Gracenote was no longer just a provider of music recognition software but covered the entire spectrum of data analysis in the entertainment industry. The company had grown to more than 1,700 employees and was generating revenues of approximately US $100 million, when Nielsen Holdings made an offer to acquire Gracenote for US $540 million in cash, an offer Tribune Media could not refuse. In February 2017, the acquisition was completed and Gracenote became part of the world’s largest market research company.[12] At the time of the acquisition, Gracenote had already expanded its business beyond music recognition to include music recommendations and market research. If you visit the Gracenote homepage today, you will still see “Music Recognition” as a business segment alongside “Global Music Data”, “Music Discovery” and “Audio on Demand”. The core of the business is still the MusicID technology used to identify CD and digital content, augmented by speech recognition AI. Gracenote promotes its services by claiming that its technology is integrated into 250 million car sound systems and can be used to organise music playlists. Of course, AI is at the heart of this.[13]

MusicBrainz, which was founded in 1998 by Robert Kaye as an open-source response to the commercialisation of Gracenote, has taken a different path to Gracenote. Unlike Gracenote, MusicBrainz defines itself as an online music encyclopaedia, run by a foundation, through which anyone can research the metadata of music recordings and cover art free of charge. Registered users can not only access music information, but also contribute to it. Currently (as of 1 February 2024), the MusicBrainz database contains 30.8 million recordings by 2.3 million artists and about 262,000 labels.[14] Unlike other music databases such as Discogs.com, MusicBrainz has also developed a fingerprinting technology to identify recordings. TRM audio fingerprinting was introduced in 2000, replaced by MusicDNS in 2008 and, after its sale in 2009, by AcoustID, which is still used today by the MusicBrainz Picard application. This fingerprinting software allows users to identify audio files even if they have no metadata.[15]

Before AI conquered music recognition, digital fingerprinting was the gold standard for identifying music. Shazam is the best-known company in this field and, like Gracenote and MusicBrainz, was launched in 1999 as an open-source project by Dhiraj Mukherjee, Chris Barton and Philip Inghelbrecht in London. The idea was to identify the music you were listening to on the radio over the phone. All you had to do was dial 2580 on your mobile phone and the title of the song playing on the radio would appear on the display if it was in the database.[16] However, ambient noise often interfered with the recording and the result was inaccurate or not delivered at all. This changed when Avery Wang, a PhD candidate at Stanford University, joined the founding team. He programmed an algorithm that was able to filter out the background noise and identify the song using fingerprinting technology. This led to a technological breakthrough in 2002. However, Wang was the only one to experience this, as the original three founders had left the company in 2003. Although the company raised about US $9.5 million in 2000 and 2001, it was too little for a successful breakthrough and too much to die for.[17]

The rescue from economic ruin was a co-operation with the US telephone provider AT&T, which installed the Shazam software in its devices in 2004 and charged US $0.99 for each use.[18] However, commercial success came with the launch of the smartphone. In 2008, Shazam was made available on the iPhone and the app could be downloaded from the iTunes store.[19] An Android version followed shortly afterwards.[20]

Shazam became one of the most popular and downloaded apps of its time. It was cool to hold up your smartphone in a restaurant or disco to see what song was playing. This success caught the attention of investors. In 2009, Silicon Valley-based venture capital firm Kleiner Perkins Caufield & Byers led a funding round that provided Shazam with much-needed cash to expand.[21]

By the end of 2010, Shazam reported more than 100 million users and expanded its collaboration with TV networks and advertising agencies to enable music recognition for commercials and TV movies.[22] In 2011, a consortium led by Kleiner Perkins Caufield & Byers raised a further US $32 million for Shazam, although the London-based company continued to make losses.[23] Nevertheless, Shazam was able to expand its partnerships with music streaming services such as rdio[24] and the Indian Saavn.[25] In 2013, Mexican telecoms group America Movil pumped a further US $40 million into the music recognition service.[26] After a further financing round, which brought in US $20 million,[27] rumours of an IPO for Shazam intensified. According to a report in the Wall Street Journal, the three music majors Universal, Warner and Sony then secured shares in Shazam worth US $3 million each, after the company had been valued at US $500 million.[28] However the IPO failed, but Shazam was able to grow its active user base to 120 million in 2015, although the annual loss for 2014 was still nearly GBP 15 million.[29]

Then came the bombshell at the end of 2017: Apple announced its intention to buy Shazam for US $400 million.[30] However, this did not come as a surprise to observers. Shazam was kept alive by venture capital over the years, attracting a total of US $143 million in investment before being acquired by Apple. However, the company never managed to make a profit in all those years. Writing on his music industry blog, Mark Mulligan summarised the situation as follows: “Cool tech without a business model.”[31] For Apple, however, Shazam was a good addition at that time to catch up with the market leader Spotify, which had recently acquired EchoNest for EUR 50 million.[32] However, the deal drew the attention of the European Commission, which was concerned that competition would be restricted due to Shazam’s strong position in the music recognition market and Apple’s position in the music streaming market.[33] However, the investigation only took a few months and the Directorate General for Competition found that Apple’s acquisition of Shazam would not harm competition in the digital music market.[34] Shazam’s music recognition has been further developed by Apple using AI, and the app has become an integral part of Apple’s iPhone. In July 2023, Apple announced that the Shazam app could now recognise music used on TikTok, Instagram and YouTube.[35]

Shazam’s direct competitor is the Santa Clara, California-based company SoundHound AI Inc., which was founded in December 2004 under the name Midomi by the Iranian-born mathematician Keyvan Mohajer, when he was still a doctoral student at Stanford University.[36] In 2009, Midomi was renamed SoundHound and entered into a co-operation with the US music streaming service Pandora.[37] SoundHound was already able to convince investors of its business model during this period, and by June 2015 the company had raised around US $40 million, according to an article in the Wall Street Journal.[38] In the same article, the company unveiled its Hound voice assistant, which was a direct competitor to Apple’s Siri, Microsoft’s Cortana and Amazon’s Alexa. The AI application combined SoundHound’s speech and music recognition software with natural language processing, allowing more complex requests to be made to the voice assistant.[39] As a result, SoundHound convinced Korean car manufacturer Hyundai to integrate its Houndify music recognition system into the Genesis model. In 2017, Hyundai and SoundHound announced that an AI-based voice controller would be integrated into all of the car manufacturer’s new models from 2019.[40] In the same year, SoundHound succeeded in raising a further US $75 million from a group of investors led by Kleiner Perkins Caulfield & Byers.[41] A year later, SoundHound succeeded in attracting further strategic investors such as Hyundai, Daimler, Midea Group and the Chinese Tencent Group, which together invested US $100 million in the company.[42] Tencent also opened up the Chinese market for SoundHound. The big bang followed in November 2021. SoundHound went public on the New York NASDAQ with the help of the special purpose acquisition company Archimedes Tech SPAC Partners Co. under the new name SoundHound AI Inc. and a market capitalisation of US $2.1 billion.[43] SoundHound’s focus on artificial intelligence in music recognition was earlier than that of its closest competitor, Shazam, and allowed it to grow massively in value, while Shazam was acquired by Apple for US $400 million. However, the IPO also meant that SoundHound AI had to cut more than 40 per cent of its workforce by the end of 2022 to remain attractive to investors. This was also the price SoundHound paid for being able to issue a further US $25 million in preferred shares to undisclosed investors.[44]

SoundHound is a good example of how the early use of AI in speech and music recognition has not only enabled the company to become a technology leader in the field but has also significantly increased the value of the company.


Endnotes

[1] A detailed explanation of how Xmcd works can be found on Ti Kan’s homepage: http://www.tikan.org/xmcd/, accessed: 2024-02-01.

[2] Wikipedia, “CDDB”, version of March 4, 2023, accessed: 2024-02-01.

[3] Ibid.

[4] Wall Street Journal, “Three Veterans Advise the Next Tech Wave. It’s All About Business”, December 31, 2001, accessed: 2024-02-01.

[5] Ibid.

[6] Wired, “The House That Music Fans Built”, July 7, 2004, accessed: 2024-02-01.

[7] Wired, “Sony Buys Gracenote for $260 Million”, April 23, 2008, accessed: 2024-02-01.

[8] Billboard, “Tribune to Buy Gracenote from Sony for $170 Million”, December 23, 2013, accessed: 2024-02-01.

[9] Ibid.

[10] Variety, “Tribune Media’s Gracenote Acquires Baseline for $50 Million Cash”, September 3, 2014, accessed: 2024-02-01.

[11] Multichannel News, “Gracenote Puts Up $54M for Two Sports Data Firms”, May 28, 2015, accessed: 2024-02-01.

[12] New York Times, “Nielsen Acquires Gracenote, Highlighting the Value of Data”, December 20, 2016, accessed: 2024-02-01.

[13] Gracenote, “Music Recognition for CDs and Digital Files”, n.d., accessed: 2024-02-01.

[14] MusicBrainz, “Database statistics”, February 1, 2024, accessed: 2024-02-01.

[15] Wikipedia, “MusicBrainz Picard”, version of October 17, 2023, accessed: 2024-02-01.

[16] The Guardian, “Shazam co-founder: ‘We were growing a business in a collapsing market'”, December 7, 2016, accessed: 2024-02-01.

[17] Ibid.

[18] CNET News, “Dial-that-tune comes to U.S.”, April 15, 2004, accessed: 2024-02-01.

[19] CNET News, “Shazam on iPhone could change music discovery”, July 10, 2008, accessed: 2024-02-01.

[20] CNET News, “Shazam moves to Android, works with Amazon MP3 Store”, October 21, 2008, accessed: 2024-02-01.

[21] Business Insider, “Shazam Draws Investment, Is Already Profitable”, October 15, 2009, accessed: 2024-02-01.

[22] Mashable, “Shazam Helps 100 Million Users Identify Tunes”, December 6, 2010, accessed: 2024-02-01.

[23] Billboard, “Shazam Raises $32 Million to Expand Music, TV Services”, June 22, 2011, accessed: 2024-02-01.

[24] TechCrunch, “Rdio And Shazam Expand Full Music Track Streaming Partnership To UK, Canada, Australia, Brazil And Mexico”, May 9, 2013, accessed: 2024-02-01.

[25] Musicweek, “Shazam forms partnership with Indian streaming service Saavn”, April 3, 2013, accessed: 2024-02-01.

[26] Billboard, “Shazam Gets $40 Million Investment from Carlos Slim’s America Movil”, July 7, 2013, accessed: 2024-02-01.

[27] Billboard, “Shazam Aims to Raise $20 Million, Valuation at Half a Billion Dollars”, February 11, 2014, accessed: 2024-02-01.

[28] Wall Street Journal, “Warner, Universal, Sony Buy Stakes in Music App Shazam”, May 14, 2014, accessed: 2024-02-01.

[29] Music Business Worldwide, “Shazam has lost £25m in past three years, but it’s close to 120m users”, August 31, 2015, accessed: 2024-02-01.

[30] TechCrunch, “Sources: Apple is acquiring music recognition app Shazam”, December 8, 2017, accessed: 2024-02-01.

[31] Mark Mulligan, “Shazam Is Apple’s Echo Nest”, Music Industry Blog, December 11, 2017, accessed: 2024-02-01.

[32] Music Business Worldwide, “Turns out Spotify acquired The Echo Nest for just €50m”, May 10, 2015, accessed: 2024-02-01.

[33] European Commission press release, “Mergers: Commission opens in-depth investigation into Apple’s proposed acquisition of Shazam”, April 23, 2018, accessed: 2024-02-01.

[34] European Commission, DG Competition, Case M.8788 -APPLE/SHAZAM, decision C(2018) 5748 final, September 6, 2018, accessed: 2024-02-01.

[35] Heise.de, “Shazam für iPhone erkennt Songs aus Instagram, YouTube & Co”. July 7, 2023, accessed: 2024-02-01.

[36] Crunchbase, “Keyvan Mohajer”, n.d., accessed: 2024-02-01.

[37] VentureBeat, “Tune identifier SoundHound announces new version with Pandora, tour dates”, January 27, 2010, accessed: 2024-02-01.

[38] Wall Street Journal, “SoundHound App Emerges to Take On Apple and Google in Voice Search”, June 2, 2015, accessed: 2024-02-01.

[39] Ibid.

[40] Hyundai press release, “Hyundai Collaborates with SoundHound Inc. to Develop ‘Intelligent Personal Agent’ Voice-Control Technology”, December 21, 2017, accessed: 2024-02-01.

[41] VentureBeat, “SoundHound raises $75 million to expand access to its Houndify voice-powered platform”, January 31, 2017, accessed: 2024-02-01.

[42] TechCrunch, “SoundHound has raised a big $100M round to take on Alexa and Google Assistant”, May 3, 2018, accessed: 2024-02-01.

[43] Music Business Worldwide, “Voice and music recognition firm SoundHound to list on NASDAQ with $2.1bn valuation via SPAC merger”, November 22, 2021, accessed: 2024-02-01.

[44] Music Business Worldwide, “SoundHound raises $25m, just 2 weeks after axing 40% of its workforce”, January 25, 2023, accessed: 2024-02-01.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.