> Unfortunately, the resolution of the data that we collect in AcousticBrainz is not enough to be used in this type of machine learning, and so we were unable to try these new techniques using the data that we had available in the database.
Reading between the lines here -- AcousticBrainz can't store copies of any actual audio for copyright reasons. Instead they store various fingerprints and derived properties of the audio signal (example[1]). When the project started they didn't anticipate the right kind of input data required for these newest AI techniques, so to make progress they would have to throw away 7 years of data and start over their data acquisition from scratch. Instead they decided to shut it down.
Just another way the RIAA is making the world a worse place to live.
The hardest thing you can do when you're knee deep in a project is to pull the plug. I've been there a few times and it takes a lot of courage to do, especially when other people have put a ton of time and energy into it.
I'm sorry it didn't work out.
Is there a possible pivot to something else non-music related, or adapting your technology to another industry?
According to TFA, even if they had copies of every song they'd ever analyzed they'd still need to re-start from scratch. The reason they're not is:
"…we don’t have the resources and developer availability to perform this kind of research ourselves, and so we rely on the assistance of other researchers and volunteers to help us integrate new tools into AcousticBrainz, which is a relationship that we haven’t managed to build over the last few years."
If the RIAA was to blame, Pandora's Music Genome Project and other music analysis/recommendation systems wouldn't exist.
I think your main point is correct, but the Music Genome Project had experts humans listening to each and every song. It originally started out to build music-store recommendation kiosks (yes kids, music used to be sold in physical stores), and it took them 5 years to pivot to streaming audio. Given when they started, I expect that they just bought a lot of CDs to start. By the time they got into streaming, they of course had licenses. So Pandora didn't defy RIAA; they spent money and built business relationships, two things unavailable here.
I know Spotify does a lot of analysis and clever things with the music on their platform, but there's the kicker, they actually have licenses and whatnot to the music, which I'm sure includes a clause for statistical / data-driven analysis.
I wonder if that license extends to deriving a ML model from it. I know there's some ML models out there already that can produce music based on a prompt, but that'll be limited to what music the authors have available to them. Spotify (and Apple Music, and the others) have millions upon millions of tracks available to them.
> I know Spotify does a lot of analysis and clever things with the music on their platform
Do they?
I have ‘melancholic’ and ‘I want to hype myself up’ playlists. Both from the tone of the music, voice, and lyrics (even the titles, I’d say) it is extremely obvious what the overarching vibe of the playlist is. Yet the radio-mode of those playlist or the recommendations below the playlist are fairly obviously just “other people who played the songs in this playlist also liked these songs”.
As far as I can tell there are no deep analysis smarts at work, or even something like EveryNoiseAtOnce.
I suspect the RIAA might even be favor of supporting recommendation services—after all, matching people to music they like promotes music sales. RIAA members have a strongly vested interest in selling more music.
It is worth noting that AcousticBrainz was based on an open-source audio analysis library Essentia [1] that has been gradually improving since the launch of the AcousticBrainz project. It now has better ML models, with higher accuracy and generalization, than those used in AcousticBrainz in the pre-deep learning age of handcrafted signal processing features and SVMs.
See [2] for the list of models currently available. It includes updated ML models for the classifiers used in AcousticBrainz and many new models (e.g., a 400-style music classifier, models for arousal-valence emotion recognition, and feature embeddings that can be used for audio similarity or training new classifiers).
This is unfortunate. A decade or so ago I had a beta for something IIRC was called MusicMatch which was just an executable that could create iTunes playlists based off characteristics of a song in iTunes and it felt like magic. Then Apple bought them, supposedly built it into iTunes but it seemed to simply disappear (DarkSky is experiencing this same phenomenon right now). Does anyone have a recommendation for things like this? Starting a radio station off a song in Spotify works well enough, but it feels like it just selects "People who listen to X also listened to Y" whereas MusicMatch worked off a fingerprint technique that was more serendipitous.
> Then Apple bought them, supposedly built it into iTunes but it seemed to simply disappear (DarkSky is experiencing this same phenomenon right now)
The Weather apps in the latest versions of iOS, iPadOS, and macOS are very obviously re-styled versions of DarkSky. They even rebranded and released the API.
It is unfortunate and it looks like it was a very interesting project, but I have so many questions. How come they looked at the data just now when they “collected enough” and not along the way. Quality of data and mis-labeling issues is something that would be apparent right away, no? What kind of “data quality” we are talking about? Is it raw waveform data? This has the ceiling of 44 KHz and pretty much everything is very close to it. Quality of existing labeling?
> We finally got around to doing this recenty, and realised that the data simply isn’t of high enough quality to be useful for much at all.
I suspect that they just didn't have the right people on the team, or those people were busy and haven't gotten around to it.
This is sad, but it's an important cautionary tale for businesses that plan to collect data first, and later hire a data person to turn that data into gold. Many, many businesses in the 2010s probably failed because they thought they could do this.
The truth is that you need data people involved from the beginning, and continuously throughout the project, in order to monitor and evaluate the data being collected, building proof-of-concept models along the way, and to adjust the data collection process as problems are discovered and new techniques are developed.
Oh god yes. The "collect data" was such an "underpants gnomes" business model thing. Just thinking about it gives me flashbacks to industry events where insufferable people droned at me about this.
And yes, another vote here for early proofs of concept and rigorous testing. Not everybody gets what really matters in data quality, but the surest way to find out is to try to build something and see if it really works.
> Quality of data and mis-labeling issues is something that would be apparent right away, no?
You are correct, those issues were apparent right from the beginning, and they never really got better. The acoustic fingerprinting worked sorta OK for very popular albums -- but even for that, it was never 100% accurate. It never worked well for live performances, imports, classical music, jazz, or jam bands.
Using this software always required a lot of manual intervention, which at least for me negates the whole point of using it in the first place.
Understanding it and building useful models of it are really really hard.
Even building a functional content-driven also-bot is hard, although you can always solve that problem by cheating and aggregating playlist preferences.
Supposedly simple concepts like 'track BPM' just don't work reliably in the real world. (What's the tempo of a recording of a symphony, or even just a folk album that wasn't recorded to a click track? Or an EDM track with subtle tempo shifts - which quite a few tracks have?)
If they'd known more about music when they started they'd have understood this.
>Supposedly simple concepts like 'track BPM' just don't work reliably in the real world.
Sure it does, you just have to give up on the idea of there only being a single value. A song could have multiple different BPM values, so in your database you'd just record a set (or a tuple in Python parlance). For subtle tempo shifts, you'd run the music through some kind of filter algorithm which would isolate sections with significantly different tempos, then for each section would find the average, then would make a set with those values and store it. The only place this would fall down is some weird music with constantly-changing tempo, where you'd end up with an average over the whole song.
This is based on old Echonest scanner. Echonest was bought out by spotify. Echonest had same issue as AcousticBrains, their first scanner sucked and mid way had to ask people to rescan using a new version.
Echonest used to provide source code, but after being bought by spotify, they stopped updating their code, disabled fingerprint server, etc:
https://github.com/spotify/echoprint-codegen
I wonder what kind of data do they have. Why did they say the quality of the data is not good enough for DL?
Without concrete information, no one can provide meaningful feedback for them.
Looks like they stored computed outputs from the “Essentia” tool, and the values are not accurate, so training a model on top of that will render equally innacurate results.
To improve it you’re [pun not intended] essentially starting from scratch.
Why one would store secondary information instead of the original ones is beyond my comprehension. If I’m about to study human voices, I’d store samples of human voices, compressed maybe but not just the output of some processing program.
Reading between the lines here -- AcousticBrainz can't store copies of any actual audio for copyright reasons. Instead they store various fingerprints and derived properties of the audio signal (example[1]). When the project started they didn't anticipate the right kind of input data required for these newest AI techniques, so to make progress they would have to throw away 7 years of data and start over their data acquisition from scratch. Instead they decided to shut it down.
Just another way the RIAA is making the world a worse place to live.
[1] https://acousticbrainz.org/api/v1/6df2d3b5-25e1-4d4c-a196-04...