One Step further towards Diplom Thesis

While doing yet again a search for companies or institutes, i.e. a attendant, possibly related to what I’m looking for (automated music similarity analysis) I got one big step forward finding projects at the Fraunhofer IDMT (Institute Digital Media Technologies) that sound really interesting. What I’m interested in is doing some sort of wave form analysis and find different characteristics, different descriptive measures that make two music pieces “sound similar” independent of genre, same artist or whatever and those that make two other pieces different. Most interesting would be to derive them from how we humans would decide it which, of course, is not always deterministic, i.e. fuzzy. The long term dream would be to have an automate find the right music for a given emotional atmosphere, e.g. candle lite dinner, BBQ, reception event…

  • SoundsLike — Sounds like it comes close to what I’m interested in; derived from AudioID fingerprinting mechanism.
  • GenreID — more the category based approach similar to acoustic fingerprinting. Still interesting, though.
  • Query by Humming — Finding music tracks by humming the characteristic melody. But what exactly is characteristic to a certain track?
  • Semantic HiFi — interesting project; combines multiple tools to have the stereo generate personalized playlists on demand via call by voice, interacts with other media devices. Reads very promising. The project itself went from 2003-2006. And what’s really interesting is a cooperation with, among others, the University of Pompeu Fabra, Barcelona, Spain.
    I also could imagine automated adjustment of volume level by sound level in the room if actually it’s wise and no feedback effekt takes place, e.g. at a cockail party: conversation louder -> music louder -> conversation louder…
  • Retrieval and Recommendation Tools — just the attempt I’m looking for.

I also stumbled upon news that the mp3 format has been enhanced to encode surround sound information while only enlarging file size by about 10% (see also mpegsurround.com). And secondly an attempt to use P2P to legally ship music content and to utilize it to encourage people to by the better quality version called Freebies.

Basics of MusicIP’s MusicDNS and MusicAnalysis

Deriving from musicbrainz the system MusicIP created finding similar music works, in short, in three steps:

  1. analyse the music audio signal (up to 10 min of a track) locally by MusicIP Mixer generating an id called PUID (closed source!)
  2. PUID is sent to MusicDNS, a web-service by MusicIP (closed source, too!) which does fuzzy matching
  3. Some magic happens that the Mixer calculates a playlist by. It would not be sufficient for the DNS (Music Digital Naming Service, don’t mistaken it with Domain Name System) server to just return a list of PUIDs since the server (hopefully!) doesn’t know about all other tracks I have in my library, i.e. that potentially could be used to generate playlists with.

PUIDs

PUID is a 128-bit Portable Unique IDentifier that represents the analysis result from MusicIP Mixer and therefore is not a music piece finger print identifying a song in some particular version. PUIDs are just the ids used in the proprietary fingerprinting system operated by MusicIP. They provide a lightweight PUID generator called genpuid that does 1. and 2. PUIDs can be used to map track information such as artist, title, etc. to a finger print. The id itself has no acoustic information.

Acoustic Fingerprinting

Refering, again, to musicbrainz’s wiki acoustic fingerprinting here is a different process using only 2 minutes of a track. This fingerprint than is send to a MusicDNS server which in turn matches it against stored fingerprints. If a close enough match is made a PUID is returned which unambiguously identifies the matching fingerprint (Also see a list of fingerprinting systems. There is also an scientific review of algorithms). This is necessary since source to generate PUIDs or submit new ones is closed source.

On the other hand wikipedia defines acoustic fingerprinting as follows:

An acoustic fingerprint is a unique code generated from an audio waveform. Depending upon the particular algorithm, acoustic fingerprints can be used to automatically categorize or identify an audio sample.

This definition is even quoted by MusicIP’s Open Fingerprint™ Architecture Whitepaper (page 3).

MusicDNS

The web-service mainly is to match a PUID to a given acoustic fingerprint and look up track metadata such as artist, title, album, year, etc. (aka tags) as done by the fingerprinting client library libofa which has been developed by Predixis Corporation (now MusicIP) during 2000-2005. Only the query code is public via the MusicDNS SDK; music analysis and PUID submitting routines are closed source!

Getting the Playlist

Up to now I couldn’t figure out or find sources how this is actually done by Music Mixer. I’ll keep you posted as I find out.

Other sources / Directions