Lately I’ve been thinking about the possibilities of songbird (see screencast below), collecting music distributed under common licence or the like from blogs. Wouldn’t it be great to precisely search for music that way? You could say something like “search for music that sounds like what I’m listening to right now”. That means, the method I might be working on should be efficient enough to calculate relevant vectors for different pieces of music fast. Otherwise there would have to be some central service doing some fingerprinting again for all tracks around. But, as said before, that’s not what I’m after. Rather something to get a mutual measure between two tracks, pairwise.
As you can see in the screencast songbird can already, by using diverse search services, find related music. But, as far as I can judge right now, that’s user- or expert-based. That means, the music has to be known well to a certain point to have the data what a newly found song is like. And also it’s text based.
Even though I’m into last.fm, if any (of those), I do understand if some Panora maniac cheets on them the alternative has to be worth a glimpse. So, it was Musicovery.com that also impressed me … at first. And I’ll admit it was mostly because of the blinky-blinky. But it’s more to it than just effects attention. From a HMI conceptional point of view Musicovery have really made an effort. It is easy to start listening to what you want — without any or, if you really want to, very little reading. In one word: I’d call it intuitive.
You are presented those, and only those, selections you need to do and combine (what you can) to make your choice distinct enough to gather correct songs. Also the other direction of “communication”, machine to human, has some promissing approaches like the “neighbourhood map” and colours for genres. One can even drag (move) that map around. The playlist is shown as path through the graph of audio tracks.
But then, of course, the hacker in me came to surface and I had to test that stuff. After a few clicks I was presented with Shakira’s “Objection” after hitting “dark” mood. Sure, no accounting for taste, but I wouldn’t call “Objection” a dark mood song. And also there was Black Eyed Peas’ “Shut up” to come… I don’t know about you; I couldn’t keep my feed still while listening and there where absolutely no “I hate the world” and “Where is my gun to get a rampage going” (just being sarcastic here). While the “energetic” direction has worked fine for a while dark more and more seams to be a bad label.
To conclude Musicovery.com nevertheless sounds very promising. I’d really like to know the “music selection techniques” behind it, though, since the more I listen to the tracks that are picked for a selected mood don’t satisfy me just like the other lot.
Edit: I just caught myself letting imaginary drift away: Wouldn’t it be possible to have, in a few years time, some HMI stuff so one brachiates though a play list just like the one displayed at Musicovery but as some sort of hologram or only imaginary (not directly visible) but more like that Wii stuff? So if one wants to ffw to a track on the playlist (displayed in some sort of 3D neighbourhood map/grid as a ball, e.g.) you grab it and drag it to the middle of the cube or punch it to play it, pet it to let information been displayed about it, …
While doing yet again a search for companies or institutes, i.e. a attendant, possibly related to what I’m looking for (automated music similarity analysis) I got one big step forward finding projects at the Fraunhofer IDMT (Institute Digital Media Technologies) that sound really interesting. What I’m interested in is doing some sort of wave form analysis and find different characteristics, different descriptive measures that make two music pieces “sound similar” independent of genre, same artist or whatever and those that make two other pieces different. Most interesting would be to derive them from how we humans would decide it which, of course, is not always deterministic, i.e. fuzzy. The long term dream would be to have an automate find the right music for a given emotional atmosphere, e.g. candle lite dinner, BBQ, reception event…
SoundsLike — Sounds like it comes close to what I’m interested in; derived from AudioID fingerprinting mechanism.
GenreID – more the category based approach similar to acoustic fingerprinting. Still interesting, though.
Query by Humming — Finding music tracks by humming the characteristic melody. But what exactly is characteristic to a certain track?
Semantic HiFi — interesting project; combines multiple tools to have the stereo generate personalized playlists on demand via call by voice, interacts with other media devices. Reads very promising. The project itself went from 2003-2006. And what’s really interesting is a cooperation with, among others, the University of Pompeu Fabra, Barcelona, Spain.
I also could imagine automated adjustment of volume level by sound level in the room if actually it’s wise and no feedback effekt takes place, e.g. at a cockail party: conversation louder -> music louder -> conversation louder…
I also stumbled upon news that the mp3 format has been enhanced to encode surround sound information while only enlarging file size by about 10% (see also mpegsurround.com). And secondly an attempt to use P2P to legally ship music content and to utilize it to encourage people to by the better quality version called Freebies.
Deriving from musicbrainz the system MusicIP created finding similar music works, in short, in three steps:
analyse the music audio signal (up to 10 min of a track) locally by MusicIP Mixer generating an id called PUID (closed source!)
PUID is sent to MusicDNS, a web-service by MusicIP (closed source, too!) which does fuzzy matching
Some magic happens that the Mixer calculates a playlist by. It would not be sufficient for the DNS (Music Digital Naming Service, don’t mistaken it with Domain Name System) server to just return a list of PUIDs since the server (hopefully!) doesn’t know about all other tracks I have in my library, i.e. that potentially could be used to generate playlists with.
PUIDs
PUID is a 128-bit Portable Unique IDentifier that represents the analysis result from MusicIP Mixer and therefore is not a music piece finger print identifying a song in some particular version. PUIDs are just the ids used in the proprietary fingerprinting system operated by MusicIP. They provide a lightweight PUID generator called genpuid that does 1. and 2. PUIDs can be used to map track information such as artist, title, etc. to a finger print. The id itself has no acoustic information.
Acoustic Fingerprinting
Refering, again, to musicbrainz’s wiki acoustic fingerprinting here is a different process using only 2 minutes of a track. This fingerprint than is send to a MusicDNS server which in turn matches it against stored fingerprints. If a close enough match is made a PUID is returned which unambiguously identifies the matching fingerprint (Also see a list of fingerprinting systems. There is also an scientific review of algorithms). This is necessary since source to generate PUIDs or submit new ones is closed source.
An acoustic fingerprint is a unique code generated from an audio waveform. Depending upon the particular algorithm, acoustic fingerprints can be used to automatically categorize or identify an audio sample.
The web-service mainly is to match a PUID to a given acoustic fingerprint and look up track metadata such as artist, title, album, year, etc. (aka tags) as done by the fingerprinting client library libofa which has been developed by Predixis Corporation (now MusicIP) during 2000-2005. Only the query code is public via the MusicDNS SDK; music analysis and PUID submitting routines are closed source!
Getting the Playlist
Up to now I couldn’t figure out or find sources how this is actually done by Music Mixer. I’ll keep you posted as I find out.
Other sources / Directions
There has been an attempt by Microsoft with the so-called MongoMusic to do patented “sounds like” searches for music files. Never seen it though.
If someone is interested in archiving your audio CD collection in a really save manner to get rid of them after the encoding, or I should say transcoding process you might want to have a look at Neil Popham’s guide. Neil Popham, by the way, wrote some features for the APEv2 tagging tool wapet. His guide utilizes Windows batch scripts, adds tags and employs PAR2 for parity information. The later is needed to overcome seldom yet possible bit failures which would compromise an ape file. The ripping is done with EAC, the meanwhile well-known audio CD accurate ripping tool by Andre Wiethoff.
Using one single image file per CD has the benefit over multiple files to include, among others, audio information from before the first track, lead-in silence and TOC information. On the other hand, for audio library handling and daily playback where tagging plays a more important role, one file per track is much more comfy. That way all tags can be written directly to the audio file. This, however is resolutely limited with cue sheets. For example, with foobar2000 it’s transparent whether you have your playlist made up from an ape image with cue sheet or from multiple audio files with tags. It’s all the same with randomizing the list, seeking forward and backwards and the like. But once it comes to playback statistics or ReplayGain this will not be written to any file besides fb2k’s library database.
For my daily usage I copied the file MAC.exe (the actual encoder) from the Monkey’s Audio directory to foobar2000’s one and set up an encoding preset in it’s converter preferences using one of the following parameter taken from hydrogenaudio:
-c1000
Fast / large file
-c2000
Normal
-c3000
High / medium file
-c4000
Extra High
-c5000
Insane / small file
Prefixed by the source and destination file variables %s %d. I mainly chose ape over flac because of ape’s flexible tagging feature and neglected the in terms of encoding far better codec OptimFROG because this one needs heaps of CPU usage for decoding (while listening). When CPU power is not a constraint any longer I will most likely switch to ofg (if at that time it still has the best compression around). Don’t forget to enable the secure mode for your CD drive in fb2k under file -> open audio cd… -> select your drive and than drive settings. You can also access the rip wizard from there. Or go via “add to playlist” and convert the tracks manually.
Update: I have found out about WavPack which IMHO is a much better choice to MAC (APE) and FLAC because of better support, tagging and hybrid mode.
Love is where you don't feel sorry -- and therefor love is where you are free. -myself
Live is a treasure island; it would be too bad if someone went without filling their pockets. -myself
The only real voyage of discovery consists not in seeking new landscapes, but in having new eyes. -Marcel Proust
For one human being to love another; that is perhaps the most difficult of all our tasks... the work for which all other work is but preparation. -Rainer Maira Rilke
Authorities define reality. -myself
Life is music written by the moment with every creature being individual notes on the score of time. -myself