Way the hack didn’t I trip over WavPack (wv) earlier — it’s been around for some time now and astonishingly ultimate. To make it short and obvious (see hydrogenaudio.org for complete list):
Pros
Open sourc
Good efficiency (fwd is even better than mp3’s on foobar2000)
Hybrid/lossy mode (see below)
Tagging support (ID3v1, APEv2 tags)
Replay Gain compatible (which is no deal with fb2k anyway, but still)
Cons
Limited hardware player support
does take long to encode with optimal settings (not really because one time procedure)
Other features
Supports embedded CUE sheets
Includes MD5 hashes for quick integrity checking
Can encode in both symmetrical and assymmetrical modes
Supports multichannel audio and high resolutions
Fits the Matroska container
streaming support
Error robustness
So it’s open source! Most distros even come along with WavPack preinstalled. The only other lossless formats that I know are open sourced are two: FLAC which has bad tagging support and Shorten (to put it short: out dated). MAC (Monkey Audio Codec, ape) has an open sourced version but it’s not developed any longer.
The next great feature is hybrid mode which means you decode to lossy small file and an additional file containing “the rest” of the information. Only one other format is capable of this: OptimFROG (ofr) in DualMode. That means, putting both files together you get your 100% original back. The lossy file can be used entirely on it’s own. While encoding there is a second file, called correction file, that stores the difference between lossy and original — compressed that is. So what that means is you don’t have to convert your files each time you’re to shove them onto your portable. The only bad thing is you need to ensure the device can decode (read: play) them.
To give an example: If you convert a 27.5 MB ofr file to wv, hybrid enabled with lossy bitrate set to 192, you’ll get one 6.31 MB sized .wv file and a second 22.8 MB sized .wvc file. It took 12 min. 15 sec (playtime 4:25). Insane settings where: Compression Mode “high”, Processing Mode “6″ (best encoding quality), Hybrid Lossy Mode “192kbps”. When you open the .wv file only in fb2k it’ll handle correction files automatically (see picture). However moving the file with foobar2000’s dialog “” will only move .wv files. I cannot speak for other software players but this way it’s just easy to handle two qualities of one file — one hifi one and on “to go”! After I now have converted an entire album here are the approximate file sizes comparing MAC, OptimFROG and WavPack and MP3/OggVorbis as lossy counterpart to “portable wv” (to be added to .ape/.ofr):
.ape
311 MB
.ofr
306 MB
.wv
70 MB
.wvc
255 MB
.wv+.wvc
325 MB
.mp3 (V2, ~190kbps)
74 MB
.ogg (q5, ~160kbps)
61 MB
Tagging: Unlike FLAC it uses APEv2 (or ID3v1) so tags can be used with most players, software and portable devices’ ones, without intervention.
While I ran encoding test’s with foobar2000 (which has decoding WavPack “build-in” by the way) I noticed when converting from, say, OptimFROG to WavPack fb2k went right at it. No temporary wav files as with OptimFROG to MAC, for example! But mind you it does take a long time if you use optimization for file size and quality. It seams to be somewhere around 0.7x (slightly slower than plain play time). I don’t see why this really is an issue because in most cases you’ll only encode once as it’s true for all lossless formats anyway.
Lately I’ve been thinking about the possibilities of songbird (see screencast below), collecting music distributed under common licence or the like from blogs. Wouldn’t it be great to precisely search for music that way? You could say something like “search for music that sounds like what I’m listening to right now”. That means, the method I might be working on should be efficient enough to calculate relevant vectors for different pieces of music fast. Otherwise there would have to be some central service doing some fingerprinting again for all tracks around. But, as said before, that’s not what I’m after. Rather something to get a mutual measure between two tracks, pairwise.
As you can see in the screencast songbird can already, by using diverse search services, find related music. But, as far as I can judge right now, that’s user- or expert-based. That means, the music has to be known well to a certain point to have the data what a newly found song is like. And also it’s text based.
While doing yet again a search for companies or institutes, i.e. a attendant, possibly related to what I’m looking for (automated music similarity analysis) I got one big step forward finding projects at the Fraunhofer IDMT (Institute Digital Media Technologies) that sound really interesting. What I’m interested in is doing some sort of wave form analysis and find different characteristics, different descriptive measures that make two music pieces “sound similar” independent of genre, same artist or whatever and those that make two other pieces different. Most interesting would be to derive them from how we humans would decide it which, of course, is not always deterministic, i.e. fuzzy. The long term dream would be to have an automate find the right music for a given emotional atmosphere, e.g. candle lite dinner, BBQ, reception event…
SoundsLike — Sounds like it comes close to what I’m interested in; derived from AudioID fingerprinting mechanism.
GenreID – more the category based approach similar to acoustic fingerprinting. Still interesting, though.
Query by Humming — Finding music tracks by humming the characteristic melody. But what exactly is characteristic to a certain track?
Semantic HiFi — interesting project; combines multiple tools to have the stereo generate personalized playlists on demand via call by voice, interacts with other media devices. Reads very promising. The project itself went from 2003-2006. And what’s really interesting is a cooperation with, among others, the University of Pompeu Fabra, Barcelona, Spain.
I also could imagine automated adjustment of volume level by sound level in the room if actually it’s wise and no feedback effekt takes place, e.g. at a cockail party: conversation louder -> music louder -> conversation louder…
I also stumbled upon news that the mp3 format has been enhanced to encode surround sound information while only enlarging file size by about 10% (see also mpegsurround.com). And secondly an attempt to use P2P to legally ship music content and to utilize it to encourage people to by the better quality version called Freebies.
Deriving from musicbrainz the system MusicIP created finding similar music works, in short, in three steps:
analyse the music audio signal (up to 10 min of a track) locally by MusicIP Mixer generating an id called PUID (closed source!)
PUID is sent to MusicDNS, a web-service by MusicIP (closed source, too!) which does fuzzy matching
Some magic happens that the Mixer calculates a playlist by. It would not be sufficient for the DNS (Music Digital Naming Service, don’t mistaken it with Domain Name System) server to just return a list of PUIDs since the server (hopefully!) doesn’t know about all other tracks I have in my library, i.e. that potentially could be used to generate playlists with.
PUIDs
PUID is a 128-bit Portable Unique IDentifier that represents the analysis result from MusicIP Mixer and therefore is not a music piece finger print identifying a song in some particular version. PUIDs are just the ids used in the proprietary fingerprinting system operated by MusicIP. They provide a lightweight PUID generator called genpuid that does 1. and 2. PUIDs can be used to map track information such as artist, title, etc. to a finger print. The id itself has no acoustic information.
Acoustic Fingerprinting
Refering, again, to musicbrainz’s wiki acoustic fingerprinting here is a different process using only 2 minutes of a track. This fingerprint than is send to a MusicDNS server which in turn matches it against stored fingerprints. If a close enough match is made a PUID is returned which unambiguously identifies the matching fingerprint (Also see a list of fingerprinting systems. There is also an scientific review of algorithms). This is necessary since source to generate PUIDs or submit new ones is closed source.
An acoustic fingerprint is a unique code generated from an audio waveform. Depending upon the particular algorithm, acoustic fingerprints can be used to automatically categorize or identify an audio sample.
The web-service mainly is to match a PUID to a given acoustic fingerprint and look up track metadata such as artist, title, album, year, etc. (aka tags) as done by the fingerprinting client library libofa which has been developed by Predixis Corporation (now MusicIP) during 2000-2005. Only the query code is public via the MusicDNS SDK; music analysis and PUID submitting routines are closed source!
Getting the Playlist
Up to now I couldn’t figure out or find sources how this is actually done by Music Mixer. I’ll keep you posted as I find out.
Other sources / Directions
There has been an attempt by Microsoft with the so-called MongoMusic to do patented “sounds like” searches for music files. Never seen it though.
Love is where you don't feel sorry -- and therefor love is where you are free. -myself
Live is a treasure island; it would be too bad if someone went without filling their pockets. -myself
The only real voyage of discovery consists not in seeking new landscapes, but in having new eyes. -Marcel Proust
For one human being to love another; that is perhaps the most difficult of all our tasks... the work for which all other work is but preparation. -Rainer Maira Rilke
Authorities define reality. -myself
Life is music written by the moment with every creature being individual notes on the score of time. -myself