Have you ever wondered about the magic of Wii remote? It can do more than blow out candles
While doing yet again a search for companies or institutes, i.e. a attendant, possibly related to what I’m looking for (automated music similarity analysis) I got one big step forward finding projects at the Fraunhofer IDMT (Institute Digital Media Technologies) that sound really interesting. What I’m interested in is doing some sort of wave form analysis and find different characteristics, different descriptive measures that make two music pieces “sound similar” independent of genre, same artist or whatever and those that make two other pieces different. Most interesting would be to derive them from how we humans would decide it which, of course, is not always deterministic, i.e. fuzzy. The long term dream would be to have an automate find the right music for a given emotional atmosphere, e.g. candle lite dinner, BBQ, reception event…
- SoundsLike — Sounds like it comes close to what I’m interested in; derived from AudioID fingerprinting mechanism.
- GenreID – more the category based approach similar to acoustic fingerprinting. Still interesting, though.
- Query by Humming — Finding music tracks by humming the characteristic melody. But what exactly is characteristic to a certain track?
- Semantic HiFi — interesting project; combines multiple tools to have the stereo generate personalized playlists on demand via call by voice, interacts with other media devices. Reads very promising. The project itself went from 2003-2006. And what’s really interesting is a cooperation with, among others, the University of Pompeu Fabra, Barcelona, Spain.
I also could imagine automated adjustment of volume level by sound level in the room if actually it’s wise and no feedback effekt takes place, e.g. at a cockail party: conversation louder -> music louder -> conversation louder…
- Retrieval and Recommendation Tools — just the attempt I’m looking for.
I also stumbled upon news that the mp3 format has been enhanced to encode surround sound information while only enlarging file size by about 10% (see also mpegsurround.com). And secondly an attempt to use P2P to legally ship music content and to utilize it to encourage people to by the better quality version called Freebies.
- analyse the music audio signal (up to 10 min of a track) locally by MusicIP Mixer generating an id called PUID (closed source!)
- PUID is sent to MusicDNS, a web-service by MusicIP (closed source, too!) which does fuzzy matching
- Some magic happens that the Mixer calculates a playlist by. It would not be sufficient for the DNS (Music Digital Naming Service, don’t mistaken it with Domain Name System) server to just return a list of PUIDs since the server (hopefully!) doesn’t know about all other tracks I have in my library, i.e. that potentially could be used to generate playlists with.
PUID is a 128-bit Portable Unique IDentifier that represents the analysis result from MusicIP Mixer and therefore is not a music piece finger print identifying a song in some particular version. PUIDs are just the ids used in the proprietary fingerprinting system operated by MusicIP. They provide a lightweight PUID generator called genpuid that does 1. and 2. PUIDs can be used to map track information such as artist, title, etc. to a finger print. The id itself has no acoustic information.
Refering, again, to musicbrainz’s wiki acoustic fingerprinting here is a different process using only 2 minutes of a track. This fingerprint than is send to a MusicDNS server which in turn matches it against stored fingerprints. If a close enough match is made a PUID is returned which unambiguously identifies the matching fingerprint (Also see a list of fingerprinting systems. There is also an scientific review of algorithms). This is necessary since source to generate PUIDs or submit new ones is closed source.
On the other hand wikipedia defines acoustic fingerprinting as follows:
An acoustic fingerprint is a unique code generated from an audio waveform. Depending upon the particular algorithm, acoustic fingerprints can be used to automatically categorize or identify an audio sample.
This definition is even quoted by MusicIP’s Open Fingerprint™ Architecture Whitepaper (page 3).
The web-service mainly is to match a PUID to a given acoustic fingerprint and look up track metadata such as artist, title, album, year, etc. (aka tags) as done by the fingerprinting client library libofa which has been developed by Predixis Corporation (now MusicIP) during 2000-2005. Only the query code is public via the MusicDNS SDK; music analysis and PUID submitting routines are closed source!
Getting the Playlist
Up to now I couldn’t figure out or find sources how this is actually done by Music Mixer. I’ll keep you posted as I find out.
Other sources / Directions
Have you ever wandered why you cannot find a webmin package via apt-cache search webmin any longer like I did? Or is it just the question that came up why webmin has gone since Debian Etch and is only available in oldstable, aka Sarge? I have never really used it myself more than the obligatory try-out but was going to test it a little more these days. But I was surprised not to find anything. After a little googling Debians’ own wiki gave a hint to bug #343897. Well, the answer is as simple as this: The package former maintainer, Jaldhar H. Vyas, seams to have always ended up doing all the work on his own and finally couldn’t motivate himself any longer. Sad, but I guess that’s how life goes! If it’s worth it for someone package support will be reintroduced to Debian. Right in terms of “survival of the fittest”.
Well, what I’m trying to do I thought would be very simple: Add a new tag to each file that’s been added to fb2k’s media library holding the current system date and hence add a “added to library” tag. The tagz script to achieve this is not even the problem.
Or, what seams to be semantically equivalent but more readable (refer to tagz reference to understand the commands):
This even checks if the file has a tag already. Using the tagz parser in the preference dialog (Ctrl+P -> Display -> Title Formatting) confirms it’s working correctly when playing a song without a tag and one with the time stamp set.
The set-up is this:
- with masstagger (right-click on song -> tagging -> manage scripts, if you haven’t changed the default context menu structure) add “Format value from other fields…“, select destination field name (ADDED_TO_FOOBAR) and use the stated script as formatting pattern. Hit Return and name your masstagger script and (important:) click the save button. Note: Using “Set value…” or the like will not work since it, despite intuitive guesses, does not evaluate tagz scripts but outputs it as a string.
- in fb2k’s preferences dialog select Tools -> New File Tagger and from the drop-down list select Tagging/Scripts/your name (do this after extensive testing on single files with the file’s preferences box open!)
Now each time a file is added to fb2k’s lib this script is run on it. BUT: It doesn’t do what it’s supposed to! What ends up in the files tags is
- a ‘?’ for those with no time stamp
- deleting the existing field when value = ‘?’
That’s when I noticed the two scripts are not equivalent: The first add an empty string to the requested field if present where the later does simply nothing in that case because there is no else branch. But still, even the later does not do the desired job.
Then I came up with this script:
But once again, the only effort is frustration but not the desired time stamp. After all it leaves existing fields untouched.
I could work around this issue with the “Stamp current Time and Date…” bit but since after reinstalling my OS and using fb2k before my music files partly are stamped already. Sidenote: Probably because of this the field name should rather be something like “ADDED_TO_LIBRARY”. Though moving on…
A working workaround I figured out is to
- add a “Stamp current Time and Date…”,
- use the following script
- add “Remove Field…” with selected “TIMESTAMP” field.
So that leaves me with speculating about a bug in either the masstagger or in foo_cwb_hooks. By the way, the time stamping option might be in foo_masstag_addons. You might want to include this masstagger script from a file.
In case you wonder — like I did — to employ foobar2000 (fb2k) to handle the lack of ReplayGain (RG) info in a file’s tags nicely so it doesn’t blast away your eardrum I have a set of very usefull links. At hydrogenaudio.org I found the Intermediate User Guide for fb2k explaining, among others, the options one has setting up fb2k for ReplayGain. Most importantly one should slide the bar in the playback preferences pane for “without RG info” to a value that reflects the average sound level of all tracks. This, of course, would mean to scan all your files. For my couple’o weeks worth of playback time fb2k estimates just over 24h for that job. So it was an easy choise to just stick to the suggested value -8db.
Last but not least, I will mention the two preamps in the Playback preferences. Except if you know exactly what you are doing, it is not recommended to raise the output of the preamps above the default 0.0dB in any way. However you can use these to slightly compensate for the difference between replaygained and unreplaigained Tracks. Simply estimate your average Replaygain level and lower the preamp for files without Replaygain info by that value. I found -8dB to work quite well for me. This obviously should not be used to compensate for not properly Replaygaining your tracks, but definetaly will protect your ears and your equipment when coming across tracks that miss Replaygain info.
Secondly there is a more detailed description on the Playback settings in the same wiki. Besides some mathematics on RG it also points out some interesting knowlege about pre-buffering and DSP settings
For a small business pharmacy shop looking to presenting themselves with a small website I’m aiming at a web CM system that integrates the following:
- Content should be edited via a highly configurable, yet minimal rich text editor so there is no difference to writing text in Word with even less function offered (only predefined CSS-Styles for title, paragraph, etc.). No need of code view.
- Possibility to save as draft or direct publishing and therefore some sort of (email) notification system.
- Get access to that editor via the website’s “frontend” (FE), i.e. in-line editing — showing a pen marker at the position that could be edited right in place by the logged-in user, similar to the wiki stuff — so there is no backend for the ordinary user.
- For the administrator (web designer, … someone more technically minded) there should be some sort of backend for user management, reviewing drafts to be published, configuring the rich text editor (could be done “externally” on the system file level).
- News handling: Basically able to set date to publish and expire per entry/article. So writing of news articles ahead of time becomes possible. Something like tt_news (see also manual) from typo3, could be far less complex.
- Define roles per user, or more accurately define per document who can do what with it.
- Separate content from formatting. Use e.g. CSS for formatting and store pure content in database with “meta” formats like header of css class xy.
- Multi language support per document.
- Versioning of articles.
- Easy to use with public hosting services (I don’t want to have some old, noisy fan sitting in my living room; and buying a new hardware just for 2.0 MB of website? No way)
Friday, 20th Apr 2007 at 18:30 (technical stuff)
Nice little helpers for administrating apache2 servers:
Will create the correct symlinks in sites-enabled to allow the site configured in sitefilename to be served
Will remove the symlinks from sites-enabled so that the site configured in sitefilename will not be served
e.g. a2enmod php4 will create the correct symlinks in mods-enabled to allow the module to be used. In this example it will link both php4.conf and php4.load for the user
Will remove the symlinks from mods-enabled so that the module cannot be used
Blogged with Flock
… a day dream that was. I was walking through my flat listening to music of my favourite kind. Well, tell me something new you might think. Here it comes: Each time I walked from one room to another the music speakers in the next room would be activated by the N95 (or any other Upnp aware device) playing the music. So the music would only be played in that sorounding I was in. Of course the music was not locally stored on the N95 but came via Upnp (or whatever) from my file storage via ether.
That’s not so unrealistic as I thought at first. I vaguely remember reading about a sound system that can address all present speakers individually via remote control even. That, of course must have been centrally controlled, though, and should have been proprietary, i.e. only working of all hardware comes from the same manufacturer. But how about if that worked via Upnp (or anything the like)? I guess the tricky bit would be the hand over of the signals transporting the sound information, i.e. manage gapless playback as if everything was wired and feed by broadcasting the sound to the speakers. Of course the speakers most likely would have to support some kind of wireless technique that the signal could be transmitted by.
Also, some mode that does address each speaker individually should be implemented, regulating the sound volume for each box (or at least each room) so the toilette is not blasted away… But I guess that would be rather easy and does not even need a modulated infrastructure as needed for the “sound hand over” scenario described above. What a cockaigne world it would be to have an abstract layer every manufacturer sticks to and supports to handle “cross-platform-interaction” like needed here. Well, I’m looking forward to building that cockaigne and living in it. How about you?
What would be even greater to have a stationary player, say in the living room, controlled via, eg. Upnp, but still be able to have the hand over working, i.e. have the system notice where the sound should be played in which volume. Another attempt could be some kind of tracker that knows where the person listening to the music is sojourning. I don’t think that would be the best approach since most likely it needs complex structures. Plus I can’t think of a way to keep that tracker “inter-platformly” scalable for many scenarios and systems.
Blogged with Flock
To step onwards in finding a subject for my diploma thesis I’ve googled a littel and found the following:
First of all I looked for what topics are being worked on at my uni to maybe narrow it that way. Our Institute for Digital Media seamed the best guess showing a seminar by Dr. Dieter Trüstedt called “Elektronische Musik in Theorie und Praxis” (electronic music in theorie and in practice”). Only after a while I noticed that it emphasis on, or I should say is making music, not analysing it. Nevertheless I was pointed to a book by Miller Puckette (Dept. of Music, University of California, San Diego) called “The Theory and Technique of Electronic Music” including some parts about wave analysis in generell, digital music, etc.
Issues I’m looking for are as described before, more precisely finding similar music as a starting point. I also found a few (not yet reviewe) papers:
- Music Database Retrieval Based on Spectral Similarity by Cheng Yang
- Pattern Discovery Techniques for Music Audio by Roger B. Dannenberg and Ning Hu
- Toward Automatic Music Audio Summary Generation from Signal Analysis by Geoffroy Peeters, Amaury La Burthe and Xavier Rodet
- Audio Retrieval by Rhythmic Similarity by Jonathan Foote, Matthew Cooper and Unjung Nam
Also, what came to my mind what to maybe take into account how humans (mammals) distinguish music (or complex sounds) and thus learn more about the brain, also.
Another thought that hit my mind concerning the use of such an analysis was to use it in, say meeting recording scenarios as some kind of search algorithm. Imagine you have some 3 hours of meeting recorded (possibly conference call) and need some certain part of but cannot find the time position by any means. Maybe by the analysis spread out above one can use a search just as we do nowadays with text: Speak the word or phrase one is looking for (with a different voice — your own) and find the position in the audio file.
Blogged with Flock