Thursday, July 31, 2008

July Listening Test Results

I have not posted the updated charts for my station tests since September of last year. As you can see below, my satisfaction peaked towards the beginning of the year. There was a marked drop in May and June, and the ratings have come back quite well. The player seemed to deliver fewer previously thumb-up songs in June as can be seen in the second chart.

Once it became clear that the stations were exploring more than usual I wrote Tim and Tom. Unusually, neither relied (normally both or quite responsive but my e-mail would have been during their crunch time in creating the iPhone App), but as part of Tom's reply on my review of the iPhone app he confirmed my impression. The chart below shows the number of previously thumbed up tracks (out of 40 possible tracks in each test). As you can see, the player has recovered substantially and was exploring even less than usual this past month.

Of course, the right amount of exploration that the player does is a hard thing to determine. If the player went to only playing previously thumbed-up tracks, then by these tests my satisfaction ratings would go to 10 for each station, and the previously thumbed-up counts would all go to 40. There's no doubt that is exactly the behavior that some people want and even expect from the player. However, I do like hearing new music.

For me the ideal player would maximize my satisfaction while minimizing the previously thumbed up tracks. It would take an infinite amount of music for the player to be perfect under that criteria. However, from the drop in performance in May and June, it's clear that the selection algorithm can not tolerate much more exploration at this point. Additional exploration is particularly hard on stations which try to do something other than deliver genres which are easily identified by the genome (like my novelty song station, "The Best Medicine", for instance). Thus, given the current genome pushing the player towards less exploration is probably better (though unsatisfying). Of course, if we could tailor the exploration to each station, that would be best of all.

