Friday, September 7, 2007

Station Building 101: Listening Test

The best way to develop a station is to spend time listening to it with intention and focus. I've developed a methodology to do so that allows me to track the development of the station over time and spend some uninterrupted time with each station that I'm actively curating. I came up with this procedure about a year after I started listening, and it's obviously more work than many might be interested in doing. However, I've found it to be quite fun and helpful at understanding what works well in Pandora and defining what I want for particular stations.

The basic idea of my listening test is to listen to ten sets of songs generated by a station and score each song: 1 for thumbs up, 0 for thumbs down and .5 for neither. Please note that it is important when doing a listening test to not click on a thumbs-down until after the songs is over since the player will start an entirely new set at that point and you could miss any remaining songs in the set.

The first time you click on a station after starting up the player seems to start a new set (and so you generally do not have to worry about a partial set at the beginning). To help me identify whether a set is three or four songs, I click on the song page for each song and copy and paste the focus traits into a column of the spreadsheet. I then shade the traits which are common between those songs I believe to be in the same set. On particularly homogeneous stations this step can be hard, and you may need to change you mind and readjust your assessments occasionally. I often will wait on the fourth song of a set to see if it's more similar to the next song.

Once I've identified the sets and scored each song, I then calculate a score for each set as the average score for the songs in the set. For instance, on a three-song set if I had a thumb up, and thumb down and a neither, I'd score that set as a 0.5 ( = (1 + 0 + 0.5)/3). The final score for the station is the sum of the scores across the sets giving a number from 0 to 10. A good score is anything above a 5. An 8 or higher is a great score and was very rare prior to the change in the selection algorithm this summer. Now, you can get at least some of your stations into that range with diligent development.

The scores tend to be pretty volatile, but that is largely because a session of ten sets is not long enough statistically for the average song score to settle down. Nevertheless, a ten-song set is about as long as can be done comfortably at a single setting, and it is long enough to give you some read of the quality of the station.

No comments: