Today I converted the validation set (I had not noticed its existence before), so let’s redo some of the numbers and statistics I have given before… The ratings in the training set are 47% tracks, 29% artists, 19% albums and 5% genres. The ratings in the validation set are 27% tracks, 52% artists, 11% albums […]

Today I discovered that I have completely missed the existence of a validation set. I created my own validation set from the trainingset. So I have to redo some stuff! Also, I was checking consistency in genres between the tracks and album data. I compared the genres of a track to the genres of the […]

I have been playing with the KDD Cup 2011 data for about a week now. So far, the results are not impressive, with a best score of 36.2247. Today, I looked at the data some more and found some interesting facts. First, 97.5% of the ratings (in track 1) are in one of the categories […]