Tuesday 26 November 2013

Entered a few Kaggle Competitions

I entered a few Kaggle competitions for fun and so I could put them on my resume.

On all of them I scored somewhat below the middle of the leaderboard.

I did very little feature engineering, I simply loaded them into postgres (except for digit recognition which just formatted for vw using python), outputted them to text files in vw format and used some shell scripts to unix paste and vw run them. VW automatically treat text as bag of words so it gave reasonable results as is. Overfitting exists - running 1000 passes appeared to give better results  at vw console but gave worse on kaggle submission.

For see click predict I first did a multiclass --oaa and did some feature engineering on the timestamps in postgres (date truncs and date parts) and ceilinged the lat/longs. The competition had only 4 days left and this was my first competition so some time was spent learning the kaggle site and the evaluation metric (log). I again did the competition but logarithmed the outputs as regression but got worse results than the multiclass.

Partly sunny with a chance of hashtags, digit recognition, see click predict fix.

No comments:

Post a Comment