Thursday 14 November 2013

How I approach machine learning

I have a simplified conception of machine learning with a few basic algorithms

decision trees and ensembles of them (boosting/random forests) - I haven't looked much into these but might test them in the future (probably just use waffles)

perceptron (linear regression etc.) - there is only 1 global minimum and so derivative is useful. Just vowpal wabbit for it (get as much data and engineer as many features as possible and let vw sort it out)

neural nets - perceptrons wired together - many global minimum, derivative maybe/maybe not useful. Academic researchers try out different optimizers (sgd most common). I would think simulated annealing would do good on it. I tried the vowpal nnet but it took too long to run and gave poor results (researchers use gpu's to get performance).

autoencoders, topic models - unsupervised liearning. I just use gensim's implemetations. It has tfidf, lsi, rp, lda, hdp. I've only tested out tfidf, lsi, and rp, and might only use rp in the future.

naive bayes - count stuff up.

For my recommender systems projects I plan on sticking to only the vector spaces (gensim), perceptron (vw), and naive bayes (probably do it in sql or awk). I don't have much computer power so things like nnets are too much for now.

No comments:

Post a Comment