Quantitative Analysis Competition: Kaggle


An online competition called http://www.Kaggle.com has emerged as a way for companies to post large data sets and ask the world wide web to create algorithms to make sense of this data. I will be utilizing R, Octave, and Python to compete in the AMS 2013-2014 Solar Energy Prediction Contest. The goal of this contest is to utilize NOAA/ESLR weather data to better predict solar output from various solar panels in Oklahoma. I am currently learning Machine Learning through the Stanford Coursera Course taught by Andrew Ng. The reason I am learning Machine Learning is because often in data analysis we want to answer the question “is there hidden structure in this large data set?” and “are there patterns and correlations buried in the data?”. Machine Learning becomes especially useful when the number of variables or features is very large, so large that the time to pre-process and selectively test for correlations would take a lifetime to complete. I hope to improve my ability to quickly prototype statistical models and algorithms using Machine Learning in this solar competition.