Applied Machine Learning (Support Vector Machines and Random Forest) for Energy Efficiency Asset Rating in Collaboration with the Dept. of Energy, SRA International, and Lawrence Livermore National Laboratory
One of the prediction projects I had the privilege of working on as of late is helping the Department of Energy and Lawrence Berkeley National Laboratory improve their Home Energy Score (HES). This single value metric which ranges from 1 – 10 judges the performance of the energy efficiency of a given home regardless of user behavior. In other words this scoring system allows new home buyers to look at household energy performance based on this score without having to take into account the prior users’ behavior. HES is a relatively simple scoring system compared to the California Home Energy Rating System (HERS). I needed to conduct predictive analysis based on the HES features to predict HERS outcomes and find out if the simpler HES model captures the variability of the more complex HERS model. I conduct applied both Support Vector Machines and Random Forest Machine Learning Algorithms during this project.
Click below to see the Home Energy Score Viability Report
Home Energy Rating Predictive Analysis Report
Created Spatial/Statistical Model for targeted Home Energy Efficiency Upgrades – 2013
Energy Upgrade California is a statewide multimillion dollar initiative to reduce greenhouse gas emissions via rebated home energy efficiency audits and subsequent improvements. Utilizing County Assessor Data, SDG&E residential energy use data, weather stations, and other parameters. I created a statistical model that maps which homes in San Diego are most likely to benefit and participate in home energy efficiency upgrades. I utilized R Statistical Modeling to conduct multivariate statistical regressions. Advanced model selection methods were used such as Bayesian Information Criterion, Akaike Information Criterion, Mallows’s Cp to extract features with strong explanatory power. The results of the model and mapping are helping to assist future targeted growth for the program. Click here to explore the optimization map.
Analyzed and Modeled Solar Panel Performance for the California Solar Initiative
The California Solar Initiative is a $2 billion program with the target of installing 1.9 Gigawatts of new solar in California. I created a statistical model to investigate past installations for trends in solar panel performance vs. azimuth, tilt, shading, and mounting height using a combination of Arcmap GIS, NREL PVWatts, and Statistical modeling using R statistical modeling software.
Urban Lawn and Turf Irrigation Monitoring and Assessment (ULTIMA)
UC Irvine 2013
Analyzed hundreds of thousands of customer water billing data from the Municipal Water District of Orange County to train a multivariate regression model that predicts the amount of water used to irrigate outdoor lawnscape. Other features used in the model include NIR, Red, demographic income data, as well as thermal bands from Landsat 5 and 7 satellites.
Link: Remote Sensing Statistical Model Full Report
Map of Wind Farm Success Potential Based on Multivariate Factors
Stanford University 2011 – 2012
Professional Reference: Patricia Carbajales, Geospatial Manager
(650) 725-9179 firstname.lastname@example.org
Using the R statistical software package in combination with ArcMap GIS, assessed the optimal wind farm locations in the continental United States using transmission line, road, wind speed, state renewable incentives, endangered species, and topo data. The most important model features were selected using Bayesian Information Criteria.
This map is a wind energy investment optimization map using variables such as distance to roads, distance to transmission lines, wind energy density, endangered birds present, state utility rates, state parks, and state renewable portfolio standards.