
Recent Posts
Categories
Follow me on Twitter
My Tweets
Category Archives: Data Science
Analysis of 2012 Presidential Election Polls
Here, my goal is to predict 2012 US Presidential election results based on multiple polls. I used online data for polls and Electoral College votes. As long as the links do not change, these codes should work on any machine. This … Continue reading
Posted in Data Science, Programming
Tagged data science, Obama, US Elections
Comments Off on Analysis of 2012 Presidential Election Polls
Predict 2012 Presidential Elections Based on Occupation and Employer
My code and description can be found here: http://nbviewer.ipython.org/github/sergulaydore/DataScienceProjects/blob/master/Predict2012Elections.ipynb
Posted in Data Science, Programming
Tagged 2008, 2012, Predict, prediction modeling, Presidential Elections
Comments Off on Predict 2012 Presidential Elections Based on Occupation and Employer
Data Loading, Storage and file formats using Pandas
Pandas is a very useful library in Python for data analysis. Here is my python notebook script to get started with pandas. I used Wes McKinney’s book “Python for data analysis”. Chapter 6 – Data Loading, Storage and file formats
Posted in Data Science, Programming
Tagged data science, ipython notebook, pandas, python
Comments Off on Data Loading, Storage and file formats using Pandas
Getting Started with pandas
Pandas is a very useful library in Python for data analysis. Here is my python notebook script to get started with pandas. I used Wes McKinney’s book “Python for data analysis”. Chapter 5 – Getting started with pandas
Posted in Data Science, Programming
Tagged ipython notebook, pandas, python
Comments Off on Getting Started with pandas
Evaluation metrics for binary classification
Say you perform a binary classification algorithm using different models and you want to find out which model yields the best results. The evaluation depends on the application but there are some common frameworks for this task. Accuracy is probably … Continue reading
Posted in Data Science
Tagged classification, data science, error metrics
Comments Off on Evaluation metrics for binary classification
Covariance shift
Covariance shift is an important problem in data science. Here, I illustrated how weighted maximum likelihood will improve the prediction results.
Posted in Data Science
Tagged covariance shift, data science, prediction modeling
Comments Off on Covariance shift
Overfitting and Its Avoidance
Assume you work in a company and your boss asked you to build a model for a prediction of customers’ tendency to accept a special offer. You built a model and you came up with a result which is almost … Continue reading
Posted in Data Science
Comments Off on Overfitting and Its Avoidance
Predictive Modeling and Supervised Segmentation
Predictive modeling would be very useful to better understand or predict a target quantity. In business, this quantity might be something we want to avoid. For example, you may want to predict if a customer will leave the company when … Continue reading
Posted in Data Science
Tagged business, data mining, data science
Comments Off on Predictive Modeling and Supervised Segmentation
Data Science terms in Business
Recently, I came across with several engineering students applying for data scientist positions. They all have great analytical and programming skills but they mentioned that it was hard to understand the jargon used in business world for the same problems … Continue reading
The Perceptron
The perceptron is one of the earliest supervised classification algorithms in machine learning. It was introduced by Frank Rosenblatt in 1958 [1]. The idea of perceptron was inspired by the neurons in the brain. The inputs of the perceptron that … Continue reading
Posted in Data Science, Programming
Tagged deep learning, OR, perceptron, python, theano, XOR
Comments Off on The Perceptron