How to run Linear regression in Python


How to run Linear regression in Python

Scikit-learn is a powerful Python module for machine learning. It contains function for regression, classification, clustering, model selection and dimensionality reduction. Today, I will explore the sklearn.linear_model module which contains “methods intended for regression in which the target value is expected to be a linear combination of the input variables”.

  • The first step is to import the required Python libraries into Ipython Notebook. img
  • import Boston data set into Ipython notebook and store it in a variable called boston. img
  • convert boston.data into a pandas data frame. img
  • add these target prices to the bos data frame. img
  • import linear regression from sci-kit learn module. Then I am going to drop the price column as I want only the parameters as my X values. I am going to store linear regression object in a variable called lm. img
  • print the intercept and number of coefficients. img
  • construct a data frame that contains features and estimated coefficients. img

As you can see from the data frame that there is a high correlation between RM and prices. Lets plot a scatter plot between True housing prices and True RM. img img