Thursday 24 August 2017

Introduction to predictive modelling

What is Predictive modeling?



  • A set of methods to arrive at quantitative solution to problems of business interests
  • It is a part of Data Science or Statistical learning
  • Examples
  1.  Predict whether a patient, hospitalized due to a heart attack, will have a second heart attack.   The prediction is to be based on demographic, diet and clinical measurements for that patient.
  2. Predict the price of a stock in 6 months from now, on the basis of company performance measures and economic data.
  3. Identify the numbers in a handwritten ZIP code, from a digitized image.
  4. Estimate the amount of glucose in the blood of a diabetic person, from the infrared absorption spectrum of that person’s blood.



Predictive modeling process





Types of predictive model learning


The learning problems that we consider can be roughly categorized as either supervised or unsupervised



  • Supervised learning

         In supervised learning, the goal is to predict the value of an outcome measure based on a                      number of input measures
         
         Examples
            Predict whether a patient, hospitalized due to a heart attack, will have a second heart attack.                 The prediction is to be based on demographic, diet and clinical measurements for that patient.

            Predict the price of a stock in 6 months from now, on the basis of company performance                       measures and economic data.

            Identify the numbers in a handwritten ZIP code, from a digitized image.
            Estimate the amount of glucose in the blood of a diabetic person, from the infrared absorption             spectrum of that person’s blood.


                
  • Unsupervised learning
           In unsupervised learning, there is no outcome measure, and the goal is to describe the                          associations and patterns among a set of input measures.


         Examples

           Identifying the products that are usually sold together
            
           Identifying of typical profile of employees who quit quickly



Variable Types and Terminology

  •  X-----> Set of inputs/Independent variables/Predictors
  • Y------->Set of outputs/Dependent variables/Responses