Predictive analytics applications come in a variety of forms, but most business applications fall under
one of the three categories described below:
Decision Trees
Decision Trees are versatile tools that can be used for both classifying objects into pre-defined groups (classification),
or to predict the outcome of a target variable that takes on numeric values (prediction). These techniques successively
partition the data to predict the desired outcome. The resulting tree contains a set of nodes, where observations
within the nodes are similar based on the specified measure, whereas observations in different nodes are dissimilar with
respect to the same measure.
The objective for a decision tree is stated in terms of identifying subsets of data that are most dissimilar
along some outcome variables.
For example, Decision Trees can be used to segment a set of prospects based on their propensity to respond to a particular
campaign. Different segments differ from each other based on their propensity to respond to the campaign.
Scorecards
Scorecards are commonly used to rank order customers based on their likelihood of exhibiting
a specific behavior. Companies can devise business rules to use the
scorecards in automating a business process. For instance, threshold scores can be set to
automate approval of credit limit increases.
For example, a credit card company can create a scorecard that rank orders its customers
based on a Probability of Payment score (POP score), which reflects the risk of non-payment. Customers scoring
below a certain pre-set threshold can be referred to collections for proactive collections action.
Clustering Models
Clustering models are used to group objects into clusters so that objects
in the same clusters are more similar to each other than to objects in other clusters.
These models are useful when there are
competing patterns in data, making it hard to spot any single pattern. They can also be used as a precursor to
developing other models. Data can be segmented into clusters of similar records, thus
reducing complexity within clusters so that other data mining techniques are likely to succeed.
Clustering is typically used to segment customer data into groups of
similar customers, based on their purchase patterns, demographics or attitudes.
Customized scorecards can then be built for each of these segments.
Neural Networks
Neural Networks have their origins in the Machine Learning domain. They are "learning" models that can
make predictions at incredible speeds based on
quantification and replication of highly complex, non-linear patterns in the data. They have predominantly been used
in situations that require real-time or near real-time response in identifying subtle data patterns,
where less emphasis is placed on what the model is doing than on how well it is doing it.
Neural Networks are often used in applications that require real
time response to unknown behavioral situations. Credit card fraud models used in detecting suspicious
fraudulent transactions (while minimizing legitimate transactions incorrectly identified as potentially fraudulent)
are commonly built using this technique.