Programming Assignment 4
In the first programming assignment, we used a network of TLUs to classify
card indices according to rank and suit. For example, given a card described
by the index 13, the network would output the suit class (Spade) and the rank
class (Ace). Another way to classify data is via Bayes rule. One very simple
example of a Bayesian approach to classification is the so-called "Naive Bayes Classifier".
For this programming assignment, we want you to implement a Naive Bayes Classifier
and use it to predict whether an individual is likely to play tennis given
a set of weather conditions. (Data borrowed from Tom Mitchell).
Data
The data for this assignment consists of a set of training instances, each consisting of
values for the following five features:
- Outlook may take one of three values: sunny, overcast, rain
- Temperature may take one of three values: hot, mild, cool
- Humidity may take one of two values: high, normal
- Wind may take one of two values: strong, weak
- Play_Tennis may take one of two values: yes, no
The 14 training instances are organized in this file such
that each line represents one instance. The first item on each line corresponds
to the Outlook feature, the second item corresponds to the
Temperature feature and so on. The last item on each line represents
the Play_Tennis feature, whose value we are regarding as the class label of the instance.
You want to use these instances to form a decision rule which you can use to make predictions of
Play_Tennis for any possible combination of the other four features. So the classes you
want to try to distinguish are those for which someone will play tennis, and those for which
someone will not play tennis. We call the first four features of an instance the input features,
and the last feature the output feature.
Naive Bayes Classifier
Using Bayes rule to reason about a domain is often quite complex. Large tables
of probabilities are often generated as the relationship between the various
domain variables are taken into account. However, the Naive Bayes Classifier
relies on the simplifying assumption that all the input features are conditionally
independent of given the desired output value, which in this case is the value of
Play_Tennis. Instead of performing complex calculations to find the necessary probability
values, Naive Bayes uses only a simple product. (This approach is called "naive" since this
conditional independence assumption is rarely actually satisfied in practice. Nevertheless, this
method can work surprisingly well in some problems).
To predict the class value of an instance, Naive Bayes must first calculate
two sets of probabilities.
- Class Priors The prior probability of each class represents how
frequently each class occurs in the data. For this assignment, there are
two classes depending on the value of Play_Tennis. The prior for
Play_Tennis = yes is estimated by counting the number of training instances with
Play_Tennis = yes, and dividing it by the total number of training instances (14 in this
case). Thus, the prior is a ratio of the number of instances in each class to the total
number of instances. Note that the sum of all class prior probabilities should be 1.
- Input Conditionals The conditional probabilities reflect the
relationship between values of a particular input feature and the output feature. For example,
in the data for this assignment, we would need to calculate the conditional
probability P[Outlook = sunny | Play_Tennis = yes]. This conditional probability
may be estimated by counting the number of instances in which both Outlook = sunny
and Play_Tennis = yes and dividing this by the total number of instances having
Play_Tennis = yes. So from the data given, the estimate of
P[Outlook = sunny | Play_Tennis = yes] is 0.22. Note that the conditionals for a given
input feature and a given value of Play_Tennis will always sum to 1. Thus, P[Outlook =
sunny | Play_Tennis = yes] + P[Outlook = overcast | Play_Tennis = yes] + P[Outlook = rain |
Play_Tennis = yes] = 1.
After the probabilities have been calculated, Naive Bayes can be used to predict
the class of a new instance by calculating which class is most likely given
the input. The likelihood of each class is found by multiplying together the conditional
probabilities corresponding to each input feature and then multiplying by the class prior. For
example, suppose we want to predict the class of an instance with
Outlook = rain, Temperature = cool, Humidity = high, Wind = weak
Then we multiply
P[Outlook = rain | Play_Tennis = yes] * P[Temperature = cool | Play_Tennis = yes] * P[Humidity =
high | Play_Tennis = yes] * P[Wind = weak | Play_Tennis = yes] * P[Play_Tennis = yes] =
Liklihood_yes
and
P[Outlook = rain | Play_Tennis = no] * P[Temperature = cool | Play_Tennis = no] * P[Humidity = high
| Play_Tennis = no] * P[Wind = weak | Play_Tennis = no] * P[ Play_Tennis = no] = Liklihood_no
Now simply choose the outcome whose likelihood has the highest value. Note that the values
Likelihood_yes and Likelihood_no are not probabilities. They may be turned into probabilities
by normalizing. For example, P_yes = Likelihood_yes / (Likelihood_yes + Likelihood_no) and
P_no = Likelihood_Lno / (Likelihood_yes + Likelihood_no).
Assignment
As usual, all materials should be formatted by the a2ps program.
Turn in:
- Your code.
- A printout of the class prior probabilities, P[Play_Tennis = yes] and P[Play_Tennis =
no] calculated from the data given above.
- A printout of the input conditional probabilities calculated from the
data above. For each input feature Outlook, Temperature, Humidity, Wind, you
should calculate the conditional probabilities for each possible feature value and
output value. For example, for the Wind feature, you must calculate
P[Wind = weak | Play_Tennis = yes], P[Wind = strong | Play_Tennis = yes], P[Wind =
weak | Play_Tennis = no], and P[Wind = strong | Play_Tennis = no]. Make sure you label
your output so that I can understand what each probability means!
- After you have calculated all of the prior and conditional probabilities
from the data, use Naive Bayes to predict the value of Play_Tennis from the input
features (the first four features) of each instance. Turn in a printout of the predicted
probability of each class for all of given data. For example, given the instance Outlook
= rain, Temperature = cool, Humidity = high, Wind = weak, print P_yes and P_no. Again, make
sure each probability is labeled!
- Answers the following questions: (1) Look at the probabilities produced by
Naive Bayes and compare the results with the actual class of the instances. How well did Naive
Bayes predict the class of the data? (2) Suppose we had left some of the data
out when we calculated the conditional probabilities. How would this affect
Naive Bayes? Do you see any potential problems?
David J. Stracuzzi
Last modified: Tue May 1 10:56:14 EDT 2001