Project description

Get complete project »

Need Urgent help with this project?


Several multivariate measurements require variables selection and ordering. Stepwise procedures ensure a step by step method through which these variables are selected and ordered usually for discrimination and classification purposes. Stepwise procedures in discriminant analysis show that only important variables are selected, while redundant variables (variables that contribute less in the presence of other variables) are discarded. The use of stepwise procedures is employed as to obtain a classification rule with a low error rate. Here in this work, variables are selected based on Wilks’ lambda Ù and partial F. The variable with the minimum Ù and maximum F is included in the model first, followed by the next most important variable as can be observed from the forward selection. Backward elimination deletes the variable with the smallest F and the largest Ù in a step by step fashion. SPSS is used to illustrate how stepwise procedures can be employed to identify the most important variable to be included in the model based on Wilks’ Ù and partial F. The analysis revealed that only variables X1, head width at the widest dimension and X4, eye-to-top-of-head measurement are the most important variables that are worthy of inclusion into the discriminant function. 




 Discriminant Analysis or D.A is a multivariate technique used to classify cases into distinct groups. It separates distinct sets of objects (or observations) and allocates new objects (or observations) to previously defined groups. Discriminant analysis is concerned with the problem of classification, which arises when a researcher having made a number of measurements on an individual, wishes to classify the individual into one of several categories on the basis of these multivariate measurements (Onyeagu, 2003).

 Discriminant analysis will help us analyze the differences between groups and provide us with a means to assign or classify any case into the groups which it most closely resembles. 

          There are two aspects of discriminant analysis, 

1.                 Predictive Discriminant Analysis (PDA) or Classification, which is concerned with classifying objects into one of several groups and

2.                 Descriptive Discriminant Analysis (DDA) which focused on

revealing major differences among the groups (Stevens 1996). 

According to Huberty (1994), Descriptive discriminant analysis includes the collection of techniques involving two or more criterion variables and a set of one or more grouping variables, each with two or more levels. “Whereas in predictive discriminant analysis (PDA) the multiple response variables play the role of predictor variables. In descriptive discriminant analysis (DDA) they are viewed as outcome variables and the grouping variable(s) as the explanatory variable(s). That is, the roles of the two types of variables involved in a multivariate multigroup setting in DDA are reversed from the role in PDA.


A researcher may wish to discard variables that are redundant (in the presence of other variables) when a large number of variables are available for groups separation. Here (in discriminant analysis), variables (say y’s) are selected and, the basic model does not change. Unlike regression, where independent variables are selected and consequently, the model is altered.  Stepwise selection is a combination of forward and backward

variables selection methods. In forward selection, the variable entered at each step is the one that maximizes the partial F-Statistic based on Wilks’Ù.

The maximal additional separation of groups above and beyond the separation already attained by the other variables is thus obtained. The proportion of these F’s that exceed Fα is greater than α. While in backward selection (elimination), the variable that contributes least is deleted at each step as shown by the partial F.

 The variables which are selected one at a time, and at each step, are re-examined to see if any variable that entered earlier has become redundant in the presence of recently added variables. When the largest partial F among the variables available for entry fails to exceed a preset threshold value, the procedure stops.

 Stepwise discriminant Analysis is a form of discriminant analysis. During the selection process no discriminant functions are calculated. However, after the completion of the subset selection, discriminant function is calculated for the selected variables. These variables can also be used in the construction of classification functions.   


1.                 Construct the discriminant function. 

2.                 Evaluate the discriminant function for population one (1) by substituting the mean values of X1, X2, ….., Xp into Y = L1X1 + L2

X2+…+LPXP, label the value obtained, Y1.

3.                 Repeat step 2 for population two (2) and label the value obtained, Y2.

4.                 Since one is usually greater than the other, assume Y2 > Y1

5.                 Compute the critical value, YC =  Y1 + Y2


6.                 Then state the discriminating procedure as; assign the new individual to population one (1) if Y < YC and to population two (2) if Y > YC or

YC < Y.


               Johnson and Wichern (1992) defined two goals of discriminant

analysis as:

1.                 To describe either graphically (in at most three dimensions) or algebraically the differential features of objects (or observations) from several known collections (populations). We try to find discriminants such that the collections are separated as much as possible.

2.                 To sort objects (observations) into two or more labeled classes. The emphasis is on deriving a rule that can be used to optimally assign a new object to the labeled classes. Johnson and Wichern

(1992) used the term discrimination to refer to Goal 1 and

Classification or Allocation to refer to goal 2.  

The goals of discriminant analysis include identifying the relative contribution of the p variables to separation of the groups and finding the optimal plane on which the points can be projected to illustrate the configuration of the groups.


1.                 A geologist might wish to classify fossils into their respective categories of fossils groups on the basis of measurements on sizes, shapes and ages of the fossils.

2.                 A doctor may intend to classify new born babies into different categories of blood groups, based on measurement obtained from the blood samples of the babies.

3.                 Students applying for admission into a University are given a common Entrance Examinations (CEE), the vector of their scores in the entrance examination is a set of measurement, X. The problem is to classify a student on the basis of his scores on the entrance examination.

4.                 An automobile Engineer might decide to classify an automobile engine into one of several categories of engine on the basis of measurement of its power output, size and shape.

5.                 A nutritionist might classify food substances into categories of food nutrient as carbohydrate, minerals, water, protein, fat and oil, and vitamin on the basis of measurement on comparative amount of

different nutrients in the food.

As we have seen in the examples above, individuals are assigned to groups taking cognizance of data related to the groups.


This study is necessary for the following purposes:

1.                 For classification of cases into groups using the stepwise methodologies of discriminant analysis;

2.                 To identify and discard or remove redundant variables or variables which are little related to group distinction;

3.                 To compare the probabilities of misclassification and the hit ratios obtained with discriminant analysis (all independent variables) to that obtained with stepwise procedures.

  1.7   DEFINITION OF TERMS 1.7.1 Discriminant Function

This is a latent variable which is created as a linear combination of discriminating variables, such that 

            Y     =        L1x1 + L2x2 + …..+ Lp xp

where the L’s are the discriminant coefficients, the x’s are the discriminating variables. 

1.7.2 The eigenvalue: This is the ratio of importance of the dimensions which classifies cases of the dependent variables. There is one eigenvalue for each discriminant function. With more than one discriminant function, the first eigenvalue will be the largest and the most important in explanatory power, while the last eigenvalue will be the smallest and the least important in explanatory power.

 Relative importance is assessed by eigenvalues since they reflect the percents of variance explained in the dependent variable, cumulating to

100% for all functions. Eigenvalues are part of the default of output in SPSS

(Analysis, Classify, Discrimination).

1.7.3     The Discriminant Score    

 This is the value obtained from applying a discriminant function formula to the data for a given case. For standardized data, Z score is the discriminant score. 

1.7.4     Cutoff 

 When group sizes are equal, the mean of the two centroids for twogroups discriminant analysis is the cut off. The cut off is the weighted mean if the groups are unequal. A case is classed as 0 if the discriminant score of the discriminant function is less than or equal to the cut off or classed as 1 if above it.

1.7.5     The Relative Percentage 

 This is equal to the eigenvalue of a function divided by the sum of all eigenvalues of all discriminant functions in the model. It is the percent of discriminating power for the model associated with a particular discriminant function. It tells us how many functions are important. The ratio of eigenvalues indicates the relative discriminating power of the discriminant functions.

1.7.6     The Canonical Correlation, R* 

This measures the association between the groups formed by the dependent and the given discriminant function. A large canonical correlation indicates high correlation between the discriminant functions and the groups. An R* of 1.0 shows that all of the variability in the discriminant scores can be accounted for by that dimension. The relative percentage and R* do not have to be correlated. Canonical Correlation, R* , also shows how much each function is useful in determining group differences.

1.7.7     Mahalanobis Distances

 This is the distance between a case and the centroid for each group (of the dependent variables) in attribute space (a dimensional space defined by n variables). There is one mahalanobis distance for each group of case, and it will be classified as belonging to the group with the smallest mahalanobis distance. This means that the closer the case to the group centriod, the smaller the mahalanobis distance. Mahalanobis distance is measured in terms of standard deviations from the centroid.

1.7.8     The Classification Table 

 This is a table in which the rows are observed categories of the dependent and the columns are the predicted categories of the dependent. All cases lie on the diagonal at perfect prediction.

1.7.9     Hit Ratio

                  This is the percentage of cases on the diagonal of a confusion

matrix. It is the percentage of correct classifications. The higher the hit ratio the less the error of misclassification, also the less the hit ratio the higher the error rate.

1.7.10 Tolerance 

This is the proportion of the variation in the independent variables that is not explained by the variables already in the model. Zero tolerance means that the independent variable under consideration is a perfect linear combination of other variables already in the model. A tolerance of 1 implies that the predictor variables are completely independent of other predictor variables already in the model. Most computer packages set the minimum tolerance at 0.01 as the default option.

Get complete project »

Need Urgent help with this project?

Can't Find What You Are Looking For?

Quick Project Search