A Web-based Computer Program for Determining Group Classification, page 2 of 4

Berger and Tsujimoto (1983) presented a classification program that takes into account population base rates and allows the researcher to designate the positive utility of a correct classification in relation to the negative utility of an incorrect classification. If population base rates differ, then it is more likely that a case belongs to the more popular group. If the benefit of correctly identifying a case as belonging to Group 1 is greater than the cost of an incorrect decision, then one should be liberal in classifying cases into Group 1. Similarly, if classifying a Group 2 case into Group 1 is highly costly, then one should be less likely to classify cases into Group 1.

Classification in this model is designated by ‘cut scores’ which represent points at which the likelihood of a case coming from one group is equal to that of coming from the other group. The three examples illustrate situations where there are zero, one, or two cut points.

The WISE Classification Applet

The applet is a Java implementation of Berger and Tsujimoto’s (1983) classification program. A downloadable Excel workbook with a comparable classification applet called UTIL is available at Utility 2.1

Input Parameters

The classification applet allows users to input parameter values based upon either observed or theoretically expected values. The following parameters can be set by the user:

Population means
Population standard deviations
Population base rates
Utility values of each of the four possible outcomes of classification decisions (members of each of two groups may be classified correctly or incorrectly)

Classification Statistics

The program interactively displays the underlying normal distributions and the distributions weighted by base rates and utility of the four possible decision outcomes.
The program also displays the cut scores and decision rule for designating group membership. The decision rule states whether there are zero, one, or two cut points, as well as how the classification groups are defined by the cut point boundaries.
Statistical results include the predicted counts of correct and incorrect classifications for Group 1 and Group 2, the proportion of correct classifications, and the relative utility of classification based on the computed cut points.

Important Assumption

The populations of test scores are assumed to be normally distributed. Transformations may be useful to obtain normal distributions. Accuracy of classification also depends on accuracy of the parameters provided by the user (means, standard deviations, base rates, and the four classification utilities).

Examples

In Example 1, compared to Group 2 (shown in red), Group 1 (shown in blue) has a larger mean (10 vs. 0), greater variability of scores in the population (SD = 10 vs. 5), and a much greater base rate (100:10). The utility applet shows that in this situation, one’s best choice is to classify all cases as Group 1, which yields 91% diagnostic accuracy (100 of 110 cases are in Group 1, so we expect to be correct 100/110 = .91).

Example 1. Zero cut-points.

Previous page | Next Page

Return to the WISE homepage