跳到主要内容

京都大学 情報学研究科 知能情報学専攻 2023年8月実施 専門科目 S-3

Author

祭音Myyura

Description

Suppose that samples are given in a 2-dimensional feature space as shown in Table 1. The samples are denoted by , , where represents the transpose of a vector or a matrix. Answer the following questions.

(1) Suppose that a classification of the samples in Table 1 (a) is given by

where and represent two classes. Find the linear classifier function that classifies a new sample into the class whose center is closer to the sample in Euclidean distance than the other class’s center.

(2) Suppose that another classification of the samples in Table 1 (a) is given by

where and represent two classes. Explain how we can compare the discriminability of the classifications given in (1) and (2) by using the within-class variance concerning the distribution of the samples within a class and the between-class variance concerning the distribution of the classes.

(3) For each class given in (1), give the formula for calculating Mahalanobis distance for .

(4) Explain which class given in (1) is suitable for the sample in Table 1 (b). The assumption on the sample distribution required for the discussion must be clearly stated.

Kai

(1)

Let and denote the center of classes and , respectively. Then we have

Let the point be where the Euclidean distance from and are equal. Then we define a linear classifier as follows:

i.e.,

(2)

Let denote the mean of all the data

Since a larger ratio of between-class variance to within-class variance is considered a better classification, we compare the ratio and of Q1 and Q2.

Let denote the center of class . Then we have

Similarly,

Since , the classifier in Q1 is better than Q2.

(3)

For , let denote the covariance matrix of . Then, the Mahalanobis distance for corresponds to is

where

(4)

Assume that each sample follows a multivariate normal distribution. Based on the form of the probability density function of the multivariate normal distribution, it is better to classify the sample into the class with the smaller Mahalanobis distance calculated using the method in Q3.

Following the formula given in Q3, the Mahalanobis distances and for are

Hence class is suitable for .