跳到主要内容

京都大学 情報学研究科 知能情報学専攻 2020年8月実施 専門科目 S-2

Author

祭音Myyura

Description

Q.1

Suppose the number of times an event occurs in one second, , follows the Poisson distribution with . Note that , and .

(1) Choose the probability density function of the Poisson distribution.

  • (a)
  • (b)
  • ()
  • (d)

(2) Find the minimum integer satisfying , and explain why.

(3) Let be the total number of times this event occurs in seconds. A theorem says that the distribution of approaches a normal distribution as increases. Write the name of this theorem, and find the values of the mean and the variance of the normal distribution approaches.

Q.2

Let be a random sample of size from the normal distribution with the mean and the variance . If is known, the 95% confidence interval of can be computed as follows: "Let be the sample mean.
follows the standard normal distribution. By using that satisfies for random variable following the standard normal distribution, the 95% confidence interval of is computed as ."

Explain similarly the procedure to compute the 95% confidence interval of when is unknown.

Q.3

Specify the errors in the following statistical arguments.

(1) "One thousand participants performed the task A and the task B. We computed the correlation coefficient of the task scores, and found no significant correlation across individuals between the two tasks. This suggests that the human abilities measured by the tasks A and B are independent."

(2) "We found an interesting statistical difference between Kyoto and Tokyo — the experiment in Kyoto showed a statistically significant difference between the test and control conditions, while that in Tokyo did not. In the next study, we will examine the reason of this regional difference."

Q.4

Briefly explain the meanings of the two statistical terms chosen from the following list.

  • Bonferroni correction
  • in ANOVA
  • Bootstrap method
  • Statistical power of a test

Kai

Q.1

(1)

Hence the PDF of the Poisson distribution is (b)。

(2)

i.e., we need to find the minimum that satisfies above inequality.

Since

01234

we have , hence the minimum is .

(3)

Central limit theorem.

Note that the mean and variance of the Poisson distribution are both , by Central limit theorem, we know that the mean and the variance of the normal distribution approaches are and , respectively.

Q.2

Let and denote the unbiased variance.

Then, follows a distribution. By using that satisfies for random variable following the distribution, the 95% confidence interval is computed as

Q.3

(1)

Two random variables are uncorrelated is a necessary condition for independence, but not sufficient.

(2)

The null hypothesis was not rejected in Tokyo and was rejected in Kyoto does not imply a "statistical difference" between the two regions. Instead of testing the significance of the difference between the test and control conditions separately in Tokyo and Kyoto, the proper approach would be to include both Tokyo and Kyoto as factors in the experimental conditions and test for the significance of the differences accordingly.

Q.4

Note: the answers are generated by GPT-4o

Bonferroni correction

The Bonferroni correction is a method used to address the problem of multiple comparisons. It adjusts the significance level () to reduce the risk of Type I errors (false positives). Specifically, if you perform tests, the corrected significance level for each test is , ensuring the overall significance level remains at .

in ANOVA

(eta squared) is a measure of effect size in ANOVA, representing the proportion of total variance in the dependent variable that is attributed to a specific factor or independent variable. It is calculated as the ratio of the sum of squares for the effect to the total sum of squares. Larger values indicate a greater effect.

Bootstrap method

The bootstrap method is a resampling technique used to estimate the sampling distribution of a statistic. By repeatedly sampling with replacement from the observed data and recalculating the statistic for each sample, the method provides estimates of measures like confidence intervals or standard errors, even when the theoretical distribution is unknown.

Statistical power of a test

Statistical power is the probability that a test correctly rejects the null hypothesis () when the alternative hypothesis () is true (i.e., avoiding a Type II error). Higher power indicates a greater likelihood of detecting an effect when it exists, and it depends on factors such as sample size, effect size, and significance level ().