跳到主要内容

京都大学 情報学研究科 知能情報学専攻 2025年8月実施 専門科目 S-3

Author

itsuitsuki

Description (English)

Answer the following questions. All logarithms are base 2 (). For calculations, you must use and . Also, round to two decimal places if necessary.

Q.1 Consider a memoryless stationary source defined by the alphabet , where each symbol occurs independently. Their probabilities of occurrence are .

(1) Calculate the entropy .

(2) Design a binary Huffman code based on the above probabilities. Draw a Huffman tree (code tree for an instantaneous code).

(3) Calculate the average code length of the binary Huffman code in (2).

Q.2 Consider a first-order Markov source defined by the alphabet , where the symbols themselves represent the states, and the following transition probability matrix . For example, if the current symbol is , the probabilities that the next symbol will be are , respectively.

(1) Calculate the stationary distribution of this Markov process. Considering a memoryless source based on the stationary distribution, calculate its entropy .

(2) Calculate the entropy rate (entropy per symbol) of the Markov source .

(3) Compare the magnitudes of and , and explain the reason for their difference.

Q.3 Variable-length blocks are created by concatenating one or more characters from the set . Let the set of five variable-length blocks be an alphabet in which each element represents a single symbol. For this set , consider a memoryless stationary source in which each symbol occurs independently. The probabilities of occurrence for each variable-length block are .

(1) Evaluate the coding efficiency of the source at the character level. First, design a binary Huffman code for , and calculate the average code length per block (). Furthermore, calculate the average code length per character for this code. Here, the average code length per character is defined as , where is the average number of characters in per source symbol of the variable-length block information source. For example, the block consists of two characters in , so its number of characters is 2.

(2) After determining the probability distribution of the characters generated from , consider the average code length of a (binary) Huffman code for a memoryless source that follows this distribution. Discuss the magnitude difference (i.e., which is larger) between the average code length obtained here and the average code length per character found in (1), and explain its cause.