Information Theory
Information is measured in characters, as when describing the length of an email message, r in digits (as in the length of a phone number), the convention in information theory is to measure information in bits
.
A "bit" (the term is a contraction of binary digit) is either a zero or a one. Because there are 8 possible configurations of three bits (000, 001, 010, 011, 100, 101, 110, and 111), we can use three bits to encode any integer from 1 to 8. So when we refer to a "3-bit number", what we mean is an integer in the range 1 through 8.Suppose you flip a coin one million times and write down the sequence of results.
If you want to communicate this sequence to another person, how many bits will it take? If it's a fair coin, the two possible outcomes, heads and tails, occur with equal probability.
Therefore each flip requires 1 bit of information to transmit. To send the entire sequence will require one million bits.
Tossing a biased coin
Suppose the coin is biased so that heads occur only 1/4 of the time, and tails occur 3/4. Then the entire sequence can be sent in 811,300 bits, on average . This would seem to imply that each flip of the coin requires just 0.8113 bits to transmit.
How can you transmit a coin flip in less than one bit, when the only language available is that of zeros and ones? Obviously, you can't. But if the goal is to transmit an entire sequence of flips, and the distribution is biased in some way, then you can use your
knowledge of the distribution to select a more efficient code
.
Another way to look at it is: a sequence of biased coin flips contains less "information" than a sequence of unbiased flips, so it should take fewer bits to transmit.
Passing the head positions
Suppose the coin is very heavily biased, so that the probability of getting heads is only
1/1000, and tails is 999/1000.
In a million tosses of this coin we would expect to see only about 1,000 heads. Rather than
transmitting the results of each toss, we could just transmit the numbers of the tosses that came up heads; the rest of the tosses can be assumed to be tails.
Each toss has a position in the sequence: a number between 1 and 1,000,000. A number in that range can be encoded using just 20 bits. So, if we transmit 1,000 20 -bit numbers, we will have transmitted all the information content of the original one million toss sequence,
using only around 20,000 bits.
Some sequences will contain more than 1,000 heads, and some will contain fewer, so to be perfectly correct we should say that we expect to need 20,000 bits on average to transmit a sequence this way
.
Even Better Encoding
Encoding the absolute positions of the heads in the sequence takes 20 bits per head, but his allows us to transmit the heads in any order.
If we agree to transmit the heads systematically, by going through the sequence from beginning to end, then instead of encoding their absolute positions we can just encode the distance to the next head, which
takes fewer bits.
For example, if the first four heads occurred at positions 502, 1609, 2454, and 2607, then their encoding as "distance to the next head" would be 502, 1107, 845, 153. On average,
the distance between two heads will be around 1,000 flips; only rarely will the distance exceed 4,000 flips.
Using this more sophisticated encoding convention, a sequence of one million coin tosses containing about 1,000 heads can be transmitted in just 12,000 bits, on average.
Thus a single coin toss takes just 0.012 bits to transmit. Again, this claim only makes sense because we're actually transmitting a whole sequence of tosses
Measuring Information Content -ENTROPY
Secure System Model
Perfect Secrecy
- According to Bayes formula,P(M)P(C|M) = P(C)P(M|C) =>=> P(M|C) = P(M)P(C|M)/P(C)
- P(M) –probability of M being selected
- P(C|M) -Conditional probability of ciphertext C being generated by M (sum of all keys producing C from M)
- P(C) -probability of obtaining C from any cause
- P(M|C) –probability of message M given C is intercepted
- if P(M|C) = P(M) for all C and M, then the attacker gains no information by the interception of the ciphertext
“Perfect” secrecy is attained
Necessary and Sufficient Condition
Equivocation
Message equivocation ≤ Key equivocation The message equivocation is no larger than the key equivocation
Measurement of Equivocation
Properties of Equivocation
Mutual Information
Perfect Secrecy
•
To achieve perfect secrecy, we wish to make C and M statistically independent
•
I(M;C)=0 that is, the cryptanalyst can do no better than guess
Perfect Secrecy and Key Property
Key Entropy and Mutual Information
If H(K) is small, the cryptanalyst can obtain a lot of information about the plaintext
Redundancy
Index of Coincidence (IC)