Notebook Filled with Messages

Kusuri

Scenario Architect

I'm a university student, social science, also drug science is one my hobbies, as well as writing, reading a novel is also one of them.

💬 Talk, if you would like to →

"This notebook has no pages left."

Yuki stared at the last page of the thick notebook.

"Just buy a new one?" Riku said.

"But there should still be space to write..."

Aoi became interested. "Are you thinking about compression?"

"Compression?"

"Source coding. Methods for efficiently representing data."

Riku peeked over. "Yuki's notebook has lots of 'desu' and 'masu.'"

"Because it's polite language."

Aoi flipped through the notebook. "If you write frequent words shorter, you can save space."

"But won't it become unreadable?"

"That's where code design becomes important. For example, Huffman coding."

Yuki asked, "Huffman?"

Aoi started drawing diagrams on the whiteboard.

"Check character frequency and assign short codes to high-frequency characters, long codes to low-frequency ones."

"I see."

"What's important is being prefix-free. No code is a prefix of another code."

Riku looked confused. "What does that mean?"

"For example, if you encode A as '0' and B as '01', you can't tell if '01' is A followed by 1, or B."

"True."

"So you design like 'A=0, B=10, C=11'. Reading 0 immediately confirms it's A."

Yuki started calculating. "The most common word in my notebook is 'information.' If I write this shorter..."

"Good idea. But you'll need a decoding dictionary."

Riku had an idea. "Then write the dictionary on the first page?"

"Yes. That's header information. It's included in compressed files too."

Yuki calculated concretely. "Information' appears 100 times. 5 characters each, so 500 characters. Making it 'J' saves 400 characters."

"But," Aoi pointed out, "you need to write 'J=information' in the dictionary. Consider that cost too."

"A tradeoff."

"Yes. With small data, compression overhead becomes relatively large. This is a compression limit."

Riku made another suggestion. "How about symbols instead of abbreviations? Like 😊?"

Aoi said interestingly, "Emojis are also a kind of encoding. Expressing emotions briefly."

"But," Yuki thought, "emojis are context-dependent, right? The same 😊 can have different meanings."

"Sharp. That's the problem of conditional entropy. If you know the previous context, the required information decreases."

Aoi wrote an equation in the notebook. "H(X) ≥ L̄ ≥ H(X). Average code length L̄ can't be shorter than entropy H(X)."

"That's Shannon's source coding theorem."

Yuki was moved. "So entropy determines the compression limit."

"Precisely, conditional entropy. With context, you can compress further."

Riku looked at the notebook. "Yuki's writing has patterns, so it's compressible."

"But," Yuki said, "I don't want to omit important things. I want to preserve information content."

Aoi nodded. "That's lossless compression. Fully recoverable. Meanwhile, images and audio use lossy compression. Discarding some information for large compression."

"Choosing which information to discard is also a technique."

"Yes. Preserve perceptually important information, remove unnecessary parts."

Yuki opened a new notebook. "In the next notebook, I'll try encoding frequent words."

"Interesting experiment," Aoi acknowledged.

Riku laughed. "But you might read it a year later and not understand."

"That's losing the decoding key," Aoi said.

"Then I'll carefully preserve the dictionary," Yuki smiled.

The three gazed at the notebook filled with messages.

Information theory is the technology of conveying maximum information with limited resources.

It applies to notebook pages and to life.

Recommended Reading Assets