KL-senpai Dislikes Disappointment

Kusuri

Scenario Architect

I'm a university student, social science, also drug science is one my hobbies, as well as writing, reading a novel is also one of them.

💬 Talk, if you would like to →

"KL-senpai is shaking her head again."

Riku whispered. At the Information Theory Club presentation, the senior showed a dissatisfied expression every time someone explained something.

"She dislikes disappointment," Aoi answered.

Yuki became interested. "Disappointment?"

"More precisely, she dislikes large KL divergence. Everyone calls her KL-senpai rather than her real name."

The presentation ended, and KL-senpai approached them.

"Your hypothesis has too much deviation from the actual measured data."

Riku asked back. "Deviation?"

KL-senpai opened her notebook. "Kullback-Leibler divergence. A metric that measures how different two probability distributions are."

Aoi supplemented. "It measures the difference between the expected distribution P and the actual distribution Q."

KL-senpai wrote an equation.

"D_KL(P||Q) = Σ P(x) log(P(x)/Q(x))"

"The larger this is, the further expectation is from reality."

Yuki tried to understand. "So it means the prediction is off?"

"Exactly. It's an important concept in machine learning too. The goal is to minimize the KL divergence between the model's predicted distribution and the true distribution."

Riku asked for a concrete example. "Like what?"

KL-senpai continued explaining. "Consider weather forecasting. The meteorological model predicts 'tomorrow will be 80% sunny, 20% rain.' But actually it was 30% sunny, 70% rain."

"Way off."

"At this time, calculating KL divergence numerically quantifies how bad the prediction was."

Aoi started calculating in her notebook. "With P=(0.3, 0.7) and Q=(0.8, 0.2)..."

"0.3×log(0.3/0.8) + 0.7×log(0.7/0.2) ≈ 0.635 bits"

KL-senpai nodded. "This is the informational loss. If you could predict perfectly, KL divergence would be zero."

Yuki asked. "But perfect prediction is impossible, right?"

"That's why continuous improvement is necessary. With Bayesian updating, you modify the model every time new data arrives."

Riku remembered. "On my last test, I also had large divergence between expectation and reality."

"It's a concept applicable to life," KL-senpai smiled unusually. "Either bring expectations closer to reality, or bring reality closer to expectations."

Aoi continued. "What's interesting is that KL divergence is asymmetric. D_KL(P||Q) and D_KL(Q||P) are different."

"Order matters?"

"Yes. The value changes depending on which distribution you use as reference. So it's interpreted as 'distance viewing Q from P's perspective.'"

KL-senpai added explanation. "It's also deeply related to cross-entropy. There's a relationship H(P,Q) = H(P) + D_KL(P||Q)."

"Cross-entropy?"

"The average code length when encoding data following distribution P with distribution Q. Often used as a loss function in machine learning."

Yuki summarized. "So KL divergence is a tool to measure the gap between expectation and reality."

"Correct," KL-senpai acknowledged. "And minimizing that gap is the learning process."

Riku laughed. "Senpai, you dislike disappointment because you want to minimize KL divergence?"

"Humans are a type of predictor. We model the world, predict, and correct errors. Disappointment is an opportunity to improve the model."

Aoi nodded deeply. "Disappointment is information."

"Exactly. The greater the surprise, the stronger the learning signal. So I dislike disappointment but also think it's necessary."

The presentation resumed. As KL-senpai returned to her seat, she said,

"Next time, I expect a better model."

Yuki whispered. "She's actually a kind person."

"Probably good at updating expected values," Aoi answered.

The three watched the next presenter. There's always a gap between expectation and reality. But that gap itself is the driving force of growth. KL divergence is the ruler that measures that gap.

Recommended Reading Assets