Graveyard

I wrote all kinds of additional text and widgets that did not fit with the main story that much. Hence this bonus content. If you like the mini-course so far, or some of the widgets dumped below interest you, check them out!

Bonus chapters

I wrote three chapters that show some nice stuff connected to KL divergence & entropy.

Kolmogorov Complexity

Mandelbrot set looks incredibly complex, yet it's generated by a very short formula.

Max Iterations:50

Drag • Scroll to zoom • Double-click

↓ Kolmogorov complexity of the picture ↓

Mandelbrot Set Formula

z₀ = 0, z_n+1 = z_n² + c

c ∈ M if |z_n| remains bounded

Max iterations: 50

Viewport Coordinates

Left: -2.50

Right: 1.50

Top: -1.50

Bottom: 1.50

Kolmogorov complexity gives us the language to talk about all this, and much more.

For example, it tells us a lot about pseudorandomness.

Coin Flip Randomness Tester

Click coins to build a sequence, then see if it passes statistical tests for randomness

Current Sequence (0 flips)

Click coins above to build your sequence

Alternatively, write text here:

It's also the ultimate answer to how similar files are.

Loading three categories analysis...

Multiplicative Weights Update

Here's a riddle that's behind a ton of recent algorithms in CS and ML. Say you wanna get rich trading stocks. Lucky you— $n$ investors share their tips every day. Each day $t$ , they give advice, and afterward you find out how they did. For investor $i$ , you get $g_i^{(t)}$ - how many dollars she gained that day.

Your strategy: Start with equal trust in everyone— $p_i^{(0)} = 1/n$ for all investors. Each morning, randomly pick an investor based on your trust distribution and follow their advice. After seeing how everyone did, update your distribution to $p^{(t+1)}$ .

How should you update? What's best? ¹

Try it yourself with the widget below:

Multiplicative Weights Update in Action

Select Algorithms

⚖️ Multiplicative Weights👑 Follow the Leader🎲 Sample Proportionally

Choose Scenario

Also, here's how we can use it in algorithms, for noisy binary search problem.

Noisy Binary Search Visualization

Search for element:(1-15)

6.7%

Steps: 0

If you got interested, continue to the multiplicative weights chapter.

Fisher Information

A typical US election poll asks 1000-2000 random people. Do the math and you'll find such polls are usually within 2-3% of the truth.² Pretty wild—it doesn't even matter how many people live in the country, 1000 random ones get you within a few percent!

But here's the thing: US elections are super close. We already know both parties will get around 50%. So maybe we should poll more people (or combine polls) to get within 0.1%. How many people would that take?

Explore the relationship in the widget below.

Polling Error Calculator

Desired Margin of Error: ±3.0%

0.1%10%

Population Split: 50.0% / 50.0%

45/5550/5055/45

Confidence Level

Required Sample Size

people needed

Estimated cost:$2K

% of US voters:<0.001%

Quick Scenarios

Bonus random content

More on entropy and KL divergence applications; cut from the third chapter.

⚠️

Chain rule & uniqueness

Mutual Information Explorer

Joint Distribution P(Weather, Transport)

I(Weather; Transport) = 0.0000 bits

I(X;Y) = Σ P(x,y) log₂ [P(x,y) / (P(x)P(y))]

⚠️

Conditional entropy and mutual information

More max entropy distributions, cut from the max-entropy chapter.

⚠️

Beta Distribution

⚠️

Exponential Family