Earth-scale log n vs Cosmological log n - Part 2 of Binary Tree of the Universe

In the previous post, we worked through the enormous power of logarithmic functions for reducing search spaces. For our Binary Tree of the Universe, it would take us only 266 steps to locate any atom in the observable universe. Tree data structures such as B-trees with wider fan-outs require even less steps.

I find this kind of miraculous. But this holds in the most extreme case of cosmic scope. While searching for individual atoms in the universe may be more relevant for humans millennia from now, we on the other hand grapple with more Earthly confines. Let’s bring this down to Earth (huhu):

The Earth has an estimated ~10^50 atoms; convert to powers of 2:

ln 10 / ln 2 = 3.322~
10^50 = 2^(50 * 3.322) = 2^166.1

Simplifying 2^166.1 to 2^166, our Binary Tree of the Universe would handle searching every atom on Earth with 166 steps. A mildy pleasing coincidence that 2^100 is the difference between Earth and the observable universe.

Perhaps every atom on Earth is still too ambitious. Let’s further ground this within the context of humanity:

  • 1 million - typical token context window for LLMs today: ~2^19.93 ≈ 20 steps
  • 1 billion - favorite valuation goal of startups: ~2^29.90 ≈ 30 steps
  • 8 billion - all humans alive today: ~2^32.90 ≈ 33 steps
  • 100 billion - all humans that have ever lived: 2^36.541 ≈ 37 steps

37 steps from every person that has ever lived. You would need only a little over a quarter of the Spanish steps to search through every human that has ever lived!

But one could say, that’s only the people! What about all the content and information they’re creating? If we indexed all the internet data ever created, currently estimated to be ~100 zettabytes, or ~2^76.4 bytes and some change, our search for an individual byte still only requires 77 steps! We can then define Earth-scale log n as 166 and the current Human-scale log n as 77. It'd be nice to have some breathing room for human data's future growth, so let's make Human-scale log n a nice round 80.

This explains part of the magic of YouTube, Meta, TikTok, and the rest of the social media players having reasonable access times for their massive info and content libraries. It’s possible to reasonably store, index, and retrieve social media content for every human alive. After all, all human data ever created is only a little more than half way up the Spanish Steps.

As a final exercise to give us a sense of what logarithmically sits between 1 and 2^266 (all the atoms in the observable universe), I'll compile it all into two handy reference tables, starting with Human-scale:

Scale Count log₂(n) Spanish Steps
LLM context window (1M tokens) ~10^6 20 15% up the stairs
Unicorn valuation ($1B) ~10^9 30 22% up the stairs
All humans alive today ~8×10^9 33 24% up the stairs
All humans ever lived ~10^11 37 27% up the stairs
All internet data (bytes) ~10^23 77 57% up the stairs

Extending beyond humanity, I'll add in cosmological structures:

Scale Estimated Atoms log₂(n) Spanish Steps
Earth ~10^50 166 Up and down 31 steps
Solar System ~10^57 189 Up and down 54 steps
Nebula (typical) ~10^60 199 Up and down 64 steps
Milky Way Galaxy ~10^68 226 Up and down 91 steps
Local Group ~10^72 239 Up and down 104 steps
Virgo Supercluster ~10^75 249 Up and down 114 steps
Observable Universe ~10^80 266 Up and down (2 sets of stairs)
Universe of Universes ~10^160 532 Up and down twice (4 sets of stairs)

I find something about this comforting. The universe is so unimaginably, incomprehensibly vast, and yet there are paths for us to structure it, make sense of it, and explore its immense depth.

-----

Part 3 of this series: What the cosmos teaches us about quadratic growth and LLM context windows