EngineeringMay 19, 20269 min read

What actually happens when your code runs.

Source code is not the same as the thing the machine executes. Every line you write descends through five layers of translation, ending in electrons in a transistor. Here's the staircase, layer by layer, in the most concrete language I can manage.

Atakan Özalan

Co-founder & engineering lead, GOGOGO LLC

What actually happens when your code runs.

I get this question more than I'd expect, usually from people who write code daily but never zoom in on what happens after they hit run. The honest answer takes five layers. Each layer hides the one below it. Mostly that's a kindness, but knowing how it works changes how you write the top layer. This is the staircase, in the most concrete language I can write it.

Layer 1 — Source code

The top layer is what you type. if x > 5: print('big') in Python, or console.log("hi") in JavaScript. It looks like English with some punctuation. It is not what the machine runs. It is a description of what you want, in a notation humans designed to be readable. The CPU has never read your source code and never will.

What this layer is good at: matching the way humans think. Loops, variables, named functions — these correspond to mental models we already have. The compiler / interpreter's job is to demolish that match cleanly into something the machine can use.

Layer 2 — Bytecode / intermediate representation

Most modern languages don't compile source straight to machine code. They first translate it to an intermediate representation — bytecode, IR, AST-but-flatter. Python's .pyc files are this. Java's .class files are this. JavaScript engines have a hidden IR called Ignition. Even C goes through LLVM's IR before it lands on machine code.

Bytecode is a low-level virtual instruction set — simpler operations than your source, but not yet specific to a real CPU. LOAD_FAST, CALL_FUNCTION, BINARY_ADD, JUMP_IF_FALSE. A handful of dozens of distinct operations, no names of your variables left, no whitespace, no comments. This is what a Python interpreter actually walks over.

Why this layer exists: it's portable. The same bytecode runs on every machine that has the right virtual machine installed. The compiler/interpreter has only one job — translate source to bytecode — and the VM does the rest.

Layer 3 — Machine code

Bytecode is portable but slow, because each bytecode instruction usually maps to multiple actual CPU instructions. To go faster, the VM (or a JIT compiler — Just-In-Time) translates hot bytecode paths down into machine code — the binary instructions your specific CPU architecture understands. MOV, ADD, CMP, JMP in x86-64; their equivalents in ARM64 on your phone or M-series Mac.

Machine code is what's actually fetched from memory by the CPU and executed. It's a stream of bytes — the assembler mnemonic is a human-readable version, but the CPU sees something like 48 89 e5 (which is x86-64 for mov rbp, rsp). The CPU has no idea this came from your Python script. It just executes bytes.

Layer 4 — CPU cycles

Machine code instructions don't execute instantly. A modern CPU runs at ~3-4 billion cycles per second. Each instruction takes between 1 cycle (a simple ADD) and ~100+ cycles (a cache miss that has to fetch from RAM). The CPU has pipelines — multiple instructions are in flight at once, in different stages of fetch / decode / execute / write-back. It does out-of-order execution — instructions get reordered for throughput when their dependencies allow. It does branch prediction — it guesses which way an if statement will go and starts executing speculatively, rolling back if wrong.

Almost no software engineer thinks about this layer day to day. But it's why your code is fast or slow. The pipeline only stays full if branches are predictable. Memory locality matters because cache misses cost 100× a register hit. SIMD vector instructions process 4-8 numbers in parallel when the data layout is right. When someone tells you to write cache-friendly code, this is the layer they're talking about.

Layer 5 — Physics

Underneath the CPU cycle is the actual physical event. A CPU register is a small set of bistable circuits made of transistors — a few billion of them on the chip. A transistor is a switch — current flows or doesn't, depending on the voltage at its gate. ADD at the machine-code level corresponds to electrons moving through a network of switches arranged in a binary adder, settling into a new pattern that represents the sum. Each transistor flips in less than a nanosecond. The clock signal — the heartbeat of the chip — coordinates billions of these flips per second.

Below transistors is quantum mechanics. Electrons tunneling through gate oxides is what limits how small a transistor can get. Heat dissipation — Landauer's principle says erasing a bit of information has a minimum thermodynamic cost — is what limits how fast the chip can run. Modern chips at the 3nm process are starting to bump into these physical floors, which is why CPU clock speeds plateaued around 5 GHz in 2010 and progress now comes from parallelism, not raw frequency.

Why the staircase matters

You can write working software without knowing any of this. Tens of millions of engineers do every day. But the engineers I trust most can fall down the staircase when they need to. A weird performance bug? Drop to Layer 4 — what's the cache pattern, what's the branch misprediction rate. A bytecode-level question? Layer 2. A weird floating-point result? Layer 5 — IEEE 754, rounding modes, denormals. The staircase is not exotic knowledge; it's the substrate that makes the top layer behave the way it does.

At GOGOGO LLC, I spend most of my time at Layer 1, with TypeScript and Python. The multi-agent runtime is Layer 1 code through and through. But I think about Layer 4 every time I see a cost-per-call number that doesn't match my mental model. I think about Layer 5 every time someone asks why our agent latency floor is ~30ms even when the model returns in ~5ms (answer: network round-trip is physics; speed of light through fiber is a real constraint).

“Code is description. The CPU is physics. Every abstraction layer between them is a debt you'll pay when something at the top behaves weirdly. Knowing the staircase is the difference between guessing and reasoning.”

What I'd tell a younger engineer

Don't stay on Layer 1 forever. You don't need to write Assembly to be a good engineer — I haven't written x86 voluntarily in over a decade — but you should be able to read it for ten minutes and recognize what's happening. Same for bytecode. Same for the basic physics of a transistor. Each of these layers will eventually save you a debugging week. The investment is small. The compounding is large.

If you want to walk further down any of the layers — pipelines, IRs, cache hierarchies, the I-Ching of CPU prefetchers — I'm easy to reach. atakanozalan.com.