Threads see different memory states than you'd expect. The CPU reorders instructions, caches lag, compilers optimize aggressively. The memory model — Java's JMM, C++'s memory_order, Go's happens-before — defines what guarantees you actually get. Misunderstanding this produces the worst class of concurrency bugs.
Why memory looks weird across threads
Modern CPUs have per-core caches. Stores get buffered, may not be visible to other cores for nanoseconds. Compilers reorder reads and writes for optimization. Without explicit synchronization, two threads have no agreed-on view of memory.
Happens-before — the formal contract
If event A happens-before event B, B sees A's effects. Sources of happens-before: program order in one thread, monitor lock-unlock pairs, volatile read-after-write, thread-start, thread-join. Outside these, no guarantees.
Visibility — the practical concern
A non-volatile field written by thread 1 may never be visible to thread 2. The CPU's cache might just never propagate. volatile (Java) / memory_order_release/acquire (C++) / channel send-receive (Go) create the needed happens-before.
Reordering — the surprising one
CPU and compiler can reorder reads and writes within a thread, as long as single-threaded behavior is preserved. Other threads may see the writes in a different order. Acquire-release semantics (in Java/C++) prevent reordering across the barrier.
Practical rules
Use higher-level primitives (channels, mutexes, atomics) where possible. Reading volatile/atomic is cheaper than locking — use for single-variable hot paths. Don't roll your own lock-free code; the memory model traps will get you.