Solutions: 23 — Index maps

Exercise 1 — Build the map

import numpy as np
INVALID = np.iinfo(np.uint32).max

class World:
    def __init__(self, capacity: int, n_ids: int):
        self.capacity = capacity
        self.n_active = 0
        self.id           = np.zeros(capacity, dtype=np.uint32)
        # ... other columns ...
        self.id_to_slot   = np.full(n_ids, INVALID, dtype=np.uint32)

    def append(self, new_id: int, **fields):
        slot = self.n_active
        self.id[slot] = new_id
        for k, v in fields.items():
            getattr(self, k)[slot] = v
        self.id_to_slot[new_id] = slot
        self.n_active += 1

Adding the map is one extra column and one extra line in append. Removal updates happen in cleanup (next exercise).

Exercise 2 — O(1) presence query

class World:
    hungry: np.ndarray = np.empty(0, dtype=np.uint32)
    hungry_member: np.ndarray = np.zeros(N_max, dtype=bool)

def become_hungry(world, creature_id: int):
    world.hungry = np.concatenate([world.hungry, [creature_id]])
    world.hungry_member[creature_id] = True

def stop_being_hungry(world, creature_id: int):
    world.hungry = world.hungry[world.hungry != creature_id]
    world.hungry_member[creature_id] = False

def is_hungry(world, creature_id: int) -> bool:
    return bool(world.hungry_member[creature_id])

Two parallel structures: hungry (the iteration list — O(K) walk) and hungry_member (the membership map — O(1) check). The list is for iterating; the bool array is for asking “is this id in the table?”. Both updated together; one read for each access pattern.

The cost is one byte per id ever issued (~1 MB at 1M ids), which is the memory price of constant-time membership.

Exercise 3 — Maintain on bulk-filter cleanup

def cleanup(world, buffer):
    if buffer.to_remove:
        ids = np.unique(np.array(buffer.to_remove, dtype=np.uint32))
        slots = world.id_to_slot[ids]
        keep_mask = np.ones(world.n_active, dtype=bool)
        keep_mask[slots] = False

        # mark the removed ids as no longer in the table
        world.id_to_slot[ids] = INVALID

        # compress every column
        n_keep = int(keep_mask.sum())
        for col_name in world.column_names:
            col = getattr(world, col_name)
            col[:n_keep] = col[: world.n_active][keep_mask]
        world.n_active = n_keep
        # rewrite id_to_slot for survivors — one bulk numpy assignment
        world.id_to_slot[world.id[:n_keep]] = np.arange(n_keep, dtype=np.uint32)
        buffer.to_remove.clear()

    # ... insertions: append new ids and write id_to_slot[new_id] = slot ...

The id_to_slot[ids[:n_keep]] = np.arange(n_keep) line is the keystone. It rewrites every surviving id’s slot in one bulk numpy assignment — exactly the same shape as the column compress, applied to the index map.

Exercise 4 — Time the difference

import time, numpy as np

world = build_world(n=1_000_000, hungry_count=100_000)
ids = np.random.default_rng(0).choice(1_000_000, size=100_000)

# Linear scan version (§17 ex 6)
def is_hungry_scan(hungry, target):
    return bool(np.any(hungry == target))

t = time.perf_counter()
for cid in ids:
    is_hungry_scan(world.hungry, int(cid))
print(f"linear scan × 100K: {time.perf_counter()-t:.2f} s")

# Indexed version
t = time.perf_counter()
for cid in ids:
    bool(world.hungry_member[int(cid)])
print(f"indexed × 100K: {time.perf_counter()-t:.3f} s")

Typical: linear scan ~5-10 minutes (10⁵ × 10⁵ = 10¹⁰ ops). Indexed: ~30 ms (one C-level read per call, plus Python loop overhead). Ratio: ~10⁵-10⁶×.

For a real simulator that does many membership queries per tick, the index map is the difference between workable and unsalvageable. Without it, presence-replaces-flags would only be defensible for whole-table operations, not individual queries.

Exercise 5 — Run the exhibit (honestly)

uv run "code/measurement/csr_matrix or python dict.py"

Benchmarking with a 1000x1000 matrix, 1.0% density (9954 non-zero elements).
Performing 10000 random lookups.

CSR Matrix lookup time:        0.0616 s
Python Dictionary lookup time: 0.00072 s

Python Dictionary is faster for lookups by approximately 85.62 times.

The headline (“Dict is 86× faster”) is true for the access pattern in the file (random scalar lookups). The right reading is that scipy gave you a sparse matrix, not a sparse map. CSR is excellent at:

import numpy as np
from scipy.sparse import csr_matrix

mat = csr_matrix((1000, 1000))
# ... populate ...
v = np.zeros(1000)
result = mat @ v               # SpMV — what CSR is actually for

For SpMV at 1000×1000 with 1% density, CSR is dramatically faster than naive dense or dict-based approaches — nine thousand multiplications instead of a million. That’s the operation it’s optimised for.

The lesson: pick the structure that matches your access pattern. A dict is a sparse point-lookup map. CSR is a sparse matrix. They share the word “sparse” and almost nothing else.

Exercise 6 — The bandwidth cost

1M id_to_slot entries × 4 bytes = 4 MB total
1500 cleanup writes per tick × 4 bytes = 6 KB written
At ~10 GB/s memory bandwidth: ~0.6 µs to write 6 KB
30 Hz tick budget: 33 ms

The cleanup map-update cost is 0.002% of the tick budget at typical mutation rates. The id_to_slot maintenance is invisible against the rest of the work. The 4 MB total memory cost is the dominant concern at scale, not the bandwidth — which mitigates to 400 KB once recycling caps the high-water id count.

Exercise 7 — Sort-for-locality compatibility

def sort_for_locality(world, key_col_name: str):
    """Sort the table in-place by some key (e.g., spatial bucket).
       Updates id_to_slot to reflect the new positions."""
    key = getattr(world, key_col_name)[: world.n_active]
    order = np.argsort(key, kind="stable")

    for col_name in world.column_names:
        col = getattr(world, col_name)
        col[: world.n_active] = col[: world.n_active][order]

    # the keystone again — one bulk update
    world.id_to_slot[world.id[: world.n_active]] = np.arange(world.n_active,
                                                              dtype=np.uint32)

After the sort, world.id[k] is some new id, and id_to_slot[world.id[k]] == k. External code holding a reference to id 42 looks up id_to_slot[42], gets the new slot, reads the (now-relocated) row.

The sort changed every slot. The map update changed every entry of id_to_slot. Both are O(N) bulk numpy operations — fast enough to do every tick if needed.

Exercise 8 — A from-scratch generational arena (stretch)

import numpy as np
from typing import NamedTuple

class CreatureRef(NamedTuple):
    id:  int
    gen: int

INVALID = np.iinfo(np.uint32).max

class SlotMap:
    """Generational arena: stable handles, O(1) lookup, slot recycling, generation checks."""

    def __init__(self, capacity: int = 65536, n_ids: int = 1_000_000):
        self.capacity = capacity
        self.n_active = 0
        self.id    = np.zeros(capacity, dtype=np.uint32)
        self.gens  = np.zeros(capacity, dtype=np.uint32)
        self.value = np.zeros(capacity, dtype=np.float32)
        self.id_to_slot = np.full(n_ids, INVALID, dtype=np.uint32)
        self.next_id = 0

    def insert(self, value: float) -> CreatureRef:
        if self.n_active >= self.capacity:
            raise MemoryError("SlotMap full")
        slot = self.n_active
        new_id = self.next_id
        self.next_id += 1
        self.id[slot]    = new_id
        self.gens[slot]  = 0
        self.value[slot] = value
        self.id_to_slot[new_id] = slot
        self.n_active += 1
        return CreatureRef(id=new_id, gen=0)

    def remove(self, ref: CreatureRef) -> bool:
        slot = self._slot_of(ref)
        if slot is None: return False
        last = self.n_active - 1
        moved_id = int(self.id[last])
        if slot != last:
            self.id[slot]    = self.id[last]
            self.gens[slot]  = self.gens[last]
            self.value[slot] = self.value[last]
            self.id_to_slot[moved_id] = slot
        self.id_to_slot[ref.id] = INVALID
        self.gens[last] += 1                      # bump generation for next reuse
        self.n_active -= 1
        return True

    def get(self, ref: CreatureRef) -> float | None:
        slot = self._slot_of(ref)
        return None if slot is None else float(self.value[slot])

    def _slot_of(self, ref: CreatureRef) -> int | None:
        slot = int(self.id_to_slot[ref.id])
        if slot == INVALID: return None
        if int(self.gens[slot]) != ref.gen: return None
        return slot

Compare with slotmap::SlotMap (Rust): same machinery, different organisation. Rust packs (index, generation) into one Key (a u64); we use a NamedTuple. Rust uses Vec<Slot> with an internal free list; we use an active counter and bump generations on remove. The structural pieces — id allocator, generation array, id_to_slot map, swap_remove on delete — are identical.

Combined with §22’s deferred cleanup, this SlotMap is the simulator’s table primitive. Once you have it, every variable-quantity table in the book reuses the shape — creatures, food, pending events, transition log entries — each one a SlotMap with different columns.

Keyboard shortcuts

An Introduction to Programming, using ECS & EBP in Python