explainer pipeline trace walkthrough monitor research hardware

Data to Prediction

Walking through your actual live data from this morning. Every number, every step.

This is your Mac Mini right now: Monday 10:39 AM. CPU 56%, Memory 72%, Disk 54%, Swap 74%, Load 5.0. You're working. The system has been running 35 hours and seen 19,925 observations.

Here's exactly what happens every 30 seconds.

Stage 1: Collect

Raw Metrics~1ms

The system reads your Mac's vital signs using stdlib calls only — subprocess to top, vm_stat, sysctl, df. Zero external dependencies. On an ESP32 it would be sensor reads.

Metric	Value	Source
CPU	50.51%	top -l 1 (parsed)
Memory	72.27%	vm_stat + sysctl hw.memsize
Disk	54.0%	df / (parsed)
Swap	73.59%	sysctl vm.swapusage
Load	4.25	sysctl vm.loadavg
Network	1 (up)	socket connect test
Idle Time	2.5s	IOKit HIDIdleTime
Activity	0.24/min	key/mouse events per minute
Hour	10	datetime.now().hour
Weekday	0 (Mon)	datetime.now().weekday()

This is the raw truth. Just numbers. No interpretation yet.

↓

Stage 2: Quantize

Numbers → Tokensmicroseconds

Each number gets placed into a bucket. CPU 50.51% falls in the 50-60% range = bucket 5. Memory 72.27% = bucket 7. This is language.py — the same idea as a tokenizer in GPT, but for numbers instead of words.

Why buckets? The transformer predicts discrete tokens, not continuous numbers. "CPU went from bucket 3 to bucket 5" is a learnable pattern. "CPU went from 31.247% to 50.512%" is noise.

Pulse Tokens (6 tokens)

Metric	Raw Value	Bucket Rule	Token
CPU 50.51%	÷10, floor	50-59% → 5	C5
Memory 72.27%	÷10, floor	70-79% → 7	M7
Disk 54.0%	÷10, floor	50-59% → 5	D5
Swap 73.59%	÷20, floor	60-79% → 3	S3
Load 4.25	÷4, floor	4.0-7.99 → 1	L1
Network up	boolean	up → 1	N1

Swap has 5 buckets (0-100%, each 20%). Load has 5 buckets (0-20, each 4 units).

Pulse sequence: C5 M7 D5 S3 L1 N1

Rhythm Tokens (4 tokens)

Metric	Raw Value	Token
Idle 2.5s	bucket 0-7 by duration	I0 (very short idle = active)
Activity 0.24/min	bucket 0-5 by rate	A0 (low activity)
Hour 10	÷3, floor	H3 (morning)
Weekday 0	direct	W0 (Monday)

Rhythm sequence: I0 A0 H3 W0

Your entire Mac state is now 10 tokens. This is the vocabulary the model speaks.

↓

Stage 3: Predict (Atoms)

Pulse Atom — 27,840 params~1ms on MPS

The Pulse atom is a tiny GPT. It was trained on 15,000+ observations of your Mac. It learned patterns like: "after C2 M7, D3 is the most likely next token."

Now it processes your current sequence token by token:

Step	Input	Model Predicts	Actually Got	Surprise
1	BOS	C2 (38%), C3 (25%)	C5	4.386 — never seen C5 much
2	C5	M7 (85%), M6 (10%)	M7	0.296 — expected this
3	M7	D3 (45%), D4 (30%)	D5	2.936 — disk higher than usual
4	D5	S1 (40%), S2 (30%)	S3	4.878 — swap much higher than expected
5	S3	L0 (50%), L1 (35%)	L1	1.366 — slightly surprising
6	L1	N1 (99%)	N1	0.017 — network always up

How "surprise" works: The model outputs a probability for every possible next token. If it predicted M7 with 85% probability and M7 actually appeared, surprise = -log(0.85) = 0.16 (low). If it predicted C2 but got C5 (maybe 1% probability), surprise = -log(0.01) = 4.6 (high).

That's the entire mechanism. Surprise = -log(probability the model assigned to what actually happened). High surprise = the model didn't expect this = anomaly.

Rhythm Atom — 27,008 params~1ms on MPS

Same process, different vocabulary:

Step	Input	Predicts	Got	Surprise
1	BOS	I0 (60%)	I0	1.747
2	I0	A0 (99%)	A0	0.011
3	A0	H5 (25%), H6 (25%)	H3	1.884 — usually sees evening
4	H3	W5 (60%), W6 (20%)	W0	4.229 — never seen Monday!

Notice: W0 (Monday) scores 4.229 — the highest surprise. The model was trained mostly on Saturday and Sunday data. It's never seen a Monday before. That's why W0 is surprising. After a week of weekday data, this score will drop.

↓

Stage 4: Score

Average Surprise → Anomaly Scoremicroseconds

Pulse score = average of all per-token surprises:

(4.386 + 0.296 + 2.936 + 4.878 + 1.366 + 0.017) ÷ 6 = 2.313

4.39

0.30

2.94

4.88

1.37

0.02

Rhythm score = (1.747 + 0.011 + 1.884 + 4.229) ÷ 4 = 1.968

The threshold is 2.0. Pulse (2.31) is above it → "elevated." Rhythm (1.97) is just below → "normal."

What you can read from the per-token scores:

M7 (memory 72%) scored 0.30 — the model has seen this many times. Normal.
S3 (swap 73%) scored 4.88 — the model hasn't seen swap this high often. Concerning.
W0 (Monday) scored 4.23 — first Monday ever. The model will learn this next week.

The per-token breakdown tells you exactly WHY the model is surprised, not just that it is.

↓

Stage 5: Decide (Molecule)

Molecule — 157,824 params — MoE Transformer~2ms on MPS

The molecule sees EVERYTHING the atoms see, plus their scores, plus time context. Its input is a unified sequence of all domains:

Molecule Input (20 tokens)

BOS p.C5 p.M7 p.D5 p.S3 p.L1 p.N1 r.I0 r.A0 r.H3 r.W0 PS2 RS1 DS0 H3 W0

Notice the prefixes: p.C5 means "Pulse CPU bucket 5." r.I0 means "Rhythm Idle bucket 0." PS2 means "Pulse Score bucket 2." This prevents collisions — Pulse's C and Drift's C are different tokens.

Inside the Molecule: MoE Transformer

The molecule has 4 expert FFN networks. A gating network looks at each token and decides which 2 experts (out of 4) should process it. This is Mixture of Experts — different experts specialize in different patterns.

The forward pass:

Layer	What Happens
Embed	Each of the 20 tokens → 48-dimensional vector. Position also encoded.
Layer 1	Attention: tokens attend to each other. p.S3 (high swap) attends to p.M7 (high mem) — they're related. Gate routes to experts 1,3.
Layer 2	Deeper patterns: PS2 (elevated pulse score) attends to p.C5 + p.S3. Gate routes to experts 2,4.
Layer 3	Decision layer: combines all evidence. W0 (Monday) + H3 (morning) + elevated scores → routes to experts 1,2.
Output	Predict next token: A:alert (100%), A:ok (0%), A:suppress (0%)

After predicting the action, the model continues generating: EXP → explanation tokens:

A:alert EXP elevated above at high anomaly END

The explanation vocabulary is ~50 diagnostic words. The model picks words autoregressively — same as GPT generating text, but constrained to domain-specific vocabulary.

↓

Stage 6: Explain

Explanation Outputgenerated by molecule

"elevated above at high anomaly"

This is the molecule's best attempt at explaining why it chose "alert." It's saying: scores are elevated, something is above normal, high anomaly detected.

The explanation quality depends on training data. Right now the molecule trained mostly on synthetic templates. As it retrains on more real observations (it's at 4,300+ sequences and climbing), the explanations will become more specific — eventually something like "cpu high swap elevated during morning" instead of generic "elevated above at high."

↓

Stage 7: Act

Action Takenresult

Based on the molecule's decision:

Action	What Happens
ok	Log the observation. Do nothing. Everything is normal.
alert ← this time	Log + flag as anomaly. In daemon mode, could send Telegram/email.
suppress	Log but don't alert. The model recognizes this pattern as "known unusual" — like high CPU during builds.
retrain	The model thinks it's seeing patterns outside its training. Triggers a retrain cycle.

This observation was logged. The alert was sent. 30 seconds from now, the whole cycle repeats.

The Complete Picture

Your Mac at 10:39 AM Monday:

1. Collect: CPU 50%, MEM 72%, Swap 73%, Load 4.2
2. Quantize: C5 M7 D5 S3 L1 N1 · I0 A0 H3 W0
3. Atoms predict: Pulse surprised by C5 and S3 (unusual CPU and swap). Rhythm surprised by W0 (first Monday ever).
4. Score: Pulse 2.31 (above threshold). Rhythm 1.97 (just below).
5. Molecule decides: alert at 100% — it combines cross-domain evidence and says this is unusual enough to flag.
6. Explains: "elevated above at high anomaly"
7. Acts: Logged + alert sent.

Why it's alerting: The model learned weekends. Monday morning with active work is genuinely new to it. The swap at 73% is higher than the model usually expects. After a few weekdays of data + auto-retrains, Monday mornings will become "ok."

Total time: ~5 milliseconds. Then sleeps 30 seconds. Repeats. 19,925 times so far.

What Each Component Knows

Component	Params	Sees	Learns	Outputs
Pulse Atom	27,840	CPU, MEM, Disk, Swap, Load, Net	"After M7, D3 is normal"	Per-token surprise scores
Rhythm Atom	27,008	Idle time, Activity, Hour, Weekday	"At H5 W5, I3 is normal"	Per-token surprise scores
Drift Atom	~27K	Tasks added/completed/switched	"3 added 2 completed is normal"	Per-token surprise scores
Molecule	157,824	All atom tokens + scores + time	"Elevated pulse + night = alert, elevated pulse + weekday = suppress"	Action + explanation

KIRI — an Eryx Labs project