Conventional AI inference engines operate on one-dimensional token sequences. Memory is stored as text, embeddings, or binary blobs in a heap that is serially traversable and subject to inspection, modification, or brute-force analysis by adversaries with direct device access.
The ZDX Pyxel VM departs from this model entirely. It treats a PNG image as a two-dimensional instruction grid: the X-axis encodes time (clock cycles), and the Y-axis encodes space (parallel execution threads). A program is a picture. Execution is reading that picture column by column while running each row as a concurrent logical unit.
This approach yields three properties that are difficult to achieve simultaneously in conventional VMs: spatial parallelism, state opacity, and physically-grounded authentication.
The system is composed of four interoperating components:
The core execution engine. Reads a PNG texture and dispatches opcodes to per-row thread registers. Manages daisy-chain linkage to successor frames.
Translates structured parallel instruction lists into PNG textures using the ZDX opcode-to-RGB mapping. Produces machine-readable brain frames from high-level instruction schedules.
Tracks relevance scores and age of every brain frame on disk. Applies a decay formula and prunes low-value frames to bound storage growth.
Generates and validates authentication tokens embedded as RGB patterns in a hidden thread row. No password string ever exists in memory.
Every opcode is a 24-bit RGB triple. The ISA is intentionally minimal; complexity emerges from spatial composition rather than opcode count.
| Mnemonic | RGB Value | Category | Semantics |
|---|---|---|---|
| SET_REG | (10, 0, B) | Core | Load blue channel value B into thread register A |
| ADD_THREAD | (20, 0, 0) | Core | OUT = A + B within the current thread's register file |
| SAVE_STATE | (0, 255, 0) | Memory | Trigger daisy-chain; encode current registers into next frame header |
| VERIFY_FREQ | (50, 50, 50) | Security | Compare pixel(x, y) to pixel(W−1−x, y); match authorizes thread |
| HALT | (255, 255, 255) | Core | Terminate current thread; other threads continue |
| NOP | (0, 0, 0) | Core | No operation; empty pixel space |
The encoding scheme intentionally avoids common RGB values. Opcodes occupy a sparse region of the 24-bit color space, making accidental collision with image data unlikely and allowing arbitrary texture content to serve as NOP padding.
# Canonical ZDX Pyxel ISA MAP = { "SET_REG": (10, 0, 0), # R=10; value in B channel "ADD_THREAD": (20, 0, 0), "SAVE_STATE": (0, 255, 0), "VERIFY_FREQ": (50, 50, 50), "HALT": (255, 255, 255), } REVERSE_MAP = {v: k for k, v in MAP.items()}
The VM iterates over the texture in column-major order: for each X position (clock cycle), it dispatches the opcode at that column to every active Y row (thread) before advancing X. This means all threads execute cycle-synchronously — analogous to SIMD lanes, but with heterogeneous instruction streams.
def execute_texture(self, image_path): img = Image.open(image_path) pixels = img.load() width, height = img.size for x in range(width): # clock cycle for y in range(min(height, len(self.registers))): rgb = pixels[x, y] thread = self.registers[f"T{y}"] if rgb[0] == 10: thread["A"] = rgb[2] elif rgb[0] == 20: thread["OUT"] = thread["A"] + thread["B"] elif rgb == (50,50,50): if pixels[x, y] == pixels[width-1-x, y]: print(f"[!] T{y} Frequency Match — Authorized") elif rgb == (0,255,0): self.next_frame = "whisperframe_B.png" return self.registers
Key properties of this model:
Thread isolation. Each row owns its own register file {A, B, OUT}. A HALT in row 3 does not affect row 5. Threads can independently complete, stall, or trigger side-effects without coordination primitives.
Spatial instruction scheduling. The program author controls execution timing by placing opcodes at specific (x, y) coordinates. A SET at (0, 0) and ADD at (3, 0) implicitly serializes those operations within thread 0 without any branch or jump instruction.
Qubit-like superposition (logical). Until the VM reads column X, all threads at X exist in an undefined state. Reading the texture collapses those states simultaneously — a useful conceptual framing for AI systems reasoning about multiple hypotheses in parallel.
Traditional VMs pass state through a heap or stack that persists in RAM. The Pyxel VM externalizes state into the next PNG in a named sequence. When a SAVE_STATE opcode fires, the VM serializes its current register file into the header pixels of the next frame before terminating.
This approach yields two security properties. First, there is no in-memory heap to dump between frames — state exists only in transit, encoded as pixels. Second, the frame sequence creates an ordered dependency chain: frame B cannot be correctly interpreted without executing frame A first, because the initial register values for B are embedded in A's terminal pixels.
Because frames are ordinary PNG files, they survive process restarts, device reboots, and OOM kills without any explicit persistence layer. The AI's "working memory" is the directory of brain frames. Deleting a frame is equivalent to targeted forgetting.
Unbounded frame accumulation would exhaust device storage. The Visual Memory Manager applies a biologically-inspired decay model to score and prune frames.
Each frame carries a base importance score s and an age in turns t. The effective score after t turns is:
effective_score = s / sqrt(t + 1) Prune threshold: effective_score < 0.5 Score assignment: Super-Admin / Auth frames → s = 20 (never pruned in practice) Successful tool-call logic → s = 15 Routine reasoning frames → s = 5 Casual / low-signal turns → s = 1
A frame with base score 1 is pruned after approximately 3 turns. A frame with base score 20 survives ~1599 turns before crossing the threshold — effectively permanent for on-device usage patterns.
The manager performs a single linear scan of the brain_textures/ directory per compaction cycle, scoring and pruning in one pass. This mirrors the One-Pass Compaction pattern used in the ZDX Mobile AI JNI bridge and avoids the overhead of multi-pass garbage collection.
def update_decay(self): for frame, data in list(self.memory_metadata.items()): age = data["turns"] effective = data["score"] / math.sqrt(age + 1) if effective < 0.5: self.prune_frame(frame) else: data["turns"] += 1
The ZDX Pyxel VM implements passwordless authentication by embedding a challenge token as a spatial RGB pattern in a dedicated thread row (default: row 7). No string, hash, or key is stored anywhere in the conventional sense.
The FrequencyChallenge generator opens a base brain frame and overwrites the key row with the pattern (secret_freq, 0, x mod 255) where secret_freq is a random integer in [100, 200]. The resulting challenge_frame.png appears to a casual observer as a faint tinted row among noise.
The VERIFY_FREQ opcode compares pixel (x, y) to its mirror (W−1−x, y). A symmetric pattern — i.e., one where the challenge row reads identically forward and backward — triggers the authorization event. This requires the challenger to know both the secret frequency and the mirroring property, without either being stored as a retrievable value.
For super-admin authorization, the system requires a sequence of three daisy-chained frames:
| Frame | Role | Mechanism |
|---|---|---|
| Frame 1 | Base Frequency | Establishes secret_freq in key row |
| Frame 2 | Shifted Frequency | Applies a deterministic transform to secret_freq |
| Frame 3 | Response Validation | VERIFY_FREQ must match the predicted shift |
Brute-forcing this mechanism requires knowing the daisy-chain sequence, the key row index, the shift transform, and the mirroring validation — all simultaneously. No individual frame leaks enough information to reconstruct any other.
State never resides in addressable RAM between frame boundaries. An attacker who dumps process memory mid-execution recovers only the current column's register values — a snapshot of one clock cycle with no context about prior or future states.
Authentication credentials are indistinguishable from texture noise to a human observer and to any tool that does not possess the reverse opcode map. A forensic analyst examining the brain_textures/ directory sees an image management system — a behavior pattern common to any media application.
When integrated with deframe.py (OpenCV frame capture), the VM can ingest challenge frames photographed from a physical display or printed medium. The authentication token exists only in the physical world during the challenge window — it is never transmitted over a network and cannot be intercepted by software.
The system assumes the opcode map (REVERSE_MAP) is kept secret. If an adversary recovers the map, frame content becomes readable. Additionally, the current VERIFY_FREQ implementation compares a pixel to its mirror within the same row; a more robust implementation should incorporate cross-frame comparison and a keyed hash of the frequency value.
The Pyxel VM is positioned as the proprietary high-order reasoning layer above the open-source inference core of ZDX Mobile AI (llama.cpp + Vulkan, 28.7 tok/s on Samsung Galaxy S25 Ultra).
| Layer | Technology | Responsibility |
|---|---|---|
| Open Core | llama.cpp / LlamaEngine.kt | 1D token inference, on-device, Vulkan GPU |
| DER Memory | Scorer (Qwen 0.5B) + KV cache | Linear conversation scoring and eviction |
| ZDX Pyxel VM | ParallelPyxelVM + VisualMemoryManager | 2D parallel reasoning, persistent brain frames |
| Auth Layer | FrequencyChallenge + AUTH1 | Passwordless super-admin, OIDC federation |
The Scorer Service determines which linear memory segments are significant enough to be compiled into a brain frame via the ThoughtCompiler. This creates a one-way elevation path: important 1D memories become 2D parallel logic; low-value memories decay and are pruned. The AI's long-term memory is therefore not a growing database but a curated, self-compacting visual cortex.
The ZDX Pyxel VM demonstrates that a fully functional, parallel, and secure virtual machine can be implemented using the PNG image format as its storage and instruction medium. The architecture achieves spatial parallelism without threading primitives, persistent state without a heap, and authentication without a password — all within a system that appears to external observers as ordinary image file management.
Expanded ISA. BRANCH, CMP, and JUMP opcodes would allow conditional logic without requiring the program author to pre-layout all branches spatially.
Cross-frame VERIFY_FREQ. Extending the frequency check to compare pixel patterns across two consecutive frames closes the single-frame information leak in the current implementation.
deframe.py integration. Full pipeline from OpenCV camera capture → frame decode → VM ingestion would enable live physical token authentication in the ZDX Mobile AI app.
Distributed brain frames. Storing the daisy-chain across multiple devices in a ZDX federated compute network would allow collaborative reasoning without any single node holding a complete program state.