jumpstation

Hardware Targeting

Overview

The targeting suite is the core intelligence of JumpStation. It answers a single question:

Given this AI model and this quality threshold, what is the smallest hardware class it can run on?

This question sounds simple. In practice it requires measuring the model’s computational requirements, simulating its behavior at reduced precision across multiple hardware classes, and selecting the minimum viable target. The targeting suite automates all of this.

The output is a target declaration — a machine-readable specification of the minimum hardware class, required inference backend, and recommended weight precision — which is embedded in the JumpBundle manifest and enforced by the runtime at install time.


The Targeting Workflow

Trained model (ONNX, TFLite, PyTorch)
        │
        ▼
┌───────────────────────────────┐
│  Profiler                     │
│  - Count FLOPs                │
│  - Measure peak RAM           │
│  - Run inference passes       │
│  - Measure quantization error │
│    at each precision level    │
└───────────────┬───────────────┘
                │  requirement vector
                ▼
┌───────────────────────────────┐
│  Target Selector              │
│  - Load device catalog        │
│  - Match requirement vector   │
│    against each device class  │
│  - Select minimum viable      │
│    hardware class             │
└───────────────┬───────────────┘
                │  target declaration
                ▼
        Distillation Pipeline
        (see distillation.md)

Profiler

Module: core/targeting/profiler.py

The profiler measures a model’s concrete computational requirements. It does not estimate from architecture — it runs actual inference passes and measures.

What it measures

Metric How
FLOPs Counted from the model’s computational graph (ONNX or TFLite graph traversal)
Peak RAM (KB) Measured via process memory instrumentation during a live inference pass
Latency (CPU) Wall-clock time for N inference passes on the JumpStation CM5 CPU
Latency (DX-M1) Wall-clock time for N passes on the DX-M1 accelerator (Turbo only)
Throughput Inferences per second at batch size 1
INT8 error Mean absolute output difference between FP32 and INT8 inference
INT4 error Mean absolute output difference between FP32 and INT4 inference

Requirement vector

The profiler outputs a RequirementVector — a structured record that captures all measurements and becomes the input to the target selector:

{
  "flops": 15200000,
  "peak_ram_kb": 148,
  "latency_cpu_ms": 82.4,
  "latency_dxm1_ms": 1.2,
  "throughput_fps": 12.1,
  "int8_error_mean": 0.0031,
  "int4_error_mean": 0.042,
  "input_shape": [1, 96, 96, 3],
  "output_shape": [1, 50]
}

The role of the DX-M1

The DX-M1 M.2 accelerator in the Turbo is central to fast profiling. Running quantization sensitivity sweeps (comparing INT8 and INT4 error across calibration samples) on CPU takes minutes to hours. The DX-M1 reduces this to seconds by vectorizing the quantization simulation across the device catalog in parallel.

The Turbo RK (RK3588S2 + DX-M1) adds the Rockchip on-chip NPU alongside the DX-M1, yielding approximately 31 combined TOPS and enabling native profiling for RK-silicon deployment targets without cross-architecture simulation.

The Turbo is not required for targeting — it will run on the base JumpStation (CM5 or RK3588S2) with CPU-only profiling. The DX-M1 makes it practical to iterate rapidly.


Target Selector

Module: core/targeting/target_selector.py

The target selector scans the device catalog (devices/*/profile.json) and finds the device class with the lowest ai_compute specification that satisfies all constraints in the requirement vector.

Selection algorithm

For each device in the catalog, the selector checks:

  1. RAM: device.ram_mb * 1024 >= requirements.peak_ram_kb * safety_margin
  2. Inference backend: the device must support an inference backend compatible with the model’s framework
  3. Quantization viability: if the device only supports INT8 or INT4, the measured quantization error must be below the caller-specified tolerance threshold
  4. Latency: if a latency budget is specified, simulated latency on the target device must fall within it

Devices are scored and sorted by total cost (a weighted function of TOPS, RAM, and price class). The lowest-cost passing device is the minimum viable target.

The full evaluation order, cheapest to most capable:

uno (ATmega328P) → pico → esp32 → uno_q → jumpstation / jumpstation_rk
  → turbo / turbo_rk → orion_o6 → orion (O9)

For user-facing applications (those requiring a display and Linux runtime), the floor is uno_q. The selector skips uno, pico, and esp32 for bundles that declare UI requirements.

Quality tolerance

The caller specifies a quality tolerance — the maximum acceptable degradation from the FP32 baseline. The default is 2% mean absolute error. If no device can satisfy the model’s requirements within that tolerance, the selector returns the minimum-cost device that comes closest, with a warning.

Example output

Target analysis complete.
  Model: plant_classifier_v2.onnx
  FLOPs: 15.2M
  Peak RAM: 148 KB
  INT8 error: 0.31% (within tolerance)

  Minimum viable target: pico
    Backend: tflite_micro
    Precision: int8
    Estimated latency: ~45ms (RP2040 @ 133MHz)

  Next tier up if latency unacceptable: esp32
    Estimated latency: ~28ms (ESP32 @ 240MHz)

Requirement Vector → Bundle Manifest

After targeting is complete, the profiler and selector results are embedded in the model object of the JumpBundle manifest:

"model": {
  "framework": "tflite_micro",
  "precision": "int8",
  "weights": "model/weights.tflite",
  "flops": 15200000,
  "peak_ram_kb": 148,
  "latency_ms_turbo": 1.2,
  "targeting_version": "0.1.0"
}

This means every deployed JumpBundle carries a complete audit trail of how it was targeted and what it requires.


Running the Targeting Suite

# Profile a model
python core/targeting/profiler.py \
  --model ./my_model.onnx \
  --calibration-data ./data/calibration_samples/ \
  --output ./profile.json

# Select a target
python core/targeting/target_selector.py \
  --profile ./profile.json \
  --tolerance 0.02 \
  --output ./target.json

# View result
cat ./target.json

Further Reading