The targeting suite is the core intelligence of JumpStation. It answers a single question:
Given this AI model and this quality threshold, what is the smallest hardware class it can run on?
This question sounds simple. In practice it requires measuring the model’s computational requirements, simulating its behavior at reduced precision across multiple hardware classes, and selecting the minimum viable target. The targeting suite automates all of this.
The output is a target declaration — a machine-readable specification of the minimum hardware class, required inference backend, and recommended weight precision — which is embedded in the JumpBundle manifest and enforced by the runtime at install time.
Trained model (ONNX, TFLite, PyTorch)
│
▼
┌───────────────────────────────┐
│ Profiler │
│ - Count FLOPs │
│ - Measure peak RAM │
│ - Run inference passes │
│ - Measure quantization error │
│ at each precision level │
└───────────────┬───────────────┘
│ requirement vector
▼
┌───────────────────────────────┐
│ Target Selector │
│ - Load device catalog │
│ - Match requirement vector │
│ against each device class │
│ - Select minimum viable │
│ hardware class │
└───────────────┬───────────────┘
│ target declaration
▼
Distillation Pipeline
(see distillation.md)
Module: core/targeting/profiler.py
The profiler measures a model’s concrete computational requirements. It does not estimate from architecture — it runs actual inference passes and measures.
| Metric | How |
|---|---|
| FLOPs | Counted from the model’s computational graph (ONNX or TFLite graph traversal) |
| Peak RAM (KB) | Measured via process memory instrumentation during a live inference pass |
| Latency (CPU) | Wall-clock time for N inference passes on the JumpStation CM5 CPU |
| Latency (DX-M1) | Wall-clock time for N passes on the DX-M1 accelerator (Turbo only) |
| Throughput | Inferences per second at batch size 1 |
| INT8 error | Mean absolute output difference between FP32 and INT8 inference |
| INT4 error | Mean absolute output difference between FP32 and INT4 inference |
The profiler outputs a RequirementVector — a structured record that captures all measurements and becomes the input to the target selector:
{
"flops": 15200000,
"peak_ram_kb": 148,
"latency_cpu_ms": 82.4,
"latency_dxm1_ms": 1.2,
"throughput_fps": 12.1,
"int8_error_mean": 0.0031,
"int4_error_mean": 0.042,
"input_shape": [1, 96, 96, 3],
"output_shape": [1, 50]
}
The DX-M1 M.2 accelerator in the Turbo is central to fast profiling. Running quantization sensitivity sweeps (comparing INT8 and INT4 error across calibration samples) on CPU takes minutes to hours. The DX-M1 reduces this to seconds by vectorizing the quantization simulation across the device catalog in parallel.
The Turbo RK (RK3588S2 + DX-M1) adds the Rockchip on-chip NPU alongside the DX-M1, yielding approximately 31 combined TOPS and enabling native profiling for RK-silicon deployment targets without cross-architecture simulation.
The Turbo is not required for targeting — it will run on the base JumpStation (CM5 or RK3588S2) with CPU-only profiling. The DX-M1 makes it practical to iterate rapidly.
Module: core/targeting/target_selector.py
The target selector scans the device catalog (devices/*/profile.json) and finds the device class with the lowest ai_compute specification that satisfies all constraints in the requirement vector.
For each device in the catalog, the selector checks:
device.ram_mb * 1024 >= requirements.peak_ram_kb * safety_marginDevices are scored and sorted by total cost (a weighted function of TOPS, RAM, and price class). The lowest-cost passing device is the minimum viable target.
The full evaluation order, cheapest to most capable:
uno (ATmega328P) → pico → esp32 → uno_q → jumpstation / jumpstation_rk
→ turbo / turbo_rk → orion_o6 → orion (O9)
For user-facing applications (those requiring a display and Linux runtime), the floor is uno_q. The selector skips uno, pico, and esp32 for bundles that declare UI requirements.
The caller specifies a quality tolerance — the maximum acceptable degradation from the FP32 baseline. The default is 2% mean absolute error. If no device can satisfy the model’s requirements within that tolerance, the selector returns the minimum-cost device that comes closest, with a warning.
Target analysis complete.
Model: plant_classifier_v2.onnx
FLOPs: 15.2M
Peak RAM: 148 KB
INT8 error: 0.31% (within tolerance)
Minimum viable target: pico
Backend: tflite_micro
Precision: int8
Estimated latency: ~45ms (RP2040 @ 133MHz)
Next tier up if latency unacceptable: esp32
Estimated latency: ~28ms (ESP32 @ 240MHz)
After targeting is complete, the profiler and selector results are embedded in the model object of the JumpBundle manifest:
"model": {
"framework": "tflite_micro",
"precision": "int8",
"weights": "model/weights.tflite",
"flops": 15200000,
"peak_ram_kb": 148,
"latency_ms_turbo": 1.2,
"targeting_version": "0.1.0"
}
This means every deployed JumpBundle carries a complete audit trail of how it was targeted and what it requires.
# Profile a model
python core/targeting/profiler.py \
--model ./my_model.onnx \
--calibration-data ./data/calibration_samples/ \
--output ./profile.json
# Select a target
python core/targeting/target_selector.py \
--profile ./profile.json \
--tolerance 0.02 \
--output ./target.json
# View result
cat ./target.json