← ALL RFDS
RFD 027 idea

Training vs Inference Workload Discrimination

AUTHOR Amodo CREATED 2024-11-27
verificationside-channelinference

The Idea

Use out-of-band signals (power draw, EM emissions, network traffic patterns) to distinguish training workloads from inference workloads, without trusting software attestation. This enables verification regimes where clusters are licensed for inference-only but might attempt unauthorized training.

Key observables:

  • Power: Training has sustained high power; inference is bursty
  • Network: Training has AllReduce patterns across many nodes; inference is request-response
  • Memory: Training updates weights; inference doesn’t (detectable via cache/memory access patterns?)
  • Duration: Training runs for hours/days; inference requests complete in seconds

Why It Matters

RFD 017 (Inference-Only Verification Package) assumes you can verify inference-only compliance. This RFD addresses how to make that determination from observable signals, which is the core technical challenge.

Implementation Status (Amodo)

  • Collecting side-channel data from research servers
  • Testing power measurement, EM probes, network packet sniffing
  • Building demo for policy stakeholders (“workload attestation”)
  • Planning to publish dataset for verification researchers

Open Questions

  • What’s the minimum observation time to distinguish with high confidence?
  • Can adversaries design “inference-shaped” training (small batches, bursty)?
  • How does MoE routing affect signatures (sparse expert activation)?
  • Can fine-tuning be distinguished from inference + from full training?

References

  • Related: RFD 017 (Inference-Only Verification Package)
  • Related: RFD 001 (Side-Channel Leakage from LLM Inference)
  • Related: RFD 012 (Analog Sensors for Compute Verification)