便携式信号分析仪架构说明

版本号：v0.1.0 最后更新：2026-04-04

说明：本版为按规范整理的历史文档，正文暂保留原英文内容。

This document is a domain-specific product architecture under the higher-level framework design in 多模态分析框架.md.

1. Purpose

This document captures a product and architecture direction for a portable signal analyzer inspired by the "tricorder" style workflow:

collect signals from multiple physical sources
detect, demodulate, decode, and frame them using deterministic algorithms
apply protocol analysis on framed data
use AI as an assistant, orchestrator, and experiment controller

The core principle is:

AI should not replace the actual signal-processing and protocol-analysis engine. Deterministic algorithms should do the decoding work. AI should control experiments, compare results, explain outcomes, and choose the next action.

2. Product Positioning

The target device is not just a packet sniffer and not just an SDR receiver. It is a multi-stage analysis platform for:

RF and non-RF signal acquisition
physical-layer and link-layer recovery
protocol identification and interpretation
guided diagnostics and anomaly explanation

Practical examples include:

identifying unknown digital bursts
recovering framed traffic from noisy captures
decoding standard or proprietary protocols
presenting operator-friendly summaries and next-step suggestions

3. Wireshark Reuse Boundary

Wireshark is useful, but only for the upper half of the stack.

Wireshark is strong at:

framed packet dissection
protocol tree generation
display filtering
reassembly, statistics, and follow-stream style analysis
export and structured protocol interpretation

Wireshark is not the right tool for:

raw RF analysis
blind modulation recognition
carrier recovery
symbol timing recovery
unknown physical-layer reconstruction

The practical reuse boundary is:

collect raw signal data
perform DSP, demodulation, bit recovery, and framing
convert recovered traffic into packets or events
hand those results to Wireshark-related tooling such as:
- pcap / pcapng
- tshark
- sharkd
- custom dissectors where appropriate

This makes Wireshark a protocol-analysis backend, not the full analyzer brain.

4. System Architecture

The system should be split into clear layers.

4.1 Acquisition Layer

Inputs may include:

IQ streams
IF or audio data
logic-level captures
UART / SPI / I2C / CAN buses
BLE / Wi-Fi / Ethernet mirrored traffic
file-based replay samples

This layer should normalize access to multiple hardware front-ends and record:

timestamp
sample rate
center frequency
gain / front-end state
source identity
capture duration

4.2 Signal Workspace

The workspace is the canonical store for both raw and intermediate data. It should retain:

raw samples
derived features
intermediate bitstreams
framed outputs
experiment metadata
scores and failure reasons

This is essential for reproducibility, offline replay, regression testing, and AI-driven iteration.

4.3 DSP / Demodulation Pipeline

This layer performs actual signal recovery. Typical module categories:

preprocessing
- DC removal
- AGC
- filtering
- resampling
detection
- energy detection
- burst detection
- coarse frequency estimation
synchronization
- carrier recovery
- symbol clock recovery
- preamble / sync-word search
demodulation
- OOK / ASK
- FSK / GFSK
- PSK / QPSK
- OFDM-family or chirp-style paths when supported
bit-domain processing
- hard or soft decision
- de-whitening
- de-interleaving
- FEC decoding
- CRC validation

4.4 Frame Builder

This layer transforms bitstreams into candidate frames or packets. It is the boundary between "signal recovery" and "protocol interpretation".

Responsibilities:

frame boundary detection
fixed/variable length frame assembly
checksum / CRC validation
field boundary estimation
event extraction

4.5 Protocol Analysis Layer

Once data has become packets or events:

use Wireshark-compatible outputs for standard protocols
use internal parsers and heuristics for proprietary protocols
gradually migrate stable proprietary formats into custom dissectors if needed

4.6 UI and Assistant Layer

This layer provides:

live scan results
replay and lab analysis
confidence-ranked protocol candidates
anomaly explanations
next-step recommendations
exportable reports

5. AI Role in the System

AI should not directly replace DSP blocks. Its primary role is that of an orchestration and analysis controller.

AI responsibilities:

choose candidate algorithm pipelines
tune parameters
compare candidate outputs
explain likely failure points
decide what to try next
summarize results for the operator

AI should behave like an automated signal-analysis engineer, not like a magical decoder.

6. AI Search for Algorithm Chains

6.1 Problem Definition

Algorithm-chain search is a constrained program-search problem.

Input:

raw or partially processed signal data
prior device/context metadata
previous experiment history

Output:

a ranked set of candidate pipelines
associated parameter settings
score and confidence estimates

Optimization target:

maximize correctness and interpretability
minimize computational cost and false positives

6.2 Canonical Pipeline Shape

A pipeline can be modeled as:

source
-> preprocess
-> detect
-> sync
-> demod
-> decode
-> frame
-> proto

Each stage may have several interchangeable modules.

6.3 Search State

Each attempt should be tracked as an experiment node.

type Experiment = {
  id: string
  parentId?: string
  pipeline: PipelineNode[]
  inputRef: string
  outputs: StageOutput[]
  score: ScoreCard
  status: "pending" | "running" | "done" | "failed"
  notes?: string
}

type PipelineNode = {
  module: string
  params: Record<string, number | string | boolean>
}

This allows AI to operate over an experiment tree instead of producing one-off guesses.

6.4 Recommended Search Strategy

Use a hybrid strategy:

rules for initialization
beam search for structure search
local optimization for parameter tuning

Recommended control flow:

classify signal at a coarse level
generate a small number of high-probability candidate pipelines
run short-window experiments
score results and prune aggressively
mutate the best pipelines locally
rerun on longer samples for confirmation

This is more stable than unconstrained random search.

6.5 Why Beam Search Fits

Beam search is a strong fit because it:

is resource-bounded
is easy to explain and debug
supports progressive refinement
works well with ranked experiment history

Suggested pattern:

outer loop: beam search over module-chain structure
inner loop: bounded parameter tuning around the best chains

6.6 Parameter Tuning

Continuous or range-based parameters should be tuned separately from structure search. Typical tunables:

symbol rate
bandwidth
threshold values
timing recovery parameters
frequency offset compensation
framing tolerances

Possible strategies:

bounded grid search
adaptive range narrowing
Bayesian optimization where available

7. Scoring System

The scoring system is the backbone of the AI loop. AI can only optimize what is measured.

7.1 Physical-Layer Score

Examples:

SNR improvement
carrier lock stability
clock recovery stability
cluster separation after demodulation
residual frequency error

7.2 Frame-Level Score

Examples:

preamble detection rate
frame length consistency
frame-boundary stability
CRC pass rate
repeated structure frequency

7.3 Protocol-Level Score

Examples:

known-header matches
valid field lengths
legal enum / field value ratios
session consistency
successful Wireshark-style protocol interpretation

7.4 Cost Penalty

Examples:

CPU cost
memory cost
latency
fragility under small parameter changes
overfitting to short windows

7.5 Example Composite Score

score =
  0.25 * phy_score +
  0.35 * frame_score +
  0.30 * proto_score -
  0.10 * cost_penalty

Weights should initially be hand-tuned and later adjusted using replay corpora.

8. Failure Attribution

Every experiment should return structured failure reasons.

Example labels:

no_signal_detected
unstable_symbol_clock
carrier_not_locked
frame_sync_failed
crc_failed
field_semantics_invalid
overfit_to_noise

This enables targeted next-step decisions.

Examples:

unstable_symbol_clock -> adjust symbol-rate range or swap timing recovery module
crc_failed -> try bit inversion, whitening, byte order, CRC family changes
field_semantics_invalid -> reconsider framing or protocol family

9. Module Registry

Every algorithmic building block should be registered with machine-readable metadata.

type ModuleSpec = {
  name: string
  stage: "preprocess" | "sync" | "demod" | "decode" | "frame" | "proto"
  inputFormat: string
  outputFormat: string
  params: Record<string, ParamSpec>
  constraints: string[]
  metrics: string[]
  cost: { cpu: number; memory: number; latency: number }
}

Without a registry, AI cannot safely orchestrate pipelines.

The registry should allow the system to answer:

what can run after what
what parameters are tunable
what metrics each module produces
which modules are expensive
which modules are suitable for real-time use

10. Knowledge Base and Priors

The system should maintain a history of prior successful analyses.

type PriorCase = {
  featureFingerprint: number[]
  successfulPipelines: RankedPipeline[]
}

Benefits:

faster startup on familiar signal families
reduced search cost
improved reliability over time
operator trust through precedent-based suggestions

This lets the AI behave more like an experienced lab engineer.

11. Runtime Modes

At minimum, the product should support:

11.1 Live Scan

real-time acquisition
limited local search
fast confidence-ranked hints
real-time alerting

11.2 Lab Replay

deterministic offline reprocessing
multiple experiment branches
parameter tuning
regression validation

11.3 Protocol Assist

packet/event summarization
protocol explanation
filter and query generation
reporting and export

12. Device vs Host Split

A portable device has strict CPU, memory, thermal, and battery limits. Do not assume the full AI search workload belongs on-device.

Recommended split:

device side
- acquisition
- lightweight feature extraction
- small bounded search
- fast heuristic alerts
host / dock / edge side
- deep experiment search
- heavy replay analysis
- larger AI inference
- training / rule generation

This split keeps the handheld usable under real operating conditions.

13. Safety Boundaries for AI

AI should be allowed to:

select pipelines
adjust parameters
reorder compatible modules
choose which experiment to run next
generate summaries

AI should not directly and automatically:

patch low-level production DSP code in the live path
disable safety limits
bypass deterministic validation
replace scoring with free-form judgment

If AI-generated changes extend beyond parameter or policy updates, they should be validated in replay or sandbox mode first.

14. MVP Implementation Plan

A practical first version should be intentionally small.

14.1 MVP Scope

one or two acquisition sources
10 to 20 reusable modules
experiment manager
scoring engine
beam-search controller
replay dataset support
export to pcap / pcapng
protocol analysis through tshark or sharkd

14.2 Suggested Initial Modules

dc_remove
agc
bandpass
resample
burst_detect
freq_offset_est
clock_recovery
ook_demod
2fsk_demod
gfsk_demod
slicer
manchester_decode
whitening_try
crc_scan
fixed_preamble_framer
variable_length_framer

14.3 Build Order

acquisition and replay path
module registry
pipeline executor
scoring engine
experiment persistence
AI orchestration loop
Wireshark-compatible export and protocol backend
handheld UI and reporting

15. Key Risks

Primary technical risks:

search-space explosion
weak scoring functions
overfitting to noise or short windows
mixing structure search and parameter search too early
lack of reproducible experiment logs

Primary product risks:

placing AI too low in the stack
trying to make the first version too universal
failing to define a standard intermediate representation

Primary integration risk:

misunderstanding Wireshark's role and pushing it below the framing boundary

16. Summary

The proposed portable signal analyzer should be designed as a layered system:

deterministic algorithms do the actual signal recovery
Wireshark-derived tooling handles protocol analysis after framing
AI operates above those layers as an experiment orchestrator, tuning controller, and explanation engine

The winning architecture is not "AI decodes everything". It is "AI controls a rigorous decoding and analysis workflow".

便携式信号分析仪架构说明.md 13 KB History Raw

便携式信号分析仪架构说明

1. Purpose

2. Product Positioning

3. Wireshark Reuse Boundary

4. System Architecture

4.1 Acquisition Layer

4.2 Signal Workspace

4.3 DSP / Demodulation Pipeline

4.4 Frame Builder

4.5 Protocol Analysis Layer

4.6 UI and Assistant Layer

5. AI Role in the System

6. AI Search for Algorithm Chains

6.1 Problem Definition

6.2 Canonical Pipeline Shape

6.3 Search State

6.4 Recommended Search Strategy

6.5 Why Beam Search Fits

6.6 Parameter Tuning

7. Scoring System

7.1 Physical-Layer Score

7.2 Frame-Level Score

7.3 Protocol-Level Score

7.4 Cost Penalty

7.5 Example Composite Score

8. Failure Attribution

9. Module Registry

10. Knowledge Base and Priors

11. Runtime Modes

11.1 Live Scan

11.2 Lab Replay

11.3 Protocol Assist

12. Device vs Host Split

13. Safety Boundaries for AI

14. MVP Implementation Plan

14.1 MVP Scope

14.2 Suggested Initial Modules

14.3 Build Order

15. Key Risks

16. Summary

便携式信号分析仪架构说明.md 13 KB

History Raw