便携式信号分析仪架构说明
版本号:v0.1.0
最后更新:2026-04-04
说明:本版为按规范整理的历史文档,正文暂保留原英文内容。
This document is a domain-specific product architecture under the higher-level framework design in 多模态分析框架.md.
1. Purpose
This document captures a product and architecture direction for a portable signal analyzer inspired by the "tricorder" style workflow:
- collect signals from multiple physical sources
- detect, demodulate, decode, and frame them using deterministic algorithms
- apply protocol analysis on framed data
- use AI as an assistant, orchestrator, and experiment controller
The core principle is:
AI should not replace the actual signal-processing and protocol-analysis engine.
Deterministic algorithms should do the decoding work.
AI should control experiments, compare results, explain outcomes, and choose the next action.
2. Product Positioning
The target device is not just a packet sniffer and not just an SDR receiver.
It is a multi-stage analysis platform for:
- RF and non-RF signal acquisition
- physical-layer and link-layer recovery
- protocol identification and interpretation
- guided diagnostics and anomaly explanation
Practical examples include:
- identifying unknown digital bursts
- recovering framed traffic from noisy captures
- decoding standard or proprietary protocols
- presenting operator-friendly summaries and next-step suggestions
3. Wireshark Reuse Boundary
Wireshark is useful, but only for the upper half of the stack.
Wireshark is strong at:
- framed packet dissection
- protocol tree generation
- display filtering
- reassembly, statistics, and follow-stream style analysis
- export and structured protocol interpretation
Wireshark is not the right tool for:
- raw RF analysis
- blind modulation recognition
- carrier recovery
- symbol timing recovery
- unknown physical-layer reconstruction
The practical reuse boundary is:
- collect raw signal data
- perform DSP, demodulation, bit recovery, and framing
- convert recovered traffic into packets or events
- hand those results to Wireshark-related tooling such as:
pcap / pcapng
tshark
sharkd
- custom dissectors where appropriate
This makes Wireshark a protocol-analysis backend, not the full analyzer brain.
4. System Architecture
The system should be split into clear layers.
4.1 Acquisition Layer
Inputs may include:
- IQ streams
- IF or audio data
- logic-level captures
- UART / SPI / I2C / CAN buses
- BLE / Wi-Fi / Ethernet mirrored traffic
- file-based replay samples
This layer should normalize access to multiple hardware front-ends and record:
- timestamp
- sample rate
- center frequency
- gain / front-end state
- source identity
- capture duration
4.2 Signal Workspace
The workspace is the canonical store for both raw and intermediate data.
It should retain:
- raw samples
- derived features
- intermediate bitstreams
- framed outputs
- experiment metadata
- scores and failure reasons
This is essential for reproducibility, offline replay, regression testing, and AI-driven iteration.
4.3 DSP / Demodulation Pipeline
This layer performs actual signal recovery.
Typical module categories:
- preprocessing
- DC removal
- AGC
- filtering
- resampling
- detection
- energy detection
- burst detection
- coarse frequency estimation
- synchronization
- carrier recovery
- symbol clock recovery
- preamble / sync-word search
- demodulation
- OOK / ASK
- FSK / GFSK
- PSK / QPSK
- OFDM-family or chirp-style paths when supported
- bit-domain processing
- hard or soft decision
- de-whitening
- de-interleaving
- FEC decoding
- CRC validation
4.4 Frame Builder
This layer transforms bitstreams into candidate frames or packets.
It is the boundary between "signal recovery" and "protocol interpretation".
Responsibilities:
- frame boundary detection
- fixed/variable length frame assembly
- checksum / CRC validation
- field boundary estimation
- event extraction
4.5 Protocol Analysis Layer
Once data has become packets or events:
- use Wireshark-compatible outputs for standard protocols
- use internal parsers and heuristics for proprietary protocols
- gradually migrate stable proprietary formats into custom dissectors if needed
4.6 UI and Assistant Layer
This layer provides:
- live scan results
- replay and lab analysis
- confidence-ranked protocol candidates
- anomaly explanations
- next-step recommendations
- exportable reports
5. AI Role in the System
AI should not directly replace DSP blocks.
Its primary role is that of an orchestration and analysis controller.
AI responsibilities:
- choose candidate algorithm pipelines
- tune parameters
- compare candidate outputs
- explain likely failure points
- decide what to try next
- summarize results for the operator
AI should behave like an automated signal-analysis engineer, not like a magical decoder.
6. AI Search for Algorithm Chains
6.1 Problem Definition
Algorithm-chain search is a constrained program-search problem.
Input:
- raw or partially processed signal data
- prior device/context metadata
- previous experiment history
Output:
- a ranked set of candidate pipelines
- associated parameter settings
- score and confidence estimates
Optimization target:
- maximize correctness and interpretability
- minimize computational cost and false positives
6.2 Canonical Pipeline Shape
A pipeline can be modeled as:
source
-> preprocess
-> detect
-> sync
-> demod
-> decode
-> frame
-> proto
Each stage may have several interchangeable modules.
6.3 Search State
Each attempt should be tracked as an experiment node.
type Experiment = {
id: string
parentId?: string
pipeline: PipelineNode[]
inputRef: string
outputs: StageOutput[]
score: ScoreCard
status: "pending" | "running" | "done" | "failed"
notes?: string
}
type PipelineNode = {
module: string
params: Record<string, number | string | boolean>
}
This allows AI to operate over an experiment tree instead of producing one-off guesses.
6.4 Recommended Search Strategy
Use a hybrid strategy:
- rules for initialization
- beam search for structure search
- local optimization for parameter tuning
Recommended control flow:
- classify signal at a coarse level
- generate a small number of high-probability candidate pipelines
- run short-window experiments
- score results and prune aggressively
- mutate the best pipelines locally
- rerun on longer samples for confirmation
This is more stable than unconstrained random search.
6.5 Why Beam Search Fits
Beam search is a strong fit because it:
- is resource-bounded
- is easy to explain and debug
- supports progressive refinement
- works well with ranked experiment history
Suggested pattern:
- outer loop: beam search over module-chain structure
- inner loop: bounded parameter tuning around the best chains
6.6 Parameter Tuning
Continuous or range-based parameters should be tuned separately from structure search.
Typical tunables:
- symbol rate
- bandwidth
- threshold values
- timing recovery parameters
- frequency offset compensation
- framing tolerances
Possible strategies:
- bounded grid search
- adaptive range narrowing
- Bayesian optimization where available
7. Scoring System
The scoring system is the backbone of the AI loop.
AI can only optimize what is measured.
7.1 Physical-Layer Score
Examples:
- SNR improvement
- carrier lock stability
- clock recovery stability
- cluster separation after demodulation
- residual frequency error
7.2 Frame-Level Score
Examples:
- preamble detection rate
- frame length consistency
- frame-boundary stability
- CRC pass rate
- repeated structure frequency
7.3 Protocol-Level Score
Examples:
- known-header matches
- valid field lengths
- legal enum / field value ratios
- session consistency
- successful Wireshark-style protocol interpretation
7.4 Cost Penalty
Examples:
- CPU cost
- memory cost
- latency
- fragility under small parameter changes
- overfitting to short windows
7.5 Example Composite Score
score =
0.25 * phy_score +
0.35 * frame_score +
0.30 * proto_score -
0.10 * cost_penalty
Weights should initially be hand-tuned and later adjusted using replay corpora.
8. Failure Attribution
Every experiment should return structured failure reasons.
Example labels:
no_signal_detected
unstable_symbol_clock
carrier_not_locked
frame_sync_failed
crc_failed
field_semantics_invalid
overfit_to_noise
This enables targeted next-step decisions.
Examples:
unstable_symbol_clock -> adjust symbol-rate range or swap timing recovery module
crc_failed -> try bit inversion, whitening, byte order, CRC family changes
field_semantics_invalid -> reconsider framing or protocol family
9. Module Registry
Every algorithmic building block should be registered with machine-readable metadata.
type ModuleSpec = {
name: string
stage: "preprocess" | "sync" | "demod" | "decode" | "frame" | "proto"
inputFormat: string
outputFormat: string
params: Record<string, ParamSpec>
constraints: string[]
metrics: string[]
cost: { cpu: number; memory: number; latency: number }
}
Without a registry, AI cannot safely orchestrate pipelines.
The registry should allow the system to answer:
- what can run after what
- what parameters are tunable
- what metrics each module produces
- which modules are expensive
- which modules are suitable for real-time use
10. Knowledge Base and Priors
The system should maintain a history of prior successful analyses.
type PriorCase = {
featureFingerprint: number[]
successfulPipelines: RankedPipeline[]
}
Benefits:
- faster startup on familiar signal families
- reduced search cost
- improved reliability over time
- operator trust through precedent-based suggestions
This lets the AI behave more like an experienced lab engineer.
11. Runtime Modes
At minimum, the product should support:
11.1 Live Scan
- real-time acquisition
- limited local search
- fast confidence-ranked hints
- real-time alerting
11.2 Lab Replay
- deterministic offline reprocessing
- multiple experiment branches
- parameter tuning
- regression validation
11.3 Protocol Assist
- packet/event summarization
- protocol explanation
- filter and query generation
- reporting and export
12. Device vs Host Split
A portable device has strict CPU, memory, thermal, and battery limits.
Do not assume the full AI search workload belongs on-device.
Recommended split:
- device side
- acquisition
- lightweight feature extraction
- small bounded search
- fast heuristic alerts
- host / dock / edge side
- deep experiment search
- heavy replay analysis
- larger AI inference
- training / rule generation
This split keeps the handheld usable under real operating conditions.
13. Safety Boundaries for AI
AI should be allowed to:
- select pipelines
- adjust parameters
- reorder compatible modules
- choose which experiment to run next
- generate summaries
AI should not directly and automatically:
- patch low-level production DSP code in the live path
- disable safety limits
- bypass deterministic validation
- replace scoring with free-form judgment
If AI-generated changes extend beyond parameter or policy updates, they should be validated in replay or sandbox mode first.
14. MVP Implementation Plan
A practical first version should be intentionally small.
14.1 MVP Scope
- one or two acquisition sources
- 10 to 20 reusable modules
- experiment manager
- scoring engine
- beam-search controller
- replay dataset support
- export to
pcap / pcapng
- protocol analysis through
tshark or sharkd
14.2 Suggested Initial Modules
dc_remove
agc
bandpass
resample
burst_detect
freq_offset_est
clock_recovery
ook_demod
2fsk_demod
gfsk_demod
slicer
manchester_decode
whitening_try
crc_scan
fixed_preamble_framer
variable_length_framer
14.3 Build Order
- acquisition and replay path
- module registry
- pipeline executor
- scoring engine
- experiment persistence
- AI orchestration loop
- Wireshark-compatible export and protocol backend
- handheld UI and reporting
15. Key Risks
Primary technical risks:
- search-space explosion
- weak scoring functions
- overfitting to noise or short windows
- mixing structure search and parameter search too early
- lack of reproducible experiment logs
Primary product risks:
- placing AI too low in the stack
- trying to make the first version too universal
- failing to define a standard intermediate representation
Primary integration risk:
- misunderstanding Wireshark's role and pushing it below the framing boundary
16. Summary
The proposed portable signal analyzer should be designed as a layered system:
- deterministic algorithms do the actual signal recovery
- Wireshark-derived tooling handles protocol analysis after framing
- AI operates above those layers as an experiment orchestrator, tuning controller, and explanation engine
The winning architecture is not "AI decodes everything".
It is "AI controls a rigorous decoding and analysis workflow".