# 便携式信号分析仪架构说明 版本号:v0.1.0 最后更新:2026-04-04 说明:本版为按规范整理的历史文档,正文暂保留原英文内容。 This document is a domain-specific product architecture under the higher-level framework design in [多模态分析框架.md](D:/dev/TC/doc/总体架构/多模态分析框架.md). ## 1. Purpose This document captures a product and architecture direction for a portable signal analyzer inspired by the "tricorder" style workflow: - collect signals from multiple physical sources - detect, demodulate, decode, and frame them using deterministic algorithms - apply protocol analysis on framed data - use AI as an assistant, orchestrator, and experiment controller The core principle is: > AI should not replace the actual signal-processing and protocol-analysis engine. > Deterministic algorithms should do the decoding work. > AI should control experiments, compare results, explain outcomes, and choose the next action. ## 2. Product Positioning The target device is not just a packet sniffer and not just an SDR receiver. It is a multi-stage analysis platform for: - RF and non-RF signal acquisition - physical-layer and link-layer recovery - protocol identification and interpretation - guided diagnostics and anomaly explanation Practical examples include: - identifying unknown digital bursts - recovering framed traffic from noisy captures - decoding standard or proprietary protocols - presenting operator-friendly summaries and next-step suggestions ## 3. Wireshark Reuse Boundary Wireshark is useful, but only for the upper half of the stack. Wireshark is strong at: - framed packet dissection - protocol tree generation - display filtering - reassembly, statistics, and follow-stream style analysis - export and structured protocol interpretation Wireshark is not the right tool for: - raw RF analysis - blind modulation recognition - carrier recovery - symbol timing recovery - unknown physical-layer reconstruction The practical reuse boundary is: 1. collect raw signal data 2. perform DSP, demodulation, bit recovery, and framing 3. convert recovered traffic into packets or events 4. hand those results to Wireshark-related tooling such as: - `pcap` / `pcapng` - `tshark` - `sharkd` - custom dissectors where appropriate This makes Wireshark a protocol-analysis backend, not the full analyzer brain. ## 4. System Architecture The system should be split into clear layers. ### 4.1 Acquisition Layer Inputs may include: - IQ streams - IF or audio data - logic-level captures - UART / SPI / I2C / CAN buses - BLE / Wi-Fi / Ethernet mirrored traffic - file-based replay samples This layer should normalize access to multiple hardware front-ends and record: - timestamp - sample rate - center frequency - gain / front-end state - source identity - capture duration ### 4.2 Signal Workspace The workspace is the canonical store for both raw and intermediate data. It should retain: - raw samples - derived features - intermediate bitstreams - framed outputs - experiment metadata - scores and failure reasons This is essential for reproducibility, offline replay, regression testing, and AI-driven iteration. ### 4.3 DSP / Demodulation Pipeline This layer performs actual signal recovery. Typical module categories: - preprocessing - DC removal - AGC - filtering - resampling - detection - energy detection - burst detection - coarse frequency estimation - synchronization - carrier recovery - symbol clock recovery - preamble / sync-word search - demodulation - OOK / ASK - FSK / GFSK - PSK / QPSK - OFDM-family or chirp-style paths when supported - bit-domain processing - hard or soft decision - de-whitening - de-interleaving - FEC decoding - CRC validation ### 4.4 Frame Builder This layer transforms bitstreams into candidate frames or packets. It is the boundary between "signal recovery" and "protocol interpretation". Responsibilities: - frame boundary detection - fixed/variable length frame assembly - checksum / CRC validation - field boundary estimation - event extraction ### 4.5 Protocol Analysis Layer Once data has become packets or events: - use Wireshark-compatible outputs for standard protocols - use internal parsers and heuristics for proprietary protocols - gradually migrate stable proprietary formats into custom dissectors if needed ### 4.6 UI and Assistant Layer This layer provides: - live scan results - replay and lab analysis - confidence-ranked protocol candidates - anomaly explanations - next-step recommendations - exportable reports ## 5. AI Role in the System AI should not directly replace DSP blocks. Its primary role is that of an orchestration and analysis controller. AI responsibilities: - choose candidate algorithm pipelines - tune parameters - compare candidate outputs - explain likely failure points - decide what to try next - summarize results for the operator AI should behave like an automated signal-analysis engineer, not like a magical decoder. ## 6. AI Search for Algorithm Chains ### 6.1 Problem Definition Algorithm-chain search is a constrained program-search problem. Input: - raw or partially processed signal data - prior device/context metadata - previous experiment history Output: - a ranked set of candidate pipelines - associated parameter settings - score and confidence estimates Optimization target: - maximize correctness and interpretability - minimize computational cost and false positives ### 6.2 Canonical Pipeline Shape A pipeline can be modeled as: ```text source -> preprocess -> detect -> sync -> demod -> decode -> frame -> proto ``` Each stage may have several interchangeable modules. ### 6.3 Search State Each attempt should be tracked as an experiment node. ```ts type Experiment = { id: string parentId?: string pipeline: PipelineNode[] inputRef: string outputs: StageOutput[] score: ScoreCard status: "pending" | "running" | "done" | "failed" notes?: string } type PipelineNode = { module: string params: Record } ``` This allows AI to operate over an experiment tree instead of producing one-off guesses. ### 6.4 Recommended Search Strategy Use a hybrid strategy: - rules for initialization - beam search for structure search - local optimization for parameter tuning Recommended control flow: 1. classify signal at a coarse level 2. generate a small number of high-probability candidate pipelines 3. run short-window experiments 4. score results and prune aggressively 5. mutate the best pipelines locally 6. rerun on longer samples for confirmation This is more stable than unconstrained random search. ### 6.5 Why Beam Search Fits Beam search is a strong fit because it: - is resource-bounded - is easy to explain and debug - supports progressive refinement - works well with ranked experiment history Suggested pattern: - outer loop: beam search over module-chain structure - inner loop: bounded parameter tuning around the best chains ### 6.6 Parameter Tuning Continuous or range-based parameters should be tuned separately from structure search. Typical tunables: - symbol rate - bandwidth - threshold values - timing recovery parameters - frequency offset compensation - framing tolerances Possible strategies: - bounded grid search - adaptive range narrowing - Bayesian optimization where available ## 7. Scoring System The scoring system is the backbone of the AI loop. AI can only optimize what is measured. ### 7.1 Physical-Layer Score Examples: - SNR improvement - carrier lock stability - clock recovery stability - cluster separation after demodulation - residual frequency error ### 7.2 Frame-Level Score Examples: - preamble detection rate - frame length consistency - frame-boundary stability - CRC pass rate - repeated structure frequency ### 7.3 Protocol-Level Score Examples: - known-header matches - valid field lengths - legal enum / field value ratios - session consistency - successful Wireshark-style protocol interpretation ### 7.4 Cost Penalty Examples: - CPU cost - memory cost - latency - fragility under small parameter changes - overfitting to short windows ### 7.5 Example Composite Score ```text score = 0.25 * phy_score + 0.35 * frame_score + 0.30 * proto_score - 0.10 * cost_penalty ``` Weights should initially be hand-tuned and later adjusted using replay corpora. ## 8. Failure Attribution Every experiment should return structured failure reasons. Example labels: - `no_signal_detected` - `unstable_symbol_clock` - `carrier_not_locked` - `frame_sync_failed` - `crc_failed` - `field_semantics_invalid` - `overfit_to_noise` This enables targeted next-step decisions. Examples: - `unstable_symbol_clock` -> adjust symbol-rate range or swap timing recovery module - `crc_failed` -> try bit inversion, whitening, byte order, CRC family changes - `field_semantics_invalid` -> reconsider framing or protocol family ## 9. Module Registry Every algorithmic building block should be registered with machine-readable metadata. ```ts type ModuleSpec = { name: string stage: "preprocess" | "sync" | "demod" | "decode" | "frame" | "proto" inputFormat: string outputFormat: string params: Record constraints: string[] metrics: string[] cost: { cpu: number; memory: number; latency: number } } ``` Without a registry, AI cannot safely orchestrate pipelines. The registry should allow the system to answer: - what can run after what - what parameters are tunable - what metrics each module produces - which modules are expensive - which modules are suitable for real-time use ## 10. Knowledge Base and Priors The system should maintain a history of prior successful analyses. ```ts type PriorCase = { featureFingerprint: number[] successfulPipelines: RankedPipeline[] } ``` Benefits: - faster startup on familiar signal families - reduced search cost - improved reliability over time - operator trust through precedent-based suggestions This lets the AI behave more like an experienced lab engineer. ## 11. Runtime Modes At minimum, the product should support: ### 11.1 Live Scan - real-time acquisition - limited local search - fast confidence-ranked hints - real-time alerting ### 11.2 Lab Replay - deterministic offline reprocessing - multiple experiment branches - parameter tuning - regression validation ### 11.3 Protocol Assist - packet/event summarization - protocol explanation - filter and query generation - reporting and export ## 12. Device vs Host Split A portable device has strict CPU, memory, thermal, and battery limits. Do not assume the full AI search workload belongs on-device. Recommended split: - device side - acquisition - lightweight feature extraction - small bounded search - fast heuristic alerts - host / dock / edge side - deep experiment search - heavy replay analysis - larger AI inference - training / rule generation This split keeps the handheld usable under real operating conditions. ## 13. Safety Boundaries for AI AI should be allowed to: - select pipelines - adjust parameters - reorder compatible modules - choose which experiment to run next - generate summaries AI should not directly and automatically: - patch low-level production DSP code in the live path - disable safety limits - bypass deterministic validation - replace scoring with free-form judgment If AI-generated changes extend beyond parameter or policy updates, they should be validated in replay or sandbox mode first. ## 14. MVP Implementation Plan A practical first version should be intentionally small. ### 14.1 MVP Scope - one or two acquisition sources - 10 to 20 reusable modules - experiment manager - scoring engine - beam-search controller - replay dataset support - export to `pcap` / `pcapng` - protocol analysis through `tshark` or `sharkd` ### 14.2 Suggested Initial Modules - `dc_remove` - `agc` - `bandpass` - `resample` - `burst_detect` - `freq_offset_est` - `clock_recovery` - `ook_demod` - `2fsk_demod` - `gfsk_demod` - `slicer` - `manchester_decode` - `whitening_try` - `crc_scan` - `fixed_preamble_framer` - `variable_length_framer` ### 14.3 Build Order 1. acquisition and replay path 2. module registry 3. pipeline executor 4. scoring engine 5. experiment persistence 6. AI orchestration loop 7. Wireshark-compatible export and protocol backend 8. handheld UI and reporting ## 15. Key Risks Primary technical risks: - search-space explosion - weak scoring functions - overfitting to noise or short windows - mixing structure search and parameter search too early - lack of reproducible experiment logs Primary product risks: - placing AI too low in the stack - trying to make the first version too universal - failing to define a standard intermediate representation Primary integration risk: - misunderstanding Wireshark's role and pushing it below the framing boundary ## 16. Summary The proposed portable signal analyzer should be designed as a layered system: - deterministic algorithms do the actual signal recovery - Wireshark-derived tooling handles protocol analysis after framing - AI operates above those layers as an experiment orchestrator, tuning controller, and explanation engine The winning architecture is not "AI decodes everything". It is "AI controls a rigorous decoding and analysis workflow".