# 便携式信号分析仪架构说明

版本号：v0.1.0
最后更新：2026-04-04

说明：本版为按规范整理的历史文档，正文暂保留原英文内容。

This document is a domain-specific product architecture under the higher-level framework design in [多模态分析框架.md](D:/dev/TC/doc/总体架构/多模态分析框架.md).

## 1. Purpose

This document captures a product and architecture direction for a portable signal analyzer inspired by the "tricorder" style workflow:

- collect signals from multiple physical sources
- detect, demodulate, decode, and frame them using deterministic algorithms
- apply protocol analysis on framed data
- use AI as an assistant, orchestrator, and experiment controller

The core principle is:

> AI should not replace the actual signal-processing and protocol-analysis engine.
> Deterministic algorithms should do the decoding work.
> AI should control experiments, compare results, explain outcomes, and choose the next action.

## 2. Product Positioning

The target device is not just a packet sniffer and not just an SDR receiver.
It is a multi-stage analysis platform for:

- RF and non-RF signal acquisition
- physical-layer and link-layer recovery
- protocol identification and interpretation
- guided diagnostics and anomaly explanation

Practical examples include:

- identifying unknown digital bursts
- recovering framed traffic from noisy captures
- decoding standard or proprietary protocols
- presenting operator-friendly summaries and next-step suggestions

## 3. Wireshark Reuse Boundary

Wireshark is useful, but only for the upper half of the stack.

Wireshark is strong at:

- framed packet dissection
- protocol tree generation
- display filtering
- reassembly, statistics, and follow-stream style analysis
- export and structured protocol interpretation

Wireshark is not the right tool for:

- raw RF analysis
- blind modulation recognition
- carrier recovery
- symbol timing recovery
- unknown physical-layer reconstruction

The practical reuse boundary is:

1. collect raw signal data
2. perform DSP, demodulation, bit recovery, and framing
3. convert recovered traffic into packets or events
4. hand those results to Wireshark-related tooling such as:
   - `pcap` / `pcapng`
   - `tshark`
   - `sharkd`
   - custom dissectors where appropriate

This makes Wireshark a protocol-analysis backend, not the full analyzer brain.

## 4. System Architecture

The system should be split into clear layers.

### 4.1 Acquisition Layer

Inputs may include:

- IQ streams
- IF or audio data
- logic-level captures
- UART / SPI / I2C / CAN buses
- BLE / Wi-Fi / Ethernet mirrored traffic
- file-based replay samples

This layer should normalize access to multiple hardware front-ends and record:

- timestamp
- sample rate
- center frequency
- gain / front-end state
- source identity
- capture duration

### 4.2 Signal Workspace

The workspace is the canonical store for both raw and intermediate data.
It should retain:

- raw samples
- derived features
- intermediate bitstreams
- framed outputs
- experiment metadata
- scores and failure reasons

This is essential for reproducibility, offline replay, regression testing, and AI-driven iteration.

### 4.3 DSP / Demodulation Pipeline

This layer performs actual signal recovery.
Typical module categories:

- preprocessing
  - DC removal
  - AGC
  - filtering
  - resampling
- detection
  - energy detection
  - burst detection
  - coarse frequency estimation
- synchronization
  - carrier recovery
  - symbol clock recovery
  - preamble / sync-word search
- demodulation
  - OOK / ASK
  - FSK / GFSK
  - PSK / QPSK
  - OFDM-family or chirp-style paths when supported
- bit-domain processing
  - hard or soft decision
  - de-whitening
  - de-interleaving
  - FEC decoding
  - CRC validation

### 4.4 Frame Builder

This layer transforms bitstreams into candidate frames or packets.
It is the boundary between "signal recovery" and "protocol interpretation".

Responsibilities:

- frame boundary detection
- fixed/variable length frame assembly
- checksum / CRC validation
- field boundary estimation
- event extraction

### 4.5 Protocol Analysis Layer

Once data has become packets or events:

- use Wireshark-compatible outputs for standard protocols
- use internal parsers and heuristics for proprietary protocols
- gradually migrate stable proprietary formats into custom dissectors if needed

### 4.6 UI and Assistant Layer

This layer provides:

- live scan results
- replay and lab analysis
- confidence-ranked protocol candidates
- anomaly explanations
- next-step recommendations
- exportable reports

## 5. AI Role in the System

AI should not directly replace DSP blocks.
Its primary role is that of an orchestration and analysis controller.

AI responsibilities:

- choose candidate algorithm pipelines
- tune parameters
- compare candidate outputs
- explain likely failure points
- decide what to try next
- summarize results for the operator

AI should behave like an automated signal-analysis engineer, not like a magical decoder.

## 6. AI Search for Algorithm Chains

### 6.1 Problem Definition

Algorithm-chain search is a constrained program-search problem.

Input:

- raw or partially processed signal data
- prior device/context metadata
- previous experiment history

Output:

- a ranked set of candidate pipelines
- associated parameter settings
- score and confidence estimates

Optimization target:

- maximize correctness and interpretability
- minimize computational cost and false positives

### 6.2 Canonical Pipeline Shape

A pipeline can be modeled as:

```text
source
-> preprocess
-> detect
-> sync
-> demod
-> decode
-> frame
-> proto
```

Each stage may have several interchangeable modules.

### 6.3 Search State

Each attempt should be tracked as an experiment node.

```ts
type Experiment = {
  id: string
  parentId?: string
  pipeline: PipelineNode[]
  inputRef: string
  outputs: StageOutput[]
  score: ScoreCard
  status: "pending" | "running" | "done" | "failed"
  notes?: string
}

type PipelineNode = {
  module: string
  params: Record<string, number | string | boolean>
}
```

This allows AI to operate over an experiment tree instead of producing one-off guesses.

### 6.4 Recommended Search Strategy

Use a hybrid strategy:

- rules for initialization
- beam search for structure search
- local optimization for parameter tuning

Recommended control flow:

1. classify signal at a coarse level
2. generate a small number of high-probability candidate pipelines
3. run short-window experiments
4. score results and prune aggressively
5. mutate the best pipelines locally
6. rerun on longer samples for confirmation

This is more stable than unconstrained random search.

### 6.5 Why Beam Search Fits

Beam search is a strong fit because it:

- is resource-bounded
- is easy to explain and debug
- supports progressive refinement
- works well with ranked experiment history

Suggested pattern:

- outer loop: beam search over module-chain structure
- inner loop: bounded parameter tuning around the best chains

### 6.6 Parameter Tuning

Continuous or range-based parameters should be tuned separately from structure search.
Typical tunables:

- symbol rate
- bandwidth
- threshold values
- timing recovery parameters
- frequency offset compensation
- framing tolerances

Possible strategies:

- bounded grid search
- adaptive range narrowing
- Bayesian optimization where available

## 7. Scoring System

The scoring system is the backbone of the AI loop.
AI can only optimize what is measured.

### 7.1 Physical-Layer Score

Examples:

- SNR improvement
- carrier lock stability
- clock recovery stability
- cluster separation after demodulation
- residual frequency error

### 7.2 Frame-Level Score

Examples:

- preamble detection rate
- frame length consistency
- frame-boundary stability
- CRC pass rate
- repeated structure frequency

### 7.3 Protocol-Level Score

Examples:

- known-header matches
- valid field lengths
- legal enum / field value ratios
- session consistency
- successful Wireshark-style protocol interpretation

### 7.4 Cost Penalty

Examples:

- CPU cost
- memory cost
- latency
- fragility under small parameter changes
- overfitting to short windows

### 7.5 Example Composite Score

```text
score =
  0.25 * phy_score +
  0.35 * frame_score +
  0.30 * proto_score -
  0.10 * cost_penalty
```

Weights should initially be hand-tuned and later adjusted using replay corpora.

## 8. Failure Attribution

Every experiment should return structured failure reasons.

Example labels:

- `no_signal_detected`
- `unstable_symbol_clock`
- `carrier_not_locked`
- `frame_sync_failed`
- `crc_failed`
- `field_semantics_invalid`
- `overfit_to_noise`

This enables targeted next-step decisions.

Examples:

- `unstable_symbol_clock` -> adjust symbol-rate range or swap timing recovery module
- `crc_failed` -> try bit inversion, whitening, byte order, CRC family changes
- `field_semantics_invalid` -> reconsider framing or protocol family

## 9. Module Registry

Every algorithmic building block should be registered with machine-readable metadata.

```ts
type ModuleSpec = {
  name: string
  stage: "preprocess" | "sync" | "demod" | "decode" | "frame" | "proto"
  inputFormat: string
  outputFormat: string
  params: Record<string, ParamSpec>
  constraints: string[]
  metrics: string[]
  cost: { cpu: number; memory: number; latency: number }
}
```

Without a registry, AI cannot safely orchestrate pipelines.

The registry should allow the system to answer:

- what can run after what
- what parameters are tunable
- what metrics each module produces
- which modules are expensive
- which modules are suitable for real-time use

## 10. Knowledge Base and Priors

The system should maintain a history of prior successful analyses.

```ts
type PriorCase = {
  featureFingerprint: number[]
  successfulPipelines: RankedPipeline[]
}
```

Benefits:

- faster startup on familiar signal families
- reduced search cost
- improved reliability over time
- operator trust through precedent-based suggestions

This lets the AI behave more like an experienced lab engineer.

## 11. Runtime Modes

At minimum, the product should support:

### 11.1 Live Scan

- real-time acquisition
- limited local search
- fast confidence-ranked hints
- real-time alerting

### 11.2 Lab Replay

- deterministic offline reprocessing
- multiple experiment branches
- parameter tuning
- regression validation

### 11.3 Protocol Assist

- packet/event summarization
- protocol explanation
- filter and query generation
- reporting and export

## 12. Device vs Host Split

A portable device has strict CPU, memory, thermal, and battery limits.
Do not assume the full AI search workload belongs on-device.

Recommended split:

- device side
  - acquisition
  - lightweight feature extraction
  - small bounded search
  - fast heuristic alerts
- host / dock / edge side
  - deep experiment search
  - heavy replay analysis
  - larger AI inference
  - training / rule generation

This split keeps the handheld usable under real operating conditions.

## 13. Safety Boundaries for AI

AI should be allowed to:

- select pipelines
- adjust parameters
- reorder compatible modules
- choose which experiment to run next
- generate summaries

AI should not directly and automatically:

- patch low-level production DSP code in the live path
- disable safety limits
- bypass deterministic validation
- replace scoring with free-form judgment

If AI-generated changes extend beyond parameter or policy updates, they should be validated in replay or sandbox mode first.

## 14. MVP Implementation Plan

A practical first version should be intentionally small.

### 14.1 MVP Scope

- one or two acquisition sources
- 10 to 20 reusable modules
- experiment manager
- scoring engine
- beam-search controller
- replay dataset support
- export to `pcap` / `pcapng`
- protocol analysis through `tshark` or `sharkd`

### 14.2 Suggested Initial Modules

- `dc_remove`
- `agc`
- `bandpass`
- `resample`
- `burst_detect`
- `freq_offset_est`
- `clock_recovery`
- `ook_demod`
- `2fsk_demod`
- `gfsk_demod`
- `slicer`
- `manchester_decode`
- `whitening_try`
- `crc_scan`
- `fixed_preamble_framer`
- `variable_length_framer`

### 14.3 Build Order

1. acquisition and replay path
2. module registry
3. pipeline executor
4. scoring engine
5. experiment persistence
6. AI orchestration loop
7. Wireshark-compatible export and protocol backend
8. handheld UI and reporting

## 15. Key Risks

Primary technical risks:

- search-space explosion
- weak scoring functions
- overfitting to noise or short windows
- mixing structure search and parameter search too early
- lack of reproducible experiment logs

Primary product risks:

- placing AI too low in the stack
- trying to make the first version too universal
- failing to define a standard intermediate representation

Primary integration risk:

- misunderstanding Wireshark's role and pushing it below the framing boundary

## 16. Summary

The proposed portable signal analyzer should be designed as a layered system:

- deterministic algorithms do the actual signal recovery
- Wireshark-derived tooling handles protocol analysis after framing
- AI operates above those layers as an experiment orchestrator, tuning controller, and explanation engine

The winning architecture is not "AI decodes everything".
It is "AI controls a rigorous decoding and analysis workflow".