核心数据模型.md 9.4 KB

核心数据模型

版本号:v0.1.0 最后更新:2026-04-04

说明:本版为按规范整理的历史文档,正文暂保留原英文内容。

1. Purpose

This document defines the core shared data model for the multimodal analysis framework.

These data structures are the foundation for:

  • modality integration
  • algorithm module registration
  • pipeline execution
  • experiment tracking
  • evidence fusion
  • AI orchestration

This document is intended to be the first implementation-facing specification after the architecture documents.

It complements:

2. Design Principles

The core data model should be:

  • modality-agnostic at the top level
  • specific enough to support real execution
  • serializable
  • versioned
  • stable across runtime, storage, and network boundaries

The first implementation should prefer simple, explicit schemas over clever abstractions.

3. Core Entities

The first version of the system should standardize five primary entities:

  1. InputSource
  2. Observation
  3. AlgorithmModule
  4. Evidence
  5. Experiment

These entities should be sufficient to support:

  • audio-first execution
  • later RF extension
  • future multimodal expansion

4. InputSource

4.1 Purpose

InputSource describes where data comes from and what it is capable of producing.

Examples:

  • phone microphone
  • external USB microphone
  • SDR receiver
  • camera
  • photodiode board
  • file replay source

4.2 Suggested Shape

type InputSource = {
  id: string
  kind: "audio" | "rf" | "video" | "optical" | "bus" | "file" | "log"
  name: string
  capabilities: string[]
  deviceInfo?: Record<string, unknown>
  supportedFormats: DataFormat[]
  location?: SourceLocation
  status: "available" | "busy" | "offline" | "error"
  createdAt: string
  updatedAt: string
}

4.3 Notes

  • InputSource is not the captured data itself.
  • It represents a reusable source of observations.
  • Capabilities should be machine-readable whenever possible.

5. Observation

5.1 Purpose

Observation is a concrete captured sample or replayable unit of data plus its context.

Examples:

  • a 5-second audio clip
  • a segment of IQ samples
  • a short video clip
  • an imported WAV file

5.2 Suggested Shape

type Observation = {
  id: string
  sourceId: string
  modality: "audio" | "rf" | "video" | "optical" | "bus" | "log"
  format: DataFormat
  payloadRef: string
  byteSize?: number
  timeRange: {
    start: string
    end: string
  }
  captureMetadata: Record<string, unknown>
  tags: string[]
  createdAt: string
}

5.3 Audio-Specific Metadata Examples

For the audio-first product, captureMetadata may include:

  • sample rate
  • channels
  • bit depth
  • microphone type
  • estimated noise floor
  • device model

5.4 Notes

  • payloadRef should point to a stable storage reference rather than embedding large binary data directly.
  • Observation should be replayable.

6. DataFormat

6.1 Purpose

DataFormat describes the structural format flowing between modules.

6.2 Suggested Shape

type DataFormat = {
  family: string
  encoding: string
  shape?: Record<string, unknown>
}

6.3 Examples

  • { family: "audio", encoding: "wav_pcm16" }
  • { family: "audio", encoding: "spectrogram_f32" }
  • { family: "rf", encoding: "complex_iq_f32" }
  • { family: "signal", encoding: "soft_bits" }
  • { family: "packet", encoding: "pcap_frame" }

7. AlgorithmModule

7.1 Purpose

AlgorithmModule defines a registered processing block that can be used in a pipeline.

Examples:

  • denoiser
  • spectrogram extractor
  • MFCC extractor
  • sound event detector
  • demodulator
  • framer
  • classifier

7.2 Suggested Shape

type AlgorithmModule = {
  id: string
  name: string
  version: string
  modality: string[]
  stage: string
  inputFormat: DataFormat
  outputFormat: DataFormat
  params: Record<string, ParamSpec>
  metrics: string[]
  executionModel: "in_process" | "out_of_process" | "sandboxed"
  trustLevel: "core" | "trusted" | "partner" | "experimental" | "untrusted"
  realtimeSafe: boolean
  source: ModuleSource
  dependencies?: string[]
  createdAt: string
  updatedAt: string
}

7.3 ParamSpec

type ParamSpec = {
  type: "int" | "float" | "bool" | "string" | "enum"
  required?: boolean
  defaultValue?: unknown
  range?: [number, number]
  choices?: string[]
  description?: string
}

7.4 ModuleSource

type ModuleSource = {
  provider: string
  packageName?: string
  license?: string
  originType: "builtin" | "pack" | "third_party" | "ai_generated"
}

8. PipelineNode

8.1 Purpose

PipelineNode captures one configured module instance inside an experiment pipeline.

8.2 Suggested Shape

type PipelineNode = {
  moduleId: string
  params: Record<string, unknown>
  enabled: boolean
}

8.3 Notes

  • AlgorithmModule is a registry definition.
  • PipelineNode is a concrete use of that module with specific parameters.

9. Evidence

9.1 Purpose

Evidence is the structured result that AI and downstream logic reason over.

It should represent facts, measurements, ranked hypotheses, or anomalies produced by algorithms.

9.2 Suggested Shape

type Evidence = {
  id: string
  observationId: string
  experimentId: string
  producerModuleId?: string
  category: "signal" | "feature" | "frame" | "protocol" | "semantic" | "anomaly"
  values: Record<string, unknown>
  confidence: number
  scoreContribution?: number
  traceRefs: string[]
  createdAt: string
}

9.3 Audio Examples

Examples for audio-first execution:

  • dominant frequency band
  • onset count
  • top-3 sound-class predictions
  • clip similarity score
  • background noise estimate
  • uncertainty indicator

9.4 Notes

  • Evidence should be human-reviewable and machine-consumable.
  • Confidence should always be explicit.

10. ScoreCard

10.1 Purpose

ScoreCard summarizes how well an experiment performed.

10.2 Suggested Shape

type ScoreCard = {
  total: number
  components: Record<string, number>
  penalties?: Record<string, number>
  notes?: string[]
}

10.3 Example Components

  • signal_quality
  • segmentation_quality
  • classification_confidence
  • structural_consistency
  • resource_cost

11. Experiment

11.1 Purpose

Experiment records one pipeline execution attempt over an observation.

This is the main unit of search, replay, validation, and AI orchestration.

11.2 Suggested Shape

type Experiment = {
  id: string
  observationId: string
  parentId?: string
  pipeline: PipelineNode[]
  status: "pending" | "running" | "done" | "failed" | "cancelled"
  outputs: ExperimentOutput[]
  evidenceIds: string[]
  score: ScoreCard
  failureReasons: string[]
  runtimeStats?: RuntimeStats
  createdAt: string
  startedAt?: string
  finishedAt?: string
}

11.3 ExperimentOutput

type ExperimentOutput = {
  stage: string
  format: DataFormat
  payloadRef: string
  metadata?: Record<string, unknown>
}

11.4 RuntimeStats

type RuntimeStats = {
  elapsedMs: number
  cpuMs?: number
  peakMemoryMb?: number
}

12. Relationship Between Entities

The first-order relationships are:

  • one InputSource produces many Observation
  • one Observation can have many Experiment
  • one Experiment contains many PipelineNode
  • one Experiment produces many Evidence
  • one AlgorithmModule may be used by many PipelineNode

Simple view:

InputSource
  -> Observation
      -> Experiment
          -> PipelineNode
          -> Evidence

13. Versioning and Serialization

The core model should be serializable to JSON and representable in Protobuf later.

Recommended rule:

  • first define in a language-neutral schema mindset
  • initially persist as JSON
  • promote to Protobuf once the shapes stabilize

This avoids locking in transport too early while still keeping future RPC clean.

14. First Audio-First Specialization

The first implementation should specialize the generic model only where necessary.

Recommended first concrete instances:

  • InputSource.kind = "audio"
  • Observation.modality = "audio"
  • DataFormat.family = "audio"
  • AlgorithmModule.stage values such as:
    • preprocess
    • segment
    • feature_extract
    • detect
    • classify
    • explain

This keeps the shared model intact while letting audio move quickly.

15. Anti-Patterns to Avoid

Do not:

  • embed large binary payloads directly into entity objects
  • make the first model deeply modality-specific
  • hide confidence or uncertainty
  • mix registry definitions with runtime instances
  • make AI outputs unstructured free text only

These would make later scaling much harder.

16. Recommended Immediate Next Step

After this model is accepted, the next implementation-facing work should be:

  1. define JSON schemas or Protobuf drafts for these entities
  2. define the module registry API
  3. define experiment-runner input/output contracts
  4. define an initial audio-specific module catalog

17. Summary

The first implementation should standardize the system around five entities:

  • InputSource
  • Observation
  • AlgorithmModule
  • Evidence
  • Experiment

These should remain the stable backbone of the platform while modalities, algorithms, and products evolve on top of them.