Concept · metadata

Meaning behind the numbers

Define what your data is — types, units, ranges, operational limits — so every dashboard, query engine, and storage schema speaks the same language.

A pipeline computes numbers — smoothed values, trends, flags. But what do those numbers mean? Is 92.5 a temperature in Celsius or a pressure in bar? Is 0.42 vibration in mm/s or displacement in microns? Without metadata, a dashboard shows bare numbers and a query engine returns anonymous columns.

Semantics answer this. They define what your data is — types, units, physical ranges, operational limits — so every system downstream speaks the same language.

The Single Source of Truth

Semantics are a shared metadata layer consumed by multiple parts of the ecosystem:

Define a column once — its type, unit, and limits — and every consumer inherits the same definition. Change the unit from bar to kPa in one JSON file and the dashboard, the query engine, and the storage schema all update together.

Facts vs. Decisions

This is the core design principle: semantics describe facts about data, not decisions about what to do with it.

Semantics (facts)	Flow language (decisions)
Outlet pressure is measured in bar	Alert when pressure exceeds 110 bar
Temperature range is -40 to 150 °C	Switch to high-temp mode above 85 °C
Vibration unit is mm/s RMS	Flag bearing as degraded when trend slope > 0.02
Pump status codes: 0=Idle, 1=Running	Only process messages where status is Running

A pump’s outlet pressure has a specification limit of 90 bar (semantic fact). The flow uses a threshold of 110 bar to detect high-pressure events (policy decision). These are independent — the limit describes the equipment, the threshold drives the logic.

What Semantics Define

Columns

Each column carries its full definition:

Type — float64, int64, string, bool, timestamp
Unit — engineering units (bar, °C, mm/s)
Description — human-readable explanation
Resolution — measurement precision
Three-tier limits — nested validation boundaries:

Physical range catches impossible values (sensor faults). Operational limits define safe bounds with warning and critical thresholds. Specification limits track process capability for quality control.

Asset Classes

An asset class groups columns into an equipment type — like industrialPump or bearingTestRig. It defines which measurements belong together and how they map to storage:

Columns — all measurements for this equipment type
Insight types — column projections for storage tables (e.g., operational for continuous data, faults for event data)
Enumerations — coded-value lookups (e.g., 0 → Idle, 1 → Running)

Semantic Digest

A SHA-256 hash of the loaded semantics serves as a version fingerprint. Storage records carry this digest, so you can always verify which schema version produced a given dataset — essential for reproducibility and auditing.

How Semantics Connect to Flows

In the flow language, semantics wire in through configuration methods:


flow( 'pump-monitor' )
    .assetClass( pumpSemantics )      // load the asset class
    .storage( questdb, config )       // columns, types, and units flow to storage
    .persistIf( 'operational', ... )  // "operational" is an insight type from the asset class
    .run()

The persistIf node references an insight type defined in the asset class. winkComposer validates at load time that the insight type exists and that the columns it references are defined — configuration errors surface immediately, not after hours of data collection.

Next Steps

Use Cases — see complete pipelines with semantics at production scale
Node Index — find any building block by name or category