metadata
metadata
¶
Content-addressed metadata with SHA-256 deduplication.
Stores arbitrary JSON-compatible data with a type classification (MetadataType). A deterministic content hash is computed from the canonical JSON representation of the data, enabling content-addressed deduplication in PostgreSQL.
The Metadata class is agnostic about
the internal structure of data; higher-level models in
bigbrotr.nips.nip11 and bigbrotr.nips.nip66 define their own
conventions for what goes inside it.
See Also
bigbrotr.models.relay_metadata: Junction model linking a
Relay to a
Metadata record.
bigbrotr.nips.nip11: Produces nip11_info-typed metadata from
relay information documents.
bigbrotr.nips.nip66: Produces nip66_*-typed metadata from
health check results (RTT, SSL, DNS, Geo, Net, HTTP).
Classes¶
MetadataType
¶
Bases: StrEnum
Metadata type identifiers stored in the metadata.type column.
Each value corresponds to a specific data source or monitoring test performed by the Monitor service.
Attributes:
-
NIP11_INFO–NIP-11 relay information document fetched via HTTP(S).
-
NIP66_RTT–NIP-66 round-trip time measurements (WebSocket latency).
-
NIP66_SSL–NIP-66 SSL/TLS certificate information (expiry, issuer, chain).
-
NIP66_GEO–NIP-66 geolocation data (country, city, coordinates).
-
NIP66_NET–NIP-66 network and ASN information (provider, AS number).
-
NIP66_DNS–NIP-66 DNS resolution data (A/AAAA records, response times).
-
NIP66_HTTP–NIP-66 HTTP header information (server, content-type, CORS).
See Also
Metadata: The content-addressed container that carries a MetadataType alongside its data. RelayMetadata: Junction model linking a relay to a metadata record.
MetadataDbParams
¶
Bases: NamedTuple
Positional parameters for the metadata database insert procedure.
Produced by Metadata.to_db_params()
and consumed by the metadata_insert stored procedure in PostgreSQL.
Attributes:
-
id(bytes) –SHA-256 content hash (32 bytes), part of composite PK
(id, type). -
type(MetadataType) –MetadataType discriminator, part of composite PK
(id, type). -
data(str) –Canonical JSON string for PostgreSQL JSONB storage.
See Also
Metadata: The model that produces these parameters. Metadata.from_db_params(): Reconstructs a Metadata instance from these parameters with integrity verification.
Metadata
dataclass
¶
Metadata(
type: MetadataType, data: Mapping[str, Any] = dict()
)
Immutable metadata with deterministic content hashing.
On construction, the data dict is sanitized (null values and
empty containers removed, keys sorted) and a canonical JSON string
is produced. The SHA-256 hash of that string serves as a
content-addressed identifier for deduplication.
The hash is derived from data only -- type is not included in
the hash computation but is part of the composite primary key
(id, type) in the database.
Attributes:
-
type(MetadataType) –The metadata classification (see MetadataType).
-
data(Mapping[str, Any]) –Sanitized JSON-compatible dictionary.
Examples:
meta = Metadata(type=MetadataType.NIP11_INFO, data={"name": "My Relay"})
meta.content_hash # 32-byte SHA-256 digest
meta.canonical_json # '{"name":"My Relay"}'
meta.to_db_params() # MetadataDbParams(...)
Identical data always produces the same hash (content-addressed):
m1 = Metadata(type=MetadataType.NIP11_INFO, data={"b": 2, "a": 1})
m2 = Metadata(type=MetadataType.NIP11_INFO, data={"a": 1, "b": 2})
m1.content_hash == m2.content_hash # True
Note
The content hash is derived from data alone. The type is stored
alongside the hash on the metadata table with composite primary key
(id, type), ensuring each document is tied to exactly one type.
The relay_metadata junction table references metadata via a
compound foreign key on (metadata_id, metadata_type).
Computed fields (_canonical_json, _content_hash, _db_params)
are set via object.__setattr__ in __post_init__ because the
dataclass is frozen.
Warning
String data containing null bytes (\x00) will raise ValueError
during sanitization. PostgreSQL TEXT and JSONB columns do not support null
bytes.
See Also
MetadataType: Enum of supported metadata classifications. MetadataDbParams: Database parameter container produced by to_db_params(). RelayMetadata: Junction linking a Relay to this metadata record.
Attributes¶
content_hash
property
¶
SHA-256 digest of the canonical JSON representation.
Computed once at construction time. Identical semantic data always
produces the same 32-byte hash, enabling content-addressed
deduplication in the metadata table.
Returns:
-
bytes–32-byte SHA-256 digest suitable for PostgreSQL BYTEA columns.
See Also
canonical_json: The JSON string from which this hash is derived.
canonical_json
property
¶
Canonical JSON string used for hashing and JSONB storage.
Format: sorted keys, compact separators, UTF-8 encoding.
Returns:
-
str–Deterministic JSON string of the sanitized value.
Functions¶
to_db_params
¶
to_db_params() -> MetadataDbParams
Return cached positional parameters for the database insert procedure.
The result is computed once during construction and cached for the lifetime of the (frozen) instance.
Returns:
-
MetadataDbParams–MetadataDbParams with
-
MetadataDbParams–the content hash as
id, the canonical JSON asdata, -
MetadataDbParams–and the metadata type.
Source code in src/bigbrotr/models/metadata.py
from_db_params
classmethod
¶
from_db_params(params: MetadataDbParams) -> Metadata
Reconstruct a Metadata instance from database parameters.
Re-parses the stored JSON and verifies that the recomputed hash
matches the stored id to detect data corruption.
Parameters:
-
params(MetadataDbParams) –Database row values previously produced by to_db_params().
Returns:
Raises:
-
ValueError–If the recomputed hash does not match
params.id, indicating data corruption in the database.
Note
Unlike Relay.from_db_params(),
this method performs an explicit integrity check by comparing the
recomputed SHA-256 hash against the stored id. This catches
silent data corruption that could otherwise propagate through the
system.