Skip to content

metadata

metadata

Content-addressed metadata with SHA-256 deduplication.

Stores arbitrary JSON-compatible data with a type classification (MetadataType). A deterministic content hash is computed from the canonical JSON representation of the data, enabling content-addressed deduplication in PostgreSQL.

The Metadata class is agnostic about the internal structure of data; higher-level models in bigbrotr.nips.nip11 and bigbrotr.nips.nip66 define their own conventions for what goes inside it.

See Also

bigbrotr.models.relay_metadata: Junction model linking a Relay to a Metadata record. bigbrotr.nips.nip11: Produces nip11_info-typed metadata from relay information documents. bigbrotr.nips.nip66: Produces nip66_*-typed metadata from health check results (RTT, SSL, DNS, Geo, Net, HTTP).

Classes

MetadataType

Bases: StrEnum

Metadata type identifiers stored in the metadata.type column.

Each value corresponds to a specific data source or monitoring test performed by the Monitor service.

Attributes:

  • NIP11_INFO

    NIP-11 relay information document fetched via HTTP(S).

  • NIP66_RTT

    NIP-66 round-trip time measurements (WebSocket latency).

  • NIP66_SSL

    NIP-66 SSL/TLS certificate information (expiry, issuer, chain).

  • NIP66_GEO

    NIP-66 geolocation data (country, city, coordinates).

  • NIP66_NET

    NIP-66 network and ASN information (provider, AS number).

  • NIP66_DNS

    NIP-66 DNS resolution data (A/AAAA records, response times).

  • NIP66_HTTP

    NIP-66 HTTP header information (server, content-type, CORS).

See Also

Metadata: The content-addressed container that carries a MetadataType alongside its data. RelayMetadata: Junction model linking a relay to a metadata record.

MetadataDbParams

Bases: NamedTuple

Positional parameters for the metadata database insert procedure.

Produced by Metadata.to_db_params() and consumed by the metadata_insert stored procedure in PostgreSQL.

Attributes:

  • id (bytes) –

    SHA-256 content hash (32 bytes), part of composite PK (id, type).

  • type (MetadataType) –

    MetadataType discriminator, part of composite PK (id, type).

  • data (str) –

    Canonical JSON string for PostgreSQL JSONB storage.

See Also

Metadata: The model that produces these parameters. Metadata.from_db_params(): Reconstructs a Metadata instance from these parameters with integrity verification.

Metadata dataclass

Metadata(
    type: MetadataType, data: Mapping[str, Any] = dict()
)

Immutable metadata with deterministic content hashing.

On construction, the data dict is sanitized (null values and empty containers removed, keys sorted) and a canonical JSON string is produced. The SHA-256 hash of that string serves as a content-addressed identifier for deduplication.

The hash is derived from data only -- type is not included in the hash computation but is part of the composite primary key (id, type) in the database.

Attributes:

  • type (MetadataType) –

    The metadata classification (see MetadataType).

  • data (Mapping[str, Any]) –

    Sanitized JSON-compatible dictionary.

Examples:

meta = Metadata(type=MetadataType.NIP11_INFO, data={"name": "My Relay"})
meta.content_hash    # 32-byte SHA-256 digest
meta.canonical_json  # '{"name":"My Relay"}'
meta.to_db_params()  # MetadataDbParams(...)

Identical data always produces the same hash (content-addressed):

m1 = Metadata(type=MetadataType.NIP11_INFO, data={"b": 2, "a": 1})
m2 = Metadata(type=MetadataType.NIP11_INFO, data={"a": 1, "b": 2})
m1.content_hash == m2.content_hash  # True
Note

The content hash is derived from data alone. The type is stored alongside the hash on the metadata table with composite primary key (id, type), ensuring each document is tied to exactly one type. The relay_metadata junction table references metadata via a compound foreign key on (metadata_id, metadata_type).

Computed fields (_canonical_json, _content_hash, _db_params) are set via object.__setattr__ in __post_init__ because the dataclass is frozen.

Warning

String data containing null bytes (\x00) will raise ValueError during sanitization. PostgreSQL TEXT and JSONB columns do not support null bytes.

See Also

MetadataType: Enum of supported metadata classifications. MetadataDbParams: Database parameter container produced by to_db_params(). RelayMetadata: Junction linking a Relay to this metadata record.

Attributes
content_hash property
content_hash: bytes

SHA-256 digest of the canonical JSON representation.

Computed once at construction time. Identical semantic data always produces the same 32-byte hash, enabling content-addressed deduplication in the metadata table.

Returns:

  • bytes

    32-byte SHA-256 digest suitable for PostgreSQL BYTEA columns.

See Also

canonical_json: The JSON string from which this hash is derived.

canonical_json property
canonical_json: str

Canonical JSON string used for hashing and JSONB storage.

Format: sorted keys, compact separators, UTF-8 encoding.

Returns:

  • str

    Deterministic JSON string of the sanitized value.

Functions
to_db_params
to_db_params() -> MetadataDbParams

Return cached positional parameters for the database insert procedure.

The result is computed once during construction and cached for the lifetime of the (frozen) instance.

Returns:

Source code in src/bigbrotr/models/metadata.py
def to_db_params(self) -> MetadataDbParams:
    """Return cached positional parameters for the database insert procedure.

    The result is computed once during construction and cached for the
    lifetime of the (frozen) instance.

    Returns:
        [MetadataDbParams][bigbrotr.models.metadata.MetadataDbParams] with
        the content hash as ``id``, the canonical JSON as ``data``,
        and the metadata type.
    """
    return self._db_params
from_db_params classmethod
from_db_params(params: MetadataDbParams) -> Metadata

Reconstruct a Metadata instance from database parameters.

Re-parses the stored JSON and verifies that the recomputed hash matches the stored id to detect data corruption.

Parameters:

Returns:

Raises:

  • ValueError

    If the recomputed hash does not match params.id, indicating data corruption in the database.

Note

Unlike Relay.from_db_params(), this method performs an explicit integrity check by comparing the recomputed SHA-256 hash against the stored id. This catches silent data corruption that could otherwise propagate through the system.

Source code in src/bigbrotr/models/metadata.py
@classmethod
def from_db_params(cls, params: MetadataDbParams) -> Metadata:
    """Reconstruct a ``Metadata`` instance from database parameters.

    Re-parses the stored JSON and verifies that the recomputed hash
    matches the stored ``id`` to detect data corruption.

    Args:
        params: Database row values previously produced by
            [to_db_params()][bigbrotr.models.metadata.Metadata.to_db_params].

    Returns:
        A new [Metadata][bigbrotr.models.metadata.Metadata] instance.

    Raises:
        ValueError: If the recomputed hash does not match ``params.id``,
            indicating data corruption in the database.

    Note:
        Unlike [Relay.from_db_params()][bigbrotr.models.relay.Relay.from_db_params],
        this method performs an explicit integrity check by comparing the
        recomputed SHA-256 hash against the stored ``id``. This catches
        silent data corruption that could otherwise propagate through the
        system.
    """
    value_dict = json.loads(params.data)
    instance = cls(type=params.type, data=value_dict)

    if instance._content_hash != params.id:
        raise ValueError(
            f"Hash mismatch: computed {instance._content_hash.hex()}, "
            f"expected {params.id.hex()}"
        )

    return instance

Functions