Skip to content

XProtocol — Discovery & Transport Specification

Protocol Version: 0.1 (Draft) Date: 2026-05-31 License: CC BY 4.0 (this document) · Apache 2.0 (code) Status: Proposed — not yet ratified as a community standard Namespace: xp.discovery.* Depends on: XProtocol-Specification.md (core protocol)


Abstract

XProtocol's identity model is key-based, not address-based. Every participant — person, service, AI agent, IoT device — is identified by their Ed25519 public key. This key never changes regardless of IP address, network, physical location, or device. The discovery problem is therefore not "how do I find your IP address" but "how do I route an event to your key" — a fundamentally different and more powerful abstraction.

This specification defines the XProtocol Discovery Layer: a pluggable, extensible transport negotiation system that allows any two XProtocol participants to find and reach each other through any available physical or network channel, with transport selection driven by availability, capability, performance, and trust requirements.

The result is a protocol where the key IS the address, the transport IS pluggable, and the trust level of the connection is a cryptographic property of the transport used to establish it — not an assumption made about the network.


License

XProtocol is free and open. Code is Apache 2.0. Documentation and specifications (including this document) are CC BY 4.0. See LICENSE and LICENSE-DOCS in the repository root.

The mechanisms described in this specification — including visual place recognition, audio fingerprint key derivation, cryptographic physical context binding, proximity broadcast pairing, continuous location presence gating, and all other novel primitives — are contributed to the public under these open licenses. Apache 2.0's Section 3 patent grant means contributors automatically license any relevant patent claims to all users, royalty-free and irrevocably. These ideas are yours to build on. No permission required.


1. Design Principles

1.1 — Key-Addressed Routing

The destination of every XProtocol event is a public key, not an IP address. Discovery is the process of mapping a public key to a currently reachable transport path. This mapping is ephemeral and may change at any time — a device that was reachable via BLE five minutes ago may now be reachable only via relay. The key remains constant; the transport path does not.

1.2 — Transport Pluggability

Discovery transports are pluggable modules. The protocol defines a common interface that all transports implement. New transports can be added without changing the core protocol. Implementations declare which transports they support. The protocol selects the best available common transport.

1.3 — Trust Tiers

Not all transports provide the same trust guarantees. A connection established via NFC tap (physical proximity proven) carries a stronger trust assertion than one established via relay (no physical context whatsoever). The trust level of the transport used to establish a relationship is a first-class property, recorded in the pairing event and available to capability policies.

1.4 — Graceful Degradation

Every XProtocol implementation supports at minimum one transport: relay. Relay is the universal fallback — it works on any device with any internet connection. Every additional transport is an enhancement. A device that supports only relay can communicate with a device that supports all transports. The richer device uses relay when that is the only common transport.

1.5 — Privacy by Default

Network transports expose metadata (who is communicating with whom, how often, how much data). Physical transports (NFC, QR, UWB, camera) expose physical presence but not network identity. The choice of transport is a privacy decision. Applications should use the minimum transport exposure required for the use case.


2. The Reachability Profile

Every XProtocol participant publishes a reachability profile — a signed event declaring which discovery transports they support, in priority order.

{
  "kind": "xp.discovery.profile",
  "payload": {
    "key": "<ed25519_public_key>",
    "x25519_key": "<x25519_public_key>",
    "transports": [
      {
        "type": "relay",
        "priority": 1,
        "params": {
          "relay_urls": ["wss://relay.xprotocol.ai", "wss://relay2.xprotocol.ai"]
        }
      },
      {
        "type": "rendezvous",
        "priority": 2,
        "params": {
          "server": "rv.xprotocol.ai"
        }
      },
      {
        "type": "dht",
        "priority": 3,
        "params": {
          "bootstrap_nodes": ["dht1.xprotocol.ai", "dht2.xprotocol.ai"]
        }
      },
      {
        "type": "ble",
        "priority": 4,
        "params": {
          "advertise": true,
          "service_uuid": "<xprotocol_ble_service_uuid>"
        }
      },
      {
        "type": "wifi_direct",
        "priority": 5,
        "params": {}
      },
      {
        "type": "nfc",
        "priority": 6,
        "params": {
          "passive_tag": false
        }
      },
      {
        "type": "uwb",
        "priority": 7,
        "params": {
          "ranging_enabled": true
        }
      },
      {
        "type": "qr",
        "priority": 8,
        "params": {
          "ttl": 300
        }
      },
      {
        "type": "visual_place",
        "priority": 9,
        "params": {
          "tolerance": "medium"
        }
      },
      {
        "type": "location",
        "priority": 10,
        "params": {
          "geofence_precision": 7
        }
      }
    ],
    "published_at": 1717000000,
    "expires_at": 1717086400
  }
}

The reachability profile is: - Signed by the participant's Ed25519 key - Stored in the participant's relay and/or DHT node - Cached locally by peers who have previously communicated with this participant - Refreshed when transport availability changes (network change, BLE toggle, etc.)


3. Transport Negotiation

3.1 — Discovery Flow

When device A wants to reach device B:

1. A fetches B's reachability profile
   (from relay, DHT, local cache, or out-of-band)

2. A intersects B's transport list with A's own capabilities

3. A attempts transports in priority order (B's priority takes precedence)
   until one succeeds

4. A sends the event via the first successful transport

5. A optionally attempts xp.discovery.upgrade after initial contact
   (e.g., upgrade from relay to direct connection)

3.2 — Transport Negotiation Events

xp.discovery.profile      — publish reachability profile
xp.discovery.probe        — "can you reach me via this transport?"
xp.discovery.probe.ack    — "yes / no, and here is why"
xp.discovery.upgrade      — "we are on relay, want to try direct?"
xp.discovery.upgrade.ack  — "yes / no"
xp.discovery.register     — register a new transport type (extensibility)
xp.discovery.physical_bind — bind a device to its physical context (§9)
xp.discovery.recommission  — authorized repositioning of a bound device

3.3 — Upgrade Protocol

Once initial contact is established via any transport, either party may propose upgrading to a faster or more private transport:

{
  "kind": "xp.discovery.upgrade",
  "payload": {
    "current_transport": "relay",
    "proposed_transport": "rendezvous",
    "rendezvous_token": "<hkdf_derived_token>",
    "my_external_addr": "203.0.113.42:51820",
    "expires_at": 1717000300
  }
}

The recipient attempts the proposed transport. If successful, subsequent events flow via the upgraded transport. If not, the current transport continues. No disruption to the session either way.


4. Network Transports

Network transports provide connectivity without requiring physical proximity. They expose varying amounts of metadata to infrastructure.

4.1 — Relay (Mandatory)

How it works: Device A maintains a persistent WebSocket connection to one or more relays. Events addressed to A's public key are delivered to connected relays and forwarded to A. Device B sends events to A's key — the relay mesh routes them without B ever knowing A's IP address.

Key properties: - Universal — works on any device with any internet connection - Privacy — A's IP is never disclosed to B; relay sees ciphertext only - Async — events are stored for offline recipients and delivered on reconnect - Relay mesh — relays forward events to neighboring relays for keys they don't directly serve, enabling full decentralization

Metadata exposed to relay operator: sender key fingerprint, recipient key fingerprint, event kind (unless envelope-encrypted), timestamp, payload size. Use envelope encryption to hide all but recipient key fingerprint.

Conformance: MANDATORY. Every XProtocol implementation must support relay.

4.2 — Cryptographic Rendezvous

How it works: Two devices that share a prior secret (from a previous pairing) can meet directly without a central directory by deriving a time-windowed rendezvous token:

token = HKDF(
  shared_secret,
  info = "xp.discovery.rendezvous",
  salt = floor(unix_time_ms / 300000)  // 5-minute windows
)

Device A posts its current external IP:port alongside this token to any rendezvous server. Device B computes the same token, queries the server, retrieves A's address, and establishes a direct connection. The token is single-use and immediately invalidated after match.

Privacy model: The rendezvous server is completely untrusted. It sees only opaque tokens and IP addresses — no keys, no identities, no content. Any commodity server can run the rendezvous service. Multiple independent rendezvous servers can be used for redundancy.

NAT traversal: The rendezvous server doubles as a STUN endpoint — it reports each device's external IP:port as seen from the internet. Devices exchange this information via the rendezvous token and attempt UDP hole-punch for direct peer-to-peer connectivity. If hole-punching fails (symmetric NAT), TURN relay fallback is used. This is the complete ICE/STUN/TURN stack, but identity-native and key-addressed.

Conformance: OPTIONAL. Recommended for all implementations that support persistent connections between known peers.

4.3 — Key-Addressed DHT

How it works: A Kademlia-style distributed hash table where the node ID IS the XProtocol public key fingerprint. Devices store their current connection information (IP, relay URL, or both) in the DHT under their own key. Any device that knows a public key can look up current reachability information directly in the DHT — no central server, no DNS, no registrar.

Self-authenticating records: Location records stored in the DHT are signed XProtocol events. Any DHT node storing a record can verify it was signed by the key it is stored under. Poisoned location records are cryptographically impossible — you cannot fake a location record for a key you don't control.

Conformance: OPTIONAL. Recommended for implementations targeting decentralized or censorship-resistant deployments.

4.4 — Return Address Capability

How it works: Every outgoing event may carry a signed, encrypted return-address capability — a short-lived token that allows the recipient to reach the sender through whatever path is currently available, without the sender needing a persistent discoverable address.

{
  "return_capability": {
    "transport": "relay",
    "relay_url": "wss://relay.xprotocol.ai",
    "ephemeral_key": "<x25519_public_key>",
    "valid_for_seconds": 300,
    "single_use": false
  }
}

The return capability is encrypted to the recipient's X25519 key — only the intended recipient can use it. It expires after the specified window. Single-use capabilities are invalidated after first use.

Privacy model: The sender is reachable for exactly as long as they choose, through exactly the channels they expose, to exactly the recipients they intend. Discoverability is an active per-interaction choice, not a passive always-on state.

Conformance: OPTIONAL. Recommended for privacy-sensitive deployments and IoT devices that should not maintain persistent discoverable addresses.


5. Local Presence Transports

Local presence transports provide connectivity within a physical space — typically a building, floor, or room. They expose less metadata than network transports and provide implicit proximity context.

5.1 — Bluetooth Low Energy (BLE)

How it works: Devices advertise their XProtocol key fingerprint via BLE service advertisements. Nearby devices that share prior pairings recognize each other's fingerprints and establish direct connections.

Discovery without prior pairing: BLE advertisements carry only the key fingerprint — not the full public key. This is sufficient for known peers (who already have the full key) to recognize each other. For first-contact, BLE discovery is combined with QR, NFC, or UWB for key exchange.

Automatic silent reconnection: Devices that have previously paired reconnect automatically when they come into BLE range — silently, without user action, without any server involvement.

Conformance: OPTIONAL. Recommended for mobile and IoT implementations.

5.2 — WiFi Direct / P2P

How it works: Devices establish direct WiFi connections without a router or access point. Suitable for high-bandwidth local transfers (large file sync, video, sensor data streams) where BLE throughput is insufficient.

Typical use: Upgrade from BLE discovery to WiFi Direct transport for bulk data operations. BLE handles the discovery and initial key exchange; WiFi Direct handles the data transfer.

Conformance: OPTIONAL.


6. Physical Presence Transports — Proximity

Physical presence transports provide connectivity that requires one or both devices to be physically close to the other. This physical proximity is a cryptographic trust signal — connections established via these transports carry a higher trust assertion than network transports.

6.1 — QR Code

How it works: One device displays a QR code encoding a signed, time-limited connection invitation. Another device scans it to initiate contact.

QR payload structure:

{
  "v": 1,
  "kind": "xp.discovery.qr",
  "key": "<ed25519_public_key>",
  "x25519": "<x25519_public_key>",
  "transports": ["wss://relay.xprotocol.ai"],
  "nonce": "<32_random_bytes_base64url>",
  "purpose": "pair | service | iot | capability | rendezvous",
  "expires": 1717000300,
  "sig": "<ed25519_signature_of_above>"
}

Compact-encoded (CBOR + base64url), this fits comfortably within a standard QR code. The signature proves the QR was not tampered with. The nonce prevents replay. The expiry makes photographed QR codes useless after the window.

QR as a signed capability, not just an address:

The QR code is not merely an identity advertisement — it is a signed, one-time capability that can encode different purposes:

Purpose What it enables
pair First-contact identity exchange — scan to add a contact
service Service endpoint bootstrap — scan to connect to this XProtocol service
iot IoT device onboarding — scan to add this device to your key hierarchy
capability Temporary access grant — scan to receive a time-limited capability
rendezvous Ephemeral direct connection — scan to establish a direct link right now

Physical proximity as a trust signal: Scanning a QR code proves the scanner was physically present at the display — or had access to a screen or printout showing it. For IoT onboarding, this physical presence proof is the right trust model: the person commissioning the device was there.

NFC tag variant: QR code payloads can be written to passive NFC tags. This enables physical objects (devices, access points, products, locations) to carry XProtocol endpoint information without any power source. A device mounted on a wall can be commissioned by tapping an NFC tag attached to it.

Conformance: OPTIONAL. Strongly recommended for mobile implementations and IoT onboarding flows.

6.2 — NFC (Near Field Communication)

How it works: Two NFC-capable devices exchange XProtocol bootstrap information in a single physical tap. Unlike QR (one-way, display to camera), NFC is bidirectional — both devices transmit and receive in the same interaction.

NFC pairing payload:

{
  "kind": "xp.discovery.nfc",
  "initiator": {
    "key": "<ed25519_public_key>",
    "x25519": "<x25519_public_key>",
    "transports": ["wss://relay.xprotocol.ai"],
    "nonce": "<random_bytes>",
    "sig": "<signature>"
  },
  "responder": {
    "key": "<ed25519_public_key>",
    "x25519": "<x25519_public_key>",
    "transports": ["wss://relay2.xprotocol.ai"],
    "nonce": "<random_bytes>",
    "sig": "<signature>"
  }
}

Pairing is complete at tap. Because NFC is bidirectional, both sides have everything they need — both public keys, both X25519 keys, both transport lists — after the single tap. No network follow-up is required to complete the pairing. The devices can communicate immediately via any shared transport, even if they are no longer in NFC range.

NFC tags (passive): Passive NFC tags (no power source) carry the same QR code payload in NFC-readable form. Anything that can be done with a QR code can be done with an NFC tag, plus the tag can be embedded in physical objects (devices, access cards, product packaging) invisibly.

Trust level: NFC requires physical contact or near-contact (< 4cm). This is the strongest physical proximity proof among the non-UWB transports. Pairings established via NFC carry the highest proximity trust assertion.

Conformance: OPTIONAL. Recommended for mobile implementations.


7. Physical Presence Transports — Spatial

Spatial transports add precise physical positioning to the trust model. They are not just proximity-based but position-based — enabling use cases where the exact location and orientation of a device is a security primitive.

7.1 — Ultra-Wideband (UWB)

How it works: UWB radios provide centimeter-accurate ranging between devices. Unlike GPS (meter-accurate outdoors only) or BLE (approximate, indoors), UWB provides precise distance and angle measurements indoors and outdoors at cm-level accuracy.

UWB in the XProtocol discovery context:

Spatial addressing: Events can be addressed to "the device 2 meters in front of me facing north" rather than by key fingerprint. The UWB ranging system resolves this spatial address to a specific device key in real time.

Proximity-gated capabilities: A capability token is valid only while the holding device is within N centimeters of a designated anchor device. The capability is automatically suspended when the device moves out of range and restored when it returns. The gate is enforced by continuous UWB ranging — no server check, no polling interval.

Physical context as authorization: "This door unlocks for keys that are within 30cm of the reader" — not "keys that are in this building" or "keys that are on this network." The cm-accurate ranging makes this meaningful in a way BLE cannot provide.

Anti-relay attack prevention: For high-security pairing scenarios, UWB ranging proves the device is physically present — not being proxied through a long-range radio by an attacker. A device claiming to be at 5cm but actually 50 meters away cannot pass UWB ranging verification. This prevents relay attacks that defeat NFC and BLE proximity claims.

Trust level: UWB provides the highest physical presence trust assertion of any transport — cm-accurate, direction-aware, and relay-attack-resistant.

Hardware availability: UWB is present in iPhone 11+, recent Android flagships (Samsung Galaxy S21+, Pixel 6+), Apple AirTags, and a growing range of IoT hardware. The hardware exists at scale today.

Conformance: OPTIONAL. Recommended for high-security access control and IoT installations.

7.2 — GPS Location-Derived Key Material

The core idea: Derive encryption key material from a geographic location, such that only a device physically present within a defined geofence can reconstruct the decryption key. The location IS the key — not a password that happens to describe a location.

Key derivation:

location_key = HKDF(
  shared_secret,
  info = "xp.discovery.location",
  salt = geohash(lat, lon, precision)
)

The geohash precision parameter controls geofence size: - Precision 9 → ~4m × 4m cell - Precision 8 → ~38m × 19m cell - Precision 7 → ~153m × 153m cell - Precision 6 → ~1.2km × 0.6km cell

Sender flow: 1. Sender selects a target location and precision 2. Derives location_key from their shared secret with the recipient 3. Encrypts payload with location_key 4. Sends event with geofence parameters (center + precision) but NOT the key

Recipient flow: 1. Recipient receives event with geofence parameters 2. Device's GPS must report position within the geofenced area 3. Device derives the same location_key from GPS position + shared secret 4. Decryption succeeds only within the geofence

Privacy model: No location data is transmitted. The server never knows where the geofence is. The key material is derived locally from GPS position — which the server cannot read. The payload is encrypted and meaningless without the location-derived key.

Location proof hardening: GPS coordinates alone are public knowledge — anyone can claim to be anywhere. Hardening mechanisms:

  • Witness co-signing: Nearby XProtocol devices co-sign a location attestation. Decryption requires both the location-derived key AND a threshold of witness signatures from devices known to be at that location.
  • Time-locked location: Combine geofence with a time window. The key is only reconstructable at location X between time T1 and T2.
  • UWB + GPS combination: GPS gets you to the right building; UWB gets you to the right room. Combined, they create a precise physical context extremely difficult to spoof.
  • Visual verification: Combined with the visual place recognition transport (§8), the geofence key can require both GPS position AND visual scene match.

Use cases: - Time-sensitive documents retrievable only from a specific jurisdiction - Dead drops — encrypted messages retrievable only from a physical location - Location-gated IoT device activation - Proof-of-presence verification for legal or compliance purposes - Access grants valid only within a specific facility or zone

Conformance: OPTIONAL.


7.3 — Continuous Location Presence Gating for Messaging

7.3.1 — Concept

GPS Location-Derived Key Material (§7.2) is a one-time gate: go to the location, decrypt, keep the content forever. Continuous Location Presence Gating is a fundamentally different primitive: the message is only visible while the recipient is physically inside the geofence. Leave the area — the message disappears from the timeline. Return — it reappears. The content is not destroyed; the read condition is evaluated continuously against current GPS position.

This creates a new category of spatially-aware content: messages that exist in the physical world, not just in a conversation thread.

7.3.2 — The Delivery × Visibility × Persistence Matrix

Geofenced messages are defined by three independent, composable dimensions. Any combination is valid.

Dimension 1 — Delivery Mode:

Mode Value Behavior
Deliver immediately deliver_immediately Message arrives at relay and is stored on device silently. No notification until geofence entry.
Hold at relay hold_until_present Relay holds ciphertext. Message never touches recipient device until location proof is provided.

Dimension 2 — Visibility Mode:

Mode Value Behavior
Persist after read persist_after_read Geofence gates first access only. Once read inside the area, message is visible everywhere thereafter.
Geofence only geofence_only Continuously gated. Visible inside, hidden outside. Appears and disappears dynamically.
Disappear on exit disappear_on_exit Plaintext and local copy deleted when recipient leaves geofence. Cannot be recovered without re-entering.

Dimension 3 — Persistence Mode:

Mode Value Behavior
Permanent permanent Normal retention. No time limit on reading.
Timed read window (soft) timed_on_open Timer starts when recipient opens the message. After N seconds, plaintext evicted, local copy deleted, relay copy deleted.
Timed read window (hard) timed_on_delivery read_by timestamp set by sender. If not opened by deadline, relay deletes regardless of geofence status.
Single read single_read Deleted from system the moment the recipient closes the message.

The interesting combinations:

deliver_immediately + geofence_only + permanent
→ "Location echo" — time capsule tied to a place.
  Come back in 10 years, message is still there.

deliver_immediately + geofence_only + timed_on_open
→ Your original idea. Silent delivery, visible only
  inside geofence, 5-minute read window once opened.
  Miss the window, timer resets on next visit (or not —
  sender configurable).

hold_until_present + disappear_on_exit + single_read
→ Cryptographic dead drop. Relay holds until you arrive.
  One read. Closes — gone from the entire system.

hold_until_present + geofence_only + timed_on_delivery
→ Maximum restriction. Relay holds it. Hard deadline.
  Must arrive before the clock runs out. Opens, reads,
  N minutes to finish. Then gone.

deliver_immediately + persist_after_read + permanent
→ Geofenced notification only. Normal message once read.
  Friend sends "look up" — you only see it when you're
  at the right corner.

7.3.3 — Privacy-Preserving Relay Hold

Mode B (hold_until_present) requires the relay to know when to release the held message. The relay cannot know the recipient's GPS position — that would be a privacy violation. Two privacy-preserving release mechanisms:

Option A — Zero-knowledge location proof (client-initiated pull):

The recipient's device monitors geofence entry locally via OS-level geofencing APIs (iOS CLLocationManager region monitoring, Android Geofencing API — battery-efficient OS callbacks, not continuous GPS polling). On geofence entry, the device generates a zero-knowledge proof of geofence membership and sends it to the relay:

{
  "kind": "xp.geofence.proof",
  "payload": {
    "held_event_id": "<sha256_of_held_event>",
    "proof": "<zk_proof_of_geofence_membership>",
    "timestamp": 1717000000,
    "sig": "<recipient_key_signature>"
  }
}

The relay verifies the proof (without learning exact coordinates) and releases the held ciphertext to the recipient.

Option B — Cryptographic location unlock (zero relay logic):

The relay holds the event ciphertext with no special logic. The decryption key is itself encrypted to a location-derived key:

outer_key = location_key (derived from geofence coords + shared secret)
inner_event = encrypt(message_payload, inner_symmetric_key)
outer_event = encrypt(inner_symmetric_key, outer_key)

The relay delivers both ciphertexts immediately — it cannot tell the difference between a held message and a normal one. The device simply cannot decrypt until GPS confirms geofence entry and the outer_key can be derived. No new relay infrastructure required.

Option B is the recommended approach — it requires no changes to the relay specification, no zero-knowledge proof infrastructure, and no relay awareness of geofencing. The location-gating is entirely client-side cryptography.

7.3.4 — Notification Behavior

Silent delivery (deliver_immediately mode) requires suppressing the OS-level push notification until geofence entry:

On event receipt (outside geofence):
  → Store encrypted event locally
  → Increment no badge count
  → Show no notification
  → Register OS geofence region for this event's coordinates

On OS geofence entry callback:
  → Derive location_key from current GPS
  → Decrypt event
  → Surface notification: "📍 [Sender] sent you a message nearby"
  → Render in timeline

On OS geofence exit callback:
  → If visibility_mode == geofence_only or disappear_on_exit:
      Evict plaintext from memory
      Remove from rendered timeline
      (local encrypted copy retained unless disappear_on_exit)
  → If disappear_on_exit:
      Delete local copy entirely
      Send xp.geofence.exited event to relay
      Relay deletes its copy

7.3.5 — Timed Read Window

The read window timer starts based on read_window_starts:

On open (soft timer):

User taps message → timer starts → countdown visible in UI
At T=0:
  Evict plaintext from memory
  Overwrite local SQLite record
  Send xp.message.timed_expire event to relay
  Relay deletes its copy
  Message slot shows: "⏱ This message has expired"

On delivery (hard timer): The sender sets read_by as a Unix timestamp. The relay enforces this as an absolute deadline — the event is deleted from the relay at read_by regardless of whether the recipient ever opened it or entered the geofence. The recipient's device also checks read_by and suppresses display after the deadline even if a local copy exists.

Timer reset on re-entry: If the recipient exits and re-enters the geofence before the timer expires, the timer continues from where it left off — it does not reset. If the timer expires while outside the geofence, the message is gone on next entry.

7.3.6 — Deletion Mechanics and Honest Limitations

What deletion covers: - Relay copy: deleted on signed xp.message.delete event from sender, or on xp.geofence.exited from recipient, or on read_by expiry. Relay issues a signed deletion receipt. - Device plaintext: evicted from memory. Local SQLite record overwritten. OS screenshot prevention flag set during display. - System-wide: once relay and all known device copies are deleted, the message no longer exists in any XProtocol-controlled storage.

What deletion cannot guarantee: - A recipient who photographed the screen with another device - A recipient who screenshotted before screenshot prevention engaged - A recipient who rooted their device and extracted memory

Honest statement for the spec: Timed and location-gated deletion provides strong privacy guarantees and genuine deterrence. It is not absolute proof against a determined adversary with physical access to the device. The XProtocol threat model covers protocol-level adversaries; device-level attacks are out of scope.

Screenshot prevention: The secure_display: true flag requests OS-level screenshot prevention (iOS UIScreen.isCaptured monitoring, Android WindowManager.LayoutParams.FLAG_SECURE). This prevents system screenshots and screen recording. It does not prevent a second device photographing the screen.

7.3.7 — Complete Event Schema

{
  "kind": "xp.message.geofenced",
  "payload": {
    "text": "<encrypted message content>",
    "geofence": {
      "center": { "lat": 37.7749, "lon": -122.4194 },
      "radius_meters": 500,
      "precision": 7
    },
    "delivery_mode": "deliver_immediately | hold_until_present",
    "visibility_mode": "persist_after_read | geofence_only | disappear_on_exit",
    "persistence_mode": "permanent | timed_on_open | timed_on_delivery | single_read",
    "read_window_seconds": 300,
    "read_by": null,
    "timer_resets_on_reentry": false,
    "secure_display": true,
    "hint_outside_geofence": "📍 There's a message waiting for you nearby",
    "map_pin_visible": true
  }
}

Supporting event kinds:

xp.geofence.proof        — ZK location proof for relay-held messages
xp.geofence.entered      — Client notifies relay of geofence entry
xp.geofence.exited       — Client notifies relay of geofence exit
xp.message.timed_expire  — Client notifies relay to delete on timer expiry
xp.message.delete        — Sender requests deletion from relay

7.3.8 — Application Scenarios

Location echo / time capsule: Drop a message at a meaningful place — the bench where you met, the coffee shop where the deal closed, the hospital room. It waits there indefinitely. Anyone with access who visits reads it in place. A spatially-anchored memory.

Treasure hunt / scavenger hunt: A series of geofenced messages each revealing the next location. The map shows pins but no content until you arrive. The chat timeline shows only the messages you've physically reached.

Event-specific channels: A festival, conference, or concert creates a geofenced group thread. You're in the conversation when you're at the venue. Leave — thread disappears. Return — full history reappears. The channel IS the place.

Neighborhood community: A geofenced group that automatically includes you when you're home and excludes you when you travel. No subscription, no invite, no leave action — presence is membership.

Sensitive location-aware information: A doctor's office sends post-visit instructions geofenced to the pharmacy. Readable at the pharmacy, invisible elsewhere. Not deleted — just contextually appropriate.

Physical check-in proof: Reading a geofenced message is implicit cryptographic proof of physical presence. The read receipt timestamp combined with the geofence coordinates constitutes a location attestation signed by the recipient's key.

Dead drop communications: Relay holds the message. Recipient must physically arrive. Single read. Gone on close. No network trace connecting sender and recipient at the moment of delivery.

7.3.9 — Novel Mechanisms in this Section

Mechanism: Continuous Location Presence Gating for Messages A messaging system where message visibility is continuously gated by the recipient's real-time GPS position within a defined geofence — such that messages appear in the conversation timeline when the recipient is inside the geofence and disappear when they exit — with plaintext evicted from device memory on geofence exit, without destroying the underlying encrypted event.

Mechanism: Delivery × Visibility × Persistence Matrix A composable message flag system for location-gated messaging where delivery mode (immediate vs. relay-held), visibility mode (persist after read vs. continuously gated vs. disappear on exit), and persistence mode (permanent vs. timed vs. single-read) are independent, composable flags on a single message — enabling any combination of geofenced delivery, visibility, and deletion behavior.

Mechanism: Privacy-Preserving Relay Hold via Nested Encryption A method of holding a message at a relay until a recipient is physically present at a location, without the relay ever learning the geofence coordinates — achieved by encrypting the message decryption key to a location-derived key, such that the relay holds opaque ciphertext and the location-gating is enforced entirely by client-side cryptographic derivation on geofence entry.

Mechanism: Silent Delivery with Deferred Geofence Notification A messaging delivery mechanism where an encrypted event is stored on the recipient's device without generating any OS-level notification, badge count, or timeline entry — with the notification deferred until the recipient's device detects entry into the sender-specified geofence, at which point the message is decrypted and surfaced as if it just arrived.

Mechanism: Read Receipt as Implicit Location Attestation The property of a geofenced message read receipt — wherein the act of successfully decrypting and opening a location-gated message constitutes cryptographic proof of physical presence, since decryption required deriving a location-key from GPS coordinates within the geofence at the time of reading.


8.1 — Concept

The human eye recognizes places. A photograph taken from a specific position facing a specific direction produces a visual fingerprint of that location and orientation — a fingerprint that is:

  • Deterministic: The same scene, captured from the same position and direction, produces the same (or very similar) fingerprint.
  • Location-bound: A different location, even meters away, produces a different fingerprint.
  • Orientation-bound: Facing a different direction from the same position produces a different fingerprint.
  • Difficult to fake: Reproducing the fingerprint requires being physically present at that location facing that direction, or having a sufficiently accurate reproduction of the scene.

XProtocol uses this visual fingerprint as cryptographic key material — making physical presence at a specific location AND orientation a prerequisite for decryption.

8.2 — Visual Fingerprinting Pipeline

The visual fingerprint is a compact, normalized representation of a scene that is robust to expected environmental variation (lighting changes, minor angle differences, seasonal variation) while being sensitive to meaningful changes (location change, direction change, scene replacement).

Pipeline stages:

Raw camera frame
      │
      ▼
Preprocessing
  - Normalize exposure and white balance
  - Crop to central region (removes sky/ground extremes)
  - Resize to standard resolution (e.g., 64×64 or 128×128)
      │
      ▼
Feature extraction
  - Compute perceptual hash (pHash or dHash)
  - Extract dominant structural features (edges, corners, gradients)
  - Compute local feature descriptors (ORB or similar — no GPU required)
      │
      ▼
Quantization
  - Reduce feature vector to N bits (e.g., 256 bits)
  - This is the visual fingerprint
      │
      ▼
Key derivation
  visual_key = HKDF(
    visual_fingerprint || geohash(lat, lon) || quantized_heading,
    info = "xp.discovery.visual_place"
  )

8.3 — Fuzzy Commitment for Environmental Tolerance

Standard cryptographic keys are binary — either exactly right or completely wrong. A visual fingerprint needs to be fuzzy — similar scenes must produce the same key even with minor variation. This is achieved via fuzzy commitment schemes (also called secure sketches):

  1. Sender commits to visual fingerprint F with tolerance D (maximum acceptable Hamming distance between F and F')
  2. Commitment encodes enough error-correction information to recover F from any F' within Hamming distance D of F
  3. Commitment is included in the event alongside the encrypted payload
  4. Recipient captures their own fingerprint F', applies error correction using the commitment, and recovers F if within tolerance
  5. F is used to derive the decryption key

The tolerance parameter D is the security/usability tradeoff: - Low D (strict): very sensitive to any scene change; fails in poor lighting - High D (loose): tolerant of variation; less sensitive to tampering

Recommended defaults by deployment context:

Context Tolerance Notes
Indoor, controlled lighting Low Server rooms, offices
Indoor, variable lighting Medium Homes, retail
Outdoor, stable scene Medium Buildings, monuments
Outdoor, variable scene High Parks, landscapes
IoT physical binding Low Maximum tamper sensitivity

8.4 — The Visual Dead Drop

The visual dead drop is the primary user-facing application of visual place recognition as a cryptographic primitive. It enables one device to leave an encrypted message that can only be decrypted by another device that physically visits a specific location and "sees" the same scene.

Sender flow: 1. Sender opens the XProtocol app and points the camera at a scene 2. App captures visual fingerprint + GPS coordinates + compass heading 3. App generates visual_key from these inputs 4. Sender composes and encrypts the payload with visual_key 5. Event is sent containing: - Encrypted payload (requires visual_key to decrypt) - GPS coordinates (where to go) - Approximate compass heading (which direction to face) - Approximate distance to subject (how far away) - Fuzzy commitment (error correction data — does NOT reveal the fingerprint) - Expiry (optional — the dead drop self-destructs after this time) - Hint text (optional — human-readable clue about the scene)

Recipient flow: 1. Recipient receives the event, sees location hint + coordinates + heading 2. Travels to the indicated GPS location 3. Points device camera in the indicated direction 4. App captures visual fingerprint in real time 5. Applies fuzzy commitment error correction 6. If scene matches within tolerance → visual_key reconstructed → payload decrypts 7. If scene does not match → decryption fails — recipient is not at the right place

What the visual dead drop protects against:

Attack Protection
Remote decryption (attacker has the event but not the location) Cannot reconstruct visual_key without being physically present
Photo-of-photo attack (attacker photographs the location) Perceptual hash of a photo of a photo degrades below tolerance threshold
GPS spoofing alone GPS is one of three inputs; visual match is required independently
Correct location, wrong direction Compass heading is part of key material
Scene photograph from satellite/street view Resolution and perspective differences cause hash mismatch

Limitations:

  • Scene destruction: If the scene changes dramatically (building demolished, mural painted over), the key is permanently lost. Mitigate with fallback transport or key escrow.
  • Adversarial high-fidelity reproduction: A sophisticated attacker with a photorealistic display positioned at the correct location could potentially reconstruct the hash. Combined with GPS + UWB, this becomes prohibitively difficult.
  • Live camera requirement: Implementations MUST prevent gallery image imports from being used in place of live camera capture. Motion sensor and GPS activity should be confirmed during capture to resist static image attacks.

8.5 — Applications

Scavenger hunts and location-based games: The visual dead drop was conceived as the foundation for cryptographically authentic scavenger hunts. Clues that can only be read at specific physical locations. Prizes that unlock only when you've physically visited the right place and seen the right thing.

Secure physical document delivery: Legal documents, contracts, or sensitive data that can only be accessed from a specific physical location — a courthouse, a notary office, a secure facility. The document is location-gated not by policy but by cryptography.

Proof of physical presence: A journalist, inspector, or auditor can prove they were physically at a specific location by capturing a visual_key event signed by their identity key. The event is tamper-evident and location-bound — it cannot be fabricated without physical presence.

IoT commissioning via visual fingerprint: Point your phone at a device to commission it. The visual fingerprint of the device's physical installation (its label, housing, and surrounding environment) IS the bootstrap credential. Discussed further in §9.

Location-gated access control: A server room that grants access only to keys whose holders are photographing the correct wall from the correct angle. Walking away from the camera angle revokes access.


9. Cryptographic Physical Context Binding

9.1 — Concept

Physical context binding is the cryptographic attachment of a device's communication capability to its physical installation. A bound device:

  • Can only communicate when it is at the correct location (GPS)
  • Can only communicate when it is facing the correct direction (compass/gyroscope)
  • Can only communicate when its camera sees the correct scene (visual fingerprint)
  • Optionally: can only communicate when it is within the correct UWB range of an anchor device (centimeter-accurate)

If any of these physical context parameters change beyond defined tolerances, the device loses the ability to reconstruct its communication key and goes cryptographically silent — not locked out by a server, not disabled by a flag, but literally unable to reconstruct the key material needed to sign or decrypt events.

This creates a new security primitive: a device that is cryptographically inseparable from its physical installation.

9.2 — Key Derivation for Physical Context Binding

physical_context_key = HKDF(
  root_secret,
  info = concat(
    "xp.discovery.physical_bind",
    geohash(lat, lon, precision),      // GPS location
    quantize(compass_heading, 5°),     // Orientation (5° buckets)
    visual_fingerprint,                // Scene fingerprint
    uwb_anchor_key_fingerprint         // UWB anchor (optional)
  )
)

The device derives this key at commissioning time. The key is then used to encrypt the device's X25519 private key at rest:

stored_x25519_private_key = encrypt(x25519_private_key, physical_context_key)

At every communication event, the device: 1. Reads current GPS position, compass heading, camera frame 2. Derives physical_context_key from current physical context 3. Attempts to decrypt x25519_private_key using derived key 4. If decryption succeeds → device can sign and encrypt events → operates normally 5. If decryption fails → device cannot reconstruct private key → goes silent

No server is involved. No heartbeat check. The physics of the device's situation IS its authentication, enforced locally and continuously.

9.3 — Commissioning Flow

Factory / Pre-deployment:
  Device generates Ed25519 + X25519 keypairs
  Device public keys registered with owner's key hierarchy
  Device shipped — not yet active (no physical_context_key yet)

Installation:
  Technician opens XProtocol app
  Technician installs device at target location
  Technician points phone camera at device's intended field of view
  App captures: GPS position, compass heading, visual fingerprint
  App optionally places UWB anchor device
  App sends xp.discovery.physical_bind event to device:
    - signed by technician's authorized key
    - contains physical context parameters + tolerances
  Device captures its own GPS, compass, and camera frame
  Device verifies its captured context matches technician's within tolerance
  Device derives physical_context_key from captured context
  Device encrypts X25519 private key with physical_context_key
  Device confirms commissioning — now active and physically bound

Operation:
  Every N seconds (configurable — default 60s), device re-derives key
  If context within tolerance → continues operating normally
  If context drifts toward tolerance boundary → emits xp.iot.context_warning
  If context exceeds security threshold → goes silent
  If device is moved completely → permanently silent until recommissioned

9.4 — Security Properties

Tamper detection without tamper hardware: Traditional tamper-evident IoT uses physical seals, accelerometers, or dedicated tamper chips. These can be defeated with careful physical access. A physically-bound device cannot be moved, reoriented, or have its camera covered without immediately losing communication ability — and the loss is cryptographic, not just a flag that can be reset.

Anti-theft by design: A stolen security camera that is physically bound to its installation point cannot be re-used elsewhere. It is not deactivated by a server command — it is physically incapable of functioning anywhere else because the physical context key that encrypted its private key does not exist at any other location.

Supply chain integrity: A device commissioned at a factory with a visual binding can verify it arrived at the correct installation location before activating. If intercepted and installed somewhere else, it never activates.

Camera occlusion as automatic lockout: Taping paper over the camera changes the visual fingerprint dramatically. The device goes silent. Any attempt to blind the camera is treated identically to moving it. This is not a policy decision — it is a cryptographic consequence.

The attack surface is physical, not digital: To compromise a physically-bound device, an attacker must physically recreate its original environment — same location, same orientation, same scene in the camera. For a device in a server room pointing at a specific wall, this is essentially impossible to fake remotely.

9.5 — Tolerance and Drift Management

Tolerance parameters (set at commissioning):

Parameter Tight Medium Loose
GPS radius 1m 5m 20m
Heading tolerance ±2° ±5° ±15°
Visual hash distance 8 bits 16 bits 32 bits
UWB range ±10cm ±50cm ±2m

Gradual drift handling: A device on a rooftop may shift gradually over months due to thermal expansion, wind, or settling. The physical context key will drift gradually rather than jump suddenly. The device tracks hash drift over time and emits xp.iot.context_warning when approaching the tolerance boundary — before going silent rather than after.

Intentional repositioning — recommission flow: When a device legitimately needs to be repositioned: 1. Authorized key sends signed xp.discovery.recommission event 2. Recommission event specifies a time window during which binding is suspended 3. Device temporarily accepts a new physical context capture 4. New physical_context_key derived and stored 5. Recommission event logged immutably in Graph Store — every repositioning is permanently auditable

Multi-factor binding: For highest-security installations, combine all four binding factors: GPS + compass + visual + UWB. Moving the device even 10cm out of UWB anchor range silences it instantly, independent of GPS and visual checks. This provides four independent physical tamper detection mechanisms.

9.6 — Applications

Security cameras: A camera that goes offline if physically moved or reoriented. Cannot be stolen and redeployed without full recommissioning by an authorized key.

Access control readers: A door reader that silences itself if detached from the door. Cannot be cloned and placed elsewhere to capture credentials.

Industrial sensors: A sensor on critical infrastructure that proves it is still in its intended position. Any tampering is immediately detectable and stops the device from reporting false readings.

Server hardware: A server that verifies it is still in its designated rack position before processing sensitive operations. Physically relocating the server silences it.

Autonomous vehicles and drones: A vehicle that verifies its physical context against expected parameters before executing sensitive operations. Cannot be spoofed into thinking it is somewhere it is not.


10. Physical Presence Trust Tier Model

⚠️ The trust tier model itself — specifically the use of physical transport type as a cryptographic trust assertion embedded in pairing events and capability policies — is a novel open primitive contributed to the public record under CC BY 4.0.

Different transports provide different levels of assurance that the connecting device is physically where it claims to be. This assurance level is a first-class property of the discovery system.

10.1 — Trust Tiers

TIER 5 — Cryptographic physical context (highest)
  Physical context binding (§9)
  Requires: correct GPS + compass + visual scene + optional UWB
  Assurance: device cannot operate unless all physical parameters match
  Use cases: high-security IoT, tamper-evident sensors

TIER 4 — Precise spatial presence
  UWB ranging
  Requires: within N centimeters of anchor device
  Assurance: cm-accurate, relay-attack-resistant
  Use cases: access control, proximity-gated capabilities

TIER 3 — Visual + spatial presence
  Visual place recognition + GPS + compass (§8)
  Requires: at correct location, facing correct direction, seeing correct scene
  Assurance: must physically be at location with correct orientation
  Use cases: visual dead drops, location-gated documents, proof of presence

TIER 2 — Physical proximity
  NFC, QR code
  Requires: physical contact (NFC) or visual access to display (QR)
  Assurance: was physically present at device or display
  Use cases: device pairing, service onboarding, first contact

TIER 1 — Local presence
  BLE, WiFi-Direct
  Requires: within ~10-100 meters
  Assurance: probably in the same building or space
  Use cases: automatic reconnection, local service discovery

TIER 0 — Network presence (no physical context)
  Relay, rendezvous, DHT, return_addr
  Requires: internet connectivity only
  Assurance: none — could be anywhere
  Use cases: general communication, remote access, async messaging

10.2 — Trust Tier as a Pairing Property

When two devices pair, the trust tier of the transport used is recorded in the pairing event:

{
  "kind": "xp.identity.pair",
  "payload": {
    "peer_key": "<ed25519_public_key>",
    "peer_x25519": "<x25519_public_key>",
    "established_via": "nfc",
    "trust_tier": 2,
    "physical_context": {
      "location": { "lat": 37.7749, "lon": -122.4194 },
      "timestamp": 1717000000
    }
  }
}

10.3 — Trust Tier in Capability Policies

Services and devices can require a minimum trust tier for specific capabilities:

{
  "capability": "vault.open",
  "requires_trust_tier": 4,
  "requires_transport": ["uwb", "nfc"],
  "uwb_max_range_cm": 30
}

A key that was paired via relay (Tier 0) cannot exercise a Tier 4 capability — even if the key is otherwise authorized. The key holder must physically re-pair via a Tier 4 transport to gain access.

10.4 — Integration with the Graph Store

Physical presence pairing events automatically generate Graph Store annotations recording the trust tier of the relationship. This makes trust tiers queryable just like any other annotation — enabling capability policies, audit queries, and trust upgrade flows built on top of the standard xp.store.query interface.

When a pairing is established via any physical presence transport, the relay (if it is also a Graph Store) writes:

{
  "kind": "xp.graph.annotate",
  "payload": {
    "target_event_id": "<pairing_event_id>",
    "operation": "merge",
    "annotations": {
      "physical": {
        "trust_tier": 2,
        "transport": "nfc",
        "established_at": 1717000000,
        "location": {
          "lat": 37.7749,
          "lon": -122.4194
        }
      }
    }
  }
}

Capability policies can then query trust tier directly:

{
  "kind": "xp.store.query",
  "payload": {
    "filter": {
      "and": [
        { "field": "annotations.physical.trust_tier", "op": "gte", "value": 3 },
        { "field": "sender", "op": "eq", "value": "<key_fingerprint>" }
      ]
    }
  }
}

This closes the loop between the discovery layer and the graph layer: physical presence proof becomes a queryable, auditable, signed data point — not just a session-level assertion.


11. Proximity Broadcast Pairing

11.1 — Concept

All transports defined in sections 4–10 assume you either already know the other party's public key (reconnection) or are physically exchanging keys (NFC, QR, UWB). Proximity Broadcast Pairing solves a different and more interesting problem: anonymous discovery between strangers who share compatible interests, within physical proximity, with no server involvement and no identity disclosure until both parties explicitly consent.

This is distinct from the BLE reconnection transport (§5.1) in a fundamental way:

BLE Reconnection Proximity Broadcast Pairing
Known peers only Complete strangers
Key fingerprint lookup Predicate-based profile matching
Reconnects existing relationship Establishes new relationship
Identity always known Identity gated behind mutual consent
No server No server

The primitive has obvious consumer applications (proximity-based social discovery) but the underlying mechanism — predicate-encrypted broadcast with consent-gated identity exchange — is general-purpose infrastructure applicable across many domains.

11.2 — Beacon Structure

A proximity beacon carries two cryptographically distinct layers:

Layer 1 — Public envelope (anyone in range can read): - A fresh ephemeral session key (not the identity key — rotated per broadcast session) - A seeking commitment (cryptographic hash of what the sender is looking for) - Encrypted seeking criteria (what the sender wants — readable by the recipient's app to evaluate match) - App namespace (scopes the beacon to a specific application context) - Beacon TTL (how long this broadcast is valid) - Signature from the session key (proves the beacon is internally consistent)

Layer 2 — Predicate-gated content (decryptable only if you match): - Profile content encrypted to a key derivable only by someone whose attributes satisfy the sender's seeking criteria - Response capability (how to reach the sender if interested)

{
  "kind": "xp.proximity.beacon",
  "payload": {
    "namespace": "xp.proximity.dating",
    "session_key": "<ephemeral_ed25519_public_key>",
    "session_x25519": "<ephemeral_x25519_public_key>",
    "seeking_commitment": "<sha256_of_canonical_criteria>",
    "seeking_criteria": "<criteria_encrypted_to_session_x25519>",
    "profile_encrypted": "<profile_encrypted_to_criteria_key>",
    "beacon_ttl_seconds": 300,
    "response_hint": "<how_to_respond_encrypted_to_criteria_key>",
    "signed_at": 1717000000,
    "sig": "<session_key_signature_of_above>"
  }
}

11.3 — Predicate Key Derivation

The criteria key — the key that gates access to the profile — is derived from the seeking criteria in a way that allows any receiver whose profile satisfies the criteria to derive the same key independently:

criteria_key = HKDF(
  session_x25519_secret,
  info = "xp.proximity.criteria." + canonical_encode(seeking_criteria),
  salt = beacon_nonce
)

profile_ciphertext = encrypt(profile_content, criteria_key)

Receiver evaluation:

1. Receiver reads seeking_criteria from beacon (decryptable by anyone —
   criteria themselves are public so receivers can self-evaluate)

2. Receiver's app locally evaluates: does my profile satisfy this criteria?
   (This evaluation happens entirely on the receiver's device — no server)

3. If YES:
   a. Receiver derives the same criteria_key using the same HKDF inputs
   b. Receiver decrypts profile_ciphertext → sees sender's profile
   c. Receiver decides whether to respond

4. If NO:
   a. Receiver cannot derive criteria_key
   b. profile_ciphertext is meaningless ciphertext
   c. Receiver ignores the beacon

Important: The seeking criteria themselves are readable by anyone in range — they are not secret. "Looking for: 28-38, hiking enthusiast, within 5km" is public. Only the profile content (photos, bio, contact capability) is gated behind the criteria key. This is intentional — receivers need to read the criteria to evaluate whether they match.

Proximity Broadcast Pairing uses a four-phase consent model that progressively reveals identity only as mutual interest is confirmed:

PHASE 1 — Beacon (one-way, anonymous sender)
─────────────────────────────────────────────
Sender broadcasts xp.proximity.beacon via BLE and/or WiFi
Receiver reads beacon, evaluates criteria locally
If match: receiver decrypts profile, views sender's public content
Sender does NOT know receiver was there

PHASE 2 — Interest signal (one-way, pseudonymous receiver)
────────────────────────────────────────────────────────────
Interested receiver sends xp.proximity.interest to the beacon's
response_capability address (pseudonymous — uses receiver's own
session key, not identity key)
Payload: receiver's profile encrypted to sender's criteria_key
         + receiver's own seeking criteria for sender to evaluate
         + receiver's response session key
Sender receives interest signal, evaluates receiver's criteria
If sender's profile matches receiver's criteria: mutual match

PHASE 3 — Mutual match (bidirectional, still pseudonymous)
────────────────────────────────────────────────────────────
Both parties have each other's profiles
Neither has the other's real identity key
Both independently view profiles and decide whether to proceed
Either party can walk away — no record of the interaction

PHASE 4 — Identity exchange (explicit, bilateral consent)
──────────────────────────────────────────────────────────
Either party sends xp.proximity.connect containing their real
Ed25519 identity key + X25519 key + preferred transport list
Signed by their session key (proves continuity with Phase 1/2)
The other party accepts (sends their identity keys back) or ignores
If accepted → full XProtocol pairing established (stored in Graph Store)
If ignored → session keys expire, interaction leaves no trace

Privacy properties of the four-phase model:

Party What they learn at each phase
Anyone in range Sender's seeking criteria (Phase 1)
Matching receiver Sender's profile content (Phase 1)
Sender That someone responded + receiver's profile (Phase 2)
Both parties Each other's profiles, not identity keys (Phase 3)
Both parties (if connected) Each other's real identity keys (Phase 4)

A receiver who decrypts the profile and chooses not to respond is completely invisible to the sender — no record, no trace, no notification.

11.5 — Session Key Rotation

Beacon session keys rotate on a configurable schedule (default: every 5 minutes). This prevents:

  • Tracking: An observer cannot correlate beacons from the same person over time — each rotation produces a completely different session key
  • Replay attacks: Old beacons expire with their session keys
  • Persistent surveillance: Recording beacons for later analysis yields no useful identity information after session key rotation

The rotation schedule is a privacy parameter the user controls: - Faster rotation = stronger anti-tracking, harder to pair (receiver must respond within the rotation window) - Slower rotation = easier pairing, weaker anti-tracking

11.6 — Beacon Transport Mechanisms

Proximity beacons are broadcast via two complementary channels:

BLE Advertising: The full beacon payload is embedded in BLE advertisement data or the GATT characteristic of an XProtocol proximity service. Range: ~10-100 meters depending on hardware and environment. Works through walls. Lower bandwidth.

WiFi Beacon / Action Frames: On supported hardware, the beacon payload is embedded in WiFi management frames (action frames or probe responses). Range: ~50-150 meters. Higher bandwidth — allows larger profile payloads. Requires WiFi hardware in promiscuous/monitor mode (platform-dependent capability).

Combined broadcast: Both channels simultaneously for maximum coverage. Receivers with BLE-only see the BLE beacon. Receivers with WiFi capability see the WiFi beacon (higher-quality profile data). The same session key is used for both — they are the same beacon on different physical channels.

11.7 — Namespace System

The namespace field in the beacon scopes the broadcast to a specific application context. Devices only process beacons in namespaces their app supports. This prevents dating app beacons from cluttering IoT device discovery and vice versa.

Standard namespaces:

Namespace Application
xp.proximity.social General social discovery — meet people nearby
xp.proximity.dating Romantic / relationship discovery
xp.proximity.professional Career networking at conferences, events
xp.proximity.gaming Local multiplayer game discovery
xp.proximity.trade Hyperlocal marketplace — buy/sell nearby
xp.proximity.emergency Skills and resources in emergency situations
xp.proximity.community Neighborhood groups, local interests
xp.proximity.mesh Infrastructure-free mesh communication
xp.proximity.iot IoT device discovery (§5.1 extended)
xp.proximity.* Open extension — any app defines its own namespace

Custom namespaces follow the DNS-ownership convention: xp.proximity.com.myapp.myfeature

11.8 — Application Domains

Dating and Relationship Discovery

The application that inspired this primitive. A user builds two separate profiles:

  • Who I am: Photos, bio, interests, age, preferences — the content encrypted behind the criteria key
  • Who I'm looking for: Age range, interests, distance, intentions — the seeking criteria broadcast in the public envelope

When two users are within BLE/WiFi range and each satisfies the other's criteria, both see each other's profiles. The app surfaces the match notification: "Someone nearby matches what you're looking for." Neither user knows who the other is until both choose to connect.

Key advantages over existing dating apps:

  • No server stores or matches profiles — privacy by construction
  • No fake profiles verifiable by server — profiles signed by real keys
  • No location tracking by the platform — proximity is local only
  • No subscription required to see who likes you — matching is local
  • Works completely offline (BLE/WiFi only) — no internet required
  • Identity gated behind mutual consent — no unsolicited contact

Professional Networking at Events

At a conference, trade show, or meetup — broadcast your professional profile and what you're looking to discuss or collaborate on. Walk into a session room and your app silently surfaces the two people in the room who are working on problems you care about or have skills you're looking for.

Seeking criteria examples: - "Flutter developer looking for: AI/ML engineers, product managers, fintech experience" - "VC investor looking for: seed-stage founders, healthcare or climate verticals, technical background" - "Job seeker looking for: hiring managers, senior engineers willing to refer, remote-friendly companies"

No badge scanning. No LinkedIn QR hunting. No app-specific platform lock-in — because it's a protocol, any conference app can implement it and they all interoperate.

Local Multiplayer Game Discovery

A game broadcasts an invitation beacon with the game type, skill level, and number of players needed as the seeking criteria. Nearby devices running compatible games see the invitation, evaluate whether they want to join, and connect — all without a game server, without internet, and without knowing each other in advance.

Works in airplane mode. Works when the game server is down. Works at a LAN party. Works anywhere two devices are within WiFi Direct range.

Hyperlocal Marketplace

Broadcast what you're selling or what you're looking to buy as proximity beacons. Walk through your neighborhood and your app surfaces people within 100 meters who have what you want or want what you have.

The predicate encryption model means sellers only reveal their inventory to buyers who are looking for it. A seller broadcasting "vintage camera equipment" does not reveal their profile to someone looking for furniture. Profile content (condition photos, price, contact details) is only decryptable by a receiver whose "looking for" criteria matches the seller's goods.

This is Craigslist but: proximity-native, no server, identity-authenticated, privacy-preserving, and functional without internet infrastructure.

Emergency Response and Disaster Relief

In a disaster scenario — earthquake, flood, mass casualty event — normal infrastructure may be down. First responders and community members need to find each other based on skills and needs, not based on prior relationships.

Emergency beacons broadcast skills and resources: - "Trauma surgeon, available, within 200m" - "Structural engineer, can assess building safety" - "Generator + fuel, can share power" - "Need: insulin, Type 1 diabetic, urgent" - "Need: translation, Mandarin speaker"

Receiving devices evaluate the broadcast criteria against their own situation and surface relevant matches. No internet required — pure BLE/WiFi mesh. No central coordination — distributed matching across all devices in range.

The emergency namespace uses relaxed privacy defaults — in emergencies, faster matching matters more than anti-tracking. Profile identity is revealed at Phase 1 rather than Phase 4 when the emergency flag is set.

Community Mesh During Infrastructure Failures

During prolonged outages (natural disasters, power grid failures, civil unrest), proximity broadcast pairing enables community mesh formation:

Devices that have paired via proximity automatically form a BLE/WiFi mesh network. Events are relayed device-to-device through the mesh — no internet required. The mesh routing uses the same XProtocol event model: events addressed to public keys, relayed by any mesh node that receives them.

Community members can: - Send messages to anyone in the mesh (range: as far as the mesh extends) - Broadcast resource availability and needs - Coordinate without any central infrastructure - Authenticate all communications via their identity keys (no fake messages in the mesh)

This is ham radio for the smartphone era — decentralized, resilient, cryptographically authenticated.

Skills and Talent Discovery

At a hackathon, coworking space, or maker event — broadcast what skills you have and what skills you need. Teams form organically based on complementary capabilities without a matchmaking platform.

Seeking criteria: "Need: iOS developer, 3+ years, available for 48h" Profile content: "iOS developer, 7 years, available this weekend, portfolio URL, contact capability"

The match happens locally. The team forms. No platform takes a cut. No data goes to a server. The event organizer doesn't need to build a team-formation feature — the protocol provides it.

Anonymous Civic and Community Organizing

Community members can broadcast interest in local issues, mutual aid, or civic activities without revealing identity until they choose to:

  • "Interested in: neighborhood watch discussion, block [redacted]"
  • "Offering: childcare swap, local parents group"
  • "Looking for: carpool partners, [destination] commute"

Identity is gated behind Phase 4 consent. Community members can evaluate interest and compatibility before revealing who they are. Organizers cannot build surveillance lists of attendees — because there is no server and no persistent identity in the beacon.

11.9 — Security Considerations

Criteria honesty: A receiver can lie about their profile to satisfy the sender's criteria and decrypt the profile. The protocol cannot prevent dishonest self-evaluation (this is the honest-but-curious attacker model). Mitigations: - Verifiable credentials: profile attributes can be signed by trusted third parties (age verification services, professional credential issuers) and included in the profile - Reputation: after identity exchange (Phase 4), the receiver's real key is known — reputation systems can track dishonest actors - Consequence: a dishonest receiver sees a profile they weren't supposed to see, but cannot contact the sender without revealing their identity

Beacon harvesting: An attacker in range can collect beacons passively. Session key rotation (§11.5) limits the value of harvested beacons — they expire quickly. Profile content is still gated behind criteria keys — the attacker only harvests public envelopes (seeking criteria), which are intentionally public.

Fake beacons: An attacker can broadcast fake beacons. The session key signature proves internal consistency (the beacon wasn't tampered with) but does not prove the profile content is honest. Same mitigation as criteria honesty: verifiable credentials and post-connection reputation.

Physical surveillance: An observer can note that a device is broadcasting beacons and infer the person is socially active. Session key rotation prevents identity correlation across sessions. The beacon itself reveals only the seeking criteria — not the broadcaster's identity. This is the same privacy model as being visible in public while your face is unknown.

11.10 — Conformance

xp.proximity.* is an OPTIONAL extension. Implementations that support proximity broadcast pairing MUST:

  1. Implement session key rotation with a minimum rotation interval of 60 seconds and a maximum of 3600 seconds (user-configurable)
  2. Never use the identity Ed25519 key as the session key
  3. Support the four-phase consent flow — implementations MUST NOT reveal identity keys before Phase 4
  4. Declare supported namespaces in their xp.endpoint.announce event
  5. Honor beacon TTL — expired beacons MUST be ignored


12. Audio as a Cryptographic and Discovery Primitive

12.1 — Two Distinct Audio Primitives

Audio contributes two fundamentally different capabilities to the XProtocol discovery stack:

Audio Channel Transport (xp.discovery.audio_channel): Sound as a physical data transmission medium — key material encoded in audio signals (audible or ultrasonic) and exchanged between devices via microphone and speaker. Analogous to QR/NFC but using sound rather than light or radio.

Audio Fingerprint Key Material (xp.discovery.audio_fingerprint): Audio content itself — a song, ambient soundscape, spoken phrase, musical pattern, or rhythmic sequence — as cryptographic key material. The sound IS the key. Analogous to visual place recognition but using acoustic fingerprinting rather than visual perceptual hashing.

These two primitives are independent and complementary. Audio channel transport carries key data through sound. Audio fingerprint makes the content of the sound the key itself.

12.2 — Audio Channel Transport

How it works: One device encodes XProtocol bootstrap data (key material, transport list, nonce, signature) as audio signals and plays them through its speaker. Nearby devices receive the signal through their microphone and decode the key exchange data.

Frequency variants:

Audible (20Hz–18kHz): Data encoded in frequency patterns within the human-audible range — similar to a modem handshake or DTMF tones, but using modern acoustic modulation (chirp spread spectrum or FSK). Range: ~10-20 meters in a quiet environment. Both humans and devices hear the pairing signal. This audibility is a feature for some use cases — the pairing event is witnessed by everyone present.

Ultrasonic (18kHz–22kHz): Data encoded above the threshold of human hearing. Inaudible to humans, receivable by most smartphone microphones. Range: ~1-5 meters. Silent, invisible pairing between devices in a shared space. This is the mechanism used by some existing proximity payment and loyalty systems — XProtocol uses it for cryptographic key exchange.

Infrasonic signaling (< 20Hz): Very low frequencies propagate through walls and barriers at ranges impractical for other local transports. Not suitable for data encoding (too low bandwidth) but useful as a presence signal — "a device is broadcasting XProtocol in this building" — detected by devices before they switch to a higher-bandwidth channel for the actual key exchange.

Acoustic distance ranging: Sound travels at 343m/s. Two devices can measure the round-trip time of an audio exchange and calculate distance with centimeter accuracy — equivalent to UWB ranging but using only the speaker and microphone hardware already present in every smartphone. No UWB chip required.

Directional audio via microphone arrays: Devices with multiple microphones (most modern smartphones) can determine the direction a sound originated from. Combined with acoustic distance ranging, a device can spatially address an event to "the device approximately 2 meters to my left" without UWB hardware.

One-to-many broadcast: A single device playing an audio signal can simultaneously initiate pairing with every device in range that is listening. NFC is one-to-one. QR is one-to-one. Audio broadcast is genuinely one-to-many — a room full of devices can all receive the same pairing signal simultaneously.

Trust level: Tier 2 — physical proximity required (must be within audio range). Audible variant adds witnessed social proof (everyone in the room hears the pairing).

12.3 — Audio Fingerprint Key Derivation

The core concept: Derive cryptographic key material from audio content using acoustic feature extraction — the same underlying technique used by audio recognition systems like Shazam — repurposed as a key derivation function with fuzzy commitment for environmental tolerance.

The audio fingerprinting pipeline:

Live audio capture (microphone)
      │
      ▼
Preprocessing
  - Normalize volume/gain to reference level
  - Strip silence at start/end (VAD — voice activity detection)
  - Bandpass filter (focus on stable frequency ranges 300Hz–8kHz)
  - Segment into overlapping frames (25ms frames, 10ms hop)
      │
      ▼
Feature extraction
  - Compute Short-Time Fourier Transform (STFT) per frame
  - Convert to Mel-frequency spectrogram
  - Extract MFCCs (Mel-Frequency Cepstral Coefficients) — 13 coefficients
  - Compute chromagram (12-bin pitch class distribution)
  - Extract spectral centroid, rolloff, and flux per frame
  - Aggregate features across all frames (mean + std per coefficient)
      │
      ▼
Fingerprint quantization
  - Reduce feature vector to N bits (default: 256 bits)
  - Apply locality-sensitive hashing for fuzzy matching support
  - This is the audio fingerprint
      │
      ▼
Key derivation
  audio_key = HKDF(
    audio_fingerprint,
    info = "xp.discovery.audio_fingerprint." + context_label,
    salt = optional_nonce
  )

Fuzzy commitment for environmental tolerance:

Live audio captures of the same source differ due to microphone quality, room acoustics, background noise, and recording distance. The fuzzy commitment scheme (BCH or Reed-Solomon error correction) allows the recipient's fingerprint F' to recover the sender's fingerprint F if their Hamming distance is within tolerance D:

Sender:
  Captures audio → extracts fingerprint F
  Computes commitment C = fuzzy_commit(F, tolerance_D)
  Encrypts payload with audio_key = HKDF(F, ...)
  Sends: encrypted_payload + commitment C + context metadata

Recipient:
  Captures same audio → extracts fingerprint F'
  Applies error correction: F = fuzzy_recover(F', C)
  If Hamming_distance(F, F') ≤ D:
    Derives same audio_key → decryption succeeds
  Else:
    Wrong audio source or too much variation → decryption fails

Tolerance parameters by audio context:

Audio Source Tolerance Notes
Studio recording, controlled playback Low (8 bits) Same file, calibrated playback
Broadcast audio (radio, streaming) Low–Medium (12 bits) Compression artifacts
Live performance, audience recording Medium (20 bits) Room acoustics, crowd noise
Ambient environment fingerprint High (32 bits) Variable background
Spoken passphrase, same speaker Medium (16 bits) Natural voice variation
Musical pattern, human performance Medium (18 bits) Tempo/pitch variation
Secret knock / rhythmic pattern Low–Medium (12 bits) Timing jitter

12.4 — Audio Fingerprint Applications

Song-Locked Content

Content encrypted with a key derived from a song's audio fingerprint. Anyone who plays that song through the app — from the same recording or a sufficiently similar one — can derive the key and decrypt the content. The song IS the password. It is not transmitted, not stored on a server, not exchangeable — it is heard.

Use cases: music-gated communities, artist-to-fan exclusive content, playlist-locked collaborative documents, album-unlocked bonus material.

Event-Exclusive Content and Proof of Physical Presence

Content encrypted to the audio fingerprint of a live performance — a specific song played at a specific concert, a keynote speech at a specific conference, a ceremony at a specific event. The live recording's unique acoustic fingerprint (room acoustics, crowd noise, slight tempo variations, specific reverberation characteristics of that venue) differs from any studio recording or recording from another event.

Only people physically present when that audio played can decrypt:

  • Concert-exclusive content: Behind-the-scenes footage, unreleased tracks, meet-and-greet invitations — decryptable only by fans who were in the room during the performance
  • Conference session notes: Encrypted to the audio fingerprint of the opening keynote — only actual attendees can access
  • Ceremony records: Wedding vows, graduation speeches, awards presentations — encrypted to the acoustic fingerprint of the moment
  • Sports event content: Match-day exclusive content decryptable only by fans who were in the stadium

The live recording's uniqueness makes this proof of physical presence — not just proof of knowing the song. A studio recording will not match a live concert fingerprint within tolerance because the acoustic environments are completely different.

Acoustic Dead Drop

The audio equivalent of the visual dead drop. Encrypted content retrievable only when a specific sound plays at a specific location and time:

  • A fountain that plays a specific pattern on the hour
  • A clock tower chime sequence
  • A street musician's recurring melody
  • A factory machine's characteristic operational sound
  • A bird call unique to a specific habitat

Sender flow: 1. Sender records the target audio at the target location 2. Extracts audio fingerprint → derives audio_key 3. Encrypts payload with audio_key 4. Sends event containing: encrypted payload + fuzzy commitment + location hint + time hint + hint text

Recipient flow: 1. Receives event with hints 2. Goes to location, waits for the described sound 3. Records audio through the app 4. Extracts fingerprint F' → applies fuzzy recovery → derives audio_key 5. Payload decrypts if audio matches

Temporal specificity: The clock tower at noon produces a different fingerprint than at midnight (different chime count). A fountain's pattern may vary by season or operational schedule. Time is implicitly encoded in the acoustic content — adding a temporal dimension to the dead drop without requiring GPS or network time.

Spoken Passphrase as Biometric Key

A spoken phrase in a specific voice becomes key material. The MFCC features capture both the phonetic content (what is said) and the speaker characteristics (who says it). "Open sesame" said by Alice produces a different fingerprint than "Open sesame" said by Bob — the voiceprint is part of the key.

Applications: - Voice-keyed personal data vaults - Speaker-authenticated document signing - Voice-gated group channels (only members who have trained the passphrase model can participate)

Anti-spoofing via acoustic environment: Combine spoken passphrase with room acoustic fingerprint to defend against replay attacks using recorded voices. "Open sesame" said by Alice in the server room produces a different key than the same phrase played from a recording in a different acoustic environment — because the room's reverb characteristics are part of the fingerprint.

Ambient Environment as Location Fingerprint

The acoustic fingerprint of a physical space — its characteristic background frequencies, reverb profile, dominant sounds, and noise floor — is unique and stable enough to function as a location identifier. A server room sounds different from a coffee shop. A factory floor sounds different from a library.

Applications: - Location-gated documents: accessible only in the correct acoustic environment (an office, a secure facility, a specific room) - Environment-authenticated operations: "execute this command only when the device is in the server room" — verified by ambient sound - Acoustic geofencing: complement GPS geofencing with acoustic verification for indoor environments where GPS is unreliable

Combination with visual binding: For maximum physical context assurance, combine ambient acoustic fingerprint with visual scene fingerprint and GPS. Three independent physical channels must all match — acoustic, visual, and spatial — before decryption succeeds.

Musical Pattern as Shared Secret

A musical phrase — a melody, a rhythm, a chord sequence, a traditional song — known to a group of people becomes a shared cryptographic secret without any digital key exchange. Anyone who knows and can produce the musical pattern can derive the key.

Applications: - Cultural cryptography: content accessible to communities who share musical heritage (folk songs, traditional melodies, regional musical forms) - Group authentication: a band, ensemble, or community derives a shared key from a piece of music they all know and can perform - Organizational secrets: a company's internal musical motif (a jingle, a theme) as an authentication mechanism for internal systems — memorable, non-digital, reproducible by any member

Secret Knock as Authentication

A specific rhythmic pattern tapped on a surface — detected via accelerometer or microphone — generates a consistent acoustic fingerprint. The "secret knock" becomes a cryptographic primitive: groups that know the knock can authenticate to each other without any prior digital key exchange.

Applications: - Physical access control: tap the correct pattern on a door to unlock (no keys, no cards, no app unlock required — just the knock) - Group membership proof: a secret society, club, or organization uses a tap pattern as a physical authentication ritual that is also a real cryptographic primitive - IoT commissioning: tap a device with the correct pattern to authorize it — the knock IS the credential

Group Authentication via Shared Musical Experience

Everyone at a meeting, ceremony, or event records a shared musical moment — a song played at the opening, a theme performed live, a signal broadcast simultaneously. All recordings, despite individual variation in microphone quality and position, produce fingerprints within tolerance of each other. The group now shares a cryptographic secret derived from the shared experience.

Properties: - No digital key exchange ever occurred - The key is not stored anywhere — it exists in the acoustic memory of the event - Anyone who attended and recorded the moment can derive the key - Anyone who was not present cannot derive the key (their recording of a different performance will not match within tolerance) - The key is ephemeral — when the tolerance window closes (sufficiently different performances), old recordings no longer match new ones

Applications: - Post-event exclusive content distribution - Conference session-gated resources - Wedding/ceremony encrypted memories — only guests who attended can access - Sports season ticket holder communities — each home game produces a new key for that game's exclusive content

Acoustic Mesh Coordination During Outages

During infrastructure failures where both internet and cellular networks are down, devices within audio range can exchange XProtocol events via ultrasonic audio channels. Range is limited (~1-5 meters per hop) but devices can relay events through a human chain — each person within earshot passes events forward.

Combined with proximity broadcast pairing (§11), devices first discover each other via BLE, pair via audio channel (QR-equivalent without requiring a screen), and then form a mesh relay network for broader event distribution.

12.5 — Audio Channel Event Schema

{
  "kind": "xp.discovery.audio_channel",
  "payload": {
    "mode": "audible | ultrasonic | infrasonic",
    "encoding": "chirp_spread_spectrum | fsk | ook",
    "frequency_range_hz": [18000, 22000],
    "data": "<key_exchange_payload_base64url>",
    "duration_ms": 500,
    "repeat_count": 3,
    "signed_at": 1717000000,
    "sig": "<session_key_signature>"
  }
}

12.6 — Audio Fingerprint Event Schema

{
  "kind": "xp.discovery.audio_fingerprint",
  "payload": {
    "context": "song | live_event | ambient | voice | pattern | knock",
    "commitment": "<fuzzy_commitment_base64url>",
    "tolerance_bits": 20,
    "hint": {
      "description": "The opening song at the keynote",
      "location": "optional_text_hint",
      "time_window": {
        "start": 1717000000,
        "end": 1717003600
      }
    },
    "profile_encrypted": "<payload_encrypted_to_audio_key>",
    "expires_at": 1717086400,
    "signed_at": 1717000000,
    "sig": "<sender_identity_signature>"
  }
}

12.7 — Security Considerations

Replay attacks via recording: A high-quality recording of the target audio could potentially reconstruct the fingerprint within tolerance. Mitigations: - Live capture requirement: implementations MUST prevent gallery audio imports from being used in place of live microphone capture - Motion sensor confirmation: require device motion during capture (eliminates static playback from a speaker) - Combine with location: audio_key = HKDF(audio_fingerprint + geohash) — a recording played at the wrong location fails - Combine with visual: require simultaneous visual scene match

Environment reproduction attack: A sophisticated attacker could attempt to reproduce the target acoustic environment. The more specific and complex the audio fingerprint (a live concert with crowd noise, a room with unique reverb, a natural soundscape), the harder reproduction becomes. Simple tones or widely-available recordings are less secure than complex, specific, or live audio.

Eavesdropping on audio channel transport: Ultrasonic transmission is inaudible to humans but detectable by any microphone in range. The data transmitted is already an encrypted XProtocol event — eavesdropping yields only ciphertext. Audible transmission is deliberately witnessed — its security model assumes public observation is acceptable (or even desired).

12.8 — Conformance

xp.discovery.audio_channel and xp.discovery.audio_fingerprint are OPTIONAL extensions. Implementations that support them MUST:

  1. Never accept gallery/file audio imports in place of live capture for fingerprint key derivation
  2. Apply the fuzzy commitment scheme — raw fingerprint comparison is NOT conformant (too brittle for real-world use)
  3. Declare audio capability in xp.endpoint.announce: "discovery_transports": ["audio_channel", "audio_fingerprint"]
  4. Document the tolerance parameters used for each audio context type

13. Conformance

13.1 — Mandatory Transports

Transport Conformance
Relay MANDATORY — all implementations
All others OPTIONAL

13.2 — Capability Declaration

Implementations declare supported transports in their xp.endpoint.announce event:

{
  "kind": "xp.endpoint.announce",
  "payload": {
    "discovery_transports": [
      "relay", "rendezvous", "ble", "nfc", "qr", "uwb",
      "audio_channel", "audio_fingerprint",
      "visual_place", "location_key", "physical_binding"
    ],
    "proximity_namespaces": [
      "xp.proximity.social", "xp.proximity.professional",
      "xp.proximity.emergency"
    ],
    "physical_binding_supported": true,
    "max_trust_tier": 5
  }
}

13.3 — Implementation Profiles

Minimal (server/cloud services): Relay only. Trust Tier 0.

Mobile standard (smartphone apps): Relay + rendezvous + BLE + QR + NFC + proximity broadcast + audio channel. Trust Tiers 0–2.

Mobile enhanced (flagship smartphones with UWB): All standard mobile transports + UWB + audio fingerprint + visual place recognition + GPS location key. Trust Tiers 0–4.

IoT basic (constrained devices): Relay + BLE + QR + audio channel. Trust Tiers 0–2. Physical binding optional.

IoT high-security (cameras, access control, sensors): Relay + UWB + visual place recognition + GPS + audio fingerprint + physical context binding. Trust Tiers 0–5.


14. Privacy Considerations

14.1 — Metadata Exposure by Transport

Transport Exposed to infrastructure Exposed to peer
Relay Sender key, recipient key, kind, timestamp, size Nothing additional
Relay + envelope encryption Recipient key only Nothing additional
Rendezvous Opaque token, IP (to rendezvous server) IP address
DHT Key fingerprint, IP (to DHT nodes) IP address
BLE Key fingerprint (to anyone in range) Key fingerprint
Proximity beacon Seeking criteria (to anyone in range) Criteria only until Phase 4
Audio channel Presence signal (to anyone in range) Audio signal
Audio fingerprint Nothing (local derivation) Hint text only
QR Nothing (offline display) All QR payload fields
NFC Nothing (short range) All NFC payload fields
UWB Ranging data to anchor Ranging data
GPS/visual Nothing (local derivation) Geofence/location hint

14.2 — Minimum Exposure Principle

  • Anonymous initial contact: QR, NFC, or audio channel (no network metadata)
  • Proximity social discovery: proximity beacon with session key rotation
  • Silent proximity pairing: ultrasonic audio channel
  • Witnessed proximity pairing: audible audio channel
  • Content gated to physical presence: audio or visual fingerprint
  • Persistent remote communication: relay with envelope encryption
  • High-performance known-peer: rendezvous

14.3 — Location and Audio Privacy

GPS coordinates and audio fingerprint hints reveal the target location or context to the recipient — intentionally, since they must go there or reproduce the audio. These fields should be transmitted only to intended recipients via encrypted events. Audio channel transmissions in ultrasonic mode are inaudible but detectable — treat them as potentially observable by any device in range.


Appendix A — Transport Summary Table

Transport Tier Range Direction Requires Novel
Relay 0 Global Any Internet No
Rendezvous 0 Global Any Internet + shared secret No
DHT 0 Global Any Internet No
Return address 0 Global Any Prior event No
BLE 1 ~10-100m Any BLE hardware No
WiFi Direct 1 ~100m Any WiFi hardware No
Proximity beacon 1–2 ~10-150m Broadcast→any BLE or WiFi Yes
Audio channel (audible) 2 ~10-20m Speaker→mic Speaker + mic No
Audio channel (ultrasonic) 2 ~1-5m Speaker→mic Speaker + mic No
QR code 2 ~0-5m Display→camera Camera No
NFC 2 ~0-4cm Bidirectional NFC hardware No
Audio fingerprint (live event) 3 Event venue Ambient→mic Microphone Yes
Audio fingerprint (ambient) 3 ~room Ambient→mic Microphone Yes
GPS location key 3 Geofence N/A GPS Yes
Visual place recognition 3 ~1-5m Camera→scene Camera + GPS Yes
UWB 4 ~0-10m Bidirectional UWB hardware No
Physical context binding 5 Installation N/A Camera + GPS + compass Yes

Appendix B — Novel Mechanisms Summary

The following mechanisms described in this specification are believed to have no direct prior art in deployed systems. They are documented here as a technical reference for implementors and researchers, and as an open contribution to the public record under CC BY 4.0.

All mechanisms are freely available for anyone to implement, build upon, or extend under the terms of the XProtocol open licenses.

Visual Place Recognition as a Cryptographic Primitive (§8) Using perceptual hashing of camera images combined with GPS coordinates and compass heading to derive cryptographic key material — decryption requires physical presence at a specific location facing a specific direction seeing a specific scene. Fuzzy commitment schemes provide tolerance for environmental variation.

Visual Dead Drop (§8.4) Encrypted information transmittable only to a recipient who physically visits a specific location, orients correctly, and captures a matching camera image. Key material is derived from the visual fingerprint of the scene, not from a transmitted secret.

Cryptographic Physical Context Binding for IoT (§9) Binding an IoT device's communication private key to its physical installation context — GPS, compass orientation, visual scene, optional UWB — such that the device cannot communicate if moved, reoriented, or occluded. Tamper detection enforced cryptographically without hardware tamper seals.

Physical Presence Trust Tier Model (§10) Classifying the trust level of cryptographic relationships by the physical transport used to establish them. Trust tier is recorded as a signed pairing property and enforced in capability policies — distinguishing remote pairing from centimeter-accurate physical presence.

Camera Occlusion as Cryptographic Lockout (§9) Covering, repointing, or moving an IoT device's camera causes the device to lose its private key and go silent — without server involvement or tamper hardware.

Time-Windowed Cryptographic Rendezvous (§4.2) HKDF with a time-epoch salt derived from a shared secret generates rendezvous tokens allowing devices to find each other on an untrusted server with no identity disclosure.

Proximity Broadcast Pairing with Predicate-Encrypted Profiles (§11) Anonymous proximity discovery between strangers via BLE/WiFi broadcast where profile content is encrypted to a key derivable only by receivers whose profile satisfies the sender's criteria. Four-phase consent flow. Session key rotation for anti-tracking. No server involved.

Predicate Encryption for Proximity Social Discovery (§11) Predicate-based key derivation for compatibility matching — encryption predicate encodes compatibility criteria, decryption key derivable only by profiles satisfying those criteria, evaluated locally.

Emergency Proximity Beacon with Relaxed Privacy Defaults (§11.8) Proximity beacon mode for emergency scenarios where the four-phase consent flow is replaced with immediate identity disclosure — enabling faster skills-and-needs matching when speed is critical.

Infrastructure-Free Community Mesh via Proximity Pairing (§11.8) Authenticated decentralized mesh network formation during infrastructure failures using proximity broadcast pairing, with device-to-device XProtocol event relay and cryptographic authentication throughout.

Audio Fingerprint as Cryptographic Key Material (§12) Acoustic feature extraction (MFCC, chromagram, spectral features) applied to live audio derives cryptographic key material via HKDF with fuzzy commitment for environmental tolerance.

Live Event Audio Fingerprint as Proof of Physical Presence (§12.4) The unique acoustic fingerprint of a live performance constitutes cryptographic proof of attendance — venue acoustics, crowd noise, and live variation distinguish it from any studio recording.

Acoustic Dead Drop (§12.4) Encrypted information decryptable only when a specific sound plays at a specific location. Time-varying sounds encode temporal specificity as an implicit key component.

Secret Knock as Cryptographic Authentication Primitive (§12.4) Rhythmic patterns tapped on a surface, detected via accelerometer or microphone, as a cryptographic authentication mechanism.

Group Authentication via Shared Live Audio Experience (§12.4) A shared cryptographic secret established among a group through simultaneous live audio experience — each recording produces a fingerprint within fuzzy tolerance of all others. No digital key exchange required.

Acoustic Ranging as UWB Alternative (§12.2) Time-of-flight measurement of audio signals providing centimeter-accurate distance estimation using only standard speaker and microphone hardware.

Continuous Location Presence Gating for Messages (§7.3) Message visibility continuously gated by real-time GPS position — messages appear in the timeline inside the geofence and disappear outside it, with plaintext evicted from memory on exit.

Delivery × Visibility × Persistence Matrix for Geofenced Messaging (§7.3) Composable independent flags for delivery mode, visibility mode, and persistence mode — enabling any combination of geofenced delivery, visibility, and deletion behavior from a unified schema.

Privacy-Preserving Relay Hold via Nested Location-Key Encryption (§7.3) Relay holds a message without knowing the geofence coordinates — the decryption key is encrypted to a location-derived key, enforcing location-gating entirely through client-side cryptographic derivation.

Silent Delivery with Deferred Geofence Notification (§7.3) Encrypted event stored on device without OS notification, badge, or timeline entry — notification deferred until geofence entry, at which point the message surfaces as a new arrival.

Read Receipt as Implicit Location Attestation (§7.3) A geofenced message read receipt constitutes cryptographic proof of physical presence — decryption required deriving a location-key from GPS coordinates within the geofence at the time of reading.

XProtocol.ai is an independent open protocol project and is not affiliated with, endorsed by, or connected to XProtocol.org or any related entities.