Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

QoS, Status Events, and Discovery

This page covers runtime event behavior that application authors need after choosing an RMW backend: liveliness, deadline misses, lost messages, and the places where nano-ros deliberately differs from standard ROS 2 waitset-style event handling.

Transport-level status events let an application observe conditions the underlying middleware detects: a remote node going silent, a publisher missing its rate, a subscriber dropping samples. nano-ros exposes a Tier-1 subset useful for embedded RTOS deployments — liveliness changes, deadline misses, and message loss.

This page describes the event surface, the dispatch model, and the patterns that motivate having events at all on RTOS targets.

The Tier-1 events

Five event kinds. Three on subscribers, two on publishers.

Subscriber events

KindFires whenUse
LivelinessChangedA tracked publisher’s liveliness state changes (publisher started asserting / stopped)Safety-island fail-over: trigger MRM when remote control node goes silent
RequestedDeadlineMissedAn expected sample didn’t arrive within the configured deadlinePeriodic-sensor pattern: 100 Hz topic with 15 ms deadline; fire alarm when late
MessageLostThe backend dropped one or more samples (typically: ring-buffer overflow, slow consumer)Diagnostic + adaptive: log, drop more aggressively, coalesce

Publisher events

KindFires whenUse
LivelinessLostThis publisher missed its own liveliness assertion deadlineSelf-monitoring: detect own task starvation
OfferedDeadlineMissedThis publisher promised X Hz, fell behindSelf-monitoring: detect overrunning compute

Registering callbacks

Same shape as message callbacks. Register at construction time; callback fires from inside spin_once when the backend detects the event.

Rust

#![allow(unused)]
fn main() {
use core::time::Duration;

let mut sub = node.create_subscription::<SensorReading>("/sensor")?;

// Liveliness — fires when the publisher comes / goes
sub.on_liveliness_changed(|status| {
    if status.alive_count == 0 {
        log::warn!("sensor publisher went silent");
        trigger_failover();
    }
})?;

// Deadline — fires when no sample within 15 ms
sub.on_requested_deadline_missed(Duration::from_millis(15), |status| {
    metric_inc(&LATE_SAMPLE_COUNT, status.total_count_change);
})?;

// Message lost — fires when backend drops a sample
sub.on_message_lost(|status| {
    log::warn!("dropped {} samples", status.total_count_change);
})?;
}

Async equivalents (spin_async / Embassy / tokio):

#![allow(unused)]
fn main() {
let status = sub.next_liveliness_change().await?;
let status = sub.next_deadline_miss().await?;
}

C

static void on_liveliness_changed(
        nros_subscription_t *sub,
        nros_liveliness_changed_status_t status,
        void *user_context) {
    if (status.alive_count == 0) {
        trigger_failover();
    }
}

nros_subscription_set_liveliness_changed_callback(
    sub, on_liveliness_changed, NULL);

Same shape on the four other event kinds. Each returns NROS_RMW_RET_UNSUPPORTED if the active backend doesn’t generate that event.

C++

sub.on_liveliness_changed([&](nros::LivelinessChangedStatus status) {
    if (status.alive_count == 0) trigger_failover();
});

sub.on_requested_deadline_missed(
    std::chrono::milliseconds(15),
    [&](nros::DeadlineMissedStatus status) {
        late_count_ += status.total_count_change;
    });

std::function overloads available with NROS_CPP_STD; freestanding mode uses C function pointers + user-context.

How dispatch works

Events fire from inside drive_io on the executor thread, the same way message callbacks do:

spin_once(timeout)
 └─ session.drive_io(...)
       ├─ backend RX worker detects message → fires message callback
       ├─ backend RX worker detects liveliness change → fires liveliness callback
       ├─ backend timer expires (deadline) → fires deadline callback
       └─ return

Same execution context, same priority, same constraints as message callbacks. The max_callbacks_per_spin knob (see RTOS Cooperation) covers events too — an event callback counts as one against the cap.

This means events get the same scheduling treatment as messages. max_callbacks_per_spin = 1 will fire either one message OR one event per spin_once, whichever the backend has ready first.

Why a callback model, not a waitset

Upstream rmw.h exposes events via a waitset — register an rmw_event_t handle in a waitset, call rmw_wait, then rmw_take_event to pull payload. nano-ros doesn’t have a waitset (see RMW vs upstream Section 4).

Reusing the existing message-callback path avoids introducing the waitset. Trade-off: applications can’t bulk-poll for events. For the Tier-1 set this isn’t load-bearing — events are rare, callbacks are cheap.

Backend support is uneven

Not every backend can generate every event. Apps must handle Unsupported errors when registering:

BackendLivelinessChanged / LostDeadlineMissedMessageLost
Cyclone DDS✗ Not wired through the nano-ros event API yet✗ Not wired yet✗ Not wired yet
XRCE-DDS✗ Not exposed (xrce-dds-client API limitation)✅ Shim-side clock check (sub: has_data / try_recv_raw; pub: publish_raw)✗ Not exposed
zenoh-pico✅ Poll-based via zenoh tokens (alive_count ∈ {0,1})✅ Shim-side clock check (<P as PlatformClock>::clock_ms)✅ Seq-gap detection from RMW attachment
uORB✗ No wire-level liveliness✗ No rate concept✅ Native (host mock + real PX4 via RustSubscriptionCallback publish-counter)

✓ = wired and tested. 🟡 = surface API works (returns Err while pending), wiring planned. ✗ = not feasible at this layer.

No current backend exposes the full Tier-1 event set through the nano-ros event API. Apps should call Subscriber::supports_event(kind) first or design for graceful fallback.

The Subscriber::supports_event(kind) query lets applications check support before registering:

#![allow(unused)]
fn main() {
if sub.supports_event(EventKind::RequestedDeadlineMissed) {
    sub.on_requested_deadline_missed(...)?;
} else {
    // fallback: app-side timeout monitoring
}
}

Apps that need cross-backend portability code defensively. Apps pinned to one backend can call register-and-unwrap.

What’s deliberately skipped

Three upstream event types are intentionally absent from the API:

*_MATCHED (publication / subscription matched)

Fires when a remote endpoint appears or disappears. Useful for discovery-tracking dashboards and dynamic-topology apps. Static- topology embedded apps don’t benefit; if a use case appears, the event kind is additive.

*_QOS_INCOMPATIBLE

Fires when publisher and subscriber QoS profiles can’t be reconciled. nano-ros surfaces this synchronously at create_publisher / create_subscriber time as NROS_RMW_RET_INCOMPATIBLE_QOS. No runtime event needed; the failure is visible at startup.

*_INCOMPATIBLE_TYPE

Type-hash mismatch. Same: surfaced synchronously as NROS_RMW_RET_TOPIC_NAME_INVALID (or a future NROS_RMW_RET_INCOMPATIBLE_TYPE) at creation. No runtime event.

Patterns

Drone-bridge fail-over on liveliness loss

#![allow(unused)]
fn main() {
let mut sub = node.create_subscription::<VehicleAttitude>("/vehicle_attitude")?;
sub.on_liveliness_changed(|status| {
    if status.alive_count == 0 {
        // PX4 commander.cpp went silent — trigger MRM
        request_minimum_risk_manoeuvre();
    } else if status.alive_count_change > 0 {
        // commander came back online; clear MRM if appropriate
        clear_failover();
    }
})?;
}

This pairs with the cross-backend bridge example pattern.

100 Hz sensor with deadline alarm

#![allow(unused)]
fn main() {
let mut sub = node.create_subscription::<SensorReading>("/imu")?;
sub.on_requested_deadline_missed(
    Duration::from_millis(15),    // expected 100 Hz, allow 15 ms deadline
    |status| {
        if status.total_count_change > 0 {
            log::error!("IMU late: {} missed deadlines", status.total_count_change);
            // Optional: enter degraded-mode controller
        }
    },
)?;
}

Slow-consumer logging

#![allow(unused)]
fn main() {
let mut sub = node.create_subscription::<Pointcloud>("/lidar")?;
sub.on_message_lost(|status| {
    log::warn!("dropped {} pointcloud frames (total: {})",
               status.total_count_change, status.total_count);
    // Tighten downstream filter, reduce work, coalesce, etc.
})?;
}

See also