# Lesavka Agent Notes ## A/V Sync Probe And Lip-Sync Validation Checklist Context: Google Meet testing on 2026-04-30 showed audio roughly 8 seconds behind video even though internal client/server telemetry reported fresh uplink packets. Treat this as a product correctness failure, not a calibration issue. Do not resume blind lip-sync tuning until the probe can explain where delay appears. ### Operating Principles - Avoid hard-resetting USB, UVC, UAC, display managers, or remote hosts unless the user explicitly approves it. - Prefer observation and reversible user-space probes before changing media pipelines. - Treat Tethys-only SSH/device inspection as a development luxury, not a production dependency. - Do not claim lip sync is fixed from internal telemetry alone; require end-to-end device-level evidence. - Keep this checklist updated as work lands. ### Phase 1: Build The Probe - [x] Create this tracked checklist in `AGENTS.md`. - [x] Inventory existing `client/src/sync_probe/` code and decide what can be reused. - Reuse the existing synthetic beacon in `client/src/sync_probe/`. - Reuse the existing Tethys capture harness in `scripts/manual/run_upstream_av_sync.sh`. - Reuse and extend `lesavka-sync-analyze`; current gap is structured evidence output, not capture generation. - [x] Define the phase-1 output contract: - [x] `report.json` - [x] `report.txt` - [x] per-event rows with event id, video time, audio time, skew, and confidence - [x] pass/fail verdict using preferred/acceptable/catastrophic thresholds - [x] Add a deterministic local sync beacon source: - [x] video flash pattern with event identity or cadence - [x] simultaneous audio click/beep - [x] stable event schedule suitable for automated detection - [x] Add a Tethys-side capture probe: - [x] capture Lesavka UVC video device - [x] capture Lesavka UAC microphone device - [x] record enough raw evidence for debugging when detection fails - [x] detect video flashes - [x] detect audio clicks - [x] pair events and compute skew - [x] Add a runner that can launch or instruct the Tethys probe safely over SSH without rebooting or restarting the desktop. - [x] Store probe artifacts under `/tmp/lesavka-sync-probe-*` by default. - [x] Keep the probe usable without Google Meet first; Google Meet validation is a later application-level check. ### Phase 2: Use Probe To Root-Cause Desync - [x] Run probe through direct Lesavka UVC/UAC devices on Tethys. - First live run reached the devices but exposed analyzer/tooling gaps instead of a valid skew report. - Fixed the manual probe tunnel to preserve HTTPS/mTLS through SSH (`LESAVKA_SERVER_SCHEME=https`, `LESAVKA_TLS_DOMAIN=lesavka-server`). - Fixed analyzer handling for MJPEG captures whose FFprobe metadata over-reports frames versus decodable video frames. - [x] Compare client-generated event times against Tethys-observed times. - The preserved Tethys capture had 323 decodable frames with constant brightness, so no video flash reached UVC. - Server logs show the probe entered a stale upstream session and dropped audio as ~326 seconds late. - [x] Identify whether delay appears before server planning, at server UAC sink, at UVC helper, inside Tethys device capture, or inside browser/WebRTC. - Current root cause is server planning/session lifecycle, before UVC/UAC sink output. - A previous one-sided microphone session started at 2026-04-30T22:59:52Z; the new probe at 2026-05-01T00:57:08Z inherited its stale playout epoch. - [x] Add diagnostics for whichever stage is hiding delay. - Existing server lifecycle/planning logs were enough to isolate this run; next gate should preserve these as structured artifacts. - [x] Do not tune calibration offsets until gross backlog is ruled out. - No calibration offsets were changed during the stale-session investigation. - Current evidence points at lifecycle/session planning, not an offset problem. ### Phase 3: Fix Lesavka With Evidence - [x] If stale upstream lifecycle is confirmed, reset shared A/V timing anchors when a new stream replaces an existing owner. - Added a lifecycle guard so normal camera/microphone stream replacement clears stale shared timing anchors before re-pairing. - Kept soft microphone recovery intentionally separate so it supersedes the mic owner without disturbing an active healthy camera/shared clock. - Added regression coverage for stale timing-anchor replacement and soft microphone recovery preservation. - [ ] If UAC sink backlog is confirmed, make UAC output freshness-bounded. - [ ] If audio progress is marked too early, move/augment progress telemetry to reflect actual sink emission readiness. - [ ] If UVC and UAC are using incompatible freshness semantics, unify them behind one live-media policy. - [ ] If browser/WebRTC adds delay after devices are already synced, document the application boundary and add browser-specific mitigation or guidance. ### Phase 4: Gate And Release Criteria - [x] Add deterministic unit/integration tests for probe analysis logic. - [x] Add a hardware-in-the-loop/manual gate artifact schema for real Tethys probe runs. - [x] Update `scripts/ci/media_reliability_gate.sh` to report probe evidence when present. - Gate now reads `LESAVKA_SYNC_PROBE_REPORT_JSON`, `LESAVKA_SYNC_PROBE_REPORT_DIR`, or `target/media-reliability-gate/sync-probe/report.json`. - Gate emits sync-probe verdict/check metrics, skew metrics, event counts, and a verdict info metric. - [x] Require a fresh probe report before declaring lip sync fixed. - Gate now supports `LESAVKA_REQUIRE_SYNC_PROBE=1`, which fails media reliability when a valid passing probe report is absent. - Product/release judgment still requires a new live Theia/Tethys probe after the lifecycle fix is installed. - [ ] Suggested thresholds: - [x] preferred: p95 skew <= 35 ms - [x] acceptable: p95 skew <= 80 ms - [x] gross failure: sustained skew > 250 ms - [x] catastrophic failure: any sustained skew near or above 1000 ms ### Open Questions - [x] Decide whether the phase-1 beacon should run as a separate binary, a hidden client mode, or both. - [x] Decide whether Tethys probe should be Rust-only, shell plus GStreamer, or a hybrid. - [ ] Confirm whether sudo/Vault access is available for installing missing probe dependencies on Theia/Tethys. - Non-sudo server journal inspection worked; noninteractive sudo over SSH still needs an explicit TTY/password path. ### Validation Evidence - [x] `cargo test -p lesavka_server upstream_media_runtime::tests::lifecycle` - [x] `cargo test -p lesavka_client sync_probe::analyze` - [x] `cargo test -p lesavka_testing upstream_sync_script_tunnels_auto_server_addr_through_ssh` - [x] `bash -n scripts/ci/media_reliability_gate.sh` - [x] `cargo test -p lesavka_testing media_reliability_gate_reports_direct_sync_probe_evidence` - [x] `LESAVKA_REQUIRE_SYNC_PROBE=1 ./scripts/ci/media_reliability_gate.sh` - Used a synthetic passing report at `target/media-reliability-gate/sync-probe/report.json` to verify gate parsing/enforcement. - This validates CI glue only; a real Theia/Tethys probe is still required for product judgment. ## Real Upstream Lip-Sync Fix Checklist Context: the mirrored browser probe finally reproduced the real failure class on 2026-05-01: `activity_start_delta_ms=+9591.1`. This means the end-to-end browser-visible path can still start video far ahead of audio. The fix target is not silence in the logs; it is a freshness-first A/V uplink whose startup can heal briefly but cannot drift into seconds of skew. ### Acceptance Criteria - [ ] Mirrored browser probe passes with `activity_start_delta_ms <= 1000`. - [ ] Steady-state preferred sync: median skew within `35 ms`. - [ ] Steady-state acceptable sync: p95 absolute skew within `80 ms`. - [ ] Any sustained or startup A/V split near `1000 ms` remains a hard failure. - [ ] No stale audio backlog is ever drained into UAC to catch up. - [ ] No stale video backlog is ever drained into UVC to catch up. - [ ] Google Meet manual testing agrees with the mirrored probe instead of revealing hidden seconds-scale skew. ### Phase 0: Keep The Probe Honest - [x] Split raw activity-start fields from filtered/coded paired-pulse fields in probe reports. - [x] Print explicit raw first-video and first-audio timestamps in `report.txt`. - [x] Root-cause the 0.16.17 `raw_first_video_activity_s=0.000` artifact as the mirrored probe counting its own bright pre-start positioning card. - [x] Make the mirrored stimulus pre-start screen dark/dim so only real flash pulses can be detected as video activity. - [x] Add analyzer coverage proving dim pre-start positioning frames are ignored. - [x] Replace generic light/dark mirrored flashes with color-coded event IDs. - [x] Make mirrored audio pulses unique by the same event ID via pulse width plus tone frequency. - [x] Teach the analyzer to decode mirrored video event IDs from color, not grayscale brightness. - [x] Tighten real-camera color matching after 0.16.18 accepted washed-out brown/gray remnants as red/yellow events. - [x] Preserve raw activity-start timing before cadence cleanup in coded reports. - [x] Merge short audio envelope dropouts inside one coded pulse so a single tone burst cannot become two fake events. - [x] Add diagnostic coded-pair correlation so stable large skew reports as measured failure instead of `not enough pairs`. - [x] Make coded mirrored verdicts/calibration use matched coded pulses as authority; raw activity-start deltas are reported separately unless they agree with the coded pairs. - [x] Print unpaired video/audio onsets in the human report so missed coded pulses are visible during probe triage. - [ ] Keep the mirrored browser probe as the release/blocking upstream A/V gate. - [ ] Keep the old raw-device probe as a lower-level diagnostic only. ### Phase 1: Stop One-Sided Startup Drift - [x] Default upstream planning must require both camera and microphone before live playout. - [x] One-sided playout may only happen through an explicit compatibility override. - [x] While pairing is overdue, keep replacing the waiting-side anchor with fresh packets instead of preserving stale startup anchors. - [x] While awaiting the peer stream, keep only fresh pending camera packets. - [x] While awaiting the peer stream, keep only fresh pending microphone packets. - [x] Add tests proving the pairing window no longer expires into one-sided playout by default. - [x] Add tests proving the explicit one-sided override still works for intentional single-stream scenarios. ### Phase 2: Bound UAC Freshness - [x] Configure UAC `appsrc` as non-blocking and bounded. - [x] Log and drop UAC appsrc push failures instead of treating enqueue as guaranteed playback. - [x] Raise calibration offset limits to cover one-second healing without rejecting measured probe corrections. - [x] Update the MJPEG/UVC factory audio baseline from `-45ms` to `+720ms` based on the first trustworthy mirrored browser probe artifact. - [x] Migrate untouched legacy `-45ms` factory/env calibration files on load so old installs actually receive the new baseline. - [x] Make the video/audio-master wait offset-aware so a positive audio playout delay does not freeze UVC video while UAC sleeps before emission. - [ ] Flush/stop UAC cleanly on session close, replacement, and recovery. - [x] Add tests or contract coverage for bounded UAC settings where practical. ### Phase 3: Add Real Timing Evidence - [ ] Add server timing counters for first camera packet, first mic packet, first UVC write, and first UAC push per session. - [ ] Add dropped-stale audio/video counters to diagnostics. - [ ] Add a concise health explanation when startup pairing exceeds the healing window. - [ ] Surface `Starting`, `Healing`, `Flowing`, `Lagging`, `Dropping`, and `Stale` states in chips/diagnostics from real path evidence. ### Phase 4: Recovery And Mid-Session Changes - [ ] Make device changes trigger soft-pause, stream replacement, queue flush, and re-pairing. - [ ] Keep recovery soft-first; reserve hard UVC/UAC gadget rebuilds for explicit guarded recoveries. - [ ] Add cooldown/state guards so recovery buttons cannot wedge Theia. - [ ] Ensure disconnect closes all client/server media tasks for the session. ### Phase 5: Verification Loop - [x] Run focused upstream runtime tests. - [x] Run server/client media contract tests. - [x] Run `cargo check` for touched packages. - [x] Bump version for the fix release. - [x] Run the mirrored browser probe on installed client/server. - 0.16.17 still failed: reported `activity_start_delta_ms=+6735.0`, but `raw_first_video_activity_s=0.000` exposed a probe false-positive from the pre-start screen. Paired pulses still showed real steady-state skew (`p95=411.8 ms`, `median=-99.0 ms`), so the product remains unfixed. - 0.16.18 captured real colored/audio-coded events but the analyzer still bailed with `need at least 3 matching coded pulse pairs; saw 1`. Replaying that artifact after analyzer hardening now reports `gross_failure`: 16/16 coded pairs, p95 `775.7 ms`, activity start `-766.4 ms`, and drift `-2.8 ms`; the failure is stable audio-ahead/video-late skew, not random detector noise. - 0.16.19 changes the shipped MJPEG/UVC audio playout baseline to `+720ms`; the next mirrored browser probe should move the measured median from about `-766ms` toward roughly `-46ms` before fine calibration. - 0.16.19 mirrored browser probe did not move the measured skew: p95 `885.7 ms`, median `-788.4 ms`, activity start `-659.1 ms`, drift `-81.2 ms`. SSH inspection showed Theia was on commit `c348597`, but `/etc/lesavka/server.env` still contained `LESAVKA_UPSTREAM_AUDIO_PLAYOUT_OFFSET_US=-45000`; the new `+720ms` baseline was not actually installed. Patch the installer to migrate leaked legacy ambient `-45000` to `+720000` unless `LESAVKA_INSTALL_UPSTREAM_AUDIO_PLAYOUT_OFFSET_US` explicitly asks for the legacy value. - 0.16.20 installed the `+720ms` offset (`/etc/lesavka/server.env` had `LESAVKA_UPSTREAM_AUDIO_PLAYOUT_OFFSET_US=720000`), but the mirrored browser capture contained no recognizable color pulses. Theia server logs showed repeated `upstream video frame dropped because the audio master never caught up inside the pairing window`; UVC was effectively starved by the positive audio delay instead of flowing delayed-but-fresh frames. - 0.16.21 makes that wait offset-aware and adds a regression test proving a configured positive audio delay does not freeze UVC video while UAC sleeps before playout. - Replaying the 0.16.21 artifact after 0.16.22 analyzer hardening changes the verdict from false `catastrophic_failure` to `gross_failure`: p95 `273.8 ms`, median `-188.4 ms`, 7 paired coded pulses. The raw activity-start delta (`-3620.7 ms`) is still printed, but it is ignored for verdict/calibration because it disagrees with coded pairs by `3432.3 ms`; unpaired video/audio onsets are printed for triage. - [ ] Re-run the mirrored browser probe after the pre-start false-positive fix. - [ ] Run Google Meet manual validation.