152 lines
10 KiB
Markdown
152 lines
10 KiB
Markdown
# Lesavka Agent Notes
|
|
|
|
## A/V Sync Probe And Lip-Sync Validation Checklist
|
|
|
|
Context: Google Meet testing on 2026-04-30 showed audio roughly 8 seconds behind video even though internal client/server telemetry reported fresh uplink packets. Treat this as a product correctness failure, not a calibration issue. Do not resume blind lip-sync tuning until the probe can explain where delay appears.
|
|
|
|
### Operating Principles
|
|
- Avoid hard-resetting USB, UVC, UAC, display managers, or remote hosts unless the user explicitly approves it.
|
|
- Prefer observation and reversible user-space probes before changing media pipelines.
|
|
- Treat Tethys-only SSH/device inspection as a development luxury, not a production dependency.
|
|
- Do not claim lip sync is fixed from internal telemetry alone; require end-to-end device-level evidence.
|
|
- Keep this checklist updated as work lands.
|
|
|
|
### Phase 1: Build The Probe
|
|
- [x] Create this tracked checklist in `AGENTS.md`.
|
|
- [x] Inventory existing `client/src/sync_probe/` code and decide what can be reused.
|
|
- Reuse the existing synthetic beacon in `client/src/sync_probe/`.
|
|
- Reuse the existing Tethys capture harness in `scripts/manual/run_upstream_av_sync.sh`.
|
|
- Reuse and extend `lesavka-sync-analyze`; current gap is structured evidence output, not capture generation.
|
|
- [x] Define the phase-1 output contract:
|
|
- [x] `report.json`
|
|
- [x] `report.txt`
|
|
- [x] per-event rows with event id, video time, audio time, skew, and confidence
|
|
- [x] pass/fail verdict using preferred/acceptable/catastrophic thresholds
|
|
- [x] Add a deterministic local sync beacon source:
|
|
- [x] video flash pattern with event identity or cadence
|
|
- [x] simultaneous audio click/beep
|
|
- [x] stable event schedule suitable for automated detection
|
|
- [x] Add a Tethys-side capture probe:
|
|
- [x] capture Lesavka UVC video device
|
|
- [x] capture Lesavka UAC microphone device
|
|
- [x] record enough raw evidence for debugging when detection fails
|
|
- [x] detect video flashes
|
|
- [x] detect audio clicks
|
|
- [x] pair events and compute skew
|
|
- [x] Add a runner that can launch or instruct the Tethys probe safely over SSH without rebooting or restarting the desktop.
|
|
- [x] Store probe artifacts under `/tmp/lesavka-sync-probe-*` by default.
|
|
- [x] Keep the probe usable without Google Meet first; Google Meet validation is a later application-level check.
|
|
|
|
### Phase 2: Use Probe To Root-Cause Desync
|
|
- [x] Run probe through direct Lesavka UVC/UAC devices on Tethys.
|
|
- First live run reached the devices but exposed analyzer/tooling gaps instead of a valid skew report.
|
|
- Fixed the manual probe tunnel to preserve HTTPS/mTLS through SSH (`LESAVKA_SERVER_SCHEME=https`, `LESAVKA_TLS_DOMAIN=lesavka-server`).
|
|
- Fixed analyzer handling for MJPEG captures whose FFprobe metadata over-reports frames versus decodable video frames.
|
|
- [x] Compare client-generated event times against Tethys-observed times.
|
|
- The preserved Tethys capture had 323 decodable frames with constant brightness, so no video flash reached UVC.
|
|
- Server logs show the probe entered a stale upstream session and dropped audio as ~326 seconds late.
|
|
- [x] Identify whether delay appears before server planning, at server UAC sink, at UVC helper, inside Tethys device capture, or inside browser/WebRTC.
|
|
- Current root cause is server planning/session lifecycle, before UVC/UAC sink output.
|
|
- A previous one-sided microphone session started at 2026-04-30T22:59:52Z; the new probe at 2026-05-01T00:57:08Z inherited its stale playout epoch.
|
|
- [x] Add diagnostics for whichever stage is hiding delay.
|
|
- Existing server lifecycle/planning logs were enough to isolate this run; next gate should preserve these as structured artifacts.
|
|
- [x] Do not tune calibration offsets until gross backlog is ruled out.
|
|
- No calibration offsets were changed during the stale-session investigation.
|
|
- Current evidence points at lifecycle/session planning, not an offset problem.
|
|
|
|
### Phase 3: Fix Lesavka With Evidence
|
|
- [x] If stale upstream lifecycle is confirmed, reset shared A/V timing anchors when a new stream replaces an existing owner.
|
|
- Added a lifecycle guard so normal camera/microphone stream replacement clears stale shared timing anchors before re-pairing.
|
|
- Kept soft microphone recovery intentionally separate so it supersedes the mic owner without disturbing an active healthy camera/shared clock.
|
|
- Added regression coverage for stale timing-anchor replacement and soft microphone recovery preservation.
|
|
- [ ] If UAC sink backlog is confirmed, make UAC output freshness-bounded.
|
|
- [ ] If audio progress is marked too early, move/augment progress telemetry to reflect actual sink emission readiness.
|
|
- [ ] If UVC and UAC are using incompatible freshness semantics, unify them behind one live-media policy.
|
|
- [ ] If browser/WebRTC adds delay after devices are already synced, document the application boundary and add browser-specific mitigation or guidance.
|
|
|
|
### Phase 4: Gate And Release Criteria
|
|
- [x] Add deterministic unit/integration tests for probe analysis logic.
|
|
- [x] Add a hardware-in-the-loop/manual gate artifact schema for real Tethys probe runs.
|
|
- [x] Update `scripts/ci/media_reliability_gate.sh` to report probe evidence when present.
|
|
- Gate now reads `LESAVKA_SYNC_PROBE_REPORT_JSON`, `LESAVKA_SYNC_PROBE_REPORT_DIR`, or `target/media-reliability-gate/sync-probe/report.json`.
|
|
- Gate emits sync-probe verdict/check metrics, skew metrics, event counts, and a verdict info metric.
|
|
- [x] Require a fresh probe report before declaring lip sync fixed.
|
|
- Gate now supports `LESAVKA_REQUIRE_SYNC_PROBE=1`, which fails media reliability when a valid passing probe report is absent.
|
|
- Product/release judgment still requires a new live Theia/Tethys probe after the lifecycle fix is installed.
|
|
- [ ] Suggested thresholds:
|
|
- [x] preferred: p95 skew <= 35 ms
|
|
- [x] acceptable: p95 skew <= 80 ms
|
|
- [x] gross failure: sustained skew > 250 ms
|
|
- [x] catastrophic failure: any sustained skew near or above 1000 ms
|
|
|
|
### Open Questions
|
|
- [x] Decide whether the phase-1 beacon should run as a separate binary, a hidden client mode, or both.
|
|
- [x] Decide whether Tethys probe should be Rust-only, shell plus GStreamer, or a hybrid.
|
|
- [ ] Confirm whether sudo/Vault access is available for installing missing probe dependencies on Theia/Tethys.
|
|
- Non-sudo server journal inspection worked; noninteractive sudo over SSH still needs an explicit TTY/password path.
|
|
|
|
### Validation Evidence
|
|
- [x] `cargo test -p lesavka_server upstream_media_runtime::tests::lifecycle`
|
|
- [x] `cargo test -p lesavka_client sync_probe::analyze`
|
|
- [x] `cargo test -p lesavka_testing upstream_sync_script_tunnels_auto_server_addr_through_ssh`
|
|
- [x] `bash -n scripts/ci/media_reliability_gate.sh`
|
|
- [x] `cargo test -p lesavka_testing media_reliability_gate_reports_direct_sync_probe_evidence`
|
|
- [x] `LESAVKA_REQUIRE_SYNC_PROBE=1 ./scripts/ci/media_reliability_gate.sh`
|
|
- Used a synthetic passing report at `target/media-reliability-gate/sync-probe/report.json` to verify gate parsing/enforcement.
|
|
- This validates CI glue only; a real Theia/Tethys probe is still required for product judgment.
|
|
|
|
## Real Upstream Lip-Sync Fix Checklist
|
|
|
|
Context: the mirrored browser probe finally reproduced the real failure class on 2026-05-01:
|
|
`activity_start_delta_ms=+9591.1`. This means the end-to-end browser-visible path can still start video far ahead of audio. The fix target is not silence in the logs; it is a freshness-first A/V uplink whose startup can heal briefly but cannot drift into seconds of skew.
|
|
|
|
### Acceptance Criteria
|
|
- [ ] Mirrored browser probe passes with `activity_start_delta_ms <= 1000`.
|
|
- [ ] Steady-state preferred sync: median skew within `35 ms`.
|
|
- [ ] Steady-state acceptable sync: p95 absolute skew within `80 ms`.
|
|
- [ ] Any sustained or startup A/V split near `1000 ms` remains a hard failure.
|
|
- [ ] No stale audio backlog is ever drained into UAC to catch up.
|
|
- [ ] No stale video backlog is ever drained into UVC to catch up.
|
|
- [ ] Google Meet manual testing agrees with the mirrored probe instead of revealing hidden seconds-scale skew.
|
|
|
|
### Phase 0: Keep The Probe Honest
|
|
- [x] Split raw activity-start fields from filtered/coded paired-pulse fields in probe reports.
|
|
- [x] Print explicit raw first-video and first-audio timestamps in `report.txt`.
|
|
- [ ] Keep the mirrored browser probe as the release/blocking upstream A/V gate.
|
|
- [ ] Keep the old raw-device probe as a lower-level diagnostic only.
|
|
|
|
### Phase 1: Stop One-Sided Startup Drift
|
|
- [x] Default upstream planning must require both camera and microphone before live playout.
|
|
- [x] One-sided playout may only happen through an explicit compatibility override.
|
|
- [x] While pairing is overdue, keep replacing the waiting-side anchor with fresh packets instead of preserving stale startup anchors.
|
|
- [x] While awaiting the peer stream, keep only fresh pending camera packets.
|
|
- [x] While awaiting the peer stream, keep only fresh pending microphone packets.
|
|
- [x] Add tests proving the pairing window no longer expires into one-sided playout by default.
|
|
- [x] Add tests proving the explicit one-sided override still works for intentional single-stream scenarios.
|
|
|
|
### Phase 2: Bound UAC Freshness
|
|
- [x] Configure UAC `appsrc` as non-blocking and bounded.
|
|
- [x] Log and drop UAC appsrc push failures instead of treating enqueue as guaranteed playback.
|
|
- [ ] Flush/stop UAC cleanly on session close, replacement, and recovery.
|
|
- [x] Add tests or contract coverage for bounded UAC settings where practical.
|
|
|
|
### Phase 3: Add Real Timing Evidence
|
|
- [ ] Add server timing counters for first camera packet, first mic packet, first UVC write, and first UAC push per session.
|
|
- [ ] Add dropped-stale audio/video counters to diagnostics.
|
|
- [ ] Add a concise health explanation when startup pairing exceeds the healing window.
|
|
- [ ] Surface `Starting`, `Healing`, `Flowing`, `Lagging`, `Dropping`, and `Stale` states in chips/diagnostics from real path evidence.
|
|
|
|
### Phase 4: Recovery And Mid-Session Changes
|
|
- [ ] Make device changes trigger soft-pause, stream replacement, queue flush, and re-pairing.
|
|
- [ ] Keep recovery soft-first; reserve hard UVC/UAC gadget rebuilds for explicit guarded recoveries.
|
|
- [ ] Add cooldown/state guards so recovery buttons cannot wedge Theia.
|
|
- [ ] Ensure disconnect closes all client/server media tasks for the session.
|
|
|
|
### Phase 5: Verification Loop
|
|
- [x] Run focused upstream runtime tests.
|
|
- [x] Run server/client media contract tests.
|
|
- [x] Run `cargo check` for touched packages.
|
|
- [x] Bump version for the fix release.
|
|
- [ ] Run the mirrored browser probe on installed client/server.
|
|
- [ ] Run Google Meet manual validation.
|