lesavka/AGENTS.md

792 lines
68 KiB
Markdown

# Lesavka Agent Notes
## 0.18.3 Bundled Webcam A/V Migration Checklist
Context: manual Google Meet and mirrored-probe testing showed the split webcam
and microphone uplink design is too fragile under real browser/device pressure.
The new product contract is: when webcam video is present, microphone audio
travels with it on one client-owned upstream media stream. The server manages
freshness and smoothness after arrival; it no longer tries to make two racing
upstream channels look synchronized. Microphone-only remains supported as the
explicit no-camera path.
### Product Invariants
- [x] Webcam-enabled sessions use one bundled upstream media RPC by default.
- [x] Webcam-enabled sessions imply microphone capture when the server supports UAC.
- [x] The UI-selected camera, camera quality, microphone, speaker, gain, and
enable switches remain authoritative; defaults may not override visible UI state.
- [x] Client capture timestamps are the source of A/V sync truth for webcam sessions.
- [x] Server bundled playout rebases that client timeline onto a fresh local epoch.
- [x] Server bundled playout may drop stale packets, but must not rebuild sync by
independently pairing separate camera and microphone streams.
- [x] Mic-only sessions keep the existing microphone stream path.
- [x] Legacy split webcam/mic uplink is only an explicit compatibility escape hatch.
- [x] Manual probes and diagnostics clearly label `bundled-webcam-media` versus
`mic-only` so we never confuse the architectures during debugging.
- [x] Sync protection takes precedence over freshness and smoothness: bad mixed
bundle timing is dropped coherently instead of letting one side play alone.
- [x] Startup video is allowed to prime UVC if a first mixed bundle has bad
audio/video timing; the mismatched audio is dropped, preserving sync while
avoiding browser `Camera is starting` starvation.
- [x] An already-attached UVC gadget descriptor is the physical browser contract:
if it still advertises an older profile, server handshake/capture sizing
follows that live descriptor until a controlled gadget rebuild is allowed.
- [x] Bundled webcam sessions enforce a hard one-second freshness ceiling:
server-side reanchors may improve smoothness, but stale/future waits over
the live budget are dropped instead of preserving lag.
- [x] Bundled webcam sessions do not inherit legacy split-path static A/V
calibration offsets by default; the client-owned capture timeline is the
sync source, with only explicit bundled-offset env overrides allowed.
### Wire Protocol
- [x] Add `UpstreamMediaBundle` containing one optional video frame plus zero or
more audio packets from the same client capture timeline.
- [x] Add `StreamWebcamMedia(stream UpstreamMediaBundle)` to the relay service.
- [x] Advertise bundled support in the handshake capability set.
- [ ] Add compatibility tests proving older split RPCs are not used by normal
webcam sessions when bundled support is advertised.
### Client Migration
- [x] Change session startup to prefer bundled webcam media whenever camera and
microphone are both available.
- [x] Spawn one bundled capture/uplink task instead of separate camera and mic
tasks for webcam sessions.
- [x] Bundle camera frames and microphone packets into one freshness-bounded queue.
- [x] Bound the pre-bundle capture handoff channel so camera/mic workers drop
old events under pressure instead of building unbounded latency.
- [x] Drop lag-clamped camera and microphone source buffers before bundling;
stale source data may not be relabeled as fresh media.
- [x] Stamp all packets at capture/uplink enqueue before the async gRPC stream
can add misleading delay.
- [x] Preserve live UI device/profile changes by restarting the bundled capture
pipeline when selected camera, camera quality, or microphone changes.
- [x] Make launcher diagnostics expose the active upstream mode as first-class
text rather than inferring from separate camera/mic telemetry.
- [ ] Migrate sync-probe runner to the bundled path explicitly and remove any
normal probe dependence on split `StreamCamera` + `StreamMicrophone`.
### Server Migration
- [x] Implement `StreamWebcamMedia` and make it own both UAC and UVC/HDMI sinks
for one upstream session.
- [x] Schedule bundled packets by shared client capture timestamp instead of
startup-pairing independent streams.
- [x] Sanitize packet timestamps before bundling so stale/future source PTS values
cannot become the server's A/V sync truth.
- [x] Make server bundled scheduling use the client capture sidecar rather than
raw packet `pts`, and reset the bundled epoch on client-session changes.
- [x] Keep server freshness drops/reanchors active for bundled media.
- [x] Drop mixed A/V bundles coherently when one side fails freshness/sync planning.
- [x] Reanchor bundled playout when due times drift too far into the future, and
drop packets whose predicted playout age would exceed the one-second budget.
- [x] Activate the camera relay before opening the microphone sink so UVC can
become ready even if UAC setup is slow.
- [x] Log the first bundled video frame handed to the camera sink.
- [x] Honor the live configfs UVC descriptor when it differs from configured
defaults, preventing browsers from receiving frames outside negotiated caps.
- [x] Make the UVC control helper answer probe/commit requests from the same
live descriptor so Firefox/Chrome negotiation matches server frame output.
- [x] Continue reporting client timing and sink handoff diagnostics from bundled packets.
- [ ] Add bundled-mode counters for first bundle, first audio push, first video feed,
dropped stale bundles, and bundle queue age.
- [ ] Retire split-stream planner assumptions from the webcam path after the
bundled mode passes manual Google Meet and mirrored-probe validation.
### Validation
- [x] `cargo check -p lesavka_common -p lesavka_client -p lesavka_server --bins`
- [x] Focused handshake and launcher tests.
- [x] Focused UVC profile test for stale configured profile vs live attached descriptor.
- [ ] Focused server upstream-media tests including bundled stream acceptance.
- [ ] Install on both ends and verify diagnostics show bundled webcam media.
- [ ] Manual Google Meet test: camera starts, video is not black/unsupported,
audio is intelligible, and lip sync is inside the acceptable band.
- [ ] Mirrored browser probe: recordings open in Dolphin and automated analyzer
no longer depends on fragile split-channel assumptions.
## A/V Sync Probe And Lip-Sync Validation Checklist
Context: Google Meet testing on 2026-04-30 showed audio roughly 8 seconds behind video even though internal client/server telemetry reported fresh uplink packets. Treat this as a product correctness failure, not a calibration issue. Do not resume blind lip-sync tuning until the probe can explain where delay appears.
### Operating Principles
- Avoid hard-resetting USB, UVC, UAC, display managers, or remote hosts unless the user explicitly approves it.
- Prefer observation and reversible user-space probes before changing media pipelines.
- Treat Tethys-only SSH/device inspection as a development luxury, not a production dependency.
- Do not claim lip sync is fixed from internal telemetry alone; require end-to-end device-level evidence.
- Keep this checklist updated as work lands.
### Phase 1: Build The Probe
- [x] Create this tracked checklist in `AGENTS.md`.
- [x] Inventory existing `client/src/sync_probe/` code and decide what can be reused.
- Reuse the existing synthetic beacon in `client/src/sync_probe/`.
- Reuse the existing Tethys capture harness in `scripts/manual/run_upstream_av_sync.sh`.
- Reuse and extend `lesavka-sync-analyze`; current gap is structured evidence output, not capture generation.
- [x] Define the phase-1 output contract:
- [x] `report.json`
- [x] `report.txt`
- [x] per-event rows with event id, video time, audio time, skew, and confidence
- [x] pass/fail verdict using preferred/acceptable/catastrophic thresholds
- [x] Add a deterministic local sync beacon source:
- [x] video flash pattern with event identity or cadence
- [x] simultaneous audio click/beep
- [x] stable event schedule suitable for automated detection
- [x] Add a Tethys-side capture probe:
- [x] capture Lesavka UVC video device
- [x] capture Lesavka UAC microphone device
- [x] record enough raw evidence for debugging when detection fails
- [x] detect video flashes
- [x] detect audio clicks
- [x] pair events and compute skew
- [x] Add a runner that can launch or instruct the Tethys probe safely over SSH without rebooting or restarting the desktop.
- [x] Store probe artifacts under `/tmp/lesavka-sync-probe-*` by default.
- [x] Keep the probe usable without Google Meet first; Google Meet validation is a later application-level check.
### Phase 2: Use Probe To Root-Cause Desync
- [x] Run probe through direct Lesavka UVC/UAC devices on Tethys.
- First live run reached the devices but exposed analyzer/tooling gaps instead of a valid skew report.
- Fixed the manual probe tunnel to preserve HTTPS/mTLS through SSH (`LESAVKA_SERVER_SCHEME=https`, `LESAVKA_TLS_DOMAIN=lesavka-server`).
- Fixed analyzer handling for MJPEG captures whose FFprobe metadata over-reports frames versus decodable video frames.
- [x] Compare client-generated event times against Tethys-observed times.
- The preserved Tethys capture had 323 decodable frames with constant brightness, so no video flash reached UVC.
- Server logs show the probe entered a stale upstream session and dropped audio as ~326 seconds late.
- [x] Identify whether delay appears before server planning, at server UAC sink, at UVC helper, inside Tethys device capture, or inside browser/WebRTC.
- Current root cause is server planning/session lifecycle, before UVC/UAC sink output.
- A previous one-sided microphone session started at 2026-04-30T22:59:52Z; the new probe at 2026-05-01T00:57:08Z inherited its stale playout epoch.
- [x] Add diagnostics for whichever stage is hiding delay.
- Existing server lifecycle/planning logs were enough to isolate this run; next gate should preserve these as structured artifacts.
- [x] Do not tune calibration offsets until gross backlog is ruled out.
- No calibration offsets were changed during the stale-session investigation.
- Current evidence points at lifecycle/session planning, not an offset problem.
### Phase 3: Fix Lesavka With Evidence
- [x] If stale upstream lifecycle is confirmed, reset shared A/V timing anchors when a new stream replaces an existing owner.
- Added a lifecycle guard so normal camera/microphone stream replacement clears stale shared timing anchors before re-pairing.
- Kept soft microphone recovery intentionally separate so it supersedes the mic owner without disturbing an active healthy camera/shared clock.
- Added regression coverage for stale timing-anchor replacement and soft microphone recovery preservation.
- [ ] If UAC sink backlog is confirmed, make UAC output freshness-bounded.
- [ ] If audio progress is marked too early, move/augment progress telemetry to reflect actual sink emission readiness.
- [ ] If UVC and UAC are using incompatible freshness semantics, unify them behind one live-media policy.
- [ ] If browser/WebRTC adds delay after devices are already synced, document the application boundary and add browser-specific mitigation or guidance.
### Phase 4: Gate And Release Criteria
- [x] Add deterministic unit/integration tests for probe analysis logic.
- [x] Add a hardware-in-the-loop/manual gate artifact schema for real Tethys probe runs.
- [x] Update `scripts/ci/media_reliability_gate.sh` to report probe evidence when present.
- Gate now reads `LESAVKA_SYNC_PROBE_REPORT_JSON`, `LESAVKA_SYNC_PROBE_REPORT_DIR`, or `target/media-reliability-gate/sync-probe/report.json`.
- Gate emits sync-probe verdict/check metrics, skew metrics, event counts, and a verdict info metric.
- [x] Require a fresh probe report before declaring lip sync fixed.
- Gate now supports `LESAVKA_REQUIRE_SYNC_PROBE=1`, which fails media reliability when a valid passing probe report is absent.
- Product/release judgment still requires a new live Theia/Tethys probe after the lifecycle fix is installed.
- [ ] Suggested thresholds:
- [x] preferred: p95 skew <= 35 ms
- [x] acceptable: p95 skew <= 80 ms
- [x] gross failure: sustained skew > 250 ms
- [x] catastrophic failure: any sustained skew near or above 1000 ms
### Open Questions
- [x] Decide whether the phase-1 beacon should run as a separate binary, a hidden client mode, or both.
- [x] Decide whether Tethys probe should be Rust-only, shell plus GStreamer, or a hybrid.
- [ ] Confirm whether sudo/Vault access is available for installing missing probe dependencies on Theia/Tethys.
- Non-sudo server journal inspection worked; noninteractive sudo over SSH still needs an explicit TTY/password path.
### Validation Evidence
- [x] `cargo test -p lesavka_server upstream_media_runtime::tests::lifecycle`
- [x] `cargo test -p lesavka_client sync_probe::analyze`
- [x] `cargo test -p lesavka_testing upstream_sync_script_tunnels_auto_server_addr_through_ssh`
- [x] `bash -n scripts/ci/media_reliability_gate.sh`
- [x] `cargo test -p lesavka_testing media_reliability_gate_reports_direct_sync_probe_evidence`
- [x] `LESAVKA_REQUIRE_SYNC_PROBE=1 ./scripts/ci/media_reliability_gate.sh`
- Used a synthetic passing report at `target/media-reliability-gate/sync-probe/report.json` to verify gate parsing/enforcement.
- This validates CI glue only; a real Theia/Tethys probe is still required for product judgment.
## Real Upstream Lip-Sync Fix Checklist
Context: the mirrored browser probe finally reproduced the real failure class on 2026-05-01:
`activity_start_delta_ms=+9591.1`. This means the end-to-end browser-visible path can still start video far ahead of audio. The fix target is not silence in the logs; it is a freshness-first A/V uplink whose startup can heal briefly but cannot drift into seconds of skew.
### Acceptance Criteria
- [ ] Mirrored browser probe passes with `activity_start_delta_ms <= 1000`.
- [ ] Steady-state preferred sync: median skew within `35 ms`.
- [ ] Steady-state acceptable sync: p95 absolute skew within `80 ms`.
- [ ] Any sustained or startup A/V split near `1000 ms` remains a hard failure.
- [ ] No stale audio backlog is ever drained into UAC to catch up.
- [ ] No stale video backlog is ever drained into UVC to catch up.
- [ ] Google Meet manual testing agrees with the mirrored probe instead of revealing hidden seconds-scale skew.
### Phase 0: Keep The Probe Honest
- [x] Split raw activity-start fields from filtered/coded paired-pulse fields in probe reports.
- [x] Print explicit raw first-video and first-audio timestamps in `report.txt`.
- [x] Root-cause the 0.16.17 `raw_first_video_activity_s=0.000` artifact as the mirrored probe counting its own bright pre-start positioning card.
- [x] Make the mirrored stimulus pre-start screen dark/dim so only real flash pulses can be detected as video activity.
- [x] Add analyzer coverage proving dim pre-start positioning frames are ignored.
- [x] Replace generic light/dark mirrored flashes with color-coded event IDs.
- [x] Make mirrored audio pulses unique by the same event ID via pulse width plus tone frequency.
- [x] Teach the analyzer to decode mirrored video event IDs from color, not grayscale brightness.
- [x] Tighten real-camera color matching after 0.16.18 accepted washed-out brown/gray remnants as red/yellow events.
- [x] Preserve raw activity-start timing before cadence cleanup in coded reports.
- [x] Merge short audio envelope dropouts inside one coded pulse so a single tone burst cannot become two fake events.
- [x] Add diagnostic coded-pair correlation so stable large skew reports as measured failure instead of `not enough pairs`.
- [x] Make coded mirrored verdicts/calibration use matched coded pulses as authority; raw activity-start deltas are reported separately unless they agree with the coded pairs.
- [x] Print unpaired video/audio onsets in the human report so missed coded pulses are visible during probe triage.
- [ ] Keep the mirrored browser probe as the release/blocking upstream A/V gate.
- [ ] Keep the old raw-device probe as a lower-level diagnostic only.
### Phase 1: Stop One-Sided Startup Drift
- [x] Default upstream planning must require both camera and microphone before live playout.
- [x] One-sided playout may only happen through an explicit compatibility override.
- [x] While pairing is overdue, keep replacing the waiting-side anchor with fresh packets instead of preserving stale startup anchors.
- [x] While awaiting the peer stream, keep only fresh pending camera packets.
- [x] While awaiting the peer stream, keep only fresh pending microphone packets.
- [x] Send the latest camera packet from the client uplink queue instead of draining old-but-not-yet-stale video backlog.
- [x] Add tests proving the pairing window no longer expires into one-sided playout by default.
- [x] Add tests proving the explicit one-sided override still works for intentional single-stream scenarios.
### Phase 2: Bound UAC Freshness
- [x] Configure UAC `appsrc` as non-blocking and bounded.
- [x] Log and drop UAC appsrc push failures instead of treating enqueue as guaranteed playback.
- [x] Raise calibration offset limits enough to cover the measured MJPEG/UVC path delta without rejecting probe corrections.
- [x] Update the MJPEG/UVC factory audio baseline from the old `-45ms`/`+720ms` values to `+1260ms` as the mirrored probe exposes the fresh UAC-vs-UVC path delta.
- [x] Migrate untouched legacy `-45ms` factory/env calibration files on load so old installs actually receive the new baseline.
- [x] Make the video/audio-master wait offset-aware so a positive audio playout delay does not freeze UVC video while UAC sleeps before emission.
- [ ] Flush/stop UAC cleanly on session close, replacement, and recovery.
- [x] Add tests or contract coverage for bounded UAC settings where practical.
### Phase 3: Add Real Timing Evidence
- [ ] Add server timing counters for first camera packet, first mic packet, first UVC write, and first UAC push per session.
- [ ] Add dropped-stale audio/video counters to diagnostics.
- [ ] Add a concise health explanation when startup pairing exceeds the healing window.
- [ ] Surface `Starting`, `Healing`, `Flowing`, `Lagging`, `Dropping`, and `Stale` states in chips/diagnostics from real path evidence.
### Phase 4: Recovery And Mid-Session Changes
- [x] Make device changes trigger soft-pause, stream replacement, queue flush, and re-pairing.
- [ ] Keep recovery soft-first; reserve hard UVC/UAC gadget rebuilds for explicit guarded recoveries.
- [ ] Add cooldown/state guards so recovery buttons cannot wedge Theia.
- [ ] Ensure disconnect closes all client/server media tasks for the session.
### Phase 5: Verification Loop
- [x] Run focused upstream runtime tests.
- [x] Run server/client media contract tests.
- [x] Run `cargo check` for touched packages.
- [x] Bump version for the fix release.
- [x] Run the mirrored browser probe on installed client/server.
- 0.16.17 still failed: reported `activity_start_delta_ms=+6735.0`, but `raw_first_video_activity_s=0.000` exposed a probe false-positive from the pre-start screen. Paired pulses still showed real steady-state skew (`p95=411.8 ms`, `median=-99.0 ms`), so the product remains unfixed.
- 0.16.18 captured real colored/audio-coded events but the analyzer still bailed with `need at least 3 matching coded pulse pairs; saw 1`. Replaying that artifact after analyzer hardening now reports `gross_failure`: 16/16 coded pairs, p95 `775.7 ms`, activity start `-766.4 ms`, and drift `-2.8 ms`; the failure is stable audio-ahead/video-late skew, not random detector noise.
- 0.16.19 changes the shipped MJPEG/UVC audio playout baseline to `+720ms`; the next mirrored browser probe should move the measured median from about `-766ms` toward roughly `-46ms` before fine calibration.
- 0.16.19 mirrored browser probe did not move the measured skew: p95 `885.7 ms`, median `-788.4 ms`, activity start `-659.1 ms`, drift `-81.2 ms`. SSH inspection showed Theia was on commit `c348597`, but `/etc/lesavka/server.env` still contained `LESAVKA_UPSTREAM_AUDIO_PLAYOUT_OFFSET_US=-45000`; the new `+720ms` baseline was not actually installed. Patch the installer to migrate leaked legacy ambient `-45000` to `+720000` unless `LESAVKA_INSTALL_UPSTREAM_AUDIO_PLAYOUT_OFFSET_US` explicitly asks for the legacy value.
- 0.16.20 installed the `+720ms` offset (`/etc/lesavka/server.env` had `LESAVKA_UPSTREAM_AUDIO_PLAYOUT_OFFSET_US=720000`), but the mirrored browser capture contained no recognizable color pulses. Theia server logs showed repeated `upstream video frame dropped because the audio master never caught up inside the pairing window`; UVC was effectively starved by the positive audio delay instead of flowing delayed-but-fresh frames.
- 0.16.21 makes that wait offset-aware and adds a regression test proving a configured positive audio delay does not freeze UVC video while UAC sleeps before playout.
- Replaying the 0.16.21 artifact after 0.16.22 analyzer hardening changes the verdict from false `catastrophic_failure` to `gross_failure`: p95 `273.8 ms`, median `-188.4 ms`, 7 paired coded pulses. The raw activity-start delta (`-3620.7 ms`) is still printed, but it is ignored for verdict/calibration because it disagrees with coded pairs by `3432.3 ms`; unpaired video/audio onsets are printed for triage.
- 0.16.22 live mirrored run still failed with p95 `433.7 ms`, median `-359.4 ms`, and 5 paired coded pulses. Client telemetry showed camera uplink `latest_age_ms` repeatedly around `300-350 ms`, matching the measured skew; patch 0.16.23 to make video queues latest-only instead of draining stale-but-under-budget backlog.
- 0.16.23 local validation passed for fresh-queue behavior, uplink/probe freshness contracts, sync analyzer tests, client/server binary checks, and whitespace checks.
- 0.16.23 live mirrored run improved to p95 `215.2 ms`, median `+142.2 ms`, 13 paired coded pulses, and raw activity alignment within `6.6 ms` of coded pairs. Patch 0.16.24 makes the probe print local client and remote server versions before capture so every run records what was actually tested.
- 0.16.24 live mirrored run improved again to p95 `168.4 ms`, median `-19.1 ms`, 11 paired coded pulses, but still failed because individual paired pulses bounced between about `-168 ms` and `+45 ms`. Client logs showed the microphone uplink queue still accumulating depth `16`; patch 0.16.25 makes microphone uplink queues latest-only too so stale audio PTS cannot continue acting as the server timing master under backpressure.
- 0.16.25 removed the client mic backlog but exposed a stable hardware/browser path delta: p95 `557.3 ms`, median `-540.5 ms`, drift `+9.0 ms`, and fresh mic delivery ages around `2-10 ms`. Patch 0.16.26 raises the MJPEG/UVC factory audio delay to `+1260 ms` and expands the calibration clamp so this stable offset can actually be corrected instead of rejected.
- [ ] Re-run the mirrored browser probe after the pre-start false-positive fix.
- [ ] Run Google Meet manual validation.
## 0.17.0 Tyrannical Upstream Playout Checklist
Context: 0.16.x proved that queue tweaks and static calibration cannot guarantee lip sync. 0.17.0 changes the upstream contract: the server planner is authoritative, audio is the master, video follows by timestamp or freezes, and freshness wins over smooth-but-wrong playback.
### Hard Product Invariants
- [x] No normal live upstream playout may be more than 1000 ms behind the freshest known client capture frontier.
- [x] Video may not advance outside the audio playhead sync budget.
- [x] Audio should be continuous when possible, but stale audio must be dropped/skipped rather than drained.
- [x] Missing or late video should freeze/stutter instead of pulling audio backward.
- [x] Startup may be ugly, but must either enter fresh synced live mode or declare failure within 60 seconds.
- [x] Healing may be visible, but must prevent persistent seconds-scale skew.
- [x] Calibration may fine-tune sub-frame offsets only; it must not be required to rescue seconds-scale desync.
- [x] Bump Lesavka to 0.17.0 because this is a media-contract change, not a patch tune.
- [x] Bump patch follow-ups to 0.17.1 instead of reporting `version+revision` as the release version.
- [x] Add planner policy config: max live lag, max skew, startup timeout, target playout delay, and healing cooldown.
- [x] Reset 0.17 defaults so shipped audio/video offsets do not intentionally exceed the freshness budget.
- [x] Track latest camera/audio input timestamps in the server planner.
- [x] Track actual planned/emitted audio and video playheads.
- [x] Enforce audio as master: stale audio is dropped/skipped; it does not drain backlog.
- [x] Enforce video follower behavior: frames ahead of audio wait; frames too far behind audio are dropped so the previous frame freezes.
- [x] Re-anchor continuously when the live playhead falls outside the freshness budget, not only once at startup.
- [x] Keep startup paired-only by default and fail visibly after the startup timeout.
- [x] Add planner phases and counters to diagnostics/logs: acquiring, syncing, live, healing, failed; stale drops, skew drops, freshness heals, video freezes.
- [x] Keep UVC/UAC sinks as dumb consumers of planner-approved packets.
- [x] Update tests to prove stale media cannot be emitted and video cannot outrun/lag audio beyond policy.
- [x] Update manual/probe diagnostics so 0.17 reports the planner state being tested.
### Validation Targets Before Human Test
- [x] Unit tests for startup pairing, stale audio drop, stale video drop, reanchor, startup timeout, audio-master/video-follower rules.
- [x] Contract tests for installer defaults and version reporting.
- [x] `cargo check -p lesavka_client -p lesavka_server --bins`.
- [x] Focused `lesavka_testing` media/runtime contracts.
- [x] Only after all of the above, run the mirrored browser probe.
### Progress Log
- 2026-05-01: Added 0.17 planner defaults (`350ms` target playout, `1000ms` max live lag, `60000ms` startup timeout, `80000us` pair slack), reset MJPEG audio factory offset to `0`, and migrated old `-45ms`, `+720ms`, and `+1260ms` untouched baselines.
- 2026-05-01: Server planner now tracks latest input frontier, presented audio/video playheads, phase, stale drops, skew drops, reanchors, startup timeouts, and freezes.
- 2026-05-01: Runtime tests green for stale audio drop, stale video drop, audio-master/video-follower freeze, repeated reanchor, paired startup timeout, and planner snapshot basics: `cargo test -p lesavka_server upstream_media_runtime::tests -- --nocapture`.
- 2026-05-01: Added `GetUpstreamSync` RPC, `lesavka-relayctl upstream-sync`, launcher diagnostics text, and mirrored-probe before/after planner snapshots so 0.17 probe runs report the exact planner state under test.
- 2026-05-01: Validation green: `cargo test -p lesavka_server --lib --bins`, `cargo test -p lesavka_testing`, `cargo test -p lesavka_client --bins --lib`, and targeted installer/RPC/layout contracts.
- 2026-05-01: First installed 0.17.0 mirrored browser probe on client/server commit `3920e0a` failed honestly: planner reported fresh live state (`live_lag_ms=10`, `skew_ms=+20.7`) but browser-observed paired pulses showed audio late by median `+349.1ms`, p95 `429.1ms`, with 6 video freezes/skew drops. Replayed artifact after analyzer hardening now reports `gross_failure` instead of false raw-start `catastrophic_failure`.
- 2026-05-01: Patch follow-up models the observed MJPEG/UVC browser egress delta by defaulting video playout offset to `+350ms` and preserving the 1s freshness ceiling. Raw activity-start evidence is now ignored for verdict/calibration when it disagrees with paired pulses that are already failing directly. Existing early-0.17 `audio=0/video=0` factory/env calibration files migrate to the new `video=+350ms` default on load.
- 2026-05-01: Release identity cleanup: bumped the patched build to clean semver `0.17.1`; probe attribution now prints `client_version`/`server_version` separately from `client_revision`/`server_revision` and refuses old `client_full_version` output.
- 2026-05-01: 0.17.1 mirrored probe failed with video about `1.18-1.31s` behind audio and 761 planner video freezes. Root cause candidate: the client rebaser forced independent camera/mic pipelines onto one first-packet capture base, so a later-starting camera path was timestamped too early and looked permanently behind audio. Patch 0.17.2 anchors each stream to the shared monotonic clock at its own first packet time.
- 2026-05-02: 0.17.2 mirrored probe and Google Meet test showed major improvement but persistent sub-second late video. Root cause follow-up: the temporary `+350ms` factory MJPEG video playout offset matched the observed browser skew and also made the server skew guard freeze video against its own offset. Patch 0.17.3 restores factory video offset to `0ms`, migrates untouched `+350ms` install/calibration defaults back to `0ms`, and makes the skew guard offset-aware for intentional site calibration.
- 2026-05-02: 0.17.3 Google Meet manual test improved to roughly sub-second/near-quarter-second lip sync, but the mirrored analyzer could not pair pulses and the user still heard choppy background audio. Client logs showed Pulse microphone packets arriving unevenly with ages around `90-240ms`; patch 0.17.4 lowers Pulse mic `buffer-time`/`latency-time`, bounds the mic queue/appsink, and keeps mirrored-probe after-run planner diagnostics even when analysis fails.
- 2026-05-02: 0.17.4 mirrored run was salvageable after an SCP banner timeout, but analysis still failed with no close pulse pairs. The client log still showed `180-240ms` microphone delivery ages, pointing at server playout sleeps backpressuring the gRPC microphone stream. Patch 0.17.5 drains inbound microphone packets while waiting for scheduled UAC playout and retries browser-capture SCP fetches.
- 2026-05-02: 0.17.5 mirrored run still failed with insufficient paired evidence, and the client log still showed recurring `180-240ms` microphone packet age while camera age stayed near zero. Patch 0.17.6 splits oversized mic samples into `20ms` timestamped packets and keeps a short fresh server-side audio window instead of collapsing every pending burst to one newest chunk, aiming to preserve lip sync without making background audio choppy.
- 2026-05-02: 0.17.6 Bumblebee mirrored run proved Bumblebee mic packets are already `10ms`, but camera source timestamps were being rebased up to roughly `1.8s` into the future while mic packets sat around `180-240ms` old. Patch 0.17.7 adds a source lead cap (`80ms` default) to both direct and duration-paced client timestamp rebasing so bursty camera buffers cannot make the server wait for fake future video while fresh audio keeps moving.
- 2026-05-02: The launcher UI was still writing live control files with only camera/mic/speaker booleans, so media device combo changes were honestly only staged for the next child launch. Patch 0.17.7 extends the live media control file with base64-encoded camera source, camera profile, microphone source, and speaker sink choices; the relay child now rebuilds the affected camera, mic, or speaker pipeline when those selections change.
## 0.17.8 Sync-Only Output Bias Checklist
Context: 0.17.7 with the Bumblebee mic and BRIO camera removed the seconds-scale failure and left a stable browser-visible output skew: paired pulses were audio-late by roughly `+95ms` to `+183ms` (`median=+110.8ms`, `mean=+132.6ms`, `p95=+183.1ms`). Per user direction, 0.17.8 is only about establishing sync. Freshness and smoothness tuning are explicitly deferred until the mirrored probe is inside the sync band.
- [x] Do not change freshness ceilings, reanchor thresholds, queue policy, UAC smoothness, or startup healing behavior in this version.
- [x] Set the MJPEG/UVC factory video playout baseline to `+130ms` to counter the measured browser output audio-late bias.
- [x] Migrate only untouched old `0ms` and `+350ms` video defaults to the new `+130ms` baseline.
- [x] Preserve manual/site calibration values exactly as-is.
- [x] Update installer defaults so Theia receives the same `+130ms` baseline after reinstall.
- [x] Update docs and contracts to state the measured sync baseline clearly.
- [x] Run focused calibration/installer/runtime tests.
- [x] Run package checks before push.
- [x] Push clean semver `0.17.8` for installed client/server testing.
## 0.17.9 Sync-Only Audio-Master Presentation Checklist
Context: 0.17.8 installed cleanly on both ends (`314c55b`) but the mirrored probe failed with insufficient data: only 2 paired events, 1187 video freezes, and planner phase `healing`. The server was using the newest planned audio packet as the video-drop reference, so future audio planning could make current video look falsely behind before that audio was actually handed to UAC.
- [x] Keep 0.17.9 scoped to sync enforcement only; no freshness ceilings, queue policy, or smoothness changes.
- [x] Make video freeze/drop decisions compare against audio actually presented to UAC, not merely planned audio.
- [x] Make `wait_for_audio_master` wake on `mark_audio_presented` so video waits for real audio progress.
- [x] Add/adjust tests proving future planned audio alone cannot freeze video.
- [x] Run focused upstream planner tests.
- [x] Run package checks before push.
- [x] Push clean semver `0.17.9` for installed client/server testing.
## 0.17.10 Sync-Only Audio Catch-Up Grace Checklist
Context: 0.17.9 installed cleanly on both ends (`fbf274d`) and improved the mirrored probe to `median=+19.8ms`, `mean=-42.0ms`, and planner phase `live`, but it still failed with `p95=254.1ms`, only 6 paired pulses, `drift=341.9ms`, and 591 video freezes. The Theia server log showed repeated `upstream video frame dropped because the audio master never caught up inside the pairing window`, so the video follower was still giving up at the nominal video due time instead of spending a bounded sync grace to let audio catch up.
- [x] Keep 0.17.10 scoped to establishing sync; defer freshness and smoothness tuning until paired skew is stable.
- [x] Add `LESAVKA_UPSTREAM_AUDIO_MASTER_WAIT_GRACE_MS` with a `350ms` default so video can wait past nominal due time for UAC audio progress.
- [x] Stop dropping video solely because it woke late after a successful audio-master wait.
- [x] Preserve the global `1000ms` live-lag ceiling and existing stale-input planner rules.
- [x] Update installer defaults and operational docs for the sync grace.
- [x] Add/adjust tests proving video can wait through sync grace and still times out after grace expires.
- [x] Run focused upstream planner tests.
- [x] Run package checks before push.
- [x] Push clean semver `0.17.10` for installed client/server testing.
## 0.17.11 Sync-Only Browser Egress Compensation Checklist
Context: 0.17.10 installed cleanly on both ends (`4bb0f4a`) and produced a high-confidence coded-pulse failure instead of probe ambiguity. Browser-visible audio on Tethys arrived about `+891ms` to `+971ms` after the matching video (`median=+962.1ms`, `mean=+946.7ms`, `p95=+971.5ms`), while the server planner reported internal skew near zero (`planner_skew_ms=-56.9`). The missing model is UAC/browser output egress latency: Lesavka was treating `appsrc.push_buffer`/UAC enqueue as audio presentation, but the browser consumes that audio about one second later.
- [x] Keep 0.17.11 scoped to establishing sync; do not tune freshness ceilings or smoothness policy.
- [x] Raise the MJPEG/UVC factory video playout baseline from `+130ms` to `+1090ms` to align video with browser-visible UAC audio.
- [x] Allow intentional A/V playout offsets to exceed the generic future-wait freshness guard so the planner does not immediately reanchor away the sync compensation.
- [x] Widen calibration offset bounds so the measured browser egress baseline is representable instead of silently clamped.
- [x] Migrate untouched `0ms`, `+130ms`, and `+350ms` MJPEG/UVC video baselines to the new browser-visible baseline.
- [x] Preserve manual/site calibration values exactly as-is.
- [x] Update installer defaults so Theia receives the same browser-visible baseline after reinstall.
- [x] Update operational docs and installer contract tests for the new baseline.
- [x] Run focused calibration, installer, and runtime checks.
- [x] Push clean semver `0.17.11` for installed client/server testing.
## 0.17.12 Sync-Only Audio Marker Continuity Checklist
Context: 0.17.11 installed cleanly on both ends (`092c03a`) and fixed the constant browser-visible offset: raw activity start was `-17.0ms`, median coded skew was `+13.1ms`, and no server planner freeze/drop counters moved. The run still failed because only 5 coded pairs were usable and the last two had weak confidence (`0.58`, `0.33`) while the client log showed repeated microphone stale/superseded drops. The remaining sync blocker is audio-marker continuity, not another constant offset.
- [x] Keep 0.17.12 scoped to sync establishment; do not change video freshness or server freshness ceilings.
- [x] Stop using latest-only policy for microphone uplink packets; preserve oldest fresh audio chunks so speech/pulses stay continuous.
- [x] Keep microphone queue age bounded at `400ms` so continuity does not become unbounded backlog.
- [x] Apply the same bounded audio-continuity policy to the synthetic sync-probe audio queue.
- [x] Update contract tests to encode the new audio continuity policy.
- [x] Run focused client queue/probe contracts and package checks.
- [x] Push clean semver `0.17.12` for installed client/server testing.
## 0.17.13 Probe-Driven Calibration Loop Checklist
Context: 0.17.12 installed cleanly on both ends (`2b26fde`) and moved the paired-pulse median into the near-sync zone (`median=+71.6ms`, first/last `+59.8ms/-60.1ms`), but the run still failed with only 4 usable coded pairs and `p95=206.8ms`. Static guesses have reached diminishing returns. 0.17.13 adds a measured calibration loop so the mirrored probe can become the authority for site-specific browser-visible output compensation when, and only when, the analyzer has enough stable evidence.
- [x] Keep 0.17.13 scoped to probe-driven sync calibration tooling; do not change freshness ceilings, queue policy, UAC smoothness, or startup healing behavior.
- [x] Expose relay CLI calibration state and safe calibration actions for scripts.
- [x] Make the mirrored probe locate the analyzer `report.json` after a run and print the calibration decision.
- [x] Add opt-in probe calibration apply mode gated by `LESAVKA_SYNC_APPLY_CALIBRATION=1`.
- [x] Apply only when analyzer `calibration.ready=true`; otherwise refuse and print the analyzer reason.
- [x] Default probe-driven correction to video offset adjustment, because the measured residual is browser-visible UAC egress delay relative to UVC video.
- [x] Keep saving the measured offset as the site default opt-in via `LESAVKA_SYNC_SAVE_CALIBRATION=1`.
- [x] Update contract tests for relay CLI and manual probe script behavior.
- [x] Run focused relay/manual-script tests and package checks.
- [x] Push clean semver `0.17.13` for installed client/server testing.
Follow-up candidate: after 0.17.13 proves safe measured apply/refuse behavior, add a segmented live-calibration probe. The current browser probe uploads one WebM after recording ends, so it can only do measure/apply/rerun. A true same-session loop should run a longer stimulus, capture/analyze separate Tethys browser windows, apply calibration only between windows, and use the next window as the confirmation segment so before/after evidence is not mixed.
## 0.17.14 Segmented Live Calibration Probe Checklist
Context: 0.17.13 adds safe measured calibration apply/refuse plumbing, but it is still a single-window measure-then-rerun workflow. The next probe should keep the same Lesavka client/server session alive across multiple browser-capture windows so we can measure, apply, and re-measure without reinstalling or restarting the media path. This is the bridge from probe truth to blind/server-side calibration targets.
- [x] Keep 0.17.14 scoped to probe tooling and observability; do not change media planner policy.
- [x] Add optional multi-segment mirrored probe mode via `LESAVKA_SYNC_CALIBRATION_SEGMENTS`.
- [x] Keep one local stimulus browser and one headless Lesavka sender alive across all segments.
- [x] Run a fresh Tethys browser recording/analyzer pass per segment so before/after calibration evidence is not mixed in one WebM.
- [x] Allow calibration apply between segments using the 0.17.13 ready/refuse gate.
- [x] Capture planner and calibration snapshots before and after each segment for metric correlation.
- [x] Preserve single-segment default behavior for normal manual probes.
- [x] Update manual probe contract tests for segmented live calibration mode.
- [x] Run focused script/CLI checks and package checks.
- [x] Push clean semver `0.17.14` for installed client/server testing.
## 0.17.15 Adaptive Probe Metrics and Blind Target Checklist
Context: 0.17.14 can keep one Lesavka session alive across multiple measured segments, but we still need the probe to teach Lesavka what "good" looks like from server-only telemetry. 0.17.15 turns segmented runs into an adaptive calibration dataset: every segment gets probe truth, planner state, and calibration state joined into artifacts that can drive blind calibration/healing targets when Tethys/browser probe access is not available.
- [x] Keep 0.17.15 scoped to probe intelligence and metrics correlation; do not change media playout policy.
- [x] Add adaptive calibration ergonomics for longer near-continuous runs without changing the default one-segment probe.
- [x] Write per-run segment metrics as CSV and JSONL, joining analyzer verdicts with planner/calibration before/after snapshots.
- [x] Emit a blind-target candidate JSON from segments whose probe verdict passes, including server-visible planner lag/skew ranges.
- [x] Record when no segment is probe-good enough so blind-target generation refuses instead of inventing targets.
- [x] Keep calibration mutation gated by the existing ready/refuse logic and `LESAVKA_SYNC_APPLY_CALIBRATION=1`.
- [x] Update manual probe contract tests for the adaptive artifacts and controls.
- [x] Run focused script checks and package checks.
- [x] Push clean semver `0.17.15` for installed client/server testing.
## 0.17.16 Continuous Browser Evidence Checklist
Context: 0.17.15 proved the adaptive/live-edit loop is structurally useful, but each segment restarted the Tethys browser/getUserMedia receiver. That contaminates calibration segments with receiver startup noise and keeps usable coded pairs too low (`3-5` pairs instead of the `8+` needed for safe calibration). 0.17.16 is scoped to making the probe evidence continuous and attributable before any more media playout changes.
- [x] Keep 0.17.16 scoped to probe/tooling reliability; do not change server media playout policy, freshness ceilings, queue policy, or UAC smoothness.
- [x] Make adaptive mirrored runs keep one Tethys browser/getUserMedia session alive after the first segment.
- [x] Preserve single-segment/manual probe behavior by default.
- [x] Add an explicit `BROWSER_CONSUMER_REUSE_SESSION` control to the browser probe runner.
- [x] Verify browser uploads by start token so the fetched WebM belongs to the segment just triggered.
- [x] Track browser upload counts/tokens in `browser_consumer_probe.py` status JSON for post-run debugging.
- [x] Wire adaptive mirrored mode to reuse the existing Tethys receiver for segments 2+.
- [x] Keep calibration mutation behind the existing analyzer `calibration.ready=true` gate.
- [x] Update manual probe contract tests for continuous browser session behavior.
- [x] Run shell syntax checks, focused contract tests, and package checks.
- [x] Push clean semver `0.17.16` for installed client/server testing.
## 0.17.17 Adaptive Segment Failure Continuation Checklist
Context: 0.17.16 successfully installed and token-verified the first browser capture, but the adaptive run still aborted at segment 1 because `lesavka-sync-analyze` could not form coded pairs (`saw 0`) even though raw activity was nearly synced (`-32.2ms`). That prevented segments 2-4 from exercising the same long-lived Tethys browser session. 0.17.17 keeps adaptive data gathering alive across analyzer-only failures and preserves the failure evidence for summaries.
- [x] Keep 0.17.17 scoped to probe/tooling reliability; do not change media playout policy.
- [x] Add `BROWSER_ANALYSIS_REQUIRED` so browser captures can keep artifacts even when analyzer exits nonzero.
- [x] In adaptive mirrored mode, treat analyzer-only failures as nonfatal segment evidence.
- [x] Preserve analyzer failure reason, raw activity delta, and log path as structured segment artifacts.
- [x] Continue to abort on browser startup, recording, upload, or capture fetch failures.
- [x] Include analyzer failure artifacts in segment CSV/JSONL summaries.
- [x] Keep calibration apply impossible without a real analyzer `report.json` and `calibration.ready=true`.
- [x] Update manual probe contract tests for analyzer-failure continuation.
- [x] Run shell syntax checks, focused contract tests, and package checks.
- [x] Push clean semver `0.17.17` for installed client/server testing.
## 0.17.18 Explicit Probe Source Integrity Checklist
Context: the 0.17.17 Bumblebee mirrored run was configured with
`LESAVKA_MIC_SOURCE=alsa_input.usb-Neat_Microphones_Bumblebee_II_USB_Microphone-00.mono-fallback`,
but the Bumblebee was unplugged. The client log recorded `requested mic ... not found; using default`,
so the run measured the fallback/default microphone and must not be treated as Bumblebee calibration
evidence.
- [x] Treat 0.17.17 Bumblebee probe metrics as invalid for Bumblebee-specific calibration.
- [x] Keep ordinary launcher/client source selection forgiving by default.
- [x] Add strict explicit-source mode with `LESAVKA_REQUIRE_EXPLICIT_MEDIA_SOURCES=1`.
- [x] In strict mode, fail client startup when requested `LESAVKA_MIC_SOURCE` is unavailable instead of falling back to default.
- [x] Make the mirrored manual probe launch the real client with strict explicit-source mode by default.
- [x] Add contract coverage so the mirrored probe cannot regress to silent explicit-source fallback.
- [x] Run shell syntax checks, focused contract tests, and package checks.
- [x] Push clean semver `0.17.18` for installed client/server testing.
- [ ] Re-run the mirrored probe only after confirming the intended microphone is physically present and selected.
## 0.17.19 Fatal Required Source Failure Checklist
Context: the 0.17.18 run proved fallback was blocked, but the headless client kept running
camera-only after the required Bumblebee source failed to open. The server stayed in `acquiring`
for all four segments (`awaiting both upstream media streams`), the analyzer saw no color-coded
video pulses, and no calibration data was produced. Required-source failure must fail the probe,
not degrade into camera-only evidence.
- [x] Treat the 0.17.18 run as a required-microphone setup failure, not a lip-sync measurement.
- [x] Keep strict no-fallback behavior from 0.17.18.
- [x] Abort the client process when an explicit required microphone source cannot start.
- [x] Abort the client process when an explicit required camera source cannot start.
- [x] Run shell syntax checks, focused contract tests, and package checks.
- [x] Push clean semver `0.17.19` for installed client/server testing.
- [ ] Re-run only after `LESAVKA_MIC_SOURCE` is listed by the local audio stack.
## 0.17.20 Provisional Adaptive Calibration Checklist
Context: the 0.17.19 adaptive run reached live media and produced browser-visible sync evidence,
but the probe still did not calibrate. Each segment either had only 3 coded pairs or an analyzer
failure, so the existing `calibration.ready=true` gate refused every adjustment. That was safe for
saved defaults, but wrong for the user-directed adaptive probe goal: live segments should be allowed
to make bounded provisional corrections from imperfect-but-usable evidence, then let the next segment
judge whether the correction helped.
- [x] Treat the 0.17.19 run as a measurement-only run, not a calibration run.
- [x] Keep permanent saved calibration gated by analyzer-ready evidence.
- [x] Add provisional adaptive calibration mode enabled by adaptive runs by default.
- [x] Allow provisional corrections from low-but-usable paired-pulse reports when p95 and drift are bounded.
- [x] Use median skew as the provisional correction source and default to video offset adjustment.
- [x] Clamp provisional correction gain and max step so one noisy segment cannot swing the site offset wildly.
- [x] Record decision mode and provisional recommendations in segment metrics.
- [x] Update manual probe contract tests for provisional calibration controls and output.
- [x] Run shell syntax checks, focused contract tests, and package checks.
- [x] Push clean semver `0.17.20` for installed client/server testing.
## 0.17.21 Calibrate-Then-Confirm Probe Checklist
Context: 0.17.20 made adaptive runs capable of provisional calibration between measured
segments, but that still did not strictly guarantee the user-requested flow: run the probe,
calibrate the server while it is running, then run a post-calibration test segment. It also
still ignored analyzer-failure captures that contained a bounded raw activity delta. 0.17.21
makes the probe behavior explicit: calibration segments mutate active server calibration,
confirmation segments do not mutate it, and adaptive runs fail unless confirmation passes.
- [x] Treat `LESAVKA_SYNC_CALIBRATION_SEGMENTS` as calibration windows in adaptive confirm mode.
- [x] Add post-calibration confirmation windows via `LESAVKA_SYNC_CONFIRMATION_SEGMENTS`.
- [x] Disable calibration apply during confirmation windows so they are a clean test.
- [x] Require confirmation pass by default in adaptive confirm mode.
- [x] Add bounded raw-activity provisional calibration for analyzer failures that still report a raw A/V delta.
- [x] Include confirmation summaries and segment phase in adaptive artifacts.
- [x] Update manual probe contract tests for calibrate-then-confirm behavior.
- [x] Run shell syntax checks, focused contract tests, and package checks.
- [x] Push clean semver `0.17.21` for installed client/server testing.
## 0.17.22 Adaptive Calibration Export Fix Checklist
Context: the 0.17.21 run proved the calibrate-then-confirm shape worked, but no calibration
was actually applied. The calibration decision printed usable provisional recommendations, then
reported `provisional_calibration_enabled=false` and left the active offset unchanged. Root cause:
the Bash defaults were shell variables, while the embedded Python decision helper reads environment
variables via `os.environ`.
- [x] Treat the 0.17.21 run as confirmation that the probe shape works but calibration was disabled by script plumbing.
- [x] Export adaptive/provisional/raw-failure/confirmation knobs after defaulting them.
- [x] Add contract coverage so provisional calibration defaults cannot silently stop reaching the Python decision helper.
- [x] Run shell syntax checks, focused contract tests, and package checks.
- [x] Push clean semver `0.17.22` for installed client/server testing.
## 0.17.23 Audio Probe Robustness Checklist
Context: recent mirrored runs consistently detected more video events than audio events. That can
represent a real audio path problem, but the probe should not under-count audio just because the
room/speaker/mic path is quieter or mildly chopped. Harden the test tooling before interpreting
low paired-pulse counts as product failure.
- [x] Raise the default local stimulus tone level and expose it as `PROBE_AUDIO_GAIN`.
- [x] Pass the configured audio gain into the local stimulus browser page.
- [x] Lower the analyzer audio peak floor so faint but valid probe tones are accepted.
- [x] Smooth the audio envelope before thresholding so single-window dips do not erase pulses.
- [x] Merge longer internal tone dropouts inside one pulse without merging adjacent 1s pulses.
- [x] Add analyzer tests for faint tones and longer within-pulse audio dropouts.
- [x] Update manual probe contract coverage for the audio-gain control.
- [x] Run focused analyzer/manual-probe tests and package checks.
- [x] Push clean semver `0.17.23` for installed client/server testing.
## 0.17.24 Probe Truthfulness And Localization Checklist
Context: the 0.17.23 run proved adaptive calibration is now live-editing the server,
but confirmation still failed. Segment 3 passed and triggered a provisional calibration
nudge, while the confirmation segment failed with a near-centered median but high p95/drift.
This means the fastest high-quality path is localization tooling, not another static offset
guess.
- [x] Treat the latest failure as timing instability/outlier drift until the probe proves otherwise.
- [x] Fix analyzer-failure raw activity delta parsing so bounded raw-delta calibration can use the evidence it prints.
- [x] Stop marking `blind-targets.json` ready from calibration-only passes when confirmation segments exist and fail.
- [x] Emit combined `segment-events.csv` and `segment-events.jsonl` artifacts so each run exposes per-pulse skew and confidence across segments.
- [ ] Use the next run to decide whether bad p95 is caused by low-confidence analyzer pairings, camera/mic capture instability, or server planner/output jitter.
- [ ] Add stage-local timing evidence for stimulus schedule, client capture onsets, server output timing, and browser/device capture if the event table still cannot isolate the source.
- [ ] Only save calibration defaults after a confirmation segment passes.
## 0.17.25 Client/Server Timing Sidecar Checklist
Context: the probe should remain an external truth check, not a runtime dependency.
Production sync needs client/server-only timing evidence that can predict and heal jitter before
browser/probe validation. Attach timing metadata to media packets first; add a separate timing RPC
only if packet-attached metadata cannot explain the next failure.
- [x] Add client send/capture/queue timing metadata to upstream camera packets.
- [x] Add client send/capture/queue timing metadata to upstream microphone packets.
- [x] Record the latest packet timing samples on the server when media packets arrive.
- [x] Expose blind client/server timing metrics through `GetUpstreamSync` and `lesavka-relayctl upstream-sync`.
- [x] Include those timing metrics in segmented mirrored-probe summaries.
- [x] Add planner tests covering client capture skew, client send skew, server receive skew, and queue ages.
- [ ] Use the next mirrored run to compare browser p95/drift against client capture/send skew and server receive skew.
- [x] Instrument UVC/UAC/HDMI sink handoff timing before waiting for another run.
## 0.17.26 Blind Timing Window And Sink Handoff Checklist
Context: the next probe should not be required to discover that the server is
blind between "packet arrived" and "packet handed to UAC/UVC/HDMI". Close
measurement gaps before tuning any new healing controller.
- [x] Retain rolling client capture/send skew windows inside the server.
- [x] Retain rolling server receive skew and client queue age windows.
- [x] Record audio/video sink handoff instants and schedule lateness at the server boundary.
- [x] Expose sink handoff skew, sink lateness, and rolling p95 timing metrics through `GetUpstreamSync`.
- [x] Include rolling blind metrics in mirrored-probe CSV/JSONL summaries and blind targets.
- [x] Add planner tests for rolling timing windows and sink handoff timing.
- [ ] Use the next mirrored run only for correlation/tuning: decide whether the controller should adjust playout delay, offset, or drop/freeze policy from these blind metrics.
## 0.17.27 Runtime Blind Healing Checklist
Context: if client/server-only timing is stable enough, Lesavka should make
small runtime corrections without waiting for the external probe. The probe
remains the truth judge and root-cause localizer, not the production dependency.
- [x] Add a server-owned blind healer loop enabled by default with `LESAVKA_UPSTREAM_BLIND_HEAL=0` escape hatch.
- [x] Gate blind healing on rolling sample counts plus client-send, server-receive, queue-age, sink-late, and sink-handoff p95 limits.
- [x] Apply bounded transient offset nudges from sink handoff skew without saving them as site defaults.
- [x] Expose sample counts in `upstream-sync` and mirrored probe artifacts so failed runs can separate "insufficient evidence" from real timing failure.
- [x] Emit `root-cause-summary.json` from mirrored probe runs to classify failing layers instead of eyeballing raw metrics.
- [x] Add unit tests for apply/refuse/target behavior in the blind healer.
- [ ] Next run should identify the failing layer if confirmation still fails: client capture/uplink, network/server receive, server planner, server sink handoff, or external USB/browser/probe boundary.
## 0.17.28 Blind Timing Normalization Checklist
Context: the first preferred confirmation pass showed the probe-calibrate-confirm
loop can work, but also revealed two blind-healing blockers: sink handoff samples
stayed empty, and client timing skew included a false cross-pipeline PTS offset.
- [x] Pair server sink handoff samples by planned due time, not raw local PTS, so offset-compensated streams still produce handoff evidence.
- [x] Normalize client sidecar capture/send windows onto the shared capture clock using queue delivery age instead of raw per-pipeline packet PTS.
- [x] Add tests proving sink handoff survives large offset-compensated local PTS gaps.
- [x] Add tests proving audio/video timing metadata no longer copies packet PTS domains into blind sidecar fields.
- [ ] Next mirrored run should show non-zero `planner_sink_handoff_window_samples` and much smaller client send/capture p95 skew before trusting blind healing.
## 0.17.29 Enqueue-Bound Client Timing Checklist
Context: the first blind-healing runs showed huge client capture/send skew even though media packets
were latest-only. The sidecar timestamps were being written in async sender tasks after queueing, so
parallel scheduling delay leaked into the diagnostic clock and made blind healing distrust the wrong
layer.
- [x] Stamp client timing metadata at the capture/enqueue boundary instead of the async gRPC send boundary.
- [x] Keep async sender updates limited to queue depth and queue age so scheduling delay stays observable but does not rewrite capture/send time.
- [x] Pair server-side client timing samples by nearby enqueue/send time before reporting rolling skew windows.
- [x] Add regression tests proving queue delay no longer changes capture/send timestamps.
- [x] Push clean semver `0.17.29` for installed client/server testing.
- [x] Use the next mirrored run to confirm client capture/send p95 drops from seconds to single-digit milliseconds.
## 0.17.30 Raw-Failure Calibration Safety Checklist
Context: the 0.17.29 mirrored run confirmed the client-side scheduling leak is fixed, but the probe
then applied large opposite calibration nudges from analyzer failures with zero or one coded pair.
Raw activity deltas are useful diagnostic breadcrumbs; they are not safe steering evidence when coded
pairing collapses.
- [x] Treat the 0.17.29 run as proof that client sidecar timing is now trustworthy enough to move the investigation downstream.
- [x] Default raw analyzer-failure calibration to off instead of inheriting provisional calibration.
- [x] Add `LESAVKA_SYNC_RAW_FAILURE_MIN_PAIRS` so even explicit raw-failure calibration refuses weak coded evidence.
- [x] Print the raw-failure pair floor in calibration decisions and segment artifacts.
- [x] Prefer server-side receive/sink blockers over probe-pairing blockers when root-cause evidence is available.
- [x] Update manual probe contract coverage for the safer defaults and refusal reason.
- [ ] Re-run the probe-calibrate-confirm flow; analyzer failures should diagnose but not mutate calibration unless raw fallback is explicitly enabled and has enough coded support.
- [ ] If client send/capture p95 stays low and server receive p95 stays high, localize the transport/server-receive timing layer next.
## 0.17.31 Server Receive Timing Drain Checklist
Context: the 0.17.30 mirrored run kept calibration stable and proved the client enqueue-side timing
fix held: client capture/send p95 stayed in single-digit milliseconds after startup. The remaining
blind blocker moved to server receive timing, where video packets were only timestamped when the
camera playout loop woke up after waiting for the audio master.
- [x] Treat the 0.17.30 run as confirmation that raw-failure calibration no longer whipsaws offsets.
- [x] Keep polling and timing inbound camera packets while video waits for the audio master.
- [x] Keep polling and timing inbound camera packets while video waits for its due time.
- [x] Coalesce pending video to the freshest packet during those waits so the server does not build a stale video backlog.
- [x] Add regression coverage that video timing is recorded at enqueue/drain time before scheduler waits.
- [x] Re-run the probe-calibrate-confirm flow; `planner_server_receive_abs_skew_p95_ms` should fall if this was the receive-side scheduling leak.
- [ ] If receive p95 remains high after this, inspect actual gRPC/HTTP2 stream delivery and OS/network scheduling rather than static calibration.
## 0.17.32 Blind Heal Opt-In And Stability Checklist
Context: the 0.17.31 mirrored run confirmed the receive-side drain worked: client send and server
receive p95 both stayed near 50ms. The run still failed as probe pairing, and the server-side blind
healer silently changed calibration during the probe run because it was enabled by default and allowed
sink handoff p95 near 240ms.
- [x] Treat the 0.17.31 run as confirmation that the server receive scheduling leak is fixed.
- [x] Default runtime blind healing to disabled so probe-calibration runs cannot be contaminated by hidden server nudges.
- [x] Require explicit server-side `LESAVKA_UPSTREAM_BLIND_HEAL=1` before blind healing mutates transient calibration.
- [x] Tighten the blind-heal sink handoff p95 gate from 250ms to 120ms before applying runtime nudges.
- [x] Align mirrored-run root-cause summaries with the stricter sink handoff stability threshold.
- [x] Add regression coverage for default-disabled blind healing and noisy sink-handoff refusal.
- [ ] Re-run the normal probe-calibrate-confirm flow; `calibration_source` should remain non-blind unless the server was explicitly started with blind healing.
- [ ] If the probe still produces only one or two visual events while blind metrics stay stable, move the next fix to stimulus/browser/probe detection instead of transport timing.
## 0.17.33 Probe Detection Robustness Checklist
Context: the 0.17.32 mirrored run proved hidden blind healing stayed off and calibration remained
stable, but the external browser probe still produced too few pairs. The capture path is now limited
by analyzer robustness: the webcam sees the screen plus room background, and the microphone hears the
stimulus plus environmental noise.
- [x] Treat probe pairing as the top priority before applying more calibration logic.
- [x] Replace whole-frame color/brightness averaging with adaptive video ROI detection that follows the changing stimulus region.
- [x] Add regression coverage for a small flashing screen region inside a larger static frame.
- [x] Add tone-aware audio detection using the stimulus frequency palette so steady hum/noise is less likely to become a pulse.
- [x] Add regression coverage for test-tone pulses under strong low-frequency background hum.
- [x] Add coded-video fallbacks for overexposed screen captures: pulse-shaped color filtering, brightness fallback, and duplicate-frame normalization.
- [x] Make the local stimulus more probe-friendly by defaulting to kiosk mode and darker saturated colors.
- [x] Generate a `manual-review/index.html` with embedded segment captures so runs are easy to inspect by eye.
- [ ] Re-run the mirrored probe and confirm pair counts rise enough for calibration-ready evidence.
- [ ] If pair counts improve but p95 remains high, move next to server sink handoff jitter and late-run queue pressure.
## 0.17.34 Stimulus Verification Checklist
Context: the first 0.17.33 run did not prove the analyzer changes because the
browser-control path timed out before the local stimulus was started, and the
operator did not see colored flashes or hear coded tones. A sync probe must
verify its own source before asking the analyzer to explain the capture.
- [x] Add a short visible/audible local stimulus preview before the real client starts so framing and audio audibility are human-checkable.
- [x] Record stimulus start/preview tokens, audio context state, and active pulse metadata in `stimulus-status.json`.
- [x] Make each mirrored segment fail early if the local page does not observe `/start` or WebAudio is not running.
- [x] Retry the Tethys browser recording `/start` request to survive transient SSH banner timeouts.
- [x] Open the manual review capture directory in Dolphin after summarization so copied Tethys captures are immediately inspectable.
- [ ] Re-run the mirrored probe and confirm the preview is visible/audible before trusting any pairing diagnosis.
## 0.17.35 Right-Eye Capture Diagnostics Checklist
Context: manual Tethys testing showed the desktop was awake and HDMI was on, but the right-eye feed stayed black. Server logs showed `eye=r` reached `PLAYING` and then hit a V4L2/GStreamer `Invalid argument (22)` poll error before any frame was pushed, while the left eye streamed normally.
- [x] Confirm Tethys was not asleep: X11 reported HDMI connected at 1920x1080 and `Monitor is On`.
- [x] Remove stale Tethys browser-probe processes after mirrored probe runs so manual Google Meet testing does not compete with old recorder sessions.
- [x] Propagate late eye-capture GStreamer bus errors into the gRPC video stream so the launcher reports a preview stream error instead of a silent black window.
- [x] Add a first-frame watchdog for eye capture streams so opened-but-empty sources surface as explicit diagnostics.
- [ ] Re-run a manual two-eye session and confirm right-eye failures now appear in the session log with the concrete source error.
- [ ] If `eye-r` still reports `poll error ... EINVAL`, recover/reseat the right HDMI capture path or add a dedicated eye-capture soft recovery path separate from UVC/UAC.
## 0.17.36 Call Media Stability Checklist
Context: a manual Google Meet test on 0.17.34/0.17.35 was much worse than the earlier baseline:
remote audio sounded like choppy chunks/clicks and the video was visibly choppy. Live Theia
configuration showed the installer-generated UVC gadget was advertising `640x480 @ 20fps`, the
client camera pipeline was using that server profile as the outgoing packet profile even when the UI
selected `720p@30`, and the UAC speech path used a very tight 20ms/5ms sink buffer/latency with
downstream appsrc dropping.
- [x] Treat probe/analyzer measurements as suspect until the copied Tethys capture is visually and audibly stable.
- [x] Make UI-selected camera quality king: the launcher camera profile now drives the outgoing uplink profile by default instead of being downscaled to stale server caps.
- [x] Keep an explicit `LESAVKA_CAM_LOCK_TO_SERVER_PROFILE=1` lab escape hatch for debugging the server UVC gadget contract.
- [x] Restore generated UVC gadget fallback defaults to `1280x720 @ 30fps` for sessions without an explicit UI/session profile.
- [x] Align runtime UVC fallback defaults with the generated 30fps gadget profile.
- [x] Raise UAC speech sink buffering and appsrc limits so speech favors intelligibility over bare-minimum latency under jitter.
- [x] Stop default downstream appsrc leaking on the UAC speech path; shredded chunks are worse than modest added latency for calls.
- [ ] Reinstall/restart Theia services so `/etc/lesavka/uvc.env` is refreshed from `640x480 @ 20fps` to `1280x720 @ 30fps`.
- [ ] Re-run manual Google Meet before trusting mirrored probe calibration; verify speech is intelligible and video cadence is stable by eye.
## 0.17.37 UVC Format Compatibility Checklist
Context: after 0.17.36, Google Meet showed `Video Format Not Supported`. The client correctly
captured the UI-selected `720p@30` profile, but it emitted those frames into a server UVC gadget still
advertising `640x480 @ 20fps`. USB camera consumers require advertised caps and frame payloads to
match; otherwise the feed is rejected before we can evaluate smoothness or sync.
- [x] Preserve UI-selected capture quality as the source capture profile.
- [x] Restore safe default UVC emission to the negotiated server gadget profile so browsers see frames matching the camera format they negotiated.
- [x] Keep `LESAVKA_CAM_EMIT_UI_PROFILE=1` as an explicit lab-only opt-in until the server can reconfigure the UVC gadget from the UI/session profile.
- [x] Keep `LESAVKA_CAM_LOCK_TO_SERVER_PROFILE=1` as a safety override that wins over experimental UI-profile emission.
- [ ] Add a real server-side UVC profile reconfigure path before making UI-selected quality drive the gadget-advertised output format.