# Sanad Package 2 — Premium Communication — PLAN (locked scope)

> **Status: IMPLEMENTED (self-contained) — built 2026-06-22, pending Docker build
> + on-robot test.** Vendored from SanadV3, mirrors P1. Dashboard port **:8012**.
> Structural validation passed (compile, import resolution, shim-symbol coverage,
> namespace bootstrap, `license_check P2` entitled, YAML).
>
> **Refinement (2026-06):** the engine is now vendored from **SanadV3**, not
> plain Sanad. SanadV3 already implements the mask/face subsystem and the
> evolved voice/audio/arm engine P2 needs — so P2 *vendors and wires* it rather
> than building the mask from scratch. See "Why vendor SanadV3".

P2 is a **superset of P1**: everything P1 does, plus multilingual auto-detect,
voice-commanded **arm gestures**, gestures while speaking, and a lip-syncing LED
face on the BLE "Shining Mask".

## Locked scope (decisions taken)
- **Motion = arm gestures only (first pass).** Voice-command **locomotion**
  (robot walking/turning) is **DEFERRED** to a later pass — `voice/movement_dispatch.py`
  + `G1_Controller/loco_controller.py` and SanadV3's `dashboard/routes/_arbiter.py`
  (loco/nav leg arbitration) are intentionally **out of scope here**.
- **Mask = included, BLE driven from INSIDE the P2 container** (vendor the Mask
  controller; `bleak` + host BlueZ/D-Bus). No separate `sanad-mask` side-car.
- **Single self-contained container** (no `Sanad_Core`, no `sanad-base`) — like P1.
- **Keyless** ship; customer supplies their own Gemini key.

## Why vendor SanadV3 (not plain Sanad)
SanadV3 is **not a fork** of Sanad — it is plain Sanad **plus exactly the
subsystems P2 needs**, reachable through the identical `Project.Sanad.*` import
surface the dashboard routes already use. Verified in the tree:

- `face/mask_face.py` → `FaceController` (BLE asyncio-loop thread + reconnect
  supervisor + `set_speaking`/`set_mouth` lip-sync inputs). Imports only
  `Project.Sanad.config.BASE_DIR`, `core.config_loader`, `core.logger`, then the
  flat Mask lib via `sys.path.insert(mask_dir)`.
- `face/face_motion.py` → `LifelikeFace` (saccades, varied blinks,
  idle/listening/thinking/speaking states, timed reactions, smooth lip-sync),
  with automatic fallback to the flat lib's `FaceAnimator`.
- `config/mask_config.json` → already env-driven
  (`SANAD_MASK_DIR`/`ADDRESS`/`NAME_PREFIX`/`ADAPTER`), `brightness`/`fps`/
  `lifelike`/`autostart`, persisted face colors.
- `dashboard/routes/mask.py` → a **failure-safe** `/api/mask` router; every
  handler is `asyncio.to_thread`-wrapped and maps errors to 503/409/500 so a
  missing mask never crashes the dashboard.
- Lip-sync chain → `gemini/script.py` emits `[[MOUTH:n]]` (0–3) RMS markers;
  `gemini/subprocess.py` exposes `register_mouth_callback`; `core/brain.py`
  `set_gestural_speaking` emits `brain.gestural_speaking_changed`.
- Evolved voice/audio → `voice/live_voice_loop.py` (position-based dedup),
  `voice/text_utils.py` (Arabic normalize + `maybe_trigger_arm`),
  `voice/audio_manager.py` (per-instance + TTL throttle) — supersets of plain
  Sanad's race-prone equivalents.

Building the mask from plain Sanad (the prior plan) would mean **re-deriving all
of the above by hand** and inheriting plain Sanad's known bugs (content-based
dedup, no `[[MOUTH:n]]`, non-atomic `index.json` writes). The one thing SanadV3
does **not** solve is BLE-inside-a-container (it ran the mask on the host
`g1_env`) — but that is a Docker capability/mount problem P2 owns regardless of
base engine.

**Correction to the prior plan:** the lip-sync source of truth is **Gemini's
`[[MOUTH:n]]` markers over the event bus**, *not* a raw audio-amplitude tap, and
the driver is **`FaceController` + `LifelikeFace`**, *not* `Mask/talking.py
TalkingFace`. P2 adopts the marker path.

## Architecture
One self-contained container that **owns all the hardware it needs directly**:
the G1 DDS link, `rt/arm_sdk` (arm), chest **or** USB/Anker audio, and the BLE
mask. It runs the premium slice of the vendored **SanadV3** engine.

P2 is a **containerization wrapper** (like P1), not a fork: it
1. bootstraps the `Project.Sanad` namespace (deployed layout, P1's mechanism),
2. constructs ONLY the P2 superset subsystems,
3. injects a P2-scoped `Project.Sanad.main` shim exposing the singletons the
   routers import lazily (`mask_face`, `brain`, `live_voice`, …),
4. mounts the P1 + premium routers + the logs websocket,
5. serves the real SanadV3 SPA with non-P2 tabs hidden,
6. runs uvicorn on **:8012**.

Why no `Sanad_Core`/`hwbroker`/`sanad-mask` split: that split exists only to stop
**multiple** package containers fighting over the one device set. A customer who
buys **just P2** has a single owner — no contention — so everything folds into one
container (exactly like P1 standalone, and like the original Sanad monolith which
did comms + arm + mask in one process). The ZMQ bus seam stays available for a
future P1+P2+P3 fleet SKU but is **not** part of standalone P2.

## Namespace bootstrap (reuse P1's, not SanadV3's self-alias)
P2 reuses **P1's exact bootstrap** (`app_p1.py` lines 30–46): synthesize a
`Project` namespace package and alias `Project.Sanad` → the vendored `Sanad`
module, then inject a `Project.Sanad.main` shim holding the P2 singletons. The
mask + voice routes resolve their singletons via lazy `from Project.Sanad.main
import mask_face` inside handlers — so the shim must define them. (SanadV3's own
self-alias-by-folder-name is skipped automatically — the vendored tree is named
`Sanad`, so `main.py`'s `if _THIS_DIR.name != 'Sanad'` branch never fires — and
the wrapper-with-shim is what P1 ships and what the routes expect.)

## Features / capabilities

### Inherited from P1 (superset — same code, premium flags on)
Hands-free Gemini conversation · persona editor (who/tone/language) · **keyless**
Gemini key (customer adds own) · chest **or** USB/Anker audio (selectable,
hot-swap) · typed-replay / "say a line" · live logs + download · offline license
gate. **Same audio (mic+speaker) mechanism as P1**, via the evolved per-instance
`voice/audio_manager.py` + `voice/audio_devices.py`.

### New in P2 (this pass)
1. **Multilingual auto-detect** — Gemini natively detects the visitor's language
   (Arabic Gulf/English) and replies in kind, via the bilingual system prompt in
   `gemini/script.py` / `voice/sanad_voice.py`. No per-user flag.
2. **Voice-command arm gestures** — `voice/live_voice_loop.py`: USER speech →
   arm actions via `scripts/sanad_arm.txt` (**23 actions**, non-contiguous ids
   `{0–15, 23–28, 30}`; the file's nominal range is 0–28; hundreds of Arabic/EN
   phrase variants) → `sanad_arm_controller.ARM.trigger_action_by_id()`.
   **Instant** or **deferred** mode (0.65 s fallback so a silent user still
   fires). Master **trigger-enabled** gate (default **OFF**). Position-based
   dedup (`_last_snapshot` + `_trigger_lock`).
3. **Gestures while speaking** — `core/brain.set_gestural_speaking` →
   `brain.gestural_speaking_changed` → `mask_face.set_speaking(True)` (mouth +
   any gestural motion animate together while Gemini talks).
4. **Wake-phrase management** — phrase→action CRUD
   (`voice/wake_phrase_manager.py`, persisted to `data/wake_phrases.json`),
   folded into the live loop at runtime via `_merge_wake_phrases`.
5. **Skills registry** — skill CRUD, execute, upload-audio
   (`dashboard/routes/skills.py`).
6. **Lip-sync on the LED "Shining Mask"** — vendored `face/mask_face.py`
   `FaceController` + `face/face_motion.py` `LifelikeFace`:
   - mouth driven by Gemini `[[MOUTH:n]]` (0–3) markers →
     `gemini/subprocess.register_mouth_callback` → `mask_face.set_mouth(level)`;
   - `set_speaking(on)` from the gestural-speaking event for auto-talk;
   - state-aware idle/listening/thinking + timed reactions (smile/surprised/sad);
   - falls back to the flat lib's `FaceAnimator` if `LifelikeFace`/Pillow/bleak
     unavailable (`lifelike=false` in `mask_config.json`).

### Deferred (NOT in this pass)
- **Voice-command locomotion** (`voice/movement_dispatch.py` + `loco_controller`):
  Gemini's spoken confirmation → discrete bounded steps, `movement_enabled` gate,
  "stop" = E-STOP. Adds a walking-on-voice safety surface + on-robot calibration —
  staged separately. SanadV3's `dashboard/routes/_arbiter.py` (Nav2 ↔ LocoController
  leg arbitration) belongs to this pass, **not** P2's arm-only pass.
- Multi-package fleet via `Sanad_Core` (hwbroker/busd/shared `sanad-mask`).

## Dashboard (:8012) = all P1 tabs **+** (all routes VENDORED from SanadV3)
- **Voice** — adds the multilingual auto-detect toggle + per-language voice config.
- **Live-voice (commands)** — `dashboard/routes/live_voice.py`: start/stop,
  deferred-mode toggle, **trigger-enabled** master gate, status, trigger history.
- **Wake-phrases** — phrase→action CRUD (AR dialects + EN).
- **Motion / Gestural** — `dashboard/routes/motion.py`: gestural-speaking toggle,
  trigger / cancel arm actions. *(Arm only — loco controls present in the route
  but unwired this pass.)*
- **Skills** — `dashboard/routes/skills.py`.
- **Mask / Lip-sync** — the existing SanadV3 SPA **Mask Face** tab + the vendored
  `dashboard/routes/mask.py` (`/api/mask/*`): connect/disconnect, brightness,
  face start/stop/return/color, speaking toggle, mouth slider, expressions,
  text/image/animation overrides, status. **Mounted as-is — not authored.**
- **Logs**.

Non-P2 tabs (recognition, temp/3D, controller, navigation, terminal) are hidden
the same way P1 hides its non-P1 set.

## What it vendors / reuses (self-contained, like P1)
- `vendor/Sanad` — the **SanadV3** engine tree (rsync-excluding
  `data/`, `Logs/`, `__pycache__/`, `tests/`, `static/temp3d/`). Includes
  `face/`, evolved `voice/`, evolved `gemini/`, `motion/`, the mask + live-voice
  + motion + skills routes, and the SPA. *(Locomotion modules vendored but left
  unwired this pass.)*
- `vendor/sanad_pkg` — IPC bus shim + offline license verification lib (P1's set).
- `vendor/mask` — the flat `shiningmask` library copied from `Project/Mask`
  (`mask.py`, `faceanim.py`, `colorface.py`, `constants.py`, `protocol.py`,
  `transport.py`, `bitmap.py`, `NotoSans-Regular.ttf`, …). `Project/Mask` uses
  **flat imports** (`import faces`, `import mask`), so it goes on its **own**
  path (`SANAD_MASK_DIR=/app/mask`, also on `PYTHONPATH`) — **NOT** under
  `Sanad/`, to avoid collisions. `face/mask_face.py` and `face/face_motion.py`
  both `sys.path.insert(mask_dir)` and `import mask / faceanim / colorface`.
- A `sync_vendor.sh` that refreshes **both** `vendor/Sanad` (from SanadV3) and
  `vendor/mask` (from `Project/Mask`), and blanks any baked Gemini key.

## Wiring the lip-sync + gestures (in `app_p2.py` / the `Project.Sanad.main` shim)
1. Construct P1 comms singletons (brain, audio_mgr, voice_client,
   `GeminiSubprocess`) exactly as P1 does.
2. Construct premium singletons: `FaceController()` (mask_face),
   `LiveVoiceLoop(...)`, `WakePhraseManager()`, arm controllers, skills.
3. `brain.attach_live_voice(live_voice)`; **wire the arm⇄locomotion motion-block
   predicate exactly as SanadV3's `main.py` does** (`arm_controller.set_motion_block(...)`).
   This is load-bearing safety, not optional: **two `rt/arm_sdk` publishers
   coexist in-process** — `motion/arm_controller.py` (publisher ~line 237) and
   `motion/sanad_arm_controller.py` (~line 176). They stay collision-free ONLY via
   (a) that `set_motion_block` interlock on `ArmController` and (b)
   `sanad_arm_controller`'s `_is_busy`/`_busy_lock` atomic guard. A boot-time
   'sole writer' assertion only covers cross-*container* contention — it does
   **not** replace this intra-process interlock, which `app_p2.py` must reproduce.
4. **Lip-sync:** `gemini_subprocess.register_mouth_callback(mask_face.set_mouth)`.
5. **Gestures-while-speaking:** subscribe `brain.gestural_speaking_changed` →
   `mask_face.set_speaking(on)`. The event bus is **synchronous**
   (`core/event_bus` `.on`/`.emit_sync`), so `set_speaking` runs on the caller's
   thread — keep it non-blocking (it only flips a flag the BLE loop reads).
6. **Lifelike state:** wire voice events (connected→`set_listening`,
   user_said→`set_thinking`, disconnected→`set_idle`,
   `voice.error`/`motion.action_error`→`react('sad')`,
   `skill.finished`→`react('smile')`). **SanadV3's `main.py` lines ~360–427 are
   the concrete reference implementation** — copy that wiring (incl.
   `register_mouth_callback` ~383–391 and `mask_face.shutdown()` ~587–591) rather
   than re-deriving it.
7. Expose `mask_face`, `brain`, `live_voice` on the `Project.Sanad.main` shim so
   the lazy route accessors resolve them.
8. On shutdown: `mask_face.shutdown()` (BLE disconnect + stop loop) — handle
   SIGTERM so the container exits cleanly.

## Container & hardware
- `FROM python:3.10-slim`, `WITH_UNITREE_SDK=1` (builds — the cyclonedds `idlc`
  fix gives arm + chest audio).
- **System deps (added over P1):** `bluez`, `libdbus-1-3`/`libdbus-1-dev`,
  `libglib2.0-0` for BlueZ/D-Bus; Pillow needs no extra apt on slim.
- **Python deps:** P1's set **+** `bleak==0.22.3` (**pinned** — bleak 3.x throws
  `KeyError 'Roles'` on the Jetson's BlueZ 5.53 and every connect fails) **+**
  `Pillow` (LifelikeFace frame rendering).
- **BLE for the mask (in-container):** mount `/var/run/dbus`,
  `--cap-add NET_ADMIN`, `/dev/bus/usb`; `network_mode: host`. Free the mask
  from the phone app before connecting. Set `SANAD_MASK_DIR=/app/mask`.
- `/dev/snd` (audio), license mount, **writable** `./data` + `./config` mounts
  (mask color persistence), `restart: unless-stopped`.
- Port **:8012**. Ships **keyless** (`strip_key.py` blanks any baked key).
- **License features:** `multilingual`, `voice_command_motion` (arm gestures),
  `lipsync`, `mask`. Entrypoint checks entitlement **P2**. *(A future
  `voice_command_locomotion`/`navigation` feature gates the deferred walking.)*

## Package layout (to build later — mirrors P1)
```
Sanad_Package_2/
  app_p2.py  routes_p2.py  entrypoint.sh  strip_key.py  p2ctl.sh
  config/p2_config.json  static/  Dockerfile  docker-compose.yml  requirements.txt
  vendor/Sanad   (from SanadV3)
  vendor/sanad_pkg
  vendor/mask    (from Project/Mask — own PYTHONPATH)
  license/(pubkey + example)  data/(seed incl. wake_phrases.json)  sync_vendor.sh
  README.md  PLAN.md  NEW_ROBOT_SETUP.md
```

## Build sequence (when implemented)
1. Vendor the **SanadV3** engine + `sanad_pkg` + the flat `Project/Mask` →
   `vendor/mask`; merge requirements (P1 deps + `bleak==0.22.3` + `Pillow`).
2. Self-contained Dockerfile (P1's + BlueZ/D-Bus system deps; `COPY vendor/mask
   /app/mask`; `ENV SANAD_MASK_DIR=/app/mask`).
3. `app_p2.py` — P1's namespace bootstrap + `Project.Sanad.main` shim; construct
   P1 comms **+** `FaceController` + `LiveVoiceLoop` + wake-phrase mgr + skills;
   wire `register_mouth_callback`→`set_mouth` and
   `gestural_speaking_changed`→`set_speaking` + lifelike state hooks; mount P1 +
   premium routers (incl. the vendored `mask.py`); serve :8012.
4. Mask: `FaceController` already runs its own BLE asyncio loop +
   reconnect supervisor; with `autostart=true` it connects + uploads frames
   (~30–90 s one-time, persists on flash) in the background — never blocks boot.
5. Multilingual: ship SanadV3's bilingual system prompt (rename persona as
   decided); enable per-language voice config in the Voice tab.
6. License gate `P2`; keyless; smoke test (P1 endpoints + live-voice + wake +
   skills + `/api/mask/status` → connect → face/start → mouth slider); validate
   on the robot.

## Risks / mitigations
- **BLE-from-container is fiddly** — prototype the mask connect early on the
  target Jetson (BlueZ 5.53) with `bleak==0.22.3`; if in-container BLE
  misbehaves, fall back to the mask controller on the **host** with a tiny socket
  shim (contingency, not the plan). `mask.py` route already degrades to 503 if
  the subsystem is down, so the dashboard never crashes.
- **Mask flat imports** — vendor onto `/app/mask` via `SANAD_MASK_DIR`; never
  place under `Sanad/`.
- **LifelikeFace deps** — needs Pillow + bleak; if either is missing the
  subsystem auto-falls-back to `FaceAnimator` (`lifelike=false`) and reports the
  reason in the Mask tab. The rest of P2 is unaffected.
- **Lip-sync chain spans the voice subprocess** — `gemini/subprocess.py` spawns
  `voice/sanad_voice.py` (the child, `SANAD_VOICE_BRAIN=gemini`), which is an
  **orchestrator** that in turn runs `gemini/script.py` — the actual `GeminiBrain`.
  The `[[MOUTH:n]]` markers are emitted in **`gemini/script.py` (~lines 563/578)**,
  not `sanad_voice.py`. Vendor the full **`script.py → sanad_voice.py →
  subprocess.py`** chain intact and register the parent callback
  (`GeminiSubprocess.register_mouth_callback`) in the shim. P1 already vendors
  `gemini/subprocess.py`, so this is additive wiring, not a new vendor.
- **Arm safety (two publishers + interlock)** — `trigger_enabled` defaults **OFF**;
  arm actions are bounded. The single container is the sole *container* writing
  `rt/arm_sdk`, but **two in-process publishers** (`arm_controller.py` +
  `sanad_arm_controller.py`) coexist — collision-free ONLY via the
  `set_motion_block` interlock + `_is_busy` busy-lock that `app_p2.py` must wire
  (Wiring step 3). Assert sole-container-writer at boot **and** reproduce the interlock.
- **Arm⇄locomotion interlock is a no-op in this arm-only build** —
  `sanad_arm_controller.trigger_action_by_id()` calls `_blocked()` (refuses arm
  motion while the robot may be walking). With locomotion **deferred**, nothing
  publishes the locomotion-state signals `_blocked()` reads, so it always permits
  arm motion — safe by omission **now**, but it **must be re-armed before the
  deferred voice-command-locomotion pass ships**, else voice-walking would enable
  with a silently-disabled arm interlock.
- **Mask hardware presence** — lip-sync needs the physical BLE mask paired/in
  range and freed from the phone app.
- **Locomotion creep** — `movement_dispatch.py` + `_arbiter.py` are present in
  the vendored tree; keep them **unwired** this pass to avoid accidentally
  shipping voice-command walking without the safety surface + calibration.

## Open decisions (resolve before/while building)
1. **In-container BLE vs host side-car** — confirm `bleak`+BlueZ/D-Bus actually
   connects from inside `python:3.10-slim` with `/var/run/dbus` + `NET_ADMIN` +
   `network_mode: host` on the target Jetson (BlueZ 5.53). Pin **`bleak==0.22.3`**
   (3.x throws `KeyError 'Roles'` on BlueZ 5.53 — every connect fails). If flaky,
   fall back to a host-side mask controller + tiny socket shim (contingency).
2. **LifelikeFace default** — SanadV3 defaults `lifelike=true` (needs
   `face/face_motion.py` + Pillow + bleak). Confirm Pillow+bleak install cleanly
   in the slim image; otherwise `mask_config.json lifelike=false` auto-falls back
   to `FaceAnimator`.
3. **Gemini lip-sync chain vendoring** — vendor the evolved `gemini/script.py`
   (emits `[[MOUTH:n]]`) **and** `gemini/subprocess.py` (`register_mouth_callback`),
   and wire `register_mouth_callback → mask_face.set_mouth` in the shim. P1 already
   vendors `subprocess.py`, so this is additive.
4. **Persona / robot name** — SanadV3's fallback prompt identifies the robot as
   **"Marcus"** (bilingual Gulf-Arabic/English auto-detect — this prompt *is* the
   multilingual engine). Decide P2's shipped persona name/dialect in
   `scripts/sanad_script.txt` and whether to keep the bilingual prompt verbatim.
5. **First-boot frame-upload latency** — the one-time DIY frame upload (~30–90 s,
   persists on mask flash). Decide `autostart=true` (background, non-blocking) vs
   gating `face/start` behind an explicit dashboard action with a progress indicator.
6. **Vendor `movement_dispatch.py` now (unwired) or omit** — keep for forward-compat
   with the deferred locomotion pass, or drop it to keep the arm-only image lean.
7. **Writable mounts for persisted face colors** — `/face/color` persists eye/
   mouth/sclera colors to `mask_config.json`; ensure `config/` (or the `data/`
   seed) is on a writable volume so colors survive container restarts.

> **Carried-forward safety gate:** before the deferred *voice-command-locomotion*
> pass ships, re-arm the `_blocked()` arm⇄locomotion interlock (a no-op today) and
> revisit `_arbiter.py` (Nav2 ↔ LocoController leg arbitration).

## Container-runtime audit (2026-06-23)
A 3-reader audit of the vendored runtime paths (voice/audio, mask/face, arm+dashboard)
for in-container failure modes. **Verdict: container-safe on the G1 after the
`iproute2` fix** — no remaining crash-level landmine on the builtin+gemini+host-net path.
- **FIXED (crash, P1+P2):** chest-mic `voice/audio_io.py:_find_g1_local_ip()` shells
  out to `ip` — added **`iproute2`** to both Dockerfiles. (This was the live-voice
  crash we hit.)
- **Mitigated (crash, off-robot only):** `BuiltinMic.start()` calls
  `_find_g1_local_ip()` **unguarded**; off a G1 (no `192.168.123.x` net) it raises
  and kills the voice subprocess. On the G1 (network_mode host) the interface exists →
  fine. Package-level safety valve: **`SANAD_AUDIO_PROFILE=plugged` on any non-G1 host**
  (documented in compose + `.env.example`). Not fixed in the engine (shared with SanadV3).
- **FIXED (degraded, persistence):** mask face colors are written to
  `config/mask_config.json` (a baked layer, lost on recreate) → added a pre-seeded
  single-file mount `./config/mask_config.json:/app/Sanad/config/mask_config.json`.
- **Deploy-side (degraded, P2):** mask BLE cold-start can stall ~45 s (3×15 s scan) if
  host `bluetoothd` is down / adapter missing — retry-bounded, supervised, background
  thread; does **not** crash. Ensure `bluetoothd` is up; leave `SANAD_MASK_ADDRESS`
  empty for auto-detect.

Verified-safe (NO action, to avoid over-fixing): `~/logs` FileHandler (root-writable,
no crash); `NotoSans-Regular.ttf` present + COPYed; `local_tts` torch never imported
(gemini brain); `pactl` with no server (guarded, and not on the builtin voice path);
`parec` capture (gated); arm/DDS init (degrades, catches); config path resolution
(robust fallback chain); `teaching.py` tempfile (writes under the `./data` mount).