Sanad_Package_2/README.md
2026-07-05 13:34:15 +00:00

213 lines
12 KiB
Markdown

# Unitree G1 - Sanad Package 2 — Premium Communication
Everything **Package 1** does (hands-free Gemini Live conversation, chest or USB
audio) **plus** multilingual auto-detect, voice-command **arm gestures**, gestures
while speaking, wake-phrases, skills, and a lip-syncing + expressive LED **"Shining
Mask"**. Dashboard on **:8012**. (Voice-command **locomotion/walking** is *deferred*
to a later pass.)
This package is **self-contained** — it vendors the SanadV3 engine and the flat BLE
Mask lib under `vendor/` and builds `FROM python:3.10-slim-bookworm`, so a clone of
just this folder builds and runs with **no** `sanad-base`, `Sanad_Core`, or sibling
checkout. Full design notes: **[PLAN.md](PLAN.md)**.
## Install (per-robot, first run)
The workstation is canonical — you **build and run Docker on the robot**, but you
edit and stage the folder on the workstation and copy it over.
**1. Get the package onto the robot.** From the workstation `Packages/` directory,
rsync just this folder to the robot's deploy dir (dev robot = `10.255.254.86`):
```bash
rsync -az --exclude __pycache__ --exclude .git \
Sanad_Package_2 unitree@<robot>:~/sanad_deploy/
```
Only this folder is needed — the engine and mask lib are vendored under `vendor/`.
**2. License.** A signed **Ed25519** `sanad.lic` decides which packages a robot may
run, verified **fully offline** against `pubkey.ed25519` (baked into the image at
`/etc/sanad/pubkey.ed25519`). The build and a first run work out of the box with the
**bundled demo license** (`license/sanad.lic.example``robot_id G1-SN-DEMO-0001`,
unbound, entitles **P1+P2+P3**, expires 2030-01-01). A **production delivery needs a
signed per-robot license** — see [Licensing](#licensing) and, for the full flow,
**[NEW_ROBOT_SETUP.md](NEW_ROBOT_SETUP.md)**.
**3. Keyless — the customer adds their own Gemini key.** The image ships with **no
Gemini API key baked in**. On first open of the dashboard, paste your own key in the
**Gemini API key** card (step 1 below). Nothing to configure at build time.
For a brand-new G1 end-to-end (prereqs, build, signing + binding a P2 license, mask
pairing, verify) follow **[NEW_ROBOT_SETUP.md](NEW_ROBOT_SETUP.md)**.
## Run with Docker (quick start)
On the robot, **from this folder** (`~/sanad_deploy/Sanad_Package_2/`) — nothing
else required:
```bash
cp .env.example .env # optional: license path / audio / mask MAC
docker compose up -d --build # build (vendored engine + mask + chest-audio SDK) + run
# Jetson Docker without buildx: DOCKER_BUILDKIT=0 docker compose up -d --build
```
`network_mode: host` is **required** (G1 DDS discovery + chest audio + Gemini cloud +
host BlueZ for the mask) — it is already set in `docker-compose.yml`.
Then open **http://&lt;robot-ip&gt;:8012** (the image ships **keyless**):
1. **Gemini API key** card → paste your key.
2. **Persona** card → who the robot is; the bilingual prompt drives **multilingual
auto-detect** (saving restarts the live session).
3. **Audio** card → pick chest vs USB/Anker + volume; **Mask** tab → connect the LED
face and run a lip-sync test.
4. **Live-voice** tab → enable voice-command **arm gestures** (default OFF), then
press **Start** in the Conversation card and talk.
> The **LED mask** must be paired / in range and **freed from its phone app** (BLE
> allows one central at a time); `bleak` is pinned to **0.22.3** for BlueZ 5.53
> compatibility.
## Manage it
```bash
docker compose logs -f # live logs
docker compose down # stop
./test_p2.sh <robot-ip>:8012 # smoke test (expect 16/16 PASS)
```
- **Auto-start on boot:** `sudo systemctl enable docker``docker-compose.yml`
already runs P2 `restart: unless-stopped`, so it returns after a reboot.
- **No Docker? (dev mode)** run against `./vendor` in the robot's `gemini_sdk` conda
env:
```bash
cd ~/sanad_deploy/Sanad_Package_2
./p2ctl.sh start # launch on :8012 (runs against ./vendor)
./p2ctl.sh status # process + /api/health
./p2ctl.sh logs 80 # tail the P2 log
./p2ctl.sh restart
./p2ctl.sh stop
```
The conda env must have `google-genai` and (for the mask) `bleak==0.22.3` + Pillow.
**Entrypoint flow** (`entrypoint.sh`): (1) **license gate** —
`python3 -m sanad_pkg.license_check P2`; if the robot is **not** entitled the
container exits **cleanly** (code 0) so the restart policy never crash-loops; (2)
resolve config with precedence **env var > license feature > `config/p2_config.json`**;
(3) a **preflight** that prints clear diagnostics (python / google-genai / pyaudio /
bleak / Pillow / mask lib / unitree SDK / audio profile); (4) `exec app_p2.py`.
## What it ships
- `app_p2.py` — launcher: bootstraps the `Project.Sanad` namespace + the flat Mask
path, constructs the P2 superset (comms + arm + `FaceController` mask), wires
lip-sync (Gemini `[[MOUTH:n]]` markers → `mask_face.set_mouth`), gestures-while-
speaking, lifelike face state, and the arm↔loco motion-block interlock seam (no-op
while locomotion is deferred); mounts the P1 + premium routers; serves the SanadV3
SPA (non-P2 tabs hidden) on :8012.
- `routes_p2.py` — `/api/p2/*` convenience routes (api-key, persona, say, logs,
settings) that also restart the live Gemini session so a new key/persona applies
immediately.
- `vendor/Sanad` (SanadV3 engine) · `vendor/sanad_pkg` (license + bus lib) ·
`vendor/mask` (flat BLE Shining-Mask lib, its own path via `SANAD_MASK_DIR=/app/mask`).
Refresh from a monorepo checkout with `./sync_vendor.sh`.
- `Dockerfile` — self-contained `FROM python:3.10-slim-bookworm`; adds BlueZ/D-Bus,
`bleak==0.22.3`, Pillow, and (when `WITH_UNITREE_SDK=1`) a full CycloneDDS 0.10.2 +
`unitree_sdk2_python` for chest audio.
- `docker-compose.yml` — standalone build + run; `network_mode: host`, `cap_add:
NET_ADMIN`, and mounts `/var/run/dbus` + `/dev/bus/usb` for the mask.
- `entrypoint.sh` — license gate → resolve env → preflight → launch.
- `config/` — `p2_config.json` (defaults) + `mask_config.json` (persisted face
colors). `license/` — `pubkey.ed25519` + demo `sanad.lic.example`.
- `test_p2.sh` — 16-check smoke test · `p2ctl.sh` — dev-mode control script ·
`sync_vendor.sh` — re-vendor the engine · `NEW_ROBOT_SETUP.md` / `PLAN.md`.
## Configuration (env vars)
Copy `.env.example` → `.env` (docker compose reads it automatically). Precedence at
runtime is **env var > license feature > `config/p2_config.json`**.
| Var | Default | What it does |
|---|---|---|
| `SANAD_LICENSE_FILE` | `./license/sanad.lic.example` | Host path to the signed `sanad.lic`, mounted **read-only** at `/etc/sanad/sanad.lic`. Point it at your per-robot license for production. |
| `SANAD_LICENSE_BIND` | `0` | `1` enforces machine-fingerprint binding — also uncomment the `/etc/machine-id` mount in `docker-compose.yml`. |
| `SANAD_AUDIO_PROFILE` | `builtin` | `builtin` = G1 chest over DDS · `plugged` = USB (e.g. Anker) via PulseAudio. Use `plugged` on any non-G1 host. |
| `SANAD_DDS_INTERFACE` | `eth0` | DDS interface to the G1 firmware (chest audio + arm). |
| `SANAD_LANGUAGE` | *(empty)* | Empty = **multilingual auto-detect** (P2's headline feature). Set `ar`/`en` only to force a single language. |
| `SANAD_MASK_ADDRESS` | *(empty)* | Pin the LED mask's BLE MAC; empty = auto-discover by name prefix. |
| `SANAD_PULSE_DIR` | `/run/user/1000/pulse` | Host PulseAudio runtime dir mounted into the container for plugged/BT speaker volume. |
| `PULSE_SERVER` | `unix:/run/user/1000/pulse/native` | PulseAudio socket the (root) container uses to reach the uid-1000 server. |
| `PULSE_COOKIE` | `/run/user/1000/pulse/cookie` | PulseAudio auth cookie (root→uid-1000). |
| `WITH_UNITREE_SDK` | `1` | **Build arg.** `1` bundles CycloneDDS 0.10.2 + `unitree_sdk2_python` so **chest audio works out of the box**; `0` = leaner USB/plugged-only image. |
| `BASE_OS_IMAGE` | `python:3.10-slim-bookworm` | **Build arg.** Base image; override only for a GPU build. |
| `SANAD_IMAGE` | `sanad-p2:latest` | Image name/tag — set a registry path for pull-and-run at fleet scale. |
Audio note: the volume slider drives **all** speaker types like SanadV3 — the
unified `POST /api/audio/g1-speaker/volume` sets the G1 chest (DDS `SetVolume`) **and**
the active PulseAudio sink (plugged/Bluetooth, e.g. JBL). The plugged/BT half needs
the host PulseAudio socket + cookie (the `PULSE_*` vars above); chest audio needs
none of it. One-time host setup for a stable boot-time socket: `loginctl enable-linger
unitree`. Advanced mask overrides also exist: `SANAD_MASK_NAME_PREFIX` (default
`MASK`) and `SANAD_MASK_ADAPTER`.
## Dashboard & features
**SPA tabs** (rendered by `app_p2.py` `P2_SPA_TABS`) — **6 tabs**: `operations`,
`voice`, `motion`, `mask`, `recordings`, `settings`. The premium controls for
**live-voice** (voice-command triggers), **wake-phrases**, and **skills** live
*inside* those tabs (operations / voice / motion), not as separate tabs — but their
routers are fully mounted (`/api/live-voice/*`, `/api/wake-phrases/*`,
`/api/skills/*`). Tabs for `recognition`, `zones`, `navigation`, `controller`,
`temp`, and `terminal` belong to other packages and stay hidden.
**API endpoint groups** (mounted by `app_p2.py`, confirmed against `test_p2.sh`) — all
the P1 groups **plus** the premium ones:
| Group | What it covers |
|---|---|
| `/api/health` · `/api/system/info` | health + system info |
| `/api/package` | manifest + license state + entitled features + api-key status |
| `/api/p2/*` | convenience: `api-key`, `persona`, `say`, `settings`, `logs/delete` (with live-session restart) |
| `/api/voice/*` · `/api/prompt/*` · `/api/typed-replay/*` | conversation, persona/prompt, typed replay (multilingual) |
| `/api/live-subprocess/*` · `/api/live-voice/*` | start/stop the live Gemini session · voice-command gesture status/toggle |
| `/api/wake-phrases/*` · `/api/skills/*` | wake-phrases · skills registry |
| `/api/motion/*` · `/api/macros/*` · `/api/replay/*` · `/api/scripts/*` | **arm** gestures/actions, macros, replays, scripts (locomotion deferred) |
| `/api/mask/*` | LED mask: connect, brightness, face start/stop, speaking, mouth, expressions, status (+ QR/social) |
| `/api/audio/*` | speaker/mic profiles, `g1-speaker/volume`, mute, refresh |
| `/api/records/*` · `/api/logs/*` · `/ws/logs` | saved recordings · logs list/bundle · live log stream |
Lip-sync + expression source of truth is **Gemini markers**, not raw amplitude:
`[[MOUTH:n]]` drives the mouth, `[[FACE:name]]` drives brief expressions, and
`[[SHOW:account]]` relays a social QR onto the mask.
## Licensing
The bundled `license/sanad.lic.example` is a **demo** license (`robot_id
G1-SN-DEMO-0001`, unbound, expires 2030-01-01) that entitles **P1 + P2 + P3** with
features `multilingual`, `voice_command_motion`, `lipsync`, `mask`, `face_rec`,
`places`, `memory`. The license **features that gate P2** are `multilingual`,
`voice_command_motion` (arm gestures), `lipsync`, and `mask`. A future
`voice_command_locomotion` feature will gate the deferred walking.
To sign a **real per-robot license** (vendor side — keep the private key **off** the
robot):
```bash
python licensing/sign_license.py gen-keys --out-dir . # once, ever
# edit claims.json: packages {P1..P4} + features {...} + robot_id (+ optional machine_fingerprint)
python licensing/sign_license.py sign --key privkey.ed25519 \
--in claims.json --out sanad.lic
```
Put `sanad.lic` on the robot and point `SANAD_LICENSE_FILE` at it (or replace
`license/sanad.lic.example`). Verification is **fully offline**. Optional hardware
binding: set `SANAD_LICENSE_BIND=1` to bind the license to the machine fingerprint
(and uncomment the `/etc/machine-id` mount). Verify entitlement the way the entrypoint
does:
```bash
SANAD_LICENSE=license/sanad.lic python -m sanad_pkg.license_check P2
```
## Status
**Built + validated locally** on the workstation (compile, import resolution,
shim-symbol coverage, namespace bootstrap, `license_check P2` entitled, YAML, mask-lib
import), and a prior on-robot smoke run passed **16/16**. The remaining production
gate is a **clean self-contained Docker build + on-robot smoke on the target G1**:
chest audio and arm gestures are confirmed, but **mask lip-sync needs the physical BLE
mask** paired and freed from its phone app. (P1 is already robot-tested; the on-robot
Docker build + smoke is the outstanding gate for P2/P3/P4.)