Sanad_Package_1/README.md
2026-07-05 13:34:34 +00:00

300 lines
15 KiB
Markdown

# Unitree G1 - Sanad Package 1 — Basic Communication
Hands-free conversation in **one operator-selected language** (Gemini Live),
audio via the **G1 chest** or **any plugged USB mic/speaker (Anker)**. **No**
voice-command motion, vision, recognition, or navigation. Dashboard on **:8011**.
This package is **self-contained** — it vendors the Sanad engine under `vendor/`
and builds `FROM python:3.10-slim-bookworm`, so a clone of **just this folder**
builds and runs with no sibling folders and no `sanad-base` image. The image
ships **keyless**: the customer pastes their own Gemini API key in the dashboard
on first open.
---
## Install (per-robot, first run)
The workstation is canonical — you **build and run Docker on the robot**, but you
copy the folder from the workstation. Three steps: deploy the folder, give the
robot a license, then add the Gemini key in the dashboard. For a **brand-new G1**
that has never run Sanad, follow the full flow in **[`NEW_ROBOT_SETUP.md`](NEW_ROBOT_SETUP.md)**.
**1. Get the package onto the robot.** P1 is self-contained, so you copy only the
package folder (no `Sanad/` sibling, no `sanad-base`). From the workstation:
```bash
rsync -az --exclude __pycache__ \
Project/G1/Packages/Sanad_Package_1 \
unitree@<robot>:~/sanad_deploy/
```
Everything P1 needs to build and run then lives under
`~/sanad_deploy/Sanad_Package_1` on the robot. (Dev robot: `unitree@10.255.254.86`.)
**2. License this robot.** A signed Ed25519 `sanad.lic` decides which packages a
robot may run; verification is fully **offline** (the public key is baked into the
image). The package ships a **bundled demo license**
(`license/sanad.lic.example`, robot_id `G1-SN-DEMO-0001`, unbound, expires
`2030-01-01`) that entitles **P1 + P3** — so the build **runs out of the box** with no
extra work. For a **delivered/production robot**, sign a real **per-robot license**
(optionally hardware-bound) — see **[Licensing](#licensing)** below and
**[`NEW_ROBOT_SETUP.md`](NEW_ROBOT_SETUP.md)**.
**3. First run is keyless.** No Gemini key is baked into the image. On first open
of the dashboard the customer pastes **their own** Gemini API key (see the
first-run steps under [Run with Docker](#run-with-docker-quick-start)).
---
## Run with Docker (quick start)
On the robot, **from this folder** (`~/sanad_deploy/Sanad_Package_1/`) — nothing
else needed:
```bash
cp .env.example .env # optional: set language / audio / license path
docker compose up -d --build # build (vendored engine) + run
# Jetson Docker without buildx: DOCKER_BUILDKIT=0 docker compose up -d --build
```
The container runs `network_mode: host` — this is **required** so the G1 DDS
discovery (and chest audio) reaches the robot firmware on `eth0`.
Open **http://&lt;robot-ip&gt;:8011** and (the image ships **keyless**):
1. **Gemini API key** card → paste your key.
2. **Persona** card → set who the robot is + the language/dialect it speaks (saving restarts the live session).
3. **Audio** card → pick chest vs USB/Anker speaker, volume, mute.
4. Press **Start** in the **Conversation** card and talk.
The first build vendors the engine and (by default, `WITH_UNITREE_SDK=1`) compiles
the chest-audio Unitree SDK, so it takes a few minutes; later builds are cached.
---
## Manage it
```bash
docker compose logs -f # live logs
docker compose down # stop
./test_p1.sh <robot-ip>:8011 # smoke test (expect 11/11 PASS)
```
- **Auto-start on boot:** `sudo systemctl enable docker` — compose already runs P1
with `restart: unless-stopped`, so the container returns after a reboot until you
explicitly `docker compose down`. (Full boot options — Docker-native vs a systemd
unit — are in [`NEW_ROBOT_SETUP.md`](NEW_ROBOT_SETUP.md) §7.)
- **No Docker? (dev mode)** run P1 in the robot's `gemini_sdk` conda env via the
control script. It runs against the vendored engine in `./vendor`, so only the
package folder is needed:
```bash
cd ~/sanad_deploy/Sanad_Package_1
./p1ctl.sh start # launch on :8011 (runs against ./vendor)
./p1ctl.sh status # process + /api/health
./p1ctl.sh logs 80 # tail the P1 log
./p1ctl.sh restart
./p1ctl.sh stop
```
- **Logs:** the dashboard's **Logs** card streams live (`/ws/logs`) and the
**⬇ Download** button saves the full bundle (`/api/logs/bundle`) as
`sanad_p1_logs_<ts>.txt`.
**Container startup (what the entrypoint does):** (1) **license gate** —
`python3 -m sanad_pkg.license_check P1`; if the robot is **not** entitled the
container **exits cleanly (code 0)** so the restart policy won't crash-loop;
(2) resolve config with precedence **env var > license feature >
`config/p1_config.json`**; (3) a **preflight** that prints clear diagnostics
(python / google-genai / pyaudio / Unitree SDK / audio profile); (4) `exec app_p1.py`.
---
## What it ships
- `app_p1.py` — launcher: bootstraps the `Project.Sanad` namespace, constructs
ONLY the comms subsystems (`brain`, `audio_mgr`, `voice_client`, `local_tts`,
`typed_replay`, `live_sub`), injects a P1-scoped `Project.Sanad.main` shim, and
mounts ONLY the P1 dashboard routers (`voice`, `audio`, `prompt`,
`typed-replay`, `records`, `logs`, `live-subprocess`, `health`, `system`) +
the logs websocket. Serves the real Sanad SPA with non-P1 tabs hidden.
- `routes_p1.py` — the P1-scoped routes mounted at `/api/p1` (`api-key`, `persona`,
`say`, `settings`) that reuse Sanad's canonical logic and add the live-session
restart on change.
- `entrypoint.sh` — license gate → resolve language/audio/port → preflight → launch.
- `Dockerfile` / `requirements.txt` — **SELF-CONTAINED**:
`FROM python:3.10-slim-bookworm`, installs all deps, and bakes the vendored engine
(+ chest-audio SDK when `WITH_UNITREE_SDK=1`) — **no `sanad-base`**.
- `docker-compose.yml` — standalone build + run (`context: .`, `network_mode: host`,
`restart: unless-stopped`). The top-level `Packages/docker-compose.yml` can still
run P1 in the fleet via `--profile p1`.
- `vendor/Sanad` + `vendor/sanad_pkg` — the vendored Sanad engine + license/bus
lib. Refresh from a monorepo checkout with `./sync_vendor.sh`.
- `config/p1_config.json` — defaults (language, audio profile, port, SPA tab set).
- `license/` — `pubkey.ed25519` (public verification key) + `sanad.lic.example`
(the bundled demo license used by the default mount).
- `sync_vendor.sh` — re-vendor the engine from a monorepo checkout.
- `test_p1.sh` — smoke test (`./test_p1.sh <robot-ip>:8011`, expect 11/11 PASS).
- `p1ctl.sh` — dev-mode (no-Docker) start/stop/status/logs against `./vendor`.
It does **not** fork Sanad — it **vendors** the canonical source under `vendor/`
(re-synced by `sync_vendor.sh`), so the package builds and runs entirely on its
own with **no sibling folders**.
---
## Configuration (env vars)
`docker compose` reads `.env` from this directory automatically (`cp .env.example .env`).
The knobs you'll actually touch:
| Var | Default | What it does |
|---|---|---|
| `SANAD_LICENSE_FILE` | `./license/sanad.lic.example` | Host path to the signed license, mounted **read-only** into the container at `/etc/sanad/sanad.lic`. Point it at your signed `sanad.lic` on a delivered robot. |
| `SANAD_LICENSE_BIND` | `0` | `1` = enforce the machine-fingerprint lock (bound license). When `1`, also uncomment the `/etc/machine-id` mount in `docker-compose.yml`. |
| `SANAD_AUDIO_PROFILE` | `builtin` | `builtin` = G1 chest mic+speaker over DDS · `plugged` = USB (e.g. Anker) via PulseAudio. |
| `SANAD_DDS_INTERFACE` | `eth0` | Network interface carrying the G1 DDS link to the robot firmware. |
| `SANAD_LANGUAGE` | *(empty)* | Conversation language override (`ar`, `en`, `hi`, `ur`, `zh`, `ru`, `fr`, …). Empty → falls back to the license `language` feature, then the config default (`ar`). |
| `SANAD_PULSE_DIR` | `/run/user/1000/pulse` | Host PulseAudio runtime dir mounted into the container (socket + cookie) so plugged/Bluetooth volume+output works. |
| `PULSE_SERVER` | `unix:/run/user/1000/pulse/native` | PulseAudio socket the (root) container talks to for plugged/BT sinks. |
| `PULSE_COOKIE` | `/run/user/1000/pulse/cookie` | PulseAudio auth cookie (root → uid-1000). If plugged volume is silent, try `/home/<user>/.config/pulse/cookie`. |
| `WITH_UNITREE_SDK` | `1` | **Build arg.** `1` bundles CycloneDDS + `unitree_sdk2_python` so chest audio works out of the box; `0` builds a leaner USB/plugged-only image. |
| `BASE_OS_IMAGE` | `python:3.10-slim-bookworm` | **Build arg.** Base image (override only for a GPU build). |
| `SANAD_IMAGE` | `sanad-p1:latest` | Image name/tag — set to a registry path for pull-and-run at fleet scale. |
Fixed in `docker-compose.yml` (rarely overridden): `SANAD_PACKAGE=P1`,
`SANAD_DASHBOARD_PORT=8011`, `SANAD_DASHBOARD_HOST=0.0.0.0`,
`SANAD_VOICE_BRAIN=gemini`, and the in-container license paths
`SANAD_LICENSE=/etc/sanad/sanad.lic` + `SANAD_PUBKEY=/etc/sanad/pubkey.ed25519`.
### Audio & volume (all speaker types)
Like SanadV3, the **volume slider drives every speaker type**. The unified control
`POST /api/audio/g1-speaker/volume` sets **both** the G1 chest (DDS `SetVolume`,
always) **and** the active PulseAudio sink (plugged USB / Bluetooth, e.g. JBL).
- **`builtin` (chest)** uses DDS only and needs **none** of the pulse setup.
- **`plugged` / Bluetooth** — the container runs as **root**, so it needs the
**host** PulseAudio socket + cookie to reach the uid-1000 PulseAudio. That is why
`docker-compose.yml` mounts `SANAD_PULSE_DIR` and sets `PULSE_SERVER` +
`PULSE_COOKIE`. One-time host setup for a stable boot-time socket:
```bash
loginctl enable-linger unitree
```
---
## Dashboard & features (http://&lt;robot&gt;:8011)
Two UIs from the one server:
- **`/`** — a clean **P1 control page** with cards: Conversation (start/stop),
Say-a-line, **Persona** (Save & Apply), **Gemini API key**, **Audio** (speaker
profile + volume + mute + rescan), and a live Logs view. This is the everyday
UI — no API knowledge needed.
- **`/full`** — the complete Sanad SPA (advanced), tabs
**operations · voice · recordings · settings**, with non-P1 tabs hidden
(motion/recognition/nav/mask/temperature/terminal belong to other packages).
**What you can do (cards on `/`, and the matching endpoints):**
| You want to… | Where / endpoint |
|---|---|
| **Talk to the robot** (start/stop the live conversation) | Voice · `POST /api/live-subprocess/start\|stop`, `/api/voice/connect\|disconnect` |
| **Make it say a specific line** | Voice / Typed-replay · `POST /api/voice/generate`, `POST /api/typed-replay/say`, `POST /api/p1/say` |
| **Change the robot persona** (who it is, tone, **language/dialect**) | Settings · `GET/POST /api/p1/persona` (or base `/api/prompt`) |
| **Set / update the Gemini API key** | Settings · `GET/POST /api/p1/api-key` |
| **Pick speaker/mic** (chest vs Anker/USB), **volume**, mute | Audio · `/api/audio/devices\|profiles\|select-profile\|select-manual\|g1-speaker/volume\|*/mute\|refresh\|reset` |
| **Manage saved recordings** (save/replay/rename/delete) | Recordings · `/api/records/*`, `/api/typed-replay/*` |
| **See logs / system / health** | Settings · `/api/logs/*`, `/ws/logs`, `/api/system/info`, `/api/health` |
**Endpoint groups:** `/` and `/full` (SPA) · `/api/package` (manifest + license +
api-key status) · **`/api/p1/*`** (api-key, persona, say, settings) · `/api/voice/*` ·
`/api/audio/*` · `/api/prompt/*` · `/api/typed-replay/*` · `/api/records/*` ·
`/api/logs/*` · `/api/live-subprocess/*` · `/api/health` · `/api/system/info` ·
`/ws/logs`.
### Change the robot persona
The persona is the system prompt at `scripts/sanad_script.txt` (who Sanad is,
tone, and the language/dialect it speaks). Edit it from the Settings tab or:
```bash
curl http://<robot>:8011/api/p1/persona # current persona + rules
curl -X POST http://<robot>:8011/api/p1/persona \
-H 'Content-Type: application/json' \
-d '{"content":"You are Sanad, a friendly Emirati guide. Speak Khaleeji Arabic..."}'
```
`POST /api/p1/persona` writes the persona **and restarts the live session** so it
takes effect immediately (the base `/api/prompt/update` writes the file but a
running session keeps the old persona until restarted). This is also how you steer
the conversation **language** — put the language directive in the persona.
### Set / update the Gemini API key
Two ways, both available in P1:
- **Base (Sanad):** `GET/POST /api/voice/api-key` — the SPA Voice/Settings tab
uses this. POST persists to `data/motions/config.json`, hot-swaps the in-memory
key, and disconnects the short-session client. The **live** Gemini subprocess
must be restarted (Stop→Start) to pick it up.
- **P1 convenience:** `GET/POST /api/p1/api-key` — same persist + hot-swap, and
**also auto-restarts the live Gemini subprocess** so the new key applies
immediately. `GET /api/p1/settings` returns api-key status + persona + language +
audio profile + whether a live session is running.
```bash
# set or update the key (works for first-time set AND replacing an existing key)
curl -X POST http://<robot>:8011/api/p1/api-key \
-H 'Content-Type: application/json' -d '{"api_key":"AIza...."}'
# check status (masked; never returns the full key)
curl http://<robot>:8011/api/p1/api-key
```
The key is stored masked in any response and persisted to `data/motions/config.json`
(highest precedence, ahead of the `SANAD_GEMINI_API_KEY` env and `core_config.json`).
---
## Licensing
A signed **Ed25519** `sanad.lic` decides which packages a robot may run.
`pubkey.ed25519` is baked into every image at `/etc/sanad/pubkey.ed25519`, so
**verification is fully offline** — no network, no license server.
`docker-compose.yml` mounts the license read-only:
```yaml
- "${SANAD_LICENSE_FILE:-./license/sanad.lic.example}:/etc/sanad/sanad.lic:ro"
```
**Bundled demo license** (`license/sanad.lic.example`): robot_id `G1-SN-DEMO-0001`,
**unbound**, expires `2030-01-01`. It entitles **P1 + P3** with feature
`language=ar`, so **P1 starts out of the box** for evaluation. P1 itself is gated by
the license `packages.P1` bit; the `language` feature sets the default conversation
language when `SANAD_LANGUAGE` is unset.
**Sign a real per-robot license (vendor side — keep the private key OFF the robot):**
```bash
# once: generate the vendor keypair (privkey stays on the workstation)
python licensing/sign_license.py gen-keys --out-dir .
# edit claims.json — set packages{P1..P4}, features{...}, robot_id
# (+ optional machine_fingerprint for hardware binding)
python licensing/sign_license.py sign --key privkey.ed25519 --in claims.json --out sanad.lic
```
Put `sanad.lic` on the robot and point `SANAD_LICENSE_FILE` at it (or replace
`license/sanad.lic.example`). **Optional hardware binding:** set
`SANAD_LICENSE_BIND=1` to bind to the machine fingerprint (also uncomment the
`/etc/machine-id` mount in `docker-compose.yml`). Verify entitlement the way the
entrypoint does:
```bash
SANAD_LICENSE=license/sanad.lic python -m sanad_pkg.license_check P1
```
The full brand-new-robot signing/binding walkthrough is in
**[`NEW_ROBOT_SETUP.md`](NEW_ROBOT_SETUP.md)**.
---
## Status
**Built + validated locally, and already built + smoke-tested on the dev robot
(11/11 PASS).** P1 is the reference-tested package. The remaining gate for the
other packages (P2/P3/P4) is the on-robot Docker build + smoke test — P1 has
already cleared it.