213 lines
12 KiB
Markdown
213 lines
12 KiB
Markdown
# Unitree G1 - Sanad Package 2 — Premium Communication
|
|
|
|
Everything **Package 1** does (hands-free Gemini Live conversation, chest or USB
|
|
audio) **plus** multilingual auto-detect, voice-command **arm gestures**, gestures
|
|
while speaking, wake-phrases, skills, and a lip-syncing + expressive LED **"Shining
|
|
Mask"**. Dashboard on **:8012**. (Voice-command **locomotion/walking** is *deferred*
|
|
to a later pass.)
|
|
|
|
This package is **self-contained** — it vendors the SanadV3 engine and the flat BLE
|
|
Mask lib under `vendor/` and builds `FROM python:3.10-slim-bookworm`, so a clone of
|
|
just this folder builds and runs with **no** `sanad-base`, `Sanad_Core`, or sibling
|
|
checkout. Full design notes: **[PLAN.md](PLAN.md)**.
|
|
|
|
## Install (per-robot, first run)
|
|
|
|
The workstation is canonical — you **build and run Docker on the robot**, but you
|
|
edit and stage the folder on the workstation and copy it over.
|
|
|
|
**1. Get the package onto the robot.** From the workstation `Packages/` directory,
|
|
rsync just this folder to the robot's deploy dir (dev robot = `10.255.254.86`):
|
|
```bash
|
|
rsync -az --exclude __pycache__ --exclude .git \
|
|
Sanad_Package_2 unitree@<robot>:~/sanad_deploy/
|
|
```
|
|
Only this folder is needed — the engine and mask lib are vendored under `vendor/`.
|
|
|
|
**2. License.** A signed **Ed25519** `sanad.lic` decides which packages a robot may
|
|
run, verified **fully offline** against `pubkey.ed25519` (baked into the image at
|
|
`/etc/sanad/pubkey.ed25519`). The build and a first run work out of the box with the
|
|
**bundled demo license** (`license/sanad.lic.example` — `robot_id G1-SN-DEMO-0001`,
|
|
unbound, entitles **P1+P2+P3**, expires 2030-01-01). A **production delivery needs a
|
|
signed per-robot license** — see [Licensing](#licensing) and, for the full flow,
|
|
**[NEW_ROBOT_SETUP.md](NEW_ROBOT_SETUP.md)**.
|
|
|
|
**3. Keyless — the customer adds their own Gemini key.** The image ships with **no
|
|
Gemini API key baked in**. On first open of the dashboard, paste your own key in the
|
|
**Gemini API key** card (step 1 below). Nothing to configure at build time.
|
|
|
|
For a brand-new G1 end-to-end (prereqs, build, signing + binding a P2 license, mask
|
|
pairing, verify) follow **[NEW_ROBOT_SETUP.md](NEW_ROBOT_SETUP.md)**.
|
|
|
|
## Run with Docker (quick start)
|
|
|
|
On the robot, **from this folder** (`~/sanad_deploy/Sanad_Package_2/`) — nothing
|
|
else required:
|
|
```bash
|
|
cp .env.example .env # optional: license path / audio / mask MAC
|
|
docker compose up -d --build # build (vendored engine + mask + chest-audio SDK) + run
|
|
# Jetson Docker without buildx: DOCKER_BUILDKIT=0 docker compose up -d --build
|
|
```
|
|
`network_mode: host` is **required** (G1 DDS discovery + chest audio + Gemini cloud +
|
|
host BlueZ for the mask) — it is already set in `docker-compose.yml`.
|
|
|
|
Then open **http://<robot-ip>:8012** (the image ships **keyless**):
|
|
1. **Gemini API key** card → paste your key.
|
|
2. **Persona** card → who the robot is; the bilingual prompt drives **multilingual
|
|
auto-detect** (saving restarts the live session).
|
|
3. **Audio** card → pick chest vs USB/Anker + volume; **Mask** tab → connect the LED
|
|
face and run a lip-sync test.
|
|
4. **Live-voice** tab → enable voice-command **arm gestures** (default OFF), then
|
|
press **Start** in the Conversation card and talk.
|
|
|
|
> The **LED mask** must be paired / in range and **freed from its phone app** (BLE
|
|
> allows one central at a time); `bleak` is pinned to **0.22.3** for BlueZ 5.53
|
|
> compatibility.
|
|
|
|
## Manage it
|
|
|
|
```bash
|
|
docker compose logs -f # live logs
|
|
docker compose down # stop
|
|
./test_p2.sh <robot-ip>:8012 # smoke test (expect 16/16 PASS)
|
|
```
|
|
- **Auto-start on boot:** `sudo systemctl enable docker` — `docker-compose.yml`
|
|
already runs P2 `restart: unless-stopped`, so it returns after a reboot.
|
|
- **No Docker? (dev mode)** run against `./vendor` in the robot's `gemini_sdk` conda
|
|
env:
|
|
```bash
|
|
cd ~/sanad_deploy/Sanad_Package_2
|
|
./p2ctl.sh start # launch on :8012 (runs against ./vendor)
|
|
./p2ctl.sh status # process + /api/health
|
|
./p2ctl.sh logs 80 # tail the P2 log
|
|
./p2ctl.sh restart
|
|
./p2ctl.sh stop
|
|
```
|
|
The conda env must have `google-genai` and (for the mask) `bleak==0.22.3` + Pillow.
|
|
|
|
**Entrypoint flow** (`entrypoint.sh`): (1) **license gate** —
|
|
`python3 -m sanad_pkg.license_check P2`; if the robot is **not** entitled the
|
|
container exits **cleanly** (code 0) so the restart policy never crash-loops; (2)
|
|
resolve config with precedence **env var > license feature > `config/p2_config.json`**;
|
|
(3) a **preflight** that prints clear diagnostics (python / google-genai / pyaudio /
|
|
bleak / Pillow / mask lib / unitree SDK / audio profile); (4) `exec app_p2.py`.
|
|
|
|
## What it ships
|
|
|
|
- `app_p2.py` — launcher: bootstraps the `Project.Sanad` namespace + the flat Mask
|
|
path, constructs the P2 superset (comms + arm + `FaceController` mask), wires
|
|
lip-sync (Gemini `[[MOUTH:n]]` markers → `mask_face.set_mouth`), gestures-while-
|
|
speaking, lifelike face state, and the arm↔loco motion-block interlock seam (no-op
|
|
while locomotion is deferred); mounts the P1 + premium routers; serves the SanadV3
|
|
SPA (non-P2 tabs hidden) on :8012.
|
|
- `routes_p2.py` — `/api/p2/*` convenience routes (api-key, persona, say, logs,
|
|
settings) that also restart the live Gemini session so a new key/persona applies
|
|
immediately.
|
|
- `vendor/Sanad` (SanadV3 engine) · `vendor/sanad_pkg` (license + bus lib) ·
|
|
`vendor/mask` (flat BLE Shining-Mask lib, its own path via `SANAD_MASK_DIR=/app/mask`).
|
|
Refresh from a monorepo checkout with `./sync_vendor.sh`.
|
|
- `Dockerfile` — self-contained `FROM python:3.10-slim-bookworm`; adds BlueZ/D-Bus,
|
|
`bleak==0.22.3`, Pillow, and (when `WITH_UNITREE_SDK=1`) a full CycloneDDS 0.10.2 +
|
|
`unitree_sdk2_python` for chest audio.
|
|
- `docker-compose.yml` — standalone build + run; `network_mode: host`, `cap_add:
|
|
NET_ADMIN`, and mounts `/var/run/dbus` + `/dev/bus/usb` for the mask.
|
|
- `entrypoint.sh` — license gate → resolve env → preflight → launch.
|
|
- `config/` — `p2_config.json` (defaults) + `mask_config.json` (persisted face
|
|
colors). `license/` — `pubkey.ed25519` + demo `sanad.lic.example`.
|
|
- `test_p2.sh` — 16-check smoke test · `p2ctl.sh` — dev-mode control script ·
|
|
`sync_vendor.sh` — re-vendor the engine · `NEW_ROBOT_SETUP.md` / `PLAN.md`.
|
|
|
|
## Configuration (env vars)
|
|
|
|
Copy `.env.example` → `.env` (docker compose reads it automatically). Precedence at
|
|
runtime is **env var > license feature > `config/p2_config.json`**.
|
|
|
|
| Var | Default | What it does |
|
|
|---|---|---|
|
|
| `SANAD_LICENSE_FILE` | `./license/sanad.lic.example` | Host path to the signed `sanad.lic`, mounted **read-only** at `/etc/sanad/sanad.lic`. Point it at your per-robot license for production. |
|
|
| `SANAD_LICENSE_BIND` | `0` | `1` enforces machine-fingerprint binding — also uncomment the `/etc/machine-id` mount in `docker-compose.yml`. |
|
|
| `SANAD_AUDIO_PROFILE` | `builtin` | `builtin` = G1 chest over DDS · `plugged` = USB (e.g. Anker) via PulseAudio. Use `plugged` on any non-G1 host. |
|
|
| `SANAD_DDS_INTERFACE` | `eth0` | DDS interface to the G1 firmware (chest audio + arm). |
|
|
| `SANAD_LANGUAGE` | *(empty)* | Empty = **multilingual auto-detect** (P2's headline feature). Set `ar`/`en` only to force a single language. |
|
|
| `SANAD_MASK_ADDRESS` | *(empty)* | Pin the LED mask's BLE MAC; empty = auto-discover by name prefix. |
|
|
| `SANAD_PULSE_DIR` | `/run/user/1000/pulse` | Host PulseAudio runtime dir mounted into the container for plugged/BT speaker volume. |
|
|
| `PULSE_SERVER` | `unix:/run/user/1000/pulse/native` | PulseAudio socket the (root) container uses to reach the uid-1000 server. |
|
|
| `PULSE_COOKIE` | `/run/user/1000/pulse/cookie` | PulseAudio auth cookie (root→uid-1000). |
|
|
| `WITH_UNITREE_SDK` | `1` | **Build arg.** `1` bundles CycloneDDS 0.10.2 + `unitree_sdk2_python` so **chest audio works out of the box**; `0` = leaner USB/plugged-only image. |
|
|
| `BASE_OS_IMAGE` | `python:3.10-slim-bookworm` | **Build arg.** Base image; override only for a GPU build. |
|
|
| `SANAD_IMAGE` | `sanad-p2:latest` | Image name/tag — set a registry path for pull-and-run at fleet scale. |
|
|
|
|
Audio note: the volume slider drives **all** speaker types like SanadV3 — the
|
|
unified `POST /api/audio/g1-speaker/volume` sets the G1 chest (DDS `SetVolume`) **and**
|
|
the active PulseAudio sink (plugged/Bluetooth, e.g. JBL). The plugged/BT half needs
|
|
the host PulseAudio socket + cookie (the `PULSE_*` vars above); chest audio needs
|
|
none of it. One-time host setup for a stable boot-time socket: `loginctl enable-linger
|
|
unitree`. Advanced mask overrides also exist: `SANAD_MASK_NAME_PREFIX` (default
|
|
`MASK`) and `SANAD_MASK_ADAPTER`.
|
|
|
|
## Dashboard & features
|
|
|
|
**SPA tabs** (rendered by `app_p2.py` `P2_SPA_TABS`) — **6 tabs**: `operations`,
|
|
`voice`, `motion`, `mask`, `recordings`, `settings`. The premium controls for
|
|
**live-voice** (voice-command triggers), **wake-phrases**, and **skills** live
|
|
*inside* those tabs (operations / voice / motion), not as separate tabs — but their
|
|
routers are fully mounted (`/api/live-voice/*`, `/api/wake-phrases/*`,
|
|
`/api/skills/*`). Tabs for `recognition`, `zones`, `navigation`, `controller`,
|
|
`temp`, and `terminal` belong to other packages and stay hidden.
|
|
|
|
**API endpoint groups** (mounted by `app_p2.py`, confirmed against `test_p2.sh`) — all
|
|
the P1 groups **plus** the premium ones:
|
|
|
|
| Group | What it covers |
|
|
|---|---|
|
|
| `/api/health` · `/api/system/info` | health + system info |
|
|
| `/api/package` | manifest + license state + entitled features + api-key status |
|
|
| `/api/p2/*` | convenience: `api-key`, `persona`, `say`, `settings`, `logs/delete` (with live-session restart) |
|
|
| `/api/voice/*` · `/api/prompt/*` · `/api/typed-replay/*` | conversation, persona/prompt, typed replay (multilingual) |
|
|
| `/api/live-subprocess/*` · `/api/live-voice/*` | start/stop the live Gemini session · voice-command gesture status/toggle |
|
|
| `/api/wake-phrases/*` · `/api/skills/*` | wake-phrases · skills registry |
|
|
| `/api/motion/*` · `/api/macros/*` · `/api/replay/*` · `/api/scripts/*` | **arm** gestures/actions, macros, replays, scripts (locomotion deferred) |
|
|
| `/api/mask/*` | LED mask: connect, brightness, face start/stop, speaking, mouth, expressions, status (+ QR/social) |
|
|
| `/api/audio/*` | speaker/mic profiles, `g1-speaker/volume`, mute, refresh |
|
|
| `/api/records/*` · `/api/logs/*` · `/ws/logs` | saved recordings · logs list/bundle · live log stream |
|
|
|
|
Lip-sync + expression source of truth is **Gemini markers**, not raw amplitude:
|
|
`[[MOUTH:n]]` drives the mouth, `[[FACE:name]]` drives brief expressions, and
|
|
`[[SHOW:account]]` relays a social QR onto the mask.
|
|
|
|
## Licensing
|
|
|
|
The bundled `license/sanad.lic.example` is a **demo** license (`robot_id
|
|
G1-SN-DEMO-0001`, unbound, expires 2030-01-01) that entitles **P1 + P2 + P3** with
|
|
features `multilingual`, `voice_command_motion`, `lipsync`, `mask`, `face_rec`,
|
|
`places`, `memory`. The license **features that gate P2** are `multilingual`,
|
|
`voice_command_motion` (arm gestures), `lipsync`, and `mask`. A future
|
|
`voice_command_locomotion` feature will gate the deferred walking.
|
|
|
|
To sign a **real per-robot license** (vendor side — keep the private key **off** the
|
|
robot):
|
|
```bash
|
|
python licensing/sign_license.py gen-keys --out-dir . # once, ever
|
|
# edit claims.json: packages {P1..P4} + features {...} + robot_id (+ optional machine_fingerprint)
|
|
python licensing/sign_license.py sign --key privkey.ed25519 \
|
|
--in claims.json --out sanad.lic
|
|
```
|
|
Put `sanad.lic` on the robot and point `SANAD_LICENSE_FILE` at it (or replace
|
|
`license/sanad.lic.example`). Verification is **fully offline**. Optional hardware
|
|
binding: set `SANAD_LICENSE_BIND=1` to bind the license to the machine fingerprint
|
|
(and uncomment the `/etc/machine-id` mount). Verify entitlement the way the entrypoint
|
|
does:
|
|
```bash
|
|
SANAD_LICENSE=license/sanad.lic python -m sanad_pkg.license_check P2
|
|
```
|
|
|
|
## Status
|
|
|
|
**Built + validated locally** on the workstation (compile, import resolution,
|
|
shim-symbol coverage, namespace bootstrap, `license_check P2` entitled, YAML, mask-lib
|
|
import), and a prior on-robot smoke run passed **16/16**. The remaining production
|
|
gate is a **clean self-contained Docker build + on-robot smoke on the target G1**:
|
|
chest audio and arm gestures are confirmed, but **mask lip-sync needs the physical BLE
|
|
mask** paired and freed from its phone app. (P1 is already robot-tested; the on-robot
|
|
Docker build + smoke is the outstanding gate for P2/P3/P4.)
|