Sanad_Package_2/README.md
2026-07-05 13:34:15 +00:00

12 KiB

Unitree G1 - Sanad Package 2 — Premium Communication

Everything Package 1 does (hands-free Gemini Live conversation, chest or USB audio) plus multilingual auto-detect, voice-command arm gestures, gestures while speaking, wake-phrases, skills, and a lip-syncing + expressive LED "Shining Mask". Dashboard on :8012. (Voice-command locomotion/walking is deferred to a later pass.)

This package is self-contained — it vendors the SanadV3 engine and the flat BLE Mask lib under vendor/ and builds FROM python:3.10-slim-bookworm, so a clone of just this folder builds and runs with no sanad-base, Sanad_Core, or sibling checkout. Full design notes: PLAN.md.

Install (per-robot, first run)

The workstation is canonical — you build and run Docker on the robot, but you edit and stage the folder on the workstation and copy it over.

1. Get the package onto the robot. From the workstation Packages/ directory, rsync just this folder to the robot's deploy dir (dev robot = 10.255.254.86):

rsync -az --exclude __pycache__ --exclude .git \
  Sanad_Package_2 unitree@<robot>:~/sanad_deploy/

Only this folder is needed — the engine and mask lib are vendored under vendor/.

2. License. A signed Ed25519 sanad.lic decides which packages a robot may run, verified fully offline against pubkey.ed25519 (baked into the image at /etc/sanad/pubkey.ed25519). The build and a first run work out of the box with the bundled demo license (license/sanad.lic.examplerobot_id G1-SN-DEMO-0001, unbound, entitles P1+P2+P3, expires 2030-01-01). A production delivery needs a signed per-robot license — see Licensing and, for the full flow, NEW_ROBOT_SETUP.md.

3. Keyless — the customer adds their own Gemini key. The image ships with no Gemini API key baked in. On first open of the dashboard, paste your own key in the Gemini API key card (step 1 below). Nothing to configure at build time.

For a brand-new G1 end-to-end (prereqs, build, signing + binding a P2 license, mask pairing, verify) follow NEW_ROBOT_SETUP.md.

Run with Docker (quick start)

On the robot, from this folder (~/sanad_deploy/Sanad_Package_2/) — nothing else required:

cp .env.example .env            # optional: license path / audio / mask MAC
docker compose up -d --build    # build (vendored engine + mask + chest-audio SDK) + run
#   Jetson Docker without buildx:  DOCKER_BUILDKIT=0 docker compose up -d --build

network_mode: host is required (G1 DDS discovery + chest audio + Gemini cloud + host BlueZ for the mask) — it is already set in docker-compose.yml.

Then open http://<robot-ip>:8012 (the image ships keyless):

  1. Gemini API key card → paste your key.
  2. Persona card → who the robot is; the bilingual prompt drives multilingual auto-detect (saving restarts the live session).
  3. Audio card → pick chest vs USB/Anker + volume; Mask tab → connect the LED face and run a lip-sync test.
  4. Live-voice tab → enable voice-command arm gestures (default OFF), then press Start in the Conversation card and talk.

The LED mask must be paired / in range and freed from its phone app (BLE allows one central at a time); bleak is pinned to 0.22.3 for BlueZ 5.53 compatibility.

Manage it

docker compose logs -f           # live logs
docker compose down              # stop
./test_p2.sh <robot-ip>:8012     # smoke test (expect 16/16 PASS)
  • Auto-start on boot: sudo systemctl enable dockerdocker-compose.yml already runs P2 restart: unless-stopped, so it returns after a reboot.
  • No Docker? (dev mode) run against ./vendor in the robot's gemini_sdk conda env:
    cd ~/sanad_deploy/Sanad_Package_2
    ./p2ctl.sh start      # launch on :8012 (runs against ./vendor)
    ./p2ctl.sh status     # process + /api/health
    ./p2ctl.sh logs 80    # tail the P2 log
    ./p2ctl.sh restart
    ./p2ctl.sh stop
    
    The conda env must have google-genai and (for the mask) bleak==0.22.3 + Pillow.

Entrypoint flow (entrypoint.sh): (1) license gatepython3 -m sanad_pkg.license_check P2; if the robot is not entitled the container exits cleanly (code 0) so the restart policy never crash-loops; (2) resolve config with precedence env var > license feature > config/p2_config.json; (3) a preflight that prints clear diagnostics (python / google-genai / pyaudio / bleak / Pillow / mask lib / unitree SDK / audio profile); (4) exec app_p2.py.

What it ships

  • app_p2.py — launcher: bootstraps the Project.Sanad namespace + the flat Mask path, constructs the P2 superset (comms + arm + FaceController mask), wires lip-sync (Gemini [[MOUTH:n]] markers → mask_face.set_mouth), gestures-while- speaking, lifelike face state, and the arm↔loco motion-block interlock seam (no-op while locomotion is deferred); mounts the P1 + premium routers; serves the SanadV3 SPA (non-P2 tabs hidden) on :8012.
  • routes_p2.py/api/p2/* convenience routes (api-key, persona, say, logs, settings) that also restart the live Gemini session so a new key/persona applies immediately.
  • vendor/Sanad (SanadV3 engine) · vendor/sanad_pkg (license + bus lib) · vendor/mask (flat BLE Shining-Mask lib, its own path via SANAD_MASK_DIR=/app/mask). Refresh from a monorepo checkout with ./sync_vendor.sh.
  • Dockerfile — self-contained FROM python:3.10-slim-bookworm; adds BlueZ/D-Bus, bleak==0.22.3, Pillow, and (when WITH_UNITREE_SDK=1) a full CycloneDDS 0.10.2 + unitree_sdk2_python for chest audio.
  • docker-compose.yml — standalone build + run; network_mode: host, cap_add: NET_ADMIN, and mounts /var/run/dbus + /dev/bus/usb for the mask.
  • entrypoint.sh — license gate → resolve env → preflight → launch.
  • config/p2_config.json (defaults) + mask_config.json (persisted face colors). license/pubkey.ed25519 + demo sanad.lic.example.
  • test_p2.sh — 16-check smoke test · p2ctl.sh — dev-mode control script · sync_vendor.sh — re-vendor the engine · NEW_ROBOT_SETUP.md / PLAN.md.

Configuration (env vars)

Copy .env.example.env (docker compose reads it automatically). Precedence at runtime is env var > license feature > config/p2_config.json.

Var Default What it does
SANAD_LICENSE_FILE ./license/sanad.lic.example Host path to the signed sanad.lic, mounted read-only at /etc/sanad/sanad.lic. Point it at your per-robot license for production.
SANAD_LICENSE_BIND 0 1 enforces machine-fingerprint binding — also uncomment the /etc/machine-id mount in docker-compose.yml.
SANAD_AUDIO_PROFILE builtin builtin = G1 chest over DDS · plugged = USB (e.g. Anker) via PulseAudio. Use plugged on any non-G1 host.
SANAD_DDS_INTERFACE eth0 DDS interface to the G1 firmware (chest audio + arm).
SANAD_LANGUAGE (empty) Empty = multilingual auto-detect (P2's headline feature). Set ar/en only to force a single language.
SANAD_MASK_ADDRESS (empty) Pin the LED mask's BLE MAC; empty = auto-discover by name prefix.
SANAD_PULSE_DIR /run/user/1000/pulse Host PulseAudio runtime dir mounted into the container for plugged/BT speaker volume.
PULSE_SERVER unix:/run/user/1000/pulse/native PulseAudio socket the (root) container uses to reach the uid-1000 server.
PULSE_COOKIE /run/user/1000/pulse/cookie PulseAudio auth cookie (root→uid-1000).
WITH_UNITREE_SDK 1 Build arg. 1 bundles CycloneDDS 0.10.2 + unitree_sdk2_python so chest audio works out of the box; 0 = leaner USB/plugged-only image.
BASE_OS_IMAGE python:3.10-slim-bookworm Build arg. Base image; override only for a GPU build.
SANAD_IMAGE sanad-p2:latest Image name/tag — set a registry path for pull-and-run at fleet scale.

Audio note: the volume slider drives all speaker types like SanadV3 — the unified POST /api/audio/g1-speaker/volume sets the G1 chest (DDS SetVolume) and the active PulseAudio sink (plugged/Bluetooth, e.g. JBL). The plugged/BT half needs the host PulseAudio socket + cookie (the PULSE_* vars above); chest audio needs none of it. One-time host setup for a stable boot-time socket: loginctl enable-linger unitree. Advanced mask overrides also exist: SANAD_MASK_NAME_PREFIX (default MASK) and SANAD_MASK_ADAPTER.

Dashboard & features

SPA tabs (rendered by app_p2.py P2_SPA_TABS) — 6 tabs: operations, voice, motion, mask, recordings, settings. The premium controls for live-voice (voice-command triggers), wake-phrases, and skills live inside those tabs (operations / voice / motion), not as separate tabs — but their routers are fully mounted (/api/live-voice/*, /api/wake-phrases/*, /api/skills/*). Tabs for recognition, zones, navigation, controller, temp, and terminal belong to other packages and stay hidden.

API endpoint groups (mounted by app_p2.py, confirmed against test_p2.sh) — all the P1 groups plus the premium ones:

Group What it covers
/api/health · /api/system/info health + system info
/api/package manifest + license state + entitled features + api-key status
/api/p2/* convenience: api-key, persona, say, settings, logs/delete (with live-session restart)
/api/voice/* · /api/prompt/* · /api/typed-replay/* conversation, persona/prompt, typed replay (multilingual)
/api/live-subprocess/* · /api/live-voice/* start/stop the live Gemini session · voice-command gesture status/toggle
/api/wake-phrases/* · /api/skills/* wake-phrases · skills registry
/api/motion/* · /api/macros/* · /api/replay/* · /api/scripts/* arm gestures/actions, macros, replays, scripts (locomotion deferred)
/api/mask/* LED mask: connect, brightness, face start/stop, speaking, mouth, expressions, status (+ QR/social)
/api/audio/* speaker/mic profiles, g1-speaker/volume, mute, refresh
/api/records/* · /api/logs/* · /ws/logs saved recordings · logs list/bundle · live log stream

Lip-sync + expression source of truth is Gemini markers, not raw amplitude: [[MOUTH:n]] drives the mouth, [[FACE:name]] drives brief expressions, and [[SHOW:account]] relays a social QR onto the mask.

Licensing

The bundled license/sanad.lic.example is a demo license (robot_id G1-SN-DEMO-0001, unbound, expires 2030-01-01) that entitles P1 + P2 + P3 with features multilingual, voice_command_motion, lipsync, mask, face_rec, places, memory. The license features that gate P2 are multilingual, voice_command_motion (arm gestures), lipsync, and mask. A future voice_command_locomotion feature will gate the deferred walking.

To sign a real per-robot license (vendor side — keep the private key off the robot):

python licensing/sign_license.py gen-keys --out-dir .        # once, ever
# edit claims.json: packages {P1..P4} + features {...} + robot_id (+ optional machine_fingerprint)
python licensing/sign_license.py sign --key privkey.ed25519 \
       --in claims.json --out sanad.lic

Put sanad.lic on the robot and point SANAD_LICENSE_FILE at it (or replace license/sanad.lic.example). Verification is fully offline. Optional hardware binding: set SANAD_LICENSE_BIND=1 to bind the license to the machine fingerprint (and uncomment the /etc/machine-id mount). Verify entitlement the way the entrypoint does:

SANAD_LICENSE=license/sanad.lic python -m sanad_pkg.license_check P2

Status

Built + validated locally on the workstation (compile, import resolution, shim-symbol coverage, namespace bootstrap, license_check P2 entitled, YAML, mask-lib import), and a prior on-robot smoke run passed 16/16. The remaining production gate is a clean self-contained Docker build + on-robot smoke on the target G1: chest audio and arm gestures are confirmed, but mask lip-sync needs the physical BLE mask paired and freed from its phone app. (P1 is already robot-tested; the on-robot Docker build + smoke is the outstanding gate for P2/P3/P4.)