Sanad Package 1 — Basic Communication
Hands-free conversation in one operator-selected language (Gemini Live), audio via the G1 chest or any plugged USB mic/speaker (Anker). No voice-command motion, vision, recognition, or navigation. Dashboard on :8011.
This package is self-contained — it vendors the Sanad engine under vendor/,
so a clone of this folder builds and runs with no sibling folders or sanad-base.
Quick start (how to start)
On the robot, from this folder (Sanad_Package_1/) — nothing else needed:
cp .env.example .env # optional: set language / audio / license path
docker compose up -d --build # build (vendored engine) + run
# Jetson Docker without buildx: DOCKER_BUILDKIT=0 docker compose up -d --build
Open http://<robot-ip>:8011 and (the image ships keyless):
- Gemini API key card → paste your key.
- Persona card → set who the robot is + the language/dialect it speaks (saving restarts the live session).
- Audio card → pick chest vs USB/Anker speaker, volume, mute.
- Press Start in the Conversation card and talk.
Manage it:
docker compose logs -f # live logs
docker compose down # stop
./test_p1.sh <robot-ip>:8011 # smoke test (expect 11/11 PASS)
- Auto-start on boot:
sudo systemctl enable docker— compose already runs P1restart: unless-stopped, so it returns after a reboot. - A brand-new G1 (build, sign a license, bind to the robot) →
NEW_ROBOT_SETUP.md. - No Docker? dev mode in the
gemini_sdkconda env →./p1ctl.sh start(runs against./vendor).
What it ships
app_p1.py— launcher: bootstraps theProject.Sanadnamespace, constructs ONLY the comms subsystems (brain,audio_mgr,voice_client,local_tts,typed_replay,live_sub), injects a P1-scopedProject.Sanad.mainshim, and mounts ONLY the P1 dashboard routers (voice,audio,prompt,typed-replay,records,logs,live-subprocess,health,system) + the logs websocket. Serves the real Sanad SPA with non-P1 tabs hidden.entrypoint.sh— license gate (license_check P1; clean exit if unlicensed), resolves language/audio/port (env > license feature >config/p1_config.json).Dockerfile/requirements.txt— SELF-CONTAINED:FROM python:3.10-slim, installs all deps, and bakes the vendored engine — nosanad-base.vendor/Sanad+vendor/sanad_pkg— the vendored Sanad engine + license/bus lib. Refresh from a monorepo checkout with./sync_vendor.sh.config/p1_config.json— defaults (language, audio profile, port, tab set).docker-compose.yml— standalone build + run (context: .); the top-levelPackages/docker-compose.ymlcan still run P1 in the fleet via--profile p1.license/— public verification key + an example license for the default mount.
It does not fork Sanad — it vendors the canonical source under vendor/
(re-synced by sync_vendor.sh), so the package builds and runs entirely on its
own with no sibling folders.
Run & stop P1
A) Docker (the productized way) — self-contained; from this package dir
(Sanad_Package_1/) on the robot, nothing else required:
docker compose up -d --build # build (vendored engine) + run -> http://<robot>:8011
docker compose logs -f # view logs
docker compose down # stop
# Jetson Docker without buildx: DOCKER_BUILDKIT=0 docker compose up -d --build
# audio: SANAD_AUDIO_PROFILE=builtin (chest) | plugged (USB/Anker)
# language: license `language` feature, or SANAD_LANGUAGE=en docker compose up -d --build
No sanad-base image and no sibling Sanad/ checkout are needed — the engine is
vendored under vendor/. (The top-level Packages/docker-compose.yml can still
run P1 in the multi-package fleet via --profile p1.)
B) Dev mode (no Docker) — run P1 in the robot's gemini_sdk conda env via the
control script. It runs against the vendored engine in ./vendor, so only the
package folder is needed:
cd ~/sanad_deploy/Sanad_Package_1
./p1ctl.sh start # launch on :8011 (runs against ./vendor)
./p1ctl.sh status # process + /api/health
./p1ctl.sh logs 80 # tail the P1 log
./p1ctl.sh restart
./p1ctl.sh stop
Deploy/update from the workstation first:
rsync -az --exclude __pycache__ Project/Packages/Sanad_Package_1 unitree@<robot>:~/sanad_deploy/
Logs: the dashboard's Logs card streams live (/ws/logs) and the ⬇ Download
button saves the full bundle (/api/logs/bundle) as sanad_p1_logs_<ts>.txt.
Endpoints (P1 subset)
/ (filtered SPA) · /api/package (manifest + license + api-key status) ·
/api/voice/* · /api/audio/* · /api/prompt/* · /api/typed-replay/* ·
/api/records/* · /api/logs/* · /api/live-subprocess/* · /api/health ·
/api/system/info · /ws/logs · /api/p1/* (P1 settings, see below).
The P1 dashboard (http://<robot>:8011)
/— a clean P1 control page with cards: Conversation (start/stop), Say-a-line, Persona (Save & Apply), Gemini API key, Audio (speaker profile + volume + mute + rescan), and a live Logs view. This is the everyday UI — no API knowledge needed./full— the complete Sanad SPA (advanced), with non-P1 tabs hidden (motion/recognition/nav/temperature/terminal belong to other packages).
What you can do (cards on /, also the matching endpoints):
| You want to… | Where / endpoint |
|---|---|
| Talk to the robot (start/stop the live conversation) | Voice tab · `POST /api/live-subprocess/start |
| Make it say a specific line | Voice/Typed-replay · POST /api/voice/generate, POST /api/typed-replay/say |
| Change the robot persona (who it is, tone, language/dialect) | Settings · GET/POST /api/p1/persona (or /api/prompt) |
| Set / update the Gemini API key | Settings · GET/POST /api/p1/api-key |
| Pick speaker/mic (chest vs Anker/USB), volume, mute | Audio · `/api/audio/devices |
| Manage saved recordings (save/replay/rename/delete) | Recordings · /api/records/*, /api/typed-replay/* |
| See logs / system / health | Settings · /api/logs, /ws/logs, /api/system/info, /api/health |
Change the robot persona
The persona is the system prompt at scripts/sanad_script.txt (who Sanad is,
tone, and the language/dialect it speaks). Edit it from the Settings tab or:
curl http://<robot>:8011/api/p1/persona # current persona + rules
curl -X POST http://<robot>:8011/api/p1/persona \
-H 'Content-Type: application/json' \
-d '{"content":"You are Sanad, a friendly Emirati guide. Speak Khaleeji Arabic..."}'
POST /api/p1/persona writes the persona and restarts the live session so it
takes effect immediately (the base /api/prompt/update writes the file but a
running session keeps the old persona until restarted). This is also how you
steer the conversation language (put the language directive in the persona).
Set / update the Gemini API key
Two ways, both available in P1:
- Base (Sanad):
GET/POST /api/voice/api-key— the SPA Voice/Settings tab uses this. POST persists todata/motions/config.json, hot-swaps the in-memory key, and disconnects the short-session client. The live Gemini subprocess must be restarted (Stop→Start) to pick it up. - P1 convenience:
GET/POST /api/p1/api-key— same persist + hot-swap, and also auto-restarts the live Gemini subprocess so the new key applies immediately.GET /api/p1/settingsreturns api-key status + language + audio profile + whether a live session is running.
# set or update the key (works for first-time set AND replacing an existing key)
curl -X POST http://<robot>:8011/api/p1/api-key \
-H 'Content-Type: application/json' -d '{"api_key":"AIza...."}'
# check status (masked; never returns the full key)
curl http://<robot>:8011/api/p1/api-key
Keys are validated (must start with AIza, length check), stored masked in any
response, and persisted to data/motions/config.json (highest precedence, ahead
of SANAD_GEMINI_API_KEY env and core_config.json).
Plug-and-play status
- Base:
python:3.10-slim(multi-arch) →google-genaiinstalls cleanly, no CUDA needed. Build on the Jetson (or x86) withdocker compose up -d --build. - Works out of the box with a plugged USB speaker/mic. The entrypoint runs a preflight (python / google-genai / pyaudio / Unitree-SDK / audio profile) and prints clear guidance if something's missing.
- Language is set via the Persona card (put the dialect/language directive in the system prompt — saving applies it to the live session immediately).
- Self-contained: the Sanad engine is vendored under
vendor/and chest-audiounitree_sdk2_pythonis built into the image (WITH_UNITREE_SDK=1), so a clone of this repo builds and runs with no sibling folders. For pull-and-run at fleet scale, publish thelinux/arm64image to a registry anddocker compose pull. In a multi-package deployment, audio output can route through theSanad_Corehwbroker audio-lock (P1 standalone speaks directly).