Update 2026-07-04 23:59:48

This commit is contained in:
kassam 2026-07-04 23:59:49 +04:00
parent 39be38811c
commit e03c6f68b3

263
README.md
View File

@ -1,22 +1,249 @@
# Sanad Package 3 — Facial Recognition + Places + Memories (SCAFFOLD)
# Sanad Package 3 — Facial Recognition + Places + Memories
Perception + memory only; **no motion**. Dashboard on **:8013**. License
features: `face_rec`, `places`, `memory`, `mask`.
Identify **faces** (VIP database), recognize **places** (visual zones), and
**remember visitors** across visits — then greet them personally and drive the
LED mask on recognition. Perception + memory only; **no robot motion**. Dashboard
on **:8013**.
Planned (`sanad-recognition` container):
- **Faces / VIP DB**`vision/face_gallery.py` (enroll/upload/capture/describe,
VIP flag). Recognition is Gemini-side in-context (primer images) — no local ML.
- **Places**`vision/zone_gallery.py` visual place recognition (the `/go` nav
action stays disabled here; it belongs to P4).
- **Memories** — NEW persistent visitor-profile store (attributes, notes,
last-seen, linked `face_id`); feeds personalized-greeting primers to comms.
- **Face/Mask** — drives the shared `sanad-mask` for expressions on recognition.
- Consumes the `frames` topic from `Sanad_Core/camera`; emits `recognition.event`
/ `place.event`.
This package is **self-contained** — it vendors the Sanad engine under `vendor/`
and builds `FROM python:3.10-slim-bookworm`, so a clone of this folder builds and
runs with **no sibling folders and no shared base image**. Recognition is 100%
**Gemini-side / in-context** (primer images) — there is **no local ML model** and
no CUDA needed. The image ships **keyless**: the customer pastes their own Gemini
API key in the dashboard on first open.
Depends on `Sanad_Core` (camera, shared `sanad-mask`). See the plan for details.
---
**Build model (when implemented):** self-contained like P1 — a `vendor/` copy of
the Sanad engine + a standalone `Dockerfile` (`FROM python:3.10-slim`) + its own
`docker-compose.yml`, refreshed by a `sync_vendor.sh`. No `sanad-base`. (Camera
frames + shared mask are the cross-container dependencies.)
## Install (per-robot, first run)
The workstation is canonical. You **rsync the folder to the robot, then build and
run Docker on the robot**.
### 1. Copy the package onto the robot
From the workstation (dev robot = `10.255.254.86`):
```bash
rsync -az --exclude __pycache__ --exclude data/faces --exclude data/zones \
Project/G1/Packages/Sanad_Package_3 unitree@<robot>:~/sanad_deploy/
```
Everything the container needs is inside the folder (`vendor/Sanad`,
`vendor/sanad_pkg`, `vendor/mask`) — no monorepo checkout on the robot.
### 2. License
Verification is a signed **Ed25519** `sanad.lic` (fully **offline** — the public
key is baked into the image). The folder ships a **demo** license
(`license/sanad.lic.example`, robot_id `G1-SN-DEMO-0001`, unbound, expires
`2030-01-01`) that entitles **P1 + P2 + P3**, so the package **builds and runs
as-is** for evaluation. A **production** robot needs a **signed per-robot
license** — see [Licensing](#licensing) below (P3 has no `NEW_ROBOT_SETUP.md`;
the fleet-level `../README.md` has the vendor signing flow too).
### 3. Keyless — add your own Gemini key
No Gemini key is baked into the image. On first open of the dashboard, paste your
key in the **Gemini API key** card (or `POST /api/p3/api-key`). It is persisted to
`data/motions/config.json` and hot-swapped into the live session.
### 4. Camera — the #1 gotcha (read this before `up`)
Recognition needs the camera, but the `/dev/video*` device mounts in
`docker-compose.yml` are **commented out by default** — because Docker
**hard-fails `up`** if you bind a device node that is absent. Before enabling
recognition, edit `docker-compose.yml` and **uncomment the V4L node your camera
actually exposes**:
- A plain USB webcam is usually `/dev/video0`.
- A **RealSense colour node is NOT `video0`** — a RealSense exposes ~6 nodes; pick
the colour one (or add several) **or** set `SANAD_CAMERA_USB_INDEX` to the right
OpenCV index.
Capture is plain **OpenCV** (`opencv-python-headless`, `cv2.VideoCapture`) — no
`pyrealsense2`. The container is granted `group_add: video` so it can open the
node. Faces/zones/memories persist under the bind-mounted `./data`.
---
## Run with Docker (quick start)
On the robot, **from this folder** (`Sanad_Package_3/`) — nothing else needed:
```bash
cp .env.example .env # optional: license / audio / camera / language
docker compose up -d --build # build (vendored engine) + run
# Jetson Docker without buildx: DOCKER_BUILDKIT=0 docker compose up -d --build
```
`network_mode: host` is **mandatory** (G1 DDS discovery on `eth0`) and is already
set in the compose file. Then open **http://&lt;robot-ip&gt;:8013** and:
1. **Gemini API key** card → paste your key (ships keyless).
2. **Recognition** tab → enroll faces (camera capture or upload), name them, flag
**VIP**; capture **places** (zones). This builds the primer DB.
3. **Mask** tab (optional) → confirm the LED mask connects for expressions on
recognition.
4. **Settings** tab → speaker profile (chest vs USB/BT), volume, persona/language.
If the camera devices are still commented out, the container boots fine but
recognition has no video source — uncomment the right V4L node (step 4 of Install)
and `docker compose up -d` again.
---
## Manage it
```bash
docker compose logs -f # live logs
docker compose down # stop
./test_p3.sh <robot-ip>:8013 # smoke test (expect 13/13 PASS + memory roundtrip)
```
- **Auto-start on boot:** `sudo systemctl enable docker` — compose already runs P3
`restart: unless-stopped`, so it returns after a reboot. (For plugged/BT audio to
survive a reboot, also run the one-time `loginctl enable-linger unitree` — see
[Configuration](#configuration-env-vars).)
- **No Docker? (dev mode)** run P3 in the robot's `gemini_sdk` conda env against the
vendored engine in `./vendor`:
```bash
cd ~/sanad_deploy/Sanad_Package_3
./p3ctl.sh start # launch on :8013 (runs against ./vendor)
./p3ctl.sh status # process + /api/health
./p3ctl.sh logs 80 # tail the P3 log
./p3ctl.sh restart
./p3ctl.sh stop
```
The conda env needs `google-genai`, `opencv-python-headless`, and (for the LED
mask) `bleak==0.22.3` + `Pillow`.
**What happens on `up`** (from `entrypoint.sh`): (1) **license gate**
`python3 -m sanad_pkg.license_check P3`; if the robot is **not** entitled the
container **exits cleanly (code 0)** so the restart policy will not crash-loop;
(2) **resolve config** with precedence **env var > license feature >
`config/p3_config.json`**; (3) a **preflight** that prints clear diagnostics
(python / google-genai / pyaudio / bleak / Pillow / mask lib / Unitree SDK /
audio profile); (4) `exec app_p3.py`.
---
## What it ships
- `app_p3.py` — the launcher: bootstraps the `Project.Sanad` namespace, builds the
perception subsystems (camera, face gallery, zone gallery, recognition state) +
comms core (brain/audio/voice/live_sub) + mask, constructs the **NEW**
package-local `VisitorMemory` store, injects a P3-scoped `Project.Sanad.main`
shim, and mounts only the P3 routers + the filtered SPA (non-P3 tabs hidden).
- `routes_p3.py` — P3 convenience routes (`/api/p3/*`: api-key, persona, say,
settings, logs) that also restart the live Gemini session so a new key/persona
applies immediately.
- `routes_memory.py` — the **NEW** visitor-memory CRUD router (`/api/memory/*`).
- `visitor_memory.py` — the persistent visitor-profile store (attributes, notes,
tags, last-seen, linked `face_id`; feeds personalized-greeting primers).
- `vendor/Sanad` + `vendor/sanad_pkg` + `vendor/mask` — the vendored Sanad engine,
license/bus lib, and Shining-Mask BLE lib. Refresh with `./sync_vendor.sh`.
- `Dockerfile` / `requirements.txt`**SELF-CONTAINED** build (`FROM
python:3.10-slim-bookworm`), bakes the vendored engine + pubkey; no `sanad-base`.
- `docker-compose.yml` — standalone build + run (`context: .`, image
`sanad-p3:latest`, container `sanad-p3`, service `p3`).
- `entrypoint.sh` — license gate → config resolve → preflight → launch.
- `config/``p3_config.json` (defaults: port, audio, tab set) +
`mask_config.json`.
- `license/``pubkey.ed25519` (public verify key) + `sanad.lic.example` (demo).
- `sync_vendor.sh` — refresh `vendor/` from a full monorepo checkout.
- `test_p3.sh` — 13-check smoke test + a memory create→list→delete roundtrip.
- `p3ctl.sh` — no-Docker dev-mode control (`start|stop|restart|status|logs`).
- `strip_key.py` — build step that guarantees the image ships keyless.
- `data/` — bind-mounted persistence (`faces` / `zones` / `memories` / recordings).
---
## Configuration (env vars)
Copy `.env.example``.env` (compose reads it automatically). The most-used knobs:
| Var | Default | What it does |
|---|---|---|
| `SANAD_LICENSE_FILE` | `./license/sanad.lic.example` | Signed license mounted read-only at `/etc/sanad/sanad.lic`. Point at your real per-robot `sanad.lic`. |
| `SANAD_LICENSE_BIND` | `0` | `1` enforces machine-fingerprint binding (also uncomment the `/etc/machine-id` mount in `docker-compose.yml`). |
| `SANAD_CAMERA_USB_INDEX` | *(empty)* | Pin the OpenCV colour-camera index (a RealSense colour node is **not** `video0`). Alternative to uncommenting a specific `/dev/video*` bind. |
| `SANAD_AUDIO_PROFILE` | `builtin` | `builtin` = G1 chest over DDS; `plugged` = USB (Anker) / Bluetooth (JBL) via host PulseAudio. |
| `SANAD_DDS_INTERFACE` | `eth0` | NIC used for G1 DDS discovery. |
| `SANAD_LANGUAGE` | *(empty)* | Empty = **multilingual auto-detect**; set `ar`/`en` to pin one language. |
| `SANAD_MASK_ADDRESS` | *(empty)* | Pin the LED mask BLE MAC; empty = auto-discover by name prefix. |
| `SANAD_MEMORIES_DIR` | `/app/Sanad/data/memories` | Where visitor profiles are stored (persisted via the `./data` bind). |
| `SANAD_PULSE_DIR` / `PULSE_SERVER` / `PULSE_COOKIE` | `/run/user/1000/pulse…` | Host PulseAudio socket + auth cookie the root container needs to set **plugged/BT** volume. Chest audio needs none of this. |
| `WITH_UNITREE_SDK` | `1` | Build arg: `1` bundles CycloneDDS (pinned `0.10.2`, built full) + `unitree_sdk2_python` so chest audio works out of the box; `0` = leaner USB-only image. |
| `BASE_OS_IMAGE` | `python:3.10-slim-bookworm` | Build arg: override only for a GPU build. |
| `SANAD_IMAGE` | `sanad-p3:latest` | Image name/tag (e.g. a registry path for pull-and-run). |
**Audio note (like SanadV3):** the volume slider drives **all** speaker types. The
unified control `POST /api/audio/g1-speaker/volume` sets the G1 chest (DDS
`SetVolume`) **and** the active PulseAudio sink (plugged/BT). For the plugged/BT
half, the root container needs the host PulseAudio socket + cookie — already mounted
by compose. One-time host setup for a stable boot-time socket:
`loginctl enable-linger unitree`.
---
## Dashboard & features (http://&lt;robot&gt;:8013)
The dashboard is the SanadV3 SPA with only the **P3 tabs** shown:
**operations · voice · recognition · mask · recordings · settings**
(motion / controller / navigation / livemap / mapeditor / temp / terminal are
hidden — they belong to other packages).
What you can do:
- **Faces / VIP DB** (`vision/face_gallery.py`) — enroll via camera capture or
upload, rename, describe, flag **VIP**.
- **Places** (`vision/zone_gallery.py`) — capture and recognize visual zones. (The
nav `/go` action stays disabled here — it belongs to P4.)
- **Visitor memory** (`visitor_memory.py`, **NEW**) — persistent profiles with
attributes / notes / tags / last-seen / linked `face_id`, feeding
personalized-greeting primers.
- **Mask** — drives the LED "Shining Mask" for expressions on recognition (same
in-container BLE stack as P2: `bleak==0.22.3`, host `/var/run/dbus`, `NET_ADMIN`,
`/dev/bus/usb`).
### API endpoint groups
| Group | Prefix | Purpose |
|---|---|---|
| **Recognition** | `/api/recognition/*` | Face identification + VIP gallery state. |
| **Places** | `/api/zones/*` | Visual zone / place recognition. |
| **Visitor memory** (NEW) | `/api/memory/*` | CRUD: `GET /` list · `GET /status` · `GET /primer` · `GET /by-face/{id}` · `GET/PUT/DELETE /{id}` · `POST /` create · `POST /{id}/touch`. |
| **Mask** | `/api/mask/*` | LED mask status + expressions. |
| **P3 settings** | `/api/p3/*` | `api-key`, `persona`, `say`, `settings`, `logs/delete`. |
| **Comms** | `/api/voice/*` `/api/audio/*` `/api/prompt/*` `/api/typed-replay/*` `/api/records/*` `/api/live-subprocess/*` | Reused Sanad conversation + audio subset. |
| **Ops** | `/api/package` `/api/health` `/api/system/info` `/api/logs/*` `/ws/logs` | Manifest (+ license + features), health, system info, logs. |
---
## Licensing
A signed **Ed25519** `sanad.lic` decides which packages a robot may run.
`pubkey.ed25519` is baked into every image at `/etc/sanad/pubkey.ed25519`;
verification is fully **offline**. Compose mounts the license read-only:
`${SANAD_LICENSE_FILE:-./license/sanad.lic.example}:/etc/sanad/sanad.lic:ro`.
**The bundled demo** (`license/sanad.lic.example`) entitles **P1 + P2 + P3** with
features `face_rec`, `places`, `memory`, `mask` (among others), robot_id
`G1-SN-DEMO-0001`, unbound, expiring `2030-01-01` — enough to build and evaluate P3.
**Features that gate P3:** `face_rec`, `places`, `memory`, `mask`.
**Sign a real per-robot license** (vendor side — keep the private key **OFF** the
robot):
```bash
python licensing/sign_license.py gen-keys --out-dir . # once
# edit claims.json: set packages{P1..P4}, features{face_rec,places,memory,mask,…},
# robot_id, and (optionally) machine_fingerprint
python licensing/sign_license.py sign --key privkey.ed25519 \
--in claims.json --out sanad.lic
```
Put `sanad.lic` on the robot and point `SANAD_LICENSE_FILE` at it (or replace
`license/sanad.lic.example`). Optional **hardware binding**: set
`SANAD_LICENSE_BIND=1` (also uncomment the `/etc/machine-id` mount). Verify
entitlement exactly as the entrypoint does:
```bash
SANAD_LICENSE=license/sanad.lic python -m sanad_pkg.license_check P3
```
---
## Status
**Built + validated locally** — the package compiles, `license_check P3` reports
entitled against the demo license, and the compose YAML is valid. P3 is a real,
shipped package (not a scaffold).
**Remaining gate:** the on-robot Docker build + `./test_p3.sh` smoke test on the G1
(P1 is already robot-tested; P2/P3/P4 have been validated locally but not yet
built + smoke-tested on hardware). The camera V4L node must be uncommented on the
target robot before recognition will have video.