137 lines
6.1 KiB
Markdown
137 lines
6.1 KiB
Markdown
# SanadV3 — Feature Catalog
|
||
|
||
Sanad is a bilingual (Arabic/English) humanoid receptionist/assistant on a
|
||
Unitree G1 (Jetson Orin NX, ROS 2 Foxy, Livox MID-360). This catalogs
|
||
**what's built today** (Part A) and **what's on the roadmap** (Part B).
|
||
|
||
---
|
||
|
||
# Part A — Current features (built & running)
|
||
|
||
Verified from the live subsystem registry (19 subsystems), dashboard tabs
|
||
(12), and API routers (22).
|
||
|
||
## 1. Voice & Conversation
|
||
- **Gemini live voice** — real-time bilingual AR/EN spoken conversation (native-audio model)
|
||
- **Offline brain** — local pipeline via `ollama` (`SANAD_VOICE_BRAIN=local`), no cloud
|
||
- **Wake phrases** — configurable wake-word manager
|
||
- **Typed replay** — type text, robot speaks it (with speaker-monitor capture)
|
||
- **Local TTS** — on-device text-to-speech engine
|
||
- **Prompt management** — edit the system prompt from the dashboard
|
||
- **Lip-sync** — mask mouth driven by TTS `MOUTH` markers
|
||
- **Barge-in** — interrupt speech (volume-scaled threshold)
|
||
|
||
## 2. Motion & Arm
|
||
- **35 arm actions** — 16 SDK built-ins + 19 custom JSONL motions
|
||
- **Macro record / playback** — capture and replay motion sequences
|
||
- **Teaching mode** — kinesthetic teach-and-repeat
|
||
- **Skills** — composed higher-level behaviors (skills.json)
|
||
- **Movement dispatch** — voice → motion (53 fixed + 10 parametric phrases, cooldown-gated)
|
||
- **Arm motion-block** — auto-inhibits arm moves while locomotion is active (safety interlock)
|
||
|
||
## 3. Locomotion
|
||
- **LocoClient + MotionSwitcher** — walk / pose control via Unitree SDK (eth0)
|
||
- **E-STOP** — dashboard kill button
|
||
- **Single Ctrl+C teardown** — one signal cleanly stops every subsystem (~2s)
|
||
|
||
## 4. LED Face Mask
|
||
- **Animated expressions** — neutral, smile, blink, look L/R, talk1–3, surprised, sad
|
||
- **Gestural-speaking events** — face reacts while speaking
|
||
- **Lip-sync** — mouth animates to speech
|
||
|
||
## 5. Vision & Recognition
|
||
- **Face recognition** — identify people via camera
|
||
- **Face gallery** — enroll/manage known faces
|
||
- **Zone gallery / zones** — visual zone recognition
|
||
- **Camera feed** — attached to the live voice subprocess (vision-in-the-loop)
|
||
|
||
## 6. Navigation (web_nav3 integration)
|
||
- **Live Map tab** — full embedded web_nav3 dashboard (set-pose, goals, bringup)
|
||
- **Navigation tab** — native canvas viewer (saved/live map, places, missions)
|
||
- **map_relay** — re-publishes the latched `/map` @1Hz so the map renders even when stationary
|
||
- **Saved maps** — load & view a pre-built `.db` (localize mode)
|
||
- **Places** — save named poses, one-click "Go"
|
||
- **Missions** — multi-waypoint routes (defined in web_nav3)
|
||
- **Cancel goal** — stop an active goal without tearing down bringup
|
||
- **SLAM** — RTABMap LiDAR-ICP, drift-corrected mapping/localization
|
||
|
||
## 7. Audio
|
||
- **Device manager** — sink/source selection, live refresh
|
||
- **Audio profiles** — builtin / anker / hollyland_builtin (auto-switch on plug/unplug)
|
||
|
||
## 8. Operations, System & Diagnostics
|
||
- **System control** — start/stop subsystems, status
|
||
- **Temperature monitor** — motor temps (live websocket stream)
|
||
- **Controller** — gamepad/teleop input
|
||
- **Web terminal** — shell in the browser (websocket)
|
||
- **Logs** — live log stream
|
||
- **Recordings & replay** — record/playback sessions
|
||
- **Scripts** — run saved scripts
|
||
|
||
## Dashboard infrastructure
|
||
- 12 tabs, **fault-isolated routers** (one broken module never breaks the dashboard)
|
||
- WebSocket streams: log_stream, motor_temps, terminal
|
||
- No-store HTML (no stale-cache 404s after deploy)
|
||
- Lazy subsystem imports (missing dep → that subsystem unavailable, rest runs)
|
||
|
||
---
|
||
|
||
# Part B — Roadmap (to add)
|
||
|
||
Tiers = priority. 🏗️ = load-bearing · ⚠️ = Foxy constraint.
|
||
|
||
## Tier 1 — Autonomous behaviors (the product)
|
||
1. **Voice-driven navigation** — "Sanad, go to the lobby" → nav goal
|
||
2. **Greeter mission** — recognized face → navigate → greet → express
|
||
3. **Named-person greeting** — identity → personalized line
|
||
4. **Patrol / guided tours** — ordered places, speech at each stop
|
||
5. **Return-to-base / dock-on-idle** — auto-home on idle/low battery
|
||
|
||
## Tier 2 — Navigation & map (harden + edit)
|
||
6. 🏗️ **Map republish relay** — ✅ DONE (map_relay)
|
||
7. **Click-to-goal on Nav tab canvas**
|
||
8. **Live nav telemetry** — distance/ETA/waypoint, "arrived" toast
|
||
9. **Battery + nav-state status bar**
|
||
10. **Geofence zones on the map**
|
||
11. **Cancel-goal button** — ✅ DONE
|
||
### Map editing & annotation (all build on #6)
|
||
12. **Erase tool** — paint cells free; wipe ghost obstacles + the SLAM "spokes"
|
||
13. **Obstacle paint ("black points" / virtual walls)** — ⚠️ Foxy-safe KeepoutFilter substitute
|
||
14. **Shape tools + brush size** — line/rectangle/polygon
|
||
15. **Non-destructive overlay + undo/redo**
|
||
16. **Persist & auto-reload edits per map**
|
||
17. **Crop / trim map bounds**
|
||
|
||
## Tier 3 — Voice & interaction
|
||
18. **Barge-in from dashboard**
|
||
19. **Quick-phrase soundboard**
|
||
20. **Conversation memory / visitor log**
|
||
21. **Per-speaker AR/EN auto-detect**
|
||
22. **Scheduled announcements**
|
||
23. **Bake edited map → PGM/YAML** (static map_server deploy)
|
||
|
||
## Tier 4 — Face & presence
|
||
24. **Gaze / head-track recognized face**
|
||
25. **Emotion-from-context** (sentiment → expression)
|
||
26. **Idle breathing / look-around**
|
||
27. **Lip-sync to TTS amplitude** (enhance existing markers)
|
||
|
||
## Tier 5 — Operator, fleet & reliability
|
||
28. 🏗️ **Global E-STOP button** — ✅ exists; surface consistently
|
||
29. **Health watchdog** — auto-restart dead subsystem + alert
|
||
30. **Per-subsystem enable/disable toggles**
|
||
31. **Behavior recorder → replay** (nav+voice timelines)
|
||
32. **Mission editor UI** (visual sequence builder)
|
||
33. **Remote access / tunnel**
|
||
34. **Reverse-proxy web_nav3 through :8001** — one origin, no iframe cross-port issues
|
||
|
||
## Tier 6 — Future / blocked
|
||
35. **Speed / caution zones** — needs Galactic SpeedFilter or custom layer
|
||
36. **Multi-robot fleet** (SanadV3 ↔ BotBrain) — needs LocoClient arbitration + coordinator
|
||
|
||
---
|
||
|
||
### Recommended next build order
|
||
**#1 voice→nav** → **#2 greeter mission** (the product), then **#12/#13 map editing**
|
||
(clean the spokes + virtual walls). #6 republish relay and #11 cancel are already done.
|