Sanadv3/FEATURES.md

137 lines
6.1 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# SanadV3 — Feature Catalog
Sanad is a bilingual (Arabic/English) humanoid receptionist/assistant on a
Unitree G1 (Jetson Orin NX, ROS 2 Foxy, Livox MID-360). This catalogs
**what's built today** (Part A) and **what's on the roadmap** (Part B).
---
# Part A — Current features (built & running)
Verified from the live subsystem registry (19 subsystems), dashboard tabs
(12), and API routers (22).
## 1. Voice & Conversation
- **Gemini live voice** — real-time bilingual AR/EN spoken conversation (native-audio model)
- **Offline brain** — local pipeline via `ollama` (`SANAD_VOICE_BRAIN=local`), no cloud
- **Wake phrases** — configurable wake-word manager
- **Typed replay** — type text, robot speaks it (with speaker-monitor capture)
- **Local TTS** — on-device text-to-speech engine
- **Prompt management** — edit the system prompt from the dashboard
- **Lip-sync** — mask mouth driven by TTS `MOUTH` markers
- **Barge-in** — interrupt speech (volume-scaled threshold)
## 2. Motion & Arm
- **35 arm actions** — 16 SDK built-ins + 19 custom JSONL motions
- **Macro record / playback** — capture and replay motion sequences
- **Teaching mode** — kinesthetic teach-and-repeat
- **Skills** — composed higher-level behaviors (skills.json)
- **Movement dispatch** — voice → motion (53 fixed + 10 parametric phrases, cooldown-gated)
- **Arm motion-block** — auto-inhibits arm moves while locomotion is active (safety interlock)
## 3. Locomotion
- **LocoClient + MotionSwitcher** — walk / pose control via Unitree SDK (eth0)
- **E-STOP** — dashboard kill button
- **Single Ctrl+C teardown** — one signal cleanly stops every subsystem (~2s)
## 4. LED Face Mask
- **Animated expressions** — neutral, smile, blink, look L/R, talk13, surprised, sad
- **Gestural-speaking events** — face reacts while speaking
- **Lip-sync** — mouth animates to speech
## 5. Vision & Recognition
- **Face recognition** — identify people via camera
- **Face gallery** — enroll/manage known faces
- **Zone gallery / zones** — visual zone recognition
- **Camera feed** — attached to the live voice subprocess (vision-in-the-loop)
## 6. Navigation (web_nav3 integration)
- **Live Map tab** — full embedded web_nav3 dashboard (set-pose, goals, bringup)
- **Navigation tab** — native canvas viewer (saved/live map, places, missions)
- **map_relay** — re-publishes the latched `/map` @1Hz so the map renders even when stationary
- **Saved maps** — load & view a pre-built `.db` (localize mode)
- **Places** — save named poses, one-click "Go"
- **Missions** — multi-waypoint routes (defined in web_nav3)
- **Cancel goal** — stop an active goal without tearing down bringup
- **SLAM** — RTABMap LiDAR-ICP, drift-corrected mapping/localization
## 7. Audio
- **Device manager** — sink/source selection, live refresh
- **Audio profiles** — builtin / anker / hollyland_builtin (auto-switch on plug/unplug)
## 8. Operations, System & Diagnostics
- **System control** — start/stop subsystems, status
- **Temperature monitor** — motor temps (live websocket stream)
- **Controller** — gamepad/teleop input
- **Web terminal** — shell in the browser (websocket)
- **Logs** — live log stream
- **Recordings & replay** — record/playback sessions
- **Scripts** — run saved scripts
## Dashboard infrastructure
- 12 tabs, **fault-isolated routers** (one broken module never breaks the dashboard)
- WebSocket streams: log_stream, motor_temps, terminal
- No-store HTML (no stale-cache 404s after deploy)
- Lazy subsystem imports (missing dep → that subsystem unavailable, rest runs)
---
# Part B — Roadmap (to add)
Tiers = priority. 🏗️ = load-bearing · ⚠️ = Foxy constraint.
## Tier 1 — Autonomous behaviors (the product)
1. **Voice-driven navigation** — "Sanad, go to the lobby" → nav goal
2. **Greeter mission** — recognized face → navigate → greet → express
3. **Named-person greeting** — identity → personalized line
4. **Patrol / guided tours** — ordered places, speech at each stop
5. **Return-to-base / dock-on-idle** — auto-home on idle/low battery
## Tier 2 — Navigation & map (harden + edit)
6. 🏗️ **Map republish relay** — ✅ DONE (map_relay)
7. **Click-to-goal on Nav tab canvas**
8. **Live nav telemetry** — distance/ETA/waypoint, "arrived" toast
9. **Battery + nav-state status bar**
10. **Geofence zones on the map**
11. **Cancel-goal button** — ✅ DONE
### Map editing & annotation (all build on #6)
12. **Erase tool** — paint cells free; wipe ghost obstacles + the SLAM "spokes"
13. **Obstacle paint ("black points" / virtual walls)** — ⚠️ Foxy-safe KeepoutFilter substitute
14. **Shape tools + brush size** — line/rectangle/polygon
15. **Non-destructive overlay + undo/redo**
16. **Persist & auto-reload edits per map**
17. **Crop / trim map bounds**
## Tier 3 — Voice & interaction
18. **Barge-in from dashboard**
19. **Quick-phrase soundboard**
20. **Conversation memory / visitor log**
21. **Per-speaker AR/EN auto-detect**
22. **Scheduled announcements**
23. **Bake edited map → PGM/YAML** (static map_server deploy)
## Tier 4 — Face & presence
24. **Gaze / head-track recognized face**
25. **Emotion-from-context** (sentiment → expression)
26. **Idle breathing / look-around**
27. **Lip-sync to TTS amplitude** (enhance existing markers)
## Tier 5 — Operator, fleet & reliability
28. 🏗️ **Global E-STOP button** — ✅ exists; surface consistently
29. **Health watchdog** — auto-restart dead subsystem + alert
30. **Per-subsystem enable/disable toggles**
31. **Behavior recorder → replay** (nav+voice timelines)
32. **Mission editor UI** (visual sequence builder)
33. **Remote access / tunnel**
34. **Reverse-proxy web_nav3 through :8001** — one origin, no iframe cross-port issues
## Tier 6 — Future / blocked
35. **Speed / caution zones** — needs Galactic SpeedFilter or custom layer
36. **Multi-robot fleet** (SanadV3 ↔ BotBrain) — needs LocoClient arbitration + coordinator
---
### Recommended next build order
**#1 voice→nav** → **#2 greeter mission** (the product), then **#12/#13 map editing**
(clean the spokes + virtual walls). #6 republish relay and #11 cancel are already done.