kassam c1f74a167d Update 2026-07-04 23:28:11

2026-07-04 23:28:12 +04:00

6.1 KiB

Raw Blame History

SanadV3 — Feature Catalog

Sanad is a bilingual (Arabic/English) humanoid receptionist/assistant on a Unitree G1 (Jetson Orin NX, ROS 2 Foxy, Livox MID-360). This catalogs what's built today (Part A) and what's on the roadmap (Part B).

Part A — Current features (built & running)

Verified from the live subsystem registry (19 subsystems), dashboard tabs (12), and API routers (22).

1. Voice & Conversation

Gemini live voice — real-time bilingual AR/EN spoken conversation (native-audio model)
Offline brain — local pipeline via ollama (SANAD_VOICE_BRAIN=local), no cloud
Wake phrases — configurable wake-word manager
Typed replay — type text, robot speaks it (with speaker-monitor capture)
Local TTS — on-device text-to-speech engine
Prompt management — edit the system prompt from the dashboard
Lip-sync — mask mouth driven by TTS MOUTH markers
Barge-in — interrupt speech (volume-scaled threshold)

2. Motion & Arm

35 arm actions — 16 SDK built-ins + 19 custom JSONL motions
Macro record / playback — capture and replay motion sequences
Teaching mode — kinesthetic teach-and-repeat
Skills — composed higher-level behaviors (skills.json)
Movement dispatch — voice → motion (53 fixed + 10 parametric phrases, cooldown-gated)
Arm motion-block — auto-inhibits arm moves while locomotion is active (safety interlock)

3. Locomotion

LocoClient + MotionSwitcher — walk / pose control via Unitree SDK (eth0)
E-STOP — dashboard kill button
Single Ctrl+C teardown — one signal cleanly stops every subsystem (~2s)

4. LED Face Mask

Animated expressions — neutral, smile, blink, look L/R, talk1–3, surprised, sad
Gestural-speaking events — face reacts while speaking
Lip-sync — mouth animates to speech

5. Vision & Recognition

Face recognition — identify people via camera
Face gallery — enroll/manage known faces
Zone gallery / zones — visual zone recognition
Camera feed — attached to the live voice subprocess (vision-in-the-loop)

Live Map tab — full embedded web_nav3 dashboard (set-pose, goals, bringup)
Navigation tab — native canvas viewer (saved/live map, places, missions)
map_relay — re-publishes the latched /map @1Hz so the map renders even when stationary
Saved maps — load & view a pre-built .db (localize mode)
Places — save named poses, one-click "Go"
Missions — multi-waypoint routes (defined in web_nav3)
Cancel goal — stop an active goal without tearing down bringup
SLAM — RTABMap LiDAR-ICP, drift-corrected mapping/localization

7. Audio

Device manager — sink/source selection, live refresh
Audio profiles — builtin / anker / hollyland_builtin (auto-switch on plug/unplug)

8. Operations, System & Diagnostics

System control — start/stop subsystems, status
Temperature monitor — motor temps (live websocket stream)
Controller — gamepad/teleop input
Web terminal — shell in the browser (websocket)
Logs — live log stream
Recordings & replay — record/playback sessions
Scripts — run saved scripts

Dashboard infrastructure

12 tabs, fault-isolated routers (one broken module never breaks the dashboard)
WebSocket streams: log_stream, motor_temps, terminal
No-store HTML (no stale-cache 404s after deploy)
Lazy subsystem imports (missing dep → that subsystem unavailable, rest runs)

Part B — Roadmap (to add)

Tiers = priority. 🏗️ = load-bearing · ⚠️ = Foxy constraint.

Tier 1 — Autonomous behaviors (the product)

Voice-driven navigation — "Sanad, go to the lobby" → nav goal
Greeter mission — recognized face → navigate → greet → express
Named-person greeting — identity → personalized line
Patrol / guided tours — ordered places, speech at each stop
Return-to-base / dock-on-idle — auto-home on idle/low battery

🏗️ Map republish relay — ✅ DONE (map_relay)
Click-to-goal on Nav tab canvas
Live nav telemetry — distance/ETA/waypoint, "arrived" toast
Battery + nav-state status bar
Geofence zones on the map
Cancel-goal button — ✅ DONE

Map editing & annotation (all build on #6)

Erase tool — paint cells free; wipe ghost obstacles + the SLAM "spokes"
Obstacle paint ("black points" / virtual walls) — ⚠️ Foxy-safe KeepoutFilter substitute
Shape tools + brush size — line/rectangle/polygon
Non-destructive overlay + undo/redo
Persist & auto-reload edits per map
Crop / trim map bounds

Tier 3 — Voice & interaction

Barge-in from dashboard
Quick-phrase soundboard
Conversation memory / visitor log
Per-speaker AR/EN auto-detect
Scheduled announcements
Bake edited map → PGM/YAML (static map_server deploy)

Tier 4 — Face & presence

Gaze / head-track recognized face
Emotion-from-context (sentiment → expression)
Idle breathing / look-around
Lip-sync to TTS amplitude (enhance existing markers)

Tier 5 — Operator, fleet & reliability

🏗️ Global E-STOP button — ✅ exists; surface consistently
Health watchdog — auto-restart dead subsystem + alert
Per-subsystem enable/disable toggles
Behavior recorder → replay (nav+voice timelines)
Mission editor UI (visual sequence builder)
Remote access / tunnel
Reverse-proxy web_nav3 through :8001 — one origin, no iframe cross-port issues

Tier 6 — Future / blocked

Speed / caution zones — needs Galactic SpeedFilter or custom layer
Multi-robot fleet (SanadV3 ↔ BotBrain) — needs LocoClient arbitration + coordinator

Recommended next build order

#1 voice→nav → #2 greeter mission (the product), then #12/#13 map editing (clean the spokes + virtual walls). #6 republish relay and #11 cancel are already done.

6.1 KiB

Raw Blame History

SanadV3 — Feature Catalog

Part A — Current features (built & running)

1. Voice & Conversation

2. Motion & Arm

3. Locomotion

4. LED Face Mask

5. Vision & Recognition

6. Navigation (web_nav3 integration)

7. Audio

8. Operations, System & Diagnostics

Dashboard infrastructure

Part B — Roadmap (to add)

Tier 1 — Autonomous behaviors (the product)

Tier 2 — Navigation & map (harden + edit)

Map editing & annotation (all build on #6)

Tier 3 — Voice & interaction

Tier 4 — Face & presence

Tier 5 — Operator, fleet & reliability

Tier 6 — Future / blocked

Recommended next build order

6.1 KiB Raw Blame History Unescape Escape

SanadV3 — Feature Catalog

Part A — Current features (built & running)

1. Voice & Conversation

2. Motion & Arm

3. Locomotion

4. LED Face Mask

5. Vision & Recognition

6. Navigation (web_nav3 integration)

7. Audio

8. Operations, System & Diagnostics

Dashboard infrastructure

Part B — Roadmap (to add)

Tier 1 — Autonomous behaviors (the product)

Tier 2 — Navigation & map (harden + edit)

Map editing & annotation (all build on #6)

Tier 3 — Voice & interaction

Tier 4 — Face & presence

Tier 5 — Operator, fleet & reliability

Tier 6 — Future / blocked

Recommended next build order

6.1 KiB

Raw Blame History