GoWelcome/README.md

398 lines
18 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# GoWelcome
An autonomous **Unitree Go2** "backyard greeter". The robot dog wanders a yard,
spots a person with on-board vision, walks up to them via visual servoing, and
plays a friendly greeting (audio + gestures) -- all while steering clear of
roads and vehicles.
GoWelcome is **mapless**, **reactive**, and **NON-ROS2**: a single Python
process running a fixed-rate state machine on top of the official
`unitree_sdk2py`. No SLAM, no global planner -- just perceive, decide, move.
---
## Behaviour: the 5-state machine
Arbitration priority each tick (highest first):
**e-stop / pause****stale-perception halt****AVOID_DANGER (cars/obstacles)**
**GPS hold (no fix)** → finish **GREET****BOUNDARY (geofence)**
**APPROACH****WANDER + dog-play**. So cars always win, the dog never chases a
person out of the area, and idle play only happens when nothing else is going on.
```
+----------+ person (conf>thr) +-----------+ box fills +---------+
| WANDER | ------------------> | APPROACH | -----------> | GREET |
| + play | <------ lost ------ | | stop_ratio | wav+wag |
+----------+ cooldown <----------------------------------+---------+
^ ^ ^
| | +--- back inside ---- +--------------+ near geofence edge (GPS)
| | | BOUNDARY | home back toward centre
| +------ clear ---------- +--------------+
| ^
| road / vehicle near any state can preempt to
+------------------------ +--------------+
| AVOID_DANGER | back up + pivot away
+--------------+
```
- **WANDER** -- cruise forward (`wander.forward_speed`) with a gentle yaw sweep,
scanning the yard, and occasionally performing an idle **dog-play** action
(see below). It also **keeps its distance from the road/cars**: as pavement or
a vehicle appears ahead it veers toward the clear side and slows down (the
vision-only containment — see below). Transitions to **APPROACH** when a person
is detected above `perception.person_conf`.
- **APPROACH** -- run the visual servo (below) to centre and close on the
person. Falls back to **WANDER** if the person is missing for
`loop.person_lost_frames`. Transitions to **GREET** when the person's box
fills `servo.stop_height_ratio` of the frame height.
- **GREET** -- settle to a full stop (`greet.settle_time`), play the greeting
`wav`, run `greet.gestures` (spaced by `greet.gesture_gap`), then return to
**WANDER** and ignore people for `greet.cooldown` seconds.
- **AVOID_DANGER** -- preempts every other state (highest safety). Triggered by
HSV road coverage over `perception.road_trigger_coverage`, or a vehicle box
taller than `perception.danger_min_height_ratio`. Backs up (`avoid.backup_speed`)
then pivots away from the road until clear for `avoid.clear_frames` frames.
- **BOUNDARY** -- *optional* GPS keep-in-area behaviour (off by default; see
below). When enabled and the dog nears the geofence edge it homes back toward
the centre, and won't leave the area even to chase a person.
A perception time-out (`safety.perception_timeout`) in any state stops the
robot until fresh frames arrive.
---
## Stay-in-area: vision (default), GPS optional
The dog patrols an **open area with no physical fence**. By **default it stays in
the area with vision** — no GPS needed:
- **Soft road/car repulsion (WANDER):** as pavement appears in the lower frame
(HSV road mask), or a vehicle is detected (YOLO), the dog **veers toward the
clear side and slows down** *before* reaching the hard reaction — actively
keeping its distance from the road/cars (`avoid.soft_road_coverage`,
`road_repulsion_gain`, `car_repulsion_gain`).
- **Hard reaction (AVOID_DANGER):** up close (road fills the centre past
`perception.road_trigger_coverage`, or a near car), it backs up and pivots
away. The firmware **LiDAR hard-stop** sits underneath all of it.
This relies on a usable visual border (a clear grass→pavement edge, decent
lighting) and is the recommended setup. Tune the gains in `config.py`
(`AvoidConfig`); set `avoid.soft_avoid_enabled = False` to keep only the hard
reaction.
### Optional: GPS geofence (`--geofence`)
For a hard metric boundary, add an external GPS receiver and enable the geofence
with `--geofence` (off by default). It adds the **BOUNDARY** state, homing the
dog back toward a centre point.
> ⚠️ **The Go2 has no built-in GPS.** This requires an **external GPS receiver on
> the onboard computer** (USB u-blox-class into the Jetson), read via `gpsd` or
> serial NMEA. Standard GPS is accurate to **±25 m**, so keep `geofence.radius_m`
> well inside the real edge (use RTK GPS for tight bounds near a road).
How it works:
1. On startup the first good fix becomes the fence **centre** (`center_mode:
onstart`), or set explicit `center_lat/lon` (`center_mode: fixed`), or press
**"Set fence centre here"** on the dashboard.
2. The dog roams freely within `geofence.radius_m`. Within `geofence.margin_m`
of the edge it enters **BOUNDARY** and homes back toward the centre, steering
with the GPS **course-over-ground** (no compass needed), until `release_m`
back inside (hysteresis).
3. **Fail-safe:** if GPS is lost/stale (`gps.stale_after`) the dog **stops**
(`geofence.no_fix_behavior: stop`) rather than roam blind near the edge.
GPS source (`gps.source`): `auto` (probe gpsd, else serial), `gpsd`, `serial`
(NMEA on `gps.serial_port`), or `mock` (a simulated receiver that integrates the
commanded motion — for testing). `--gps`/`--radius` imply `--geofence`.
---
## Act like a dog (idle play)
While WANDERing, an idle scheduler occasionally performs a random dog action
(`play.actions`: `stretch`, `wiggle`, `scrape`/dig, `dance1`, `wallow`, ...),
pausing briefly to do the trick. Intensity is **runtime-settable** (default
`moderate`): `calm` (~75 s between actions), `moderate` (~30 s), `playful`
(~15 s) — change it any time from the dashboard or with `--play`. (The greeting
itself adds Hello/Heart "wags".) Play never overrides safety, the geofence, or a
greeting.
---
## Architecture
```
camera frames
|
v
+---------------------+ latest() +-------------------------+
| PerceptionThread | -------------------> | GoWelcomeStateMachine |
| (background thread) | PerceptionResult | step(dt) -> State |
| YOLO + HSV road | | WANDER/APPROACH/GREET/ |
+---------------------+ | AVOID_DANGER |
^ +-------------------------+
| get_frame() |
| | drive() / stop() /
| | gesture() / play_greeting()
| v
| +--------------------------+
+--------------------------------- | RobotInterface |
| (abstract contract) |
+--------------------------+
/ | \
+------------------+ +------------------+ +------------------+
| Go2WebRTCRobot | | Go2Robot | | MockRobot |
| unitree_webrtc_ | | unitree_sdk2py | | webcam / video, |
| connect (DEFAULT)| | over CycloneDDS | | no hardware, for |
| wifi, AIR/PRO/EDU| | (--transport dds)| | off-robot dev/CI |
| + AudioHub audio | | wired / EDU | | |
+------------------+ +------------------+ +------------------+
(async bridge: WebRTC event loop on its own thread)
```
- **`config.py`** -- every tunable lives here in grouped dataclasses
(`GoWelcomeConfig`). CLI flags in `main.py` override a handful at startup.
- **`gowelcome/types.py`** -- frozen data contracts: `State`, `Detection`,
`RoadInfo`, `PerceptionResult`. The shared language between layers.
- **`gowelcome/robot/interface.py`** -- the `RobotInterface` and `AudioBackend`
ABCs plus the `GESTURES` vocabulary. The behaviour layer talks only to these.
- **`PerceptionThread`** -- grabs frames from the robot, runs YOLO + the HSV
road mask off the control loop, and publishes the newest `PerceptionResult`
via `latest()`.
- **`GoWelcomeStateMachine`** -- the reactive brain; `step(dt)` reads the latest
perception and issues robot commands, returning the current `State`.
Velocity convention everywhere (matches `SportClient.Move`):
`vx` forward+, `vy` left+, `vyaw` CCW/left+ (rad/s).
---
## Transports & greeting audio
GoWelcome talks to the robot through one of two transports (`--transport`):
| Transport | Library | Works on | Greeting audio | When |
|-----------|---------|----------|----------------|------|
| **`webrtc`** *(default)* | `unitree_webrtc_connect` (app protocol) | Go2 **AIR/PRO/EDU over wifi** | ✅ **from the dog's speaker** (AudioHub) | default; no jailbreak |
| `dds` | official `unitree_sdk2py` (CycloneDDS) | Go2 **EDU, wired** | ❌ none on Go2 → host speaker | `--transport dds` |
**Greeting from the dog (WebRTC default).** On startup GoWelcome uploads
`assets/greeting.wav` to the robot via **AudioHub** and plays it by uuid on each
greeting — sound comes from the **Go2's own speaker**. Pick the method with
`--audio-method`:
- `audiohub` *(default)* — upload once, `play_by_uuid` per greeting (persistent, low latency).
- `stream` — stream the file live each greeting via an aiortc `MediaPlayer`.
**DDS transport audio.** The official SDK has **no Go2 audio path** (its
`AudioClient` is G1-only). On `--transport dds`, greeting audio falls back to a
**pluggable host backend** (`--audio host|go2|null`) that plays on the machine
running GoWelcome. The field-proven pattern (from the team's G1 `Sanad` stack)
is a USB/Bluetooth speaker on the onboard computer, pinned by its PulseAudio sink:
```bash
pactl list short sinks # find your speaker's sink
python main.py --transport dds --interface eth0 \
--audio-device alsa_output.usb-Anker_PowerConf_A3321-DEV-SN1-01.analog-stereo
```
Drop your clip at `assets/greeting.wav` — see `assets/greeting.README.md`. (DDS
host playback wants 16 kHz mono 16-bit PCM; AudioHub accepts any wav.)
---
## Install
### Off-robot (mock / development / tests)
```bash
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt # numpy, opencv-python, ultralytics, simpleaudio
```
You do **not** need any robot library for `--mock`.
### On the real robot — WebRTC (default)
```bash
sudo apt install -y portaudio19-dev
pip install -r requirements.txt # includes unitree_webrtc_connect
```
For **Go2 firmware ≥ 1.1.15** you also need the per-device AES-128 key (once):
fetch it with the connector's `examples/fetch_aes_key.py`, then pass `--aes-key`.
### On the real robot — DDS (alternative, EDU/wired)
Install the official Unitree SDK on the robot's host (not a plain pip install):
```bash
# https://github.com/unitreerobotics/unitree_sdk2_python
git clone https://github.com/unitreerobotics/unitree_sdk2_python
cd unitree_sdk2_python && pip install -e . # pulls in cyclonedds
```
`pyserial` is **not** required.
---
## Run
```bash
# Off-robot, webcam index 0, silent audio:
python main.py --mock --audio null --source 0
# or: ./scripts/run_mock.sh
# Off-robot from a video file with the debug window:
python main.py --mock --source backyard.mp4
# Real Go2 over WebRTC (default) — greeting plays from the dog's speaker:
python main.py --robot-ip 192.168.1.50 # add --aes-key <hex> on fw >= 1.1.15
# or: ./welcome.sh --robot-ip 192.168.1.50 # TEST SUSPENDED FIRST
# Real Go2 over DDS (EDU/wired):
python main.py --transport dds --interface eth0
# Useful flags:
python main.py --robot-ip 192.168.1.50 --device cuda --headless --web
python main.py --mock --dry-run # perceive + decide, never move
```
| Flag | Config field set |
|-----------------|------------------------------------------|
| `--mock` | `mock` |
| `--transport` | `transport` (`webrtc`/`dds`) |
| `--robot-ip` | `webrtc.ip` (localsta) |
| `--serial` | `webrtc.serial_number` |
| `--aes-key` | `webrtc.aes_128_key` (fw ≥ 1.1.15) |
| `--connection` | `webrtc.connection_method` |
| `--audio-method`| `webrtc.audio_method` (`audiohub`/`stream`) |
| `--interface` | `network.interface` (dds) |
| `--device` | `perception.device` |
| `--model` | `perception.model_path` |
| `--source` | `camera.mock_source` |
| `--wav` | `greet.wav_path` |
| `--audio` | `audio.backend` (`host`/`go2`/`null`) |
| `--audio-device`| `audio.output_device` (PulseAudio sink) |
| `--no-avoidance`| `safety.use_lidar_avoidance = False` |
| `--headless` | `headless` (no cv2 window) |
| `--dry-run` | `dry_run` (decide but never move) |
| `--conf` | `perception.person_conf` |
| `--web` | `web.enabled` (control dashboard) |
| `--web-port` | `web.port` (default 8080) |
| `--geofence` | enable GPS geofence (default vision-only) |
| `--gps` | `gps.source` (`auto`/`gpsd`/`serial`/`mock`) |
| `--radius` | `geofence.radius_m` (metres) |
| `--play` | `play.mode` (`calm`/`moderate`/`playful`) |
A live cv2 window (unless `--headless`) draws **green** person boxes,
**red** danger boxes, the road-coverage percentage, and the current state.
Press **ESC** in the window (or **Ctrl-C** in the terminal) to quit.
### Control dashboard (HTTP)
Add `--web` and open **`http://<dog-ip>:8080/`** from any laptop/phone on the
network — ideal headless on the dog. The page shows the live camera (with the
detection/state overlay) **plus controls**: change **play mode**
(calm/moderate/playful), **pause/resume**, **E-STOP**, and **"set fence centre
here"**, with a live status panel (state, GPS fix, in/out of fence).
```bash
./welcome.sh --robot-ip 192.168.1.50 --headless --web # dog, browse to its IP:8080
./welcome.sh --mock --source 0 --web --web-port 9000 # off-robot demo
```
Endpoints: `/` (dashboard), `/stream.mjpg` (raw MJPEG), `/snapshot.jpg`,
`/status.json` (status), `POST /control` (commands), `/healthz`. Stdlib-only
(`http.server` + `cv2.imencode`) — no extra dependency, multiple viewers.
> ⚠️ **Security:** the viewer binds `0.0.0.0` with **no authentication** — anyone
> on the same network can watch the camera at `http://<dog-ip>:<port>/`. That's
> intended for a trusted home LAN. On an untrusted network, set
> `web.host = "127.0.0.1"` in `config.py` (view only via an SSH tunnel), or
> leave `--web` off.
---
## Tuning
Everything is in **`config.py`**, grouped by subsystem. Common knobs:
- **Detection** -- `perception.person_conf` (default `0.80`), `perception.device`,
`perception.model_path`, `perception.danger_classes`,
`perception.danger_min_height_ratio`.
- **Road mask** -- `perception.road_hsv_lower/upper`, `road_crop_frac`,
`road_trigger_coverage`.
- **Approach feel** -- `servo.kp_yaw`, `servo.max_yaw_rate`, `servo.yaw_deadband`,
`servo.kp_forward`, `servo.max_forward`, `servo.stop_height_ratio`,
`servo.yaw_sign` (flip if your camera mounting inverts left/right).
- **Wander** -- `wander.forward_speed`, `wander.yaw_sweep_rate/period`.
- **Greeting** -- `greet.gestures`, `greet.gesture_gap`, `greet.cooldown`.
- **Safety caps** -- `safety.max_vx/max_vy/max_vyaw`, `safety.perception_timeout`,
`safety.command_timeout`.
---
## Visual-servoing math
The servo turns a single person bounding box into a `(vx, vyaw)` command each
tick. Let `frame_w`, `frame_h` be the frame size and the box have centre `cx`
and height `h`.
**Horizontal (yaw) error** -- normalised to `[-1, 1]`, `+` = right of centre:
```
err = (cx - frame_w/2) / (frame_w/2) # Detection.horizontal_offset
```
**Yaw command** -- a P(ID) controller on `err`, with a deadband and clamp:
```
vyaw = yaw_sign * PID(err) clamped to +/- servo.max_yaw_rate
|err| < servo.yaw_deadband -> vyaw = 0
```
With the default `yaw_sign = -1`: a target to the **right** (`err > 0`) yields
`vyaw < 0` (a clockwise/right turn) -- the robot turns toward the person. Flip
`yaw_sign` if your mounting inverts this.
**Distance proxy** -- how much of the frame height the box fills:
```
height_ratio = h / frame_h # Detection.height_ratio
arrived = height_ratio >= servo.stop_height_ratio (default 0.50)
```
**Forward command** -- proportional to remaining distance, throttled when the
heading error is large so the dog squares up before charging, and zeroed on
arrival:
```
vx = kp_forward * (stop_height_ratio - height_ratio)
vx *= exp(-forward_heading_falloff * |err|) # slow down off-axis
vx = clamp(vx, 0, servo.max_forward) # never reverse to approach
arrived -> vx = 0
```
All commands then pass the global safety caps (`safety.max_*`) before reaching
`drive()`.
---
## Safety notes
- **E-stop = Ctrl-C.** SIGINT/SIGTERM set a stop flag; the loop then stops the
robot and runs a clean `shutdown()` (zero velocity, release avoidance, close
camera/audio). ESC in the debug window does the same.
- **Test suspended first.** Always run a new build with the Go2 hung off the
ground and a hand on Ctrl-C before letting it walk.
- **LiDAR firmware hard-stop.** On real hardware, `drive()` routes through the
Go2 `ObstaclesAvoidClient` when `safety.use_lidar_avoidance` is on (default).
This is a firmware-level last line of defence on top of GoWelcome's own
`AVOID_DANGER` logic. `--no-avoidance` disables it (use with care).
- **Velocity caps.** Every command is clamped by `safety.max_vx/vy/vyaw` after
the controllers run, so a controller bug can't command an unsafe speed.
- **Stale perception.** If no fresh frame arrives within
`safety.perception_timeout`, the robot stops.
- **`--dry-run`** runs the full perception + decision pipeline but never sends a
non-zero velocity -- handy for validating behaviour safely.