398 lines
18 KiB
Markdown
398 lines
18 KiB
Markdown
# GoWelcome
|
||
|
||
An autonomous **Unitree Go2** "backyard greeter". The robot dog wanders a yard,
|
||
spots a person with on-board vision, walks up to them via visual servoing, and
|
||
plays a friendly greeting (audio + gestures) -- all while steering clear of
|
||
roads and vehicles.
|
||
|
||
GoWelcome is **mapless**, **reactive**, and **NON-ROS2**: a single Python
|
||
process running a fixed-rate state machine on top of the official
|
||
`unitree_sdk2py`. No SLAM, no global planner -- just perceive, decide, move.
|
||
|
||
---
|
||
|
||
## Behaviour: the 5-state machine
|
||
|
||
Arbitration priority each tick (highest first):
|
||
**e-stop / pause** → **stale-perception halt** → **AVOID_DANGER (cars/obstacles)**
|
||
→ **GPS hold (no fix)** → finish **GREET** → **BOUNDARY (geofence)** →
|
||
**APPROACH** → **WANDER + dog-play**. So cars always win, the dog never chases a
|
||
person out of the area, and idle play only happens when nothing else is going on.
|
||
|
||
```
|
||
+----------+ person (conf>thr) +-----------+ box fills +---------+
|
||
| WANDER | ------------------> | APPROACH | -----------> | GREET |
|
||
| + play | <------ lost ------ | | stop_ratio | wav+wag |
|
||
+----------+ cooldown <----------------------------------+---------+
|
||
^ ^ ^
|
||
| | +--- back inside ---- +--------------+ near geofence edge (GPS)
|
||
| | | BOUNDARY | home back toward centre
|
||
| +------ clear ---------- +--------------+
|
||
| ^
|
||
| road / vehicle near any state can preempt to
|
||
+------------------------ +--------------+
|
||
| AVOID_DANGER | back up + pivot away
|
||
+--------------+
|
||
```
|
||
|
||
- **WANDER** -- cruise forward (`wander.forward_speed`) with a gentle yaw sweep,
|
||
scanning the yard, and occasionally performing an idle **dog-play** action
|
||
(see below). It also **keeps its distance from the road/cars**: as pavement or
|
||
a vehicle appears ahead it veers toward the clear side and slows down (the
|
||
vision-only containment — see below). Transitions to **APPROACH** when a person
|
||
is detected above `perception.person_conf`.
|
||
- **APPROACH** -- run the visual servo (below) to centre and close on the
|
||
person. Falls back to **WANDER** if the person is missing for
|
||
`loop.person_lost_frames`. Transitions to **GREET** when the person's box
|
||
fills `servo.stop_height_ratio` of the frame height.
|
||
- **GREET** -- settle to a full stop (`greet.settle_time`), play the greeting
|
||
`wav`, run `greet.gestures` (spaced by `greet.gesture_gap`), then return to
|
||
**WANDER** and ignore people for `greet.cooldown` seconds.
|
||
- **AVOID_DANGER** -- preempts every other state (highest safety). Triggered by
|
||
HSV road coverage over `perception.road_trigger_coverage`, or a vehicle box
|
||
taller than `perception.danger_min_height_ratio`. Backs up (`avoid.backup_speed`)
|
||
then pivots away from the road until clear for `avoid.clear_frames` frames.
|
||
- **BOUNDARY** -- *optional* GPS keep-in-area behaviour (off by default; see
|
||
below). When enabled and the dog nears the geofence edge it homes back toward
|
||
the centre, and won't leave the area even to chase a person.
|
||
|
||
A perception time-out (`safety.perception_timeout`) in any state stops the
|
||
robot until fresh frames arrive.
|
||
|
||
---
|
||
|
||
## Stay-in-area: vision (default), GPS optional
|
||
|
||
The dog patrols an **open area with no physical fence**. By **default it stays in
|
||
the area with vision** — no GPS needed:
|
||
|
||
- **Soft road/car repulsion (WANDER):** as pavement appears in the lower frame
|
||
(HSV road mask), or a vehicle is detected (YOLO), the dog **veers toward the
|
||
clear side and slows down** *before* reaching the hard reaction — actively
|
||
keeping its distance from the road/cars (`avoid.soft_road_coverage`,
|
||
`road_repulsion_gain`, `car_repulsion_gain`).
|
||
- **Hard reaction (AVOID_DANGER):** up close (road fills the centre past
|
||
`perception.road_trigger_coverage`, or a near car), it backs up and pivots
|
||
away. The firmware **LiDAR hard-stop** sits underneath all of it.
|
||
|
||
This relies on a usable visual border (a clear grass→pavement edge, decent
|
||
lighting) and is the recommended setup. Tune the gains in `config.py`
|
||
(`AvoidConfig`); set `avoid.soft_avoid_enabled = False` to keep only the hard
|
||
reaction.
|
||
|
||
### Optional: GPS geofence (`--geofence`)
|
||
|
||
For a hard metric boundary, add an external GPS receiver and enable the geofence
|
||
with `--geofence` (off by default). It adds the **BOUNDARY** state, homing the
|
||
dog back toward a centre point.
|
||
|
||
> ⚠️ **The Go2 has no built-in GPS.** This requires an **external GPS receiver on
|
||
> the onboard computer** (USB u-blox-class into the Jetson), read via `gpsd` or
|
||
> serial NMEA. Standard GPS is accurate to **±2–5 m**, so keep `geofence.radius_m`
|
||
> well inside the real edge (use RTK GPS for tight bounds near a road).
|
||
|
||
How it works:
|
||
1. On startup the first good fix becomes the fence **centre** (`center_mode:
|
||
onstart`), or set explicit `center_lat/lon` (`center_mode: fixed`), or press
|
||
**"Set fence centre here"** on the dashboard.
|
||
2. The dog roams freely within `geofence.radius_m`. Within `geofence.margin_m`
|
||
of the edge it enters **BOUNDARY** and homes back toward the centre, steering
|
||
with the GPS **course-over-ground** (no compass needed), until `release_m`
|
||
back inside (hysteresis).
|
||
3. **Fail-safe:** if GPS is lost/stale (`gps.stale_after`) the dog **stops**
|
||
(`geofence.no_fix_behavior: stop`) rather than roam blind near the edge.
|
||
|
||
GPS source (`gps.source`): `auto` (probe gpsd, else serial), `gpsd`, `serial`
|
||
(NMEA on `gps.serial_port`), or `mock` (a simulated receiver that integrates the
|
||
commanded motion — for testing). `--gps`/`--radius` imply `--geofence`.
|
||
|
||
---
|
||
|
||
## Act like a dog (idle play)
|
||
|
||
While WANDERing, an idle scheduler occasionally performs a random dog action
|
||
(`play.actions`: `stretch`, `wiggle`, `scrape`/dig, `dance1`, `wallow`, ...),
|
||
pausing briefly to do the trick. Intensity is **runtime-settable** (default
|
||
`moderate`): `calm` (~75 s between actions), `moderate` (~30 s), `playful`
|
||
(~15 s) — change it any time from the dashboard or with `--play`. (The greeting
|
||
itself adds Hello/Heart "wags".) Play never overrides safety, the geofence, or a
|
||
greeting.
|
||
|
||
---
|
||
|
||
## Architecture
|
||
|
||
```
|
||
camera frames
|
||
|
|
||
v
|
||
+---------------------+ latest() +-------------------------+
|
||
| PerceptionThread | -------------------> | GoWelcomeStateMachine |
|
||
| (background thread) | PerceptionResult | step(dt) -> State |
|
||
| YOLO + HSV road | | WANDER/APPROACH/GREET/ |
|
||
+---------------------+ | AVOID_DANGER |
|
||
^ +-------------------------+
|
||
| get_frame() |
|
||
| | drive() / stop() /
|
||
| | gesture() / play_greeting()
|
||
| v
|
||
| +--------------------------+
|
||
+--------------------------------- | RobotInterface |
|
||
| (abstract contract) |
|
||
+--------------------------+
|
||
/ | \
|
||
+------------------+ +------------------+ +------------------+
|
||
| Go2WebRTCRobot | | Go2Robot | | MockRobot |
|
||
| unitree_webrtc_ | | unitree_sdk2py | | webcam / video, |
|
||
| connect (DEFAULT)| | over CycloneDDS | | no hardware, for |
|
||
| wifi, AIR/PRO/EDU| | (--transport dds)| | off-robot dev/CI |
|
||
| + AudioHub audio | | wired / EDU | | |
|
||
+------------------+ +------------------+ +------------------+
|
||
(async bridge: WebRTC event loop on its own thread)
|
||
```
|
||
|
||
- **`config.py`** -- every tunable lives here in grouped dataclasses
|
||
(`GoWelcomeConfig`). CLI flags in `main.py` override a handful at startup.
|
||
- **`gowelcome/types.py`** -- frozen data contracts: `State`, `Detection`,
|
||
`RoadInfo`, `PerceptionResult`. The shared language between layers.
|
||
- **`gowelcome/robot/interface.py`** -- the `RobotInterface` and `AudioBackend`
|
||
ABCs plus the `GESTURES` vocabulary. The behaviour layer talks only to these.
|
||
- **`PerceptionThread`** -- grabs frames from the robot, runs YOLO + the HSV
|
||
road mask off the control loop, and publishes the newest `PerceptionResult`
|
||
via `latest()`.
|
||
- **`GoWelcomeStateMachine`** -- the reactive brain; `step(dt)` reads the latest
|
||
perception and issues robot commands, returning the current `State`.
|
||
|
||
Velocity convention everywhere (matches `SportClient.Move`):
|
||
`vx` forward+, `vy` left+, `vyaw` CCW/left+ (rad/s).
|
||
|
||
---
|
||
|
||
## Transports & greeting audio
|
||
|
||
GoWelcome talks to the robot through one of two transports (`--transport`):
|
||
|
||
| Transport | Library | Works on | Greeting audio | When |
|
||
|-----------|---------|----------|----------------|------|
|
||
| **`webrtc`** *(default)* | `unitree_webrtc_connect` (app protocol) | Go2 **AIR/PRO/EDU over wifi** | ✅ **from the dog's speaker** (AudioHub) | default; no jailbreak |
|
||
| `dds` | official `unitree_sdk2py` (CycloneDDS) | Go2 **EDU, wired** | ❌ none on Go2 → host speaker | `--transport dds` |
|
||
|
||
**Greeting from the dog (WebRTC default).** On startup GoWelcome uploads
|
||
`assets/greeting.wav` to the robot via **AudioHub** and plays it by uuid on each
|
||
greeting — sound comes from the **Go2's own speaker**. Pick the method with
|
||
`--audio-method`:
|
||
- `audiohub` *(default)* — upload once, `play_by_uuid` per greeting (persistent, low latency).
|
||
- `stream` — stream the file live each greeting via an aiortc `MediaPlayer`.
|
||
|
||
**DDS transport audio.** The official SDK has **no Go2 audio path** (its
|
||
`AudioClient` is G1-only). On `--transport dds`, greeting audio falls back to a
|
||
**pluggable host backend** (`--audio host|go2|null`) that plays on the machine
|
||
running GoWelcome. The field-proven pattern (from the team's G1 `Sanad` stack)
|
||
is a USB/Bluetooth speaker on the onboard computer, pinned by its PulseAudio sink:
|
||
|
||
```bash
|
||
pactl list short sinks # find your speaker's sink
|
||
python main.py --transport dds --interface eth0 \
|
||
--audio-device alsa_output.usb-Anker_PowerConf_A3321-DEV-SN1-01.analog-stereo
|
||
```
|
||
|
||
Drop your clip at `assets/greeting.wav` — see `assets/greeting.README.md`. (DDS
|
||
host playback wants 16 kHz mono 16-bit PCM; AudioHub accepts any wav.)
|
||
|
||
---
|
||
|
||
## Install
|
||
|
||
### Off-robot (mock / development / tests)
|
||
|
||
```bash
|
||
python3 -m venv .venv && source .venv/bin/activate
|
||
pip install -r requirements.txt # numpy, opencv-python, ultralytics, simpleaudio
|
||
```
|
||
|
||
You do **not** need any robot library for `--mock`.
|
||
|
||
### On the real robot — WebRTC (default)
|
||
|
||
```bash
|
||
sudo apt install -y portaudio19-dev
|
||
pip install -r requirements.txt # includes unitree_webrtc_connect
|
||
```
|
||
For **Go2 firmware ≥ 1.1.15** you also need the per-device AES-128 key (once):
|
||
fetch it with the connector's `examples/fetch_aes_key.py`, then pass `--aes-key`.
|
||
|
||
### On the real robot — DDS (alternative, EDU/wired)
|
||
|
||
Install the official Unitree SDK on the robot's host (not a plain pip install):
|
||
|
||
```bash
|
||
# https://github.com/unitreerobotics/unitree_sdk2_python
|
||
git clone https://github.com/unitreerobotics/unitree_sdk2_python
|
||
cd unitree_sdk2_python && pip install -e . # pulls in cyclonedds
|
||
```
|
||
|
||
`pyserial` is **not** required.
|
||
|
||
---
|
||
|
||
## Run
|
||
|
||
```bash
|
||
# Off-robot, webcam index 0, silent audio:
|
||
python main.py --mock --audio null --source 0
|
||
# or: ./scripts/run_mock.sh
|
||
|
||
# Off-robot from a video file with the debug window:
|
||
python main.py --mock --source backyard.mp4
|
||
|
||
# Real Go2 over WebRTC (default) — greeting plays from the dog's speaker:
|
||
python main.py --robot-ip 192.168.1.50 # add --aes-key <hex> on fw >= 1.1.15
|
||
# or: ./welcome.sh --robot-ip 192.168.1.50 # TEST SUSPENDED FIRST
|
||
|
||
# Real Go2 over DDS (EDU/wired):
|
||
python main.py --transport dds --interface eth0
|
||
|
||
# Useful flags:
|
||
python main.py --robot-ip 192.168.1.50 --device cuda --headless --web
|
||
python main.py --mock --dry-run # perceive + decide, never move
|
||
```
|
||
|
||
| Flag | Config field set |
|
||
|-----------------|------------------------------------------|
|
||
| `--mock` | `mock` |
|
||
| `--transport` | `transport` (`webrtc`/`dds`) |
|
||
| `--robot-ip` | `webrtc.ip` (localsta) |
|
||
| `--serial` | `webrtc.serial_number` |
|
||
| `--aes-key` | `webrtc.aes_128_key` (fw ≥ 1.1.15) |
|
||
| `--connection` | `webrtc.connection_method` |
|
||
| `--audio-method`| `webrtc.audio_method` (`audiohub`/`stream`) |
|
||
| `--interface` | `network.interface` (dds) |
|
||
| `--device` | `perception.device` |
|
||
| `--model` | `perception.model_path` |
|
||
| `--source` | `camera.mock_source` |
|
||
| `--wav` | `greet.wav_path` |
|
||
| `--audio` | `audio.backend` (`host`/`go2`/`null`) |
|
||
| `--audio-device`| `audio.output_device` (PulseAudio sink) |
|
||
| `--no-avoidance`| `safety.use_lidar_avoidance = False` |
|
||
| `--headless` | `headless` (no cv2 window) |
|
||
| `--dry-run` | `dry_run` (decide but never move) |
|
||
| `--conf` | `perception.person_conf` |
|
||
| `--web` | `web.enabled` (control dashboard) |
|
||
| `--web-port` | `web.port` (default 8080) |
|
||
| `--geofence` | enable GPS geofence (default vision-only) |
|
||
| `--gps` | `gps.source` (`auto`/`gpsd`/`serial`/`mock`) |
|
||
| `--radius` | `geofence.radius_m` (metres) |
|
||
| `--play` | `play.mode` (`calm`/`moderate`/`playful`) |
|
||
|
||
A live cv2 window (unless `--headless`) draws **green** person boxes,
|
||
**red** danger boxes, the road-coverage percentage, and the current state.
|
||
Press **ESC** in the window (or **Ctrl-C** in the terminal) to quit.
|
||
|
||
### Control dashboard (HTTP)
|
||
|
||
Add `--web` and open **`http://<dog-ip>:8080/`** from any laptop/phone on the
|
||
network — ideal headless on the dog. The page shows the live camera (with the
|
||
detection/state overlay) **plus controls**: change **play mode**
|
||
(calm/moderate/playful), **pause/resume**, **E-STOP**, and **"set fence centre
|
||
here"**, with a live status panel (state, GPS fix, in/out of fence).
|
||
|
||
```bash
|
||
./welcome.sh --robot-ip 192.168.1.50 --headless --web # dog, browse to its IP:8080
|
||
./welcome.sh --mock --source 0 --web --web-port 9000 # off-robot demo
|
||
```
|
||
|
||
Endpoints: `/` (dashboard), `/stream.mjpg` (raw MJPEG), `/snapshot.jpg`,
|
||
`/status.json` (status), `POST /control` (commands), `/healthz`. Stdlib-only
|
||
(`http.server` + `cv2.imencode`) — no extra dependency, multiple viewers.
|
||
|
||
> ⚠️ **Security:** the viewer binds `0.0.0.0` with **no authentication** — anyone
|
||
> on the same network can watch the camera at `http://<dog-ip>:<port>/`. That's
|
||
> intended for a trusted home LAN. On an untrusted network, set
|
||
> `web.host = "127.0.0.1"` in `config.py` (view only via an SSH tunnel), or
|
||
> leave `--web` off.
|
||
|
||
---
|
||
|
||
## Tuning
|
||
|
||
Everything is in **`config.py`**, grouped by subsystem. Common knobs:
|
||
|
||
- **Detection** -- `perception.person_conf` (default `0.80`), `perception.device`,
|
||
`perception.model_path`, `perception.danger_classes`,
|
||
`perception.danger_min_height_ratio`.
|
||
- **Road mask** -- `perception.road_hsv_lower/upper`, `road_crop_frac`,
|
||
`road_trigger_coverage`.
|
||
- **Approach feel** -- `servo.kp_yaw`, `servo.max_yaw_rate`, `servo.yaw_deadband`,
|
||
`servo.kp_forward`, `servo.max_forward`, `servo.stop_height_ratio`,
|
||
`servo.yaw_sign` (flip if your camera mounting inverts left/right).
|
||
- **Wander** -- `wander.forward_speed`, `wander.yaw_sweep_rate/period`.
|
||
- **Greeting** -- `greet.gestures`, `greet.gesture_gap`, `greet.cooldown`.
|
||
- **Safety caps** -- `safety.max_vx/max_vy/max_vyaw`, `safety.perception_timeout`,
|
||
`safety.command_timeout`.
|
||
|
||
---
|
||
|
||
## Visual-servoing math
|
||
|
||
The servo turns a single person bounding box into a `(vx, vyaw)` command each
|
||
tick. Let `frame_w`, `frame_h` be the frame size and the box have centre `cx`
|
||
and height `h`.
|
||
|
||
**Horizontal (yaw) error** -- normalised to `[-1, 1]`, `+` = right of centre:
|
||
|
||
```
|
||
err = (cx - frame_w/2) / (frame_w/2) # Detection.horizontal_offset
|
||
```
|
||
|
||
**Yaw command** -- a P(ID) controller on `err`, with a deadband and clamp:
|
||
|
||
```
|
||
vyaw = yaw_sign * PID(err) clamped to +/- servo.max_yaw_rate
|
||
|err| < servo.yaw_deadband -> vyaw = 0
|
||
```
|
||
|
||
With the default `yaw_sign = -1`: a target to the **right** (`err > 0`) yields
|
||
`vyaw < 0` (a clockwise/right turn) -- the robot turns toward the person. Flip
|
||
`yaw_sign` if your mounting inverts this.
|
||
|
||
**Distance proxy** -- how much of the frame height the box fills:
|
||
|
||
```
|
||
height_ratio = h / frame_h # Detection.height_ratio
|
||
arrived = height_ratio >= servo.stop_height_ratio (default 0.50)
|
||
```
|
||
|
||
**Forward command** -- proportional to remaining distance, throttled when the
|
||
heading error is large so the dog squares up before charging, and zeroed on
|
||
arrival:
|
||
|
||
```
|
||
vx = kp_forward * (stop_height_ratio - height_ratio)
|
||
vx *= exp(-forward_heading_falloff * |err|) # slow down off-axis
|
||
vx = clamp(vx, 0, servo.max_forward) # never reverse to approach
|
||
arrived -> vx = 0
|
||
```
|
||
|
||
All commands then pass the global safety caps (`safety.max_*`) before reaching
|
||
`drive()`.
|
||
|
||
---
|
||
|
||
## Safety notes
|
||
|
||
- **E-stop = Ctrl-C.** SIGINT/SIGTERM set a stop flag; the loop then stops the
|
||
robot and runs a clean `shutdown()` (zero velocity, release avoidance, close
|
||
camera/audio). ESC in the debug window does the same.
|
||
- **Test suspended first.** Always run a new build with the Go2 hung off the
|
||
ground and a hand on Ctrl-C before letting it walk.
|
||
- **LiDAR firmware hard-stop.** On real hardware, `drive()` routes through the
|
||
Go2 `ObstaclesAvoidClient` when `safety.use_lidar_avoidance` is on (default).
|
||
This is a firmware-level last line of defence on top of GoWelcome's own
|
||
`AVOID_DANGER` logic. `--no-avoidance` disables it (use with care).
|
||
- **Velocity caps.** Every command is clamped by `safety.max_vx/vy/vyaw` after
|
||
the controllers run, so a controller bug can't command an unsafe speed.
|
||
- **Stale perception.** If no fresh frame arrives within
|
||
`safety.perception_timeout`, the robot stops.
|
||
- **`--dry-run`** runs the full perception + decision pipeline but never sends a
|
||
non-zero velocity -- handy for validating behaviour safely.
|