# GoWelcome An autonomous **Unitree Go2** "backyard greeter". The robot dog wanders a yard, spots a person with on-board vision, walks up to them via visual servoing, and plays a friendly greeting (audio + gestures) -- all while steering clear of roads and vehicles. GoWelcome is **mapless**, **reactive**, and **NON-ROS2**: a single Python process running a fixed-rate state machine on top of the official `unitree_sdk2py`. No SLAM, no global planner -- just perceive, decide, move. --- ## Behaviour: the 5-state machine Arbitration priority each tick (highest first): **e-stop / pause** → **stale-perception halt** → **AVOID_DANGER (cars/obstacles)** → **GPS hold (no fix)** → finish **GREET** → **BOUNDARY (geofence)** → **APPROACH** → **WANDER + dog-play**. So cars always win, the dog never chases a person out of the area, and idle play only happens when nothing else is going on. ``` +----------+ person (conf>thr) +-----------+ box fills +---------+ | WANDER | ------------------> | APPROACH | -----------> | GREET | | + play | <------ lost ------ | | stop_ratio | wav+wag | +----------+ cooldown <----------------------------------+---------+ ^ ^ ^ | | +--- back inside ---- +--------------+ near geofence edge (GPS) | | | BOUNDARY | home back toward centre | +------ clear ---------- +--------------+ | ^ | road / vehicle near any state can preempt to +------------------------ +--------------+ | AVOID_DANGER | back up + pivot away +--------------+ ``` - **WANDER** -- cruise forward (`wander.forward_speed`) with a gentle yaw sweep, scanning the yard, and occasionally performing an idle **dog-play** action (see below). It also **keeps its distance from the road/cars**: as pavement or a vehicle appears ahead it veers toward the clear side and slows down (the vision-only containment — see below). Transitions to **APPROACH** when a person is detected above `perception.person_conf`. - **APPROACH** -- run the visual servo (below) to centre and close on the person. Falls back to **WANDER** if the person is missing for `loop.person_lost_frames`. Transitions to **GREET** when the person's box fills `servo.stop_height_ratio` of the frame height. - **GREET** -- settle to a full stop (`greet.settle_time`), play the greeting `wav`, run `greet.gestures` (spaced by `greet.gesture_gap`), then return to **WANDER** and ignore people for `greet.cooldown` seconds. - **AVOID_DANGER** -- preempts every other state (highest safety). Triggered by HSV road coverage over `perception.road_trigger_coverage`, or a vehicle box taller than `perception.danger_min_height_ratio`. Backs up (`avoid.backup_speed`) then pivots away from the road until clear for `avoid.clear_frames` frames. - **BOUNDARY** -- *optional* GPS keep-in-area behaviour (off by default; see below). When enabled and the dog nears the geofence edge it homes back toward the centre, and won't leave the area even to chase a person. A perception time-out (`safety.perception_timeout`) in any state stops the robot until fresh frames arrive. --- ## Stay-in-area: vision (default), GPS optional The dog patrols an **open area with no physical fence**. By **default it stays in the area with vision** — no GPS needed: - **Soft road/car repulsion (WANDER):** as pavement appears in the lower frame (HSV road mask), or a vehicle is detected (YOLO), the dog **veers toward the clear side and slows down** *before* reaching the hard reaction — actively keeping its distance from the road/cars (`avoid.soft_road_coverage`, `road_repulsion_gain`, `car_repulsion_gain`). - **Hard reaction (AVOID_DANGER):** up close (road fills the centre past `perception.road_trigger_coverage`, or a near car), it backs up and pivots away. The firmware **LiDAR hard-stop** sits underneath all of it. This relies on a usable visual border (a clear grass→pavement edge, decent lighting) and is the recommended setup. Tune the gains in `config.py` (`AvoidConfig`); set `avoid.soft_avoid_enabled = False` to keep only the hard reaction. ### Optional: GPS geofence (`--geofence`) For a hard metric boundary, add an external GPS receiver and enable the geofence with `--geofence` (off by default). It adds the **BOUNDARY** state, homing the dog back toward a centre point. > ⚠️ **The Go2 has no built-in GPS.** This requires an **external GPS receiver on > the onboard computer** (USB u-blox-class into the Jetson), read via `gpsd` or > serial NMEA. Standard GPS is accurate to **±2–5 m**, so keep `geofence.radius_m` > well inside the real edge (use RTK GPS for tight bounds near a road). How it works: 1. On startup the first good fix becomes the fence **centre** (`center_mode: onstart`), or set explicit `center_lat/lon` (`center_mode: fixed`), or press **"Set fence centre here"** on the dashboard. 2. The dog roams freely within `geofence.radius_m`. Within `geofence.margin_m` of the edge it enters **BOUNDARY** and homes back toward the centre, steering with the GPS **course-over-ground** (no compass needed), until `release_m` back inside (hysteresis). 3. **Fail-safe:** if GPS is lost/stale (`gps.stale_after`) the dog **stops** (`geofence.no_fix_behavior: stop`) rather than roam blind near the edge. GPS source (`gps.source`): `auto` (probe gpsd, else serial), `gpsd`, `serial` (NMEA on `gps.serial_port`), or `mock` (a simulated receiver that integrates the commanded motion — for testing). `--gps`/`--radius` imply `--geofence`. --- ## Act like a dog (idle play) While WANDERing, an idle scheduler occasionally performs a random dog action (`play.actions`: `stretch`, `wiggle`, `scrape`/dig, `dance1`, `wallow`, ...), pausing briefly to do the trick. Intensity is **runtime-settable** (default `moderate`): `calm` (~75 s between actions), `moderate` (~30 s), `playful` (~15 s) — change it any time from the dashboard or with `--play`. (The greeting itself adds Hello/Heart "wags".) Play never overrides safety, the geofence, or a greeting. --- ## Architecture ``` camera frames | v +---------------------+ latest() +-------------------------+ | PerceptionThread | -------------------> | GoWelcomeStateMachine | | (background thread) | PerceptionResult | step(dt) -> State | | YOLO + HSV road | | WANDER/APPROACH/GREET/ | +---------------------+ | AVOID_DANGER | ^ +-------------------------+ | get_frame() | | | drive() / stop() / | | gesture() / play_greeting() | v | +--------------------------+ +--------------------------------- | RobotInterface | | (abstract contract) | +--------------------------+ / | \ +------------------+ +------------------+ +------------------+ | Go2WebRTCRobot | | Go2Robot | | MockRobot | | unitree_webrtc_ | | unitree_sdk2py | | webcam / video, | | connect (DEFAULT)| | over CycloneDDS | | no hardware, for | | wifi, AIR/PRO/EDU| | (--transport dds)| | off-robot dev/CI | | + AudioHub audio | | wired / EDU | | | +------------------+ +------------------+ +------------------+ (async bridge: WebRTC event loop on its own thread) ``` - **`config.py`** -- every tunable lives here in grouped dataclasses (`GoWelcomeConfig`). CLI flags in `main.py` override a handful at startup. - **`gowelcome/types.py`** -- frozen data contracts: `State`, `Detection`, `RoadInfo`, `PerceptionResult`. The shared language between layers. - **`gowelcome/robot/interface.py`** -- the `RobotInterface` and `AudioBackend` ABCs plus the `GESTURES` vocabulary. The behaviour layer talks only to these. - **`PerceptionThread`** -- grabs frames from the robot, runs YOLO + the HSV road mask off the control loop, and publishes the newest `PerceptionResult` via `latest()`. - **`GoWelcomeStateMachine`** -- the reactive brain; `step(dt)` reads the latest perception and issues robot commands, returning the current `State`. Velocity convention everywhere (matches `SportClient.Move`): `vx` forward+, `vy` left+, `vyaw` CCW/left+ (rad/s). --- ## Transports & greeting audio GoWelcome talks to the robot through one of two transports (`--transport`): | Transport | Library | Works on | Greeting audio | When | |-----------|---------|----------|----------------|------| | **`webrtc`** *(default)* | `unitree_webrtc_connect` (app protocol) | Go2 **AIR/PRO/EDU over wifi** | ✅ **from the dog's speaker** (AudioHub) | default; no jailbreak | | `dds` | official `unitree_sdk2py` (CycloneDDS) | Go2 **EDU, wired** | ❌ none on Go2 → host speaker | `--transport dds` | **Greeting from the dog (WebRTC default).** On startup GoWelcome uploads `assets/greeting.wav` to the robot via **AudioHub** and plays it by uuid on each greeting — sound comes from the **Go2's own speaker**. Pick the method with `--audio-method`: - `audiohub` *(default)* — upload once, `play_by_uuid` per greeting (persistent, low latency). - `stream` — stream the file live each greeting via an aiortc `MediaPlayer`. **DDS transport audio.** The official SDK has **no Go2 audio path** (its `AudioClient` is G1-only). On `--transport dds`, greeting audio falls back to a **pluggable host backend** (`--audio host|go2|null`) that plays on the machine running GoWelcome. The field-proven pattern (from the team's G1 `Sanad` stack) is a USB/Bluetooth speaker on the onboard computer, pinned by its PulseAudio sink: ```bash pactl list short sinks # find your speaker's sink python main.py --transport dds --interface eth0 \ --audio-device alsa_output.usb-Anker_PowerConf_A3321-DEV-SN1-01.analog-stereo ``` Drop your clip at `assets/greeting.wav` — see `assets/greeting.README.md`. (DDS host playback wants 16 kHz mono 16-bit PCM; AudioHub accepts any wav.) --- ## Install ### Off-robot (mock / development / tests) ```bash python3 -m venv .venv && source .venv/bin/activate pip install -r requirements.txt # numpy, opencv-python, ultralytics, simpleaudio ``` You do **not** need any robot library for `--mock`. ### On the real robot — WebRTC (default) ```bash sudo apt install -y portaudio19-dev pip install -r requirements.txt # includes unitree_webrtc_connect ``` For **Go2 firmware ≥ 1.1.15** you also need the per-device AES-128 key (once): fetch it with the connector's `examples/fetch_aes_key.py`, then pass `--aes-key`. ### On the real robot — DDS (alternative, EDU/wired) Install the official Unitree SDK on the robot's host (not a plain pip install): ```bash # https://github.com/unitreerobotics/unitree_sdk2_python git clone https://github.com/unitreerobotics/unitree_sdk2_python cd unitree_sdk2_python && pip install -e . # pulls in cyclonedds ``` `pyserial` is **not** required. --- ## Run ```bash # Off-robot, webcam index 0, silent audio: python main.py --mock --audio null --source 0 # or: ./scripts/run_mock.sh # Off-robot from a video file with the debug window: python main.py --mock --source backyard.mp4 # Real Go2 over WebRTC (default) — greeting plays from the dog's speaker: python main.py --robot-ip 192.168.1.50 # add --aes-key on fw >= 1.1.15 # or: ./welcome.sh --robot-ip 192.168.1.50 # TEST SUSPENDED FIRST # Real Go2 over DDS (EDU/wired): python main.py --transport dds --interface eth0 # Useful flags: python main.py --robot-ip 192.168.1.50 --device cuda --headless --web python main.py --mock --dry-run # perceive + decide, never move ``` | Flag | Config field set | |-----------------|------------------------------------------| | `--mock` | `mock` | | `--transport` | `transport` (`webrtc`/`dds`) | | `--robot-ip` | `webrtc.ip` (localsta) | | `--serial` | `webrtc.serial_number` | | `--aes-key` | `webrtc.aes_128_key` (fw ≥ 1.1.15) | | `--connection` | `webrtc.connection_method` | | `--audio-method`| `webrtc.audio_method` (`audiohub`/`stream`) | | `--interface` | `network.interface` (dds) | | `--device` | `perception.device` | | `--model` | `perception.model_path` | | `--source` | `camera.mock_source` | | `--wav` | `greet.wav_path` | | `--audio` | `audio.backend` (`host`/`go2`/`null`) | | `--audio-device`| `audio.output_device` (PulseAudio sink) | | `--no-avoidance`| `safety.use_lidar_avoidance = False` | | `--headless` | `headless` (no cv2 window) | | `--dry-run` | `dry_run` (decide but never move) | | `--conf` | `perception.person_conf` | | `--web` | `web.enabled` (control dashboard) | | `--web-port` | `web.port` (default 8080) | | `--geofence` | enable GPS geofence (default vision-only) | | `--gps` | `gps.source` (`auto`/`gpsd`/`serial`/`mock`) | | `--radius` | `geofence.radius_m` (metres) | | `--play` | `play.mode` (`calm`/`moderate`/`playful`) | A live cv2 window (unless `--headless`) draws **green** person boxes, **red** danger boxes, the road-coverage percentage, and the current state. Press **ESC** in the window (or **Ctrl-C** in the terminal) to quit. ### Control dashboard (HTTP) Add `--web` and open **`http://:8080/`** from any laptop/phone on the network — ideal headless on the dog. The page shows the live camera (with the detection/state overlay) **plus controls**: change **play mode** (calm/moderate/playful), **pause/resume**, **E-STOP**, and **"set fence centre here"**, with a live status panel (state, GPS fix, in/out of fence). ```bash ./welcome.sh --robot-ip 192.168.1.50 --headless --web # dog, browse to its IP:8080 ./welcome.sh --mock --source 0 --web --web-port 9000 # off-robot demo ``` Endpoints: `/` (dashboard), `/stream.mjpg` (raw MJPEG), `/snapshot.jpg`, `/status.json` (status), `POST /control` (commands), `/healthz`. Stdlib-only (`http.server` + `cv2.imencode`) — no extra dependency, multiple viewers. > ⚠️ **Security:** the viewer binds `0.0.0.0` with **no authentication** — anyone > on the same network can watch the camera at `http://:/`. That's > intended for a trusted home LAN. On an untrusted network, set > `web.host = "127.0.0.1"` in `config.py` (view only via an SSH tunnel), or > leave `--web` off. --- ## Tuning Everything is in **`config.py`**, grouped by subsystem. Common knobs: - **Detection** -- `perception.person_conf` (default `0.80`), `perception.device`, `perception.model_path`, `perception.danger_classes`, `perception.danger_min_height_ratio`. - **Road mask** -- `perception.road_hsv_lower/upper`, `road_crop_frac`, `road_trigger_coverage`. - **Approach feel** -- `servo.kp_yaw`, `servo.max_yaw_rate`, `servo.yaw_deadband`, `servo.kp_forward`, `servo.max_forward`, `servo.stop_height_ratio`, `servo.yaw_sign` (flip if your camera mounting inverts left/right). - **Wander** -- `wander.forward_speed`, `wander.yaw_sweep_rate/period`. - **Greeting** -- `greet.gestures`, `greet.gesture_gap`, `greet.cooldown`. - **Safety caps** -- `safety.max_vx/max_vy/max_vyaw`, `safety.perception_timeout`, `safety.command_timeout`. --- ## Visual-servoing math The servo turns a single person bounding box into a `(vx, vyaw)` command each tick. Let `frame_w`, `frame_h` be the frame size and the box have centre `cx` and height `h`. **Horizontal (yaw) error** -- normalised to `[-1, 1]`, `+` = right of centre: ``` err = (cx - frame_w/2) / (frame_w/2) # Detection.horizontal_offset ``` **Yaw command** -- a P(ID) controller on `err`, with a deadband and clamp: ``` vyaw = yaw_sign * PID(err) clamped to +/- servo.max_yaw_rate |err| < servo.yaw_deadband -> vyaw = 0 ``` With the default `yaw_sign = -1`: a target to the **right** (`err > 0`) yields `vyaw < 0` (a clockwise/right turn) -- the robot turns toward the person. Flip `yaw_sign` if your mounting inverts this. **Distance proxy** -- how much of the frame height the box fills: ``` height_ratio = h / frame_h # Detection.height_ratio arrived = height_ratio >= servo.stop_height_ratio (default 0.50) ``` **Forward command** -- proportional to remaining distance, throttled when the heading error is large so the dog squares up before charging, and zeroed on arrival: ``` vx = kp_forward * (stop_height_ratio - height_ratio) vx *= exp(-forward_heading_falloff * |err|) # slow down off-axis vx = clamp(vx, 0, servo.max_forward) # never reverse to approach arrived -> vx = 0 ``` All commands then pass the global safety caps (`safety.max_*`) before reaching `drive()`. --- ## Safety notes - **E-stop = Ctrl-C.** SIGINT/SIGTERM set a stop flag; the loop then stops the robot and runs a clean `shutdown()` (zero velocity, release avoidance, close camera/audio). ESC in the debug window does the same. - **Test suspended first.** Always run a new build with the Go2 hung off the ground and a hand on Ctrl-C before letting it walk. - **LiDAR firmware hard-stop.** On real hardware, `drive()` routes through the Go2 `ObstaclesAvoidClient` when `safety.use_lidar_avoidance` is on (default). This is a firmware-level last line of defence on top of GoWelcome's own `AVOID_DANGER` logic. `--no-avoidance` disables it (use with care). - **Velocity caps.** Every command is clamped by `safety.max_vx/vy/vyaw` after the controllers run, so a controller bug can't command an unsafe speed. - **Stale perception.** If no fresh frame arrives within `safety.perception_timeout`, the robot stops. - **`--dry-run`** runs the full perception + decision pipeline but never sends a non-zero velocity -- handy for validating behaviour safely.