18 KiB
GoWelcome
An autonomous Unitree Go2 "backyard greeter". The robot dog wanders a yard, spots a person with on-board vision, walks up to them via visual servoing, and plays a friendly greeting (audio + gestures) -- all while steering clear of roads and vehicles.
GoWelcome is mapless, reactive, and NON-ROS2: a single Python
process running a fixed-rate state machine on top of the official
unitree_sdk2py. No SLAM, no global planner -- just perceive, decide, move.
Behaviour: the 5-state machine
Arbitration priority each tick (highest first): e-stop / pause → stale-perception halt → AVOID_DANGER (cars/obstacles) → GPS hold (no fix) → finish GREET → BOUNDARY (geofence) → APPROACH → WANDER + dog-play. So cars always win, the dog never chases a person out of the area, and idle play only happens when nothing else is going on.
+----------+ person (conf>thr) +-----------+ box fills +---------+
| WANDER | ------------------> | APPROACH | -----------> | GREET |
| + play | <------ lost ------ | | stop_ratio | wav+wag |
+----------+ cooldown <----------------------------------+---------+
^ ^ ^
| | +--- back inside ---- +--------------+ near geofence edge (GPS)
| | | BOUNDARY | home back toward centre
| +------ clear ---------- +--------------+
| ^
| road / vehicle near any state can preempt to
+------------------------ +--------------+
| AVOID_DANGER | back up + pivot away
+--------------+
- WANDER -- cruise forward (
wander.forward_speed) with a gentle yaw sweep, scanning the yard, and occasionally performing an idle dog-play action (see below). It also keeps its distance from the road/cars: as pavement or a vehicle appears ahead it veers toward the clear side and slows down (the vision-only containment — see below). Transitions to APPROACH when a person is detected aboveperception.person_conf. - APPROACH -- run the visual servo (below) to centre and close on the
person. Falls back to WANDER if the person is missing for
loop.person_lost_frames. Transitions to GREET when the person's box fillsservo.stop_height_ratioof the frame height. - GREET -- settle to a full stop (
greet.settle_time), play the greetingwav, rungreet.gestures(spaced bygreet.gesture_gap), then return to WANDER and ignore people forgreet.cooldownseconds. - AVOID_DANGER -- preempts every other state (highest safety). Triggered by
HSV road coverage over
perception.road_trigger_coverage, or a vehicle box taller thanperception.danger_min_height_ratio. Backs up (avoid.backup_speed) then pivots away from the road until clear foravoid.clear_framesframes. - BOUNDARY -- optional GPS keep-in-area behaviour (off by default; see below). When enabled and the dog nears the geofence edge it homes back toward the centre, and won't leave the area even to chase a person.
A perception time-out (safety.perception_timeout) in any state stops the
robot until fresh frames arrive.
Stay-in-area: vision (default), GPS optional
The dog patrols an open area with no physical fence. By default it stays in the area with vision — no GPS needed:
- Soft road/car repulsion (WANDER): as pavement appears in the lower frame
(HSV road mask), or a vehicle is detected (YOLO), the dog veers toward the
clear side and slows down before reaching the hard reaction — actively
keeping its distance from the road/cars (
avoid.soft_road_coverage,road_repulsion_gain,car_repulsion_gain). - Hard reaction (AVOID_DANGER): up close (road fills the centre past
perception.road_trigger_coverage, or a near car), it backs up and pivots away. The firmware LiDAR hard-stop sits underneath all of it.
This relies on a usable visual border (a clear grass→pavement edge, decent
lighting) and is the recommended setup. Tune the gains in config.py
(AvoidConfig); set avoid.soft_avoid_enabled = False to keep only the hard
reaction.
Optional: GPS geofence (--geofence)
For a hard metric boundary, add an external GPS receiver and enable the geofence
with --geofence (off by default). It adds the BOUNDARY state, homing the
dog back toward a centre point.
⚠️ The Go2 has no built-in GPS. This requires an external GPS receiver on the onboard computer (USB u-blox-class into the Jetson), read via
gpsdor serial NMEA. Standard GPS is accurate to ±2–5 m, so keepgeofence.radius_mwell inside the real edge (use RTK GPS for tight bounds near a road).
How it works:
- On startup the first good fix becomes the fence centre (
center_mode: onstart), or set explicitcenter_lat/lon(center_mode: fixed), or press "Set fence centre here" on the dashboard. - The dog roams freely within
geofence.radius_m. Withingeofence.margin_mof the edge it enters BOUNDARY and homes back toward the centre, steering with the GPS course-over-ground (no compass needed), untilrelease_mback inside (hysteresis). - Fail-safe: if GPS is lost/stale (
gps.stale_after) the dog stops (geofence.no_fix_behavior: stop) rather than roam blind near the edge.
GPS source (gps.source): auto (probe gpsd, else serial), gpsd, serial
(NMEA on gps.serial_port), or mock (a simulated receiver that integrates the
commanded motion — for testing). --gps/--radius imply --geofence.
Act like a dog (idle play)
While WANDERing, an idle scheduler occasionally performs a random dog action
(play.actions: stretch, wiggle, scrape/dig, dance1, wallow, ...),
pausing briefly to do the trick. Intensity is runtime-settable (default
moderate): calm (~75 s between actions), moderate (~30 s), playful
(~15 s) — change it any time from the dashboard or with --play. (The greeting
itself adds Hello/Heart "wags".) Play never overrides safety, the geofence, or a
greeting.
Architecture
camera frames
|
v
+---------------------+ latest() +-------------------------+
| PerceptionThread | -------------------> | GoWelcomeStateMachine |
| (background thread) | PerceptionResult | step(dt) -> State |
| YOLO + HSV road | | WANDER/APPROACH/GREET/ |
+---------------------+ | AVOID_DANGER |
^ +-------------------------+
| get_frame() |
| | drive() / stop() /
| | gesture() / play_greeting()
| v
| +--------------------------+
+--------------------------------- | RobotInterface |
| (abstract contract) |
+--------------------------+
/ | \
+------------------+ +------------------+ +------------------+
| Go2WebRTCRobot | | Go2Robot | | MockRobot |
| unitree_webrtc_ | | unitree_sdk2py | | webcam / video, |
| connect (DEFAULT)| | over CycloneDDS | | no hardware, for |
| wifi, AIR/PRO/EDU| | (--transport dds)| | off-robot dev/CI |
| + AudioHub audio | | wired / EDU | | |
+------------------+ +------------------+ +------------------+
(async bridge: WebRTC event loop on its own thread)
config.py-- every tunable lives here in grouped dataclasses (GoWelcomeConfig). CLI flags inmain.pyoverride a handful at startup.gowelcome/types.py-- frozen data contracts:State,Detection,RoadInfo,PerceptionResult. The shared language between layers.gowelcome/robot/interface.py-- theRobotInterfaceandAudioBackendABCs plus theGESTURESvocabulary. The behaviour layer talks only to these.PerceptionThread-- grabs frames from the robot, runs YOLO + the HSV road mask off the control loop, and publishes the newestPerceptionResultvialatest().GoWelcomeStateMachine-- the reactive brain;step(dt)reads the latest perception and issues robot commands, returning the currentState.
Velocity convention everywhere (matches SportClient.Move):
vx forward+, vy left+, vyaw CCW/left+ (rad/s).
Transports & greeting audio
GoWelcome talks to the robot through one of two transports (--transport):
| Transport | Library | Works on | Greeting audio | When |
|---|---|---|---|---|
webrtc (default) |
unitree_webrtc_connect (app protocol) |
Go2 AIR/PRO/EDU over wifi | ✅ from the dog's speaker (AudioHub) | default; no jailbreak |
dds |
official unitree_sdk2py (CycloneDDS) |
Go2 EDU, wired | ❌ none on Go2 → host speaker | --transport dds |
Greeting from the dog (WebRTC default). On startup GoWelcome uploads
assets/greeting.wav to the robot via AudioHub and plays it by uuid on each
greeting — sound comes from the Go2's own speaker. Pick the method with
--audio-method:
audiohub(default) — upload once,play_by_uuidper greeting (persistent, low latency).stream— stream the file live each greeting via an aiortcMediaPlayer.
DDS transport audio. The official SDK has no Go2 audio path (its
AudioClient is G1-only). On --transport dds, greeting audio falls back to a
pluggable host backend (--audio host|go2|null) that plays on the machine
running GoWelcome. The field-proven pattern (from the team's G1 Sanad stack)
is a USB/Bluetooth speaker on the onboard computer, pinned by its PulseAudio sink:
pactl list short sinks # find your speaker's sink
python main.py --transport dds --interface eth0 \
--audio-device alsa_output.usb-Anker_PowerConf_A3321-DEV-SN1-01.analog-stereo
Drop your clip at assets/greeting.wav — see assets/greeting.README.md. (DDS
host playback wants 16 kHz mono 16-bit PCM; AudioHub accepts any wav.)
Install
Off-robot (mock / development / tests)
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt # numpy, opencv-python, ultralytics, simpleaudio
You do not need any robot library for --mock.
On the real robot — WebRTC (default)
sudo apt install -y portaudio19-dev
pip install -r requirements.txt # includes unitree_webrtc_connect
For Go2 firmware ≥ 1.1.15 you also need the per-device AES-128 key (once):
fetch it with the connector's examples/fetch_aes_key.py, then pass --aes-key.
On the real robot — DDS (alternative, EDU/wired)
Install the official Unitree SDK on the robot's host (not a plain pip install):
# https://github.com/unitreerobotics/unitree_sdk2_python
git clone https://github.com/unitreerobotics/unitree_sdk2_python
cd unitree_sdk2_python && pip install -e . # pulls in cyclonedds
pyserial is not required.
Run
# Off-robot, webcam index 0, silent audio:
python main.py --mock --audio null --source 0
# or: ./scripts/run_mock.sh
# Off-robot from a video file with the debug window:
python main.py --mock --source backyard.mp4
# Real Go2 over WebRTC (default) — greeting plays from the dog's speaker:
python main.py --robot-ip 192.168.1.50 # add --aes-key <hex> on fw >= 1.1.15
# or: ./welcome.sh --robot-ip 192.168.1.50 # TEST SUSPENDED FIRST
# Real Go2 over DDS (EDU/wired):
python main.py --transport dds --interface eth0
# Useful flags:
python main.py --robot-ip 192.168.1.50 --device cuda --headless --web
python main.py --mock --dry-run # perceive + decide, never move
| Flag | Config field set |
|---|---|
--mock |
mock |
--transport |
transport (webrtc/dds) |
--robot-ip |
webrtc.ip (localsta) |
--serial |
webrtc.serial_number |
--aes-key |
webrtc.aes_128_key (fw ≥ 1.1.15) |
--connection |
webrtc.connection_method |
--audio-method |
webrtc.audio_method (audiohub/stream) |
--interface |
network.interface (dds) |
--device |
perception.device |
--model |
perception.model_path |
--source |
camera.mock_source |
--wav |
greet.wav_path |
--audio |
audio.backend (host/go2/null) |
--audio-device |
audio.output_device (PulseAudio sink) |
--no-avoidance |
safety.use_lidar_avoidance = False |
--headless |
headless (no cv2 window) |
--dry-run |
dry_run (decide but never move) |
--conf |
perception.person_conf |
--web |
web.enabled (control dashboard) |
--web-port |
web.port (default 8080) |
--geofence |
enable GPS geofence (default vision-only) |
--gps |
gps.source (auto/gpsd/serial/mock) |
--radius |
geofence.radius_m (metres) |
--play |
play.mode (calm/moderate/playful) |
A live cv2 window (unless --headless) draws green person boxes,
red danger boxes, the road-coverage percentage, and the current state.
Press ESC in the window (or Ctrl-C in the terminal) to quit.
Control dashboard (HTTP)
Add --web and open http://<dog-ip>:8080/ from any laptop/phone on the
network — ideal headless on the dog. The page shows the live camera (with the
detection/state overlay) plus controls: change play mode
(calm/moderate/playful), pause/resume, E-STOP, and "set fence centre
here", with a live status panel (state, GPS fix, in/out of fence).
./welcome.sh --robot-ip 192.168.1.50 --headless --web # dog, browse to its IP:8080
./welcome.sh --mock --source 0 --web --web-port 9000 # off-robot demo
Endpoints: / (dashboard), /stream.mjpg (raw MJPEG), /snapshot.jpg,
/status.json (status), POST /control (commands), /healthz. Stdlib-only
(http.server + cv2.imencode) — no extra dependency, multiple viewers.
⚠️ Security: the viewer binds
0.0.0.0with no authentication — anyone on the same network can watch the camera athttp://<dog-ip>:<port>/. That's intended for a trusted home LAN. On an untrusted network, setweb.host = "127.0.0.1"inconfig.py(view only via an SSH tunnel), or leave--weboff.
Tuning
Everything is in config.py, grouped by subsystem. Common knobs:
- Detection --
perception.person_conf(default0.80),perception.device,perception.model_path,perception.danger_classes,perception.danger_min_height_ratio. - Road mask --
perception.road_hsv_lower/upper,road_crop_frac,road_trigger_coverage. - Approach feel --
servo.kp_yaw,servo.max_yaw_rate,servo.yaw_deadband,servo.kp_forward,servo.max_forward,servo.stop_height_ratio,servo.yaw_sign(flip if your camera mounting inverts left/right). - Wander --
wander.forward_speed,wander.yaw_sweep_rate/period. - Greeting --
greet.gestures,greet.gesture_gap,greet.cooldown. - Safety caps --
safety.max_vx/max_vy/max_vyaw,safety.perception_timeout,safety.command_timeout.
Visual-servoing math
The servo turns a single person bounding box into a (vx, vyaw) command each
tick. Let frame_w, frame_h be the frame size and the box have centre cx
and height h.
Horizontal (yaw) error -- normalised to [-1, 1], + = right of centre:
err = (cx - frame_w/2) / (frame_w/2) # Detection.horizontal_offset
Yaw command -- a P(ID) controller on err, with a deadband and clamp:
vyaw = yaw_sign * PID(err) clamped to +/- servo.max_yaw_rate
|err| < servo.yaw_deadband -> vyaw = 0
With the default yaw_sign = -1: a target to the right (err > 0) yields
vyaw < 0 (a clockwise/right turn) -- the robot turns toward the person. Flip
yaw_sign if your mounting inverts this.
Distance proxy -- how much of the frame height the box fills:
height_ratio = h / frame_h # Detection.height_ratio
arrived = height_ratio >= servo.stop_height_ratio (default 0.50)
Forward command -- proportional to remaining distance, throttled when the heading error is large so the dog squares up before charging, and zeroed on arrival:
vx = kp_forward * (stop_height_ratio - height_ratio)
vx *= exp(-forward_heading_falloff * |err|) # slow down off-axis
vx = clamp(vx, 0, servo.max_forward) # never reverse to approach
arrived -> vx = 0
All commands then pass the global safety caps (safety.max_*) before reaching
drive().
Safety notes
- E-stop = Ctrl-C. SIGINT/SIGTERM set a stop flag; the loop then stops the
robot and runs a clean
shutdown()(zero velocity, release avoidance, close camera/audio). ESC in the debug window does the same. - Test suspended first. Always run a new build with the Go2 hung off the ground and a hand on Ctrl-C before letting it walk.
- LiDAR firmware hard-stop. On real hardware,
drive()routes through the Go2ObstaclesAvoidClientwhensafety.use_lidar_avoidanceis on (default). This is a firmware-level last line of defence on top of GoWelcome's ownAVOID_DANGERlogic.--no-avoidancedisables it (use with care). - Velocity caps. Every command is clamped by
safety.max_vx/vy/vyawafter the controllers run, so a controller bug can't command an unsafe speed. - Stale perception. If no fresh frame arrives within
safety.perception_timeout, the robot stops. --dry-runruns the full perception + decision pipeline but never sends a non-zero velocity -- handy for validating behaviour safely.