11 KiB
Saqr — PPE Safety Detection on Unitree G1
Real-time PPE compliance (helmet, vest, boots, gloves, goggles) using YOLO11n.
Runs on a Unitree G1 humanoid with an Intel RealSense D435I. On UNSAFE the
robot announces the missing gear via the onboard TTS and plays a recorded
arm-motion trajectory from assets/motions/.
Layout
Saqr/
├── core/ # detection + tracking + events (shared by CLI/GUI/bridge)
├── apps/ # CLI modules (saqr_cli, detect_cli, train_cli, manager_cli, view_stream)
├── gui/ # PySide6 desktop GUI (dev-machine only)
├── robot/ # G1 bridge + DDS controller
├── utils/ # logger, config loader
├── scripts/
│ ├── start_saqr.sh # the single entry point
│ ├── saqr-bridge.service # systemd unit (wraps start_saqr.sh)
│ └── deploy.sh # push code dev machine → robot
├── config/ # logging.json, core_config.json, robot_config.json
├── assets/
│ ├── audio/ # pre-recorded WAV clips (16kHz mono int16, per category)
│ └── motions/ # teach-and-replay arm trajectories (*.jsonl)
├── data/ # dataset/, models/ (gitignored)
├── runtime/ # captures/, runs/ (gitignored)
├── logs/ # per-module .log files (gitignored)
├── docs/ # DEPLOY.md, start.md, use_case_catalogue.pdf
├── pyproject.toml
└── README.md
The project root is auto-detected from core/paths.py::PROJECT_ROOT — drop
the Saqr/ folder anywhere on disk and the code finds itself. Override
with SAQR_ROOT=/custom/path if needed.
Run
The project only runs through scripts/start_saqr.sh
(directly or under the saqr-bridge systemd unit):
# On the robot:
sudo systemctl start saqr-bridge # production
# or
~/Saqr/scripts/start_saqr.sh # foreground / debug
Then on the wireless remote:
- R2 + X → start detection
- R2 + Y → stop detection
See docs/DEPLOY.md for first-time deploy, docs/start.md for the systemd workflow, and docs/ARCHITECTURE.md for the component map.
Usage
One-time setup on the robot (already done, documented for rebuilds):
# 1. Clone into ~/Saqr
git clone <repo> ~/Saqr && cd ~/Saqr
# 2. Create the conda env and install in editable mode
conda create -n saqr python=3.8 -y
conda activate saqr
pip install -e .
pip install "cyclonedds==0.10.2" # exact version required by unitree_sdk2py
# 3. Drop the model at data/models/saqr_best.pt
# (trained on the dev machine; copy via scp or scripts/deploy.sh)
# 4. Install the systemd unit for auto-start on boot
sudo cp scripts/saqr-bridge.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable saqr-bridge
Normal day-to-day operation on the robot:
# Foreground (see logs live, Ctrl-C to stop)
CONDA_ENV=saqr ~/Saqr/scripts/start_saqr.sh
# Background via systemd (survives reboots)
sudo systemctl start saqr-bridge
journalctl -u saqr-bridge -f # follow logs
sudo systemctl stop saqr-bridge
Controlling detection from the wireless remote:
- Start the bridge (above). It announces "Saqr is running. Press R2 plus X to start."
- Press R2 + X → the bridge spawns
apps.saqr_clias a subprocess, opens the RealSense camera, loads the model, and begins publishing events on stdout. - Walk into frame. On UNSAFE (missing helmet or vest) the robot speaks what's missing and plays the recorded arm gesture.
- Press R2 + Y → the subprocess is SIGINT'd, the robot announces "Saqr deactivated.", and the bridge goes back to idle.
- You can start/stop repeatedly without restarting the bridge.
Dev-machine workflow (push code + weights to the robot):
# From the dev machine, from the Saqr/ dir
scripts/deploy.sh # rsync + pip install -e . in the robot's conda env
# Or a single file:
scp robot/robot_controller.py unitree@192.168.123.164:~/Saqr/robot/
ssh unitree@192.168.123.164 'sudo systemctl restart saqr-bridge'
Streaming view (optional, for debugging from a browser):
With the subprocess running, open http://<robot-ip>:8080 — MJPEG stream
of the annotated frames. Port and interface are set in
config/core_config.json under stream.
Log locations:
- Bridge stdout/stderr:
journalctl -u saqr-bridge(systemd) or the terminal. - Saqr CLI: streams to the bridge and appears inline in the same log.
- Per-module files:
logs/*.logrotate on size; tail withtail -f logs/saqr.log. - Event CSVs:
runtime/runs/<timestamp>/events.csv.
Troubleshooting quick-refs:
| Symptom | Likely cause | Fix |
|---|---|---|
| "Camera not connected" announcement | RealSense not on USB | Replug, lsusb | grep Intel, rerun |
| R2+X does nothing | LowState not subscribed | Check BRIDGE … Subscribed to rt/lowstate in the log; network down? |
rc=3104 warnings |
Firmware audio/arm contention | Expected — see Known limitations |
motion files missing |
assets/motions/*.jsonl absent on robot |
scripts/deploy.sh or scp the directory |
ModuleNotFoundError: cyclonedds |
Wrong version | pip install "cyclonedds==0.10.2" in the saqr env |
Deploy
From the dev machine:
scripts/deploy.sh # rsync + pip install -e . in the robot's conda env
scripts/deploy.sh --ip … # custom robot IP
Configure
All tunable values live in JSON — no code edits needed:
- config/core_config.json — detection, tracking, camera, stream, training.
- config/robot_config.json — bridge, TTS, phrases, arm actions, deploy, start_saqr defaults.
- config/logging.json — log levels per category.
Precedence: env var > config JSON > code fallback. Most defaults can be
overridden via env vars without editing the files (CONDA_ENV,
SAQR_SOURCE, STREAM_PORT, DDS_IFACE, ROBOT_IP, …).
After editing any JSON:
sudo systemctl restart saqr-bridge
Data & Models
data/ and runtime/ are gitignored (too large). Download separately:
- Dataset: testcasque/ppe-detection-qlq3d
→ YOLOv11 format → unzip to
data/dataset/. - Base weights: Ultralytics releases
→
yolo11n.ptintodata/models/. - Saqr fine-tuned weights (
saqr_best.pt): produced by training — see docs/DEPLOY.md.
Training (dev machine, off-path)
Training is done on a workstation, not on the robot. It's a one-off, not part of the normal run flow:
python -m apps.train_cli --epochs 100 --batch 16
# best weights land at data/models/saqr_best.pt; deploy with scripts/deploy.sh
Audio pipeline
On UNSAFE, robot/robot_controller.py queues one
announcement to a worker thread which routes it per tts.mode:
tts_only→ firmwareTtsMaker(text, speaker_id=2)(current default).recorded_only→ WAV lookup inassets/audio/<category>/<key>.wav; no fallback to TtsMaker.recorded_or_tts→ WAV if available, else TtsMaker.
Each TtsMaker call is preceded by AUDIO_STOP_PLAY + 300 ms settle (the
reset pattern from G1_Lootah/Audio_Recorder/voice_note.txt) and retried
once on rc != 0. Audio RPC timeout is capped at 3 s so a stuck firmware
call fails fast instead of blocking the worker for the bridge-wide 10 s.
The Python SDK's TtsMaker has a broken tts_index counter that never
increments — the controller bypasses it and calls the underlying RPC with
a real index.
Known limitations
G1 firmware serialises the audio channel and rt/arm_sdk. When the
low-level arm replayer is publishing joint commands at 60 Hz, every audio
RPC (TtsMaker, PlayStream, even AUDIO_STOP_PLAY) blocks until the
RPC timeout and returns rc=3104. Measured audio dropout during parallel
arm+audio runs: ~50 % of alerts. This is a firmware-level constraint on
the voice service, not anything the client can tune its way out of —
verified by testing retries, SetVolume warmups, and AUDIO_STOP_PLAY
resets, all of which hang under the same condition.
The reliable path is serial: speak() → wait_for_audio_done() →
reject() (swap bridge.py handle_line if that's the behaviour you
want). Parallel is kept behind audio_lead_s for latency-sensitive
demos where occasional dropped audio is acceptable.
Current status (2026-04-21)
Snapshot of what is and isn't working, so whoever picks this up next doesn't have to rediscover it.
Working:
- End-to-end flow: RealSense → YOLO11n → tracker → compliance → UNSAFE/SAFE/PARTIAL events → robot voice + recorded arm motion.
- Wireless-remote gated start/stop (R2+X / R2+Y) via
rt/lowstate. - Recorded arm replay from assets/motions/adnoc1.jsonl
via robot/arm_replay.py — custom teach-and-replay
trajectory instead of the canned
rejectaction. - Recorded WAV library under assets/audio/ with a
recorded_or_ttsmode that can be toggled viatts.modein config. - TtsMaker path hardened:
AUDIO_STOP_PLAYreset + retry + 3 s RPC timeout + index-bug workaround (see Audio pipeline). - MJPEG stream on port 8080 for remote visual QA.
- Fully config-driven — no hard-coded IPs, paths, or thresholds in code
beyond
SAQR_ROOTfallback and firmware API ids. - systemd unit +
start_saqr.sh= single entry point, survives reboots.
Known broken / in tension:
- Audio ↔ arm firmware contention (see Known limitations).
Current default is parallel via
audio_lead_s=0.3— ~50 % of UNSAFE alerts drop their audio when arm sdk is publishing. Mitigations tested and none of them fix the root cause (it's a firmware limitation). The deterministic fix is to serialise audio-then-arm; pending decision. - Track-ID churn — each re-acquisition of the same person generates a
new track id, which bypasses the per-
(track_id, status)cooldown and triggers a fresh UNSAFE alert. Not yet quantified, but visible in rapid-fireID 0001 → 0002 → 0003sequences for what appears to be one worker.
Configuration right now:
tts.mode = "tts_only"(firmware TTS, no recorded WAVs used).motion.enabled = true,motion.unsafe_file = "adnoc1.jsonl"— the custom 7-second trajectory with 60 frames of smooth move-in and 180 frames of smooth return-to-home.bridge.audio_lead_s = 0.3— parallel mode.bridge.cooldown = 8.0— per (track_id, status).
Next-session picks (prioritised):
- Decide parallel-vs-serial for audio+arm and wire it in. Serial is safer for safety-critical alerts.
- Stabilise track-IDs or make the cooldown per-identity instead of per-track so one worker doesn't trigger N alerts.
- Add a boot-time self-check (camera reachable, DDS iface up, model file present, motion JSONL parseable) so failures surface before the first event rather than during it.
- Rotate
logs/— currently grows unbounded. - Add a
logs/events.jsonlaudit trail (one structured line per UNSAFE/SAFE event) for post-hoc compliance review.
See docs/ARCHITECTURE.md for the full component map and data flow.