2026-04-21 10:33:13 +04:00

11 KiB

Raw Blame History

Saqr — PPE Safety Detection on Unitree G1

Real-time PPE compliance (helmet, vest, boots, gloves, goggles) using YOLO11n. Runs on a Unitree G1 humanoid with an Intel RealSense D435I. On UNSAFE the robot announces the missing gear via the onboard TTS and plays a recorded arm-motion trajectory from assets/motions/.

Layout

Saqr/
├── core/           # detection + tracking + events (shared by CLI/GUI/bridge)
├── apps/           # CLI modules (saqr_cli, detect_cli, train_cli, manager_cli, view_stream)
├── gui/            # PySide6 desktop GUI (dev-machine only)
├── robot/          # G1 bridge + DDS controller
├── utils/          # logger, config loader
├── scripts/
│   ├── start_saqr.sh         # the single entry point
│   ├── saqr-bridge.service   # systemd unit (wraps start_saqr.sh)
│   └── deploy.sh             # push code dev machine → robot
├── config/         # logging.json, core_config.json, robot_config.json
├── assets/
│   ├── audio/      # pre-recorded WAV clips (16kHz mono int16, per category)
│   └── motions/    # teach-and-replay arm trajectories (*.jsonl)
├── data/           # dataset/, models/          (gitignored)
├── runtime/        # captures/, runs/           (gitignored)
├── logs/           # per-module .log files      (gitignored)
├── docs/           # DEPLOY.md, start.md, use_case_catalogue.pdf
├── pyproject.toml
└── README.md

The project root is auto-detected from core/paths.py::PROJECT_ROOT — drop the Saqr/ folder anywhere on disk and the code finds itself. Override with SAQR_ROOT=/custom/path if needed.

Run

The project only runs through scripts/start_saqr.sh (directly or under the saqr-bridge systemd unit):

# On the robot:
sudo systemctl start saqr-bridge       # production
# or
~/Saqr/scripts/start_saqr.sh           # foreground / debug

Then on the wireless remote:

R2 + X → start detection
R2 + Y → stop detection

See docs/DEPLOY.md for first-time deploy, docs/start.md for the systemd workflow, and docs/ARCHITECTURE.md for the component map.

Usage

One-time setup on the robot (already done, documented for rebuilds):

# 1. Clone into ~/Saqr
git clone <repo> ~/Saqr && cd ~/Saqr

# 2. Create the conda env and install in editable mode
conda create -n saqr python=3.8 -y
conda activate saqr
pip install -e .
pip install "cyclonedds==0.10.2"   # exact version required by unitree_sdk2py

# 3. Drop the model at data/models/saqr_best.pt
#    (trained on the dev machine; copy via scp or scripts/deploy.sh)

# 4. Install the systemd unit for auto-start on boot
sudo cp scripts/saqr-bridge.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable saqr-bridge

Normal day-to-day operation on the robot:

# Foreground (see logs live, Ctrl-C to stop)
CONDA_ENV=saqr ~/Saqr/scripts/start_saqr.sh

# Background via systemd (survives reboots)
sudo systemctl start saqr-bridge
journalctl -u saqr-bridge -f        # follow logs
sudo systemctl stop  saqr-bridge

Controlling detection from the wireless remote:

Start the bridge (above). It announces "Saqr is running. Press R2 plus X to start."
Press R2 + X → the bridge spawns apps.saqr_cli as a subprocess, opens the RealSense camera, loads the model, and begins publishing events on stdout.
Walk into frame. On UNSAFE (missing helmet or vest) the robot speaks what's missing and plays the recorded arm gesture.
Press R2 + Y → the subprocess is SIGINT'd, the robot announces "Saqr deactivated.", and the bridge goes back to idle.
You can start/stop repeatedly without restarting the bridge.

Dev-machine workflow (push code + weights to the robot):

# From the dev machine, from the Saqr/ dir
scripts/deploy.sh              # rsync + pip install -e . in the robot's conda env
# Or a single file:
scp robot/robot_controller.py unitree@192.168.123.164:~/Saqr/robot/
ssh unitree@192.168.123.164 'sudo systemctl restart saqr-bridge'

Streaming view (optional, for debugging from a browser):

With the subprocess running, open http://<robot-ip>:8080 — MJPEG stream of the annotated frames. Port and interface are set in config/core_config.json under stream.

Log locations:

Bridge stdout/stderr: journalctl -u saqr-bridge (systemd) or the terminal.
Saqr CLI: streams to the bridge and appears inline in the same log.
Per-module files: logs/*.log rotate on size; tail with tail -f logs/saqr.log.
Event CSVs: runtime/runs/<timestamp>/events.csv.

Troubleshooting quick-refs:

Symptom	Likely cause	Fix
"Camera not connected" announcement	RealSense not on USB	Replug, `lsusb \| grep Intel`, rerun
R2+X does nothing	LowState not subscribed	Check `BRIDGE … Subscribed to rt/lowstate` in the log; network down?
`rc=3104` warnings	Firmware audio/arm contention	Expected — see Known limitations
`motion files missing`	`assets/motions/*.jsonl` absent on robot	`scripts/deploy.sh` or scp the directory
`ModuleNotFoundError: cyclonedds`	Wrong version	`pip install "cyclonedds==0.10.2"` in the `saqr` env

Deploy

From the dev machine:

scripts/deploy.sh            # rsync + pip install -e . in the robot's conda env
scripts/deploy.sh --ip …     # custom robot IP

Configure

All tunable values live in JSON — no code edits needed:

config/core_config.json — detection, tracking, camera, stream, training.
config/robot_config.json — bridge, TTS, phrases, arm actions, deploy, start_saqr defaults.
config/logging.json — log levels per category.

Precedence: env var > config JSON > code fallback. Most defaults can be overridden via env vars without editing the files (CONDA_ENV, SAQR_SOURCE, STREAM_PORT, DDS_IFACE, ROBOT_IP, …).

After editing any JSON:

sudo systemctl restart saqr-bridge

Data & Models

data/ and runtime/ are gitignored (too large). Download separately:

Dataset: testcasque/ppe-detection-qlq3d → YOLOv11 format → unzip to data/dataset/.
Base weights: Ultralytics releases → yolo11n.pt into data/models/.
Saqr fine-tuned weights (saqr_best.pt): produced by training — see docs/DEPLOY.md.

Training (dev machine, off-path)

Training is done on a workstation, not on the robot. It's a one-off, not part of the normal run flow:

python -m apps.train_cli --epochs 100 --batch 16
# best weights land at data/models/saqr_best.pt; deploy with scripts/deploy.sh

Audio pipeline

On UNSAFE, robot/robot_controller.py queues one announcement to a worker thread which routes it per tts.mode:

tts_only → firmware TtsMaker(text, speaker_id=2) (current default).
recorded_only → WAV lookup in assets/audio/<category>/<key>.wav; no fallback to TtsMaker.
recorded_or_tts → WAV if available, else TtsMaker.

Each TtsMaker call is preceded by AUDIO_STOP_PLAY + 300 ms settle (the reset pattern from G1_Lootah/Audio_Recorder/voice_note.txt) and retried once on rc != 0. Audio RPC timeout is capped at 3 s so a stuck firmware call fails fast instead of blocking the worker for the bridge-wide 10 s. The Python SDK's TtsMaker has a broken tts_index counter that never increments — the controller bypasses it and calls the underlying RPC with a real index.

Known limitations

G1 firmware serialises the audio channel and rt/arm_sdk. When the low-level arm replayer is publishing joint commands at 60 Hz, every audio RPC (TtsMaker, PlayStream, even AUDIO_STOP_PLAY) blocks until the RPC timeout and returns rc=3104. Measured audio dropout during parallel arm+audio runs: ~50 % of alerts. This is a firmware-level constraint on the voice service, not anything the client can tune its way out of — verified by testing retries, SetVolume warmups, and AUDIO_STOP_PLAY resets, all of which hang under the same condition.

The reliable path is serial: speak() → wait_for_audio_done() → reject() (swap bridge.py handle_line if that's the behaviour you want). Parallel is kept behind audio_lead_s for latency-sensitive demos where occasional dropped audio is acceptable.

Current status (2026-04-21)

Snapshot of what is and isn't working, so whoever picks this up next doesn't have to rediscover it.

Working:

End-to-end flow: RealSense → YOLO11n → tracker → compliance → UNSAFE/SAFE/PARTIAL events → robot voice + recorded arm motion.
Wireless-remote gated start/stop (R2+X / R2+Y) via rt/lowstate.
Recorded arm replay from assets/motions/adnoc1.jsonl via robot/arm_replay.py — custom teach-and-replay trajectory instead of the canned reject action.
Recorded WAV library under assets/audio/ with a recorded_or_tts mode that can be toggled via tts.mode in config.
TtsMaker path hardened: AUDIO_STOP_PLAY reset + retry + 3 s RPC timeout + index-bug workaround (see Audio pipeline).
MJPEG stream on port 8080 for remote visual QA.
Fully config-driven — no hard-coded IPs, paths, or thresholds in code beyond SAQR_ROOT fallback and firmware API ids.
systemd unit + start_saqr.sh = single entry point, survives reboots.

Known broken / in tension:

Audio ↔ arm firmware contention (see Known limitations). Current default is parallel via audio_lead_s=0.3 — ~50 % of UNSAFE alerts drop their audio when arm sdk is publishing. Mitigations tested and none of them fix the root cause (it's a firmware limitation). The deterministic fix is to serialise audio-then-arm; pending decision.
Track-ID churn — each re-acquisition of the same person generates a new track id, which bypasses the per-(track_id, status) cooldown and triggers a fresh UNSAFE alert. Not yet quantified, but visible in rapid-fire ID 0001 → 0002 → 0003 sequences for what appears to be one worker.

Configuration right now:

tts.mode = "tts_only" (firmware TTS, no recorded WAVs used).
motion.enabled = true, motion.unsafe_file = "adnoc1.jsonl" — the custom 7-second trajectory with 60 frames of smooth move-in and 180 frames of smooth return-to-home.
bridge.audio_lead_s = 0.3 — parallel mode.
bridge.cooldown = 8.0 — per (track_id, status).

Next-session picks (prioritised):

Decide parallel-vs-serial for audio+arm and wire it in. Serial is safer for safety-critical alerts.
Stabilise track-IDs or make the cooldown per-identity instead of per-track so one worker doesn't trigger N alerts.
Add a boot-time self-check (camera reachable, DDS iface up, model file present, motion JSONL parseable) so failures surface before the first event rather than during it.
Rotate logs/ — currently grows unbounded.
Add a logs/events.jsonl audit trail (one structured line per UNSAFE/SAFE event) for post-hoc compliance review.

See docs/ARCHITECTURE.md for the full component map and data flow.

11 KiB Raw Blame History