280 lines
11 KiB
Markdown
280 lines
11 KiB
Markdown
# Saqr — PPE Safety Detection on Unitree G1
|
|
|
|
Real-time PPE compliance (helmet, vest, boots, gloves, goggles) using YOLO11n.
|
|
Runs on a Unitree G1 humanoid with an Intel RealSense D435I. On UNSAFE the
|
|
robot announces the missing gear via the onboard TTS and plays a recorded
|
|
arm-motion trajectory from `assets/motions/`.
|
|
|
|
## Layout
|
|
|
|
```
|
|
Saqr/
|
|
├── core/ # detection + tracking + events (shared by CLI/GUI/bridge)
|
|
├── apps/ # CLI modules (saqr_cli, detect_cli, train_cli, manager_cli, view_stream)
|
|
├── gui/ # PySide6 desktop GUI (dev-machine only)
|
|
├── robot/ # G1 bridge + DDS controller
|
|
├── utils/ # logger, config loader
|
|
├── scripts/
|
|
│ ├── start_saqr.sh # the single entry point
|
|
│ ├── saqr-bridge.service # systemd unit (wraps start_saqr.sh)
|
|
│ └── deploy.sh # push code dev machine → robot
|
|
├── config/ # logging.json, core_config.json, robot_config.json
|
|
├── assets/
|
|
│ ├── audio/ # pre-recorded WAV clips (16kHz mono int16, per category)
|
|
│ └── motions/ # teach-and-replay arm trajectories (*.jsonl)
|
|
├── data/ # dataset/, models/ (gitignored)
|
|
├── runtime/ # captures/, runs/ (gitignored)
|
|
├── logs/ # per-module .log files (gitignored)
|
|
├── docs/ # DEPLOY.md, start.md, use_case_catalogue.pdf
|
|
├── pyproject.toml
|
|
└── README.md
|
|
```
|
|
|
|
The project root is auto-detected from `core/paths.py::PROJECT_ROOT` — drop
|
|
the `Saqr/` folder anywhere on disk and the code finds itself. Override
|
|
with `SAQR_ROOT=/custom/path` if needed.
|
|
|
|
## Run
|
|
|
|
The project only runs through **[scripts/start_saqr.sh](scripts/start_saqr.sh)**
|
|
(directly or under the `saqr-bridge` systemd unit):
|
|
|
|
```bash
|
|
# On the robot:
|
|
sudo systemctl start saqr-bridge # production
|
|
# or
|
|
~/Saqr/scripts/start_saqr.sh # foreground / debug
|
|
```
|
|
|
|
Then on the wireless remote:
|
|
- **R2 + X** → start detection
|
|
- **R2 + Y** → stop detection
|
|
|
|
See [docs/DEPLOY.md](docs/DEPLOY.md) for first-time deploy,
|
|
[docs/start.md](docs/start.md) for the systemd workflow, and
|
|
[docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) for the component map.
|
|
|
|
## Usage
|
|
|
|
**One-time setup on the robot (already done, documented for rebuilds):**
|
|
|
|
```bash
|
|
# 1. Clone into ~/Saqr
|
|
git clone <repo> ~/Saqr && cd ~/Saqr
|
|
|
|
# 2. Create the conda env and install in editable mode
|
|
conda create -n saqr python=3.8 -y
|
|
conda activate saqr
|
|
pip install -e .
|
|
pip install "cyclonedds==0.10.2" # exact version required by unitree_sdk2py
|
|
|
|
# 3. Drop the model at data/models/saqr_best.pt
|
|
# (trained on the dev machine; copy via scp or scripts/deploy.sh)
|
|
|
|
# 4. Install the systemd unit for auto-start on boot
|
|
sudo cp scripts/saqr-bridge.service /etc/systemd/system/
|
|
sudo systemctl daemon-reload
|
|
sudo systemctl enable saqr-bridge
|
|
```
|
|
|
|
**Normal day-to-day operation on the robot:**
|
|
|
|
```bash
|
|
# Foreground (see logs live, Ctrl-C to stop)
|
|
CONDA_ENV=saqr ~/Saqr/scripts/start_saqr.sh
|
|
|
|
# Background via systemd (survives reboots)
|
|
sudo systemctl start saqr-bridge
|
|
journalctl -u saqr-bridge -f # follow logs
|
|
sudo systemctl stop saqr-bridge
|
|
```
|
|
|
|
**Controlling detection from the wireless remote:**
|
|
|
|
1. Start the bridge (above). It announces "Saqr is running. Press R2 plus X
|
|
to start."
|
|
2. Press **R2 + X** → the bridge spawns `apps.saqr_cli` as a subprocess,
|
|
opens the RealSense camera, loads the model, and begins publishing
|
|
events on stdout.
|
|
3. Walk into frame. On UNSAFE (missing helmet or vest) the robot speaks
|
|
what's missing and plays the recorded arm gesture.
|
|
4. Press **R2 + Y** → the subprocess is SIGINT'd, the robot announces
|
|
"Saqr deactivated.", and the bridge goes back to idle.
|
|
5. You can start/stop repeatedly without restarting the bridge.
|
|
|
|
**Dev-machine workflow (push code + weights to the robot):**
|
|
|
|
```bash
|
|
# From the dev machine, from the Saqr/ dir
|
|
scripts/deploy.sh # rsync + pip install -e . in the robot's conda env
|
|
# Or a single file:
|
|
scp robot/robot_controller.py unitree@192.168.123.164:~/Saqr/robot/
|
|
ssh unitree@192.168.123.164 'sudo systemctl restart saqr-bridge'
|
|
```
|
|
|
|
**Streaming view (optional, for debugging from a browser):**
|
|
|
|
With the subprocess running, open `http://<robot-ip>:8080` — MJPEG stream
|
|
of the annotated frames. Port and interface are set in
|
|
[config/core_config.json](config/core_config.json) under `stream`.
|
|
|
|
**Log locations:**
|
|
|
|
- Bridge stdout/stderr: `journalctl -u saqr-bridge` (systemd) or the terminal.
|
|
- Saqr CLI: streams to the bridge and appears inline in the same log.
|
|
- Per-module files: `logs/*.log` rotate on size; tail with `tail -f
|
|
logs/saqr.log`.
|
|
- Event CSVs: `runtime/runs/<timestamp>/events.csv`.
|
|
|
|
**Troubleshooting quick-refs:**
|
|
|
|
| Symptom | Likely cause | Fix |
|
|
|---|---|---|
|
|
| "Camera not connected" announcement | RealSense not on USB | Replug, `lsusb \| grep Intel`, rerun |
|
|
| R2+X does nothing | LowState not subscribed | Check `BRIDGE … Subscribed to rt/lowstate` in the log; network down? |
|
|
| `rc=3104` warnings | Firmware audio/arm contention | Expected — see [Known limitations](#known-limitations) |
|
|
| `motion files missing` | `assets/motions/*.jsonl` absent on robot | `scripts/deploy.sh` or scp the directory |
|
|
| `ModuleNotFoundError: cyclonedds` | Wrong version | `pip install "cyclonedds==0.10.2"` in the `saqr` env |
|
|
|
|
## Deploy
|
|
|
|
From the dev machine:
|
|
```bash
|
|
scripts/deploy.sh # rsync + pip install -e . in the robot's conda env
|
|
scripts/deploy.sh --ip … # custom robot IP
|
|
```
|
|
|
|
## Configure
|
|
|
|
All tunable values live in JSON — no code edits needed:
|
|
|
|
- [config/core_config.json](config/core_config.json) — detection, tracking,
|
|
camera, stream, training.
|
|
- [config/robot_config.json](config/robot_config.json) — bridge, TTS,
|
|
phrases, arm actions, deploy, start_saqr defaults.
|
|
- [config/logging.json](config/logging.json) — log levels per category.
|
|
|
|
Precedence: **env var > config JSON > code fallback**. Most defaults can be
|
|
overridden via env vars without editing the files (`CONDA_ENV`,
|
|
`SAQR_SOURCE`, `STREAM_PORT`, `DDS_IFACE`, `ROBOT_IP`, …).
|
|
|
|
After editing any JSON:
|
|
```bash
|
|
sudo systemctl restart saqr-bridge
|
|
```
|
|
|
|
## Data & Models
|
|
|
|
`data/` and `runtime/` are gitignored (too large). Download separately:
|
|
|
|
- **Dataset**: [testcasque/ppe-detection-qlq3d](https://universe.roboflow.com/testcasque/ppe-detection-qlq3d)
|
|
→ YOLOv11 format → unzip to `data/dataset/`.
|
|
- **Base weights**: [Ultralytics releases](https://github.com/ultralytics/assets/releases)
|
|
→ `yolo11n.pt` into `data/models/`.
|
|
- **Saqr fine-tuned weights** (`saqr_best.pt`): produced by training — see
|
|
[docs/DEPLOY.md](docs/DEPLOY.md).
|
|
|
|
## Training (dev machine, off-path)
|
|
|
|
Training is done on a workstation, not on the robot. It's a one-off, not
|
|
part of the normal run flow:
|
|
|
|
```bash
|
|
python -m apps.train_cli --epochs 100 --batch 16
|
|
# best weights land at data/models/saqr_best.pt; deploy with scripts/deploy.sh
|
|
```
|
|
|
|
## Audio pipeline
|
|
|
|
On UNSAFE, [robot/robot_controller.py](robot/robot_controller.py) queues one
|
|
announcement to a worker thread which routes it per `tts.mode`:
|
|
|
|
- `tts_only` → firmware `TtsMaker(text, speaker_id=2)` (current default).
|
|
- `recorded_only` → WAV lookup in `assets/audio/<category>/<key>.wav`; no
|
|
fallback to TtsMaker.
|
|
- `recorded_or_tts` → WAV if available, else TtsMaker.
|
|
|
|
Each TtsMaker call is preceded by `AUDIO_STOP_PLAY` + 300 ms settle (the
|
|
reset pattern from `G1_Lootah/Audio_Recorder/voice_note.txt`) and retried
|
|
once on `rc != 0`. Audio RPC timeout is capped at 3 s so a stuck firmware
|
|
call fails fast instead of blocking the worker for the bridge-wide 10 s.
|
|
The Python SDK's `TtsMaker` has a broken `tts_index` counter that never
|
|
increments — the controller bypasses it and calls the underlying RPC with
|
|
a real index.
|
|
|
|
## Known limitations
|
|
|
|
**G1 firmware serialises the audio channel and `rt/arm_sdk`.** When the
|
|
low-level arm replayer is publishing joint commands at 60 Hz, every audio
|
|
RPC (`TtsMaker`, `PlayStream`, even `AUDIO_STOP_PLAY`) blocks until the
|
|
RPC timeout and returns `rc=3104`. Measured audio dropout during parallel
|
|
arm+audio runs: ~50 % of alerts. This is a firmware-level constraint on
|
|
the voice service, not anything the client can tune its way out of —
|
|
verified by testing retries, `SetVolume` warmups, and `AUDIO_STOP_PLAY`
|
|
resets, all of which hang under the same condition.
|
|
|
|
The reliable path is **serial**: `speak()` → `wait_for_audio_done()` →
|
|
`reject()` (swap `bridge.py handle_line` if that's the behaviour you
|
|
want). Parallel is kept behind `audio_lead_s` for latency-sensitive
|
|
demos where occasional dropped audio is acceptable.
|
|
|
|
## Current status (2026-04-21)
|
|
|
|
Snapshot of what is and isn't working, so whoever picks this up next
|
|
doesn't have to rediscover it.
|
|
|
|
**Working:**
|
|
|
|
- End-to-end flow: RealSense → YOLO11n → tracker → compliance →
|
|
UNSAFE/SAFE/PARTIAL events → robot voice + recorded arm motion.
|
|
- Wireless-remote gated start/stop (R2+X / R2+Y) via `rt/lowstate`.
|
|
- Recorded arm replay from [assets/motions/adnoc1.jsonl](assets/motions/)
|
|
via [robot/arm_replay.py](robot/arm_replay.py) — custom teach-and-replay
|
|
trajectory instead of the canned `reject` action.
|
|
- Recorded WAV library under [assets/audio/](assets/audio/) with a
|
|
`recorded_or_tts` mode that can be toggled via `tts.mode` in config.
|
|
- TtsMaker path hardened: `AUDIO_STOP_PLAY` reset + retry + 3 s RPC
|
|
timeout + index-bug workaround (see [Audio pipeline](#audio-pipeline)).
|
|
- MJPEG stream on port 8080 for remote visual QA.
|
|
- Fully config-driven — no hard-coded IPs, paths, or thresholds in code
|
|
beyond `SAQR_ROOT` fallback and firmware API ids.
|
|
- systemd unit + `start_saqr.sh` = single entry point, survives reboots.
|
|
|
|
**Known broken / in tension:**
|
|
|
|
- **Audio ↔ arm firmware contention** (see [Known limitations](#known-limitations)).
|
|
Current default is parallel via `audio_lead_s=0.3` — ~50 % of UNSAFE
|
|
alerts drop their audio when arm sdk is publishing. Mitigations tested
|
|
and **none of them fix the root cause** (it's a firmware limitation).
|
|
The deterministic fix is to serialise audio-then-arm; pending decision.
|
|
- **Track-ID churn** — each re-acquisition of the same person generates a
|
|
new track id, which bypasses the per-`(track_id, status)` cooldown and
|
|
triggers a fresh UNSAFE alert. Not yet quantified, but visible in
|
|
rapid-fire `ID 0001 → 0002 → 0003` sequences for what appears to be
|
|
one worker.
|
|
|
|
**Configuration right now:**
|
|
|
|
- `tts.mode = "tts_only"` (firmware TTS, no recorded WAVs used).
|
|
- `motion.enabled = true`, `motion.unsafe_file = "adnoc1.jsonl"` — the
|
|
custom 7-second trajectory with 60 frames of smooth move-in and 180
|
|
frames of smooth return-to-home.
|
|
- `bridge.audio_lead_s = 0.3` — parallel mode.
|
|
- `bridge.cooldown = 8.0` — per (track_id, status).
|
|
|
|
**Next-session picks (prioritised):**
|
|
|
|
1. Decide parallel-vs-serial for audio+arm and wire it in. Serial is
|
|
safer for safety-critical alerts.
|
|
2. Stabilise track-IDs or make the cooldown per-identity instead of
|
|
per-track so one worker doesn't trigger N alerts.
|
|
3. Add a boot-time self-check (camera reachable, DDS iface up, model
|
|
file present, motion JSONL parseable) so failures surface before the
|
|
first event rather than during it.
|
|
4. Rotate `logs/` — currently grows unbounded.
|
|
5. Add a `logs/events.jsonl` audit trail (one structured line per
|
|
UNSAFE/SAFE event) for post-hoc compliance review.
|
|
|
|
See [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) for the full component
|
|
map and data flow.
|