Update 2026-04-20 17:44:59
This commit is contained in:
parent
2c9cde84ca
commit
380794a4ed
147
README.md
147
README.md
@ -1,101 +1,98 @@
|
||||
# Saqr — PPE Safety Detection on Unitree G1
|
||||
|
||||
Real-time PPE compliance (helmet, vest, boots, gloves, goggles) using YOLO11n,
|
||||
designed to run on a Unitree G1 humanoid with an Intel RealSense D435I. On
|
||||
UNSAFE the robot speaks a warning and plays the `reject` arm action.
|
||||
Real-time PPE compliance (helmet, vest, boots, gloves, goggles) using YOLO11n.
|
||||
Runs on a Unitree G1 humanoid with an Intel RealSense D435I. On UNSAFE the
|
||||
robot speaks a warning and plays the `reject` arm action.
|
||||
|
||||
## Layout
|
||||
|
||||
```
|
||||
saqr/ # python package
|
||||
core/ # detection + tracking + events (shared by CLI/GUI/bridge)
|
||||
apps/ # CLI entry points (saqr, detect, train, manager, view_stream)
|
||||
gui/ # PySide6 desktop GUI
|
||||
robot/ # G1 bridge + DDS controller
|
||||
utils/ # logger
|
||||
scripts/ # deploy.sh, start_saqr.sh, run_local.sh, run_robot.sh, systemd unit
|
||||
config/ # logging.json
|
||||
data/ # dataset/, models/ (gitignored)
|
||||
runtime/ # captures/, logs/, runs/ (gitignored)
|
||||
docs/ # DEPLOY.md, start.md, use_case_catalogue.pdf
|
||||
Saqr/
|
||||
├── core/ # detection + tracking + events (shared by CLI/GUI/bridge)
|
||||
├── apps/ # CLI modules (saqr_cli, detect_cli, train_cli, manager_cli, view_stream)
|
||||
├── gui/ # PySide6 desktop GUI (dev-machine only)
|
||||
├── robot/ # G1 bridge + DDS controller
|
||||
├── utils/ # logger, config loader
|
||||
├── scripts/
|
||||
│ ├── start_saqr.sh # the single entry point
|
||||
│ ├── saqr-bridge.service # systemd unit (wraps start_saqr.sh)
|
||||
│ └── deploy.sh # push code dev machine → robot
|
||||
├── config/ # logging.json, core_config.json, robot_config.json
|
||||
├── data/ # dataset/, models/ (gitignored)
|
||||
├── runtime/ # captures/, runs/ (gitignored)
|
||||
├── logs/ # per-module .log files (gitignored)
|
||||
├── docs/ # DEPLOY.md, start.md, use_case_catalogue.pdf
|
||||
├── pyproject.toml
|
||||
└── README.md
|
||||
```
|
||||
|
||||
## Quick start
|
||||
The project root is auto-detected from `core/paths.py::PROJECT_ROOT` — drop
|
||||
the `Saqr/` folder anywhere on disk and the code finds itself. Override
|
||||
with `SAQR_ROOT=/custom/path` if needed.
|
||||
|
||||
## Run
|
||||
|
||||
The project only runs through **[scripts/start_saqr.sh](scripts/start_saqr.sh)**
|
||||
(directly or under the `saqr-bridge` systemd unit):
|
||||
|
||||
```bash
|
||||
# Install the package (editable)
|
||||
pip install -e .
|
||||
|
||||
# Local dev run (webcam)
|
||||
saqr --source 0
|
||||
|
||||
# PySide6 GUI
|
||||
pip install -e ".[gui]"
|
||||
saqr-gui
|
||||
|
||||
# On the Unitree G1 (bridge owns the R2+X / R2+Y flow)
|
||||
saqr-bridge --iface eth0 --source realsense --headless -- --stream 8080
|
||||
# On the robot:
|
||||
sudo systemctl start saqr-bridge # production
|
||||
# or
|
||||
~/Saqr/scripts/start_saqr.sh # foreground / debug
|
||||
```
|
||||
|
||||
Without installing, everything still works via `python -m`:
|
||||
Then on the wireless remote:
|
||||
- **R2 + X** → start detection
|
||||
- **R2 + Y** → stop detection
|
||||
|
||||
See [docs/DEPLOY.md](docs/DEPLOY.md) for first-time deploy and
|
||||
[docs/start.md](docs/start.md) for the systemd workflow.
|
||||
|
||||
## Deploy
|
||||
|
||||
From the dev machine:
|
||||
```bash
|
||||
python -m saqr.apps.saqr_cli --source 0
|
||||
python -m saqr.robot.bridge --iface eth0 --source realsense --headless
|
||||
scripts/deploy.sh # rsync + pip install -e . in the robot's conda env
|
||||
scripts/deploy.sh --ip … # custom robot IP
|
||||
```
|
||||
|
||||
## Docs
|
||||
## Configure
|
||||
|
||||
- [docs/DEPLOY.md](docs/DEPLOY.md) — full deploy + robot setup.
|
||||
- [docs/start.md](docs/start.md) — systemd auto-start workflow.
|
||||
- [docs/use_case_catalogue.pdf](docs/use_case_catalogue.pdf) — PPE use-case spec.
|
||||
All tunable values live in JSON — no code edits needed:
|
||||
|
||||
- [config/core_config.json](config/core_config.json) — detection, tracking,
|
||||
camera, stream, training.
|
||||
- [config/robot_config.json](config/robot_config.json) — bridge, TTS,
|
||||
phrases, arm actions, deploy, start_saqr defaults.
|
||||
- [config/logging.json](config/logging.json) — log levels per category.
|
||||
|
||||
Precedence: **env var > config JSON > code fallback**. Most defaults can be
|
||||
overridden via env vars without editing the files (`CONDA_ENV`,
|
||||
`SAQR_SOURCE`, `STREAM_PORT`, `DDS_IFACE`, `ROBOT_IP`, …).
|
||||
|
||||
After editing any JSON:
|
||||
```bash
|
||||
sudo systemctl restart saqr-bridge
|
||||
```
|
||||
|
||||
## Data & Models
|
||||
|
||||
The `data/` and `runtime/` directories are excluded from git (too large).
|
||||
Download them separately before training or running inference.
|
||||
`data/` and `runtime/` are gitignored (too large). Download separately:
|
||||
|
||||
### `data/` — dataset and pre-trained weights
|
||||
- **Dataset**: [testcasque/ppe-detection-qlq3d](https://universe.roboflow.com/testcasque/ppe-detection-qlq3d)
|
||||
→ YOLOv11 format → unzip to `data/dataset/`.
|
||||
- **Base weights**: [Ultralytics releases](https://github.com/ultralytics/assets/releases)
|
||||
→ `yolo11n.pt` into `data/models/`.
|
||||
- **Saqr fine-tuned weights** (`saqr_best.pt`): produced by training — see
|
||||
[docs/DEPLOY.md](docs/DEPLOY.md).
|
||||
|
||||
Expected contents:
|
||||
## Training (dev machine, off-path)
|
||||
|
||||
```
|
||||
data/
|
||||
dataset/
|
||||
train/{images,labels}/
|
||||
valid/{images,labels}/
|
||||
test/{images,labels}/
|
||||
data.yaml
|
||||
models/
|
||||
saqr_best.pt # Saqr YOLO11n fine-tuned on PPE
|
||||
saqr_last.pt
|
||||
yolo11n.pt # base YOLO11n
|
||||
yolo26n.pt # base YOLO26n
|
||||
```
|
||||
|
||||
Download:
|
||||
|
||||
- **Dataset** (PPE, Roboflow): [testcasque/ppe-detection-qlq3d](https://universe.roboflow.com/testcasque/ppe-detection-qlq3d)
|
||||
Open the Roboflow link → *Download Dataset* → format **YOLOv11** → unzip into `data/dataset/`.
|
||||
- **Base YOLO weights**: [Ultralytics assets releases](https://github.com/ultralytics/assets/releases)
|
||||
Grab `yolo11n.pt` (and optionally `yolo26n.pt`) into `data/models/`.
|
||||
- **Saqr fine-tuned weights** (`saqr_best.pt`, `saqr_last.pt`):
|
||||
Produced by training — see "Training" below. Or request from the maintainer.
|
||||
|
||||
Place everything under `data/` so the tree matches above.
|
||||
|
||||
### `runtime/` — training output (optional)
|
||||
|
||||
Auto-generated when you run training. Not required for inference.
|
||||
Contains confusion matrices, PR curves, batch previews, and the raw weights
|
||||
under `runtime/runs/train/saqr_det/weights/`.
|
||||
|
||||
### Training
|
||||
Training is done on a workstation, not on the robot. It's a one-off, not
|
||||
part of the normal run flow:
|
||||
|
||||
```bash
|
||||
# after placing the dataset in data/dataset/ and base weights in data/models/
|
||||
python -m saqr.apps.train_cli --data data/dataset/data.yaml --weights data/models/yolo11n.pt
|
||||
python -m apps.train_cli --epochs 100 --batch 16
|
||||
# best weights land at data/models/saqr_best.pt; deploy with scripts/deploy.sh
|
||||
```
|
||||
|
||||
Outputs land in `runtime/runs/train/saqr_det/`. Copy the best checkpoint to
|
||||
`data/models/saqr_best.pt` to use it at inference time.
|
||||
|
||||
@ -8,11 +8,14 @@ from pathlib import Path
|
||||
import cv2
|
||||
from ultralytics import YOLO
|
||||
|
||||
from saqr.core.detection import get_inference_config, set_inference_config
|
||||
from saqr.core.model import resolve_model_path
|
||||
from saqr.utils.logger import get_logger
|
||||
from core.detection import get_inference_config, set_inference_config
|
||||
from core.model import resolve_model_path
|
||||
from utils.config import load_config
|
||||
from utils.logger import get_logger
|
||||
|
||||
log = get_logger("Inference", "detect")
|
||||
_DET = load_config("core")["detection"]
|
||||
_CAM = load_config("core")["camera"]
|
||||
|
||||
VIOLATION = {"no-helmet", "no-vest", "no-boots", "no-gloves", "no-goggles"}
|
||||
COMPLIANT = {"helmet", "vest", "boots", "gloves", "goggles"}
|
||||
@ -97,12 +100,12 @@ def run_image(model, path, conf):
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Saqr simple PPE detection")
|
||||
parser.add_argument("--source", default="0")
|
||||
parser.add_argument("--model", default="saqr_best.pt")
|
||||
parser.add_argument("--conf", type=float, default=0.35)
|
||||
parser.add_argument("--device", default="0", help="'cpu', '0', 'cuda:0'")
|
||||
parser.add_argument("--half", action="store_true")
|
||||
parser.add_argument("--imgsz", type=int, default=320)
|
||||
parser.add_argument("--source", default=_CAM["default_source"])
|
||||
parser.add_argument("--model", default=_DET["default_model"])
|
||||
parser.add_argument("--conf", type=float, default=_DET["conf"])
|
||||
parser.add_argument("--device", default=_DET["device"], help="'cpu', '0', 'cuda:0'")
|
||||
parser.add_argument("--half", action="store_true", default=_DET["half"])
|
||||
parser.add_argument("--imgsz", type=int, default=_DET["imgsz"])
|
||||
args = parser.parse_args()
|
||||
|
||||
set_inference_config(device=args.device, half=args.half, imgsz=args.imgsz)
|
||||
@ -10,8 +10,8 @@ from pathlib import Path
|
||||
|
||||
import cv2
|
||||
|
||||
from saqr.core.paths import CAPTURES_DIR, PROJECT_ROOT
|
||||
from saqr.utils.logger import get_logger
|
||||
from core.paths import CAPTURES_DIR, PROJECT_ROOT
|
||||
from utils.logger import get_logger
|
||||
|
||||
log = get_logger("Manager", "manager")
|
||||
|
||||
@ -9,23 +9,28 @@ from typing import Dict
|
||||
import cv2
|
||||
from ultralytics import YOLO
|
||||
|
||||
from saqr.core.camera import open_capture
|
||||
from saqr.core.capture import setup_capture_dirs
|
||||
from saqr.core.detection import set_inference_config
|
||||
from saqr.core.drawing import draw_counters
|
||||
from saqr.core.events import EventLogger, write_result_csv
|
||||
from saqr.core.model import resolve_model_path
|
||||
from saqr.core.paths import EVENTS_CSV, RESULT_CSV
|
||||
from saqr.core.pipeline import process_frame
|
||||
from saqr.core.streaming import start_stream_server, update_stream_frame
|
||||
from saqr.core.tracking import PersonTracker
|
||||
from saqr.utils.logger import get_logger
|
||||
from core.camera import RealSenseCapture, open_capture
|
||||
from core.capture import setup_capture_dirs, setup_snapshot_dirs
|
||||
from core.detection import set_inference_config
|
||||
from core.drawing import draw_counters
|
||||
from core.events import EventLogger, write_result_csv
|
||||
from core.model import resolve_model_path
|
||||
from core.paths import EVENTS_CSV, RESULT_CSV
|
||||
from core.pipeline import process_frame
|
||||
from core.streaming import start_stream_server, update_stream_frame
|
||||
from core.tracking import PersonTracker
|
||||
from utils.config import load_config
|
||||
from utils.logger import get_logger
|
||||
|
||||
log = get_logger("Inference", "saqr")
|
||||
_CORE = load_config("core")
|
||||
|
||||
|
||||
def run_video(model, source, conf, capture_dirs: Dict[str, Path], show_gui, csv_every_frame,
|
||||
max_missing, match_distance, status_confirm_frames, stream_port=0):
|
||||
def run_video(model, source, conf, capture_dirs: Dict[str, Path], show_gui,
|
||||
csv_interval, max_missing, match_distance, status_confirm_frames,
|
||||
*,
|
||||
snapshot_dirs=None, max_distance_m: float = 0.0,
|
||||
stream_port: int = 0):
|
||||
cap = open_capture(source)
|
||||
if not cap.isOpened():
|
||||
log.error(f"Cannot open source: {source}")
|
||||
@ -37,6 +42,9 @@ def run_video(model, source, conf, capture_dirs: Dict[str, Path], show_gui, csv_
|
||||
cap.release()
|
||||
return
|
||||
|
||||
is_realsense = isinstance(cap, RealSenseCapture) and cap.has_depth
|
||||
depth_scale = cap.depth_scale if is_realsense else 0.001
|
||||
|
||||
event_logger = EventLogger(EVENTS_CSV)
|
||||
tracker = PersonTracker(
|
||||
event_logger=event_logger,
|
||||
@ -48,7 +56,11 @@ def run_video(model, source, conf, capture_dirs: Dict[str, Path], show_gui, csv_
|
||||
if stream_port > 0:
|
||||
start_stream_server(stream_port)
|
||||
|
||||
log.info(f"Session started | source={source}")
|
||||
log.info(
|
||||
f"Session started | source={source} depth={is_realsense} "
|
||||
f"max_distance_m={max_distance_m if max_distance_m > 0 else 'off'} "
|
||||
f"csv_interval={csv_interval}"
|
||||
)
|
||||
if show_gui:
|
||||
print("Running - press q to quit, s to save frame.")
|
||||
|
||||
@ -58,10 +70,16 @@ def run_video(model, source, conf, capture_dirs: Dict[str, Path], show_gui, csv_
|
||||
|
||||
while True:
|
||||
frame_idx += 1
|
||||
depth_frame = cap.latest_depth if is_realsense else None
|
||||
write_csv_this_frame = csv_interval > 0 and (frame_idx % csv_interval == 0)
|
||||
|
||||
try:
|
||||
annotated, visible = process_frame(
|
||||
frame, model, tracker, frame_idx, conf,
|
||||
capture_dirs, write_csv=csv_every_frame,
|
||||
capture_dirs, write_csv=write_csv_this_frame,
|
||||
snapshot_dirs=snapshot_dirs,
|
||||
depth_frame=depth_frame, depth_scale=depth_scale,
|
||||
max_distance_m=max_distance_m,
|
||||
)
|
||||
except Exception as e:
|
||||
log.exception(f"Frame error #{frame_idx}: {e}")
|
||||
@ -94,11 +112,12 @@ def run_video(model, source, conf, capture_dirs: Dict[str, Path], show_gui, csv_
|
||||
if show_gui:
|
||||
cv2.destroyAllWindows()
|
||||
|
||||
# Always write final state on exit so the last tracked people are recorded.
|
||||
write_result_csv(list(tracker.tracks.values()), RESULT_CSV)
|
||||
log.info(f"Session ended | frames={frame_idx} tracks_created={tracker.next_id - 1}")
|
||||
|
||||
|
||||
def run_image(model, path, conf, capture_dirs: Dict[str, Path], show_gui):
|
||||
def run_image(model, path, conf, capture_dirs: Dict[str, Path], show_gui, snapshot_dirs=None):
|
||||
frame = cv2.imread(path)
|
||||
if frame is None:
|
||||
log.error(f"Cannot read image: {path}")
|
||||
@ -107,7 +126,10 @@ def run_image(model, path, conf, capture_dirs: Dict[str, Path], show_gui):
|
||||
event_logger = EventLogger(EVENTS_CSV)
|
||||
tracker = PersonTracker(event_logger=event_logger)
|
||||
|
||||
annotated, visible = process_frame(frame, model, tracker, 1, conf, capture_dirs)
|
||||
annotated, visible = process_frame(
|
||||
frame, model, tracker, 1, conf, capture_dirs,
|
||||
snapshot_dirs=snapshot_dirs,
|
||||
)
|
||||
draw_counters(annotated, visible, 0.0)
|
||||
|
||||
out = Path(path).stem + "_saqr.jpg"
|
||||
@ -122,20 +144,32 @@ def run_image(model, path, conf, capture_dirs: Dict[str, Path], show_gui):
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Saqr PPE detection with tracking")
|
||||
parser.add_argument("--source", default="0",
|
||||
det = _CORE["detection"]
|
||||
trk = _CORE["tracking"]
|
||||
cam = _CORE["camera"]
|
||||
cap_cfg = _CORE["capture"]
|
||||
|
||||
parser.add_argument("--source", default=cam["default_source"],
|
||||
help="0/1=webcam, realsense, realsense:SERIAL, /dev/videoX, or video path")
|
||||
parser.add_argument("--model", default="saqr_best.pt",
|
||||
parser.add_argument("--model", default=det["default_model"],
|
||||
help="Trained YOLO weights (resolved under data/models/ by default)")
|
||||
parser.add_argument("--conf", type=float, default=0.35)
|
||||
parser.add_argument("--max-missing", type=int, default=90)
|
||||
parser.add_argument("--match-distance", type=float, default=250.0)
|
||||
parser.add_argument("--status-confirm-frames", type=int, default=5)
|
||||
parser.add_argument("--headless", action="store_true")
|
||||
parser.add_argument("--stream", type=int, default=0, metavar="PORT")
|
||||
parser.add_argument("--csv-on-exit", action="store_true")
|
||||
parser.add_argument("--device", default="0")
|
||||
parser.add_argument("--half", action="store_true")
|
||||
parser.add_argument("--imgsz", type=int, default=320)
|
||||
parser.add_argument("--conf", type=float, default=det["conf"])
|
||||
parser.add_argument("--max-missing", type=int, default=trk["max_missing"])
|
||||
parser.add_argument("--match-distance", type=float, default=trk["match_distance"])
|
||||
parser.add_argument("--status-confirm-frames", type=int, default=trk["status_confirm_frames"])
|
||||
parser.add_argument("--max-distance-m", type=float, default=det.get("max_distance_m", 0.0),
|
||||
help="RealSense-only: drop candidates beyond this depth (0 = off)")
|
||||
parser.add_argument("--headless", action="store_true")
|
||||
parser.add_argument("--stream", type=int, default=0, metavar="PORT")
|
||||
parser.add_argument("--csv-interval", type=int, default=trk.get("csv_write_every_n_frames", 30),
|
||||
help="Write result.csv every N frames (0 = only on exit)")
|
||||
parser.add_argument("--csv-on-exit", action="store_true",
|
||||
help="Alias for --csv-interval 0")
|
||||
parser.add_argument("--no-snapshots", action="store_true",
|
||||
help="Disable full-frame snapshot on transitions")
|
||||
parser.add_argument("--device", default=det["device"])
|
||||
parser.add_argument("--half", action="store_true", default=det["half"])
|
||||
parser.add_argument("--imgsz", type=int, default=det["imgsz"])
|
||||
args = parser.parse_args()
|
||||
|
||||
set_inference_config(device=args.device, half=args.half, imgsz=args.imgsz)
|
||||
@ -154,12 +188,17 @@ def main():
|
||||
log.warning("PyTorch not found")
|
||||
|
||||
capture_dirs = setup_capture_dirs()
|
||||
snapshot_dirs = None
|
||||
if cap_cfg.get("save_event_snapshot", True) and not args.no_snapshots:
|
||||
snapshot_dirs = setup_snapshot_dirs()
|
||||
|
||||
csv_interval = 0 if args.csv_on_exit else max(0, args.csv_interval)
|
||||
|
||||
try:
|
||||
model_path = resolve_model_path(args.model)
|
||||
except FileNotFoundError as e:
|
||||
log.error(str(e))
|
||||
log.error("Train first: saqr-train --dataset data/dataset")
|
||||
log.error("Train first: python -m apps.train_cli --dataset data/dataset")
|
||||
raise SystemExit(1)
|
||||
|
||||
log.info(f"Loading model: {model_path}")
|
||||
@ -176,14 +215,17 @@ def main():
|
||||
run_video(
|
||||
model, source, args.conf, capture_dirs,
|
||||
show_gui=not args.headless,
|
||||
csv_every_frame=not args.csv_on_exit,
|
||||
csv_interval=csv_interval,
|
||||
max_missing=args.max_missing,
|
||||
match_distance=args.match_distance,
|
||||
status_confirm_frames=args.status_confirm_frames,
|
||||
snapshot_dirs=snapshot_dirs,
|
||||
max_distance_m=args.max_distance_m,
|
||||
stream_port=args.stream,
|
||||
)
|
||||
elif Path(source).exists():
|
||||
run_image(model, source, args.conf, capture_dirs, show_gui=not args.headless)
|
||||
run_image(model, source, args.conf, capture_dirs,
|
||||
show_gui=not args.headless, snapshot_dirs=snapshot_dirs)
|
||||
else:
|
||||
log.error(f"Source not found: {source}")
|
||||
raise SystemExit(1)
|
||||
@ -7,10 +7,12 @@ from pathlib import Path
|
||||
|
||||
import yaml
|
||||
|
||||
from saqr.core.paths import DATASET_DIR, MODELS_DIR, PROJECT_ROOT, RUNS_DIR
|
||||
from saqr.utils.logger import get_logger
|
||||
from core.paths import DATASET_DIR, MODELS_DIR, PROJECT_ROOT, RUNS_DIR
|
||||
from utils.config import load_config
|
||||
from utils.logger import get_logger
|
||||
|
||||
log = get_logger("Training", "train")
|
||||
_TRAIN = load_config("core")["training"]
|
||||
|
||||
EXPECTED_CLASSES = [
|
||||
"boots", "gloves", "goggles", "helmet", "no-boots",
|
||||
@ -52,13 +54,13 @@ def main():
|
||||
parser = argparse.ArgumentParser(description="Train Saqr PPE detector (YOLO11n)")
|
||||
parser.add_argument("--dataset", default=str(DATASET_DIR),
|
||||
help="Root folder containing data.yaml + train/valid/test")
|
||||
parser.add_argument("--epochs", type=int, default=100)
|
||||
parser.add_argument("--imgsz", type=int, default=640)
|
||||
parser.add_argument("--batch", type=int, default=16)
|
||||
parser.add_argument("--model", default="yolo11n.pt",
|
||||
parser.add_argument("--epochs", type=int, default=_TRAIN["epochs"])
|
||||
parser.add_argument("--imgsz", type=int, default=_TRAIN["imgsz"])
|
||||
parser.add_argument("--batch", type=int, default=_TRAIN["batch"])
|
||||
parser.add_argument("--model", default=_TRAIN["base_model"],
|
||||
help="Base YOLO model (auto-downloaded if not present)")
|
||||
parser.add_argument("--name", default="saqr_det")
|
||||
parser.add_argument("--device", default="0")
|
||||
parser.add_argument("--name", default=_TRAIN["run_name"])
|
||||
parser.add_argument("--device", default=_TRAIN["device"])
|
||||
args = parser.parse_args()
|
||||
|
||||
dataset_root = Path(args.dataset)
|
||||
@ -1,15 +1,20 @@
|
||||
"""View the robot's MJPEG stream on a laptop."""
|
||||
"""View the robot's MJPEG stream from a laptop."""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
|
||||
import cv2
|
||||
|
||||
from utils.config import load_config
|
||||
|
||||
|
||||
def main():
|
||||
default_ip = load_config("robot")["deploy"]["robot_ip"]
|
||||
default_port = load_config("core")["stream"]["port"]
|
||||
|
||||
parser = argparse.ArgumentParser(description="View Saqr PPE stream from robot")
|
||||
parser.add_argument("--ip", default="192.168.123.164", help="Robot IP address")
|
||||
parser.add_argument("--port", default="8080", help="Stream port")
|
||||
parser.add_argument("--ip", default=default_ip, help="Robot IP address")
|
||||
parser.add_argument("--port", default=str(default_port), help="Stream port")
|
||||
args = parser.parse_args()
|
||||
|
||||
url = f"http://{args.ip}:{args.port}/stream"
|
||||
88
assets/audio/README.md
Normal file
88
assets/audio/README.md
Normal file
@ -0,0 +1,88 @@
|
||||
# Pre-recorded audio library
|
||||
|
||||
WAV clips played via `AudioClient.PlayStream` on the G1 speaker. Bypassing
|
||||
`TtsMaker` saves ~200–700 ms of firmware synthesis buffer per announcement
|
||||
and eliminates 3104 "device busy" errors.
|
||||
|
||||
## Required format
|
||||
|
||||
Every file **must** be:
|
||||
|
||||
- **16 kHz** sample rate
|
||||
- **mono** (1 channel)
|
||||
- **16-bit signed PCM** (`int16`)
|
||||
- `.wav` container
|
||||
|
||||
Any file not matching this is logged as a warning and skipped (the bridge
|
||||
falls back to `TtsMaker` for that phrase per `config/robot_config.json.tts.mode`).
|
||||
|
||||
## Expected files
|
||||
|
||||
Drop WAVs under the right `<category>/<key>.wav` path so `AudioPlayer` finds them:
|
||||
|
||||
```
|
||||
fixed/
|
||||
ready.wav "Saqr is running. Press R2 plus X to start."
|
||||
deactivated.wav "Saqr deactivated."
|
||||
no_camera.wav "Camera not connected. Please plug in the camera and try again."
|
||||
safe.wav "Safe to enter. Have a good day."
|
||||
unsafe_generic.wav "Please stop. Wear your proper safety equipment."
|
||||
|
||||
unsafe_missing/
|
||||
helmet.wav "Please stop. Wear your proper safety equipment. You are missing helmet."
|
||||
vest.wav "Please stop. Wear your proper safety equipment. You are missing vest."
|
||||
helmet_vest.wav "Please stop. Wear your proper safety equipment. You are missing helmet and vest."
|
||||
```
|
||||
|
||||
**Naming rule for `unsafe_missing/`**: the `<key>` is the missing PPE names
|
||||
sorted alphabetically and joined with `_`. So if someone misses both helmet
|
||||
and vest, the bridge looks up `unsafe_missing/helmet_vest.wav` (not
|
||||
`vest_helmet.wav`). If you extend `compliance.required_ppe` later, add clips
|
||||
for every subset — for 3 required items that's 7 combinations (2³−1).
|
||||
|
||||
## Converting existing recordings
|
||||
|
||||
If your source file is at a different sample rate or stereo, convert with
|
||||
`ffmpeg`:
|
||||
|
||||
```bash
|
||||
ffmpeg -y -i input.m4a -ac 1 -ar 16000 -sample_fmt s16 fixed/safe.wav
|
||||
```
|
||||
|
||||
Validate with:
|
||||
|
||||
```bash
|
||||
python3 - <<'EOF'
|
||||
import wave, sys
|
||||
with wave.open("fixed/safe.wav", "rb") as wf:
|
||||
print(wf.getframerate(), "Hz,", wf.getnchannels(), "ch,", wf.getsampwidth()*8, "bit")
|
||||
EOF
|
||||
# must print: 16000 Hz, 1 ch, 16 bit
|
||||
```
|
||||
|
||||
## Recording tips
|
||||
|
||||
- Quiet room; no echo.
|
||||
- Don't clip — keep peaks below 0 dBFS.
|
||||
- Leave ~100 ms of silence at the start and end so the clip doesn't pop.
|
||||
- Target speaking rate: ~3 syllables/sec. The shortest clip (`deactivated`)
|
||||
should be ~2 s; longest (`no_camera`) around 5–6 s.
|
||||
|
||||
## Runtime behaviour
|
||||
|
||||
In `config/robot_config.json`:
|
||||
|
||||
```json
|
||||
"tts": {
|
||||
"mode": "recorded_or_tts"
|
||||
}
|
||||
```
|
||||
|
||||
Modes:
|
||||
- `recorded_or_tts` — play WAV if the clip exists, otherwise fall back to `TtsMaker`.
|
||||
- `recorded_only` — play WAV or stay silent. Useful for demos where you want
|
||||
deterministic audio. Will skip any phrase whose clip is missing.
|
||||
- `tts_only` — ignore the WAV library entirely (current legacy behaviour).
|
||||
|
||||
After adding or replacing WAVs, restart the bridge to reload the library
|
||||
(`sudo systemctl restart saqr-bridge`).
|
||||
BIN
assets/audio/fixed/deactivated.wav
Normal file
BIN
assets/audio/fixed/deactivated.wav
Normal file
Binary file not shown.
BIN
assets/audio/fixed/no_camera.wav
Normal file
BIN
assets/audio/fixed/no_camera.wav
Normal file
Binary file not shown.
BIN
assets/audio/fixed/ready.wav
Normal file
BIN
assets/audio/fixed/ready.wav
Normal file
Binary file not shown.
BIN
assets/audio/fixed/safe.wav
Normal file
BIN
assets/audio/fixed/safe.wav
Normal file
Binary file not shown.
BIN
assets/audio/fixed/unsafe_generic.wav
Normal file
BIN
assets/audio/fixed/unsafe_generic.wav
Normal file
Binary file not shown.
BIN
assets/audio/unsafe_missing/helmet.wav
Normal file
BIN
assets/audio/unsafe_missing/helmet.wav
Normal file
Binary file not shown.
BIN
assets/audio/unsafe_missing/helmet_vest.wav
Normal file
BIN
assets/audio/unsafe_missing/helmet_vest.wav
Normal file
Binary file not shown.
BIN
assets/audio/unsafe_missing/vest.wav
Normal file
BIN
assets/audio/unsafe_missing/vest.wav
Normal file
Binary file not shown.
44
config/core_config.json
Normal file
44
config/core_config.json
Normal file
@ -0,0 +1,44 @@
|
||||
{
|
||||
"detection": {
|
||||
"conf": 0.35,
|
||||
"device": "0",
|
||||
"half": false,
|
||||
"imgsz": 320,
|
||||
"default_model": "saqr_best.pt",
|
||||
"max_distance_m": 3.0
|
||||
},
|
||||
"tracking": {
|
||||
"max_missing": 90,
|
||||
"match_distance": 250.0,
|
||||
"status_confirm_frames": 5,
|
||||
"csv_write_every_n_frames": 30,
|
||||
"stationary_check_frames": 15,
|
||||
"stationary_tolerance_px": 40.0
|
||||
},
|
||||
"compliance": {
|
||||
"required_ppe": ["helmet", "vest"]
|
||||
},
|
||||
"capture": {
|
||||
"save_event_snapshot": true
|
||||
},
|
||||
"camera": {
|
||||
"width": 640,
|
||||
"height": 480,
|
||||
"fps": 30,
|
||||
"default_source": "0",
|
||||
"enable_depth": true
|
||||
},
|
||||
"stream": {
|
||||
"host": "0.0.0.0",
|
||||
"port": 8080,
|
||||
"jpeg_quality": 70
|
||||
},
|
||||
"training": {
|
||||
"epochs": 100,
|
||||
"imgsz": 640,
|
||||
"batch": 16,
|
||||
"device": "0",
|
||||
"base_model": "yolo11n.pt",
|
||||
"run_name": "saqr_det"
|
||||
}
|
||||
}
|
||||
51
config/robot_config.json
Normal file
51
config/robot_config.json
Normal file
@ -0,0 +1,51 @@
|
||||
{
|
||||
"bridge": {
|
||||
"iface": "eth0",
|
||||
"timeout": 10.0,
|
||||
"cooldown": 8.0,
|
||||
"release_after": 0.5,
|
||||
"audio_lead_s": 0.3,
|
||||
"quick_fail_window": 8.0,
|
||||
"trigger_poll_hz": 50.0
|
||||
},
|
||||
"tts": {
|
||||
"mode": "recorded_only",
|
||||
"speaker_id": 2,
|
||||
"volume": 100,
|
||||
"seconds_per_char": 0.12,
|
||||
"min_seconds": 2.5,
|
||||
"queue_max": 4,
|
||||
"busy_factor": {
|
||||
"min": 1.0,
|
||||
"max": 2.5,
|
||||
"up": 1.20,
|
||||
"down": 0.97
|
||||
}
|
||||
},
|
||||
"phrases": {
|
||||
"safe": "Safe to enter. Have a good day.",
|
||||
"unsafe_generic": "Please stop. Wear your proper safety equipment.",
|
||||
"unsafe_missing": "Please stop. Wear your proper safety equipment. You are missing {items}.",
|
||||
"deactivated": "Saqr deactivated.",
|
||||
"ready": "Saqr is running. Press R2 plus X to start.",
|
||||
"no_camera": "Camera not connected. Please plug in the camera and try again."
|
||||
},
|
||||
"arm": {
|
||||
"reject_action": "reject",
|
||||
"release_action": "release arm"
|
||||
},
|
||||
"deploy": {
|
||||
"robot_ip": "192.168.123.164",
|
||||
"robot_user": "unitree",
|
||||
"robot_env": "marcus",
|
||||
"remote_dir": "/home/unitree/Saqr",
|
||||
"ssh_opts": "-o StrictHostKeyChecking=no -o ConnectTimeout=10"
|
||||
},
|
||||
"start_saqr": {
|
||||
"conda_root": "$HOME/miniconda3",
|
||||
"conda_env": "marcus",
|
||||
"dds_iface": "eth0",
|
||||
"saqr_source": "realsense",
|
||||
"stream_port": 8080
|
||||
}
|
||||
}
|
||||
157
core/camera.py
Normal file
157
core/camera.py
Normal file
@ -0,0 +1,157 @@
|
||||
"""Camera backends: RealSense SDK (with optional aligned depth) and OpenCV V4L2.
|
||||
|
||||
RealSenseCapture exposes the latest depth frame via ``latest_depth`` (a
|
||||
numpy uint16 array in millimetres) and ``depth_scale`` (meters per raw unit).
|
||||
Non-RealSense captures leave ``latest_depth = None`` — callers must guard.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Optional
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
from utils.config import load_config
|
||||
from utils.logger import get_logger
|
||||
|
||||
log = get_logger("Inference", "camera")
|
||||
_CFG = load_config("core")["camera"]
|
||||
|
||||
try:
|
||||
import pyrealsense2 as rs
|
||||
HAS_REALSENSE = True
|
||||
except ImportError:
|
||||
HAS_REALSENSE = False
|
||||
|
||||
|
||||
class RealSenseCapture:
|
||||
"""pyrealsense2 pipeline with an OpenCV-like read() interface.
|
||||
|
||||
When ``enable_depth`` is true (default from core_config.camera.enable_depth),
|
||||
the pipeline also streams depth and aligns each frame to the color view.
|
||||
The aligned depth frame is stored on ``self.latest_depth`` as a numpy array.
|
||||
"""
|
||||
|
||||
def __init__(self, width: int = _CFG["width"], height: int = _CFG["height"],
|
||||
fps: int = _CFG["fps"], serial: Optional[str] = None,
|
||||
enable_depth: bool = _CFG.get("enable_depth", True)):
|
||||
if not HAS_REALSENSE:
|
||||
raise RuntimeError("pyrealsense2 not installed")
|
||||
self.pipeline = rs.pipeline()
|
||||
cfg = rs.config()
|
||||
if serial:
|
||||
cfg.enable_device(serial)
|
||||
cfg.enable_stream(rs.stream.color, width, height, rs.format.bgr8, fps)
|
||||
|
||||
self.has_depth = bool(enable_depth)
|
||||
self._align = None
|
||||
self.depth_scale = 0.001 # mm per raw unit default; overwritten below
|
||||
|
||||
if self.has_depth:
|
||||
cfg.enable_stream(rs.stream.depth, width, height, rs.format.z16, fps)
|
||||
|
||||
self.profile = self.pipeline.start(cfg)
|
||||
self._open = True
|
||||
|
||||
if self.has_depth:
|
||||
try:
|
||||
depth_sensor = self.profile.get_device().first_depth_sensor()
|
||||
self.depth_scale = float(depth_sensor.get_depth_scale())
|
||||
self._align = rs.align(rs.stream.color)
|
||||
except Exception as e:
|
||||
log.warning(f"Depth init failed ({e}); disabling depth")
|
||||
self.has_depth = False
|
||||
self._align = None
|
||||
|
||||
self.latest_depth: Optional[np.ndarray] = None
|
||||
|
||||
dev = self.profile.get_device()
|
||||
log.info(
|
||||
f"RealSense opened | {dev.get_info(rs.camera_info.name)} "
|
||||
f"serial={dev.get_info(rs.camera_info.serial_number)} "
|
||||
f"{width}x{height}@{fps} depth={self.has_depth}"
|
||||
)
|
||||
|
||||
def isOpened(self) -> bool:
|
||||
return self._open
|
||||
|
||||
def read(self):
|
||||
if not self._open:
|
||||
return False, None
|
||||
try:
|
||||
frames = self.pipeline.wait_for_frames(timeout_ms=3000)
|
||||
if self._align is not None:
|
||||
frames = self._align.process(frames)
|
||||
color = frames.get_color_frame()
|
||||
if not color:
|
||||
return False, None
|
||||
if self.has_depth:
|
||||
depth = frames.get_depth_frame()
|
||||
self.latest_depth = (
|
||||
np.asanyarray(depth.get_data()) if depth else None
|
||||
)
|
||||
return True, np.asanyarray(color.get_data())
|
||||
except Exception:
|
||||
return False, None
|
||||
|
||||
def release(self):
|
||||
if self._open:
|
||||
self.pipeline.stop()
|
||||
self._open = False
|
||||
|
||||
|
||||
def open_capture(source: str):
|
||||
if source.lower().startswith("realsense"):
|
||||
serial = None
|
||||
if ":" in source:
|
||||
serial = source.split(":", 1)[1]
|
||||
return RealSenseCapture(
|
||||
width=_CFG["width"], height=_CFG["height"], fps=_CFG["fps"],
|
||||
serial=serial,
|
||||
)
|
||||
|
||||
if str(source).isdigit():
|
||||
idx = int(source)
|
||||
cap = cv2.VideoCapture(idx)
|
||||
if cap.isOpened():
|
||||
return cap
|
||||
cap = cv2.VideoCapture(idx, cv2.CAP_ANY)
|
||||
if cap.isOpened():
|
||||
return cap
|
||||
cap = cv2.VideoCapture(idx, cv2.CAP_V4L2)
|
||||
return cap
|
||||
|
||||
if source.startswith("/dev/video"):
|
||||
cap = cv2.VideoCapture(source, cv2.CAP_V4L2)
|
||||
if cap.isOpened():
|
||||
cap.set(cv2.CAP_PROP_BUFFERSIZE, 1)
|
||||
cap.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc(*"MJPG"))
|
||||
cap.set(cv2.CAP_PROP_FRAME_WIDTH, _CFG["width"])
|
||||
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, _CFG["height"])
|
||||
cap.set(cv2.CAP_PROP_FPS, _CFG["fps"])
|
||||
return cap
|
||||
|
||||
return cv2.VideoCapture(source)
|
||||
|
||||
|
||||
def estimate_person_distance_m(depth_frame, bbox, depth_scale: float) -> Optional[float]:
|
||||
"""Median distance in metres inside ``bbox`` on the depth frame.
|
||||
|
||||
Returns ``None`` if depth is unavailable or the bbox falls outside the
|
||||
valid area. Ignores zero-depth pixels (RealSense's "no return" marker).
|
||||
"""
|
||||
if depth_frame is None:
|
||||
return None
|
||||
h, w = depth_frame.shape[:2]
|
||||
x1, y1, x2, y2 = bbox
|
||||
x1 = max(0, min(int(x1), w - 1))
|
||||
x2 = max(0, min(int(x2), w))
|
||||
y1 = max(0, min(int(y1), h - 1))
|
||||
y2 = max(0, min(int(y2), h))
|
||||
if x2 <= x1 or y2 <= y1:
|
||||
return None
|
||||
roi = depth_frame[y1:y2, x1:x2]
|
||||
valid = roi[roi > 0]
|
||||
if valid.size == 0:
|
||||
return None
|
||||
return float(np.median(valid) * depth_scale)
|
||||
67
core/capture.py
Normal file
67
core/capture.py
Normal file
@ -0,0 +1,67 @@
|
||||
"""Per-track image cropping, capture dirs, and full-frame event snapshots."""
|
||||
from __future__ import annotations
|
||||
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional
|
||||
|
||||
import cv2
|
||||
|
||||
from core.detection import STATUSES
|
||||
from core.geometry import clamp_bbox
|
||||
from core.paths import CAPTURES_DIR, SNAPSHOTS_DIR
|
||||
|
||||
|
||||
def setup_capture_dirs() -> Dict[str, Path]:
|
||||
"""runtime/captures/{SAFE,PARTIAL,UNSAFE}/ — latest crop per track."""
|
||||
dirs: Dict[str, Path] = {}
|
||||
for s in STATUSES:
|
||||
d = CAPTURES_DIR / s
|
||||
d.mkdir(parents=True, exist_ok=True)
|
||||
dirs[s] = d
|
||||
return dirs
|
||||
|
||||
|
||||
def setup_snapshot_dirs() -> Dict[str, Path]:
|
||||
"""runtime/snapshots/{SAFE,PARTIAL,UNSAFE}/ — annotated full frame per transition."""
|
||||
dirs: Dict[str, Path] = {}
|
||||
for s in STATUSES:
|
||||
d = SNAPSHOTS_DIR / s
|
||||
d.mkdir(parents=True, exist_ok=True)
|
||||
dirs[s] = d
|
||||
return dirs
|
||||
|
||||
|
||||
def save_track_image(frame, track, capture_dirs: Dict[str, Path]) -> Optional[Path]:
|
||||
"""Save the latest crop for a track. Overwritten each frame."""
|
||||
h, w = frame.shape[:2]
|
||||
x1, y1, x2, y2 = clamp_bbox(track.bbox, w, h)
|
||||
if x2 <= x1 or y2 <= y1:
|
||||
return None
|
||||
crop = frame[y1:y2, x1:x2]
|
||||
if crop.size == 0:
|
||||
return None
|
||||
|
||||
target = capture_dirs[track.status] / f"track_{track.track_id:04d}.jpg"
|
||||
if track.photo_path and track.photo_path != target and track.photo_path.exists():
|
||||
try:
|
||||
track.photo_path.unlink()
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
cv2.imwrite(str(target), crop)
|
||||
track.photo_path = target
|
||||
return target
|
||||
|
||||
|
||||
def save_event_snapshot(annotated_frame, track, snapshot_dirs: Dict[str, Path]) -> Optional[Path]:
|
||||
"""Save the full annotated frame at the moment of a NEW / STATUS_CHANGE event.
|
||||
|
||||
Timestamped filename so a history is preserved across events.
|
||||
"""
|
||||
if annotated_frame is None or annotated_frame.size == 0:
|
||||
return None
|
||||
ts = datetime.now().strftime("%Y%m%d_%H%M%S_%f")[:-3]
|
||||
target = snapshot_dirs[track.status] / f"track_{track.track_id:04d}_{ts}.jpg"
|
||||
cv2.imwrite(str(target), annotated_frame)
|
||||
return target
|
||||
55
core/compliance.py
Normal file
55
core/compliance.py
Normal file
@ -0,0 +1,55 @@
|
||||
"""Binary SAFE / UNSAFE classification — every item in REQUIRED_PPE must be worn.
|
||||
|
||||
The required list comes from ``config/core_config.json::compliance.required_ppe``
|
||||
so it can be retuned without a code change.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Dict, List, Tuple
|
||||
|
||||
from core.detection import POSITIVE_TO_NEGATIVE, PPE_DISPLAY_ORDER
|
||||
from utils.config import load_config
|
||||
|
||||
# PPE items required for SAFE status. Other items (gloves/goggles/boots) are
|
||||
# still detected and logged but don't affect compliance.
|
||||
REQUIRED_PPE: Tuple[str, ...] = tuple(
|
||||
load_config("core").get("compliance", {}).get("required_ppe", ["helmet", "vest"])
|
||||
)
|
||||
|
||||
|
||||
def _is_wearing(items: Dict[str, float], ppe: str) -> bool:
|
||||
pos = items.get(ppe, 0.0)
|
||||
neg = items.get(POSITIVE_TO_NEGATIVE[ppe], 0.0)
|
||||
return pos > neg and pos > 0
|
||||
|
||||
|
||||
def status_from_items(items: Dict[str, float]) -> str:
|
||||
"""SAFE only when every required item is confidently worn."""
|
||||
if all(_is_wearing(items, p) for p in REQUIRED_PPE):
|
||||
return "SAFE"
|
||||
return "UNSAFE"
|
||||
|
||||
|
||||
def split_wearing_missing(items: Dict[str, float]) -> Tuple[List[str], List[str], List[str]]:
|
||||
"""Bucket each PPE item into wearing / missing / unknown.
|
||||
|
||||
``missing`` only contains REQUIRED items that aren't confidently worn —
|
||||
this is what the TTS announcement keys off of, so it maps cleanly to a
|
||||
recorded clip. Non-required items (gloves/goggles/boots) are never in
|
||||
``missing`` regardless of whether their ``no-X`` class was detected; they
|
||||
go to ``unknown`` so they're still visible in the event log.
|
||||
"""
|
||||
wearing, missing, unknown = [], [], []
|
||||
for pos in PPE_DISPLAY_ORDER:
|
||||
neg = POSITIVE_TO_NEGATIVE[pos]
|
||||
pos_conf = items.get(pos, 0.0)
|
||||
neg_conf = items.get(neg, 0.0)
|
||||
|
||||
worn = pos_conf > neg_conf and pos_conf > 0
|
||||
if worn:
|
||||
wearing.append(pos)
|
||||
elif pos in REQUIRED_PPE:
|
||||
missing.append(pos)
|
||||
else:
|
||||
unknown.append(pos)
|
||||
return wearing, missing, unknown
|
||||
@ -5,8 +5,8 @@ from typing import List, Tuple
|
||||
|
||||
import cv2
|
||||
|
||||
from saqr.core.compliance import split_wearing_missing
|
||||
from saqr.core.detection import STATUSES
|
||||
from core.compliance import split_wearing_missing
|
||||
from core.detection import STATUSES
|
||||
|
||||
GREEN = (0, 200, 0)
|
||||
YELLOW = (0, 200, 255)
|
||||
@ -1,6 +1,6 @@
|
||||
"""Event-line emission and CSV writers.
|
||||
|
||||
The ``emit_event`` output line is a contract with ``saqr.robot.bridge`` — its
|
||||
The ``emit_event`` output line is a contract with ``robot.bridge`` — its
|
||||
regex parses this exact format. Don't change the field order without updating
|
||||
the bridge.
|
||||
"""
|
||||
@ -11,8 +11,8 @@ from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Dict, List
|
||||
|
||||
from saqr.core.compliance import split_wearing_missing
|
||||
from saqr.core.detection import CLASS_ORDER
|
||||
from core.compliance import split_wearing_missing
|
||||
from core.detection import CLASS_ORDER
|
||||
|
||||
|
||||
def now_iso() -> str:
|
||||
@ -21,7 +21,7 @@ def now_iso() -> str:
|
||||
|
||||
class EventLogger:
|
||||
FIELDS = ["timestamp", "track_id", "event_type", "status",
|
||||
"wearing", "missing", "unknown", "photo", "path"]
|
||||
"wearing", "missing", "unknown", "photo", "path", "snapshot"]
|
||||
|
||||
def __init__(self, path: Path):
|
||||
self.path = path
|
||||
@ -63,15 +63,16 @@ def write_result_csv(tracks: List, output: Path) -> None:
|
||||
|
||||
|
||||
def emit_event(track, event_logger: EventLogger, event_type: str = "STATUS_CHANGE",
|
||||
force: bool = False) -> None:
|
||||
force: bool = False, snapshot_path: Path = None) -> None:
|
||||
if track.photo_path is None:
|
||||
return
|
||||
if not force and track.announced_status == track.status:
|
||||
return
|
||||
|
||||
wearing, missing, unknown = split_wearing_missing(track.items)
|
||||
ts = datetime.now().strftime("%H:%M:%S.%f")[:-3]
|
||||
msg = (
|
||||
f"ID {track.track_id:04d} | {event_type} | {track.status} | "
|
||||
f"[{ts}] ID {track.track_id:04d} | {event_type} | {track.status} | "
|
||||
f"wearing: {', '.join(wearing) or 'none'} | "
|
||||
f"missing: {', '.join(missing) or 'none'} | "
|
||||
f"unknown: {', '.join(unknown) or 'none'}"
|
||||
@ -88,6 +89,7 @@ def emit_event(track, event_logger: EventLogger, event_type: str = "STATUS_CHANG
|
||||
"unknown": ", ".join(unknown),
|
||||
"photo": track.photo_path.name if track.photo_path else "",
|
||||
"path": str(track.photo_path) if track.photo_path else "",
|
||||
"snapshot": str(snapshot_path) if snapshot_path else "",
|
||||
})
|
||||
track.announced_status = track.status
|
||||
track.event_count += 1
|
||||
@ -4,8 +4,8 @@ from __future__ import annotations
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Dict, List, Tuple
|
||||
|
||||
from saqr.core.detection import PPEItem
|
||||
from saqr.core.geometry import box_distance, expand_bbox, merge_boxes
|
||||
from core.detection import PPEItem
|
||||
from core.geometry import box_distance, expand_bbox, merge_boxes
|
||||
|
||||
|
||||
@dataclass
|
||||
@ -3,7 +3,7 @@ from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
from saqr.core.paths import MODELS_DIR, PROJECT_ROOT
|
||||
from core.paths import MODELS_DIR, PROJECT_ROOT
|
||||
|
||||
|
||||
def resolve_model_path(model_arg: str) -> Path:
|
||||
28
core/paths.py
Normal file
28
core/paths.py
Normal file
@ -0,0 +1,28 @@
|
||||
"""Canonical project paths, resolved dynamically at import time.
|
||||
|
||||
By default the project root is the parent of the ``core/`` package —
|
||||
whatever filesystem location ``Saqr/`` currently lives at. Override by
|
||||
setting the ``SAQR_ROOT`` environment variable.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
_DEFAULT_ROOT = Path(__file__).resolve().parents[1]
|
||||
PROJECT_ROOT = Path(os.environ.get("SAQR_ROOT", _DEFAULT_ROOT)).resolve()
|
||||
|
||||
CONFIG_DIR = PROJECT_ROOT / "config"
|
||||
DATA_DIR = PROJECT_ROOT / "data"
|
||||
DATASET_DIR = DATA_DIR / "dataset"
|
||||
MODELS_DIR = DATA_DIR / "models"
|
||||
|
||||
RUNTIME_DIR = PROJECT_ROOT / "runtime"
|
||||
CAPTURES_DIR = RUNTIME_DIR / "captures"
|
||||
SNAPSHOTS_DIR = RUNTIME_DIR / "snapshots"
|
||||
RUNS_DIR = RUNTIME_DIR / "runs"
|
||||
|
||||
LOGS_DIR = PROJECT_ROOT / "logs"
|
||||
|
||||
RESULT_CSV = CAPTURES_DIR / "result.csv"
|
||||
EVENTS_CSV = CAPTURES_DIR / "events.csv"
|
||||
79
core/pipeline.py
Normal file
79
core/pipeline.py
Normal file
@ -0,0 +1,79 @@
|
||||
"""Per-frame detect + group + depth-gate + track + capture + emit pipeline.
|
||||
|
||||
Announcements are gated on two conditions:
|
||||
* the track is currently stationary (``is_stationary()``), so walkers-by
|
||||
never trigger a reject/TTS — only people planted at the checkpoint do;
|
||||
* the current status differs from the last-announced status for the track,
|
||||
so we never repeat the same announcement.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional
|
||||
|
||||
from core.camera import estimate_person_distance_m
|
||||
from core.capture import save_event_snapshot, save_track_image
|
||||
from core.detection import collect_detections
|
||||
from core.drawing import draw_track
|
||||
from core.events import emit_event, write_result_csv
|
||||
from core.grouping import group_detections_to_people
|
||||
from core.paths import RESULT_CSV
|
||||
from core.tracking import PersonTracker
|
||||
|
||||
|
||||
def _filter_by_depth(candidates, depth_frame, depth_scale: float, max_distance_m: float):
|
||||
"""Drop candidates whose median depth exceeds ``max_distance_m``.
|
||||
|
||||
Fail-open: candidates with no valid depth pass through. ``max_distance_m``
|
||||
<= 0 disables the filter entirely.
|
||||
"""
|
||||
if depth_frame is None or max_distance_m <= 0 or not candidates:
|
||||
return candidates
|
||||
kept = []
|
||||
for cand in candidates:
|
||||
d = estimate_person_distance_m(depth_frame, cand.bbox, depth_scale)
|
||||
if d is None or d <= max_distance_m:
|
||||
cand.distance_m = d
|
||||
kept.append(cand)
|
||||
return kept
|
||||
|
||||
|
||||
def process_frame(frame, model, tracker: PersonTracker, frame_idx: int, conf: float,
|
||||
capture_dirs: Dict[str, Path], write_csv: bool = True,
|
||||
*,
|
||||
snapshot_dirs: Optional[Dict[str, Path]] = None,
|
||||
depth_frame=None, depth_scale: float = 0.001,
|
||||
max_distance_m: float = 0.0):
|
||||
annotated = frame.copy()
|
||||
h, w = annotated.shape[:2]
|
||||
|
||||
detections = collect_detections(frame, model, conf)
|
||||
candidates = group_detections_to_people(detections, w, h)
|
||||
candidates = _filter_by_depth(candidates, depth_frame, depth_scale, max_distance_m)
|
||||
|
||||
tracker.update(candidates, frame_idx)
|
||||
visible = tracker.visible_tracks()
|
||||
|
||||
# Pass 1: save crops + draw annotations so snapshots capture the full scene.
|
||||
for track in visible:
|
||||
save_track_image(frame, track, capture_dirs)
|
||||
draw_track(annotated, track)
|
||||
|
||||
# Pass 2: emit events ONLY for stationary tracks whose current status
|
||||
# hasn't been announced yet. Walkers-by never reach this branch because
|
||||
# their bbox-center history shows movement above the tolerance.
|
||||
for track in visible:
|
||||
if track.announced_status == track.status:
|
||||
continue
|
||||
if not track.is_stationary():
|
||||
continue
|
||||
ev_type = "NEW" if track.announced_status is None else "STATUS_CHANGE"
|
||||
snap_path = None
|
||||
if snapshot_dirs is not None:
|
||||
snap_path = save_event_snapshot(annotated, track, snapshot_dirs)
|
||||
emit_event(track, tracker.event_logger, ev_type, snapshot_path=snap_path)
|
||||
|
||||
if write_csv:
|
||||
write_result_csv(list(tracker.tracks.values()), RESULT_CSV)
|
||||
|
||||
return annotated, visible
|
||||
@ -8,9 +8,11 @@ from typing import Optional
|
||||
|
||||
import cv2
|
||||
|
||||
from saqr.utils.logger import get_logger
|
||||
from utils.config import load_config
|
||||
from utils.logger import get_logger
|
||||
|
||||
log = get_logger("Inference", "streaming")
|
||||
_CFG = load_config("core")["stream"]
|
||||
|
||||
_stream_frame: Optional[bytes] = None
|
||||
_stream_lock = threading.Lock()
|
||||
@ -47,16 +49,18 @@ class MJPEGHandler(BaseHTTPRequestHandler):
|
||||
pass
|
||||
|
||||
|
||||
def start_stream_server(port: int = 8080):
|
||||
server = HTTPServer(("0.0.0.0", port), MJPEGHandler)
|
||||
def start_stream_server(port: Optional[int] = None, host: Optional[str] = None):
|
||||
h = host or _CFG["host"]
|
||||
p = port if port is not None else _CFG["port"]
|
||||
server = HTTPServer((h, p), MJPEGHandler)
|
||||
t = threading.Thread(target=server.serve_forever, daemon=True)
|
||||
t.start()
|
||||
log.info(f"MJPEG stream server started on http://0.0.0.0:{port}")
|
||||
log.info(f"MJPEG stream server started on http://{h}:{p}")
|
||||
return server
|
||||
|
||||
|
||||
def update_stream_frame(frame):
|
||||
global _stream_frame
|
||||
_, jpeg = cv2.imencode(".jpg", frame, [cv2.IMWRITE_JPEG_QUALITY, 70])
|
||||
_, jpeg = cv2.imencode(".jpg", frame, [cv2.IMWRITE_JPEG_QUALITY, _CFG["jpeg_quality"]])
|
||||
with _stream_lock:
|
||||
_stream_frame = jpeg.tobytes()
|
||||
@ -1,14 +1,24 @@
|
||||
"""Per-person Track dataclass and the greedy nearest-match PersonTracker."""
|
||||
"""Per-person Track dataclass and the greedy nearest-match PersonTracker.
|
||||
|
||||
Tracks keep a rolling history of bbox centers so ``is_stationary()`` can gate
|
||||
announcements: the bridge only fires TTS + arm actions on people standing
|
||||
still at the checkpoint, not on anyone walking past.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Optional, Tuple
|
||||
|
||||
from saqr.core.compliance import status_from_items
|
||||
from saqr.core.events import EventLogger, now_iso
|
||||
from saqr.core.geometry import box_distance
|
||||
from saqr.core.grouping import PersonCandidate
|
||||
from core.compliance import status_from_items
|
||||
from core.events import EventLogger, now_iso
|
||||
from core.geometry import box_distance
|
||||
from core.grouping import PersonCandidate
|
||||
from utils.config import load_config
|
||||
|
||||
_TRK = load_config("core")["tracking"]
|
||||
STATIONARY_CHECK_FRAMES = int(_TRK.get("stationary_check_frames", 15))
|
||||
STATIONARY_TOLERANCE_PX = float(_TRK.get("stationary_tolerance_px", 40.0))
|
||||
|
||||
|
||||
@dataclass
|
||||
@ -26,6 +36,26 @@ class Track:
|
||||
event_count: int = 0
|
||||
pending_status: Optional[str] = None
|
||||
pending_count: int = 0
|
||||
# Rolling history of bbox centers (most recent N frames) used to decide
|
||||
# whether the person is standing still at the checkpoint.
|
||||
recent_centers: List[Tuple[float, float]] = field(default_factory=list)
|
||||
|
||||
def record_center(self) -> None:
|
||||
cx = (self.bbox[0] + self.bbox[2]) / 2.0
|
||||
cy = (self.bbox[1] + self.bbox[3]) / 2.0
|
||||
self.recent_centers.append((cx, cy))
|
||||
if len(self.recent_centers) > STATIONARY_CHECK_FRAMES:
|
||||
del self.recent_centers[0]
|
||||
|
||||
def is_stationary(self) -> bool:
|
||||
"""True when the bbox center hasn't moved more than the configured
|
||||
tolerance across the last ``STATIONARY_CHECK_FRAMES`` frames."""
|
||||
if len(self.recent_centers) < STATIONARY_CHECK_FRAMES:
|
||||
return False
|
||||
xs = [c[0] for c in self.recent_centers]
|
||||
ys = [c[1] for c in self.recent_centers]
|
||||
spread = max(max(xs) - min(xs), max(ys) - min(ys))
|
||||
return spread <= STATIONARY_TOLERANCE_PX
|
||||
|
||||
|
||||
class PersonTracker:
|
||||
@ -101,6 +131,7 @@ class PersonTracker:
|
||||
track.pending_status = None
|
||||
track.pending_count = 0
|
||||
|
||||
track.record_center()
|
||||
used.add(track.track_id)
|
||||
|
||||
stale = []
|
||||
114
docs/DEPLOY.md
114
docs/DEPLOY.md
@ -21,15 +21,27 @@
|
||||
## Repo layout
|
||||
|
||||
```
|
||||
saqr/ # python package (core/apps/gui/robot/utils)
|
||||
scripts/ # deploy.sh, start_saqr.sh, run_local.sh, run_robot.sh, saqr-bridge.service
|
||||
config/ # logging.json
|
||||
data/ # dataset/, models/ (gitignored)
|
||||
runtime/ # captures/, logs/, runs/ (gitignored)
|
||||
docs/ # this file, start.md, use_case_catalogue.pdf
|
||||
pyproject.toml # installs the `saqr`, `saqr-bridge`, `saqr-gui`, ... scripts
|
||||
Saqr/
|
||||
├── core/ # detection + tracking + events (shared by CLI/GUI/bridge)
|
||||
├── apps/ # CLI entry points (saqr, detect, train, manager, view_stream)
|
||||
├── gui/ # PySide6 desktop GUI
|
||||
├── robot/ # G1 bridge + DDS controller
|
||||
├── utils/ # logger
|
||||
├── scripts/ # start_saqr.sh (entry), saqr-bridge.service (systemd), deploy.sh
|
||||
├── config/ # logging.json
|
||||
├── data/ # dataset/, models/ (gitignored)
|
||||
├── runtime/ # captures/, runs/ (gitignored)
|
||||
├── logs/ # per-module .log files (gitignored)
|
||||
├── docs/ # this file, start.md, use_case_catalogue.pdf
|
||||
├── pyproject.toml # installs the package (no console scripts — use start_saqr.sh)
|
||||
└── README.md
|
||||
```
|
||||
|
||||
`core/`, `apps/`, `gui/`, `robot/`, and `utils/` are the importable Python
|
||||
packages. Run them via `python -m apps.saqr_cli`, `python -m robot.bridge`,
|
||||
etc. The project root is auto-detected from `core/paths.py`; override with
|
||||
`SAQR_ROOT=/custom/path` if the tree lives elsewhere.
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Train the Model (Dev Machine)
|
||||
@ -38,7 +50,7 @@ pyproject.toml # installs the `saqr`, `saqr-bridge`, `saqr-gui`, ... scripts
|
||||
cd ~/Robotics_workspace/yslootahtech/Project/Saqr
|
||||
conda activate AI_MSI_yolo
|
||||
pip install -e .
|
||||
saqr-train --dataset data/dataset --epochs 100 --batch 16
|
||||
python -m apps.train_cli --dataset data/dataset --epochs 100 --batch 16
|
||||
```
|
||||
|
||||
Verify model exists:
|
||||
@ -58,10 +70,9 @@ scripts/deploy.sh --run # ...and start the bridge
|
||||
scripts/deploy.sh --ip 10.0.0.5 # custom robot IP
|
||||
```
|
||||
|
||||
The script rsyncs `saqr/`, `scripts/`, `config/`, `docs/`,
|
||||
`pyproject.toml`, `requirements.txt`, and `README.md` to
|
||||
`~/Saqr` on the robot, then runs `pip install -e .` inside the
|
||||
target conda env (default `marcus`).
|
||||
The script rsyncs `core/`, `apps/`, `gui/`, `robot/`, `utils/`, `scripts/`,
|
||||
`config/`, `docs/`, and `pyproject.toml` to `~/Saqr` on the robot, then runs
|
||||
`pip install -e .` inside the target conda env (default `marcus`).
|
||||
|
||||
---
|
||||
|
||||
@ -100,56 +111,49 @@ sudo date -s "2026-04-10 15:00:00"
|
||||
python -c "from ultralytics import YOLO; print('ultralytics OK')"
|
||||
python -c "import torch; print('CUDA:', torch.cuda.is_available())"
|
||||
python -c "import cv2; print('opencv OK')"
|
||||
python -c "import saqr; print('saqr', saqr.__version__)"
|
||||
python -c "import core.paths; print('project root:', core.paths.PROJECT_ROOT)"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Run Saqr (Robot)
|
||||
|
||||
### Production: bridge with R2+X / R2+Y
|
||||
|
||||
The bridge owns the DDS clients and spawns `saqr` on demand. On the robot:
|
||||
**The project has one entry point: [scripts/start_saqr.sh](../scripts/start_saqr.sh).**
|
||||
Everything — conda activation, DDS iface, camera source, stream port — is
|
||||
read from `config/robot_config.json.start_saqr` (env vars override).
|
||||
|
||||
```bash
|
||||
cd ~/Saqr
|
||||
scripts/start_saqr.sh # manual launch
|
||||
sudo systemctl restart saqr-bridge # systemd-managed (see start.md)
|
||||
scripts/start_saqr.sh # manual (foreground, Ctrl+C to stop)
|
||||
sudo systemctl restart saqr-bridge # systemd-owned (see start.md)
|
||||
```
|
||||
|
||||
Or without the helper script:
|
||||
The launcher execs `python -m robot.bridge` with the production flags.
|
||||
The bridge owns the DDS clients and spawns `python -m apps.saqr_cli` as a
|
||||
subprocess when R2+X is pressed.
|
||||
|
||||
### Overrides (no config edit)
|
||||
|
||||
```bash
|
||||
conda activate marcus
|
||||
python -m saqr.robot.bridge --iface eth0 --source realsense --headless -- --stream 8080
|
||||
SAQR_SOURCE=/dev/video2 scripts/start_saqr.sh # V4L2 fallback
|
||||
CONDA_ENV=teleimager scripts/start_saqr.sh
|
||||
DDS_IFACE=wlan0 scripts/start_saqr.sh
|
||||
STREAM_PORT=9090 scripts/start_saqr.sh
|
||||
```
|
||||
|
||||
### Plain saqr (no bridge)
|
||||
### Companion utilities (off the run path)
|
||||
|
||||
Operators occasionally need these on the robot. They're not part of the
|
||||
normal run flow:
|
||||
|
||||
```bash
|
||||
# With display
|
||||
scripts/run_robot.sh --stream 8080
|
||||
|
||||
# Headless
|
||||
scripts/run_robot.sh --headless --stream 8080
|
||||
|
||||
# V4L2 fallback if RealSense SDK won't enumerate
|
||||
scripts/run_robot.sh --source /dev/video2 --headless
|
||||
python -m apps.manager_cli --export # dump a CSV report
|
||||
python -m apps.manager_cli # interactive photo CRUD
|
||||
```
|
||||
|
||||
Equivalent `python -m` forms:
|
||||
|
||||
Training (dev machine only):
|
||||
```bash
|
||||
python -m saqr.apps.saqr_cli --source realsense --model saqr_best.pt --headless
|
||||
python -m saqr.apps.detect_cli --source /dev/video2 --model saqr_best.pt
|
||||
python -m saqr.apps.manager_cli --export
|
||||
```
|
||||
|
||||
### Dev machine GUI
|
||||
|
||||
```bash
|
||||
pip install -e ".[gui]"
|
||||
python -m saqr.gui.app --source 0
|
||||
python -m apps.train_cli --epochs 100
|
||||
```
|
||||
|
||||
---
|
||||
@ -205,7 +209,7 @@ cat runtime/captures/result.csv # current state per tracked person
|
||||
cat runtime/captures/events.csv # audit log
|
||||
ls runtime/captures/{SAFE,PARTIAL,UNSAFE}/
|
||||
|
||||
saqr-manager --export # quick CSV export
|
||||
python -m apps.manager_cli --export # quick CSV export
|
||||
|
||||
# Download to dev machine
|
||||
scp -r unitree@192.168.123.164:~/Saqr/runtime/captures ./captures_from_robot
|
||||
@ -268,7 +272,7 @@ saqr --source realsense --headless \
|
||||
| `result.csv` | `runtime/captures/` | Current state of tracked persons |
|
||||
| `events.csv` | `runtime/captures/` | Audit log (NEW / STATUS_CHANGE) |
|
||||
| Person crops | `runtime/captures/{SAFE,PARTIAL,UNSAFE}/*.jpg` | Latest crop per track |
|
||||
| Logs | `runtime/logs/Inference/*.log` | Module log output |
|
||||
| Logs | `logs/Inference/*.log` | Module log output |
|
||||
|
||||
---
|
||||
|
||||
@ -276,17 +280,17 @@ saqr --source realsense --headless \
|
||||
|
||||
| Path | Purpose |
|
||||
|------|---------|
|
||||
| [saqr/apps/saqr_cli.py](../saqr/apps/saqr_cli.py) | Main PPE tracking entry (`saqr`) |
|
||||
| [saqr/robot/bridge.py](../saqr/robot/bridge.py) | Saqr → G1 bridge (R2+X/R2+Y) |
|
||||
| [saqr/robot/robot_controller.py](../saqr/robot/robot_controller.py) | G1 arm + audio + TTS worker |
|
||||
| [saqr/robot/controller.py](../saqr/robot/controller.py) | G1 wireless-remote DDS reader |
|
||||
| [saqr/core/pipeline.py](../saqr/core/pipeline.py) | Per-frame detect + track + emit |
|
||||
| [saqr/core/tracking.py](../saqr/core/tracking.py) | `PersonTracker`, `Track` |
|
||||
| [saqr/core/events.py](../saqr/core/events.py) | Event-line format (contract with bridge) |
|
||||
| [saqr/apps/detect_cli.py](../saqr/apps/detect_cli.py) | Simple detection without tracking |
|
||||
| [saqr/apps/train_cli.py](../saqr/apps/train_cli.py) | YOLO11n training |
|
||||
| [saqr/apps/manager_cli.py](../saqr/apps/manager_cli.py) | Capture CRUD + CSV export |
|
||||
| [saqr/gui/app.py](../saqr/gui/app.py) | PySide6 desktop GUI |
|
||||
| [apps/saqr_cli.py](../apps/saqr_cli.py) | Main PPE tracking entry (`saqr`) |
|
||||
| [robot/bridge.py](../robot/bridge.py) | Saqr → G1 bridge (R2+X/R2+Y) |
|
||||
| [robot/robot_controller.py](../robot/robot_controller.py) | G1 arm + audio + TTS worker |
|
||||
| [robot/controller.py](../robot/controller.py) | G1 wireless-remote DDS reader |
|
||||
| [core/pipeline.py](../core/pipeline.py) | Per-frame detect + track + emit |
|
||||
| [core/tracking.py](../core/tracking.py) | `PersonTracker`, `Track` |
|
||||
| [core/events.py](../core/events.py) | Event-line format (contract with bridge) |
|
||||
| [apps/detect_cli.py](../apps/detect_cli.py) | Simple detection without tracking |
|
||||
| [apps/train_cli.py](../apps/train_cli.py) | YOLO11n training |
|
||||
| [apps/manager_cli.py](../apps/manager_cli.py) | Capture CRUD + CSV export |
|
||||
| [gui/app.py](../gui/app.py) | PySide6 desktop GUI |
|
||||
|
||||
---
|
||||
|
||||
|
||||
@ -1,6 +1,6 @@
|
||||
# Saqr — Auto-start on boot
|
||||
|
||||
How to auto-start `saqr.robot.bridge` on every boot of the Unitree G1
|
||||
How to auto-start `robot.bridge` on every boot of the Unitree G1
|
||||
(Jetson), via `systemd` + `scripts/start_saqr.sh`.
|
||||
|
||||
---
|
||||
@ -9,7 +9,7 @@ How to auto-start `saqr.robot.bridge` on every boot of the Unitree G1
|
||||
|
||||
| File | Role |
|
||||
|------|------|
|
||||
| `~/Saqr/saqr/robot/bridge.py` | The bridge process (DDS + TTS + R2+X/R2+Y trigger loop). Entry point: `python -m saqr.robot.bridge`. |
|
||||
| `~/Saqr/robot/bridge.py` | The bridge process (DDS + TTS + R2+X/R2+Y trigger loop). Entry point: `python -m robot.bridge`. |
|
||||
| `~/Saqr/scripts/start_saqr.sh` | Bash launcher: sources conda, activates `marcus`, `cd ~/Saqr`, exec the bridge with the production flags. |
|
||||
| `~/Saqr/scripts/saqr-bridge.service` | systemd unit that runs `start_saqr.sh` as user `unitree` on every boot, restarts on failure, logs to journalctl. |
|
||||
|
||||
@ -164,6 +164,6 @@ sudo systemctl restart saqr-bridge
|
||||
### Bridge is running twice
|
||||
|
||||
```bash
|
||||
ps -ef | grep "saqr.robot.bridge"
|
||||
ps -ef | grep "robot.bridge"
|
||||
sudo systemctl restart saqr-bridge
|
||||
```
|
||||
|
||||
@ -18,18 +18,20 @@ from PySide6.QtWidgets import (
|
||||
)
|
||||
from ultralytics import YOLO
|
||||
|
||||
from saqr.core.capture import save_track_image, setup_capture_dirs
|
||||
from saqr.core.compliance import split_wearing_missing
|
||||
from saqr.core.detection import STATUSES, collect_detections
|
||||
from saqr.core.drawing import draw_counters, draw_track
|
||||
from saqr.core.events import EventLogger, emit_event, write_result_csv
|
||||
from saqr.core.grouping import group_detections_to_people
|
||||
from saqr.core.model import resolve_model_path
|
||||
from saqr.core.paths import EVENTS_CSV, MODELS_DIR, PROJECT_ROOT, RESULT_CSV
|
||||
from saqr.core.tracking import PersonTracker
|
||||
from saqr.utils.logger import get_logger
|
||||
from core.capture import save_track_image, setup_capture_dirs
|
||||
from core.compliance import split_wearing_missing
|
||||
from core.detection import STATUSES, collect_detections
|
||||
from core.drawing import draw_counters, draw_track
|
||||
from core.events import EventLogger, emit_event, write_result_csv
|
||||
from core.grouping import group_detections_to_people
|
||||
from core.model import resolve_model_path
|
||||
from core.paths import EVENTS_CSV, MODELS_DIR, PROJECT_ROOT, RESULT_CSV
|
||||
from core.tracking import PersonTracker
|
||||
from utils.config import load_config
|
||||
from utils.logger import get_logger
|
||||
|
||||
log = get_logger("Inference", "gui")
|
||||
_CORE = load_config("core")
|
||||
|
||||
|
||||
def list_cameras(max_idx: int = 10) -> List[str]:
|
||||
@ -47,7 +49,10 @@ def list_cameras(max_idx: int = 10) -> List[str]:
|
||||
return sources if sources else ["0"]
|
||||
|
||||
|
||||
def open_camera(source: str, width: int = 640, height: int = 480, fps: int = 30):
|
||||
def open_camera(source: str,
|
||||
width: int = _CORE["camera"]["width"],
|
||||
height: int = _CORE["camera"]["height"],
|
||||
fps: int = _CORE["camera"]["fps"]):
|
||||
if source.startswith("/dev/video"):
|
||||
cap = cv2.VideoCapture(source, cv2.CAP_V4L2)
|
||||
elif source.isdigit():
|
||||
@ -191,7 +196,9 @@ def cv_to_qpixmap(frame: np.ndarray, max_w: int = 960, max_h: int = 720) -> QPix
|
||||
|
||||
|
||||
class SaqrWindow(QMainWindow):
|
||||
def __init__(self, default_model: str = "saqr_best.pt", default_source: str = "0"):
|
||||
def __init__(self,
|
||||
default_model: str = _CORE["detection"]["default_model"],
|
||||
default_source: str = _CORE["camera"]["default_source"]):
|
||||
super().__init__()
|
||||
self.setWindowTitle("Saqr - PPE Safety Tracking")
|
||||
self.setMinimumSize(1200, 700)
|
||||
@ -235,26 +242,26 @@ class SaqrWindow(QMainWindow):
|
||||
self.conf_spin = QDoubleSpinBox()
|
||||
self.conf_spin.setRange(0.1, 0.9)
|
||||
self.conf_spin.setSingleStep(0.05)
|
||||
self.conf_spin.setValue(0.35)
|
||||
self.conf_spin.setValue(_CORE["detection"]["conf"])
|
||||
param_lay.addWidget(self.conf_spin, 0, 1)
|
||||
|
||||
param_lay.addWidget(QLabel("Max Missing:"), 1, 0)
|
||||
self.missing_spin = QSpinBox()
|
||||
self.missing_spin.setRange(10, 300)
|
||||
self.missing_spin.setValue(90)
|
||||
self.missing_spin.setValue(_CORE["tracking"]["max_missing"])
|
||||
param_lay.addWidget(self.missing_spin, 1, 1)
|
||||
|
||||
param_lay.addWidget(QLabel("Match Dist:"), 2, 0)
|
||||
self.dist_spin = QDoubleSpinBox()
|
||||
self.dist_spin.setRange(50, 500)
|
||||
self.dist_spin.setSingleStep(10)
|
||||
self.dist_spin.setValue(250)
|
||||
self.dist_spin.setValue(_CORE["tracking"]["match_distance"])
|
||||
param_lay.addWidget(self.dist_spin, 2, 1)
|
||||
|
||||
param_lay.addWidget(QLabel("Confirm Frames:"), 3, 0)
|
||||
self.confirm_spin = QSpinBox()
|
||||
self.confirm_spin.setRange(1, 20)
|
||||
self.confirm_spin.setValue(5)
|
||||
self.confirm_spin.setValue(_CORE["tracking"]["status_confirm_frames"])
|
||||
param_lay.addWidget(self.confirm_spin, 3, 1)
|
||||
|
||||
left.addWidget(param_grp)
|
||||
@ -437,7 +444,7 @@ class SaqrWindow(QMainWindow):
|
||||
"CSV Files (*.csv)"
|
||||
)
|
||||
if path:
|
||||
from saqr.apps.manager_cli import export_csv, load_photos
|
||||
from apps.manager_cli import export_csv, load_photos
|
||||
export_csv(load_photos(), Path(path))
|
||||
self._on_event(f"Exported: {path}")
|
||||
|
||||
@ -449,8 +456,8 @@ class SaqrWindow(QMainWindow):
|
||||
def main():
|
||||
import argparse
|
||||
parser = argparse.ArgumentParser(description="Saqr PPE GUI")
|
||||
parser.add_argument("--model", default="saqr_best.pt")
|
||||
parser.add_argument("--source", default="0")
|
||||
parser.add_argument("--model", default=_CORE["detection"]["default_model"])
|
||||
parser.add_argument("--source", default=_CORE["camera"]["default_source"])
|
||||
args = parser.parse_args()
|
||||
|
||||
app = QApplication(sys.argv)
|
||||
@ -18,14 +18,9 @@ dependencies = [
|
||||
gui = ["PySide6>=6.5.0"]
|
||||
realsense = ["pyrealsense2"]
|
||||
|
||||
[project.scripts]
|
||||
saqr = "saqr.apps.saqr_cli:main"
|
||||
saqr-detect = "saqr.apps.detect_cli:main"
|
||||
saqr-train = "saqr.apps.train_cli:main"
|
||||
saqr-manager = "saqr.apps.manager_cli:main"
|
||||
saqr-view-stream = "saqr.apps.view_stream:main"
|
||||
saqr-gui = "saqr.gui.app:main"
|
||||
saqr-bridge = "saqr.robot.bridge:main"
|
||||
# No console scripts — the project is launched through scripts/start_saqr.sh
|
||||
# (which calls `python -m robot.bridge`). That bridge then spawns
|
||||
# `python -m apps.saqr_cli` as a subprocess. Everything else is a module.
|
||||
|
||||
[tool.setuptools.packages.find]
|
||||
include = ["saqr*"]
|
||||
include = ["core*", "apps*", "gui*", "robot*", "utils*"]
|
||||
|
||||
0
robot/__init__.py
Normal file
0
robot/__init__.py
Normal file
181
robot/audio_player.py
Normal file
181
robot/audio_player.py
Normal file
@ -0,0 +1,181 @@
|
||||
"""Plays pre-recorded WAV clips via AudioClient.PlayStream — bypasses TtsMaker.
|
||||
|
||||
Expected WAV format (required by the G1 audio channel):
|
||||
16 kHz, mono, 16-bit signed PCM.
|
||||
|
||||
Library layout under assets/audio/:
|
||||
fixed/<key>.wav e.g. safe.wav, unsafe_generic.wav, ready.wav
|
||||
unsafe_missing/<key>.wav e.g. helmet.wav, vest.wav, helmet_vest.wav
|
||||
(key is sorted-joined PPE names, "_" separator)
|
||||
|
||||
Callers look up by (category, key). Missing clip → returns False so the caller
|
||||
can fall back to TtsMaker.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import datetime
|
||||
import json
|
||||
import time
|
||||
import wave
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional, Tuple
|
||||
|
||||
from core.paths import PROJECT_ROOT
|
||||
|
||||
AUDIO_ROOT = PROJECT_ROOT / "assets" / "audio"
|
||||
|
||||
|
||||
def _ts() -> str:
|
||||
return datetime.datetime.now().strftime("%H:%M:%S.%f")[:-3]
|
||||
|
||||
EXPECTED_RATE = 16000
|
||||
EXPECTED_CHANNELS = 1
|
||||
EXPECTED_WIDTH = 2 # bytes per sample (int16)
|
||||
PLAY_CHUNK_BYTES = 96000 # 3 s per PlayStream call (matches the Unitree example)
|
||||
PLAY_APP_NAME = "saqr_audio"
|
||||
|
||||
# The G1 arm action and audio stack share a firmware busy state. If we try to
|
||||
# PlayStream while an arm action is still being processed, chunk 0 is often
|
||||
# rejected with rc=3104. Later chunks can fail too if the firmware hasn't
|
||||
# fully processed the previous chunk. Retry both cases with back-off.
|
||||
CHUNK0_RETRIES = 4
|
||||
CHUNK0_BACKOFF_S = 1.0 # 1,2,3,4 s — total ≈ 10 s, covers a full arm cycle
|
||||
CHUNKN_RETRIES = 2
|
||||
CHUNKN_BACKOFF_S = 1.0 # 1,2 s — firmware usually clears within 1-2 s
|
||||
PRE_STREAM_SLEEP = 0.1 # pause after PlayStop before first PlayStream
|
||||
INTER_CHUNK_MARGIN = 0.1 # extra sleep after each chunk's audio duration
|
||||
|
||||
|
||||
def _read_wav_pcm(path: Path) -> Optional[bytes]:
|
||||
"""Return the raw PCM bytes if the WAV matches the expected format, else None."""
|
||||
try:
|
||||
with wave.open(str(path), "rb") as wf:
|
||||
ch = wf.getnchannels()
|
||||
sw = wf.getsampwidth()
|
||||
fr = wf.getframerate()
|
||||
if ch != EXPECTED_CHANNELS or sw != EXPECTED_WIDTH or fr != EXPECTED_RATE:
|
||||
print(
|
||||
f"[audio_player {_ts()}][WARN] {path}: expected "
|
||||
f"{EXPECTED_RATE} Hz mono 16-bit; got "
|
||||
f"{fr} Hz {ch}-ch {sw*8}-bit. Skipping.",
|
||||
flush=True,
|
||||
)
|
||||
return None
|
||||
return wf.readframes(wf.getnframes())
|
||||
except Exception as e:
|
||||
print(f"[audio_player {_ts()}][WARN] failed to load {path}: {e}", flush=True)
|
||||
return None
|
||||
|
||||
|
||||
class AudioPlayer:
|
||||
"""Loads WAVs under ``assets/audio/<category>/<key>.wav`` and plays them on the G1."""
|
||||
|
||||
def __init__(self, audio_client):
|
||||
self.audio_client = audio_client
|
||||
self._clips: Dict[Tuple[str, str], bytes] = {}
|
||||
self._load_all()
|
||||
if self._clips:
|
||||
print(f"[audio_player {_ts()}] loaded {len(self._clips)} clip(s): "
|
||||
f"{sorted(self._clips.keys())}", flush=True)
|
||||
else:
|
||||
print(f"[audio_player {_ts()}] no clips found under {AUDIO_ROOT}", flush=True)
|
||||
|
||||
# ── library ─────────────────────────────────────────────────────────────
|
||||
def _load_all(self) -> None:
|
||||
if not AUDIO_ROOT.exists():
|
||||
return
|
||||
for category_dir in sorted(AUDIO_ROOT.iterdir()):
|
||||
if not category_dir.is_dir():
|
||||
continue
|
||||
for wav_path in sorted(category_dir.glob("*.wav")):
|
||||
pcm = _read_wav_pcm(wav_path)
|
||||
if pcm is not None:
|
||||
self._clips[(category_dir.name, wav_path.stem)] = pcm
|
||||
|
||||
def has(self, category: str, key: str) -> bool:
|
||||
return (category, key) in self._clips
|
||||
|
||||
# ── playback ────────────────────────────────────────────────────────────
|
||||
def play(self, category: str, key: str) -> bool:
|
||||
"""Blocking play. Returns True on success, False if clip missing / failed."""
|
||||
pcm = self._clips.get((category, key))
|
||||
if pcm is None:
|
||||
return False
|
||||
if self.audio_client is None:
|
||||
return False
|
||||
|
||||
# Re-assert max volume on every play; the firmware sometimes resets
|
||||
# between sessions or after certain events. Matches the Unitree
|
||||
# reference script's pattern.
|
||||
try:
|
||||
self.audio_client.SetVolume(100)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Cancel any in-flight stream from a previous call.
|
||||
try:
|
||||
from unitree_sdk2py.g1.audio.g1_audio_api import ROBOT_API_ID_AUDIO_STOP_PLAY
|
||||
self.audio_client._Call(
|
||||
ROBOT_API_ID_AUDIO_STOP_PLAY,
|
||||
json.dumps({"app_name": PLAY_APP_NAME}),
|
||||
)
|
||||
except Exception:
|
||||
pass
|
||||
time.sleep(PRE_STREAM_SLEEP)
|
||||
|
||||
sid = f"saqr_{int(time.time() * 1000)}"
|
||||
offset = 0
|
||||
chunk0_attempts = 0
|
||||
chunkn_attempts = 0
|
||||
while offset < len(pcm):
|
||||
chunk = pcm[offset:offset + PLAY_CHUNK_BYTES]
|
||||
code, _ = self.audio_client.PlayStream(PLAY_APP_NAME, sid, chunk)
|
||||
|
||||
if code != 0:
|
||||
# chunk 0: likely firmware busy (arm motion colliding with audio).
|
||||
# retry with linear back-off; new sid so firmware sees a fresh stream.
|
||||
if offset == 0 and chunk0_attempts < CHUNK0_RETRIES:
|
||||
chunk0_attempts += 1
|
||||
delay = CHUNK0_BACKOFF_S * chunk0_attempts
|
||||
print(f"[audio_player {_ts()}][WARN] PlayStream rc={code} at byte 0; "
|
||||
f"retry {chunk0_attempts}/{CHUNK0_RETRIES} in {delay:.1f}s",
|
||||
flush=True)
|
||||
time.sleep(delay)
|
||||
sid = f"saqr_{int(time.time() * 1000)}"
|
||||
continue
|
||||
|
||||
# later chunks: firmware is still processing the previous chunk.
|
||||
# Retry a couple of times with back-off, keeping the same sid.
|
||||
if offset > 0 and chunkn_attempts < CHUNKN_RETRIES:
|
||||
chunkn_attempts += 1
|
||||
delay = CHUNKN_BACKOFF_S * chunkn_attempts
|
||||
print(f"[audio_player {_ts()}][WARN] PlayStream rc={code} at byte {offset}; "
|
||||
f"mid-stream retry {chunkn_attempts}/{CHUNKN_RETRIES} in {delay:.1f}s",
|
||||
flush=True)
|
||||
time.sleep(delay)
|
||||
continue
|
||||
|
||||
print(f"[audio_player {_ts()}][WARN] PlayStream rc={code} at byte {offset} "
|
||||
f"(retries exhausted)", flush=True)
|
||||
return False
|
||||
|
||||
if offset == 0 and chunk0_attempts > 0:
|
||||
print(f"[audio_player {_ts()}] chunk 0 succeeded after "
|
||||
f"{chunk0_attempts} retry/retries", flush=True)
|
||||
elif offset > 0 and chunkn_attempts > 0:
|
||||
print(f"[audio_player {_ts()}] chunk at byte {offset} succeeded after "
|
||||
f"{chunkn_attempts} retry/retries", flush=True)
|
||||
chunkn_attempts = 0 # reset for any subsequent chunk
|
||||
|
||||
offset += len(chunk)
|
||||
# Wait for the chunk to finish playing before sending the next.
|
||||
chunk_seconds = len(chunk) / (EXPECTED_RATE * EXPECTED_WIDTH)
|
||||
time.sleep(chunk_seconds + INTER_CHUNK_MARGIN)
|
||||
|
||||
# Short settle time after the final chunk before PlayStop.
|
||||
time.sleep(0.2)
|
||||
try:
|
||||
self.audio_client.PlayStop(PLAY_APP_NAME)
|
||||
except Exception:
|
||||
pass
|
||||
return True
|
||||
@ -11,7 +11,7 @@ R2+Y stops it. While saqr is running the bridge parses its event stream and:
|
||||
|
||||
See docs/DEPLOY.md for wireless-remote workflow and systemd deploy notes.
|
||||
|
||||
Saqr event line format (from saqr.core.events.emit_event):
|
||||
Saqr event line format (from core.events.emit_event):
|
||||
ID 0001 | NEW | UNSAFE | wearing: ... | missing: ... | unknown: ...
|
||||
ID 0001 | STATUS_CHANGE | SAFE | wearing: ... | missing: ... | unknown: ...
|
||||
"""
|
||||
@ -27,32 +27,42 @@ import threading
|
||||
import time
|
||||
from typing import Dict, Optional
|
||||
|
||||
from saqr.core.paths import PROJECT_ROOT
|
||||
from saqr.robot.robot_controller import RobotController
|
||||
from core.paths import PROJECT_ROOT
|
||||
from robot.robot_controller import RobotController
|
||||
from utils.config import load_config
|
||||
|
||||
import datetime
|
||||
|
||||
|
||||
def _ts() -> str:
|
||||
return datetime.datetime.now().strftime("%H:%M:%S.%f")[:-3]
|
||||
|
||||
_ROBOT = load_config("robot")
|
||||
_BRIDGE = _ROBOT["bridge"]
|
||||
_PHRASES = _ROBOT["phrases"]
|
||||
|
||||
DANGER_STATUS = "UNSAFE"
|
||||
SAFE_STATUS = "SAFE"
|
||||
|
||||
# speaker_id=2 was confirmed English on current G1 firmware via
|
||||
# Project/Sanad/voice_example.py mode 6. speaker_id=0 is Chinese.
|
||||
TTS_SPEAKER_ID = 2
|
||||
# speaker_id is locked to a language by G1 firmware: 2=English, 0=Chinese.
|
||||
# Confirmed via Project/Sanad/voice_example.py mode 6.
|
||||
TTS_SPEAKER_ID = _ROBOT["tts"]["speaker_id"]
|
||||
|
||||
TTS_TEXT_SAFE = "Safe to enter. Have a good day."
|
||||
TTS_UNSAFE_WITH_MISSING = (
|
||||
"Please stop. Wear your proper safety equipment. You are missing {items}."
|
||||
)
|
||||
TTS_UNSAFE_GENERIC = "Please stop. Wear your proper safety equipment."
|
||||
TTS_BRIDGE_DEACTIVATED = "Saqr deactivated."
|
||||
TTS_BRIDGE_READY = "Saqr is running. Press R2 plus X to start."
|
||||
TTS_BRIDGE_NO_CAMERA = (
|
||||
"Camera not connected. Please plug in the camera and try again."
|
||||
)
|
||||
TTS_TEXT_SAFE = _PHRASES["safe"]
|
||||
TTS_UNSAFE_WITH_MISSING = _PHRASES["unsafe_missing"]
|
||||
TTS_UNSAFE_GENERIC = _PHRASES["unsafe_generic"]
|
||||
TTS_BRIDGE_DEACTIVATED = _PHRASES["deactivated"]
|
||||
TTS_BRIDGE_READY = _PHRASES["ready"]
|
||||
TTS_BRIDGE_NO_CAMERA = _PHRASES["no_camera"]
|
||||
|
||||
QUICK_FAIL_WINDOW_S = 8.0
|
||||
QUICK_FAIL_WINDOW_S = _BRIDGE["quick_fail_window"]
|
||||
TRIGGER_POLL_HZ = _BRIDGE["trigger_poll_hz"]
|
||||
|
||||
# ID NNNN | EVENT_TYPE | STATUS | wearing: ... | missing: ... | unknown: ...
|
||||
# [HH:MM:SS.fff] ID NNNN | EVENT_TYPE | STATUS | wearing: ... | missing: ... | unknown: ...
|
||||
# The leading timestamp is optional for backwards compatibility with old logs.
|
||||
EVENT_RE = re.compile(
|
||||
r"^ID\s+(?P<id>\d+)\s*\|\s*"
|
||||
r"^(?:\[[\d:.]+\]\s+)?"
|
||||
r"ID\s+(?P<id>\d+)\s*\|\s*"
|
||||
r"(?P<event>NEW|STATUS_CHANGE)\s*\|\s*"
|
||||
r"(?P<status>SAFE|PARTIAL|UNSAFE)\s*\|\s*"
|
||||
r"wearing:\s*(?P<wearing>[^|]*?)\s*\|\s*"
|
||||
@ -86,7 +96,7 @@ def build_unsafe_tts(missing: list) -> str:
|
||||
|
||||
def build_saqr_cmd(saqr_extra_args: list) -> list:
|
||||
"""Invoke the saqr CLI via ``python -m`` so it picks up the package layout."""
|
||||
return [sys.executable, "-u", "-m", "saqr.apps.saqr_cli", *saqr_extra_args]
|
||||
return [sys.executable, "-u", "-m", "apps.saqr_cli", *saqr_extra_args]
|
||||
|
||||
|
||||
def split_argv(argv):
|
||||
@ -101,10 +111,12 @@ class Bridge:
|
||||
|
||||
def __init__(self, robot: RobotController, cooldown_s: float,
|
||||
release_after_s: float, saqr_args: list,
|
||||
env: Dict[str, str], cwd: str):
|
||||
env: Dict[str, str], cwd: str,
|
||||
audio_lead_s: float = 0.3):
|
||||
self.robot = robot
|
||||
self.cooldown_s = cooldown_s
|
||||
self.release_after_s = release_after_s
|
||||
self.audio_lead_s = audio_lead_s
|
||||
self.saqr_args = saqr_args
|
||||
self.env = env
|
||||
self.cwd = cwd
|
||||
@ -125,11 +137,11 @@ class Bridge:
|
||||
def start_saqr(self):
|
||||
with self._proc_lock:
|
||||
if self.proc is not None and self.proc.poll() is None:
|
||||
print("[BRIDGE] start ignored — saqr already running", flush=True)
|
||||
print(f"[BRIDGE {_ts()}] start ignored — saqr already running", flush=True)
|
||||
return
|
||||
|
||||
cmd = build_saqr_cmd(self.saqr_args)
|
||||
print(f"[BRIDGE] starting saqr: {' '.join(cmd)}", flush=True)
|
||||
print(f"[BRIDGE {_ts()}] starting saqr: {' '.join(cmd)}", flush=True)
|
||||
self.proc = subprocess.Popen(
|
||||
cmd,
|
||||
cwd=self.cwd,
|
||||
@ -154,10 +166,10 @@ class Bridge:
|
||||
with self._proc_lock:
|
||||
proc = self.proc
|
||||
if proc is None or proc.poll() is not None:
|
||||
print("[BRIDGE] stop ignored — saqr not running", flush=True)
|
||||
print(f"[BRIDGE {_ts()}] stop ignored — saqr not running", flush=True)
|
||||
self.proc = None
|
||||
return
|
||||
print("[BRIDGE] stopping saqr (SIGINT)", flush=True)
|
||||
print(f"[BRIDGE {_ts()}] stopping saqr (SIGINT)", flush=True)
|
||||
try:
|
||||
proc.send_signal(signal.SIGINT)
|
||||
except Exception:
|
||||
@ -166,12 +178,12 @@ class Bridge:
|
||||
try:
|
||||
proc.wait(timeout=3.0)
|
||||
except subprocess.TimeoutExpired:
|
||||
print("[BRIDGE] saqr did not exit in 3s, sending SIGTERM", flush=True)
|
||||
print(f"[BRIDGE {_ts()}] saqr did not exit in 3s, sending SIGTERM", flush=True)
|
||||
try:
|
||||
proc.terminate()
|
||||
proc.wait(timeout=2.0)
|
||||
except subprocess.TimeoutExpired:
|
||||
print("[BRIDGE] saqr unresponsive, sending SIGKILL", flush=True)
|
||||
print(f"[BRIDGE {_ts()}] saqr unresponsive, sending SIGKILL", flush=True)
|
||||
proc.kill()
|
||||
proc.wait()
|
||||
|
||||
@ -182,7 +194,7 @@ class Bridge:
|
||||
self.proc = None
|
||||
self.reader_thread = None
|
||||
|
||||
self.robot.speak(TTS_BRIDGE_DEACTIVATED)
|
||||
self.robot.speak(TTS_BRIDGE_DEACTIVATED, category="fixed", key="deactivated")
|
||||
|
||||
def _read_stdout(self, proc: subprocess.Popen):
|
||||
start_t = self._proc_start_t
|
||||
@ -191,17 +203,17 @@ class Bridge:
|
||||
for line in proc.stdout:
|
||||
self.handle_line(line)
|
||||
except Exception as e:
|
||||
print(f"[BRIDGE][ERR] reader thread: {e}", flush=True)
|
||||
print(f"[BRIDGE {_ts()}][ERR] reader thread: {e}", flush=True)
|
||||
rc = proc.wait()
|
||||
lifetime = time.time() - start_t if start_t > 0 else 0.0
|
||||
print(f"[BRIDGE] saqr exited rc={rc} (lifetime={lifetime:.1f}s)",
|
||||
print(f"[BRIDGE {_ts()}] saqr exited rc={rc} (lifetime={lifetime:.1f}s)",
|
||||
flush=True)
|
||||
|
||||
if rc not in (0, -2) and 0 < lifetime < QUICK_FAIL_WINDOW_S:
|
||||
try:
|
||||
self.robot.speak(TTS_BRIDGE_NO_CAMERA)
|
||||
self.robot.speak(TTS_BRIDGE_NO_CAMERA, category="fixed", key="no_camera")
|
||||
except Exception as e:
|
||||
print(f"[BRIDGE][ERR] no-camera tts failed: {e}", flush=True)
|
||||
print(f"[BRIDGE {_ts()}][ERR] no-camera tts failed: {e}", flush=True)
|
||||
|
||||
def handle_line(self, line: str):
|
||||
line = line.rstrip()
|
||||
@ -235,21 +247,33 @@ class Bridge:
|
||||
|
||||
try:
|
||||
if status == DANGER_STATUS:
|
||||
self.robot.speak(build_unsafe_tts(missing))
|
||||
# Fire audio first, give the worker thread a head start so
|
||||
# PlayStream reaches the firmware BEFORE the arm command.
|
||||
# Once audio is actively playing, the arm command queues
|
||||
# behind it without blocking playback → audible overlap.
|
||||
if missing:
|
||||
key = "_".join(sorted(missing))
|
||||
self.robot.speak(build_unsafe_tts(missing),
|
||||
category="unsafe_missing", key=key)
|
||||
else:
|
||||
self.robot.speak(TTS_UNSAFE_GENERIC,
|
||||
category="fixed", key="unsafe_generic")
|
||||
time.sleep(self.audio_lead_s)
|
||||
self.robot.reject(release_after=self.release_after_s)
|
||||
else:
|
||||
self.robot.speak(TTS_TEXT_SAFE)
|
||||
self.robot.speak(TTS_TEXT_SAFE,
|
||||
category="fixed", key="safe")
|
||||
except Exception as e:
|
||||
print(f"[BRIDGE][ERR] robot action failed: {e}", flush=True)
|
||||
print(f"[BRIDGE {_ts()}][ERR] robot action failed: {e}", flush=True)
|
||||
|
||||
|
||||
def trigger_loop(bridge: Bridge, hub, stop_event: threading.Event,
|
||||
poll_hz: float = 50.0):
|
||||
poll_hz: float = TRIGGER_POLL_HZ):
|
||||
"""Watch the wireless remote for R2+X (start) and R2+Y (stop)."""
|
||||
period = 1.0 / max(poll_hz, 1.0)
|
||||
waiting_release_x = False
|
||||
waiting_release_y = False
|
||||
print("[BRIDGE] trigger loop ready — press R2+X to start, R2+Y to stop.",
|
||||
print(f"[BRIDGE {_ts()}] trigger loop ready — press R2+X to start, R2+Y to stop.",
|
||||
flush=True)
|
||||
while not stop_event.is_set():
|
||||
time.sleep(period)
|
||||
@ -264,22 +288,22 @@ def trigger_loop(bridge: Bridge, hub, stop_event: threading.Event,
|
||||
waiting_release_x = False
|
||||
elif r2x:
|
||||
waiting_release_x = True
|
||||
print("[BRIDGE] R2+X pressed -> start saqr", flush=True)
|
||||
print(f"[BRIDGE {_ts()}] R2+X pressed -> start saqr", flush=True)
|
||||
try:
|
||||
bridge.start_saqr()
|
||||
except Exception as e:
|
||||
print(f"[BRIDGE][ERR] start_saqr failed: {e}", flush=True)
|
||||
print(f"[BRIDGE {_ts()}][ERR] start_saqr failed: {e}", flush=True)
|
||||
|
||||
if waiting_release_y:
|
||||
if not r2y:
|
||||
waiting_release_y = False
|
||||
elif r2y:
|
||||
waiting_release_y = True
|
||||
print("[BRIDGE] R2+Y pressed -> stop saqr", flush=True)
|
||||
print(f"[BRIDGE {_ts()}] R2+Y pressed -> stop saqr", flush=True)
|
||||
try:
|
||||
bridge.stop_saqr()
|
||||
except Exception as e:
|
||||
print(f"[BRIDGE][ERR] stop_saqr failed: {e}", flush=True)
|
||||
print(f"[BRIDGE {_ts()}][ERR] stop_saqr failed: {e}", flush=True)
|
||||
|
||||
|
||||
def main():
|
||||
@ -288,14 +312,14 @@ def main():
|
||||
ap = argparse.ArgumentParser(
|
||||
description="Bridge Saqr PPE events to the G1 arm 'reject' action."
|
||||
)
|
||||
ap.add_argument("--iface", default=None,
|
||||
ap.add_argument("--iface", default=_BRIDGE["iface"],
|
||||
help="DDS network interface (e.g. eth0).")
|
||||
ap.add_argument("--timeout", type=float, default=10.0)
|
||||
ap.add_argument("--cooldown", type=float, default=8.0)
|
||||
ap.add_argument("--release-after", type=float, default=2.0)
|
||||
ap.add_argument("--dry-run", action="store_true")
|
||||
ap.add_argument("--speaker-id", type=int, default=TTS_SPEAKER_ID)
|
||||
ap.add_argument("--no-trigger", action="store_true")
|
||||
ap.add_argument("--timeout", type=float, default=_BRIDGE["timeout"])
|
||||
ap.add_argument("--cooldown", type=float, default=_BRIDGE["cooldown"])
|
||||
ap.add_argument("--release-after", type=float, default=_BRIDGE["release_after"])
|
||||
ap.add_argument("--dry-run", action="store_true")
|
||||
ap.add_argument("--speaker-id", type=int, default=TTS_SPEAKER_ID)
|
||||
ap.add_argument("--no-trigger", action="store_true")
|
||||
|
||||
ap.add_argument("--source", default=None)
|
||||
ap.add_argument("--headless", action="store_true")
|
||||
@ -335,18 +359,19 @@ def main():
|
||||
robot=robot,
|
||||
cooldown_s=args.cooldown,
|
||||
release_after_s=args.release_after,
|
||||
audio_lead_s=float(_BRIDGE.get("audio_lead_s", 0.3)),
|
||||
saqr_args=saqr_args,
|
||||
env=env,
|
||||
cwd=str(PROJECT_ROOT),
|
||||
)
|
||||
print(f"[BRIDGE] saqr cmd template: {' '.join(build_saqr_cmd(saqr_args))}",
|
||||
print(f"[BRIDGE {_ts()}] saqr cmd template: {' '.join(build_saqr_cmd(saqr_args))}",
|
||||
flush=True)
|
||||
print(f"[BRIDGE] cwd: {PROJECT_ROOT}", flush=True)
|
||||
print(f"[BRIDGE {_ts()}] cwd: {PROJECT_ROOT}", flush=True)
|
||||
|
||||
stop_event = threading.Event()
|
||||
|
||||
def _forward_signal(signum, _frame):
|
||||
print(f"[BRIDGE] signal {signum} -> shutting down", flush=True)
|
||||
print(f"[BRIDGE {_ts()}] signal {signum} -> shutting down", flush=True)
|
||||
stop_event.set()
|
||||
|
||||
signal.signal(signal.SIGINT, _forward_signal)
|
||||
@ -354,16 +379,16 @@ def main():
|
||||
|
||||
have_hub = use_trigger and robot.hub is not None
|
||||
if use_trigger and not have_hub:
|
||||
print("[BRIDGE][WARN] --no-trigger not set, but no LowState hub is "
|
||||
print(f"[BRIDGE {_ts()}][WARN] --no-trigger not set, but no LowState hub is "
|
||||
"available. Falling back to legacy auto-start mode.", flush=True)
|
||||
|
||||
trigger_thread: Optional[threading.Thread] = None
|
||||
try:
|
||||
if have_hub:
|
||||
try:
|
||||
robot.speak(TTS_BRIDGE_READY)
|
||||
robot.speak(TTS_BRIDGE_READY, category="fixed", key="ready")
|
||||
except Exception as e:
|
||||
print(f"[BRIDGE][WARN] startup announce failed: {e}", flush=True)
|
||||
print(f"[BRIDGE {_ts()}][WARN] startup announce failed: {e}", flush=True)
|
||||
|
||||
trigger_thread = threading.Thread(
|
||||
target=trigger_loop,
|
||||
@ -388,7 +413,7 @@ def main():
|
||||
robot.shutdown_tts()
|
||||
except Exception:
|
||||
pass
|
||||
print("[BRIDGE] bye.", flush=True)
|
||||
print(f"[BRIDGE {_ts()}] bye.", flush=True)
|
||||
sys.exit(0)
|
||||
|
||||
|
||||
@ -1,8 +1,10 @@
|
||||
"""G1 arm + audio + LowState DDS client owned by the bridge.
|
||||
|
||||
A dedicated TTS worker thread paces ``TtsMaker`` calls so overlapping phrases
|
||||
don't trip the SDK's "device busy" error (3104). The busy multiplier adapts
|
||||
up on 3104s and decays on clean calls.
|
||||
Announcements run on a dedicated worker thread. Each queue item is a tuple
|
||||
``(text, category, key)``. The worker picks WAV playback via
|
||||
``AudioClient.PlayStream`` when the clip exists under ``assets/audio/`` and
|
||||
``tts.mode`` allows, otherwise falls back to ``TtsMaker`` with the adaptive
|
||||
busy-factor backoff for 3104 ("device busy") errors.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
@ -10,20 +12,28 @@ import collections
|
||||
import datetime
|
||||
import threading
|
||||
import time
|
||||
from typing import Deque, Optional
|
||||
from typing import Deque, Optional, Tuple
|
||||
|
||||
TTS_VOLUME = 100
|
||||
from utils.config import load_config
|
||||
|
||||
TTS_SECONDS_PER_CHAR = 0.12
|
||||
TTS_MIN_SECONDS = 2.5
|
||||
TTS_QUEUE_MAX = 4
|
||||
TTS_BUSY_FACTOR_MIN = 1.0
|
||||
TTS_BUSY_FACTOR_MAX = 2.5
|
||||
TTS_BUSY_FACTOR_UP = 1.20
|
||||
TTS_BUSY_FACTOR_DOWN = 0.97
|
||||
_ROBOT = load_config("robot")
|
||||
_TTS = _ROBOT["tts"]
|
||||
_ARM = _ROBOT["arm"]
|
||||
|
||||
REJECT_ACTION = "reject"
|
||||
RELEASE_ACTION = "release arm"
|
||||
TTS_VOLUME = _TTS["volume"]
|
||||
TTS_SECONDS_PER_CHAR = _TTS["seconds_per_char"]
|
||||
TTS_MIN_SECONDS = _TTS["min_seconds"]
|
||||
TTS_QUEUE_MAX = _TTS["queue_max"]
|
||||
TTS_BUSY_FACTOR_MIN = _TTS["busy_factor"]["min"]
|
||||
TTS_BUSY_FACTOR_MAX = _TTS["busy_factor"]["max"]
|
||||
TTS_BUSY_FACTOR_UP = _TTS["busy_factor"]["up"]
|
||||
TTS_BUSY_FACTOR_DOWN = _TTS["busy_factor"]["down"]
|
||||
TTS_MODE = _TTS.get("mode", "tts_only") # tts_only | recorded_or_tts | recorded_only
|
||||
|
||||
REJECT_ACTION = _ARM["reject_action"]
|
||||
RELEASE_ACTION = _ARM["release_action"]
|
||||
|
||||
QueueItem = Tuple[str, Optional[str], Optional[str]] # (text, category, key)
|
||||
|
||||
|
||||
def _ts() -> str:
|
||||
@ -31,11 +41,7 @@ def _ts() -> str:
|
||||
|
||||
|
||||
class RobotController:
|
||||
"""Owns both the G1 arm action client and the G1 audio (TTS) client.
|
||||
|
||||
A single ``ChannelFactoryInitialize`` call is shared by both clients and
|
||||
the optional ``rt/lowstate`` subscriber used by the wireless-remote loop.
|
||||
"""
|
||||
"""Owns both the G1 arm action client and the G1 audio (TTS + PlayStream) client."""
|
||||
|
||||
def __init__(self, iface: Optional[str], timeout: float, dry_run: bool,
|
||||
tts_speaker_id: int, want_lowstate: bool = True):
|
||||
@ -46,8 +52,9 @@ class RobotController:
|
||||
self._action_map = None
|
||||
self.hub = None
|
||||
self._lowstate_sub = None
|
||||
self._player = None # AudioPlayer, lazily initialised
|
||||
|
||||
self._tts_queue: Deque[str] = collections.deque(maxlen=TTS_QUEUE_MAX)
|
||||
self._tts_queue: Deque[QueueItem] = collections.deque(maxlen=TTS_QUEUE_MAX)
|
||||
self._tts_event = threading.Event()
|
||||
self._tts_worker_stop = threading.Event()
|
||||
self._tts_worker_thread: Optional[threading.Thread] = None
|
||||
@ -57,7 +64,7 @@ class RobotController:
|
||||
self._tts_busy_count: int = 0
|
||||
|
||||
if dry_run:
|
||||
print("[BRIDGE] DRY RUN — G1 SDK will not be loaded.", flush=True)
|
||||
print(f"[BRIDGE {_ts()}] DRY RUN — G1 SDK will not be loaded.", flush=True)
|
||||
return
|
||||
|
||||
from unitree_sdk2py.core.channel import ChannelFactoryInitialize
|
||||
@ -77,7 +84,7 @@ class RobotController:
|
||||
self.arm_client = G1ArmActionClient()
|
||||
self.arm_client.SetTimeout(timeout)
|
||||
self.arm_client.Init()
|
||||
print(f"[BRIDGE] G1ArmActionClient ready (iface={iface or 'default'})",
|
||||
print(f"[BRIDGE {_ts()}] G1ArmActionClient ready (iface={iface or 'default'})",
|
||||
flush=True)
|
||||
|
||||
self.audio_client = AudioClient()
|
||||
@ -86,9 +93,13 @@ class RobotController:
|
||||
try:
|
||||
self.audio_client.SetVolume(TTS_VOLUME)
|
||||
except Exception as e:
|
||||
print(f"[BRIDGE][WARN] AudioClient.SetVolume failed: {e}", flush=True)
|
||||
print(f"[BRIDGE] G1 AudioClient ready (speaker_id={tts_speaker_id})",
|
||||
flush=True)
|
||||
print(f"[BRIDGE {_ts()}][WARN] AudioClient.SetVolume failed: {e}", flush=True)
|
||||
print(f"[BRIDGE {_ts()}] G1 AudioClient ready (speaker_id={tts_speaker_id}, "
|
||||
f"tts_mode={TTS_MODE})", flush=True)
|
||||
|
||||
# Pre-recorded clip library (WAVs under assets/audio/).
|
||||
from robot.audio_player import AudioPlayer
|
||||
self._player = AudioPlayer(self.audio_client)
|
||||
|
||||
self._tts_worker_thread = threading.Thread(
|
||||
target=self._tts_worker_loop, name="TtsWorker", daemon=True,
|
||||
@ -99,35 +110,40 @@ class RobotController:
|
||||
try:
|
||||
from unitree_sdk2py.core.channel import ChannelSubscriber
|
||||
from unitree_sdk2py.idl.unitree_hg.msg.dds_ import LowState_
|
||||
from saqr.robot.controller import LowStateHub
|
||||
from robot.controller import LowStateHub
|
||||
|
||||
self.hub = LowStateHub(watchdog_timeout=0.25)
|
||||
self._lowstate_sub = ChannelSubscriber("rt/lowstate", LowState_)
|
||||
self._lowstate_sub.Init(self.hub.handler, 10)
|
||||
print("[BRIDGE] Subscribed to rt/lowstate (wireless remote)",
|
||||
print(f"[BRIDGE {_ts()}] Subscribed to rt/lowstate (wireless remote)",
|
||||
flush=True)
|
||||
except Exception as e:
|
||||
print(f"[BRIDGE][WARN] LowState subscribe failed: {e}", flush=True)
|
||||
print("[BRIDGE][WARN] Trigger keys (R2+X / R2+Y) will not work.",
|
||||
print(f"[BRIDGE {_ts()}][WARN] LowState subscribe failed: {e}", flush=True)
|
||||
print(f"[BRIDGE {_ts()}][WARN] Trigger keys (R2+X / R2+Y) will not work.",
|
||||
flush=True)
|
||||
self.hub = None
|
||||
|
||||
# ── TTS ─────────────────────────────────────────────────────────────────
|
||||
def _estimate_tts_seconds(self, text: str) -> float:
|
||||
base = max(TTS_MIN_SECONDS, len(text) * TTS_SECONDS_PER_CHAR)
|
||||
return base * self._tts_busy_factor
|
||||
# ── Public API ──────────────────────────────────────────────────────────
|
||||
def speak(self, text: str,
|
||||
category: Optional[str] = None, key: Optional[str] = None):
|
||||
"""Non-blocking — enqueue an announcement for the worker thread.
|
||||
|
||||
def speak(self, text: str):
|
||||
"""Non-blocking — enqueue the phrase for the worker thread."""
|
||||
If ``category`` + ``key`` are provided and a matching WAV exists under
|
||||
``assets/audio/<category>/<key>.wav``, the worker plays the WAV via
|
||||
PlayStream (no firmware TTS latency). Otherwise falls back per
|
||||
``tts.mode`` setting.
|
||||
"""
|
||||
if self.dry_run:
|
||||
print(f"[BRIDGE] (dry) would TtsMaker({text!r}, "
|
||||
f"speaker_id={self.tts_speaker_id})", flush=True)
|
||||
print(f"[BRIDGE {_ts()}] (dry) would speak({text!r}, category={category!r}, "
|
||||
f"key={key!r})", flush=True)
|
||||
return
|
||||
if self.audio_client is None:
|
||||
return
|
||||
if self._tts_queue and self._tts_queue[-1] == text:
|
||||
item: QueueItem = (text, category, key)
|
||||
# Drop adjacent duplicates (same text + routing).
|
||||
if self._tts_queue and self._tts_queue[-1] == item:
|
||||
return
|
||||
self._tts_queue.append(text)
|
||||
self._tts_queue.append(item)
|
||||
self._tts_event.set()
|
||||
|
||||
def shutdown_tts(self):
|
||||
@ -136,6 +152,7 @@ class RobotController:
|
||||
if self._tts_worker_thread is not None:
|
||||
self._tts_worker_thread.join(timeout=1.0)
|
||||
|
||||
# ── Worker thread ───────────────────────────────────────────────────────
|
||||
def _tts_worker_loop(self):
|
||||
while not self._tts_worker_stop.is_set():
|
||||
if not self._tts_queue:
|
||||
@ -143,10 +160,49 @@ class RobotController:
|
||||
self._tts_event.clear()
|
||||
continue
|
||||
try:
|
||||
text = self._tts_queue.popleft()
|
||||
item = self._tts_queue.popleft()
|
||||
except IndexError:
|
||||
continue
|
||||
self._speak_blocking(text)
|
||||
self._dispatch(*item)
|
||||
|
||||
def _dispatch(self, text: str, category: Optional[str], key: Optional[str]):
|
||||
"""Route one queue item to PlayStream or TtsMaker per ``tts.mode``."""
|
||||
wants_clip = (
|
||||
TTS_MODE in ("recorded_or_tts", "recorded_only")
|
||||
and category is not None and key is not None
|
||||
and self._player is not None
|
||||
and self._player.has(category, key)
|
||||
)
|
||||
if wants_clip:
|
||||
print(f"[BRIDGE {_ts()}] play -> {category}/{key!r} "
|
||||
f"(text={text!r})", flush=True)
|
||||
call_t0 = time.monotonic()
|
||||
ok = self._player.play(category, key)
|
||||
dt = time.monotonic() - call_t0
|
||||
if ok:
|
||||
print(f"[BRIDGE {_ts()}] play done ({dt*1000:.0f} ms)", flush=True)
|
||||
return
|
||||
# Play failed. Decide by mode whether to fall back to TtsMaker.
|
||||
if TTS_MODE == "recorded_only":
|
||||
print(f"[BRIDGE {_ts()}][WARN] play failed and tts.mode=recorded_only "
|
||||
f"— dropping phrase silently", flush=True)
|
||||
return
|
||||
print(f"[BRIDGE {_ts()}][WARN] play failed; falling back to TtsMaker",
|
||||
flush=True)
|
||||
# fall through to TtsMaker
|
||||
|
||||
if TTS_MODE == "recorded_only":
|
||||
# No clip exists for this phrase and user opted out of TtsMaker.
|
||||
print(f"[BRIDGE {_ts()}] skip (recorded_only, no clip for "
|
||||
f"{category}/{key!r}): {text!r}", flush=True)
|
||||
return
|
||||
|
||||
self._speak_blocking(text)
|
||||
|
||||
# ── TtsMaker path (fallback + legacy) ───────────────────────────────────
|
||||
def _estimate_tts_seconds(self, text: str) -> float:
|
||||
base = max(TTS_MIN_SECONDS, len(text) * TTS_SECONDS_PER_CHAR)
|
||||
return base * self._tts_busy_factor
|
||||
|
||||
def _speak_blocking(self, text: str):
|
||||
if self.audio_client is None:
|
||||
@ -199,18 +255,18 @@ class RobotController:
|
||||
# ── Arm ─────────────────────────────────────────────────────────────────
|
||||
def reject(self, release_after: float):
|
||||
if self.dry_run:
|
||||
print(f"[BRIDGE] (dry) would run '{REJECT_ACTION}' "
|
||||
print(f"[BRIDGE {_ts()}] (dry) would run '{REJECT_ACTION}' "
|
||||
f"then release after {release_after:.1f}s", flush=True)
|
||||
return
|
||||
if self.arm_client is None or self._action_map is None:
|
||||
return
|
||||
if REJECT_ACTION not in self._action_map:
|
||||
print(f"[BRIDGE][ERR] '{REJECT_ACTION}' not in SDK action_map",
|
||||
print(f"[BRIDGE {_ts()}][ERR] '{REJECT_ACTION}' not in SDK action_map",
|
||||
flush=True)
|
||||
return
|
||||
print(f"[BRIDGE] -> {REJECT_ACTION}", flush=True)
|
||||
print(f"[BRIDGE {_ts()}] -> {REJECT_ACTION}", flush=True)
|
||||
self.arm_client.ExecuteAction(self._action_map[REJECT_ACTION])
|
||||
if release_after > 0:
|
||||
time.sleep(release_after)
|
||||
print(f"[BRIDGE] -> {RELEASE_ACTION}", flush=True)
|
||||
print(f"[BRIDGE {_ts()}] -> {RELEASE_ACTION}", flush=True)
|
||||
self.arm_client.ExecuteAction(self._action_map[RELEASE_ACTION])
|
||||
@ -1,3 +0,0 @@
|
||||
"""Saqr — PPE safety detection and G1 humanoid integration."""
|
||||
|
||||
__version__ = "0.1.0"
|
||||
@ -1,88 +0,0 @@
|
||||
"""Camera backends: RealSense SDK and OpenCV V4L2."""
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Optional
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
from saqr.utils.logger import get_logger
|
||||
|
||||
log = get_logger("Inference", "camera")
|
||||
|
||||
try:
|
||||
import pyrealsense2 as rs
|
||||
HAS_REALSENSE = True
|
||||
except ImportError:
|
||||
HAS_REALSENSE = False
|
||||
|
||||
|
||||
class RealSenseCapture:
|
||||
"""pyrealsense2 pipeline with an OpenCV-like read() interface."""
|
||||
|
||||
def __init__(self, width: int = 640, height: int = 480, fps: int = 30,
|
||||
serial: Optional[str] = None):
|
||||
if not HAS_REALSENSE:
|
||||
raise RuntimeError("pyrealsense2 not installed")
|
||||
self.pipeline = rs.pipeline()
|
||||
cfg = rs.config()
|
||||
if serial:
|
||||
cfg.enable_device(serial)
|
||||
cfg.enable_stream(rs.stream.color, width, height, rs.format.bgr8, fps)
|
||||
self.profile = self.pipeline.start(cfg)
|
||||
self._open = True
|
||||
dev = self.profile.get_device()
|
||||
log.info(f"RealSense opened | {dev.get_info(rs.camera_info.name)} "
|
||||
f"serial={dev.get_info(rs.camera_info.serial_number)} "
|
||||
f"{width}x{height}@{fps}")
|
||||
|
||||
def isOpened(self) -> bool:
|
||||
return self._open
|
||||
|
||||
def read(self):
|
||||
if not self._open:
|
||||
return False, None
|
||||
try:
|
||||
frames = self.pipeline.wait_for_frames(timeout_ms=3000)
|
||||
color = frames.get_color_frame()
|
||||
if not color:
|
||||
return False, None
|
||||
return True, np.asanyarray(color.get_data())
|
||||
except Exception:
|
||||
return False, None
|
||||
|
||||
def release(self):
|
||||
if self._open:
|
||||
self.pipeline.stop()
|
||||
self._open = False
|
||||
|
||||
|
||||
def open_capture(source: str):
|
||||
if source.lower().startswith("realsense"):
|
||||
serial = None
|
||||
if ":" in source:
|
||||
serial = source.split(":", 1)[1]
|
||||
return RealSenseCapture(width=640, height=480, fps=30, serial=serial)
|
||||
|
||||
if str(source).isdigit():
|
||||
idx = int(source)
|
||||
cap = cv2.VideoCapture(idx)
|
||||
if cap.isOpened():
|
||||
return cap
|
||||
cap = cv2.VideoCapture(idx, cv2.CAP_ANY)
|
||||
if cap.isOpened():
|
||||
return cap
|
||||
cap = cv2.VideoCapture(idx, cv2.CAP_V4L2)
|
||||
return cap
|
||||
|
||||
if source.startswith("/dev/video"):
|
||||
cap = cv2.VideoCapture(source, cv2.CAP_V4L2)
|
||||
if cap.isOpened():
|
||||
cap.set(cv2.CAP_PROP_BUFFERSIZE, 1)
|
||||
cap.set(cv2.CAP_PROP_FOURCC, cv2.VideoWriter_fourcc(*"MJPG"))
|
||||
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
|
||||
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
|
||||
cap.set(cv2.CAP_PROP_FPS, 30)
|
||||
return cap
|
||||
|
||||
return cv2.VideoCapture(source)
|
||||
@ -1,41 +0,0 @@
|
||||
"""Per-track image cropping + capture directory setup."""
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional
|
||||
|
||||
import cv2
|
||||
|
||||
from saqr.core.detection import STATUSES
|
||||
from saqr.core.geometry import clamp_bbox
|
||||
from saqr.core.paths import CAPTURES_DIR
|
||||
|
||||
|
||||
def setup_capture_dirs() -> Dict[str, Path]:
|
||||
dirs: Dict[str, Path] = {}
|
||||
for s in STATUSES:
|
||||
d = CAPTURES_DIR / s
|
||||
d.mkdir(parents=True, exist_ok=True)
|
||||
dirs[s] = d
|
||||
return dirs
|
||||
|
||||
|
||||
def save_track_image(frame, track, capture_dirs: Dict[str, Path]) -> Optional[Path]:
|
||||
h, w = frame.shape[:2]
|
||||
x1, y1, x2, y2 = clamp_bbox(track.bbox, w, h)
|
||||
if x2 <= x1 or y2 <= y1:
|
||||
return None
|
||||
crop = frame[y1:y2, x1:x2]
|
||||
if crop.size == 0:
|
||||
return None
|
||||
|
||||
target = capture_dirs[track.status] / f"track_{track.track_id:04d}.jpg"
|
||||
if track.photo_path and track.photo_path != target and track.photo_path.exists():
|
||||
try:
|
||||
track.photo_path.unlink()
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
cv2.imwrite(str(target), crop)
|
||||
track.photo_path = target
|
||||
return target
|
||||
@ -1,36 +0,0 @@
|
||||
"""SAFE / PARTIAL / UNSAFE classification (helmet + vest focus)."""
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Dict, List, Tuple
|
||||
|
||||
from saqr.core.detection import POSITIVE_TO_NEGATIVE, PPE_DISPLAY_ORDER
|
||||
|
||||
|
||||
def status_from_items(items: Dict[str, float]) -> str:
|
||||
has_helmet = items.get("helmet", 0.0) > items.get("no-helmet", 0.0) and items.get("helmet", 0.0) > 0
|
||||
has_vest = items.get("vest", 0.0) > items.get("no-vest", 0.0) and items.get("vest", 0.0) > 0
|
||||
no_helmet = items.get("no-helmet", 0.0) > 0
|
||||
no_vest = items.get("no-vest", 0.0) > 0
|
||||
|
||||
if no_helmet or no_vest:
|
||||
return "UNSAFE"
|
||||
if has_helmet and has_vest:
|
||||
return "SAFE"
|
||||
if has_helmet or has_vest:
|
||||
return "PARTIAL"
|
||||
return "UNSAFE"
|
||||
|
||||
|
||||
def split_wearing_missing(items: Dict[str, float]) -> Tuple[List[str], List[str], List[str]]:
|
||||
wearing, missing, unknown = [], [], []
|
||||
for pos in PPE_DISPLAY_ORDER:
|
||||
neg = POSITIVE_TO_NEGATIVE[pos]
|
||||
pos_conf = items.get(pos, 0.0)
|
||||
neg_conf = items.get(neg, 0.0)
|
||||
if pos_conf > neg_conf and pos_conf > 0:
|
||||
wearing.append(pos)
|
||||
elif neg_conf >= pos_conf and neg_conf > 0:
|
||||
missing.append(pos)
|
||||
else:
|
||||
unknown.append(pos)
|
||||
return wearing, missing, unknown
|
||||
@ -1,19 +0,0 @@
|
||||
"""Canonical project paths, resolved from the saqr package location."""
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
PROJECT_ROOT = Path(__file__).resolve().parents[2]
|
||||
|
||||
CONFIG_DIR = PROJECT_ROOT / "config"
|
||||
DATA_DIR = PROJECT_ROOT / "data"
|
||||
DATASET_DIR = DATA_DIR / "dataset"
|
||||
MODELS_DIR = DATA_DIR / "models"
|
||||
|
||||
RUNTIME_DIR = PROJECT_ROOT / "runtime"
|
||||
CAPTURES_DIR = RUNTIME_DIR / "captures"
|
||||
LOGS_DIR = RUNTIME_DIR / "logs"
|
||||
RUNS_DIR = RUNTIME_DIR / "runs"
|
||||
|
||||
RESULT_CSV = CAPTURES_DIR / "result.csv"
|
||||
EVENTS_CSV = CAPTURES_DIR / "events.csv"
|
||||
@ -1,40 +0,0 @@
|
||||
"""Per-frame detect + group + track + capture + emit pipeline."""
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
from typing import Dict
|
||||
|
||||
from saqr.core.capture import save_track_image
|
||||
from saqr.core.detection import collect_detections
|
||||
from saqr.core.drawing import draw_track
|
||||
from saqr.core.events import emit_event, write_result_csv
|
||||
from saqr.core.grouping import group_detections_to_people
|
||||
from saqr.core.paths import RESULT_CSV
|
||||
from saqr.core.tracking import PersonTracker
|
||||
|
||||
|
||||
def process_frame(frame, model, tracker: PersonTracker, frame_idx: int, conf: float,
|
||||
capture_dirs: Dict[str, Path], write_csv: bool = True):
|
||||
annotated = frame.copy()
|
||||
h, w = annotated.shape[:2]
|
||||
|
||||
detections = collect_detections(frame, model, conf)
|
||||
candidates = group_detections_to_people(detections, w, h)
|
||||
created, changed = tracker.update(candidates, frame_idx)
|
||||
visible = tracker.visible_tracks()
|
||||
|
||||
created_ids = {t.track_id for t in created}
|
||||
changed_ids = {t.track_id for t in changed}
|
||||
event_ids = created_ids | changed_ids
|
||||
|
||||
for track in visible:
|
||||
save_track_image(frame, track, capture_dirs)
|
||||
if track.track_id in event_ids:
|
||||
ev_type = "NEW" if track.track_id in created_ids else "STATUS_CHANGE"
|
||||
emit_event(track, tracker.event_logger, ev_type)
|
||||
draw_track(annotated, track)
|
||||
|
||||
if write_csv:
|
||||
write_result_csv(list(tracker.tracks.values()), RESULT_CSV)
|
||||
|
||||
return annotated, visible
|
||||
@ -15,11 +15,22 @@ set -e
|
||||
HERE="$(cd "$(dirname "$0")" && pwd)"
|
||||
PROJECT_ROOT="$(cd "$HERE/.." && pwd)"
|
||||
|
||||
ROBOT_IP="${ROBOT_IP:-192.168.123.164}"
|
||||
ROBOT_USER="${ROBOT_USER:-unitree}"
|
||||
ROBOT_ENV="${ROBOT_ENV:-marcus}"
|
||||
REMOTE_DIR="/home/${ROBOT_USER}/Saqr"
|
||||
SSH_OPTS="-o StrictHostKeyChecking=no -o ConnectTimeout=10"
|
||||
config_get() {
|
||||
python3 -c "
|
||||
import json, os, sys
|
||||
with open('$PROJECT_ROOT/config/robot_config.json') as f:
|
||||
c = json.load(f)
|
||||
for k in sys.argv[1].split('.'):
|
||||
c = c[k]
|
||||
print(os.path.expandvars(str(c)))
|
||||
" "$1"
|
||||
}
|
||||
|
||||
ROBOT_IP="${ROBOT_IP:-$(config_get deploy.robot_ip)}"
|
||||
ROBOT_USER="${ROBOT_USER:-$(config_get deploy.robot_user)}"
|
||||
ROBOT_ENV="${ROBOT_ENV:-$(config_get deploy.robot_env)}"
|
||||
REMOTE_DIR="${REMOTE_DIR:-$(config_get deploy.remote_dir)}"
|
||||
SSH_OPTS="${SSH_OPTS:-$(config_get deploy.ssh_opts)}"
|
||||
|
||||
RUN_AFTER=false
|
||||
|
||||
@ -50,7 +61,10 @@ ssh ${SSH_OPTS} ${ROBOT_USER}@${ROBOT_IP} "echo 'Connected OK'" || {
|
||||
|
||||
echo "[2/4] Creating remote directory..."
|
||||
ssh ${SSH_OPTS} ${ROBOT_USER}@${ROBOT_IP} \
|
||||
"mkdir -p ${REMOTE_DIR}/runtime/{captures/{SAFE,PARTIAL,UNSAFE},logs,runs} ${REMOTE_DIR}/data/models"
|
||||
"mkdir -p ${REMOTE_DIR}/runtime/captures/{SAFE,PARTIAL,UNSAFE} \
|
||||
${REMOTE_DIR}/runtime/runs \
|
||||
${REMOTE_DIR}/logs \
|
||||
${REMOTE_DIR}/data/models"
|
||||
|
||||
echo "[3/4] Rsyncing package..."
|
||||
RSYNC_OPTS=(
|
||||
@ -60,12 +74,16 @@ RSYNC_OPTS=(
|
||||
--exclude="*.egg-info" --exclude="build/" --exclude="dist/"
|
||||
)
|
||||
rsync "${RSYNC_OPTS[@]}" \
|
||||
"${PROJECT_ROOT}/saqr" \
|
||||
"${PROJECT_ROOT}/core" \
|
||||
"${PROJECT_ROOT}/apps" \
|
||||
"${PROJECT_ROOT}/gui" \
|
||||
"${PROJECT_ROOT}/robot" \
|
||||
"${PROJECT_ROOT}/utils" \
|
||||
"${PROJECT_ROOT}/scripts" \
|
||||
"${PROJECT_ROOT}/config" \
|
||||
"${PROJECT_ROOT}/docs" \
|
||||
"${PROJECT_ROOT}/assets" \
|
||||
"${PROJECT_ROOT}/pyproject.toml" \
|
||||
"${PROJECT_ROOT}/requirements.txt" \
|
||||
"${PROJECT_ROOT}/README.md" \
|
||||
${ROBOT_USER}@${ROBOT_IP}:${REMOTE_DIR}/
|
||||
|
||||
@ -84,8 +102,8 @@ ssh ${SSH_OPTS} ${ROBOT_USER}@${ROBOT_IP} bash -s <<INSTALL_EOF
|
||||
source ~/miniconda3/etc/profile.d/conda.sh
|
||||
conda activate ${ROBOT_ENV}
|
||||
cd ${REMOTE_DIR}
|
||||
pip install -q -e . 2>/dev/null || pip install -e .
|
||||
chmod +x scripts/start_saqr.sh scripts/run_robot.sh scripts/run_local.sh 2>/dev/null || true
|
||||
pip install -e .
|
||||
chmod +x scripts/start_saqr.sh
|
||||
echo " Install OK"
|
||||
INSTALL_EOF
|
||||
|
||||
|
||||
178
scripts/generate_phrases.py
Normal file
178
scripts/generate_phrases.py
Normal file
@ -0,0 +1,178 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
generate_phrases.py — synthesize Saqr's 8 phrases to WAV using piper-tts.
|
||||
|
||||
This produces the audio library locally (no G1 mic, no TtsMaker capture,
|
||||
no PulseAudio) at exactly the format robot.audio_player expects:
|
||||
16 kHz mono int16 WAV under assets/audio/<category>/<key>.wav.
|
||||
|
||||
Setup (once):
|
||||
pip install piper-tts
|
||||
python -m piper.download_voices en_US-amy-medium
|
||||
|
||||
Usage:
|
||||
python scripts/generate_phrases.py
|
||||
python scripts/generate_phrases.py --voice en_US-lessac-medium
|
||||
python scripts/generate_phrases.py --only safe,helmet_vest
|
||||
python scripts/generate_phrases.py --voice-dir /custom/path
|
||||
|
||||
Common voices (run the download command above, swap the name):
|
||||
en_US-amy-medium female, neutral (~60 MB)
|
||||
en_US-lessac-medium female, friendly (~60 MB)
|
||||
en_US-ryan-high male, clear (~120 MB)
|
||||
en_GB-alan-medium male, British (~60 MB)
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import io
|
||||
import sys
|
||||
import wave
|
||||
from pathlib import Path
|
||||
|
||||
PROJECT_ROOT = Path(__file__).resolve().parent.parent
|
||||
ASSETS_ROOT = PROJECT_ROOT / "assets" / "audio"
|
||||
|
||||
PHRASES = [
|
||||
("fixed", "ready",
|
||||
"Saqr is running. Press R2 plus X to start."),
|
||||
("fixed", "deactivated",
|
||||
"Saqr deactivated."),
|
||||
("fixed", "no_camera",
|
||||
"Camera not connected. Please plug in the camera and try again."),
|
||||
("fixed", "safe",
|
||||
"Safe to enter. Have a good day."),
|
||||
("fixed", "unsafe_generic",
|
||||
"Please stop. Wear your proper safety equipment."),
|
||||
("unsafe_missing", "helmet",
|
||||
"Please stop. Wear your proper safety equipment. You are missing helmet."),
|
||||
("unsafe_missing", "vest",
|
||||
"Please stop. Wear your proper safety equipment. You are missing vest."),
|
||||
("unsafe_missing", "helmet_vest",
|
||||
"Please stop. Wear your proper safety equipment. You are missing helmet and vest."),
|
||||
]
|
||||
|
||||
|
||||
def find_voice_files(voice_name: str, override_dir: Path = None):
|
||||
"""Locate <voice_name>.onnx + .onnx.json in piper's standard dirs."""
|
||||
search_dirs = []
|
||||
if override_dir is not None:
|
||||
search_dirs.append(Path(override_dir))
|
||||
search_dirs += [
|
||||
Path.home() / ".local" / "share" / "piper-voices",
|
||||
Path.home() / ".cache" / "piper" / "voices",
|
||||
Path.cwd(),
|
||||
]
|
||||
for d in search_dirs:
|
||||
onnx = d / f"{voice_name}.onnx"
|
||||
js = d / f"{voice_name}.onnx.json"
|
||||
if onnx.exists() and js.exists():
|
||||
return onnx, js
|
||||
return None, None
|
||||
|
||||
|
||||
def resample_to_16k_mono_int16(pcm: bytes, src_rate: int, src_channels: int) -> bytes:
|
||||
"""Linear-resample int16 PCM to 16 kHz mono."""
|
||||
import numpy as np
|
||||
a = np.frombuffer(pcm, dtype=np.int16)
|
||||
if src_channels > 1:
|
||||
a = a.reshape(-1, src_channels).mean(axis=1).astype(np.int16)
|
||||
if src_rate == 16000:
|
||||
return a.tobytes()
|
||||
target_len = int(round(len(a) * 16000 / src_rate))
|
||||
idx = np.linspace(0, len(a) - 1, target_len).astype(np.int64)
|
||||
return a[idx].astype(np.int16).tobytes()
|
||||
|
||||
|
||||
def save_wav_16k_mono(pcm: bytes, path: Path) -> None:
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
with wave.open(str(path), "wb") as wf:
|
||||
wf.setnchannels(1)
|
||||
wf.setsampwidth(2)
|
||||
wf.setframerate(16000)
|
||||
wf.writeframes(pcm)
|
||||
|
||||
|
||||
def main():
|
||||
ap = argparse.ArgumentParser(
|
||||
description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
)
|
||||
ap.add_argument("--voice", default="en_US-amy-medium",
|
||||
help="piper voice name (default en_US-amy-medium)")
|
||||
ap.add_argument("--voice-dir", default=None,
|
||||
help="override directory to search for the voice .onnx files")
|
||||
ap.add_argument("--only", default=None,
|
||||
help="comma-separated keys to generate (e.g. safe,helmet)")
|
||||
ap.add_argument("--length-scale", type=float, default=None,
|
||||
help="speaking rate (piper default ~1.0; 0.8 faster, 1.2 slower)")
|
||||
args = ap.parse_args()
|
||||
|
||||
only = set(args.only.split(",")) if args.only else None
|
||||
targets = [(c, k, t) for (c, k, t) in PHRASES if only is None or k in only]
|
||||
if not targets:
|
||||
print(f"No phrases match --only={args.only!r}.")
|
||||
sys.exit(1)
|
||||
|
||||
# Import piper
|
||||
try:
|
||||
from piper.voice import PiperVoice
|
||||
except ImportError:
|
||||
print("ERROR: piper-tts is not installed in this Python environment.")
|
||||
print()
|
||||
print("Install it:")
|
||||
print(" pip install piper-tts")
|
||||
print(f" python -m piper.download_voices {args.voice}")
|
||||
sys.exit(1)
|
||||
|
||||
onnx, js = find_voice_files(args.voice, args.voice_dir)
|
||||
if onnx is None:
|
||||
print(f"ERROR: voice files for {args.voice!r} not found.")
|
||||
print()
|
||||
print("Download:")
|
||||
print(f" python -m piper.download_voices {args.voice}")
|
||||
print()
|
||||
print("Or pass --voice-dir pointing at a directory that contains")
|
||||
print(f" {args.voice}.onnx + {args.voice}.onnx.json")
|
||||
sys.exit(1)
|
||||
|
||||
print(f"Loading voice: {onnx}")
|
||||
voice = PiperVoice.load(str(onnx), config_path=str(js))
|
||||
src_rate = int(voice.config.sample_rate)
|
||||
print(f"Native rate: {src_rate} Hz")
|
||||
print(f"Target: 16000 Hz mono int16 under {ASSETS_ROOT}")
|
||||
print()
|
||||
|
||||
synth_kwargs = {}
|
||||
if args.length_scale is not None:
|
||||
synth_kwargs["length_scale"] = args.length_scale
|
||||
|
||||
for cat, key, text in targets:
|
||||
out = ASSETS_ROOT / cat / f"{key}.wav"
|
||||
print(f"[{cat}/{key}]")
|
||||
print(f" text : {text}")
|
||||
|
||||
# Synthesize into an in-memory WAV buffer.
|
||||
buf = io.BytesIO()
|
||||
with wave.open(buf, "wb") as wf:
|
||||
voice.synthesize(text, wf, **synth_kwargs)
|
||||
buf.seek(0)
|
||||
with wave.open(buf, "rb") as wf:
|
||||
pcm = wf.readframes(wf.getnframes())
|
||||
rate_in = wf.getframerate()
|
||||
channels = wf.getnchannels()
|
||||
|
||||
resampled = resample_to_16k_mono_int16(pcm, rate_in, channels)
|
||||
save_wav_16k_mono(resampled, out)
|
||||
|
||||
dur = len(resampled) / 2.0 / 16000
|
||||
print(f" saved → {out.relative_to(PROJECT_ROOT)} ({dur:.1f}s)")
|
||||
print()
|
||||
|
||||
print("Done. Deploy + restart:")
|
||||
print(" scripts/deploy.sh")
|
||||
print(" ssh unitree@192.168.123.164 'sudo systemctl restart saqr-bridge && "
|
||||
"journalctl -u saqr-bridge -n 20 | grep audio_player'")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
206
scripts/import_voices.py
Normal file
206
scripts/import_voices.py
Normal file
@ -0,0 +1,206 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
import_voices.py — Convert TTS MP3s to Saqr's audio library format.
|
||||
|
||||
Reads MP3s from a source directory, maps each to its correct
|
||||
``assets/audio/<category>/<key>.wav`` path, and transcodes to
|
||||
16 kHz mono int16 WAV using ffmpeg.
|
||||
|
||||
Matching rules:
|
||||
1. If the filename (without extension, trailing dots stripped) exactly
|
||||
equals one of the known phrases, that's a direct map.
|
||||
2. Any file that doesn't match — typical case is ``ttsmaker-file-*.mp3``
|
||||
where the site didn't rename it — is assigned to whichever phrase
|
||||
key is still missing. If exactly ONE phrase is unmapped and ONE
|
||||
file is unmatched, they're paired. Otherwise you're prompted.
|
||||
|
||||
Requires ffmpeg on PATH: sudo apt install -y ffmpeg
|
||||
|
||||
Usage:
|
||||
python scripts/import_voices.py
|
||||
python scripts/import_voices.py --src "/home/zedx/Downloads/voices saqr"
|
||||
python scripts/import_voices.py --dry-run # just show mapping
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import shutil
|
||||
import subprocess
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
PROJECT_ROOT = Path(__file__).resolve().parent.parent
|
||||
ASSETS_ROOT = PROJECT_ROOT / "assets" / "audio"
|
||||
DEFAULT_SRC = Path("/home/zedx/Downloads/voices saqr")
|
||||
|
||||
# (category, key, text) — filename match uses the text (with or without
|
||||
# trailing period).
|
||||
PHRASES = [
|
||||
("fixed", "ready",
|
||||
"Saqr is running. Press R2 plus X to start."),
|
||||
("fixed", "deactivated",
|
||||
"Saqr deactivated."),
|
||||
("fixed", "no_camera",
|
||||
"Camera not connected. Please plug in the camera and try again."),
|
||||
("fixed", "safe",
|
||||
"Safe to enter. Have a good day."),
|
||||
("fixed", "unsafe_generic",
|
||||
"Please stop. Wear your proper safety equipment."),
|
||||
("unsafe_missing", "helmet",
|
||||
"Please stop. Wear your proper safety equipment. You are missing helmet."),
|
||||
("unsafe_missing", "vest",
|
||||
"Please stop. Wear your proper safety equipment. You are missing vest."),
|
||||
("unsafe_missing", "helmet_vest",
|
||||
"Please stop. Wear your proper safety equipment. You are missing helmet and vest."),
|
||||
]
|
||||
|
||||
# Text → (category, key) and normalised-text → (category, key)
|
||||
TEXT_TO_KEY = {text: (cat, key) for (cat, key, text) in PHRASES}
|
||||
|
||||
|
||||
def normalise(stem: str) -> str:
|
||||
"""Strip trailing dots so 'Saqr deactivated.' matches 'Saqr deactivated..mp3'."""
|
||||
return stem.rstrip(".").strip()
|
||||
|
||||
|
||||
def match_by_filename(stem: str):
|
||||
candidates = {stem, stem + ".", normalise(stem), normalise(stem) + "."}
|
||||
for c in candidates:
|
||||
if c in TEXT_TO_KEY:
|
||||
return TEXT_TO_KEY[c]
|
||||
return None
|
||||
|
||||
|
||||
def convert_to_wav(src: Path, dst: Path, dry_run: bool = False) -> None:
|
||||
dst.parent.mkdir(parents=True, exist_ok=True)
|
||||
cmd = [
|
||||
"ffmpeg", "-y", "-loglevel", "error",
|
||||
"-i", str(src),
|
||||
"-ac", "1", # mono
|
||||
"-ar", "16000", # 16 kHz
|
||||
"-sample_fmt", "s16", # 16-bit int
|
||||
str(dst),
|
||||
]
|
||||
if dry_run:
|
||||
return
|
||||
subprocess.run(cmd, check=True)
|
||||
|
||||
|
||||
def main():
|
||||
ap = argparse.ArgumentParser(description=__doc__,
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter)
|
||||
ap.add_argument("--src", default=str(DEFAULT_SRC),
|
||||
help=f"source directory of MP3s (default: {DEFAULT_SRC!s})")
|
||||
ap.add_argument("--dry-run", action="store_true",
|
||||
help="print the mapping but don't convert")
|
||||
args = ap.parse_args()
|
||||
|
||||
if shutil.which("ffmpeg") is None and not args.dry_run:
|
||||
print("ERROR: ffmpeg not on PATH. Install it first:")
|
||||
print(" sudo apt install -y ffmpeg")
|
||||
sys.exit(1)
|
||||
|
||||
src_dir = Path(args.src)
|
||||
if not src_dir.exists():
|
||||
print(f"ERROR: source directory not found: {src_dir}")
|
||||
sys.exit(1)
|
||||
|
||||
mp3s = sorted(src_dir.glob("*.mp3"))
|
||||
if not mp3s:
|
||||
print(f"No .mp3 files under {src_dir}")
|
||||
sys.exit(1)
|
||||
|
||||
print(f"Source: {src_dir}")
|
||||
print(f"Target: {ASSETS_ROOT}")
|
||||
print(f"Files: {len(mp3s)} mp3 (expected 8)")
|
||||
print()
|
||||
|
||||
# Pass 1: match by filename.
|
||||
assigned: dict[tuple, Path] = {} # (cat,key) → src mp3
|
||||
unmatched: list[Path] = []
|
||||
for mp3 in mp3s:
|
||||
key = match_by_filename(mp3.stem)
|
||||
if key is None:
|
||||
unmatched.append(mp3)
|
||||
elif key in assigned:
|
||||
print(f" [WARN] duplicate match for {key}: keeping {assigned[key].name},"
|
||||
f" ignoring {mp3.name}")
|
||||
else:
|
||||
assigned[key] = mp3
|
||||
|
||||
# Pass 2: assign unmatched files to still-missing phrases.
|
||||
missing_keys = [(cat, key) for (cat, key, _) in PHRASES if (cat, key) not in assigned]
|
||||
if len(unmatched) == 1 and len(missing_keys) == 1:
|
||||
only_key = missing_keys[0]
|
||||
only_file = unmatched[0]
|
||||
print(f" [AUTO] assigning '{only_file.name}' → {only_key[0]}/{only_key[1]} "
|
||||
f"(the only phrase still unmapped)")
|
||||
assigned[only_key] = only_file
|
||||
unmatched = []
|
||||
missing_keys = []
|
||||
elif unmatched:
|
||||
# Prompt user
|
||||
print(f" [WARN] {len(unmatched)} unmatched file(s) and "
|
||||
f"{len(missing_keys)} unassigned phrase(s). Manual mapping needed:")
|
||||
for f in unmatched:
|
||||
print(f" unmatched: {f.name}")
|
||||
for (cat, key) in missing_keys:
|
||||
print(f" missing: {cat}/{key}")
|
||||
print()
|
||||
print("Resolve manually by renaming the source files to one of:")
|
||||
for (_, _, text) in PHRASES:
|
||||
print(f" '{text}.mp3'")
|
||||
sys.exit(1)
|
||||
|
||||
# Show the plan
|
||||
print("Mapping:")
|
||||
for (cat, key, _) in PHRASES:
|
||||
src = assigned.get((cat, key))
|
||||
if src is None:
|
||||
print(f" [MISSING] no source for {cat}/{key}")
|
||||
continue
|
||||
rel = src.name if len(src.name) <= 60 else src.name[:57] + "..."
|
||||
print(f" {rel:<60s} → assets/audio/{cat}/{key}.wav")
|
||||
print()
|
||||
|
||||
if args.dry_run:
|
||||
print("(dry-run — nothing written)")
|
||||
return
|
||||
|
||||
# Convert
|
||||
print("Converting…")
|
||||
converted = 0
|
||||
for (cat, key, _) in PHRASES:
|
||||
src = assigned.get((cat, key))
|
||||
if src is None:
|
||||
continue
|
||||
dst = ASSETS_ROOT / cat / f"{key}.wav"
|
||||
try:
|
||||
convert_to_wav(src, dst)
|
||||
except subprocess.CalledProcessError as e:
|
||||
print(f" [FAIL] {src.name} → {dst.name}: ffmpeg exited {e.returncode}")
|
||||
continue
|
||||
|
||||
# Probe the result
|
||||
import wave
|
||||
with wave.open(str(dst), "rb") as wf:
|
||||
ok = (wf.getframerate() == 16000
|
||||
and wf.getnchannels() == 1
|
||||
and wf.getsampwidth() == 2)
|
||||
dur = wf.getnframes() / wf.getframerate()
|
||||
mark = "✓" if ok else "⚠ wrong format"
|
||||
print(f" {mark} {dst.relative_to(PROJECT_ROOT)} ({dur:.1f}s)")
|
||||
if ok:
|
||||
converted += 1
|
||||
|
||||
print()
|
||||
print(f"Done — {converted}/{len(PHRASES)} clip(s) converted.")
|
||||
print()
|
||||
print("Deploy + restart:")
|
||||
print(" scripts/deploy.sh")
|
||||
print(" ssh unitree@192.168.123.164 "
|
||||
"'sudo systemctl restart saqr-bridge && journalctl -u saqr-bridge -n 20 | grep audio_player'")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
408
scripts/record_phrases.py
Executable file
408
scripts/record_phrases.py
Executable file
@ -0,0 +1,408 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
record_phrases.py — Build Saqr's audio library for AudioClient.PlayStream.
|
||||
|
||||
Two modes:
|
||||
|
||||
DEFAULT (TTS capture) — the script calls AudioClient.TtsMaker for each
|
||||
phrase and simultaneously records the G1 speaker
|
||||
output via the mic. Zero speaking required. Voice
|
||||
is the G1's own TTS voice, but at runtime
|
||||
PlayStream skips the firmware synthesis buffer so
|
||||
it plays back ~200–700 ms faster.
|
||||
|
||||
--mic (your voice) — the script shows each phrase, counts down, and
|
||||
records whatever the mic hears. You speak each
|
||||
line yourself.
|
||||
|
||||
Both modes save 16 kHz mono int16 WAVs under
|
||||
``assets/audio/<category>/<key>.wav`` — exactly what ``robot/audio_player``
|
||||
expects.
|
||||
|
||||
Usage (run on the robot or any machine on the G1 subnet):
|
||||
|
||||
python3 scripts/record_phrases.py # TTS capture, all 8
|
||||
python3 scripts/record_phrases.py --only safe,helmet # just those two
|
||||
python3 scripts/record_phrases.py --iface enp3s0 # custom DDS iface
|
||||
python3 scripts/record_phrases.py --mic # your voice instead
|
||||
python3 scripts/record_phrases.py --play # verify each clip by replaying
|
||||
|
||||
Requires ``unitree_sdk2py`` in the active conda env (TTS-capture and --play).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import socket
|
||||
import struct
|
||||
import subprocess
|
||||
import sys
|
||||
import threading
|
||||
import time
|
||||
import wave
|
||||
from pathlib import Path
|
||||
|
||||
PROJECT_ROOT = Path(__file__).resolve().parent.parent
|
||||
ASSETS_ROOT = PROJECT_ROOT / "assets" / "audio"
|
||||
|
||||
MCAST_GRP = "239.168.123.161"
|
||||
MCAST_PORT = 5555
|
||||
RATE = 16000
|
||||
|
||||
# (category, key, text).
|
||||
PHRASES = [
|
||||
("fixed", "ready",
|
||||
"Saqr is running. Press R2 plus X to start."),
|
||||
("fixed", "deactivated",
|
||||
"Saqr deactivated."),
|
||||
("fixed", "no_camera",
|
||||
"Camera not connected. Please plug in the camera and try again."),
|
||||
("fixed", "safe",
|
||||
"Safe to enter. Have a good day."),
|
||||
("fixed", "unsafe_generic",
|
||||
"Please stop. Wear your proper safety equipment."),
|
||||
("unsafe_missing", "helmet",
|
||||
"Please stop. Wear your proper safety equipment. You are missing helmet."),
|
||||
("unsafe_missing", "vest",
|
||||
"Please stop. Wear your proper safety equipment. You are missing vest."),
|
||||
("unsafe_missing", "helmet_vest",
|
||||
"Please stop. Wear your proper safety equipment. You are missing helmet and vest."),
|
||||
]
|
||||
|
||||
|
||||
# ── mic capture ─────────────────────────────────────────────────────────────
|
||||
def find_local_ip() -> str:
|
||||
"""Pick the first 192.168.123.x address on this machine."""
|
||||
out = subprocess.run(
|
||||
["ip", "-4", "-o", "addr"], capture_output=True, text=True
|
||||
).stdout
|
||||
for line in out.splitlines():
|
||||
for tok in line.split():
|
||||
if tok.startswith("192.168.123."):
|
||||
return tok.split("/")[0]
|
||||
raise RuntimeError(
|
||||
"No 192.168.123.x address on this machine — connect to the G1 network first."
|
||||
)
|
||||
|
||||
|
||||
def _open_mcast_socket(local_ip: str) -> socket.socket:
|
||||
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
|
||||
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
|
||||
s.bind(("", MCAST_PORT))
|
||||
mreq = struct.pack("4s4s", socket.inet_aton(MCAST_GRP), socket.inet_aton(local_ip))
|
||||
s.setsockopt(socket.IPPROTO_IP, socket.IP_ADD_MEMBERSHIP, mreq)
|
||||
return s
|
||||
|
||||
|
||||
def record_multicast_fixed(seconds: float) -> bytes:
|
||||
"""Blocking mic capture for exactly ``seconds`` (used by --mic mode)."""
|
||||
local_ip = find_local_ip()
|
||||
sock = _open_mcast_socket(local_ip)
|
||||
sock.settimeout(2.0)
|
||||
target_bytes = int(RATE * 2 * seconds)
|
||||
buf = bytearray()
|
||||
t0 = time.time()
|
||||
try:
|
||||
while len(buf) < target_bytes and time.time() - t0 < seconds + 5:
|
||||
try:
|
||||
data, _ = sock.recvfrom(4096)
|
||||
buf.extend(data)
|
||||
except socket.timeout:
|
||||
continue
|
||||
finally:
|
||||
sock.close()
|
||||
return bytes(buf)
|
||||
|
||||
|
||||
def record_while_tts_plays(ac, text: str, speaker_id: int,
|
||||
capture_seconds: float) -> bytes:
|
||||
"""Start mic capture, call TtsMaker, capture for ``capture_seconds``, stop."""
|
||||
local_ip = find_local_ip()
|
||||
buf = bytearray()
|
||||
stop_flag = threading.Event()
|
||||
|
||||
def _mic():
|
||||
sock = _open_mcast_socket(local_ip)
|
||||
sock.settimeout(0.5)
|
||||
try:
|
||||
while not stop_flag.is_set():
|
||||
try:
|
||||
data, _ = sock.recvfrom(4096)
|
||||
buf.extend(data)
|
||||
except socket.timeout:
|
||||
continue
|
||||
finally:
|
||||
sock.close()
|
||||
|
||||
t = threading.Thread(target=_mic, daemon=True)
|
||||
t.start()
|
||||
time.sleep(0.3) # let the mic socket settle before TTS fires
|
||||
|
||||
try:
|
||||
code = ac.TtsMaker(text, speaker_id)
|
||||
except Exception as e:
|
||||
print(f" [FAIL] TtsMaker raised: {e}")
|
||||
stop_flag.set()
|
||||
t.join(timeout=1.0)
|
||||
return b""
|
||||
if code != 0:
|
||||
print(f" [FAIL] TtsMaker rc={code} — retry in 2s…")
|
||||
time.sleep(2.0)
|
||||
try:
|
||||
code = ac.TtsMaker(text, speaker_id)
|
||||
except Exception as e:
|
||||
print(f" [FAIL] TtsMaker retry raised: {e}")
|
||||
stop_flag.set()
|
||||
t.join(timeout=1.0)
|
||||
return b""
|
||||
if code != 0:
|
||||
print(f" [FAIL] TtsMaker retry rc={code}")
|
||||
stop_flag.set()
|
||||
t.join(timeout=1.0)
|
||||
return b""
|
||||
|
||||
time.sleep(capture_seconds)
|
||||
stop_flag.set()
|
||||
t.join(timeout=1.0)
|
||||
return bytes(buf)
|
||||
|
||||
|
||||
# ── processing ──────────────────────────────────────────────────────────────
|
||||
def rms_and_peak(pcm: bytes):
|
||||
import numpy as np
|
||||
a = np.frombuffer(pcm, dtype=np.int16)
|
||||
if a.size == 0:
|
||||
return 0.0, 0
|
||||
rms = float(np.sqrt(np.mean(a.astype(np.float32) ** 2)))
|
||||
peak = int(np.abs(a).max())
|
||||
return rms, peak
|
||||
|
||||
|
||||
def trim_leading_silence(pcm: bytes, threshold: int = 500,
|
||||
head_ms: int = 80) -> bytes:
|
||||
"""Cut silence before the first sample above threshold; keep ``head_ms`` lead-in."""
|
||||
import numpy as np
|
||||
a = np.frombuffer(pcm, dtype=np.int16)
|
||||
above = np.where(np.abs(a) > threshold)[0]
|
||||
if above.size == 0:
|
||||
return pcm
|
||||
start = max(0, int(above[0]) - int(head_ms / 1000.0 * RATE))
|
||||
return a[start:].tobytes()
|
||||
|
||||
|
||||
def trim_trailing_silence(pcm: bytes, threshold: int = 500,
|
||||
tail_ms: int = 150) -> bytes:
|
||||
import numpy as np
|
||||
a = np.frombuffer(pcm, dtype=np.int16)
|
||||
above = np.where(np.abs(a) > threshold)[0]
|
||||
if above.size == 0:
|
||||
return pcm
|
||||
end = int(above[-1]) + int(tail_ms / 1000.0 * RATE)
|
||||
end = min(end, a.size)
|
||||
return a[:end].tobytes()
|
||||
|
||||
|
||||
def save_wav(pcm: bytes, path: Path) -> None:
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
with wave.open(str(path), "wb") as wf:
|
||||
wf.setnchannels(1)
|
||||
wf.setsampwidth(2)
|
||||
wf.setframerate(RATE)
|
||||
wf.writeframes(pcm)
|
||||
|
||||
|
||||
def estimate_tts_duration(text: str) -> float:
|
||||
"""Match RobotController's pacing estimate (0.12 s/char, min 2.5 s)."""
|
||||
return max(2.5, len(text) * 0.12)
|
||||
|
||||
|
||||
# ── optional playback verification ──────────────────────────────────────────
|
||||
def play_on_g1(ac, path: Path) -> None:
|
||||
with wave.open(str(path), "rb") as wf:
|
||||
pcm = wf.readframes(wf.getnframes())
|
||||
CHUNK = 96000
|
||||
sid = f"verify_{int(time.time() * 1000)}"
|
||||
offset = 0
|
||||
while offset < len(pcm):
|
||||
chunk = pcm[offset:offset + CHUNK]
|
||||
ac.PlayStream("saqr_verify", sid, chunk)
|
||||
offset += len(chunk)
|
||||
time.sleep(len(chunk) / (RATE * 2) / 2)
|
||||
time.sleep(len(pcm) / (RATE * 2) + 0.3)
|
||||
try:
|
||||
ac.PlayStop("saqr_verify")
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
|
||||
def countdown(seconds: int) -> None:
|
||||
for i in range(seconds, 0, -1):
|
||||
print(f" starting in {i}...", end="\r", flush=True)
|
||||
time.sleep(1)
|
||||
print(" " + " " * 30, end="\r")
|
||||
|
||||
|
||||
# ── main ────────────────────────────────────────────────────────────────────
|
||||
def main():
|
||||
ap = argparse.ArgumentParser(
|
||||
description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
)
|
||||
ap.add_argument("--mic", action="store_true",
|
||||
help="record YOUR voice via the G1 mic instead of TtsMaker")
|
||||
ap.add_argument("--duration", type=float, default=None,
|
||||
help="seconds to record per phrase (default: estimate per phrase)")
|
||||
ap.add_argument("--pause", type=float, default=1.0,
|
||||
help="seconds of pause between phrases (default 1)")
|
||||
ap.add_argument("--only", default=None,
|
||||
help="comma-separated keys to record (e.g. 'safe,helmet,vest')")
|
||||
ap.add_argument("--iface", default="eth0",
|
||||
help="DDS iface for TtsMaker / PlayStream (default eth0)")
|
||||
ap.add_argument("--speaker-id", type=int, default=2,
|
||||
help="TtsMaker speaker_id (2 = English on current firmware)")
|
||||
ap.add_argument("--volume", type=int, default=100,
|
||||
help="G1 speaker volume 0–100 (default 100)")
|
||||
ap.add_argument("--play", action="store_true",
|
||||
help="play each saved clip back on the G1 via PlayStream")
|
||||
ap.add_argument("--countdown", type=int, default=2,
|
||||
help="countdown seconds before --mic recordings (default 2)")
|
||||
ap.add_argument("--no-trim", action="store_true",
|
||||
help="don't auto-trim leading/trailing silence")
|
||||
ap.add_argument("--threshold", type=int, default=500,
|
||||
help="amplitude threshold for silence trim (default 500)")
|
||||
args = ap.parse_args()
|
||||
|
||||
only = set(args.only.split(",")) if args.only else None
|
||||
targets = [(c, k, t) for (c, k, t) in PHRASES if only is None or k in only]
|
||||
if not targets:
|
||||
print(f"No phrases match --only={args.only!r}. Known keys: "
|
||||
f"{', '.join(k for _, k, _ in PHRASES)}")
|
||||
sys.exit(1)
|
||||
|
||||
try:
|
||||
local_ip = find_local_ip()
|
||||
except RuntimeError as e:
|
||||
print(f"[FATAL] {e}")
|
||||
sys.exit(1)
|
||||
|
||||
mode = "mic" if args.mic else "tts"
|
||||
|
||||
# Init AudioClient for TTS mode or --play.
|
||||
ac = None
|
||||
if mode == "tts" or args.play:
|
||||
try:
|
||||
from unitree_sdk2py.core.channel import ChannelFactoryInitialize
|
||||
from unitree_sdk2py.g1.audio.g1_audio_client import AudioClient
|
||||
except ImportError as e:
|
||||
print(f"[FATAL] unitree_sdk2py not available in this env: {e}")
|
||||
sys.exit(1)
|
||||
print(f"[init] ChannelFactoryInitialize(0, {args.iface!r})")
|
||||
ChannelFactoryInitialize(0, args.iface)
|
||||
ac = AudioClient()
|
||||
ac.SetTimeout(10.0)
|
||||
ac.Init()
|
||||
try:
|
||||
ac.SetVolume(args.volume)
|
||||
except Exception as e:
|
||||
print(f"[init][WARN] SetVolume failed: {e}")
|
||||
print(f"[init] AudioClient ready (speaker_id={args.speaker_id} volume={args.volume})")
|
||||
print()
|
||||
|
||||
print("=" * 60)
|
||||
print(f" Saqr phrase recorder — {len(targets)} clip(s)")
|
||||
print(f" Mode: {mode.upper()} "
|
||||
f"({'TtsMaker → mic' if mode == 'tts' else 'your voice → mic'})")
|
||||
print(f" Duration: {('auto (per phrase)' if args.duration is None else f'{args.duration}s')}")
|
||||
print(f" Pause: {args.pause}s between phrases")
|
||||
print(f" Trim: {'off' if args.no_trim else f'on (threshold={args.threshold})'}")
|
||||
print(f" Output: {ASSETS_ROOT}")
|
||||
print(f" Mic feed: {MCAST_GRP}:{MCAST_PORT} (local IP: {local_ip})")
|
||||
print("=" * 60)
|
||||
print()
|
||||
|
||||
if mode == "mic":
|
||||
print("Stand within ~1 m of the G1. When you see '>>> SPEAK NOW <<<' say")
|
||||
print("the SAY: line clearly. Target rms 500–5000.")
|
||||
else:
|
||||
print("The G1 will speak each phrase via TtsMaker while the script records")
|
||||
print("through the mic. Stay quiet during capture; any room noise you hear")
|
||||
print("will end up baked into the clip.")
|
||||
print()
|
||||
try:
|
||||
input("Press Enter to start…")
|
||||
except EOFError:
|
||||
return
|
||||
print()
|
||||
|
||||
saved = []
|
||||
for i, (cat, key, text) in enumerate(targets, 1):
|
||||
out = ASSETS_ROOT / cat / f"{key}.wav"
|
||||
print(f"── [{i}/{len(targets)} {cat}/{key}] ──")
|
||||
print(f" SAY: {text}")
|
||||
|
||||
if mode == "mic":
|
||||
if args.countdown > 0:
|
||||
countdown(args.countdown)
|
||||
dur = args.duration if args.duration is not None else 5.0
|
||||
print(f" >>> SPEAK NOW — {dur}s <<<", flush=True)
|
||||
pcm = record_multicast_fixed(dur)
|
||||
else:
|
||||
dur = args.duration if args.duration is not None else \
|
||||
estimate_tts_duration(text) + 1.5
|
||||
print(f" TtsMaker playing… (capturing {dur:.1f}s)", flush=True)
|
||||
pcm = record_while_tts_plays(ac, text, args.speaker_id, dur)
|
||||
|
||||
if not pcm:
|
||||
print(f" [FAIL] no audio captured")
|
||||
print()
|
||||
continue
|
||||
|
||||
if not args.no_trim:
|
||||
pcm = trim_leading_silence(pcm, threshold=args.threshold)
|
||||
pcm = trim_trailing_silence(pcm, threshold=args.threshold)
|
||||
|
||||
rms, peak = rms_and_peak(pcm)
|
||||
dur_s = len(pcm) / 2.0 / RATE
|
||||
save_wav(pcm, out)
|
||||
|
||||
if rms < 400:
|
||||
marker = f" ⚠ TOO QUIET — re-record with --only {key}"
|
||||
elif peak > 30000:
|
||||
marker = " ⚠ CLIPPING"
|
||||
else:
|
||||
marker = " ✓"
|
||||
print(f" saved → {out.relative_to(PROJECT_ROOT)} "
|
||||
f"({dur_s:.1f}s rms={rms:.0f} peak={peak}){marker}")
|
||||
saved.append((cat, key, out, dur_s, rms, peak))
|
||||
|
||||
if args.play and ac is not None:
|
||||
print(f" playing back…")
|
||||
try:
|
||||
play_on_g1(ac, out)
|
||||
except Exception as e:
|
||||
print(f" [play] failed: {e}")
|
||||
|
||||
if i < len(targets) and args.pause > 0:
|
||||
time.sleep(args.pause)
|
||||
print()
|
||||
|
||||
# Summary
|
||||
print("=" * 60)
|
||||
print(f" Done — {len(saved)}/{len(targets)} clip(s) saved")
|
||||
print("=" * 60)
|
||||
bad = []
|
||||
for cat, key, path, dur, rms, peak in saved:
|
||||
warn = " ⚠ low level" if rms < 400 else (" ⚠ clipping" if peak > 30000 else "")
|
||||
if warn:
|
||||
bad.append(key)
|
||||
print(f" {cat}/{key:20s} {dur:4.1f}s rms={rms:5.0f} peak={peak:5d}{warn}")
|
||||
print()
|
||||
if bad:
|
||||
print(f"Re-record the flagged ones with:")
|
||||
print(f" python3 scripts/record_phrases.py --only {','.join(bad)}")
|
||||
print()
|
||||
print("Next:")
|
||||
print(" 1. (on robot) sudo systemctl restart saqr-bridge")
|
||||
print(" 2. expect: [audio_player] loaded N clip(s): ...")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@ -1,87 +0,0 @@
|
||||
#!/bin/bash
|
||||
# ============================================================================
|
||||
# Saqr PPE Detection - Run on Local Laptop
|
||||
# ============================================================================
|
||||
#
|
||||
# Usage:
|
||||
# scripts/run_local.sh # webcam 0
|
||||
# scripts/run_local.sh --source 1 # webcam 1
|
||||
# scripts/run_local.sh --source video.mp4 # video file
|
||||
# scripts/run_local.sh --gui # PySide6 GUI
|
||||
# scripts/run_local.sh --detect # simple detection (no tracking)
|
||||
#
|
||||
# ============================================================================
|
||||
|
||||
set -e
|
||||
|
||||
HERE="$(cd "$(dirname "$0")" && pwd)"
|
||||
PROJECT_ROOT="$(cd "$HERE/.." && pwd)"
|
||||
cd "$PROJECT_ROOT"
|
||||
|
||||
SOURCE="0"
|
||||
MODEL="saqr_best.pt"
|
||||
CONF="0.35"
|
||||
MODE="saqr" # saqr | gui | detect
|
||||
HEADLESS=false
|
||||
MAX_MISSING=90
|
||||
MATCH_DIST=250
|
||||
CONFIRM=5
|
||||
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case $1 in
|
||||
--source) SOURCE="$2"; shift 2 ;;
|
||||
--model) MODEL="$2"; shift 2 ;;
|
||||
--conf) CONF="$2"; shift 2 ;;
|
||||
--gui) MODE="gui"; shift ;;
|
||||
--detect) MODE="detect"; shift ;;
|
||||
--headless) HEADLESS=true; shift ;;
|
||||
--max-missing) MAX_MISSING="$2"; shift 2 ;;
|
||||
--match-distance) MATCH_DIST="$2"; shift 2 ;;
|
||||
--confirm) CONFIRM="$2"; shift 2 ;;
|
||||
*) echo "Unknown arg: $1"; exit 1 ;;
|
||||
esac
|
||||
done
|
||||
|
||||
if command -v conda &>/dev/null; then
|
||||
source "$(conda info --base)/etc/profile.d/conda.sh" 2>/dev/null || true
|
||||
conda activate AI_MSI_yolo 2>/dev/null || true
|
||||
fi
|
||||
|
||||
echo "============================================"
|
||||
echo " Saqr PPE Detection - Local Laptop"
|
||||
echo "============================================"
|
||||
echo " Mode : $MODE"
|
||||
echo " Source : $SOURCE"
|
||||
echo " Model : $MODEL"
|
||||
echo " Conf : $CONF"
|
||||
echo "============================================"
|
||||
echo ""
|
||||
|
||||
HEADLESS_FLAG=""
|
||||
if [ "$HEADLESS" = true ]; then
|
||||
HEADLESS_FLAG="--headless"
|
||||
fi
|
||||
|
||||
case $MODE in
|
||||
saqr)
|
||||
python -m saqr.apps.saqr_cli \
|
||||
--source "$SOURCE" \
|
||||
--model "$MODEL" \
|
||||
--conf "$CONF" \
|
||||
--max-missing "$MAX_MISSING" \
|
||||
--match-distance "$MATCH_DIST" \
|
||||
--status-confirm-frames "$CONFIRM" \
|
||||
$HEADLESS_FLAG
|
||||
;;
|
||||
gui)
|
||||
python -m saqr.gui.app \
|
||||
--source "$SOURCE" \
|
||||
--model "$MODEL"
|
||||
;;
|
||||
detect)
|
||||
python -m saqr.apps.detect_cli \
|
||||
--source "$SOURCE" \
|
||||
--model "$MODEL" \
|
||||
--conf "$CONF"
|
||||
;;
|
||||
esac
|
||||
@ -1,102 +0,0 @@
|
||||
#!/bin/bash
|
||||
# ============================================================================
|
||||
# Saqr PPE Detection - Run on Unitree G1 Robot (no bridge, direct saqr run)
|
||||
# ============================================================================
|
||||
#
|
||||
# Run on the robot's physical terminal (with monitor) or via ssh -X:
|
||||
# scripts/run_robot.sh
|
||||
# scripts/run_robot.sh --headless # no display
|
||||
# scripts/run_robot.sh --source /dev/video2 # V4L2 fallback
|
||||
#
|
||||
# For the production R2+X / R2+Y workflow, use scripts/start_saqr.sh instead.
|
||||
# ============================================================================
|
||||
|
||||
set -e
|
||||
|
||||
HERE="$(cd "$(dirname "$0")" && pwd)"
|
||||
PROJECT_ROOT="$(cd "$HERE/.." && pwd)"
|
||||
cd "$PROJECT_ROOT"
|
||||
|
||||
SOURCE="realsense"
|
||||
MODEL="saqr_best.pt"
|
||||
CONF="0.35"
|
||||
HEADLESS=false
|
||||
MAX_MISSING=120
|
||||
MATCH_DIST=300
|
||||
CONFIRM=7
|
||||
DEVICE="0"
|
||||
IMGSZ=320
|
||||
HALF=true
|
||||
STREAM_PORT=0
|
||||
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case $1 in
|
||||
--source) SOURCE="$2"; shift 2 ;;
|
||||
--model) MODEL="$2"; shift 2 ;;
|
||||
--conf) CONF="$2"; shift 2 ;;
|
||||
--headless) HEADLESS=true; shift ;;
|
||||
--max-missing) MAX_MISSING="$2"; shift 2 ;;
|
||||
--match-distance) MATCH_DIST="$2"; shift 2 ;;
|
||||
--confirm) CONFIRM="$2"; shift 2 ;;
|
||||
--device) DEVICE="$2"; shift 2 ;;
|
||||
--imgsz) IMGSZ="$2"; shift 2 ;;
|
||||
--no-half) HALF=false; shift ;;
|
||||
--stream) STREAM_PORT="$2"; shift 2 ;;
|
||||
--cpu) DEVICE="cpu"; HALF=false; shift ;;
|
||||
*) echo "Unknown arg: $1"; exit 1 ;;
|
||||
esac
|
||||
done
|
||||
|
||||
source ~/miniconda3/etc/profile.d/conda.sh 2>/dev/null || true
|
||||
conda activate marcus 2>/dev/null || conda activate teleimager 2>/dev/null || true
|
||||
|
||||
YEAR=$(date +%Y)
|
||||
if [ "$YEAR" -lt 2025 ]; then
|
||||
echo "[WARN] System clock is wrong (year=$YEAR). Fixing..."
|
||||
echo "123" | sudo -S date -s "2026-04-10 16:00:00" 2>/dev/null || true
|
||||
fi
|
||||
|
||||
if [ "$HEADLESS" = true ]; then
|
||||
export QT_QPA_PLATFORM=offscreen
|
||||
HEADLESS_FLAG="--headless"
|
||||
echo "Mode: HEADLESS (no display, results saved to runtime/captures/)"
|
||||
else
|
||||
xhost + >/dev/null 2>&1 || true
|
||||
export DISPLAY=:0
|
||||
HEADLESS_FLAG=""
|
||||
echo "Mode: DISPLAY (OpenCV window on monitor)"
|
||||
fi
|
||||
|
||||
HALF_FLAG=""
|
||||
if [ "$HALF" = true ]; then
|
||||
HALF_FLAG="--half"
|
||||
fi
|
||||
|
||||
STREAM_FLAG=""
|
||||
if [ "$STREAM_PORT" -gt 0 ]; then
|
||||
STREAM_FLAG="--stream $STREAM_PORT"
|
||||
fi
|
||||
|
||||
echo "============================================"
|
||||
echo " Saqr PPE Detection - Unitree G1 Robot"
|
||||
echo "============================================"
|
||||
echo " Source : $SOURCE"
|
||||
echo " Model : $MODEL"
|
||||
echo " Device : $DEVICE (half=$HALF, imgsz=$IMGSZ)"
|
||||
echo " Conf : $CONF"
|
||||
echo " Stream : ${STREAM_PORT:-disabled}"
|
||||
echo "============================================"
|
||||
echo ""
|
||||
|
||||
python -m saqr.apps.saqr_cli \
|
||||
--source "$SOURCE" \
|
||||
--model "$MODEL" \
|
||||
--conf "$CONF" \
|
||||
--max-missing "$MAX_MISSING" \
|
||||
--match-distance "$MATCH_DIST" \
|
||||
--status-confirm-frames "$CONFIRM" \
|
||||
--device "$DEVICE" \
|
||||
--imgsz "$IMGSZ" \
|
||||
$HALF_FLAG \
|
||||
$STREAM_FLAG \
|
||||
$HEADLESS_FLAG
|
||||
@ -6,7 +6,7 @@
|
||||
# What it does:
|
||||
# 1. Sources miniconda and activates the target env (default: marcus).
|
||||
# 2. cd to the project root (parent of this scripts/ dir).
|
||||
# 3. Execs `python -m saqr.robot.bridge` with the production flags.
|
||||
# 3. Execs `python -m robot.bridge` with the production flags.
|
||||
#
|
||||
# The bridge will:
|
||||
# - init the G1 arm + audio + LowState DDS clients
|
||||
@ -22,19 +22,33 @@ set -u
|
||||
HERE="$(cd "$(dirname "$0")" && pwd)"
|
||||
SAQR_DIR="${SAQR_DIR:-$(cd "$HERE/.." && pwd)}"
|
||||
|
||||
CONDA_ROOT="${CONDA_ROOT:-$HOME/miniconda3}"
|
||||
CONDA_ENV="${CONDA_ENV:-marcus}"
|
||||
DDS_IFACE="${DDS_IFACE:-eth0}"
|
||||
SAQR_SOURCE="${SAQR_SOURCE:-realsense}"
|
||||
STREAM_PORT="${STREAM_PORT:-8080}"
|
||||
# Read defaults from config/robot_config.json (env vars override).
|
||||
config_get() {
|
||||
# config_get dotted.key
|
||||
python3 -c "
|
||||
import json, os, sys
|
||||
with open('$SAQR_DIR/config/robot_config.json') as f:
|
||||
c = json.load(f)
|
||||
for k in sys.argv[1].split('.'):
|
||||
c = c[k]
|
||||
print(os.path.expandvars(str(c)))
|
||||
" "$1"
|
||||
}
|
||||
|
||||
CONDA_ROOT="${CONDA_ROOT:-$(config_get start_saqr.conda_root)}"
|
||||
CONDA_ENV="${CONDA_ENV:-$(config_get start_saqr.conda_env)}"
|
||||
DDS_IFACE="${DDS_IFACE:-$(config_get start_saqr.dds_iface)}"
|
||||
SAQR_SOURCE="${SAQR_SOURCE:-$(config_get start_saqr.saqr_source)}"
|
||||
STREAM_PORT="${STREAM_PORT:-$(config_get start_saqr.stream_port)}"
|
||||
|
||||
if [ ! -d "$SAQR_DIR" ]; then
|
||||
echo "[start_saqr] FATAL: SAQR_DIR not found: $SAQR_DIR" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if [ ! -d "$SAQR_DIR/saqr" ]; then
|
||||
echo "[start_saqr] FATAL: saqr/ package not found in $SAQR_DIR" >&2
|
||||
if [ ! -f "$SAQR_DIR/robot/bridge.py" ]; then
|
||||
echo "[start_saqr] FATAL: robot/bridge.py not found in $SAQR_DIR" >&2
|
||||
echo " (expected $SAQR_DIR to contain core/ apps/ robot/ utils/)" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
@ -58,7 +72,7 @@ cd "$SAQR_DIR" || {
|
||||
echo "[start_saqr] env=$CONDA_ENV cwd=$PWD iface=$DDS_IFACE source=$SAQR_SOURCE stream=$STREAM_PORT"
|
||||
echo "[start_saqr] launching bridge..."
|
||||
|
||||
exec python3 -m saqr.robot.bridge \
|
||||
exec python3 -m robot.bridge \
|
||||
--iface "$DDS_IFACE" \
|
||||
--source "$SAQR_SOURCE" \
|
||||
--headless \
|
||||
|
||||
0
utils/__init__.py
Normal file
0
utils/__init__.py
Normal file
45
utils/config.py
Normal file
45
utils/config.py
Normal file
@ -0,0 +1,45 @@
|
||||
"""Cached JSON config loader for config/<name>_config.json files.
|
||||
|
||||
Usage:
|
||||
from utils.config import load_config
|
||||
cfg = load_config("core") # reads config/core_config.json
|
||||
conf_threshold = cfg["detection"]["conf"]
|
||||
|
||||
String values pass through ``os.path.expandvars`` on load, so JSON like
|
||||
``"$HOME/miniconda3"`` resolves at read time.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
from typing import Any, Dict
|
||||
|
||||
from core.paths import CONFIG_DIR
|
||||
|
||||
_CACHE: Dict[str, Dict[str, Any]] = {}
|
||||
|
||||
|
||||
def _expand(value):
|
||||
if isinstance(value, str):
|
||||
return os.path.expandvars(value)
|
||||
if isinstance(value, list):
|
||||
return [_expand(v) for v in value]
|
||||
if isinstance(value, dict):
|
||||
return {k: _expand(v) for k, v in value.items()}
|
||||
return value
|
||||
|
||||
|
||||
def load_config(name: str) -> Dict[str, Any]:
|
||||
"""Return the parsed JSON config for ``config/<name>_config.json`` (cached)."""
|
||||
if name in _CACHE:
|
||||
return _CACHE[name]
|
||||
|
||||
path = CONFIG_DIR / f"{name}_config.json"
|
||||
if not path.exists():
|
||||
raise FileNotFoundError(f"Config not found: {path}")
|
||||
|
||||
with open(path, "r") as f:
|
||||
raw = json.load(f)
|
||||
cfg = _expand(raw)
|
||||
_CACHE[name] = cfg
|
||||
return cfg
|
||||
@ -4,7 +4,7 @@ import json
|
||||
import logging
|
||||
from typing import Dict
|
||||
|
||||
from saqr.core.paths import CONFIG_DIR, LOGS_DIR
|
||||
from core.paths import CONFIG_DIR, LOGS_DIR
|
||||
|
||||
_LOGGER_CACHE: Dict[str, logging.Logger] = {}
|
||||
|
||||
@ -23,7 +23,7 @@ def _level_from_name(name: str) -> int:
|
||||
|
||||
|
||||
def get_logger(category: str, name: str) -> logging.Logger:
|
||||
"""Return a cached logger that writes to runtime/logs/<category>/<name>.log."""
|
||||
"""Return a cached logger that writes to logs/<category>/<name>.log."""
|
||||
key = f"{category}.{name}"
|
||||
if key in _LOGGER_CACHE:
|
||||
return _LOGGER_CACHE[key]
|
||||
Loading…
x
Reference in New Issue
Block a user