Full-day voice-stack refactor. Experiments run and reverted:
- Gemini Live HTTP microservice (Python 3.8 env incompat, latency)
- Vosk grammar STT (English lexicon can't decode 'Sanad'; big model
cold-load too slow on Jetson CPU)
Kept architecture:
- Voice/wake_detector.py — pure-numpy energy state machine with
adaptive baseline, burst-audio capture for post-hoc verify.
- Voice/marcus_voice.py — orchestrator with 3 modes
(wake_and_command / always_on / always_on_gated), hysteretic VAD,
pre-silence trim (300 ms pre-roll), DSP pipeline (DC remove,
80 Hz HPF, 0.97 pre-emphasis, peak-normalize), faster-whisper
base.en int8 with beam=8 + temperature fallback [0,0.2,0.4],
fuzzy-match canonicalisation, GARBAGE_PATTERNS + length filter,
/s-/ phonetic wake-verify, full-turn debug WAV recording.
Config-driven vocab (zero hardcoded strings in Python):
- stt.wake_words (33 variants of 'Sanad')
- stt.command_vocab (68 canonical phrases)
- stt.garbage_patterns (17 Whisper noise outputs)
- stt.min_transcription_length, stt.command_vocab_cutoff
Command parser widened (Brain/command_parser.py):
- _RE_SIMPLE_DIR — bare direction + verb+direction combos
('left', 'go back', 'move forward', 'step right', ...)
- _RE_STOP_SIMPLE — bare stop/halt/wait/pause/freeze/hold
- All motion constants sourced from config_Navigation.json
(move_map + step_duration_sec) via API/zmq_api.py; no more
hardcoded 0.3 / 2.0 magic numbers.
API/audio_api.py — _play_pcm now uses AudioClient.PlayStream with
automatic resampling to 16 kHz (matches Sanad's proven pattern).
Removed:
- Voice/vosk_stt.py (and all Vosk references in marcus_voice.py)
- Models/vosk-model-small-en-us-0.15/ (40 MB model + zip)
- All Vosk keys from Config/config_Voice.json
Documentation synced across README, Doc/architecture.md,
Doc/pipeline.md, Doc/functions.md, Doc/controlling.md,
Doc/MARCUS_API.md, Doc/environment.md changelog.
Known limitation: faster-whisper base.en on Jetson CPU + G1
far-field mic yields ~50% command-transcription accuracy due
to model capacity and mic reverberation. Wake + ack + recording
+ trim + Whisper + fuzzy + brain + motion all verified working
end-to-end. Future improvement path (unused): close-talking USB
mic via pactl_parec, or Gemini Live via HTTP microservice.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
389 lines
20 KiB
Markdown
389 lines
20 KiB
Markdown
# Marcus — Environment & Version Reference
|
||
|
||
**Project**: Marcus | YS Lootah Technology
|
||
**Robot persona**: Sanad (wake word + self-intro; codebase stays under `Marcus/`)
|
||
**Hardware**: Unitree G1 EDU Humanoid (29 DOF) + Jetson Orin NX 16 GB
|
||
**Deployment host**: `unitree@192.168.123.164` (hostname `ubuntu`)
|
||
**Conda env**: `marcus`
|
||
**Captured**: 2026-04-12 (updated 2026-04-21)
|
||
|
||
This document is the canonical record of the verified GPU-accelerated software stack running on the Jetson Orin NX. It covers system software, Python environment, Marcus runtime dependencies, installation recipe, verification commands, and known quirks. Pair it with `architecture.md` (what the code does) and `controlling.md` (how to drive it).
|
||
|
||
---
|
||
|
||
## 1. Hardware
|
||
|
||
| Item | Value |
|
||
|---|---|
|
||
| Robot | Unitree G1 EDU humanoid, 29 DoF |
|
||
| Compute | Jetson Orin NX 16 GB (integrated Ampere GPU, 8.7 capability, tensor cores) |
|
||
| Camera | Intel RealSense D435 (424x240 @ 15 fps, BGR8) |
|
||
| LiDAR | (optional) loaded via `API/lidar_api.py` + `Lidar/SLAM_worker.py` |
|
||
| Network | `eth0` 192.168.123.164 (Holosoma + Marcus), `wlan0` 10.255.254.86 |
|
||
|
||
---
|
||
|
||
## 2. System software (Jetson)
|
||
|
||
| Layer | Version | Source of truth |
|
||
|---|---|---|
|
||
| Kernel | `Linux 5.10.104-tegra aarch64` | `uname -a` |
|
||
| OS | Ubuntu 20.04.6 LTS | `/etc/os-release` |
|
||
| L4T | R35.3.1 (2023-03-19 build, GCID 32827747) | `/etc/nv_tegra_release` |
|
||
| JetPack | **5.1.1** (derived from L4T R35.3.1) | `nvidia-l4t-core 35.3.1-20230319081403` |
|
||
| CUDA runtime | `11.4.19-1` | `dpkg -l cuda-runtime-11-4` |
|
||
| CUDA toolkit (nvcc) | `11.4.315` (Built 2022-10-23) | `nvcc --version` |
|
||
| cuDNN | `8.6.0.166-1+cuda11.4` | `dpkg -l libcudnn8` |
|
||
| CUDA install paths | `/usr/local/cuda`, `/usr/local/cuda-11`, `/usr/local/cuda-11.4` | `ls /usr/local` |
|
||
|
||
---
|
||
|
||
## 3. Python environment
|
||
|
||
| Layer | Version |
|
||
|---|---|
|
||
| Conda | `25.11.1` |
|
||
| Active env | `marcus` at `/home/unitree/miniconda3/envs/marcus` |
|
||
| Python | `3.8.20` (Jetson stock Python for JetPack 5) |
|
||
| pip | `25.0.1` (user site `~/.local/lib/python3.8/site-packages`) |
|
||
| `which python3` | `/home/unitree/miniconda3/envs/marcus/bin/python3` |
|
||
|
||
Other conda envs on the machine (not used by Marcus): `base`, `gemini`, `gmr`, `marcus_tts`, `saqr`, `teleimager`, `tv`, `twist2`, `unitree_lerobot`, plus the Holosoma-side `hsinference` under `~/.holosoma_deps/miniconda3`.
|
||
|
||
Note: Python 3.8 is EOL (Oct 2024). It is retained because JetPack 5 ships it and NVIDIA's pre-built Jetson torch wheels for this generation target cp38. Upgrading requires either JetPack 6 or a from-source torch build.
|
||
|
||
---
|
||
|
||
## 4. PyTorch stack (GPU-critical)
|
||
|
||
**The torch install must be the NVIDIA Jetson wheel, not the PyPI wheel.** PyPI `torch` for aarch64 is CPU-only; only NVIDIA's builds expose CUDA on Jetson.
|
||
|
||
| Item | Expected | Verified |
|
||
|---|---|---|
|
||
| `torch.__version__` | `2.1.0a0+41361538.nv23.06` | yes |
|
||
| `torch.cuda.is_available()` | `True` | yes |
|
||
| `torch.version.cuda` | `11.4` | yes |
|
||
| `torch.backends.cudnn.version()` | `8600` (= cuDNN 8.6.0) | yes |
|
||
| `torch.cuda.get_device_name(0)` | `Orin` | yes |
|
||
| `torch.cuda.get_device_capability(0)` | `(8, 7)` (Ampere + tensor cores) | yes |
|
||
| `torchvision.__version__` | `0.16.1` (built from source against the Jetson torch) | yes |
|
||
| `torchvision.ops.nms(...).device` | `cuda:0` | yes |
|
||
|
||
Capability 8.7 gives us FP16 tensor cores — the GPU-path FP16 kwarg in `Vision/marcus_yolo.py` is meaningful here, not placebo.
|
||
|
||
---
|
||
|
||
## 5. Ultralytics / YOLO runtime
|
||
|
||
| Item | Value |
|
||
|---|---|
|
||
| `ultralytics` | `8.4.21` |
|
||
| Weights | `Models/yolov8m.pt` (~50 MB, auto-fetched if missing) |
|
||
| `yolo checks` GPU line | `GPU: Orin, 15389MiB`, `CUDA: 11.4` |
|
||
| Marcus config device | `cuda` (hard-required — no CPU fallback) |
|
||
| Marcus config half | `true` (FP16) |
|
||
| Marcus config imgsz | `320` |
|
||
| First inference warmup | ~4–5 s (cuDNN kernel autotune) |
|
||
| **Steady-state FPS on Orin** | **~21.9 fps** at imgsz=320 FP16 |
|
||
|
||
The 21.9 fps figure is measured via the smoke test in section 11 below. It comfortably exceeds the 15 fps camera stream, so YOLO is no longer the pipeline bottleneck.
|
||
|
||
---
|
||
|
||
## 6. Ollama / vision-language model
|
||
|
||
| Item | Value |
|
||
|---|---|
|
||
| Ollama CLI client | `0.20.0` |
|
||
| Ollama server | `0.20.0` (`curl http://localhost:11434/api/version`) |
|
||
| Python `ollama` package | `0.6.1` (no `__version__` attribute — use `pip show ollama`) |
|
||
| Models installed | `qwen2.5vl:3b` (3.2 GB), `llava:7b` (4.7 GB) |
|
||
| Marcus-configured model | `qwen2.5vl:3b` (`Config/config_Brain.json`) |
|
||
| Resident VRAM when loaded | ~11 GB (includes KV cache + vision projector) |
|
||
| Processor placement | **`100% GPU`** per `ollama ps` |
|
||
|
||
Headroom note: with Qwen2.5-VL resident (~11 GB) + YOLO (~0.5 GB) + camera buffers + the ZMQ bridge, you have ~4 GB free on the 16 GB Orin NX. Comfortable but not unlimited — if image-search (which sends two images to Qwen at once) ever OOMs, enable quantized KV cache via `OLLAMA_KV_CACHE_TYPE=q8_0`.
|
||
|
||
---
|
||
|
||
## 7. Marcus runtime Python dependencies
|
||
|
||
Captured from `importlib` on 2026-04-12, `marcus` env on the Jetson.
|
||
|
||
| Module | Version | Site |
|
||
|---|---|---|
|
||
| `numpy` | 1.24.4 | user |
|
||
| `cv2` (opencv-python) | 4.13.0 | user |
|
||
| `PIL` (Pillow) | 10.4.0 | user |
|
||
| `yaml` (PyYAML) | 6.0.3 | user |
|
||
| `zmq` (pyzmq) | 27.1.0 | user |
|
||
| `websockets` | 13.1 | conda env |
|
||
| `pyrealsense2` | 2.55.1.6486 | user |
|
||
| `dotenv` | (no `__version__`) | user |
|
||
| `ollama` (python client) | 0.6.1 | user |
|
||
| `requests` | 2.32.4 | user |
|
||
| `ultralytics` | 8.4.21 | user |
|
||
| `torch` | 2.1.0a0+41361538.nv23.06 | user |
|
||
| `torchvision` | 0.16.1 (egg) | user |
|
||
| `matplotlib` | 3.7.5 | user (via ultralytics) |
|
||
| `scipy` | 1.10.1 | user (via ultralytics) |
|
||
| `psutil` | 7.2.2 | user (via ultralytics) |
|
||
| `polars` | 1.8.2 | user (via ultralytics) |
|
||
| `ultralytics-thop` | 2.0.18 | user |
|
||
|
||
"user" = `~/.local/lib/python3.8/site-packages`. Most Marcus deps live here rather than in the conda env's site-packages because of how JetPack ships system libs with `--user` installs.
|
||
|
||
---
|
||
|
||
## 8. Marcus project modules — import status
|
||
|
||
All 25 project modules import cleanly from the `marcus` env at `/home/unitree/Marcus`:
|
||
|
||
```
|
||
OK Core.config_loader Core.env_loader
|
||
OK Core.log_backend Core.logger
|
||
OK Voice.builtin_mic Voice.builtin_tts Voice.marcus_voice
|
||
OK Vision.marcus_yolo Vision.marcus_imgsearch
|
||
OK API.llava_api API.yolo_api API.camera_api
|
||
OK API.zmq_api API.imgsearch_api API.odometry_api
|
||
OK API.memory_api API.arm_api API.audio_api
|
||
OK Navigation.goal_nav Navigation.patrol Navigation.marcus_odometry
|
||
OK Brain.marcus_brain Brain.marcus_memory Brain.command_parser
|
||
OK Autonomous.marcus_autonomous
|
||
```
|
||
|
||
Notable removals: `Voice/marcus_gemini_voice.py` deleted on 2026-04-21. `Core/Logger.py` renamed to `Core/log_backend.py`.
|
||
|
||
---
|
||
|
||
## 9. Installation recipe (reproducing this environment)
|
||
|
||
Run these steps on a fresh `marcus` conda env, in order. They reproduce the exact stack above.
|
||
|
||
### 9.1 Fix the Jetson clock if needed
|
||
|
||
Jetsons have no RTC battery; after power cycles the clock may reset to 1970 and break TLS (cert validation fails with "issued certificate not yet valid"). `systemd-timesyncd` may refuse NTP — fall back to manual date:
|
||
|
||
```bash
|
||
sudo systemctl restart systemd-timesyncd
|
||
sudo timedatectl set-ntp true # may fail on Jetson
|
||
# fallback:
|
||
sudo date -s "YYYY-MM-DD HH:MM:SS"
|
||
sudo hwclock --systohc
|
||
```
|
||
|
||
### 9.2 Remove CPU-only torch and install the NVIDIA Jetson wheel
|
||
|
||
```bash
|
||
conda activate marcus
|
||
pip uninstall -y torch torchvision torchaudio
|
||
|
||
cd ~
|
||
wget https://developer.download.nvidia.com/compute/redist/jp/v512/pytorch/torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl
|
||
# (JP 5.1.1 uses the same nv23.06 wheel; v511 returns 404 on this build)
|
||
|
||
pip install ~/torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl
|
||
```
|
||
|
||
Verify before proceeding:
|
||
|
||
```bash
|
||
python3 -c "import torch; print(torch.__version__, torch.cuda.is_available(), torch.cuda.get_device_name(0))"
|
||
# Expect: 2.1.0a0+41361538.nv23.06 True Orin
|
||
```
|
||
|
||
### 9.3 Build torchvision 0.16.1 from source (matches torch 2.1.0)
|
||
|
||
```bash
|
||
sudo apt install -y libjpeg-dev zlib1g-dev libpython3-dev \
|
||
libavcodec-dev libavformat-dev libswscale-dev
|
||
|
||
cd ~
|
||
git clone --branch v0.16.1 https://github.com/pytorch/vision torchvision
|
||
cd torchvision
|
||
export BUILD_VERSION=0.16.1
|
||
python3 setup.py install --user
|
||
```
|
||
|
||
Build takes ~15–25 min on Orin NX. Verify:
|
||
|
||
```bash
|
||
python3 -c "
|
||
import torch, torchvision
|
||
x = torch.rand(5, 4).cuda(); s = torch.rand(5).cuda()
|
||
print(torchvision.__version__, torchvision.ops.nms(x, s, 0.5).device)
|
||
"
|
||
# Expect: 0.16.1 cuda:0
|
||
```
|
||
|
||
### 9.4 Ollama server + model
|
||
|
||
```bash
|
||
ollama serve > /tmp/ollama.log 2>&1 &
|
||
sleep 3
|
||
ollama list # confirm qwen2.5vl:3b present
|
||
ollama pull qwen2.5vl:3b # if missing (~3 GB)
|
||
ollama run qwen2.5vl:3b "hi" # warm model into VRAM
|
||
ollama ps # PROCESSOR must say "100% GPU"
|
||
```
|
||
|
||
### 9.5 Other deps
|
||
|
||
Already present via `pip install --user` from earlier setup — see section 7 for versions. No action needed unless reinstalling from scratch.
|
||
|
||
---
|
||
|
||
## 10. Marcus launch sequence
|
||
|
||
Full terminal-mode bring-up:
|
||
|
||
```bash
|
||
# Terminal 1 — Holosoma locomotion policy (runs in hsinference env, not marcus)
|
||
source ~/.holosoma_deps/miniconda3/bin/activate hsinference
|
||
cd ~/holosoma
|
||
~/.holosoma_deps/miniconda3/envs/hsinference/bin/python3 \
|
||
src/holosoma_inference/holosoma_inference/run_policy.py \
|
||
inference:g1-29dof-loco \
|
||
--task.model-path src/holosoma_inference/holosoma_inference/models/loco/g1_29dof/fastsac_g1_29dof.onnx \
|
||
--task.velocity-input zmq --task.state-input zmq --task.interface eth0
|
||
|
||
# Terminal 2 — Ollama server (leave running)
|
||
ollama serve & sleep 3
|
||
|
||
# Terminal 3 — Marcus brain
|
||
conda activate marcus
|
||
cd ~/Marcus
|
||
python3 run_marcus.py
|
||
```
|
||
|
||
Expected Marcus YOLO init line:
|
||
|
||
```
|
||
[YOLO] Model loaded ✅ | device: cuda (Orin) | FP16 | 19 tracked classes
|
||
```
|
||
|
||
If instead you get `RuntimeError: [YOLO] CUDA not available — torch.cuda.is_available() == False`, the torch install is wrong or was overwritten by a `pip install torch` somewhere — redo section 9.2.
|
||
|
||
---
|
||
|
||
## 11. Verification commands (copy/paste)
|
||
|
||
Full-stack version check:
|
||
|
||
```bash
|
||
cat /etc/nv_tegra_release
|
||
dpkg -l | grep -E "nvidia-l4t-core|cuda-runtime|libcudnn8" | awk '{print $2, $3}'
|
||
nvcc --version | tail -n2
|
||
python3 --version
|
||
python3 -c "import torch, torchvision; print('torch', torch.__version__, '| cuda', torch.cuda.is_available(), '| cudnn', torch.backends.cudnn.version(), '| gpu', torch.cuda.get_device_name(0), '| tv', torchvision.__version__)"
|
||
python3 -c "import ultralytics; print('ultralytics', ultralytics.__version__)"
|
||
ollama --version
|
||
curl -s http://localhost:11434/api/version
|
||
ollama list
|
||
ollama ps
|
||
```
|
||
|
||
YOLO warmup + steady-state FPS (the gold-standard GPU smoke test):
|
||
|
||
```bash
|
||
cd ~/Marcus
|
||
python3 - <<'EOF'
|
||
import sys, os, time, threading
|
||
sys.path.insert(0, os.getcwd())
|
||
os.environ.setdefault("PROJECT_BASE", "/home/unitree")
|
||
os.environ.setdefault("PROJECT_NAME", "Marcus")
|
||
import numpy as np
|
||
import Vision.marcus_yolo as my
|
||
from Vision.marcus_yolo import (
|
||
start_yolo, yolo_fps, yolo_is_running, _resolve_device, YOLO_DEVICE
|
||
)
|
||
|
||
dev, half = _resolve_device(YOLO_DEVICE)
|
||
print(f"[resolve] device={dev!r} half={half}")
|
||
|
||
raw, lock = [None], threading.Lock()
|
||
assert start_yolo(raw_frame_ref=raw, frame_lock=lock)
|
||
raw[0] = np.random.randint(0, 255, (240, 424, 3), dtype=np.uint8)
|
||
|
||
for i in range(15):
|
||
time.sleep(1)
|
||
print(f" t={i+1:2d}s fps={yolo_fps():.1f}")
|
||
|
||
time.sleep(5)
|
||
print(f"[final] fps={yolo_fps():.1f}")
|
||
my._yolo_running[0] = False
|
||
time.sleep(0.3)
|
||
EOF
|
||
```
|
||
|
||
GPU live telemetry while Marcus runs:
|
||
|
||
```bash
|
||
tegrastats --interval 500 | grep -oE "GR3D_FREQ [0-9]+%"
|
||
```
|
||
|
||
`nvidia-smi` is absent on Jetson — `tegrastats` is the equivalent.
|
||
|
||
---
|
||
|
||
## 12. Known quirks
|
||
|
||
1. **No RTC battery** — clock resets to 1970 on every full power cycle. Fix before any `wget`/`pip install` that hits HTTPS. See 9.1.
|
||
2. **`ollama` python lib has no `__version__`** — use `pip show ollama` instead of `ollama.__version__`.
|
||
3. **`nvidia-smi` not available** — normal on Jetson. Use `tegrastats` and `torch.cuda.*` APIs.
|
||
4. **Ollama server "could not connect" warning** on first `ollama list`/`ollama ps` just means the server isn't running yet. Start it with `ollama serve &` before Marcus.
|
||
5. **YOLO first inference ~4–5 s** — cuDNN kernel autotune + FP16 conversion on cold start. The first user command after `python3 run_marcus.py` will feel slow; subsequent commands are steady-state. A YOLO warmup pass in `init_brain()` would hide this — open item.
|
||
6. **Holosoma and Marcus share ZMQ port 5556** — `run_marcus.py` (terminal) and `Server/marcus_server.py` (websocket) cannot run simultaneously. Pick one.
|
||
7. **NVIDIA torch wheel is at `/jp/v512/`** on developer.download.nvidia.com even though this host is JetPack 5.1.1. The `nv23.06` wheel is shared across JP 5.1.x (same CUDA 11.4 + cuDNN 8.6 runtime). `/jp/v511/pytorch/` 404s — use `v512`.
|
||
8. **PyPI torch is CPU-only on aarch64** — any `pip install torch` with no wheel argument will silently replace the NVIDIA build with a CPU wheel and break Marcus startup (Marcus is now hard-configured to refuse CPU). If that happens, redo 9.2.
|
||
|
||
---
|
||
|
||
## 13. GPU-only policy (enforced in code)
|
||
|
||
As of 2026-04-12, `Vision/marcus_yolo.py::_resolve_device` raises `RuntimeError` instead of falling back to CPU when any of:
|
||
|
||
- `Config/config_Vision.json` has `yolo_device: "cpu"`
|
||
- `torch` is not installed
|
||
- `torch.cuda.is_available()` returns False
|
||
|
||
`API/yolo_api.py::init_yolo` was also updated to **propagate** that `RuntimeError` (previously it caught `Exception` and silently disabled YOLO, leaving Marcus running blind). The brain crashes at `init_brain()` with a clear message if the GPU is unreachable — preferred over silent degradation on a safety-sensitive robot.
|
||
|
||
Config file (`Config/config_Vision.json`):
|
||
|
||
```json
|
||
{
|
||
"yolo_model_path": "Models/yolov8m.pt",
|
||
"yolo_confidence": 0.45,
|
||
"yolo_iou": 0.45,
|
||
"yolo_device": "cuda",
|
||
"yolo_half": true,
|
||
"yolo_img_size": 320,
|
||
"tracked_classes": [ ... ],
|
||
"ppe_violation_classes": [ "no-helmet", "no_helmet", "no-vest", "no_vest" ]
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 14. Change log
|
||
|
||
| Date | Change |
|
||
|---|---|
|
||
| 2026-04-12 | Initial environment.md — full stack captured, GPU bring-up verified end to end. Steady-state YOLOv8m FPS on Orin NX measured at 21.9. Ollama Qwen2.5-VL verified at 100% GPU. |
|
||
| 2026-04-12 | `Vision/marcus_yolo.py` rewired to load `config_Vision.json`, added `_resolve_device()` with hard-fail on missing CUDA (GPU-only policy). `API/yolo_api.py` updated to propagate `RuntimeError`. `Config/config_Vision.json` set `yolo_device=cuda`, `yolo_half=true`. |
|
||
| 2026-04-12 | Installed NVIDIA Jetson torch `2.1.0a0+41361538.nv23.06` (replacing CPU-only PyPI `2.4.1`) + built torchvision `0.16.1` from source against it. Verified `nms device = cuda:0`. |
|
||
| 2026-04-12 | Fixed llama.cpp compute-graph OOM on Jetson: added `num_batch=128` + `num_ctx=2048` caps in `Config/config_Brain.json`, propagated through `API/llava_api.py` and `Vision/marcus_imgsearch.py`. Qwen2.5-VL compute graph drops from ~7.5 GiB to ~1.8 GiB. |
|
||
| 2026-04-21 | **Restructure**: moved ZMQ bind out of `API/zmq_api.py` import time into `init_zmq()`; fixes LiDAR SLAM worker spawn crash. Added loud GPU-requirement banner in `API/yolo_api.py`. Dropped `num_predict_main` 200→120. Made inner-loop sleeps in goal_nav/autonomous/imgsearch conditional. Renamed `Core/Logger.py` → `Core/log_backend.py` (case-collision fix). Updated `Doc/MARCUS_API.md` to current state. |
|
||
| 2026-04-21 | **Voice restructure**: added `Voice/builtin_mic.py` (G1 array mic via UDP multicast `239.168.123.161:5555`) and `Voice/builtin_tts.py` (thin `AudioClient.TtsMaker` wrapper). Rewired `Voice/marcus_voice.py` to use BuiltinMic. Refactored `API/audio_api.py::speak()` to use BuiltinTTS — removed ~110 lines of edge-tts + pydub + Piper plumbing. Deleted `Voice/marcus_gemini_voice.py`. Added `subsystems.{lidar,voice,imgsearch,autonomous}` gate in `config_Brain.json::init_brain()`. |
|
||
| 2026-04-21 | **Persona swap**: robot identifies as Sanad. Wake words `["sanad","sannad","sanat","sunnat"]`, `speaker.app_name="sanad"`, all Qwen prompts say "You are Sanad", banner reads `SANAD AI BRAIN — READY`, hardcoded self-intro says "I am Sanad". Project directory, class names, filenames, and `PROJECT_NAME=Marcus` env var unchanged. |
|
||
| 2026-04-21 | **English-only sweep**: stripped 5.8 KB of Arabic examples from `marcus_prompts.yaml`, removed Arabic talk-pattern and greeting regexes in `Brain/marcus_brain.py`, dropped Arabic wake words from `config_Voice.json`, changed user-facing prints `Marcus: …` → `Sanad: …` in `executor.py`, `marcus_brain.py`, `marcus_cli.py`. Verified: 0 Arabic chars in live code/config. |
|
||
| 2026-04-21 | **Logs hardened**: `Core/log_backend.py` now uses `RotatingFileHandler` (5 MB × 3 backups, env-tunable via `MARCUS_LOG_MAX_BYTES` / `MARCUS_LOG_BACKUP_COUNT`) for all three code paths (main_handler, `LogEngine`, `LogsMessages`). `API/audio_api.py` + `Voice/marcus_voice.py` also rotate `voice.log`. `default_logs_dir` fixed: `"Logs"` → `"logs"` (matches actual directory; no more case-collision recreation). |
|
||
| 2026-04-21 | **Dead code removed**: deleted `Legacy/marcus_nav.py` (unused + Arabic), deleted `Config/config_Memory.json` (orphan — never loaded). Config count: 13 → **12** JSON files + `marcus_prompts.yaml`. |
|
||
| 2026-04-21 | **Orphan config keys wired up (0 orphans remaining)**: `config_ImageSearch.json` → `Vision/marcus_imgsearch.py` (4 constants), `config_Voice.mic_udp.read_timeout_sec` → `Voice/builtin_mic.py`, `config_Camera.{timeout_ms, stale_threshold_s, reconnect_delay_s}` → `API/camera_api.py`, `config_Odometry.json` (10 keys) → `Navigation/marcus_odometry.py`. All 156 config keys now referenced by code. |
|
||
| 2026-04-21 | **Subprocess leak fix**: `AudioAPI._record_parec` now wraps `Popen` in try/finally with `terminate → wait(1.0) → kill` fallback; orphan `parec` processes can no longer survive Ctrl-C. Last-resort `proc.kill()` catches only `OSError` (not bare `except`). |
|
||
| 2026-04-21 | **Modelfile corrected**: `Models/Modelfile` now `FROM qwen2.5vl:3b` (was `:7b`) with a header explaining it's an optional build template — runtime uses `ollama pull qwen2.5vl:3b` directly. |
|
||
| 2026-04-21 | **Final verification**: 14-dimension smoke test green — no Arabic, no dead dirs, 0 orphan keys, every FileHandler rotates, no bare `except: pass`, no stale `Models_marcus` / `marcus_llava` refs, 25/25 modules import. |
|
||
| 2026-04-24 | **Voice finalised on faster-whisper + custom energy wake**. Added `Voice/wake_detector.py` (pure-numpy energy state machine, adaptive noise floor, burst-audio capture for verify). Rewrote `Voice/marcus_voice.py` around it: three operating modes (`wake_and_command` / `always_on` / `always_on_gated`), hysteretic record VAD, pre-speech silence trim (300 ms pre-roll preserved), faster-whisper `base.en` int8 CPU decode, fuzzy-match canonicalisation against `command_vocab`, `GARBAGE_PATTERNS` + length filter for noise hallucinations, `/s-/` phonetic wake verify (accepts Whisper mishearings of "Sanad" like "Stop"/"Set"/"Sand"). Tried and reverted: Gemini Live WebSocket (Python 3.8 incompatibility + latency), Vosk grammar STT (English lexicon can't decode "Sanad"; big model cold-load too slow on Jetson). All voice tunables (33 wake_words, 68 command_vocab, 17 garbage_patterns, ~25 threshold/VAD/Whisper keys) live in `config_Voice.json::stt.*` — zero hardcoded strings in Voice/. |
|
||
| 2026-04-24 | **Command parser widened**: `Brain/command_parser.py` now has `_RE_SIMPLE_DIR` (`left`, `go back`, `move forward`, `step right`, etc.) and `_RE_STOP_SIMPLE` (`stop`, `halt`, `wait`, `pause`, `freeze`) regex fast-paths — these bare-direction / bare-stop commands now skip Qwen entirely (~50 ms vs ~5 s). Motion velocities and step duration pulled from `config_Navigation.json::{move_map, step_duration_sec}` via `API/zmq_api.py`; command_parser no longer contains hardcoded `0.3` / `2.0` magic numbers. |
|