kassam/Marcus

Fork 0

kassam ac9271c62b Update 2026-04-22 10:57:22

2026-04-22 10:57:23 +04:00

10 KiB

Raw Blame History

Marcus — Control & Startup Guide

Robot persona: Sanad (wake word + self-intro; project code lives under Marcus/) Updated: 2026-04-21

Quick Start

Prerequisites (Jetson Orin NX, JetPack 5.1.1)

# Terminal 1 — Start Holosoma (locomotion policy, in hsinference env)
source ~/.holosoma_deps/miniconda3/bin/activate hsinference
cd ~/holosoma
~/.holosoma_deps/miniconda3/envs/hsinference/bin/python3 \
  src/holosoma_inference/holosoma_inference/run_policy.py \
  inference:g1-29dof-loco \
  --task.model-path src/holosoma_inference/holosoma_inference/models/loco/g1_29dof/fastsac_g1_29dof.onnx \
  --task.velocity-input zmq \
  --task.state-input zmq \
  --task.interface eth0

# Terminal 2 — Ollama server (leave running)
ollama serve > /tmp/ollama.log 2>&1 &
sleep 3
ollama list                # confirm qwen2.5vl:3b present

Option A — Terminal Mode (on Jetson)

# Terminal 3 — Start Marcus Brain
conda activate marcus
cd ~/Marcus
python3 run_marcus.py

Direct keyboard control + voice input (say "Sanad" to wake). Expected banner on boot:

================================================
         SANAD AI BRAIN — READY
================================================
  model     : qwen2.5vl:3b
  yolo      : True
  odometry  : True
  memory    : True
  lidar     : True
  voice     : True
  camera    : 424x240@15

Option B — Server + Client (remote)

# Terminal 3 (Jetson) — Start Server
conda activate marcus
cd ~/Marcus
python3 -m Server.marcus_server

# Terminal 4 (Workstation) — Connect Client
cd ~/Robotics_workspace/yslootahtech/Project/Marcus
python3 -m Client.marcus_cli

Client prompts for connection:

  Connection options:
    1) eth0  — 192.168.123.164:8765
    2) wlan0 — 10.255.254.86:8765
    3) custom
  Choose [1/2/3] or IP:

Or skip prompt: python3 -m Client.marcus_cli --ip 192.168.123.164 --port 8765

Voice

Wake word: "Sanad" (variants "sannad", "sanat", "sunnat" — see config_Voice.json::stt.wake_words_en)
Mic: G1 on-board array mic, captured via UDP multicast 239.168.123.161:5555 (16 kHz mono, 16-bit PCM). No USB mic needed.
STT: Whisper tiny (wake detection) + Whisper small (command transcription) — both run locally.
TTS: Unitree client.TtsMaker() → G1 body speaker. English only.
Barge-in: say something while Marcus is speaking and the mic buffer flushes on the next command.

Interaction flow: say "Sanad" → hear "Listening" → speak your command → see transcript on console → Marcus answers through the speaker.

To disable voice entirely, set subsystems.voice: false in config_Brain.json — Marcus will boot text-only ~2 s faster.

Command Reference

Movement

Command	Action
`turn left` / `turn right`	Rotate (2s default)
`walk forward` / `move back`	Walk (2s default)
`walk 1 meter`	Precise odometry walk
`walk backward 2 meters`	Precise backward walk
`turn right 90 degrees`	Precise odometry turn
`turn right then walk forward`	Multi-step compound
`come to me` / `come here`	Forward 2s (instant, no AI)
`stop`	Gradual stop

Vision

Command	Action
`what do you see`	Qwen2.5-VL describes camera view
`describe the room`	Qwen2.5-VL scene description
`is anyone here`	Qwen2.5-VL person check
`yolo`	Show YOLO detection status

Command	Action
`goal/ stop when you see a person`	YOLO fast search + stop
`goal/ find a laptop`	YOLO + Qwen-VL search
`goal/ stop when you see a guy holding a phone`	YOLO + Qwen-VL compound verification
`find a person`	Auto-detected as goal (no prefix needed)
`look for a bottle`	Auto-detected as goal

Place Memory

Command	Action
`remember this as door`	Save current position
`go to door`	Navigate to saved place
`places`	List all saved places
`forget door`	Delete place
`rename door to entrance`	Rename place
`where am I`	Show odometry position
`go home`	Return to start position

Patrol

Command	Action
`patrol`	Autonomous patrol (prompts for duration)
`patrol: door → desk → exit`	Named waypoint patrol

Image Search (requires `subsystems.imgsearch: true`)

Command	Action
`search/ /path/to/photo.jpg`	Find target from reference image
`search/ /path/to/photo.jpg person in blue shirt`	Image + hint
`search/ person in blue shirt`	Text-only search

Session Memory

Command	Action
`last command`	Show last typed command
`do that again`	Repeat last command
`undo`	Reverse last movement
`last session`	Previous session summary
`session summary`	Current session stats

Autonomous Mode

Command	Action
`auto on`	Start autonomous exploration
`auto off`	Stop
`auto status`	Current step / observations
`auto save`	Snapshot observations to disk

System

Command	Action
`help`	Command reference
`example`	Usage examples
`lidar` / `lidar status`	SLAM engine pose + health
`q` / `quit`	Shutdown

Client-Only Commands (CLI)

Command	Action
`status`	Ping server + LiDAR status
`camera`	Get camera configuration
`profile low/medium/high/full`	Switch camera profile
`capture`	Take a photo

Subsystem flags (`Config/config_Brain.json`)

Control what initializes at boot. Defaults:

"subsystems": {
  "lidar":      true,
  "voice":      true,
  "imgsearch":  false,
  "autonomous": true
}

Set any to false to skip that subsystem's init. Boot time drops roughly:

voice: false → ~2 s faster (no Whisper model load)
lidar: false → ~1 s faster (no SLAM subprocess spawn)
imgsearch: false → already the default; re-enable only when you need search/ …
autonomous: false → minor, but removes the AutonomousMode init

Network Configuration

Interface	IP	Use
`eth0`	192.168.123.164	Robot internal network (Jetson ↔ G1 ↔ LiDAR)
`wlan0`	10.255.254.86	Office WiFi (Jetson ↔ Workstation)

Service	Port	Protocol
Marcus WebSocket	8765	ws://
ZMQ velocity (→ Holosoma)	5556	tcp:// (PUB/SUB)
Ollama API	11434	HTTP (localhost only)
G1 audio multicast (mic)	5555	UDP multicast 239.168.123.161
Livox Mid-360 (LiDAR)	192.168.123.120	UDP (Livox SDK)

Most values configurable in Config/config_Network.json and config_Voice.json::mic_udp.

Troubleshooting

Issue	Cause	Fix
Banner shows `SANAD AI BRAIN — READY` but nothing moves	Holosoma not running	Start Holosoma (Terminal 1) first
`RuntimeError: CUDA not available` on boot	Wrong torch build on Jetson	See `Doc/environment.md` section 9.2 — reinstall the NVIDIA Jetson torch wheel
`llama runner process has terminated: %!w(<nil>)`	Ollama compute graph OOM	Already capped at `num_batch=128 / num_ctx=2048`. Check `free -h`; kill stale Ollama runners: `pkill -f "ollama runner"`
Traceback mentioning `multiprocessing/spawn.py` + ZMQ port 5556	Old import-time ZMQ bind regressed	Pull latest `API/zmq_api.py` — must call `init_zmq()` from the parent only
`[Camera] No frame for 10s` during warmup	Ollama blocking the main thread, or USB bandwidth	Warmup is ~10–15 s on first Qwen load; subsequent commands are fast
Wake word never fires	Whisper hearing something else	Check `logs/voice.log` — if it transcribes as "sunnat"/"sannat", add your variant to `config_Voice.json::stt.wake_words_en`
Mic silent	G1 audio service not publishing	Run `python3 Voice/builtin_mic.py` standalone — must print "OK — mic is capturing audio"
`[LiDAR] No data yet (will keep trying)`	SLAM worker still spawning (normal) or Livox network	First ~5 s normal. If persists, `ping 192.168.123.120`
Client can't connect	Wrong IP or server not running	Verify `ollama serve &` and `python3 -m Server.marcus_server` are both up

File Locations

What	Path
Brain code	`~/Marcus/Brain/`
Server	`~/Marcus/Server/marcus_server.py`
Voice	`~/Marcus/Voice/{builtin_mic,builtin_tts,marcus_voice}.py`
Config	`~/Marcus/Config/`
Prompts	`~/Marcus/Config/marcus_prompts.yaml`
YOLO model	`~/Marcus/Models/yolov8m.pt`
Session data	`~/Marcus/Data/Brain/Sessions/`
Places	`~/Marcus/Data/History/Places/places.json`
Logs	`~/Marcus/logs/`

See Doc/architecture.md for full project structure and file-by-file documentation. See Doc/environment.md for the verified Jetson software stack. See Doc/pipeline.md for the end-to-end data flow. See Doc/functions.md for the full function inventory (AST-generated).

Language policy

English only. Arabic was removed from the codebase on 2026-04-21:

Config/config_Voice.json::stt.wake_words_en — only English variants (sanad, sannad, sanat, sunnat)
Config/marcus_prompts.yaml — no Arabic examples left in any of the 7 prompts
API/audio_api.py::speak(text) — rejects non-ASCII (the G1 TtsMaker silently maps Arabic to Chinese, which nobody wants)
Brain/marcus_brain.py — greeting and talk-pattern regexes match English only

If you need Arabic back, the cleanest paths are either Piper TTS (offline) or edge-tts (online) — see git log for the removed implementations.

Logs

All .log files in logs/ rotate at 5 MB × 3 backups by default. To change:

export MARCUS_LOG_MAX_BYTES=10000000     # 10 MB per file
export MARCUS_LOG_BACKUP_COUNT=5          # keep 5 rotations
export MARCUS_LOG_DIR=/var/log/marcus     # move logs off SD card

Per-module log files:

brain.log, camera.log, lidar.log, zmq.log, server.log, main.log — via Core.logger.log()
voice.log — via stdlib logging in audio_api.py + marcus_voice.py
Session JSON: Data/Brain/Sessions/session_NNN_YYYY-MM-DD/{commands,detections,alerts,places}.json

10 KiB Raw Blame History Unescape Escape