# Marcus — Function Inventory **Robot persona:** Sanad (wake word + self-intro) **Updated:** 2026-04-21 Every callable in the codebase, grouped by layer. Generated from AST, kept in sync with the source. See `architecture.md` for where each module lives and `pipeline.md` for how they connect. **Totals:** 25 importable modules · 73 top-level functions · 9 public classes. --- ## `run_marcus.py` — entrypoint Script only. Prepends `PROJECT_ROOT` to `sys.path`, then calls `Brain.marcus_brain.run_terminal()` in `__main__`. --- ## `Core/` — foundation, no external deps | File | Function | Purpose | |---|---|---| | `env_loader.py` | `_find_env_file()`, `_load_dotenv(path)` | find + parse `.env` into `os.environ`; exports `PROJECT_ROOT` | | `config_loader.py` | `load_config(name)`, `config_path(relative)` | cached reader for `Config/config_{name}.json` | | `log_backend.py` | `_rotating_handler(path)` + **class `Logs`** | custom logging engine; all handlers are `RotatingFileHandler` (5 MB × 3) | | `logger.py` | `get_logger(module)`, `log(msg, level, module)`, `log_and_print(msg, level, module)` | project-wide logging façade | **`Core.log_backend.Logs`** methods: `__init__(default_log_level, main_log_file)`, `_choose_fallback_log_dir`, `_normalize_log_name`, `_is_writable_path`, `_with_fallback`, `resolve_log_path`, `construct_path`, `log_to_file`, `LogEngine(folder, log_name)`, `LogsMessages(msg, type, folder, file)`, `print_and_log(...)`. --- ## `API/` — subsystem wrappers (Brain imports only from here) | File | Public functions | |---|---| | `zmq_api.py` | `init_zmq()`, `get_socket()`, `send_vel(vx, vy, vyaw)`, `gradual_stop()`, `send_cmd(cmd)` | | `camera_api.py` | `start_camera()`, `stop_camera()`, `get_frame()`, `get_frame_age()`, `get_raw_refs()`, `camera_loop()` | | `llava_api.py` | `call_llava(prompt, img_b64, num_predict, use_history)`, `ask(command, img_b64)`, `ask_goal(goal, img_b64)`, `ask_talk(command, img_b64, facts)`, `ask_verify(target, condition, img_b64)`, `ask_patrol(img_b64)`, `remember_fact(fact)`, `add_to_history(user_msg, assistant_msg)`, `parse_json(raw)` | | `yolo_api.py` | `init_yolo(raw_frame_ref, frame_lock)` + 8 stubs rebound on success: `yolo_sees`, `yolo_count`, `yolo_closest`, `yolo_summary`, `yolo_ppe_violations`, `yolo_person_too_close`, `yolo_all_classes`, `yolo_fps` | | `odometry_api.py` | `init_odometry(zmq_sock)`, `get_position()` | | `memory_api.py` | `init_memory()`, `log_cmd(cmd, response, duration)`, `log_detection(class_name, position, distance)`, `place_save(name)`, `place_goto(name)`, `places_list_str()` | | `arm_api.py` | `do_arm(action)` — G1 GR00T stub | | `imgsearch_api.py` | `init_imgsearch(get_frame_fn, send_vel_fn, gradual_stop_fn, llava_fn, yolo_sees_fn, model)`, `get_searcher()` | | `audio_api.py` | **class `AudioAPI`** (see below) | | `lidar_api.py` | `init_lidar()`, `obstacle_ahead(radius)`, `get_slam_pose()`, `get_nav_cmd()`, `get_loc_state()`, `get_safety_reasons()`, `get_lidar_status()`, `get_client()`, `stop_lidar()` | **`API.audio_api.AudioAPI`** methods: `speak(text, lang="en")`, `record(seconds)` → np.int16 array, `play_pcm(audio_16k)`, `save_recording(audio, name)`, properties `is_speaking`, `is_available`. Internal: `_init_sdk`, `_mute_mic`, `_unmute_mic`, `_resample`, `_play_pcm`, `_record_builtin`, `_record_parec`. --- ## `Voice/` — audio I/O + Gemini Live STT + TtsMaker | File | Public API | |---|---| | `audio_io.py` | `_find_g1_local_ip()`, `_resample_int16`, `_as_int16_array`, abstract **classes `Mic`, `Speaker`**, concrete **classes `BuiltinMic`, `BuiltinSpeaker`**, **dataclass `AudioIO`** with `from_profile()` factory | | `builtin_mic.py` | **class `BuiltinMic`** (subclass of `audio_io.BuiltinMic` + `read_seconds()` for `AudioAPI.record()`) | | `builtin_tts.py` | **class `BuiltinTTS`** (used by `AudioAPI.speak()`) | | `gemini_script.py` | module-level `_load_voice_cfg()`, `_audio_energy()`, **class `GeminiBrain`** | | `turn_recorder.py` | **class `TurnRecorder`** | | `marcus_voice.py` | module-level `WAKE_WORDS`, `COMMAND_VOCAB`, `GARBAGE_PATTERNS` (populated from config), helpers `_has_wake_word`, `_strip_wake_word`, `_strip_wake_word_once`, `_closest_command`, **class `VoiceModule`** | **`Voice.audio_io.BuiltinMic`** — G1 UDP multicast mic (Sanad-pattern port): `__init__(group, port, buf_max)`, `start()`, `stop()`, `read_chunk(num_bytes)`, `flush()`; internal `_recv_loop`. **`Voice.audio_io.BuiltinSpeaker`** — streaming wrapper over `AudioClient.PlayStream` (built but idle in STT-only mode; TtsMaker owns the speaker): `__init__(audio_client, app_name=None)`, `begin_stream()`, `send_chunk(pcm, source_rate)`, `wait_finish()`, `stop()`, properties `interrupted`, `total_sent_sec`. Internal `_stop_play_api()`. **`Voice.audio_io.AudioIO`** — paired mic + speaker bundle: `@classmethod from_profile(profile_id, *, audio_client=None) -> AudioIO`, `start()`, `stop()`. Only `"builtin"` profile supported (Anker/Hollyland USB profiles dropped). **`Voice.builtin_tts.BuiltinTTS`** — wraps `AudioClient.TtsMaker`: `__init__(audio_client, default_speaker_id=0)`, `speak(text, speaker_id=None, block=True)`. **`Voice.gemini_script.GeminiBrain`** — Gemini Live STT-only brain (Sanad `gemini/script.py` port): `__init__(audio_io, recorder, voice_name=None, system_prompt="", *, api_key, on_transcript=None, on_command=None)`, `start()`, `stop()`, `async run()`. Internal: `_thread_main()` runs an asyncio loop in a worker thread, `_build_config(types)` returns `LiveConnectConfig(response_modalities=["TEXT"], input_audio_transcription, system_instruction)`, `_send_mic_loop(session, types)` streams 32 ms PCM chunks, `_receive_loop(session)` extracts `input_transcription.text` → callbacks + `model_turn` text → log + recorder. **`Voice.turn_recorder.TurnRecorder`** — per-turn WAV saver: `__init__(enabled, out_dir, user_rate, robot_rate)`, `capture_user(pcm_bytes)`, `capture_robot(pcm_bytes)`, `add_user_text(text)`, `add_robot_text(text)`, `finish_turn() -> dict`. Internal: `_save_wav`, `_append_index`. In STT-only mode `_robot.wav` is never written (Gemini emits text, not audio). **`Voice.marcus_voice.VoiceModule`** — voice orchestrator. Builds `AudioIO.from_profile("builtin", audio_client=ac)`, spawns `GeminiBrain` with `_on_gemini_transcript` (transcript log) and `_dispatch_gemini_command` (wake-word gate + fuzzy match → on_command callback) hooks. Forwards every "Sanad + X" transcript to Marcus's brain via the user-supplied `on_command` callback. `__init__(audio_api, on_command=None, on_wake=None)`, `start()`, `stop()`, `flush_mic()`, `is_speaking` property. Internal: `_voice_loop` (calls `_voice_loop_gemini`), `_voice_loop_gemini` (assembles AudioIO + TurnRecorder + GeminiBrain), `_on_gemini_transcript(text)`, `_dispatch_gemini_command(text, lang)`, `_normalize_command(text)`. The `flush_mic()` hook is called by `Brain/marcus_brain._on_command` before AND after `audio_api.speak()` to prevent TtsMaker output from being transcribed back as user input. --- ## `Vision/` | File | Public API | |---|---| | `marcus_yolo.py` | `start_yolo(raw_frame_ref, frame_lock)`, `yolo_sees(class, min_confidence)`, `yolo_count(class)`, `yolo_closest(class)`, `yolo_all_classes()`, `yolo_summary()`, `yolo_ppe_violations()`, `yolo_person_too_close(threshold)`, `yolo_is_running()`, `yolo_fps()`, `_resolve_device(requested)` + **class `Detection`** | | `marcus_imgsearch.py` | **class `ImageSearch`** + prompt helpers `_build_compare_prompt`, `_build_single_prompt`, image utils `_load_image_b64`, `_numpy_to_b64`, `_resize_b64` | **`Vision.marcus_yolo.Detection`** — a single detection's metadata: `__init__(class_name, confidence, x1, y1, x2, y2, frame_w, frame_h)`, props `size_ratio`, `position`, `distance_estimate`, method `to_dict()`, `__repr__`. **`Vision.marcus_imgsearch.ImageSearch`** — rotate-and-compare search: `__init__(get_frame_fn, send_vel_fn, gradual_stop_fn, llava_fn, yolo_sees_fn, model)`, `search(ref_img_b64, hint, max_steps, direction, yolo_prefilter)`, `search_from_file(image_path, hint, max_steps, direction)`, `abort()`. --- ## `Navigation/` | File | Public API | |---|---| | `goal_nav.py` | `navigate_to_goal(goal, max_steps)`; private `_goal_yolo_target`, `_extract_extra_condition`, `_verify_condition` | | `patrol.py` | `patrol(duration_minutes, alert_callback)` | | `marcus_odometry.py` | **class `Odometry`** | **`Navigation.marcus_odometry.Odometry`** — ROS2 `/dog_odom` + dead-reckoning fallback: - lifecycle: `__init__()`, `start(zmq_sock)`, `stop()`, `reset()`, `is_running()` - pose: `get_position()` → `{x, y, heading, source}`, `get_distance_from_start()`, `status_str()`, `__repr__` - movement: `walk_distance(meters, speed, direction)`, `turn_degrees(degrees, speed)`, `navigate_to(x, y, heading, speed)`, `return_to_start(speed)`, `patrol_route(waypoints, speed, loop)` - internal: `_init_own_zmq`, `_reset_state`, `_try_start_ros2`, `_dead_reckoning_loop`, `_send_vel`, `_gradual_stop`, `_check_stale`, `_time_based_walk`, `_time_based_turn` --- ## `Brain/` | File | Public API | |---|---| | `marcus_brain.py` | `init_brain()`, `process_command(cmd)` → `{type, speak, action, elapsed}`, `get_brain_status()`, `shutdown()`, `run_terminal()`; private `_init_voice`, `_handle_llava`, `_handle_talk`, `_handle_search`, `_warmup_llava` | | `command_parser.py` | `init_autonomous(auto_instance)`, `try_local_command(cmd)` (regex-table dispatcher); `_print_help`, `_print_examples` | | `executor.py` | `execute(d)`, `execute_action(move, duration)`, `move_step(move, duration)`, `merge_actions(actions)`; `_obstacle_check` | | `marcus_memory.py` | **class `Memory`** + utils `_read_json`, `_write_json`, `_sanitize_name`, `_fuzzy_match`, `_new_session_id` | **`Brain.marcus_memory.Memory`** — places + sessions store, JSON-backed: - places: `save_place(name, x, y, heading)`, `get_place(name)`, `delete_place(name)`, `list_places()`, `rename_place(old, new)`, `places_count()` - sessions: `start_session()`, `end_session()`, `log_command(cmd, response, duration_s)`, `log_detection(class, pos, dist, x, y)`, `log_alert(type, detail)`, `get_last_command()`, `get_last_n_commands(n)`, `get_session_detections()`, `commands_count()`, `session_duration_str()` - history: `last_session_summary()`, `previous_session_detections()`, `previous_session_places()`, `all_sessions()` - internal: `_load_places`, `_start_autosave`, `_flush_session`, `_emergency_save`, `_write_summary`, `_prune_old_sessions`, `_get_previous_session_dir` --- ## `Autonomous/` `marcus_autonomous.py` — **class `AutonomousMode`**: patrol-and-map state machine. - `__init__(get_frame_fn, send_vel_fn, gradual_stop_fn, yolo_sees_fn, yolo_summary_fn, yolo_all_classes_fn, yolo_closest_fn, odom_fn, call_llava_fn, patrol_prompt, mem, models_dir)` - lifecycle: `enable()`, `disable()`, `is_enabled()`, `status()`, `save_snapshot()` - internal: `_explore_loop`, `_move_forward`, `_turn`, `_assess_scene`, `_create_map_dir`, `_save_observations`, `_save_path`, `_save_frame`, `_generate_summary`, `_save_session`, `_print_summary` --- ## `Server/` & `Bridge/` | File | Public API | |---|---| | `Server/marcus_server.py` | `async handler(websocket)`, `async broadcast_frames()`, `async run_server(host, port)`, `main()`; helpers `_get_interface_ips`, `_check_lidar` | | `Bridge/ros2_zmq_bridge.py` | **class `ROS2ZMQBridge`** (`_vel_cb`, `_cmd_cb`) + `main()` — standalone tool, not imported by Marcus | --- ## Suggested import surface for integration code If you're writing glue on top of Marcus, the stable public surface is: ```python # brain orchestration from Brain.marcus_brain import init_brain, process_command, shutdown # direct robot control (bypasses brain) from API.zmq_api import init_zmq, send_vel, gradual_stop, send_cmd from API.yolo_api import yolo_sees, yolo_summary, yolo_closest from API.camera_api import start_camera, get_frame from API.audio_api import AudioAPI # .speak(text), .record(seconds) from API.lidar_api import init_lidar, obstacle_ahead, get_slam_pose, stop_lidar from API.memory_api import init_memory, log_cmd, log_detection, place_save, place_goto # voice pipeline from Voice.marcus_voice import VoiceModule from Voice.audio_io import AudioIO, BuiltinMic, BuiltinSpeaker from Voice.builtin_tts import BuiltinTTS # used by AudioAPI.speak() from Voice.gemini_script import GeminiBrain from Voice.turn_recorder import TurnRecorder # navigation from Navigation.goal_nav import navigate_to_goal from Navigation.patrol import patrol from Navigation.marcus_odometry import Odometry # autonomous mode from Autonomous.marcus_autonomous import AutonomousMode ``` --- ## Convention notes - **All layers above Core must import from `API.*` only** (not directly from `Vision/`, `Navigation/`, `Voice/`). Enforced by convention, not the language. - **Underscore prefix = private.** `_foo` is internal; don't import it outside the module unless you're the test harness. - **Stub rebinding pattern** (e.g. `API.yolo_api`): module-level placeholders get replaced with real implementations inside `init_*()` on success. If init fails, callers keep getting the safe stub (e.g. `yolo_sees` returns `False`). - **Error returns are consistent per layer**: API layer returns `None` / empty dict / `False`; Brain layer returns structured dicts (`{"type","speak","action","elapsed"}`); no exception leaks to the terminal loop except at startup (`init_brain()` will raise to surface hardware issues like missing CUDA).