|
|
211d4f52ab
|
Update 2026-04-27 09:39:12
|
2026-04-27 09:39:13 +04:00 |
|
|
|
9485601e18
|
Update 2026-04-24 15:23:19
|
2026-04-24 15:23:19 +04:00 |
|
|
|
5d839d4f4e
|
Voice: finalise on faster-whisper + energy wake, remove Vosk
Full-day voice-stack refactor. Experiments run and reverted:
- Gemini Live HTTP microservice (Python 3.8 env incompat, latency)
- Vosk grammar STT (English lexicon can't decode 'Sanad'; big model
cold-load too slow on Jetson CPU)
Kept architecture:
- Voice/wake_detector.py — pure-numpy energy state machine with
adaptive baseline, burst-audio capture for post-hoc verify.
- Voice/marcus_voice.py — orchestrator with 3 modes
(wake_and_command / always_on / always_on_gated), hysteretic VAD,
pre-silence trim (300 ms pre-roll), DSP pipeline (DC remove,
80 Hz HPF, 0.97 pre-emphasis, peak-normalize), faster-whisper
base.en int8 with beam=8 + temperature fallback [0,0.2,0.4],
fuzzy-match canonicalisation, GARBAGE_PATTERNS + length filter,
/s-/ phonetic wake-verify, full-turn debug WAV recording.
Config-driven vocab (zero hardcoded strings in Python):
- stt.wake_words (33 variants of 'Sanad')
- stt.command_vocab (68 canonical phrases)
- stt.garbage_patterns (17 Whisper noise outputs)
- stt.min_transcription_length, stt.command_vocab_cutoff
Command parser widened (Brain/command_parser.py):
- _RE_SIMPLE_DIR — bare direction + verb+direction combos
('left', 'go back', 'move forward', 'step right', ...)
- _RE_STOP_SIMPLE — bare stop/halt/wait/pause/freeze/hold
- All motion constants sourced from config_Navigation.json
(move_map + step_duration_sec) via API/zmq_api.py; no more
hardcoded 0.3 / 2.0 magic numbers.
API/audio_api.py — _play_pcm now uses AudioClient.PlayStream with
automatic resampling to 16 kHz (matches Sanad's proven pattern).
Removed:
- Voice/vosk_stt.py (and all Vosk references in marcus_voice.py)
- Models/vosk-model-small-en-us-0.15/ (40 MB model + zip)
- All Vosk keys from Config/config_Voice.json
Documentation synced across README, Doc/architecture.md,
Doc/pipeline.md, Doc/functions.md, Doc/controlling.md,
Doc/MARCUS_API.md, Doc/environment.md changelog.
Known limitation: faster-whisper base.en on Jetson CPU + G1
far-field mic yields ~50% command-transcription accuracy due
to model capacity and mic reverberation. Wake + ack + recording
+ trim + Whisper + fuzzy + brain + motion all verified working
end-to-end. Future improvement path (unused): close-talking USB
mic via pactl_parec, or Gemini Live via HTTP microservice.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
2026-04-24 14:32:28 +04:00 |
|
|
|
f45e12fae5
|
Update 2026-04-23 09:54:45
|
2026-04-23 09:54:45 +04:00 |
|
|
|
ce09b6920a
|
Update 2026-04-22 17:54:49
|
2026-04-22 17:54:50 +04:00 |
|
|
|
00e52496a9
|
Update 2026-04-22 17:01:46
|
2026-04-22 17:01:48 +04:00 |
|
|
|
78a5b0b408
|
Update 2026-04-22 15:37:55
|
2026-04-22 15:37:56 +04:00 |
|
|
|
e9fb18eea1
|
Update 2026-04-22 14:44:13
|
2026-04-22 14:44:14 +04:00 |
|
|
|
f4ff5c27fa
|
Update 2026-04-22 13:47:14
|
2026-04-22 13:47:15 +04:00 |
|
|
|
dc06864ec3
|
Update 2026-04-22 13:28:38
|
2026-04-22 13:28:39 +04:00 |
|
|
|
9991e742da
|
Update 2026-04-22 12:32:54
|
2026-04-22 12:32:55 +04:00 |
|
|
|
d257808e48
|
Update 2026-04-22 12:17:30
|
2026-04-22 12:17:32 +04:00 |
|
|
|
dcf5f9f39b
|
Update 2026-04-22 12:08:03
|
2026-04-22 12:08:04 +04:00 |
|
|
|
1c994fa175
|
Update 2026-04-22 12:00:29
|
2026-04-22 12:00:30 +04:00 |
|
|
|
9ac3e19ed1
|
Update 2026-04-22 11:53:06
|
2026-04-22 11:53:07 +04:00 |
|
|
|
5b8e94f42e
|
Update 2026-04-22 11:47:53
|
2026-04-22 11:47:54 +04:00 |
|
|
|
3122a52966
|
Update 2026-04-22 11:43:20
|
2026-04-22 11:43:21 +04:00 |
|
|
|
2e3cc1ba5b
|
Update 2026-04-22 11:39:53
|
2026-04-22 11:39:54 +04:00 |
|
|
|
88e22a423b
|
Update 2026-04-22 11:19:55
|
2026-04-22 11:19:59 +04:00 |
|
|
|
e7609b119f
|
Update 2026-04-22 11:13:23
|
2026-04-22 11:13:23 +04:00 |
|
|
|
ac9271c62b
|
Update 2026-04-22 10:57:22
|
2026-04-22 10:57:23 +04:00 |
|
|
|
af1d0c1b8a
|
Update 2026-04-21 16:27:07
|
2026-04-21 16:27:11 +04:00 |
|
|
|
e0f6acd5c7
|
Update 2026-04-21 16:10:00
|
2026-04-21 16:10:03 +04:00 |
|
|
|
8491be7f1e
|
Initial project commit
|
2026-04-12 18:50:22 +04:00 |
|