v0.12.0
Release date: 2026-05-30
Added
- Added browser dictation and a modular speech-to-text service with OpenAI, Deepgram, ElevenLabs, Google/Gemini, and xAI backends, shared WAV preprocessing, provider settings, and
/api/stt/transcribe(fcadb238, 80c5475f, 209bff6c). - Added a live microphone waveform in the composer so users get visible recording feedback while dictating (fcadb238).
- Added Google/Gemini speech-to-text with inline WAV upload support and exposed Google as an STT-capable provider (209bff6c).
- Added
grok-build-0.1andgrok-4.3model catalog entries with pricing details (dd0accb5).
Changed
- Normalized speech-to-text provider selection in Settings, including capability badges, auto-save behavior, and consistent active-default styling (fcadb238, f888b5ab).
- Replaced remaining native picker controls in settings, board/task dialogs, and dashboard filters with shared shadcn/Radix Select controls (bd37b82f).
- Aligned Gemini image input with current Interactions API shapes, including HEIC/HEIF detection, upload picker support, and current multimodal turn serialization (209bff6c).
- Standardized model-provider retry behavior across OpenAI, xAI, Google, Anthropic, generic OpenAI-compatible, and GitHub Copilot providers with shared transient/rate-limit backoff tracing (209bff6c).
- Configured ElevenLabs speech-to-text requests to avoid verbatim filler transcription (40d2b3e).
Fixed
- Fixed Gemini Interactions parsing for
steps/model_outputresponses so tool calls, web search items, grounding annotations, and final assistant text are handled reliably (209bff6c). - Fixed Google semantic incomplete responses and HTTP overload statuses to retry through the shared provider retry path (209bff6c).
- Fixed broken Tk workspace folder picker handling so web workspace selection can recover when the native dialog is unavailable (e955c761).
- Fixed model profile modal overflow artifacts and restored padding for Select dropdowns after the picker migration (3d7173b8, bd37b82f).
- Fixed shared session/workspace search fields by replacing the native cancel affordance with a styled clear button and safe input padding (0cea096f).
Removed
- Removed the legacy native select wrapper after migrating app pickers to shared shadcn/Radix Select primitives (bd37b82f).
Documentation
- Added speech-to-text documentation and linked provider, environment variable, session command, and web UI guidance to the new dictation flow (54bdc68e, 209bff6c).