speech-ui
Speech - OpenAI Text-to-Speech GUI
A Crystal GUI application using the OpenAI Text-to-Speech API. Enter text, choose voice/model/format, optionally add instructions, then generate and play AI‑generated speech.
Features
- Uses OpenAI GPT-4o-mini-TTS / tts-1 / tts-1-hd models
- 11 built‑in voices selectable
- Multiple audio output formats (mp3, wav, pcm, opus, flac, aac)
- Optional instruction prompt to change tone / style
- Save-to-file option with custom path
- Simple, compact cross‑platform GUI (macOS / Linux playback helpers)
Prerequisites
- Crystal language installed
- OpenAI API key (https://platform.openai.com/api-keys)
- macOS:
afplay
available (default pre-installed) - Linux: one of
mpg123
oraplay
installed for playback
Install dependencies
shards install
Build
# Speech - OpenAI TTS GUI
Minimal Crystal GUI for OpenAI Text-to-Speech. Type text, pick voice / model / format, optionally add instructions, then generate & play (or save) audio.
## Highlights
* Models: gpt-4o-mini-tts / tts-1 / tts-1-hd
* Voices (11): alloy, ash, ballad, coral, echo, fable, nova, onyx, sage, shimmer
* Formats: mp3 (default), wav, pcm, opus, flac, aac
* Optional instructions & save-to-file
## Quick Start
Prerequisites: Crystal, OpenAI API key, playback tool (macOS: afplay, Linux: mpg123 or aplay).
```bash
shards install
export OPENAI_API_KEY="your-api-key"
shards build --release -Dpreview_mt
bin/speech
Usage
- Select Voice / Model / Format
- (Optional) Enter instructions (e.g. "Warm, calm")
- Enter text
- (Optional) Enable "Save file" and choose a path
- Click "Generate & Play"
Notes
- Disclose to users the voice is AI-generated
- API usage may incur cost
- Requires internet
Dev
crystal run src/speech.cr
License
MIT
- uing – GUI toolkit
Repository
speech-ui
Owner
Statistic
- 0
- 0
- 0
- 0
- 2
- 2 days ago
- October 1, 2025
License
MIT License
Links
Synced at
Thu, 02 Oct 2025 04:19:47 GMT
Languages