I’m building https://TaleTwister.com – an AI tool that generates and narrates bedtime stories based on a kid’s interests, a specific moral lesson, age, and optional specifications. The product was born out of necessity (a stint of bad behavior that comes with age).
It uses GPT-4 for story generation, ElevenLabs for narration, and a simple Next.js + Supabase stack for the app layer. I’m experimenting with story memory (so kids can revisit recurring characters) using vector embeddings, and building a “choose-your-own-adventure” mode with dynamic audio rendering.
Biggest challenge so far: aligning narration, ambient sounds, and story pacing without sounding janky or robotic. Solved it by tokenizing and chunking the story for synchronized audio stitching via ffmpeg.
Another challenge was the inconsistent image illustrations via Dall-E 3. I’ve adopted a dynamic prompting method that includes as many details about the scene, character details, and other visual elements which should remain consistent on each of the storybook pages.
If your kid ever demands “one more story” after a long day, I built this for you. It’s had a meaningful impact on my son’s behavior.
That's really cool! I made something similar but much simpler using R Shiny.
Edit: Looking at the featured story, I notice that the character illustration is not consistent currently. I got relatively consistent character illustration from scene to scene by asking the LLM to write a prompt for image generation based on the story text and use the same character description in each sentence. That got me pretty consistent character drawings. Also, to keep the same drawing style, my prompt was like:
Drawing style e.g., Digital painting {drawing prompt}
Thanks for the feedback! I’ll be incorporating that. The scene is described with the dynamic image prompting however there’s nothing on style in the prompt.
I’ll update the prompt and see how it works. Thanks again!
It uses GPT-4 for story generation, ElevenLabs for narration, and a simple Next.js + Supabase stack for the app layer. I’m experimenting with story memory (so kids can revisit recurring characters) using vector embeddings, and building a “choose-your-own-adventure” mode with dynamic audio rendering.
Biggest challenge so far: aligning narration, ambient sounds, and story pacing without sounding janky or robotic. Solved it by tokenizing and chunking the story for synchronized audio stitching via ffmpeg.
Another challenge was the inconsistent image illustrations via Dall-E 3. I’ve adopted a dynamic prompting method that includes as many details about the scene, character details, and other visual elements which should remain consistent on each of the storybook pages.
If your kid ever demands “one more story” after a long day, I built this for you. It’s had a meaningful impact on my son’s behavior.