Skip to content
Back to guides
Feature

Multilingual Text-to-Speech for Kids' Learning

A parent-friendly guide to the languages KORENANI supports and how spoken labels turn everyday photos into language moments.

What is KORENANI's multilingual text-to-speech?

KORENANI's multilingual text-to-speech lets children hear the label and short description for a recognized photo. It supports nine languages: Japanese, English, Spanish, French, German, Italian, Portuguese, Chinese, and Korean.

Hear the names of things, in nine languages

KORENANI takes the names it finds in your photos and reads them out loud. Children who can't yet read written letters can still pick up new words from sound alone — exactly how they pick up their first language at home.

The app combines two complementary kinds of speech. iOS provides on-device speech synthesis for instant playback, and our backend pre-generates clearer audio for the labels and short descriptions of recognized objects. Either way, when your child taps a result, they hear the word right away.

Supported languages

KORENANI ships with vocabulary and audio for nine languages:

  • Japanese
  • English
  • Spanish
  • French
  • German
  • Italian
  • Portuguese
  • Chinese
  • Korean
LanguageExample family use
JapaneseConfirm the everyday name in a home language
EnglishHear the same photo in English and compare the sound
SpanishAdd common words to a family language routine
FrenchReplay the label for its rhythm and pronunciation
GermanCompare names for vehicles, tools, and household objects
ItalianUse food and daily objects as simple listening practice
PortugueseFocus on the language your family wants to practice
ChineseHear the same object in a different language system
KoreanBuild familiarity with nearby regional vocabulary

Two kinds of audio

For every recognized object, we prepare two pieces of audio:

  • Label audio — a short, native-sounding pronunciation of the word itself. Great for repeating after.
  • Description audio — a short sentence that gives a little more context (what it is, where you might see it).

Because the audio is prepared in advance and stored alongside the recognition result, playback feels instant even on a slow connection.

Turning photos into language moments

The real magic happens when audio meets the moment a child is curious:

  • Snap the flower you spotted on a walk and hear the name in two different languages back-to-back
  • Take a picture of dinner and have your child repeat each ingredient in their target language
  • Show the same photo again next week to refresh the word from the ear, not just the eye

Vocabulary that arrives through the ear, attached to a real object the child cared about, sticks far better than flashcards. KORENANI is built around that simple idea.

Read next

For more on how the app picks the right answer, read General, Insect, and Plant Recognition Modes. To learn how photos are handled, read How Photo Data Flows During Image Recognition.

Related Guides