insights

How vocabulary becomes a language, and what it takes to get there

A student named Isabella accidentally summarized 40 years of language acquisition research in one feature request

How vocabulary becomes a language, and what it takes to get there

A few months ago a student named Isabella sent me a message. She wanted to know if she could upload her Swedish vocabulary list to Lexie and have the app generate reading passages from those exact words. Then comprehension questions on top, to check she had understood what she'd read.

She is 16. She had a Swedish exam that week. And she just described comprehensible input plus active recall in one sentence without using either term.

Vocabulary drilling has a bad reputation it doesn't deserve. The research is unambiguous. Words become available for use through retrieval practice (Karpicke and Roediger, 2008), spaced over time (Cepeda et al., 2006), and ideally encoded across more than one channel, which is the dual coding effect Paivio described in 1971. Reading the word, hearing it, typing it, and producing it from a prompt all leave different traces. Stack enough of them and the word becomes recoverable when the brain needs it.

What doesn't work is staring at the list and willing yourself to remember it. Recognition is a much weaker signal than recall. You read "kissa = cat" twenty times and the next day, on the test, the word is gone. Producing the word from nothing is the part that builds the memory, and it's the part students skip when they're studying alone.

Lexie handles this part the way the research says it should be done. You photograph the list and the app builds typed recall (the word in your language, you type it in the target language), matching pairs, and listening exercises where the word is read aloud and you type what you hear. Reviews get scheduled by FSRS, a more recent and efficient descendant of the spaced repetition curve Ebbinghaus mapped in 1885. None of this is novel. It is rare to find it executed cleanly inside one app.

Vocabulary drilling has a ceiling though. You can know what kissa means and still freeze when you read it inside a sentence, because words in isolation are not the units real language operates on. Stephen Krashen's input hypothesis, the i+1 formulation, is the standard reference for what comes next. Language gets acquired when a learner is exposed to material slightly above their current level, in context, doing meaningful work. Words live inside grammar. Grammar lives inside meaning. Memorizing definitions without exposure to the language working as language gets you to a wall faster than students expect.

A word doesn't move from list to language all at once. It moves in stages. You encounter it. You notice it as a unit. You comprehend it in context. You retrieve it from memory under pressure. You produce it in the right form, first in writing, then out loud. Eventually you use it without thinking about it at all. Different kinds of practice serve different stages, and a student who is only doing one kind will stall at whichever stage that practice doesn't reach.

Paul Nation's four-strands framework, which is roughly the consensus view of how a balanced language course should be structured, breaks practice into four equal pillars. Meaning-focused input, where you read and listen for the meaning of the message. Deliberate language study, where you drill vocabulary, grammar, and pronunciation. Meaning-focused output, where you write and speak to communicate. And fluency development, where you use what you already know at a faster pace. Each strand should get roughly equal time. A student doing one or two of them, however much of them they do, is leaving stages of acquisition undertrained.

The strand that gets shortest shrift, in most school curricula and in most self-study, is output. This is where Merrill Swain's output hypothesis becomes useful. Swain formulated it in 1985 after watching Canadian French immersion students develop strong comprehension and surprisingly weak production. The students understood almost everything. They couldn't reliably produce the language they understood. Comprehension and production turned out to be different skills, built by different kinds of practice. Producing language forces decisions the brain can avoid when it's only receiving. You can't smooth over the verb form you don't know. You have to commit to one. The commitment is what makes you notice the gap, and noticing the gap is what eventually closes it.

School gives students the vocab list, the textbook chapter, the grammar drill, the listening exercise, the test. Plenty of input across a week. The thing school can't give every student is the sheer volume of additional practice each individual brain needs to move a word from "I have seen this" to "I can use this." Nation estimates a learner needs to encounter a word between 10 and 20 times in meaningful context before it moves from recognition to productive use. The classroom can deliver some of those encounters. The rest happens at home, on the bus, the night before, in whatever practice the student manages to build for themselves.

What Isabella was asking for was a way to build more of that practice without having to stitch it together from three separate apps. Vocabulary not on a list. In sentences. Doing real work. With questions afterward to confirm it had landed.

So Lexie generates reading passages from any vocabulary list you give it. The passages use your words inside sentences that make sense, at a difficulty level matched to the source. Then it generates comprehension questions on top of those passages, which forces retrieval of meaning rather than the shallower kind of word recognition that fooled you on the wordlist.

You can listen to the passages with full audio, which adds the dual coding layer back in. You can tap any word for an instant translation if something trips you up, which keeps you inside the passage rather than bouncing to a dictionary and losing your place. The same vocab list can run through typed recall, matching pairs, and spaced review as a separate study mode whenever you want to drill.

Drilling and reading aren't competing methods. They sit at different points on the same path, and the path only works if you can walk all of it.

There is one stretch of the path Lexie can only partly walk with you. Output has two halves. Written output, where you produce language with time to think, room to revise, and a forgiving medium. Spoken output, where you produce language in real time with no take-backs and a person waiting on the other side. Lexie's typed recall, written comprehension answers, and open-ended think questions are written output. They force production. They reveal the gaps that reading alone doesn't.

Spoken output is a different thing. You can rehearse a sentence in your head and still freeze when it comes out of your mouth. The muscles haven't done it. The timing hasn't been practiced. The reaction loop with another speaker hasn't trained your prediction. None of that gets built by typing words into a phone. It gets built by talking to a person who is also speaking the language, and ideally to a person who will tell you when you're wrong.

That gap is real and the app doesn't pretend otherwise. Lexie covers vocabulary, reading, listening comprehension, written recall, and the spaced review that keeps it all available. The talking part still needs a human. A tutor, an exchange partner, a classroom that actually uses the target language, a stubborn relative who refuses to switch to English. Whatever combination a student can assemble.

What this means in practice is that everything around the conversation fits inside the app. The preparation. The vocabulary. The reading. The comprehension. The written recall. The bus-ride drill on Thursday morning before the Friday test. The conversation itself still happens elsewhere, which is where it should happen.

What Isabella described is roughly the full theoretical stack of modern second language acquisition research, compressed into one feature request. Vocab to exposure to recall. The fact that a 16-year-old felt the gap and named it precisely tells you the gap is real. School gets students most of the way there. The remaining stretch is practice, and practice is what Lexie is for.