As someone who's learning a language (french) with Duolingo, and also supplementing that with other methods (podcasts, social media, online chatting, talking to chatgpt) I've also really wanted a way to get duolingo type experience with my own set of vocabulary that I encounter. So i'll definitely check this out. Also your english is impressive!
This looks really good. I wish I had had something like this many years ago when I was studying languages.
Somebody has already suggested adding spaced repetition and audio, which I agree with completely.
One more suggestion: In addition to having the LLM give you the meaning and example for the context in which you originally saw the word, also ask it to provide the word’s other main meanings and examples of it being used in those senses. You might encounter a word first in a slang or technical sense; while it’s useful to learn that meaning, it’s also important to learn other, more common meanings.
Below are some examples of words you might encounter first in technical contexts but would also be worth knowing in their more general meanings. (Examples suggested and defined by ChatGPT o1.)
canonical
Religious/General: Relating to a canon (e.g., church law) or a recognized body of works.
Math/Computing: Conforming to a standard or simplest form (e.g., “canonical form” of an equation).
resolution
General: A firm decision or determination (often heard in “New Year’s resolution”).
Tech/Imaging: The detail an image holds, typically measured in pixels, dots per inch (DPI), etc.
protocol
Diplomatic/General: The official procedure or set of rules governing state or ceremonial events.
Computing: A set of conventions and rules for transmitting data between electronic devices.
flux
General: Continuous movement or change, often implying instability.
Physics/Engineering: The amount of some quantity (e.g., heat, magnetism) passing through a given area over time.
Hey there, quick suggestion as a PhD Linguistics candidate and avid language learner!
The best way I've found to identify vocabulary most important to my life is through journaling in the language I'm trying to learn. Describing exactly what I did that day, my thoughts, etc, as best I can.
I had thought of doing the journal entries digitally and gathering dictionary headwords from such journal entries, whether they're written in my mother tongue (English) or not, and use the built dictionary lists to drill vocab.
Traditionally you'd use a lemmatizer with a morphosyntactic tagger for the language to identify the dictionary words, but AI is serviceable these days to easily identify dictionary words from long-form text in many languages, though honestly would be surprised if AI outperforms the traditional methods already.
Good luck and have fun :)
Thoughts on FSI methodology? That's what I used for mine (my app).
Honestly had never even heard of it! But adult language acquisition isn't really a domain of study I've ever been interested in. I can only speak to what I have found most helpful in my own adult language acquisition journeys. The journaling method was taught to me by a polyglot friend of mine and it sort of solved the "what actually is my everyday vocabulary anyway" side of language learning for me.
tl;dr "The Foreign Service Institute (FSI) is the primary training institution to prepare American diplomats to advance U.S. foreign affairs interests, teaching, among other things, the languages of the countries where Foreign Service Officers will serve. "
Apologies, I should have linked beforehand.
From my research, the best language learning program is Anki. It is open source and one can make custom 'decks' fairly simply. Perhaps a dictionary add-on for Anki would be good idea?
Great work, I had a similar need, and built a similar app (using podcasts) [1]
I originally planned to add some kind of SRS to it, but I found that I learned much better just reading things in context instead of explicitly using SRS to memorize them. Steve Kaufmann (creator of LingQ) explains this better here [2]
Thank you so much both for your comment and for sharing your app! (there are definitely great tools out there that we're not aware of) I am very happy to find your app because I actually needed something like this! I enjoy listening while working and being able to see the transcription alongside it, with word definitions in context - this kind of learning really works for me! It's fantastic how it supports all those languages - you can listen, read, and look up definitions all in one place. Looking at this, the one I shared above looked very basic. You handle transcription, media playing, testing pronouncation, LLM interaction I guess for contextual meaning and examples... ! The only question I have (sorry if this already exists - but i couln't find it) but is there a chance I can see a list of words I've encountered and marked as known?
And for the second part, I'm planning to include SRS features @markvdb pointed out in comments, combining both contextual learning with SRS would be interested I guess.
Similar to LingQ there is Migaku which can do this for YouTube and other sites. It definitely has significantly aided my learning and made it a zero friction and even fun experience to learn another language.
Thank you for sharing! Looking at their blog, I saw this post about learning Japanese vocabulary (https://migaku.com/blog/japanese/how-to-learn-japanese-vocab...). They share a Japanese Netflix Frequency List - (https://docs.google.com/spreadsheets/d/15b3j9--RJ1K5hI9vz_2L...)
"To recognize 99% of all the words in Netflix's subtitles, you'd need to know 37,247 words"
Interesting approach! I really don't know how they managed to gather this list, but it's an interesting and clever method.
There's also https://nuenki.app (disclaimer: I made it), which applies the same approach to every single website*. It translates appropriate-difficulty sentences into your target language, and you can hover for definitions, pronunciations, etc.
*other than those blocked for privacy reasons
I actually want to learn German, but I want to learn it by reading German texts and starting from zero, even though that makes it challenging. I need to look up definitions and such, but translating the entire page defeats the core purpose. This app in my case is just perfect match! Thank you for sharing!
Awesome! Let me know if you have any feedback!
I'll drop this here: If anyone wants to work on Language Reactor (well compensated), my email is in my profile. I'm planning to start open-sourcing much of it soon.
Interesting approach! Thanks for making it open-source, I think we need more open-language language learning tools. As I'm also building one (https://github.com/laurentlb/lingostories/), I'm going to take a look at what you did and the technical decisions.
You seem to focus on the English use-case. In my experience, getting exposure to other languages can be much more difficult, especially when you're not fluent yet. It would be interesting to see how to approach it: ideally, questions and answers should be in the target language, but the questions have to be very simple.
As someone else mentioned, having audio would be very useful. At some point, you could consider a hand-free mode: it reads the question out loud, pauses a few seconds, then tells the response.
This looks neat. If you’re going to add Duolingo style features, please don’t add fill-in-the-blank or word matching to the question types; or at least make them optional. They are an incredibly frustrating waste of time on Duolingo—they take up a ton of time to solve and don’t actually improve comprehension. My biggest gripe with Duolingo is that half of the questions asked in a lesson are questions like these which have the pretense of helping you learn but don’t actually deliver. I think if you instead came up with some very difficult question types that really challenged someone’s comprehension, it would be stickier than Duolingo (especially for the HN crowd who is actually trying to learn) and not just here to “play a game” like a large portion of the Duolingo audience.
Out of curiosity, do have any citations on how those exercises don't enable learning?
On the latter part, there used to be a hard mode at least in browser mode where you could have it force you hand type every word. I always really liked that, but then they got rid of it. Of course with the heart system these days, I wouldn't last 5 minutes if I tried to do it that way so such is life I suppose
Thank you! I am very interested in this project and want to keep working on it, hopefully getting help from open source contributors.
I actually had this idea of using Duolingo's style exercises, but now with your comment, I realize some might not be appropriate for individual learners with different goals.
The cool thing would be to have customizable exercise types, where users can choose which ones they want and which ones they don't want!
I will add this to the roadmap in the README, pointing out this comment! Thanks again!
Thank you for your contribution to the FOSS learning space.
Here's a few random suggestions: - spaced repetition. Again, anki style. - audio. Can you make it easy to record a phrase, anki style? Or maybe even make AI pronounce them correctly?
I would something like that.
Thank you so much! I will definitely add those ideas to the roadmap in the README (pointing out this comment).
I believe the spaced repetition feature must be prioritized because that's the most important thing in this app. I mean, what's the purpose of seeing the words over and over again if I already have confidence with them?
For the pronunciation feature, I had similar work before and there are great open source tools and libraries we can build upon that analyze your pronunciation and spot where you made mistakes. We can use open source TTS libraries to pronounce the correct version.
I also would definitely want to see audio questions in exercises similar to Duolingo, and it would be great to work on those features.
I am learning Turkish so I built something like that for me. You can highlight any word online and it will translate colloquially so you can actually use it irl.
It also has audio and pronunciation. It is around the halfway mark in the demo.
Apologies if this answered in the readme but does this support other languages than English?
Hi! I actually forgot to mention this in the README, thank you for pointing it out.
The app would work for any language, but the definitions and exercises will be written in English. I created a list just now for German words and added the German word "Zeitreise". It generated this definition:
<<"Zeitreise" in a German mystery series means time travel. It refers to the act of a character or characters moving through time, either to the past or the future, often as a central element of the mystery's plot.>>
Exercises were asked in English.
"What does "Zeitreise" mean?":
- Time travel - Train journey - Long wait - Difficult puzzle
Maybe a feature where you can choose the language would be cool. I mean, someone might prefer to learn German using German, or say Spanish using Turkish.
Again, thank you for pointing it out. I will update the README and hopefully add inference language preference feature.
Abi, slick design!
A feature where it supports TR -> EN and vice versa would be amazing!