What is the difference between a traditional (online) dictionary and a concordancer?
A traditional dictionary lists all the meanings a word has or has had in a language. Its great advantage is completeness, but unfortunately, dictionaries often provide only a very limited number of usage examples, and for some meanings, no examples at all. This is a major drawback for language learners, who need plenty of input to make progress and assimilate new words effectively.
A concordancer is a computer program that searches a large text corpus (e.g., all newspaper articles from the past 10 years) and returns all instances of a search term. For example, if you enter “par exemple” in a concordancer, it will find all sentences containing that phrase. While this can be very useful, there are some drawbacks: the number of results can be huge—sometimes in the thousands depending on the corpus size—and the quality of examples is often inconsistent. Additionally, the different uses of a lemma are not always clearly illustrated, so the user has to guess what the differences are between examples, if any.
Lexogoth attempts to combine these two approaches to the lexicon in the following way:
- We include only the relevant (more common) meanings of a lemma and leave out outdated ones.
- We provide clear, relatively short, and study-friendly examples of correct usage for each lemma, construction, or expression.
- For lemmas that may cause problems for learners, we add extra explanations about usage—such as the language registers involved or grammatical peculiarities—to ensure the learner can confidently use the word appropriately.
- We also pay careful attention to the possible combinations (collocations) of a lemma with other words, giving learners expressive freedom.
This combination means Lexogoth is neither just a dictionary nor just a concordancer. Users can treat it as an information source (examples of correct usage) or as a study tool (selected examples for learning). Moreover, in some cases, we add extra commentary to clarify the content of example sentences.
Why should I use this program instead of just DeepL (Linguee), ChatGPT, or another AI tool like Reverso?
The tools mentioned above are very useful for someone who already has a good (or no) command of the language and wants to quickly draft or translate documents without spending too much time. Students can also benefit from them to save time or get help with assignments. However, these tools are not meaningful for people who are learning a language and need to use it actively. By systematically relying on these tools during exercises, students switch off their own critical thinking about the language and do not build any real knowledge of it. They are simply given a solution, which they adopt uncritically. Moreover, these programs do not explicitly distinguish between different language registers, nor do they explain why certain constructions or word choices are made. With Lexogoth, the learner is required to critically examine the provided examples and select those that best match their query. This approach offers many advantages: learners often read the same examples repeatedly, which helps them significantly expand their lexical knowledge through reading. Additionally, the numerous explanatory notes provide essential knowledge about word usage—an absolute prerequisite to becoming a competent user of a foreign language. These aspects are not offered by translation websites like Reverso or DeepL, which sometimes even generate incorrect translations for certain inputs. Furthermore, the AI tools mentioned are mainly trained on written corpora and therefore primarily provide written language. It is no surprise they struggle with informal language use, which is mainly spoken and which Lexogoth pays great attention to. Unlike ChatGPT or other AI tools, Lexogoth is not a productivity tool that generates or translates texts and suggests approximate solutions. The program is a true learning tool that, for every search term, provides one or more common examples verified in numerous reference works and a corpus, enabling learners to form an accurate understanding of how a word is actually used in the target language. Moreover, Lexogoth regularly offers clarifying comments on the usage and meaning of certain words and constructions. Such elements, largely absent from AI tools, are in our view indispensable for acquiring a solid knowledge of a foreign language.
What is the role of Lexogoth in FLE?
Lexogoth acts as a bridge between corpus linguistics and the FLE classroom. Unlike traditional textbooks that often use artificial examples, Lexogoth provides students with authentic, context-rich language data. It has three key functions:
- Data-Driven Learning: It encourages students to act as “language detectives.” Instead of just memorizing rules, they discover how French is actually used in various registers and contexts by exploring over 40,000 authentic entries.
- Focus on Lexical Chunks: Lexogoth emphasizes collocations, that is the way words naturally stick together. This is crucial for FLE learners to achieve native-like fluency and move beyond literal translations.
Scaffolded Autonomy: With the Lexical Toolbox, Lexogoth moves from a reference tool to an active learning environment. It allows teachers to create differentiated materials and enables students to build their owpersonalized learning paths based on the vocabulary they encounter.
How does Lexogoth differ from other (online) French learning programs like Duolingo, Babbel, and Rosetta Stone ?
Duolingo, Babbel, Rosetta Stone, and other popular language apps are primarily designed for beginners, offering language instruction through playful, accessible exercises like multiple-choice questions or short fill-in-the-blank tasks. Such an approach is useful for learners who have little to no prior knowledge of the target language and want to quickly pick up some basic understanding of the new language (e.g., vocabulary, grammar).
However, we believe that these apps don’t offer the necessary depth and accuracy to truly learn a new language. Many of the examples, especially in Duolingo, feel artificial and do not always reflect natural spoken or written French. This is likely due to the widespread use of AI-generated content, where almost no attention is paid to nuance, language registers, or communicative context.
Additionally, most popular language apps offer only fixed learning paths, leaving little room for learner autonomy and deeper exploration. Many language apps also rely on subscription models with automatically renewing fees, all too often accompanied by overly optimistic marketing claims promising fluency within a very short time frame, something we believe is not achievable with such a superficial input.
Lexogoth takes a different approach:
- Lexogoth is designed for learners beyond the beginner level, as well as for teachers who need authentic, high-quality materials for advanced instruction.
- Lexogoth includes over 20 000 French chunks (short segments of language) each with an English translation and detailed notes on grammar and register.
- The chunks and sentences included in Lexogoth are (almost) all drawn from authentic sources (spoken and written language) allowing users to use them confidently in real-world communication.
- Unlike most apps, Lexogoth does not lock users into a fixed curriculum. Users are free to explore and review whatever topic or structure they choose, at their own pace and depth.
- Lexogoth also provides over 100 downloadable lessons (PDFs), offering clear, structured lessons on French grammar, vocabulary, and language functions, covering the entire A level and parts of B1.
In summary, where numerous language apps promise quick results with superficial exercises, Lexogoth offers a remarkably reliable, context-rich and rigorous learning environment for those who truly want to build an excellent command of the French language.
Is Lexogoth a full-fledged teaching method?
No, Lexogoth was designed to be used alongside existing teaching methods and should not be seen as a fully standalone course. The program does not contain exercises, and the knowledge offered through the lessons (grammar, vocabulary, language functions) is intended only to allow students (levels A1-A2) to independently review or refresh certain points. Moreover, the lexical database—the main component of Lexogoth—was not organized according to a progressive difficulty level; the intention is for students to consult this database when studying vocabulary and specific grammar topics (e.g., the difference between “à condition de” and “à condition que,” the mood after “le fait que” etc.) so that vocabulary acquisition and fixed expressions (les locutions) can be more efficient and thorough. Such support is usually only very limited in the current range of digital and non-digital methods.
Which language registers are used? What is their importance?
Sometimes, following the example of the consulted works, we indicate to which language register or language level a particular word belongs. The distinguished language registers are: familiar language, more elegant language, standard language, vulgar language, and colloquial language. Most of the words and expressions included in our lexicon belong to the common standard language (le registre courant), meaning this word is used in everyday language in almost all situations. Familiar language (registre familier) belongs to casual speech, a language usually not used in more formal or serious situations. In English, this is often called informal language. The more elegant language use (le registre littéraire / soutenu) or formal language is characterized by sophisticated language and more complex, refined expressions. Then there is the colloquial register (le registre populaire), often a synonym for language used in socially disadvantaged environments. Vulgar language (le registre vulgaire)—well, that needs no explanation. This type of language has very limited applicability and should be used cautiously. Some conversation partners are more sensitive than others and may be more easily offended by vulgar terms. We mention these registers to give language learners some guidance when processing and using words and structures in a foreign language; when learning a foreign language, you must do so without the learned norms of your mother tongue, and mistakes can happen. Be honest—what would you think if someone asked you on the train in broken English whether “he might be permitted to assume the seat beside you” ?
If we do not specify language registers for a word, you can assume that the word belongs to the standard language. When addressing the different meanings, we have paid special attention to familiar (informal) language use. More information about language registers can be found further on in the FAQ (section on varieties of French).
Varieties of French: language levels, Belgicisms, and Anglicisms.
In Lexogoth, we pay due attention to the different varieties of French. For example, we usually indicate to which language register a word or construction belongs. In Lexogoth, we distinguish four broad language registers for the words and constructions treated:
– The standard or common language: this is the language used in daily life at work, at school, and in the media.
– The more refined, polished, and literary language: vocabulary and constructions that require considerable lexical knowledge and that mainly belong to written language use.
– The familiar language: this is informal language commonly used among speakers who know each other well; primarily oral language.
– The popular language (registre populaire): This is very informal language that exists almost exclusively orally, characterized by casual pronunciation, little lexical and syntactic variation, simple constructions, use of coarse (vulgar) words, use of verlan and Anglicisms, etc. For these reasons, the popular register is considered the lowest language level. For most foreign language learners, this variety has only limited communicative usability. We advise users to handle this register with great care and not to use it lightly without first fully understanding the communicative context.
For more explanation about the characteristics of the different language registers, we refer to the document available through the grammar menu in Lexogoth: Extension 12: les registres de langue.
We also always alert users when they encounter a Belgicism. A Belgicism is an expression mainly used by Belgian French speakers and scarcely or not at all used outside Belgium; in other words, Belgicisms are not considered part of the general standard language. Many of these Belgicisms originate from the regional variants of Belgian French, from Dutch—specifically Flemish or Standard Dutch—or from Walloon, a sister language (with many variants) of French historically spoken in southern Belgium and definitively replaced by French after WWI. The best-known Belgicisms are undoubtedly nonante, septante, and the excessive use of savoir. The reader must judge, according to the communicative goal, whether it is appropriate to use a particular Belgicism or not. Most (lexical) Belgicisms can easily be found in Lexogoth by entering the search term “belgicisme.” In Lexogoth, we focus almost exclusively on French Belgicisms.
French is not immune to the influence of English, and we therefore give proper attention to Anglicisms. Anglicisms are constructions and words directly borrowed from English (usually American English) into French. The reader can easily find these by typing “anglicisme” into the search field (language option: French).
How do you decide which words are added to Lexogoth?
The Lexogoth corpus consists of two parts: the first layer (about one-fifth of the total number of examples) comprises the lexical items that a student is expected to know for the CEFR levels A1–A2. The second layer of examples (four-fifths of the total number of examples) consists of items we regularly encountered in:
– French-language media, especially Belgian and French news websites (examples of standard written language)
– French-language internet forums and blogs (examples of spoken language and informal usage)
– government documents: informational campaigns from ministries, reports from parliament (Belgian), ministry websites
– written press (Charlie Hebdo, Le Monde, Le Soir, La Libre Belgique)
The frequently occurring lemmas and the sentences in which they were used were then verified in at least three reference works from the following list:
– Dictionnaire Le Petit Robert, Dictionnaire de l’Académie française, Dictionnaire Larousse
– The open-source dictionaries La langue française and Wiktionnaire
If the meaning could be found in the above works, we consulted the concordancer SketchEngine to get an idea of the frequency of use. If the word was sufficiently frequent, we added one or more examples to our corpus. If the word appeared rarely or not at all in the concordancer, we usually chose not to include it in the corpus. Sometimes a word’s meaning could not be found in any of the above reference works, and in those cases, we only added the word provided there were enough user examples found on internet forums or in the concordancer.
Currently, Lexogoth contains over 40,000 clickable items, and we are now working on filling the gaps that (certainly) still exist in the work: we review the lists present in Lexogoth and compare them with dictionary word lists. Words that were not yet included but have common usage are then added based on clear user examples. This process will definitely take several years; as of now, the letter “d” is fully completed, and we are working on the letter “e” (December 2025).
The reformed spelling
In our lexicon, we have tried to take the 1990 reformed spelling into account as much as possible and to systematically add it to the corpus. For most words, we used the traditional spelling and then provided the updated spelling at the end of the sentence. You can recognize this by the addition ORT.
In principle, both spellings of a word are included in the search terms; however, there are still some words for which we have not yet been able to do this. Therefore, we advise users to always search for words with a “circumflex accent” (yes, that little hat) both with and without the accent to ensure they get all relevant search results. For example, if you want to search for « croître », it is best to search both « croître » and « croitre ». We have not yet systematically applied this reformed spelling. Keep in mind, however, that the pre-reform spelling should not be considered outdated or incorrect; this spelling remains valid for an “indefinite period” and is fully equivalent to the renewed spelling. Just to make things a bit more complicated…
If I click on ‘fait’ in the word list, I also get all the search results for the verb ‘faire.’ Why is that, and how can I avoid it?
The Lexogoth program is based on about 2,500 pages of text, combining English and French. When the program starts, it processes these texts and the program’s tokenizer-lemmatizer assigns all possible tags applicable to a word. For example, the word “fait” will be linked to both “noun” (un fait) and “verb” (faire). For this reason, when you click on “fait” in the word list, you will see results for faire as well as for un fait (and also phrases like en fait, au fait, etc.).
For words that are not very frequent, it is not a problem to read through the few examples to find the desired form, but for very frequent words, this can be an issue. To help the user with this, we have manually added tags so that it is easier to find the desired category of a word with multiple tags. For example, for fait/faire we added the following tags: fait[nom], faire[verb], fait[en_fait], …. This way, the user can more easily navigate through all the examples. Make sure to use the visual search methods to take advantage of this feature.
How do I search for expressions consisting of multiple words? What is the meaning of the “underscore” in words? And why is “parce que” listed as “parce_que” in the word list, but “pour que” appears as “pour[pour_que]”?
The underscore was added internally to indicate that the words belong together and form a single unit. For example, ‘parce_que’ is treated as one word, whereas the computer sees ‘parce que’ as two separate words. If you look in the dictionary, you will easily find the sequence ‘parce_que’, but not ‘pour_que’. Why is that? To avoid making the dictionary cluttered and hard to read by systematically adding every fixed expression as one string (e.g., à_bride_abattue, and all expressions starting with the preposition à), we chose to add compound expressions and words based on the meaningful words that form the expression. In the example above, you will find the expression under the lemmas ‘bride’ as bride[à_bride_abattue] and ‘abattre’ as abattre[à_bride_abattue], but not under the meaningless preposition ‘à’. But why then don’t we find ‘parce que’ under parce[parce que]? Because ‘parce’ is not a valid or existing lemma in French, so it is added directly. This explanation also often applies to Anglicisms. In summary: for compound words, it is best to search visually (dictionary or autocomplete) using the core words that make up the expression or compound word.
Why can’t I simply find all compound words like “pomme de terre” or “film d’horreur”?
In French, compound words are most often formed with prepositions, which means that one “word” often consists of several words separated by spaces. For a text-processing program, this is naturally a nightmare. How do we distinguish between a true word (such as pomme de terre) and a coincidental combination? It is not always clear where to draw the line to recognize a sequence of letters (or sounds) as a word: in English, for example, we say potato (one continuous sequence of letters), whereas French needs more than three “words” to name the same object (the potato), namely pomme de terre. Now suppose we treat pomme de terre as one word (because, for example, native speakers no longer analyze this term as the sum of three separate words— literally “an apple from/of the ground”—but as a whole). What then do we do with words like film d’action or film d’aventures, for which the situation is less clear? Should these also be given the status of “word” because they have the same form as the “pomme de terre ” example, even though they can still be analyzed based on their components? Intuitions on this will likely vary, and this naturally reflects in the search results. For example, we had to decide which word groups form a unit and which do not: in the example above, we chose to treat film d’action as three separate words, while pomme de terre is considered one word.
What do the abbreviations SYN, LOC, EXP, CONT, ORT, and PROV mean?
– CONT(A : B): This means that the antonym (opposite) of word A is word B. We added this relation only sporadically.
– PROV: This indicates a saying or proverb. Sayings will only be added at a later stage, as they have limited usefulness for language learners (and users in general).
– SYN(A : B, C, D,…): This means that word A has words B, C, D as synonyms. Note that sometimes this relation applies only to one meaning of the word!
– ORT(A, B, C): This indicates the reformed spelling or an alternative spelling of a word. For example, if you have a search result with the word ‘connaît’, it should say further on: (ORT: connait).
– LOC(): This abbreviation stands for locution (adjectival, adverbial, verbal, etc.) and indicates fixed word groups that frequently occur in the language. Word groups such as ‘à cause de,’ ‘pour que,’ ‘demander pardon’ fall under this somewhat vague definition. Under LOC, we also include all reflexive verbs or sometimes verbs that unexpectedly have a fixed preposition. For Dutch, LOC also contains inseparable verb forms (e.g., doorlopen, rondkomen). For example, in the sentence ‘Jan spreekt af met Marie’, the computer (and perhaps the student) may not automatically recognize the word ‘afspreken’, which is why we add it under LOC.
– EXP(): This abbreviation stands for expressions. Naturally, the question arises what the difference is with the fixed word groups mentioned above as locutions. We consider a word group an expression if, at first glance, it is not clear how the sum of the meanings leads to the overall meaning of the expression. The criteria to distinguish locutions from expressions are quite complex, but for the practical nature of the program, we chose to make only a rudimentary distinction; after all, it does not matter much to a language learner whether something is a locution or an expression. In future updates, we will try to distinguish both more systematically. In other words, if you want to launch a search for all fixed expressions with the word ‘main’, it’s safest to first search for ‘LOC main’ and then ‘EXP main’ so you can combine both results.
Why doesn’t Lexogoth offer spell check, and why doesn’t it correct sentences?
This program does not have the necessary modules (a rule-based parser) to syntactically analyze a sentence. To achieve this, we would need to add a parser that can determine whether a sentence is correctly formed or not. Unfortunately, it doesn’t stop there, because since we are in a learning context, this parsing module would also need to indicate exactly where the error is (a notoriously difficult and unsolved problem) and suggest ways to improve the sentence. This latter task is a huge challenge and cannot be implemented quickly or easily.
Moreover, the program is not a spell checker, although it can be used somewhat for that purpose by tweaking the search queries. A spell checker performs a shallow parse of the text and suggests possible errors to the user based on the immediate context of surrounding words. For example, if you want to know whether you can say ‘une vieux femme’, you could enter the sequence ‘vieux femme’ and the program will return the tags for both words. You will quickly notice that these tags are incompatible: ‘masculine plural, singular’ for ‘vieux’ does not match ‘feminine singular’ for ‘femme’!
How do you update Lexogoth ?
As mentioned above, Lexogoth is only available through the Microsoft Store. We commit to releasing updates for the program every three months. These updates mainly involve expanding the text corpus and fixing errors. The update process is as follows: Open the Microsoft Store app on your computer, then select Library, and you will see the option Update (Apps). Clicking this will update all apps. Please note that updating Lexogoth must always be done manually through the Microsoft Store app and does not happen automatically via regular Windows updates. Organizations that have made a group purchase will receive an update via email.
Comments regarding the installation of Lexogoth and loading times.
*Important: Starting from Lexogoth version 3.8 (current version is v3.11), users no longer need to have Microsoft Visual C++ Redistributable (at least version 14.40.x, x64 architecture) installed separately; we have included the necessary files in the installer. However, if you encounter error messages about missing files when opening Lexogoth, please visit the following link and download and install the Microsoft Visual C++ Redistributable (x64 architecture): https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170 (under the title Latest Microsoft Visual C++ Redistributable Version). After that, try installing Lexogoth again.
*If your Windows operating system is running in S mode (which is rare), the program may not work properly. A possible solution is to disable S mode or to download and install the program on another computer. This limitation is due to permission restrictions imposed by S mode on programs.
*It takes about eight to ten seconds for the program to start up; this loading time is caused by the large amount of data that needs to be loaded into memory (approximately six ‘lexicons’). However, once the program is started, each search query can be completed within a fraction of a second.
*If you notice that resizing the program window is slow or laggy, it is likely because there are already thousands of example sentences displayed in the results window without you realizing it. The easiest solution is to refresh the screen (using the refresh icon), after which resizing should work smoothly again.