Constructing Languages with a Multi-Hop LLM Pipeline
We introduce a fully automated system for constructing languages (conlangs) using large language models. Our multi-stage pipeline creates coherent, diverse artificial languages with their own phonology, grammar, lexicon, and translation capabilities.
Constructed languages (conlangs) such as Esperanto and Quenya have played diverse roles in art, philosophy, and international communication. Meanwhile, large-scale foundation models have revolutionized creative generation in text, images, and beyond. In this work, we leverage modern LLMs as computational creativity aids for end-to-end conlang creation. We introduce ConlangCrafter, a multi-hop pipeline that decomposes language design into modular stages -- phonology, morphology, syntax, lexicon generation, and translation. At each stage, our method leverages LLMs' meta-linguistic reasoning capabilities, injecting randomness to encourage diversity and leveraging self-refinement feedback to encourage consistency in the emerging language description. We evaluate ConlangCrafter on metrics measuring coherence and typological diversity, demonstrating its ability to produce coherent and varied conlangs without human linguistic expertise.
ConlangCrafter decomposes language creation into modular stages, each with specialized prompting strategies and self-refinement loops for consistency.
Browse our collection of generated constructed languages. Click on any language to explore its phonology, grammar, lexicon, and sample translations.
Choose a generated conlang from the grid to view its detailed linguistic analysis.
@article{conlangcrafter2025, title={ConlangCrafter: Constructing Languages with a Multi-Hop LLM Pipeline}, author={Morris Alper and Moran Yanuka and Raja Giryes and Ga{\v{s}}per Begu{\v{s}}}, year={2025} eprint={2508.06094}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2508.06094}, }