GitPedia

Whatlang rs

Natural language detection library for Rust. Try demo online: https://whatlang.org/

From greyblake·Updated June 21, 2026·View on GitHub·

Natural language detection for Rust with focus on simplicity and performance. Try online demo. The project is written primarily in Rust, distributed under the MIT License license, first published in 2016. It has gained significant community traction with 1,078 stars and 119 forks on GitHub. Key topics include: ai, algorithm, classifier, detect-language, language.

Latest release: v0.16.2Whatlang v0.16.2 - 2022-10-23
October 23, 2022View Changelog →
<p align="center"><img width="160" src="https://raw.githubusercontent.com/greyblake/whatlang-rs/master/misc/logo/whatlang-logo.svg" alt="Whatlang - rust library for natural language detection"></p> <h1 align="center">Whatlang</h1> <p align="center">Natural language detection for Rust with focus on simplicity and performance.</p> <p align="center"><a href="https://whatlang.org/" target="_blank">Try online demo.</a></p> <p align="center"> <a href="https://github.com/greyblake/whatlang-rs/actions/workflows/ci.yml" rel="nofollow"><img src="https://github.com/greyblake/whatlang-rs/actions/workflows/ci.yml/badge.svg" alt="Build Status"></a> <a href="https://raw.githubusercontent.com/greyblake/whatlang-rs/master/LICENSE" rel="nofollow"><img src="https://img.shields.io/badge/license-MIT-blue.svg" alt="License"></a> <a href="https://docs.rs/whatlang" rel="nofollow"><img src="https://docs.rs/whatlang/badge.svg" alt="Documentation"></a> <p>

Stand With Ukraine

Content

Features

  • Supports 70 languages
  • 100% written in Rust
  • Lightweight, fast and simple
  • Recognizes not only a language, but also a script (Latin, Cyrillic, etc)
  • Provides reliability information

Get started

Example:

rust
use whatlang::{detect, Lang, Script}; fn main() { let text = "Ĉu vi ne volas eklerni Esperanton? Bonvolu! Estas unu de la plej bonaj aferoj!"; let info = detect(text).unwrap(); assert_eq!(info.lang(), Lang::Epo); assert_eq!(info.script(), Script::Latin); assert_eq!(info.confidence(), 1.0); assert!(info.is_reliable()); }

For more details (e.g. how to blacklist some languages) please check the documentation.

Who uses Whatlang?

Whatlang is used within the following big projects as direct or indirect dependency for language recognition.
You're gonna be in a great company using Whatlang:

  • Sonic - fast, lightweight and schema-less search backend in Rust.
  • Meilisearch - an open-source, easy-to-use, blazingly fast, and hyper-relevant search engine built in Rust.

Feature toggles

FeatureDescription
enum-mapLang and Script implement Enum trait from enum-map
arbitrarySupport Arbitrary
serdeImplements Serialize and Deserialize for Lang and Script
devEnables whatlang::dev module which provides some internal API.<br/> It exists for profiling purposes and normal users are discouraged to to rely on this API.

How does it work?

How does the language recognition work?

The algorithm is based on the trigram language models, which is a particular case of n-grams.
To understand the idea, please check the original whitepaper Cavnar and Trenkle '94: N-Gram-Based Text Categorization'.

How is is_reliable calculated?

It is based on the following factors:

  • How many unique trigrams are in the given text
  • How big is the difference between the first and the second(not returned) detected languages? This metric is called rate in the code base.

Therefore, it can be presented as 2d space with threshold functions, that splits it into "Reliable" and "Not reliable" areas.
This function is a hyperbola and it looks like the following one:

<img alt="Language recognition whatlang rust" src="https://raw.githubusercontent.com/greyblake/whatlang-rs/master/misc/images/whatlang_is_reliable.png" width="450" height="300" />

For more details, please check a blog article Introduction to Rust Whatlang Library and Natural Language Identification Algorithms.

Make tasks

  • make bench - run performance benchmarks
  • make doc - generate and open doc
  • make test - run tests
  • make watch - watch changes and run tests

Comparison with alternatives

WhatlangCLD2CLD3
Implementation languageRustC++C++
Languages7083107
Algorithmtrigramsquadgramsneural network
Supported EncodingUTF-8UTF-8?
HTML supportnoyes?

Ports and clones

Donations

You can support the project by donating NEAR tokens.

Our NEAR wallet address is whatlang.near

Derivation

Whatlang is a derivative work from Franc (JavaScript, MIT) by Titus Wormer.

License

MIT © Sergey Potapov

Contributors

Contributors

Showing top 12 contributors by commit count.

View all contributors on GitHub →

This article is auto-generated from greyblake/whatlang-rs via the GitHub API.Last fetched: 6/24/2026