GitPedia

Russiannames

Russian names parsers, gender identification and processing tools

From datacoon·Updated June 16, 2026·View on GitHub·

`russiannames` is a Python 3 library dedicated to parse Russian names, surnames and midnames, identify person gender by fullname and how name is written. It uses MongoDB as backend to speed-up name parsing. The project is written primarily in Python, distributed under the BSD 3-Clause "New" or "Revised" License license, first published in 2019. Key topics include: gender-detection, names-classification, names-frequent, russian-language, russian-specific.

Latest release: 0.2Preservation release
May 11, 2019View Changelog →

Russian Names

russiannames is a Python 3 library dedicated to parse Russian names, surnames and midnames, identify person gender by fullname and how name is written. It uses MongoDB as backend to speed-up name parsing.

Documentation

Documentation is built automatically and can be found on
https://russiannames.readthedocs.org/en/latest/

Installation

To install Python library use pip install russiannames via pip or python setup.py install

To use database you need MongoDB instance.
Unpack db_data_bson.zip file from https://github.com/datacoon/russiannames/blob/master/data/bson/db_dump_bson.zip

and use mongorestore command to restore names database with 3 collections: names, surnames and midnames

Features

Database of names used for identification

  • 375449 surnames - collection: surnames
  • 32134 first names - collection: names
  • 48274 midnames - collection: midnames

Detailed database statistics by gender and collection

collectiontotalmalesfemalesuniversal or unidentified
names321341929782781196
midnames4827430114161430
surnames37527412466211153438827

Supports 12 formats of Russian full names writing style

FormatExampleDescription
fОльгаonly first name
sПетровonly surname
FsО. Сидороваfirst letter of first name and full surname
sFНиколаев С.full surname and first letter of surname
sfАбрамов Семенfull surname and full first name
fsСоня Камиуллинаfull first name and full surname
fmИван Петровичfull first name and full middlename
SFMМ.Д.М.first letters of surname, first name, middlename
FMsА.Н. Егороваfirst letters of first and middle name and full furname
sFMНиколаенко С.П.full surname and first letters of first and middle names
sfMПетракова Зинаида М.full surname, first name and first letter of middle name
sfmКазаков Ринат Артуровичfull name as surname, first name and middle name
fmsСветлана Архиповна Волковаfull name as first name, middle name and surname

Supports names with following ethnics identification

9 ethnic types in names, surnames and middle names supported

keyname (en)name (rus)
arabArabicАрабское
armArmenianАрмянское
georGeorgianГрузинское
germGermanНемецкие
greekGreekГреческие
jewJewЕврейские
polskPolishПольские
slavSlavic (Russian)Славянские
turTurkicТюркские (тюркоязычные)

Limitations

  • very rare names, surnames or middlenames could be not parsed
  • ethnic identification is still on early stage

Speed optimization

  • preconfigured and preindexed MongoDb collections used

Usage and Examples

Parse name and identify gender

Parses names and returns: format, surname, first name, middle name, parsed (True/False) and gender

>>> from russiannames.parser import NamesParser
>>> parser = NamesParser()
>>> parser.parse('Нигматуллин Ринат Ахметович')
{'format': 'sfm', 'sn': 'Нигматуллин', 'fn': 'Ринат', 'mn': 'Ахметович', 'gender': 'm', 'text': 'Нигматуллин Ринат Ахметович', 'parsed': True}
>>> parser.parse('Петрова C.Я.')
{'format': 'sFM', 'sn': 'Петрова', 'fn_s': 'C', 'mn_s': 'Я', 'gender': 'f', 'text': 'Петрова C.Я.', 'parsed': True}

Gender field could have one of following values:

  • m: Male
  • f: Female
  • u: Unknown / unidentified
  • -: Impossible to identify

Ethnic identification (experimental)

Parses surname, first name and middle name and tries to identify person ethic affiliation of the person

>>> from russiannames.parser import NamesParser
>>> parser = NamesParser()
>>> parser.classify('Нигматуллин', 'Ринат', 'Ахметович')
{'ethnics': ['tur'], 'gender': 'm'}
>>> parser.classify('Алексеева', 'Ольга', 'Ивановна')
{'ethnics': ['slav'], 'gender': 'f'}

Supported languages

  • Russian

Requirements

  • pymongo
  • click

Contributors

Showing top 2 contributors by commit count.

View all contributors on GitHub →

This article is auto-generated from datacoon/russiannames via the GitHub API.Last fetched: 6/28/2026