GitPedia

Simple Sentence Similarity

Exploring the simple sentence similarity measurements using word embeddings

From TharinduDR·Updated July 31, 2025·View on GitHub·

We provide a collection of simple unsupervised semantic textual similarity methods to calculate semantic similarity between two sentences. The project is written primarily in Python, distributed under the Apache License 2.0 license, first published in 2018. Key topics include: elmo, fasttext, glove, ipynb, python.

Latest release: v2.3.0Removing Elmo Embeddings
August 20, 2024View Changelog →

License Downloads

Simple Sentence Similarity

We provide a collection of simple unsupervised semantic textual similarity methods to calculate semantic similarity between two sentences.

References

If you find this code useful in your research, please consider citing:

@inproceedings{ranasinghe-etal-2019-enhancing,
    title = "Enhancing Unsupervised Sentence Similarity Methods with Deep Contextualised Word Representations",
    author = "Ranasinghe, Tharindu  and
      Orasan, Constantin  and
      Mitkov, Ruslan",
    booktitle = "Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)",
    month = sep,
    year = "2019",
    address = "Varna, Bulgaria",
    publisher = "INCOMA Ltd.",
    url = "https://www.aclweb.org/anthology/R19-1115",
    doi = "10.26615/978-954-452-056-4_115",
    pages = "994--1003",
    abstract = "Calculating Semantic Textual Similarity (STS) plays a significant role in many applications such as question answering, document summarisation, information retrieval and information extraction. All modern state of the art STS methods rely on word embeddings one way or another. The recently introduced contextualised word embeddings have proved more effective than standard word embeddings in many natural language processing tasks. This paper evaluates the impact of several contextualised word embeddings on unsupervised STS methods and compares it with the existing supervised/unsupervised STS methods for different datasets in different languages and different domains",
}
}

Contributors

Showing top 1 contributor by commit count.

View all contributors on GitHub →

This article is auto-generated from TharinduDR/Simple-Sentence-Similarity via the GitHub API.Last fetched: 6/13/2026