Home/obss/jury/Changelog

obss/jury

Comprehensive NLP Evaluation System

23 Releases

Latest: 2y ago

v2.3.12.3.1Latest

devrimcavusoglu·2y ago·May 20, 2024

📋 What's Changed

Update CI actions versions. by @devrimcavusoglu in https://github.com/obss/jury/pull/134
Update dev installation to allow for e.g. Zsh by @KennethEnevoldsen in https://github.com/obss/jury/pull/136
Update README.md by @devrimcavusoglu in https://github.com/obss/jury/pull/137

✨ New Contributors

@KennethEnevoldsen made their first contribution in https://github.com/obss/jury/pull/136
Full Changelog: https://github.com/obss/jury/compare/2.3...2.3.1

v2.32.3

devrimcavusoglu·2y ago·October 8, 2023

📋 What's Changed

Comet version update, according changes have been made. by @devrimcavusoglu in https://github.com/obss/jury/pull/129
Update README.md by @eltociear in https://github.com/obss/jury/pull/130
Drop py3.7 support, change CI. by @devrimcavusoglu in https://github.com/obss/jury/pull/132
README.md updated. Jury paper added. by @devrimcavusoglu in https://github.com/obss/jury/pull/133

✨ New Contributors

@eltociear made their first contribution in https://github.com/obss/jury/pull/130
Full Changelog: https://github.com/obss/jury/compare/2.2.4...2.3

v2.2.42.2.4

devrimcavusoglu·3y ago·June 15, 2023

📋 What's Changed

datasets dependency added with constraint. by @devrimcavusoglu in https://github.com/obss/jury/pull/126
Add try/catch block across ZeroDivisionError for AccuracyForLanguageGeneration._compute_single_pred_single_ref by @NISH1001 in https://github.com/obss/jury/pull/123
Package `evaluate` updated to 0.4 (from <0.3). by @devrimcavusoglu in https://github.com/obss/jury/pull/128

✨ New Contributors

@NISH1001 made their first contribution in https://github.com/obss/jury/pull/123
Full Changelog: https://github.com/obss/jury/compare/2.2.3...2.2.4

v2.2.32.2.3

devrimcavusoglu·3y ago·December 26, 2022

📋 What's Changed

`flake8` error on python3.7 by @devrimcavusoglu in https://github.com/obss/jury/pull/118
Seqeval typo fix by @devrimcavusoglu in https://github.com/obss/jury/pull/117
Refactored requirements (sklearn). by @devrimcavusoglu in https://github.com/obss/jury/pull/121
Full Changelog: https://github.com/obss/jury/compare/2.2.2...2.2.3

v2.2.22.2.2

devrimcavusoglu·3y ago·September 30, 2022

📋 What's Changed

Migrating to `evaluate` package (from `datasets`). by @devrimcavusoglu in https://github.com/obss/jury/pull/116
Full Changelog: https://github.com/obss/jury/compare/2.2.1...2.2.2

v2.2.12.2.1

devrimcavusoglu·3y ago·September 21, 2022

📋 What's Changed

Fixed warning message in BLEURT default initialization by @zafercavdar in https://github.com/obss/jury/pull/110
`ZeroDivisionError` on precision and recall values. by @devrimcavusoglu in https://github.com/obss/jury/pull/112
validators added to the requirements. by @devrimcavusoglu in https://github.com/obss/jury/pull/113
Intermediate patch, fixes, updates. by @devrimcavusoglu in https://github.com/obss/jury/pull/114

✨ New Contributors

@zafercavdar made their first contribution in https://github.com/obss/jury/pull/110
Full Changelog: https://github.com/obss/jury/compare/2.2...2.2.1

v2.22.2

devrimcavusoglu·4y ago·March 29, 2022

📋 What's Changed

Fix Reference Structure for Basic BLEU calculation by @Sophylax in https://github.com/obss/jury/pull/74
Added BLEURT. by @devrimcavusoglu in https://github.com/obss/jury/pull/78
README.md updated with doi badge and citation inforamtion. by @devrimcavusoglu in https://github.com/obss/jury/pull/81
Add VSCode Folder to Gitignore by @Sophylax in https://github.com/obss/jury/pull/82
Change one BERTScore test Device to CPU by @Sophylax in https://github.com/obss/jury/pull/84
Add Prism metric by @devrimcavusoglu in https://github.com/obss/jury/pull/79
Update issue templates by @devrimcavusoglu in https://github.com/obss/jury/pull/85
Dl manager rework by @devrimcavusoglu in https://github.com/obss/jury/pull/86
+ 13 more

✨ New Contributors

@Sophylax made their first contribution in https://github.com/obss/jury/pull/74
Full Changelog: https://github.com/obss/jury/compare/2.1.5...2.2

v2.1.52.1.5

devrimcavusoglu·4y ago·December 23, 2021

📋 What's Changed

Bug fix: Typo corrected in _remove_empty() in core.py. by @devrimcavusoglu in https://github.com/obss/jury/pull/67
Metric name path bug fix. by @devrimcavusoglu in https://github.com/obss/jury/pull/69
Full Changelog: https://github.com/obss/jury/compare/2.1.4...2.1.5

v2.1.42.1.4

devrimcavusoglu·4y ago·December 6, 2021

📋 What's Changed

Handle for empty predictions & references on Jury (skipping empty). by @devrimcavusoglu in https://github.com/obss/jury/pull/65
Full Changelog: https://github.com/obss/jury/compare/2.1.3...2.1.4

v2.1.32.1.3

devrimcavusoglu·4y ago·December 1, 2021

📋 What's Changed

Bug fix: Bleu reshape error fixed. by @devrimcavusoglu in https://github.com/obss/jury/pull/63
Full Changelog: https://github.com/obss/jury/compare/2.1.2...2.1.3

v2.1.22.1.2

devrimcavusoglu·4y ago·November 14, 2021

📋 What's Changed

Bug fix: bleu returning same score with different max_order is fixed. by @devrimcavusoglu in https://github.com/obss/jury/pull/59
nltk version upgraded as >=3.6.4 (from >=3.6.2). by @devrimcavusoglu in https://github.com/obss/jury/pull/61
Full Changelog: https://github.com/obss/jury/compare/2.1.1...2.1.2

v2.1.12.1.1

devrimcavusoglu·4y ago·November 10, 2021

📋 What's Changed

Seqeval: json normalization added. by @devrimcavusoglu in https://github.com/obss/jury/pull/55
Read support from folders by @devrimcavusoglu in https://github.com/obss/jury/pull/57
Full Changelog: https://github.com/obss/jury/compare/2.1.0...2.1.1

v2.1.02.1.0

devrimcavusoglu·4y ago·October 25, 2021

📦 AutoMetric ✨

AutoMetric is introduced as a main factory class for automatically loading metrics, as a side note `load_metric` is still available for backward compatibility and is preferred (it uses AutoMetric under the hood).
Tasks are now distinguished within metrics. For example, precision can be used for `language-generation` or `sequence-classification` task, where one evaluates from string (generated text) while other one evaluates from integers (class labels).
On configuration file, metrics can be now stated with HuggingFace's datasets' metrics initializiation parameters. The keyword arguments for metrics that are used on computation are now separated in `"compute_kwargs"` key.
Full Changelog: https://github.com/obss/jury/compare/2.0.0...2.1.0

v2.0.02.0.0

devrimcavusoglu·4y ago·October 11, 2021

✨ New Metric System

datasets package Metric implementation is adopted (and extended) to provide high performance 💯 and more unified interface 🤗.
Custom metric implementation changed accordingly (it now requires 3 abstract methods to be implemented).
Jury class is now callable (implements __call__() method to be used thoroughly) though evaluate() method is still available for backward compatibility.
In the usage of evaluate of Jury, `predictions` and `references` parameters are restricted to be passed as keyword arguments to prevent confusion/wrong computations (like datasets' metrics).
MetricCollator is removed, the methods for metrics are attached directly to Jury class. Now, metric addition and removal can be performed from a Jury instance directly.
Jury now supports reading metrics from string, list and dictionaries. It is more generic to input type of metrics given along with parameters.

✨ New metrics

Accuracy, F1, Precision, Recall are added to Jury metrics.
All metrics on datasets package are still available on jury through the use of `jury.load_metric()`

📦 Development

Test cases are improved with fixtures, and test structure is enchanced.
Expected outputs are now required for tests as a json with proper name.

v1.1.21.1.2

devrimcavusoglu·4y ago·September 15, 2021

📋 Changes

SQuAD bug fixed for evaluating with multiple references.
Test design & cases revised with fixtures (improvement).

v1.1.11.1.1

devrimcavusoglu·4y ago·August 15, 2021

📋 Changes

Malfunctioning multiple prediction calculation caused by multiple reference input for BLEU and SacreBLEU is fixed.
CLI Implementation is completed. 🎉

v1.0.11.0.1

devrimcavusoglu·4y ago·August 13, 2021

📋 Changes

Fix for nltk version (Colab is fixed as well).

v1.0.01.0.0

devrimcavusoglu·4y ago·August 9, 2021

📦 Release Notes

New metric structure is completed.
Custom metric support is improved and no longer required to extend `datasets.Metric`, rather uses `jury.metrics.Metric`.
Metric usage is unified with `compute`, `preprocess` and `postprocess` functions, which the only required implementation for custom metric is `compute`.
Both string and `Metric` objects can be passed to `Jury(metrics=metrics)` now in a mixed fashion.
`load_metric` function was rearranged to capture end score results and several metrics added accordingly (e.g. `load_metric("squad_f1")` will load squad metric which returns F1-score).
Example notebook has added to example.
MT and QA tasks were illustrated.
Custom metric creation added as example.

📦 Acknowledgments

@fcakyon @cemilcengiz @devrimcavusoglu

v0.0.60.0.6

devrimcavusoglu·4y ago·July 28, 2021

v0.0.50.0.5

devrimcavusoglu·4y ago·July 27, 2021

v0.0.40.0.4

devrimcavusoglu·4y ago·July 26, 2021

v0.0.30.0.3

devrimcavusoglu·4y ago·July 26, 2021

Multiple predictions and multiple references supportç

v0.0.20.0.2Pre-release

fcakyon·4y ago·July 14, 2021

first pypi release

← Back to jury wiki