cvs-health/langfair
LangFair is a Python library for conducting use-case level LLM bias and fairness assessments
📦 1. Adversarial toxicity evaluation
- This implementation generates responses with the provided LLM to prompts from the [RealToxicityPrompts](https://arxiv.org/abs/2009.11462) dataset.
📋 What's Changed
- v0.7.1 updates by @dylanbouchard in https://github.com/cvs-health/langfair/pull/210
- Za/adversarial generator by @zeya30 in https://github.com/cvs-health/langfair/pull/205
- 171 adversarial generator by @dylanbouchard in https://github.com/cvs-health/langfair/pull/213
- chore(deps): bump actions/setup-python from 5.6.0 to 6.0.0 by @dependabot[bot] in https://github.com/cvs-health/langfair/pull/204
- add unit tests for adversarial generator by @zeya30 in https://github.com/cvs-health/langfair/pull/217
- Enable `rich` progress bars by @dylanbouchard in https://github.com/cvs-health/langfair/pull/216
- Minor release: `v0.8.0` by @dylanbouchard in https://github.com/cvs-health/langfair/pull/231
- Release `v0.8.0` (version bump) by @dylanbouchard in https://github.com/cvs-health/langfair/pull/232
- + 1 more
📦 Highlights
- Compatibility with Python 3.13
- Additional checks for `AutoEval`
- Misc. dependency version updates
📋 What's Changed
- v0.6.8 updates by @dylanbouchard in https://github.com/cvs-health/langfair/pull/197
- chore(deps-dev): bump sphinx from 7.3.7 to 7.4.7 by @dependabot[bot] in https://github.com/cvs-health/langfair/pull/196
- chore(deps): bump actions/checkout from 4 to 5 by @dependabot[bot] in https://github.com/cvs-health/langfair/pull/191
- chore(deps-dev): bump pytest-asyncio from 0.24.0 to 1.1.0 by @dependabot[bot] in https://github.com/cvs-health/langfair/pull/192
- chore(deps): bump tiktoken from 0.7.0 to 0.11.0 by @dependabot[bot] in https://github.com/cvs-health/langfair/pull/194
- python3.13 support by @dskarbrevik in https://github.com/cvs-health/langfair/pull/206
- Minor release: `v0.7.0` by @dylanbouchard in https://github.com/cvs-health/langfair/pull/208
- Release branch/v0.7.1 by @dylanbouchard in https://github.com/cvs-health/langfair/pull/209
✨ New Contributors
- @dependabot[bot] made their first contribution in https://github.com/cvs-health/langfair/pull/196
- Full Changelog: https://github.com/cvs-health/langfair/compare/v0.6.8...v0.7.1]
📦 Highlights
- Improve security posture
- Fix bug in Stereotype demo notebook
- Remove redundant LLM instantiation from demo notebooks
- Polish use of `AzureChatOpenAI` to follow best practices
- Add missing words to word lists for FTU checking and counterfactual substitution for improved performance of `CounterfactualGenerator`
📋 What's Changed
- feat: improve project security posture by @trumant in https://github.com/cvs-health/langfair/pull/166
- fix: pin setup-python to v5.6.0 by @trumant in https://github.com/cvs-health/langfair/pull/180
- fix: add ci ignores to optimize PR feedback by @trumant in https://github.com/cvs-health/langfair/pull/183
- Ds/update stereotype by @dskarbrevik in https://github.com/cvs-health/langfair/pull/176
- Update LLM instantiation in notebooks by @mehtajinesh in https://github.com/cvs-health/langfair/pull/177
- fix: mv snyk from pre-commit to CI by @trumant in https://github.com/cvs-health/langfair/pull/178
- update race words requiring context by @dylanbouchard in https://github.com/cvs-health/langfair/pull/185
- fix: update jupyter-core to v5.8.1 for CVE fix by @trumant in https://github.com/cvs-health/langfair/pull/184
- + 4 more
✨ New Contributors
- @mehtajinesh made their first contribution in https://github.com/cvs-health/langfair/pull/177
- Full Changelog: https://github.com/cvs-health/langfair/compare/v0.6.7...v0.6.8
📋 What's Changed
- chore: gitleaks:allow a false positive finding by @trumant in https://github.com/cvs-health/langfair/pull/164
- Stereotype classifier_model attribute by @mohitcek in https://github.com/cvs-health/langfair/pull/167
- Counterfactual `cosine_transformer` and `sentiment_classifier` attributes by @mohitcek in https://github.com/cvs-health/langfair/pull/169
- Release PR: v0.6.7 by @dylanbouchard in https://github.com/cvs-health/langfair/pull/170
✨ New Contributors
- @trumant made their first contribution in https://github.com/cvs-health/langfair/pull/164
- Full Changelog: https://github.com/cvs-health/langfair/compare/v0.6.6...v0.6.7
📦 Highlights
- Update the version of `aiohttp` per Dependabot security alert.
📋 What's Changed
- v0.6.5 updates by @dylanbouchard in https://github.com/cvs-health/langfair/pull/162
- Release branch/v0.6.6 by @dylanbouchard in https://github.com/cvs-health/langfair/pull/163
- Full Changelog: https://github.com/cvs-health/langfair/compare/v0.6.5...v0.6.6
This patch release updates versions of `transformers`, `urllib3`, and `pillow` per Dependabot security alerts
📦 Highlights
- Update version of `protobuf` per Dependabot security alert
📋 What's Changed
- Release PR: v0.6.4 by @dylanbouchard in https://github.com/cvs-health/langfair/pull/157
- Full Changelog: https://github.com/cvs-health/langfair/compare/v0.6.3...v0.6.4
📋 What's Changed
- dependabot security patch by @dylanbouchard in https://github.com/cvs-health/langfair/pull/156
- Full Changelog: https://github.com/cvs-health/langfair/compare/v0.6.2...v0.6.3
📋 What's Changed
- Patch release: update tornado, setuptools per dependabot alert by @dylanbouchard in https://github.com/cvs-health/langfair/pull/155
- Full Changelog: https://github.com/cvs-health/langfair/compare/v0.6.1...v0.6.2
📦 Highlights
- upgrade the version of `h11` per Dependabot security alert.
📋 What's Changed
- Release PR: v0.6.1 by @dylanbouchard in https://github.com/cvs-health/langfair/pull/152
- Full Changelog: https://github.com/cvs-health/langfair/compare/v0.6.0...v0.6.1
📦 Highlights
- option for new multilingual toxicity classifier (`detoxify_multilingual`)
- upgrade version of `torch`
📋 What's Changed
- v0.5.2 updates by @dylanbouchard in https://github.com/cvs-health/langfair/pull/148
- Multilingual classifier by @venkataseetharam in https://github.com/cvs-health/langfair/pull/147
- Refactor ToxicityMetric class by @mohitcek in https://github.com/cvs-health/langfair/pull/149
- New toxicity classifier by @dylanbouchard in https://github.com/cvs-health/langfair/pull/150
- Release PR: v0.6.0 by @dylanbouchard in https://github.com/cvs-health/langfair/pull/151
✨ New Contributors
- @venkataseetharam made their first contribution in https://github.com/cvs-health/langfair/pull/147
- Full Changelog: https://github.com/cvs-health/langfair/compare/v0.5.3...v0.6.0
📦 Highlights
- require version of `torch` to `>2.6.0` per Dependabot security alert
📋 What's Changed
- Release PR: v0.5.3 by @dylanbouchard in https://github.com/cvs-health/langfair/pull/146
- Full Changelog: https://github.com/cvs-health/langfair/compare/v0.5.2...v0.5.3
📦 Highlights
- improve handling of case-sensitivity of counterfactual substitutions for gender
- add missing parameter for specifying sentiment classifier with `CounterfactualMetrics` class
📋 What's Changed
- v0.5 updates by @dylanbouchard in https://github.com/cvs-health/langfair/pull/135
- feat: configurable `sentiment_classfier` for `CounterFactualMetrics` (#138) by @Mihir3 in https://github.com/cvs-health/langfair/pull/139
- Enhance substitution logic and update race word lists by @ManaliJoshi92 in https://github.com/cvs-health/langfair/pull/137
- Restructure `device` and `sentiment_classifier` parameter by @mohitcek in https://github.com/cvs-health/langfair/pull/144
- Release PR: v0.5.2 by @dylanbouchard in https://github.com/cvs-health/langfair/pull/143
✨ New Contributors
- @Mihir3 made their first contribution in https://github.com/cvs-health/langfair/pull/139
- Full Changelog: https://github.com/cvs-health/langfair/compare/v0.5.1...v0.5.2
📋 What's Changed
- Compatibility with Python 3.12
- Additional sentiment classifier (https://huggingface.co/siebert/sentiment-roberta-large-english) enabled with counterfactual sentiment bias
- Security patch for `jinja2`
- add `.basedpyright`
📋 What's Changed
- Release PR: v0.5.1 by @dylanbouchard in https://github.com/cvs-health/langfair/pull/134
- v0.4.0 updates by @dylanbouchard in https://github.com/cvs-health/langfair/pull/122
- Re/basedpyright by @renzmann in https://github.com/cvs-health/langfair/pull/124
- Apply pre-commit formatting by @renzmann in https://github.com/cvs-health/langfair/pull/125
- Add Support for Python 3.12 in LangFair by @kmadan in https://github.com/cvs-health/langfair/pull/126
- Remove redundant condition in evaluate method by @vsatyamuralikrishna in https://github.com/cvs-health/langfair/pull/128
- upgrade jinja2 for security patch by @dylanbouchard in https://github.com/cvs-health/langfair/pull/131
- Enable Additional Sentiment Classifier by @ManaliJoshi92 in https://github.com/cvs-health/langfair/pull/129
- + 2 more
✨ New Contributors
- @renzmann made their first contribution in https://github.com/cvs-health/langfair/pull/124
- @kmadan made their first contribution in https://github.com/cvs-health/langfair/pull/126
- @vsatyamuralikrishna made their first contribution in https://github.com/cvs-health/langfair/pull/128
- @ManaliJoshi92 made their first contribution in https://github.com/cvs-health/langfair/pull/129
- Full Changelog: https://github.com/cvs-health/langfair/compare/v0.4.0...v0.5.1
📦 Highlights
- Option to use `n` parameter for select `BaseChatModel` classes for significantly faster generation
- Updated `ResponseGenerator`, `CounterfactualGenerator`, and `AutoEval` accepted types for `langchain_llm` parameter to only include LangChain `BaseChatModel`
- Customizable failure messages for `ResponseGenerator`, `CounterfactualGenerator`, and `AutoEval`
- Customizable `count` for `AutoEval`
- Patch security vulnerability related to `transformers` package
- Added graphic to readme to illustrate LangFair workflow
📋 What's Changed
- v0.3.2 updates by @dylanbouchard in https://github.com/cvs-health/langfair/pull/103
- fix typos in notebook by @dylanbouchard in https://github.com/cvs-health/langfair/pull/109
- fixed issue #96 by @Riddhimaan-Senapati in https://github.com/cvs-health/langfair/pull/110
- Enable use of `BaseLanguageModel.n` parameter for `ResponseGenerator` by @dylanbouchard in https://github.com/cvs-health/langfair/pull/112
- update notebooks by @zeya30 in https://github.com/cvs-health/langfair/pull/114
- Adding developer deps for jupyter notebook development by @dskarbrevik in https://github.com/cvs-health/langfair/pull/115
- Adding dotenv to developer deps by @dskarbrevik in https://github.com/cvs-health/langfair/pull/117
- update default use of n by @dylanbouchard in https://github.com/cvs-health/langfair/pull/119
- + 1 more
✨ New Contributors
- @Riddhimaan-Senapati made their first contribution in https://github.com/cvs-health/langfair/pull/110
- Full Changelog: https://github.com/cvs-health/langfair/compare/v0.3.2...v0.4.0
📦 Highlights
- Security patch for `jinja2`
- Update readme to include software paper bibtex
- Minor docstring updates for docs site fixes
- Create PR template
📋 What's Changed
- v0.3.1 updates by @dylanbouchard in https://github.com/cvs-health/langfair/pull/88
- update docstrings by @zeya30 in https://github.com/cvs-health/langfair/pull/101
- Add new paper bibtex, fix docstring, add PR template by @dylanbouchard in https://github.com/cvs-health/langfair/pull/99
- Release PR: v0.3.2 by @dylanbouchard in https://github.com/cvs-health/langfair/pull/102
- Full Changelog: https://github.com/cvs-health/langfair/compare/v0.3.1...v0.3.2
📦 Highlights
- New method `check_ftu` to check for FTU in `CounterfactualGenerator` class. This method provides a more user-friendly way to check for FTU than the previous approach with `parse_texts`
- Updates to counterfactual demo notebook
- Updates to dev dependencies
- Fix broken links in readme
📋 What's Changed
- v0.3.0 updates by @dylanbouchard in https://github.com/cvs-health/langfair/pull/80
- Add sphinx to a poetry dep group by @dskarbrevik in https://github.com/cvs-health/langfair/pull/78
- Fix broken links in README and copy-paste errors in example notebook by @xavieryao in https://github.com/cvs-health/langfair/pull/81
- New FTU check method by @dylanbouchard in https://github.com/cvs-health/langfair/pull/85
- Contributing guide update by @dskarbrevik in https://github.com/cvs-health/langfair/pull/87
- Release PR: v0.3.1 by @dylanbouchard in https://github.com/cvs-health/langfair/pull/86
✨ New Contributors
- @xavieryao made their first contribution in https://github.com/cvs-health/langfair/pull/81
- Full Changelog: https://github.com/cvs-health/langfair/compare/v0.3.0...v0.3.1
📦 Highlights
- Option to return response-level scores for `CounterfactualMetrics`, `AutoEval`
- Additional unit tests for `CounterfactualMetrics`, `AutoEval`
- Data loader functions for cleaner code when using example data
- Enforced strings in `ResponseGenerator`, `CounterfactualGenerator` output to avoid error when computing metrics if any response is None
📋 What's Changed
- v0.2.1 updates by @dylanbouchard in https://github.com/cvs-health/langfair/pull/65
- Ds/data loader by @dskarbrevik in https://github.com/cvs-health/langfair/pull/59
- Unit tests for classification metrics by @mohitcek in https://github.com/cvs-health/langfair/pull/69
- enforce strings in response outputs, return response-level cf scores by @dylanbouchard in https://github.com/cvs-health/langfair/pull/66
- Consistent return object of AutoEval class by @mohitcek in https://github.com/cvs-health/langfair/pull/70
- notebook updates by @dylanbouchard in https://github.com/cvs-health/langfair/pull/72
- AutoEval unit tests by @mohitcek in https://github.com/cvs-health/langfair/pull/73
- Final changes before releasing v0.3.0 by @mohitcek in https://github.com/cvs-health/langfair/pull/75
- + 2 more
📦 Highlights
- updated README for more illustrative examples
- patch to `AutoEval` for pairwise filtering of counterfactual responses in cases of generation failure
- references in docstring
- fix to SPDX expression in pyproject.toml
📋 What's Changed
- v0.2.0 updates by @dylanbouchard in https://github.com/cvs-health/langfair/pull/46
- Update docstrings by @vasisthasinghal in https://github.com/cvs-health/langfair/pull/53
- Fix pyproject and readme by @dylanbouchard in https://github.com/cvs-health/langfair/pull/61
- skip select unit tests due to memory issue by @dylanbouchard in https://github.com/cvs-health/langfair/pull/63
- Updated Readme file and AutoEval bugfix by @mohitcek in https://github.com/cvs-health/langfair/pull/62
- Release PR: v0.2.1 by @dylanbouchard in https://github.com/cvs-health/langfair/pull/64
- Full Changelog: https://github.com/cvs-health/langfair/compare/v0.2.0...v0.2.1
📦 Highlights
- Upgrade version of LangChain to 0.3.7 to resolve dependency conflicts with later versions of LangChain community packages
- Refactor `ResponseGenerator`, `CounterfactualGenerator`, `AutoEval` to adjust for LangChain upgrade
- More intuitive printing in `AutoEval`
- Update unit tests
- Update documentation in notebooks for user-friendliness and to include MistralAI
- Improved exception handling
- Remove 'langchain: ' from print statements
📋 What's Changed
- v0.1.2 Updates by @dylanbouchard in https://github.com/cvs-health/langfair/pull/36
- upgrade langchain by @dylanbouchard in https://github.com/cvs-health/langfair/pull/39
- Formatting changes made by @vasisthasinghal in https://github.com/cvs-health/langfair/pull/41
- Resolve issue: upgrade version of langchain by @dylanbouchard in https://github.com/cvs-health/langfair/pull/40
- Add pytest to dev dependencies by @virenbajaj in https://github.com/cvs-health/langfair/pull/38
- Update exception handling and notebooks by @dylanbouchard in https://github.com/cvs-health/langfair/pull/42
- Vb/handle suppressed exceptions by @dylanbouchard in https://github.com/cvs-health/langfair/pull/44
- Release PR: v0.2.0 by @dylanbouchard in https://github.com/cvs-health/langfair/pull/45
✨ New Contributors
- @vasisthasinghal made their first contribution in https://github.com/cvs-health/langfair/pull/41
- Full Changelog: https://github.com/cvs-health/langfair/compare/v0.1.2...v0.2.0
📦 Highlights
- Improved Readme for readability
- Improved notebook documentation for readability
- Removed `scipy`, `sklearn`, `openai` and `langchain-openai` dependencies
- Created new argument for `ResponseGenerator` and `CounterfactualGenerator` that allows users to specify which exceptions to suppress
📋 What's Changed
- v0.1.1 -> Develop by @dylanbouchard in https://github.com/cvs-health/langfair/pull/19
- Update readme and notebooks by @dylanbouchard in https://github.com/cvs-health/langfair/pull/20
- Remove `scipy` dependency by @dylanbouchard in https://github.com/cvs-health/langfair/pull/21
- Remove dependency on the scikit-learn confusion matrix by @mohitcek in https://github.com/cvs-health/langfair/pull/23
- Add code of conduct by @virenbajaj in https://github.com/cvs-health/langfair/pull/25
- Remove `openai`, `langchain-openai` dependencies by @dylanbouchard in https://github.com/cvs-health/langfair/pull/22
- Add code of conduct: main -> develop by @dylanbouchard in https://github.com/cvs-health/langfair/pull/26
- Move metrics section by @virenbajaj in https://github.com/cvs-health/langfair/pull/27
- + 7 more
✨ New Contributors
- @mohitcek made their first contribution in https://github.com/cvs-health/langfair/pull/23
- @virenbajaj made their first contribution in https://github.com/cvs-health/langfair/pull/25
- @dskarbrevik made their first contribution in https://github.com/cvs-health/langfair/pull/30
- Full Changelog: https://github.com/cvs-health/langfair/compare/v0.1.1...v0.1.2
📋 What's Changed
- update docstring by @dylanbouchard in https://github.com/cvs-health/langfair/pull/14
- Update readme by @dylanbouchard in https://github.com/cvs-health/langfair/pull/16
- readme updates by @dylanbouchard in https://github.com/cvs-health/langfair/pull/17
- Release PR - v0.1.1 by @dylanbouchard in https://github.com/cvs-health/langfair/pull/18
- Full Changelog: https://github.com/cvs-health/langfair/compare/v0.1.0...v0.1.1
📋 Changes
- Strict Counterfactual Sentiment Parity ([Huang et al., 2020](https://arxiv.org/pdf/1911.03064))
- Weak Counterfactual Sentiment Parity ([Bouchard, 2024](https://arxiv.org/pdf/2407.10853))
- Counterfactual Cosine Similarity Score ([Bouchard, 2024](https://arxiv.org/pdf/2407.10853))
- Counterfactual BLEU ([Bouchard, 2024](https://arxiv.org/pdf/2407.10853))
- Counterfactual ROUGE-L ([Bouchard, 2024](https://arxiv.org/pdf/2407.10853))
- Stereotypical Associations ([Liang et al., 2023](https://arxiv.org/pdf/2211.09110))
- Co-occurrence Bias Score ([Bordia & Bowman, 2019](https://aclanthology.org/N19-3002.pdf))
- Stereotype classifier metrics ([Zekun et al., 2023](https://arxiv.org/ftp/arxiv/papers/2311/2311.14126.pdf), [Bouchard, 2024](https://arxiv.org/pdf/2407.10853))
- + 11 more
