niieani/gpt-tokenizer
The fastest JavaScript BPE Tokenizer Encoder Decoder for OpenAI's GPT models (gpt-5, gpt-o*, gpt-4o, etc.). Port of OpenAI's tiktoken with additional features.
30 Releases
Latest: 7mo ago
3.4.0Latest
✨ Features
- add function calling token counting ([#83](https://github.com/niieani/gpt-tokenizer/issues/83)) ([7f880f4](https://github.com/niieani/gpt-tokenizer/commit/7f880f46bb34a644ec8f9b3069060b2d2f99e11c))
3.3.0
✨ Features
- correct split regexp for o200 and add harmony format support ([59422fd](https://github.com/niieani/gpt-tokenizer/commit/59422fd8b68987dc0a207e43c48198c1346ecc50)), closes [#82](https://github.com/niieani/gpt-tokenizer/issues/82) [#78](https://github.com/niieani/gpt-tokenizer/issues/78)
3.2.0
✨ Features
- support encode options in isWithinTokenLimit ([#80](https://github.com/niieani/gpt-tokenizer/issues/80)) ([dcc8783](https://github.com/niieani/gpt-tokenizer/commit/dcc878381f2b173998141d0adefaf5116c33e6b3))
3.1.0
🐛 Bug Fixes
- codegen: guard BPE generation against empty output ([c46019a](https://github.com/niieani/gpt-tokenizer/commit/c46019a95b34182781dbfcf1ab240f2672dec522))
- place generated headers after lint directives ([70bed74](https://github.com/niieani/gpt-tokenizer/commit/70bed74b8858bae62f007f04984cc2c2329e8964))
✨ Features
- add new models (gpt-5*) and update pricing data ([135a851](https://github.com/niieani/gpt-tokenizer/commit/135a8513f31459b569b2899f676bc1fa4b4e8ca3))
3.0.1
🐛 Bug Fixes
- add o3-pro and update pricing ([52a3b3c](https://github.com/niieani/gpt-tokenizer/commit/52a3b3c52bf10c8fce30a30da491d71e498b1d34))
3.0.0
✨ Features
- add new models and ability to estimate cost ([#72](https://github.com/niieani/gpt-tokenizer/issues/72)) ([1d1d76d](https://github.com/niieani/gpt-tokenizer/commit/1d1d76d86bd22d89b89ebba09f5f7b4eed04c4eb)), closes [#71](https://github.com/niieani/gpt-tokenizer/issues/71) [#70](https://github.com/niieani/gpt-tokenizer/issues/70)
💥 BREAKING CHANGES
- changes the default encoding to `o200k_base` as that is what most modern models use now
2.9.0
✨ Features
- add new models and update pricing ([e2506c2](https://github.com/niieani/gpt-tokenizer/commit/e2506c229c542384928e35b34a2d3ed07cf68a10))
- implement 'estimateCost' ([4124587](https://github.com/niieani/gpt-tokenizer/commit/4124587f4e2b30908db2cfe2d81b6520411e87de))
2.8.1
🐛 Bug Fixes
- add 'clearMergeCache' ([4f64377](https://github.com/niieani/gpt-tokenizer/commit/4f64377455243d5a923a0fd0437886c6b0d11dc9))
2.8.0
✨ Features
- add a LRU BPE merge cache ([15d13b1](https://github.com/niieani/gpt-tokenizer/commit/15d13b1a35047d531efda795200257183b892a93)), closes [#68](https://github.com/niieani/gpt-tokenizer/issues/68)
⚡ Performance Improvements
- optimize token counting ([c3e533c](https://github.com/niieani/gpt-tokenizer/commit/c3e533c9f814875cd97ee8ae8b11d2129fc8e86f))
2.7.0
✨ Features
- implement 'countTokens' function ([2d4146a](https://github.com/niieani/gpt-tokenizer/commit/2d4146a064d9dc6c2512bf6c869b05f2d18ce741)), closes [#67](https://github.com/niieani/gpt-tokenizer/issues/67)
2.6.2
🐛 Bug Fixes
- correct special token matching & counting ([3547826](https://github.com/niieani/gpt-tokenizer/commit/3547826b37e829009a40d421a3733a54d13cd452))
- unify property and variable names across the library ([6030d91](https://github.com/niieani/gpt-tokenizer/commit/6030d91cbd8a08876212e9e43d4eb7387465e5ac))
2.6.1
🐛 Bug Fixes
- expose vocabulary size ([402ff0b](https://github.com/niieani/gpt-tokenizer/commit/402ff0bea15acdd62cf5d2069ffa94b26f8200c4)), closes [#66](https://github.com/niieani/gpt-tokenizer/issues/66)
- use extensions in models.ts ([78b803d](https://github.com/niieani/gpt-tokenizer/commit/78b803d4cf60dcf04b203b293378244e2efbabb2)), closes [#65](https://github.com/niieani/gpt-tokenizer/issues/65)
2.6.0
🐛 Bug Fixes
- initialize encodings array in parts ([aa6c71d](https://github.com/niieani/gpt-tokenizer/commit/aa6c71d1d3d6756087c5d246daa17669f94bc0a0)), closes [#62](https://github.com/niieani/gpt-tokenizer/issues/62)
✨ Features
- add new and update existing models ([e832f9a](https://github.com/niieani/gpt-tokenizer/commit/e832f9a3c6ece43ad6f709e0fda33f7c0e68a743))
- provide comprehensive data for all OpenAI models ([ec2ad7e](https://github.com/niieani/gpt-tokenizer/commit/ec2ad7efc7873a303baab71853047f58becb1877))
2.5.1
📦 [2.5.1](https://github.com/niieani/gpt-tokenizer/compare/2.5.0...2.5.1) (2024-10-21)
- (no changes, only benchmark update)
2.5.0
✨ Features
- added o1-preview and o1-mini chat completion models ([#56](https://github.com/niieani/gpt-tokenizer/issues/56)) ([41673af](https://github.com/niieani/gpt-tokenizer/commit/41673afe7078c73d439583ffd470b6c52ed4f625))
2.4.1
🐛 Bug Fixes
- deps: update dependency gpt-tokenizer to ^2.4.0 ([bf4b459](https://github.com/niieani/gpt-tokenizer/commit/bf4b459d8d99903264698f606bdd9a31ca0b724f))
2.4.0
✨ Features
- performance optimizations ([661e283](https://github.com/niieani/gpt-tokenizer/commit/661e283ec92fa9b31a8d1eee01b29680c251e00a))
2.3.0
✨ Features
- improve performance, memory usage & initialization time ([#50](https://github.com/niieani/gpt-tokenizer/issues/50)) ([e2c560a](https://github.com/niieani/gpt-tokenizer/commit/e2c560aafeda84dcbec61880d552ffbaa69deaac)), closes [#18](https://github.com/niieani/gpt-tokenizer/issues/18) [#35](https://github.com/niieani/gpt-tokenizer/issues/35)
2.2.3
🐛 Bug Fixes
- deps: update dependency rfc4648 to ^1.5.3 ([fcbf48a](https://github.com/niieani/gpt-tokenizer/commit/fcbf48a553dcc4d6e7b617374880736070d16882))
2.2.2
🐛 Bug Fixes
- improve test typing ([bbd0764](https://github.com/niieani/gpt-tokenizer/commit/bbd0764ad238c6c3f83aadfc75c37d47488577f6))
- upgrade dependencies (including typescript) ([75ebd54](https://github.com/niieani/gpt-tokenizer/commit/75ebd542d8c70c2938b2fb214474f763fad4dccf)), closes [#49](https://github.com/niieani/gpt-tokenizer/issues/49)
2.2.1
🐛 Bug Fixes
- add files for other models ([c21d498](https://github.com/niieani/gpt-tokenizer/commit/c21d4986b283904766a19fb26129b0815dca68bd)), closes [#19](https://github.com/niieani/gpt-tokenizer/issues/19)
- regenerate o200k encoding from tiktoken file ([c7ba009](https://github.com/niieani/gpt-tokenizer/commit/c7ba0091c6af08c72ca67054c27f627e8dcc1117))
2.2.0
🐛 Bug Fixes
- add correct encoding for o200k_base ([137e07b](https://github.com/niieani/gpt-tokenizer/commit/137e07ba92cdbca761e2dc63cf467f6e1c3df844))
- add gpt-4o on readme as supported model ([27b4e20](https://github.com/niieani/gpt-tokenizer/commit/27b4e20dc4507f3304db3ad4cc0084fddbdd5cf5))
- update readme ([0b33e1e](https://github.com/niieani/gpt-tokenizer/commit/0b33e1edbe6c7575ce3be8ee55c9638df0a75acb))
✨ Features
- add o200k_base test plans ([44ce38e](https://github.com/niieani/gpt-tokenizer/commit/44ce38eae93d9a7b019691fb2baf0db97592d9e8))
- added o200k_base to encodings and configured it's specialTokens ([2a9da2b](https://github.com/niieani/gpt-tokenizer/commit/2a9da2b79985b3332855d83dd1db6e08e3823424))
2.1.2
🐛 Bug Fixes
- bind encodeChat and encodeChatGenerator ([86c270c](https://github.com/niieani/gpt-tokenizer/commit/86c270cb6a27600149f9bd56f675628109c4a134)), closes [#15](https://github.com/niieani/gpt-tokenizer/issues/15)
2.1.1
🐛 Bug Fixes
- missing default exports due to limitations of ESM ([2a55474](https://github.com/niieani/gpt-tokenizer/commit/2a55474f9725f6907a5f17fa68cd13d76e8d2f9d)), closes [#11](https://github.com/niieani/gpt-tokenizer/issues/11)
2.1.0
🐛 Bug Fixes
- correct excluded path ([71af9d3](https://github.com/niieani/gpt-tokenizer/commit/71af9d3415f903c304e9bbf4c8feef33acd48e02))
- workaround for webpack not exposing the default export in UMD correctly ([84887b4](https://github.com/niieani/gpt-tokenizer/commit/84887b487789d17bc18e9778674defd814498d1b)), closes [#12](https://github.com/niieani/gpt-tokenizer/issues/12)
✨ Features
- add encodeChat ([ff30f11](https://github.com/niieani/gpt-tokenizer/commit/ff30f112c03775a19948bf5b867177b1350a2881)), closes [#10](https://github.com/niieani/gpt-tokenizer/issues/10)
2.0.0
✨ Features
- complete rewrite to support different models ([eedd944](https://github.com/niieani/gpt-tokenizer/commit/eedd944628d67f3a4121447bf45aa83022922800)), closes [#5](https://github.com/niieani/gpt-tokenizer/issues/5) [#6](https://github.com/niieani/gpt-tokenizer/issues/6)
- more tests and better README ([e660a25](https://github.com/niieani/gpt-tokenizer/commit/e660a25ab2388416ca027639828b026a0724a5ea))
💥 BREAKING CHANGES
- default encoder is now GPT3.5 / GPT4
2.0.0-beta.2Pre-release
✨ Features
- more tests and better README ([e660a25](https://github.com/niieani/gpt-tokenizer/commit/e660a25ab2388416ca027639828b026a0724a5ea))
2.0.0-beta.1Pre-release
✨ Features
- complete rewrite to support different models ([eedd944](https://github.com/niieani/gpt-tokenizer/commit/eedd944628d67f3a4121447bf45aa83022922800)), closes [#5](https://github.com/niieani/gpt-tokenizer/issues/5) [#6](https://github.com/niieani/gpt-tokenizer/issues/6)
💥 BREAKING CHANGES
- default encoder is now GPT3.5 / GPT4
1.0.5
🐛 Bug Fixes
- publish "dist" folder ([f7df7e9](https://github.com/niieani/gpt-tokenizer/commit/f7df7e95eec1a8b23b6f7df3a1b853450f0a08a7)), closes [#7](https://github.com/niieani/gpt-tokenizer/issues/7)
1.0.4
🐛 Bug Fixes
- eslint and jest complaints ([8b1cb01](https://github.com/niieani/gpt-tokenizer/commit/8b1cb0185e900f3b6f8ea8f75df7f59a287e0f89))
- importing/requiring under certain setups was not working ([8b71131](https://github.com/niieani/gpt-tokenizer/commit/8b711315afc50342158c3c28d247548dc5e4ef83)), closes [#4](https://github.com/niieani/gpt-tokenizer/issues/4)
