gliwka/hyperscan-java
Match tens of thousands of regular expressions within milliseconds - Java bindings for Intel's hyperscan 5
✨ Added
- New callback API for efficient custom match processing ([#58](https://github.com/gliwka/hyperscan-java/issues/58))
- Byte-based scanning API for direct operation on byte[] and ByteBuffer without String overhead
- New `hasMatch` methods for quick existence checking that terminate immediately on first match ([#68](https://github.com/gliwka/hyperscan-java/issues/68))
⚡ Performance
- Reduced memory usage for UTF-8 string mapping by dynamically selecting optimal array type
🐛 Fixed
- Fixed upstream vectorscan correctness regression on x86 architecture with targeted patch ([#228](https://github.com/gliwka/hyperscan-java/issues/228), [#231](https://github.com/gliwka/hyperscan-java/issues/231))
- Ensured all native handles are properly reclaimed after database compilation ([#230](https://github.com/gliwka/hyperscan-java/issues/230))
- Reworked UTF-8 position mapping to handle the mapping correctly in edge cases ([#170](https://github.com/gliwka/hyperscan-java/issues/170))
📋 Changed
- Removed 255 thread limit for concurrent scanning operations ([#222](https://github.com/gliwka/hyperscan-java/issues/222), [#229](https://github.com/gliwka/hyperscan-java/issues/229))
💥 Breaking
- Windows support has been dropped due to vectorscan not supporting it
✨ Added
- Support for ARM64 architecture on Linux and macOS (M1/M2/M3 family of chips)
🐛 Fixed
- Database instances not reclaimable by GC ([#161](https://github.com/gliwka/hyperscan-java/issues/161)) - thanks [@mmimica](https://github.com/mmimica)!
- Race condition during tracking of native references on multiple threads ([#158](https://github.com/gliwka/hyperscan-java/issues/158)) - thanks [@mmimica](https://github.com/mmimica)!
- Expression IDs now can have arbitrary space between them without consuming additional memory ([#163](https://github.com/gliwka/hyperscan-java/issues/163)) - thanks [@mmimica](https://github.com/mmimica)!
- Removed superflous duplicate call during mapping of expressions in PatternFilter ([#205](https://github.com/gliwka/hyperscan-java/pull/205)) - thanks [@Jiar](https://github.com/Jiar)!
✨ Added
- New PatternFilter allowing for prefiltering of java regex patterns similar to chimera
- Windows support
- Possibility to manually specify expression ids
📋 Changed
- Moved access to native library from JNA to JavaCPP
- Removed context object from expressions
🐛 Fixed
- Lock contention while scanning with high concurrency ([#89](https://github.com/gliwka/hyperscan-java/issues/89))
📋 Changes
- Hyperscan v5.1.1 binaries (#48)
- Java 11 compatibility (#54)
- Logical combinations (thanks to @swapnilnawale @digitalreasoning for the contribution) (#55)
- Change from checked to unchecked exceptions to clean up method signature (#53)
📋 Changes
- Enforce UTF-8 encoding in the direct mapping library (#43) - thanks to @eliaslevy for this PR!
This release contains the new hyperscan v4.7.0 binaries
📋 Changes
- Segmentation fault when passing null value as expression (#34)
📋 Changes
- Error due to reading of structs after call to free (#31)
📋 Changes
- Support for 32-bit Linux
- New hyperscan binaries (v4.6.0)
- Updated JUnit and Surefire plugins for testing (works now with the most recent IntelliJ release)
📋 Changes
- Add context object to Expression
📋 Changes
- Removed "System.out.println" which has been left in accidentally
📋 Changes
- Closeable interface implemented in Database and Scanner
- New constructor to create expressions without flags
- Avoid some work in the fast path of Scanner.scan. (#24 - thanks to @eliaslevy)
- Bugfix for edge cases causing an ArrayIndexOutOfBoundException (#23)
📋 Changes
- call time speed up due to JNA direct mapping (PR #17) - thanks to @eliaslevy for this contribution.
📋 Changes
- Enforce UTF-8 encoding on every string (java and jna - see #14) - thanks to @eliaslevy for this PR!
- new version of the libhs.so shared linux library - has been updated to v4.5.1 and linked against older libraries to ensure it works on more conservative distributions (see #13)
This release introduces a precompiled bundled MacOs 64-bit library and thus works out of the box on those systems. Thanks to @eliaslevy for submitting this with PR #12. Additionally to the jar files below, you can add this to Gradle, maven, sbt and leiningen by visiting: https://jitpack.io/#cerebuild/hyperscan-java
📋 Changes
- add test scope to the JUnit dependency and pin it to a version that is not RELEASE (#11)
📋 Changes
- Adds a new method, Scanner.allocScratch, to preallocate the scratch space outside the scan code path and removes the allocation in Scanner.scan. (#9)
📋 Changes
- Occasional segmentation faults during high memory pressure (#8)
📋 Changes
- API method names were refactored to start with a lowercase letter to match the offical java style
- The endOfMatch value now matches the the string index (zero based) of the end of the match (issue #6)
Changed match positions from bytes to actual character positions in the String to integrate better into the java ecosystem.
🐛 Bugfixes
- Fixed wrong matched string getting returned in case of utf8 (#5)
📋 Changes
- Includes precompiled x86 64-bit shared hyperscan library for most modern linux distributions (#3)
🐛 Bugfixes
- Wrong string length in case of utf8 characters (bug #2)
