niklak/dom_query
A Flexible Rust Crate for DOM Querying and Manipulation
๐ Changes
- Updated dependencies:
- `html5ever`: `0.38.0` -> `0.39.0`
- `selectors`: `0.36.0` -> `0.38.0`
- `cssparser`: `0.36.0` -> `0.37.0`
- `hashbrown`: `0.16.0` -> `0.16.1`
- Refactored internal `Selection` tree access logic to improve maintainability. No public API changes.
- Applied selected Clippy (`pedantic` and `nursery`) suggestions to improve overall code quality.
- Markdown: moved `NodeRef::md` and `Document::md` implementations into a dedicated `serializing/md` submodule for better internal organization. No public API changes.
- + 4 more
โจ New Features
- Introduced `Tree::new_element_qualname`, allowing creation of elements with a specific `html5ever::QualName`.
๐ Changes
- Improved performance of `Element::add_class` for both empty and existing `class` attributes.
- Optimized `Element::remove_class` using a `Vec`-based approach, preserving the original class order while improving performance.
๐ Bug Fixes
- Fixed an incorrect namespace in `Tree::new_element`. Previously, manually created void elements could serialize with closing tags due to the wrong namespace. Elements created during HTML parsing were not affected.
- ---
- [Examples](https://niklak.github.io/dom_query_by_example)
- [Changelog](CHANGELOG.md)
- Full Changelog: https://github.com/niklak/dom_query/compare/0.26.0...0.27.0
๐ Changes
- Updated dependencies:
- `html5ever`: 0.36.1 โ 0.38.0
- `tendril`: 0.4.3 โ 0.5.0
- `selectors`: 0.35.0 โ 0.36.0
- Refactored internal `TreeNodeOps` methods; no functional or API changes.
๐ Bug Fixes
- `NodeRef::immediate_text` now returns text for text nodes, not only the immediate text of child elements.
- ---
- [Examples](https://niklak.github.io/dom_query_by_example)
- [Changelog](CHANGELOG.md)
- Full Changelog:
- https://github.com/niklak/dom_query/compare/0.25.1...0.26.0
๐ Changes
- Rolled back dependencies due to an upstream issue (details in [#168](https://github.com/niklak/dom_query/pull/168)):
- `html5ever`: 0.37.1 โ 0.36.1
- `tendril`: 0.5.0 โ 0.4.3
- Reverted code changes introduced to maintain compatibility with newer versions of `html5ever` and `tendril`.
- ---
- [Examples](https://niklak.github.io/dom_query_by_example)
- [Changelog](CHANGELOG.md)
- Full Changelog: https://github.com/niklak/dom_query/compare/0.25.0...0.25.1
๐ Changes
- Updated dependencies:
- `selectors`: 0.33.0 โ 0.35.0
- `html5ever`: 0.36.1 โ 0.37.1
- `tendril`: 0.4.3 โ 0.5.0
- Refactored internal code to stay compatible with the latest `html5ever` and `tendril` APIs.
- Reworked the internal structure of the `markdown` feature; no public API changes.
๐ Bug Fixes
- Markdown: block-level elements are now properly handled inside list elements (`ul`, `ol`), resulting in more correct and predictable Markdown output.
- ---
- [Examples](https://niklak.github.io/dom_query_by_example)
- [Changelog](CHANGELOG.md)
- Full Changelog: https://github.com/niklak/dom_query/compare/0.24.0...0.25.0
๐ ๐ง Changes
- Updated internal implementation of `NodeRef::find` for cleaner logic and improved maintainability.
- Exposed `TreeNode` to the public API.
- Dependency updates:
- `selectors`: 0.31.1 -> 0.33.0
- `cssparser`: 0.35.0 -> 0.36.0
- `html5ever`: 0.35.0 -> 0.36.1
๐ฆ ๐ Benchmarks
- Added benchmark for `Selection`.
- Added benchmark for `NodeRef::normalized_char_count`.
- Added *parsing* benchmark for `Document::from`.
- Added benchmark for `Document::html`.
- ---
- [Examples](https://niklak.github.io/dom_query_by_example)
- [Changelog](CHANGELOG.md)
- Full Changelog: https://github.com/niklak/dom_query/compare/0.23.1...0.24.0
๐ ๐ง Changes
- Markdown: `MDSerializer::find_code_language` now uses `MDSerializer::find_code_language_css_class` as a fallback when it fails to detect the `code` language from ancestor elements. (by @justahero)
๐ ๐ Bug Fixes
- Markdown: Escaped double quotes in `title` attributes of `<a>` elements. (by @justahero)
- ---
- [Examples](https://niklak.github.io/dom_query_by_example)
- [Changelog](CHANGELOG.md)
- Full Changelog: https://github.com/niklak/dom_query/compare/0.23.0...0.23.1
โจ โจ New Features
- Implemented `Element::attr_ref` method, which returns an `&str` reference to the attribute value by `html5ever::LocalName`.
- Re-exported `html5ever::LocalName` and `html5ever::local_name` for easier access.
- Markdown: Enhanced `<pre>`-block parsing by checking `data-lang` and `data-language` attributes. (by @justahero)
- Markdown: Multiline `<code>` blocks are now parsed as `<pre>` blocks. (by @justahero)
๐ ๐ง Changes
- Revised `NodeRef::find_descendants` (requires `mini_selector` feature). This method now supports the `Adjacent (+)` and `Sibling (~)` combinators.
- ---
- [Examples](https://niklak.github.io/dom_query_by_example)
- [Changelog](CHANGELOG.md)
- Full Changelog: https://github.com/niklak/dom_query/compare/0.22.0...0.23.0
โจ โจ New Features
- Added `Tree::is_mathml_annotation_xml_integration_point` method to check whether a node is a MathML annotation XML integration point.
๐ ๐ง Changes
- Updated dependencies:
- `foldhash`: 0.1.5 โ 0.2.0
- `hashbrown`: 0.15.3 โ 0.16.0
- Refactored `dom_tree::helpers::normalized_char_count` for cleaner implementation.
- Set MSRV (Minimum Supported Rust Version) to 1.75.
- ---
- [Examples](https://niklak.github.io/dom_query_by_example)
- [Changelog](CHANGELOG.md)
- + 1 more
โจ โจ New Features
- Introduced `Tree::head` and `Tree::body` methods, which return `None` if the corresponding element is absent (e.g., fragments typically lack `<head>`/`<body>`). Added equivalent `Document::head` and `Document::body` methods.
๐ ๐ง Changes
- Minor refactor of `TreeNode::adjust` method; no functional or API changes.
๐ ๐ Bug Fixes
- Revised `Document::create_element`. Now the `template` element precedes its `Fragment`, allowing HTML trees with templates to be merged more predictably.
- Skip merging trees (and all related operations) when the main tree is empty (e.g., a document created via `Document::default()`).
- ---
- [Examples](https://niklak.github.io/dom_query_by_example)
- [Changelog](CHANGELOG.md)
- Full Changelog: [https://github.com/niklak/dom\_query/compare/0.20.0...0.21.0](https://github.com/niklak/dom_query/compare/0.20.0...0.21.0)
โจ โจ New Features
- Introduced `Selection::select_matcher_iter`, a method that returns an iterator over all nodes matching a given matcher, without collecting them into a result vector. This approach is more efficient for read-only use cases where a full collection isn't needed.
๐ ๐ง Changes
- `NodeRef` now implements the `Copy` trait, simplifying usage in performance-critical scenarios.
- Minor internal code refactoring for improved readability and maintainability.
- `examples/manipulate_html.rs`: added an assertion that `doc.select(".remove-it").exists()` is false. (by @aaronsturm in [#110](https://github.com/niklak/dom_query/pull/110))
- ---
- [Examples](https://niklak.github.io/dom_query_by_example)
- [Changelog](CHANGELOG.md)
- Full Changelog: [https://github.com/niklak/dom\_query/compare/0.19.0...0.20.0](https://github.com/niklak/dom_query/compare/0.19.0...0.20.0)
โจ โจ New Features
- Introduced `Tree::html_root` and `Document::html_root` methods to conveniently retrieve the `<html>` root element of a document.
- Implemented `NodeRef::to_fragment`, which allows creating a deep copy of a node's contents as a standalone document fragment.
๐๏ธ ๐๏ธ Removed
- Removed deprecated methods for better maintainability and clarity:
- `Tree::append_prev_sibling_of`
- `NodeRef::append_prev_sibling`
- `NodeRef::append_prev_siblings`
- `Selection::next`
- ---
- [Examples](https://niklak.github.io/dom_query_by_example)
- [Changelog](CHANGELOG.md)
- + 1 more
โจ โจ New Features
- Added `NodeRef::wrap_node`, `NodeRef::wrap_html`, and `NodeRef::unwrap_node` methods. These allow wrapping a node with another node or an HTML fragment, and unwrapping it back. (by @phayes)
- Introduced `Tree::validate`, a method to perform comprehensive integrity checks on node relationships, links, and cycles within the DOM tree. (by @phayes)
๐ ๐ง Changes
- Updated dependencies:
- `selectors` updated to 0.27.0.
- `cssparser` updated to 0.35.0.
- `html5ever` updated to 0.31.0.
- Improved `mini_selector::Attribute`: attribute values can now be enclosed in double quotes, single quotes, or left unquoted.
- Changed `entities::Attr` visibility to `pub`, making it publicly accessible.
- Enhanced `TreeNodeOps::append_child_of` and `TreeNodeOps::prepend_child_of` to internally call `TreeNodeOps::remove_from_parent` on the new child node. This ensures the node is properly detached from its previous parent and siblings before reattachment, preventing dangling references.
๐ ๐ Bug Fixes
- Fixed `template` element serialization in the `NodeRef::html` and `NodeRef::inner_html` methods.
- ---
- [Examples](https://niklak.github.io/dom_query_by_example)
- [Changelog](CHANGELOG.md)
- Full Changelog: https://github.com/niklak/dom_query/compare/0.17.0...0.18.0
โจ โจ New Features
- Introduced the `NodeRef::strip_elements(&[&str])` method, which removes matched elements while retaining their children in the document tree.
- Introduced the `Selection::strip_elements(&[&str])` method, which performs the same operation as `NodeRef::strip_elements(&[&str])` but for every node in the `Selection`.
- Introduced the `NodeRef::retain_attrs(&[&str])` method, which allows retaining only the specified attributes of a node.
- Introduced the `Selection::retain_attrs(&[&str])` method, which performs the same operation as `NodeRef::retain_attrs` but for every node in the `Selection`.
- [Changelog](CHANGELOG.md)
- Full Changelog: https://github.com/niklak/dom_query/compare/0.16.0...0.17.0
โจ โจ New Features
- Added `NodeRef::element_ref`, which returns a reference to the underlying `Element` if the node is an element node.
- Added `NodeRef::qual_name_ref`, which returns a reference to the qualified name of the node.
- Added `NodeRef::has_name`, which checks if the node is an element with the given local name.
- Added `NodeRef::is_nonempty_text`, which checks if the node is a non-empty text node.
๐ ๐ง Changes
- Optimized element matching: when a single root node is present, `DescendantMatches` (using `DescendantNodes`) is now utilized as the internal iterator, improving performance by eliminating the need for duplicate checks. When multiple root nodes exist, `Matches` remains in use, ensuring proper duplicate handling.
- Updated `<NodeRef as selectors::Element>::is_link` to always return `false` as its impact on element matching was unclear and added unnecessary overhead.
- [Changelog](CHANGELOG.md)
- Full Changelog: https://github.com/niklak/dom_query/compare/0.15.2...0.16.0
๐ฆ Release Notes for 0.15.0 (2025-03-01) ๐
- ๐ [Check out examples](https://niklak.github.io/dom_query_by_example)
โจ โจ New Features
- Implemented the `markdown` feature, which allows serializing a `Document` or `NodeRef` into Markdown text using the `md()` method.
- Implemented the `mini_selector` feature, providing a lightweight and faster alternative for element matching with limited CSS selector support. This includes additional `NodeRef` methods: `find_descendants`, `try_find_descendants`, `mini_is`, and `mini_match`.
๐ ๐ Bug Fixes
- `Selection::select` now returns nodes in ascending order if there were multiple underlying root nodes. If there was only one root node, it still returns nodes in ascending order, just as before.
- [Changelog](./CHANGELOG.md)
- Full Changelog: https://github.com/niklak/dom_query/compare/0.14.0...0.15.0
โจ โจ New Features
- Implemented the `Node::id_attr` and `Node::class` methods, which return the `id` and `class` attributes of the node, respectively. The `Selection::id` and `Selection::class` methods provide the same functionality for the first node in the selection.
๐ ๐ง Changes
- Replaced `foldhash::HashSet` with `bit-set::BitSet` in the `Matches::next` method. This ensures efficient duplicate detection while iterating over matches, as `bit-set` offers better performance for this use case.
- Revised the `NodeRef::formatted_text` implementation: moved related code to a separate module, extended the formatting logic, and added more test cases.
๐ ๐ Bug Fixes
- Fixed an issue where `DescendantNodes` could traverse beyond the initial node when iterating over descendants, affecting `NodeRef::descendants` and `NodeRef::descendants_it`.
- [Changelog](CHANGELOG.md)
- Full Changelog: https://github.com/niklak/dom_query/compare/0.13.3...0.14.0
โจ โจ New Features
- Text Processing Enhancements:
- Added `NodeRef::normalized_char_count` method to estimate character count in descendant nodes' text after normalization.
- Introduced new formatted text retrieval methods across different levels:
- `Document::formatted_text` for document-level text formatting
- `Selection::formatted_text` for handling formatted text in selections
- `NodeRef::formatted_text` for node-specific formatted text extraction
- [Changelog](CHANGELOG.md)
- Full Changelog: https://github.com/niklak/dom_query/compare/0.12.0...0.13.0
โจ โจ New Features
- NodeRef Enhancements:
- Added `NodeRef::is_match` and `NodeRef::is` methods for checking if a node matches a given matcher (`&Matcher`) or selector (`&str`) without creating a `Selection` object.
- Introduced `NodeRef::find`, an experimental method for finding all descendant elements matching a given path. It is significantly faster than `Selection::select`.
- Tree and Document Enhancements:
- Added `Document::base_uri` and `NodeRef::base_uri` for retrieving the base URI of a document using the `href` attribute of the `<base>` element. Inspired by [Node: baseURI property](https://developer.mozilla.org/en-US/docs/Web/API/Node/baseURI).
๐ฆ ๐จ Improvements
- Selection:
- Reduced usage of `RefCell::borrow` and `RefCell::borrow_mut` to simplify internal code.
- Matches:
- Optimized internal code to improve selection performance.
- [Changelog](CHANGELOG.md)
- Full Changelog: https://github.com/niklak/dom_query/compare/0.11.0...0.12.0
โจ โจ New Features
- Atomic Feature:
- Added the `atomic` feature, which replaces `NodeData`'s use of `StrTendril` with `Tendril<tendril::fmt::UTF8, tendril::Atomic>`.
- This enables `NodeData` and related structures, including `Document`, to implement the `Send` trait.
- NodeRef Enhancements:
- Introduced `NodeRef::insert_siblings_after`, enabling the insertion of a node along with its siblings after a selected node.
- Introduced `NodeRef::before_html` and `NodeRef::after_html` for inserting HTML content before or after a specific node.
- Selection Enhancements:
- Introduced `Selection::set_text` to set the content of each node in the selection to specified content.
- + 2 more
๐ฆ ๐จ Improvements
- Internal code changes reduce calls to `RefCell::borrow` and `RefCell::borrow_mut`.
- [Changelog](CHANGELOG.md)
- Full Changelog: https://github.com/niklak/dom_query/compare/0.10.0...0.11.0
โจ โจ New Features
- Introduced new node manipulation methods:
- `NodeRef::insert_after`: Insert a node after the selected node
- `NodeRef::descendants_it`: Iterate over all descendants of a node
- `NodeRef::descendants`: Get a vector containing all descendants of a node
- Added normalization functionality:
- `NodeRef::normalize`: Merges adjacent text nodes and removes empty text nodes at node level
- `Document::normalize`: Performs normalization across the entire document
๐๏ธ โ ๏ธ Deprecations
- Several methods have been deprecated in favor of more consistently named alternatives:
- `NodeRef::append_prev_sibling` โ `NodeRef::insert_before`
- `NodeRef::append_prev_siblings` โ `NodeRef::insert_siblings_before`
- `Tree::append_prev_sibling_of` โ `Tree::insert_before_of`
๐ฆ ๐จ Improvements
- Modified `Document::from` and `Document::fragment` to disable scripting in HTML parser, enabling proper querying of `noscript` elements
๐ ๐ Fixed
- `Document::text` method now returns the text content, whereas previously it returned an empty string
๐ ๐ง Minor Changes
- Added support for ordered comparison of node IDs through implementation of `Ord` trait for `NodeId`
๐ฆ ๐ Migration Guide
- Users are encouraged to update their code to use the new method names for better maintainability. The deprecated methods will be removed in a future release.
- [Changelog](CHANGELOG.md)
- Full Changelog: https://github.com/niklak/dom_query/compare/0.9.1...0.10.0
๐ ๐ Changes
- Added reverse iteration support in `Tree::child_ids_of_it` and `NodeRef::children_it` (use `rev: true` for reverse order)
- Improved internal logic for selection operations (`Selection::append_selection` and `Selection::replace_with_selection`)
โจ โจ New Features
- Added new node prepending methods:
- `NodeRef::prepend_child`: Insert a single child at the start
- `NodeRef::prepend_children`: Insert multiple children at the start
- `NodeRef::prepend_html`: Parse and insert HTML content at the start
- `Selection::prepend_html`: Parse and insert HTML at the start of all matched nodes
- Introduced new selection extension methods:
- `Selection::ancestors`: Get ancestors of matched elements
- `Selection::add_selection`: Add another selection to current selection
- + 3 more
๐ ๐ Bug Fixes
- Enhanced selection operations to properly handle multiple nodes and cross-tree selections:
- Fixed `Selection::append_selection`
- Fixed `Selection::replace_with_selection`
- Improved node attachment methods to automatically handle node detachment from previous parent:
- `Node::append_child`
- `Node::append_children`
- `Node::prepend_child`
- `Node::prepend_children`
๐ โ ๏ธ Critical Fix
- Fixed a significant issue in `NodeRef::first_element_child` where the method incorrectly checked the node itself instead of its children. This fix may affect existing code that relies on this method.
- CSS selector `:has()` behavior could have been affected
- First element child detection logic
- Code depending on this method's behavior
- Please review any code that depends on `NodeRef::first_element_child` after upgrading to ensure compatibility with the corrected behavior.
- [Changelog](CHANGELOG.md)
- Full Changelog: https://github.com/niklak/dom_query/compare/0.8.0...0.9.0
๐ ๐ Major Changes
- Simplified internal architecture by replacing generic types with concrete `NodeData` type
- Moved implementations from `Node` to `NodeRef` (`Node` is now an alias for `NodeRef`)
- Simplified the internal logic of several methods (`replace_with_html`, `set_html`, `append_html`) using `Tree::merge`
โจ โจ New Features
- Added new Selection filtering methods:
- `Selection::filter`
- `Selection::filter_matcher`
- `Selection::try_filter`
- `Selection::filter_selection`
- Introduced new NodeRef manipulation methods:
- `NodeRef::replace_with` - replace a node with another one
- `NodeRef::replace_with_html` - replace a node with HTML content
- + 4 more
๐ ๐ ๏ธ Minor Changes
- Simplified `Node::has_text` functionality
๐ ๐ Bug Fixes
- Corrected `<NodeRef<'a> as selectors::Element>::is_empty` to handle line breaks and whitespace properly.
๐๏ธ ๐๏ธ Removals
- The following deprecated methods have been removed:
- `Tree::append_children_from_another_tree`
- `Tree::append_prev_siblings_from_another_tree`
- `Node::append_children_from_another_tree`
- `Node::append_prev_siblings_from_another_tree`
- [Changelog](CHANGELOG.md)
- Full Changelog: https://github.com/niklak/dom_query/compare/0.7.0...0.8.0
๐ฆ Node Tree API
- Added `Node::ancestors` method to get all or limited number of node ancestors
- New methods for working with related elements:
- `Node::element_children`: returns child nodes of `NodeData::Element` type
- `Node::children_it`: provides an iterator over child nodes
- `Node::ancestors_it`: provides an iterator over ancestor nodes
- Added `Node::immediate_text` method to get node's text without descendants
- Similarly, `Selection::immediate_text` does the same for every node in selection
๐ฆ Tree API Extensions
- New methods for working with node IDs:
- `Tree:child_ids_of` and `Tree:child_ids_of_it`: get child node IDs as vec and as iterator respectively
- `Tree:ancestor_ids_of` and `Tree:ancestor_ids_of_it`: get ancestor node IDs as vec and as iterator respectively
๐ฆ Selector Improvements
- Enabled support for `:is()` and `:where()` pseudo-classes
- Implemented `:has` support from the `selectors` crate
- Added `From<Vec<Node>>` implementation for `Selection`
โก Performance Optimizations
- Improved performance for several core operations:
- `Document::from`
- `Selection::select`
- Other related methods
- Switched from `rustc-hash` to `foldhash`
๐ API Changes
- Exposed `Matcher::match_element` for use outside the crate
- Updated `selectors::Element` implementation for `Node<'a>::opaque`
- Removed `&mut` requirement from `Selection` methods
๐ฆ Dependencies
- Updated to `selectors` v0.26.0
- [Changelog](CHANGELOG.md)
- Full Changelog: https://github.com/niklak/dom_query/compare/0.6.0...0.7.0
๐ Changed
- Exposed `Document::tree`.
- `Selection` methods that required `&mut` now doesn't require `&mut`.
- Changed the project structure, now modules are divided based on the `struct` implementations.
โจ Added
- Added `Node::append_html` and `Node::set_html` methods for creating children nodes of a single selected node.
- Added `Tree<NodeData>::new_element`, an easy way to create an empty element with a given name.
- Added `NodeRef::last_child`.
- Added `Node::has_attr` method, which returns `true` if an attribute exists on the node element.
- `Selection::has_attr` does the same thing for the first node inside selection.
- Added `Node::remove_all_attrs` method for removing all attributes of a node.
- `Selection::remove_all_attrs` does the same thing for the every node inside selection.
- Added `Node::remove_attrs` method, a convenient way to remove multiple attributes from the node element.
- + 5 more
โจ Added
- Added `select_single_matcher` and `select_single` methods for `Document` and `Selection`.
- Added `Document::fragment` which allows to create a document fragment.
๐ Changed
- Update documentation
- *A small breaking change*: `From` implementation for `Document`, now it is based on `Into<StrTendril>` and because of that, previous `From<&String>` implementation will not work anymore (they are in config). If your code consumes `&String`, you should use `String::as_str()` instead.
- Refactored the code (`NodeData::Element`).
- [Changelog](./CHANGELOG.md)
