GitPedia

Tokenizer

A small library for converting tokenized PHP source code into XML (and potentially other formats)

From theseer·Updated June 5, 2026·View on GitHub·

A small library for converting tokenized PHP source code into XML. The project is written primarily in PHP, distributed under the Other license, first published in 2017. It has gained significant community traction with 5,194 stars and 23 forks on GitHub. Key topics include: php, tokenizer, xml.

Latest release: 2.0.1Release 2.0.1
December 8, 2025View Changelog →

Tokenizer

A small library for converting tokenized PHP source code into XML.

Test

Installation

You can add this library as a local, per-project dependency to your project using Composer:

composer require theseer/tokenizer

If you only need this library during development, for instance to run your project's test suite, then you should add it as a development-time dependency:

composer require --dev theseer/tokenizer

Usage examples

php
$tokenizer = new TheSeer\Tokenizer\Tokenizer(); $tokens = $tokenizer->parse(file_get_contents(__DIR__ . '/src/XMLSerializer.php')); $serializer = new TheSeer\Tokenizer\XMLSerializer(); $xml = $serializer->toXML($tokens); echo $xml;

The generated XML structure looks something like this:

xml
<?xml version="1.0"?> <source xmlns="https://github.com/theseer/tokenizer"> <line no="1"> <token name="T_OPEN_TAG">&lt;?php </token> <token name="T_DECLARE">declare</token> <token name="T_OPEN_BRACKET">(</token> <token name="T_STRING">strict_types</token> <token name="T_WHITESPACE"> </token> <token name="T_EQUAL">=</token> <token name="T_WHITESPACE"> </token> <token name="T_LNUMBER">1</token> <token name="T_CLOSE_BRACKET">)</token> <token name="T_SEMICOLON">;</token> </line> </source>

Contributors

Showing top 10 contributors by commit count.

View all contributors on GitHub →

This article is auto-generated from theseer/tokenizer via the GitHub API.Last fetched: 6/17/2026