Mdsplit
mdsplit is a python command line tool to split Markdown files into chapters at a given heading level
`mdsplit` is a python command line tool to **split Markdown files** into chapters **at a given [heading level](https://spec.commonmark.org/0.31.2/#atx-headings)**. The project is written primarily in Python, distributed under the MIT License license, first published in 2022. Key topics include: markdown, splitter.
mdsplit
mdsplit is a python command line tool to
split Markdown files into chapters
at a given heading level.
Each chapter (or subchapter) is written to its own file,
which is named after the heading title.
These files are written to subdirectories representing the document's structure.
Optionally you can create:
- table of contents (
toc.md) for each input file - navigation footers (links to table of contents, previous page, next page)
Note:
- Code blocks (
```) are detected (and headers inside ignored) - The output is guaranteed to be identical with the input
(except for the separation into multiple files of course)- This means: no touching of whitespace or changing
-to*of your lists
like some viusual Markdown editors tend to do
- This means: no touching of whitespace or changing
- Text before the first heading is written to a file with the same name as the Markdown file
- Chapters with the same heading name are written to the same file.
- Reading from
stdinis supported - Can easily handle large files,
e.g. a 1 GB file is split into 30k files in 35 seconds on a 2015 Thinkpad (with an SSD)
Limitations:
- Only ATX headings
such as# Heading 1are supported.
Setext headings
(underlined headings) are not recognised.
positional arguments:
input path to input file/folder (omit or set to '-' to read from stdin)
options:
-h, --help show this help message and exit
-e ENCODING, --encoding ENCODING
force a specific encoding, default: python's default platform encoding
-l {1,2,3,4,5,6}, --max-level {1,2,3,4,5,6}
maximum heading level to split, default: 1
-t, --table-of-contents
generate a table of contents (one 'toc.md' per input file)
-n, --navigation add a navigation footer on each page (links to toc, previous page, next page)
-o OUTPUT, --output OUTPUT
path to output folder (must not exist)
-f, --force write into output folder even if it already exists
-v, --verbose
Similar projects:
You may also be interested in https://github.com/alandefreitas/mdsplit (C++-based).
Installation
Either use pip:
pip install mdsplit
mdsplit
Or simply download mdsplit.py and run it (it does not use any dependencies but python itself):
python3 mdsplit.py
Usage
Show documentation and supported arguments:
bashmdsplit --help
Split a file at level 1 headings, e.g. # This Heading, and write results to an output folder based on the input name:
bashmdsplit in.md
mermaid%%{init: {'themeVariables': { 'fontFamily': 'Monospace', 'text-align': 'left'}}}%% flowchart LR subgraph in.md SRC[# Heading 1<br>lorem ipsum<br><br># HeadingTwo<br>dolor sit amet<br><br>## Heading 2.1<br>consetetur sadipscing elitr] end SRC --> MDSPLIT(mdsplit in.md) MDSPLIT --> SPLIT_A MDSPLIT --> SPLIT_B subgraph in/HeadingTwo.md SPLIT_B[# HeadingTwo<br>dolor sit amet<br><br>## Heading 2.1<br>consetetur sadipscing elitr] end subgraph in/Heading 1.md SPLIT_A[# Heading 1<br>lorem ipsum<br><br>] end style SRC text-align:left style SPLIT_A text-align:left style SPLIT_B text-align:left style MDSPLIT fill:#000,color:#0F0
Split a file at level 2 headings and higher, e.g. # This Heading and ## That Heading, and write to a specific output directory:
bashmdsplit in.md --max-level 2 --output out
mermaid%%{init: {'themeVariables': { 'fontFamily': 'Monospace', 'text-align': 'left'}}}%% flowchart LR subgraph in.md SRC[# Heading 1<br>lorem ipsum<br><br># HeadingTwo<br>dolor sit amet<br><br>## Heading 2.1<br>consetetur sadipscing elitr] end SRC --> MDSPLIT(mdsplit in.md -l 2 -o out) subgraph out/HeadingTwo/Heading 2.1.md SPLIT_C[## Heading 2.1<br>consetetur sadipscing elitr] end subgraph out/HeadingTwo.md SPLIT_B[# HeadingTwo<br>dolor sit amet<br><br>] end subgraph out/Heading 1.md SPLIT_A[# Heading 1<br>lorem ipsum<br><br>] end MDSPLIT --> SPLIT_A MDSPLIT --> SPLIT_B MDSPLIT --> SPLIT_C style SRC text-align:left style SPLIT_A text-align:left style SPLIT_B text-align:left style MDSPLIT fill:#000,color:#0F0
Split Markdown from stdin:
bashcat in.md | mdsplit --output out
Development (Ubuntu 24.04)
Add the deadsnakes PPA
and install additional python versions for testing
bashsudo add-apt-repository ppa:deadsnakes/ppa sudo apt install python3.9-distutils python3.9-venv ...
Install poetry
Prepare virtual environment and download dependencies
poetry install
Run tests (for the default python version)
poetry run pytest
Run tests for all supported python versions
poetry run tox
Release new version
poetry build
poetry publish
Contributors
Showing top 3 contributors by commit count.
