mirror of
https://github.com/dgtlmoon/changedetection.io.git
synced 2025-12-17 13:35:50 +00:00
Some checks failed
Build and push containers / metadata (push) Has been cancelled
Build and push containers / build-push-containers (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v7 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v8 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (main) (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
24 lines
604 B
Python
24 lines
604 B
Python
"""
|
|
Tokenizers for diff operations.
|
|
|
|
This module provides various tokenization strategies for use with the diff system.
|
|
New tokenizers can be easily added by:
|
|
1. Creating a new module in this directory
|
|
2. Importing and registering it in the TOKENIZERS dictionary below
|
|
"""
|
|
|
|
from .natural_text import tokenize_words
|
|
from .words_and_html import tokenize_words_and_html
|
|
|
|
# Tokenizer registry - maps tokenizer names to functions
|
|
TOKENIZERS = {
|
|
'words': tokenize_words,
|
|
'words_and_html': tokenize_words_and_html,
|
|
}
|
|
|
|
__all__ = [
|
|
'tokenize_words',
|
|
'tokenize_words_and_html',
|
|
'TOKENIZERS',
|
|
]
|