Compare commits

..

36 Commits

Author SHA1 Message Date
dgtlmoon
5484b2352e Merge branch 'master' into 3159-test-notification-send
Some checks failed
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v7 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v8 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (main) (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2025-10-09 00:52:30 +02:00
dgtlmoon
d318bb77a1 Merge branch 'master' into 3159-test-notification-send
Some checks failed
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v7 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v8 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (main) (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2025-10-03 17:23:17 +02:00
dgtlmoon
4216ffeca9 some WIP
Some checks failed
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2025-09-18 11:41:33 +02:00
dgtlmoon
fe800fd7a4 fix colour on diff_added/diff_removed
Some checks failed
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2025-09-18 09:42:35 +02:00
dgtlmoon
0781de94ad Merge branch 'master' into 3159-test-notification-send
Some checks failed
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2025-09-17 13:53:34 +02:00
dgtlmoon
ec43d1afc2 Merge branch 'master' into 3159-test-notification-send
Some checks failed
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v7 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v8 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64/v8 (main) (push) Has been cancelled
2025-09-16 19:10:33 +02:00
dgtlmoon
0058103744 Add missing extension
Some checks failed
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2025-09-16 16:35:51 +02:00
dgtlmoon
4608989316 Maybe this fixes it 2025-09-16 16:29:08 +02:00
dgtlmoon
19162991a9 improved error handling 2025-09-16 16:20:12 +02:00
dgtlmoon
f730db8164 fix defaults 2025-09-16 15:58:50 +02:00
dgtlmoon
7ba14b6f39 Re #3426 2025-09-16 15:53:57 +02:00
dgtlmoon
660bf3e9bb HTML improvements 2025-09-16 15:45:57 +02:00
dgtlmoon
74c275d570 WIP 2025-09-16 13:09:47 +02:00
dgtlmoon
d90ad2d845 oops
Some checks failed
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2025-09-15 15:50:09 +02:00
dgtlmoon
8e68043a58 oops 2025-09-15 14:18:30 +02:00
dgtlmoon
4ab222e882 Fixing error handlers 2025-09-15 13:53:33 +02:00
dgtlmoon
623f056ebe Fixing markup safety 2025-09-15 13:53:15 +02:00
dgtlmoon
6e1c53b1bf fix error handler 2025-09-15 13:52:55 +02:00
dgtlmoon
c1a92de50c little styling fixup
Some checks failed
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v7 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v8 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64/v8 (main) (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2025-09-10 15:10:41 +02:00
dgtlmoon
52987484ce New default notification 2025-09-10 15:02:10 +02:00
dgtlmoon
c77a970330 Merge branch 'master' into 3159-test-notification-send 2025-09-10 15:01:50 +02:00
dgtlmoon
c2eb736051 WIP 2025-09-10 14:54:24 +02:00
dgtlmoon
0bfa9fe9cf Fix links from being mashed
Some checks failed
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2025-08-29 10:03:20 +02:00
dgtlmoon
dfd7e71985 Merge branch 'master' into 3159-test-notification-send
Some checks failed
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v7 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v8 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64/v8 (main) (push) Has been cancelled
2025-08-29 09:49:12 +02:00
dgtlmoon
0820dc1f97 Merge branch 'master' into 3159-test-notification-send
Some checks failed
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v7 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v8 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64/v8 (main) (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2025-08-28 18:54:52 +02:00
dgtlmoon
4ea90138d5 Adding ability to use a wrapping template "notification.html"
Some checks failed
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2025-08-28 17:18:55 +02:00
dgtlmoon
abd24c2a50 Fix up selection of correct group uuid 2025-08-28 15:25:28 +02:00
dgtlmoon
a8e402754b little cleanup for tests 2025-08-28 14:59:31 +02:00
dgtlmoon
a9a0ae0896 WIP 2025-08-28 14:30:16 +02:00
dgtlmoon
e7d82bb346 WIP 2025-08-28 11:33:46 +02:00
dgtlmoon
9f0bc0688c Use iframe for preview
Some checks failed
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2025-08-22 16:56:15 +02:00
dgtlmoon
bfd5432062 WIP 2025-08-22 16:26:43 +02:00
dgtlmoon
5dd00c1e8f UI - Fixing tabs handling
Some checks failed
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2025-08-22 13:13:04 +02:00
dgtlmoon
017898d9bc Update notification method with new queue system
Some checks failed
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v7 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v8 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64/v8 (main) (push) Has been cancelled
2025-08-21 12:56:09 +02:00
dgtlmoon
97e6933fef Merge branch 'master' into 3159-test-notification-send 2025-08-21 12:48:50 +02:00
dgtlmoon
51081941e3 Re #3159 - better send test handling 2025-04-30 17:07:44 +02:00
181 changed files with 3937 additions and 9279 deletions

View File

@@ -1,51 +0,0 @@
name: 'Extract Memory Test Report'
description: 'Extracts and displays memory test report from a container'
inputs:
container-name:
description: 'Name of the container to extract logs from'
required: true
python-version:
description: 'Python version for artifact naming'
required: true
output-dir:
description: 'Directory to store output logs'
required: false
default: 'output-logs'
runs:
using: "composite"
steps:
- name: Create output directory
shell: bash
run: |
mkdir -p ${{ inputs.output-dir }}
- name: Dump container log
shell: bash
run: |
echo "Disabled for now"
# return
# docker logs ${{ inputs.container-name }} > ${{ inputs.output-dir }}/${{ inputs.container-name }}-stdout-${{ inputs.python-version }}.txt 2>&1 || echo "Could not get stdout"
# docker logs ${{ inputs.container-name }} 2> ${{ inputs.output-dir }}/${{ inputs.container-name }}-stderr-${{ inputs.python-version }}.txt || echo "Could not get stderr"
- name: Extract and display memory test report
shell: bash
run: |
echo "Disabled for now"
# echo "Extracting test-memory.log from container..."
# docker cp ${{ inputs.container-name }}:/app/changedetectionio/test-memory.log ${{ inputs.output-dir }}/test-memory-${{ inputs.python-version }}.log || echo "test-memory.log not found in container"
#
# echo "=== Top 10 Highest Peak Memory Tests ==="
# if [ -f ${{ inputs.output-dir }}/test-memory-${{ inputs.python-version }}.log ]; then
# grep "Peak memory:" ${{ inputs.output-dir }}/test-memory-${{ inputs.python-version }}.log | \
# sed 's/.*Peak memory: //' | \
# paste -d'|' - <(grep "Peak memory:" ${{ inputs.output-dir }}/test-memory-${{ inputs.python-version }}.log) | \
# sort -t'|' -k1 -nr | \
# cut -d'|' -f2 | \
# head -10
# echo ""
# echo "=== Full Memory Test Report ==="
# cat ${{ inputs.output-dir }}/test-memory-${{ inputs.python-version }}.log
# else
# echo "No memory log available"
# fi

View File

@@ -30,11 +30,11 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@v6
uses: actions/checkout@v5
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v4
uses: github/codeql-action/init@v3
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
@@ -45,7 +45,7 @@ jobs:
# Autobuild attempts to build any compiled languages (C/C++, C#, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@v4
uses: github/codeql-action/autobuild@v3
# Command-line programs to run using the OS shell.
# 📚 https://git.io/JvXDl
@@ -59,4 +59,4 @@ jobs:
# make release
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v4
uses: github/codeql-action/analyze@v3

View File

@@ -15,7 +15,6 @@ on:
push:
branches:
- master
- dev
jobs:
metadata:
@@ -40,20 +39,12 @@ jobs:
# Or if we are in a tagged release scenario.
if: ${{ github.event.workflow_run.conclusion == 'success' }} || ${{ github.event.release.tag_name }} != ''
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
- name: Set up Python 3.11
uses: actions/setup-python@v6
with:
python-version: 3.11
- name: Cache pip packages
uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('requirements.txt') }}
restore-keys: |
${{ runner.os }}-pip-
- name: Install dependencies
run: |
python -m pip install --upgrade pip
@@ -93,10 +84,10 @@ jobs:
version: latest
driver-opts: image=moby/buildkit:master
# dev branch -> :dev container tag
# master branch -> :dev container tag
- name: Build and push :dev
id: docker_build
if: ${{ github.ref == 'refs/heads/dev' }}
if: ${{ github.ref }} == "refs/heads/master"
uses: docker/build-push-action@v6
with:
context: ./

View File

@@ -7,7 +7,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
- name: Set up Python
uses: actions/setup-python@v6
with:
@@ -21,20 +21,20 @@ jobs:
- name: Build a binary wheel and a source tarball
run: python3 -m build
- name: Store the distribution packages
uses: actions/upload-artifact@v5
uses: actions/upload-artifact@v4
with:
name: python-package-distributions
path: dist/
test-pypi-package:
name: Test the built package works basically.
name: Test the built 📦 package works basically.
runs-on: ubuntu-latest
needs:
- build
steps:
- name: Download all the dists
uses: actions/download-artifact@v6
uses: actions/download-artifact@v5
with:
name: python-package-distributions
path: dist/
@@ -42,39 +42,18 @@ jobs:
uses: actions/setup-python@v6
with:
python-version: '3.11'
- name: Test that the basic pip built package runs without error
run: |
set -ex
ls -alR
# Install the first wheel found in dist/
WHEEL=$(find dist -type f -name "*.whl" -print -quit)
echo Installing $WHEEL
python3 -m pip install --upgrade pip
python3 -m pip install "$WHEEL"
# Find and install the first .whl file
find dist -type f -name "*.whl" -exec pip3 install {} \; -quit
changedetection.io -d /tmp -p 10000 &
sleep 3
curl --retry-connrefused --retry 6 http://127.0.0.1:10000/static/styles/pure-min.css >/dev/null
curl --retry-connrefused --retry 6 http://127.0.0.1:10000/ >/dev/null
# --- API test ---
# This also means that the docs/api-spec.yml was shipped and could be read
test -f /tmp/url-watches.json
API_KEY=$(jq -r '.. | .api_access_token? // empty' /tmp/url-watches.json)
echo Test API KEY is $API_KEY
curl -X POST "http://127.0.0.1:10000/api/v1/watch" \
-H "x-api-key: ${API_KEY}" \
-H "Content-Type: application/json" \
--show-error --fail \
--retry 6 --retry-delay 1 --retry-connrefused \
-d '{
"url": "https://example.com",
"title": "Example Site Monitor",
"time_between_check": { "hours": 1 }
}'
killall changedetection.io
@@ -93,7 +72,7 @@ jobs:
steps:
- name: Download all the dists
uses: actions/download-artifact@v6
uses: actions/download-artifact@v5
with:
name: python-package-distributions
path: dist/

View File

@@ -44,20 +44,12 @@ jobs:
- platform: linux/arm64
dockerfile: ./.github/test/Dockerfile-alpine
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
- name: Set up Python 3.11
uses: actions/setup-python@v6
with:
python-version: 3.11
- name: Cache pip packages
uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('requirements.txt') }}
restore-keys: |
${{ runner.os }}-pip-
# Just test that the build works, some libraries won't compile on ARM/rPi etc
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
@@ -82,5 +74,5 @@ jobs:
file: ${{ matrix.dockerfile }}
platforms: ${{ matrix.platform }}
cache-from: type=gha
cache-to: type=gha,mode=max
cache-to: type=gha,mode=min

View File

@@ -7,7 +7,7 @@ jobs:
lint-code:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
- name: Lint with Ruff
run: |
pip install ruff
@@ -21,8 +21,6 @@ jobs:
python3 -c "from openapi_spec_validator import validate_spec; import yaml; validate_spec(yaml.safe_load(open('docs/api-spec.yaml')))"
test-application-3-10:
# Only run on push to master (including PR merges)
if: github.event_name == 'push' && github.ref == 'refs/heads/master'
needs: lint-code
uses: ./.github/workflows/test-stack-reusable-workflow.yml
with:
@@ -30,15 +28,12 @@ jobs:
test-application-3-11:
# Always run
needs: lint-code
uses: ./.github/workflows/test-stack-reusable-workflow.yml
with:
python-version: '3.11'
test-application-3-12:
# Only run on push to master (including PR merges)
if: github.event_name == 'push' && github.ref == 'refs/heads/master'
needs: lint-code
uses: ./.github/workflows/test-stack-reusable-workflow.yml
with:
@@ -46,8 +41,6 @@ jobs:
skip-pypuppeteer: true
test-application-3-13:
# Only run on push to master (including PR merges)
if: github.event_name == 'push' && github.ref == 'refs/heads/master'
needs: lint-code
uses: ./.github/workflows/test-stack-reusable-workflow.yml
with:

View File

@@ -15,294 +15,138 @@ on:
default: false
jobs:
# Build the Docker image once and share it with all test jobs
build:
test-application:
runs-on: ubuntu-latest
env:
PYTHON_VERSION: ${{ inputs.python-version }}
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
# Mainly just for link/flake8
- name: Set up Python ${{ env.PYTHON_VERSION }}
uses: actions/setup-python@v6
with:
python-version: ${{ env.PYTHON_VERSION }}
- name: Cache pip packages
uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-py${{ env.PYTHON_VERSION }}-${{ hashFiles('requirements.txt') }}
restore-keys: |
${{ runner.os }}-pip-py${{ env.PYTHON_VERSION }}-
${{ runner.os }}-pip-
- name: Build changedetection.io container for testing under Python ${{ env.PYTHON_VERSION }}
run: |
echo "---- Building for Python ${{ env.PYTHON_VERSION }} -----"
# Build a changedetection.io container and start testing inside
docker build --build-arg PYTHON_VERSION=${{ env.PYTHON_VERSION }} --build-arg LOGGER_LEVEL=TRACE -t test-changedetectionio .
docker run test-changedetectionio bash -c 'pip list'
# Debug info
docker run test-changedetectionio bash -c 'pip list'
- name: We should be Python ${{ env.PYTHON_VERSION }} ...
run: |
docker run test-changedetectionio bash -c 'python3 --version'
- name: Spin up ancillary testable services
run: |
docker run test-changedetectionio bash -c 'python3 --version'
docker network create changedet-network
# Selenium
docker run --network changedet-network -d --hostname selenium -p 4444:4444 --rm --shm-size="2g" selenium/standalone-chrome:4
# SocketPuppetBrowser + Extra for custom browser test
docker run --network changedet-network -d -e "LOG_LEVEL=TRACE" --cap-add=SYS_ADMIN --name sockpuppetbrowser --hostname sockpuppetbrowser --rm -p 3000:3000 dgtlmoon/sockpuppetbrowser:latest
docker run --network changedet-network -d -e "LOG_LEVEL=TRACE" --cap-add=SYS_ADMIN --name sockpuppetbrowser-custom-url --hostname sockpuppetbrowser-custom-url -p 3001:3000 --rm dgtlmoon/sockpuppetbrowser:latest
- name: Save Docker image
- name: Spin up ancillary SMTP+Echo message test server
run: |
docker save test-changedetectionio -o /tmp/test-changedetectionio.tar
# Debug SMTP server/echo message back server
docker run --network changedet-network -d -p 11025:11025 -p 11080:11080 --hostname mailserver test-changedetectionio bash -c 'pip3 install aiosmtpd && python changedetectionio/tests/smtp/smtp-test-server.py'
docker ps
- name: Upload Docker image artifact
uses: actions/upload-artifact@v5
with:
name: test-changedetectionio-${{ env.PYTHON_VERSION }}
path: /tmp/test-changedetectionio.tar
retention-days: 1
# Unit tests (lightweight, no ancillary services needed)
unit-tests:
runs-on: ubuntu-latest
needs: build
timeout-minutes: 10
env:
PYTHON_VERSION: ${{ inputs.python-version }}
steps:
- uses: actions/checkout@v6
- name: Download Docker image artifact
uses: actions/download-artifact@v6
with:
name: test-changedetectionio-${{ env.PYTHON_VERSION }}
path: /tmp
- name: Load Docker image
- name: Show docker container state and other debug info
run: |
docker load -i /tmp/test-changedetectionio.tar
set -x
echo "Running processes in docker..."
docker ps
- name: Run Unit Tests
run: |
docker run test-changedetectionio bash -c 'python3 -m unittest changedetectionio.tests.unit.test_notification_diff'
docker run test-changedetectionio bash -c 'python3 -m unittest changedetectionio.tests.unit.test_watch_model'
docker run test-changedetectionio bash -c 'python3 -m unittest changedetectionio.tests.unit.test_jinja2_security'
docker run test-changedetectionio bash -c 'python3 -m unittest changedetectionio.tests.unit.test_semver'
# Unit tests
docker run test-changedetectionio bash -c 'python3 -m unittest changedetectionio.tests.unit.test_notification_diff'
docker run test-changedetectionio bash -c 'python3 -m unittest changedetectionio.tests.unit.test_watch_model'
docker run test-changedetectionio bash -c 'python3 -m unittest changedetectionio.tests.unit.test_jinja2_security'
docker run test-changedetectionio bash -c 'python3 -m unittest changedetectionio.tests.unit.test_semver'
# Basic pytest tests with ancillary services
basic-tests:
runs-on: ubuntu-latest
needs: build
timeout-minutes: 25
env:
PYTHON_VERSION: ${{ inputs.python-version }}
steps:
- uses: actions/checkout@v6
- name: Download Docker image artifact
uses: actions/download-artifact@v6
with:
name: test-changedetectionio-${{ env.PYTHON_VERSION }}
path: /tmp
- name: Load Docker image
- name: Test built container with Pytest (generally as requests/plaintext fetching)
run: |
docker load -i /tmp/test-changedetectionio.tar
# All tests
echo "run test with pytest"
# The default pytest logger_level is TRACE
# To change logger_level for pytest(test/conftest.py),
# append the docker option. e.g. '-e LOGGER_LEVEL=DEBUG'
docker run --name test-cdio-basic-tests --network changedet-network test-changedetectionio bash -c 'cd changedetectionio && ./run_basic_tests.sh'
- name: Test built container with Pytest
# PLAYWRIGHT/NODE-> CDP
- name: Playwright and SocketPuppetBrowser - Specific tests in built container
run: |
docker network inspect changedet-network >/dev/null 2>&1 || docker network create changedet-network
docker run --name test-cdio-basic-tests --network changedet-network test-changedetectionio bash -c 'cd changedetectionio && ./run_basic_tests.sh'
# Playwright via Sockpuppetbrowser fetch
# tests/visualselector/test_fetch_data.py will do browser steps
docker run --rm -e "FLASK_SERVER_NAME=cdio" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network --hostname=cdio test-changedetectionio bash -c 'cd changedetectionio;pytest -vv --capture=tee-sys --showlocals --tb=long --live-server-host=0.0.0.0 --live-server-port=5004 tests/fetchers/test_content.py'
docker run --rm -e "FLASK_SERVER_NAME=cdio" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network --hostname=cdio test-changedetectionio bash -c 'cd changedetectionio;pytest -vv --capture=tee-sys --showlocals --tb=long --live-server-host=0.0.0.0 --live-server-port=5004 tests/test_errorhandling.py'
docker run --rm -e "FLASK_SERVER_NAME=cdio" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network --hostname=cdio test-changedetectionio bash -c 'cd changedetectionio;pytest -vv --capture=tee-sys --showlocals --tb=long --live-server-host=0.0.0.0 --live-server-port=5004 tests/visualselector/test_fetch_data.py'
docker run --rm -e "FLASK_SERVER_NAME=cdio" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network --hostname=cdio test-changedetectionio bash -c 'cd changedetectionio;pytest -vv --capture=tee-sys --showlocals --tb=long --live-server-host=0.0.0.0 --live-server-port=5004 tests/fetchers/test_custom_js_before_content.py'
- name: Extract memory report and logs
if: always()
uses: ./.github/actions/extract-memory-report
with:
container-name: test-cdio-basic-tests
python-version: ${{ env.PYTHON_VERSION }}
- name: Store test artifacts
if: always()
uses: actions/upload-artifact@v5
with:
name: test-cdio-basic-tests-output-py${{ env.PYTHON_VERSION }}
path: output-logs
- name: Playwright and SocketPuppetBrowser - Headers and requests
run: |
# Settings headers playwright tests - Call back in from Sockpuppetbrowser, check headers
docker run --name "changedet" --hostname changedet --rm -e "FLASK_SERVER_NAME=changedet" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000?dumpio=true" --network changedet-network test-changedetectionio bash -c 'find .; cd changedetectionio; pytest --live-server-host=0.0.0.0 --live-server-port=5004 tests/test_request.py; pwd;find .'
# Playwright tests
playwright-tests:
runs-on: ubuntu-latest
needs: build
timeout-minutes: 10
env:
PYTHON_VERSION: ${{ inputs.python-version }}
steps:
- uses: actions/checkout@v6
- name: Playwright and SocketPuppetBrowser - Restock detection
run: |
# restock detection via playwright - added name=changedet here so that playwright and sockpuppetbrowser can connect to it
docker run --rm --name "changedet" -e "FLASK_SERVER_NAME=changedet" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network test-changedetectionio bash -c 'cd changedetectionio;pytest --live-server-port=5004 --live-server-host=0.0.0.0 tests/restock/test_restock.py'
- name: Download Docker image artifact
uses: actions/download-artifact@v6
with:
name: test-changedetectionio-${{ env.PYTHON_VERSION }}
path: /tmp
- name: Load Docker image
# STRAIGHT TO CDP
- name: Pyppeteer and SocketPuppetBrowser - Specific tests in built container
if: ${{ inputs.skip-pypuppeteer == false }}
run: |
docker load -i /tmp/test-changedetectionio.tar
# Playwright via Sockpuppetbrowser fetch
docker run --rm -e "FLASK_SERVER_NAME=cdio" -e "FAST_PUPPETEER_CHROME_FETCHER=True" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network --hostname=cdio test-changedetectionio bash -c 'cd changedetectionio;pytest --live-server-host=0.0.0.0 --live-server-port=5004 tests/fetchers/test_content.py'
docker run --rm -e "FLASK_SERVER_NAME=cdio" -e "FAST_PUPPETEER_CHROME_FETCHER=True" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network --hostname=cdio test-changedetectionio bash -c 'cd changedetectionio;pytest --live-server-host=0.0.0.0 --live-server-port=5004 tests/test_errorhandling.py'
docker run --rm -e "FLASK_SERVER_NAME=cdio" -e "FAST_PUPPETEER_CHROME_FETCHER=True" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network --hostname=cdio test-changedetectionio bash -c 'cd changedetectionio;pytest --live-server-host=0.0.0.0 --live-server-port=5004 tests/visualselector/test_fetch_data.py'
docker run --rm -e "FLASK_SERVER_NAME=cdio" -e "FAST_PUPPETEER_CHROME_FETCHER=True" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network --hostname=cdio test-changedetectionio bash -c 'cd changedetectionio;pytest --live-server-host=0.0.0.0 --live-server-port=5004 tests/fetchers/test_custom_js_before_content.py'
- name: Spin up ancillary services
- name: Pyppeteer and SocketPuppetBrowser - Headers and requests checks
if: ${{ inputs.skip-pypuppeteer == false }}
run: |
docker network create changedet-network
docker run --network changedet-network -d -e "LOG_LEVEL=TRACE" --cap-add=SYS_ADMIN --name sockpuppetbrowser --hostname sockpuppetbrowser --rm -p 3000:3000 dgtlmoon/sockpuppetbrowser:latest
docker run --network changedet-network -d -e "LOG_LEVEL=TRACE" --cap-add=SYS_ADMIN --name sockpuppetbrowser-custom-url --hostname sockpuppetbrowser-custom-url -p 3001:3000 --rm dgtlmoon/sockpuppetbrowser:latest
# Settings headers playwright tests - Call back in from Sockpuppetbrowser, check headers
docker run --name "changedet" --hostname changedet --rm -e "FAST_PUPPETEER_CHROME_FETCHER=True" -e "FLASK_SERVER_NAME=changedet" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000?dumpio=true" --network changedet-network test-changedetectionio bash -c 'cd changedetectionio; pytest --live-server-host=0.0.0.0 --live-server-port=5004 tests/test_request.py'
- name: Playwright - Specific tests in built container
run: |
docker run --rm -e "FLASK_SERVER_NAME=cdio" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network --hostname=cdio test-changedetectionio bash -c 'cd changedetectionio;pytest -vv --capture=tee-sys --showlocals --tb=long --live-server-host=0.0.0.0 --live-server-port=5004 tests/fetchers/test_content.py'
docker run --rm -e "FLASK_SERVER_NAME=cdio" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network --hostname=cdio test-changedetectionio bash -c 'cd changedetectionio;pytest -vv --capture=tee-sys --showlocals --tb=long --live-server-host=0.0.0.0 --live-server-port=5004 tests/test_errorhandling.py'
docker run --rm -e "FLASK_SERVER_NAME=cdio" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network --hostname=cdio test-changedetectionio bash -c 'cd changedetectionio;pytest -vv --capture=tee-sys --showlocals --tb=long --live-server-host=0.0.0.0 --live-server-port=5004 tests/visualselector/test_fetch_data.py'
docker run --rm -e "FLASK_SERVER_NAME=cdio" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network --hostname=cdio test-changedetectionio bash -c 'cd changedetectionio;pytest -vv --capture=tee-sys --showlocals --tb=long --live-server-host=0.0.0.0 --live-server-port=5004 tests/fetchers/test_custom_js_before_content.py'
- name: Playwright - Headers and requests
run: |
docker run --name "changedet" --hostname changedet --rm -e "FLASK_SERVER_NAME=changedet" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000?dumpio=true" --network changedet-network test-changedetectionio bash -c 'find .; cd changedetectionio; pytest --live-server-host=0.0.0.0 --live-server-port=5004 tests/test_request.py; pwd;find .'
- name: Playwright - Restock detection
run: |
docker run --rm --name "changedet" -e "FLASK_SERVER_NAME=changedet" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network test-changedetectionio bash -c 'cd changedetectionio;pytest --live-server-port=5004 --live-server-host=0.0.0.0 tests/restock/test_restock.py'
# Pyppeteer tests
pyppeteer-tests:
runs-on: ubuntu-latest
needs: build
if: ${{ inputs.skip-pypuppeteer == false }}
timeout-minutes: 10
env:
PYTHON_VERSION: ${{ inputs.python-version }}
steps:
- uses: actions/checkout@v6
- name: Download Docker image artifact
uses: actions/download-artifact@v6
with:
name: test-changedetectionio-${{ env.PYTHON_VERSION }}
path: /tmp
- name: Load Docker image
run: |
docker load -i /tmp/test-changedetectionio.tar
- name: Spin up ancillary services
run: |
docker network create changedet-network
docker run --network changedet-network -d -e "LOG_LEVEL=TRACE" --cap-add=SYS_ADMIN --name sockpuppetbrowser --hostname sockpuppetbrowser --rm -p 3000:3000 dgtlmoon/sockpuppetbrowser:latest
- name: Pyppeteer - Specific tests in built container
run: |
docker run --rm -e "FLASK_SERVER_NAME=cdio" -e "FAST_PUPPETEER_CHROME_FETCHER=True" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network --hostname=cdio test-changedetectionio bash -c 'cd changedetectionio;pytest --live-server-host=0.0.0.0 --live-server-port=5004 tests/fetchers/test_content.py'
docker run --rm -e "FLASK_SERVER_NAME=cdio" -e "FAST_PUPPETEER_CHROME_FETCHER=True" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network --hostname=cdio test-changedetectionio bash -c 'cd changedetectionio;pytest --live-server-host=0.0.0.0 --live-server-port=5004 tests/test_errorhandling.py'
docker run --rm -e "FLASK_SERVER_NAME=cdio" -e "FAST_PUPPETEER_CHROME_FETCHER=True" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network --hostname=cdio test-changedetectionio bash -c 'cd changedetectionio;pytest --live-server-host=0.0.0.0 --live-server-port=5004 tests/visualselector/test_fetch_data.py'
docker run --rm -e "FLASK_SERVER_NAME=cdio" -e "FAST_PUPPETEER_CHROME_FETCHER=True" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network --hostname=cdio test-changedetectionio bash -c 'cd changedetectionio;pytest --live-server-host=0.0.0.0 --live-server-port=5004 tests/fetchers/test_custom_js_before_content.py'
- name: Pyppeteer - Headers and requests checks
run: |
docker run --name "changedet" --hostname changedet --rm -e "FAST_PUPPETEER_CHROME_FETCHER=True" -e "FLASK_SERVER_NAME=changedet" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000?dumpio=true" --network changedet-network test-changedetectionio bash -c 'cd changedetectionio; pytest --live-server-host=0.0.0.0 --live-server-port=5004 tests/test_request.py'
- name: Pyppeteer - Restock detection
run: |
docker run --rm --name "changedet" -e "FLASK_SERVER_NAME=changedet" -e "FAST_PUPPETEER_CHROME_FETCHER=True" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network test-changedetectionio bash -c 'cd changedetectionio;pytest --live-server-port=5004 --live-server-host=0.0.0.0 tests/restock/test_restock.py'
# Selenium tests
selenium-tests:
runs-on: ubuntu-latest
needs: build
timeout-minutes: 10
env:
PYTHON_VERSION: ${{ inputs.python-version }}
steps:
- uses: actions/checkout@v6
- name: Download Docker image artifact
uses: actions/download-artifact@v6
with:
name: test-changedetectionio-${{ env.PYTHON_VERSION }}
path: /tmp
- name: Load Docker image
run: |
docker load -i /tmp/test-changedetectionio.tar
- name: Spin up ancillary services
run: |
docker network create changedet-network
docker run --network changedet-network -d --hostname selenium -p 4444:4444 --rm --shm-size="2g" selenium/standalone-chrome:4
sleep 3
- name: Specific tests for headers and requests checks with Selenium
run: |
docker run --name "changedet" --hostname changedet --rm -e "FLASK_SERVER_NAME=changedet" -e "WEBDRIVER_URL=http://selenium:4444/wd/hub" --network changedet-network test-changedetectionio bash -c 'cd changedetectionio; pytest --live-server-host=0.0.0.0 --live-server-port=5004 tests/test_request.py'
- name: Pyppeteer and SocketPuppetBrowser - Restock detection
if: ${{ inputs.skip-pypuppeteer == false }}
run: |
# restock detection via playwright - added name=changedet here so that playwright and sockpuppetbrowser can connect to it
docker run --rm --name "changedet" -e "FLASK_SERVER_NAME=changedet" -e "FAST_PUPPETEER_CHROME_FETCHER=True" -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network test-changedetectionio bash -c 'cd changedetectionio;pytest --live-server-port=5004 --live-server-host=0.0.0.0 tests/restock/test_restock.py'
# SELENIUM
- name: Specific tests in built container for Selenium
run: |
docker run --rm -e "WEBDRIVER_URL=http://selenium:4444/wd/hub" --network changedet-network test-changedetectionio bash -c 'cd changedetectionio;pytest tests/fetchers/test_content.py && pytest tests/test_errorhandling.py'
# Selenium fetch
docker run --rm -e "WEBDRIVER_URL=http://selenium:4444/wd/hub" --network changedet-network test-changedetectionio bash -c 'cd changedetectionio;pytest tests/fetchers/test_content.py && pytest tests/test_errorhandling.py'
# SMTP tests
smtp-tests:
runs-on: ubuntu-latest
needs: build
timeout-minutes: 10
env:
PYTHON_VERSION: ${{ inputs.python-version }}
steps:
- uses: actions/checkout@v6
- name: Download Docker image artifact
uses: actions/download-artifact@v6
with:
name: test-changedetectionio-${{ env.PYTHON_VERSION }}
path: /tmp
- name: Load Docker image
- name: Specific tests in built container for headers and requests checks with Selenium
run: |
docker load -i /tmp/test-changedetectionio.tar
- name: Spin up SMTP test server
run: |
docker network create changedet-network
docker run --network changedet-network -d -p 11025:11025 -p 11080:11080 --hostname mailserver test-changedetectionio bash -c 'pip3 install aiosmtpd && python changedetectionio/tests/smtp/smtp-test-server.py'
docker run --name "changedet" --hostname changedet --rm -e "FLASK_SERVER_NAME=changedet" -e "WEBDRIVER_URL=http://selenium:4444/wd/hub" --network changedet-network test-changedetectionio bash -c 'cd changedetectionio; pytest --live-server-host=0.0.0.0 --live-server-port=5004 tests/test_request.py'
# OTHER STUFF
- name: Test SMTP notification mime types
run: |
# SMTP content types - needs the 'Debug SMTP server/echo message back server' container from above
# "mailserver" hostname defined above
docker run --rm --network changedet-network test-changedetectionio bash -c 'cd changedetectionio;pytest tests/smtp/test_notification_smtp.py'
# Proxy tests
proxy-tests:
runs-on: ubuntu-latest
needs: build
timeout-minutes: 10
env:
PYTHON_VERSION: ${{ inputs.python-version }}
steps:
- uses: actions/checkout@v6
- name: Download Docker image artifact
uses: actions/download-artifact@v6
with:
name: test-changedetectionio-${{ env.PYTHON_VERSION }}
path: /tmp
- name: Load Docker image
run: |
docker load -i /tmp/test-changedetectionio.tar
- name: Spin up services
run: |
docker network create changedet-network
docker run --network changedet-network -d --hostname selenium -p 4444:4444 --rm --shm-size="2g" selenium/standalone-chrome:4
docker run --network changedet-network -d -e "LOG_LEVEL=TRACE" --cap-add=SYS_ADMIN --name sockpuppetbrowser --hostname sockpuppetbrowser --rm -p 3000:3000 dgtlmoon/sockpuppetbrowser:latest
docker run --network changedet-network -d -e "LOG_LEVEL=TRACE" --cap-add=SYS_ADMIN --name sockpuppetbrowser-custom-url --hostname sockpuppetbrowser-custom-url -p 3001:3000 --rm dgtlmoon/sockpuppetbrowser:latest
- name: Test proxy Squid style interaction
# @todo Add a test via playwright/puppeteer
# squid with auth is tested in run_proxy_tests.sh -> tests/proxy_list/test_select_custom_proxy.py
- name: Test proxy squid style interaction
run: |
cd changedetectionio
./run_proxy_tests.sh
docker ps
cd ..
- name: Test proxy SOCKS5 style interaction
@@ -311,65 +155,28 @@ jobs:
./run_socks_proxy_tests.sh
cd ..
# Custom browser URL tests
custom-browser-tests:
runs-on: ubuntu-latest
needs: build
timeout-minutes: 10
env:
PYTHON_VERSION: ${{ inputs.python-version }}
steps:
- uses: actions/checkout@v6
- name: Download Docker image artifact
uses: actions/download-artifact@v6
with:
name: test-changedetectionio-${{ env.PYTHON_VERSION }}
path: /tmp
- name: Load Docker image
run: |
docker load -i /tmp/test-changedetectionio.tar
- name: Spin up ancillary services
run: |
docker network create changedet-network
docker run --network changedet-network -d -e "LOG_LEVEL=TRACE" --cap-add=SYS_ADMIN --name sockpuppetbrowser --hostname sockpuppetbrowser --rm -p 3000:3000 dgtlmoon/sockpuppetbrowser:latest
docker run --network changedet-network -d -e "LOG_LEVEL=TRACE" --cap-add=SYS_ADMIN --name sockpuppetbrowser-custom-url --hostname sockpuppetbrowser-custom-url -p 3001:3000 --rm dgtlmoon/sockpuppetbrowser:latest
- name: Test custom browser URL
run: |
cd changedetectionio
./run_custom_browser_url_tests.sh
cd ..
# Container startup tests
container-tests:
runs-on: ubuntu-latest
needs: build
timeout-minutes: 10
env:
PYTHON_VERSION: ${{ inputs.python-version }}
steps:
- uses: actions/checkout@v6
- name: Download Docker image artifact
uses: actions/download-artifact@v6
with:
name: test-changedetectionio-${{ env.PYTHON_VERSION }}
path: /tmp
- name: Load Docker image
- name: Test changedetection.io container starts+runs basically without error
run: |
docker load -i /tmp/test-changedetectionio.tar
- name: Test container starts+runs basically without error
run: |
docker run --name test-changedetectionio -p 5556:5000 -d test-changedetectionio
docker run --name test-changedetectionio -p 5556:5000 -d test-changedetectionio
sleep 3
curl --retry-connrefused --retry 6 -s http://localhost:5556 |grep -q checkbox-uuid
curl --retry-connrefused --retry 6 -s -g -6 "http://[::1]:5556"|grep -q checkbox-uuid
# Should return 0 (no error) when grep finds it
curl --retry-connrefused --retry 6 -s http://localhost:5556 |grep -q checkbox-uuid
# and IPv6
curl --retry-connrefused --retry 6 -s -g -6 "http://[::1]:5556"|grep -q checkbox-uuid
# Check whether TRACE log is enabled.
# Also, check whether TRACE came from STDOUT
docker logs test-changedetectionio 2>/dev/null | grep 'TRACE log is enabled' || exit 1
# Check whether DEBUG is came from STDOUT
docker logs test-changedetectionio 2>/dev/null | grep 'DEBUG' || exit 1
docker kill test-changedetectionio
- name: Test HTTPS SSL mode
@@ -377,66 +184,78 @@ jobs:
openssl req -x509 -newkey rsa:4096 -keyout privkey.pem -out cert.pem -days 365 -nodes -subj "/CN=localhost"
docker run --name test-changedetectionio-ssl --rm -e SSL_CERT_FILE=cert.pem -e SSL_PRIVKEY_FILE=privkey.pem -p 5000:5000 -v ./cert.pem:/app/cert.pem -v ./privkey.pem:/app/privkey.pem -d test-changedetectionio
sleep 3
# Should return 0 (no error) when grep finds it
# -k because its self-signed
curl --retry-connrefused --retry 6 -k https://localhost:5000 -v|grep -q checkbox-uuid
docker kill test-changedetectionio-ssl
- name: Test IPv6 Mode
run: |
# IPv6 - :: bind to all interfaces inside container (like 0.0.0.0), ::1 would be localhost only
docker run --name test-changedetectionio-ipv6 --rm -p 5000:5000 -e LISTEN_HOST=:: -d test-changedetectionio
sleep 3
# Should return 0 (no error) when grep finds it on localhost
curl --retry-connrefused --retry 6 http://[::1]:5000 -v|grep -q checkbox-uuid
docker kill test-changedetectionio-ipv6
# Signal tests
signal-tests:
runs-on: ubuntu-latest
needs: build
timeout-minutes: 10
env:
PYTHON_VERSION: ${{ inputs.python-version }}
steps:
- uses: actions/checkout@v6
- name: Download Docker image artifact
uses: actions/download-artifact@v6
with:
name: test-changedetectionio-${{ env.PYTHON_VERSION }}
path: /tmp
- name: Load Docker image
run: |
docker load -i /tmp/test-changedetectionio.tar
- name: Test SIGTERM and SIGINT signal shutdown
- name: Test changedetection.io SIGTERM and SIGINT signal shutdown
run: |
echo SIGINT Shutdown request test
docker run --name sig-test -d test-changedetectionio
sleep 3
echo ">>> Sending SIGINT to sig-test container"
docker kill --signal=SIGINT sig-test
sleep 3
# invert the check (it should be not 0/not running)
docker ps
# check signal catch(STDERR) log. Because of
# changedetectionio/__init__.py: logger.add(sys.stderr, level=logger_level)
docker logs sig-test 2>&1 | grep 'Shutdown: Got Signal - SIGINT' || exit 1
test -z "`docker ps|grep sig-test`"
if [ $? -ne 0 ]; then
if [ $? -ne 0 ]
then
echo "Looks like container was running when it shouldnt be"
docker ps
exit 1
fi
# @todo - scan the container log to see the right "graceful shutdown" text exists
docker rm sig-test
echo SIGTERM Shutdown request test
docker run --name sig-test -d test-changedetectionio
sleep 3
echo ">>> Sending SIGTERM to sig-test container"
docker kill --signal=SIGTERM sig-test
sleep 3
# invert the check (it should be not 0/not running)
docker ps
# check signal catch(STDERR) log. Because of
# changedetectionio/__init__.py: logger.add(sys.stderr, level=logger_level)
docker logs sig-test 2>&1 | grep 'Shutdown: Got Signal - SIGTERM' || exit 1
test -z "`docker ps|grep sig-test`"
if [ $? -ne 0 ]; then
if [ $? -ne 0 ]
then
echo "Looks like container was running when it shouldnt be"
docker ps
exit 1
fi
# @todo - scan the container log to see the right "graceful shutdown" text exists
docker rm sig-test
- name: Dump container log
if: always()
run: |
mkdir output-logs
docker logs test-cdio-basic-tests > output-logs/test-cdio-basic-tests-stdout-${{ env.PYTHON_VERSION }}.txt
docker logs test-cdio-basic-tests 2> output-logs/test-cdio-basic-tests-stderr-${{ env.PYTHON_VERSION }}.txt
- name: Store everything including test-datastore
if: always()
uses: actions/upload-artifact@v4
with:
name: test-cdio-basic-tests-output-py${{ env.PYTHON_VERSION }}
path: .

1
.gitignore vendored
View File

@@ -21,7 +21,6 @@ venv/
# IDEs
.idea
.vscode/settings.json
*~
# Datastore files
datastore/

View File

@@ -34,27 +34,23 @@ ENV OPENSSL_LIB_DIR="/usr/lib/arm-linux-gnueabihf"
ENV OPENSSL_INCLUDE_DIR="/usr/include/openssl"
# Additional environment variables for cryptography Rust build
ENV CRYPTOGRAPHY_DONT_BUILD_RUST=1
RUN --mount=type=cache,id=pip,sharing=locked,target=/tmp/pip-cache \
pip install \
--prefer-binary \
--extra-index-url https://www.piwheels.org/simple \
--extra-index-url https://pypi.anaconda.org/ARM-software/simple \
--cache-dir=/tmp/pip-cache \
--target=/dependencies \
-r /requirements.txt
RUN --mount=type=cache,target=/tmp/pip-cache \
pip install \
--extra-index-url https://www.piwheels.org/simple \
--extra-index-url https://pypi.anaconda.org/ARM-software/simple \
--cache-dir=/tmp/pip-cache \
--target=/dependencies \
-r /requirements.txt
# Playwright is an alternative to Selenium
# Excluded this package from requirements.txt to prevent arm/v6 and arm/v7 builds from failing
# https://github.com/dgtlmoon/changedetection.io/pull/1067 also musl/alpine (not supported)
RUN --mount=type=cache,id=pip,sharing=locked,target=/tmp/pip-cache \
pip install \
--prefer-binary \
--cache-dir=/tmp/pip-cache \
--target=/dependencies \
playwright~=1.56.0 \
|| echo "WARN: Failed to install Playwright. The application can still run, but the Playwright option will be disabled."
RUN --mount=type=cache,target=/tmp/pip-cache \
pip install \
--cache-dir=/tmp/pip-cache \
--target=/dependencies \
playwright~=1.48.0 \
|| echo "WARN: Failed to install Playwright. The application can still run, but the Playwright option will be disabled."
# Final image stage
FROM python:${PYTHON_VERSION}-slim-bookworm

View File

@@ -1,9 +1,7 @@
recursive-include changedetectionio/api *
include docs/api-spec.yaml
recursive-include changedetectionio/blueprint *
recursive-include changedetectionio/conditions *
recursive-include changedetectionio/content_fetchers *
recursive-include changedetectionio/jinja2_custom *
recursive-include changedetectionio/model *
recursive-include changedetectionio/notification *
recursive-include changedetectionio/processors *

View File

@@ -14,7 +14,7 @@ Ideal for monitoring price changes, content edits, conditional changes and more.
[<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/screenshot.png" style="max-width:100%;" alt="Self-hosted web page change monitoring, list of websites with changes" title="Self-hosted web page change monitoring, list of websites with changes" />](https://changedetection.io)
[**Don't have time? Try our extremely affordable subscription use our proxies and support!**](https://changedetection.io)
[**Don't have time? Try our extremely affordable subscription use our proxies and support!**](https://changedetection.io)
@@ -31,7 +31,7 @@ Available when connected to a <a href="https://github.com/dgtlmoon/changedetecti
### Perform interactive browser steps
Fill in text boxes, click buttons and more, setup your changedetection scenario.
Fill in text boxes, click buttons and more, setup your changedetection scenario.
Using the **Browser Steps** configuration, add basic steps before performing change detection, such as logging into websites, adding a product to a cart, accept cookie logins, entering dates and refining searches.
@@ -54,7 +54,7 @@ Requires Playwright to be enabled.
- Know when your favourite whiskey is on sale, or other special deals are announced before anyone else
- COVID related news from government websites
- University/organisation news from their website
- Detect and monitor changes in JSON API responses
- Detect and monitor changes in JSON API responses
- JSON API monitoring and alerting
- Changes in legal and other documents
- Trigger API calls via notifications when text appears on a website
@@ -86,7 +86,7 @@ _Need an actual Chrome runner with Javascript support? We support fetching via W
We [recommend and use Bright Data](https://brightdata.grsm.io/n0r16zf7eivq) global proxy services, Bright Data will match any first deposit up to $100 using our signup link.
[Oxylabs](https://oxylabs.go2cloud.org/SH2d) is also an excellent proxy provider and well worth using, they offer Residential, ISP, Rotating and many other proxy types to suit your project.
[Oxylabs](https://oxylabs.go2cloud.org/SH2d) is also an excellent proxy provider and well worth using, they offer Residental, ISP, Rotating and many other proxy types to suit your project.
Please :star: star :star: this project and help it grow! https://github.com/dgtlmoon/changedetection.io/
@@ -106,3 +106,4 @@ $ changedetection.io -d /path/to/empty/data/dir -p 5000
Then visit http://127.0.0.1:5000 , You should now be able to access the UI.
See https://changedetection.io for more information.

View File

@@ -64,7 +64,7 @@ def count_words_in_history(watch):
return 0
latest_key = list(watch.history.keys())[-1]
latest_content = watch.get_history_snapshot(timestamp=latest_key)
latest_content = watch.get_history_snapshot(latest_key)
return len(latest_content.split())
except Exception as e:
logger.error(f"Error counting words: {str(e)}")

View File

@@ -1,8 +1,8 @@
#!/usr/bin/env python3
# Read more https://github.com/dgtlmoon/changedetection.io/wiki
# Semver means never use .01, or 00. Should be .1.
__version__ = '0.51.4'
__version__ = '0.50.17'
from changedetectionio.strtobool import strtobool
from json.decoder import JSONDecodeError
@@ -74,12 +74,6 @@ def main():
datastore_path = None
do_cleanup = False
# Optional URL to watch since start
default_url = None
# Set a default logger level
logger_level = 'DEBUG'
include_default_watches = True
host = os.environ.get("LISTEN_HOST", "0.0.0.0").strip()
port = int(os.environ.get('PORT', 5000))
ssl_mode = False
@@ -93,13 +87,15 @@ def main():
datastore_path = os.path.join(os.getcwd(), "../datastore")
try:
opts, args = getopt.getopt(sys.argv[1:], "6Ccsd:h:p:l:u:", "port")
opts, args = getopt.getopt(sys.argv[1:], "6Ccsd:h:p:l:", "port")
except getopt.GetoptError:
print('backend.py -s SSL enable -h [host] -p [port] -d [datastore path] -u [default URL to watch] -l [debug level - TRACE, DEBUG(default), INFO, SUCCESS, WARNING, ERROR, CRITICAL]')
print('backend.py -s SSL enable -h [host] -p [port] -d [datastore path] -l [debug level - TRACE, DEBUG(default), INFO, SUCCESS, WARNING, ERROR, CRITICAL]')
sys.exit(2)
create_datastore_dir = False
# Set a default logger level
logger_level = 'DEBUG'
# Set a logger level via shell env variable
# Used: Dockerfile for CICD
# To set logger level for pytest, see the app function in tests/conftest.py
@@ -120,10 +116,6 @@ def main():
if opt == '-d':
datastore_path = arg
if opt == '-u':
default_url = arg
include_default_watches = False
# Cleanup (remove text files that arent in the index)
if opt == '-c':
do_cleanup = True
@@ -180,16 +172,13 @@ def main():
sys.exit(2)
try:
datastore = store.ChangeDetectionStore(datastore_path=app_config['datastore_path'], version_tag=__version__, include_default_watches=include_default_watches)
datastore = store.ChangeDetectionStore(datastore_path=app_config['datastore_path'], version_tag=__version__)
except JSONDecodeError as e:
# Dont' start if the JSON DB looks corrupt
logger.critical(f"ERROR: JSON DB or Proxy List JSON at '{app_config['datastore_path']}' appears to be corrupt, aborting.")
logger.critical(str(e))
return
if default_url:
datastore.add_watch(url = default_url)
app = changedetection_app(app_config, datastore)
# Get the SocketIO instance from the Flask app (created in flask_app.py)

View File

@@ -1,22 +1,9 @@
import os
from changedetectionio.strtobool import strtobool
from flask_restful import abort, Resource
from flask import request
from functools import wraps
import validators
from . import auth, validate_openapi_request
from ..validate_url import is_safe_valid_url
def default_content_type(content_type='text/plain'):
"""Decorator to set a default Content-Type header if none is provided."""
def decorator(f):
@wraps(f)
def wrapper(*args, **kwargs):
if not request.content_type:
# Set default content type in the request environment
request.environ['CONTENT_TYPE'] = content_type
return f(*args, **kwargs)
return wrapper
return decorator
class Import(Resource):
@@ -25,7 +12,6 @@ class Import(Resource):
self.datastore = kwargs['datastore']
@auth.check_token
@default_content_type('text/plain') #3547 #3542
@validate_openapi_request('importWatches')
def post(self):
"""Import a list of watched URLs."""
@@ -49,13 +35,14 @@ class Import(Resource):
urls = request.get_data().decode('utf8').splitlines()
added = []
allow_simplehost = not strtobool(os.getenv('BLOCK_SIMPLEHOSTS', 'False'))
for url in urls:
url = url.strip()
if not len(url):
continue
# If hosts that only contain alphanumerics are allowed ("localhost" for example)
if not is_safe_valid_url(url):
if not validators.url(url, simple_host=allow_simplehost):
return f"Invalid or unsupported URL - {url}", 400
if dedupe and self.datastore.url_exists(url):

View File

@@ -1,12 +1,12 @@
import os
from changedetectionio.validate_url import is_safe_valid_url
from changedetectionio.strtobool import strtobool
from flask_expects_json import expects_json
from changedetectionio import queuedWatchMetaData
from changedetectionio import worker_handler
from flask_restful import abort, Resource
from flask import request, make_response, send_from_directory
import validators
from . import auth
import copy
@@ -121,10 +121,6 @@ class Watch(Resource):
if validation_error:
return validation_error, 400
# XSS etc protection
if request.json.get('url') and not is_safe_valid_url(request.json.get('url')):
return "Invalid URL", 400
watch.update(request.json)
return "OK", 200
@@ -175,7 +171,7 @@ class WatchSingleHistory(Resource):
response = make_response("No content found", 404)
response.mimetype = "text/plain"
else:
content = watch.get_history_snapshot(timestamp=timestamp)
content = watch.get_history_snapshot(timestamp)
response = make_response(content, 200)
response.mimetype = "text/plain"
@@ -230,7 +226,9 @@ class CreateWatch(Resource):
json_data = request.get_json()
url = json_data['url'].strip()
if not is_safe_valid_url(url):
# If hosts that only contain alphanumerics are allowed ("localhost" for example)
allow_simplehost = not strtobool(os.getenv('BLOCK_SIMPLEHOSTS', 'False'))
if not validators.url(url, simple_host=allow_simplehost):
return "Invalid or unsupported URL", 400
if json_data.get('proxy'):

View File

@@ -1,7 +1,10 @@
import copy
import yaml
import functools
from flask import request, abort
from loguru import logger
from openapi_core import OpenAPI
from openapi_core.contrib.flask import FlaskOpenAPIRequest
from . import api_schema
from ..model import watch_base
@@ -31,17 +34,9 @@ schema_delete_notification_urls['required'] = ['notification_urls']
@functools.cache
def get_openapi_spec():
"""Lazy load OpenAPI spec and dependencies only when validation is needed."""
import os
import yaml # Lazy import - only loaded when API validation is actually used
from openapi_core import OpenAPI # Lazy import - saves ~10.7 MB on startup
spec_path = os.path.join(os.path.dirname(__file__), '../../docs/api-spec.yaml')
if not os.path.exists(spec_path):
# Possibly for pip3 packages
spec_path = os.path.join(os.path.dirname(__file__), '../docs/api-spec.yaml')
with open(spec_path, 'r', encoding='utf-8') as f:
with open(spec_path, 'r') as f:
spec_dict = yaml.safe_load(f)
_openapi_spec = OpenAPI.from_dict(spec_dict)
return _openapi_spec
@@ -54,9 +49,6 @@ def validate_openapi_request(operation_id):
try:
# Skip OpenAPI validation for GET requests since they don't have request bodies
if request.method.upper() != 'GET':
# Lazy import - only loaded when actually validating a request
from openapi_core.contrib.flask import FlaskOpenAPIRequest
spec = get_openapi_spec()
openapi_request = FlaskOpenAPIRequest(request)
result = spec.unmarshal_request(openapi_request)

View File

@@ -96,10 +96,7 @@ def build_watch_json_schema(d):
"enum": ["html_requests", "html_webdriver"]
})
schema['properties']['processor'] = {"anyOf": [
{"type": "string", "enum": ["restock_diff", "text_json_diff"]},
{"type": "null"}
]}
# All headers must be key/value type dict
schema['properties']['headers'] = {

View File

@@ -91,7 +91,7 @@ async def async_update_worker(worker_id, q, notification_q, app, datastore):
try:
processor_module = importlib.import_module(processor_module_name)
except ModuleNotFoundError as e:
print(f"Processor module '{processor}' not found.")
logger.error(f"Processor module '{processor}' not found.")
raise e
update_handler = processor_module.perform_site_check(datastore=datastore,
@@ -334,10 +334,6 @@ async def async_update_worker(worker_id, q, notification_q, app, datastore):
if update_handler.fetcher.content or (not update_handler.fetcher.content and empty_pages_are_a_change):
watch.save_last_fetched_html(contents=update_handler.fetcher.content, timestamp=int(fetch_start_time))
# Explicitly delete large content variables to free memory IMMEDIATELY after saving
# These are no longer needed after being saved to history
del contents
# Send notifications on second+ check
if watch.history_n >= 2:
logger.info(f"Change detected in UUID {uuid} - {watch['url']}")
@@ -353,15 +349,12 @@ async def async_update_worker(worker_id, q, notification_q, app, datastore):
count = watch.get('check_count', 0) + 1
# Always record page title (used in notifications, and can change even when the content is the same)
if update_obj.get('content-type') and 'html' in update_obj.get('content-type'):
try:
page_title = html_tools.extract_title(data=update_handler.fetcher.content)
if page_title:
page_title = page_title.strip()[:2000]
logger.debug(f"UUID: {uuid} Page <title> is '{page_title}'")
datastore.update_watch(uuid=uuid, update_obj={'page_title': page_title})
except Exception as e:
logger.warning(f"UUID: {uuid} Exception when extracting <title> - {str(e)}")
try:
page_title = html_tools.extract_title(data=update_handler.fetcher.content)
logger.debug(f"UUID: {uuid} Page <title> is '{page_title}'")
datastore.update_watch(uuid=uuid, update_obj={'page_title': page_title})
except Exception as e:
logger.warning(f"UUID: {uuid} Exception when extracting <title> - {str(e)}")
# Record server header
try:
@@ -379,12 +372,6 @@ async def async_update_worker(worker_id, q, notification_q, app, datastore):
datastore.update_watch(uuid=uuid, update_obj={'fetch_time': round(time.time() - fetch_start_time, 3),
'check_count': count})
# NOW clear fetcher content - after all processing is complete
# This is the last point where we need the fetcher data
if update_handler and hasattr(update_handler, 'fetcher') and update_handler.fetcher:
update_handler.fetcher.clear_content()
logger.debug(f"Cleared fetcher content for UUID {uuid}")
except Exception as e:
logger.error(f"Worker {worker_id} unexpected error processing {uuid}: {e}")
logger.error(f"Worker {worker_id} traceback:", exc_info=True)
@@ -405,28 +392,7 @@ async def async_update_worker(worker_id, q, notification_q, app, datastore):
#logger.info(f"Worker {worker_id} sending completion signal for UUID {watch['uuid']}")
watch_check_update.send(watch_uuid=watch['uuid'])
# Explicitly clean up update_handler and all its references
if update_handler:
# Clear fetcher content using the proper method
if hasattr(update_handler, 'fetcher') and update_handler.fetcher:
update_handler.fetcher.clear_content()
# Clear processor references
if hasattr(update_handler, 'content_processor'):
update_handler.content_processor = None
update_handler = None
# Clear local contents variable if it still exists
if 'contents' in locals():
del contents
# Note: We don't set watch = None here because:
# 1. watch is just a local reference to datastore.data['watching'][uuid]
# 2. Setting it to None doesn't affect the datastore
# 3. GC can't collect the object anyway (still referenced by datastore)
# 4. It would just cause confusion
update_handler = None
logger.debug(f"Worker {worker_id} completed watch {uuid} in {time.time()-fetch_start_time:.2f}s")
except Exception as cleanup_error:
logger.error(f"Worker {worker_id} error during cleanup: {cleanup_error}")

View File

@@ -6,7 +6,7 @@ from loguru import logger
from changedetectionio.content_fetchers import SCREENSHOT_MAX_HEIGHT_DEFAULT
from changedetectionio.content_fetchers.base import manage_user_agent
from changedetectionio.jinja2_custom import render as jinja_render
from changedetectionio.safe_jinja import render as jinja_render
@@ -439,7 +439,7 @@ class browsersteps_live_ui(steppable_browser_interface):
logger.warning("Attempted to get current state after cleanup")
return (None, None)
xpath_element_js = importlib.resources.files("changedetectionio.content_fetchers.res").joinpath('xpath_element_scraper.js').read_text(encoding="utf-8")
xpath_element_js = importlib.resources.files("changedetectionio.content_fetchers.res").joinpath('xpath_element_scraper.js').read_text()
now = time.time()
await self.page.wait_for_timeout(1 * 1000)

View File

@@ -33,7 +33,7 @@ def construct_blueprint(datastore: ChangeDetectionStore):
def long_task(uuid, preferred_proxy):
import time
from changedetectionio.content_fetchers import exceptions as content_fetcher_exceptions
from changedetectionio.jinja2_custom import render as jinja_render
from changedetectionio.safe_jinja import render as jinja_render
status = {'status': '', 'length': 0, 'text': ''}

View File

@@ -1,27 +1 @@
from copy import deepcopy
from loguru import logger
from changedetectionio.model import USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH
from changedetectionio.notification import valid_notification_formats
RSS_CONTENT_FORMAT_DEFAULT = 'text'
# Some stuff not related
RSS_FORMAT_TYPES = deepcopy(valid_notification_formats)
if RSS_FORMAT_TYPES.get('markdown'):
del RSS_FORMAT_TYPES['markdown']
if RSS_FORMAT_TYPES.get(USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH):
del RSS_FORMAT_TYPES[USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH]
if not RSS_FORMAT_TYPES.get(RSS_CONTENT_FORMAT_DEFAULT):
logger.critical(f"RSS_CONTENT_FORMAT_DEFAULT not in the acceptable list {RSS_CONTENT_FORMAT_DEFAULT}")
RSS_TEMPLATE_TYPE_OPTIONS = {'system_default': 'System default', 'notification_body': 'Notification body'}
# @note: We use <pre> because nearly all RSS readers render only HTML (Thunderbird for example cant do just plaintext)
RSS_TEMPLATE_PLAINTEXT_DEFAULT = "<pre>{{watch_label}} had a change.\n\n{{diff}}\n</pre>"
# @todo add some [edit]/[history]/[goto] etc links
# @todo need {{watch_edit_link}} + delete + history link token
RSS_TEMPLATE_HTML_DEFAULT = "<html><body>\n<h4><a href=\"{{watch_url}}\">{{watch_label}}</a></h4>\n<p>{{diff}}</p>\n</body></html>\n"
RSS_FORMAT_TYPES = [('plaintext', 'Plain text'), ('html', 'HTML Color')]

View File

@@ -1,156 +0,0 @@
"""
Utility functions for RSS feed generation.
"""
from changedetectionio.notification.handler import process_notification
from changedetectionio.notification_service import NotificationContextData, _check_cascading_vars
from loguru import logger
import datetime
import pytz
import re
BAD_CHARS_REGEX = r'[\x00-\x08\x0B\x0C\x0E-\x1F]'
def scan_invalid_chars_in_rss(content):
"""
Scan for invalid characters in RSS content.
Returns True if invalid characters are found.
"""
for match in re.finditer(BAD_CHARS_REGEX, content):
i = match.start()
bad_char = content[i]
hex_value = f"0x{ord(bad_char):02x}"
# Grab context
start = max(0, i - 20)
end = min(len(content), i + 21)
context = content[start:end].replace('\n', '\\n').replace('\r', '\\r')
logger.warning(f"Invalid char {hex_value} at pos {i}: ...{context}...")
# First match is enough
return True
return False
def clean_entry_content(content):
"""
Remove invalid characters from RSS content.
"""
cleaned = re.sub(BAD_CHARS_REGEX, '', content)
return cleaned
def generate_watch_guid(watch, timestamp):
"""
Generate a unique GUID for a watch RSS entry.
Args:
watch: The watch object
timestamp: The timestamp of the specific change this entry represents
"""
return f"{watch['uuid']}/{timestamp}"
def validate_rss_token(datastore, request):
"""
Validate the RSS access token from the request.
Returns:
tuple: (is_valid, error_response) where error_response is None if valid
"""
app_rss_token = datastore.data['settings']['application'].get('rss_access_token')
rss_url_token = request.args.get('token')
if rss_url_token != app_rss_token:
return False, ("Access denied, bad token", 403)
return True, None
def get_rss_template(datastore, watch, rss_content_format, default_html, default_plaintext):
"""Get the appropriate template for RSS content."""
if datastore.data['settings']['application'].get('rss_template_type') == 'notification_body':
return _check_cascading_vars(datastore=datastore, var_name='notification_body', watch=watch)
override = datastore.data['settings']['application'].get('rss_template_override')
if override and override.strip():
return override
elif 'text' in rss_content_format:
return default_plaintext
else:
return default_html
def get_watch_label(datastore, watch):
"""Get the label for a watch based on settings."""
if datastore.data['settings']['application']['ui'].get('use_page_title_in_list') or watch.get('use_page_title_in_list'):
return watch.label
else:
return watch.get('url')
def add_watch_categories(fe, watch, datastore):
"""Add category tags to a feed entry based on watch tags."""
for tag_uuid in watch.get('tags', []):
tag = datastore.data['settings']['application'].get('tags', {}).get(tag_uuid)
if tag and tag.get('title'):
fe.category(term=tag.get('title'))
def build_notification_context(watch, timestamp_from, timestamp_to, watch_label,
n_body_template, rss_content_format):
"""Build the notification context object."""
return NotificationContextData(initial_data={
'notification_urls': ['null://just-sending-a-null-test-for-the-render-in-RSS'],
'notification_body': n_body_template,
'timestamp_to': timestamp_to,
'timestamp_from': timestamp_from,
'watch_label': watch_label,
'notification_format': rss_content_format
})
def render_notification(n_object, notification_service, watch, datastore,
date_index_from=None, date_index_to=None):
"""Process and render the notification content."""
kwargs = {'n_object': n_object, 'watch': watch}
if date_index_from is not None and date_index_to is not None:
kwargs['date_index_from'] = date_index_from
kwargs['date_index_to'] = date_index_to
n_object = notification_service.queue_notification_for_watch(**kwargs)
n_object['watch_mime_type'] = None
res = process_notification(n_object=n_object, datastore=datastore)
return res[0]
def populate_feed_entry(fe, watch, content, guid, timestamp, link=None, title_suffix=None):
"""Populate a feed entry with content and metadata."""
watch_label = watch.get('url') # Already determined by caller
# Set link
if link:
fe.link(link=link)
# Set title
if title_suffix:
fe.title(title=f"{watch_label} - {title_suffix}")
else:
fe.title(title=watch_label)
# Clean and set content
if scan_invalid_chars_in_rss(content):
content = clean_entry_content(content)
fe.content(content=content, type='CDATA')
# Set GUID
fe.guid(guid, permalink=False)
# Set pubDate using the timestamp of this specific change
dt = datetime.datetime.fromtimestamp(int(timestamp))
dt = dt.replace(tzinfo=pytz.UTC)
fe.pubDate(dt)

View File

@@ -1,26 +1,150 @@
from changedetectionio.safe_jinja import render as jinja_render
from changedetectionio.store import ChangeDetectionStore
from flask import Blueprint
from feedgen.feed import FeedGenerator
from flask import Blueprint, make_response, request, url_for, redirect
from loguru import logger
import datetime
import pytz
import re
import time
from . import tag as tag_routes
from . import main_feed
from . import single_watch
BAD_CHARS_REGEX=r'[\x00-\x08\x0B\x0C\x0E-\x1F]'
# Anything that is not text/UTF-8 should be stripped before it breaks feedgen (such as binary data etc)
def scan_invalid_chars_in_rss(content):
for match in re.finditer(BAD_CHARS_REGEX, content):
i = match.start()
bad_char = content[i]
hex_value = f"0x{ord(bad_char):02x}"
# Grab context
start = max(0, i - 20)
end = min(len(content), i + 21)
context = content[start:end].replace('\n', '\\n').replace('\r', '\\r')
logger.warning(f"Invalid char {hex_value} at pos {i}: ...{context}...")
# First match is enough
return True
return False
def clean_entry_content(content):
cleaned = re.sub(BAD_CHARS_REGEX, '', content)
return cleaned
def construct_blueprint(datastore: ChangeDetectionStore):
"""
Construct and configure the RSS blueprint with all routes.
Args:
datastore: The ChangeDetectionStore instance
Returns:
The configured Flask blueprint
"""
rss_blueprint = Blueprint('rss', __name__)
# Register all route modules
main_feed.construct_main_feed_routes(rss_blueprint, datastore)
single_watch.construct_single_watch_routes(rss_blueprint, datastore)
tag_routes.construct_tag_routes(rss_blueprint, datastore)
# Some RSS reader situations ended up with rss/ (forward slash after RSS) due
# to some earlier blueprint rerouting work, it should goto feed.
@rss_blueprint.route("/", methods=['GET'])
def extraslash():
return redirect(url_for('rss.feed'))
# Import the login decorator if needed
# from changedetectionio.auth_decorator import login_optionally_required
@rss_blueprint.route("", methods=['GET'])
def feed():
now = time.time()
# Always requires token set
app_rss_token = datastore.data['settings']['application'].get('rss_access_token')
rss_url_token = request.args.get('token')
if rss_url_token != app_rss_token:
return "Access denied, bad token", 403
from changedetectionio import diff
limit_tag = request.args.get('tag', '').lower().strip()
# Be sure limit_tag is a uuid
for uuid, tag in datastore.data['settings']['application'].get('tags', {}).items():
if limit_tag == tag.get('title', '').lower().strip():
limit_tag = uuid
# Sort by last_changed and add the uuid which is usually the key..
sorted_watches = []
# @todo needs a .itemsWithTag() or something - then we can use that in Jinaj2 and throw this away
for uuid, watch in datastore.data['watching'].items():
# @todo tag notification_muted skip also (improve Watch model)
if datastore.data['settings']['application'].get('rss_hide_muted_watches') and watch.get('notification_muted'):
continue
if limit_tag and not limit_tag in watch['tags']:
continue
watch['uuid'] = uuid
sorted_watches.append(watch)
sorted_watches.sort(key=lambda x: x.last_changed, reverse=False)
fg = FeedGenerator()
fg.title('changedetection.io')
fg.description('Feed description')
fg.link(href='https://changedetection.io')
html_colour_enable = False
if datastore.data['settings']['application'].get('rss_content_format') == 'html':
html_colour_enable = True
for watch in sorted_watches:
dates = list(watch.history.keys())
# Re #521 - Don't bother processing this one if theres less than 2 snapshots, means we never had a change detected.
if len(dates) < 2:
continue
if not watch.viewed:
# Re #239 - GUID needs to be individual for each event
# @todo In the future make this a configurable link back (see work on BASE_URL https://github.com/dgtlmoon/changedetection.io/pull/228)
guid = "{}/{}".format(watch['uuid'], watch.last_changed)
fe = fg.add_entry()
# Include a link to the diff page, they will have to login here to see if password protection is enabled.
# Description is the page you watch, link takes you to the diff JS UI page
# Dict val base_url will get overriden with the env var if it is set.
ext_base_url = datastore.data['settings']['application'].get('active_base_url')
# @todo fix
# Because we are called via whatever web server, flask should figure out the right path (
diff_link = {'href': url_for('ui.ui_views.diff_history_page', uuid=watch['uuid'], _external=True)}
fe.link(link=diff_link)
# Same logic as watch-overview.html
if datastore.data['settings']['application']['ui'].get('use_page_title_in_list') or watch.get('use_page_title_in_list'):
watch_label = watch.label
else:
watch_label = watch.get('url')
fe.title(title=watch_label)
try:
html_diff = diff.render_diff(previous_version_file_contents=watch.get_history_snapshot(dates[-2]),
newest_version_file_contents=watch.get_history_snapshot(dates[-1]),
include_equal=False,
line_feed_sep="<br>",
html_colour=html_colour_enable
)
except FileNotFoundError as e:
html_diff = f"History snapshot file for watch {watch.get('uuid')}@{watch.last_changed} - '{watch.get('title')} not found."
# @todo Make this configurable and also consider html-colored markup
# @todo User could decide if <link> goes to the diff page, or to the watch link
rss_template = "<html><body>\n<h4><a href=\"{{watch_url}}\">{{watch_title}}</a></h4>\n<p>{{html_diff}}</p>\n</body></html>\n"
content = jinja_render(template_str=rss_template, watch_title=watch_label, html_diff=html_diff, watch_url=watch.link)
# Out of range chars could also break feedgen
if scan_invalid_chars_in_rss(content):
content = clean_entry_content(content)
fe.content(content=content, type='CDATA')
fe.guid(guid, permalink=False)
dt = datetime.datetime.fromtimestamp(int(watch.newest_history_key))
dt = dt.replace(tzinfo=pytz.UTC)
fe.pubDate(dt)
response = make_response(fg.rss_str())
response.headers.set('Content-Type', 'application/rss+xml;charset=utf-8')
logger.trace(f"RSS generated in {time.time() - now:.3f}s")
return response
return rss_blueprint

View File

@@ -1,105 +0,0 @@
from flask import make_response, request, url_for, redirect
def construct_main_feed_routes(rss_blueprint, datastore):
"""
Construct the main RSS feed routes.
Args:
rss_blueprint: The Flask blueprint to add routes to
datastore: The ChangeDetectionStore instance
"""
# Some RSS reader situations ended up with rss/ (forward slash after RSS) due
# to some earlier blueprint rerouting work, it should goto feed.
@rss_blueprint.route("/", methods=['GET'])
def extraslash():
return redirect(url_for('rss.feed'))
# Import the login decorator if needed
# from changedetectionio.auth_decorator import login_optionally_required
@rss_blueprint.route("", methods=['GET'])
def feed():
from feedgen.feed import FeedGenerator
from loguru import logger
import time
from . import RSS_TEMPLATE_HTML_DEFAULT, RSS_TEMPLATE_PLAINTEXT_DEFAULT
from ._util import (validate_rss_token, generate_watch_guid, get_rss_template,
get_watch_label, build_notification_context, render_notification,
populate_feed_entry, add_watch_categories)
from ...notification_service import NotificationService
now = time.time()
# Validate token
is_valid, error = validate_rss_token(datastore, request)
if not is_valid:
return error
rss_content_format = datastore.data['settings']['application'].get('rss_content_format')
limit_tag = request.args.get('tag', '').lower().strip()
# Be sure limit_tag is a uuid
for uuid, tag in datastore.data['settings']['application'].get('tags', {}).items():
if limit_tag == tag.get('title', '').lower().strip():
limit_tag = uuid
# Sort by last_changed and add the uuid which is usually the key..
sorted_watches = []
# @todo needs a .itemsWithTag() or something - then we can use that in Jinaj2 and throw this away
for uuid, watch in datastore.data['watching'].items():
# @todo tag notification_muted skip also (improve Watch model)
if datastore.data['settings']['application'].get('rss_hide_muted_watches') and watch.get('notification_muted'):
continue
if limit_tag and not limit_tag in watch['tags']:
continue
sorted_watches.append(watch)
sorted_watches.sort(key=lambda x: x.last_changed, reverse=False)
fg = FeedGenerator()
fg.title('changedetection.io')
fg.description('Feed description')
fg.link(href='https://changedetection.io')
notification_service = NotificationService(datastore=datastore, notification_q=False)
for watch in sorted_watches:
dates = list(watch.history.keys())
# Re #521 - Don't bother processing this one if theres less than 2 snapshots, means we never had a change detected.
if len(dates) < 2:
continue
if not watch.viewed:
# Re #239 - GUID needs to be individual for each event
# @todo In the future make this a configurable link back (see work on BASE_URL https://github.com/dgtlmoon/changedetection.io/pull/228)
watch_label = get_watch_label(datastore, watch)
timestamp_to = dates[-1]
timestamp_from = dates[-2]
guid = generate_watch_guid(watch, timestamp_to)
# Because we are called via whatever web server, flask should figure out the right path
diff_link = {'href': url_for('ui.ui_views.diff_history_page', uuid=watch['uuid'], _external=True)}
# Get template and build notification context
n_body_template = get_rss_template(datastore, watch, rss_content_format,
RSS_TEMPLATE_HTML_DEFAULT, RSS_TEMPLATE_PLAINTEXT_DEFAULT)
n_object = build_notification_context(watch, timestamp_from, timestamp_to,
watch_label, n_body_template, rss_content_format)
# Render notification
res = render_notification(n_object, notification_service, watch, datastore)
# Create and populate feed entry
fe = fg.add_entry()
populate_feed_entry(fe, watch, res['body'], guid, timestamp_to, link=diff_link)
fe.title(title=watch_label) # Override title to not include suffix
add_watch_categories(fe, watch, datastore)
response = make_response(fg.rss_str())
response.headers.set('Content-Type', 'application/rss+xml;charset=utf-8')
logger.trace(f"RSS generated in {time.time() - now:.3f}s")
return response

View File

@@ -1,115 +0,0 @@
def construct_single_watch_routes(rss_blueprint, datastore):
"""
Construct RSS feed routes for single watches.
Args:
rss_blueprint: The Flask blueprint to add routes to
datastore: The ChangeDetectionStore instance
"""
@rss_blueprint.route("/watch/<string:uuid>", methods=['GET'])
def rss_single_watch(uuid):
import time
from flask import make_response, request
from feedgen.feed import FeedGenerator
from loguru import logger
from . import RSS_TEMPLATE_HTML_DEFAULT, RSS_TEMPLATE_PLAINTEXT_DEFAULT
from ._util import (validate_rss_token, get_rss_template, get_watch_label,
build_notification_context, render_notification,
populate_feed_entry, add_watch_categories)
from ...notification_service import NotificationService
"""
Display the most recent changes for a single watch as RSS feed.
Returns RSS XML with multiple entries showing diffs between consecutive snapshots.
The number of entries is controlled by the rss_diff_length setting.
"""
now = time.time()
# Validate token
is_valid, error = validate_rss_token(datastore, request)
if not is_valid:
return error
rss_content_format = datastore.data['settings']['application'].get('rss_content_format')
# Get the watch by UUID
watch = datastore.data['watching'].get(uuid)
if not watch:
return f"Watch with UUID {uuid} not found", 404
# Check if watch has at least 2 history snapshots
dates = list(watch.history.keys())
if len(dates) < 2:
return f"Watch {uuid} does not have enough history snapshots to show changes (need at least 2)", 400
# Add uuid to watch for proper functioning
watch['uuid'] = uuid
# Get the number of diffs to include (default: 5)
rss_diff_length = datastore.data['settings']['application'].get('rss_diff_length', 5)
# Calculate how many diffs we can actually show (limited by available history)
# We need at least 2 snapshots to create 1 diff
max_possible_diffs = len(dates) - 1
num_diffs = min(rss_diff_length, max_possible_diffs) if rss_diff_length > 0 else max_possible_diffs
# Create RSS feed
fg = FeedGenerator()
# Set title: use "label (url)" if label differs from url, otherwise just url
watch_url = watch.get('url', '')
watch_label = get_watch_label(datastore, watch)
if watch_label != watch_url:
feed_title = f'changedetection.io - {watch_label} ({watch_url})'
else:
feed_title = f'changedetection.io - {watch_url}'
fg.title(feed_title)
fg.description('Changes')
fg.link(href='https://changedetection.io')
# Loop through history and create RSS entries for each diff
# Add entries in reverse order because feedgen reverses them
# This way, the newest change appears first in the final RSS
notification_service = NotificationService(datastore=datastore, notification_q=False)
for i in range(num_diffs - 1, -1, -1):
# Calculate indices for this diff (working backwards from newest)
# i=0: compare dates[-2] to dates[-1] (most recent change)
# i=1: compare dates[-3] to dates[-2] (previous change)
# etc.
date_index_to = -(i + 1)
date_index_from = -(i + 2)
timestamp_to = dates[date_index_to]
timestamp_from = dates[date_index_from]
# Get template and build notification context
n_body_template = get_rss_template(datastore, watch, rss_content_format,
RSS_TEMPLATE_HTML_DEFAULT, RSS_TEMPLATE_PLAINTEXT_DEFAULT)
n_object = build_notification_context(watch, timestamp_from, timestamp_to,
watch_label, n_body_template, rss_content_format)
# Render notification with date indices
res = render_notification(n_object, notification_service, watch, datastore,
date_index_from, date_index_to)
# Create and populate feed entry
guid = f"{watch['uuid']}/{timestamp_to}"
fe = fg.add_entry()
title_suffix = f"Change @ {res['original_context']['change_datetime']}"
populate_feed_entry(fe, watch, res.get('body', ''), guid, timestamp_to,
link={'href': watch.get('url')}, title_suffix=title_suffix)
add_watch_categories(fe, watch, datastore)
response = make_response(fg.rss_str())
response.headers.set('Content-Type', 'application/rss+xml;charset=utf-8')
logger.debug(f"RSS Single watch built in {time.time()-now:.2f}s")
return response

View File

@@ -1,98 +0,0 @@
def construct_tag_routes(rss_blueprint, datastore):
"""
Construct RSS feed routes for tags.
Args:
rss_blueprint: The Flask blueprint to add routes to
datastore: The ChangeDetectionStore instance
"""
@rss_blueprint.route("/tag/<string:tag_uuid>", methods=['GET'])
def rss_tag_feed(tag_uuid):
from flask import make_response, request, url_for
from feedgen.feed import FeedGenerator
from . import RSS_TEMPLATE_HTML_DEFAULT, RSS_TEMPLATE_PLAINTEXT_DEFAULT
from ._util import (validate_rss_token, generate_watch_guid, get_rss_template,
get_watch_label, build_notification_context, render_notification,
populate_feed_entry, add_watch_categories)
from ...notification_service import NotificationService
"""
Display an RSS feed for all unviewed watches that belong to a specific tag.
Returns RSS XML with entries for each unviewed watch with sufficient history.
"""
# Validate token
is_valid, error = validate_rss_token(datastore, request)
if not is_valid:
return error
rss_content_format = datastore.data['settings']['application'].get('rss_content_format')
# Verify tag exists
tag = datastore.data['settings']['application'].get('tags', {}).get(tag_uuid)
if not tag:
return f"Tag with UUID {tag_uuid} not found", 404
tag_title = tag.get('title', 'Unknown Tag')
# Create RSS feed
fg = FeedGenerator()
fg.title(f'changedetection.io - {tag_title}')
fg.description(f'Changes for watches tagged with {tag_title}')
fg.link(href='https://changedetection.io')
notification_service = NotificationService(datastore=datastore, notification_q=False)
# Find all watches with this tag
for uuid, watch in datastore.data['watching'].items():
#@todo This is wrong, it needs to sort by most recently changed and then limit it datastore.data['watching'].items().sorted(?)
# So get all watches in this tag then sort
# Skip if watch doesn't have this tag
if tag_uuid not in watch.get('tags', []):
continue
# Skip muted watches if configured
if datastore.data['settings']['application'].get('rss_hide_muted_watches') and watch.get('notification_muted'):
continue
# Check if watch has at least 2 history snapshots
dates = list(watch.history.keys())
if len(dates) < 2:
continue
# Only include unviewed watches
if not watch.viewed:
# Add uuid to watch for proper functioning
watch['uuid'] = uuid
# Include a link to the diff page
diff_link = {'href': url_for('ui.ui_views.diff_history_page', uuid=watch['uuid'], _external=True)}
# Get watch label
watch_label = get_watch_label(datastore, watch)
# Get template and build notification context
timestamp_to = dates[-1]
timestamp_from = dates[-2]
# Generate GUID for this entry
guid = generate_watch_guid(watch, timestamp_to)
n_body_template = get_rss_template(datastore, watch, rss_content_format,
RSS_TEMPLATE_HTML_DEFAULT, RSS_TEMPLATE_PLAINTEXT_DEFAULT)
n_object = build_notification_context(watch, timestamp_from, timestamp_to,
watch_label, n_body_template, rss_content_format)
# Render notification
res = render_notification(n_object, notification_service, watch, datastore)
# Create and populate feed entry
fe = fg.add_entry()
title_suffix = f"Change @ {res['original_context']['change_datetime']}"
populate_feed_entry(fe, watch, res['body'], guid, timestamp_to, link=diff_link, title_suffix=title_suffix)
add_watch_categories(fe, watch, datastore)
response = make_response(fg.rss_str())
response.headers.set('Content-Type', 'application/rss+xml;charset=utf-8')
return response

View File

@@ -119,7 +119,7 @@ def construct_blueprint(datastore: ChangeDetectionStore):
hide_remove_pass=os.getenv("SALTED_PASS", False),
min_system_recheck_seconds=int(os.getenv('MINIMUM_SECONDS_RECHECK_TIME', 3)),
settings_application=datastore.data['settings']['application'],
timezone_default_config=datastore.data['settings']['application'].get('scheduler_timezone_default'),
timezone_default_config=datastore.data['settings']['application'].get('timezone'),
utc_time=utc_time,
)

View File

@@ -1,10 +1,11 @@
{% extends 'base.html' %}
{% block content %}
{% from '_helpers.html' import render_field, render_checkbox_field, render_button, render_time_schedule_form, render_ternary_field, render_fieldlist_with_inline_errors %}
{% from '_common_fields.html' import render_common_settings_form, show_token_placeholders %}
{% from '_helpers.html' import render_field, render_checkbox_field, render_button, render_time_schedule_form, render_ternary_field %}
{% from '_common_fields.html' import render_common_settings_form %}
<script>
const notification_base_url="{{url_for('ui.ui_notification.ajax_callback_send_notification_test', mode="global-settings")}}";
const notification_test_render_preview_url="{{url_for('ui.ui_notification.ajax_callback_test_render_preview', mode="global-settings")}}";
{% if emailprefix %}
const email_notification_prefix=JSON.parse('{{emailprefix|tojson}}');
{% endif %}
@@ -24,7 +25,6 @@
<li class="tab"><a href="#filters">Global Filters</a></li>
<li class="tab"><a href="#ui-options">UI Options</a></li>
<li class="tab"><a href="#api">API</a></li>
<li class="tab"><a href="#rss">RSS</a></li>
<li class="tab"><a href="#timedate">Time &amp Date</a></li>
<li class="tab"><a href="#proxies">CAPTCHA &amp; Proxies</a></li>
</ul>
@@ -66,13 +66,28 @@
<div class="pure-control-group">
{{ render_checkbox_field(form.application.form.shared_diff_access, class="shared_diff_access") }}
<span class="pure-form-message-inline">Allow access to the watch change history page when password is enabled (Good for sharing the diff page)
<span class="pure-form-message-inline">Allow access to view watch diff page when password is enabled (Good for sharing the diff page)
</span>
</div>
<div class="pure-control-group">
{{ render_checkbox_field(form.application.form.rss_hide_muted_watches) }}
</div>
<div class="pure-control-group">
{{ render_field(form.application.form.rss_content_format) }}
<span class="pure-form-message-inline">Love RSS? Does your reader support HTML? Set it here</span>
</div>
<div class="pure-control-group">
{{ render_checkbox_field(form.application.form.empty_pages_are_a_change) }}
<span class="pure-form-message-inline">When a request returns no content, or the HTML does not contain any text, is this considered a change?</span>
</div>
{% if form.requests.proxy %}
<div class="pure-control-group inline-radio">
{{ render_field(form.requests.form.proxy, class="fetch-backend-proxy") }}
<span class="pure-form-message-inline">
Choose a default proxy for all watches
</span>
</div>
{% endif %}
</fieldset>
</div>
@@ -119,10 +134,6 @@
{{ render_field(form.requests.form.jitter_seconds, class="jitter_seconds") }}
<span class="pure-form-message-inline">Example - 3 seconds random jitter could trigger up to 3 seconds earlier or up to 3 seconds later</span>
</div>
<div class="pure-control-group">
{{ render_field(form.requests.form.timeout) }}
<span class="pure-form-message-inline">For regular plain requests (not chrome based), maximum number of seconds until timeout, 1-999.<br>
</div>
<div class="pure-control-group inline-radio">
{{ render_field(form.requests.form.default_ua) }}
<span class="pure-form-message-inline">
@@ -190,21 +201,12 @@ nav
</div>
<div class="tab-pane-inner" id="api">
<h4>API Access</h4>
<p>Drive your changedetection.io via API, More about <a href="https://changedetection.io/docs/api_v1/index.html">API access and examples here</a>.</p>
<p>
<strong>Chrome extension and API Access</strong><br>
</p>
<div class="pure-control-group">
{{ render_checkbox_field(form.application.form.api_access_token_enabled) }}
<div class="pure-form-message-inline">Restrict API access limit by using <code>x-api-key</code> header - required for the Chrome Extension to work</div><br>
<div class="pure-form-message-inline"><br>API Key <span id="api-key">{{api_key}}</span>
<span style="display:none;" id="api-key-copy" >copy</span>
</div>
</div>
<div class="pure-control-group">
<a href="{{url_for('settings.settings_reset_api_key')}}" class="pure-button button-small button-cancel">Regenerate API key</a>
</div>
<div class="pure-control-group">
<h4>Chrome Extension</h4>
<div class="pure-control-group border-fieldset">
<strong>Chrome Extension</strong><br>
<p>Easily add any web-page to your changedetection.io installation from within Chrome.</p>
<strong>Step 1</strong> Install the extension, <strong>Step 2</strong> Navigate to this page,
<strong>Step 3</strong> Open the extension from the toolbar and click "<i>Sync API Access</i>"
@@ -217,38 +219,22 @@ nav
</a>
</p>
</div>
</div>
<div class="tab-pane-inner" id="rss">
<div class="pure-control-group">
{{ render_checkbox_field(form.application.form.rss_hide_muted_watches) }}
</div>
<div class="pure-control-group">
{{ render_field(form.application.form.rss_diff_length) }}
<span class="pure-form-message-inline">Maximum number of history snapshots to include in the watch specific RSS feed.</span>
</div>
<div class="pure-control-group">
{{ render_checkbox_field(form.application.form.rss_reader_mode) }}
<span class="pure-form-message-inline">For watching other RSS feeds - When watching RSS/Atom feeds, convert them into clean text for better change detection.</span>
</div>
<div class="pure-control-group grey-form-border">
<div class="pure-control-group">
{{ render_field(form.application.form.rss_content_format) }}
<span class="pure-form-message-inline">Does your reader support HTML? Set it here</span>
</div>
<div class="pure-control-group">
{{ render_field(form.application.form.rss_template_type) }}
<span class="pure-form-message-inline">'System default' for the same template for all items, or re-use your "Notification Body" as the template.</span>
</div>
<div>
{{ render_field(form.application.form.rss_template_override) }}
{{ show_token_placeholders(extra_notification_token_placeholder_info=extra_notification_token_placeholder_info, suffix="-rss") }}
<div class="pure-control-group border-fieldset">
Drive your changedetection.io via API, More about <a href="https://changedetection.io/docs/api_v1/index.html">API access and examples here</a>.<br>
<p>
{{ render_checkbox_field(form.application.form.api_access_token_enabled) }}
</p>
<div class="pure-form-message-inline">Restrict API access limit by using <code>x-api-key</code> header - required for the Chrome Extension to work</div><br>
<div class="pure-form-message-inline"><br>API Key <span id="api-key">{{api_key}}</span>
<span style="display:none;" id="api-key-copy" >copy</span>
</div>
<p>
<a href="{{url_for('settings.settings_reset_api_key')}}" class="pure-button button-small button-cancel">Regenerate API key</a>
</p>
</div>
<br>
</div>
<div class="tab-pane-inner" id="timedate">
<div class="tab-pane-inner" id="timedate">
<div class="pure-control-group">
Ensure the settings below are correct, they are used to manage the time schedule for checking your web page watches.
</div>
@@ -256,9 +242,11 @@ nav
<p><strong>UTC Time &amp Date from Server:</strong> <span id="utc-time" >{{ utc_time }}</span></p>
<p><strong>Local Time &amp Date in Browser:</strong> <span class="local-time" data-utc="{{ utc_time }}"></span></p>
<p>
{{ render_field(form.application.form.scheduler_timezone_default) }}
{{ render_field(form.application.form.timezone) }}
<datalist id="timezones" style="display: none;">
{%- for timezone in available_timezones -%}<option value="{{ timezone }}">{{ timezone }}</option>{%- endfor -%}
{% for tz_name in available_timezones %}
<option value="{{ tz_name }}">{{ tz_name }}</option>
{% endfor %}
</datalist>
</p>
</div>
@@ -332,27 +320,17 @@ nav
<p><strong>Tip</strong>: "Residential" and "Mobile" proxy type can be more successfull than "Data Center" for blocked websites.
<div class="pure-control-group" id="extra-proxies-setting">
{{ render_fieldlist_with_inline_errors(form.requests.form.extra_proxies) }}
{{ render_field(form.requests.form.extra_proxies) }}
<span class="pure-form-message-inline">"Name" will be used for selecting the proxy in the Watch Edit settings</span><br>
<span class="pure-form-message-inline">SOCKS5 proxies with authentication are only supported with 'plain requests' fetcher, for other fetchers you should whitelist the IP access instead</span>
{% if form.requests.proxy %}
<div>
<br>
<div class="inline-radio">
{{ render_field(form.requests.form.proxy, class="fetch-backend-proxy") }}
<span class="pure-form-message-inline">Choose a default proxy for all watches</span>
</div>
</div>
{% endif %}
</div>
<div class="pure-control-group" id="extra-browsers-setting">
<p>
<span class="pure-form-message-inline"><i>Extra Browsers</i> can be attached to further defeat CAPTCHA's on websites that are particularly hard to scrape.</span><br>
<span class="pure-form-message-inline">Simply paste the connection address into the box, <a href="https://changedetection.io/tutorial/using-bright-datas-scraping-browser-pass-captchas-and-other-protection-when-monitoring">More instructions and examples here</a> </span>
</p>
{{ render_fieldlist_with_inline_errors(form.requests.form.extra_browsers) }}
{{ render_field(form.requests.form.extra_browsers) }}
</div>
</div>
<div id="actions">
<div class="pure-control-group">

View File

@@ -21,10 +21,9 @@ def construct_blueprint(datastore: ChangeDetectionStore):
tag_count = Counter(tag for watch in datastore.data['watching'].values() if watch.get('tags') for tag in watch['tags'])
output = render_template("groups-overview.html",
app_rss_token=datastore.data['settings']['application'].get('rss_access_token'),
available_tags=sorted_tags,
form=add_form,
tag_count=tag_count,
tag_count=tag_count
)
return output
@@ -150,9 +149,9 @@ def construct_blueprint(datastore: ChangeDetectionStore):
included_content = template.render(**template_args)
output = render_template("edit-tag.html",
extra_form_content=included_content,
extra_tab_content=form.extra_tab_content() if form.extra_tab_content() else None,
settings_application=datastore.data['settings']['application'],
extra_tab_content=form.extra_tab_content() if form.extra_tab_content() else None,
extra_form_content=included_content,
**template_args
)

View File

@@ -4,6 +4,8 @@
{% from '_common_fields.html' import render_common_settings_form %}
<script>
const notification_base_url="{{url_for('ui.ui_notification.ajax_callback_send_notification_test', mode="group-settings")}}";
const notification_test_render_preview_url="{{url_for('ui.ui_notification.ajax_callback_test_render_preview', mode="group-settings", watch_uuid=data.uuid)}}";
//alert(notification_test_render_preview_url)
</script>
<script src="{{url_for('static_content', group='js', filename='tabs.js')}}" defer></script>
@@ -19,6 +21,8 @@
<script src="{{url_for('static_content', group='js', filename='watch-settings.js')}}" defer></script>
<script src="{{url_for('static_content', group='js', filename='notifications.js')}}" defer></script>
<script src="{{url_for('static_content', group='js', filename='plugins.js')}}" defer></script>
<div class="edit-form monospaced-textarea">

View File

@@ -52,7 +52,6 @@
<a class="pure-button pure-button-primary" href="{{ url_for('tags.form_tag_edit', uuid=uuid) }}">Edit</a>&nbsp;
<a class="pure-button pure-button-primary" href="{{ url_for('tags.delete', uuid=uuid) }}" title="Deletes and removes tag">Delete</a>
<a class="pure-button pure-button-primary" href="{{ url_for('tags.unlink', uuid=uuid) }}" title="Keep the tag but unlink any watches">Unlink</a>
<a href="{{ url_for('rss.rss_tag_feed', tag_uuid=uuid, token=app_rss_token)}}"><img alt="RSS Feed for this watch" style="padding-left: 1em;" src="{{url_for('static_content', group='images', filename='generic_feed-icon.svg')}}" height="15"></a>
</td>
</tr>
{% endfor %}

View File

@@ -76,14 +76,14 @@ def _handle_operations(op, uuids, datastore, worker_handler, update_q, queuedWat
elif (op == 'notification-default'):
from changedetectionio.notification import (
USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH
default_notification_format_for_watch
)
for uuid in uuids:
if datastore.data['watching'].get(uuid):
datastore.data['watching'][uuid]['notification_title'] = None
datastore.data['watching'][uuid]['notification_body'] = None
datastore.data['watching'][uuid]['notification_urls'] = []
datastore.data['watching'][uuid]['notification_format'] = USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH
datastore.data['watching'][uuid]['notification_format'] = default_notification_format_for_watch
if emit_flash:
flash(f"{len(uuids)} watches set to use default notification settings")
@@ -106,14 +106,14 @@ def _handle_operations(op, uuids, datastore, worker_handler, update_q, queuedWat
for uuid in uuids:
watch_check_update.send(watch_uuid=uuid)
def construct_blueprint(datastore: ChangeDetectionStore, update_q, worker_handler, queuedWatchMetaData, watch_check_update):
def construct_blueprint(datastore: ChangeDetectionStore, update_q, worker_handler, queuedWatchMetaData, watch_check_update, notification_q):
ui_blueprint = Blueprint('ui', __name__, template_folder="templates")
# Register the edit blueprint
edit_blueprint = construct_edit_blueprint(datastore, update_q, queuedWatchMetaData)
ui_blueprint.register_blueprint(edit_blueprint)
# Register the notification blueprint
# Register the notification blueprint - mostly used for sending test
notification_blueprint = construct_notification_blueprint(datastore)
ui_blueprint.register_blueprint(notification_blueprint)

View File

@@ -187,7 +187,7 @@ def construct_blueprint(datastore: ChangeDetectionStore, update_q, queuedWatchMe
tz_name = time_schedule_limit.get('timezone')
if not tz_name:
tz_name = datastore.data['settings']['application'].get('scheduler_timezone_default', os.getenv('TZ', 'UTC').strip())
tz_name = datastore.data['settings']['application'].get('timezone', 'UTC')
if time_schedule_limit and time_schedule_limit.get('enabled'):
try:
@@ -236,7 +236,7 @@ def construct_blueprint(datastore: ChangeDetectionStore, update_q, queuedWatchMe
# Import the global plugin system
from changedetectionio.pluggy_interface import collect_ui_edit_stats_extras
app_rss_token = datastore.data['settings']['application'].get('rss_access_token'),
template_args = {
'available_processors': processors.available_processors(),
'available_timezones': sorted(available_timezones()),
@@ -252,17 +252,12 @@ def construct_blueprint(datastore: ChangeDetectionStore, update_q, queuedWatchMe
'has_special_tag_options': _watch_has_tag_options_set(watch=watch),
'jq_support': jq_support,
'playwright_enabled': os.getenv('PLAYWRIGHT_DRIVER_URL', False),
'app_rss_token': app_rss_token,
'rss_uuid_feed' : {
'label': watch.label,
'url': url_for('rss.rss_single_watch', uuid=watch['uuid'], token=app_rss_token)
},
'settings_application': datastore.data['settings']['application'],
'system_has_playwright_configured': os.getenv('PLAYWRIGHT_DRIVER_URL'),
'system_has_webdriver_configured': os.getenv('WEBDRIVER_URL'),
'ui_edit_stats_extras': collect_ui_edit_stats_extras(watch),
'visual_selector_data_ready': datastore.visualselector_data_is_ready(watch_uuid=uuid),
'timezone_default_config': datastore.data['settings']['application'].get('scheduler_timezone_default'),
'timezone_default_config': datastore.data['settings']['application'].get('timezone'),
'using_global_webdriver_wait': not default['webdriver_delay'],
'uuid': uuid,
'watch': watch,

View File

@@ -1,47 +1,85 @@
from flask import Blueprint, request, make_response
from flask import Blueprint, request, make_response, jsonify
import random
from loguru import logger
from changedetectionio.notification.handler import process_notification
from changedetectionio.store import ChangeDetectionStore
from changedetectionio.auth_decorator import login_optionally_required
def construct_blueprint(datastore: ChangeDetectionStore):
notification_blueprint = Blueprint('ui_notification', __name__, template_folder="../ui/templates")
@notification_blueprint.route("/notification/render-preview/<string:watch_uuid>", methods=['POST'])
@notification_blueprint.route("/notification/render-preview", methods=['POST'])
@notification_blueprint.route("/notification/render-preview/", methods=['POST'])
@login_optionally_required
def ajax_callback_test_render_preview(watch_uuid=None):
return ajax_callback_send_notification_test(watch_uuid=watch_uuid, send_as_null_test=True)
# AJAX endpoint for sending a test
@notification_blueprint.route("/notification/send-test/<string:watch_uuid>", methods=['POST'])
@notification_blueprint.route("/notification/send-test", methods=['POST'])
@notification_blueprint.route("/notification/send-test/", methods=['POST'])
@login_optionally_required
def ajax_callback_send_notification_test(watch_uuid=None):
from changedetectionio.notification_service import NotificationContextData, set_basic_notification_vars
def ajax_callback_send_notification_test(watch_uuid=None, send_as_null_test=False):
# Watch_uuid could be unset in the case it`s used in tag editor, global settings
import apprise
from changedetectionio.notification.handler import process_notification
from urllib.parse import urlparse
from changedetectionio.notification.apprise_plugin.assets import apprise_asset
from changedetectionio.jinja2_custom import render as jinja_render
from changedetectionio.notification.apprise_plugin.custom_handlers import apprise_http_custom_handler
# Necessary so that we import our custom handlers
from changedetectionio.notification.apprise_plugin.custom_handlers import apprise_http_custom_handler, apprise_null_custom_handler
apobj = apprise.Apprise(asset=apprise_asset)
sent_obj = {}
is_global_settings_form = request.args.get('mode', '') == 'global-settings'
is_group_settings_form = request.args.get('mode', '') == 'group-settings'
# Use an existing random one on the global/main settings form
if not watch_uuid and (is_global_settings_form or is_group_settings_form) \
and datastore.data.get('watching'):
logger.debug(f"Send test notification - Choosing random Watch {watch_uuid}")
if not watch_uuid and is_global_settings_form and datastore.data.get('watching'):
watch_uuid = random.choice(list(datastore.data['watching'].keys()))
logger.debug(f"Send test notification - Chose random watch UUID: {watch_uuid}")
if is_group_settings_form and datastore.data.get('watching'):
logger.debug(f"Send test notification - Choosing random Watch from group {watch_uuid}")
matching_watches = [uuid for uuid, watch in datastore.data['watching'].items() if watch.get('tags') and watch_uuid in watch['tags']]
if matching_watches:
watch_uuid = random.choice(matching_watches)
else:
# Just fallback to any
watch_uuid = random.choice(list(datastore.data['watching'].keys()))
if not watch_uuid:
return make_response("Error: You must have atleast one watch configured for 'test notification' to work", 400)
watch = datastore.data['watching'].get(watch_uuid)
notification_urls = request.form.get('notification_urls','').strip().splitlines()
notification_urls = []
if send_as_null_test:
test_schema = ''
try:
if request.form.get('notification_urls') and '://' in request.form.get('notification_urls'):
first_test_notification_url = request.form['notification_urls'].strip().splitlines()[0]
test_schema = urlparse(first_test_notification_url).scheme.lower().strip()
except Exception as e:
logger.error(f"Error trying to get a test schema based on the first notification_url {str(e)}")
notification_urls = [
# Null lets us do the whole chain of the same code without any extra repeated code
f'null://null-test-just-to-render-everything-on-the-same-codepath-and-get-preview?test_schema={test_schema}'
]
else:
if request.form.get('notification_urls'):
notification_urls += request.form['notification_urls'].strip().splitlines()
if not notification_urls:
logger.debug("Test notification - Trying by group/tag in the edit form if available")
# @todo this logic is not clear, omegaconf?
# On an edit page, we should also fire off to the tags if they have notifications
if request.form.get('tags') and request.form['tags'].strip():
for k in request.form['tags'].split(','):
@@ -55,27 +93,26 @@ def construct_blueprint(datastore: ChangeDetectionStore):
notification_urls = datastore.data['settings']['application']['notification_urls']
if not notification_urls:
return 'Error: No Notification URLs set/found'
return make_response("Error: No Notification URLs set/found.", 400)
for n_url in notification_urls:
# We are ONLY validating the apprise:// part here, convert all tags to something so as not to break apprise URLs
generic_notification_context_data = NotificationContextData()
generic_notification_context_data.set_random_for_validation()
n_url = jinja_render(template_str=n_url, **generic_notification_context_data).strip()
if len(n_url.strip()):
if not apobj.add(n_url):
return f'Error: {n_url} is not a valid AppRise URL.'
return make_response(f'Error: {n_url} is not a valid AppRise URL.', 400)
try:
# use the same as when it is triggered, but then override it with the form test values
n_object = NotificationContextData({
n_object = {
'watch_url': request.form.get('window_url', "https://changedetection.io"),
'notification_urls': notification_urls
})
'notification_urls': notification_urls,
'uuid': watch_uuid # Ensure uuid is present so diff rendering works
}
# Only use if present, if not set in n_object it should use the default system value
if 'notification_format' in request.form and request.form['notification_format'].strip():
n_object['notification_format'] = request.form.get('notification_format', '').strip()
notif_format = request.form.get('notification_format', '').strip()
# Use it if provided and not "System default", otherwise fall back
if notif_format and notif_format != 'System default':
n_object['notification_format'] = notif_format
else:
n_object['notification_format'] = datastore.data['settings']['application'].get('notification_format')
@@ -94,29 +131,14 @@ def construct_blueprint(datastore: ChangeDetectionStore):
n_object['notification_body'] = "Test body"
n_object['as_async'] = False
n_object.update(watch.extra_notification_token_values())
# Same like in notification service, should be refactored
dates = list(watch.history.keys())
trigger_text = ''
snapshot_contents = ''
# Could be called as a 'test notification' with only 1 snapshot available
prev_snapshot = "Example text: example test\nExample text: change detection is cool\nExample text: some more examples\n"
current_snapshot = "Example text: example test\nExample text: change detection is fantastic\nExample text: even more examples\nExample text: a lot more examples"
if len(dates) > 1:
prev_snapshot = watch.get_history_snapshot(timestamp=dates[-2])
current_snapshot = watch.get_history_snapshot(timestamp=dates[-1])
n_object.update(set_basic_notification_vars(snapshot_contents=snapshot_contents,
current_snapshot=current_snapshot,
prev_snapshot=prev_snapshot,
watch=watch,
triggered_text=trigger_text,
timestamp_changed=dates[-1] if dates else None))
sent_obj = process_notification(n_object, datastore)
# This uses the same processor that the queue runner uses
# @todo - Split the notification URLs so we know which one worked, maybe highlight them in green in the UI
result = process_notification(n_object, datastore)
if result:
sent_obj['result'] = result[0]
sent_obj['status'] = 'OK - Sent test notifications'
except Exception as e:
e_str = str(e)
@@ -124,9 +146,9 @@ def construct_blueprint(datastore: ChangeDetectionStore):
e_str = e_str.replace(
"DEBUG - <class 'apprise.decorators.base.CustomNotifyPlugin.instantiate_plugin.<locals>.CustomNotifyPluginWrapper'>",
'')
return make_response(e_str, 400)
return 'OK - Sent test notifications'
# it will be a list of things reached, for this purpose just the first is good so we can see the body that was sent
return make_response(sent_obj, 200)
return notification_blueprint

View File

@@ -21,6 +21,7 @@
const email_notification_prefix=JSON.parse('{{ emailprefix|tojson }}');
{% endif %}
const notification_base_url="{{url_for('ui.ui_notification.ajax_callback_send_notification_test', watch_uuid=uuid)}}";
const notification_test_render_preview_url="{{url_for('ui.ui_notification.ajax_callback_test_render_preview', watch_uuid=uuid)}}";
const playwright_enabled={% if playwright_enabled %}true{% else %}false{% endif %};
const recheck_proxy_start_url="{{url_for('check_proxies.start_check', uuid=uuid)}}";
const proxy_recheck_status_url="{{url_for('check_proxies.get_recheck_status', uuid=uuid)}}";
@@ -356,12 +357,12 @@ Math: {{ 1 + 1 }}") }}
</script>
<br>
{#<div id="text-preview-controls"><span id="text-preview-refresh" class="pure-button button-xsmall">Refresh</span></div>#}
<div class="minitabs-wrapper">
<div class="minitabs-wrapper" id="filter-preview-minitabs">
<div class="minitabs-content">
<div id="text-preview-inner" class="monospace-preview">
<div id="text-preview-inner" class="tab-contents-monospace-preview">
<p>Loading...</p>
</div>
<div id="text-preview-before-inner" style="display: none;" class="monospace-preview">
<div id="text-preview-before-inner" style="display: none;" class="tab-contents-monospace-preview">
<p>Loading...</p>
</div>
</div>
@@ -476,7 +477,6 @@ Math: {{ 1 + 1 }}") }}
class="pure-button button-error">Clear History</a>{% endif %}
<a href="{{url_for('ui.form_clone', uuid=uuid)}}"
class="pure-button">Clone &amp; Edit</a>
<a href="{{ url_for('rss.rss_single_watch', uuid=uuid, token=app_rss_token)}}"><img alt="RSS Feed for this watch" style="padding: .5em 1em;" src="{{url_for('static_content', group='images', filename='generic_feed-icon.svg')}}" height="15"></a>
</div>
</div>
</form>

View File

@@ -47,7 +47,7 @@ def construct_blueprint(datastore: ChangeDetectionStore, update_q, queuedWatchMe
try:
versions = list(watch.history.keys())
content = watch.get_history_snapshot(timestamp=timestamp)
content = watch.get_history_snapshot(timestamp)
triggered_line_numbers = html_tools.strip_ignore_text(content=content,
wordlist=watch['trigger_text'],

View File

@@ -1,7 +1,7 @@
import pluggy
import os
import importlib
import sys
from loguru import logger
from . import default_plugin
# ✅ Ensure that the namespace in HookspecMarker matches PluginManager
@@ -65,7 +65,7 @@ def load_plugins_from_directory():
# Register the plugin with pluggy
plugin_manager.register(module, module_name)
except (ImportError, AttributeError) as e:
print(f"Error loading plugin {module_name}: {e}")
logger.critical(f"Error loading plugin {module_name}: {e}")
# Load plugins from the plugins directory
load_plugins_from_directory()

View File

@@ -14,7 +14,7 @@ def count_words_in_history(watch, incoming_text=None):
elif watch.history.keys():
# When called from UI extras to count latest snapshot
latest_key = list(watch.history.keys())[-1]
latest_content = watch.get_history_snapshot(timestamp=latest_key)
latest_content = watch.get_history_snapshot(latest_key)
return len(latest_content.split())
return 0
except Exception as e:

View File

@@ -64,18 +64,6 @@ class Fetcher():
# Time ONTOP of the system defined env minimum time
render_extract_delay = 0
def clear_content(self):
"""
Explicitly clear all content from memory to free up heap space.
Call this after content has been saved to disk.
"""
self.content = None
if hasattr(self, 'raw_content'):
self.raw_content = None
self.screenshot = None
self.xpath_data = None
# Keep headers and status_code as they're small
@abstractmethod
def get_error(self):
return self.error
@@ -140,7 +128,7 @@ class Fetcher():
async def iterate_browser_steps(self, start_url=None):
from changedetectionio.blueprint.browser_steps.browser_steps import steppable_browser_interface
from playwright._impl._errors import TimeoutError, Error
from changedetectionio.jinja2_custom import render as jinja_render
from changedetectionio.safe_jinja import render as jinja_render
step_n = 0
if self.browser_steps is not None and len(self.browser_steps):

View File

@@ -139,7 +139,7 @@ class fetcher(Fetcher):
content = await self.page.content()
destination = os.path.join(self.browser_steps_screenshot_path, 'step_{}.html'.format(step_n))
logger.debug(f"Saving step HTML to {destination}")
with open(destination, 'w', encoding='utf-8') as f:
with open(destination, 'w') as f:
f.write(content)
async def run(self,

View File

@@ -1,7 +1,6 @@
from loguru import logger
import hashlib
import os
import re
import asyncio
from changedetectionio import strtobool
from changedetectionio.content_fetchers.exceptions import BrowserStepsInUnsupportedFetcher, EmptyReply, Non200ErrorCodeReceived
@@ -52,7 +51,6 @@ class fetcher(Fetcher):
session = requests.Session()
if strtobool(os.getenv('ALLOW_FILE_URI', 'false')) and url.startswith('file://'):
from requests_file import FileAdapter
session.mount('file://', FileAdapter())
@@ -77,22 +75,9 @@ class fetcher(Fetcher):
if not is_binary:
# Don't run this for PDF (and requests identified as binary) takes a _long_ time
if not r.headers.get('content-type') or not 'charset=' in r.headers.get('content-type'):
# For XML/RSS feeds, check the XML declaration for encoding attribute
# This is more reliable than chardet which can misdetect UTF-8 as MacRoman
content_type = r.headers.get('content-type', '').lower()
if 'xml' in content_type or 'rss' in content_type:
# Look for <?xml version="1.0" encoding="UTF-8"?>
xml_encoding_match = re.search(rb'<\?xml[^>]+encoding=["\']([^"\']+)["\']', r.content[:200])
if xml_encoding_match:
r.encoding = xml_encoding_match.group(1).decode('ascii')
else:
# Default to UTF-8 for XML if no encoding found
r.encoding = 'utf-8'
else:
# For other content types, use chardet
encoding = chardet.detect(r.content)['encoding']
if encoding:
r.encoding = encoding
encoding = chardet.detect(r.content)['encoding']
if encoding:
r.encoding = encoding
self.headers = r.headers

View File

@@ -1,32 +1,8 @@
import difflib
from typing import List, Iterator, Union
# https://github.com/dgtlmoon/changedetection.io/issues/821#issuecomment-1241837050
#HTML_ADDED_STYLE = "background-color: #d2f7c2; color: #255d00;"
#HTML_CHANGED_INTO_STYLE = "background-color: #dafbe1; color: #116329;"
#HTML_CHANGED_STYLE = "background-color: #ffd6cc; color: #7a2000;"
#HTML_REMOVED_STYLE = "background-color: #ffebe9; color: #82071e;"
# @todo - In the future we can make this configurable
HTML_ADDED_STYLE = "background-color: #eaf2c2; color: #406619"
HTML_REMOVED_STYLE = "background-color: #fadad7; color: #b30000"
HTML_CHANGED_STYLE = HTML_REMOVED_STYLE
HTML_CHANGED_INTO_STYLE = HTML_ADDED_STYLE
# These get set to html or telegram type or discord compatible or whatever in handler.py
# Something that cant get escaped to HTML by accident
REMOVED_PLACEMARKER_OPEN = '@removed_PLACEMARKER_OPEN'
REMOVED_PLACEMARKER_CLOSED = '@removed_PLACEMARKER_CLOSED'
ADDED_PLACEMARKER_OPEN = '@added_PLACEMARKER_OPEN'
ADDED_PLACEMARKER_CLOSED = '@added_PLACEMARKER_CLOSED'
CHANGED_PLACEMARKER_OPEN = '@changed_PLACEMARKER_OPEN'
CHANGED_PLACEMARKER_CLOSED = '@changed_PLACEMARKER_CLOSED'
CHANGED_INTO_PLACEMARKER_OPEN = '@changed_into_PLACEMARKER_OPEN'
CHANGED_INTO_PLACEMARKER_CLOSED = '@changed_into_PLACEMARKER_CLOSED'
REMOVED_STYLE = "background-color: #fadad7; color: #b30000;"
ADDED_STYLE = "background-color: #eaf2c2; color: #406619;"
def same_slicer(lst: List[str], start: int, end: int) -> List[str]:
"""Return a slice of the list, or a single element if start == end."""
@@ -39,7 +15,8 @@ def customSequenceMatcher(
include_removed: bool = True,
include_added: bool = True,
include_replaced: bool = True,
include_change_type_prefix: bool = True
include_change_type_prefix: bool = True,
html_colour: bool = False
) -> Iterator[List[str]]:
"""
Compare two sequences and yield differences based on specified parameters.
@@ -52,6 +29,8 @@ def customSequenceMatcher(
include_added (bool): Include added parts
include_replaced (bool): Include replaced parts
include_change_type_prefix (bool): Add prefixes to indicate change types
html_colour (bool): Use HTML background colors for differences
Yields:
List[str]: Differences between sequences
"""
@@ -63,22 +42,22 @@ def customSequenceMatcher(
if include_equal and tag == 'equal':
yield before[alo:ahi]
elif include_removed and tag == 'delete':
if include_change_type_prefix:
yield [f'{REMOVED_PLACEMARKER_OPEN}{line}{REMOVED_PLACEMARKER_CLOSED}' for line in same_slicer(before, alo, ahi)]
if html_colour:
yield [f'<span style="{REMOVED_STYLE}">{line}</span>' for line in same_slicer(before, alo, ahi)]
else:
yield same_slicer(before, alo, ahi)
yield [f"(removed) {line}" for line in same_slicer(before, alo, ahi)] if include_change_type_prefix else same_slicer(before, alo, ahi)
elif include_replaced and tag == 'replace':
if include_change_type_prefix:
yield [f'{CHANGED_PLACEMARKER_OPEN}{line}{CHANGED_PLACEMARKER_CLOSED}' for line in same_slicer(before, alo, ahi)] + \
[f'{CHANGED_INTO_PLACEMARKER_OPEN}{line}{CHANGED_INTO_PLACEMARKER_CLOSED}' for line in same_slicer(after, blo, bhi)]
if html_colour:
yield [f'<span style="{REMOVED_STYLE}">{line}</span>' for line in same_slicer(before, alo, ahi)] + \
[f'<span style="{ADDED_STYLE}">{line}</span>' for line in same_slicer(after, blo, bhi)]
else:
yield same_slicer(before, alo, ahi) + same_slicer(after, blo, bhi)
yield [f"(changed) {line}" for line in same_slicer(before, alo, ahi)] + \
[f"(into) {line}" for line in same_slicer(after, blo, bhi)] if include_change_type_prefix else same_slicer(before, alo, ahi) + same_slicer(after, blo, bhi)
elif include_added and tag == 'insert':
if include_change_type_prefix:
yield [f'{ADDED_PLACEMARKER_OPEN}{line}{ADDED_PLACEMARKER_CLOSED}' for line in same_slicer(after, blo, bhi)]
if html_colour:
yield [f'<span style="{ADDED_STYLE}">{line}</span>' for line in same_slicer(after, blo, bhi)]
else:
yield same_slicer(after, blo, bhi)
yield [f"(added) {line}" for line in same_slicer(after, blo, bhi)] if include_change_type_prefix else same_slicer(after, blo, bhi)
def render_diff(
previous_version_file_contents: str,
@@ -89,7 +68,8 @@ def render_diff(
include_replaced: bool = True,
line_feed_sep: str = "\n",
include_change_type_prefix: bool = True,
patch_format: bool = False
patch_format: bool = False,
html_colour: bool = False
) -> str:
"""
Render the difference between two file contents.
@@ -104,6 +84,8 @@ def render_diff(
line_feed_sep (str): Separator for lines in output
include_change_type_prefix (bool): Add prefixes to indicate change types
patch_format (bool): Use patch format for output
html_colour (bool): Use HTML background colors for differences
Returns:
str: Rendered difference
"""
@@ -121,7 +103,8 @@ def render_diff(
include_removed=include_removed,
include_added=include_added,
include_replaced=include_replaced,
include_change_type_prefix=include_change_type_prefix
include_change_type_prefix=include_change_type_prefix,
html_colour=html_colour
)
def flatten(lst: List[Union[str, List[str]]]) -> str:

View File

@@ -101,12 +101,12 @@ def init_app_secret(datastore_path):
path = os.path.join(datastore_path, "secret.txt")
try:
with open(path, "r", encoding='utf-8') as f:
with open(path, "r") as f:
secret = f.read()
except FileNotFoundError:
import secrets
with open(path, "w", encoding='utf-8') as f:
with open(path, "w") as f:
secret = secrets.token_hex(32)
f.write(secret)
@@ -133,11 +133,6 @@ def get_socketio_path():
# Socket.IO will be available at {prefix}/socket.io/
return prefix
@app.template_global('is_safe_valid_url')
def _is_safe_valid_url(test_url):
from .validate_url import is_safe_valid_url
return is_safe_valid_url(test_url)
@app.template_filter('format_number_locale')
def _jinja2_filter_format_number_locale(value: float) -> str:
@@ -387,7 +382,7 @@ def changedetection_app(config=None, datastore_o=None):
# We would sometimes get login loop errors on sites hosted in sub-paths
# note for the future:
# if not is_safe_valid_url(next):
# if not is_safe_url(next):
# return flask.abort(400)
return redirect(url_for('watchlist.index'))
@@ -524,7 +519,7 @@ def changedetection_app(config=None, datastore_o=None):
# watchlist UI buttons etc
import changedetectionio.blueprint.ui as ui
app.register_blueprint(ui.construct_blueprint(datastore, update_q, worker_handler, queuedWatchMetaData, watch_check_update))
app.register_blueprint(ui.construct_blueprint(datastore, update_q, worker_handler, queuedWatchMetaData, watch_check_update, notification_q))
import changedetectionio.blueprint.watchlist as watchlist
app.register_blueprint(watchlist.construct_blueprint(datastore=datastore, update_q=update_q, queuedWatchMetaData=queuedWatchMetaData), url_prefix='')
@@ -794,19 +789,15 @@ def ticker_thread_check_time_launch_checks():
# @todo - Maybe make this a hook?
# Time schedule limit - Decide between watch or global settings
scheduler_source = None
if watch.get('time_between_check_use_default'):
time_schedule_limit = datastore.data['settings']['requests'].get('time_schedule_limit', {})
scheduler_source = 'system/global settings'
logger.trace(f"{uuid} Time scheduler - Using system/global settings")
else:
time_schedule_limit = watch.get('time_schedule_limit')
scheduler_source = 'watch'
tz_name = datastore.data['settings']['application'].get('scheduler_timezone_default', os.getenv('TZ', 'UTC').strip())
logger.trace(f"{uuid} Time scheduler - Using watch settings (not global settings)")
tz_name = datastore.data['settings']['application'].get('timezone', 'UTC')
if time_schedule_limit and time_schedule_limit.get('enabled'):
logger.trace(f"{uuid} Time scheduler - Using scheduler settings from {scheduler_source}")
try:
result = is_within_schedule(time_schedule_limit=time_schedule_limit,
default_tz=tz_name
@@ -818,7 +809,6 @@ def ticker_thread_check_time_launch_checks():
logger.error(
f"{uuid} - Recheck scheduler, error handling timezone, check skipped - TZ name '{tz_name}' - {str(e)}")
return False
# If they supplied an individual entry minutes to threshold.
threshold = recheck_time_system_seconds if watch.get('time_between_check_use_default') else watch.threshold_seconds()

View File

@@ -3,9 +3,8 @@ import re
from loguru import logger
from wtforms.widgets.core import TimeInput
from changedetectionio.blueprint.rss import RSS_FORMAT_TYPES, RSS_TEMPLATE_TYPE_OPTIONS, RSS_TEMPLATE_HTML_DEFAULT
from changedetectionio.blueprint.rss import RSS_FORMAT_TYPES
from changedetectionio.conditions.form import ConditionFormRow
from changedetectionio.notification_service import NotificationContextData
from changedetectionio.strtobool import strtobool
from wtforms import (
@@ -28,8 +27,11 @@ from wtforms.utils import unset_value
from wtforms.validators import ValidationError
from validators.url import url as url_validator
from changedetectionio.widgets import TernaryNoneBooleanField
# default
# each select <option data-enabled="enabled-0-0"
from changedetectionio.blueprint.browser_steps.browser_steps import browser_step_ui_config
@@ -467,16 +469,11 @@ class ValidateAppRiseServers(object):
import apprise
from .notification.apprise_plugin.assets import apprise_asset
from .notification.apprise_plugin.custom_handlers import apprise_http_custom_handler # noqa: F401
from changedetectionio.jinja2_custom import render as jinja_render
apobj = apprise.Apprise(asset=apprise_asset)
for server_url in field.data:
generic_notification_context_data = NotificationContextData()
# Make sure something is atleast in all those regular token fields
generic_notification_context_data.set_random_for_validation()
url = jinja_render(template_str=server_url.strip(), **generic_notification_context_data).strip()
url = server_url.strip()
if url.startswith("#"):
continue
@@ -490,8 +487,9 @@ class ValidateJinja2Template(object):
"""
def __call__(self, form, field):
from changedetectionio import notification
from changedetectionio.jinja2_custom import create_jinja_env
from jinja2 import BaseLoader, TemplateSyntaxError, UndefinedError
from jinja2.sandbox import ImmutableSandboxedEnvironment
from jinja2.meta import find_undeclared_variables
import jinja2.exceptions
@@ -499,13 +497,9 @@ class ValidateJinja2Template(object):
joined_data = ' '.join(map(str, field.data)) if isinstance(field.data, list) else f"{field.data}"
try:
# Use the shared helper to create a properly configured environment
jinja2_env = create_jinja_env(loader=BaseLoader)
# Add notification tokens for validation
static_token_placeholders = NotificationContextData()
static_token_placeholders.set_random_for_validation()
jinja2_env.globals.update(static_token_placeholders)
jinja2_env = ImmutableSandboxedEnvironment(loader=BaseLoader, extensions=['jinja2_time.TimeExtension'])
jinja2_env.globals.update(notification.valid_tokens)
# Extra validation tokens provided on the form_class(... extra_tokens={}) setup
if hasattr(field, 'extra_notification_tokens'):
jinja2_env.globals.update(field.extra_notification_tokens)
@@ -517,7 +511,6 @@ class ValidateJinja2Template(object):
except jinja2.exceptions.SecurityError as e:
raise ValidationError(f"This is not a valid Jinja2 template: {e}") from e
# Check for undeclared variables
ast = jinja2_env.parse(joined_data)
undefined = ", ".join(find_undeclared_variables(ast))
if undefined:
@@ -540,10 +533,19 @@ class validateURL(object):
def validate_url(test_url):
from changedetectionio.validate_url import is_safe_valid_url
if not is_safe_valid_url(test_url):
# If hosts that only contain alphanumerics are allowed ("localhost" for example)
try:
url_validator(test_url, simple_host=allow_simplehost)
except validators.ValidationError:
#@todo check for xss
message = f"'{test_url}' is not a valid URL."
# This should be wtforms.validators.
raise ValidationError('Watch protocol is not permitted or invalid URL format')
raise ValidationError(message)
from .model.Watch import is_safe_url
if not is_safe_url(test_url):
# This should be wtforms.validators.
raise ValidationError('Watch protocol is not permitted by SAFE_PROTOCOL_REGEX or incorrect URL format')
class ValidateSinglePythonRegexString(object):
@@ -676,51 +678,6 @@ class ValidateCSSJSONXPATHInput(object):
except:
raise ValidationError("A system-error occurred when validating your jq expression")
class ValidateSimpleURL:
"""Validate that the value can be parsed by urllib.parse.urlparse() and has a scheme/netloc."""
def __init__(self, message=None):
self.message = message or "Invalid URL."
def __call__(self, form, field):
data = (field.data or "").strip()
if not data:
return # empty is OK — pair with validators.Optional()
from urllib.parse import urlparse
parsed = urlparse(data)
if not parsed.scheme or not parsed.netloc:
raise ValidationError(self.message)
class ValidateStartsWithRegex(object):
def __init__(self, regex, *, flags=0, message=None, allow_empty=True, split_lines=True):
# compile with given flags (well pass re.IGNORECASE below)
self.pattern = re.compile(regex, flags) if isinstance(regex, str) else regex
self.message = message
self.allow_empty = allow_empty
self.split_lines = split_lines
def __call__(self, form, field):
data = field.data
if not data:
return
# normalize into list of lines
if isinstance(data, str) and self.split_lines:
lines = data.splitlines()
elif isinstance(data, (list, tuple)):
lines = data
else:
lines = [data]
for line in lines:
stripped = line.strip()
if not stripped:
if self.allow_empty:
continue
raise ValidationError(self.message or "Empty value not allowed.")
if not self.pattern.match(stripped):
raise ValidationError(self.message or "Invalid value.")
class quickWatchForm(Form):
from . import processors
@@ -731,6 +688,7 @@ class quickWatchForm(Form):
edit_and_watch_submit_button = SubmitField('Edit > Watch', render_kw={"class": "pure-button pure-button-primary"})
# Common to a single watch and the global settings
class commonSettingsForm(Form):
from . import processors
@@ -743,21 +701,13 @@ class commonSettingsForm(Form):
fetch_backend = RadioField(u'Fetch Method', choices=content_fetchers.available_fetchers(), validators=[ValidateContentFetcherIsReady()])
notification_body = TextAreaField('Notification Body', default='{{ watch_url }} had a change.', validators=[validators.Optional(), ValidateJinja2Template()])
notification_format = SelectField('Notification format', choices=list(valid_notification_formats.items()))
notification_format = SelectField('Notification format', choices=valid_notification_formats.keys())
notification_title = StringField('Notification Title', default='ChangeDetection.io Notification - {{ watch_url }}', validators=[validators.Optional(), ValidateJinja2Template()])
notification_urls = StringListField('Notification URL List', validators=[validators.Optional(), ValidateAppRiseServers(), ValidateJinja2Template()])
processor = RadioField( label=u"Processor - What do you want to achieve?", choices=processors.available_processors(), default="text_json_diff")
scheduler_timezone_default = StringField("Default timezone for watch check scheduler", render_kw={"list": "timezones"}, validators=[validateTimeZoneName()])
timezone = StringField("Timezone for watch schedule", render_kw={"list": "timezones"}, validators=[validateTimeZoneName()])
webdriver_delay = IntegerField('Wait seconds before extracting text', validators=[validators.Optional(), validators.NumberRange(min=1, message="Should contain one or more seconds")])
# Not true anymore but keep the validate_ hook for future use, we convert color tags
# def validate_notification_urls(self, field):
# """Validate that HTML Color format is not used with Telegram"""
# if self.notification_format.data == 'HTML Color' and field.data:
# for url in field.data:
# if url and ('tgram://' in url or 'discord://' in url or 'discord.com/api/webhooks' in url):
# raise ValidationError('HTML Color format is not supported by Telegram and Discord. Please choose another Notification Format (Plain Text, HTML, or Markdown to HTML).')
class importForm(Form):
from . import processors
@@ -845,7 +795,7 @@ class processor_text_json_diff_form(commonSettingsForm):
if not super().validate():
return False
from changedetectionio.jinja2_custom import render as jinja_render
from changedetectionio.safe_jinja import render as jinja_render
result = True
# Fail form validation when a body is set for a GET
@@ -908,36 +858,23 @@ class processor_text_json_diff_form(commonSettingsForm):
):
super().__init__(formdata, obj, prefix, data, meta, **kwargs)
if kwargs and kwargs.get('default_system_settings'):
default_tz = kwargs.get('default_system_settings').get('application', {}).get('scheduler_timezone_default')
default_tz = kwargs.get('default_system_settings').get('application', {}).get('timezone')
if default_tz:
self.time_schedule_limit.form.timezone.render_kw['placeholder'] = default_tz
class SingleExtraProxy(Form):
# maybe better to set some <script>var..
proxy_name = StringField('Name', [validators.Optional()], render_kw={"placeholder": "Name"})
proxy_url = StringField('Proxy URL', [
validators.Optional(),
ValidateStartsWithRegex(
regex=r'^(https?|socks5)://', # ✅ main pattern
flags=re.IGNORECASE, # ✅ makes it case-insensitive
message='Proxy URLs must start with http://, https:// or socks5://',
),
ValidateSimpleURL()
], render_kw={"placeholder": "socks5:// or regular proxy http://user:pass@...:3128", "size":50})
proxy_url = StringField('Proxy URL', [validators.Optional()], render_kw={"placeholder": "socks5:// or regular proxy http://user:pass@...:3128", "size":50})
# @todo do the validation here instead
class SingleExtraBrowser(Form):
browser_name = StringField('Name', [validators.Optional()], render_kw={"placeholder": "Name"})
browser_connection_url = StringField('Browser connection URL', [
validators.Optional(),
ValidateStartsWithRegex(
regex=r'^(wss?|ws)://',
flags=re.IGNORECASE,
message='Browser URLs must start with wss:// or ws://'
),
ValidateSimpleURL()
], render_kw={"placeholder": "wss://brightdata... wss://oxylabs etc", "size":50})
browser_connection_url = StringField('Browser connection URL', [validators.Optional()], render_kw={"placeholder": "wss://brightdata... wss://oxylabs etc", "size":50})
# @todo do the validation here instead
class DefaultUAInputForm(Form):
html_requests = StringField('Plaintext requests', validators=[validators.Optional()], render_kw={"placeholder": "<default>"})
@@ -948,7 +885,7 @@ class DefaultUAInputForm(Form):
class globalSettingsRequestForm(Form):
time_between_check = RequiredFormField(TimeBetweenCheckForm)
time_schedule_limit = FormField(ScheduleLimitForm)
proxy = RadioField('Default proxy')
proxy = RadioField('Proxy')
jitter_seconds = IntegerField('Random jitter seconds ± check',
render_kw={"style": "width: 5em;"},
validators=[validators.NumberRange(min=0, message="Should contain zero or more seconds")])
@@ -957,12 +894,7 @@ class globalSettingsRequestForm(Form):
render_kw={"style": "width: 5em;"},
validators=[validators.NumberRange(min=1, max=50,
message="Should be between 1 and 50")])
timeout = IntegerField('Requests timeout in seconds',
render_kw={"style": "width: 5em;"},
validators=[validators.NumberRange(min=1, max=999,
message="Should be between 1 and 999")])
extra_proxies = FieldList(FormField(SingleExtraProxy), min_entries=5)
extra_browsers = FieldList(FormField(SingleExtraBrowser), min_entries=5)
@@ -1000,9 +932,7 @@ class globalSettingsApplicationForm(commonSettingsForm):
validators=[validators.NumberRange(min=0,
message="Should be atleast zero (disabled)")])
rss_content_format = SelectField('RSS Content format', choices=list(RSS_FORMAT_TYPES.items()))
rss_template_type = SelectField('RSS <description> body built from', choices=list(RSS_TEMPLATE_TYPE_OPTIONS.items()))
rss_template_override = TextAreaField('RSS "System default" template override', render_kw={"rows": "5", "placeholder": RSS_TEMPLATE_HTML_DEFAULT}, validators=[validators.Optional(), ValidateJinja2Template()])
rss_content_format = SelectField('RSS Content format', choices=RSS_FORMAT_TYPES)
removepassword_button = SubmitField('Remove password', render_kw={"class": "pure-button pure-button-primary"})
render_anchor_tag_content = BooleanField('Render anchor tag content', default=False)
@@ -1010,12 +940,6 @@ class globalSettingsApplicationForm(commonSettingsForm):
strip_ignored_lines = BooleanField('Strip ignored lines')
rss_hide_muted_watches = BooleanField('Hide muted watches from RSS feed', default=True,
validators=[validators.Optional()])
rss_reader_mode = BooleanField('Enable RSS reader mode ', default=False, validators=[validators.Optional()])
rss_diff_length = IntegerField(label='Number of changes to show in watch RSS feed',
render_kw={"style": "width: 5em;"},
validators=[validators.NumberRange(min=0, message="Should contain zero or more attempts")])
filter_failure_notification_threshold_attempts = IntegerField('Number of times the filter can be missing before sending a notification',
render_kw={"style": "width: 5em;"},
validators=[validators.NumberRange(min=0,

View File

@@ -1,6 +1,5 @@
from functools import lru_cache
from loguru import logger
from lxml import etree
from typing import List
import html
import json
@@ -15,6 +14,7 @@ TITLE_RE = re.compile(r"<title[^>]*>(.*?)</title>", re.I | re.S)
META_CS = re.compile(r'<meta[^>]+charset=["\']?\s*([a-z0-9_\-:+.]+)', re.I)
META_CT = re.compile(r'<meta[^>]+http-equiv=["\']?content-type["\']?[^>]*content=["\'][^>]*charset=([a-z0-9_\-:+.]+)', re.I)
# 'price' , 'lowPrice', 'highPrice' are usually under here
# All of those may or may not appear on different websites - I didnt find a way todo case-insensitive searching here
LD_JSON_PRODUCT_OFFER_SELECTORS = ["json:$..offers", "json:$..Offers"]
@@ -23,9 +23,9 @@ class JSONNotFound(ValueError):
def __init__(self, msg):
ValueError.__init__(self, msg)
# Doesn't look like python supports forward slash auto enclosure in re.findall
# So convert it to inline flag "(?i)foobar" type configuration
@lru_cache(maxsize=100)
def perl_style_slash_enclosed_regex_to_options(regex):
res = re.search(PERL_STYLE_REGEX, regex, re.IGNORECASE)
@@ -58,17 +58,13 @@ def include_filters(include_filters, html_content, append_pretty_line_formatting
return html_block
def subtractive_css_selector(css_selector, content):
def subtractive_css_selector(css_selector, html_content):
from bs4 import BeautifulSoup
soup = BeautifulSoup(content, "html.parser")
soup = BeautifulSoup(html_content, "html.parser")
# So that the elements dont shift their index, build a list of elements here which will be pointers to their place in the DOM
elements_to_remove = soup.select(css_selector)
if not elements_to_remove:
# Better to return the original that rebuild with BeautifulSoup
return content
# Then, remove them in a separate loop
for item in elements_to_remove:
item.decompose()
@@ -76,7 +72,6 @@ def subtractive_css_selector(css_selector, content):
return str(soup)
def subtractive_xpath_selector(selectors: List[str], html_content: str) -> str:
from lxml import etree
# Parse the HTML content using lxml
html_tree = etree.HTML(html_content)
@@ -88,10 +83,6 @@ def subtractive_xpath_selector(selectors: List[str], html_content: str) -> str:
# Collect elements for each selector
elements_to_remove.extend(html_tree.xpath(selector))
# If no elements were found, return the original HTML content
if not elements_to_remove:
return html_content
# Then, remove them in a separate loop
for element in elements_to_remove:
if element.getparent() is not None: # Ensure the element has a parent before removing
@@ -109,7 +100,7 @@ def element_removal(selectors: List[str], html_content):
xpath_selectors = []
for selector in selectors:
if selector.strip().startswith(('xpath:', 'xpath1:', '//')):
if selector.startswith(('xpath:', 'xpath1:', '//')):
# Handle XPath selectors separately
xpath_selector = selector.removeprefix('xpath:').removeprefix('xpath1:')
xpath_selectors.append(xpath_selector)
@@ -172,131 +163,75 @@ def elementpath_tostring(obj):
return str(obj)
# Return str Utf-8 of matched rules
def xpath_filter(xpath_filter, html_content, append_pretty_line_formatting=False, is_xml=False):
"""
:param xpath_filter:
:param html_content:
:param append_pretty_line_formatting:
:param is_xml: set to true if is XML or is RSS (RSS is XML)
:return:
"""
def xpath_filter(xpath_filter, html_content, append_pretty_line_formatting=False, is_rss=False):
from lxml import etree, html
import elementpath
# xpath 2.0-3.1
from elementpath.xpath3 import XPath3Parser
parser = etree.HTMLParser()
tree = None
try:
if is_xml:
# So that we can keep CDATA for cdata_in_document_to_text() to process
parser = etree.XMLParser(strip_cdata=False)
# For XML/RSS content, use etree.fromstring to properly handle XML declarations
tree = etree.fromstring(html_content.encode('utf-8') if isinstance(html_content, str) else html_content, parser=parser)
if is_rss:
# So that we can keep CDATA for cdata_in_document_to_text() to process
parser = etree.XMLParser(strip_cdata=False)
tree = html.fromstring(bytes(html_content, encoding='utf-8'), parser=parser)
html_block = ""
r = elementpath.select(tree, xpath_filter.strip(), namespaces={'re': 'http://exslt.org/regular-expressions'}, parser=XPath3Parser)
#@note: //title/text() wont work where <title>CDATA..
if type(r) != list:
r = [r]
for element in r:
# When there's more than 1 match, then add the suffix to separate each line
# And where the matched result doesn't include something that will cause Inscriptis to add a newline
# (This way each 'match' reliably has a new-line in the diff)
# Divs are converted to 4 whitespaces by inscriptis
if append_pretty_line_formatting and len(html_block) and (not hasattr( element, 'tag' ) or not element.tag in (['br', 'hr', 'div', 'p'])):
html_block += TEXT_FILTER_LIST_LINE_SUFFIX
if type(element) == str:
html_block += element
elif issubclass(type(element), etree._Element) or issubclass(type(element), etree._ElementTree):
html_block += etree.tostring(element, pretty_print=True).decode('utf-8')
else:
tree = html.fromstring(html_content, parser=parser)
html_block = ""
html_block += elementpath_tostring(element)
# Build namespace map for XPath queries
namespaces = {'re': 'http://exslt.org/regular-expressions'}
# Handle default namespace in documents (common in RSS/Atom feeds, but can occur in any XML)
# XPath spec: unprefixed element names have no namespace, not the default namespace
# Solution: Register the default namespace with empty string prefix in elementpath
# This is primarily for RSS/Atom feeds but works for any XML with default namespace
if hasattr(tree, 'nsmap') and tree.nsmap and None in tree.nsmap:
# Register the default namespace with empty string prefix for elementpath
# This allows //title to match elements in the default namespace
namespaces[''] = tree.nsmap[None]
r = elementpath.select(tree, xpath_filter.strip(), namespaces=namespaces, parser=XPath3Parser)
#@note: //title/text() now works with default namespaces (fixed by registering '' prefix)
#@note: //title/text() wont work where <title>CDATA.. (use cdata_in_document_to_text first)
if type(r) != list:
r = [r]
for element in r:
# When there's more than 1 match, then add the suffix to separate each line
# And where the matched result doesn't include something that will cause Inscriptis to add a newline
# (This way each 'match' reliably has a new-line in the diff)
# Divs are converted to 4 whitespaces by inscriptis
if append_pretty_line_formatting and len(html_block) and (not hasattr( element, 'tag' ) or not element.tag in (['br', 'hr', 'div', 'p'])):
html_block += TEXT_FILTER_LIST_LINE_SUFFIX
if type(element) == str:
html_block += element
elif issubclass(type(element), etree._Element) or issubclass(type(element), etree._ElementTree):
# Use 'xml' method for RSS/XML content, 'html' for HTML content
# parser will be XMLParser if we detected XML content
method = 'xml' if (is_xml or isinstance(parser, etree.XMLParser)) else 'html'
html_block += etree.tostring(element, pretty_print=True, method=method, encoding='unicode')
else:
html_block += elementpath_tostring(element)
return html_block
finally:
# Explicitly clear the tree to free memory
# lxml trees can hold significant memory, especially with large documents
if tree is not None:
tree.clear()
return html_block
# Return str Utf-8 of matched rules
# 'xpath1:'
def xpath1_filter(xpath_filter, html_content, append_pretty_line_formatting=False, is_xml=False):
def xpath1_filter(xpath_filter, html_content, append_pretty_line_formatting=False, is_rss=False):
from lxml import etree, html
parser = None
tree = None
try:
if is_xml:
# So that we can keep CDATA for cdata_in_document_to_text() to process
parser = etree.XMLParser(strip_cdata=False)
# For XML/RSS content, use etree.fromstring to properly handle XML declarations
tree = etree.fromstring(html_content.encode('utf-8') if isinstance(html_content, str) else html_content, parser=parser)
if is_rss:
# So that we can keep CDATA for cdata_in_document_to_text() to process
parser = etree.XMLParser(strip_cdata=False)
tree = html.fromstring(bytes(html_content, encoding='utf-8'), parser=parser)
html_block = ""
r = tree.xpath(xpath_filter.strip(), namespaces={'re': 'http://exslt.org/regular-expressions'})
#@note: //title/text() wont work where <title>CDATA..
for element in r:
# When there's more than 1 match, then add the suffix to separate each line
# And where the matched result doesn't include something that will cause Inscriptis to add a newline
# (This way each 'match' reliably has a new-line in the diff)
# Divs are converted to 4 whitespaces by inscriptis
if append_pretty_line_formatting and len(html_block) and (not hasattr(element, 'tag') or not element.tag in (['br', 'hr', 'div', 'p'])):
html_block += TEXT_FILTER_LIST_LINE_SUFFIX
# Some kind of text, UTF-8 or other
if isinstance(element, (str, bytes)):
html_block += element
else:
tree = html.fromstring(html_content, parser=parser)
html_block = ""
# Return the HTML which will get parsed as text
html_block += etree.tostring(element, pretty_print=True).decode('utf-8')
# Build namespace map for XPath queries
namespaces = {'re': 'http://exslt.org/regular-expressions'}
# NOTE: lxml's native xpath() does NOT support empty string prefix for default namespace
# For documents with default namespace (RSS/Atom feeds), users must use:
# - local-name(): //*[local-name()='title']/text()
# - Or use xpath_filter (not xpath1_filter) which supports default namespaces
# XPath spec: unprefixed element names have no namespace, not the default namespace
r = tree.xpath(xpath_filter.strip(), namespaces=namespaces)
#@note: xpath1 (lxml) does NOT automatically handle default namespaces
#@note: Use //*[local-name()='element'] or switch to xpath_filter for default namespace support
#@note: //title/text() wont work where <title>CDATA.. (use cdata_in_document_to_text first)
for element in r:
# When there's more than 1 match, then add the suffix to separate each line
# And where the matched result doesn't include something that will cause Inscriptis to add a newline
# (This way each 'match' reliably has a new-line in the diff)
# Divs are converted to 4 whitespaces by inscriptis
if append_pretty_line_formatting and len(html_block) and (not hasattr(element, 'tag') or not element.tag in (['br', 'hr', 'div', 'p'])):
html_block += TEXT_FILTER_LIST_LINE_SUFFIX
# Some kind of text, UTF-8 or other
if isinstance(element, (str, bytes)):
html_block += element
else:
# Return the HTML/XML which will get parsed as text
# Use 'xml' method for RSS/XML content, 'html' for HTML content
# parser will be XMLParser if we detected XML content
method = 'xml' if (is_xml or isinstance(parser, etree.XMLParser)) else 'html'
html_block += etree.tostring(element, pretty_print=True, method=method, encoding='unicode')
return html_block
finally:
# Explicitly clear the tree to free memory
# lxml trees can hold significant memory, especially with large documents
if tree is not None:
tree.clear()
return html_block
# Extract/find element
def extract_element(find='title', html_content=''):
@@ -360,92 +295,70 @@ def _get_stripped_text_from_json_match(match):
return stripped_text_from_html
def extract_json_blob_from_html(content, ensure_is_ldjson_info_type, json_filter):
from bs4 import BeautifulSoup
stripped_text_from_html = ''
# Foreach <script json></script> blob.. just return the first that matches json_filter
# As a last resort, try to parse the whole <body>
soup = BeautifulSoup(content, 'html.parser')
if ensure_is_ldjson_info_type:
bs_result = soup.find_all('script', {"type": "application/ld+json"})
else:
bs_result = soup.find_all('script')
bs_result += soup.find_all('body')
bs_jsons = []
for result in bs_result:
# result.text is how bs4 magically strips JSON from the body
content_start = result.text.lstrip("\ufeff").strip()[:100] if result.text else ''
# Skip empty tags, and things that dont even look like JSON
if not result.text or not (content_start[0] == '{' or content_start[0] == '['):
continue
try:
json_data = json.loads(result.text)
bs_jsons.append(json_data)
except json.JSONDecodeError:
# Skip objects which cannot be parsed
continue
if not bs_jsons:
raise JSONNotFound("No parsable JSON found in this document")
for json_data in bs_jsons:
stripped_text_from_html = _parse_json(json_data, json_filter)
if ensure_is_ldjson_info_type:
# Could sometimes be list, string or something else random
if isinstance(json_data, dict):
# If it has LD JSON 'key' @type, and @type is 'product', and something was found for the search
# (Some sites have multiple of the same ld+json @type='product', but some have the review part, some have the 'price' part)
# @type could also be a list although non-standard ("@type": ["Product", "SubType"],)
# LD_JSON auto-extract also requires some content PLUS the ldjson to be present
# 1833 - could be either str or dict, should not be anything else
t = json_data.get('@type')
if t and stripped_text_from_html:
if isinstance(t, str) and t.lower() == ensure_is_ldjson_info_type.lower():
break
# The non-standard part, some have a list
elif isinstance(t, list):
if ensure_is_ldjson_info_type.lower() in [x.lower().strip() for x in t]:
break
elif stripped_text_from_html:
break
return stripped_text_from_html
# content - json
# json_filter - ie json:$..price
# ensure_is_ldjson_info_type - str "product", optional, "@type == product" (I dont know how to do that as a json selector)
def extract_json_as_string(content, json_filter, ensure_is_ldjson_info_type=None):
from bs4 import BeautifulSoup
stripped_text_from_html = False
# https://github.com/dgtlmoon/changedetection.io/pull/2041#issuecomment-1848397161w
# Try to parse/filter out the JSON, if we get some parser error, then maybe it's embedded within HTML tags
try:
# .lstrip("\ufeff") strings ByteOrderMark from UTF8 and still lets the UTF work
stripped_text_from_html = _parse_json(json.loads(content.lstrip("\ufeff") ), json_filter)
except json.JSONDecodeError as e:
logger.warning(str(e))
# Looks like clean JSON, dont bother extracting from HTML
# Foreach <script json></script> blob.. just return the first that matches json_filter
# As a last resort, try to parse the whole <body>
soup = BeautifulSoup(content, 'html.parser')
content_start = content.lstrip("\ufeff").strip()[:100]
if ensure_is_ldjson_info_type:
bs_result = soup.find_all('script', {"type": "application/ld+json"})
else:
bs_result = soup.find_all('script')
bs_result += soup.find_all('body')
if content_start[0] == '{' or content_start[0] == '[':
try:
# .lstrip("\ufeff") strings ByteOrderMark from UTF8 and still lets the UTF work
stripped_text_from_html = _parse_json(json.loads(content.lstrip("\ufeff")), json_filter)
except json.JSONDecodeError as e:
logger.warning(f"Error processing JSON {content[:20]}...{str(e)})")
else:
# Probably something else, go fish inside for it
try:
stripped_text_from_html = extract_json_blob_from_html(content=content,
ensure_is_ldjson_info_type=ensure_is_ldjson_info_type,
json_filter=json_filter )
except json.JSONDecodeError as e:
logger.warning(f"Error processing JSON while extracting JSON from HTML blob {content[:20]}...{str(e)})")
bs_jsons = []
for result in bs_result:
# Skip empty tags, and things that dont even look like JSON
if not result.text or '{' not in result.text:
continue
try:
json_data = json.loads(result.text)
bs_jsons.append(json_data)
except json.JSONDecodeError:
# Skip objects which cannot be parsed
continue
if not bs_jsons:
raise JSONNotFound("No parsable JSON found in this document")
for json_data in bs_jsons:
stripped_text_from_html = _parse_json(json_data, json_filter)
if ensure_is_ldjson_info_type:
# Could sometimes be list, string or something else random
if isinstance(json_data, dict):
# If it has LD JSON 'key' @type, and @type is 'product', and something was found for the search
# (Some sites have multiple of the same ld+json @type='product', but some have the review part, some have the 'price' part)
# @type could also be a list although non-standard ("@type": ["Product", "SubType"],)
# LD_JSON auto-extract also requires some content PLUS the ldjson to be present
# 1833 - could be either str or dict, should not be anything else
t = json_data.get('@type')
if t and stripped_text_from_html:
if isinstance(t, str) and t.lower() == ensure_is_ldjson_info_type.lower():
break
# The non-standard part, some have a list
elif isinstance(t, list):
if ensure_is_ldjson_info_type.lower() in [x.lower().strip() for x in t]:
break
elif stripped_text_from_html:
break
if not stripped_text_from_html:
# Re 265 - Just return an empty string when filter not found
@@ -465,9 +378,6 @@ def strip_ignore_text(content, wordlist, mode="content"):
ignored_lines = []
for k in wordlist:
# Skip empty strings to avoid matching everything
if not k or not k.strip():
continue
# Is it a regex?
res = re.search(PERL_STYLE_REGEX, k, re.IGNORECASE)
if res:

View File

@@ -1,22 +0,0 @@
"""
Jinja2 custom extensions and safe rendering utilities.
"""
from .extensions.TimeExtension import TimeExtension
from .safe_jinja import (
render,
render_fully_escaped,
create_jinja_env,
JINJA2_MAX_RETURN_PAYLOAD_SIZE,
DEFAULT_JINJA2_EXTENSIONS,
)
from .plugins.regex import regex_replace
__all__ = [
'TimeExtension',
'render',
'render_fully_escaped',
'create_jinja_env',
'JINJA2_MAX_RETURN_PAYLOAD_SIZE',
'DEFAULT_JINJA2_EXTENSIONS',
'regex_replace',
]

View File

@@ -1,221 +0,0 @@
"""
Jinja2 TimeExtension - Custom date/time handling for templates.
This extension provides the {% now %} tag for Jinja2 templates, offering timezone-aware
date/time formatting with support for time offsets.
Why This Extension Exists:
The Arrow library has a now() function (arrow.now()), but Jinja2 templates cannot
directly call Python functions - they need extensions or filters to expose functionality.
This TimeExtension serves as a Jinja2-to-Arrow bridge that:
1. Makes Arrow accessible in templates - Jinja2 requires registering functions/tags
through extensions. You cannot use arrow.now() directly in a template.
2. Provides template-friendly syntax - Instead of complex Python code, you get clean tags:
{% now 'UTC' %}
{% now 'UTC' + 'hours=2' %}
{% now 'Europe/London', '%Y-%m-%d' %}
3. Adds convenience features on top of Arrow:
- Default timezone from environment variable (TZ) or config
- Default datetime format configuration
- Offset syntax parsing: 'hours=2,minutes=30' → shift(hours=2, minutes=30)
- Empty string timezone support to use configured defaults
4. Maintains security - Works within Jinja2's sandboxed environment so users
cannot access arbitrary Python code or objects.
Essentially, this is a Jinja2 wrapper around arrow.now() and arrow.shift() that
provides user-friendly template syntax while maintaining security.
Basic Usage:
{% now 'UTC' %}
# Output: Wed, 09 Dec 2015 23:33:01
Custom Format:
{% now 'UTC', '%Y-%m-%d %H:%M:%S' %}
# Output: 2015-12-09 23:33:01
Timezone Support:
{% now 'America/New_York' %}
{% now 'Europe/London' %}
{% now '' %} # Uses default timezone from environment.default_timezone
Time Offsets (Addition):
{% now 'UTC' + 'hours=2' %}
{% now 'UTC' + 'hours=2,minutes=30' %}
{% now 'UTC' + 'days=1,hours=2,minutes=15,seconds=10' %}
Time Offsets (Subtraction):
{% now 'UTC' - 'minutes=11' %}
{% now 'UTC' - 'days=2,minutes=33,seconds=1' %}
Time Offsets with Custom Format:
{% now 'UTC' + 'hours=2', '%Y-%m-%d %H:%M:%S' %}
# Output: 2015-12-10 01:33:01
Weekday Support (for finding next/previous weekday):
{% now 'UTC' + 'weekday=0' %} # Next Monday (0=Monday, 6=Sunday)
{% now 'UTC' + 'weekday=4' %} # Next Friday
Configuration:
- Default timezone: Set via TZ environment variable or override environment.default_timezone
- Default format: '%a, %d %b %Y %H:%M:%S' (can be overridden via environment.datetime_format)
Environment Customization:
from changedetectionio.jinja2_custom import create_jinja_env
jinja2_env = create_jinja_env()
jinja2_env.default_timezone = 'America/New_York' # Override default timezone
jinja2_env.datetime_format = '%Y-%m-%d %H:%M' # Override default format
Supported Offset Parameters:
- years, months, weeks, days
- hours, minutes, seconds, microseconds
- weekday (0=Monday through 6=Sunday, must be integer)
Note:
This extension uses the Arrow library for timezone-aware datetime handling.
All timezone names should be valid IANA timezone identifiers (e.g., 'America/New_York').
"""
import arrow
from jinja2 import nodes
from jinja2.ext import Extension
import os
class TimeExtension(Extension):
"""
Jinja2 Extension providing the {% now %} tag for timezone-aware date/time rendering.
This extension adds two attributes to the Jinja2 environment:
- datetime_format: Default strftime format string (default: '%a, %d %b %Y %H:%M:%S')
- default_timezone: Default timezone for rendering (default: TZ env var or 'UTC')
Both can be overridden after environment creation by setting the attributes directly.
"""
tags = {'now'}
def __init__(self, environment):
"""Jinja2 Extension constructor."""
super().__init__(environment)
environment.extend(
datetime_format='%a, %d %b %Y %H:%M:%S',
default_timezone=os.getenv('TZ', 'UTC').strip()
)
def _datetime(self, timezone, operator, offset, datetime_format):
"""
Get current datetime with time offset applied.
Args:
timezone: IANA timezone identifier (e.g., 'UTC', 'America/New_York') or empty string for default
operator: '+' for addition or '-' for subtraction
offset: Comma-separated offset parameters (e.g., 'hours=2,minutes=30')
datetime_format: strftime format string or None to use environment default
Returns:
Formatted datetime string with offset applied
Example:
_datetime('UTC', '+', 'hours=2,minutes=30', '%Y-%m-%d %H:%M:%S')
# Returns current time + 2.5 hours
"""
# Use default timezone if none specified
if not timezone or timezone == '':
timezone = self.environment.default_timezone
d = arrow.now(timezone)
# parse shift params from offset and include operator
shift_params = {}
for param in offset.split(','):
interval, value = param.split('=')
shift_params[interval.strip()] = float(operator + value.strip())
# Fix weekday parameter can not be float
if 'weekday' in shift_params:
shift_params['weekday'] = int(shift_params['weekday'])
d = d.shift(**shift_params)
if datetime_format is None:
datetime_format = self.environment.datetime_format
return d.strftime(datetime_format)
def _now(self, timezone, datetime_format):
"""
Get current datetime without any offset.
Args:
timezone: IANA timezone identifier (e.g., 'UTC', 'America/New_York') or empty string for default
datetime_format: strftime format string or None to use environment default
Returns:
Formatted datetime string for current time
Example:
_now('America/New_York', '%Y-%m-%d %H:%M:%S')
# Returns current time in New York timezone
"""
# Use default timezone if none specified
if not timezone or timezone == '':
timezone = self.environment.default_timezone
if datetime_format is None:
datetime_format = self.environment.datetime_format
return arrow.now(timezone).strftime(datetime_format)
def parse(self, parser):
"""
Parse the {% now %} tag and generate appropriate AST nodes.
This method is called by Jinja2 when it encounters a {% now %} tag.
It parses the tag syntax and determines whether to call _now() or _datetime()
based on whether offset operations (+ or -) are present.
Supported syntax:
{% now 'timezone' %} -> calls _now()
{% now 'timezone', 'format' %} -> calls _now()
{% now 'timezone' + 'offset' %} -> calls _datetime()
{% now 'timezone' + 'offset', 'format' %} -> calls _datetime()
{% now 'timezone' - 'offset', 'format' %} -> calls _datetime()
Args:
parser: Jinja2 parser instance
Returns:
nodes.Output: AST output node containing the formatted datetime string
"""
lineno = next(parser.stream).lineno
node = parser.parse_expression()
if parser.stream.skip_if('comma'):
datetime_format = parser.parse_expression()
else:
datetime_format = nodes.Const(None)
if isinstance(node, nodes.Add):
call_method = self.call_method(
'_datetime',
[node.left, nodes.Const('+'), node.right, datetime_format],
lineno=lineno,
)
elif isinstance(node, nodes.Sub):
call_method = self.call_method(
'_datetime',
[node.left, nodes.Const('-'), node.right, datetime_format],
lineno=lineno,
)
else:
call_method = self.call_method(
'_now',
[node, datetime_format],
lineno=lineno,
)
return nodes.Output([call_method], lineno=lineno)

View File

@@ -1,6 +0,0 @@
"""
Jinja2 custom filter plugins for changedetection.io
"""
from .regex import regex_replace
__all__ = ['regex_replace']

View File

@@ -1,98 +0,0 @@
"""
Regex filter plugin for Jinja2 templates.
Provides regex_replace filter for pattern-based string replacements in templates.
"""
import re
import signal
from loguru import logger
def regex_replace(value: str, pattern: str, replacement: str = '', count: int = 0) -> str:
"""
Replace occurrences of a regex pattern in a string.
Security: Protected against ReDoS (Regular Expression Denial of Service) attacks:
- Limits input value size to prevent excessive processing
- Uses timeout mechanism to prevent runaway regex operations
- Validates pattern complexity to prevent catastrophic backtracking
Args:
value: The input string to perform replacements on
pattern: The regex pattern to search for
replacement: The replacement string (default: '')
count: Maximum number of replacements (0 = replace all, default: 0)
Returns:
String with replacements applied, or original value on error
Example:
{{ "hello world" | regex_replace("world", "universe") }}
{{ diff | regex_replace("<td>([^<]+)</td><td>([^<]+)</td>", "Label1: \\1\\nLabel2: \\2") }}
Security limits:
- Maximum input size: 10MB
- Maximum pattern length: 500 characters
- Operation timeout: 10 seconds
- Dangerous nested quantifier patterns are rejected
"""
# Security limits
MAX_INPUT_SIZE = 1024 * 1024 * 10 # 10MB max input size
MAX_PATTERN_LENGTH = 500 # Maximum regex pattern length
REGEX_TIMEOUT_SECONDS = 10 # Maximum time for regex operation
# Validate input sizes
value_str = str(value)
if len(value_str) > MAX_INPUT_SIZE:
logger.warning(f"regex_replace: Input too large ({len(value_str)} bytes), truncating")
value_str = value_str[:MAX_INPUT_SIZE]
if len(pattern) > MAX_PATTERN_LENGTH:
logger.warning(f"regex_replace: Pattern too long ({len(pattern)} chars), rejecting")
return value_str
# Check for potentially dangerous patterns (basic checks)
# Nested quantifiers like (a+)+ can cause catastrophic backtracking
dangerous_patterns = [
r'\([^)]*\+[^)]*\)\+', # (x+)+
r'\([^)]*\*[^)]*\)\+', # (x*)+
r'\([^)]*\+[^)]*\)\*', # (x+)*
r'\([^)]*\*[^)]*\)\*', # (x*)*
]
for dangerous in dangerous_patterns:
if re.search(dangerous, pattern):
logger.warning(f"regex_replace: Potentially dangerous pattern detected: {pattern}")
return value_str
def timeout_handler(signum, frame):
raise TimeoutError("Regex operation timed out")
try:
# Set up timeout for regex operation (Unix-like systems only)
# This prevents ReDoS attacks
old_handler = None
if hasattr(signal, 'SIGALRM'):
old_handler = signal.signal(signal.SIGALRM, timeout_handler)
signal.alarm(REGEX_TIMEOUT_SECONDS)
try:
result = re.sub(pattern, replacement, value_str, count=count)
finally:
# Cancel the alarm
if hasattr(signal, 'SIGALRM'):
signal.alarm(0)
if old_handler is not None:
signal.signal(signal.SIGALRM, old_handler)
return result
except TimeoutError:
logger.error(f"regex_replace: Regex operation timed out - possible ReDoS attack. Pattern: {pattern}")
return value_str
except re.error as e:
logger.warning(f"regex_replace: Invalid regex pattern: {e}")
return value_str
except Exception as e:
logger.error(f"regex_replace: Unexpected error: {e}")
return value_str

View File

@@ -1,58 +0,0 @@
"""
Safe Jinja2 render with max payload sizes
See https://jinja.palletsprojects.com/en/3.1.x/sandbox/#security-considerations
"""
import jinja2.sandbox
import typing as t
import os
from .extensions.TimeExtension import TimeExtension
from .plugins import regex_replace
JINJA2_MAX_RETURN_PAYLOAD_SIZE = 1024 * int(os.getenv("JINJA2_MAX_RETURN_PAYLOAD_SIZE_KB", 1024 * 10))
# Default extensions - can be overridden in create_jinja_env()
DEFAULT_JINJA2_EXTENSIONS = [TimeExtension]
def create_jinja_env(extensions=None, **kwargs) -> jinja2.sandbox.ImmutableSandboxedEnvironment:
"""
Create a sandboxed Jinja2 environment with our custom extensions and default timezone.
Args:
extensions: List of extension classes to use (defaults to DEFAULT_JINJA2_EXTENSIONS)
**kwargs: Additional arguments to pass to ImmutableSandboxedEnvironment
Returns:
Configured Jinja2 environment
"""
if extensions is None:
extensions = DEFAULT_JINJA2_EXTENSIONS
jinja2_env = jinja2.sandbox.ImmutableSandboxedEnvironment(
extensions=extensions,
**kwargs
)
# Get default timezone from environment variable
default_timezone = os.getenv('TZ', 'UTC').strip()
jinja2_env.default_timezone = default_timezone
# Register custom filters
jinja2_env.filters['regex_replace'] = regex_replace
return jinja2_env
# This is used for notifications etc, so actually it's OK to send custom HTML such as <a href> etc, but it should limit what data is available.
# (Which also limits available functions that could be called)
def render(template_str, **args: t.Any) -> str:
jinja2_env = create_jinja_env()
output = jinja2_env.from_string(template_str).render(args)
return output[:JINJA2_MAX_RETURN_PAYLOAD_SIZE]
def render_fully_escaped(content):
env = jinja2.sandbox.ImmutableSandboxedEnvironment(autoescape=True)
template = env.from_string("{{ some_html|e }}")
return template.render(some_html=content)

View File

@@ -1,7 +1,6 @@
from os import getenv
from copy import deepcopy
from changedetectionio.blueprint.rss import RSS_FORMAT_TYPES, RSS_CONTENT_FORMAT_DEFAULT
from changedetectionio.blueprint.rss import RSS_FORMAT_TYPES
from changedetectionio.notification import (
default_notification_body,
@@ -54,17 +53,13 @@ class model(dict):
'password': False,
'render_anchor_tag_content': False,
'rss_access_token': None,
'rss_content_format': RSS_CONTENT_FORMAT_DEFAULT,
'rss_template_type': 'system_default',
'rss_template_override': None,
'rss_diff_length': 5,
'rss_content_format': RSS_FORMAT_TYPES[0][0],
'rss_hide_muted_watches': True,
'rss_reader_mode': False,
'scheduler_timezone_default': None, # Default IANA timezone name
'schema_version' : 0,
'shared_diff_access': False,
'strip_ignored_lines': False,
'tags': {}, #@todo use Tag.model initialisers
'timezone': None, # Default IANA timezone name
'webdriver_delay': None , # Extra delay in seconds before extracting text
'ui': {
'use_page_title_in_list': True,
@@ -78,13 +73,12 @@ class model(dict):
def __init__(self, *arg, **kw):
super(model, self).__init__(*arg, **kw)
# CRITICAL: deepcopy to avoid sharing mutable objects between instances
self.update(deepcopy(self.base_config))
self.update(self.base_config)
def parse_headers_from_text_file(filepath):
headers = {}
with open(filepath, 'r', encoding='utf-8') as f:
with open(filepath, 'r') as f:
for l in f.readlines():
l = l.strip()
if not l.startswith('#') and ':' in l:

View File

@@ -1,24 +1,42 @@
from blinker import signal
from changedetectionio.validate_url import is_safe_valid_url
from changedetectionio.strtobool import strtobool
from changedetectionio.jinja2_custom import render as jinja_render
from changedetectionio.safe_jinja import render as jinja_render
from . import watch_base
import os
import re
from pathlib import Path
from loguru import logger
from .. import jinja2_custom as safe_jinja
from ..diff import ADDED_PLACEMARKER_OPEN
from .. import safe_jinja
from ..html_tools import TRANSLATE_WHITESPACE_TABLE
# Allowable protocols, protects against javascript: etc
# file:// is further checked by ALLOW_FILE_URI
SAFE_PROTOCOL_REGEX='^(http|https|ftp|file):'
FAVICON_RESAVE_THRESHOLD_SECONDS=86400
minimum_seconds_recheck_time = int(os.getenv('MINIMUM_SECONDS_RECHECK_TIME', 3))
mtable = {'seconds': 1, 'minutes': 60, 'hours': 3600, 'days': 86400, 'weeks': 86400 * 7}
def is_safe_url(test_url):
# See https://github.com/dgtlmoon/changedetection.io/issues/1358
# Remove 'source:' prefix so we dont get 'source:javascript:' etc
# 'source:' is a valid way to tell us to return the source
r = re.compile(re.escape('source:'), re.IGNORECASE)
test_url = r.sub('', test_url)
pattern = re.compile(os.getenv('SAFE_PROTOCOL_REGEX', SAFE_PROTOCOL_REGEX), re.IGNORECASE)
if not pattern.match(test_url.strip()):
return False
return True
class model(watch_base):
__newest_history_key = None
__history_n = 0
@@ -61,7 +79,7 @@ class model(watch_base):
def link(self):
url = self.get('url', '')
if not is_safe_valid_url(url):
if not is_safe_url(url):
return 'DISABLED'
ready_url = url
@@ -71,8 +89,9 @@ class model(watch_base):
ready_url = jinja_render(template_str=url)
except Exception as e:
logger.critical(f"Invalid URL template for: '{url}' - {str(e)}")
from flask import flash, url_for
from markupsafe import Markup
from flask import (
flash, Markup, url_for
)
message = Markup('<a href="{}#general">The URL {} is invalid and cannot be used, click to edit</a>'.format(
url_for('ui.ui_edit.edit_page', uuid=self.get('uuid')), self.get('url', '')))
flash(message, 'error')
@@ -82,7 +101,7 @@ class model(watch_base):
ready_url=ready_url.replace('source:', '')
# Also double check it after any Jinja2 formatting just incase
if not is_safe_valid_url(ready_url):
if not is_safe_url(ready_url):
return 'DISABLED'
return ready_url
@@ -188,7 +207,7 @@ class model(watch_base):
fname = os.path.join(self.watch_data_dir, "history.txt")
if os.path.isfile(fname):
logger.debug(f"Reading watch history index for {self.get('uuid')}")
with open(fname, "r", encoding='utf-8') as f:
with open(fname, "r") as f:
for i in f.readlines():
if ',' in i:
k, v = i.strip().split(',', 2)
@@ -276,17 +295,9 @@ class model(watch_base):
# When the 'last viewed' timestamp is less than the oldest snapshot, return oldest
return sorted_keys[-1]
def get_history_snapshot(self, timestamp=None, filepath=None):
"""
Accepts either timestamp or filepath
:param timestamp:
:param filepath:
:return:
"""
def get_history_snapshot(self, timestamp):
import brotli
if not filepath:
filepath = self.history[timestamp]
filepath = self.history[timestamp]
# See if a brotli versions exists and switch to that
if not filepath.endswith('.br') and os.path.isfile(f"{filepath}.br"):
@@ -390,7 +401,7 @@ class model(watch_base):
# Compare each lines (set) against each history text file (set) looking for something new..
existing_history = set({})
for k, v in self.history.items():
content = self.get_history_snapshot(filepath=v)
content = self.get_history_snapshot(k)
if ignore_whitespace:
alist = set([line.translate(TRANSLATE_WHITESPACE_TABLE).lower() for line in content.splitlines()])
@@ -594,7 +605,7 @@ class model(watch_base):
"""Return the text saved from a previous request that resulted in a non-200 error"""
fname = os.path.join(self.watch_data_dir, "last-error.txt")
if os.path.isfile(fname):
with open(fname, 'r', encoding='utf-8') as f:
with open(fname, 'r') as f:
return f.read()
return False
@@ -631,8 +642,10 @@ class model(watch_base):
def extra_notification_token_placeholder_info(self):
# Used for providing extra tokens
# return [('widget', "Get widget amounts")]
return []
values = []
values.append(('watch_html_link', "Link to URL as <a href>"))
values.append(('watch_url_raw', "Raw URL/link before any jinja2 macro"))
return values
def extract_regex_from_all_history(self, regex):
@@ -647,7 +660,7 @@ class model(watch_base):
for k, fname in self.history.items():
if os.path.isfile(fname):
if True:
contents = self.get_history_snapshot(timestamp=k)
contents = self.get_history_snapshot(k)
res = re.findall(regex, contents, re.MULTILINE)
if res:
if not csv_writer:
@@ -740,7 +753,7 @@ class model(watch_base):
# If a previous attempt doesnt yet exist, just snarf the previous snapshot instead
dates = list(self.history.keys())
if len(dates):
return self.get_history_snapshot(timestamp=dates[-1])
return self.get_history_snapshot(dates[-1])
else:
return ''

View File

@@ -2,7 +2,7 @@ import os
import uuid
from changedetectionio import strtobool
USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH = 'System default'
default_notification_format_for_watch = 'System default'
CONDITIONS_MATCH_LOGIC_DEFAULT = 'ALL'
class watch_base(dict):
@@ -44,7 +44,7 @@ class watch_base(dict):
'method': 'GET',
'notification_alert_count': 0,
'notification_body': None,
'notification_format': USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH,
'notification_format': default_notification_format_for_watch,
'notification_muted': False,
'notification_screenshot': False, # Include the latest screenshot if available and supported by the apprise URL
'notification_title': None,

View File

@@ -1,16 +1,35 @@
from changedetectionio.model import USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH
from changedetectionio.model import default_notification_format_for_watch
default_notification_format = 'htmlcolor'
default_notification_body = '{{watch_url}} had a change.\n---\n{{diff}}\n---\n'
default_notification_title = 'ChangeDetection.io Notification - {{watch_url}}'
ult_notification_format_for_watch = 'System default'
default_notification_format = 'HTML Color'
default_notification_body = '{{watch_title}} had a change.\n---\n{{diff}}\n---\n'
default_notification_title = 'ChangeDetection.io Notification - {{watch_title}}'
# The values (markdown etc) are from apprise NotifyFormat,
# But to avoid importing the whole heavy module just use the same strings here.
valid_notification_formats = {
'text': 'Plain Text',
'html': 'HTML',
'htmlcolor': 'HTML Color',
'markdown': 'Markdown to HTML',
'Text': 'text',
'Markdown': 'markdown',
'HTML': 'html',
'HTML Color': 'htmlcolor',
# Used only for editing a watch (not for global)
USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH: USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH
default_notification_format_for_watch: default_notification_format_for_watch
}
valid_tokens = {
'base_url': '',
'current_snapshot': '',
'diff': '',
'diff_added': '',
'diff_full': '',
'diff_patch': '',
'diff_removed': '',
'diff_url': '',
'preview_url': '',
'triggered_text': '',
'watch_tag': '',
'watch_title': '',
'watch_url': '',
'watch_uuid': '',
}

View File

@@ -1,61 +1,10 @@
"""
Custom Apprise HTTP Handlers with format= Parameter Support
IMPORTANT: This module works around a limitation in Apprise's @notify decorator.
THE PROBLEM:
-------------
When using Apprise's @notify decorator to create custom notification handlers, the
decorator creates a CustomNotifyPlugin that uses parse_url(..., simple=True) to parse
URLs. This simple parsing mode does NOT extract the format= query parameter from the URL
and set it as a top-level parameter that NotifyBase.__init__ can use to set notify_format.
As a result:
1. URL: post://example.com/webhook?format=html
2. Apprise parses this and sees format=html in qsd (query string dictionary)
3. But it does NOT extract it and pass it to NotifyBase.__init__
4. NotifyBase defaults to notify_format=TEXT
5. When you call apobj.notify(body="<html>...", body_format="html"):
- Apprise sees: input format = html, output format (notify_format) = text
- Apprise calls convert_between("html", "text", body)
- This strips all HTML tags, leaving only plain text
6. Your custom handler receives stripped plain text instead of HTML
THE SOLUTION:
-------------
Instead of using the @notify decorator directly, we:
1. Manually register custom plugins using plugins.N_MGR.add()
2. Create a CustomHTTPHandler class that extends CustomNotifyPlugin
3. Override __init__ to extract format= from qsd and set it as kwargs['format']
4. Call NotifyBase.__init__ which properly sets notify_format from kwargs['format']
5. Set up _default_args like CustomNotifyPlugin does for compatibility
This ensures that when format=html is in the URL:
- notify_format is set to HTML
- Apprise sees: input format = html, output format = html
- No conversion happens (convert_between returns content unchanged)
- Your custom handler receives the original HTML intact
TESTING:
--------
To verify this works:
>>> apobj = apprise.Apprise()
>>> apobj.add('post://localhost:5005/test?format=html')
>>> for server in apobj:
... print(server.notify_format) # Should print: html (not text)
>>> apobj.notify(body='<span>Test</span>', body_format='html')
# Your handler should receive '<span>Test</span>' not 'Test'
"""
import json
import re
from urllib.parse import unquote_plus
import requests
from apprise import plugins
from apprise.decorators.base import CustomNotifyPlugin
from apprise.utils.parse import parse_url as apprise_parse_url, url_assembly
from apprise.utils.logic import dict_full_update
from apprise.decorators import notify
from apprise.utils.parse import parse_url as apprise_parse_url
from loguru import logger
from requests.structures import CaseInsensitiveDict
@@ -63,64 +12,16 @@ SUPPORTED_HTTP_METHODS = {"get", "post", "put", "delete", "patch", "head"}
def notify_supported_methods(func):
"""Register custom HTTP method handlers that properly support format= parameter."""
for method in SUPPORTED_HTTP_METHODS:
_register_http_handler(method, func)
_register_http_handler(f"{method}s", func)
func = notify(on=method)(func)
# Add support for https, for each supported http method
func = notify(on=f"{method}s")(func)
return func
def _register_http_handler(schema, send_func):
"""Register a custom HTTP handler that extracts format= from URL query parameters."""
# Parse base URL
base_url = f"{schema}://"
base_args = apprise_parse_url(base_url, default_schema=schema, verify_host=False, simple=True)
class CustomHTTPHandler(CustomNotifyPlugin):
secure_protocol = schema
service_name = f"Custom HTTP - {schema.upper()}"
_base_args = base_args
def __init__(self, **kwargs):
# Extract format from qsd and set it as a top-level kwarg
# This allows NotifyBase.__init__ to properly set notify_format
if 'qsd' in kwargs and 'format' in kwargs['qsd']:
kwargs['format'] = kwargs['qsd']['format']
# Call NotifyBase.__init__ (skip CustomNotifyPlugin.__init__)
super(CustomNotifyPlugin, self).__init__(**kwargs)
# Set up _default_args like CustomNotifyPlugin does
self._default_args = {}
kwargs.pop("secure", None)
dict_full_update(self._default_args, self._base_args)
dict_full_update(self._default_args, kwargs)
self._default_args["url"] = url_assembly(**self._default_args)
__send = staticmethod(send_func)
def send(self, body, title="", notify_type="info", *args, **kwargs):
"""Call the custom send function."""
try:
result = self.__send(
body, title, notify_type,
*args,
meta=self._default_args,
**kwargs
)
return True if result is None else bool(result)
except Exception as e:
self.logger.warning(f"Exception in custom HTTP handler: {e}")
return False
# Register the plugin
plugins.N_MGR.add(
plugin=CustomHTTPHandler,
schemas=schema,
send_func=send_func,
url=base_url,
)
def notify_null_method(func):
func = notify(on="null")(func)
return func
def _get_auth(parsed_url: dict) -> str | tuple[str, str]:
@@ -174,12 +75,9 @@ def apprise_http_custom_handler(
title: str,
notify_type: str,
meta: dict,
body_format: str = None,
*args,
**kwargs,
) -> bool:
url: str = meta.get("url")
schema: str = meta.get("schema")
method: str = re.sub(r"s$", "", schema).upper()
@@ -195,16 +93,43 @@ def apprise_http_custom_handler(
url = re.sub(rf"^{schema}", "https" if schema.endswith("s") else "http", parsed_url.get("url"))
response = requests.request(
method=method,
url=url,
auth=auth,
headers=headers,
params=params,
data=body.encode("utf-8") if isinstance(body, str) else body,
)
try:
response = requests.request(
method=method,
url=url,
auth=auth,
headers=headers,
params=params,
data=body.encode("utf-8") if isinstance(body, str) else body,
)
response.raise_for_status()
response.raise_for_status()
logger.info(f"Successfully sent custom notification to {url}")
return True
except requests.RequestException as e:
logger.error(f"Remote host error while sending custom notification to {url}: {e}")
return False
except Exception as e:
logger.error(f"Unexpected error occurred while sending custom notification to {url}: {e}")
return False
@notify_null_method
def apprise_null_custom_handler(
body: str,
title: str,
notify_type: str,
meta: dict,
*args,
**kwargs,
) -> bool:
url: str = meta.get("url")
schema: str = meta.get("schema")
method: str = re.sub(r"s$", "", schema).upper()
logger.info(f"Processed 'null' notification")
logger.info(f"Successfully sent custom notification to {url}")
return True

View File

@@ -1,286 +0,0 @@
"""
Custom Discord plugin for changedetection.io
Extends Apprise's Discord plugin to support custom colored embeds for removed/added content
"""
from apprise.plugins.discord import NotifyDiscord
from apprise.decorators import notify
from apprise.common import NotifyFormat
from loguru import logger
# Import placeholders from changedetection's diff module
from ...diff import (
REMOVED_PLACEMARKER_OPEN,
REMOVED_PLACEMARKER_CLOSED,
ADDED_PLACEMARKER_OPEN,
ADDED_PLACEMARKER_CLOSED,
CHANGED_PLACEMARKER_OPEN,
CHANGED_PLACEMARKER_CLOSED,
CHANGED_INTO_PLACEMARKER_OPEN,
CHANGED_INTO_PLACEMARKER_CLOSED,
)
# Discord embed sidebar colors for different change types
DISCORD_COLOR_UNCHANGED = 8421504 # Gray (#808080)
DISCORD_COLOR_REMOVED = 16711680 # Red (#FF0000)
DISCORD_COLOR_ADDED = 65280 # Green (#00FF00)
DISCORD_COLOR_CHANGED = 16753920 # Orange (#FFA500)
DISCORD_COLOR_CHANGED_INTO = 3447003 # Blue (#5865F2 - Discord blue)
DISCORD_COLOR_WARNING = 16776960 # Yellow (#FFFF00)
class NotifyDiscordCustom(NotifyDiscord):
"""
Custom Discord notification handler that supports multiple colored embeds
for showing removed (red) and added (green) content separately.
"""
def send(self, body, title="", notify_type=None, attach=None, **kwargs):
"""
Override send method to create custom embeds with red/green colors
for removed/added content when placeholders are present.
"""
# Check if body contains our diff placeholders
has_removed = REMOVED_PLACEMARKER_OPEN in body
has_added = ADDED_PLACEMARKER_OPEN in body
has_changed = CHANGED_PLACEMARKER_OPEN in body
has_changed_into = CHANGED_INTO_PLACEMARKER_OPEN in body
# If we have diff placeholders and we're in markdown/html format, create custom embeds
if (has_removed or has_added or has_changed or has_changed_into) and self.notify_format in (NotifyFormat.MARKDOWN, NotifyFormat.HTML):
return self._send_with_colored_embeds(body, title, notify_type, attach, **kwargs)
# Otherwise, use the parent class's default behavior
return super().send(body, title, notify_type, attach, **kwargs)
def _send_with_colored_embeds(self, body, title, notify_type, attach, **kwargs):
"""
Send Discord message with embeds in the original diff order.
Preserves the sequence: unchanged -> removed -> added -> unchanged, etc.
"""
from datetime import datetime, timezone
payload = {
"tts": self.tts,
"wait": self.tts is False,
}
if self.flags:
payload["flags"] = self.flags
# Acquire image_url
image_url = self.image_url(notify_type)
if self.avatar and (image_url or self.avatar_url):
payload["avatar_url"] = self.avatar_url if self.avatar_url else image_url
if self.user:
payload["username"] = self.user
# Associate our thread_id with our message
params = {"thread_id": self.thread_id} if self.thread_id else None
# Build embeds array preserving order
embeds = []
# Add title as plain bold text in message content (not an embed)
if title:
payload["content"] = f"**{title}**"
# Parse the body into ordered chunks
chunks = self._parse_body_into_chunks(body)
# Discord limits:
# - Max 10 embeds per message
# - Max 6000 characters total across all embeds
# - Max 4096 characters per embed description
max_embeds = 10
max_total_chars = 6000
max_embed_description = 4096
# All 10 embed slots are available for content
max_content_embeds = max_embeds
# Start character count
total_chars = 0
# Create embeds from chunks in order (no titles, just color coding)
for chunk_type, content in chunks:
if not content.strip():
continue
# Truncate individual embed description if needed
if len(content) > max_embed_description:
content = content[:max_embed_description - 3] + "..."
# Check if we're approaching the embed count limit
# We need room for the warning embed, so stop at max_content_embeds - 1
current_content_embeds = len(embeds)
if current_content_embeds >= max_content_embeds - 1:
# Add a truncation notice (this will be the 10th embed)
embeds.append({
"description": "⚠️ Content truncated (Discord 10 embed limit reached) - Tip: Select 'Plain Text' or 'HTML' format for longer diffs",
"color": DISCORD_COLOR_WARNING,
})
break
# Check if adding this embed would exceed total character limit
if total_chars + len(content) > max_total_chars:
# Add a truncation notice
remaining_chars = max_total_chars - total_chars
if remaining_chars > 100:
# Add partial content if we have room
truncated_content = content[:remaining_chars - 100] + "..."
embeds.append({
"description": truncated_content,
"color": (DISCORD_COLOR_UNCHANGED if chunk_type == "unchanged"
else DISCORD_COLOR_REMOVED if chunk_type == "removed"
else DISCORD_COLOR_ADDED),
})
embeds.append({
"description": "⚠️ Content truncated (Discord 6000 char limit reached)\nTip: Select 'Plain Text' or 'HTML' format for longer diffs",
"color": DISCORD_COLOR_WARNING,
})
break
if chunk_type == "unchanged":
embeds.append({
"description": content,
"color": DISCORD_COLOR_UNCHANGED,
})
elif chunk_type == "removed":
embeds.append({
"description": content,
"color": DISCORD_COLOR_REMOVED,
})
elif chunk_type == "added":
embeds.append({
"description": content,
"color": DISCORD_COLOR_ADDED,
})
elif chunk_type == "changed":
# Changed (old value) - use orange to distinguish from pure removal
embeds.append({
"description": content,
"color": DISCORD_COLOR_CHANGED,
})
elif chunk_type == "changed_into":
# Changed into (new value) - use blue to distinguish from pure addition
embeds.append({
"description": content,
"color": DISCORD_COLOR_CHANGED_INTO,
})
total_chars += len(content)
if embeds:
payload["embeds"] = embeds
# Send the payload using parent's _send method
if not self._send(payload, params=params):
return False
# Handle attachments if present
if attach and self.attachment_support:
payload.update({
"tts": False,
"wait": True,
})
payload.pop("embeds", None)
payload.pop("content", None)
payload.pop("allow_mentions", None)
for attachment in attach:
self.logger.info(f"Posting Discord Attachment {attachment.name}")
if not self._send(payload, params=params, attach=attachment):
return False
return True
def _parse_body_into_chunks(self, body):
"""
Parse the body into ordered chunks of (type, content) tuples.
Types: "unchanged", "removed", "added", "changed", "changed_into"
Preserves the original order of the diff.
"""
chunks = []
position = 0
while position < len(body):
# Find the next marker
next_removed = body.find(REMOVED_PLACEMARKER_OPEN, position)
next_added = body.find(ADDED_PLACEMARKER_OPEN, position)
next_changed = body.find(CHANGED_PLACEMARKER_OPEN, position)
next_changed_into = body.find(CHANGED_INTO_PLACEMARKER_OPEN, position)
# Determine which marker comes first
if next_removed == -1 and next_added == -1 and next_changed == -1 and next_changed_into == -1:
# No more markers, rest is unchanged
if position < len(body):
chunks.append(("unchanged", body[position:]))
break
# Find the earliest marker
next_marker_pos = None
next_marker_type = None
# Compare all marker positions to find the earliest
markers = []
if next_removed != -1:
markers.append((next_removed, "removed"))
if next_added != -1:
markers.append((next_added, "added"))
if next_changed != -1:
markers.append((next_changed, "changed"))
if next_changed_into != -1:
markers.append((next_changed_into, "changed_into"))
if markers:
next_marker_pos, next_marker_type = min(markers, key=lambda x: x[0])
# Add unchanged content before the marker
if next_marker_pos > position:
chunks.append(("unchanged", body[position:next_marker_pos]))
# Find the closing marker
if next_marker_type == "removed":
open_marker = REMOVED_PLACEMARKER_OPEN
close_marker = REMOVED_PLACEMARKER_CLOSED
elif next_marker_type == "added":
open_marker = ADDED_PLACEMARKER_OPEN
close_marker = ADDED_PLACEMARKER_CLOSED
elif next_marker_type == "changed":
open_marker = CHANGED_PLACEMARKER_OPEN
close_marker = CHANGED_PLACEMARKER_CLOSED
else: # changed_into
open_marker = CHANGED_INTO_PLACEMARKER_OPEN
close_marker = CHANGED_INTO_PLACEMARKER_CLOSED
close_pos = body.find(close_marker, next_marker_pos)
if close_pos == -1:
# No closing marker, take rest as this type
content = body[next_marker_pos + len(open_marker):]
chunks.append((next_marker_type, content))
break
else:
# Extract content between markers
content = body[next_marker_pos + len(open_marker):close_pos]
chunks.append((next_marker_type, content))
position = close_pos + len(close_marker)
return chunks
# Register the custom Discord handler with Apprise
# This will override the built-in discord:// handler
@notify(on="discord")
def discord_custom_wrapper(body, title, notify_type, meta, body_format=None, *args, **kwargs):
"""
Wrapper function to make the custom Discord handler work with Apprise's decorator system.
Note: This decorator approach may not work for overriding built-in plugins.
The class-based approach above is the proper way to extend NotifyDiscord.
"""
logger.info("Custom Discord handler called")
# This is here for potential future use with decorator-based registration
return True

View File

@@ -1,42 +0,0 @@
def as_monospaced_html_email(content: str, title: str) -> str:
"""
Wraps `content` in a minimal, email-safe HTML template
that forces monospace rendering across Gmail, Hotmail, Apple Mail, etc.
Args:
content: The body text (plain text or HTML-like).
title: The title plaintext
Returns:
A complete HTML document string suitable for sending as an email body.
"""
# All line feed types should be removed and then this function should only be fed <br>'s
# Then it works with our <pre> styling without double linefeeds
content = content.translate(str.maketrans('', '', '\r\n'))
if title:
import html
title = html.escape(title)
else:
title = ''
# 2. Full email-safe HTML
html_email = f"""<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="x-apple-disable-message-reformatting">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<!--[if mso]>
<style>
body, div, pre, td {{ font-family: "Courier New", Courier, monospace !important; }}
</style>
<![endif]-->
<title>{title}</title>
</head>
<body style="-webkit-text-size-adjust:100%;-ms-text-size-adjust:100%;">
<pre role="article" aria-roledescription="email" lang="en"
style="font-family: monospace, 'Courier New', Courier; font-size: 0.9rem;
white-space: pre-wrap; word-break: break-word;">{content}</pre>
</body>
</html>"""
return html_email

View File

@@ -1,315 +1,166 @@
import os
import time
import apprise
from apprise import NotifyFormat
from loguru import logger
from urllib.parse import urlparse
from .apprise_plugin.assets import apprise_asset, APPRISE_AVATAR_URL
from .email_helpers import as_monospaced_html_email
from ..diff import HTML_REMOVED_STYLE, REMOVED_PLACEMARKER_OPEN, REMOVED_PLACEMARKER_CLOSED, ADDED_PLACEMARKER_OPEN, HTML_ADDED_STYLE, \
ADDED_PLACEMARKER_CLOSED, CHANGED_INTO_PLACEMARKER_OPEN, CHANGED_INTO_PLACEMARKER_CLOSED, CHANGED_PLACEMARKER_OPEN, \
CHANGED_PLACEMARKER_CLOSED, HTML_CHANGED_STYLE, HTML_CHANGED_INTO_STYLE
import re
from changedetectionio.safe_jinja import render as jinja_render
from urllib.parse import urlparse
from ..notification_service import NotificationContextData
newline_re = re.compile(r'\r\n|\r|\n')
def markup_text_links_to_html(body):
def _populate_notification_tokens(n_object, datastore):
"""
Convert plaintext to HTML with clickable links.
Uses Jinja2's escape and Markup for XSS safety.
Populate notification tokens (diff, current_snapshot, etc.) if not already present.
This ensures both queued notifications and test notifications have the same data.
"""
from linkify_it import LinkifyIt
from markupsafe import Markup, escape
from changedetectionio import diff
from changedetectionio.notification import default_notification_format_for_watch
from markupsafe import escape
linkify = LinkifyIt()
watch_uuid = n_object.get('uuid')
if not watch_uuid:
return
watch = datastore.data['watching'].get(watch_uuid)
if not watch:
return
# Match URLs in the ORIGINAL text (before escaping)
matches = linkify.match(body)
dates = []
trigger_text = ''
watch_html_link = ''
if not matches:
# No URLs, just escape everything
return Markup(escape(body))
if watch:
watch_history = watch.history
dates = list(watch_history.keys())
trigger_text = watch.get('trigger_text', [])
result = []
last_index = 0
# Add text that was triggered
if len(dates):
snapshot_contents = watch.get_history_snapshot(dates[-1])
# Process each URL match
for match in matches:
# Add escaped text before the URL
if match.index > last_index:
text_part = body[last_index:match.index]
result.append(escape(text_part))
if n_object.get('notification_format').lower().startswith('html'):
snapshot_contents = str(escape(snapshot_contents))
# Add the link with escaped URL (both in href and display)
url = match.url
result.append(Markup(f'<a href="{escape(url)}">{escape(url)}</a>'))
last_index = match.last_index
# Add remaining escaped text
if last_index < len(body):
result.append(escape(body[last_index:]))
# Join all parts
return str(Markup(''.join(str(part) for part in result)))
def notification_format_align_with_apprise(n_format : str):
"""
Correctly align changedetection's formats with apprise's formats
Probably these are the same - but good to be sure.
These set the expected OUTPUT format type
:param n_format:
:return:
"""
if n_format.startswith('html'):
# Apprise only knows 'html' not 'htmlcolor' etc, which shouldnt matter here
n_format = NotifyFormat.HTML.value
elif n_format.startswith('markdown'):
# probably the same but just to be safe
n_format = NotifyFormat.MARKDOWN.value
elif n_format.startswith('text'):
# probably the same but just to be safe
n_format = NotifyFormat.TEXT.value
else:
n_format = NotifyFormat.TEXT.value
snapshot_contents = "No snapshot/history available, the watch should fetch atleast once."
return n_format
# If we ended up here with "System default"
if n_object.get('notification_format') == default_notification_format_for_watch:
n_object['notification_format'] = datastore.data['settings']['application'].get('notification_format')
def apply_discord_markdown_to_body(n_body):
"""
Discord does not support <del> but it supports non-standard ~~strikethrough~~
:param n_body:
:return:
"""
import re
# Define the mapping between your placeholders and markdown markers
replacements = [
(REMOVED_PLACEMARKER_OPEN, '~~', REMOVED_PLACEMARKER_CLOSED, '~~'),
(ADDED_PLACEMARKER_OPEN, '**', ADDED_PLACEMARKER_CLOSED, '**'),
(CHANGED_PLACEMARKER_OPEN, '~~', CHANGED_PLACEMARKER_CLOSED, '~~'),
(CHANGED_INTO_PLACEMARKER_OPEN, '**', CHANGED_INTO_PLACEMARKER_CLOSED, '**'),
]
# So that the markdown gets added without any whitespace following it which would break it
for open_tag, open_md, close_tag, close_md in replacements:
# Regex: match opening tag, optional whitespace, capture the content, optional whitespace, then closing tag
pattern = re.compile(
re.escape(open_tag) + r'(\s*)(.*?)?(\s*)' + re.escape(close_tag),
flags=re.DOTALL
)
n_body = pattern.sub(lambda m: f"{m.group(1)}{open_md}{m.group(2)}{close_md}{m.group(3)}", n_body)
return n_body
html_colour_enable = False
line_feed_sep = "\n"
def apply_standard_markdown_to_body(n_body):
"""
Apprise does not support ~~strikethrough~~ but it will convert <del> to HTML strikethrough.
:param n_body:
:return:
"""
import re
# Define the mapping between your placeholders and markdown markers
replacements = [
(REMOVED_PLACEMARKER_OPEN, '<del>', REMOVED_PLACEMARKER_CLOSED, '</del>'),
(ADDED_PLACEMARKER_OPEN, '**', ADDED_PLACEMARKER_CLOSED, '**'),
(CHANGED_PLACEMARKER_OPEN, '<del>', CHANGED_PLACEMARKER_CLOSED, '</del>'),
(CHANGED_INTO_PLACEMARKER_OPEN, '**', CHANGED_INTO_PLACEMARKER_CLOSED, '**'),
]
# HTML needs linebreak, but MarkDown and Text can use a linefeed
if n_object.get('notification_format').lower().startswith('html'):
line_feed_sep = "<br>"
# Snapshot will be plaintext on the disk, convert to some kind of HTML
snapshot_contents = snapshot_contents.replace('\n', line_feed_sep)
if n_object.get('notification_format') == 'HTML Color':
html_colour_enable = True
# So that the markdown gets added without any whitespace following it which would break it
for open_tag, open_md, close_tag, close_md in replacements:
# Regex: match opening tag, optional whitespace, capture the content, optional whitespace, then closing tag
pattern = re.compile(
re.escape(open_tag) + r'(\s*)(.*?)?(\s*)' + re.escape(close_tag),
flags=re.DOTALL
)
n_body = pattern.sub(lambda m: f"{m.group(1)}{open_md}{m.group(2)}{close_md}{m.group(3)}", n_body)
return n_body
triggered_text = ''
if len(trigger_text):
from changedetectionio import html_tools
triggered_text = html_tools.get_triggered_text(content=snapshot_contents, trigger_text=trigger_text)
if triggered_text:
triggered_text = line_feed_sep.join(triggered_text)
# Could be called as a 'test notification' with only 1 snapshot available
prev_snapshot = "Example text: example test\nExample text: change detection is cool\nExample text: some more examples\n"
current_snapshot = "Example text: example test\nExample text: More than 1 watch change needs to exist to build a nice preview!"
if len(dates) > 1:
prev_snapshot = watch.get_history_snapshot(dates[-2])
current_snapshot = watch.get_history_snapshot(dates[-1])
if n_object.get('notification_format').lower().startswith('html'):
prev_snapshot = str(escape(prev_snapshot))
current_snapshot = str(escape(current_snapshot))
def replace_placemarkers_in_text(text, url, requested_output_format):
"""
Replace diff placemarkers in text based on the URL service type and requested output format.
Used for both notification title and body to ensure consistent placeholder replacement.
:param text: The text to process
:param url: The notification URL (to detect service type)
:param requested_output_format: The output format (html, htmlcolor, markdown, text, etc.)
:return: Processed text with placemarkers replaced
"""
if not text:
return text
if url.startswith('tgram://'):
# Telegram only supports a limited subset of HTML
# Use strikethrough for removed content, bold for added content
text = text.replace(REMOVED_PLACEMARKER_OPEN, '<s>')
text = text.replace(REMOVED_PLACEMARKER_CLOSED, '</s>')
text = text.replace(ADDED_PLACEMARKER_OPEN, '<b>')
text = text.replace(ADDED_PLACEMARKER_CLOSED, '</b>')
# Handle changed/replaced lines (old → new)
text = text.replace(CHANGED_PLACEMARKER_OPEN, '<s>')
text = text.replace(CHANGED_PLACEMARKER_CLOSED, '</s>')
text = text.replace(CHANGED_INTO_PLACEMARKER_OPEN, '<b>')
text = text.replace(CHANGED_INTO_PLACEMARKER_CLOSED, '</b>')
elif (url.startswith('discord://') or url.startswith('https://discordapp.com/api/webhooks')
or url.startswith('https://discord.com/api')) and requested_output_format == 'html':
# Discord doesn't support HTML, use Discord markdown
text = apply_discord_markdown_to_body(n_body=text)
elif requested_output_format == 'htmlcolor':
# https://github.com/dgtlmoon/changedetection.io/issues/821#issuecomment-1241837050
text = text.replace(REMOVED_PLACEMARKER_OPEN, f'<span style="{HTML_REMOVED_STYLE}" role="deletion" aria-label="Removed text" title="Removed text">')
text = text.replace(REMOVED_PLACEMARKER_CLOSED, f'</span>')
text = text.replace(ADDED_PLACEMARKER_OPEN, f'<span style="{HTML_ADDED_STYLE}" role="insertion" aria-label="Added text" title="Added text">')
text = text.replace(ADDED_PLACEMARKER_CLOSED, f'</span>')
# Handle changed/replaced lines (old → new)
text = text.replace(CHANGED_PLACEMARKER_OPEN, f'<span style="{HTML_CHANGED_STYLE}" role="note" aria-label="Changed text" title="Changed text">')
text = text.replace(CHANGED_PLACEMARKER_CLOSED, f'</span>')
text = text.replace(CHANGED_INTO_PLACEMARKER_OPEN, f'<span style="{HTML_CHANGED_INTO_STYLE}" role="note" aria-label="Changed into" title="Changed into">')
text = text.replace(CHANGED_INTO_PLACEMARKER_CLOSED, f'</span>')
elif requested_output_format == 'markdown':
# Markdown to HTML - Apprise will convert this to HTML
text = apply_standard_markdown_to_body(n_body=text)
else:
# plaintext, html, and default - use simple text markers
text = text.replace(REMOVED_PLACEMARKER_OPEN, '(removed) ')
text = text.replace(REMOVED_PLACEMARKER_CLOSED, '')
text = text.replace(ADDED_PLACEMARKER_OPEN, '(added) ')
text = text.replace(ADDED_PLACEMARKER_CLOSED, '')
text = text.replace(CHANGED_PLACEMARKER_OPEN, f'(changed) ')
text = text.replace(CHANGED_PLACEMARKER_CLOSED, f'')
text = text.replace(CHANGED_INTO_PLACEMARKER_OPEN, f'(into) ')
text = text.replace(CHANGED_INTO_PLACEMARKER_CLOSED, f'')
return text
def apply_service_tweaks(url, n_body, n_title, requested_output_format):
logger.debug(f"Applying markup in '{requested_output_format}' mode")
# Re 323 - Limit discord length to their 2000 char limit total or it wont send.
# Because different notifications may require different pre-processing, run each sequentially :(
# 2000 bytes minus -
# 200 bytes for the overhead of the _entire_ json payload, 200 bytes for {tts, wait, content} etc headers
# Length of URL - Incase they specify a longer custom avatar_url
if not n_body or not n_body.strip():
return url, n_body, n_title
# Normalize URL scheme to lowercase to prevent case-sensitivity issues
# e.g., "Discord://webhook" -> "discord://webhook", "TGRAM://bot123" -> "tgram://bot123"
scheme_separator_pos = url.find('://')
if scheme_separator_pos > 0:
url = url[:scheme_separator_pos].lower() + url[scheme_separator_pos:]
# So if no avatar_url is specified, add one so it can be correctly calculated into the total payload
parsed = urlparse(url)
k = '?' if not parsed.query else '&'
if url and not 'avatar_url' in url \
and not url.startswith('mail') \
and not url.startswith('post') \
and not url.startswith('get') \
and not url.startswith('delete') \
and not url.startswith('put'):
url += k + f"avatar_url={APPRISE_AVATAR_URL}"
# Replace placemarkers in title first (this was the missing piece causing the bug)
# Titles are ALWAYS plain text across all notification services (Discord embeds, Slack attachments,
# email Subject headers, etc.), so we always use 'text' format for title placemarker replacement
# Looking over apprise library it seems that all plugins only expect plain-text.
n_title = replace_placemarkers_in_text(n_title, url, 'text')
if url.startswith('tgram://'):
# Telegram only supports a limit subset of HTML, remove the '<br>' we place in.
# re https://github.com/dgtlmoon/changedetection.io/issues/555
# @todo re-use an existing library we have already imported to strip all non-allowed tags
n_body = n_body.replace('<br>', '\n')
n_body = n_body.replace('</br>', '\n')
n_body = newline_re.sub('\n', n_body)
# Replace placemarkers for body
n_body = replace_placemarkers_in_text(n_body, url, requested_output_format)
# real limit is 4096, but minus some for extra metadata
payload_max_size = 3600
body_limit = max(0, payload_max_size - len(n_title))
n_title = n_title[0:payload_max_size]
n_body = n_body[0:body_limit]
elif (url.startswith('discord://') or url.startswith('https://discordapp.com/api/webhooks')
or url.startswith('https://discord.com/api'))\
and 'html' in requested_output_format:
# Discord doesn't support HTML, replace <br> with newlines
n_body = n_body.strip().replace('<br>', '\n')
n_body = n_body.replace('</br>', '\n')
n_body = newline_re.sub('\n', n_body)
# Don't replace placeholders or truncate here - let the custom Discord plugin handle it
# The plugin will use embeds (6000 char limit across all embeds) if placeholders are present,
# or plain content (2000 char limit) otherwise
# Only do placeholder replacement if NOT using htmlcolor (which triggers embeds in custom plugin)
if requested_output_format == 'html':
# No diff placeholders, use Discord markdown for any other formatting
# Use Discord markdown: strikethrough for removed, bold for added
n_body = replace_placemarkers_in_text(n_body, url, requested_output_format)
# Apply 2000 char limit for plain content
payload_max_size = 1700
body_limit = max(0, payload_max_size - len(n_title))
n_title = n_title[0:payload_max_size]
n_body = n_body[0:body_limit]
# else: our custom Discord plugin will convert any placeholders left over into embeds with color bars
# Is not discord/tgram and they want htmlcolor
elif requested_output_format == 'htmlcolor':
n_body = replace_placemarkers_in_text(n_body, url, requested_output_format)
n_body = newline_re.sub('<br>\n', n_body)
elif requested_output_format == 'html':
n_body = replace_placemarkers_in_text(n_body, url, requested_output_format)
n_body = newline_re.sub('<br>\n', n_body)
elif requested_output_format == 'markdown':
# Markdown to HTML - Apprise will convert this to HTML
n_body = replace_placemarkers_in_text(n_body, url, requested_output_format)
else: #plaintext etc default
n_body = replace_placemarkers_in_text(n_body, url, requested_output_format)
return url, n_body, n_title
if watch:
v = {'url': watch.get('url'), 'label': watch.label}
watch_html_link = jinja_render(template_str='<a href="{{ label or url | e }}" rel="noopener noreferrer">{{ url | e }}</a>', **v)
def process_notification(n_object: NotificationContextData, datastore):
from changedetectionio.jinja2_custom import render as jinja_render
from . import USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH, default_notification_format, valid_notification_formats
n_object.update({
'current_snapshot': snapshot_contents,
'diff': diff.render_diff(prev_snapshot, current_snapshot, line_feed_sep=line_feed_sep, html_colour=html_colour_enable),
'diff_added': diff.render_diff(prev_snapshot, current_snapshot, include_removed=False, line_feed_sep=line_feed_sep, html_colour=html_colour_enable),
'diff_full': diff.render_diff(prev_snapshot, current_snapshot, include_equal=True, line_feed_sep=line_feed_sep, html_colour=html_colour_enable),
'diff_patch': diff.render_diff(prev_snapshot, current_snapshot, line_feed_sep=line_feed_sep, patch_format=True),
'diff_removed': diff.render_diff(prev_snapshot, current_snapshot, include_added=False, line_feed_sep=line_feed_sep, html_colour=html_colour_enable),
'screenshot': watch.get_screenshot() if watch and watch.get('notification_screenshot') else None,
'triggered_text': triggered_text,
'watch_html_link': watch_html_link,
'watch_url': watch.link,
'watch_url_raw': watch.get('url'),
})
if watch:
n_object.update(watch.extra_notification_token_values())
def scan_notification_file_templates(url, datastore, n_body, notification_parameters):
import glob
from urllib.parse import urlparse, parse_qs
try:
scheme = urlparse(url).scheme.lower().strip()
# schema could be overriden dynamically
if scheme == 'null' and 'test_schema=' in url:
scheme = parse_qs(urlparse(url).query).get("test_schema", [None])[0]
logger.debug(f"Looking for '{scheme}' notification wrapper templates...")
# Try exact match first, then wildcard matches
candidates = [
os.path.join(datastore.datastore_path, f"notification-wrapper-{scheme}.html"),
*[f for f in glob.glob(os.path.join(datastore.datastore_path, "notification-wrapper-*--.html"))
if scheme.startswith(os.path.basename(f).replace("notification-wrapper-", "").replace("--.html", ""))]
]
for tpl_name in candidates:
if os.path.isfile(tpl_name):
template_params = notification_parameters.copy()
template_params['notification_body'] = n_body
with open(tpl_name, 'r', encoding='utf-8') as f:
logger.info(f"Using HTML notification template wrapper from '{tpl_name}'")
return jinja_render(template_str=f.read(), **template_params)
except Exception as e:
logger.warning(f"Failed to load notification template: {e}")
return None
def process_notification(n_object, datastore):
from . import default_notification_format_for_watch, default_notification_format, valid_notification_formats
# be sure its registered
from .apprise_plugin.custom_handlers import apprise_http_custom_handler
# Register custom Discord plugin
from .apprise_plugin.discord import NotifyDiscordCustom
from .apprise_plugin.custom_handlers import apprise_http_custom_handler, apprise_null_custom_handler
if not isinstance(n_object, NotificationContextData):
raise TypeError(f"Expected NotificationContextData, got {type(n_object)}")
n_body = ''
n_title = ''
now = time.time()
if n_object.get('notification_timestamp'):
logger.trace(f"Time since queued {now-n_object['notification_timestamp']:.3f}s")
n_format = valid_notification_formats.get(
n_object.get('notification_format', default_notification_format),
valid_notification_formats[default_notification_format],
)
# If we arrived with 'System default' then look it up
if n_format == default_notification_format_for_watch and datastore.data['settings']['application'].get('notification_format') != default_notification_format_for_watch:
# Initially text or whatever
n_format = datastore.data['settings']['application'].get('notification_format', valid_notification_formats[default_notification_format])
# Ensure diff rendering is done if not already present (for test notifications)
if not n_object.get('diff') and n_object.get('uuid'):
_populate_notification_tokens(n_object, datastore)
# Insert variables into the notification content
notification_parameters = create_notification_parameters(n_object, datastore)
requested_output_format = n_object.get('notification_format', default_notification_format)
logger.debug(f"Requested notification output format: '{requested_output_format}'")
# If we arrived with 'System default' then look it up
if requested_output_format == USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH:
# Initially text or whatever
requested_output_format = datastore.data['settings']['application'].get('notification_format', default_notification_format)
requested_output_format_original = requested_output_format
# Now clean it up so it fits perfectly with apprise
requested_output_format = notification_format_align_with_apprise(n_format=requested_output_format)
logger.trace(f"Complete notification body including Jinja and placeholders calculated in {time.time() - now:.2f}s")
@@ -324,112 +175,105 @@ def process_notification(n_object: NotificationContextData, datastore):
apobj = apprise.Apprise(debug=True, asset=apprise_asset)
# Override Apprise's built-in Discord plugin with our custom one
# This allows us to use colored embeds for diff content
# First remove the built-in discord plugin, then add our custom one
apprise.plugins.N_MGR.remove('discord')
apprise.plugins.N_MGR.add(NotifyDiscordCustom, schemas='discord')
if not n_object.get('notification_urls'):
return None
with (apprise.LogCapture(level=apprise.logging.DEBUG) as logs):
with apprise.LogCapture(level=apprise.logging.DEBUG) as logs:
for url in n_object['notification_urls']:
n_body = jinja_render(template_str=n_object.get('notification_body', ''), **notification_parameters)
n_title = jinja_render(template_str=n_object.get('notification_title', ''), **notification_parameters)
if n_object.get('markup_text_links_to_html_links'):
n_body = markup_text_links_to_html(body=n_body)
url = url.strip()
if not url or url.startswith('#'):
logger.debug(f"Skipping commented out or empty notification URL - '{url}'")
# Commented out is OK
if url.startswith('#') or not url or not url.strip():
logger.trace(f"Skipping notification URL - '{url}'")
continue
logger.info(f">> Process Notification: AppRise start notifying '{url}'")
# Get the notification body from datastore
n_body = jinja_render(template_str=n_object.get('notification_body', ''), **notification_parameters)
# hmm unsure about this, but why not
if n_object.get('notification_format', '').startswith('HTML'):
n_body = n_body.replace("\n", '<br>')
n_title = jinja_render(template_str=n_object.get('notification_title', ''), **notification_parameters)
n_body_from_file_template = scan_notification_file_templates(url=url,
datastore=datastore,
n_body=n_body,
notification_parameters=notification_parameters)
if n_body_from_file_template:
n_body = n_body_from_file_template
logger.info(f">> Process Notification: AppRise notifying {url}")
url = jinja_render(template_str=url, **notification_parameters)
# If it's a plaintext document, and they want HTML type email/alerts, so it needs to be escaped
watch_mime_type = n_object.get('watch_mime_type')
if watch_mime_type and 'text/' in watch_mime_type.lower() and not 'html' in watch_mime_type.lower():
if 'html' in requested_output_format:
from markupsafe import escape
n_body = str(escape(n_body))
# Re 323 - Limit discord length to their 2000 char limit total or it wont send.
# Because different notifications may require different pre-processing, run each sequentially :(
# 2000 bytes minus -
# 200 bytes for the overhead of the _entire_ json payload, 200 bytes for {tts, wait, content} etc headers
# Length of URL - Incase they specify a longer custom avatar_url
if 'html' in requested_output_format:
# Since the n_body is always some kind of text from the 'diff' engine, attempt to preserve whitespaces that get sent to the HTML output
# But only where its more than 1 consecutive whitespace, otherwise "and this" becomes "and&nbsp;this" etc which is too much.
n_body = n_body.replace(' ', '&nbsp;&nbsp;')
# So if no avatar_url is specified, add one so it can be correctly calculated into the total payload
k = '?' if not '?' in url else '&'
if not 'avatar_url' in url \
and not url.startswith('mail') \
and not url.startswith('post') \
and not url.startswith('get') \
and not url.startswith('delete') \
and not url.startswith('put'):
url += k + f"avatar_url={APPRISE_AVATAR_URL}"
(url, n_body, n_title) = apply_service_tweaks(url=url, n_body=n_body, n_title=n_title, requested_output_format=requested_output_format_original)
if url.startswith('tgram://'):
# Telegram only supports a limit subset of HTML, remove the '<br>' we place in.
# re https://github.com/dgtlmoon/changedetection.io/issues/555
# @todo re-use an existing library we have already imported to strip all non-allowed tags
n_body = n_body.replace('<br>', '\n')
n_body = n_body.replace('</br>', '\n')
# real limit is 4096, but minus some for extra metadata
payload_max_size = 3600
body_limit = max(0, payload_max_size - len(n_title))
n_title = n_title[0:payload_max_size]
n_body = n_body[0:body_limit]
apprise_input_format = "NO-THANKS-WE-WILL-MANAGE-ALL-OF-THIS"
elif url.startswith('discord://') or url.startswith('https://discordapp.com/api/webhooks') or url.startswith(
'https://discord.com/api'):
# real limit is 2000, but minus some for extra metadata
payload_max_size = 1700
body_limit = max(0, payload_max_size - len(n_title))
n_title = n_title[0:payload_max_size]
n_body = n_body[0:body_limit]
if not 'format=' in url:
parsed_url = urlparse(url)
prefix_add_to_url = '?' if not parsed_url.query else '&'
elif url.startswith('mailto'):
# Apprise will default to HTML, so we need to override it
# So that whats' generated in n_body is in line with what is going to be sent.
# https://github.com/caronc/apprise/issues/633#issuecomment-1191449321
if not 'format=' in url and (n_format.lower() == 'text' or n_format.lower() == 'markdown'):
prefix = '?' if not '?' in url else '&'
# Apprise format is lowercase text https://github.com/caronc/apprise/issues/633
n_format = n_format.lower()
url = f"{url}{prefix}format={n_format}"
# If n_format == HTML, then apprise email should default to text/html and we should be sending HTML only
# THIS IS THE TRICK HOW TO DISABLE APPRISE DOING WEIRD AUTO-CONVERSION WITH BREAKING BR TAGS ETC
if 'html' in requested_output_format:
url = f"{url}{prefix_add_to_url}format={NotifyFormat.HTML.value}"
apprise_input_format = NotifyFormat.HTML.value
elif 'text' in requested_output_format:
url = f"{url}{prefix_add_to_url}format={NotifyFormat.TEXT.value}"
apprise_input_format = NotifyFormat.TEXT.value
apobj.add(url)
elif requested_output_format == NotifyFormat.MARKDOWN.value:
# Convert markdown to HTML ourselves since not all plugins do this
from apprise.conversion import markdown_to_html
# Make sure there are paragraph breaks around horizontal rules
n_body = n_body.replace('---', '\n\n---\n\n')
n_body = markdown_to_html(n_body)
url = f"{url}{prefix_add_to_url}format={NotifyFormat.HTML.value}"
requested_output_format = NotifyFormat.HTML.value
apprise_input_format = NotifyFormat.HTML.value # Changed from MARKDOWN to HTML
else:
# ?format was IN the apprise URL, they are kind of on their own here, we will try our best
if 'format=html' in url:
n_body = newline_re.sub('<br>\r\n', n_body)
# This will also prevent apprise from doing conversion
apprise_input_format = NotifyFormat.HTML.value
requested_output_format = NotifyFormat.HTML.value
elif 'format=text' in url:
apprise_input_format = NotifyFormat.TEXT.value
requested_output_format = NotifyFormat.TEXT.value
#@todo on null:// (only if its a 1 url with null) probably doesnt need to actually .add/setup/etc
sent_objs.append({'title': n_title,
'body': n_body,
'url': url,
# So that we can do a null:// call and get back exactly what would have been sent
'original_context': n_object })
'body_format': n_format})
if not url.startswith('null://'):
apobj.add(url)
# Since the output is always based on the plaintext of the 'diff' engine, wrap it nicely.
# It should always be similar to the 'history' part of the UI.
if url.startswith('mail') and 'html' in requested_output_format:
if not '<pre' in n_body and not '<body' in n_body: # No custom HTML-ish body was setup already
n_body = as_monospaced_html_email(content=n_body, title=n_title)
if not url.startswith('null://'):
if n_object.get('notification_urls'):
# Blast off the notifications tht are set in .add()
apobj.notify(
title=n_title,
body=n_body,
# `body_format` Tell apprise what format the INPUT is in, specify a wrong/bad type and it will force skip conversion in apprise
# &format= in URL Tell apprise what format the OUTPUT should be in (it can convert between)
body_format=apprise_input_format,
body_format=n_format,
# False is not an option for AppRise, must be type None
attach=n_object.get('screenshot', None)
)
# Returns empty string if nothing found, multi-line string otherwise
log_value = logs.getvalue()
if log_value and ('WARNING' in log_value or 'ERROR' in log_value):
if log_value and 'WARNING' in log_value or 'ERROR' in log_value:
logger.critical(log_value)
raise Exception(log_value)
@@ -439,17 +283,17 @@ def process_notification(n_object: NotificationContextData, datastore):
# Notification title + body content parameters get created here.
# ( Where we prepare the tokens in the notification to be replaced with actual values )
def create_notification_parameters(n_object: NotificationContextData, datastore):
if not isinstance(n_object, NotificationContextData):
raise TypeError(f"Expected NotificationContextData, got {type(n_object)}")
def create_notification_parameters(n_object, datastore):
from copy import deepcopy
from . import valid_tokens
ext_base_url = datastore.data['settings']['application'].get('active_base_url').strip('/')+'/'
# in the case we send a test notification from the main settings, there is no UUID.
uuid = n_object['uuid'] if 'uuid' in n_object else ''
watch = datastore.data['watching'].get(n_object['uuid'])
if watch:
watch_title = datastore.data['watching'][n_object['uuid']].label
if uuid:
watch_title = datastore.data['watching'][uuid].label
tag_list = []
tags = datastore.get_all_tags_for_watch(n_object['uuid'])
tags = datastore.get_all_tags_for_watch(uuid)
if tags:
for tag_uuid, tag in tags.items():
tag_list.append(tag.get('title'))
@@ -458,36 +302,34 @@ def create_notification_parameters(n_object: NotificationContextData, datastore)
watch_title = 'Change Detection'
watch_tag = ''
# Create URLs to customise the notification with
# active_base_url - set in store.py data property
base_url = datastore.data['settings']['application'].get('active_base_url')
watch_url = n_object['watch_url']
# Build URLs manually instead of using url_for() to avoid requiring a request context
# This allows notifications to be processed in background threads
uuid = n_object['uuid']
diff_url = "{}/diff/{}".format(base_url, uuid)
preview_url = "{}/preview/{}".format(base_url, uuid)
if n_object.get('timestamp_from') and n_object.get('timestamp_to'):
# Include a link to the diff page with specific versions
diff_url = f"{ext_base_url}diff/{uuid}?from_version={n_object['timestamp_from']}&to_version={n_object['timestamp_to']}"
else:
diff_url = f"{ext_base_url}diff/{uuid}"
# Not sure deepcopy is needed here, but why not
tokens = deepcopy(valid_tokens)
preview_url = f"{ext_base_url}preview/{uuid}"
edit_url = f"{ext_base_url}edit/{uuid}"
# @todo test that preview_url is correct when running in not-null mode?
# if not, first time app loads i think it can set a flask context
n_object.update(
# Valid_tokens also used as a field validator
tokens.update(
{
'base_url': ext_base_url,
'base_url': base_url,
'diff_url': diff_url,
'preview_url': preview_url, #@todo include 'version='
'edit_url': edit_url, #@todo also pause, also mute link
'preview_url': preview_url,
'watch_tag': watch_tag if watch_tag is not None else '',
'watch_title': watch_title if watch_title is not None else '',
'watch_url': watch_url,
'watch_uuid': n_object['uuid'],
'watch_uuid': uuid,
})
if watch:
n_object.update(datastore.data['watching'].get(n_object['uuid']).extra_notification_token_values())
# n_object will contain diff, diff_added etc etc
tokens.update(n_object)
return n_object
if uuid:
tokens.update(datastore.data['watching'].get(uuid).extra_notification_token_values())
return tokens

View File

@@ -0,0 +1,15 @@
{# Copy this to your data-store directory if you wish to enable it for HTML style notifications, applies to all as a wrapper :) #}
<html>
<body>
Hello,<br>
<p>A change was detected on your web page watch for <p>{{ watch_html_link }}.</p>
[ view history ] [ pause checks ] [ mute notifications ]
<div>
{{ notification_body }}
</div>
</body>
</html>

View File

@@ -0,0 +1,17 @@
## Notification syntax
All notifications use the https://github.com/caronc/apprise syntax, there are some custom ones such as `posts` etc for general web-services usability.
## Template file notification wrappers
You can by default wrap all notifications by creating a `notification-wrapper-HTML-schema.html` in your datastore directory.
For example
You can use "`--`" in the filename where the _schema_ is to symbolize a wildcard. For example `notification-wrapper-HTML-mail--.html` would
apply to `mail://` `mailto://` etc etc
See is `notification-wrapper-HTML-mail--.html` which applies to `mail://`, `mailto://foobar..` etc notifications

View File

@@ -5,166 +5,10 @@ Notification Service Module
Extracted from update_worker.py to provide standalone notification functionality
for both sync and async workers
"""
import datetime
import pytz
from loguru import logger
import time
from loguru import logger
from changedetectionio.notification import default_notification_format, valid_notification_formats
def _check_cascading_vars(datastore, var_name, watch):
"""
Check notification variables in cascading priority:
Individual watch settings > Tag settings > Global settings
"""
from changedetectionio.notification import (
USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH,
default_notification_body,
default_notification_title
)
# Would be better if this was some kind of Object where Watch can reference the parent datastore etc
v = watch.get(var_name)
if v and not watch.get('notification_muted'):
if var_name == 'notification_format' and v == USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH:
return datastore.data['settings']['application'].get('notification_format')
return v
tags = datastore.get_all_tags_for_watch(uuid=watch.get('uuid'))
if tags:
for tag_uuid, tag in tags.items():
v = tag.get(var_name)
if v and not tag.get('notification_muted'):
return v
if datastore.data['settings']['application'].get(var_name):
return datastore.data['settings']['application'].get(var_name)
# Otherwise could be defaults
if var_name == 'notification_format':
return USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH
if var_name == 'notification_body':
return default_notification_body
if var_name == 'notification_title':
return default_notification_title
return None
# What is passed around as notification context, also used as the complete list of valid {{ tokens }}
class NotificationContextData(dict):
def __init__(self, initial_data=None, **kwargs):
super().__init__({
'base_url': None,
'current_snapshot': None,
'diff': None,
'diff_clean': None,
'diff_added': None,
'diff_added_clean': None,
'diff_full': None,
'diff_full_clean': None,
'diff_patch': None,
'diff_removed': None,
'diff_removed_clean': None,
'diff_url': None,
'markup_text_links_to_html_links': False, # If automatic conversion of plaintext to HTML should happen
'notification_timestamp': time.time(),
'preview_url': None,
'screenshot': None,
'triggered_text': None,
'timestamp_from': None,
'timestamp_to': None,
'uuid': 'XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX', # Converted to 'watch_uuid' in create_notification_parameters
'watch_mime_type': None,
'watch_tag': None,
'watch_title': None,
'watch_url': 'https://WATCH-PLACE-HOLDER/',
})
# Apply any initial data passed in
self.update({'watch_uuid': self.get('uuid')})
if initial_data:
self.update(initial_data)
# Apply any keyword arguments
if kwargs:
self.update(kwargs)
n_format = self.get('notification_format')
if n_format and not valid_notification_formats.get(n_format):
raise ValueError(f'Invalid notification format: "{n_format}"')
def set_random_for_validation(self):
import random, string
"""Randomly fills all dict keys with random strings (for validation/testing).
So we can test the output in the notification body
"""
for key in self.keys():
if key in ['uuid', 'time', 'watch_uuid']:
continue
rand_str = 'RANDOM-PLACEHOLDER-'+''.join(random.choices(string.ascii_letters + string.digits, k=12))
self[key] = rand_str
def __setitem__(self, key, value):
if key == 'notification_format' and isinstance(value, str) and not value.startswith('RANDOM-PLACEHOLDER-'):
if not valid_notification_formats.get(value):
raise ValueError(f'Invalid notification format: "{value}"')
super().__setitem__(key, value)
def timestamp_to_localtime(timestamp):
# Format the date using locale-aware formatting with timezone
dt = datetime.datetime.fromtimestamp(int(timestamp))
dt = dt.replace(tzinfo=pytz.UTC)
# Get local timezone-aware datetime
local_tz = datetime.datetime.now().astimezone().tzinfo
local_dt = dt.astimezone(local_tz)
# Format date with timezone - using strftime for locale awareness
try:
formatted_date = local_dt.strftime('%Y-%m-%d %H:%M:%S %Z')
except:
# Fallback if locale issues
formatted_date = local_dt.isoformat()
return formatted_date
def set_basic_notification_vars(snapshot_contents, current_snapshot, prev_snapshot, watch, triggered_text, timestamp_changed=None):
now = time.time()
from changedetectionio import diff
n_object = {
'current_snapshot': snapshot_contents,
'diff': diff.render_diff(prev_snapshot, current_snapshot),
'diff_clean': diff.render_diff(prev_snapshot, current_snapshot, include_change_type_prefix=False),
'diff_added': diff.render_diff(prev_snapshot, current_snapshot, include_removed=False),
'diff_added_clean': diff.render_diff(prev_snapshot, current_snapshot, include_removed=False, include_change_type_prefix=False),
'diff_full': diff.render_diff(prev_snapshot, current_snapshot, include_equal=True),
'diff_full_clean': diff.render_diff(prev_snapshot, current_snapshot, include_equal=True, include_change_type_prefix=False),
'diff_patch': diff.render_diff(prev_snapshot, current_snapshot, patch_format=True),
'diff_removed': diff.render_diff(prev_snapshot, current_snapshot, include_added=False),
'diff_removed_clean': diff.render_diff(prev_snapshot, current_snapshot, include_added=False, include_change_type_prefix=False),
'screenshot': watch.get_screenshot() if watch and watch.get('notification_screenshot') else None,
'change_datetime': timestamp_to_localtime(timestamp_changed) if timestamp_changed else None,
'triggered_text': triggered_text,
'uuid': watch.get('uuid') if watch else None,
'watch_url': watch.get('url') if watch else None,
'watch_uuid': watch.get('uuid') if watch else None,
'watch_mime_type': watch.get('content-type')
}
# The \n's in the content from the above will get converted to <br> etc depending on the notification format
if watch:
n_object.update(watch.extra_notification_token_values())
logger.trace(f"Main rendered notification placeholders (diff_added etc) calculated in {time.time() - now:.3f}s")
return n_object
class NotificationService:
"""
@@ -176,69 +20,70 @@ class NotificationService:
self.datastore = datastore
self.notification_q = notification_q
def queue_notification_for_watch(self, n_object: NotificationContextData, watch, date_index_from=-2, date_index_to=-1):
def queue_notification_for_watch(self, n_object, watch):
"""
Queue a notification for a watch with full diff rendering and template variables
Queue a notification for a watch. Diff rendering and template variables will be
handled by process_notification() to ensure consistency with test notifications.
"""
from changedetectionio.notification import USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH
now = time.time()
if not isinstance(n_object, NotificationContextData):
raise TypeError(f"Expected NotificationContextData, got {type(n_object)}")
dates = []
trigger_text = ''
# Add basic metadata for the notification
n_object.update({
'notification_timestamp': now,
'uuid': watch.get('uuid') if watch else None,
'watch_url': watch.get('url') if watch else None,
})
if watch:
watch_history = watch.history
dates = list(watch_history.keys())
trigger_text = watch.get('trigger_text', [])
n_object.update(watch.extra_notification_token_values())
# Add text that was triggered
if len(dates):
snapshot_contents = watch.get_history_snapshot(timestamp=dates[-1])
else:
snapshot_contents = "No snapshot/history available, the watch should fetch atleast once."
logger.debug("Queued notification for sending")
self.notification_q.put(n_object)
# If we ended up here with "System default"
if n_object.get('notification_format') == USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH:
n_object['notification_format'] = self.datastore.data['settings']['application'].get('notification_format')
def _check_cascading_vars(self, var_name, watch):
"""
Check notification variables in cascading priority:
Individual watch settings > Tag settings > Global settings
"""
from changedetectionio.notification import (
default_notification_format_for_watch,
default_notification_body,
default_notification_title
)
# Would be better if this was some kind of Object where Watch can reference the parent datastore etc
v = watch.get(var_name)
if v and not watch.get('notification_muted'):
if var_name == 'notification_format' and v == default_notification_format_for_watch:
return self.datastore.data['settings']['application'].get('notification_format')
triggered_text = ''
if len(trigger_text):
from . import html_tools
triggered_text = html_tools.get_triggered_text(content=snapshot_contents, trigger_text=trigger_text)
if triggered_text:
triggered_text = '\n'.join(triggered_text)
return v
# Could be called as a 'test notification' with only 1 snapshot available
prev_snapshot = "Example text: example test\nExample text: change detection is cool\nExample text: some more examples\n"
current_snapshot = "Example text: example test\nExample text: change detection is fantastic\nExample text: even more examples\nExample text: a lot more examples"
tags = self.datastore.get_all_tags_for_watch(uuid=watch.get('uuid'))
if tags:
for tag_uuid, tag in tags.items():
v = tag.get(var_name)
if v and not tag.get('notification_muted'):
return v
if len(dates) > 1:
prev_snapshot = watch.get_history_snapshot(timestamp=dates[date_index_from])
current_snapshot = watch.get_history_snapshot(timestamp=dates[date_index_to])
if self.datastore.data['settings']['application'].get(var_name):
return self.datastore.data['settings']['application'].get(var_name)
# Otherwise could be defaults
if var_name == 'notification_format':
return default_notification_format_for_watch
if var_name == 'notification_body':
return default_notification_body
if var_name == 'notification_title':
return default_notification_title
n_object.update(set_basic_notification_vars(snapshot_contents=snapshot_contents,
current_snapshot=current_snapshot,
prev_snapshot=prev_snapshot,
watch=watch,
triggered_text=triggered_text,
timestamp_changed=dates[date_index_to]))
if self.notification_q:
logger.debug("Queued notification for sending")
self.notification_q.put(n_object)
else:
logger.debug("Not queued, no queue defined. Just returning processed data")
return n_object
return None
def send_content_changed_notification(self, watch_uuid):
"""
Send notification when content changes are detected
"""
n_object = NotificationContextData()
n_object = {}
watch = self.datastore.data['watching'].get(watch_uuid)
if not watch:
return
@@ -255,11 +100,10 @@ class NotificationService:
# Should be a better parent getter in the model object
# Prefer - Individual watch settings > Tag settings > Global settings (in that order)
# this change probably not needed?
n_object['notification_urls'] = _check_cascading_vars(self.datastore, 'notification_urls', watch)
n_object['notification_title'] = _check_cascading_vars(self.datastore,'notification_title', watch)
n_object['notification_body'] = _check_cascading_vars(self.datastore,'notification_body', watch)
n_object['notification_format'] = _check_cascading_vars(self.datastore,'notification_format', watch)
n_object['notification_urls'] = self._check_cascading_vars('notification_urls', watch)
n_object['notification_title'] = self._check_cascading_vars('notification_title', watch)
n_object['notification_body'] = self._check_cascading_vars('notification_body', watch)
n_object['notification_format'] = self._check_cascading_vars('notification_format', watch)
# (Individual watch) Only prepare to notify if the rules above matched
queued = False
@@ -282,25 +126,11 @@ class NotificationService:
if not watch:
return
filter_list = ", ".join(watch['include_filters'])
# @todo - This could be a markdown template on the disk, apprise will convert the markdown to HTML+Plaintext parts in the email, and then 'markup_text_links_to_html_links' is not needed
body = f"""Hello,
Your configured CSS/xPath filters of '{filter_list}' for {{{{watch_url}}}} did not appear on the page after {threshold} attempts.
It's possible the page changed layout and the filter needs updating ( Try the 'Visual Selector' tab )
Edit link: {{{{base_url}}}}/edit/{{{{watch_uuid}}}}
Thanks - Your omniscient changedetection.io installation.
"""
n_object = NotificationContextData({
'notification_title': 'Changedetection.io - Alert - CSS/xPath filter was not present in the page',
'notification_body': body,
'notification_format': _check_cascading_vars(self.datastore, 'notification_format', watch),
})
n_object['markup_text_links_to_html_links'] = n_object.get('notification_format').startswith('html')
n_object = {'notification_title': 'Changedetection.io - Alert - CSS/xPath filter was not present in the page',
'notification_body': "Your configured CSS/xPath filters of '{}' for {{{{watch_url}}}} did not appear on the page after {} attempts, did the page change layout?\n\nLink: {{{{base_url}}}}/edit/{{{{watch_uuid}}}}\n\nThanks - Your omniscient changedetection.io installation :)\n".format(
", ".join(watch['include_filters']),
threshold),
'notification_format': 'text'}
if len(watch['notification_urls']):
n_object['notification_urls'] = watch['notification_urls']
@@ -328,28 +158,12 @@ Thanks - Your omniscient changedetection.io installation.
if not watch:
return
threshold = self.datastore.data['settings']['application'].get('filter_failure_notification_threshold_attempts')
step = step_n + 1
# @todo - This could be a markdown template on the disk, apprise will convert the markdown to HTML+Plaintext parts in the email, and then 'markup_text_links_to_html_links' is not needed
# {{{{ }}}} because this will be Jinja2 {{ }} tokens
body = f"""Hello,
Your configured browser step at position {step} for the web page watch {{{{watch_url}}}} did not appear on the page after {threshold} attempts, did the page change layout?
The element may have moved and needs editing, or does it need a delay added?
Edit link: {{{{base_url}}}}/edit/{{{{watch_uuid}}}}
Thanks - Your omniscient changedetection.io installation.
"""
n_object = NotificationContextData({
'notification_title': f"Changedetection.io - Alert - Browser step at position {step} could not be run",
'notification_body': body,
'notification_format': self._check_cascading_vars('notification_format', watch),
})
n_object['markup_text_links_to_html_links'] = n_object.get('notification_format').startswith('html')
n_object = {'notification_title': "Changedetection.io - Alert - Browser step at position {} could not be run".format(step_n+1),
'notification_body': "Your configured browser step at position {} for {{{{watch_url}}}} "
"did not appear on the page after {} attempts, did the page change layout? "
"Does it need a delay added?\n\nLink: {{{{base_url}}}}/edit/{{{{watch_uuid}}}}\n\n"
"Thanks - Your omniscient changedetection.io installation :)\n".format(step_n+1, threshold),
'notification_format': 'text'}
if len(watch['notification_urls']):
n_object['notification_urls'] = watch['notification_urls']

View File

@@ -1,7 +1,7 @@
import pluggy
import os
import importlib
import sys
from loguru import logger
# Global plugin namespace for changedetection.io
PLUGIN_NAMESPACE = "changedetectionio"
@@ -57,7 +57,7 @@ def load_plugins_from_directories():
# Register the plugin with pluggy
plugin_manager.register(module, module_name)
except (ImportError, AttributeError) as e:
print(f"Error loading plugin {module_name}: {e}")
logger.critical(f"Error loading plugin {module_name}: {e}")
# Load plugins
load_plugins_from_directories()

View File

@@ -91,8 +91,6 @@ class difference_detection_processor():
else:
logger.debug("Skipping adding proxy data when custom Browser endpoint is specified. ")
logger.debug(f"Using proxy '{proxy_url}' for {self.watch['uuid']}")
# Now call the fetcher (playwright/requests/etc) with arguments that only a fetcher would need.
# When browser_connection_url is None, it method should default to working out whats the best defaults (os env vars etc)
self.fetcher = fetcher_obj(proxy_override=proxy_url,
@@ -104,7 +102,7 @@ class difference_detection_processor():
self.fetcher.browser_steps_screenshot_path = os.path.join(self.datastore.datastore_path, self.watch.get('uuid'))
# Tweak the base config with the per-watch ones
from changedetectionio.jinja2_custom import render as jinja_render
from changedetectionio.safe_jinja import render as jinja_render
request_headers = CaseInsensitiveDict()
ua = self.datastore.data['settings']['requests'].get('default_ua')

View File

@@ -20,6 +20,8 @@ Used by: processors/text_json_diff/processor.py and other content processors
RSS_XML_CONTENT_TYPES = [
"application/rss+xml",
"application/rdf+xml",
"text/xml",
"application/xml",
"application/atom+xml",
"text/rss+xml", # rare, non-standard
"application/x-rss+xml", # legacy (older feed software)
@@ -35,6 +37,11 @@ JSON_CONTENT_TYPES = [
"application/vnd.api+json",
]
# CSV Content-types
CSV_CONTENT_TYPES = [
"text/csv",
"application/csv",
]
# Generic XML Content-types (non-RSS/Atom)
XML_CONTENT_TYPES = [
@@ -42,10 +49,21 @@ XML_CONTENT_TYPES = [
"application/xml",
]
# YAML Content-types
YAML_CONTENT_TYPES = [
"text/yaml",
"text/x-yaml",
"application/yaml",
"application/x-yaml",
]
HTML_PATTERNS = ['<!doctype html', '<html', '<head', '<body', '<script', '<iframe', '<div']
import re
import magic
from loguru import logger
class guess_stream_type():
is_pdf = False
is_json = False
@@ -57,77 +75,64 @@ class guess_stream_type():
is_yaml = False
def __init__(self, http_content_header, content):
import re
magic_content_header = http_content_header
test_content = content[:200].lower().strip()
# Remove whitespace between < and tag name for robust detection (handles '< html', '<\nhtml', etc.)
test_content_normalized = re.sub(r'<\s+', '<', test_content)
# Use puremagic for lightweight MIME detection (saves ~14MB vs python-magic)
# Magic will sometimes call text/plain as text/html!
magic_result = None
try:
import puremagic
# puremagic needs bytes, so encode if we have a string
content_bytes = content[:200].encode('utf-8') if isinstance(content, str) else content[:200]
# puremagic returns a list of PureMagic objects with confidence scores
detections = puremagic.magic_string(content_bytes)
if detections:
# Get the highest confidence detection
mime = detections[0].mime_type
logger.debug(f"Guessing mime type, original content_type '{http_content_header}', mime type detected '{mime}'")
if mime and "/" in mime:
magic_result = mime
# Ignore generic/fallback mime types
if mime in ['application/octet-stream', 'application/x-empty', 'binary']:
logger.debug(f"Ignoring generic mime type '{mime}' from puremagic library")
# Trust puremagic for non-text types immediately
elif mime not in ['text/html', 'text/plain']:
magic_content_header = mime
mime = magic.from_buffer(content[:200], mime=True) # Send the original content
logger.debug(f"Guessing mime type, original content_type '{http_content_header}', mime type detected '{mime}'")
if mime and "/" in mime:
magic_result = mime
# Ignore generic/fallback mime types from magic
if mime in ['application/octet-stream', 'application/x-empty', 'binary']:
logger.debug(f"Ignoring generic mime type '{mime}' from magic library")
# Trust magic for non-text types immediately
elif mime not in ['text/html', 'text/plain']:
magic_content_header = mime
except Exception as e:
logger.warning(f"Error getting a more precise mime type from 'puremagic' library ({str(e)}), using content-based detection")
logger.error(f"Error getting a more precise mime type from 'magic' library ({str(e)}), using content-based detection")
# Content-based detection (most reliable for text formats)
# Check for HTML patterns first - if found, override magic's text/plain
has_html_patterns = any(p in test_content_normalized for p in HTML_PATTERNS)
# Always trust headers first
if 'text/plain' in http_content_header:
self.is_plaintext = True
if any(s in http_content_header for s in RSS_XML_CONTENT_TYPES):
if any(s in http_content_header for s in RSS_XML_CONTENT_TYPES) or any(s in magic_content_header for s in RSS_XML_CONTENT_TYPES):
self.is_rss = True
elif any(s in http_content_header for s in JSON_CONTENT_TYPES):
elif any(s in http_content_header for s in JSON_CONTENT_TYPES) or any(s in magic_content_header for s in JSON_CONTENT_TYPES):
self.is_json = True
elif 'pdf' in magic_content_header:
self.is_pdf = True
# magic will call a rss document 'xml'
# Rarely do endpoints give the right header, usually just text/xml, so we check also for <rss
# This also triggers the automatic CDATA text parser so the RSS goes back a nice content list
elif '<rss' in test_content_normalized or '<feed' in test_content_normalized or any(s in magic_content_header for s in RSS_XML_CONTENT_TYPES) or '<rdf:' in test_content_normalized:
self.is_rss = True
elif has_html_patterns or http_content_header == 'text/html':
self.is_html = True
elif any(s in magic_content_header for s in JSON_CONTENT_TYPES):
self.is_json = True
elif any(s in http_content_header for s in XML_CONTENT_TYPES):
elif any(s in http_content_header for s in CSV_CONTENT_TYPES) or any(s in magic_content_header for s in CSV_CONTENT_TYPES):
self.is_csv = True
elif any(s in http_content_header for s in XML_CONTENT_TYPES) or any(s in magic_content_header for s in XML_CONTENT_TYPES):
# Only mark as generic XML if not already detected as RSS
if not self.is_rss:
self.is_xml = True
elif test_content_normalized.startswith('<?xml') or any(s in magic_content_header for s in XML_CONTENT_TYPES):
# Generic XML that's not RSS/Atom (RSS/Atom checked above)
self.is_xml = True
elif '%pdf-1' in test_content:
elif any(s in http_content_header for s in YAML_CONTENT_TYPES) or any(s in magic_content_header for s in YAML_CONTENT_TYPES):
self.is_yaml = True
elif 'pdf' in magic_content_header:
self.is_pdf = True
elif http_content_header.startswith('text/'):
self.is_plaintext = True
# Only trust magic for 'text' if no other patterns matched
elif 'text' in magic_content_header:
self.is_plaintext = True
###
elif has_html_patterns or http_content_header == 'text/html':
self.is_html = True
# If magic says text/plain and we found no HTML patterns, trust it
elif magic_result == 'text/plain':
self.is_plaintext = True
logger.debug(f"Trusting magic's text/plain result (no HTML patterns detected)")
elif '<rss' in test_content_normalized or '<feed' in test_content_normalized:
self.is_rss = True
elif test_content_normalized.startswith('<?xml'):
# Generic XML that's not RSS/Atom (RSS/Atom checked above)
self.is_xml = True
elif '%pdf-1' in test_content:
self.is_pdf = True
# Only trust magic for 'text' if no other patterns matched
elif 'text' in magic_content_header:
self.is_plaintext = True

View File

@@ -32,7 +32,7 @@ def prepare_filter_prevew(datastore, watch_uuid, form_data):
'''Used by @app.route("/edit/<string:uuid>/preview-rendered", methods=['POST'])'''
from changedetectionio import forms, html_tools
from changedetectionio.model.Watch import model as watch_model
from concurrent.futures import ThreadPoolExecutor
from concurrent.futures import ProcessPoolExecutor
from copy import deepcopy
from flask import request
import brotli
@@ -76,16 +76,13 @@ def prepare_filter_prevew(datastore, watch_uuid, form_data):
update_handler.fetcher.headers['content-type'] = tmp_watch.get('content-type')
# Process our watch with filters and the HTML from disk, and also a blank watch with no filters but also with the same HTML from disk
# Do this as parallel threads (not processes) to avoid pickle issues with Lock objects
try:
with ThreadPoolExecutor(max_workers=2) as executor:
future1 = executor.submit(_task, tmp_watch, update_handler)
future2 = executor.submit(_task, blank_watch_no_filters, update_handler)
# Do this as a parallel process because it could take some time
with ProcessPoolExecutor(max_workers=2) as executor:
future1 = executor.submit(_task, tmp_watch, update_handler)
future2 = executor.submit(_task, blank_watch_no_filters, update_handler)
text_after_filter = future1.result()
text_before_filter = future2.result()
except Exception as e:
x=1
text_after_filter = future1.result()
text_before_filter = future2.result()
try:
trigger_line_numbers = html_tools.strip_ignore_text(content=text_after_filter,

View File

@@ -7,7 +7,6 @@ import re
import urllib3
from changedetectionio.conditions import execute_ruleset_against_all_plugins
from changedetectionio.diff import ADDED_PLACEMARKER_OPEN
from changedetectionio.processors import difference_detection_processor
from changedetectionio.html_tools import PERL_STYLE_REGEX, cdata_in_document_to_text, TRANSLATE_WHITESPACE_TABLE
from changedetectionio import html_tools, content_fetchers
@@ -21,7 +20,7 @@ urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
name = 'Webpage Text/HTML, JSON and PDF changes'
description = 'Detects all text changes where possible'
JSON_FILTER_PREFIXES = ['json:', 'jq:', 'jqraw:']
json_filter_prefixes = ['json:', 'jq:', 'jqraw:']
# Assume it's this type if the server says nothing on content-type
DEFAULT_WHEN_NO_CONTENT_TYPE_HEADER = 'text/html'
@@ -38,560 +37,355 @@ class PDFToHTMLToolNotFound(ValueError):
ValueError.__init__(self, msg)
class FilterConfig:
"""Consolidates all filter and rule configurations from watch, tags, and global settings."""
def __init__(self, watch, datastore):
self.watch = watch
self.datastore = datastore
self.watch_uuid = watch.get('uuid')
# Cache computed properties to avoid repeated list operations
self._include_filters_cache = None
self._subtractive_selectors_cache = None
def _get_merged_rules(self, attr, include_global=False):
"""Merge rules from watch, tags, and optionally global settings."""
watch_rules = self.watch.get(attr, [])
tag_rules = self.datastore.get_tag_overrides_for_watch(uuid=self.watch_uuid, attr=attr)
rules = list(dict.fromkeys(watch_rules + tag_rules))
if include_global:
global_rules = self.datastore.data['settings']['application'].get(f'global_{attr}', [])
rules = list(dict.fromkeys(rules + global_rules))
return rules
@property
def include_filters(self):
if self._include_filters_cache is None:
filters = self._get_merged_rules('include_filters')
# Inject LD+JSON price tracker rule if enabled
if self.watch.get('track_ldjson_price_data', '') == PRICE_DATA_TRACK_ACCEPT:
filters += html_tools.LD_JSON_PRODUCT_OFFER_SELECTORS
self._include_filters_cache = filters
return self._include_filters_cache
@property
def subtractive_selectors(self):
if self._subtractive_selectors_cache is None:
watch_selectors = self.watch.get("subtractive_selectors", [])
tag_selectors = self.datastore.get_tag_overrides_for_watch(uuid=self.watch_uuid, attr='subtractive_selectors')
global_selectors = self.datastore.data["settings"]["application"].get("global_subtractive_selectors", [])
self._subtractive_selectors_cache = [*tag_selectors, *watch_selectors, *global_selectors]
return self._subtractive_selectors_cache
@property
def extract_text(self):
return self._get_merged_rules('extract_text')
@property
def ignore_text(self):
return self._get_merged_rules('ignore_text', include_global=True)
@property
def trigger_text(self):
return self._get_merged_rules('trigger_text')
@property
def text_should_not_be_present(self):
return self._get_merged_rules('text_should_not_be_present')
@property
def has_include_filters(self):
return bool(self.include_filters) and bool(self.include_filters[0].strip())
@property
def has_include_json_filters(self):
return any(f.strip().startswith(prefix) for f in self.include_filters for prefix in JSON_FILTER_PREFIXES)
@property
def has_subtractive_selectors(self):
return bool(self.subtractive_selectors) and bool(self.subtractive_selectors[0].strip())
class ContentTransformer:
"""Handles text transformations like trimming, sorting, and deduplication."""
@staticmethod
def trim_whitespace(text):
"""Remove leading/trailing whitespace from each line."""
# Use generator expression to avoid building intermediate list
return '\n'.join(line.strip() for line in text.replace("\n\n", "\n").splitlines())
@staticmethod
def remove_duplicate_lines(text):
"""Remove duplicate lines while preserving order."""
return '\n'.join(dict.fromkeys(line for line in text.replace("\n\n", "\n").splitlines()))
@staticmethod
def sort_alphabetically(text):
"""Sort lines alphabetically (case-insensitive)."""
# Remove double line feeds before sorting
text = text.replace("\n\n", "\n")
return '\n'.join(sorted(text.splitlines(), key=lambda x: x.lower()))
@staticmethod
def extract_by_regex(text, regex_patterns):
"""Extract text matching regex patterns."""
# Use list of strings instead of concatenating lists repeatedly (avoids O(n²) behavior)
regex_matched_output = []
for s_re in regex_patterns:
# Check if it's perl-style regex /.../
if re.search(PERL_STYLE_REGEX, s_re, re.IGNORECASE):
regex = html_tools.perl_style_slash_enclosed_regex_to_options(s_re)
result = re.findall(regex, text)
for match in result:
if type(match) is tuple:
regex_matched_output.extend(match)
regex_matched_output.append('\n')
else:
regex_matched_output.append(match)
regex_matched_output.append('\n')
else:
# Plain text search (case-insensitive)
r = re.compile(re.escape(s_re), re.IGNORECASE)
res = r.findall(text)
if res:
for match in res:
regex_matched_output.append(match)
regex_matched_output.append('\n')
return ''.join(regex_matched_output) if regex_matched_output else ''
class RuleEngine:
"""Evaluates blocking rules (triggers, conditions, text_should_not_be_present)."""
@staticmethod
def evaluate_trigger_text(content, trigger_patterns):
"""
Check if trigger text is present. If trigger_text is configured,
content is blocked UNLESS the trigger is found.
Returns True if blocked, False if allowed.
"""
if not trigger_patterns:
return False
# Assume blocked if trigger_text is configured
result = html_tools.strip_ignore_text(
content=str(content),
wordlist=trigger_patterns,
mode="line numbers"
)
# Unblock if trigger was found
return not bool(result)
@staticmethod
def evaluate_text_should_not_be_present(content, patterns):
"""
Check if forbidden text is present. If found, block the change.
Returns True if blocked, False if allowed.
"""
if not patterns:
return False
result = html_tools.strip_ignore_text(
content=str(content),
wordlist=patterns,
mode="line numbers"
)
# Block if forbidden text was found
return bool(result)
@staticmethod
def evaluate_conditions(watch, datastore, content):
"""
Evaluate custom conditions ruleset.
Returns True if blocked, False if allowed.
"""
if not watch.get('conditions') or not watch.get('conditions_match_logic'):
return False
conditions_result = execute_ruleset_against_all_plugins(
current_watch_uuid=watch.get('uuid'),
application_datastruct=datastore.data,
ephemeral_data={'text': content}
)
# Block if conditions not met
return not conditions_result.get('result')
class ContentProcessor:
"""Handles content preprocessing, filtering, and extraction."""
def __init__(self, fetcher, watch, filter_config, datastore):
self.fetcher = fetcher
self.watch = watch
self.filter_config = filter_config
self.datastore = datastore
def preprocess_rss(self, content):
"""
Convert CDATA/comments in RSS to usable text.
Supports two RSS processing modes:
- 'default': Inline CDATA replacement (original behavior)
- 'formatted': Format RSS items with title, link, guid, pubDate, and description (CDATA unmarked)
"""
from changedetectionio import rss_tools
rss_mode = self.datastore.data["settings"]["application"].get("rss_reader_mode")
if rss_mode:
# Format RSS items nicely with CDATA content unmarked and converted to text
return rss_tools.format_rss_items(content)
else:
# Default: Original inline CDATA replacement
return cdata_in_document_to_text(html_content=content)
def preprocess_pdf(self, raw_content):
"""Convert PDF to HTML using external tool."""
from shutil import which
tool = os.getenv("PDF_TO_HTML_TOOL", "pdftohtml")
if not which(tool):
raise PDFToHTMLToolNotFound(
f"Command-line `{tool}` tool was not found in system PATH, was it installed?"
)
import subprocess
proc = subprocess.Popen(
[tool, '-stdout', '-', '-s', 'out.pdf', '-i'],
stdout=subprocess.PIPE,
stdin=subprocess.PIPE
)
proc.stdin.write(raw_content)
proc.stdin.close()
html_content = proc.stdout.read().decode('utf-8')
proc.wait(timeout=60)
# Add metadata for change detection
metadata = (
f"<p>Added by changedetection.io: Document checksum - "
f"{hashlib.md5(raw_content).hexdigest().upper()} "
f"Original file size - {len(raw_content)} bytes</p>"
)
return html_content.replace('</body>', metadata + '</body>')
def preprocess_json(self, raw_content):
"""Format and sort JSON content."""
# Then we re-format it, else it does have filters (later on) which will reformat it anyway
content = html_tools.extract_json_as_string(content=raw_content, json_filter="json:$")
# Sort JSON to avoid false alerts from reordering
try:
content = json.dumps(json.loads(content), sort_keys=True, indent=2, ensure_ascii=False)
except Exception:
# Might be malformed JSON, continue anyway
pass
return content
def apply_include_filters(self, content, stream_content_type):
"""Apply CSS, XPath, or JSON filters to extract specific content."""
filtered_content = ""
for filter_rule in self.filter_config.include_filters:
# XPath filters
if filter_rule[0] == '/' or filter_rule.startswith('xpath:'):
filtered_content += html_tools.xpath_filter(
xpath_filter=filter_rule.replace('xpath:', ''),
html_content=content,
append_pretty_line_formatting=not self.watch.is_source_type_url,
is_xml=stream_content_type.is_rss or stream_content_type.is_xml
)
# XPath1 filters (first match only)
elif filter_rule.startswith('xpath1:'):
filtered_content += html_tools.xpath1_filter(
xpath_filter=filter_rule.replace('xpath1:', ''),
html_content=content,
append_pretty_line_formatting=not self.watch.is_source_type_url,
is_xml=stream_content_type.is_rss or stream_content_type.is_xml
)
# JSON filters
elif any(filter_rule.startswith(prefix) for prefix in JSON_FILTER_PREFIXES):
filtered_content += html_tools.extract_json_as_string(
content=content,
json_filter=filter_rule
)
# CSS selectors, default fallback
else:
filtered_content += html_tools.include_filters(
include_filters=filter_rule,
html_content=content,
append_pretty_line_formatting=not self.watch.is_source_type_url
)
# Raise error if filter returned nothing
if not filtered_content.strip():
raise FilterNotFoundInResponse(
msg=self.filter_config.include_filters,
screenshot=self.fetcher.screenshot,
xpath_data=self.fetcher.xpath_data
)
return filtered_content
def apply_subtractive_selectors(self, content):
"""Remove elements matching subtractive selectors."""
return html_tools.element_removal(self.filter_config.subtractive_selectors, content)
def extract_text_from_html(self, html_content, stream_content_type):
"""Convert HTML to plain text."""
do_anchor = self.datastore.data["settings"]["application"].get("render_anchor_tag_content", False)
return html_tools.html_to_text(
html_content=html_content,
render_anchor_tag_content=do_anchor,
is_rss=stream_content_type.is_rss
)
class ChecksumCalculator:
"""Calculates checksums with various options."""
@staticmethod
def calculate(text, ignore_whitespace=False):
"""Calculate MD5 checksum of text content."""
if ignore_whitespace:
text = text.translate(TRANSLATE_WHITESPACE_TABLE)
return hashlib.md5(text.encode('utf-8')).hexdigest()
# Some common stuff here that can be moved to a base class
# (set_proxy_from_list)
class perform_site_check(difference_detection_processor):
def run_changedetection(self, watch):
changed_detected = False
html_content = ""
screenshot = False # as bytes
stripped_text_from_html = ""
if not watch:
raise Exception("Watch no longer exists.")
# Initialize components
filter_config = FilterConfig(watch, self.datastore)
content_processor = ContentProcessor(self.fetcher, watch, filter_config, self.datastore)
transformer = ContentTransformer()
rule_engine = RuleEngine()
# Get content type and stream info
ctype_header = self.fetcher.get_all_headers().get('content-type', DEFAULT_WHEN_NO_CONTENT_TYPE_HEADER).lower()
stream_content_type = guess_stream_type(http_content_header=ctype_header, content=self.fetcher.content)
# Unset any existing notification error
update_obj = {'last_notification_error': False, 'last_error': False}
url = watch.link
self.screenshot = self.fetcher.screenshot
self.xpath_data = self.fetcher.xpath_data
# Track the content type and checksum before filters
# Track the content type
update_obj['content_type'] = ctype_header
# Watches added automatically in the queue manager will skip if its the same checksum as the previous run
# Saves a lot of CPU
update_obj['previous_md5_before_filters'] = hashlib.md5(self.fetcher.content.encode('utf-8')).hexdigest()
# === CONTENT PREPROCESSING ===
# Avoid creating unnecessary intermediate string copies by reassigning only when needed
content = self.fetcher.content
# Fetching complete, now filters
# RSS preprocessing
# @note: I feel like the following should be in a more obvious chain system
# - Check filter text
# - Is the checksum different?
# - Do we convert to JSON?
# https://stackoverflow.com/questions/41817578/basic-method-chaining ?
# return content().textfilter().jsonextract().checksumcompare() ?
# Go into RSS preprocess for converting CDATA/comment to usable text
if stream_content_type.is_rss:
content = content_processor.preprocess_rss(content)
if self.datastore.data["settings"]["application"].get("rss_reader_mode"):
# Now just becomes regular HTML that can have xpath/CSS applied (first of the set etc)
stream_content_type.is_rss = False
stream_content_type.is_html = True
self.fetcher.content = content
self.fetcher.content = cdata_in_document_to_text(html_content=self.fetcher.content)
# PDF preprocessing
if watch.is_pdf or stream_content_type.is_pdf:
content = content_processor.preprocess_pdf(raw_content=self.fetcher.raw_content)
stream_content_type.is_html = True
from shutil import which
tool = os.getenv("PDF_TO_HTML_TOOL", "pdftohtml")
if not which(tool):
raise PDFToHTMLToolNotFound("Command-line `{}` tool was not found in system PATH, was it installed?".format(tool))
# JSON - Always reformat it nicely for consistency.
import subprocess
proc = subprocess.Popen(
[tool, '-stdout', '-', '-s', 'out.pdf', '-i'],
stdout=subprocess.PIPE,
stdin=subprocess.PIPE)
proc.stdin.write(self.fetcher.raw_content)
proc.stdin.close()
self.fetcher.content = proc.stdout.read().decode('utf-8')
proc.wait(timeout=60)
# Add a little metadata so we know if the file changes (like if an image changes, but the text is the same
# @todo may cause problems with non-UTF8?
metadata = "<p>Added by changedetection.io: Document checksum - {} Filesize - {} bytes</p>".format(
hashlib.md5(self.fetcher.raw_content).hexdigest().upper(),
len(self.fetcher.content))
self.fetcher.content = self.fetcher.content.replace('</body>', metadata + '</body>')
# Better would be if Watch.model could access the global data also
# and then use getattr https://docs.python.org/3/reference/datamodel.html#object.__getitem__
# https://realpython.com/inherit-python-dict/ instead of doing it procedurely
include_filters_from_tags = self.datastore.get_tag_overrides_for_watch(uuid=watch.get('uuid'), attr='include_filters')
# 1845 - remove duplicated filters in both group and watch include filter
include_filters_rule = list(dict.fromkeys(watch.get('include_filters', []) + include_filters_from_tags))
subtractive_selectors = [*self.datastore.get_tag_overrides_for_watch(uuid=watch.get('uuid'), attr='subtractive_selectors'),
*watch.get("subtractive_selectors", []),
*self.datastore.data["settings"]["application"].get("global_subtractive_selectors", [])
]
# Inject a virtual LD+JSON price tracker rule
if watch.get('track_ldjson_price_data', '') == PRICE_DATA_TRACK_ACCEPT:
include_filters_rule += html_tools.LD_JSON_PRODUCT_OFFER_SELECTORS
has_filter_rule = len(include_filters_rule) and len(include_filters_rule[0].strip())
has_subtractive_selectors = len(subtractive_selectors) and len(subtractive_selectors[0].strip())
if stream_content_type.is_json:
if not filter_config.has_include_json_filters:
content = content_processor.preprocess_json(raw_content=content)
#else, otherwise it gets sorted/formatted in the filter stage anyway
if not has_filter_rule:
# Force a reformat
include_filters_rule.append("json:$")
has_filter_rule = True
# HTML obfuscation workarounds
if stream_content_type.is_html:
content = html_tools.workarounds_for_obfuscations(content)
# Sort the JSON so we dont get false alerts when the content is just re-ordered
try:
self.fetcher.content = json.dumps(json.loads(self.fetcher.content), sort_keys=True)
except Exception as e:
# Might have just been a snippet, or otherwise bad JSON, continue
pass
# Check for LD+JSON price data (for HTML content)
if stream_content_type.is_html:
update_obj['has_ldjson_price_data'] = html_tools.has_ldjson_product_info(content)
if has_filter_rule:
for filter in include_filters_rule:
if any(prefix in filter for prefix in json_filter_prefixes):
stripped_text_from_html += html_tools.extract_json_as_string(content=self.fetcher.content, json_filter=filter)
if stripped_text_from_html:
stream_content_type.is_json = True
stream_content_type.is_html = False
# === FILTER APPLICATION ===
# Start with content reference, avoid copy until modification
html_content = content
# We have 'watch.is_source_type_url' because we should be able to use selectors on the raw HTML but return just that selected HTML
if stream_content_type.is_html or watch.is_source_type_url or stream_content_type.is_plaintext or stream_content_type.is_rss or stream_content_type.is_xml or stream_content_type.is_pdf:
# Apply include filters (CSS, XPath, JSON)
# Except for plaintext (incase they tried to confuse the system, it will HTML escape
#if not stream_content_type.is_plaintext:
if filter_config.has_include_filters:
html_content = content_processor.apply_include_filters(content, stream_content_type)
# CSS Filter, extract the HTML that matches and feed that into the existing inscriptis::get_text
self.fetcher.content = html_tools.workarounds_for_obfuscations(self.fetcher.content)
html_content = self.fetcher.content
# Apply subtractive selectors
if filter_config.has_subtractive_selectors:
html_content = content_processor.apply_subtractive_selectors(html_content)
# === TEXT EXTRACTION ===
if watch.is_source_type_url:
# For source URLs, keep raw content
stripped_text = html_content
elif stream_content_type.is_plaintext:
# For plaintext, keep as-is without HTML-to-text conversion
stripped_text = html_content
else:
# Extract text from HTML/RSS content (not generic XML)
if stream_content_type.is_html or stream_content_type.is_rss:
stripped_text = content_processor.extract_text_from_html(html_content, stream_content_type)
# Some kind of "text" but definitely not RSS looking
if stream_content_type.is_plaintext:
# Don't run get_text or xpath/css filters on plaintext
# We are not HTML, we are not any kind of RSS, doesnt even look like HTML
stripped_text_from_html = html_content
else:
stripped_text = html_content
# If not JSON, and if it's not text/plain..
# Does it have some ld+json price data? used for easier monitoring
update_obj['has_ldjson_price_data'] = html_tools.has_ldjson_product_info(self.fetcher.content)
# Then we assume HTML
if has_filter_rule:
html_content = ""
for filter_rule in include_filters_rule:
# For HTML/XML we offer xpath as an option, just start a regular xPath "/.."
if filter_rule[0] == '/' or filter_rule.startswith('xpath:'):
html_content += html_tools.xpath_filter(xpath_filter=filter_rule.replace('xpath:', ''),
html_content=self.fetcher.content,
append_pretty_line_formatting=not watch.is_source_type_url,
is_rss=stream_content_type.is_rss)
elif filter_rule.startswith('xpath1:'):
html_content += html_tools.xpath1_filter(xpath_filter=filter_rule.replace('xpath1:', ''),
html_content=self.fetcher.content,
append_pretty_line_formatting=not watch.is_source_type_url,
is_rss=stream_content_type.is_rss)
else:
html_content += html_tools.include_filters(include_filters=filter_rule,
html_content=self.fetcher.content,
append_pretty_line_formatting=not watch.is_source_type_url)
if not html_content.strip():
raise FilterNotFoundInResponse(msg=include_filters_rule, screenshot=self.fetcher.screenshot, xpath_data=self.fetcher.xpath_data)
if has_subtractive_selectors:
html_content = html_tools.element_removal(subtractive_selectors, html_content)
if watch.is_source_type_url:
stripped_text_from_html = html_content
else:
# extract text
do_anchor = self.datastore.data["settings"]["application"].get("render_anchor_tag_content", False)
stripped_text_from_html = html_tools.html_to_text(html_content=html_content,
render_anchor_tag_content=do_anchor,
is_rss=stream_content_type.is_rss) # 1874 activate the <title workaround hack
# === TEXT TRANSFORMATIONS ===
if watch.get('trim_text_whitespace'):
stripped_text = transformer.trim_whitespace(stripped_text)
stripped_text_from_html = '\n'.join(line.strip() for line in stripped_text_from_html.replace("\n\n", "\n").splitlines())
# Save text before ignore filters (for diff calculation)
text_content_before_ignored_filter = stripped_text
# Re #340 - return the content before the 'ignore text' was applied
# Also used to calculate/show what was removed
text_content_before_ignored_filter = stripped_text_from_html
# @todo whitespace coming from missing rtrim()?
# stripped_text_from_html could be based on their preferences, replace the processed text with only that which they want to know about.
# Rewrite's the processing text based on only what diff result they want to see
# === DIFF FILTERING ===
# If user wants specific diff types (added/removed/replaced only)
if watch.has_special_diff_filter_options_set() and len(watch.history.keys()):
stripped_text = self._apply_diff_filtering(watch, stripped_text, text_content_before_ignored_filter)
if stripped_text is None:
# No differences found, but content exists
c = ChecksumCalculator.calculate(text_content_before_ignored_filter, ignore_whitespace=True)
return False, {'previous_md5': c}, text_content_before_ignored_filter.encode('utf-8')
# Now the content comes from the diff-parser and not the returned HTTP traffic, so could be some differences
from changedetectionio import diff
# needs to not include (added) etc or it may get used twice
# Replace the processed text with the preferred result
rendered_diff = diff.render_diff(previous_version_file_contents=watch.get_last_fetched_text_before_filters(),
newest_version_file_contents=stripped_text_from_html,
include_equal=False, # not the same lines
include_added=watch.get('filter_text_added', True),
include_removed=watch.get('filter_text_removed', True),
include_replaced=watch.get('filter_text_replaced', True),
line_feed_sep="\n",
include_change_type_prefix=False)
# === EMPTY PAGE CHECK ===
watch.save_last_text_fetched_before_filters(text_content_before_ignored_filter.encode('utf-8'))
if not rendered_diff and stripped_text_from_html:
# We had some content, but no differences were found
# Store our new file as the MD5 so it will trigger in the future
c = hashlib.md5(stripped_text_from_html.translate(TRANSLATE_WHITESPACE_TABLE).encode('utf-8')).hexdigest()
return False, {'previous_md5': c}, stripped_text_from_html.encode('utf-8')
else:
stripped_text_from_html = rendered_diff
# Treat pages with no renderable text content as a change? No by default
empty_pages_are_a_change = self.datastore.data['settings']['application'].get('empty_pages_are_a_change', False)
if not stream_content_type.is_json and not empty_pages_are_a_change and len(stripped_text.strip()) == 0:
raise content_fetchers.exceptions.ReplyWithContentButNoText(
url=url,
status_code=self.fetcher.get_last_status_code(),
screenshot=self.fetcher.screenshot,
has_filters=filter_config.has_include_filters,
html_content=html_content,
xpath_data=self.fetcher.xpath_data
)
if not stream_content_type.is_json and not empty_pages_are_a_change and len(stripped_text_from_html.strip()) == 0:
raise content_fetchers.exceptions.ReplyWithContentButNoText(url=url,
status_code=self.fetcher.get_last_status_code(),
screenshot=self.fetcher.screenshot,
has_filters=has_filter_rule,
html_content=html_content,
xpath_data=self.fetcher.xpath_data
)
# We rely on the actual text in the html output.. many sites have random script vars etc,
# in the future we'll implement other mechanisms.
update_obj["last_check_status"] = self.fetcher.get_last_status_code()
# === REGEX EXTRACTION ===
if filter_config.extract_text:
extracted = transformer.extract_by_regex(stripped_text, filter_config.extract_text)
stripped_text = extracted
# 615 Extract text by regex
extract_text = list(dict.fromkeys(watch.get('extract_text', []) + self.datastore.get_tag_overrides_for_watch(uuid=watch.get('uuid'), attr='extract_text')))
if len(extract_text) > 0:
regex_matched_output = []
for s_re in extract_text:
# incase they specified something in '/.../x'
if re.search(PERL_STYLE_REGEX, s_re, re.IGNORECASE):
regex = html_tools.perl_style_slash_enclosed_regex_to_options(s_re)
result = re.findall(regex, stripped_text_from_html)
for l in result:
if type(l) is tuple:
# @todo - some formatter option default (between groups)
regex_matched_output += list(l) + ['\n']
else:
# @todo - some formatter option default (between each ungrouped result)
regex_matched_output += [l] + ['\n']
else:
# Doesnt look like regex, just hunt for plaintext and return that which matches
# `stripped_text_from_html` will be bytes, so we must encode s_re also to bytes
r = re.compile(re.escape(s_re), re.IGNORECASE)
res = r.findall(stripped_text_from_html)
if res:
for match in res:
regex_matched_output += [match] + ['\n']
##########################################################
stripped_text_from_html = ''
if regex_matched_output:
# @todo some formatter for presentation?
stripped_text_from_html = ''.join(regex_matched_output)
# === MORE TEXT TRANSFORMATIONS ===
if watch.get('remove_duplicate_lines'):
stripped_text = transformer.remove_duplicate_lines(stripped_text)
stripped_text_from_html = '\n'.join(dict.fromkeys(line for line in stripped_text_from_html.replace("\n\n", "\n").splitlines()))
if watch.get('sort_text_alphabetically'):
stripped_text = transformer.sort_alphabetically(stripped_text)
# Note: Because a <p>something</p> will add an extra line feed to signify the paragraph gap
# we end up with 'Some text\n\n', sorting will add all those extra \n at the start, so we remove them here.
stripped_text_from_html = stripped_text_from_html.replace("\n\n", "\n")
stripped_text_from_html = '\n'.join(sorted(stripped_text_from_html.splitlines(), key=lambda x: x.lower()))
# === CHECKSUM CALCULATION ===
text_for_checksuming = stripped_text
### CALCULATE MD5
# If there's text to ignore
text_to_ignore = watch.get('ignore_text', []) + self.datastore.data['settings']['application'].get('global_ignore_text', [])
text_to_ignore += self.datastore.get_tag_overrides_for_watch(uuid=watch.get('uuid'), attr='ignore_text')
# Apply ignore_text for checksum calculation
if filter_config.ignore_text:
text_for_checksuming = html_tools.strip_ignore_text(stripped_text, filter_config.ignore_text)
# Optionally remove ignored lines from output
strip_ignored_lines = watch.get('strip_ignored_lines')
if strip_ignored_lines is None:
strip_ignored_lines = self.datastore.data['settings']['application'].get('strip_ignored_lines')
text_for_checksuming = stripped_text_from_html
if text_to_ignore:
text_for_checksuming = html_tools.strip_ignore_text(stripped_text_from_html, text_to_ignore)
# Some people prefer to also completely remove it
strip_ignored_lines = watch.get('strip_ignored_lines') if watch.get('strip_ignored_lines') is not None else self.datastore.data['settings']['application'].get('strip_ignored_lines')
if strip_ignored_lines:
stripped_text = text_for_checksuming
# @todo add test in the 'preview' mode, check the widget works? compare to datastruct
stripped_text_from_html = text_for_checksuming
# Calculate checksum
ignore_whitespace = self.datastore.data['settings']['application'].get('ignore_whitespace', False)
fetched_md5 = ChecksumCalculator.calculate(text_for_checksuming, ignore_whitespace=ignore_whitespace)
# Re #133 - if we should strip whitespaces from triggering the change detected comparison
if text_for_checksuming and self.datastore.data['settings']['application'].get('ignore_whitespace', False):
fetched_md5 = hashlib.md5(text_for_checksuming.translate(TRANSLATE_WHITESPACE_TABLE).encode('utf-8')).hexdigest()
else:
fetched_md5 = hashlib.md5(text_for_checksuming.encode('utf-8')).hexdigest()
# === BLOCKING RULES EVALUATION ===
############ Blocking rules, after checksum #################
blocked = False
# Check trigger_text
if rule_engine.evaluate_trigger_text(stripped_text, filter_config.trigger_text):
trigger_text = list(dict.fromkeys(watch.get('trigger_text', []) + self.datastore.get_tag_overrides_for_watch(uuid=watch.get('uuid'), attr='trigger_text')))
if len(trigger_text):
# Assume blocked
blocked = True
# Filter and trigger works the same, so reuse it
# It should return the line numbers that match
# Unblock flow if the trigger was found (some text remained after stripped what didnt match)
result = html_tools.strip_ignore_text(content=str(stripped_text_from_html),
wordlist=trigger_text,
mode="line numbers")
# Unblock if the trigger was found
if result:
blocked = False
# Check text_should_not_be_present
if rule_engine.evaluate_text_should_not_be_present(stripped_text, filter_config.text_should_not_be_present):
blocked = True
text_should_not_be_present = list(dict.fromkeys(watch.get('text_should_not_be_present', []) + self.datastore.get_tag_overrides_for_watch(uuid=watch.get('uuid'), attr='text_should_not_be_present')))
if len(text_should_not_be_present):
# If anything matched, then we should block a change from happening
result = html_tools.strip_ignore_text(content=str(stripped_text_from_html),
wordlist=text_should_not_be_present,
mode="line numbers")
if result:
blocked = True
# Check custom conditions
if rule_engine.evaluate_conditions(watch, self.datastore, stripped_text):
blocked = True
# And check if 'conditions' will let this pass through
if watch.get('conditions') and watch.get('conditions_match_logic'):
conditions_result = execute_ruleset_against_all_plugins(current_watch_uuid=watch.get('uuid'),
application_datastruct=self.datastore.data,
ephemeral_data={
'text': stripped_text_from_html
}
)
# === CHANGE DETECTION ===
if not conditions_result.get('result'):
# Conditions say "Condition not met" so we block it.
blocked = True
# Looks like something changed, but did it match all the rules?
if blocked:
changed_detected = False
else:
# Compare checksums
# The main thing that all this at the moment comes down to :)
if watch.get('previous_md5') != fetched_md5:
changed_detected = True
# Always record the new checksum
update_obj["previous_md5"] = fetched_md5
# On first run, initialize previous_md5
# On the first run of a site, watch['previous_md5'] will be None, set it the current one.
if not watch.get('previous_md5'):
watch['previous_md5'] = fetched_md5
logger.debug(f"Watch UUID {watch.get('uuid')} content check - Previous MD5: {watch.get('previous_md5')}, Fetched MD5 {fetched_md5}")
# === UNIQUE LINES CHECK ===
if changed_detected and watch.get('check_unique_lines', False):
has_unique_lines = watch.lines_contain_something_unique_compared_to_history(
lines=stripped_text.splitlines(),
ignore_whitespace=ignore_whitespace
)
if changed_detected:
if watch.get('check_unique_lines', False):
ignore_whitespace = self.datastore.data['settings']['application'].get('ignore_whitespace')
if not has_unique_lines:
logger.debug(f"check_unique_lines: UUID {watch.get('uuid')} didnt have anything new setting change_detected=False")
changed_detected = False
else:
logger.debug(f"check_unique_lines: UUID {watch.get('uuid')} had unique content")
has_unique_lines = watch.lines_contain_something_unique_compared_to_history(
lines=stripped_text_from_html.splitlines(),
ignore_whitespace=ignore_whitespace
)
# Note: Explicit cleanup is only needed here because text_json_diff handles
# large strings (100KB-300KB for RSS/HTML). The other processors work with
# small strings and don't need this.
#
# Python would clean these up automatically, but explicit `del` frees memory
# immediately rather than waiting for function return, reducing peak memory usage.
del content
if 'html_content' in locals() and html_content is not stripped_text:
del html_content
if 'text_content_before_ignored_filter' in locals() and text_content_before_ignored_filter is not stripped_text:
del text_content_before_ignored_filter
if 'text_for_checksuming' in locals() and text_for_checksuming is not stripped_text:
del text_for_checksuming
# One or more lines? unsure?
if not has_unique_lines:
logger.debug(f"check_unique_lines: UUID {watch.get('uuid')} didnt have anything new setting change_detected=False")
changed_detected = False
else:
logger.debug(f"check_unique_lines: UUID {watch.get('uuid')} had unique content")
return changed_detected, update_obj, stripped_text
def _apply_diff_filtering(self, watch, stripped_text, text_before_filter):
"""Apply user's diff filtering preferences (show only added/removed/replaced lines)."""
from changedetectionio import diff
rendered_diff = diff.render_diff(
previous_version_file_contents=watch.get_last_fetched_text_before_filters(),
newest_version_file_contents=stripped_text,
include_equal=False,
include_added=watch.get('filter_text_added', True),
include_removed=watch.get('filter_text_removed', True),
include_replaced=watch.get('filter_text_replaced', True),
line_feed_sep="\n",
include_change_type_prefix=False
)
watch.save_last_text_fetched_before_filters(text_before_filter.encode('utf-8'))
if not rendered_diff and stripped_text:
# No differences found
return None
return rendered_diff
# stripped_text_from_html - Everything after filters and NO 'ignored' content
return changed_detected, update_obj, stripped_text_from_html

View File

@@ -1,5 +1,5 @@
[pytest]
addopts = --no-start-live-server --live-server-port=0
addopts = --no-start-live-server --live-server-port=5005
#testpaths = tests pytest_invenio
#live_server_scope = function

View File

@@ -37,6 +37,18 @@ class SignalHandler:
notification_event_signal.connect(self.handle_notification_event, weak=False)
logger.info("SignalHandler: Connected to notification_event signal")
# Create and start the queue update thread using standard threading
import threading
self.polling_emitter_thread = threading.Thread(
target=self.polling_emit_running_or_queued_watches_threaded,
daemon=True
)
self.polling_emitter_thread.start()
logger.info("Started polling thread using threading (eventlet-free)")
# Store the thread reference in socketio for clean shutdown
self.socketio_instance.polling_emitter_thread = self.polling_emitter_thread
def handle_signal(self, *args, **kwargs):
logger.trace(f"SignalHandler: Signal received with {len(args)} args and {len(kwargs)} kwargs")
# Safely extract the watch UUID from kwargs
@@ -112,6 +124,74 @@ class SignalHandler:
except Exception as e:
logger.error(f"Socket.IO error in handle_notification_event: {str(e)}")
def polling_emit_running_or_queued_watches_threaded(self):
"""Threading version of polling for Windows compatibility"""
import time
import threading
logger.info("Queue update thread started (threading mode)")
# Import here to avoid circular imports
from changedetectionio.flask_app import app
from changedetectionio import worker_handler
watch_check_update = signal('watch_check_update')
# Track previous state to avoid unnecessary emissions
previous_running_uuids = set()
# Run until app shutdown - check exit flag more frequently for fast shutdown
exit_event = getattr(app.config, 'exit', threading.Event())
while not exit_event.is_set():
try:
# Get current running UUIDs from async workers
running_uuids = set(worker_handler.get_running_uuids())
# Only send updates for UUIDs that changed state
newly_running = running_uuids - previous_running_uuids
no_longer_running = previous_running_uuids - running_uuids
# Send updates for newly running UUIDs (but exit fast if shutdown requested)
for uuid in newly_running:
if exit_event.is_set():
break
logger.trace(f"Threading polling: UUID {uuid} started processing")
with app.app_context():
watch_check_update.send(app_context=app, watch_uuid=uuid)
time.sleep(0.01) # Small yield
# Send updates for UUIDs that finished processing (but exit fast if shutdown requested)
if not exit_event.is_set():
for uuid in no_longer_running:
if exit_event.is_set():
break
logger.trace(f"Threading polling: UUID {uuid} finished processing")
with app.app_context():
watch_check_update.send(app_context=app, watch_uuid=uuid)
time.sleep(0.01) # Small yield
# Update tracking for next iteration
previous_running_uuids = running_uuids
# Sleep between polling cycles, but check exit flag every 0.5 seconds for fast shutdown
for _ in range(20): # 20 * 0.5 = 10 seconds total
if exit_event.is_set():
break
time.sleep(0.5)
except Exception as e:
logger.error(f"Error in threading polling: {str(e)}")
# Even during error recovery, check for exit quickly
for _ in range(1): # 1 * 0.5 = 0.5 seconds
if exit_event.is_set():
break
time.sleep(0.5)
# Check if we're in pytest environment - if so, be more gentle with logging
import sys
in_pytest = "pytest" in sys.modules or "PYTEST_CURRENT_TEST" in os.environ
if not in_pytest:
logger.info("Queue update thread stopped (threading mode)")
def handle_watch_update(socketio, **kwargs):
@@ -303,6 +383,19 @@ def init_socketio(app, datastore):
"""Shutdown the SocketIO server fast and aggressively"""
try:
logger.info("Socket.IO: Fast shutdown initiated...")
# For threading mode, give the thread a very short time to exit gracefully
if hasattr(socketio, 'polling_emitter_thread'):
if socketio.polling_emitter_thread.is_alive():
logger.info("Socket.IO: Waiting 1 second for polling thread to stop...")
socketio.polling_emitter_thread.join(timeout=1.0) # Only 1 second timeout
if socketio.polling_emitter_thread.is_alive():
logger.info("Socket.IO: Polling thread still running after timeout - continuing with shutdown")
else:
logger.info("Socket.IO: Polling thread stopped quickly")
else:
logger.info("Socket.IO: Polling thread already stopped")
logger.info("Socket.IO: Fast shutdown complete")
except Exception as e:
logger.error(f"Socket.IO error during shutdown: {str(e)}")

View File

@@ -1,204 +0,0 @@
"""
RSS/Atom feed processing tools for changedetection.io
"""
from loguru import logger
import re
def cdata_in_document_to_text(html_content: str, render_anchor_tag_content=False) -> str:
"""
Process CDATA sections in HTML/XML content - inline replacement.
Args:
html_content: The HTML/XML content to process
render_anchor_tag_content: Whether to render anchor tag content
Returns:
Processed HTML/XML content with CDATA sections replaced inline
"""
from xml.sax.saxutils import escape as xml_escape
from .html_tools import html_to_text
pattern = '<!\[CDATA\[(\s*(?:.(?<!\]\]>)\s*)*)\]\]>'
def repl(m):
text = m.group(1)
return xml_escape(html_to_text(html_content=text, render_anchor_tag_content=render_anchor_tag_content)).strip()
return re.sub(pattern, repl, html_content)
# Jinja2 template for formatting RSS/Atom feed entries
# Covers all common feedparser entry fields including namespaced elements
# Outputs HTML that will be converted to text via html_to_text
# @todo - This could be a UI setting in the future
RSS_ENTRY_TEMPLATE = """<article class="rss-item" id="{{ entry.id|replace('"', '')|replace(' ', '-') }}">{%- if entry.title -%}Title: {{ entry.title }}<br>{%- endif -%}
{%- if entry.link -%}<strong>Link:</strong> <a href="{{ entry.link }}">{{ entry.link }}</a><br>
{%- endif -%}
{%- if entry.id -%}
<strong>Guid:</strong> {{ entry.id }}<br>
{%- endif -%}
{%- if entry.published -%}
<strong>PubDate:</strong> {{ entry.published }}<br>
{%- endif -%}
{%- if entry.updated and entry.updated != entry.published -%}
<strong>Updated:</strong> {{ entry.updated }}<br>
{%- endif -%}
{%- if entry.author -%}
<strong>Author:</strong> {{ entry.author }}<br>
{%- elif entry.author_detail and entry.author_detail.name -%}
<strong>Author:</strong> {{ entry.author_detail.name }}
{%- if entry.author_detail.email %} ({{ entry.author_detail.email }}){% endif -%}
<br>
{%- endif -%}
{%- if entry.contributors -%}
<strong>Contributors:</strong> {% for contributor in entry.contributors -%}
{{ contributor.name if contributor.name else contributor }}
{%- if not loop.last %}, {% endif -%}
{%- endfor %}<br>
{%- endif -%}
{%- if entry.publisher -%}
<strong>Publisher:</strong> {{ entry.publisher }}<br>
{%- endif -%}
{%- if entry.rights -%}
<strong>Rights:</strong> {{ entry.rights }}<br>
{%- endif -%}
{%- if entry.license -%}
<strong>License:</strong> {{ entry.license }}<br>
{%- endif -%}
{%- if entry.language -%}
<strong>Language:</strong> {{ entry.language }}<br>
{%- endif -%}
{%- if entry.tags -%}
<strong>Tags:</strong> {% for tag in entry.tags -%}
{{ tag.term if tag.term else tag }}
{%- if not loop.last %}, {% endif -%}
{%- endfor %}<br>
{%- endif -%}
{%- if entry.category -%}
<strong>Category:</strong> {{ entry.category }}<br>
{%- endif -%}
{%- if entry.comments -%}
<strong>Comments:</strong> <a href="{{ entry.comments }}">{{ entry.comments }}</a><br>
{%- endif -%}
{%- if entry.slash_comments -%}
<strong>Comment Count:</strong> {{ entry.slash_comments }}<br>
{%- endif -%}
{%- if entry.enclosures -%}
<strong>Enclosures:</strong><br>
{%- for enclosure in entry.enclosures %}
- <a href="{{ enclosure.href }}">{{ enclosure.href }}</a> ({{ enclosure.type if enclosure.type else 'unknown type' }}
{%- if enclosure.length %}, {{ enclosure.length }} bytes{% endif -%}
)<br>
{%- endfor -%}
{%- endif -%}
{%- if entry.media_content -%}
<strong>Media:</strong><br>
{%- for media in entry.media_content %}
- <a href="{{ media.url }}">{{ media.url }}</a>
{%- if media.type %} ({{ media.type }}){% endif -%}
{%- if media.width and media.height %} {{ media.width }}x{{ media.height }}{% endif -%}
<br>
{%- endfor -%}
{%- endif -%}
{%- if entry.media_thumbnail -%}
<strong>Thumbnail:</strong> <a href="{{ entry.media_thumbnail[0].url if entry.media_thumbnail[0].url else entry.media_thumbnail[0] }}">{{ entry.media_thumbnail[0].url if entry.media_thumbnail[0].url else entry.media_thumbnail[0] }}</a><br>
{%- endif -%}
{%- if entry.media_description -%}
<strong>Media Description:</strong> {{ entry.media_description }}<br>
{%- endif -%}
{%- if entry.itunes_duration -%}
<strong>Duration:</strong> {{ entry.itunes_duration }}<br>
{%- endif -%}
{%- if entry.itunes_author -%}
<strong>Podcast Author:</strong> {{ entry.itunes_author }}<br>
{%- endif -%}
{%- if entry.dc_identifier -%}
<strong>Identifier:</strong> {{ entry.dc_identifier }}<br>
{%- endif -%}
{%- if entry.dc_source -%}
<strong>DC Source:</strong> {{ entry.dc_source }}<br>
{%- endif -%}
{%- if entry.dc_type -%}
<strong>Type:</strong> {{ entry.dc_type }}<br>
{%- endif -%}
{%- if entry.dc_format -%}
<strong>Format:</strong> {{ entry.dc_format }}<br>
{%- endif -%}
{%- if entry.dc_relation -%}
<strong>Related:</strong> {{ entry.dc_relation }}<br>
{%- endif -%}
{%- if entry.dc_coverage -%}
<strong>Coverage:</strong> {{ entry.dc_coverage }}<br>
{%- endif -%}
{%- if entry.source and entry.source.title -%}
<strong>Source:</strong> {{ entry.source.title }}
{%- if entry.source.link %} (<a href="{{ entry.source.link }}">{{ entry.source.link }}</a>){% endif -%}
<br>
{%- endif -%}
{%- if entry.dc_content -%}
<strong>Content:</strong> {{ entry.dc_content | safe }}
{%- elif entry.content and entry.content[0].value -%}
<strong>Content:</strong> {{ entry.content[0].value | safe }}
{%- elif entry.summary -%}
<strong>Summary:</strong> {{ entry.summary | safe }}
{%- endif -%}</article>
"""
def format_rss_items(rss_content: str, render_anchor_tag_content=False) -> str:
"""
Format RSS/Atom feed items in a readable text format using feedparser and Jinja2.
Converts RSS <item> or Atom <entry> elements to formatted text with all available fields:
- Basic fields: title, link, id/guid, published date, updated date
- Author fields: author, author_detail, contributors, publisher
- Content fields: content, summary, description
- Metadata: tags, category, rights, license
- Media: enclosures, media_content, media_thumbnail
- Dublin Core elements: dc:creator, dc:date, dc:publisher, etc. (mapped by feedparser)
Args:
rss_content: The RSS/Atom feed content
render_anchor_tag_content: Whether to render anchor tag content in descriptions (unused, kept for compatibility)
Returns:
Formatted HTML content ready for html_to_text conversion
"""
try:
import feedparser
from changedetectionio.jinja2_custom import safe_jinja
# Parse the feed - feedparser handles all RSS/Atom variants, CDATA, entity unescaping, etc.
feed = feedparser.parse(rss_content)
# Determine feed type for appropriate labels
is_atom = feed.version and 'atom' in feed.version
formatted_items = []
for entry in feed.entries:
# Render the entry using Jinja2 template
rendered = safe_jinja.render(RSS_ENTRY_TEMPLATE, entry=entry, is_atom=is_atom)
formatted_items.append(rendered.strip())
# Wrap each item in a div with classes (first, last, item-N)
items_html = []
total_items = len(formatted_items)
for idx, item in enumerate(formatted_items):
classes = ['rss-item']
if idx == 0:
classes.append('first')
if idx == total_items - 1:
classes.append('last')
classes.append(f'item-{idx + 1}')
class_str = ' '.join(classes)
items_html.append(f'<div class="{class_str}">{item}</div>')
return '<html><body>\n' + "\n<br>".join(items_html) + '\n</body></html>'
except Exception as e:
logger.warning(f"Error formatting RSS items: {str(e)}")
# Fall back to original content
return rss_content

View File

@@ -11,86 +11,36 @@ set -e
SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
# Since theres no curl installed lets roll with python3
check_sanity() {
local port="$1"
if [ -z "$port" ]; then
echo "Usage: check_sanity <port>" >&2
return 1
fi
find tests/test_*py -type f|while read test_name
do
echo "TEST RUNNING $test_name"
# REMOVE_REQUESTS_OLD_SCREENSHOTS disabled so that we can write a screenshot and send it in test_notifications.py without a real browser
REMOVE_REQUESTS_OLD_SCREENSHOTS=false pytest $test_name
done
python3 - "$port" <<'PYCODE'
import sys, time, urllib.request, socket
port = sys.argv[1]
url = f'http://localhost:{port}'
ok = False
for _ in range(6): # --retry 6
try:
r = urllib.request.urlopen(url, timeout=3).read().decode()
if 'est-url-is-sanity' in r:
ok = True
break
except (urllib.error.URLError, ConnectionRefusedError, socket.error):
time.sleep(1)
sys.exit(0 if ok else 1)
PYCODE
}
data_sanity_test () {
# Restart data sanity test
cd ..
TMPDIR=$(mktemp -d)
PORT_N=$((5000 + RANDOM % (6501 - 5000)))
./changedetection.py -p $PORT_N -d $TMPDIR -u "https://localhost?test-url-is-sanity=1" &
PID=$!
sleep 5
kill $PID
sleep 2
./changedetection.py -p $PORT_N -d $TMPDIR &
PID=$!
sleep 5
# On a restart the URL should still be there
check_sanity $PORT_N || exit 1
kill $PID
cd $OLDPWD
# datastore looks alright, continue
}
data_sanity_test
# REMOVE_REQUESTS_OLD_SCREENSHOTS disabled so that we can write a screenshot and send it in test_notifications.py without a real browser
REMOVE_REQUESTS_OLD_SCREENSHOTS=false pytest -n 30 --dist load tests/test_*.py
#time pytest -n auto --dist loadfile -vv --tb=long tests/test_*.py
echo "RUNNING WITH BASE_URL SET"
# Now re-run some tests with BASE_URL enabled
# Re #65 - Ability to include a link back to the installation, in the notification.
export BASE_URL="https://really-unique-domain.io"
REMOVE_REQUESTS_OLD_SCREENSHOTS=false pytest -vv -s --maxfail=1 tests/test_notification.py
REMOVE_REQUESTS_OLD_SCREENSHOTS=false pytest tests/test_notification.py
# Re-run with HIDE_REFERER set - could affect login
export HIDE_REFERER=True
pytest -vv -s --maxfail=1 tests/test_access_control.py
pytest tests/test_access_control.py
# Re-run a few tests that will trigger brotli based storage
export SNAPSHOT_BROTLI_COMPRESSION_THRESHOLD=5
pytest -vv -s --maxfail=1 tests/test_access_control.py
pytest tests/test_access_control.py
REMOVE_REQUESTS_OLD_SCREENSHOTS=false pytest tests/test_notification.py
pytest -vv -s --maxfail=1 tests/test_backend.py
pytest -vv -s --maxfail=1 tests/test_rss.py
pytest -vv -s --maxfail=1 tests/test_unique_lines.py
pytest tests/test_backend.py
pytest tests/test_rss.py
pytest tests/test_unique_lines.py
# Try high concurrency
FETCH_WORKERS=130 pytest tests/test_history_consistency.py -v -l
# Check file:// will pickup a file when enabled
echo "Hello world" > /tmp/test-file.txt
ALLOW_FILE_URI=yes pytest -vv -s tests/test_security.py
ALLOW_FILE_URI=yes pytest tests/test_security.py

View File

@@ -6,8 +6,6 @@
# enable debug
set -x
docker network inspect changedet-network >/dev/null 2>&1 || docker network create changedet-network
docker run --network changedet-network -d --hostname selenium -p 4444:4444 --rm --shm-size="2g" selenium/standalone-chrome:4
# A extra browser is configured, but we never chose to use it, so it should NOT show in the logs
docker run --rm -e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" --network changedet-network test-changedetectionio bash -c 'cd changedetectionio;pytest tests/custom_browser_url/test_custom_browser_url.py::test_request_not_via_custom_browser_url'

View File

@@ -19,13 +19,12 @@ docker run --network changedet-network -d \
-v `pwd`/tests/proxy_list/squid-passwords.txt:/etc/squid3/passwords \
ubuntu/squid:4.13-21.10_edge
sleep 5
## 2nd test actually choose the preferred proxy from proxies.json
# This will force a request via "proxy-two"
docker run --network changedet-network \
-v `pwd`/tests/proxy_list/proxies.json-example:/tmp/proxies.json \
-v `pwd`/tests/proxy_list/proxies.json-example:/app/changedetectionio/test-datastore/proxies.json \
test-changedetectionio \
bash -c 'cd changedetectionio && pytest -s tests/proxy_list/test_multiple_proxy.py --datastore-path /tmp'
bash -c 'cd changedetectionio && pytest tests/proxy_list/test_multiple_proxy.py'
set +e
echo "- Looking for chosen.changedetection.io request in squid-one - it should NOT be here"
@@ -49,10 +48,8 @@ fi
# Test the UI configurable proxies
docker run --network changedet-network \
test-changedetectionio \
bash -c 'cd changedetectionio && pytest tests/proxy_list/test_select_custom_proxy.py --datastore-path /tmp'
bash -c 'cd changedetectionio && pytest tests/proxy_list/test_select_custom_proxy.py'
# Give squid proxies a moment to flush their logs
sleep 2
# Should see a request for one.changedetection.io in there
echo "- Looking for .changedetection.io request in squid-custom"
@@ -66,10 +63,7 @@ fi
# Test "no-proxy" option
docker run --network changedet-network \
test-changedetectionio \
bash -c 'cd changedetectionio && pytest tests/proxy_list/test_noproxy.py --datastore-path /tmp'
# Give squid proxies a moment to flush their logs
sleep 2
bash -c 'cd changedetectionio && pytest tests/proxy_list/test_noproxy.py'
# We need to handle grep returning 1
set +e
@@ -86,8 +80,6 @@ for c in $(echo "squid-one squid-two squid-custom"); do
fi
done
echo "docker ps output"
docker ps
docker kill squid-one squid-two squid-custom
@@ -96,19 +88,19 @@ docker kill squid-one squid-two squid-custom
# Requests
docker run --network changedet-network \
test-changedetectionio \
bash -c 'cd changedetectionio && pytest tests/proxy_list/test_proxy_noconnect.py --datastore-path /tmp'
bash -c 'cd changedetectionio && pytest tests/proxy_list/test_proxy_noconnect.py'
# Playwright
docker run --network changedet-network \
test-changedetectionio \
bash -c 'cd changedetectionio && PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000 pytest tests/proxy_list/test_proxy_noconnect.py --datastore-path /tmp'
bash -c 'cd changedetectionio && PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000 pytest tests/proxy_list/test_proxy_noconnect.py'
# Puppeteer fast
docker run --network changedet-network \
test-changedetectionio \
bash -c 'cd changedetectionio && FAST_PUPPETEER_CHROME_FETCHER=1 PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000 pytest tests/proxy_list/test_proxy_noconnect.py --datastore-path /tmp'
bash -c 'cd changedetectionio && FAST_PUPPETEER_CHROME_FETCHER=1 PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000 pytest tests/proxy_list/test_proxy_noconnect.py'
# Selenium
docker run --network changedet-network \
test-changedetectionio \
bash -c 'cd changedetectionio && WEBDRIVER_URL=http://selenium:4444/wd/hub pytest tests/proxy_list/test_proxy_noconnect.py --datastore-path /tmp'
bash -c 'cd changedetectionio && WEBDRIVER_URL=http://selenium:4444/wd/hub pytest tests/proxy_list/test_proxy_noconnect.py'

View File

@@ -5,7 +5,6 @@ set -e
# enable debug
set -x
docker network inspect changedet-network >/dev/null 2>&1 || docker network create changedet-network
# SOCKS5 related - start simple Socks5 proxy server
# SOCKSTEST=xyz should show in the logs of this service to confirm it fetched
@@ -15,13 +14,13 @@ docker run --network changedet-network -d --hostname socks5proxy-noauth --rm -p
echo "---------------------------------- SOCKS5 -------------------"
# SOCKS5 related - test from proxies.json
docker run --network changedet-network \
-v `pwd`/tests/proxy_socks5/proxies.json-example:/tmp/proxies.json \
-v `pwd`/tests/proxy_socks5/proxies.json-example:/app/changedetectionio/test-datastore/proxies.json \
--rm \
-e "FLASK_SERVER_NAME=cdio" \
--hostname cdio \
-e "SOCKSTEST=proxiesjson" \
test-changedetectionio \
bash -c 'cd changedetectionio && pytest --live-server-host=0.0.0.0 --live-server-port=5004 -s tests/proxy_socks5/test_socks5_proxy_sources.py --datastore-path /tmp'
bash -c 'cd changedetectionio && pytest --live-server-host=0.0.0.0 --live-server-port=5004 -s tests/proxy_socks5/test_socks5_proxy_sources.py'
# SOCKS5 related - by manually entering in UI
docker run --network changedet-network \
@@ -30,18 +29,18 @@ docker run --network changedet-network \
--hostname cdio \
-e "SOCKSTEST=manual" \
test-changedetectionio \
bash -c 'cd changedetectionio && pytest --live-server-host=0.0.0.0 --live-server-port=5004 -s tests/proxy_socks5/test_socks5_proxy.py --datastore-path /tmp'
bash -c 'cd changedetectionio && pytest --live-server-host=0.0.0.0 --live-server-port=5004 -s tests/proxy_socks5/test_socks5_proxy.py'
# SOCKS5 related - test from proxies.json via playwright - NOTE- PLAYWRIGHT DOESNT SUPPORT AUTHENTICATING PROXY
docker run --network changedet-network \
-e "SOCKSTEST=manual-playwright" \
--hostname cdio \
-e "FLASK_SERVER_NAME=cdio" \
-v `pwd`/tests/proxy_socks5/proxies.json-example-noauth:/tmp/proxies.json \
-v `pwd`/tests/proxy_socks5/proxies.json-example-noauth:/app/changedetectionio/test-datastore/proxies.json \
-e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" \
--rm \
test-changedetectionio \
bash -c 'cd changedetectionio && pytest --live-server-host=0.0.0.0 --live-server-port=5004 -s tests/proxy_socks5/test_socks5_proxy_sources.py --datastore-path /tmp'
bash -c 'cd changedetectionio && pytest --live-server-host=0.0.0.0 --live-server-port=5004 -s tests/proxy_socks5/test_socks5_proxy_sources.py'
echo "socks5 server logs"
docker logs socks5proxy

View File

@@ -0,0 +1,24 @@
"""
Safe Jinja2 render with max payload sizes
See https://jinja.palletsprojects.com/en/3.1.x/sandbox/#security-considerations
"""
import jinja2.sandbox
import typing as t
import os
JINJA2_MAX_RETURN_PAYLOAD_SIZE = 1024 * int(os.getenv("JINJA2_MAX_RETURN_PAYLOAD_SIZE_KB", 1024 * 10))
# This is used for notifications etc, so actually it's OK to send custom HTML such as <a href> etc, but it should limit what data is available.
# (Which also limits available functions that could be called)
def render(template_str, **args: t.Any) -> str:
jinja2_env = jinja2.sandbox.ImmutableSandboxedEnvironment(extensions=['jinja2_time.TimeExtension'])
output = jinja2_env.from_string(template_str).render(args)
return output[:JINJA2_MAX_RETURN_PAYLOAD_SIZE]
def render_fully_escaped(content):
env = jinja2.sandbox.ImmutableSandboxedEnvironment(autoescape=True, extensions=['jinja2_time.TimeExtension'])
template = env.from_string("{{ some_html|e }}")
return template.render(some_html=content)

View File

@@ -29,7 +29,7 @@ $(document).ready(function () {
$(this).text(new Date($(this).data("utc")).toLocaleString());
})
const timezoneInput = $('#application-scheduler_timezone_default');
const timezoneInput = $('#application-timezone');
if(timezoneInput.length) {
const timezone = Intl.DateTimeFormat().resolvedOptions().timeZone;
if (!timezoneInput.val().trim()) {

View File

@@ -1,5 +1,18 @@
$(document).ready(function () {
// Could be from 'watch' or system settings or other
function getNotificationData() {
data = {
notification_body: $('textarea.notification-body').val(),
notification_format: $('select.notification-format').val(),
notification_title: $('input.notification-title').val(),
notification_urls: $('textarea.notification-urls').val(),
tags: $('#tags').val(),
window_url: window.location.href,
}
return data
}
$('#add-email-helper').click(function (e) {
e.preventDefault();
email = prompt("Destination email");
@@ -10,17 +23,82 @@ $(document).ready(function () {
}
});
$('#notifications-minitabs').miniTabs({
"Customise": "#notification-setup",
"Preview": "#notification-preview",
});
$(document).on('click', '[data-target="#notification-preview"]', function (e) {
var data = getNotificationData();
$('#notification-iframe-html-preview').contents().find('body').html('Loading...');
$.ajax({
type: "POST",
url: notification_test_render_preview_url,
data: data,
statusCode: {
400: function (data) {
$('#notification-test-log').show().toggleClass('error', true);
$("#notification-test-log>span").text(data.responseText);
},
}
}).done(function (data) {
$('#notification-test-log').toggleClass('error', false);
setPreview(data['result']);
})
});
function setPreview(data) {
const iframe = document.getElementById("notification-iframe-html-preview");
const isDark = document.documentElement.getAttribute('data-darkmode') === 'true';
// this should come back in the data objk
const isTextFormat = $('select.notification-format').val() === 'Text';
$('#notification-preview-title-text').text(data['title']);
$('#notification-div-text-preview').text(data['body']);
return;
iframe.srcdoc = `
<html data-darkmode="${isDark}">
<head>
<style>
:root {
--color-white: #fff;
--color-grey-200: #333;
--color-grey-800: #e0e0e0;
--color-black: #000;
--color-dark-red: #a00;
--color-light-red: #dd0000;
--color-background: var(--color-grey-800);
--color-text: var(--color-grey-200);
}
html[data-darkmode="true"] {
--color-background: var(--color-grey-200);
--color-text: var(--color-white);
}
body { /* no darkmode */
background-color: var(--color-background);
color: var(--color-text);
padding: 5px;
}
body.text-format {
font-family: monospace;
white-space: pre;
overflow-wrap: normal;
overflow-x: auto;
}
</style>
</head>
<body class="${isTextFormat ? 'text-format' : ''}">${data['body']}</body>
</html>`;
}
$('#send-test-notification').click(function (e) {
e.preventDefault();
data = {
notification_urls: $('textarea.notification-urls').val(),
notification_title: $('input.notification-title').val(),
notification_body: $('textarea.notification-body').val(),
notification_format: $('select.notification-format').val(),
tags: $('#tags').val(),
window_url: window.location.href,
}
var data = getNotificationData();
$('.notifications-wrapper .spinner').fadeIn();
$('#notification-test-log').show();
@@ -30,11 +108,14 @@ $(document).ready(function () {
data: data,
statusCode: {
400: function (data) {
$("#notification-test-log").toggleClass('error', true);
$("#notification-test-log>span").text(data.responseText);
},
}
}).done(function (data) {
$("#notification-test-log>span").text(data);
$("#notification-test-log").toggleClass('error', false);
$("#notification-test-log>span").text(data['status']);
}).fail(function (jqXHR, textStatus, errorThrown) {
// Handle connection refused or other errors
if (textStatus === "error" && errorThrown === "") {
@@ -42,11 +123,13 @@ $(document).ready(function () {
$("#notification-test-log>span").text("Error: Connection refused or server is unreachable.");
} else {
console.error("Error:", textStatus, errorThrown);
$("#notification-test-log>span").text("An error occurred: " + textStatus);
$("#notification-test-log>span").text("An error occurred: " + errorThrown);
}
}).always(function () {
$('.notifications-wrapper .spinner').hide();
})
});
});

View File

@@ -2,13 +2,6 @@
$(document).ready(function () {
function reapplyTableStripes() {
$('.watch-table tbody tr').each(function(index) {
$(this).removeClass('pure-table-odd pure-table-even');
$(this).addClass(index % 2 === 0 ? 'pure-table-odd' : 'pure-table-even');
});
}
function bindSocketHandlerButtonsEvents(socket) {
$('.ajax-op').on('click.socketHandlerNamespace', function (e) {
e.preventDefault();
@@ -108,7 +101,6 @@ $(document).ready(function () {
socket.on('watch_deleted', function (data) {
$('tr[data-watch-uuid="' + data.uuid + '"] td').fadeOut(500, function () {
$(this).closest('tr').remove();
reapplyTableStripes();
});
});

View File

@@ -1,11 +1,11 @@
// Rewrite this is a plugin.. is all this JS really 'worth it?'
window.addEventListener('hashchange', function () {
var tabs = document.getElementsByClassName('active');
while (tabs[0]) {
tabs[0].classList.remove('active');
var tabs = document.querySelectorAll('.tabs .active');
tabs.forEach(function (tab) {
tab.classList.remove('active');
document.body.classList.remove('full-width');
}
});
set_active_tab();
}, false);

View File

@@ -74,7 +74,7 @@ $(document).ready(function () {
$('#filters-and-triggers input')[method]('change', request_textpreview_update.throttle(1000));
$("#filters-and-triggers-tab")[method]('click', request_textpreview_update.throttle(1000));
});
$('.minitabs-wrapper').miniTabs({
$('#filter-preview-minitabs').miniTabs({
"Content after filters": "#text-preview-inner",
"Content raw/before filters": "#text-preview-before-inner"
});

View File

@@ -18,8 +18,15 @@ html[data-darkmode="true"] {
display: block;
}
}
.minitabs-content {
> div {
background-color: rgb(249 249 249 / 13%) !important;
}
}
}

View File

@@ -1,6 +1,13 @@
.minitabs-wrapper {
width: 100%;
.tab-contents-monospace-preview {
font-family: "Courier New", Courier, monospace; /* Sets the font to a monospace type */
font-size: 70%;
word-break: break-word;
white-space: pre-wrap; /* Preserves whitespace and line breaks like <pre> */
}
> div[id] {
padding: 20px;
border: 1px solid #ccc;
@@ -10,38 +17,45 @@
.minitabs-content {
width: 100%;
display: flex;
> div {
flex: 1 1 auto;
min-width: 0;
overflow: scroll;
padding: 1rem;
border: 1px solid #ddd;
background-color: #eee;
}
}
.minitabs {
display: flex;
border-bottom: 1px solid #ccc;
}
.minitab {
flex: 1;
text-align: center;
padding: 12px 0;
text-decoration: none;
color: #333;
background-color: #f1f1f1;
border: 1px solid #ccc;
border-bottom: none;
cursor: pointer;
transition: background-color 0.3s;
}
.minitab {
flex: 1;
text-align: center;
padding: 12px 0;
text-decoration: none;
color: #333;
background-color: #f1f1f1;
border: 1px solid #ccc;
border-bottom: none;
cursor: pointer;
transition: background-color 0.3s;
border-top-left-radius: 5px;
border-top-right-radius: 5px;
opacity: 0.45;
&:hover {
background-color: #ddd;
}
.minitab:hover {
background-color: #ddd;
&.active {
background-color: #eee;
font-weight: bold;
opacity: 1.0;
}
}
}
.minitab.active {
background-color: #fff;
font-weight: bold;
}
}

View File

@@ -0,0 +1,12 @@
#notification-preview {
resize: both;
overflow: hidden;
}
#notification-iframe-html-preview {
width: 100%;
height: 100%;
border: 0;
display: block;
overflow: auto;
}

View File

@@ -1,5 +1,3 @@
@use "minitabs";
body.preview-text-enabled {
@media (min-width: 800px) {
@@ -31,19 +29,7 @@ body.preview-text-enabled {
}
#activate-text-preview {
background-color: var(--color-grey-500);
}
/* actual preview area */
.monospace-preview {
background: var(--color-background-input);
border: 1px solid var(--color-grey-600);
padding: 1rem;
color: var(--color-text-input);
font-family: "Courier New", Courier, monospace; /* Sets the font to a monospace type */
font-size: 70%;
word-break: break-word;
white-space: pre-wrap; /* Preserves whitespace and line breaks like <pre> */
background-color: var(--color-grey-500);
}
}
@@ -53,3 +39,11 @@ body.preview-text-enabled {
z-index: 3;
box-shadow: 1px 1px 4px var(--color-shadow-jump);
}
#filter-preview-minitabs {
.minitabs-content {
> div {
overflow: scroll;
}
}
}

View File

@@ -20,6 +20,8 @@
@use "parts/lister_extra";
@use "parts/socket";
@use "parts/visualselector";
@use "parts/_minitabs";
@use "parts/_notification";
@use "parts/widgets";
body {
@@ -329,18 +331,17 @@ a.pure-button-selected {
.notifications-wrapper {
padding-top: 0.5rem;
#notification-test-log {
margin-top: 1rem;
padding: 1rem;
padding-top: 1rem;
white-space: pre-wrap;
word-break: break-word;
overflow-wrap: break-word;
max-width: 100%;
box-sizing: border-box;
max-height: 12rem;
overflow-y: scroll;
border: 1px solid var(--color-border-notification);
border-radius: 5px;
&.error {
> span {
color: var(--color-error) !important;
}
}
}
}
@@ -350,11 +351,6 @@ label {
}
}
.grey-form-border {
border: 1px solid var(--color-border-notification);
padding: 0.5rem;
border-radius: 5px;
}
#notification-error-log {
border: 1px solid var(--color-border-notification);

File diff suppressed because one or more lines are too long

View File

@@ -1,14 +1,11 @@
from changedetectionio.strtobool import strtobool
from changedetectionio.validate_url import is_safe_valid_url
from flask import (
flash
)
from .blueprint.rss import RSS_CONTENT_FORMAT_DEFAULT
from .html_tools import TRANSLATE_WHITESPACE_TABLE
from .model import App, Watch, USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH
from . model import App, Watch
from copy import deepcopy, copy
from os import path, unlink
from threading import Lock
@@ -23,13 +20,6 @@ import uuid as uuid_builder
from loguru import logger
from blinker import signal
# Try to import orjson for faster JSON serialization
try:
import orjson
HAS_ORJSON = True
except ImportError:
HAS_ORJSON = False
from .processors import get_custom_watch_obj_for_processor
from .processors.restock_diff import Restock
@@ -45,41 +35,22 @@ class ChangeDetectionStore:
lock = Lock()
# For general updates/writes that can wait a few seconds
needs_write = False
datastore_path = None
# For when we edit, we should write to disk
needs_write_urgent = False
__version_check = True
save_data_thread = None
def __init__(self, datastore_path="/datastore", include_default_watches=True, version_tag="0.0.0"):
# Should only be active for docker
# logging.basicConfig(filename='/dev/stdout', level=logging.INFO)
self.__data = App.model()
self.datastore_path = datastore_path
self.json_store_path = os.path.join(self.datastore_path, "url-watches.json")
logger.info(f"Datastore path is '{self.json_store_path}'")
self.needs_write = False
self.start_time = time.time()
self.stop_thread = False
self.save_version_copy_json_db(version_tag)
self.reload_state(datastore_path=datastore_path, include_default_watches=include_default_watches, version_tag=version_tag)
def save_version_copy_json_db(self, version_tag):
import re
version_text = re.sub(r'\D+', '-', version_tag)
db_path = os.path.join(self.datastore_path, "url-watches.json")
db_path_version_backup = os.path.join(self.datastore_path, f"url-watches-{version_text}.json")
if not os.path.isfile(db_path_version_backup) and os.path.isfile(db_path):
from shutil import copyfile
logger.info(f"Backing up JSON DB due to new version to '{db_path_version_backup}'.")
copyfile(db_path, db_path_version_backup)
def reload_state(self, datastore_path, include_default_watches, version_tag):
logger.info(f"Datastore path is '{datastore_path}'")
self.__data = App.model()
self.json_store_path = os.path.join(self.datastore_path, "url-watches.json")
# Base definition for all watchers
# deepcopy part of #569 - not sure why its needed exactly
self.generic_definition = deepcopy(Watch.model(datastore_path = datastore_path, default={}))
@@ -91,46 +62,37 @@ class ChangeDetectionStore:
self.__data['build_sha'] = f.read()
try:
if HAS_ORJSON:
# orjson.loads() expects UTF-8 encoded bytes #3611
with open(self.json_store_path, 'rb') as json_file:
from_disk = orjson.loads(json_file.read())
else:
with open(self.json_store_path, encoding='utf-8') as json_file:
from_disk = json.load(json_file)
# @todo retest with ", encoding='utf-8'"
with open(self.json_store_path) as json_file:
from_disk = json.load(json_file)
if not from_disk:
# No FileNotFound exception was thrown but somehow the JSON was empty - abort for safety.
logger.critical(f"JSON DB existed but was empty on load - empty JSON file? '{self.json_store_path}' Aborting")
raise Exception('JSON DB existed but was empty on load - Aborting')
# @todo isnt there a way todo this dict.update recursively?
# Problem here is if the one on the disk is missing a sub-struct, it wont be present anymore.
if 'watching' in from_disk:
self.__data['watching'].update(from_disk['watching'])
# @todo isnt there a way todo this dict.update recursively?
# Problem here is if the one on the disk is missing a sub-struct, it wont be present anymore.
if 'watching' in from_disk:
self.__data['watching'].update(from_disk['watching'])
if 'app_guid' in from_disk:
self.__data['app_guid'] = from_disk['app_guid']
if 'app_guid' in from_disk:
self.__data['app_guid'] = from_disk['app_guid']
if 'settings' in from_disk:
if 'headers' in from_disk['settings']:
self.__data['settings']['headers'].update(from_disk['settings']['headers'])
if 'settings' in from_disk:
if 'headers' in from_disk['settings']:
self.__data['settings']['headers'].update(from_disk['settings']['headers'])
if 'requests' in from_disk['settings']:
self.__data['settings']['requests'].update(from_disk['settings']['requests'])
if 'requests' in from_disk['settings']:
self.__data['settings']['requests'].update(from_disk['settings']['requests'])
if 'application' in from_disk['settings']:
self.__data['settings']['application'].update(from_disk['settings']['application'])
if 'application' in from_disk['settings']:
self.__data['settings']['application'].update(from_disk['settings']['application'])
# Convert each existing watch back to the Watch.model object
for uuid, watch in self.__data['watching'].items():
self.__data['watching'][uuid] = self.rehydrate_entity(uuid, watch)
logger.info(f"Watching: {uuid} {watch['url']}")
# Convert each existing watch back to the Watch.model object
for uuid, watch in self.__data['watching'].items():
self.__data['watching'][uuid] = self.rehydrate_entity(uuid, watch)
logger.info(f"Watching: {uuid} {watch['url']}")
# And for Tags also, should be Restock type because it has extra settings
for uuid, tag in self.__data['settings']['application']['tags'].items():
self.__data['settings']['application']['tags'][uuid] = self.rehydrate_entity(uuid, tag, processor_override='restock_diff')
logger.info(f"Tag: {uuid} {tag['title']}")
# And for Tags also, should be Restock type because it has extra settings
for uuid, tag in self.__data['settings']['application']['tags'].items():
self.__data['settings']['application']['tags'][uuid] = self.rehydrate_entity(uuid, tag, processor_override='restock_diff')
logger.info(f"Tag: {uuid} {tag['title']}")
# First time ran, Create the datastore.
except (FileNotFoundError):
@@ -181,10 +143,7 @@ class ChangeDetectionStore:
self.needs_write = True
# Finally start the thread that will manage periodic data saves to JSON
# Only start if thread is not already running (reload_state might be called multiple times)
if not self.save_data_thread or not self.save_data_thread.is_alive():
self.save_data_thread = threading.Thread(target=self.save_datastore)
self.save_data_thread.start()
save_data_thread = threading.Thread(target=self.save_datastore).start()
def rehydrate_entity(self, uuid, entity, processor_override=None):
"""Set the dict back to the dict Watch object"""
@@ -269,37 +228,26 @@ class ChangeDetectionStore:
d['settings']['application']['active_base_url'] = active_base_url.strip('" ')
return d
from pathlib import Path
def delete_path(self, path: Path):
import shutil
"""Delete a file or directory tree, including the path itself."""
if not path.exists():
return
if path.is_file() or path.is_symlink():
path.unlink(missing_ok=True) # deletes a file or symlink
else:
shutil.rmtree(path, ignore_errors=True) # deletes dir *and* its contents
# Delete a single watch by UUID
def delete(self, uuid):
import pathlib
import shutil
with self.lock:
if uuid == 'all':
self.__data['watching'] = {}
time.sleep(1) # Mainly used for testing to allow all items to flush before running next test
# GitHub #30 also delete history records
for uuid in self.data['watching']:
path = pathlib.Path(
os.path.join(self.datastore_path, uuid))
path = pathlib.Path(os.path.join(self.datastore_path, uuid))
if os.path.exists(path):
self.delete(uuid)
shutil.rmtree(path)
else:
path = pathlib.Path(os.path.join(self.datastore_path, uuid))
if os.path.exists(path):
self.delete_path(path)
shutil.rmtree(path)
del self.data['watching'][uuid]
self.needs_write_urgent = True
@@ -382,10 +330,9 @@ class ChangeDetectionStore:
logger.error(f"Error fetching metadata for shared watch link {url} {str(e)}")
flash("Error fetching metadata for {}".format(url), 'error')
return False
if not is_safe_valid_url(url):
flash('Watch protocol is not permitted or invalid URL format', 'error')
from .model.Watch import is_safe_url
if not is_safe_url(url):
flash('Watch protocol is not permitted by SAFE_PROTOCOL_REGEX', 'error')
return None
if tag and type(tag) == str:
@@ -451,19 +398,14 @@ class ChangeDetectionStore:
self.sync_to_json()
return
else:
try:
# Re #286 - First write to a temp file, then confirm it looks OK and rename it
# This is a fairly basic strategy to deal with the case that the file is corrupted,
# system was out of memory, out of RAM etc
if HAS_ORJSON:
# Use orjson for faster serialization
# orjson.dumps() always returns UTF-8 encoded bytes #3611
with open(self.json_store_path+".tmp", 'wb') as json_file:
json_file.write(orjson.dumps(data, option=orjson.OPT_INDENT_2))
else:
# Fallback to standard json module
with open(self.json_store_path+".tmp", 'w', encoding='utf-8') as json_file:
json.dump(data, json_file, indent=2, ensure_ascii=False)
with open(self.json_store_path+".tmp", 'w') as json_file:
# Use compact JSON in production for better performance
json.dump(data, json_file, indent=2)
os.replace(self.json_store_path+".tmp", self.json_store_path)
except Exception as e:
logger.error(f"Error writing JSON!! (Main JSON file save was skipped) : {str(e)}")
@@ -486,7 +428,7 @@ class ChangeDetectionStore:
logger.remove()
logger.add(sys.stderr)
logger.info(f"Shutting down datastore '{self.datastore_path}' thread")
logger.critical("Shutting down datastore thread")
return
if self.needs_write or self.needs_write_urgent:
@@ -525,13 +467,8 @@ class ChangeDetectionStore:
# Load from external config file
if path.isfile(proxy_list_file):
if HAS_ORJSON:
# orjson.loads() expects UTF-8 encoded bytes #3611
with open(os.path.join(self.datastore_path, "proxies.json"), 'rb') as f:
proxy_list = orjson.loads(f.read())
else:
with open(os.path.join(self.datastore_path, "proxies.json"), encoding='utf-8') as f:
proxy_list = json.load(f)
with open(os.path.join(self.datastore_path, "proxies.json")) as f:
proxy_list = json.load(f)
# Mapping from UI config if available
extras = self.data['settings']['requests'].get('extra_proxies')
@@ -776,28 +713,6 @@ class ChangeDetectionStore:
return updates_available
def add_notification_url(self, notification_url):
logger.debug(f">>> Adding new notification_url - '{notification_url}'")
notification_urls = self.data['settings']['application'].get('notification_urls', [])
if notification_url in notification_urls:
return notification_url
with self.lock:
notification_urls = self.__data['settings']['application'].get('notification_urls', [])
if notification_url in notification_urls:
return notification_url
# Append and update the datastore
notification_urls.append(notification_url)
self.__data['settings']['application']['notification_urls'] = notification_urls
self.needs_write = True
return notification_url
# Run all updates
# IMPORTANT - Each update could be run even when they have a new install and the schema is correct
# So therefor - each `update_n` should be very careful about checking if it needs to actually run
@@ -810,16 +725,7 @@ class ChangeDetectionStore:
logger.critical(f"Applying update_{update_n}")
# Wont exist on fresh installs
if os.path.exists(self.json_store_path):
i = 0
while True:
i+=1
dest = os.path.join(self.datastore_path, f"url-watches-before-{update_n}-{i}.json")
if not os.path.exists(dest):
logger.debug(f"Copying url-watches.json DB to '{dest}' backup.")
shutil.copyfile(self.json_store_path, dest)
break
else:
logger.warning(f"Backup of url-watches.json '{dest}', DB already exists, trying {i+1}.. ")
shutil.copyfile(self.json_store_path, os.path.join(self.datastore_path, f"url-watches-before-{update_n}.json"))
try:
update_method = getattr(self, f"update_{update_n}")()
@@ -1070,55 +976,26 @@ class ChangeDetectionStore:
if self.data['settings']['application'].get('extract_title_as_title'):
self.data['settings']['application']['ui']['use_page_title_in_list'] = self.data['settings']['application'].get('extract_title_as_title')
def update_21(self):
if self.data['settings']['application'].get('timezone'):
self.data['settings']['application']['scheduler_timezone_default'] = self.data['settings']['application'].get('timezone')
del self.data['settings']['application']['timezone']
def add_notification_url(self, notification_url):
logger.debug(f">>> Adding new notification_url - '{notification_url}'")
# Some notification formats got the wrong name type
def update_23(self):
notification_urls = self.data['settings']['application'].get('notification_urls', [])
def re_run(formats):
sys_n_format = self.data['settings']['application'].get('notification_format')
key_exists_as_value = next((k for k, v in formats.items() if v == sys_n_format), None)
if key_exists_as_value: # key of "Plain text"
logger.success(f"['settings']['application']['notification_format'] '{sys_n_format}' -> '{key_exists_as_value}'")
self.data['settings']['application']['notification_format'] = key_exists_as_value
if notification_url in notification_urls:
return notification_url
for uuid, watch in self.data['watching'].items():
n_format = self.data['watching'][uuid].get('notification_format')
key_exists_as_value = next((k for k, v in formats.items() if v == n_format), None)
if key_exists_as_value and key_exists_as_value != USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH: # key of "Plain text"
logger.success(f"['watching'][{uuid}]['notification_format'] '{n_format}' -> '{key_exists_as_value}'")
self.data['watching'][uuid]['notification_format'] = key_exists_as_value # should be 'text' or whatever
with self.lock:
notification_urls = self.__data['settings']['application'].get('notification_urls', [])
for uuid, tag in self.data['settings']['application']['tags'].items():
n_format = self.data['settings']['application']['tags'][uuid].get('notification_format')
key_exists_as_value = next((k for k, v in formats.items() if v == n_format), None)
if key_exists_as_value and key_exists_as_value != USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH: # key of "Plain text"
logger.success(
f"['settings']['application']['tags'][{uuid}]['notification_format'] '{n_format}' -> '{key_exists_as_value}'")
self.data['settings']['application']['tags'][uuid][
'notification_format'] = key_exists_as_value # should be 'text' or whatever
if notification_url in notification_urls:
return notification_url
from .notification import valid_notification_formats
formats = deepcopy(valid_notification_formats)
re_run(formats)
# And in previous versions, it was "text" instead of Plain text, Markdown instead of "Markdown to HTML"
formats['text'] = 'Text'
formats['markdown'] = 'Markdown'
re_run(formats)
# Append and update the datastore
notification_urls.append(notification_url)
self.__data['settings']['application']['notification_urls'] = notification_urls
self.needs_write = True
return notification_url
# RSS types should be inline with the same names as notification types
def update_24(self):
rss_format = self.data['settings']['application'].get('rss_content_format')
if not rss_format or 'text' in rss_format:
# might have been 'plaintext, 'plain text' or something
self.data['settings']['application']['rss_content_format'] = RSS_CONTENT_FORMAT_DEFAULT
elif 'html' in rss_format:
self.data['settings']['application']['rss_content_format'] = 'htmlcolor'
else:
# safe fallback to text
self.data['settings']['application']['rss_content_format'] = RSS_CONTENT_FORMAT_DEFAULT

View File

@@ -1,118 +1,6 @@
{% from '_helpers.html' import render_field %}
{% macro show_token_placeholders(extra_notification_token_placeholder_info, suffix="") %}
<div class="pure-controls">
<span class="pure-form-message-inline">
Body for all notifications &dash; You can use <a target="newwindow" href="https://jinja.palletsprojects.com/en/3.0.x/templates/">Jinja2</a> templating in the notification title, body and URL, and tokens from below.
</span><br>
<div data-target="#notification-tokens-info{{ suffix }}" class="toggle-show pure-button button-tag button-xsmall">Show
token/placeholders
</div>
</div>
<div class="pure-controls" style="display: none;" id="notification-tokens-info{{ suffix }}">
<table class="pure-table" id="token-table">
<thead>
<tr>
<th>Token</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>{{ '{{base_url}}' }}</code></td>
<td>The URL of the changedetection.io instance you are running.</td>
</tr>
<tr>
<td><code>{{ '{{watch_url}}' }}</code></td>
<td>The URL being watched.</td>
</tr>
<tr>
<td><code>{{ '{{watch_uuid}}' }}</code></td>
<td>The UUID of the watch.</td>
</tr>
<tr>
<td><code>{{ '{{watch_title}}' }}</code></td>
<td>The page title of the watch, uses &lt;title&gt; if not set, falls back to URL</td>
</tr>
<tr>
<td><code>{{ '{{watch_tag}}' }}</code></td>
<td>The watch group / tag</td>
</tr>
<tr>
<td><code>{{ '{{preview_url}}' }}</code></td>
<td>The URL of the preview page generated by changedetection.io.</td>
</tr>
<tr>
<td><code>{{ '{{diff_url}}' }}</code></td>
<td>The URL of the diff output for the watch.</td>
</tr>
<tr>
<td><code>{{ '{{diff}}' }}</code></td>
<td>The diff output - only changes, additions, and removals</td>
</tr>
<tr>
<td><code>{{ '{{diff_clean}}' }}</code></td>
<td>The diff output - only changes, additions, and removals &dash; <i>Without (added) prefix or colors</i>
</td>
</tr>
<tr>
<td><code>{{ '{{diff_added}}' }}</code></td>
<td>The diff output - only changes and additions</td>
</tr>
<tr>
<td><code>{{ '{{diff_added_clean}}' }}</code></td>
<td>The diff output - only changes and additions &dash; <i>Without (added) prefix or colors</i></td>
</tr>
<tr>
<td><code>{{ '{{diff_removed}}' }}</code></td>
<td>The diff output - only changes and removals</td>
</tr>
<tr>
<td><code>{{ '{{diff_removed_clean}}' }}</code></td>
<td>The diff output - only changes and removals &dash; <i>Without (added) prefix or colors</i></td>
</tr>
<tr>
<td><code>{{ '{{diff_full}}' }}</code></td>
<td>The diff output - full difference output</td>
</tr>
<tr>
<td><code>{{ '{{diff_full_clean}}' }}</code></td>
<td>The diff output - full difference output &dash; <i>Without (added) prefix or colors</i></td>
</tr>
<tr>
<td><code>{{ '{{diff_patch}}' }}</code></td>
<td>The diff output - patch in unified format</td>
</tr>
<tr>
<td><code>{{ '{{current_snapshot}}' }}</code></td>
<td>The current snapshot text contents value, useful when combined with JSON or CSS filters
</td>
</tr>
<tr>
<td><code>{{ '{{triggered_text}}' }}</code></td>
<td>Text that tripped the trigger from filters</td>
{% if extra_notification_token_placeholder_info %}
{% for token in extra_notification_token_placeholder_info %}
<tr>
<td><code>{{ '{{' }}{{ token[0] }}{{ '}}' }}</code></td>
<td>{{ token[1] }}</td>
</tr>
{% endfor %}
{% endif %}
</tbody>
</table>
<span class="pure-form-message-inline">
Warning: Contents of <code>{{ '{{diff}}' }}</code>, <code>{{ '{{diff_removed}}' }}</code>, and <code>{{ '{{diff_added}}' }}</code> depend on how the difference algorithm perceives the change. <br>
For example, an addition or removal could be perceived as a change in some cases. <a target="newwindow" href="https://github.com/dgtlmoon/changedetection.io/wiki/Using-the-%7B%7Bdiff%7D%7D,-%7B%7Bdiff_added%7D%7D,-and-%7B%7Bdiff_removed%7D%7D-notification-tokens">More Here</a> <br>
</span>
</div>
{% endmacro %}
{% macro render_common_settings_form(form, emailprefix, settings_application, extra_notification_token_placeholder_info) %}
<div class="pure-control-group">
{{ render_field(form.notification_urls, rows=5, placeholder="Examples:
@@ -136,43 +24,153 @@
</ul>
</div>
<div class="notifications-wrapper">
<a id="send-test-notification" class="pure-button button-secondary button-xsmall" >Send test notification</a> <div class="spinner" style="display: none;"></div>
<a id="send-test-notification" class="pure-button button-secondary" >Send test notification</a> <div class="spinner" style="display: none;"></div>
{% if emailprefix %}
<a id="add-email-helper" class="pure-button button-secondary button-xsmall" >Add email <img style="height: 1em; display: inline-block" src="{{url_for('static_content', group='images', filename='email.svg')}}" alt="Add an email address"> </a>
<a id="add-email-helper" class="pure-button button-secondary" >Add email <img style="height: 1em; display: inline-block" src="{{url_for('static_content', group='images', filename='email.svg')}}" alt="Add an email address"> </a>
{% endif %}
<a href="{{url_for('settings.notification_logs')}}" class="pure-button button-secondary button-xsmall" >Notification debug logs</a>
<a href="{{url_for('settings.notification_logs')}}" class="pure-button button-secondary " >Notification debug logs</a>
<br>
<div id="notification-test-log" style="display: none;"><span class="pure-form-message-inline">Processing..</span></div>
</div>
</div>
<div class="pure-control-group grey-form-border">
<div class="pure-control-group">
{{ render_field(form.notification_title, class="m-d notification-title", placeholder=settings_application['notification_title']) }}
<span class="pure-form-message-inline">Title for all notifications</span>
</div>
<div class="pure-control-group">
{{ render_field(form.notification_body , rows=5, class="notification-body", placeholder=settings_application['notification_body']) }}
{{ show_token_placeholders(extra_notification_token_placeholder_info=extra_notification_token_placeholder_info) }}
<div class="pure-form-message-inline">
<ul>
<li><span class="pure-form-message-inline">
For JSON payloads, use <strong>|tojson</strong> without quotes for automatic escaping, for example - <code>{ "name": {{ '{{ watch_title|tojson }}' }} }</code>
</span></li>
<li><span class="pure-form-message-inline">
URL encoding, use <strong>|urlencode</strong>, for example - <code>gets://hook-website.com/test.php?title={{ '{{ watch_title|urlencode }}' }}</code>
</span></li>
<li><span class="pure-form-message-inline">
Regular-expression replace, use <strong>|regex_replace</strong>, for example - <code>{{ "{{ \"hello world 123\" | regex_replace('[0-9]+', 'no-more-numbers') }}" }}</code>
</span></li>
<li><span class="pure-form-message-inline">
For a complete reference of all Jinja2 built-in filters, users can refer to the <a href="https://jinja.palletsprojects.com/en/3.1.x/templates/#builtin-filters">https://jinja.palletsprojects.com/en/3.1.x/templates/#builtin-filters</a>
</span></li>
</ul>
<br>
</div>
<div class="">
{{ render_field(form.notification_format , class="notification-format") }}
<span class="pure-form-message-inline">Format for all notifications</span>
<div class="pure-control-group">
<p>Customise the contents of the notification using the form below, this is not necessary but you can create quite interesting integrations :-)</p>
<div class="minitabs-wrapper" id="notifications-minitabs">
<div class="minitabs-content">
<div id="notification-setup">
<div class="pure-control-group">
{{ render_field(form.notification_title, class="m-d notification-title", placeholder=settings_application['notification_title']) }}
</div>
<div class="pure-control-group">
{{ render_field(form.notification_body , rows=5, class="notification-body", placeholder=settings_application['notification_body']) }}
<span class="pure-form-message-inline">Body for the notification &dash; You can use <a
target="newwindow"
href="https://jinja.palletsprojects.com/en/3.0.x/templates/">Jinja2</a> templating in the notification title, body and URL, and tokens from below.
</span>
</div>
<div class="pure-controls">
<div data-target="#notification-tokens-info"
class="toggle-show pure-button button-tag button-xsmall">Show
token/placeholders
</div>
</div>
<div class="pure-controls" style="display: none;" id="notification-tokens-info">
<table class="pure-table" id="token-table">
<thead>
<tr>
<th>Token</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>{{ '{{base_url}}' }}</code></td>
<td>The URL of the changedetection.io instance you are running.</td>
</tr>
<tr>
<td><code>{{ '{{watch_url}}' }}</code></td>
<td>The URL being watched.</td>
</tr>
<tr>
<td><code>{{ '{{watch_uuid}}' }}</code></td>
<td>The UUID of the watch.</td>
</tr>
<tr>
<td><code>{{ '{{watch_title}}' }}</code></td>
<td>The page title of the watch, uses &lt;title&gt; if not set, falls back to URL</td>
</tr>
<tr>
<td><code>{{ '{{watch_tag}}' }}</code></td>
<td>The watch group / tag</td>
</tr>
<tr>
<td><code>{{ '{{preview_url}}' }}</code></td>
<td>The URL of the preview page generated by changedetection.io.
</td>
</tr>
<tr>
<td><code>{{ '{{diff_url}}' }}</code></td>
<td>The URL of the diff output for the watch.</td>
</tr>
<tr>
<td><code>{{ '{{diff}}' }}</code></td>
<td>The diff output - only changes, additions, and removals</td>
</tr>
<tr>
<td><code>{{ '{{diff_added}}' }}</code></td>
<td>The diff output - only changes and additions</td>
</tr>
<tr>
<td><code>{{ '{{diff_removed}}' }}</code></td>
<td>The diff output - only changes and removals</td>
</tr>
<tr>
<td><code>{{ '{{diff_full}}' }}</code></td>
<td>The diff output - full difference output</td>
</tr>
<tr>
<td><code>{{ '{{diff_patch}}' }}</code></td>
<td>The diff output - patch in unified format</td>
</tr>
<tr>
<td><code>{{ '{{current_snapshot}}' }}</code></td>
<td>The current snapshot text contents value, useful when combined
with JSON or CSS filters
</td>
</tr>
<tr>
<td><code>{{ '{{triggered_text}}' }}</code></td>
<td>Text that tripped the trigger from filters</td>
</tr>
{% if extra_notification_token_placeholder_info %}
{% for token in extra_notification_token_placeholder_info %}
<tr>
<td><code>{{ '{{' }}{{ token[0] }}{{ '}}' }}</code></td>
<td>{{ token[1] }}</td>
</tr>
{% endfor %}
{% endif %}
</tbody>
</table>
<div class="pure-form-message-inline">
<p>
Warning: Contents of <code>{{ '{{diff}}' }}</code>,
<code>{{ '{{diff_removed}}' }}</code>, and
<code>{{ '{{diff_added}}' }}</code> depend on how the difference
algorithm perceives the change. <br>
For example, an addition or removal could be perceived as a change
in some cases. <a target="newwindow"
href="https://github.com/dgtlmoon/changedetection.io/wiki/Using-the-%7B%7Bdiff%7D%7D,-%7B%7Bdiff_added%7D%7D,-and-%7B%7Bdiff_removed%7D%7D-notification-tokens">More
Here</a> <br>
</p>
<p>
For JSON payloads, use <strong>|tojson</strong> without quotes for
automatic escaping, for example - <code>{
"name": {{ '{{ watch_title|tojson }}' }} }</code>
</p>
<p>
URL encoding, use <strong>|urlencode</strong>, for example - <code>gets://hook-website.com/test.php?title={{ '{{ watch_title|urlencode }}' }}</code>
</p>
</div>
</div>
<div class="pure-control-group">
{{ render_field(form.notification_format , class="notification-format") }}
<span class="pure-form-message-inline">Format for all notifications</span>
</div>
</div>
<div id="notification-preview" style="display: none; height:100%; display:flex; flex-direction:column;">
<p><strong>Title: </strong><span id="notification-preview-title-text">Preview loading..</span></p>
<div style="flex:1; display:flex; flex-direction:column;">
<strong>Body: </strong>
<div id="notification-div-text-preview" style="flex:1; height:95%; width:100%; border-radius:4px; margin-top:0.5rem; border:none;"></div>
<iframe id="notification-iframe-html-preview"
style="flex:1; height:95%; width:100%; border-radius:4px; margin-top:0.5rem; border:none;">
Preview loading...
</iframe>
</div>
</div>
</div>
</div>
</div>
{% endmacro %}
{% endmacro %}

View File

@@ -14,39 +14,28 @@
{% if field.errors is mapping and 'form' in field.errors %}
{# and subfield form errors, such as used in RequiredFormField() for TimeBetweenCheckForm sub form #}
{% set errors = field.errors['form'] %}
{% for error in errors %}
<li>{{ error }}</li>
{% endfor %}
{% elif field.type == 'FieldList' %}
{# Handle FieldList of FormFields - errors is a list of dicts, one per entry #}
{% for idx, entry_errors in field.errors|enumerate %}
{% if entry_errors is mapping and entry_errors %}
{# Only show entries that have actual errors #}
<li><strong>Entry {{ idx + 1 }}:</strong>
<ul>
{% for field_name, messages in entry_errors.items() %}
{% for message in messages %}
<li>{{ field_name }}: {{ message }}</li>
{% endfor %}
{% endfor %}
</ul>
</li>
{% endif %}
{% endfor %}
{% else %}
{# regular list of errors with this field #}
{% for error in field.errors %}
<li>{{ error }}</li>
{% endfor %}
{% set errors = field.errors %}
{% endif %}
{% for error in errors %}
<li>{{ error }}</li>
{% endfor %}
</ul>
{% endif %}
</div>
{% endmacro %}
{% macro render_checkbox_field(field) %}
<div class="checkbox {% if field.errors %} error {% endif %}">
<div class="checkbox {% if field.errors or field.top_errors %} error {% endif %}">
{{ field(**kwargs)|safe }} {{ field.label }}
{% if field.top_errors %}
<ul class="errors top-errors">
{% for error in field.top_errors %}
<li>{{ error }}</li>
{% endfor %}
</ul>
{% endif %}
{% if field.errors %}
<ul class=errors>
{% for error in field.errors %}
@@ -61,9 +50,16 @@
{% if BooleanField %}
{% set _ = field.__setattr__('boolean_mode', true) %}
{% endif %}
<div class="ternary-field {% if field.errors %} error {% endif %}">
<div class="ternary-field {% if field.errors or field.top_errors %} error {% endif %}">
<div class="ternary-field-label">{{ field.label }}</div>
<div class="ternary-field-widget">{{ field(**kwargs)|safe }}</div>
{% if field.top_errors %}
<ul class="errors top-errors">
{% for error in field.top_errors %}
<li>{{ error }}</li>
{% endfor %}
</ul>
{% endif %}
{% if field.errors %}
<ul class=errors>
{% for error in field.errors %}
@@ -76,8 +72,15 @@
{% macro render_simple_field(field) %}
<span class="label {% if field.errors %}error{% endif %}">{{ field.label }}</span>
<span {% if field.errors %} class="error" {% endif %}>{{ field(**kwargs)|safe }}
<span class="label {% if field.errors or field.top_errors %}error{% endif %}">{{ field.label }}</span>
<span {% if field.errors or field.top_errors %} class="error" {% endif %}>{{ field(**kwargs)|safe }}
{% if field.top_errors %}
<ul class="errors top-errors">
{% for error in field.top_errors %}
<li>{{ error }}</li>
{% endfor %}
</ul>
{% endif %}
{% if field.errors %}
<ul class=errors>
{% for error in field.errors %}
@@ -92,8 +95,15 @@
{% macro render_nolabel_field(field) %}
<span>
{{ field(**kwargs)|safe }}
{% if field.errors %}
{% if field.top_errors or field.errors %}
<span class="error">
{% if field.top_errors %}
<ul class="errors top-errors">
{% for error in field.top_errors %}
<li>{{ error }}</li>
{% endfor %}
</ul>
{% endif %}
{% if field.errors %}
<ul class=errors>
{% for error in field.errors %}
@@ -111,39 +121,6 @@
{{ field(**kwargs)|safe }}
{% endmacro %}
{% macro render_fieldlist_with_inline_errors(fieldlist) %}
{# Specialized macro for FieldList(FormField(...)) that renders errors inline with each field #}
<div {% if fieldlist.errors %} class="error" {% endif %}>{{ fieldlist.label }}</div>
<div {% if fieldlist.errors %} class="error" {% endif %}>
<ul id="{{ fieldlist.id }}">
{% for entry in fieldlist %}
<li {% if entry.errors %} class="error" {% endif %}>
<label for="{{ entry.id }}" {% if entry.errors %} class="error" {% endif %}>{{ fieldlist.label.text }}-{{ loop.index0 }}</label>
<table id="{{ entry.id }}" {% if entry.errors %} class="error" {% endif %}>
<tbody>
{% for subfield in entry %}
<tr {% if subfield.errors %} class="error" {% endif %}>
<th {% if subfield.errors %} class="error" {% endif %}><label for="{{ subfield.id }}" {% if subfield.errors %} class="error" {% endif %}>{{ subfield.label.text }}</label></th>
<td {% if subfield.errors %} class="error" {% endif %}>
{{ subfield(**kwargs)|safe }}
{% if subfield.errors %}
<ul class="errors">
{% for error in subfield.errors %}
<li class="error">{{ error }}</li>
{% endfor %}
</ul>
{% endif %}
</td>
</tr>
{% endfor %}
</tbody>
</table>
</li>
{% endfor %}
</ul>
</div>
{% endmacro %}
{% macro render_conditions_fieldlist_of_formfields_as_table(fieldlist, table_id="rulesTable") %}
<div class="fieldlist_formfields" id="{{ table_id }}">
<div class="fieldlist-header">
@@ -266,7 +243,9 @@
<li id="timezone-info">
{{ render_field(form.time_schedule_limit.timezone, placeholder=timezone_default_config) }} <span id="local-time-in-tz"></span>
<datalist id="timezones" style="display: none;">
{%- for timezone in available_timezones -%}<option value="{{ timezone }}">{{ timezone }}</option>{%- endfor -%}
{% for timezone in available_timezones %}
<option value="{{ timezone }}">{{ timezone }}</option>
{% endfor %}
</datalist>
</li>
</ul>

View File

@@ -8,13 +8,8 @@
<meta name="robots" content="noindex">
<title>Change Detection{{extra_title}}</title>
{% if app_rss_token %}
<link rel="alternate" type="application/rss+xml" title="Changedetection.io » Feed{% if active_tag_uuid %}- {{active_tag.title}}{% endif %}" href="{{ url_for('rss.feed', tag=active_tag_uuid, token=app_rss_token, _external=True )}}" >
{% if rss_uuid_feed %}
<link rel="alternate" type="application/rss+xml" title="Feed » {{ rss_uuid_feed['label'] }}" href="{{ rss_uuid_feed['url'] }}" >
{%- endif -%}
{%- endif -%}
<link rel="alternate" type="application/rss+xml" title="Changedetection.io » Feed{% if active_tag_uuid %}- {{active_tag.title}}{% endif %}" href="{{ url_for('rss.feed', tag=active_tag_uuid , token=app_rss_token)}}" >
{% endif %}
<link rel="stylesheet" href="{{url_for('static_content', group='styles', filename='pure-min.css')}}" >
<link rel="stylesheet" href="{{url_for('static_content', group='styles', filename='styles.css')}}?v={{ get_css_version() }}" >
{% if extra_stylesheets %}
@@ -58,7 +53,7 @@
<a class="pure-menu-heading" href="{{url_for('watchlist.index')}}">
<strong>Change</strong>Detection.io</a>
{% endif %}
{% if current_diff_url and is_safe_valid_url(current_diff_url) %}
{% if current_diff_url %}
<a class="current-diff-url" href="{{ current_diff_url }}">
<span style="max-width: 30%; overflow: hidden">{{ current_diff_url }}</span></a>
{% else %}

View File

@@ -4,14 +4,12 @@ import time
from threading import Thread
import pytest
import arrow
from changedetectionio import changedetection_app
from changedetectionio import store
import os
import sys
from loguru import logger
from changedetectionio.flask_app import init_app_secret
from changedetectionio.tests.util import live_server_setup, new_live_server_setup
# https://github.com/pallets/flask/blob/1.1.2/examples/tutorial/tests/test_auth.py
@@ -31,39 +29,16 @@ def reportlog(pytestconfig):
logger.remove(handler_id)
@pytest.fixture
def environment(mocker):
"""Mock arrow.now() to return a fixed datetime for testing jinja2 time extension."""
# Fixed datetime: Wed, 09 Dec 2015 23:33:01 UTC
# This is calculated to match the test expectations when offsets are applied
fixed_datetime = arrow.Arrow(2015, 12, 9, 23, 33, 1, tzinfo='UTC')
# Patch arrow.now in the TimeExtension module where it's actually used
mocker.patch('changedetectionio.jinja2_custom.extensions.TimeExtension.arrow.now', return_value=fixed_datetime)
return fixed_datetime
def format_memory_human(bytes_value):
"""Format memory in human-readable units (KB, MB, GB)"""
if bytes_value < 1024:
return f"{bytes_value} B"
elif bytes_value < 1024 ** 2:
return f"{bytes_value / 1024:.2f} KB"
elif bytes_value < 1024 ** 3:
return f"{bytes_value / (1024 ** 2):.2f} MB"
else:
return f"{bytes_value / (1024 ** 3):.2f} GB"
def track_memory(memory_usage, ):
process = psutil.Process(os.getpid())
while not memory_usage["stop"]:
current_rss = process.memory_info().rss
memory_usage["peak"] = max(memory_usage["peak"], current_rss)
memory_usage["current"] = current_rss # Keep updating current
time.sleep(0.01) # Adjust the sleep time as needed
@pytest.fixture(scope='function')
def measure_memory_usage(request):
memory_usage = {"peak": 0, "current": 0, "stop": False}
memory_usage = {"peak": 0, "stop": False}
tracker_thread = Thread(target=track_memory, args=(memory_usage,))
tracker_thread.start()
@@ -72,22 +47,22 @@ def measure_memory_usage(request):
memory_usage["stop"] = True
tracker_thread.join()
# Note: psutil returns RSS memory in bytes
peak_human = format_memory_human(memory_usage["peak"])
s = f"{time.time()} {request.node.fspath} - '{request.node.name}' - Peak memory: {peak_human}"
# Note: ru_maxrss is in kilobytes on Unix-based systems
max_memory_used = memory_usage["peak"] / 1024 # Convert to MB
s = f"Peak memory used by the test {request.node.fspath} - '{request.node.name}': {max_memory_used:.2f} MB"
logger.debug(s)
with open("test-memory.log", 'a') as f:
f.write(f"{s}\n")
# Assert that the memory usage is less than 200MB
# assert peak_memory_kb < 150 * 1024, f"Memory usage exceeded 150MB: {peak_human}"
# assert max_memory_used < 150, f"Memory usage exceeded 200MB: {max_memory_used:.2f} MB"
def cleanup(datastore_path):
import glob
# Unlink test output files
for g in ["*.txt", "*.json", "*.pdf"]:
files = glob.glob(os.path.join(datastore_path, g))
for f in files:
@@ -97,121 +72,34 @@ def cleanup(datastore_path):
if os.path.isfile(f):
os.unlink(f)
def pytest_addoption(parser):
"""Add custom command-line options for pytest.
Provides --datastore-path option for specifying custom datastore location.
Note: Cannot use -d short option as it's reserved by pytest for debug mode.
"""
parser.addoption(
"--datastore-path",
action="store",
default=None,
help="Custom datastore path for tests"
)
@pytest.fixture(scope='session')
def datastore_path(tmp_path_factory, request):
"""Provide datastore path unique to this worker.
Supports custom path via --datastore-path/-d flag (mirrors main app).
CRITICAL for xdist isolation:
- Each WORKER gets its own directory
- Tests on same worker run SEQUENTIALLY and cleanup between tests
- No subdirectories needed since tests don't overlap on same worker
- Example: /tmp/test-datastore-gw0/ for worker gw0
"""
# Check for custom path first (mirrors main app's -d flag)
custom_path = request.config.getoption("--datastore-path")
if custom_path:
# Ensure the directory exists
os.makedirs(custom_path, exist_ok=True)
logger.info(f"Using custom datastore path: {custom_path}")
return custom_path
# Otherwise use default tmp_path_factory logic
worker_id = getattr(request.config, 'workerinput', {}).get('workerid', 'master')
if worker_id == 'master':
path = tmp_path_factory.mktemp("test-datastore")
else:
path = tmp_path_factory.mktemp(f"test-datastore-{worker_id}")
return str(path)
@pytest.fixture(scope='function', autouse=True)
def prepare_test_function(live_server, datastore_path):
"""Prepare each test with complete isolation.
def prepare_test_function(live_server):
CRITICAL for xdist per-test isolation:
- Reuses the SAME datastore instance (so blueprint references stay valid)
- Clears all watches and state for a clean slate
- First watch will get uuid="first"
"""
routes = [rule.rule for rule in live_server.app.url_map.iter_rules()]
if '/test-random-content-endpoint' not in routes:
logger.debug("Setting up test URL routes")
new_live_server_setup(live_server)
# CRITICAL: Point app to THIS test's unique datastore directory
live_server.app.config['TEST_DATASTORE_PATH'] = datastore_path
# CRITICAL: Get datastore and stop it from writing stale data
datastore = live_server.app.config.get('DATASTORE')
# Prevent background thread from writing during cleanup/reload
datastore.needs_write = False
datastore.needs_write_urgent = False
# CRITICAL: Clean up any files from previous tests
# This ensures a completely clean directory
cleanup(datastore_path)
# CRITICAL: Reload the EXISTING datastore instead of creating a new one
# This keeps blueprint references valid (they capture datastore at construction)
# reload_state() completely resets the datastore to a clean state
# Reload state with clean data (no default watches)
datastore.reload_state(
datastore_path=datastore_path,
include_default_watches=False,
version_tag=datastore.data.get('version_tag', '0.0.0')
)
live_server.app.secret_key = init_app_secret(datastore_path)
logger.debug(f"prepare_test_function: Reloaded datastore at {hex(id(datastore))}")
logger.debug(f"prepare_test_function: Path {datastore.datastore_path}")
yield
# Cleanup: Clear watches again after test
try:
datastore.data['watching'] = {}
datastore.needs_write = True
except Exception as e:
logger.warning(f"Error during datastore cleanup: {e}")
# So the app can also know which test name it was
@pytest.fixture(autouse=True)
def set_test_name(request):
"""Automatically set TEST_NAME env var for every test"""
test_name = request.node.name
os.environ['PYTEST_CURRENT_TEST'] = test_name
yield
# Cleanup if needed
# Then cleanup/shutdown
live_server.app.config['DATASTORE'].data['watching']={}
time.sleep(0.3)
live_server.app.config['DATASTORE'].data['watching']={}
@pytest.fixture(scope='session')
def app(request, datastore_path):
"""Create application once per worker (session).
def app(request):
"""Create application for the tests."""
datastore_path = "./test-datastore"
Note: Actual per-test isolation is handled by:
- prepare_test_function() recreates datastore and cleans directory
- All tests on same worker use same directory (cleaned between tests)
"""
# So they don't delay in fetching
os.environ["MINIMUM_SECONDS_RECHECK_TIME"] = "0"
logger.debug(f"Testing with datastore_path={datastore_path}")
try:
os.mkdir(datastore_path)
except FileExistsError:
pass
cleanup(datastore_path)
app_config = {'datastore_path': datastore_path, 'disable_checkver' : True}
@@ -234,8 +122,6 @@ def app(request, datastore_path):
# Disable CSRF while running tests
app.config['WTF_CSRF_ENABLED'] = False
app.config['STOP_THREADS'] = True
# Store datastore_path so Flask routes can access it
app.config['TEST_DATASTORE_PATH'] = datastore_path
def teardown():
# Stop all threads and services

View File

@@ -29,8 +29,13 @@ def do_test(client, live_server, make_test_use_extra_browser=False):
assert b"Settings updated." in res.data
# Add our URL to the import page
uuid = client.application.config.get('DATASTORE').add_watch(url=test_url)
client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
res = client.post(
url_for("imports.import_page"),
data={"urls": test_url},
follow_redirects=True
)
assert b"1 Imported" in res.data
wait_for_all_checks(client)
if make_test_use_extra_browser:
@@ -73,13 +78,13 @@ def do_test(client, live_server, make_test_use_extra_browser=False):
# Requires playwright to be installed
def test_request_via_custom_browser_url(client, live_server, measure_memory_usage, datastore_path):
def test_request_via_custom_browser_url(client, live_server, measure_memory_usage):
# live_server_setup(live_server) # Setup on conftest per function
# We do this so we can grep the logs of the custom container and see if the request actually went through that container
do_test(client, live_server, make_test_use_extra_browser=True)
def test_request_not_via_custom_browser_url(client, live_server, measure_memory_usage, datastore_path):
def test_request_not_via_custom_browser_url(client, live_server, measure_memory_usage):
# live_server_setup(live_server) # Setup on conftest per function
# We do this so we can grep the logs of the custom container and see if the request actually went through that container
do_test(client, live_server, make_test_use_extra_browser=False)

View File

@@ -8,7 +8,7 @@ import logging
# Requires playwright to be installed
def test_fetch_webdriver_content(client, live_server, measure_memory_usage, datastore_path):
def test_fetch_webdriver_content(client, live_server, measure_memory_usage):
# live_server_setup(live_server) # Setup on conftest per function
#####################

View File

@@ -3,7 +3,7 @@ from flask import url_for
from ..util import live_server_setup, wait_for_all_checks, extract_UUID_from_client
def test_execute_custom_js(client, live_server, measure_memory_usage, datastore_path):
def test_execute_custom_js(client, live_server, measure_memory_usage):
# live_server_setup(live_server) # Setup on conftest per function
assert os.getenv('PLAYWRIGHT_DRIVER_URL'), "Needs PLAYWRIGHT_DRIVER_URL set for this test"

View File

@@ -5,7 +5,7 @@ from flask import url_for
from ..util import live_server_setup, wait_for_all_checks
def test_preferred_proxy(client, live_server, measure_memory_usage, datastore_path):
def test_preferred_proxy(client, live_server, measure_memory_usage):
# live_server_setup(live_server) # Setup on conftest per function
url = "http://chosen.changedetection.io"

View File

@@ -5,7 +5,7 @@ from flask import url_for
from ..util import live_server_setup, wait_for_all_checks, extract_UUID_from_client
def test_noproxy_option(client, live_server, measure_memory_usage, datastore_path):
def test_noproxy_option(client, live_server, measure_memory_usage):
# live_server_setup(live_server) # Setup on conftest per function
# Run by run_proxy_tests.sh
# Call this URL then scan the containers that it never went through them

View File

@@ -5,7 +5,7 @@ from flask import url_for
from ..util import live_server_setup, wait_for_all_checks, extract_UUID_from_client
# just make a request, we will grep in the docker logs to see it actually got called
def test_check_basic_change_detection_functionality(client, live_server, measure_memory_usage, datastore_path):
def test_check_basic_change_detection_functionality(client, live_server, measure_memory_usage):
# live_server_setup(live_server) # Setup on conftest per function
res = client.post(
url_for("imports.import_page"),

View File

@@ -12,7 +12,7 @@ from ... import strtobool
# FAST_PUPPETEER_CHROME_FETCHER=True PLAYWRIGHT_DRIVER_URL=ws://127.0.0.1:3000 pytest tests/proxy_list/test_proxy_noconnect.py
# WEBDRIVER_URL=http://127.0.0.1:4444/wd/hub pytest tests/proxy_list/test_proxy_noconnect.py
def test_proxy_noconnect_custom(client, live_server, measure_memory_usage, datastore_path):
def test_proxy_noconnect_custom(client, live_server, measure_memory_usage):
# live_server_setup(live_server) # Setup on conftest per function
# Goto settings, add our custom one

View File

@@ -6,7 +6,7 @@ from ..util import live_server_setup, wait_for_all_checks
import os
# just make a request, we will grep in the docker logs to see it actually got called
def test_select_custom(client, live_server, measure_memory_usage, datastore_path):
def test_select_custom(client, live_server, measure_memory_usage):
# live_server_setup(live_server) # Setup on conftest per function
# Goto settings, add our custom one
@@ -49,39 +49,3 @@ def test_select_custom(client, live_server, measure_memory_usage, datastore_path
#
# Now we should see the request in the container logs for "squid-squid-custom" because it will be the only default
def test_custom_proxy_validation(client, live_server, measure_memory_usage, datastore_path):
# live_server_setup(live_server) # Setup on conftest per function
# Goto settings, add our custom one
res = client.post(
url_for("settings.settings_page"),
data={
"requests-time_between_check-minutes": 180,
"application-ignore_whitespace": "y",
"application-fetch_backend": 'html_requests',
"requests-extra_proxies-0-proxy_name": "custom-test-proxy",
"requests-extra_proxies-0-proxy_url": "xxxxhtt/333??p://test:awesome@squid-custom:3128",
},
follow_redirects=True
)
assert b"Settings updated." not in res.data
assert b'Proxy URLs must start with' in res.data
res = client.post(
url_for("settings.settings_page"),
data={
"requests-time_between_check-minutes": 180,
"application-ignore_whitespace": "y",
"application-fetch_backend": 'html_requests',
"requests-extra_proxies-0-proxy_name": "custom-test-proxy",
"requests-extra_proxies-0-proxy_url": "https://",
},
follow_redirects=True
)
assert b"Settings updated." not in res.data
assert b"Invalid URL." in res.data

View File

@@ -2,10 +2,10 @@
import json
import os
from flask import url_for
from changedetectionio.tests.util import live_server_setup, wait_for_all_checks, extract_UUID_from_client, delete_all_watches
from changedetectionio.tests.util import live_server_setup, wait_for_all_checks, extract_UUID_from_client
def set_response(datastore_path):
def set_response():
import time
data = """<html>
<body>
@@ -15,13 +15,13 @@ def set_response(datastore_path):
</html>
"""
with open(os.path.join(datastore_path, "endpoint-content.txt"), "w") as f:
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(data)
time.sleep(1)
def test_socks5(client, live_server, measure_memory_usage, datastore_path):
def test_socks5(client, live_server, measure_memory_usage):
# live_server_setup(live_server) # Setup on conftest per function
set_response(datastore_path)
set_response()
# Setup a proxy
res = client.post(
@@ -98,5 +98,6 @@ def test_socks5(client, live_server, measure_memory_usage, datastore_path):
)
assert b"OK" in res.data
delete_all_watches(client)
res = client.get(url_for("ui.form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data

Some files were not shown because too many files have changed in this diff Show More