Compare commits

...

13 Commits

Author SHA1 Message Date
dgtlmoon
141aea07b8 JSONP - Attempt to strip out JSONP 2026-03-15 21:53:46 +01:00
dgtlmoon
5a4266069b Content Fetchers / Browsers - Improvements for pluggable extra fetchers/browsers. (#3981)
Some checks failed
Build and push containers / metadata (push) Has been cancelled
Build and push containers / build-push-containers (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-14 (push) Has been cancelled
2026-03-15 17:35:46 +01:00
Yunhao Jiang
36269717b2 fix: add commit calls for pause and mute operations (#3978)
Some checks failed
Build and push containers / metadata (push) Has been cancelled
Build and push containers / build-push-containers (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v7 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v8 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (main) (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-14 (push) Has been cancelled
2026-03-13 11:32:15 +01:00
dependabot[bot]
84f2629a0c Bump apprise from 1.9.7 to 1.9.8 (#3979) 2026-03-13 10:00:12 +01:00
dgtlmoon
e9d740bd49 0.54.5
Some checks failed
Build and push containers / metadata (push) Has been cancelled
Build and push containers / build-push-containers (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v7 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v8 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (main) (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-14 (push) Has been cancelled
2026-03-12 17:11:21 +01:00
dgtlmoon
c18421fbe9 CI - YML tidyup 2026-03-12 16:46:14 +01:00
dgtlmoon
f29d6a857b Docker image - Improving org.opencontainers labels for dev containers 2026-03-12 16:41:45 +01:00
dgtlmoon
fcfe089a53 Docker image - Improving org.opencontainers labels #3794 2026-03-12 16:36:07 +01:00
dgtlmoon
b32617d700 API - Invert changes_only flag for include_equal parameter, add test, fixes changesOnly option for history diff API call (#3976) 2026-03-12 16:15:37 +01:00
dgtlmoon
380d8a26a1 UI - Fixing Preview "GO" version button (#3969)
Some checks failed
Build and push containers / metadata (push) Has been cancelled
Build and push containers / build-push-containers (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-14 (push) Has been cancelled
CodeQL / Analyze (javascript) (push) Has been cancelled
CodeQL / Analyze (python) (push) Has been cancelled
2026-03-10 11:52:58 +01:00
dgtlmoon
02c03fc32b API - Create (POST) tag/group through API do not save processor_config_restock_diff values #3966 (#3968) 2026-03-10 11:19:59 +01:00
Adrián González
db3d38b3ee Add complete Spanish translation (es) (#3961)
Some checks failed
Build and push containers / metadata (push) Has been cancelled
Build and push containers / build-push-containers (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-14 (push) Has been cancelled
2026-03-09 14:45:56 +01:00
dgtlmoon
ecd8af94f6 Various memory and CPU improvements (#3960)
Some checks failed
Build and push containers / metadata (push) Has been cancelled
Build and push containers / build-push-containers (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-14 (push) Has been cancelled
2026-03-08 14:32:50 +01:00
24 changed files with 3960 additions and 140 deletions

View File

@@ -103,6 +103,14 @@ jobs:
ghcr.io/${{ github.repository }} ghcr.io/${{ github.repository }}
tags: | tags: |
type=raw,value=dev type=raw,value=dev
labels: |
org.opencontainers.image.created=${{ github.event.release.published_at }}
org.opencontainers.image.description=Website, webpage change detection, monitoring and notifications.
org.opencontainers.image.documentation=https://changedetection.io
org.opencontainers.image.revision=${{ github.sha }}
org.opencontainers.image.source=https://github.com/dgtlmoon/changedetection.io
org.opencontainers.image.title=changedetection.io
org.opencontainers.image.url=https://changedetection.io
- name: Build and push :dev - name: Build and push :dev
id: docker_build id: docker_build
@@ -128,7 +136,7 @@ jobs:
echo "Release tag: ${{ github.event.release.tag_name }}" echo "Release tag: ${{ github.event.release.tag_name }}"
echo "Github ref: ${{ github.ref }}" echo "Github ref: ${{ github.ref }}"
echo "Github ref name: ${{ github.ref_name }}" echo "Github ref name: ${{ github.ref_name }}"
- name: Docker meta :tag - name: Docker meta :tag
if: github.event_name == 'release' && startsWith(github.event.release.tag_name, '0.') if: github.event_name == 'release' && startsWith(github.event.release.tag_name, '0.')
uses: docker/metadata-action@v6 uses: docker/metadata-action@v6
@@ -142,6 +150,15 @@ jobs:
type=semver,pattern={{major}}.{{minor}},value=${{ github.event.release.tag_name }} type=semver,pattern={{major}}.{{minor}},value=${{ github.event.release.tag_name }}
type=semver,pattern={{major}},value=${{ github.event.release.tag_name }} type=semver,pattern={{major}},value=${{ github.event.release.tag_name }}
type=raw,value=latest type=raw,value=latest
labels: |
org.opencontainers.image.created=${{ github.event.release.published_at }}
org.opencontainers.image.description=Website, webpage change detection, monitoring and notifications.
org.opencontainers.image.documentation=https://changedetection.io
org.opencontainers.image.revision=${{ github.sha }}
org.opencontainers.image.source=https://github.com/dgtlmoon/changedetection.io
org.opencontainers.image.title=changedetection.io
org.opencontainers.image.url=https://changedetection.io
org.opencontainers.image.version=${{ github.event.release.tag_name }}
- name: Build and push :tag - name: Build and push :tag
id: docker_build_tag_release id: docker_build_tag_release

View File

@@ -2,7 +2,7 @@
# Read more https://github.com/dgtlmoon/changedetection.io/wiki # Read more https://github.com/dgtlmoon/changedetection.io/wiki
# Semver means never use .01, or 00. Should be .1. # Semver means never use .01, or 00. Should be .1.
__version__ = '0.54.4' __version__ = '0.54.5'
from changedetectionio.strtobool import strtobool from changedetectionio.strtobool import strtobool
from json.decoder import JSONDecodeError from json.decoder import JSONDecodeError
@@ -61,8 +61,22 @@ import time
# ============================================================================== # ==============================================================================
import multiprocessing import multiprocessing
import os
import sys import sys
# Limit glibc malloc arena count to prevent RSS growth from concurrent requests.
# Default: glibc creates up to 8×CPU_cores arenas. Each concurrent thread/connection
# can trigger a new arena, and freed memory stays mapped in those arenas as RSS forever.
# With MALLOC_ARENA_MAX=2, at most 2 arenas are used; freed pages return to the OS faster.
# Must be set before worker threads start; env var is read lazily by glibc on first arena creation.
if 'MALLOC_ARENA_MAX' not in os.environ:
os.environ['MALLOC_ARENA_MAX'] = '2'
try:
import ctypes as _ctypes
_ctypes.CDLL('libc.so.6').mallopt(-8, 2) # M_ARENA_MAX = -8
except Exception:
pass
# Set spawn as global default (safety net - all our code uses explicit contexts anyway) # Set spawn as global default (safety net - all our code uses explicit contexts anyway)
# Skip in tests to avoid breaking pytest-flask's LiveServer fixture (uses unpicklable local functions) # Skip in tests to avoid breaking pytest-flask's LiveServer fixture (uses unpicklable local functions)
if 'pytest' not in sys.modules: if 'pytest' not in sys.modules:

View File

@@ -177,6 +177,13 @@ class Tag(Resource):
new_uuid = self.datastore.add_tag(title=title) new_uuid = self.datastore.add_tag(title=title)
if new_uuid: if new_uuid:
# Apply any extra fields (e.g. processor_config_restock_diff) beyond just title
extra = {k: v for k, v in json_data.items() if k != 'title'}
if extra:
tag = self.datastore.data['settings']['application']['tags'].get(new_uuid)
if tag:
tag.update(extra)
tag.commit()
return {'uuid': new_uuid}, 201 return {'uuid': new_uuid}, 201
else: else:
return "Invalid or unsupported tag", 400 return "Invalid or unsupported tag", 400

View File

@@ -338,7 +338,7 @@ class WatchHistoryDiff(Resource):
word_diff = True word_diff = True
# Get boolean diff preferences with defaults from DIFF_PREFERENCES_CONFIG # Get boolean diff preferences with defaults from DIFF_PREFERENCES_CONFIG
changes_only = strtobool(request.args.get('changesOnly', 'true')) changes_only = strtobool(request.args.get('changesOnly', 'false'))
ignore_whitespace = strtobool(request.args.get('ignoreWhitespace', 'false')) ignore_whitespace = strtobool(request.args.get('ignoreWhitespace', 'false'))
include_removed = strtobool(request.args.get('removed', 'true')) include_removed = strtobool(request.args.get('removed', 'true'))
include_added = strtobool(request.args.get('added', 'true')) include_added = strtobool(request.args.get('added', 'true'))
@@ -349,7 +349,7 @@ class WatchHistoryDiff(Resource):
previous_version_file_contents=from_version_file_contents, previous_version_file_contents=from_version_file_contents,
newest_version_file_contents=to_version_file_contents, newest_version_file_contents=to_version_file_contents,
ignore_junk=ignore_whitespace, ignore_junk=ignore_whitespace,
include_equal=changes_only, include_equal=not changes_only,
include_removed=include_removed, include_removed=include_removed,
include_added=include_added, include_added=include_added,
include_replaced=include_replaced, include_replaced=include_replaced,
@@ -567,4 +567,4 @@ class CreateWatch(Resource):
return {'status': f'OK, queueing {len(watches_to_queue)} watches in background'}, 202 return {'status': f'OK, queueing {len(watches_to_queue)} watches in background'}, 202
return list, 200 return list, 200

View File

@@ -102,6 +102,35 @@ def run_async_in_browser_loop(coro):
else: else:
raise RuntimeError("Browser steps event loop is not available") raise RuntimeError("Browser steps event loop is not available")
async def _close_session_resources(session_data, label=''):
"""Close all browser resources for a session in the correct order.
browserstepper.cleanup() closes page+context but not the browser itself.
For CloakBrowser, browser.close() is what stops the local Chromium process via pw.stop().
For the default CDP path, playwright_context.stop() shuts down the playwright instance.
"""
browserstepper = session_data.get('browserstepper')
if browserstepper:
try:
await browserstepper.cleanup()
except Exception as e:
logger.error(f"Error cleaning up browserstepper{label}: {e}")
browser = session_data.get('browser')
if browser:
try:
await asyncio.wait_for(browser.close(), timeout=5.0)
except Exception as e:
logger.warning(f"Error closing browser{label}: {e}")
playwright_context = session_data.get('playwright_context')
if playwright_context:
try:
await playwright_context.stop()
except Exception as e:
logger.warning(f"Error stopping playwright context{label}: {e}")
def cleanup_expired_sessions(): def cleanup_expired_sessions():
"""Remove expired browsersteps sessions and cleanup their resources""" """Remove expired browsersteps sessions and cleanup their resources"""
global browsersteps_sessions, browsersteps_watch_to_session global browsersteps_sessions, browsersteps_watch_to_session
@@ -119,13 +148,10 @@ def cleanup_expired_sessions():
logger.debug(f"Cleaning up expired browsersteps session {session_id}") logger.debug(f"Cleaning up expired browsersteps session {session_id}")
session_data = browsersteps_sessions[session_id] session_data = browsersteps_sessions[session_id]
# Cleanup playwright resources asynchronously try:
browserstepper = session_data.get('browserstepper') run_async_in_browser_loop(_close_session_resources(session_data, label=f" for session {session_id}"))
if browserstepper: except Exception as e:
try: logger.error(f"Error cleaning up session {session_id}: {e}")
run_async_in_browser_loop(browserstepper.cleanup())
except Exception as e:
logger.error(f"Error cleaning up session {session_id}: {e}")
# Remove from sessions dict # Remove from sessions dict
del browsersteps_sessions[session_id] del browsersteps_sessions[session_id]
@@ -152,12 +178,10 @@ def cleanup_session_for_watch(watch_uuid):
session_data = browsersteps_sessions.get(session_id) session_data = browsersteps_sessions.get(session_id)
if session_data: if session_data:
browserstepper = session_data.get('browserstepper') try:
if browserstepper: run_async_in_browser_loop(_close_session_resources(session_data, label=f" for watch {watch_uuid}"))
try: except Exception as e:
run_async_in_browser_loop(browserstepper.cleanup()) logger.error(f"Error cleaning up session {session_id} for watch {watch_uuid}: {e}")
except Exception as e:
logger.error(f"Error cleaning up session {session_id} for watch {watch_uuid}: {e}")
# Remove from sessions dict # Remove from sessions dict
del browsersteps_sessions[session_id] del browsersteps_sessions[session_id]
@@ -178,59 +202,69 @@ def construct_blueprint(datastore: ChangeDetectionStore):
import time import time
from playwright.async_api import async_playwright from playwright.async_api import async_playwright
# We keep the playwright session open for many minutes
keepalive_seconds = int(os.getenv('BROWSERSTEPS_MINUTES_KEEPALIVE', 10)) * 60 keepalive_seconds = int(os.getenv('BROWSERSTEPS_MINUTES_KEEPALIVE', 10)) * 60
keepalive_ms = ((keepalive_seconds + 3) * 1000)
browsersteps_start_session = {'start_time': time.time()} browsersteps_start_session = {'start_time': time.time()}
# Create a new async playwright instance for browser steps # Build proxy dict first — needed by both the CDP path and fetcher-specific launchers
playwright_instance = async_playwright()
playwright_context = await playwright_instance.start()
keepalive_ms = ((keepalive_seconds + 3) * 1000)
base_url = os.getenv('PLAYWRIGHT_DRIVER_URL', '').strip('"')
a = "?" if not '?' in base_url else '&'
base_url += a + f"timeout={keepalive_ms}"
browser = await playwright_context.chromium.connect_over_cdp(base_url, timeout=keepalive_ms)
browsersteps_start_session['browser'] = browser
browsersteps_start_session['playwright_context'] = playwright_context
proxy_id = datastore.get_preferred_proxy_for_watch(uuid=watch_uuid) proxy_id = datastore.get_preferred_proxy_for_watch(uuid=watch_uuid)
proxy = None proxy = None
if proxy_id: if proxy_id:
proxy_url = datastore.proxy_list.get(proxy_id).get('url') proxy_url = datastore.proxy_list.get(proxy_id, {}).get('url')
if proxy_url: if proxy_url:
# Playwright needs separate username and password values
from urllib.parse import urlparse from urllib.parse import urlparse
parsed = urlparse(proxy_url) parsed = urlparse(proxy_url)
proxy = {'server': proxy_url} proxy = {'server': proxy_url}
if parsed.username: if parsed.username:
proxy['username'] = parsed.username proxy['username'] = parsed.username
if parsed.password: if parsed.password:
proxy['password'] = parsed.password proxy['password'] = parsed.password
logger.debug(f"Browser Steps: UUID {watch_uuid} selected proxy {proxy_url}") logger.debug(f"Browser Steps: UUID {watch_uuid} selected proxy {proxy_url}")
# Tell Playwright to connect to Chrome and setup a new session via our stepper interface # Resolve the fetcher class for this watch so we can ask it to launch its own browser
# if it supports that (e.g. CloakBrowser, which runs locally rather than via CDP)
watch = datastore.data['watching'][watch_uuid]
from changedetectionio import content_fetchers
fetcher_name = watch.get_fetch_backend or 'system'
if fetcher_name == 'system':
fetcher_name = datastore.data['settings']['application'].get('fetch_backend', 'html_requests')
fetcher_class = getattr(content_fetchers, fetcher_name, None)
browser = None
playwright_context = None
# If the fetcher has its own browser launch for the live steps UI, use it.
# get_browsersteps_browser(proxy, keepalive_ms) returns (browser, playwright_context_or_None)
# or None to fall back to the default CDP path.
if fetcher_class and hasattr(fetcher_class, 'get_browsersteps_browser'):
result = await fetcher_class.get_browsersteps_browser(proxy=proxy, keepalive_ms=keepalive_ms)
if result is not None:
browser, playwright_context = result
logger.debug(f"Browser Steps: using fetcher-specific browser for '{fetcher_name}'")
# Default: connect to the remote Playwright/sockpuppetbrowser via CDP
if browser is None:
playwright_instance = async_playwright()
playwright_context = await playwright_instance.start()
base_url = os.getenv('PLAYWRIGHT_DRIVER_URL', '').strip('"')
a = "?" if '?' not in base_url else '&'
base_url += a + f"timeout={keepalive_ms}"
browser = await playwright_context.chromium.connect_over_cdp(base_url, timeout=keepalive_ms)
logger.debug(f"Browser Steps: using CDP connection to {base_url}")
browsersteps_start_session['browser'] = browser
browsersteps_start_session['playwright_context'] = playwright_context
browserstepper = browser_steps.browsersteps_live_ui( browserstepper = browser_steps.browsersteps_live_ui(
playwright_browser=browser, playwright_browser=browser,
proxy=proxy, proxy=proxy,
start_url=datastore.data['watching'][watch_uuid].link, start_url=watch.link,
headers=datastore.data['watching'][watch_uuid].get('headers') headers=watch.get('headers')
) )
# Initialize the async connection
await browserstepper.connect(proxy=proxy) await browserstepper.connect(proxy=proxy)
browsersteps_start_session['browserstepper'] = browserstepper browsersteps_start_session['browserstepper'] = browserstepper
# For test
#await browsersteps_start_session['browserstepper'].action_goto_url(value="http://example.com?time="+str(time.time()))
return browsersteps_start_session return browsersteps_start_session

View File

@@ -10,7 +10,8 @@ from changedetectionio import html_tools
def construct_blueprint(datastore: ChangeDetectionStore): def construct_blueprint(datastore: ChangeDetectionStore):
preview_blueprint = Blueprint('ui_preview', __name__, template_folder="../ui/templates") preview_blueprint = Blueprint('ui_preview', __name__, template_folder="../ui/templates")
@preview_blueprint.route("/preview/<uuid_str:uuid>", methods=['GET'])
@preview_blueprint.route("/preview/<uuid_str:uuid>", methods=['GET', 'POST'])
@login_optionally_required @login_optionally_required
def preview_page(uuid): def preview_page(uuid):
""" """
@@ -59,12 +60,8 @@ def construct_blueprint(datastore: ChangeDetectionStore):
versions = [] versions = []
timestamp = None timestamp = None
system_uses_webdriver = datastore.data['settings']['application']['fetch_backend'] == 'html_webdriver'
extra_stylesheets = [url_for('static_content', group='styles', filename='diff.css')] extra_stylesheets = [url_for('static_content', group='styles', filename='diff.css')]
is_html_webdriver = watch.fetcher_supports_screenshots
is_html_webdriver = False
if (watch.get('fetch_backend') == 'system' and system_uses_webdriver) or watch.get('fetch_backend') == 'html_webdriver' or watch.get('fetch_backend', '').startswith('extra_browser_'):
is_html_webdriver = True
triggered_line_numbers = [] triggered_line_numbers = []
ignored_line_numbers = [] ignored_line_numbers = []
@@ -74,7 +71,9 @@ def construct_blueprint(datastore: ChangeDetectionStore):
flash(gettext("Preview unavailable - No fetch/check completed or triggers not reached"), "error") flash(gettext("Preview unavailable - No fetch/check completed or triggers not reached"), "error")
else: else:
# So prepare the latest preview or not # So prepare the latest preview or not
preferred_version = request.args.get('version') preferred_version = request.values.get('version') if request.method == 'POST' else request.args.get('version')
versions = list(watch.history.keys()) versions = list(watch.history.keys())
timestamp = versions[-1] timestamp = versions[-1]
if preferred_version and preferred_version in versions: if preferred_version and preferred_version in versions:

View File

@@ -17,7 +17,7 @@
<script src="{{ url_for('static_content', group='js', filename='tabs.js') }}" defer></script> <script src="{{ url_for('static_content', group='js', filename='tabs.js') }}" defer></script>
{% if versions|length >= 2 %} {% if versions|length >= 2 %}
<div id="diff-form" style="text-align: center;"> <div id="diff-form" style="text-align: center;">
<form class="pure-form " action="" method="POST"> <form class="pure-form " action="{{url_for('ui.ui_preview.preview_page', uuid=uuid)}}" method="POST">
<fieldset> <fieldset>
<label for="preview-version">{{ _('Select timestamp') }}</label> <select id="preview-version" <label for="preview-version">{{ _('Select timestamp') }}</label> <select id="preview-version"
name="from_version" name="from_version"
@@ -28,6 +28,7 @@
</option> </option>
{% endfor %} {% endfor %}
</select> </select>
<input type="hidden" name="csrf_token" value="{{ csrf_token() }}">
<button type="submit" class="pure-button pure-button-primary">{{ _('Go') }}</button> <button type="submit" class="pure-button pure-button-primary">{{ _('Go') }}</button>
</fieldset> </fieldset>

View File

@@ -81,6 +81,7 @@ def construct_blueprint(datastore: ChangeDetectionStore, update_q, queuedWatchMe
sorted_tags = sorted(datastore.data['settings']['application'].get('tags').items(), key=lambda x: x[1]['title']) sorted_tags = sorted(datastore.data['settings']['application'].get('tags').items(), key=lambda x: x[1]['title'])
proxy_list = datastore.proxy_list
output = render_template( output = render_template(
"watch-overview.html", "watch-overview.html",
active_tag=active_tag, active_tag=active_tag,
@@ -92,7 +93,7 @@ def construct_blueprint(datastore: ChangeDetectionStore, update_q, queuedWatchMe
form=form, form=form,
generate_tag_colors=processors.generate_processor_badge_colors, generate_tag_colors=processors.generate_processor_badge_colors,
guid=datastore.data['app_guid'], guid=datastore.data['app_guid'],
has_proxies=datastore.proxy_list, has_proxies=proxy_list,
hosted_sticky=os.getenv("SALTED_PASS", False) == False, hosted_sticky=os.getenv("SALTED_PASS", False) == False,
now_time_server=round(time.time()), now_time_server=round(time.time()),
pagination=pagination, pagination=pagination,
@@ -110,6 +111,16 @@ def construct_blueprint(datastore: ChangeDetectionStore, update_q, queuedWatchMe
watches=sorted_watches watches=sorted_watches
) )
# Return freed template-building memory to the OS immediately.
# render_template allocates ~20MB of intermediate strings that are freed on return,
# but glibc keeps those pages mapped in its arenas as RSS. malloc_trim() forces
# glibc to release them, preventing RSS growth from concurrent Chrome connections.
try:
import ctypes
ctypes.CDLL('libc.so.6').malloc_trim(0)
except Exception:
pass
if session.get('share-link'): if session.get('share-link'):
del (session['share-link']) del (session['share-link'])

View File

@@ -213,12 +213,13 @@ html[data-darkmode="true"] .watch-tag-list.tag-{{ class_name }} {
{%- set checking_now = is_checking_now(watch) -%} {%- set checking_now = is_checking_now(watch) -%}
{%- set history_n = watch.history_n -%} {%- set history_n = watch.history_n -%}
{%- set favicon = watch.get_favicon_filename() -%} {%- set favicon = watch.get_favicon_filename() -%}
{%- set error_texts = watch.compile_error_texts(has_proxies=has_proxies) -%}
{%- set system_use_url_watchlist = datastore.data['settings']['application']['ui'].get('use_page_title_in_list') -%} {%- set system_use_url_watchlist = datastore.data['settings']['application']['ui'].get('use_page_title_in_list') -%}
{# Class settings mirrored in changedetectionio/static/js/realtime.js for the frontend #} {# Class settings mirrored in changedetectionio/static/js/realtime.js for the frontend #}
{%- set row_classes = [ {%- set row_classes = [
loop.cycle('pure-table-odd', 'pure-table-even'), loop.cycle('pure-table-odd', 'pure-table-even'),
'processor-' ~ watch['processor'], 'processor-' ~ watch['processor'],
'has-error' if watch.compile_error_texts()|length > 2 else '', 'has-error' if error_texts|length > 2 else '',
'paused' if watch.paused is defined and watch.paused != False else '', 'paused' if watch.paused is defined and watch.paused != False else '',
'unviewed' if watch.has_unviewed else '', 'unviewed' if watch.has_unviewed else '',
'has-restock-info' if watch.has_restock_info else 'no-restock-info', 'has-restock-info' if watch.has_restock_info else 'no-restock-info',
@@ -271,7 +272,7 @@ html[data-darkmode="true"] .watch-tag-list.tag-{{ class_name }} {
{% endif %} {% endif %}
<a class="external" target="_blank" rel="noopener" href="{{ watch.link.replace('source:','') }}">&nbsp;</a> <a class="external" target="_blank" rel="noopener" href="{{ watch.link.replace('source:','') }}">&nbsp;</a>
</span> </span>
<div class="error-text" style="display:none;">{{ watch.compile_error_texts(has_proxies=datastore.proxy_list)|safe }}</div> <div class="error-text" style="display:none;">{{ error_texts|safe }}</div>
{%- if watch['processor'] == 'text_json_diff' -%} {%- if watch['processor'] == 'text_json_diff' -%}
{%- if watch['has_ldjson_price_data'] and not watch['track_ldjson_price_data'] -%} {%- if watch['has_ldjson_price_data'] and not watch['track_ldjson_price_data'] -%}
<div class="ldjson-price-track-offer">Switch to Restock & Price watch mode? <a href="{{url_for('price_data_follower.accept', uuid=watch.uuid)}}" class="pure-button button-xsmall">Yes</a> <a href="{{url_for('price_data_follower.reject', uuid=watch.uuid)}}" class="">No</a></div> <div class="ldjson-price-track-offer">Switch to Restock & Price watch mode? <a href="{{url_for('price_data_follower.accept', uuid=watch.uuid)}}" class="pure-button button-xsmall">Yes</a> <a href="{{url_for('price_data_follower.reject', uuid=watch.uuid)}}" class="">No</a></div>

View File

@@ -4,6 +4,7 @@ import flask_login
import locale import locale
import os import os
import queue import queue
import re
import sys import sys
import threading import threading
import time import time
@@ -387,6 +388,8 @@ def _jinja2_filter_fetcher_status_icons(fetcher_name):
return '' return ''
_RE_SANITIZE_TAG = re.compile(r'[^a-zA-Z0-9]')
@app.template_filter('sanitize_tag_class') @app.template_filter('sanitize_tag_class')
def _jinja2_filter_sanitize_tag_class(tag_title): def _jinja2_filter_sanitize_tag_class(tag_title):
"""Sanitize a tag title to create a valid CSS class name. """Sanitize a tag title to create a valid CSS class name.
@@ -398,9 +401,8 @@ def _jinja2_filter_sanitize_tag_class(tag_title):
Returns: Returns:
str: A sanitized string suitable for use as a CSS class name str: A sanitized string suitable for use as a CSS class name
""" """
import re
# Remove all non-alphanumeric characters and convert to lowercase # Remove all non-alphanumeric characters and convert to lowercase
sanitized = re.sub(r'[^a-zA-Z0-9]', '', tag_title).lower() sanitized = _RE_SANITIZE_TAG.sub('', tag_title).lower()
# Ensure it starts with a letter (CSS requirement) # Ensure it starts with a letter (CSS requirement)
if sanitized and not sanitized[0].isalpha(): if sanitized and not sanitized[0].isalpha():
sanitized = 'tag' + sanitized sanitized = 'tag' + sanitized
@@ -488,28 +490,21 @@ def changedetection_app(config=None, datastore_o=None):
available_languages = get_available_languages() available_languages = get_available_languages()
language_codes = get_language_codes() language_codes = get_language_codes()
def get_locale(): _locale_aliases = {
# Locale aliases: map browser language codes to translation directory names 'zh-TW': 'zh_Hant_TW', # Traditional Chinese: browser sends zh-TW, we use zh_Hant_TW
# This handles cases where browsers send standard codes (e.g., zh-TW) 'zh_TW': 'zh_Hant_TW', # Also handle underscore variant
# but our translations use more specific codes (e.g., zh_Hant_TW) }
locale_aliases = { _locale_match_list = language_codes + list(_locale_aliases.keys())
'zh-TW': 'zh_Hant_TW', # Traditional Chinese: browser sends zh-TW, we use zh_Hant_TW
'zh_TW': 'zh_Hant_TW', # Also handle underscore variant
}
def get_locale():
# 1. Try to get locale from session (user explicitly selected) # 1. Try to get locale from session (user explicitly selected)
if 'locale' in session: if 'locale' in session:
return session['locale'] return session['locale']
# 2. Fall back to Accept-Language header # 2. Fall back to Accept-Language header
# Get the best match from browser's Accept-Language header browser_locale = request.accept_languages.best_match(_locale_match_list)
browser_locale = request.accept_languages.best_match(language_codes + list(locale_aliases.keys())) # 3. Map browser locale to our internal locale if needed
return _locale_aliases.get(browser_locale, browser_locale)
# 3. Check if we need to map the browser locale to our internal locale
if browser_locale in locale_aliases:
return locale_aliases[browser_locale]
return browser_locale
# Initialize Babel with locale selector # Initialize Babel with locale selector
babel = Babel(app, locale_selector=get_locale) babel = Babel(app, locale_selector=get_locale)
@@ -1022,15 +1017,16 @@ def check_for_new_version():
import urllib3 import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning) urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
session = requests.Session()
session.verify = False
while not app.config.exit.is_set(): while not app.config.exit.is_set():
try: try:
r = requests.post("https://changedetection.io/check-ver.php", r = session.post("https://changedetection.io/check-ver.php",
data={'version': __version__, data={'version': __version__,
'app_guid': datastore.data['app_guid'], 'app_guid': datastore.data['app_guid'],
'watch_count': len(datastore.data['watching']) 'watch_count': len(datastore.data['watching'])
}, })
verify=False)
except: except:
pass pass

View File

@@ -487,13 +487,25 @@ def extract_json_as_string(content, json_filter, ensure_is_ldjson_info_type=None
except json.JSONDecodeError as e: except json.JSONDecodeError as e:
logger.warning(f"Error processing JSON {content[:20]}...{str(e)})") logger.warning(f"Error processing JSON {content[:20]}...{str(e)})")
else: else:
# Probably something else, go fish inside for it # Check for JSONP wrapper: someCallback({...}) or some.namespace({...})
try: # Server may claim application/json but actually return JSONP
stripped_text_from_html = extract_json_blob_from_html(content=content, jsonp_match = re.match(r'^\w[\w.]*\s*\((.+)\)\s*;?\s*$', content.lstrip("\ufeff").strip(), re.DOTALL)
ensure_is_ldjson_info_type=ensure_is_ldjson_info_type, if jsonp_match:
json_filter=json_filter ) try:
except json.JSONDecodeError as e: inner = jsonp_match.group(1).strip()
logger.warning(f"Error processing JSON while extracting JSON from HTML blob {content[:20]}...{str(e)})") logger.warning(f"Content looks like JSONP, attempting to extract inner JSON for filter '{json_filter}'")
stripped_text_from_html = _parse_json(json.loads(inner), json_filter)
except json.JSONDecodeError as e:
logger.warning(f"Error processing JSONP inner content {content[:20]}...{str(e)})")
if not stripped_text_from_html:
# Probably something else, go fish inside for it
try:
stripped_text_from_html = extract_json_blob_from_html(content=content,
ensure_is_ldjson_info_type=ensure_is_ldjson_info_type,
json_filter=json_filter)
except json.JSONDecodeError as e:
logger.warning(f"Error processing JSON while extracting JSON from HTML blob {content[:20]}...{str(e)})")
if not stripped_text_from_html: if not stripped_text_from_html:
# Re 265 - Just return an empty string when filter not found # Re 265 - Just return an empty string when filter not found

View File

@@ -43,6 +43,11 @@ from ..html_tools import TRANSLATE_WHITESPACE_TABLE
FAVICON_RESAVE_THRESHOLD_SECONDS=86400 FAVICON_RESAVE_THRESHOLD_SECONDS=86400
BROTLI_COMPRESS_SIZE_THRESHOLD = int(os.getenv('SNAPSHOT_BROTLI_COMPRESSION_THRESHOLD', 1024*20)) BROTLI_COMPRESS_SIZE_THRESHOLD = int(os.getenv('SNAPSHOT_BROTLI_COMPRESSION_THRESHOLD', 1024*20))
# Module-level favicon filename cache: data_dir → basename (or None)
# Keyed by data_dir so it survives Watch object recreation, deepcopy, and concurrent requests.
# Invalidated explicitly in bump_favicon() when a new favicon is saved.
_FAVICON_FILENAME_CACHE: dict = {}
minimum_seconds_recheck_time = int(os.getenv('MINIMUM_SECONDS_RECHECK_TIME', 3)) minimum_seconds_recheck_time = int(os.getenv('MINIMUM_SECONDS_RECHECK_TIME', 3))
mtable = {'seconds': 1, 'minutes': 60, 'hours': 3600, 'days': 86400, 'weeks': 86400 * 7} mtable = {'seconds': 1, 'minutes': 60, 'hours': 3600, 'days': 86400, 'weeks': 86400 * 7}
@@ -383,6 +388,25 @@ class model(EntityPersistenceMixin, watch_base):
return self.get('fetch_backend') return self.get('fetch_backend')
@property
def fetcher_supports_screenshots(self):
"""Return True if the fetcher configured for this watch supports screenshots.
Resolves 'system' via self._datastore, then checks supports_screenshots on
the actual fetcher class. Works for built-in and plugin fetchers alike.
"""
from changedetectionio import content_fetchers
fetcher_name = self.get_fetch_backend # already handles is_pdf → html_requests
if not fetcher_name or fetcher_name == 'system':
fetcher_name = self._datastore['settings']['application'].get('fetch_backend', 'html_requests')
fetcher_class = getattr(content_fetchers, fetcher_name, None)
if fetcher_class is None:
return False
return bool(getattr(fetcher_class, 'supports_screenshots', False))
@property @property
def is_pdf(self): def is_pdf(self):
url = str(self.get("url") or "").lower() url = str(self.get("url") or "").lower()
@@ -806,9 +830,8 @@ class model(EntityPersistenceMixin, watch_base):
with open(fname, 'wb') as f: with open(fname, 'wb') as f:
f.write(decoded) f.write(decoded)
# Invalidate favicon filename cache # Invalidate module-level favicon filename cache for this watch
if hasattr(self, '_favicon_filename_cache'): _FAVICON_FILENAME_CACHE.pop(self.data_dir, None)
delattr(self, '_favicon_filename_cache')
# A signal that could trigger the socket server to update the browser also # A signal that could trigger the socket server to update the browser also
watch_check_update = signal('watch_favicon_bump') watch_check_update = signal('watch_favicon_bump')
@@ -823,35 +846,23 @@ class model(EntityPersistenceMixin, watch_base):
def get_favicon_filename(self) -> str | None: def get_favicon_filename(self) -> str | None:
""" """
Find any favicon.* file in the current working directory Find any favicon.* file in the watch data directory.
and return the contents of the newest one.
MEMORY LEAK FIX: Cache the result to avoid repeated glob.glob() operations. Uses a module-level cache keyed by data_dir to survive Watch object recreation,
glob.glob() causes millions of fnmatch allocations when called for every watch on page load. deepcopy (which drops instance attrs), and concurrent request races.
Invalidated by bump_favicon() when a new favicon is saved.
Returns: Returns:
str: Basename of the newest favicon file, or None if not found. str: Basename of the favicon file, or None if not found.
""" """
# Check cache first (prevents 26M+ allocations from repeated glob operations) if self.data_dir in _FAVICON_FILENAME_CACHE:
cache_key = '_favicon_filename_cache' return _FAVICON_FILENAME_CACHE[self.data_dir]
if hasattr(self, cache_key):
return getattr(self, cache_key)
import glob import glob
# Search for all favicon.* files
files = glob.glob(os.path.join(self.data_dir, "favicon.*")) files = glob.glob(os.path.join(self.data_dir, "favicon.*"))
fname = os.path.basename(files[0]) if files else None
if not files: _FAVICON_FILENAME_CACHE[self.data_dir] = fname
result = None return fname
else:
# Find the newest by modification time
newest_file = max(files, key=os.path.getmtime)
result = os.path.basename(newest_file)
# Cache the result
setattr(self, cache_key, result)
return result
def get_screenshot_as_thumbnail(self, max_age=3200): def get_screenshot_as_thumbnail(self, max_age=3200):
"""Return path to a square thumbnail of the most recent screenshot. """Return path to a square thumbnail of the most recent screenshot.
@@ -1182,18 +1193,13 @@ class model(EntityPersistenceMixin, watch_base):
def compile_error_texts(self, has_proxies=None): def compile_error_texts(self, has_proxies=None):
"""Compile error texts for this watch. """Compile error texts for this watch.
Accepts has_proxies parameter to ensure it works even outside app context""" Accepts has_proxies parameter to ensure it works even outside app context"""
from flask import url_for from flask import url_for, has_request_context
from markupsafe import Markup from markupsafe import Markup
output = [] # Initialize as list since we're using append output = [] # Initialize as list since we're using append
last_error = self.get('last_error','') last_error = self.get('last_error','')
try: has_app_context = has_request_context()
url_for('settings.settings_page')
except Exception as e:
has_app_context = False
else:
has_app_context = True
# has app+request context, we can use url_for() # has app+request context, we can use url_for()
if has_app_context: if has_app_context:

View File

@@ -42,10 +42,7 @@ def render_form(watch, datastore, request, url_for, render_template, flash, redi
# Get error information for the template # Get error information for the template
screenshot_url = watch.get_screenshot() screenshot_url = watch.get_screenshot()
system_uses_webdriver = datastore.data['settings']['application']['fetch_backend'] == 'html_webdriver' is_html_webdriver = watch.fetcher_supports_screenshots
is_html_webdriver = False
if (watch.get('fetch_backend') == 'system' and system_uses_webdriver) or watch.get('fetch_backend') == 'html_webdriver' or watch.get('fetch_backend', '').startswith('extra_browser_'):
is_html_webdriver = True
password_enabled_and_share_is_off = False password_enabled_and_share_is_off = False
if datastore.data['settings']['application'].get('password') or os.getenv("SALTED_PASS", False): if datastore.data['settings']['application'].get('password') or os.getenv("SALTED_PASS", False):

View File

@@ -100,7 +100,13 @@ class guess_stream_type():
if any(s in http_content_header for s in RSS_XML_CONTENT_TYPES): if any(s in http_content_header for s in RSS_XML_CONTENT_TYPES):
self.is_rss = True self.is_rss = True
elif any(s in http_content_header for s in JSON_CONTENT_TYPES): elif any(s in http_content_header for s in JSON_CONTENT_TYPES):
self.is_json = True # JSONP detection: server claims application/json but content is actually JSONP (e.g. cb({...}))
# A JSONP response starts with an identifier followed by '(' - not valid JSON
if re.match(r'^\w[\w.]*\s*\(', test_content):
logger.warning(f"Content-Type header claims JSON but content looks like JSONP (starts with identifier+parenthesis) - treating as plaintext")
self.is_plaintext = True
else:
self.is_json = True
elif 'pdf' in magic_content_header: elif 'pdf' in magic_content_header:
self.is_pdf = True self.is_pdf = True
# magic will call a rss document 'xml' # magic will call a rss document 'xml'

View File

@@ -154,11 +154,7 @@ def render(watch, datastore, request, url_for, render_template, flash, redirect,
screenshot_url = watch.get_screenshot() screenshot_url = watch.get_screenshot()
system_uses_webdriver = datastore.data['settings']['application']['fetch_backend'] == 'html_webdriver' is_html_webdriver = watch.fetcher_supports_screenshots
is_html_webdriver = False
if (watch.get('fetch_backend') == 'system' and system_uses_webdriver) or watch.get('fetch_backend') == 'html_webdriver' or watch.get('fetch_backend', '').startswith('extra_browser_'):
is_html_webdriver = True
password_enabled_and_share_is_off = False password_enabled_and_share_is_off = False
if datastore.data['settings']['application'].get('password') or os.getenv("SALTED_PASS", False): if datastore.data['settings']['application'].get('password') or os.getenv("SALTED_PASS", False):

View File

@@ -29,9 +29,11 @@ def register_watch_operation_handlers(socketio, datastore):
# Perform the operation # Perform the operation
if op == 'pause': if op == 'pause':
watch.toggle_pause() watch.toggle_pause()
watch.commit()
logger.info(f"Socket.IO: Toggled pause for watch {uuid}") logger.info(f"Socket.IO: Toggled pause for watch {uuid}")
elif op == 'mute': elif op == 'mute':
watch.toggle_mute() watch.toggle_mute()
watch.commit()
logger.info(f"Socket.IO: Toggled mute for watch {uuid}") logger.info(f"Socket.IO: Toggled mute for watch {uuid}")
elif op == 'recheck': elif op == 'recheck':
# Import here to avoid circular imports # Import here to avoid circular imports

View File

@@ -170,6 +170,14 @@ def test_api_simple(client, live_server, measure_memory_usage, datastore_path):
headers={'x-api-key': api_key}, headers={'x-api-key': api_key},
) )
assert b'(changed) Which is across' in res.data assert b'(changed) Which is across' in res.data
assert b'Some text thats the same' in res.data
# Fetch the difference between two versions (default text format)
res = client.get(
url_for("watchhistorydiff", uuid=watch_uuid, from_timestamp='previous', to_timestamp='latest')+"?changesOnly=true",
headers={'x-api-key': api_key},
)
assert b'Some text thats the same' not in res.data
# Test htmlcolor format # Test htmlcolor format
res = client.get( res = client.get(

View File

@@ -178,23 +178,44 @@ def test_api_tags_listing(client, live_server, measure_memory_usage, datastore_p
def test_api_tag_restock_processor_config(client, live_server, measure_memory_usage, datastore_path): def test_api_tag_restock_processor_config(client, live_server, measure_memory_usage, datastore_path):
""" """
Test that a tag/group can be updated with processor_config_restock_diff via the API. Test that a tag/group can be created and updated with processor_config_restock_diff via the API.
Since Tag extends WatchBase, processor config fields injected into WatchBase are also valid for tags. Since Tag extends WatchBase, processor config fields injected into WatchBase are also valid for tags.
""" """
api_key = live_server.app.config['DATASTORE'].data['settings']['application'].get('api_access_token') api_key = live_server.app.config['DATASTORE'].data['settings']['application'].get('api_access_token')
set_original_response(datastore_path=datastore_path) set_original_response(datastore_path=datastore_path)
# Create a tag # Create a tag with processor_config_restock_diff in a single POST (issue #3966)
res = client.post( res = client.post(
url_for("tag"), url_for("tag"),
data=json.dumps({"title": "Restock Group"}), data=json.dumps({
"title": "Restock Group",
"overrides_watch": True,
"processor_config_restock_diff": {
"in_stock_processing": "in_stock_only",
"follow_price_changes": True,
"price_change_min": 7777777
}
}),
headers={'content-type': 'application/json', 'x-api-key': api_key} headers={'content-type': 'application/json', 'x-api-key': api_key}
) )
assert res.status_code == 201 assert res.status_code == 201, f"POST tag with restock config failed: {res.data}"
tag_uuid = res.json.get('uuid') tag_uuid = res.json.get('uuid')
# Update tag with valid processor_config_restock_diff # Verify processor config was saved during creation (the bug: these were discarded)
res = client.get(
url_for("tag", uuid=tag_uuid),
headers={'x-api-key': api_key}
)
assert res.status_code == 200
tag_data = res.json
assert tag_data.get('overrides_watch') == True, "overrides_watch should be saved on POST"
assert tag_data.get('processor_config_restock_diff', {}).get('in_stock_processing') == 'in_stock_only', \
"processor_config_restock_diff should be saved on POST"
assert tag_data.get('processor_config_restock_diff', {}).get('price_change_min') == 7777777, \
"price_change_min should be saved on POST"
# Update tag with valid processor_config_restock_diff via PUT
res = client.put( res = client.put(
url_for("tag", uuid=tag_uuid), url_for("tag", uuid=tag_uuid),
headers={'x-api-key': api_key, 'content-type': 'application/json'}, headers={'x-api-key': api_key, 'content-type': 'application/json'},

View File

@@ -48,6 +48,15 @@ def test_check_basic_change_detection_functionality(client, live_server, measure
# Check this class does not appear (that we didnt see the actual source) # Check this class does not appear (that we didnt see the actual source)
assert b'foobar-detection' not in res.data assert b'foobar-detection' not in res.data
# Check POST preview
res = client.post(
url_for("ui.ui_preview.preview_page", uuid="first"),
follow_redirects=True
)
# Check this class does not appear (that we didnt see the actual source)
assert b'foobar-detection' not in res.data
# Make a change # Make a change
set_modified_response(datastore_path=datastore_path) set_modified_response(datastore_path=datastore_path)

View File

@@ -16,6 +16,51 @@ except ModuleNotFoundError:
def test_jsonp_treated_as_plaintext():
from ..processors.magic import guess_stream_type
# JSONP content (server wrongly claims application/json) should be detected as plaintext
# Callback names are arbitrary identifiers, not always 'cb'
jsonp_content = 'jQuery123456({ "version": "8.0.41", "url": "https://example.com/app.apk" })'
result = guess_stream_type(http_content_header="application/json", content=jsonp_content)
assert result.is_json is False
assert result.is_plaintext is True
# Variation with dotted callback name e.g. jQuery.cb(...)
jsonp_dotted = 'some.callback({ "version": "1.0" })'
result = guess_stream_type(http_content_header="application/json", content=jsonp_dotted)
assert result.is_json is False
assert result.is_plaintext is True
# Real JSON should still be detected as JSON
json_content = '{ "version": "8.0.41", "url": "https://example.com/app.apk" }'
result = guess_stream_type(http_content_header="application/json", content=json_content)
assert result.is_json is True
assert result.is_plaintext is False
def test_jsonp_json_filter_extraction():
from .. import html_tools
# Tough case: dotted namespace callback, trailing semicolon, deeply nested content with arrays
jsonp_content = 'weixin.update.callback({"platforms": {"android": {"variants": [{"arch": "arm64", "versionName": "8.0.68", "url": "https://example.com/app-arm64.apk"}, {"arch": "arm32", "versionName": "8.0.41", "url": "https://example.com/app-arm32.apk"}]}}});'
# Deep nested jsonpath filter into array element
text = html_tools.extract_json_as_string(jsonp_content, "json:$.platforms.android.variants[0].versionName")
assert text == '"8.0.68"'
# Filter that selects the second array element
text = html_tools.extract_json_as_string(jsonp_content, "json:$.platforms.android.variants[1].arch")
assert text == '"arm32"'
if jq_support:
text = html_tools.extract_json_as_string(jsonp_content, "jq:.platforms.android.variants[0].versionName")
assert text == '"8.0.68"'
text = html_tools.extract_json_as_string(jsonp_content, "jqraw:.platforms.android.variants[1].url")
assert text == "https://example.com/app-arm32.apk"
def test_unittest_inline_html_extract(): def test_unittest_inline_html_extract():
# So lets pretend that the JSON we want is inside some HTML # So lets pretend that the JSON we want is inside some HTML
content=""" content="""

File diff suppressed because it is too large Load Diff

View File

@@ -100,6 +100,19 @@ def is_safe_valid_url(test_url):
logger.warning('URL validation failed: URL is empty or whitespace only') logger.warning('URL validation failed: URL is empty or whitespace only')
return False return False
# Per-request cache: same URL is often validated 2-3x per watchlist render (sort + display).
# Flask's g is scoped to one request and auto-cleared on teardown, so dynamic Jinja2 URLs
# like {{microtime()}} are always re-evaluated on the next request.
# Falls back gracefully when called outside a request context (e.g. background workers).
_cache_key = test_url
try:
from flask import g
_cache = g.setdefault('_url_validation_cache', {})
if _cache_key in _cache:
return _cache[_cache_key]
except RuntimeError:
_cache = None # No app context
allow_file_access = strtobool(os.getenv('ALLOW_FILE_URI', 'false')) allow_file_access = strtobool(os.getenv('ALLOW_FILE_URI', 'false'))
safe_protocol_regex = '^(http|https|ftp|file):' if allow_file_access else '^(http|https|ftp):' safe_protocol_regex = '^(http|https|ftp|file):' if allow_file_access else '^(http|https|ftp):'
@@ -112,11 +125,14 @@ def is_safe_valid_url(test_url):
test_url = r.sub('', test_url) test_url = r.sub('', test_url)
# Check the actual rendered URL in case of any Jinja markup # Check the actual rendered URL in case of any Jinja markup
try: # Only run jinja_render when the URL actually contains Jinja2 syntax - creating a new
test_url = jinja_render(test_url) # ImmutableSandboxedEnvironment is expensive and is called once per watch per page load
except Exception as e: if '{%' in test_url or '{{' in test_url:
logger.error(f'URL "{test_url}" is not correct Jinja2? {str(e)}') try:
return False test_url = jinja_render(test_url)
except Exception as e:
logger.error(f'URL "{test_url}" is not correct Jinja2? {str(e)}')
return False
# Check query parameters and fragment # Check query parameters and fragment
if re.search(r'[<>]', test_url): if re.search(r'[<>]', test_url):
@@ -142,4 +158,6 @@ def is_safe_valid_url(test_url):
logger.warning(f'URL f"{test_url}" failed validation, aborting.') logger.warning(f'URL f"{test_url}" failed validation, aborting.')
return False return False
if _cache is not None:
_cache[_cache_key] = True
return True return True

View File

@@ -40,7 +40,7 @@ orjson~=3.11
# jq not available on Windows so must be installed manually # jq not available on Windows so must be installed manually
# Notification library # Notification library
apprise==1.9.7 apprise==1.9.8
diff_match_patch diff_match_patch