Compare commits

...

21 Commits

Author SHA1 Message Date
dgtlmoon c442a798a3 More clenaup 2026-02-16 18:17:38 +01:00
dgtlmoon 06fd43ee24 safer cleanup 2026-02-16 17:59:40 +01:00
dgtlmoon 111d424d23 Hmm 2 2026-02-16 17:58:23 +01:00
dgtlmoon fbe245c1d7 hmm 2026-02-16 17:47:15 +01:00
dgtlmoon 1eeed8dd5b add cleanup 2026-02-16 17:36:53 +01:00
dgtlmoon 363dcf6ff0 Remove static sleeps 2026-02-16 17:21:52 +01:00
dgtlmoon 7a3c9cb391 Security - Adding small test and fixing overzealous filename cleaner 2026-02-16 16:56:16 +01:00
dgtlmoon 549e167746 Datastore - On fresh installs, also scan for existing watch.json watches in subdirectories 2026-02-16 15:56:46 +01:00
dgtlmoon 9d38b45173 Security CVE-2026-25527 - Unauthenticated static path traversal in resources 2026-02-16 15:48:03 +01:00
dgtlmoon 3558e9ee10 Browser Steps - Minor code cleanup 2026-02-16 13:22:54 +01:00
dgtlmoon 4b94de7e0c UI - Browser Steps - First step was missing Clear / Remove / Pic buttons 2026-02-16 13:20:34 +01:00
dgtlmoon 3f99f0dd7b 0.53.1 2026-02-16 13:06:49 +01:00
dgtlmoon fe465de73c Browser Steps - Clean off empty fields on save/update (UI and API), small refactor Re #3874, #3879 (#3880) 2026-02-16 13:05:46 +01:00
dgtlmoon 1ad3207288 Test - Improve test for watch package download 2026-02-16 13:05:18 +01:00
dgtlmoon dbe238e33d UI - Watch data download, fix test, update text.
Build and push containers / metadata (push) Has been cancelled
Build and push containers / build-push-containers (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2026-02-16 11:13:19 +01:00
dgtlmoon 32cb72b459 UI - Ability to download a complete data package (.zip) of a watch (#3877)
Build and push containers / metadata (push) Has been cancelled
Build and push containers / build-push-containers (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2026-02-15 10:53:21 +01:00
dgtlmoon 501aa61e19 Disable content compression of HTML/etc by default due to memory leak between flask_socketio and flask and flask_compress. 2026-02-15 08:19:29 +01:00
dgtlmoon b6d3d63372 Avoid reprocessing if the page was the same (#3867)
Build and push containers / metadata (push) Has been cancelled
Build and push containers / build-push-containers (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2026-02-14 21:24:28 +01:00
dependabot[bot] f4bb32f588 Update python-socketio requirement from ~=5.16.0 to ~=5.16.1 (#3869)
Build and push containers / metadata (push) Has been cancelled
Build and push containers / build-push-containers (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (alpine) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/amd64 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v7 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm/v8 (main) (push) Has been cancelled
ChangeDetection.io Container Build Test / Build linux/arm64 (main) (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2026-02-13 17:43:43 +01:00
dgtlmoon bcd32852ca API - Remove flask_expects_json validation, this is covered entirely by OpenAPI, update OpenAPI spec. (#3871) 2026-02-13 16:30:59 +01:00
dependabot[bot] ad14807067 Update python-engineio requirement from ~=4.13.0 to ~=4.13.1 (#3868) 2026-02-13 11:24:50 +01:00
56 changed files with 2551 additions and 581 deletions
+1 -1
View File
@@ -2,7 +2,7 @@
# Read more https://github.com/dgtlmoon/changedetection.io/wiki # Read more https://github.com/dgtlmoon/changedetection.io/wiki
# Semver means never use .01, or 00. Should be .1. # Semver means never use .01, or 00. Should be .1.
__version__ = '0.52.9' __version__ = '0.53.1'
from changedetectionio.strtobool import strtobool from changedetectionio.strtobool import strtobool
from json.decoder import JSONDecodeError from json.decoder import JSONDecodeError
+22 -7
View File
@@ -2,7 +2,7 @@ from changedetectionio.strtobool import strtobool
from flask_restful import abort, Resource from flask_restful import abort, Resource
from flask import request from flask import request
from functools import wraps from functools import wraps
from . import auth, validate_openapi_request, schema_create_watch from . import auth, validate_openapi_request
from ..validate_url import is_safe_valid_url from ..validate_url import is_safe_valid_url
import json import json
@@ -33,9 +33,25 @@ def convert_query_param_to_type(value, schema_property):
Returns: Returns:
Converted value in the appropriate type Converted value in the appropriate type
Supports both OpenAPI 3.1 formats:
- type: [string, 'null'] (array format)
- anyOf: [{type: string}, {type: null}] (anyOf format)
""" """
# Handle anyOf schemas (extract the first type) prop_type = schema_property.get('type')
if 'anyOf' in schema_property:
# Handle OpenAPI 3.1 type arrays: type: [string, 'null']
if isinstance(prop_type, list):
# Use the first non-null type from the array
for t in prop_type:
if t != 'null':
prop_type = t
break
else:
prop_type = None
# Handle anyOf schemas (older format)
elif 'anyOf' in schema_property:
# Use the first non-null type from anyOf # Use the first non-null type from anyOf
for option in schema_property['anyOf']: for option in schema_property['anyOf']:
if option.get('type') and option.get('type') != 'null': if option.get('type') and option.get('type') != 'null':
@@ -43,8 +59,6 @@ def convert_query_param_to_type(value, schema_property):
break break
else: else:
prop_type = None prop_type = None
else:
prop_type = schema_property.get('type')
# Handle array type (e.g., notification_urls) # Handle array type (e.g., notification_urls)
if prop_type == 'array': if prop_type == 'array':
@@ -89,7 +103,7 @@ class Import(Resource):
@validate_openapi_request('importWatches') @validate_openapi_request('importWatches')
def post(self): def post(self):
"""Import a list of watched URLs with optional watch configuration.""" """Import a list of watched URLs with optional watch configuration."""
from . import get_watch_schema_properties
# Special parameters that are NOT watch configuration # Special parameters that are NOT watch configuration
special_params = {'tag', 'tag_uuids', 'dedupe', 'proxy'} special_params = {'tag', 'tag_uuids', 'dedupe', 'proxy'}
@@ -115,7 +129,8 @@ class Import(Resource):
tag_uuids = tag_uuids.split(',') tag_uuids = tag_uuids.split(',')
# Extract ALL other query parameters as watch configuration # Extract ALL other query parameters as watch configuration
schema_properties = schema_create_watch.get('properties', {}) # Get schema from OpenAPI spec (replaces old schema_create_watch)
schema_properties = get_watch_schema_properties()
for param_name, param_value in request.args.items(): for param_name, param_value in request.args.items():
# Skip special parameters # Skip special parameters
if param_name in special_params: if param_name in special_params:
-5
View File
@@ -1,8 +1,6 @@
from flask_expects_json import expects_json
from flask_restful import Resource, abort from flask_restful import Resource, abort
from flask import request from flask import request
from . import auth, validate_openapi_request from . import auth, validate_openapi_request
from . import schema_create_notification_urls, schema_delete_notification_urls
class Notifications(Resource): class Notifications(Resource):
def __init__(self, **kwargs): def __init__(self, **kwargs):
@@ -22,7 +20,6 @@ class Notifications(Resource):
@auth.check_token @auth.check_token
@validate_openapi_request('addNotifications') @validate_openapi_request('addNotifications')
@expects_json(schema_create_notification_urls)
def post(self): def post(self):
"""Create Notification URLs.""" """Create Notification URLs."""
@@ -50,7 +47,6 @@ class Notifications(Resource):
@auth.check_token @auth.check_token
@validate_openapi_request('replaceNotifications') @validate_openapi_request('replaceNotifications')
@expects_json(schema_create_notification_urls)
def put(self): def put(self):
"""Replace Notification URLs.""" """Replace Notification URLs."""
json_data = request.get_json() json_data = request.get_json()
@@ -73,7 +69,6 @@ class Notifications(Resource):
@auth.check_token @auth.check_token
@validate_openapi_request('deleteNotifications') @validate_openapi_request('deleteNotifications')
@expects_json(schema_delete_notification_urls)
def delete(self): def delete(self):
"""Delete Notification URLs.""" """Delete Notification URLs."""
+60 -9
View File
@@ -1,6 +1,5 @@
from changedetectionio import queuedWatchMetaData from changedetectionio import queuedWatchMetaData
from changedetectionio import worker_pool from changedetectionio import worker_pool
from flask_expects_json import expects_json
from flask_restful import abort, Resource from flask_restful import abort, Resource
from loguru import logger from loguru import logger
@@ -8,8 +7,7 @@ import threading
from flask import request from flask import request
from . import auth from . import auth
# Import schemas from __init__.py from . import validate_openapi_request
from . import schema_tag, schema_create_tag, schema_update_tag, validate_openapi_request
class Tag(Resource): class Tag(Resource):
@@ -69,7 +67,25 @@ class Tag(Resource):
tag.commit() tag.commit()
return "OK", 200 return "OK", 200
return tag # Filter out Watch-specific runtime fields that don't apply to Tags (yet)
# TODO: Future enhancement - aggregate these values from all Watches that have this tag:
# - check_count: sum of all watches' check_count
# - last_checked: most recent last_checked from all watches
# - last_changed: most recent last_changed from all watches
# - consecutive_filter_failures: count of watches with failures
# - etc.
# These come from watch_base inheritance but currently have no meaningful value for Tags
watch_only_fields = {
'browser_steps_last_error_step', 'check_count', 'consecutive_filter_failures',
'content-type', 'fetch_time', 'last_changed', 'last_checked', 'last_error',
'last_notification_error', 'last_viewed', 'notification_alert_count',
'page_title', 'previous_md5', 'remote_server_reply'
}
# Create clean tag dict without Watch-specific fields
clean_tag = {k: v for k, v in tag.items() if k not in watch_only_fields}
return clean_tag
@auth.check_token @auth.check_token
@validate_openapi_request('deleteTag') @validate_openapi_request('deleteTag')
@@ -102,38 +118,73 @@ class Tag(Resource):
@auth.check_token @auth.check_token
@validate_openapi_request('updateTag') @validate_openapi_request('updateTag')
@expects_json(schema_update_tag)
def put(self, uuid): def put(self, uuid):
"""Update tag information.""" """Update tag information."""
tag = self.datastore.data['settings']['application']['tags'].get(uuid) tag = self.datastore.data['settings']['application']['tags'].get(uuid)
if not tag: if not tag:
abort(404, message='No tag exists with the UUID of {}'.format(uuid)) abort(404, message='No tag exists with the UUID of {}'.format(uuid))
# Make a mutable copy of request.json for modification
json_data = dict(request.json)
# Validate notification_urls if provided # Validate notification_urls if provided
if 'notification_urls' in request.json: if 'notification_urls' in json_data:
from wtforms import ValidationError from wtforms import ValidationError
from changedetectionio.api.Notifications import validate_notification_urls from changedetectionio.api.Notifications import validate_notification_urls
try: try:
notification_urls = request.json.get('notification_urls', []) notification_urls = json_data.get('notification_urls', [])
validate_notification_urls(notification_urls) validate_notification_urls(notification_urls)
except ValidationError as e: except ValidationError as e:
return str(e), 400 return str(e), 400
tag.update(request.json) # Filter out readOnly fields (extracted from OpenAPI spec Tag schema)
# These are system-managed fields that should never be user-settable
from . import get_readonly_tag_fields
readonly_fields = get_readonly_tag_fields()
# Tag model inherits from watch_base but has no @property attributes of its own
# So we only need to filter readOnly fields
for field in readonly_fields:
json_data.pop(field, None)
# Validate remaining fields - reject truly unknown fields
# Get valid fields from Tag schema
from . import get_tag_schema_properties
valid_fields = set(get_tag_schema_properties().keys())
# Check for unknown fields
unknown_fields = set(json_data.keys()) - valid_fields
if unknown_fields:
return f"Unknown field(s): {', '.join(sorted(unknown_fields))}", 400
tag.update(json_data)
tag.commit() tag.commit()
# Clear checksums for all watches using this tag to force reprocessing
# Tag changes affect inherited configuration
cleared_count = self.datastore.clear_checksums_for_tag(uuid)
logger.info(f"Tag {uuid} updated via API, cleared {cleared_count} watch checksums")
return "OK", 200 return "OK", 200
@auth.check_token @auth.check_token
@validate_openapi_request('createTag') @validate_openapi_request('createTag')
# Only cares for {'title': 'xxxx'}
def post(self): def post(self):
"""Create a single tag/group.""" """Create a single tag/group."""
json_data = request.get_json() json_data = request.get_json()
title = json_data.get("title",'').strip() title = json_data.get("title",'').strip()
# Validate that only valid fields are provided
# Get valid fields from Tag schema
from . import get_tag_schema_properties
valid_fields = set(get_tag_schema_properties().keys())
# Check for unknown fields
unknown_fields = set(json_data.keys()) - valid_fields
if unknown_fields:
return f"Unknown field(s): {', '.join(sorted(unknown_fields))}", 400
new_uuid = self.datastore.add_tag(title=title) new_uuid = self.datastore.add_tag(title=title)
if new_uuid: if new_uuid:
+30 -5
View File
@@ -8,13 +8,11 @@ from . import auth
from changedetectionio import queuedWatchMetaData, strtobool from changedetectionio import queuedWatchMetaData, strtobool
from changedetectionio import worker_pool from changedetectionio import worker_pool
from flask import request, make_response, send_from_directory from flask import request, make_response, send_from_directory
from flask_expects_json import expects_json
from flask_restful import abort, Resource from flask_restful import abort, Resource
from loguru import logger from loguru import logger
import copy import copy
# Import schemas from __init__.py from . import validate_openapi_request, get_readonly_watch_fields
from . import schema, schema_create_watch, schema_update_watch, validate_openapi_request
from ..notification import valid_notification_formats from ..notification import valid_notification_formats
from ..notification.handler import newline_re from ..notification.handler import newline_re
@@ -121,7 +119,6 @@ class Watch(Resource):
@auth.check_token @auth.check_token
@validate_openapi_request('updateWatch') @validate_openapi_request('updateWatch')
@expects_json(schema_update_watch)
def put(self, uuid): def put(self, uuid):
"""Update watch information.""" """Update watch information."""
watch = self.datastore.data['watching'].get(uuid) watch = self.datastore.data['watching'].get(uuid)
@@ -175,6 +172,35 @@ class Watch(Resource):
# Extract and remove processor config fields from json_data # Extract and remove processor config fields from json_data
processor_config_data = processors.extract_processor_config_from_form_data(json_data) processor_config_data = processors.extract_processor_config_from_form_data(json_data)
# Filter out readOnly fields (extracted from OpenAPI spec Watch schema)
# These are system-managed fields that should never be user-settable
readonly_fields = get_readonly_watch_fields()
# Also filter out @property attributes (computed/derived values from the model)
# These are not stored and should be ignored in PUT requests
from changedetectionio.model.Watch import model as WatchModel
property_fields = WatchModel.get_property_names()
# Combine both sets of fields to ignore
fields_to_ignore = readonly_fields | property_fields
# Remove all ignored fields from update data
for field in fields_to_ignore:
json_data.pop(field, None)
# Validate remaining fields - reject truly unknown fields
# Get valid fields from WatchBase schema
from . import get_watch_schema_properties
valid_fields = set(get_watch_schema_properties().keys())
# Also allow last_viewed (explicitly defined in UpdateWatch schema)
valid_fields.add('last_viewed')
# Check for unknown fields
unknown_fields = set(json_data.keys()) - valid_fields
if unknown_fields:
return f"Unknown field(s): {', '.join(sorted(unknown_fields))}", 400
# Update watch with regular (non-processor-config) fields # Update watch with regular (non-processor-config) fields
watch.update(json_data) watch.update(json_data)
watch.commit() watch.commit()
@@ -393,7 +419,6 @@ class CreateWatch(Resource):
@auth.check_token @auth.check_token
@validate_openapi_request('createWatch') @validate_openapi_request('createWatch')
@expects_json(schema_create_watch)
def post(self): def post(self):
"""Create a single watch.""" """Create a single watch."""
+83 -37
View File
@@ -1,41 +1,6 @@
import copy
import functools import functools
from flask import request, abort from flask import request, abort
from loguru import logger from loguru import logger
from . import api_schema
from ..model import watch_base
# Build a JSON Schema atleast partially based on our Watch model
watch_base_config = watch_base()
schema = api_schema.build_watch_json_schema(watch_base_config)
schema_create_watch = copy.deepcopy(schema)
schema_create_watch['required'] = ['url']
del schema_create_watch['properties']['last_viewed']
# Allow processor_config_* fields (handled separately in endpoint)
schema_create_watch['patternProperties'] = {
'^processor_config_': {'type': ['string', 'number', 'boolean', 'object', 'array', 'null']}
}
schema_update_watch = copy.deepcopy(schema)
schema_update_watch['additionalProperties'] = False
# Allow processor_config_* fields (handled separately in endpoint)
schema_update_watch['patternProperties'] = {
'^processor_config_': {'type': ['string', 'number', 'boolean', 'object', 'array', 'null']}
}
# Tag schema is also based on watch_base since Tag inherits from it
schema_tag = copy.deepcopy(schema)
schema_create_tag = copy.deepcopy(schema_tag)
schema_create_tag['required'] = ['title']
schema_update_tag = copy.deepcopy(schema_tag)
schema_update_tag['additionalProperties'] = False
schema_notification_urls = copy.deepcopy(schema)
schema_create_notification_urls = copy.deepcopy(schema_notification_urls)
schema_create_notification_urls['required'] = ['notification_urls']
schema_delete_notification_urls = copy.deepcopy(schema_notification_urls)
schema_delete_notification_urls['required'] = ['notification_urls']
@functools.cache @functools.cache
def get_openapi_spec(): def get_openapi_spec():
@@ -54,6 +19,79 @@ def get_openapi_spec():
_openapi_spec = OpenAPI.from_dict(spec_dict) _openapi_spec = OpenAPI.from_dict(spec_dict)
return _openapi_spec return _openapi_spec
@functools.cache
def get_openapi_schema_dict():
"""
Get the raw OpenAPI spec dictionary for schema access.
Used by Import endpoint to validate and convert query parameters.
Returns the YAML dict directly (not the OpenAPI object).
"""
import os
import yaml
spec_path = os.path.join(os.path.dirname(__file__), '../../docs/api-spec.yaml')
if not os.path.exists(spec_path):
spec_path = os.path.join(os.path.dirname(__file__), '../docs/api-spec.yaml')
with open(spec_path, 'r', encoding='utf-8') as f:
return yaml.safe_load(f)
@functools.cache
def _resolve_schema_properties(schema_name):
"""
Generic helper to resolve schema properties, including allOf inheritance.
Args:
schema_name: Name of the schema (e.g., 'WatchBase', 'Watch', 'Tag')
Returns:
dict: All properties including inherited ones from $ref schemas
"""
spec_dict = get_openapi_schema_dict()
schema = spec_dict['components']['schemas'].get(schema_name, {})
properties = {}
# Handle allOf (schema inheritance)
if 'allOf' in schema:
for item in schema['allOf']:
# Resolve $ref to parent schema
if '$ref' in item:
ref_path = item['$ref'].split('/')[-1]
ref_schema = spec_dict['components']['schemas'].get(ref_path, {})
properties.update(ref_schema.get('properties', {}))
# Add schema-specific properties
if 'properties' in item:
properties.update(item['properties'])
else:
# Direct properties (no inheritance)
properties = schema.get('properties', {})
return properties
@functools.cache
def get_watch_schema_properties():
"""
Extract watch schema properties from OpenAPI spec for Import endpoint.
Returns WatchBase properties (all writable Watch fields).
"""
return _resolve_schema_properties('WatchBase')
# Import readonly field utilities from shared module (avoids circular dependencies with model layer)
from changedetectionio.model.schema_utils import get_readonly_watch_fields, get_readonly_tag_fields
@functools.cache
def get_tag_schema_properties():
"""
Extract Tag schema properties from OpenAPI spec.
Returns WatchBase properties + Tag-specific properties (overrides_watch).
"""
return _resolve_schema_properties('Tag')
def validate_openapi_request(operation_id): def validate_openapi_request(operation_id):
"""Decorator to validate incoming requests against OpenAPI spec.""" """Decorator to validate incoming requests against OpenAPI spec."""
def decorator(f): def decorator(f):
@@ -72,8 +110,16 @@ def validate_openapi_request(operation_id):
if result.errors: if result.errors:
error_details = [] error_details = []
for error in result.errors: for error in result.errors:
error_details.append(str(error)) # Extract detailed schema errors from __cause__
raise BadRequest(f"OpenAPI validation failed: {error_details}") if hasattr(error, '__cause__') and hasattr(error.__cause__, 'schema_errors'):
for schema_error in error.__cause__.schema_errors:
field = '.'.join(str(p) for p in schema_error.path) if schema_error.path else 'body'
msg = schema_error.message if hasattr(schema_error, 'message') else str(schema_error)
error_details.append(f"{field}: {msg}")
else:
error_details.append(str(error))
logger.error(f"API Call - Validation failed: {'; '.join(error_details)}")
raise BadRequest(f"Validation failed: {'; '.join(error_details)}")
except BadRequest: except BadRequest:
# Re-raise BadRequest exceptions (validation failures) # Re-raise BadRequest exceptions (validation failures)
raise raise
-162
View File
@@ -1,162 +0,0 @@
# Responsible for building the storage dict into a set of rules ("JSON Schema") acceptable via the API
# Probably other ways to solve this when the backend switches to some ORM
from changedetectionio.notification import valid_notification_formats
def build_time_between_check_json_schema():
# Setup time between check schema
schema_properties_time_between_check = {
"type": "object",
"additionalProperties": False,
"properties": {}
}
for p in ['weeks', 'days', 'hours', 'minutes', 'seconds']:
schema_properties_time_between_check['properties'][p] = {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
]
}
return schema_properties_time_between_check
def build_watch_json_schema(d):
# Base JSON schema
schema = {
'type': 'object',
'properties': {},
}
for k, v in d.items():
# @todo 'integer' is not covered here because its almost always for internal usage
if isinstance(v, type(None)):
schema['properties'][k] = {
"anyOf": [
{"type": "null"},
]
}
elif isinstance(v, list):
schema['properties'][k] = {
"anyOf": [
{"type": "array",
# Always is an array of strings, like text or regex or something
"items": {
"type": "string",
"maxLength": 5000
}
},
]
}
elif isinstance(v, bool):
schema['properties'][k] = {
"anyOf": [
{"type": "boolean"},
]
}
elif isinstance(v, str):
schema['properties'][k] = {
"anyOf": [
{"type": "string",
"maxLength": 5000},
]
}
# Can also be a string (or None by default above)
for v in ['body',
'notification_body',
'notification_format',
'notification_title',
'proxy',
'tag',
'title',
'webdriver_js_execute_code'
]:
schema['properties'][v]['anyOf'].append({'type': 'string', "maxLength": 5000})
for v in ['last_viewed']:
schema['properties'][v] = {
"type": "integer",
"description": "Unix timestamp in seconds of the last time the watch was viewed.",
"minimum": 0
}
# None or Boolean
schema['properties']['track_ldjson_price_data']['anyOf'].append({'type': 'boolean'})
schema['properties']['method'] = {"type": "string",
"enum": ["GET", "POST", "DELETE", "PUT"]
}
schema['properties']['fetch_backend']['anyOf'].append({"type": "string",
"enum": ["html_requests", "html_webdriver"]
})
schema['properties']['processor'] = {"anyOf": [
{"type": "string", "enum": ["restock_diff", "text_json_diff"]},
{"type": "null"}
]}
# All headers must be key/value type dict
schema['properties']['headers'] = {
"type": "object",
"patternProperties": {
# Should always be a string:string type value
".*": {"type": "string"},
}
}
schema['properties']['notification_format'] = {'type': 'string',
'enum': list(valid_notification_formats.keys())
}
# Stuff that shouldn't be available but is just state-storage
for v in ['previous_md5', 'last_error', 'has_ldjson_price_data', 'previous_md5_before_filters', 'uuid']:
del schema['properties'][v]
schema['properties']['webdriver_delay']['anyOf'].append({'type': 'integer'})
schema['properties']['time_between_check'] = build_time_between_check_json_schema()
schema['properties']['time_between_check_use_default'] = {
"type": "boolean",
"default": True,
"description": "Whether to use global settings for time between checks - defaults to true if not set"
}
schema['properties']['browser_steps'] = {
"anyOf": [
{
"type": "array",
"items": {
"type": "object",
"properties": {
"operation": {
"type": ["string", "null"],
"maxLength": 5000 # Allows null and any string up to 5000 chars (including "")
},
"selector": {
"type": ["string", "null"],
"maxLength": 5000
},
"optional_value": {
"type": ["string", "null"],
"maxLength": 5000
}
},
"required": ["operation", "selector", "optional_value"],
"additionalProperties": False # No extra keys allowed
}
},
{"type": "null"}, # Allows null for `browser_steps`
{"type": "array", "maxItems": 0} # Allows empty array []
]
}
# headers ?
return schema
@@ -174,7 +174,7 @@ def construct_blueprint(datastore: ChangeDetectionStore):
browser_steps_blueprint = Blueprint('browser_steps', __name__, template_folder="templates") browser_steps_blueprint = Blueprint('browser_steps', __name__, template_folder="templates")
async def start_browsersteps_session(watch_uuid): async def start_browsersteps_session(watch_uuid):
from . import browser_steps from changedetectionio.browser_steps import browser_steps
import time import time
from playwright.async_api import async_playwright from playwright.async_api import async_playwright
@@ -238,7 +238,6 @@ def construct_blueprint(datastore: ChangeDetectionStore):
@browser_steps_blueprint.route("/browsersteps_start_session", methods=['GET']) @browser_steps_blueprint.route("/browsersteps_start_session", methods=['GET'])
def browsersteps_start_session(): def browsersteps_start_session():
# A new session was requested, return sessionID # A new session was requested, return sessionID
import asyncio
import uuid import uuid
browsersteps_session_id = str(uuid.uuid4()) browsersteps_session_id = str(uuid.uuid4())
watch_uuid = request.args.get('uuid') watch_uuid = request.args.get('uuid')
@@ -301,11 +300,10 @@ def construct_blueprint(datastore: ChangeDetectionStore):
@browser_steps_blueprint.route("/browsersteps_update", methods=['POST']) @browser_steps_blueprint.route("/browsersteps_update", methods=['POST'])
def browsersteps_ui_update(): def browsersteps_ui_update():
import base64 import base64
import playwright._impl._errors
from changedetectionio.blueprint.browser_steps import browser_steps
remaining =0 remaining = 0
uuid = request.args.get('uuid') uuid = request.args.get('uuid')
goto_website_url_first_step = request.args.get('goto_website_url_first_step')
browsersteps_session_id = request.args.get('browsersteps_session_id') browsersteps_session_id = request.args.get('browsersteps_session_id')
@@ -316,33 +314,33 @@ def construct_blueprint(datastore: ChangeDetectionStore):
return make_response('No session exists under that ID', 500) return make_response('No session exists under that ID', 500)
is_last_step = False is_last_step = False
# Actions - step/apply/etc, do the thing and return state
if request.method == 'POST': # @todo - should always be an existing session
# @todo - should always be an existing session if goto_website_url_first_step:
logger.debug("Going to site (requested automatically before stepping)..")
step_operation = "Goto site"
step_selector = None
step_optional_value = None
else:
step_operation = request.form.get('operation') step_operation = request.form.get('operation')
step_selector = request.form.get('selector') step_selector = request.form.get('selector')
step_optional_value = request.form.get('optional_value') step_optional_value = request.form.get('optional_value')
is_last_step = strtobool(request.form.get('is_last_step')) is_last_step = strtobool(request.form.get('is_last_step'))
try: try:
# Run the async call_action method in the dedicated browser steps event loop # Run the async call_action method in the dedicated browser steps event loop
run_async_in_browser_loop( run_async_in_browser_loop(
browsersteps_sessions[browsersteps_session_id]['browserstepper'].call_action( browsersteps_sessions[browsersteps_session_id]['browserstepper'].call_action(
action_name=step_operation, action_name=step_operation,
selector=step_selector, selector=step_selector,
optional_value=step_optional_value optional_value=step_optional_value
)
) )
)
except Exception as e: except Exception as e:
logger.error(f"Exception when calling step operation {step_operation} {str(e)}") logger.error(f"Exception when calling step operation {step_operation} {str(e)}")
# Try to find something of value to give back to the user # Try to find something of value to give back to the user
return make_response(str(e).splitlines()[0], 401) return make_response(str(e).splitlines()[0], 401)
# if not this_session.page:
# cleanup_playwright_session()
# return make_response('Browser session ran out of time :( Please reload this page.', 401)
# Screenshots and other info only needed on requesting a step (POST) # Screenshots and other info only needed on requesting a step (POST)
try: try:
@@ -350,7 +348,7 @@ def construct_blueprint(datastore: ChangeDetectionStore):
(screenshot, xpath_data) = run_async_in_browser_loop( (screenshot, xpath_data) = run_async_in_browser_loop(
browsersteps_sessions[browsersteps_session_id]['browserstepper'].get_current_state() browsersteps_sessions[browsersteps_session_id]['browserstepper'].get_current_state()
) )
if is_last_step: if is_last_step:
watch = datastore.data['watching'].get(uuid) watch = datastore.data['watching'].get(uuid)
u = browsersteps_sessions[browsersteps_session_id]['browserstepper'].page.url u = browsersteps_sessions[browsersteps_session_id]['browserstepper'].page.url
@@ -83,6 +83,10 @@ def construct_blueprint(datastore: ChangeDetectionStore):
datastore.data['settings']['requests'].update(form.data['requests']) datastore.data['settings']['requests'].update(form.data['requests'])
datastore.commit() datastore.commit()
# Clear all checksums to force reprocessing with new settings
# Global settings can affect watch behavior (filters, rendering, etc.)
datastore.clear_all_last_checksums()
# Adjust worker count if it changed # Adjust worker count if it changed
if new_worker_count != old_worker_count: if new_worker_count != old_worker_count:
from changedetectionio import worker_pool from changedetectionio import worker_pool
@@ -244,6 +244,12 @@ def construct_blueprint(datastore: ChangeDetectionStore):
tag.update(form.data) tag.update(form.data)
tag['processor'] = 'restock_diff' tag['processor'] = 'restock_diff'
tag.commit() tag.commit()
# Clear checksums for all watches using this tag to force reprocessing
# Tag changes affect inherited configuration
cleared_count = datastore.clear_checksums_for_tag(uuid)
logger.info(f"Tag {uuid} updated, cleared {cleared_count} watch checksums")
flash(gettext("Updated")) flash(gettext("Updated"))
return redirect(url_for('tags.tags_overview_page')) return redirect(url_for('tags.tags_overview_page'))
+51 -1
View File
@@ -26,7 +26,7 @@ def construct_blueprint(datastore: ChangeDetectionStore, update_q, queuedWatchMe
# https://wtforms.readthedocs.io/en/3.0.x/forms/#wtforms.form.Form.populate_obj ? # https://wtforms.readthedocs.io/en/3.0.x/forms/#wtforms.form.Form.populate_obj ?
def edit_page(uuid): def edit_page(uuid):
from changedetectionio import forms from changedetectionio import forms
from changedetectionio.blueprint.browser_steps.browser_steps import browser_step_ui_config from changedetectionio.browser_steps.browser_steps import browser_step_ui_config
from changedetectionio import processors from changedetectionio import processors
import importlib import importlib
@@ -354,6 +354,56 @@ def construct_blueprint(datastore: ChangeDetectionStore, update_q, queuedWatchMe
# Return a 500 error # Return a 500 error
abort(500) abort(500)
@edit_blueprint.route("/edit/<string:uuid>/get-data-package", methods=['GET'])
@login_optionally_required
def watch_get_data_package(uuid):
"""Download all data for a single watch as a zip file"""
from io import BytesIO
from flask import send_file
import zipfile
from pathlib import Path
import datetime
watch = datastore.data['watching'].get(uuid)
if not watch:
abort(404)
# Create zip in memory
memory_file = BytesIO()
with zipfile.ZipFile(memory_file, 'w',
compression=zipfile.ZIP_DEFLATED,
compresslevel=8) as zipObj:
# Add the watch's JSON file if it exists
watch_json_path = os.path.join(watch.data_dir, 'watch.json')
if os.path.isfile(watch_json_path):
zipObj.write(watch_json_path,
arcname=os.path.join(uuid, 'watch.json'),
compress_type=zipfile.ZIP_DEFLATED,
compresslevel=8)
# Add all files in the watch data directory
if os.path.isdir(watch.data_dir):
for f in Path(watch.data_dir).glob('*'):
if f.is_file() and f.name != 'watch.json': # Skip watch.json since we already added it
zipObj.write(f,
arcname=os.path.join(uuid, f.name),
compress_type=zipfile.ZIP_DEFLATED,
compresslevel=8)
# Seek to beginning of file
memory_file.seek(0)
# Generate filename with timestamp
timestamp = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
filename = f"watch-data-{uuid[:8]}-{timestamp}.zip"
return send_file(memory_file,
as_attachment=True,
download_name=filename,
mimetype='application/zip')
# Ajax callback # Ajax callback
@edit_blueprint.route("/edit/<string:uuid>/preview-rendered", methods=['POST']) @edit_blueprint.route("/edit/<string:uuid>/preview-rendered", methods=['POST'])
@login_optionally_required @login_optionally_required
@@ -488,6 +488,7 @@ Math: {{ 1 + 1 }}") }}
{% if watch.history_n %} {% if watch.history_n %}
<p> <p>
<a href="{{url_for('ui.ui_edit.watch_get_latest_html', uuid=uuid)}}" class="pure-button button-small">{{ _('Download latest HTML snapshot') }}</a> <a href="{{url_for('ui.ui_edit.watch_get_latest_html', uuid=uuid)}}" class="pure-button button-small">{{ _('Download latest HTML snapshot') }}</a>
<a href="{{url_for('ui.ui_edit.watch_get_data_package', uuid=uuid)}}" class="pure-button button-small">{{ _('Download watch data package') }}</a>
</p> </p>
{% endif %} {% endif %}
@@ -8,6 +8,17 @@ from changedetectionio.content_fetchers import SCREENSHOT_MAX_HEIGHT_DEFAULT
from changedetectionio.content_fetchers.base import manage_user_agent from changedetectionio.content_fetchers.base import manage_user_agent
from changedetectionio.jinja2_custom import render as jinja_render from changedetectionio.jinja2_custom import render as jinja_render
def browser_steps_get_valid_steps(browser_steps: list):
if browser_steps is not None and len(browser_steps):
valid_steps = list(filter(
lambda s: (s['operation'] and len(s['operation']) and s['operation'] != 'Choose one'),browser_steps))
# Just incase they selected Goto site by accident with older JS
if valid_steps and valid_steps[0]['operation'] == 'Goto site':
del(valid_steps[0])
return valid_steps
return []
# Two flags, tell the JS which of the "Selector" or "Value" field should be enabled in the front end # Two flags, tell the JS which of the "Selector" or "Value" field should be enabled in the front end
+3 -18
View File
@@ -38,7 +38,6 @@ def manage_user_agent(headers, current_ua=''):
return None return None
class Fetcher(): class Fetcher():
browser_connection_is_custom = None browser_connection_is_custom = None
browser_connection_url = None browser_connection_url = None
@@ -163,30 +162,16 @@ class Fetcher():
""" """
return {k.lower(): v for k, v in self.headers.items()} return {k.lower(): v for k, v in self.headers.items()}
def browser_steps_get_valid_steps(self):
if self.browser_steps is not None and len(self.browser_steps):
valid_steps = list(filter(
lambda s: (s['operation'] and len(s['operation']) and s['operation'] != 'Choose one'),
self.browser_steps))
# Just incase they selected Goto site by accident with older JS
if valid_steps and valid_steps[0]['operation'] == 'Goto site':
del(valid_steps[0])
return valid_steps
return None
async def iterate_browser_steps(self, start_url=None): async def iterate_browser_steps(self, start_url=None):
from changedetectionio.blueprint.browser_steps.browser_steps import steppable_browser_interface from changedetectionio.browser_steps.browser_steps import steppable_browser_interface, browser_steps_get_valid_steps
from playwright._impl._errors import TimeoutError, Error from playwright._impl._errors import TimeoutError, Error
from changedetectionio.jinja2_custom import render as jinja_render from changedetectionio.jinja2_custom import render as jinja_render
step_n = 0 step_n = 0
if self.browser_steps is not None and len(self.browser_steps): if self.browser_steps:
interface = steppable_browser_interface(start_url=start_url) interface = steppable_browser_interface(start_url=start_url)
interface.page = self.page interface.page = self.page
valid_steps = self.browser_steps_get_valid_steps() valid_steps = browser_steps_get_valid_steps(self.browser_steps)
for step in valid_steps: for step in valid_steps:
step_n += 1 step_n += 1
@@ -295,7 +295,7 @@ class fetcher(Fetcher):
self.page.on("console", lambda msg: logger.debug(f"Playwright console: Watch URL: {url} {msg.type}: {msg.text} {msg.args}")) self.page.on("console", lambda msg: logger.debug(f"Playwright console: Watch URL: {url} {msg.type}: {msg.text} {msg.args}"))
# Re-use as much code from browser steps as possible so its the same # Re-use as much code from browser steps as possible so its the same
from changedetectionio.blueprint.browser_steps.browser_steps import steppable_browser_interface from changedetectionio.browser_steps.browser_steps import steppable_browser_interface
browsersteps_interface = steppable_browser_interface(start_url=url) browsersteps_interface = steppable_browser_interface(start_url=url)
browsersteps_interface.page = self.page browsersteps_interface.page = self.page
@@ -362,7 +362,7 @@ class fetcher(Fetcher):
# Wrap remaining operations in try/finally to ensure cleanup # Wrap remaining operations in try/finally to ensure cleanup
try: try:
# Run Browser Steps here # Run Browser Steps here
if self.browser_steps_get_valid_steps(): if self.browser_steps:
try: try:
await self.iterate_browser_steps(start_url=url) await self.iterate_browser_steps(start_url=url)
except BrowserStepsStepException: except BrowserStepsStepException:
@@ -456,7 +456,7 @@ class fetcher(Fetcher):
# Run Browser Steps here # Run Browser Steps here
# @todo not yet supported, we switch to playwright in this case # @todo not yet supported, we switch to playwright in this case
# if self.browser_steps_get_valid_steps(): # if self.browser_steps:
# self.iterate_browser_steps() # self.iterate_browser_steps()
@@ -3,7 +3,7 @@ import hashlib
import os import os
import re import re
import asyncio import asyncio
from functools import partial
from changedetectionio import strtobool from changedetectionio import strtobool
from changedetectionio.content_fetchers.exceptions import BrowserStepsInUnsupportedFetcher, EmptyReply, Non200ErrorCodeReceived from changedetectionio.content_fetchers.exceptions import BrowserStepsInUnsupportedFetcher, EmptyReply, Non200ErrorCodeReceived
from changedetectionio.content_fetchers.base import Fetcher from changedetectionio.content_fetchers.base import Fetcher
@@ -36,7 +36,7 @@ class fetcher(Fetcher):
import requests import requests
from requests.exceptions import ProxyError, ConnectionError, RequestException from requests.exceptions import ProxyError, ConnectionError, RequestException
if self.browser_steps_get_valid_steps(): if self.browser_steps:
raise BrowserStepsInUnsupportedFetcher(url=url) raise BrowserStepsInUnsupportedFetcher(url=url)
proxies = {} proxies = {}
@@ -184,7 +184,6 @@ class fetcher(Fetcher):
) )
async def quit(self, watch=None): async def quit(self, watch=None):
# In case they switched to `requests` fetcher from something else # In case they switched to `requests` fetcher from something else
# Then the screenshot could be old, in any case, it's not used here. # Then the screenshot could be old, in any case, it's not used here.
# REMOVE_REQUESTS_OLD_SCREENSHOTS - Mainly used for testing # REMOVE_REQUESTS_OLD_SCREENSHOTS - Mainly used for testing
+18 -8
View File
@@ -70,13 +70,17 @@ socketio_server = None
# Enable CORS, especially useful for the Chrome extension to operate from anywhere # Enable CORS, especially useful for the Chrome extension to operate from anywhere
CORS(app) CORS(app)
# Super handy for compressing large BrowserSteps responses and others # Flask-Compress handles HTTP compression, Socket.IO compression disabled to prevent memory leak.
# Flask-Compress handles HTTP compression, Socket.IO compression disabled to prevent memory leak # There's also a bug between flask compress and socketio that causes some kind of slow memory leak
# It's better to use compression on your reverse proxy (nginx etc) instead.
if strtobool(os.getenv("FLASK_ENABLE_COMPRESSION")):
app.config['COMPRESS_MIN_SIZE'] = 2096
app.config['COMPRESS_MIMETYPES'] = ['text/html', 'text/css', 'text/javascript', 'application/json', 'application/javascript', 'image/svg+xml']
# Use gzip only - smaller memory footprint than zstd/brotli (4-8KB vs 200-500KB contexts)
app.config['COMPRESS_ALGORITHM'] = ['gzip']
compress = FlaskCompress() compress = FlaskCompress()
app.config['COMPRESS_MIN_SIZE'] = 2096
app.config['COMPRESS_MIMETYPES'] = ['text/html', 'text/css', 'text/javascript', 'application/json', 'application/javascript', 'image/svg+xml']
# Use gzip only - smaller memory footprint than zstd/brotli (4-8KB vs 200-500KB contexts)
app.config['COMPRESS_ALGORITHM'] = ['gzip']
compress.init_app(app) compress.init_app(app)
app.config['TEMPLATES_AUTO_RELOAD'] = False app.config['TEMPLATES_AUTO_RELOAD'] = False
@@ -708,8 +712,14 @@ def changedetection_app(config=None, datastore_o=None):
def static_content(group, filename): def static_content(group, filename):
from flask import make_response from flask import make_response
import re import re
group = re.sub(r'[^\w.-]+', '', group.lower())
filename = re.sub(r'[^\w.-]+', '', filename.lower()) # Strict sanitization: only allow a-z, 0-9, and underscore (blocks .. and other traversal)
group = re.sub(r'[^a-z0-9_-]+', '', group.lower())
filename = filename
# Additional safety: reject if sanitization resulted in empty strings
if not group or not filename:
abort(404)
if group == 'screenshot': if group == 'screenshot':
# Could be sensitive, follow password requirements # Could be sensitive, follow password requirements
+2 -6
View File
@@ -7,8 +7,6 @@ from flask_babel import lazy_gettext as _l, gettext
from changedetectionio.blueprint.rss import RSS_FORMAT_TYPES, RSS_TEMPLATE_TYPE_OPTIONS, RSS_TEMPLATE_HTML_DEFAULT from changedetectionio.blueprint.rss import RSS_FORMAT_TYPES, RSS_TEMPLATE_TYPE_OPTIONS, RSS_TEMPLATE_HTML_DEFAULT
from changedetectionio.conditions.form import ConditionFormRow from changedetectionio.conditions.form import ConditionFormRow
from changedetectionio.notification_service import NotificationContextData from changedetectionio.notification_service import NotificationContextData
from changedetectionio.processors.image_ssim_diff import SCREENSHOT_COMPARISON_THRESHOLD_OPTIONS, \
SCREENSHOT_COMPARISON_THRESHOLD_OPTIONS_DEFAULT
from changedetectionio.strtobool import strtobool from changedetectionio.strtobool import strtobool
from changedetectionio import processors from changedetectionio import processors
@@ -37,7 +35,7 @@ from changedetectionio.widgets import TernaryNoneBooleanField
# default # default
# each select <option data-enabled="enabled-0-0" # each select <option data-enabled="enabled-0-0"
from changedetectionio.blueprint.browser_steps.browser_steps import browser_step_ui_config from changedetectionio.browser_steps.browser_steps import browser_step_ui_config
from changedetectionio import html_tools, content_fetchers from changedetectionio import html_tools, content_fetchers
@@ -494,7 +492,6 @@ class ValidateJinja2Template(object):
Validates that a {token} is from a valid set Validates that a {token} is from a valid set
""" """
def __call__(self, form, field): def __call__(self, form, field):
from changedetectionio import notification
from changedetectionio.jinja2_custom import create_jinja_env from changedetectionio.jinja2_custom import create_jinja_env
from jinja2 import BaseLoader, TemplateSyntaxError, UndefinedError from jinja2 import BaseLoader, TemplateSyntaxError, UndefinedError
from jinja2.meta import find_undeclared_variables from jinja2.meta import find_undeclared_variables
@@ -820,8 +817,7 @@ class processor_text_json_diff_form(commonSettingsForm):
filter_text_removed = BooleanField(_l('Removed lines'), default=True) filter_text_removed = BooleanField(_l('Removed lines'), default=True)
trigger_text = StringListField(_l('Keyword triggers - Trigger/wait for text'), [validators.Optional(), ValidateListRegex()]) trigger_text = StringListField(_l('Keyword triggers - Trigger/wait for text'), [validators.Optional(), ValidateListRegex()])
if os.getenv("PLAYWRIGHT_DRIVER_URL"): browser_steps = FieldList(FormField(SingleBrowserStep), min_entries=10)
browser_steps = FieldList(FormField(SingleBrowserStep), min_entries=10)
text_should_not_be_present = StringListField(_l('Block change-detection while text matches'), [validators.Optional(), ValidateListRegex()]) text_should_not_be_present = StringListField(_l('Block change-detection while text matches'), [validators.Optional(), ValidateListRegex()])
webdriver_js_execute_code = TextAreaField(_l('Execute JavaScript before change detection'), render_kw={"rows": "5"}, validators=[validators.Optional()]) webdriver_js_execute_code = TextAreaField(_l('Execute JavaScript before change detection'), render_kw={"rows": "5"}, validators=[validators.Optional()])
+10 -5
View File
@@ -335,7 +335,6 @@ class model(EntityPersistenceMixin, watch_base):
'last_notification_error': False, 'last_notification_error': False,
'last_viewed': 0, 'last_viewed': 0,
'previous_md5': False, 'previous_md5': False,
'previous_md5_before_filters': False,
'remote_server_reply': None, 'remote_server_reply': None,
'track_ldjson_price_data': None 'track_ldjson_price_data': None
}) })
@@ -386,10 +385,16 @@ class model(EntityPersistenceMixin, watch_base):
@property @property
def is_pdf(self): def is_pdf(self):
# content_type field is set in the future url = str(self.get("url") or "").lower()
# https://github.com/dgtlmoon/changedetection.io/issues/1392 content_type = str(self.get("content-type") or "").lower()
# Not sure the best logic here
return self.get('url', '').lower().endswith('.pdf') or 'pdf' in self.get('content_type', '').lower() if content_type in ("none", "null", ""):
content_type = ""
return (
url.endswith(".pdf")
or content_type.split(";")[0].strip() == "application/pdf"
)
@property @property
def label(self): def label(self):
+156 -3
View File
@@ -6,6 +6,8 @@ from .persistence import EntityPersistenceMixin, _determine_entity_type
__all__ = ['EntityPersistenceMixin', 'watch_base'] __all__ = ['EntityPersistenceMixin', 'watch_base']
from ..browser_steps.browser_steps import browser_steps_get_valid_steps
USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH = 'System default' USE_SYSTEM_DEFAULT_NOTIFICATION_FORMAT_FOR_WATCH = 'System default'
CONDITIONS_MATCH_LOGIC_DEFAULT = 'ALL' CONDITIONS_MATCH_LOGIC_DEFAULT = 'ALL'
@@ -26,6 +28,7 @@ class watch_base(dict):
- Configuration override chain resolution (Watch Tag Global) - Configuration override chain resolution (Watch Tag Global)
- Immutability options - Immutability options
- Better testing - Better testing
- USE https://docs.pydantic.dev/latest/integrations/datamodel_code_generator TO BUILD THE MODEL FROM THE API-SPEC!!!
CHAIN RESOLUTION ARCHITECTURE: CHAIN RESOLUTION ARCHITECTURE:
The dream is a 3-level override hierarchy: The dream is a 3-level override hierarchy:
@@ -128,7 +131,6 @@ class watch_base(dict):
fetch_time (float): Duration of last fetch in seconds fetch_time (float): Duration of last fetch in seconds
consecutive_filter_failures (int): Counter for consecutive filter match failures consecutive_filter_failures (int): Counter for consecutive filter match failures
previous_md5 (str|bool): MD5 hash of previous content previous_md5 (str|bool): MD5 hash of previous content
previous_md5_before_filters (str|bool): MD5 hash before filters applied
history_snapshot_max_length (int|None): Max history snapshots to keep (None = use global) history_snapshot_max_length (int|None): Max history snapshots to keep (None = use global)
Conditions: Conditions:
@@ -165,6 +167,10 @@ class watch_base(dict):
if kw.get('datastore_path'): if kw.get('datastore_path'):
del kw['datastore_path'] del kw['datastore_path']
# IMPORTANT: Don't initialize __watch_was_edited yet!
# We'll initialize it AFTER the initial update() call below
# This prevents marking the watch as edited during initialization
self.update({ self.update({
# Custom notification content # Custom notification content
# Re #110, so then if this is set to None, we know to use the default value instead # Re #110, so then if this is set to None, we know to use the default value instead
@@ -173,7 +179,7 @@ class watch_base(dict):
'body': None, 'body': None,
'browser_steps': [], 'browser_steps': [],
'browser_steps_last_error_step': None, 'browser_steps_last_error_step': None,
'conditions' : {}, 'conditions' : [],
'conditions_match_logic': CONDITIONS_MATCH_LOGIC_DEFAULT, 'conditions_match_logic': CONDITIONS_MATCH_LOGIC_DEFAULT,
'check_count': 0, 'check_count': 0,
'check_unique_lines': False, # On change-detected, compare against all history if its something new 'check_unique_lines': False, # On change-detected, compare against all history if its something new
@@ -210,7 +216,6 @@ class watch_base(dict):
'page_title': None, # <title> from the page 'page_title': None, # <title> from the page
'paused': False, 'paused': False,
'previous_md5': False, 'previous_md5': False,
'previous_md5_before_filters': False, # Used for skipping changedetection entirely
'processor': 'text_json_diff', # could be restock_diff or others from .processors 'processor': 'text_json_diff', # could be restock_diff or others from .processors
'price_change_threshold_percent': None, 'price_change_threshold_percent': None,
'proxy': None, # Preferred proxy connection 'proxy': None, # Preferred proxy connection
@@ -296,9 +301,157 @@ class watch_base(dict):
super(watch_base, self).__init__(*arg, **kw) super(watch_base, self).__init__(*arg, **kw)
# Check if we're being initialized from an existing watch object
# that has was_edited=True, so we can preserve the flag
preserve_edited_flag = False
if self.get('default'): if self.get('default'):
# When creating a new watch object from an existing one (e.g., changing processor),
# preserve the was_edited flag if it was True
default_watch = self.get('default')
if hasattr(default_watch, 'was_edited') and default_watch.was_edited:
preserve_edited_flag = True
del self['default'] del self['default']
# NOW initialize the edited flag after all initial setup is complete
# This ensures initialization doesn't trigger the edited flag
# But preserve it if the source watch had it set to True
self.__watch_was_edited = preserve_edited_flag
def _mark_field_as_edited(self, key):
"""
Helper to mark a field as edited if it's writable.
Internal method used by __setitem__, update(), pop(), etc.
"""
# Don't track edits during initial load or if already edited
if not hasattr(self, '_watch_base__watch_was_edited'):
return
if self.__watch_was_edited:
return # Already marked as edited
# Import from shared schema utilities (no circular dependency)
from .schema_utils import get_readonly_watch_fields
readonly_fields = get_readonly_watch_fields()
# Additional system-managed fields not in OpenAPI spec (yet)
# These are set by processors/workers and should not trigger edited flag
additional_system_fields = {
'last_check_status', # Set by processors
'restock', # Set by restock processor
'last_viewed', # Set by mark_all_viewed endpoint
}
# Only mark as edited if this is a user-writable field
if key not in readonly_fields and key not in additional_system_fields:
self.__watch_was_edited = True
def __setitem__(self, key, value):
"""
Override dict.__setitem__ to track when writable watch fields are modified.
This enables skipping reprocessing when:
1. HTML content is unchanged (checksumFromPreviousCheckWasTheSame)
2. AND watch configuration was not edited
Only sets the edited flag when field is NOT in readonly_fields (from OpenAPI spec).
"""
# Set the value first (always)
super().__setitem__(key, value)
# Mark as edited if writable field
self._mark_field_as_edited(key)
def __delitem__(self, key):
"""Override dict.__delitem__ to track deletions of writable fields."""
super().__delitem__(key)
self._mark_field_as_edited(key)
def update(self, *args, **kwargs):
if args and args[0].get('browser_steps'):
args[0]['browser_steps'] = browser_steps_get_valid_steps(args[0].get('browser_steps'))
"""Override dict.update() to track modifications to writable fields."""
# Call parent update first
super().update(*args, **kwargs)
# Mark as edited for any writable fields that were updated
# Handle both update(dict) and update(key=value) forms
if args:
for key in args[0].keys():
self._mark_field_as_edited(key)
for key in kwargs.keys():
self._mark_field_as_edited(key)
def pop(self, key, *args):
"""Override dict.pop() to track removal of writable fields."""
result = super().pop(key, *args)
self._mark_field_as_edited(key)
return result
def setdefault(self, key, default=None):
"""Override dict.setdefault() to track modifications to writable fields."""
# Only marks as edited if key didn't exist (i.e., a new value was set)
existed = key in self
result = super().setdefault(key, default)
if not existed:
self._mark_field_as_edited(key)
return result
@property
def was_edited(self):
"""
Check if watch configuration was edited since last processing.
Returns:
bool: True if writable fields were modified, False otherwise
"""
return getattr(self, '_watch_base__watch_was_edited', False)
def reset_watch_edited_flag(self):
"""
Reset the watch edited flag after successful processing.
Call this after processing completes to allow future content-only change detection.
"""
self.__watch_was_edited = False
@classmethod
def get_property_names(cls):
"""
Get all @property attribute names from this model class using introspection.
This discovers computed/derived properties that are not stored in the datastore.
These properties should be filtered out during PUT/POST requests.
Returns:
frozenset: Immutable set of @property attribute names from the model class
"""
import functools
# Create a cached version if it doesn't exist
if not hasattr(cls, '_cached_get_property_names'):
@functools.cache
def _get_props():
properties = set()
# Use introspection to find all @property attributes
for name in dir(cls):
# Skip private/magic attributes
if name.startswith('_'):
continue
try:
attr = getattr(cls, name)
# Check if it's a property descriptor
if isinstance(attr, property):
properties.add(name)
except (AttributeError, TypeError):
continue
return frozenset(properties)
cls._cached_get_property_names = _get_props
return cls._cached_get_property_names()
def __deepcopy__(self, memo): def __deepcopy__(self, memo):
""" """
Custom deepcopy for all watch_base subclasses (Watch, Tag, etc.). Custom deepcopy for all watch_base subclasses (Watch, Tag, etc.).
+92
View File
@@ -0,0 +1,92 @@
"""
Schema utilities for Watch and Tag models.
Provides functions to extract readonly fields and properties from OpenAPI spec.
Shared by both the model layer and API layer to avoid circular dependencies.
"""
import functools
@functools.cache
def get_openapi_schema_dict():
"""
Get the raw OpenAPI spec dictionary for schema access.
Returns the YAML dict directly (not the OpenAPI object).
"""
import os
import yaml
spec_path = os.path.join(os.path.dirname(__file__), '../../docs/api-spec.yaml')
if not os.path.exists(spec_path):
spec_path = os.path.join(os.path.dirname(__file__), '../docs/api-spec.yaml')
with open(spec_path, 'r', encoding='utf-8') as f:
return yaml.safe_load(f)
@functools.cache
def _resolve_readonly_fields(schema_name):
"""
Generic helper to resolve readOnly fields, including allOf inheritance.
Args:
schema_name: Name of the schema (e.g., 'Watch', 'Tag')
Returns:
frozenset: All readOnly field names including inherited ones
"""
spec_dict = get_openapi_schema_dict()
schema = spec_dict['components']['schemas'].get(schema_name, {})
readonly_fields = set()
# Handle allOf (schema inheritance)
if 'allOf' in schema:
for item in schema['allOf']:
# Resolve $ref to parent schema
if '$ref' in item:
ref_path = item['$ref'].split('/')[-1]
ref_schema = spec_dict['components']['schemas'].get(ref_path, {})
if 'properties' in ref_schema:
for field_name, field_def in ref_schema['properties'].items():
if field_def.get('readOnly') is True:
readonly_fields.add(field_name)
# Check schema-specific properties
if 'properties' in item:
for field_name, field_def in item['properties'].items():
if field_def.get('readOnly') is True:
readonly_fields.add(field_name)
else:
# Direct properties (no inheritance)
if 'properties' in schema:
for field_name, field_def in schema['properties'].items():
if field_def.get('readOnly') is True:
readonly_fields.add(field_name)
return frozenset(readonly_fields)
@functools.cache
def get_readonly_watch_fields():
"""
Extract readOnly field names from Watch schema in OpenAPI spec.
Returns readOnly fields from WatchBase (uuid, date_created) + Watch-specific readOnly fields.
Used by:
- model/watch_base.py: Track when writable fields are edited
- api/Watch.py: Filter readonly fields from PUT requests
"""
return _resolve_readonly_fields('Watch')
@functools.cache
def get_readonly_tag_fields():
"""
Extract readOnly field names from Tag schema in OpenAPI spec.
Returns readOnly fields from WatchBase (uuid, date_created) + Tag-specific readOnly fields.
"""
return _resolve_readonly_fields('Tag')
+24 -7
View File
@@ -1,6 +1,6 @@
from functools import lru_cache from functools import lru_cache
from loguru import logger from loguru import logger
from flask_babel import gettext from flask_babel import gettext, get_locale
import importlib import importlib
import inspect import inspect
import os import os
@@ -190,14 +190,15 @@ def get_plugin_processor_metadata():
logger.warning(f"Error getting plugin processor metadata: {e}") logger.warning(f"Error getting plugin processor metadata: {e}")
return metadata return metadata
@lru_cache(maxsize=32)
def available_processors(): def _available_processors_cached(locale_str):
"""
Get a list of processors by name and description for the UI elements.
Can be filtered via DISABLED_PROCESSORS environment variable (comma-separated list).
:return: A list :)
""" """
Internal cached function that includes locale in cache key.
This ensures translations are cached per-language instead of globally.
:param locale_str: The locale string (e.g., 'en', 'it', 'zh')
:return: A list of tuples (processor_name, translated_description, weight)
"""
processor_classes = find_processors() processor_classes = find_processors()
# Check if DISABLED_PROCESSORS env var is set # Check if DISABLED_PROCESSORS env var is set
@@ -256,6 +257,22 @@ def available_processors():
# Return as tuples without weight (for backwards compatibility) # Return as tuples without weight (for backwards compatibility)
return [(name, desc) for name, desc, weight in available] return [(name, desc) for name, desc, weight in available]
def available_processors():
"""
Get a list of processors by name and description for the UI elements.
Can be filtered via DISABLED_PROCESSORS environment variable (comma-separated list).
This function delegates to a locale-aware cached version to ensure translations
are cached per-language instead of globally.
:return: A list of tuples (processor_name, translated_description)
"""
# Get current locale and use it as cache key
# Convert Babel Locale object to string for use as cache key
locale = get_locale()
locale_str = str(locale) if locale else 'en'
return _available_processors_cached(locale_str)
def get_default_processor(): def get_default_processor():
""" """
+71 -2
View File
@@ -1,5 +1,7 @@
import re import re
import hashlib import hashlib
from changedetectionio.browser_steps.browser_steps import browser_steps_get_valid_steps
from changedetectionio.content_fetchers.base import Fetcher from changedetectionio.content_fetchers.base import Fetcher
from changedetectionio.strtobool import strtobool from changedetectionio.strtobool import strtobool
from copy import deepcopy from copy import deepcopy
@@ -19,6 +21,7 @@ class difference_detection_processor():
xpath_data = None xpath_data = None
preferred_proxy = None preferred_proxy = None
screenshot_format = SCREENSHOT_FORMAT_JPEG screenshot_format = SCREENSHOT_FORMAT_JPEG
last_raw_content_checksum = None
def __init__(self, datastore, watch_uuid): def __init__(self, datastore, watch_uuid):
self.datastore = datastore self.datastore = datastore
@@ -34,6 +37,64 @@ class difference_detection_processor():
# Generic fetcher that should be extended (requests, playwright etc) # Generic fetcher that should be extended (requests, playwright etc)
self.fetcher = Fetcher() self.fetcher = Fetcher()
# Load the last raw content checksum from file
self.read_last_raw_content_checksum()
def update_last_raw_content_checksum(self, checksum):
"""
Save the raw content MD5 checksum to file.
This is used for skip logic - avoid reprocessing if raw HTML unchanged.
"""
if not checksum:
return
watch = self.datastore.data['watching'].get(self.watch_uuid)
if not watch:
return
data_dir = watch.data_dir
if not data_dir:
return
watch.ensure_data_dir_exists()
checksum_file = os.path.join(data_dir, 'last-checksum.txt')
try:
with open(checksum_file, 'w', encoding='utf-8') as f:
f.write(checksum)
self.last_raw_content_checksum = checksum
except IOError as e:
logger.warning(f"Failed to write checksum file for {self.watch_uuid}: {e}")
def read_last_raw_content_checksum(self):
"""
Read the last raw content MD5 checksum from file.
Returns None if file doesn't exist (first run) or can't be read.
"""
watch = self.datastore.data['watching'].get(self.watch_uuid)
if not watch:
self.last_raw_content_checksum = None
return
data_dir = watch.data_dir
if not data_dir:
self.last_raw_content_checksum = None
return
checksum_file = os.path.join(data_dir, 'last-checksum.txt')
if not os.path.isfile(checksum_file):
self.last_raw_content_checksum = None
return
try:
with open(checksum_file, 'r', encoding='utf-8') as f:
self.last_raw_content_checksum = f.read().strip()
except IOError as e:
logger.warning(f"Failed to read checksum file for {self.watch_uuid}: {e}")
self.last_raw_content_checksum = None
async def call_browser(self, preferred_proxy_id=None): async def call_browser(self, preferred_proxy_id=None):
from requests.structures import CaseInsensitiveDict from requests.structures import CaseInsensitiveDict
@@ -110,7 +171,7 @@ class difference_detection_processor():
) )
if self.watch.has_browser_steps: if self.watch.has_browser_steps:
self.fetcher.browser_steps = self.watch.get('browser_steps', []) self.fetcher.browser_steps = browser_steps_get_valid_steps(self.watch.get('browser_steps', []))
self.fetcher.browser_steps_screenshot_path = os.path.join(self.datastore.datastore_path, self.watch.get('uuid')) self.fetcher.browser_steps_screenshot_path = os.path.join(self.datastore.datastore_path, self.watch.get('uuid'))
# Tweak the base config with the per-watch ones # Tweak the base config with the per-watch ones
@@ -257,8 +318,16 @@ class difference_detection_processor():
except IOError as e: except IOError as e:
logger.error(f"Failed to write extra watch config {filename}: {e}") logger.error(f"Failed to write extra watch config {filename}: {e}")
def get_raw_document_checksum(self):
checksum = None
if self.fetcher.content:
checksum = hashlib.md5(self.fetcher.content.encode('utf-8')).hexdigest()
return checksum
@abstractmethod @abstractmethod
def run_changedetection(self, watch): def run_changedetection(self, watch, force_reprocess=False):
update_obj = {'last_notification_error': False, 'last_error': False} update_obj = {'last_notification_error': False, 'last_error': False}
some_data = 'xxxxx' some_data = 'xxxxx'
update_obj["previous_md5"] = hashlib.md5(some_data.encode('utf-8')).hexdigest() update_obj["previous_md5"] = hashlib.md5(some_data.encode('utf-8')).hexdigest()
@@ -30,7 +30,7 @@ class perform_site_check(difference_detection_processor):
# Override to use PNG format for better image comparison (JPEG compression creates noise) # Override to use PNG format for better image comparison (JPEG compression creates noise)
screenshot_format = SCREENSHOT_FORMAT_PNG screenshot_format = SCREENSHOT_FORMAT_PNG
def run_changedetection(self, watch): def run_changedetection(self, watch, force_reprocess=False):
""" """
Perform screenshot comparison using OpenCV subprocess handler. Perform screenshot comparison using OpenCV subprocess handler.
@@ -2,6 +2,7 @@ from ..base import difference_detection_processor
from ..exceptions import ProcessorException from ..exceptions import ProcessorException
from . import Restock from . import Restock
from loguru import logger from loguru import logger
from changedetectionio.content_fetchers.exceptions import checksumFromPreviousCheckWasTheSame
import urllib3 import urllib3
import time import time
@@ -403,22 +404,37 @@ class perform_site_check(difference_detection_processor):
screenshot = None screenshot = None
xpath_data = None xpath_data = None
def run_changedetection(self, watch): def run_changedetection(self, watch, force_reprocess=False):
import hashlib import hashlib
if not watch: if not watch:
raise Exception("Watch no longer exists.") raise Exception("Watch no longer exists.")
current_raw_document_checksum = self.get_raw_document_checksum()
# Skip processing only if BOTH conditions are true:
# 1. HTML content unchanged (checksum matches last saved checksum)
# 2. Watch configuration was not edited (including trigger_text, filters, etc.)
# The was_edited flag handles all watch configuration changes, so we don't need
# separate checks for trigger_text or other processing rules.
if (not force_reprocess and
not watch.was_edited and
self.last_raw_content_checksum and
self.last_raw_content_checksum == current_raw_document_checksum):
raise checksumFromPreviousCheckWasTheSame()
# Unset any existing notification error # Unset any existing notification error
update_obj = {'last_notification_error': False, 'last_error': False, 'restock': Restock()} update_obj = {'last_notification_error': False, 'last_error': False, 'restock': Restock()}
self.screenshot = self.fetcher.screenshot self.screenshot = self.fetcher.screenshot
self.xpath_data = self.fetcher.xpath_data self.xpath_data = self.fetcher.xpath_data
# Track the content type # Track the content type (readonly field, doesn't trigger was_edited)
update_obj['content_type'] = self.fetcher.headers.get('Content-Type', '') update_obj['content-type'] = self.fetcher.headers.get('Content-Type', '') # Use hyphen (matches OpenAPI spec)
update_obj["last_check_status"] = self.fetcher.get_last_status_code() update_obj["last_check_status"] = self.fetcher.get_last_status_code()
# Save the raw content checksum to file (processor implementation detail, not watch config)
self.update_last_raw_content_checksum(current_raw_document_checksum)
# Only try to process restock information (like scraping for keywords) if the page was actually rendered correctly. # Only try to process restock information (like scraping for keywords) if the page was actually rendered correctly.
# Otherwise it will assume "in stock" because nothing suggesting the opposite was found # Otherwise it will assume "in stock" because nothing suggesting the opposite was found
from ...html_tools import html_to_text from ...html_tools import html_to_text
@@ -17,7 +17,8 @@ def _task(watch, update_handler):
try: try:
# The slow process (we run 2 of these in parallel) # The slow process (we run 2 of these in parallel)
changed_detected, update_obj, text_after_filter = update_handler.run_changedetection(watch=watch) # Always force reprocess for preview - we want to show the filtered content regardless of checksums
changed_detected, update_obj, text_after_filter = update_handler.run_changedetection(watch=watch, force_reprocess=True)
except FilterNotFoundInResponse as e: except FilterNotFoundInResponse as e:
text_after_filter = f"Filter not found in HTML: {str(e)}" text_after_filter = f"Filter not found in HTML: {str(e)}"
except ReplyWithContentButNoText as e: except ReplyWithContentButNoText as e:
@@ -7,6 +7,7 @@ import re
import urllib3 import urllib3
from changedetectionio.conditions import execute_ruleset_against_all_plugins from changedetectionio.conditions import execute_ruleset_against_all_plugins
from changedetectionio.content_fetchers.exceptions import checksumFromPreviousCheckWasTheSame
from ..base import difference_detection_processor from ..base import difference_detection_processor
from changedetectionio.html_tools import PERL_STYLE_REGEX, cdata_in_document_to_text, TRANSLATE_WHITESPACE_TABLE from changedetectionio.html_tools import PERL_STYLE_REGEX, cdata_in_document_to_text, TRANSLATE_WHITESPACE_TABLE
from changedetectionio import html_tools, content_fetchers from changedetectionio import html_tools, content_fetchers
@@ -368,12 +369,24 @@ class ChecksumCalculator:
# (set_proxy_from_list) # (set_proxy_from_list)
class perform_site_check(difference_detection_processor): class perform_site_check(difference_detection_processor):
def run_changedetection(self, watch): def run_changedetection(self, watch, force_reprocess=False):
changed_detected = False changed_detected = False
if not watch: if not watch:
raise Exception("Watch no longer exists.") raise Exception("Watch no longer exists.")
current_raw_document_checksum = self.get_raw_document_checksum()
# Skip processing only if BOTH conditions are true:
# 1. HTML content unchanged (checksum matches last saved checksum)
# 2. Watch configuration was not edited (including trigger_text, filters, etc.)
# The was_edited flag handles all watch configuration changes, so we don't need
# separate checks for trigger_text or other processing rules.
if (not force_reprocess and
not watch.was_edited and
self.last_raw_content_checksum and
self.last_raw_content_checksum == current_raw_document_checksum):
raise checksumFromPreviousCheckWasTheSame()
# Initialize components # Initialize components
filter_config = FilterConfig(watch, self.datastore) filter_config = FilterConfig(watch, self.datastore)
content_processor = ContentProcessor(self.fetcher, watch, filter_config, self.datastore) content_processor = ContentProcessor(self.fetcher, watch, filter_config, self.datastore)
@@ -391,9 +404,11 @@ class perform_site_check(difference_detection_processor):
self.screenshot = self.fetcher.screenshot self.screenshot = self.fetcher.screenshot
self.xpath_data = self.fetcher.xpath_data self.xpath_data = self.fetcher.xpath_data
# Track the content type and checksum before filters # Track the content type (readonly field, doesn't trigger was_edited)
update_obj['content_type'] = ctype_header update_obj['content-type'] = ctype_header # Use hyphen (matches OpenAPI spec and watch_base default)
update_obj['previous_md5_before_filters'] = hashlib.md5(self.fetcher.content.encode('utf-8')).hexdigest()
# Save the raw content checksum to file (processor implementation detail, not watch config)
self.update_last_raw_content_checksum(current_raw_document_checksum)
# === CONTENT PREPROCESSING === # === CONTENT PREPROCESSING ===
# Avoid creating unnecessary intermediate string copies by reassigning only when needed # Avoid creating unnecessary intermediate string copies by reassigning only when needed
+64 -80
View File
@@ -17,8 +17,6 @@ $(document).ready(function () {
set_scale(); set_scale();
}); });
// Should always be disabled // Should always be disabled
$('#browser_steps-0-operation option[value="Goto site"]').prop("selected", "selected");
$('#browser_steps-0-operation').attr('disabled', 'disabled');
$('#browsersteps-click-start').click(function () { $('#browsersteps-click-start').click(function () {
$("#browsersteps-click-start").fadeOut(); $("#browsersteps-click-start").fadeOut();
@@ -45,12 +43,6 @@ $(document).ready(function () {
browsersteps_session_id = false; browsersteps_session_id = false;
apply_buttons_disabled = false; apply_buttons_disabled = false;
ctx.clearRect(0, 0, c.width, c.height); ctx.clearRect(0, 0, c.width, c.height);
set_first_gotosite_disabled();
}
function set_first_gotosite_disabled() {
$('#browser_steps >li:first-child select').val('Goto site').attr('disabled', 'disabled');
$('#browser_steps >li:first-child').css('opacity', '0.5');
} }
// Show seconds remaining until the browser interface needs to restart the session // Show seconds remaining until the browser interface needs to restart the session
@@ -243,14 +235,54 @@ $(document).ready(function () {
ctx.fill(); ctx.fill();
} }
// Reusable AJAX function for browser step operations
function executeBrowserStep(url, data = {}) {
$('#browser-steps-ui .loader .spinner').fadeIn();
apply_buttons_disabled = true;
$('ul#browser_steps li .control .apply').css('opacity', 0.5);
$("#browsersteps-img").css('opacity', 0.65);
return $.ajax({
method: "POST",
url: url,
data: data,
statusCode: {
400: function () {
alert("There was a problem processing the request, please reload the page.");
$("#loading-status-text").hide();
$('#browser-steps-ui .loader .spinner').fadeOut();
},
401: function (data) {
alert(data.responseText);
$("#loading-status-text").hide();
$('#browser-steps-ui .loader .spinner').fadeOut();
}
}
}).done(function (data) {
xpath_data = data.xpath_data;
$('#browsersteps-img').attr('src', data.screenshot);
$('#browser-steps-ui .loader .spinner').fadeOut();
apply_buttons_disabled = false;
$("#browsersteps-img").css('opacity', 1);
$('ul#browser_steps li .control .apply').css('opacity', 1);
$("#loading-status-text").hide();
}).fail(function (data) {
console.log(data);
if (data.responseText && data.responseText.includes("Browser session expired")) {
disable_browsersteps_ui();
}
apply_buttons_disabled = false;
$("#loading-status-text").hide();
$('ul#browser_steps li .control .apply').css('opacity', 1);
$("#browsersteps-img").css('opacity', 1);
});
}
function start() { function start() {
console.log("Starting browser-steps UI"); console.log("Starting browser-steps UI");
browsersteps_session_id = false; browsersteps_session_id = false;
// @todo This setting of the first one should be done at the datalayer but wtforms doesnt wanna play nice
$('#browser_steps >li:first-child').removeClass('empty');
set_first_gotosite_disabled();
$('#browser-steps-ui .loader .spinner').show(); $('#browser-steps-ui .loader .spinner').show();
$('.clear,.remove', $('#browser_steps >li:first-child')).hide(); // Request a new session
$.ajax({ $.ajax({
type: "GET", type: "GET",
url: browser_steps_start_url, url: browser_steps_start_url,
@@ -267,11 +299,12 @@ $(document).ready(function () {
}).done(function (data) { }).done(function (data) {
$("#loading-status-text").fadeIn(); $("#loading-status-text").fadeIn();
browsersteps_session_id = data.browsersteps_session_id; browsersteps_session_id = data.browsersteps_session_id;
// This should trigger 'Goto site'
console.log("Got startup response, requesting Goto-Site (first) step fake click");
$('#browser_steps >li:first-child .apply').click();
browser_interface_seconds_remaining = 500; browser_interface_seconds_remaining = 500;
set_first_gotosite_disabled(); // Request goto_site operation
executeBrowserStep(
browser_steps_sync_url + "&browsersteps_session_id=" + browsersteps_session_id + "&goto_website_url_first_step=true"
);
}).fail(function (data) { }).fail(function (data) {
console.log(data); console.log(data);
alert('There was an error communicating with the server.'); alert('There was an error communicating with the server.');
@@ -280,7 +313,6 @@ $(document).ready(function () {
} }
function disable_browsersteps_ui() { function disable_browsersteps_ui() {
set_first_gotosite_disabled();
$("#browser-steps-ui").css('opacity', '0.3'); $("#browser-steps-ui").css('opacity', '0.3');
$('#browsersteps-selector-canvas').off("mousemove mousedown click"); $('#browsersteps-selector-canvas').off("mousemove mousedown click");
} }
@@ -328,16 +360,13 @@ $(document).ready(function () {
// Add the extra buttons to the steps // Add the extra buttons to the steps
$('ul#browser_steps li').each(function (i) { $('ul#browser_steps li').each(function (i) {
var s = '<div class="control">' + '<a data-step-index=' + i + ' class="pure-button button-secondary button-green button-xsmall apply" >Apply</a>&nbsp;'; var s = '<div class="control">' + '<a data-step-index=' + i + ' class="pure-button button-secondary button-green button-xsmall apply" >Apply</a>&nbsp;';
if (i > 0) { s += `<a data-step-index="${i}" class="pure-button button-secondary button-xsmall clear" >Clear</a>&nbsp;` +
// The first step never gets these (Goto-site) `<a data-step-index="${i}" class="pure-button button-secondary button-red button-xsmall remove" >Remove</a>`;
s += `<a data-step-index="${i}" class="pure-button button-secondary button-xsmall clear" >Clear</a>&nbsp;` +
`<a data-step-index="${i}" class="pure-button button-secondary button-red button-xsmall remove" >Remove</a>`;
// if a screenshot is available // if a screenshot is available
if (browser_steps_available_screenshots.includes(i.toString())) { if (browser_steps_available_screenshots.includes(i.toString())) {
var d = (browser_steps_last_error_step === i+1) ? 'before' : 'after'; var d = (browser_steps_last_error_step === i+1) ? 'before' : 'after';
s += `&nbsp;<a data-step-index="${i}" class="pure-button button-secondary button-xsmall show-screenshot" title="Show screenshot from last run" data-type="${d}">Pic</a>&nbsp;`; s += `&nbsp;<a data-step-index="${i}" class="pure-button button-secondary button-xsmall show-screenshot" title="Show screenshot from last run" data-type="${d}">Pic</a>&nbsp;`;
}
} }
s += '</div>'; s += '</div>';
$(this).append(s) $(this).append(s)
@@ -376,80 +405,35 @@ $(document).ready(function () {
}); });
$('ul#browser_steps li .control .apply').click(function (event) { $('ul#browser_steps li .control .apply').click(function (event) {
// sequential requests @todo refactor
if (apply_buttons_disabled) { if (apply_buttons_disabled) {
return; return;
} }
var current_data = $(event.currentTarget).closest('li'); var current_data = $(event.currentTarget).closest('li');
$('#browser-steps-ui .loader .spinner').fadeIn();
apply_buttons_disabled = true;
$('ul#browser_steps li .control .apply').css('opacity', 0.5);
$("#browsersteps-img").css('opacity', 0.65);
var is_last_step = 0;
var step_n = $(event.currentTarget).data('step-index'); var step_n = $(event.currentTarget).data('step-index');
// On the last step, we should also be getting data ready for the visual selector // Determine if this is the last configured step
var is_last_step = 0;
$('ul#browser_steps li select').each(function (i) { $('ul#browser_steps li select').each(function (i) {
if ($(this).val() !== 'Choose one') { if ($(this).val() !== 'Choose one') {
is_last_step += 1; is_last_step += 1;
} }
}); });
is_last_step = (is_last_step == (step_n + 1));
if (is_last_step == (step_n + 1)) {
is_last_step = true;
} else {
is_last_step = false;
}
console.log("Requesting step via POST " + $("select[id$='operation']", current_data).first().val()); console.log("Requesting step via POST " + $("select[id$='operation']", current_data).first().val());
// POST the currently clicked step form widget back and await response, redraw
$.ajax({ // Execute the browser step
method: "POST", executeBrowserStep(
url: browser_steps_sync_url + "&browsersteps_session_id=" + browsersteps_session_id, browser_steps_sync_url + "&browsersteps_session_id=" + browsersteps_session_id,
data: { {
'operation': $("select[id$='operation']", current_data).first().val(), 'operation': $("select[id$='operation']", current_data).first().val(),
'selector': $("input[id$='selector']", current_data).first().val(), 'selector': $("input[id$='selector']", current_data).first().val(),
'optional_value': $("input[id$='optional_value']", current_data).first().val(), 'optional_value': $("input[id$='optional_value']", current_data).first().val(),
'step_n': step_n, 'step_n': step_n,
'is_last_step': is_last_step 'is_last_step': is_last_step
},
statusCode: {
400: function () {
// More than likely the CSRF token was lost when the server restarted
alert("There was a problem processing the request, please reload the page.");
$("#loading-status-text").hide();
$('#browser-steps-ui .loader .spinner').fadeOut();
},
401: function (data) {
// More than likely the CSRF token was lost when the server restarted
alert(data.responseText);
$("#loading-status-text").hide();
$('#browser-steps-ui .loader .spinner').fadeOut();
}
} }
}).done(function (data) { );
// it should return the new state (selectors available and screenshot)
xpath_data = data.xpath_data;
$('#browsersteps-img').attr('src', data.screenshot);
$('#browser-steps-ui .loader .spinner').fadeOut();
apply_buttons_disabled = false;
$("#browsersteps-img").css('opacity', 1);
$('ul#browser_steps li .control .apply').css('opacity', 1);
$("#loading-status-text").hide();
set_first_gotosite_disabled();
}).fail(function (data) {
console.log(data);
if (data.responseText.includes("Browser session expired")) {
disable_browsersteps_ui();
}
apply_buttons_disabled = false;
$("#loading-status-text").hide();
$('ul#browser_steps li .control .apply').css('opacity', 1);
$("#browsersteps-img").css('opacity', 1);
});
}); });
$('ul#browser_steps li .control .show-screenshot').click(function (element) { $('ul#browser_steps li .control .show-screenshot').click(function (element) {
+59
View File
@@ -235,6 +235,8 @@ class ChangeDetectionStore(DatastoreUpdatesMixin, FileSavingDataStore):
# No datastore yet - check if this is a fresh install or legacy migration # No datastore yet - check if this is a fresh install or legacy migration
self.init_fresh_install(include_default_watches=include_default_watches, self.init_fresh_install(include_default_watches=include_default_watches,
version_tag=version_tag) version_tag=version_tag)
# Maybe they copied a bunch of watch subdirs across too
self._load_state()
def init_fresh_install(self, include_default_watches, version_tag): def init_fresh_install(self, include_default_watches, version_tag):
# Generate app_guid FIRST (required for all operations) # Generate app_guid FIRST (required for all operations)
@@ -456,6 +458,63 @@ class ChangeDetectionStore(DatastoreUpdatesMixin, FileSavingDataStore):
self.__data['settings']['application']['password'] = False self.__data['settings']['application']['password'] = False
self.commit() self.commit()
def clear_all_last_checksums(self):
"""
Delete all last-checksum.txt files to force reprocessing of all watches.
This should be called when global settings change, since watches inherit
configuration and need to reprocess even if their individual watch dict
hasn't been modified.
Note: We delete the checksum file rather than setting was_edited=True because:
- was_edited is not persisted across restarts
- File deletion ensures reprocessing works across app restarts
"""
deleted_count = 0
for uuid in self.__data['watching'].keys():
watch = self.__data['watching'][uuid]
if watch.data_dir:
checksum_file = os.path.join(watch.data_dir, 'last-checksum.txt')
if os.path.isfile(checksum_file):
try:
os.remove(checksum_file)
deleted_count += 1
logger.debug(f"Cleared checksum for watch {uuid}")
except OSError as e:
logger.warning(f"Failed to delete checksum file for {uuid}: {e}")
logger.info(f"Cleared {deleted_count} checksum files to force reprocessing")
return deleted_count
def clear_checksums_for_tag(self, tag_uuid):
"""
Delete last-checksum.txt files for all watches using a specific tag.
This should be called when a tag configuration is edited, since watches
inherit tag settings and need to reprocess.
Args:
tag_uuid: UUID of the tag that was modified
Returns:
int: Number of checksum files deleted
"""
deleted_count = 0
for uuid, watch in self.__data['watching'].items():
if watch.get('tags') and tag_uuid in watch['tags']:
if watch.data_dir:
checksum_file = os.path.join(watch.data_dir, 'last-checksum.txt')
if os.path.isfile(checksum_file):
try:
os.remove(checksum_file)
deleted_count += 1
logger.debug(f"Cleared checksum for watch {uuid} (tag {tag_uuid})")
except OSError as e:
logger.warning(f"Failed to delete checksum file for {uuid}: {e}")
logger.info(f"Cleared {deleted_count} checksum files for tag {tag_uuid}")
return deleted_count
def commit(self): def commit(self):
""" """
Save settings immediately to disk using atomic write. Save settings immediately to disk using atomic write.
+13
View File
@@ -331,6 +331,7 @@ def prepare_test_function(live_server, datastore_path):
# Cleanup: Clear watches and queue after test # Cleanup: Clear watches and queue after test
try: try:
from changedetectionio.flask_app import update_q from changedetectionio.flask_app import update_q
from pathlib import Path
# Clear the queue to prevent leakage to next test # Clear the queue to prevent leakage to next test
while not update_q.empty(): while not update_q.empty():
@@ -340,6 +341,18 @@ def prepare_test_function(live_server, datastore_path):
break break
datastore.data['watching'] = {} datastore.data['watching'] = {}
# Delete any old watch metadata JSON files
base_path = Path(datastore.datastore_path).resolve()
max_depth = 2
for file in base_path.rglob("*.json"):
# Calculate depth relative to base path
depth = len(file.relative_to(base_path).parts) - 1
if depth <= max_depth and file.is_file():
file.unlink()
except Exception as e: except Exception as e:
logger.warning(f"Error during datastore cleanup: {e}") logger.warning(f"Error during datastore cleanup: {e}")
+240 -5
View File
@@ -328,6 +328,68 @@ def test_api_simple(client, live_server, measure_memory_usage, datastore_path):
) )
assert len(res.json) == 0, "Watch list should be empty" assert len(res.json) == 0, "Watch list should be empty"
def test_roundtrip_API(client, live_server, measure_memory_usage, datastore_path):
"""
Test the full round trip, this way we test the default Model fits back into OpenAPI spec
:param client:
:param live_server:
:param measure_memory_usage:
:param datastore_path:
:return:
"""
api_key = live_server.app.config['DATASTORE'].data['settings']['application'].get('api_access_token')
set_original_response(datastore_path=datastore_path)
test_url = url_for('test_endpoint', _external=True)
# Create new
res = client.post(
url_for("createwatch"),
data=json.dumps({"url": test_url}),
headers={'content-type': 'application/json', 'x-api-key': api_key},
follow_redirects=True
)
assert res.status_code == 201
uuid = res.json.get('uuid')
# Now fetch it and send it back
res = client.get(
url_for("watch", uuid=uuid),
headers={'x-api-key': api_key}
)
watch=res.json
# Be sure that 'readOnly' values are never updated in the real watch
watch['last_changed'] = 454444444444
watch['date_created'] = 454444444444
# HTTP PUT ( UPDATE an existing watch )
res = client.put(
url_for("watch", uuid=uuid),
headers={'x-api-key': api_key, 'content-type': 'application/json'},
data=json.dumps(watch),
)
if res.status_code != 200:
print(f"\n=== PUT failed with {res.status_code} ===")
print(f"Error: {res.data}")
assert res.status_code == 200, "HTTP PUT update was sent OK"
res = client.get(
url_for("watch", uuid=uuid),
headers={'x-api-key': api_key}
)
last_changed = res.json.get('last_changed')
assert last_changed != 454444444444
assert last_changed != "454444444444"
date_created = res.json.get('date_created')
assert date_created != 454444444444
assert date_created != "454444444444"
def test_access_denied(client, live_server, measure_memory_usage, datastore_path): def test_access_denied(client, live_server, measure_memory_usage, datastore_path):
# `config_api_token_enabled` Should be On by default # `config_api_token_enabled` Should be On by default
res = client.get( res = client.get(
@@ -401,6 +463,9 @@ def test_api_watch_PUT_update(client, live_server, measure_memory_usage, datasto
follow_redirects=True follow_redirects=True
) )
if res.status_code != 201:
print(f"\n=== POST createwatch failed with {res.status_code} ===")
print(f"Response: {res.data}")
assert res.status_code == 201 assert res.status_code == 201
wait_for_all_checks(client) wait_for_all_checks(client)
@@ -464,11 +529,12 @@ def test_api_watch_PUT_update(client, live_server, measure_memory_usage, datasto
) )
assert res.status_code == 400, "Should get error 400 when we give a field that doesnt exist" assert res.status_code == 400, "Should get error 400 when we give a field that doesnt exist"
# Message will come from `flask_expects_json` # Backend validation now rejects unknown fields with a clear error message
# With patternProperties for processor_config_*, the error message format changed slightly assert (b'Unknown field' in res.data or
assert (b'Additional properties are not allowed' in res.data or b'Additional properties are not allowed' in res.data or
b'Unevaluated properties are not allowed' in res.data or
b'does not match any of the regexes' in res.data), \ b'does not match any of the regexes' in res.data), \
"Should reject unknown fields with schema validation error" "Should reject unknown fields with validation error"
# Try a XSS URL # Try a XSS URL
@@ -553,6 +619,8 @@ def test_api_import(client, live_server, measure_memory_usage, datastore_path):
assert res.status_code == 200 assert res.status_code == 200
uuid = res.json[0] uuid = res.json[0]
watch = live_server.app.config['DATASTORE'].data['watching'][uuid] watch = live_server.app.config['DATASTORE'].data['watching'][uuid]
assert isinstance(watch['notification_urls'], list), "notification_urls must be stored as a list"
assert len(watch['notification_urls']) == 2, "notification_urls should have 2 entries"
assert 'mailto://test@example.com' in watch['notification_urls'], "notification_urls should contain first email" assert 'mailto://test@example.com' in watch['notification_urls'], "notification_urls should contain first email"
assert 'mailto://admin@example.com' in watch['notification_urls'], "notification_urls should contain second email" assert 'mailto://admin@example.com' in watch['notification_urls'], "notification_urls should contain second email"
@@ -599,6 +667,34 @@ def test_api_import(client, live_server, measure_memory_usage, datastore_path):
assert res.status_code == 400, "Should reject unknown field" assert res.status_code == 400, "Should reject unknown field"
assert b"Unknown watch configuration parameter" in res.data, "Error message should mention unknown parameter" assert b"Unknown watch configuration parameter" in res.data, "Error message should mention unknown parameter"
# Test 7: Import with complex nested array (browser_steps) - array of objects
browser_steps = json.dumps([
{"operation": "wait", "selector": "5", "optional_value": ""},
{"operation": "click", "selector": "button.submit", "optional_value": ""}
])
params = urllib.parse.urlencode({
'tag': 'browser-test',
'browser_steps': browser_steps
})
res = client.post(
url_for("import") + "?" + params,
data='https://website8.com',
headers={'x-api-key': api_key},
follow_redirects=True
)
assert res.status_code == 200, "Should accept browser_steps array"
uuid = res.json[0]
watch = live_server.app.config['DATASTORE'].data['watching'][uuid]
assert len(watch['browser_steps']) == 2, "Should have 2 browser steps"
assert watch['browser_steps'][0]['operation'] == 'wait', "First step should be wait"
assert watch['browser_steps'][1]['operation'] == 'click', "Second step should be click"
assert watch['browser_steps'][1]['selector'] == 'button.submit', "Second step selector should be button.submit"
# Cleanup
delete_all_watches(client)
def test_api_import_small_synchronous(client, live_server, measure_memory_usage, datastore_path): def test_api_import_small_synchronous(client, live_server, measure_memory_usage, datastore_path):
"""Test that small imports (< threshold) are processed synchronously""" """Test that small imports (< threshold) are processed synchronously"""
@@ -837,7 +933,9 @@ def test_api_url_validation(client, live_server, measure_memory_usage, datastore
) )
assert res.status_code == 400, "Updating watch URL to null should fail" assert res.status_code == 400, "Updating watch URL to null should fail"
# Accept either OpenAPI validation error or our custom validation error # Accept either OpenAPI validation error or our custom validation error
assert b'URL cannot be null' in res.data or b'OpenAPI validation failed' in res.data or b'validation error' in res.data.lower() assert (b'URL cannot be null' in res.data or
b'Validation failed' in res.data or
b'validation error' in res.data.lower())
# Test 8: UPDATE to empty string URL should fail # Test 8: UPDATE to empty string URL should fail
res = client.put( res = client.put(
@@ -924,3 +1022,140 @@ def test_api_url_validation(client, live_server, measure_memory_usage, datastore
headers={'x-api-key': api_key}, headers={'x-api-key': api_key},
) )
delete_all_watches(client) delete_all_watches(client)
def test_api_time_between_check_validation(client, live_server, measure_memory_usage, datastore_path):
"""
Test that time_between_check validation works correctly:
- When time_between_check_use_default is false, at least one time value must be > 0
- Values must be valid integers
"""
import json
from flask import url_for
api_key = live_server.app.config['DATASTORE'].data['settings']['application'].get('api_access_token')
# Test 1: time_between_check_use_default=false with NO time_between_check should fail
res = client.post(
url_for("createwatch"),
data=json.dumps({
"url": "https://example.com",
"time_between_check_use_default": False
}),
headers={'content-type': 'application/json', 'x-api-key': api_key},
)
assert res.status_code == 400, "Should fail when time_between_check_use_default=false with no time_between_check"
assert b"At least one time interval" in res.data, "Error message should mention time interval requirement"
# Test 2: time_between_check_use_default=false with ALL zeros should fail
res = client.post(
url_for("createwatch"),
data=json.dumps({
"url": "https://example.com",
"time_between_check_use_default": False,
"time_between_check": {
"weeks": 0,
"days": 0,
"hours": 0,
"minutes": 0,
"seconds": 0
}
}),
headers={'content-type': 'application/json', 'x-api-key': api_key},
)
assert res.status_code == 400, "Should fail when all time values are 0"
assert b"At least one time interval" in res.data, "Error message should mention time interval requirement"
# Test 3: time_between_check_use_default=false with NULL values should fail
res = client.post(
url_for("createwatch"),
data=json.dumps({
"url": "https://example.com",
"time_between_check_use_default": False,
"time_between_check": {
"weeks": None,
"days": None,
"hours": None,
"minutes": None,
"seconds": None
}
}),
headers={'content-type': 'application/json', 'x-api-key': api_key},
)
assert res.status_code == 400, "Should fail when all time values are null"
assert b"At least one time interval" in res.data, "Error message should mention time interval requirement"
# Test 4: time_between_check_use_default=false with valid hours should succeed
res = client.post(
url_for("createwatch"),
data=json.dumps({
"url": "https://example.com",
"time_between_check_use_default": False,
"time_between_check": {
"hours": 2
}
}),
headers={'content-type': 'application/json', 'x-api-key': api_key},
)
assert res.status_code == 201, "Should succeed with valid hours value"
uuid1 = res.json.get('uuid')
# Test 5: time_between_check_use_default=false with valid minutes should succeed
res = client.post(
url_for("createwatch"),
data=json.dumps({
"url": "https://example2.com",
"time_between_check_use_default": False,
"time_between_check": {
"minutes": 30
}
}),
headers={'content-type': 'application/json', 'x-api-key': api_key},
)
assert res.status_code == 201, "Should succeed with valid minutes value"
uuid2 = res.json.get('uuid')
# Test 6: time_between_check_use_default=true (or missing) with no time_between_check should succeed (uses defaults)
res = client.post(
url_for("createwatch"),
data=json.dumps({
"url": "https://example3.com",
"time_between_check_use_default": True
}),
headers={'content-type': 'application/json', 'x-api-key': api_key},
)
assert res.status_code == 201, "Should succeed when using default settings"
uuid3 = res.json.get('uuid')
# Test 7: Default behavior (no time_between_check_use_default field) should use defaults and succeed
res = client.post(
url_for("createwatch"),
data=json.dumps({
"url": "https://example4.com"
}),
headers={'content-type': 'application/json', 'x-api-key': api_key},
)
assert res.status_code == 201, "Should succeed with default behavior (using global settings)"
uuid4 = res.json.get('uuid')
# Test 8: Verify integer type validation - string should fail (OpenAPI validation)
res = client.post(
url_for("createwatch"),
data=json.dumps({
"url": "https://example5.com",
"time_between_check_use_default": False,
"time_between_check": {
"hours": "not_a_number"
}
}),
headers={'content-type': 'application/json', 'x-api-key': api_key},
)
assert res.status_code == 400, "Should fail when time value is not an integer"
assert b"Validation failed" in res.data or b"not of type" in res.data, "Should mention validation/type error"
# Cleanup
for uuid in [uuid1, uuid2, uuid3, uuid4]:
client.delete(
url_for("watch", uuid=uuid),
headers={'x-api-key': api_key},
)
@@ -107,7 +107,7 @@ def test_watch_notification_urls_validation(client, live_server, measure_memory_
headers={'content-type': 'application/json', 'x-api-key': api_key} headers={'content-type': 'application/json', 'x-api-key': api_key}
) )
assert res.status_code == 400, "Should reject non-list notification_urls" assert res.status_code == 400, "Should reject non-list notification_urls"
assert b"OpenAPI validation failed" in res.data or b"Request body validation error" in res.data assert b"Validation failed" in res.data or b"is not of type" in res.data
# Test 6: Verify original URLs are preserved after failed update # Test 6: Verify original URLs are preserved after failed update
res = client.get( res = client.get(
@@ -159,7 +159,7 @@ def test_tag_notification_urls_validation(client, live_server, measure_memory_us
headers={'content-type': 'application/json', 'x-api-key': api_key} headers={'content-type': 'application/json', 'x-api-key': api_key}
) )
assert res.status_code == 400, "Should reject non-list notification_urls" assert res.status_code == 400, "Should reject non-list notification_urls"
assert b"OpenAPI validation failed" in res.data or b"Request body validation error" in res.data assert b"Validation failed" in res.data or b"is not of type" in res.data
# Test 4: Verify original URLs are preserved after failed update # Test 4: Verify original URLs are preserved after failed update
tag = datastore.data['settings']['application']['tags'][tag_uuid] tag = datastore.data['settings']['application']['tags'][tag_uuid]
+19 -10
View File
@@ -9,7 +9,7 @@ by testing various scenarios that should trigger validation errors.
import time import time
import json import json
from flask import url_for from flask import url_for
from .util import live_server_setup, wait_for_all_checks from .util import live_server_setup, wait_for_all_checks, delete_all_watches
def test_openapi_validation_invalid_content_type_on_create_watch(client, live_server, measure_memory_usage, datastore_path): def test_openapi_validation_invalid_content_type_on_create_watch(client, live_server, measure_memory_usage, datastore_path):
@@ -26,7 +26,8 @@ def test_openapi_validation_invalid_content_type_on_create_watch(client, live_se
# Should get 400 error due to OpenAPI validation failure # Should get 400 error due to OpenAPI validation failure
assert res.status_code == 400, f"Expected 400 but got {res.status_code}" assert res.status_code == 400, f"Expected 400 but got {res.status_code}"
assert b"OpenAPI validation failed" in res.data, "Should contain OpenAPI validation error message" assert b"Validation failed" in res.data, "Should contain validation error message"
delete_all_watches(client)
def test_openapi_validation_missing_required_field_create_watch(client, live_server, measure_memory_usage, datastore_path): def test_openapi_validation_missing_required_field_create_watch(client, live_server, measure_memory_usage, datastore_path):
@@ -43,7 +44,8 @@ def test_openapi_validation_missing_required_field_create_watch(client, live_ser
# Should get 400 error due to missing required field # Should get 400 error due to missing required field
assert res.status_code == 400, f"Expected 400 but got {res.status_code}" assert res.status_code == 400, f"Expected 400 but got {res.status_code}"
assert b"OpenAPI validation failed" in res.data, "Should contain OpenAPI validation error message" assert b"Validation failed" in res.data, "Should contain validation error message"
delete_all_watches(client)
def test_openapi_validation_invalid_field_in_request_body(client, live_server, measure_memory_usage, datastore_path): def test_openapi_validation_invalid_field_in_request_body(client, live_server, measure_memory_usage, datastore_path):
@@ -80,10 +82,10 @@ def test_openapi_validation_invalid_field_in_request_body(client, live_server, m
# Should get 400 error due to invalid field (this will be caught by internal validation) # Should get 400 error due to invalid field (this will be caught by internal validation)
# Note: This tests the flow where OpenAPI validation passes but internal validation catches it # Note: This tests the flow where OpenAPI validation passes but internal validation catches it
assert res.status_code == 400, f"Expected 400 but got {res.status_code}" assert res.status_code == 400, f"Expected 400 but got {res.status_code}"
# With patternProperties for processor_config_*, the error message format changed slightly # Backend validation now returns "Unknown field(s):" message
assert (b"Additional properties are not allowed" in res.data or assert b"Unknown field" in res.data, \
b"does not match any of the regexes" in res.data), \ "Should contain validation error about unknown fields"
"Should contain validation error about additional/invalid properties" delete_all_watches(client)
def test_openapi_validation_import_wrong_content_type(client, live_server, measure_memory_usage, datastore_path): def test_openapi_validation_import_wrong_content_type(client, live_server, measure_memory_usage, datastore_path):
@@ -100,7 +102,8 @@ def test_openapi_validation_import_wrong_content_type(client, live_server, measu
# Should get 400 error due to content-type mismatch # Should get 400 error due to content-type mismatch
assert res.status_code == 400, f"Expected 400 but got {res.status_code}" assert res.status_code == 400, f"Expected 400 but got {res.status_code}"
assert b"OpenAPI validation failed" in res.data, "Should contain OpenAPI validation error message" assert b"Validation failed" in res.data, "Should contain validation error message"
delete_all_watches(client)
def test_openapi_validation_import_correct_content_type_succeeds(client, live_server, measure_memory_usage, datastore_path): def test_openapi_validation_import_correct_content_type_succeeds(client, live_server, measure_memory_usage, datastore_path):
@@ -118,6 +121,7 @@ def test_openapi_validation_import_correct_content_type_succeeds(client, live_se
# Should succeed # Should succeed
assert res.status_code == 200, f"Expected 200 but got {res.status_code}" assert res.status_code == 200, f"Expected 200 but got {res.status_code}"
assert len(res.json) == 2, "Should import 2 URLs" assert len(res.json) == 2, "Should import 2 URLs"
delete_all_watches(client)
def test_openapi_validation_get_requests_bypass_validation(client, live_server, measure_memory_usage, datastore_path): def test_openapi_validation_get_requests_bypass_validation(client, live_server, measure_memory_usage, datastore_path):
@@ -142,6 +146,7 @@ def test_openapi_validation_get_requests_bypass_validation(client, live_server,
# Should return JSON with watch list (empty in this case) # Should return JSON with watch list (empty in this case)
assert isinstance(res.json, dict), "Should return JSON dictionary for watch list" assert isinstance(res.json, dict), "Should return JSON dictionary for watch list"
delete_all_watches(client)
def test_openapi_validation_create_tag_missing_required_title(client, live_server, measure_memory_usage, datastore_path): def test_openapi_validation_create_tag_missing_required_title(client, live_server, measure_memory_usage, datastore_path):
@@ -158,11 +163,14 @@ def test_openapi_validation_create_tag_missing_required_title(client, live_serve
# Should get 400 error due to missing required field # Should get 400 error due to missing required field
assert res.status_code == 400, f"Expected 400 but got {res.status_code}" assert res.status_code == 400, f"Expected 400 but got {res.status_code}"
assert b"OpenAPI validation failed" in res.data, "Should contain OpenAPI validation error message" assert b"Validation failed" in res.data, "Should contain validation error message"
delete_all_watches(client)
def test_openapi_validation_watch_update_allows_partial_updates(client, live_server, measure_memory_usage, datastore_path): def test_openapi_validation_watch_update_allows_partial_updates(client, live_server, measure_memory_usage, datastore_path):
"""Test that watch updates allow partial updates without requiring all fields (positive test).""" """Test that watch updates allow partial updates without requiring all fields (positive test)."""
#xxx
api_key = live_server.app.config['DATASTORE'].data['settings']['application'].get('api_access_token') api_key = live_server.app.config['DATASTORE'].data['settings']['application'].get('api_access_token')
# First create a valid watch # First create a valid watch
@@ -199,4 +207,5 @@ def test_openapi_validation_watch_update_allows_partial_updates(client, live_ser
) )
assert res.status_code == 200 assert res.status_code == 200
assert res.json.get('title') == 'Updated Title Only', "Title should be updated" assert res.json.get('title') == 'Updated Title Only', "Title should be updated"
assert res.json.get('url') == 'https://example.com', "URL should remain unchanged" assert res.json.get('url') == 'https://example.com', "URL should remain unchanged"
delete_all_watches(client)
+53
View File
@@ -176,4 +176,57 @@ def test_api_tags_listing(client, live_server, measure_memory_usage, datastore_p
assert res.status_code == 204 assert res.status_code == 204
def test_roundtrip_API(client, live_server, measure_memory_usage, datastore_path):
"""
Test the full round trip, this way we test the default Model fits back into OpenAPI spec
:param client:
:param live_server:
:param measure_memory_usage:
:param datastore_path:
:return:
"""
api_key = live_server.app.config['DATASTORE'].data['settings']['application'].get('api_access_token')
set_original_response(datastore_path=datastore_path)
res = client.post(
url_for("tag"),
data=json.dumps({"title": "My tag title"}),
headers={'content-type': 'application/json', 'x-api-key': api_key}
)
assert res.status_code == 201
uuid = res.json.get('uuid')
# Now fetch it and send it back
res = client.get(
url_for("tag", uuid=uuid),
headers={'x-api-key': api_key}
)
tag = res.json
# Only test with date_created (readOnly field that should be filtered out)
# last_changed is Watch-specific and doesn't apply to Tags
tag['date_created'] = 454444444444
# HTTP PUT ( UPDATE an existing watch )
res = client.put(
url_for("tag", uuid=uuid),
headers={'x-api-key': api_key, 'content-type': 'application/json'},
data=json.dumps(tag),
)
if res.status_code != 200:
print(f"\n=== PUT failed with {res.status_code} ===")
print(f"Error: {res.data}")
assert res.status_code == 200, "HTTP PUT update was sent OK"
# Verify readOnly fields like date_created cannot be overridden
res = client.get(
url_for("tag", uuid=uuid),
headers={'x-api-key': api_key}
)
date_created = res.json.get('date_created')
assert date_created != 454444444444, "ReadOnly date_created should not be updateable"
assert date_created != "454444444444", "ReadOnly date_created should not be updateable"
-2
View File
@@ -6,8 +6,6 @@ from flask import url_for
from .util import set_original_response, set_modified_response, live_server_setup, wait_for_all_checks, extract_rss_token_from_UI, \ from .util import set_original_response, set_modified_response, live_server_setup, wait_for_all_checks, extract_rss_token_from_UI, \
extract_UUID_from_client, delete_all_watches extract_UUID_from_client, delete_all_watches
sleep_time_for_fetch_thread = 3
# Basic test to check inscriptus is not adding return line chars, basically works etc # Basic test to check inscriptus is not adding return line chars, basically works etc
def test_inscriptus(): def test_inscriptus():
+42 -4
View File
@@ -54,11 +54,11 @@ def test_backup(client, live_server, measure_memory_usage, datastore_path):
backup = ZipFile(io.BytesIO(res.data)) backup = ZipFile(io.BytesIO(res.data))
l = backup.namelist() l = backup.namelist()
# Check for UUID-based txt files (history and snapshot) # Check for UUID-based txt files (history, snapshot, and last-checksum)
uuid4hex_txt = re.compile('^[a-f0-9]{8}-?[a-f0-9]{4}-?4[a-f0-9]{3}-?[89ab][a-f0-9]{3}-?[a-f0-9]{12}.*txt', re.I) uuid4hex_txt = re.compile('^[a-f0-9]{8}-?[a-f0-9]{4}-?4[a-f0-9]{3}-?[89ab][a-f0-9]{3}-?[a-f0-9]{12}.*txt', re.I)
txt_files = list(filter(uuid4hex_txt.match, l)) txt_files = list(filter(uuid4hex_txt.match, l))
# Should be two txt files in the archive (history and the snapshot) # Should be three txt files in the archive (history, snapshot, and last-checksum)
assert len(txt_files) == 2 assert len(txt_files) == 3
# Check for watch.json files (new format) # Check for watch.json files (new format)
uuid4hex_json = re.compile('^[a-f0-9]{8}-?[a-f0-9]{4}-?4[a-f0-9]{3}-?[89ab][a-f0-9]{3}-?[a-f0-9]{12}/watch\.json$', re.I) uuid4hex_json = re.compile('^[a-f0-9]{8}-?[a-f0-9]{4}-?4[a-f0-9]{3}-?[89ab][a-f0-9]{3}-?[a-f0-9]{12}/watch\.json$', re.I)
@@ -75,4 +75,42 @@ def test_backup(client, live_server, measure_memory_usage, datastore_path):
follow_redirects=True follow_redirects=True
) )
assert b'No backups found.' in res.data assert b'No backups found.' in res.data
def test_watch_data_package_download(client, live_server, measure_memory_usage, datastore_path):
"""Test downloading a single watch's data as a zip package"""
import os
set_original_response(datastore_path=datastore_path)
uuid = client.application.config.get('DATASTORE').add_watch(url=url_for('test_endpoint', _external=True))
client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
wait_for_all_checks(client)
# Download the watch data package
res = client.get(url_for("ui.ui_edit.watch_get_data_package", uuid=uuid))
# Should get the right zip content type
assert res.content_type == "application/zip"
# Should be PK/ZIP stream (PKzip header)
assert res.data[:2] == b'PK', "File should start with PK (PKzip header)"
assert res.data.count(b'PK') >= 2, "Should have multiple PK markers (zip file structure)"
# Verify zip contents
backup = ZipFile(io.BytesIO(res.data))
files = backup.namelist()
# Should have files in a UUID directory
assert any(uuid in f for f in files), f"Files should be in UUID directory: {files}"
# Should contain watch.json
watch_json_path = f"{uuid}/watch.json"
assert watch_json_path in files, f"Should contain watch.json, got: {files}"
# Should contain history/snapshot files
uuid4hex_txt = re.compile(f'^{re.escape(uuid)}/.*\\.txt', re.I)
txt_files = list(filter(uuid4hex_txt.match, files))
assert len(txt_files) > 0, f"Should have at least one .txt file (history/snapshot), got: {files}"
+6 -9
View File
@@ -71,22 +71,19 @@ def test_include_filters_output():
# Tests the whole stack works with the CSS Filter # Tests the whole stack works with the CSS Filter
def test_check_markup_include_filters_restriction(client, live_server, measure_memory_usage, datastore_path): def test_check_markup_include_filters_restriction(client, live_server, measure_memory_usage, datastore_path):
sleep_time_for_fetch_thread = 3
include_filters = "#sametext" include_filters = "#sametext"
set_original_response(datastore_path=datastore_path) set_original_response(datastore_path=datastore_path)
# Give the endpoint time to spin up
time.sleep(1)
# Add our URL to the import page # Add our URL to the import page
test_url = url_for('test_endpoint', _external=True) test_url = url_for('test_endpoint', _external=True)
uuid = client.application.config.get('DATASTORE').add_watch(url=test_url) uuid = client.application.config.get('DATASTORE').add_watch(url=test_url)
client.get(url_for("ui.form_watch_checknow"), follow_redirects=True) client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up wait_for_all_checks(client)
time.sleep(sleep_time_for_fetch_thread)
# Goto the edit page, add our ignore text # Goto the edit page, add our ignore text
# Add our URL to the import page # Add our URL to the import page
@@ -103,15 +100,15 @@ def test_check_markup_include_filters_restriction(client, live_server, measure_m
) )
assert bytes(include_filters.encode('utf-8')) in res.data assert bytes(include_filters.encode('utf-8')) in res.data
# Give the thread time to pick it up wait_for_all_checks(client)
time.sleep(sleep_time_for_fetch_thread)
# Make a change # Make a change
set_modified_response(datastore_path=datastore_path) set_modified_response(datastore_path=datastore_path)
# Trigger a check # Trigger a check
client.get(url_for("ui.form_watch_checknow"), follow_redirects=True) client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up wait_for_all_checks(client)
time.sleep(sleep_time_for_fetch_thread)
# It should have 'has-unread-changes' still # It should have 'has-unread-changes' still
# Because it should be looking at only that 'sametext' id # Because it should be looking at only that 'sametext' id
@@ -6,10 +6,6 @@ from urllib.request import urlopen
from .util import set_original_response, set_modified_response, live_server_setup, wait_for_all_checks from .util import set_original_response, set_modified_response, live_server_setup, wait_for_all_checks
import os import os
sleep_time_for_fetch_thread = 3
def test_check_extract_text_from_diff(client, live_server, measure_memory_usage, datastore_path): def test_check_extract_text_from_diff(client, live_server, measure_memory_usage, datastore_path):
import time import time
with open(os.path.join(datastore_path, "endpoint-content.txt"), "w") as f: with open(os.path.join(datastore_path, "endpoint-content.txt"), "w") as f:
@@ -106,7 +106,7 @@ def test_consistent_history(client, live_server, measure_memory_usage, datastore
# Find the snapshot one # Find the snapshot one
for fname in files_in_watch_dir: for fname in files_in_watch_dir:
if fname != 'history.txt' and fname != 'watch.json' and 'html' not in fname: if fname != 'history.txt' and fname != 'watch.json' and fname != 'last-checksum.txt' and 'html' not in fname:
if strtobool(os.getenv("TEST_WITH_BROTLI")): if strtobool(os.getenv("TEST_WITH_BROTLI")):
assert fname.endswith('.br'), "Forced TEST_WITH_BROTLI then it should be a .br filename" assert fname.endswith('.br'), "Forced TEST_WITH_BROTLI then it should be a .br filename"
@@ -123,11 +123,18 @@ def test_consistent_history(client, live_server, measure_memory_usage, datastore
assert json_obj['watching'][w]['title'], "Watch should have a title set" assert json_obj['watching'][w]['title'], "Watch should have a title set"
assert contents.startswith(watch_title + "x"), f"Snapshot contents in file {fname} should start with '{watch_title}x', got '{contents}'" assert contents.startswith(watch_title + "x"), f"Snapshot contents in file {fname} should start with '{watch_title}x', got '{contents}'"
# With new format, we also have watch.json, so 4 files total # With new format, we have watch.json, so 4 files minimum
# Note: last-checksum.txt may or may not exist - it gets cleared by settings changes,
# and this test changes settings before checking files
# This assertion should be AFTER the loop, not inside it
if os.path.exists(changedetection_json): if os.path.exists(changedetection_json):
assert len(files_in_watch_dir) == 4, "Should be four files in the dir with new format: watch.json, html.br snapshot, history.txt and the extracted text snapshot" # 4 required files: watch.json, html.br, history.txt, extracted text snapshot
# last-checksum.txt is optional (cleared by settings changes in this test)
assert len(files_in_watch_dir) >= 4 and len(files_in_watch_dir) <= 5, f"Should be 4-5 files in the dir with new format (last-checksum.txt is optional). Found {len(files_in_watch_dir)}: {files_in_watch_dir}"
else: else:
assert len(files_in_watch_dir) == 3, "Should be just three files in the dir with legacy format: html.br snapshot, history.txt and the extracted text snapshot" # 3 required files: html.br, history.txt, extracted text snapshot
# last-checksum.txt is optional
assert len(files_in_watch_dir) >= 3 and len(files_in_watch_dir) <= 4, f"Should be 3-4 files in the dir with legacy format (last-checksum.txt is optional). Found {len(files_in_watch_dir)}: {files_in_watch_dir}"
# Check that 'default' Watch vars aren't accidentally being saved # Check that 'default' Watch vars aren't accidentally being saved
if os.path.exists(changedetection_json): if os.path.exists(changedetection_json):
@@ -41,7 +41,6 @@ def set_modified_ignore_response(datastore_path):
def test_render_anchor_tag_content_true(client, live_server, measure_memory_usage, datastore_path): def test_render_anchor_tag_content_true(client, live_server, measure_memory_usage, datastore_path):
"""Testing that the link changes are detected when """Testing that the link changes are detected when
render_anchor_tag_content setting is set to true""" render_anchor_tag_content setting is set to true"""
sleep_time_for_fetch_thread = 3
# Give the endpoint time to spin up # Give the endpoint time to spin up
time.sleep(1) time.sleep(1)
@@ -100,7 +100,6 @@ def test_normal_page_check_works_with_ignore_status_code(client, live_server, me
# Tests the whole stack works with staus codes ignored # Tests the whole stack works with staus codes ignored
def test_403_page_check_works_with_ignore_status_code(client, live_server, measure_memory_usage, datastore_path): def test_403_page_check_works_with_ignore_status_code(client, live_server, measure_memory_usage, datastore_path):
sleep_time_for_fetch_thread = 3
set_original_response(datastore_path=datastore_path) set_original_response(datastore_path=datastore_path)
@@ -112,8 +111,7 @@ def test_403_page_check_works_with_ignore_status_code(client, live_server, measu
uuid = client.application.config.get('DATASTORE').add_watch(url=test_url) uuid = client.application.config.get('DATASTORE').add_watch(url=test_url)
client.get(url_for("ui.form_watch_checknow"), follow_redirects=True) client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up wait_for_all_checks(client)
time.sleep(sleep_time_for_fetch_thread)
# Goto the edit page, check our ignore option # Goto the edit page, check our ignore option
# Add our URL to the import page # Add our URL to the import page
@@ -2,10 +2,9 @@
import time import time
from flask import url_for from flask import url_for
from . util import live_server_setup
import os import os
from .util import live_server_setup, delete_all_watches, wait_for_all_checks
# Should be the same as set_original_ignore_response(datastore_path=datastore_path) but with a little more whitespacing # Should be the same as set_original_ignore_response(datastore_path=datastore_path) but with a little more whitespacing
@@ -50,10 +49,7 @@ def set_original_ignore_response(datastore_path):
# If there was only a change in the whitespacing, then we shouldnt have a change detected # If there was only a change in the whitespacing, then we shouldnt have a change detected
def test_check_ignore_whitespace(client, live_server, measure_memory_usage, datastore_path): def test_check_ignore_whitespace(client, live_server, measure_memory_usage, datastore_path):
sleep_time_for_fetch_thread = 3
# Give the endpoint time to spin up
time.sleep(1)
set_original_ignore_response(datastore_path=datastore_path) set_original_ignore_response(datastore_path=datastore_path)
@@ -74,17 +70,17 @@ def test_check_ignore_whitespace(client, live_server, measure_memory_usage, data
uuid = client.application.config.get('DATASTORE').add_watch(url=test_url) uuid = client.application.config.get('DATASTORE').add_watch(url=test_url)
client.get(url_for("ui.form_watch_checknow"), follow_redirects=True) client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
time.sleep(sleep_time_for_fetch_thread) wait_for_all_checks(client)
# Trigger a check # Trigger a check
client.get(url_for("ui.form_watch_checknow"), follow_redirects=True) client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
set_original_ignore_response_but_with_whitespace(datastore_path) set_original_ignore_response_but_with_whitespace(datastore_path)
time.sleep(sleep_time_for_fetch_thread) wait_for_all_checks(client)
# Trigger a check # Trigger a check
client.get(url_for("ui.form_watch_checknow"), follow_redirects=True) client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) wait_for_all_checks(client)
# It should report nothing found (no new 'has-unread-changes' class) # It should report nothing found (no new 'has-unread-changes' class)
res = client.get(url_for("watchlist.index")) res = client.get(url_for("watchlist.index"))
+100
View File
@@ -24,6 +24,29 @@ def set_original_response(datastore_path):
f.write(test_return_data) f.write(test_return_data)
return None return None
def test_favicon(client, live_server, measure_memory_usage, datastore_path):
# Attempt to fetch it, make sure that works
SVG_BASE64 = 'PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAxIDEiLz4='
uuid = client.application.config.get('DATASTORE').add_watch(url='https://localhost')
live_server.app.config['DATASTORE'].data['watching'][uuid].bump_favicon(url="favicon-set-type.svg",
favicon_base_64=SVG_BASE64
)
res = client.get(url_for('static_content', group='favicon', filename=uuid))
assert res.status_code == 200
assert len(res.data) > 10
res = client.get(url_for('static_content', group='..', filename='__init__.py'))
assert res.status_code != 200
res = client.get(url_for('static_content', group='.', filename='../__init__.py'))
assert res.status_code != 200
# Traverse by filename protection
res = client.get(url_for('static_content', group='js', filename='../styles/styles.css'))
assert res.status_code != 200
def test_bad_access(client, live_server, measure_memory_usage, datastore_path): def test_bad_access(client, live_server, measure_memory_usage, datastore_path):
res = client.post( res = client.post(
@@ -478,3 +501,80 @@ def test_logout_with_redirect(client, live_server, measure_memory_usage, datasto
# Cleanup # Cleanup
del client.application.config['DATASTORE'].data['settings']['application']['password'] del client.application.config['DATASTORE'].data['settings']['application']['password']
def test_static_directory_traversal(client, live_server, measure_memory_usage, datastore_path):
"""
Test that the static file serving route properly blocks directory traversal attempts.
This tests the fix for GHSA-9jj8-v89v-xjvw (CVE pending).
The vulnerability was in /static/<group>/<filename> where the sanitization regex
allowed dots, enabling "../" traversal to read application source files.
The fix changed the regex from r'[^\w.-]+' to r'[^a-z0-9_]+' which blocks dots.
"""
# Test 1: Direct .. traversal attempt (URL-encoded)
res = client.get(
"/static/%2e%2e/flask_app.py",
follow_redirects=False
)
# Should be blocked (404 or 403)
assert res.status_code in [404, 403], f"Expected 404/403, got {res.status_code}"
# Should NOT contain application source code
assert b"def static_content" not in res.data
assert b"changedetection_app" not in res.data
# Test 2: Direct .. traversal attempt (unencoded)
res = client.get(
"/static/../flask_app.py",
follow_redirects=False
)
assert res.status_code in [404, 403], f"Expected 404/403, got {res.status_code}"
assert b"def static_content" not in res.data
# Test 3: Multiple dots traversal
res = client.get(
"/static/..../flask_app.py",
follow_redirects=False
)
assert res.status_code in [404, 403], f"Expected 404/403, got {res.status_code}"
assert b"def static_content" not in res.data
# Test 4: Try to access other application files
for filename in ["__init__.py", "datastore.py", "store.py"]:
res = client.get(
f"/static/%2e%2e/{filename}",
follow_redirects=False
)
assert res.status_code in [404, 403], f"File {filename} should be blocked"
# Should not contain Python code indicators
assert b"import" not in res.data or b"# Test" in res.data # Allow "1 Imported" etc
# Test 5: Verify legitimate static files still work
# Note: We can't test actual files without knowing what exists,
# but we can verify the sanitization doesn't break valid groups
res = client.get(
"/static/images/test.png", # Will 404 if file doesn't exist, but won't traverse
follow_redirects=False
)
# Should get 404 (file not found) not 403 (blocked)
# This confirms the group name "images" is valid
assert res.status_code == 404
# Test 6: Ensure hyphens and dots are blocked in group names
res = client.get(
"/static/../../../etc/passwd",
follow_redirects=False
)
assert res.status_code in [404, 403]
assert b"root:" not in res.data
# Test 7: Test that underscores still work (they're allowed)
res = client.get(
"/static/visual_selector_data/test.json",
follow_redirects=False
)
# visual_selector_data is a real group, but requires auth
# Should get 403 (not authenticated) or 404 (file not found), not a path traversal
assert res.status_code in [403, 404]
@@ -0,0 +1,208 @@
#!/usr/bin/env python3
"""
Test that changing global settings or tag configurations forces reprocessing.
When settings or tag configurations change, all affected watches need to
reprocess even if their content hasn't changed, because configuration affects
the processing result.
"""
import os
import time
from flask import url_for
from .util import wait_for_all_checks
def test_settings_change_forces_reprocess(client, live_server, measure_memory_usage, datastore_path):
"""
Test that changing global settings clears all checksums to force reprocessing.
"""
# Setup test content
test_html = """<html>
<body>
<p>Test content that stays the same</p>
</body>
</html>
"""
with open(os.path.join(datastore_path, "endpoint-content.txt"), "w") as f:
f.write(test_html)
test_url = url_for('test_endpoint', _external=True)
# Add two watches
datastore = client.application.config.get('DATASTORE')
uuid1 = datastore.add_watch(url=test_url, extras={'title': 'Watch 1'})
uuid2 = datastore.add_watch(url=test_url, extras={'title': 'Watch 2'})
# Unpause watches
datastore.data['watching'][uuid1]['paused'] = False
datastore.data['watching'][uuid2]['paused'] = False
# First check - establishes baseline
client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
wait_for_all_checks(client)
# Verify checksum files were created
checksum1 = os.path.join(datastore_path, uuid1, 'last-checksum.txt')
checksum2 = os.path.join(datastore_path, uuid2, 'last-checksum.txt')
assert os.path.isfile(checksum1), "First check should create checksum file for watch 1"
assert os.path.isfile(checksum2), "First check should create checksum file for watch 2"
# Change global settings (any setting will do)
res = client.post(
url_for("settings.settings_page"),
data={
"application-empty_pages_are_a_change": "",
"requests-time_between_check-minutes": 180,
'application-fetch_backend': "html_requests"
},
follow_redirects=True
)
assert b"Settings updated." in res.data
# Give it a moment to process
time.sleep(0.5)
# Verify ALL checksum files were deleted
assert not os.path.isfile(checksum1), "Settings change should delete checksum for watch 1"
assert not os.path.isfile(checksum2), "Settings change should delete checksum for watch 2"
# Next check should reprocess (not skip) and recreate checksums
client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
wait_for_all_checks(client)
# Verify checksum files were recreated
assert os.path.isfile(checksum1), "Reprocessing should recreate checksum file for watch 1"
assert os.path.isfile(checksum2), "Reprocessing should recreate checksum file for watch 2"
print("✓ Settings change forces reprocessing of all watches")
def test_tag_change_forces_reprocess(client, live_server, measure_memory_usage, datastore_path):
"""
Test that changing a tag configuration clears checksums only for watches with that tag.
"""
# Setup test content
test_html = """<html>
<body>
<p>Test content that stays the same</p>
</body>
</html>
"""
with open(os.path.join(datastore_path, "endpoint-content.txt"), "w") as f:
f.write(test_html)
test_url = url_for('test_endpoint', _external=True)
# Create a tag
datastore = client.application.config.get('DATASTORE')
tag_uuid = datastore.add_tag('Test Tag')
# Add watches - one with tag, one without
uuid_with_tag = datastore.add_watch(url=test_url, extras={'title': 'Watch With Tag', 'tags': [tag_uuid]})
uuid_without_tag = datastore.add_watch(url=test_url, extras={'title': 'Watch Without Tag'})
# Unpause watches
datastore.data['watching'][uuid_with_tag]['paused'] = False
datastore.data['watching'][uuid_without_tag]['paused'] = False
# First check - establishes baseline
client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
wait_for_all_checks(client)
# Verify checksum files were created
checksum_with = os.path.join(datastore_path, uuid_with_tag, 'last-checksum.txt')
checksum_without = os.path.join(datastore_path, uuid_without_tag, 'last-checksum.txt')
assert os.path.isfile(checksum_with), "First check should create checksum for tagged watch"
assert os.path.isfile(checksum_without), "First check should create checksum for untagged watch"
# Edit the tag (change notification_muted as an example)
tag = datastore.data['settings']['application']['tags'][tag_uuid]
res = client.post(
url_for("tags.form_tag_edit_submit", uuid=tag_uuid),
data={
'title': 'Test Tag',
'notification_muted': 'y',
'overrides_watch': 'n'
},
follow_redirects=True
)
assert b"Updated" in res.data
# Give it a moment to process
time.sleep(0.5)
# Verify ONLY the tagged watch's checksum was deleted
assert not os.path.isfile(checksum_with), "Tag change should delete checksum for watch WITH tag"
assert os.path.isfile(checksum_without), "Tag change should NOT delete checksum for watch WITHOUT tag"
# Next check should reprocess tagged watch and recreate its checksum
client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
wait_for_all_checks(client)
# Verify tagged watch's checksum was recreated
assert os.path.isfile(checksum_with), "Reprocessing should recreate checksum for tagged watch"
assert os.path.isfile(checksum_without), "Untagged watch should still have its checksum"
print("✓ Tag change forces reprocessing only for watches with that tag")
def test_tag_change_via_api_forces_reprocess(client, live_server, measure_memory_usage, datastore_path):
"""
Test that updating a tag via API also clears checksums for affected watches.
"""
# Setup test content
test_html = """<html>
<body>
<p>Test content</p>
</body>
</html>
"""
with open(os.path.join(datastore_path, "endpoint-content.txt"), "w") as f:
f.write(test_html)
test_url = url_for('test_endpoint', _external=True)
# Create a tag
datastore = client.application.config.get('DATASTORE')
tag_uuid = datastore.add_tag('API Test Tag')
# Add watch with tag
uuid_with_tag = datastore.add_watch(url=test_url, extras={'title': 'API Watch'})
datastore.data['watching'][uuid_with_tag]['paused'] = False
datastore.data['watching'][uuid_with_tag]['tags'] = [tag_uuid]
datastore.data['watching'][uuid_with_tag].commit()
# First check
client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
wait_for_all_checks(client)
# Verify checksum exists
checksum_file = os.path.join(datastore_path, uuid_with_tag, 'last-checksum.txt')
assert os.path.isfile(checksum_file), "First check should create checksum file"
# Update tag via API
res = client.put(
f'/api/v1/tag/{tag_uuid}',
json={'notification_muted': True},
headers={'x-api-key': datastore.data['settings']['application']['api_access_token']}
)
assert res.status_code == 200, f"API call failed with status {res.status_code}: {res.data}"
# Give it more time for async operations
time.sleep(1.0)
# Debug: Check if checksum still exists
if os.path.isfile(checksum_file):
# Read checksum to see if it changed
with open(checksum_file, 'r') as f:
checksum_content = f.read()
print(f"Checksum still exists: {checksum_content}")
# Verify checksum was deleted
assert not os.path.isfile(checksum_file), "API tag update should delete checksum"
print("✓ Tag update via API forces reprocessing")
@@ -6,9 +6,6 @@ from urllib.request import urlopen
from .util import set_original_response, set_modified_response, live_server_setup, delete_all_watches from .util import set_original_response, set_modified_response, live_server_setup, delete_all_watches
import re import re
sleep_time_for_fetch_thread = 3
def test_share_watch(client, live_server, measure_memory_usage, datastore_path): def test_share_watch(client, live_server, measure_memory_usage, datastore_path):
set_original_response(datastore_path=datastore_path) set_original_response(datastore_path=datastore_path)
+4 -2
View File
@@ -6,7 +6,6 @@ from urllib.request import urlopen
from .util import set_original_response, set_modified_response, live_server_setup, wait_for_all_checks from .util import set_original_response, set_modified_response, live_server_setup, wait_for_all_checks
from ..diff import ADDED_STYLE from ..diff import ADDED_STYLE
sleep_time_for_fetch_thread = 3
def test_check_basic_change_detection_functionality_source(client, live_server, measure_memory_usage, datastore_path): def test_check_basic_change_detection_functionality_source(client, live_server, measure_memory_usage, datastore_path):
set_original_response(datastore_path=datastore_path) set_original_response(datastore_path=datastore_path)
@@ -72,7 +71,10 @@ def test_check_ignore_elements(client, live_server, measure_memory_usage, datast
follow_redirects=True follow_redirects=True
) )
time.sleep(sleep_time_for_fetch_thread) client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
wait_for_all_checks(client)
res = client.get( res = client.get(
url_for("ui.ui_preview.preview_page", uuid="first"), url_for("ui.ui_preview.preview_page", uuid="first"),
@@ -2,7 +2,8 @@
import time import time
from flask import url_for from flask import url_for
from . util import live_server_setup, delete_all_watches
from .util import live_server_setup, delete_all_watches, wait_for_all_checks
import os import os
@@ -25,9 +26,6 @@ def set_original_ignore_response(datastore_path):
def test_trigger_regex_functionality_with_filter(client, live_server, measure_memory_usage, datastore_path): def test_trigger_regex_functionality_with_filter(client, live_server, measure_memory_usage, datastore_path):
# live_server_setup(live_server) # Setup on conftest per function
sleep_time_for_fetch_thread = 3
set_original_ignore_response(datastore_path=datastore_path) set_original_ignore_response(datastore_path=datastore_path)
# Give the endpoint time to spin up # Give the endpoint time to spin up
@@ -38,8 +36,7 @@ def test_trigger_regex_functionality_with_filter(client, live_server, measure_me
uuid = client.application.config.get('DATASTORE').add_watch(url=test_url) uuid = client.application.config.get('DATASTORE').add_watch(url=test_url)
client.get(url_for("ui.form_watch_checknow"), follow_redirects=True) client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
# it needs time to save the original version wait_for_all_checks(client)
time.sleep(sleep_time_for_fetch_thread)
### test regex with filter ### test regex with filter
res = client.post( res = client.post(
@@ -52,8 +49,9 @@ def test_trigger_regex_functionality_with_filter(client, live_server, measure_me
follow_redirects=True follow_redirects=True
) )
# Give the thread time to pick it up client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
time.sleep(sleep_time_for_fetch_thread)
wait_for_all_checks(client)
client.get(url_for("ui.ui_diff.diff_history_page", uuid="first")) client.get(url_for("ui.ui_diff.diff_history_page", uuid="first"))
@@ -62,7 +60,8 @@ def test_trigger_regex_functionality_with_filter(client, live_server, measure_me
f.write("<html>some new noise with cool stuff2 ok</html>") f.write("<html>some new noise with cool stuff2 ok</html>")
client.get(url_for("ui.form_watch_checknow"), follow_redirects=True) client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
time.sleep(sleep_time_for_fetch_thread)
wait_for_all_checks(client)
# It should report nothing found (nothing should match the regex and filter) # It should report nothing found (nothing should match the regex and filter)
res = client.get(url_for("watchlist.index")) res = client.get(url_for("watchlist.index"))
@@ -73,7 +72,8 @@ def test_trigger_regex_functionality_with_filter(client, live_server, measure_me
f.write("<html>some new noise with <span id=in-here>cool stuff6</span> ok</html>") f.write("<html>some new noise with <span id=in-here>cool stuff6</span> ok</html>")
client.get(url_for("ui.form_watch_checknow"), follow_redirects=True) client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
time.sleep(sleep_time_for_fetch_thread)
wait_for_all_checks(client)
res = client.get(url_for("watchlist.index")) res = client.get(url_for("watchlist.index"))
assert b'has-unread-changes' in res.data assert b'has-unread-changes' in res.data
@@ -0,0 +1,246 @@
#!/usr/bin/env python3
"""
Test the watch edited flag functionality.
This tests the private __watch_was_edited flag that tracks when writable
watch fields are modified, which prevents skipping reprocessing when the
watch configuration has changed.
"""
import os
import time
from flask import url_for
from .util import live_server_setup, wait_for_all_checks
def set_test_content(datastore_path):
"""Write test HTML content to endpoint-content.txt for test server."""
test_html = """<html>
<body>
<p>Test content for watch edited flag tests</p>
<p>This content stays the same across checks</p>
</body>
</html>
"""
with open(os.path.join(datastore_path, "endpoint-content.txt"), "w") as f:
f.write(test_html)
def test_watch_edited_flag_lifecycle(client, live_server, measure_memory_usage, datastore_path):
"""
Test the full lifecycle of the was_edited flag:
1. Flag starts False when watch is created
2. Flag becomes True when writable fields are modified
3. Flag is reset False after worker processing
4. Flag stays False when readonly fields are modified
"""
# Setup - Add a watch
test_url = url_for('test_endpoint', _external=True)
res = client.post(
url_for("ui.ui_views.form_quick_watch_add"),
data={"url": test_url, "tags": "", "edit_and_watch_submit_button": "Edit > Watch"},
follow_redirects=True
)
assert b"Watch added" in res.data or b"Updated watch" in res.data
# Get the watch UUID
datastore = client.application.config.get('DATASTORE')
uuid = list(datastore.data['watching'].keys())[0]
watch = datastore.data['watching'][uuid]
# Reset flag after initial form submission (form sets fields which trigger the flag)
watch.reset_watch_edited_flag()
# Test 1: Flag should be False after reset
assert not watch.was_edited, "Flag should be False after reset"
# Test 2: Modify a writable field (title) - flag should become True
watch['title'] = 'New Title'
assert watch.was_edited, "Flag should be True after modifying writable field 'title'"
# Test 3: Reset flag manually (simulating what worker does)
watch.reset_watch_edited_flag()
assert not watch.was_edited, "Flag should be False after reset"
# Test 4: Modify another writable field (url) - flag should become True again
watch['url'] = 'https://example.com'
assert watch.was_edited, "Flag should be True after modifying writable field 'url'"
# Test 5: Reset and modify a readonly field - flag should stay False
watch.reset_watch_edited_flag()
assert not watch.was_edited, "Flag should be False after reset"
# Modify readonly field (uuid) - should not set flag
old_uuid = watch['uuid']
watch['uuid'] = 'readonly-test-uuid'
assert not watch.was_edited, "Flag should stay False when modifying readonly field 'uuid'"
watch['uuid'] = old_uuid # Restore original
# Note: Worker reset behavior is tested in test_check_removed_line_contains_trigger
# and test_watch_edited_flag_prevents_skip
print("✓ All watch edited flag lifecycle tests passed")
def test_watch_edited_flag_dict_methods(client, live_server, measure_memory_usage, datastore_path):
"""
Test that the flag is set correctly by various dict methods:
- __setitem__ (watch['key'] = value)
- update() (watch.update({'key': value}))
- setdefault() (watch.setdefault('key', default))
- pop() (watch.pop('key'))
- __delitem__ (del watch['key'])
"""
# Setup - Add a watch
test_url = url_for('test_endpoint', _external=True)
res = client.post(
url_for("ui.ui_views.form_quick_watch_add"),
data={"url": test_url, "tags": "", "edit_and_watch_submit_button": "Edit > Watch"},
follow_redirects=True
)
datastore = client.application.config.get('DATASTORE')
uuid = list(datastore.data['watching'].keys())[0]
watch = datastore.data['watching'][uuid]
# Test __setitem__
watch.reset_watch_edited_flag()
watch['title'] = 'Test via setitem'
assert watch.was_edited, "Flag should be True after __setitem__ on writable field"
# Test update() with dict
watch.reset_watch_edited_flag()
watch.update({'title': 'Test via update dict'})
assert watch.was_edited, "Flag should be True after update() with writable field"
# Test update() with kwargs
watch.reset_watch_edited_flag()
watch.update(title='Test via update kwargs')
assert watch.was_edited, "Flag should be True after update() kwargs with writable field"
# Test setdefault() on new key
watch.reset_watch_edited_flag()
watch.setdefault('title', 'Should not be set') # Key exists, no change
assert not watch.was_edited, "Flag should stay False when setdefault() doesn't change existing key"
watch.setdefault('custom_field', 'New value') # New key
assert watch.was_edited, "Flag should be True after setdefault() creates new writable field"
# Test pop() on writable field
watch.reset_watch_edited_flag()
watch.pop('custom_field', None)
assert watch.was_edited, "Flag should be True after pop() on writable field"
# Test __delitem__ on writable field
watch.reset_watch_edited_flag()
watch['temp_field'] = 'temp'
watch.reset_watch_edited_flag() # Reset after adding
del watch['temp_field']
assert watch.was_edited, "Flag should be True after __delitem__ on writable field"
print("✓ All dict methods correctly set the flag")
def test_watch_edited_flag_prevents_skip(client, live_server, measure_memory_usage, datastore_path):
"""
Test that the was_edited flag prevents skipping reprocessing.
When watch configuration is edited, it should reprocess even if content unchanged.
After worker processing, flag should be reset and subsequent checks can skip.
"""
# Setup test content
set_test_content(datastore_path)
# Setup - Add a watch
test_url = url_for('test_endpoint', _external=True)
res = client.post(
url_for("ui.ui_views.form_quick_watch_add"),
data={"url": test_url, "tags": "", "edit_and_watch_submit_button": "Edit > Watch"},
follow_redirects=True
)
assert b"Watch added" in res.data or b"Updated watch" in res.data
datastore = client.application.config.get('DATASTORE')
uuid = list(datastore.data['watching'].keys())[0]
watch = datastore.data['watching'][uuid]
# Unpause the watch (watches are paused by default in tests)
watch['paused'] = False
# Run first check to establish baseline
client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
wait_for_all_checks(client)
# Verify first check completed successfully - checksum file should exist
checksum_file = os.path.join(datastore_path, uuid, 'last-checksum.txt')
assert os.path.isfile(checksum_file), "First check should create last-checksum.txt file"
# Reset the was_edited flag (simulating clean state after processing)
watch.reset_watch_edited_flag()
assert not watch.was_edited, "Flag should be False after reset"
# Run second check without any changes - should skip via checksumFromPreviousCheckWasTheSame
client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
wait_for_all_checks(client)
# Verify it was skipped (last_check_status should indicate skip)
# Note: The actual skip is tested in test_check_removed_line_contains_trigger
# Here we're focused on the was_edited flag interaction
# Now modify the watch - flag should become True
watch['title'] = 'Modified Title'
assert watch.was_edited, "Flag should be True after modifying watch"
# Run third check - should NOT skip because was_edited=True even though content unchanged
client.get(url_for("ui.form_watch_checknow"), follow_redirects=True)
wait_for_all_checks(client)
# After worker processing, the flag should be reset by the worker
# This reset happens in the processor's run() method after processing completes
assert not watch.was_edited, "Flag should be False after worker processing"
print("✓ was_edited flag correctly prevents skip and is reset by worker")
def test_watch_edited_flag_system_fields(client, live_server, measure_memory_usage, datastore_path):
"""
Test that system fields (readonly + additional system fields) don't trigger the flag.
"""
# Setup - Add a watch
test_url = url_for('test_endpoint', _external=True)
res = client.post(
url_for("ui.ui_views.form_quick_watch_add"),
data={"url": test_url, "tags": "", "edit_and_watch_submit_button": "Edit > Watch"},
follow_redirects=True
)
datastore = client.application.config.get('DATASTORE')
uuid = list(datastore.data['watching'].keys())[0]
watch = datastore.data['watching'][uuid]
# Test readonly fields from OpenAPI spec
readonly_fields = ['uuid', 'date_created', 'last_viewed']
for field in readonly_fields:
watch.reset_watch_edited_flag()
if field in watch:
old_value = watch[field]
watch[field] = 'modified-readonly-value'
assert not watch.was_edited, f"Flag should stay False when modifying readonly field '{field}'"
watch[field] = old_value # Restore
# Test additional system fields not in OpenAPI spec yet
system_fields = ['last_check_status']
for field in system_fields:
watch.reset_watch_edited_flag()
watch[field] = 'system-value'
assert not watch.was_edited, f"Flag should stay False when modifying system field '{field}'"
# Test that content-type (readonly per OpenAPI) doesn't trigger flag
watch.reset_watch_edited_flag()
watch['content-type'] = 'text/html'
assert not watch.was_edited, "Flag should stay False when modifying 'content-type' (readonly)"
print("✓ System fields correctly don't trigger the flag")
+18
View File
@@ -160,6 +160,7 @@ def extract_UUID_from_client(client):
return uuid.strip() return uuid.strip()
def delete_all_watches(client=None): def delete_all_watches(client=None):
wait_for_all_checks(client)
uuids = list(client.application.config.get('DATASTORE').data['watching']) uuids = list(client.application.config.get('DATASTORE').data['watching'])
for uuid in uuids: for uuid in uuids:
@@ -180,6 +181,23 @@ def delete_all_watches(client=None):
time.sleep(0.2) time.sleep(0.2)
# Delete any old watch metadata
from pathlib import Path
base_path = Path(
client.application.config.get('DATASTORE').datastore_path
).resolve()
max_depth = 2
for file in base_path.rglob("*.json"):
# Calculate depth relative to base path
depth = len(file.relative_to(base_path).parts) - 1
if depth <= max_depth and file.is_file():
file.unlink()
def wait_for_all_checks(client=None): def wait_for_all_checks(client=None):
""" """
Waits until the queue is empty and workers are idle. Waits until the queue is empty and workers are idle.
@@ -88,7 +88,6 @@ def test_visual_selector_content_ready(client, live_server, measure_memory_usage
def test_basic_browserstep(client, live_server, measure_memory_usage, datastore_path): def test_basic_browserstep(client, live_server, measure_memory_usage, datastore_path):
assert os.getenv('PLAYWRIGHT_DRIVER_URL'), "Needs PLAYWRIGHT_DRIVER_URL set for this test"
test_url = url_for('test_interactive_html_endpoint', _external=True) test_url = url_for('test_interactive_html_endpoint', _external=True)
test_url = test_url.replace('localhost.localdomain', 'cdio') test_url = test_url.replace('localhost.localdomain', 'cdio')
@@ -108,13 +107,13 @@ def test_basic_browserstep(client, live_server, measure_memory_usage, datastore_
"url": test_url, "url": test_url,
"tags": "", "tags": "",
'fetch_backend': "html_webdriver", 'fetch_backend': "html_webdriver",
'browser_steps-0-operation': 'Enter text in field', 'browser_steps-5-operation': 'Enter text in field',
'browser_steps-0-selector': '#test-input-text', 'browser_steps-5-selector': '#test-input-text',
# Should get set to the actual text (jinja2 rendered) # Should get set to the actual text (jinja2 rendered)
'browser_steps-0-optional_value': "Hello-Jinja2-{% now 'Europe/Berlin', '%Y-%m-%d' %}", 'browser_steps-5-optional_value': "Hello-Jinja2-{% now 'Europe/Berlin', '%Y-%m-%d' %}",
'browser_steps-1-operation': 'Click element', 'browser_steps-8-operation': 'Click element',
'browser_steps-1-selector': 'button[name=test-button]', 'browser_steps-8-selector': 'button[name=test-button]',
'browser_steps-1-optional_value': '', 'browser_steps-8-optional_value': '',
# For now, cookies doesnt work in headers because it must be a full cookiejar object # For now, cookies doesnt work in headers because it must be a full cookiejar object
'headers': "testheader: yes\buser-agent: MyCustomAgent", 'headers': "testheader: yes\buser-agent: MyCustomAgent",
"time_between_check_use_default": "y", "time_between_check_use_default": "y",
@@ -122,9 +121,18 @@ def test_basic_browserstep(client, live_server, measure_memory_usage, datastore_
follow_redirects=True follow_redirects=True
) )
assert b"unpaused" in res.data assert b"unpaused" in res.data
wait_for_all_checks(client)
wait_for_all_checks(client)
uuid = next(iter(live_server.app.config['DATASTORE'].data['watching'])) uuid = next(iter(live_server.app.config['DATASTORE'].data['watching']))
# 3874 - should have tidied up any blanks
watch = live_server.app.config['DATASTORE'].data['watching'][uuid]
assert watch['browser_steps'][0].get('operation') == 'Enter text in field'
assert watch['browser_steps'][1].get('selector') == 'button[name=test-button]'
# This part actually needs the browser, before this we are just testing data
assert os.getenv('PLAYWRIGHT_DRIVER_URL'), "Needs PLAYWRIGHT_DRIVER_URL set for this test"
assert live_server.app.config['DATASTORE'].data['watching'][uuid].history_n >= 1, "Watch history had atleast 1 (everything fetched OK)" assert live_server.app.config['DATASTORE'].data['watching'][uuid].history_n >= 1, "Watch history had atleast 1 (everything fetched OK)"
assert b"This text should be removed" not in res.data assert b"This text should be removed" not in res.data
+23 -11
View File
@@ -276,6 +276,9 @@ async def async_update_worker(worker_id, q, notification_q, app, datastore, exec
# Yes fine, so nothing todo, don't continue to process. # Yes fine, so nothing todo, don't continue to process.
process_changedetection_results = False process_changedetection_results = False
changed_detected = False changed_detected = False
logger.debug(f'[{uuid}] - checksumFromPreviousCheckWasTheSame - Checksum from previous check was the same, nothing todo here.')
# Reset the edited flag since we successfully completed the check
watch.reset_watch_edited_flag()
except content_fetchers_exceptions.BrowserConnectError as e: except content_fetchers_exceptions.BrowserConnectError as e:
datastore.update_watch(uuid=uuid, datastore.update_watch(uuid=uuid,
@@ -378,7 +381,7 @@ async def async_update_worker(worker_id, q, notification_q, app, datastore, exec
if not datastore.data['watching'].get(uuid): if not datastore.data['watching'].get(uuid):
continue continue
update_obj['content-type'] = update_handler.fetcher.get_all_headers().get('content-type', '').lower() update_obj['content-type'] = str(update_handler.fetcher.get_all_headers().get('content-type', '') or "").lower()
if not watch.get('ignore_status_codes'): if not watch.get('ignore_status_codes'):
update_obj['consecutive_filter_failures'] = 0 update_obj['consecutive_filter_failures'] = 0
@@ -392,6 +395,8 @@ async def async_update_worker(worker_id, q, notification_q, app, datastore, exec
logger.debug(f"Processing watch UUID: {uuid} - xpath_data length returned {len(update_handler.xpath_data) if update_handler and update_handler.xpath_data else 'empty.'}") logger.debug(f"Processing watch UUID: {uuid} - xpath_data length returned {len(update_handler.xpath_data) if update_handler and update_handler.xpath_data else 'empty.'}")
if update_handler and process_changedetection_results: if update_handler and process_changedetection_results:
try: try:
# Reset the edited flag BEFORE update_watch (which calls watch.update() and would set it again)
watch.reset_watch_edited_flag()
datastore.update_watch(uuid=uuid, update_obj=update_obj) datastore.update_watch(uuid=uuid, update_obj=update_obj)
if changed_detected or not watch.history_n: if changed_detected or not watch.history_n:
@@ -439,8 +444,22 @@ async def async_update_worker(worker_id, q, notification_q, app, datastore, exec
logger.exception(f"Worker {worker_id} full exception details:") logger.exception(f"Worker {worker_id} full exception details:")
datastore.update_watch(uuid=uuid, update_obj={'last_error': str(e)}) datastore.update_watch(uuid=uuid, update_obj={'last_error': str(e)})
# Always record attempt count # Always record attempt count
count = watch.get('check_count', 0) + 1 count = watch.get('check_count', 0) + 1
final_updates = {'fetch_time': round(time.time() - fetch_start_time, 3),
'check_count': count,
}
# Record server header
try:
server_header = str(update_handler.fetcher.get_all_headers().get('server', '') or "").strip().lower()[:255]
if server_header:
final_updates['remote_server_reply'] = server_header
except Exception as e:
server_header = None
pass
if update_handler: # Could be none or empty if the processor was not found if update_handler: # Could be none or empty if the processor was not found
# Always record page title (used in notifications, and can change even when the content is the same) # Always record page title (used in notifications, and can change even when the content is the same)
if update_obj.get('content-type') and 'html' in update_obj.get('content-type'): if update_obj.get('content-type') and 'html' in update_obj.get('content-type'):
@@ -449,17 +468,12 @@ async def async_update_worker(worker_id, q, notification_q, app, datastore, exec
if page_title: if page_title:
page_title = page_title.strip()[:2000] page_title = page_title.strip()[:2000]
logger.debug(f"UUID: {uuid} Page <title> is '{page_title}'") logger.debug(f"UUID: {uuid} Page <title> is '{page_title}'")
datastore.update_watch(uuid=uuid, update_obj={'page_title': page_title}) final_updates['page_title'] = page_title
except Exception as e: except Exception as e:
logger.exception(f"Worker {worker_id} full exception details:") logger.exception(f"Worker {worker_id} full exception details:")
logger.warning(f"UUID: {uuid} Exception when extracting <title> - {str(e)}") logger.warning(f"UUID: {uuid} Exception when extracting <title> - {str(e)}")
# Record server header
try:
server_header = update_handler.fetcher.headers.get('server', '').strip().lower()[:255]
datastore.update_watch(uuid=uuid, update_obj={'remote_server_reply': server_header})
except Exception as e:
pass
# Store favicon if necessary # Store favicon if necessary
if update_handler.fetcher.favicon_blob and update_handler.fetcher.favicon_blob.get('base64'): if update_handler.fetcher.favicon_blob and update_handler.fetcher.favicon_blob.get('base64'):
@@ -467,14 +481,12 @@ async def async_update_worker(worker_id, q, notification_q, app, datastore, exec
favicon_base_64=update_handler.fetcher.favicon_blob.get('base64') favicon_base_64=update_handler.fetcher.favicon_blob.get('base64')
) )
datastore.update_watch(uuid=uuid, update_obj={'fetch_time': round(time.time() - fetch_start_time, 3), datastore.update_watch(uuid=uuid, update_obj=final_updates)
'check_count': count})
# NOW clear fetcher content - after all processing is complete # NOW clear fetcher content - after all processing is complete
# This is the last point where we need the fetcher data # This is the last point where we need the fetcher data
if update_handler and hasattr(update_handler, 'fetcher') and update_handler.fetcher: if update_handler and hasattr(update_handler, 'fetcher') and update_handler.fetcher:
update_handler.fetcher.clear_content() update_handler.fetcher.clear_content()
logger.debug(f"Cleared fetcher content for UUID {uuid}")
# Explicitly delete update_handler to free all references # Explicitly delete update_handler to free all references
if update_handler: if update_handler:
+313 -54
View File
@@ -28,7 +28,7 @@ info:
For example: `x-api-key: YOUR_API_KEY` For example: `x-api-key: YOUR_API_KEY`
version: 0.1.5 version: 0.1.6
contact: contact:
name: ChangeDetection.io name: ChangeDetection.io
url: https://github.com/dgtlmoon/changedetection.io url: https://github.com/dgtlmoon/changedetection.io
@@ -126,13 +126,22 @@ components:
WatchBase: WatchBase:
type: object type: object
properties: properties:
uuid:
type: string
format: uuid
description: Unique identifier
readOnly: true
date_created:
type: [integer, 'null']
description: Unix timestamp of creation
readOnly: true
url: url:
type: string type: string
format: uri format: uri
description: URL to monitor for changes description: URL to monitor for changes
maxLength: 5000 maxLength: 5000
title: title:
type: string type: [string, 'null']
description: Custom title for the web page change monitor (watch), not to be confused with page_title description: Custom title for the web page change monitor (watch), not to be confused with page_title
maxLength: 5000 maxLength: 5000
tag: tag:
@@ -156,56 +165,61 @@ components:
description: HTTP method to use description: HTTP method to use
fetch_backend: fetch_backend:
type: string type: string
enum: [html_requests, html_webdriver] description: |
description: Backend to use for fetching content Backend to use for fetching content. Common values:
- `system` (default) - Use the system-wide default fetcher
- `html_requests` - Fast requests-based fetcher
- `html_webdriver` - Browser-based fetcher (Playwright/Puppeteer)
- `extra_browser_*` - Custom browser configurations (if configured)
- Plugin-provided fetchers (if installed)
pattern: '^(system|html_requests|html_webdriver|extra_browser_.+)$'
default: system
headers: headers:
type: object type: object
additionalProperties: additionalProperties:
type: string type: string
description: HTTP headers to include in requests description: HTTP headers to include in requests
body: body:
type: string type: [string, 'null']
description: HTTP request body description: HTTP request body
maxLength: 5000 maxLength: 5000
proxy: proxy:
type: string type: [string, 'null']
description: Proxy configuration description: Proxy configuration
maxLength: 5000 maxLength: 5000
ignore_status_codes:
type: [boolean, 'null']
description: Ignore HTTP status code errors (boolean or null)
webdriver_delay: webdriver_delay:
type: integer type: [integer, 'null']
description: Delay in seconds for webdriver description: Delay in seconds for webdriver
webdriver_js_execute_code: webdriver_js_execute_code:
type: string type: [string, 'null']
description: JavaScript code to execute description: JavaScript code to execute
maxLength: 5000 maxLength: 5000
time_between_check: time_between_check:
type: object type: object
properties: properties:
weeks: weeks:
type: integer type: [integer, 'null']
minimum: 0 minimum: 0
maximum: 52000 maximum: 52000
nullable: true
days: days:
type: integer type: [integer, 'null']
minimum: 0 minimum: 0
maximum: 365000 maximum: 365000
nullable: true
hours: hours:
type: integer type: [integer, 'null']
minimum: 0 minimum: 0
maximum: 8760000 maximum: 8760000
nullable: true
minutes: minutes:
type: integer type: [integer, 'null']
minimum: 0 minimum: 0
maximum: 525600000 maximum: 525600000
nullable: true
seconds: seconds:
type: integer type: [integer, 'null']
minimum: 0 minimum: 0
maximum: 31536000000 maximum: 31536000000
nullable: true
description: Time intervals between checks. All fields must be non-negative. At least one non-zero value required when not using default settings. description: Time intervals between checks. All fields must be non-negative. At least one non-zero value required when not using default settings.
time_between_check_use_default: time_between_check_use_default:
type: boolean type: boolean
@@ -219,11 +233,11 @@ components:
maxItems: 100 maxItems: 100
description: Notification URLs for this web page change monitor (watch). Maximum 100 URLs. description: Notification URLs for this web page change monitor (watch). Maximum 100 URLs.
notification_title: notification_title:
type: string type: [string, 'null']
description: Custom notification title description: Custom notification title
maxLength: 5000 maxLength: 5000
notification_body: notification_body:
type: string type: [string, 'null']
description: Custom notification body description: Custom notification body
maxLength: 5000 maxLength: 5000
notification_format: notification_format:
@@ -231,7 +245,7 @@ components:
enum: ['text', 'html', 'htmlcolor', 'markdown', 'System default'] enum: ['text', 'html', 'htmlcolor', 'markdown', 'System default']
description: Format for notifications description: Format for notifications
track_ldjson_price_data: track_ldjson_price_data:
type: boolean type: [boolean, 'null']
description: Whether to track JSON-LD price data description: Whether to track JSON-LD price data
browser_steps: browser_steps:
type: array type: array
@@ -239,17 +253,14 @@ components:
type: object type: object
properties: properties:
operation: operation:
type: string type: [string, 'null']
maxLength: 5000 maxLength: 5000
nullable: true
selector: selector:
type: string type: [string, 'null']
maxLength: 5000 maxLength: 5000
nullable: true
optional_value: optional_value:
type: string type: [string, 'null']
maxLength: 5000 maxLength: 5000
nullable: true
required: [operation, selector, optional_value] required: [operation, selector, optional_value]
additionalProperties: false additionalProperties: false
maxItems: 100 maxItems: 100
@@ -260,16 +271,197 @@ components:
default: text_json_diff default: text_json_diff
description: Optional processor mode to use for change detection. Defaults to `text_json_diff` if not specified. description: Optional processor mode to use for change detection. Defaults to `text_json_diff` if not specified.
# Content Filtering
include_filters:
type: array
items:
type: string
maxLength: 5000
maxItems: 100
description: CSS/XPath selectors to extract specific content from the page
subtractive_selectors:
type: array
items:
type: string
maxLength: 5000
maxItems: 100
description: CSS/XPath selectors to remove content from the page
ignore_text:
type: array
items:
type: string
maxLength: 5000
maxItems: 100
description: Text patterns to ignore in change detection
trigger_text:
type: array
items:
type: string
maxLength: 5000
maxItems: 100
description: Text/regex patterns that must be present to trigger a change
text_should_not_be_present:
type: array
items:
type: string
maxLength: 5000
maxItems: 100
description: Text that should NOT be present (triggers alert if found)
extract_text:
type: array
items:
type: string
maxLength: 5000
maxItems: 100
description: Regex patterns to extract specific text after filtering
# Text Processing
trim_text_whitespace:
type: boolean
default: false
description: Strip leading/trailing whitespace from text
sort_text_alphabetically:
type: boolean
default: false
description: Sort lines alphabetically before comparison
remove_duplicate_lines:
type: boolean
default: false
description: Remove duplicate lines from content
check_unique_lines:
type: boolean
default: false
description: Compare against all history for unique lines
strip_ignored_lines:
type: [boolean, 'null']
description: Remove lines matching ignore patterns
# Change Detection Filters
filter_text_added:
type: boolean
default: true
description: Include added text in change detection
filter_text_removed:
type: boolean
default: true
description: Include removed text in change detection
filter_text_replaced:
type: boolean
default: true
description: Include replaced text in change detection
# Restock/Price Detection
in_stock_only:
type: boolean
default: true
description: Only trigger on in-stock transitions (restock_diff processor)
follow_price_changes:
type: boolean
default: true
description: Monitor and track price changes (restock_diff processor)
price_change_threshold_percent:
type: [number, 'null']
description: Minimum price change percentage to trigger notification
has_ldjson_price_data:
type: [boolean, 'null']
description: Whether page has LD-JSON price data (auto-detected)
readOnly: true
# Notifications
notification_screenshot:
type: boolean
default: false
description: Include screenshot in notifications (if supported by notification URL)
filter_failure_notification_send:
type: boolean
default: true
description: Send notification when filters fail to match content
# History & Display
use_page_title_in_list:
type: [boolean, 'null']
description: Display page title in watch list (null = use system default)
history_snapshot_max_length:
type: [integer, 'null']
minimum: 1
maximum: 1000
description: Maximum number of history snapshots to keep (null = use system default)
# Scheduling
time_schedule_limit:
type: object
description: Weekly schedule limiting when checks can run
properties:
enabled:
type: boolean
default: false
monday:
$ref: '#/components/schemas/DaySchedule'
tuesday:
$ref: '#/components/schemas/DaySchedule'
wednesday:
$ref: '#/components/schemas/DaySchedule'
thursday:
$ref: '#/components/schemas/DaySchedule'
friday:
$ref: '#/components/schemas/DaySchedule'
saturday:
$ref: '#/components/schemas/DaySchedule'
sunday:
$ref: '#/components/schemas/DaySchedule'
# Conditions (advanced logic)
conditions:
type: array
items:
type: object
properties:
field:
type: string
description: Field to check (e.g., 'page_filtered_text', 'page_title')
operator:
type: string
description: Comparison operator (e.g., 'contains_regex', 'equals', 'not_equals')
value:
type: string
description: Value to compare against
required: [field, operator, value]
maxItems: 100
description: Array of condition rules for change detection logic (empty array when not set)
conditions_match_logic:
type: string
enum: ['ALL', 'ANY']
default: 'ALL'
description: Logic operator - ALL (match all conditions) or ANY (match any condition)
DaySchedule:
type: object
properties:
enabled:
type: boolean
default: true
start_time:
type: string
pattern: '^([0-1]?[0-9]|2[0-3]):[0-5][0-9]$'
default: '00:00'
description: Start time in HH:MM format
duration:
type: object
properties:
hours:
type: string
pattern: '^[0-9]+$'
default: '24'
minutes:
type: string
pattern: '^[0-9]+$'
default: '00'
Watch: Watch:
allOf: allOf:
- $ref: '#/components/schemas/WatchBase' - $ref: '#/components/schemas/WatchBase'
- type: object - type: object
properties: properties:
uuid:
type: string
format: uuid
description: Unique identifier for the web page change monitor (watch)
readOnly: true
last_checked: last_checked:
type: integer type: integer
description: Unix timestamp of last check description: Unix timestamp of last check
@@ -278,9 +470,10 @@ components:
type: integer type: integer
description: Unix timestamp of last change description: Unix timestamp of last change
readOnly: true readOnly: true
x-computed: true
last_error: last_error:
type: string type: [string, boolean, 'null']
description: Last error message description: Last error message (false when no error, string when error occurred, null if not checked yet)
readOnly: true readOnly: true
last_viewed: last_viewed:
type: integer type: integer
@@ -291,6 +484,61 @@ components:
format: string format: string
description: The watch URL rendered in case of any Jinja2 markup, always use this for listing. description: The watch URL rendered in case of any Jinja2 markup, always use this for listing.
readOnly: true readOnly: true
x-computed: true
page_title:
type: [string, 'null']
description: HTML <title> tag extracted from the page
readOnly: true
check_count:
type: integer
description: Total number of checks performed
readOnly: true
fetch_time:
type: number
description: Duration of last fetch in seconds
readOnly: true
previous_md5:
type: [string, boolean]
description: MD5 hash of previous content (false if not set)
readOnly: true
previous_md5_before_filters:
type: [string, boolean]
description: MD5 hash before filters applied (false if not set)
readOnly: true
consecutive_filter_failures:
type: integer
description: Counter for consecutive filter match failures
readOnly: true
last_notification_error:
type: [string, 'null']
description: Last notification error message
readOnly: true
notification_alert_count:
type: integer
description: Number of notifications sent
readOnly: true
content-type:
type: [string, 'null']
description: Content-Type from last fetch
readOnly: true
remote_server_reply:
type: [string, 'null']
description: Server header from last response
readOnly: true
browser_steps_last_error_step:
type: [integer, 'null']
description: Last browser step that caused an error
readOnly: true
viewed:
type: [integer, boolean]
description: Computed property - true if watch has been viewed, false otherwise (deprecated, use last_viewed instead)
readOnly: true
x-computed: true
history_n:
type: integer
description: Number of history snapshots available
readOnly: true
x-computed: true
CreateWatch: CreateWatch:
allOf: allOf:
@@ -301,34 +549,45 @@ components:
UpdateWatch: UpdateWatch:
allOf: allOf:
- $ref: '#/components/schemas/WatchBase' - $ref: '#/components/schemas/WatchBase' # Extends WatchBase for user-settable fields
- type: object - type: object
properties: properties:
last_viewed: last_viewed:
type: integer type: integer
description: Unix timestamp in seconds of the last time the watch was viewed. Setting it to a value higher than `last_changed` in the "Update watch" endpoint marks the watch as viewed. description: Unix timestamp in seconds of the last time the watch was viewed. Setting it to a value higher than `last_changed` in the "Update watch" endpoint marks the watch as viewed.
minimum: 0 minimum: 0
# Note: ReadOnly and @property fields are filtered out in the backend before update
# We don't use unevaluatedProperties:false here to allow roundtrip GET/PUT workflows
# where the response includes computed fields that should be silently ignored
Tag: Tag:
type: object allOf:
properties: - $ref: '#/components/schemas/WatchBase'
uuid: - type: object
type: string properties:
format: uuid overrides_watch:
description: Unique identifier for the tag type: [boolean, 'null']
readOnly: true description: |
title: Whether this tag's settings override watch settings for all watches in this tag/group.
type: string - true: Tag settings override watch settings
description: Tag title - false: Tag settings do not override (watches use their own settings)
maxLength: 5000 - null: Not decided yet / inherit default behavior
notification_urls: # Future: Aggregated statistics from all watches with this tag
type: array # check_count:
items: # type: integer
type: string # description: Sum of check_count from all watches with this tag
description: Default notification URLs for web page change monitors (watches) with this tag # readOnly: true
notification_muted: # x-computed: true
type: boolean # last_checked:
description: Whether notifications are muted for this tag # type: integer
# description: Most recent last_checked timestamp from all watches with this tag
# readOnly: true
# x-computed: true
# last_changed:
# type: integer
# description: Most recent last_changed timestamp from all watches with this tag
# readOnly: true
# x-computed: true
CreateTag: CreateTag:
allOf: allOf:
+380 -39
View File
File diff suppressed because one or more lines are too long
+4 -5
View File
@@ -5,15 +5,14 @@ flask-compress
# 0.6.3 included compatibility fix for werkzeug 3.x (2.x had deprecation of url handlers) # 0.6.3 included compatibility fix for werkzeug 3.x (2.x had deprecation of url handlers)
flask-login>=0.6.3 flask-login>=0.6.3
flask-paginate flask-paginate
flask_expects_json~=1.7
flask_restful flask_restful
flask_cors # For the Chrome extension to operate flask_cors # For the Chrome extension to operate
# janus # No longer needed - using pure threading.Queue for multi-loop support # janus # No longer needed - using pure threading.Queue for multi-loop support
flask_wtf~=1.2 flask_wtf~=1.2
flask~=3.1 flask~=3.1
flask-socketio~=5.6.0 flask-socketio~=5.6.0
python-socketio~=5.16.0 python-socketio~=5.16.1
python-engineio~=4.13.0 python-engineio~=4.13.1
inscriptis~=2.2 inscriptis~=2.2
pytz pytz
timeago~=1.0 timeago~=1.0
@@ -126,8 +125,8 @@ greenlet >= 3.0.3
# Default SOCKETIO_MODE=threading is recommended for better compatibility # Default SOCKETIO_MODE=threading is recommended for better compatibility
gevent gevent
# Pinned or it causes problems with flask_expects_json which seems unmaintained # Previously pinned for flask_expects_json (removed 2026-02). Unpinning for now.
referencing==0.35.1 referencing
# For conditions # For conditions
panzi-json-logic panzi-json-logic