Compare commits

...

17 Commits

Author SHA1 Message Date
dgtlmoon 960c0510b3 Fix error handler 2025-04-22 11:25:30 +02:00
dgtlmoon 440847820f Revert "Better error reporting"
This reverts commit c9f0921b02.
2025-04-22 11:15:28 +02:00
dgtlmoon c9f0921b02 Better error reporting 2025-04-22 11:11:57 +02:00
dgtlmoon 0d1366dfb9 Make browsersteps UI a little more resilient 2025-04-22 11:03:47 +02:00
dgtlmoon ffde79ecac 0.49.15
Build and push containers / metadata (push) Has been cancelled
Build and push containers / build-push-containers (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
ChangeDetection.io Container Build Test / test-container-build (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2025-04-18 14:57:28 +02:00
dgtlmoon 66ad43b2df Visual Selector & Browser Steps - Always recheck if the data/screenshot is ready under "Visual Selector" tab after using Browser Steps (#3130) 2025-04-18 10:31:43 +02:00
Dror Levin 6b0e56ca80 App logs - Send TRACE and INFO logs to stdout (#3051) 2025-04-18 10:00:09 +02:00
Luca 5a2d84d8b4 Development: introduce Ruff as linter/formatter (#3039) 2025-04-18 09:59:18 +02:00
dgtlmoon a941156f26 Updating restock texts (#3124)
Build and push containers / metadata (push) Has been cancelled
Build and push containers / build-push-containers (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2025-04-17 10:44:32 +02:00
dgtlmoon a1fdeeaa29 Only add screenshot warning if capture was greater than trim size (#3123)
Build and push containers / metadata (push) Has been cancelled
Build and push containers / build-push-containers (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
ChangeDetection.io Container Build Test / test-container-build (push) Has been cancelled
CodeQL / Analyze (javascript) (push) Has been cancelled
CodeQL / Analyze (python) (push) Has been cancelled
2025-04-17 00:11:20 +02:00
dgtlmoon 40ea2604a7 0.49.14 2025-04-16 23:23:18 +02:00
dgtlmoon ceda526093 Small fix for multiprocessing start on Mac OS (#3121 #3115) 2025-04-16 22:52:03 +02:00
Justin Goette 4197254c53 docs: Update reference URL (#3119) 2025-04-16 21:37:50 +02:00
dgtlmoon a0b7efb436 UI - Fix to edit and groups template 2025-04-16 18:40:30 +02:00
dgtlmoon 5f5e8ede6c Updating API documentation
Build and push containers / metadata (push) Has been cancelled
Build and push containers / build-push-containers (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
2025-04-13 21:51:17 +02:00
dgtlmoon 52ca855a29 Undo forced selenium headless mode, small refactor (#3112)
Build and push containers / metadata (push) Has been cancelled
Build and push containers / build-push-containers (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Build distribution 📦 (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Test the built 📦 package works basically. (push) Has been cancelled
Publish Python 🐍distribution 📦 to PyPI and TestPyPI / Publish Python 🐍 distribution 📦 to PyPI (push) Has been cancelled
ChangeDetection.io App Test / lint-code (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-10 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-11 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-12 (push) Has been cancelled
ChangeDetection.io App Test / test-application-3-13 (push) Has been cancelled
ChangeDetection.io Container Build Test / test-container-build (push) Has been cancelled
2025-04-12 19:26:17 +02:00
dgtlmoon 079efd0a85 Playwright + Puppeteer fix for when page is taller than viewport but less than screenshot step_size (#3113) 2025-04-12 18:37:59 +02:00
27 changed files with 578 additions and 289 deletions
+7 -8
View File
@@ -8,13 +8,13 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Lint with flake8
- name: Lint with Ruff
run: |
pip3 install flake8
# stop the build if there are Python syntax errors or undefined names
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
pip install ruff
# Check for syntax errors and undefined names
ruff check . --select E9,F63,F7,F82
# Complete check with errors treated as warnings
ruff check . --exit-zero
test-application-3-10:
needs: lint-code
@@ -41,5 +41,4 @@ jobs:
uses: ./.github/workflows/test-stack-reusable-workflow.yml
with:
python-version: '3.13'
skip-pypuppeteer: true
skip-pypuppeteer: true
@@ -172,8 +172,8 @@ jobs:
curl --retry-connrefused --retry 6 -s -g -6 "http://[::1]:5556"|grep -q checkbox-uuid
# Check whether TRACE log is enabled.
# Also, check whether TRACE is came from STDERR
docker logs test-changedetectionio 2>&1 1>/dev/null | grep 'TRACE log is enabled' || exit 1
# Also, check whether TRACE came from STDOUT
docker logs test-changedetectionio 2>/dev/null | grep 'TRACE log is enabled' || exit 1
# Check whether DEBUG is came from STDOUT
docker logs test-changedetectionio 2>/dev/null | grep 'DEBUG' || exit 1
+1
View File
@@ -16,6 +16,7 @@ dist/
.env
.venv/
venv/
.python-version
# IDEs
.idea
+9
View File
@@ -0,0 +1,9 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.11.2
hooks:
# Lint (and apply safe fixes)
- id: ruff
args: [--fix]
# Fomrat
- id: ruff-format
+48
View File
@@ -0,0 +1,48 @@
# Minimum supported version
target-version = "py310"
# Formatting options
line-length = 100
indent-width = 4
exclude = [
"__pycache__",
".eggs",
".git",
".tox",
".venv",
"*.egg-info",
"*.pyc",
]
[lint]
# https://docs.astral.sh/ruff/rules/
select = [
"B", # flake8-bugbear
"B9",
"C",
"E", # pycodestyle
"F", # Pyflakes
"I", # isort
"N", # pep8-naming
"UP", # pyupgrade
"W", # pycodestyle
]
ignore = [
"B007", # unused-loop-control-variable
"B909", # loop-iterator-mutation
"E203", # whitespace-before-punctuation
"E266", # multiple-leading-hashes-for-block-comment
"E501", # redundant-backslash
"F403", # undefined-local-with-import-star
"N802", # invalid-function-name
"N806", # non-lowercase-variable-in-function
"N815", # mixed-case-variable-in-class-scope
]
[lint.mccabe]
max-complexity = 12
[format]
indent-style = "space"
quote-style = "preserve"
+1 -1
View File
@@ -68,7 +68,7 @@ COPY changedetection.py /app/changedetection.py
# Github Action test purpose(test-only.yml).
# On production, it is effectively LOGGER_LEVEL=''.
ARG LOGGER_LEVEL=''
ENV LOGGER_LEVEL "$LOGGER_LEVEL"
ENV LOGGER_LEVEL="$LOGGER_LEVEL"
WORKDIR /app
CMD ["python", "./changedetection.py", "-d", "/datastore"]
+3 -1
View File
@@ -3,4 +3,6 @@
# Only exists for direct CLI usage
import changedetectionio
changedetectionio.main()
if __name__ == '__main__':
changedetectionio.main()
+2 -2
View File
@@ -2,7 +2,7 @@
# Read more https://github.com/dgtlmoon/changedetection.io/wiki
__version__ = '0.49.13'
__version__ = '0.49.15'
from changedetectionio.strtobool import strtobool
from json.decoder import JSONDecodeError
@@ -106,7 +106,7 @@ def main():
# Without this, a logger will be duplicated
logger.remove()
try:
log_level_for_stdout = { 'DEBUG', 'SUCCESS' }
log_level_for_stdout = { 'TRACE', 'DEBUG', 'INFO', 'SUCCESS' }
logger.configure(handlers=[
{"sink": sys.stdout, "level": logger_level,
"filter" : lambda record: record['level'].name in log_level_for_stdout},
@@ -53,14 +53,7 @@ def construct_blueprint(datastore: ChangeDetectionStore):
a = "?" if not '?' in base_url else '&'
base_url += a + f"timeout={keepalive_ms}"
try:
browsersteps_start_session['browser'] = io_interface_context.chromium.connect_over_cdp(base_url)
except Exception as e:
if 'ECONNREFUSED' in str(e):
return make_response('Unable to start the Playwright Browser session, is it running?', 401)
else:
# Other errors, bad URL syntax, bad reply etc
return make_response(str(e), 401)
browsersteps_start_session['browser'] = io_interface_context.chromium.connect_over_cdp(base_url)
proxy_id = datastore.get_preferred_proxy_for_watch(uuid=watch_uuid)
proxy = None
@@ -109,7 +102,16 @@ def construct_blueprint(datastore: ChangeDetectionStore):
logger.debug("Starting connection with playwright")
logger.debug("browser_steps.py connecting")
browsersteps_sessions[browsersteps_session_id] = start_browsersteps_session(watch_uuid)
try:
browsersteps_sessions[browsersteps_session_id] = start_browsersteps_session(watch_uuid)
except Exception as e:
if 'ECONNREFUSED' in str(e):
return make_response('Unable to start the Playwright Browser session, is sockpuppetbrowser running? Network configuration is OK?', 401)
else:
# Other errors, bad URL syntax, bad reply etc
return make_response(str(e), 401)
logger.debug("Starting connection with playwright - done")
return {'browsersteps_session_id': browsersteps_session_id}
@@ -1,6 +1,8 @@
import os
import time
import re
import sys
import traceback
from random import randint
from loguru import logger
@@ -54,14 +56,34 @@ browser_step_ui_config = {'Choose one': '0 0',
class steppable_browser_interface():
page = None
start_url = None
action_timeout = 10 * 1000
def __init__(self, start_url):
self.start_url = start_url
def safe_page_operation(self, operation_fn, default_return=None):
"""Safely execute a page operation with error handling"""
if self.page is None:
logger.warning("Attempted operation on None page object")
return default_return
try:
return operation_fn()
except Exception as e:
logger.debug(f"Page operation failed: {str(e)}")
# Try to reclaim memory if possible
try:
self.page.request_gc()
except:
pass
return default_return
# Convert and perform "Click Button" for example
def call_action(self, action_name, selector=None, optional_value=None):
if self.page is None:
logger.warning("Cannot call action on None page object")
return
now = time.time()
call_action_name = re.sub('[^0-9a-zA-Z]+', '_', action_name.lower())
if call_action_name == 'choose_one':
@@ -72,28 +94,46 @@ class steppable_browser_interface():
if selector and selector.startswith('/') and not selector.startswith('//'):
selector = "xpath=" + selector
# Check if action handler exists
if not hasattr(self, "action_" + call_action_name):
logger.warning(f"Action handler for '{call_action_name}' not found")
return
action_handler = getattr(self, "action_" + call_action_name)
# Support for Jinja2 variables in the value and selector
if selector and ('{%' in selector or '{{' in selector):
selector = jinja_render(template_str=selector)
if optional_value and ('{%' in optional_value or '{{' in optional_value):
optional_value = jinja_render(template_str=optional_value)
action_handler(selector, optional_value)
self.page.wait_for_timeout(1.5 * 1000)
logger.debug(f"Call action done in {time.time()-now:.2f}s")
try:
action_handler(selector, optional_value)
# Safely wait for timeout
def wait_timeout():
self.page.wait_for_timeout(1.5 * 1000)
self.safe_page_operation(wait_timeout)
logger.debug(f"Call action done in {time.time()-now:.2f}s")
except Exception as e:
logger.error(f"Error executing action '{call_action_name}': {str(e)}")
# Request garbage collection to free up resources after error
try:
self.page.request_gc()
except:
pass
def action_goto_url(self, selector=None, value=None):
# self.page.set_viewport_size({"width": 1280, "height": 5000})
if not value:
logger.warning("No URL provided for goto_url action")
return None
now = time.time()
response = self.page.goto(value, timeout=0, wait_until='load')
# Should be the same as the puppeteer_fetch.js methods, means, load with no timeout set (skip timeout)
#and also wait for seconds ?
#await page.waitForTimeout(1000);
#await page.waitForTimeout(extra_wait_ms);
def goto_operation():
return self.page.goto(value, timeout=0, wait_until='load')
response = self.safe_page_operation(goto_operation)
logger.debug(f"Time to goto URL {time.time()-now:.2f}s")
return response
@@ -103,116 +143,209 @@ class steppable_browser_interface():
def action_click_element_containing_text(self, selector=None, value=''):
logger.debug("Clicking element containing text")
if not len(value.strip()):
if not value or not len(value.strip()):
return
elem = self.page.get_by_text(value)
if elem.count():
elem.first.click(delay=randint(200, 500), timeout=self.action_timeout)
def click_operation():
elem = self.page.get_by_text(value)
if elem.count():
elem.first.click(delay=randint(200, 500), timeout=self.action_timeout)
self.safe_page_operation(click_operation)
def action_click_element_containing_text_if_exists(self, selector=None, value=''):
logger.debug("Clicking element containing text if exists")
if not len(value.strip()):
return
elem = self.page.get_by_text(value)
logger.debug(f"Clicking element containing text - {elem.count()} elements found")
if elem.count():
elem.first.click(delay=randint(200, 500), timeout=self.action_timeout)
else:
if not value or not len(value.strip()):
return
def click_if_exists_operation():
elem = self.page.get_by_text(value)
logger.debug(f"Clicking element containing text - {elem.count()} elements found")
if elem.count():
elem.first.click(delay=randint(200, 500), timeout=self.action_timeout)
self.safe_page_operation(click_if_exists_operation)
def action_enter_text_in_field(self, selector, value):
if not len(selector.strip()):
if not selector or not len(selector.strip()):
return
self.page.fill(selector, value, timeout=self.action_timeout)
def fill_operation():
self.page.fill(selector, value, timeout=self.action_timeout)
self.safe_page_operation(fill_operation)
def action_execute_js(self, selector, value):
response = self.page.evaluate(value)
return response
if not value:
return None
def evaluate_operation():
return self.page.evaluate(value)
return self.safe_page_operation(evaluate_operation)
def action_click_element(self, selector, value):
logger.debug("Clicking element")
if not len(selector.strip()):
if not selector or not len(selector.strip()):
return
self.page.click(selector=selector, timeout=self.action_timeout + 20 * 1000, delay=randint(200, 500))
def click_operation():
self.page.click(selector=selector, timeout=self.action_timeout + 20 * 1000, delay=randint(200, 500))
self.safe_page_operation(click_operation)
def action_click_element_if_exists(self, selector, value):
import playwright._impl._errors as _api_types
logger.debug("Clicking element if exists")
if not len(selector.strip()):
return
try:
self.page.click(selector, timeout=self.action_timeout, delay=randint(200, 500))
except _api_types.TimeoutError as e:
return
except _api_types.Error as e:
# Element was there, but page redrew and now its long long gone
if not selector or not len(selector.strip()):
return
def click_if_exists_operation():
try:
self.page.click(selector, timeout=self.action_timeout, delay=randint(200, 500))
except _api_types.TimeoutError:
return
except _api_types.Error:
# Element was there, but page redrew and now its long long gone
return
self.safe_page_operation(click_if_exists_operation)
def action_click_x_y(self, selector, value):
if not re.match(r'^\s?\d+\s?,\s?\d+\s?$', value):
raise Exception("'Click X,Y' step should be in the format of '100 , 90'")
if not value or not re.match(r'^\s?\d+\s?,\s?\d+\s?$', value):
logger.warning("'Click X,Y' step should be in the format of '100 , 90'")
return
x, y = value.strip().split(',')
x = int(float(x.strip()))
y = int(float(y.strip()))
self.page.mouse.click(x=x, y=y, delay=randint(200, 500))
try:
x, y = value.strip().split(',')
x = int(float(x.strip()))
y = int(float(y.strip()))
def click_xy_operation():
self.page.mouse.click(x=x, y=y, delay=randint(200, 500))
self.safe_page_operation(click_xy_operation)
except Exception as e:
logger.error(f"Error parsing x,y coordinates: {str(e)}")
def action_scroll_down(self, selector, value):
# Some sites this doesnt work on for some reason
self.page.mouse.wheel(0, 600)
self.page.wait_for_timeout(1000)
def scroll_operation():
# Some sites this doesnt work on for some reason
self.page.mouse.wheel(0, 600)
self.page.wait_for_timeout(1000)
self.safe_page_operation(scroll_operation)
def action_wait_for_seconds(self, selector, value):
self.page.wait_for_timeout(float(value.strip()) * 1000)
try:
seconds = float(value.strip()) if value else 1.0
def wait_operation():
self.page.wait_for_timeout(seconds * 1000)
self.safe_page_operation(wait_operation)
except (ValueError, TypeError) as e:
logger.error(f"Invalid value for wait_for_seconds: {str(e)}")
def action_wait_for_text(self, selector, value):
if not value:
return
import json
v = json.dumps(value)
self.page.wait_for_function(f'document.querySelector("body").innerText.includes({v});', timeout=30000)
def wait_for_text_operation():
self.page.wait_for_function(
f'document.querySelector("body").innerText.includes({v});',
timeout=30000
)
self.safe_page_operation(wait_for_text_operation)
def action_wait_for_text_in_element(self, selector, value):
if not selector or not value:
return
import json
s = json.dumps(selector)
v = json.dumps(value)
self.page.wait_for_function(f'document.querySelector({s}).innerText.includes({v});', timeout=30000)
def wait_for_text_in_element_operation():
self.page.wait_for_function(
f'document.querySelector({s}).innerText.includes({v});',
timeout=30000
)
self.safe_page_operation(wait_for_text_in_element_operation)
# @todo - in the future make some popout interface to capture what needs to be set
# https://playwright.dev/python/docs/api/class-keyboard
def action_press_enter(self, selector, value):
self.page.keyboard.press("Enter", delay=randint(200, 500))
def press_operation():
self.page.keyboard.press("Enter", delay=randint(200, 500))
self.safe_page_operation(press_operation)
def action_press_page_up(self, selector, value):
self.page.keyboard.press("PageUp", delay=randint(200, 500))
def press_operation():
self.page.keyboard.press("PageUp", delay=randint(200, 500))
self.safe_page_operation(press_operation)
def action_press_page_down(self, selector, value):
self.page.keyboard.press("PageDown", delay=randint(200, 500))
def press_operation():
self.page.keyboard.press("PageDown", delay=randint(200, 500))
self.safe_page_operation(press_operation)
def action_check_checkbox(self, selector, value):
self.page.locator(selector).check(timeout=self.action_timeout)
if not selector:
return
def check_operation():
self.page.locator(selector).check(timeout=self.action_timeout)
self.safe_page_operation(check_operation)
def action_uncheck_checkbox(self, selector, value):
self.page.locator(selector).uncheck(timeout=self.action_timeout)
if not selector:
return
def uncheck_operation():
self.page.locator(selector).uncheck(timeout=self.action_timeout)
self.safe_page_operation(uncheck_operation)
def action_remove_elements(self, selector, value):
"""Removes all elements matching the given selector from the DOM."""
self.page.locator(selector).evaluate_all("els => els.forEach(el => el.remove())")
if not selector:
return
def remove_operation():
self.page.locator(selector).evaluate_all("els => els.forEach(el => el.remove())")
self.safe_page_operation(remove_operation)
def action_make_all_child_elements_visible(self, selector, value):
"""Recursively makes all child elements inside the given selector fully visible."""
self.page.locator(selector).locator("*").evaluate_all("""
els => els.forEach(el => {
el.style.display = 'block'; // Forces it to be displayed
el.style.visibility = 'visible'; // Ensures it's not hidden
el.style.opacity = '1'; // Fully opaque
el.style.position = 'relative'; // Avoids 'absolute' hiding
el.style.height = 'auto'; // Expands collapsed elements
el.style.width = 'auto'; // Ensures full visibility
el.removeAttribute('hidden'); // Removes hidden attribute
el.classList.remove('hidden', 'd-none'); // Removes common CSS hidden classes
})
""")
if not selector:
return
def make_visible_operation():
self.page.locator(selector).locator("*").evaluate_all("""
els => els.forEach(el => {
el.style.display = 'block'; // Forces it to be displayed
el.style.visibility = 'visible'; // Ensures it's not hidden
el.style.opacity = '1'; // Fully opaque
el.style.position = 'relative'; // Avoids 'absolute' hiding
el.style.height = 'auto'; // Expands collapsed elements
el.style.width = 'auto'; // Ensures full visibility
el.removeAttribute('hidden'); // Removes hidden attribute
el.classList.remove('hidden', 'd-none'); // Removes common CSS hidden classes
})
""")
self.safe_page_operation(make_visible_operation)
# Responsible for maintaining a live 'context' with the chrome CDP
# @todo - how long do contexts live for anyway?
@@ -224,7 +357,9 @@ class browsersteps_live_ui(steppable_browser_interface):
# bump and kill this if idle after X sec
age_start = 0
headers = {}
# Track if resources are properly cleaned up
_is_cleaned_up = False
# use a special driver, maybe locally etc
command_executor = os.getenv(
"PLAYWRIGHT_BROWSERSTEPS_DRIVER_URL"
@@ -243,9 +378,14 @@ class browsersteps_live_ui(steppable_browser_interface):
self.age_start = time.time()
self.playwright_browser = playwright_browser
self.start_url = start_url
self._is_cleaned_up = False
if self.context is None:
self.connect(proxy=proxy)
def __del__(self):
# Ensure cleanup happens if object is garbage collected
self.cleanup()
# Connect and setup a new context
def connect(self, proxy=None):
# Should only get called once - test that
@@ -264,31 +404,74 @@ class browsersteps_live_ui(steppable_browser_interface):
user_agent=manage_user_agent(headers=self.headers),
)
self.page = self.context.new_page()
# self.page.set_default_navigation_timeout(keep_open)
self.page.set_default_timeout(keep_open)
# @todo probably this doesnt work
self.page.on(
"close",
self.mark_as_closed,
)
# Set event handlers
self.page.on("close", self.mark_as_closed)
# Listen for all console events and handle errors
self.page.on("console", lambda msg: print(f"Browser steps console - {msg.type}: {msg.text} {msg.args}"))
logger.debug(f"Time to browser setup {time.time()-now:.2f}s")
self.page.wait_for_timeout(1 * 1000)
def mark_as_closed(self):
logger.debug("Page closed, cleaning up..")
self.cleanup()
def cleanup(self):
"""Properly clean up all resources to prevent memory leaks"""
if self._is_cleaned_up:
return
logger.debug("Cleaning up browser steps resources")
# Clean up page
if hasattr(self, 'page') and self.page is not None:
try:
# Force garbage collection before closing
self.page.request_gc()
except Exception as e:
logger.debug(f"Error during page garbage collection: {str(e)}")
try:
# Remove event listeners before closing
self.page.remove_listener("close", self.mark_as_closed)
except Exception as e:
logger.debug(f"Error removing event listeners: {str(e)}")
try:
self.page.close()
except Exception as e:
logger.debug(f"Error closing page: {str(e)}")
self.page = None
# Clean up context
if hasattr(self, 'context') and self.context is not None:
try:
self.context.close()
except Exception as e:
logger.debug(f"Error closing context: {str(e)}")
self.context = None
self._is_cleaned_up = True
logger.debug("Browser steps resources cleanup complete")
@property
def has_expired(self):
if not self.page:
if not self.page or self._is_cleaned_up:
return True
# Check if session has expired based on age
max_age_seconds = int(os.getenv("BROWSER_STEPS_MAX_AGE_SECONDS", 60 * 10)) # Default 10 minutes
if (time.time() - self.age_start) > max_age_seconds:
logger.debug(f"Browser steps session expired after {max_age_seconds} seconds")
return True
return False
def get_current_state(self):
"""Return the screenshot and interactive elements mapping, generally always called after action_()"""
@@ -297,36 +480,55 @@ class browsersteps_live_ui(steppable_browser_interface):
# because we for now only run browser steps in playwright mode (not puppeteer mode)
from changedetectionio.content_fetchers.playwright import capture_full_page
# Safety check - don't proceed if resources are cleaned up
if self._is_cleaned_up or self.page is None:
logger.warning("Attempted to get current state after cleanup")
return (None, None)
xpath_element_js = importlib.resources.files("changedetectionio.content_fetchers.res").joinpath('xpath_element_scraper.js').read_text()
now = time.time()
self.page.wait_for_timeout(1 * 1000)
screenshot = capture_full_page(page=self.page)
screenshot = None
xpath_data = None
try:
# Get screenshot first
screenshot = capture_full_page(page=self.page)
logger.debug(f"Time to get screenshot from browser {time.time() - now:.2f}s")
logger.debug(f"Time to get screenshot from browser {time.time() - now:.2f}s")
# Then get interactive elements
now = time.time()
self.page.evaluate("var include_filters=''")
self.page.request_gc()
now = time.time()
self.page.evaluate("var include_filters=''")
# Go find the interactive elements
# @todo in the future, something smarter that can scan for elements with .click/focus etc event handlers?
scan_elements = 'a,button,input,select,textarea,i,th,td,p,li,h1,h2,h3,h4,div,span'
self.page.request_gc()
MAX_TOTAL_HEIGHT = int(os.getenv("SCREENSHOT_MAX_HEIGHT", SCREENSHOT_MAX_HEIGHT_DEFAULT))
xpath_data = json.loads(self.page.evaluate(xpath_element_js, {
"visualselector_xpath_selectors": scan_elements,
"max_height": MAX_TOTAL_HEIGHT
}))
self.page.request_gc()
scan_elements = 'a,button,input,select,textarea,i,th,td,p,li,h1,h2,h3,h4,div,span'
MAX_TOTAL_HEIGHT = int(os.getenv("SCREENSHOT_MAX_HEIGHT", SCREENSHOT_MAX_HEIGHT_DEFAULT))
xpath_data = json.loads(self.page.evaluate(xpath_element_js, {
"visualselector_xpath_selectors": scan_elements,
"max_height": MAX_TOTAL_HEIGHT
}))
self.page.request_gc()
# So the JS will find the smallest one first
xpath_data['size_pos'] = sorted(xpath_data['size_pos'], key=lambda k: k['width'] * k['height'], reverse=True)
logger.debug(f"Time to scrape xPath element data in browser {time.time()-now:.2f}s")
# playwright._impl._api_types.Error: Browser closed.
# @todo show some countdown timer?
# Sort elements by size
xpath_data['size_pos'] = sorted(xpath_data['size_pos'], key=lambda k: k['width'] * k['height'], reverse=True)
logger.debug(f"Time to scrape xPath element data in browser {time.time()-now:.2f}s")
except Exception as e:
logger.error(f"Error getting current state: {str(e)}")
# Attempt recovery - force garbage collection
try:
self.page.request_gc()
except:
pass
# Request garbage collection one final time
try:
self.page.request_gc()
except:
pass
return (screenshot, xpath_data)
@@ -104,6 +104,9 @@ def construct_blueprint(datastore: ChangeDetectionStore):
uuid = list(datastore.data['settings']['application']['tags'].keys()).pop()
default = datastore.data['settings']['application']['tags'].get(uuid)
if not default:
flash("Tag not found", "error")
return redirect(url_for('watchlist.index'))
form = group_restock_settings_form(
formdata=request.form if request.method == 'POST' else None,
@@ -66,7 +66,7 @@
<div class="pure-control-group inline-radio">
{{ render_checkbox_field(form.notification_muted) }}
</div>
{% if is_html_webdriver %}
{% if 1 %}
<div class="pure-control-group inline-radio">
{{ render_checkbox_field(form.notification_screenshot) }}
<span class="pure-form-message-inline">
+8 -22
View File
@@ -19,20 +19,6 @@ def construct_blueprint(datastore: ChangeDetectionStore, update_q, queuedWatchMe
if tag_uuid in watch.get('tags', []) and (tag.get('include_filters') or tag.get('subtractive_selectors')):
return True
def levenshtein_ratio_recent_history(watch):
try:
from Levenshtein import ratio, distance
k = list(watch.history.keys())
if len(k) >= 2:
a = watch.get_history_snapshot(timestamp=k[0])
b = watch.get_history_snapshot(timestamp=k[1])
distance = distance(a, b)
return distance
except Exception as e:
logger.warning("Unable to calc similarity", e)
return "Unable to calc similarity"
return ''
@edit_blueprint.route("/edit/<string:uuid>", methods=['GET', 'POST'])
@login_optionally_required
# https://stackoverflow.com/questions/42984453/wtforms-populate-form-with-data-if-data-exists
@@ -227,9 +213,6 @@ def construct_blueprint(datastore: ChangeDetectionStore, update_q, queuedWatchMe
if request.method == 'POST' and not form.validate():
flash("An error occurred, please see below.", "error")
visualselector_data_is_ready = datastore.visualselector_data_is_ready(uuid)
# JQ is difficult to install on windows and must be manually added (outside requirements.txt)
jq_support = True
try:
@@ -239,11 +222,12 @@ def construct_blueprint(datastore: ChangeDetectionStore, update_q, queuedWatchMe
watch = datastore.data['watching'].get(uuid)
# if system or watch is configured to need a chrome type browser
system_uses_webdriver = datastore.data['settings']['application']['fetch_backend'] == 'html_webdriver'
watch_uses_webdriver = False
watch_needs_selenium_or_playwright = False
if (watch.get('fetch_backend') == 'system' and system_uses_webdriver) or watch.get('fetch_backend') == 'html_webdriver' or watch.get('fetch_backend', '').startswith('extra_browser_'):
watch_uses_webdriver = True
watch_needs_selenium_or_playwright = True
from zoneinfo import available_timezones
@@ -262,14 +246,16 @@ def construct_blueprint(datastore: ChangeDetectionStore, update_q, queuedWatchMe
'has_extra_headers_file': len(datastore.get_all_headers_in_textfile_for_watch(uuid=uuid)) > 0,
'has_special_tag_options': _watch_has_tag_options_set(watch=watch),
'jq_support': jq_support,
'lev_info': levenshtein_ratio_recent_history(watch),
'playwright_enabled': os.getenv('PLAYWRIGHT_DRIVER_URL', False),
'settings_application': datastore.data['settings']['application'],
'system_has_playwright_configured': os.getenv('PLAYWRIGHT_DRIVER_URL'),
'system_has_webdriver_configured': os.getenv('WEBDRIVER_URL'),
'visual_selector_data_ready': datastore.visualselector_data_is_ready(watch_uuid=uuid),
'timezone_default_config': datastore.data['settings']['application'].get('timezone'),
'using_global_webdriver_wait': not default['webdriver_delay'],
'uuid': uuid,
'watch': watch,
'watch_uses_webdriver': watch_uses_webdriver,
'watch_needs_selenium_or_playwright': watch_needs_selenium_or_playwright,
}
included_content = None
+4 -4
View File
@@ -94,11 +94,11 @@ def execute_ruleset_against_all_plugins(current_watch_uuid: str, application_dat
EXECUTE_DATA = {}
result = True
ruleset_settings = application_datastruct['watching'].get(current_watch_uuid)
watch = application_datastruct['watching'].get(current_watch_uuid)
if ruleset_settings.get("conditions"):
logic_operator = "and" if ruleset_settings.get("conditions_match_logic", "ALL") == "ALL" else "or"
complete_rules = filter_complete_rules(ruleset_settings['conditions'])
if watch and watch.get("conditions"):
logic_operator = "and" if watch.get("conditions_match_logic", "ALL") == "ALL" else "or"
complete_rules = filter_complete_rules(watch['conditions'])
if complete_rules:
# Give all plugins a chance to update the data dict again (that we will test the conditions against)
for plugin in plugin_manager.get_plugins():
@@ -26,9 +26,11 @@ def capture_full_page(page):
step_size = SCREENSHOT_SIZE_STITCH_THRESHOLD # Size that won't cause GPU to overflow
screenshot_chunks = []
y = 0
# If page height is larger than current viewport, use a larger viewport for better capturing
if page_height > page.viewport_size['height']:
if page_height < step_size:
step_size = page_height # Incase page is bigger than default viewport but smaller than proposed step size
logger.debug(f"Setting bigger viewport to step through large page width W{page.viewport_size['width']}xH{step_size} because page_height > viewport_size")
# Set viewport to a larger size to capture more content at once
page.set_viewport_size({'width': page.viewport_size['width'], 'height': step_size})
@@ -46,9 +46,10 @@ async def capture_full_page(page):
screenshot_chunks = []
y = 0
if page_height > page.viewport['height']:
if page_height < step_size:
step_size = page_height # Incase page is bigger than default viewport but smaller than proposed step size
await page.setViewport({'width': page.viewport['width'], 'height': step_size})
while y < min(page_height, SCREENSHOT_MAX_TOTAL_HEIGHT):
await page.evaluate(f"window.scrollTo(0, {y})")
screenshot_chunks.append(await page.screenshot(type_='jpeg',
@@ -10,6 +10,7 @@ async () => {
'article épuisé',
'artikel zurzeit vergriffen',
'as soon as stock is available',
'aucune offre n\'est disponible',
'ausverkauft', // sold out
'available for back order',
'awaiting stock',
@@ -25,9 +26,8 @@ async () => {
'dieser artikel ist bald wieder verfügbar',
'dostępne wkrótce',
'en rupture',
'en rupture de stock',
'épuisé',
'esgotado',
'in kürze lieferbar',
'indisponible',
'indisponível',
'isn\'t in stock right now',
@@ -50,10 +50,11 @@ async () => {
'niet leverbaar',
'niet op voorraad',
'no disponible',
'non disponibile',
'non disponible',
'no featured offers available',
'no longer in stock',
'no tickets available',
'non disponibile',
'non disponible',
'not available',
'not currently available',
'not in stock',
@@ -89,13 +90,15 @@ async () => {
'vergriffen',
'vorbestellen',
'vorbestellung ist bald möglich',
'we don\'t currently have any',
'we couldn\'t find any products that match',
'we do not currently have an estimate of when this product will be back in stock.',
'we don\'t currently have any',
'we don\'t know when or if this item will be back in stock.',
'we were not able to find a match',
'when this arrives in stock',
'when this item is available to order',
'zur zeit nicht an lager',
'épuisé',
'品切れ',
'已售',
'已售完',
@@ -31,33 +31,33 @@ def stitch_images_worker(pipe_conn, chunks_bytes, original_page_height, capture_
# Draw caption on top (overlaid, not extending canvas)
draw = ImageDraw.Draw(stitched)
caption_text = f"WARNING: Screenshot was {original_page_height}px but trimmed to {capture_height}px because it was too long"
padding = 10
font_size = 35
font_color = (255, 0, 0)
background_color = (255, 255, 255)
if original_page_height > capture_height:
caption_text = f"WARNING: Screenshot was {original_page_height}px but trimmed to {capture_height}px because it was too long"
padding = 10
font_size = 35
font_color = (255, 0, 0)
background_color = (255, 255, 255)
# Try to load a proper font
try:
font = ImageFont.truetype("arial.ttf", font_size)
except IOError:
font = ImageFont.load_default()
# Try to load a proper font
try:
font = ImageFont.truetype("arial.ttf", font_size)
except IOError:
font = ImageFont.load_default()
bbox = draw.textbbox((0, 0), caption_text, font=font)
text_width = bbox[2] - bbox[0]
text_height = bbox[3] - bbox[1]
bbox = draw.textbbox((0, 0), caption_text, font=font)
text_width = bbox[2] - bbox[0]
text_height = bbox[3] - bbox[1]
# Draw white rectangle background behind text
rect_top = 0
rect_bottom = text_height + 2 * padding
draw.rectangle([(0, rect_top), (max_width, rect_bottom)], fill=background_color)
# Draw white rectangle background behind text
rect_top = 0
rect_bottom = text_height + 2 * padding
draw.rectangle([(0, rect_top), (max_width, rect_bottom)], fill=background_color)
# Draw text centered horizontally, 10px padding from top of the rectangle
text_x = (max_width - text_width) // 2
text_y = padding
draw.text((text_x, text_y), caption_text, font=font, fill=font_color)
# Draw text centered horizontally, 10px padding from top of the rectangle
text_x = (max_width - text_width) // 2
text_y = padding
draw.text((text_x, text_y), caption_text, font=font, fill=font_color)
# Encode and send image
output = io.BytesIO()
@@ -65,7 +65,17 @@ class fetcher(Fetcher):
# request_body, request_method unused for now, until some magic in the future happens.
options = ChromeOptions()
options.add_argument("--headless")
# Load Chrome options from env
CHROME_OPTIONS = [
line.strip()
for line in os.getenv("CHROME_OPTIONS", "").strip().splitlines()
if line.strip()
]
for opt in CHROME_OPTIONS:
options.add_argument(opt)
if self.proxy:
options.proxy = self.proxy
@@ -80,7 +90,9 @@ class fetcher(Fetcher):
self.quit()
raise
self.driver.set_window_size(1280, 1024)
if not "--window-size" in os.getenv("CHROME_OPTIONS", ""):
self.driver.set_window_size(1280, 1024)
self.driver.implicitly_wait(int(os.getenv("WEBDRIVER_DELAY_BEFORE_CONTENT_READY", 5)))
if self.webdriver_js_execute_code is not None:
@@ -88,6 +100,7 @@ class fetcher(Fetcher):
# Selenium doesn't automatically wait for actions as good as Playwright, so wait again
self.driver.implicitly_wait(int(os.getenv("WEBDRIVER_DELAY_BEFORE_CONTENT_READY", 5)))
# @todo - how to check this? is it possible?
self.status_code = 200
# @todo somehow we should try to get this working for WebDriver
@@ -251,6 +251,10 @@ $(document).ready(function () {
400: function () {
// More than likely the CSRF token was lost when the server restarted
alert("There was a problem processing the request, please reload the page.");
},
401: function (err) {
// This will be a custom error
alert(err.responseText);
}
}
}).done(function (data) {
+3 -5
View File
@@ -98,15 +98,13 @@
{% macro playwright_warning() %}
<p><strong>Error - Playwright support for Chrome based fetching is not enabled.</strong> Alternatively try our <a href="https://changedetection.io">very affordable subscription based service which has all this setup for you</a>.</p>
<p><strong>Error - This watch needs Chrome (with playwright/sockpuppetbrowser), but Chrome based fetching is not enabled.</strong> Alternatively try our <a href="https://changedetection.io">very affordable subscription based service which has all this setup for you</a>.</p>
<p>You may need to <a href="https://github.com/dgtlmoon/changedetection.io/blob/09ebc6ec6338545bdd694dc6eee57f2e9d2b8075/docker-compose.yml#L31">Enable playwright environment variable</a> and uncomment the <strong>sockpuppetbrowser</strong> in the <a href="https://github.com/dgtlmoon/changedetection.io/blob/master/docker-compose.yml">docker-compose.yml</a> file.</p>
<br>
<p>(Also Selenium/WebDriver can not extract full page screenshots reliably so Playwright is recommended here)</p>
{% endmacro %}
{% macro only_webdriver_type_watches_warning() %}
<p><strong>Sorry, this functionality only works with Playwright/Chrome enabled watches.<br>You need to <a href="#request">Set the fetch method to Playwright/Chrome mode and resave</a> and have the Playwright connection enabled.</strong></p><br>
{% macro only_playwright_type_watches_warning() %}
<p><strong>Sorry, this functionality only works with Playwright/Chrome enabled watches.<br>You need to <a href="#request">Set the fetch method to Playwright/Chrome mode and resave</a> and have the SockpuppetBrowser/Playwright or Selenium enabled.</strong></p><br>
{% endmacro %}
{% macro render_time_schedule_form(form, available_timezones, timezone_default_config) %}
+26 -23
View File
@@ -1,6 +1,6 @@
{% extends 'base.html' %}
{% block content %}
{% from '_helpers.html' import render_field, render_checkbox_field, render_button, render_time_schedule_form, playwright_warning, only_webdriver_type_watches_warning, render_conditions_fieldlist_of_formfields_as_table %}
{% from '_helpers.html' import render_field, render_checkbox_field, render_button, render_time_schedule_form, playwright_warning, only_playwright_type_watches_warning, render_conditions_fieldlist_of_formfields_as_table %}
{% from '_common_fields.html' import render_common_settings_form %}
<script src="{{url_for('static_content', group='js', filename='tabs.js')}}" defer></script>
<script src="{{url_for('static_content', group='js', filename='vis.js')}}" defer></script>
@@ -204,7 +204,9 @@ Math: {{ 1 + 1 }}") }}
</div>
<div class="tab-pane-inner" id="browser-steps">
{% if playwright_enabled and watch_uses_webdriver %}
{% if watch_needs_selenium_or_playwright %}
{# Only works with playwright #}
{% if system_has_playwright_configured %}
<img class="beta-logo" src="{{url_for('static_content', group='images', filename='beta-logo.png')}}" alt="New beta functionality">
<fieldset>
<div class="pure-control-group">
@@ -223,7 +225,6 @@ Math: {{ 1 + 1 }}") }}
<div class="flex-wrapper" >
<div id="browser-steps-ui" class="noselect">
<div class="noselect" id="browsersteps-selector-wrapper" style="width: 100%">
<span class="loader" >
<span id="browsersteps-click-start">
@@ -245,15 +246,16 @@ Math: {{ 1 + 1 }}") }}
</div>
</fieldset>
{% else %}
<span class="pure-form-message-inline">
{% if not watch_uses_webdriver %}
{{ only_webdriver_type_watches_warning() }}
{% endif %}
{% if not playwright_enabled %}
{{ playwright_warning() }}
{% endif %}
</span>
{# it's configured to use selenium or chrome but system says its not configured #}
{{ playwright_warning() }}
{% if system_has_webdriver_configured %}
<strong>Selenium/Webdriver cant be used here because it wont fetch screenshots reliably.</strong>
{% endif %}
{% endif %}
{% else %}
{# "This functionality needs chrome.." #}
{{ only_playwright_type_watches_warning() }}
{% endif %}
</div>
@@ -262,7 +264,7 @@ Math: {{ 1 + 1 }}") }}
<div class="pure-control-group inline-radio">
{{ render_checkbox_field(form.notification_muted) }}
</div>
{% if watch_uses_webdriver %}
{% if watch_needs_selenium_or_playwright %}
<div class="pure-control-group inline-radio">
{{ render_checkbox_field(form.notification_screenshot) }}
<span class="pure-form-message-inline">
@@ -379,13 +381,15 @@ Math: {{ 1 + 1 }}") }}
<fieldset>
<div class="pure-control-group">
{% if playwright_enabled and watch_uses_webdriver %}
{% if watch_needs_selenium_or_playwright %}
{% if system_has_playwright_configured %}
<span class="pure-form-message-inline" id="visual-selector-heading">
The Visual Selector tool lets you select the <i>text</i> elements that will be used for the change detection. It automatically fills-in the filters in the "CSS/JSONPath/JQ/XPath Filters" box of the <a href="#filters-and-triggers">Filters & Triggers</a> tab. Use <strong>Shift+Click</strong> to select multiple items.
</span>
<div id="selector-header">
<a id="clear-selector" class="pure-button button-secondary button-xsmall" style="font-size: 70%">Clear selection</a>
<!-- visual selector IMG will try to load, it will either replace this or on error replace it with some handy text -->
<i class="fetching-update-notice" style="font-size: 80%;">One moment, fetching screenshot and element information..</i>
</div>
<div id="selector-wrapper" style="display: none">
@@ -397,13 +401,16 @@ Math: {{ 1 + 1 }}") }}
</div>
<div id="selector-current-xpath" style="overflow-x: hidden"><strong>Currently:</strong>&nbsp;<span class="text">Loading...</span></div>
{% else %}
{% if not watch_uses_webdriver %}
{{ only_webdriver_type_watches_warning() }}
{% endif %}
{% if not playwright_enabled %}
{{ playwright_warning() }}
{% endif %}
{# The watch needed chrome but system says that playwright is not ready #}
{{ playwright_warning() }}
{% endif %}
{% if system_has_webdriver_configured %}
<strong>Selenium/Webdriver cant be used here because it wont fetch screenshots reliably.</strong>
{% endif %}
{% else %}
{# "This functionality needs chrome.." #}
{{ only_playwright_type_watches_warning() }}
{% endif %}
</div>
</fieldset>
</div>
@@ -443,10 +450,6 @@ Math: {{ 1 + 1 }}") }}
</tr>
</tbody>
</table>
<h4>Text similarity</h4>
<p><strong>Levenshtein Distance</strong> - Last 2 snapshots: {{ lev_info }}</p>
<p style="max-width: 80%; font-size: 80%"><strong>Levenshtein Distance</strong> Calculates the minimum number of insertions, deletions, and substitutions required to change one text into the other.</p>
{% if watch.history_n %}
<p>
<a href="{{url_for('ui.ui_edit.watch_get_latest_html', uuid=uuid)}}" class="pure-button button-small">Download latest HTML snapshot</a>
-5
View File
@@ -74,11 +74,6 @@ def test_check_basic_change_detection_functionality(client, live_server, measure
res = client.get(url_for("ui.ui_edit.watch_get_latest_html", uuid=uuid))
assert b'which has this one new line' in res.data
# Check the 'levenshtein' distance calc showed something useful
res = client.get(url_for("ui.ui_edit.edit_page", uuid=uuid))
assert b'Last 2 snapshots: 17' in res.data
# Now something should be ready, indicated by having a 'unviewed' class
res = client.get(url_for("watchlist.index"))
assert b'unviewed' in res.data
+17 -11
View File
@@ -9,15 +9,20 @@ services:
# - ./proxies.json:/datastore/proxies.json
# environment:
# Default listening port, can also be changed with the -p option
# Default listening port, can also be changed with the -p option (not to be confused with ports: below)
# - PORT=5000
#
# Log levels are in descending order. (TRACE is the most detailed one)
# Log output levels: TRACE, DEBUG(default), INFO, SUCCESS, WARNING, ERROR, CRITICAL
# - LOGGER_LEVEL=TRACE
#
# Alternative WebDriver/selenium URL, do not use "'s or 's!
# - WEBDRIVER_URL=http://browser-chrome:4444/wd/hub
#
# Uncomment below and the "sockpuppetbrowser" to use a real Chrome browser (It uses the "playwright" protocol)
# - PLAYWRIGHT_DRIVER_URL=ws://browser-sockpuppet-chrome:3000
#
#
# Alternative WebDriver/selenium URL, do not use "'s or 's! (old, deprecated, does not support screenshots very well)
# - WEBDRIVER_URL=http://browser-selenium-chrome:4444/wd/hub
#
# WebDriver proxy settings webdriver_proxyType, webdriver_ftpProxy, webdriver_noProxy,
# webdriver_proxyAutoconfigUrl, webdriver_autodetect,
@@ -25,9 +30,6 @@ services:
#
# https://selenium-python.readthedocs.io/api.html#module-selenium.webdriver.common.proxy
#
# Alternative target "Chrome" Playwright URL, do not use "'s or 's!
# "Playwright" is a driver/librarythat allows changedetection to talk to a Chrome or similar browser.
# - PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000
#
# Playwright proxy settings playwright_proxy_server, playwright_proxy_bypass, playwright_proxy_username, playwright_proxy_password
#
@@ -43,7 +45,7 @@ services:
# Base URL of your changedetection.io install (Added to the notification alert)
# - BASE_URL=https://mysite.com
# Respect proxy_pass type settings, `proxy_set_header Host "localhost";` and `proxy_set_header X-Forwarded-Prefix /app;`
# More here https://github.com/dgtlmoon/changedetection.io/wiki/Running-changedetection.io-behind-a-reverse-proxy-sub-directory
# More here https://github.com/dgtlmoon/changedetection.io/wiki/Running-changedetection.io-behind-a-reverse-proxy
# - USE_X_SETTINGS=1
#
# Hides the `Referer` header so that monitored websites can't see the changedetection.io hostname.
@@ -86,8 +88,8 @@ services:
# Sockpuppetbrowser is basically chrome wrapped in an API for allowing fast fetching of web-pages.
# RECOMMENDED FOR FETCHING PAGES WITH CHROME, be sure to enable the "PLAYWRIGHT_DRIVER_URL" env variable in the main changedetection container
# sockpuppetbrowser:
# hostname: sockpuppetbrowser
# browser-sockpuppet-chrome:
# hostname: browser-sockpuppet-chrome
# image: dgtlmoon/sockpuppetbrowser:latest
# cap_add:
# - SYS_ADMIN
@@ -102,14 +104,18 @@ services:
# Used for fetching pages via Playwright+Chrome where you need Javascript support.
# Note: Works well but is deprecated, does not fetch full page screenshots (doesnt work with Visual Selector)
# Does not report status codes (200, 404, 403) and other issues
# browser-chrome:
# hostname: browser-chrome
# browser-selenium-chrome:
# hostname: browser-selenium-chrome
# image: selenium/standalone-chrome:4
# environment:
# - VNC_NO_PASSWORD=1
# - SCREEN_WIDTH=1920
# - SCREEN_HEIGHT=1080
# - SCREEN_DEPTH=24
# CHROME_OPTIONS: |
# --window-size=1280,1024
# --headless
# --disable-gpu
# volumes:
# # Workaround to avoid the browser crashing inside a docker container
# # See https://github.com/SeleniumHQ/docker-selenium#quick-start
File diff suppressed because one or more lines are too long
+8 -8
View File
@@ -5,13 +5,13 @@
<meta name="description" content="Manage your changedetection.io watches via API, requires the `x-api-key` header which is found in the settings UI.">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<link href="assets/bootstrap.min.css?v=1701595483622" rel="stylesheet" media="screen">
<link href="assets/prism.css?v=1701595483622" rel="stylesheet" />
<link href="assets/main.css?v=1701595483622" rel="stylesheet" media="screen, print">
<link href="assets/favicon.ico?v=1701595483622" rel="icon" type="image/x-icon">
<link href="assets/apple-touch-icon.png?v=1701595483622" rel="apple-touch-icon" sizes="180x180">
<link href="assets/favicon-32x32.png?v=1701595483622" rel="icon" type="image/png" sizes="32x32">
<link href="assets/favicon-16x16.png?v=1701595483622" rel="icon" type="image/png" sizes="16x16">
<link href="assets/bootstrap.min.css?v=1744573753999" rel="stylesheet" media="screen">
<link href="assets/prism.css?v=1744573753999" rel="stylesheet" />
<link href="assets/main.css?v=1744573753999" rel="stylesheet" media="screen, print">
<link href="assets/favicon.ico?v=1744573753999" rel="icon" type="image/x-icon">
<link href="assets/apple-touch-icon.png?v=1744573753999" rel="apple-touch-icon" sizes="180x180">
<link href="assets/favicon-32x32.png?v=1744573753999" rel="icon" type="image/png" sizes="32x32">
<link href="assets/favicon-16x16.png?v=1744573753999" rel="icon" type="image/png" sizes="16x16">
</head>
<body class="container-fluid">
@@ -928,6 +928,6 @@
</div>
</div>
<script src="assets/main.bundle.js?v=1701595483622"></script>
<script src="assets/main.bundle.js?v=1744573753999"></script>
</body>
</html>
+3 -2
View File
@@ -68,8 +68,6 @@ openpyxl
jq~=1.3; python_version >= "3.8" and sys_platform == "darwin"
jq~=1.3; python_version >= "3.8" and sys_platform == "linux"
levenshtein
# playwright is installed at Dockerfile build time because it's not available on all platforms
pyppeteer-ng==2.0.0rc9
@@ -112,3 +110,6 @@ pluggy ~= 1.5
# Needed for testing, cross-platform for process and system monitoring
psutil==7.0.0
ruff >= 0.11.2
pre_commit >= 4.2.0