mirror of
https://github.com/dgtlmoon/changedetection.io.git
synced 2025-11-12 12:36:48 +00:00
Compare commits
1 Commits
compose-im
...
improve-lo
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
6e40278566 |
2
.github/ISSUE_TEMPLATE/bug_report.md
vendored
2
.github/ISSUE_TEMPLATE/bug_report.md
vendored
@@ -21,7 +21,7 @@ Steps to reproduce the behavior:
|
||||
3. Scroll down to '....'
|
||||
4. See error
|
||||
|
||||
! ALWAYS INCLUDE AN EXAMPLE URL WHERE IT IS POSSIBLE TO RE-CREATE THE ISSUE - USE THE 'SHARE WATCH' FEATURE AND PASTE IN THE SHARE-LINK!
|
||||
! ALWAYS INCLUDE AN EXAMPLE URL WHERE IT IS POSSIBLE TO RE-CREATE THE ISSUE !
|
||||
|
||||
**Expected behavior**
|
||||
A clear and concise description of what you expected to happen.
|
||||
|
||||
15
.github/workflows/containers.yml
vendored
15
.github/workflows/containers.yml
vendored
@@ -85,8 +85,8 @@ jobs:
|
||||
version: latest
|
||||
driver-opts: image=moby/buildkit:master
|
||||
|
||||
# master branch -> :dev container tag
|
||||
- name: Build and push :dev
|
||||
# master always builds :latest
|
||||
- name: Build and push :latest
|
||||
id: docker_build
|
||||
if: ${{ github.ref }} == "refs/heads/master"
|
||||
uses: docker/build-push-action@v2
|
||||
@@ -95,12 +95,12 @@ jobs:
|
||||
file: ./Dockerfile
|
||||
push: true
|
||||
tags: |
|
||||
${{ secrets.DOCKER_HUB_USERNAME }}/changedetection.io:dev,ghcr.io/${{ github.repository }}:dev
|
||||
${{ secrets.DOCKER_HUB_USERNAME }}/changedetection.io:latest,ghcr.io/${{ github.repository }}:latest
|
||||
platforms: linux/amd64,linux/arm64,linux/arm/v6,linux/arm/v7
|
||||
cache-from: type=local,src=/tmp/.buildx-cache
|
||||
cache-to: type=local,dest=/tmp/.buildx-cache
|
||||
|
||||
# A new tagged release is required, which builds :tag and :latest
|
||||
# A new tagged release is required, which builds :tag
|
||||
- name: Build and push :tag
|
||||
id: docker_build_tag_release
|
||||
if: github.event_name == 'release' && startsWith(github.event.release.tag_name, '0.')
|
||||
@@ -110,10 +110,7 @@ jobs:
|
||||
file: ./Dockerfile
|
||||
push: true
|
||||
tags: |
|
||||
${{ secrets.DOCKER_HUB_USERNAME }}/changedetection.io:${{ github.event.release.tag_name }}
|
||||
ghcr.io/dgtlmoon/changedetection.io:${{ github.event.release.tag_name }}
|
||||
${{ secrets.DOCKER_HUB_USERNAME }}/changedetection.io:latest
|
||||
ghcr.io/dgtlmoon/changedetection.io:latest
|
||||
${{ secrets.DOCKER_HUB_USERNAME }}/changedetection.io:${{ github.event.release.tag_name }},ghcr.io/dgtlmoon/changedetection.io:${{ github.event.release.tag_name }}
|
||||
platforms: linux/amd64,linux/arm64,linux/arm/v6,linux/arm/v7
|
||||
cache-from: type=local,src=/tmp/.buildx-cache
|
||||
cache-to: type=local,dest=/tmp/.buildx-cache
|
||||
@@ -128,3 +125,5 @@ jobs:
|
||||
key: ${{ runner.os }}-buildx-${{ github.sha }}
|
||||
restore-keys: |
|
||||
${{ runner.os }}-buildx-
|
||||
|
||||
|
||||
|
||||
@@ -56,9 +56,9 @@ Easily see what changed, examine by word, line, or individual character.
|
||||
|
||||
Please :star: star :star: this project and help it grow! https://github.com/dgtlmoon/changedetection.io/
|
||||
|
||||
### Filter by elements using the Visual Selector tool.
|
||||
### Target elements with the Visual Selector tool.
|
||||
|
||||
Available when connected to a <a href="https://github.com/dgtlmoon/changedetection.io/wiki/Playwright-content-fetcher">playwright content fetcher</a> (included as part of our subscription service)
|
||||
Available when connected to a <a href="https://github.com/dgtlmoon/changedetection.io/wiki/Playwright-content-fetcher">playwright content fetcher</a> (available also as part of our subscription service)
|
||||
|
||||
<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/visualselector-anim.gif" style="max-width:100%;" alt="Self-hosted web page change monitoring context difference " title="Self-hosted web page change monitoring context difference " />
|
||||
|
||||
@@ -67,18 +67,14 @@ Available when connected to a <a href="https://github.com/dgtlmoon/changedetecti
|
||||
### Docker
|
||||
|
||||
With Docker composer, just clone this repository and..
|
||||
|
||||
```bash
|
||||
$ docker-compose up -d
|
||||
```
|
||||
|
||||
Docker standalone
|
||||
```bash
|
||||
$ docker run -d --restart always -p "127.0.0.1:5000:5000" -v datastore-volume:/datastore --name changedetection.io dgtlmoon/changedetection.io
|
||||
```
|
||||
|
||||
`:latest` tag is our latest stable release, `:dev` tag is our bleeding edge `master` branch.
|
||||
|
||||
### Windows
|
||||
|
||||
See the install instructions at the wiki https://github.com/dgtlmoon/changedetection.io/wiki/Microsoft-Windows
|
||||
|
||||
@@ -44,7 +44,7 @@ from flask_wtf import CSRFProtect
|
||||
from changedetectionio import html_tools
|
||||
from changedetectionio.api import api_v1
|
||||
|
||||
__version__ = '0.39.15'
|
||||
__version__ = '0.39.14'
|
||||
|
||||
datastore = None
|
||||
|
||||
@@ -108,7 +108,7 @@ def _jinja2_filter_datetime(watch_obj, format="%Y-%m-%d %H:%M:%S"):
|
||||
# Worker thread tells us which UUID it is currently processing.
|
||||
for t in running_update_threads:
|
||||
if t.current_uuid == watch_obj['uuid']:
|
||||
return '<span class="loader"></span><span> Checking now</span>'
|
||||
return "Checking now.."
|
||||
|
||||
if watch_obj['last_checked'] == 0:
|
||||
return 'Not yet'
|
||||
@@ -352,8 +352,9 @@ def changedetection_app(config=None, datastore_o=None):
|
||||
latest_fname = watch.history[dates[-1]]
|
||||
|
||||
html_diff = diff.render_diff(prev_fname, latest_fname, include_equal=False, line_feed_sep="</br>")
|
||||
fe.content(content="<html><body><h4>{}</h4>{}</body></html>".format(watch_title, html_diff),
|
||||
type='CDATA')
|
||||
fe.description(description="<![CDATA["
|
||||
"<html><body><h4>{}</h4>{}</body></html>"
|
||||
"]]>".format(watch_title, html_diff))
|
||||
|
||||
fe.guid(guid, permalink=False)
|
||||
dt = datetime.datetime.fromtimestamp(int(watch.newest_history_key))
|
||||
@@ -458,19 +459,6 @@ def changedetection_app(config=None, datastore_o=None):
|
||||
|
||||
return 'OK'
|
||||
|
||||
|
||||
@app.route("/scrub/<string:uuid>", methods=['GET'])
|
||||
@login_required
|
||||
def scrub_watch(uuid):
|
||||
try:
|
||||
datastore.scrub_watch(uuid)
|
||||
except KeyError:
|
||||
flash('Watch not found', 'error')
|
||||
else:
|
||||
flash("Scrubbed watch {}".format(uuid))
|
||||
|
||||
return redirect(url_for('index'))
|
||||
|
||||
@app.route("/scrub", methods=['GET', 'POST'])
|
||||
@login_required
|
||||
def scrub_page():
|
||||
@@ -822,13 +810,7 @@ def changedetection_app(config=None, datastore_o=None):
|
||||
|
||||
screenshot_url = datastore.get_screenshot(uuid)
|
||||
|
||||
system_uses_webdriver = datastore.data['settings']['application']['fetch_backend'] == 'html_webdriver'
|
||||
|
||||
is_html_webdriver = True if watch.get('fetch_backend') == 'html_webdriver' or (
|
||||
watch.get('fetch_backend', None) is None and system_uses_webdriver) else False
|
||||
|
||||
output = render_template("diff.html",
|
||||
watch_a=watch,
|
||||
output = render_template("diff.html", watch_a=watch,
|
||||
newest=newest_version_file_contents,
|
||||
previous=previous_version_file_contents,
|
||||
extra_stylesheets=extra_stylesheets,
|
||||
@@ -839,8 +821,7 @@ def changedetection_app(config=None, datastore_o=None):
|
||||
current_diff_url=watch['url'],
|
||||
extra_title=" - Diff - {}".format(watch['title'] if watch['title'] else watch['url']),
|
||||
left_sticky=True,
|
||||
screenshot=screenshot_url,
|
||||
is_html_webdriver=is_html_webdriver)
|
||||
screenshot=screenshot_url)
|
||||
|
||||
return output
|
||||
|
||||
@@ -855,12 +836,6 @@ def changedetection_app(config=None, datastore_o=None):
|
||||
if uuid == 'first':
|
||||
uuid = list(datastore.data['watching'].keys()).pop()
|
||||
|
||||
# Normally you would never reach this, because the 'preview' button is not available when there's no history
|
||||
# However they may try to scrub and reload the page
|
||||
if datastore.data['watching'][uuid].history_n == 0:
|
||||
flash("Preview unavailable - No fetch/check completed or triggers not reached", "error")
|
||||
return redirect(url_for('index'))
|
||||
|
||||
extra_stylesheets = [url_for('static_content', group='styles', filename='diff.css')]
|
||||
|
||||
try:
|
||||
@@ -907,11 +882,6 @@ def changedetection_app(config=None, datastore_o=None):
|
||||
content.append({'line': "No history found", 'classes': ''})
|
||||
|
||||
screenshot_url = datastore.get_screenshot(uuid)
|
||||
system_uses_webdriver = datastore.data['settings']['application']['fetch_backend'] == 'html_webdriver'
|
||||
|
||||
is_html_webdriver = True if watch.get('fetch_backend') == 'html_webdriver' or (
|
||||
watch.get('fetch_backend', None) is None and system_uses_webdriver) else False
|
||||
|
||||
output = render_template("preview.html",
|
||||
content=content,
|
||||
extra_stylesheets=extra_stylesheets,
|
||||
@@ -920,9 +890,8 @@ def changedetection_app(config=None, datastore_o=None):
|
||||
current_diff_url=watch['url'],
|
||||
screenshot=screenshot_url,
|
||||
watch=watch,
|
||||
uuid=uuid,
|
||||
is_html_webdriver=is_html_webdriver)
|
||||
|
||||
uuid=uuid)
|
||||
|
||||
return output
|
||||
|
||||
@app.route("/settings/notification-logs", methods=['GET'])
|
||||
@@ -930,7 +899,7 @@ def changedetection_app(config=None, datastore_o=None):
|
||||
def notification_logs():
|
||||
global notification_debug_log
|
||||
output = render_template("notification-log.html",
|
||||
logs=notification_debug_log if len(notification_debug_log) else ["Notification logs are empty - no notifications sent yet."])
|
||||
logs=notification_debug_log if len(notification_debug_log) else ["No errors or warnings detected"])
|
||||
|
||||
return output
|
||||
|
||||
@@ -1250,9 +1219,6 @@ def check_for_new_version():
|
||||
|
||||
def notification_runner():
|
||||
global notification_debug_log
|
||||
from datetime import datetime
|
||||
import json
|
||||
|
||||
while not app.config.exit.is_set():
|
||||
try:
|
||||
# At the moment only one thread runs (single runner)
|
||||
@@ -1261,14 +1227,10 @@ def notification_runner():
|
||||
time.sleep(1)
|
||||
|
||||
else:
|
||||
|
||||
now = datetime.now()
|
||||
sent_obj = None
|
||||
|
||||
# Process notifications
|
||||
try:
|
||||
from changedetectionio import notification
|
||||
|
||||
sent_obj = notification.process_notification(n_object, datastore)
|
||||
notification.process_notification(n_object, datastore)
|
||||
|
||||
except Exception as e:
|
||||
logging.error("Watch URL: {} Error {}".format(n_object['watch_url'], str(e)))
|
||||
@@ -1281,18 +1243,14 @@ def notification_runner():
|
||||
log_lines = str(e).splitlines()
|
||||
notification_debug_log += log_lines
|
||||
|
||||
# Process notifications
|
||||
notification_debug_log+= ["{} - SENDING - {}".format(now.strftime("%Y/%m/%d %H:%M:%S,000"), json.dumps(sent_obj))]
|
||||
# Trim the log length
|
||||
notification_debug_log = notification_debug_log[-100:]
|
||||
# Trim the log length
|
||||
notification_debug_log = notification_debug_log[-100:]
|
||||
|
||||
|
||||
# Thread runner to check every minute, look for new watches to feed into the Queue.
|
||||
def ticker_thread_check_time_launch_checks():
|
||||
import random
|
||||
from changedetectionio import update_worker
|
||||
|
||||
recheck_time_minimum_seconds = int(os.getenv('MINIMUM_SECONDS_RECHECK_TIME', 20))
|
||||
print("System env MINIMUM_SECONDS_RECHECK_TIME", recheck_time_minimum_seconds)
|
||||
import logging
|
||||
|
||||
# Spin up Workers that do the fetching
|
||||
# Can be overriden by ENV or use the default settings
|
||||
@@ -1325,12 +1283,14 @@ def ticker_thread_check_time_launch_checks():
|
||||
while update_q.qsize() >= 2000:
|
||||
time.sleep(1)
|
||||
|
||||
|
||||
recheck_time_system_seconds = int(datastore.threshold_seconds)
|
||||
|
||||
# Check for watches outside of the time threshold to put in the thread queue.
|
||||
now = time.time()
|
||||
|
||||
recheck_time_minimum_seconds = int(os.getenv('MINIMUM_SECONDS_RECHECK_TIME', 60))
|
||||
recheck_time_system_seconds = datastore.threshold_seconds
|
||||
|
||||
for uuid in watch_uuid_list:
|
||||
now = time.time()
|
||||
|
||||
watch = datastore.data['watching'].get(uuid)
|
||||
if not watch:
|
||||
logging.error("Watch: {} no longer present.".format(uuid))
|
||||
@@ -1341,33 +1301,20 @@ def ticker_thread_check_time_launch_checks():
|
||||
continue
|
||||
|
||||
# If they supplied an individual entry minutes to threshold.
|
||||
|
||||
threshold = now
|
||||
watch_threshold_seconds = watch.threshold_seconds()
|
||||
threshold = watch_threshold_seconds if watch_threshold_seconds > 0 else recheck_time_system_seconds
|
||||
if watch_threshold_seconds:
|
||||
threshold -= watch_threshold_seconds
|
||||
else:
|
||||
threshold -= recheck_time_system_seconds
|
||||
|
||||
# #580 - Jitter plus/minus amount of time to make the check seem more random to the server
|
||||
jitter = datastore.data['settings']['requests'].get('jitter_seconds', 0)
|
||||
if jitter > 0:
|
||||
if watch.jitter_seconds == 0:
|
||||
watch.jitter_seconds = random.uniform(-abs(jitter), jitter)
|
||||
|
||||
|
||||
seconds_since_last_recheck = now - watch['last_checked']
|
||||
if seconds_since_last_recheck >= (threshold + watch.jitter_seconds) and seconds_since_last_recheck >= recheck_time_minimum_seconds:
|
||||
# Yeah, put it in the queue, it's more than time
|
||||
if watch['last_checked'] <= max(threshold, recheck_time_minimum_seconds):
|
||||
if not uuid in running_uuids and uuid not in update_q.queue:
|
||||
print("Queued watch UUID {} last checked at {} queued at {:0.2f} jitter {:0.2f}s, {:0.2f}s since last checked".format(uuid,
|
||||
watch['last_checked'],
|
||||
now,
|
||||
watch.jitter_seconds,
|
||||
now - watch['last_checked']))
|
||||
# Into the queue with you
|
||||
update_q.put(uuid)
|
||||
|
||||
# Reset for next time
|
||||
watch.jitter_seconds = 0
|
||||
|
||||
# Wait before checking the list again - saves CPU
|
||||
time.sleep(1)
|
||||
# Wait a few seconds before checking the list again
|
||||
time.sleep(3)
|
||||
|
||||
# Should be low so we can break this out in testing
|
||||
app.config.exit.wait(1)
|
||||
@@ -281,14 +281,13 @@ class base_html_playwright(Fetcher):
|
||||
from playwright.sync_api import sync_playwright
|
||||
import playwright._impl._api_types
|
||||
from playwright._impl._api_types import Error, TimeoutError
|
||||
response = None
|
||||
|
||||
with sync_playwright() as p:
|
||||
browser_type = getattr(p, self.browser_type)
|
||||
|
||||
# Seemed to cause a connection Exception even tho I can see it connect
|
||||
# self.browser = browser_type.connect(self.command_executor, timeout=timeout*1000)
|
||||
# 60,000 connection timeout only
|
||||
browser = browser_type.connect_over_cdp(self.command_executor, timeout=60000)
|
||||
browser = browser_type.connect_over_cdp(self.command_executor, timeout=timeout * 1000)
|
||||
|
||||
# Set user agent to prevent Cloudflare from blocking the browser
|
||||
# Use the default one configured in the App.py model that's passed from fetch_site_status.py
|
||||
@@ -303,24 +302,19 @@ class base_html_playwright(Fetcher):
|
||||
|
||||
page = context.new_page()
|
||||
try:
|
||||
page.set_default_navigation_timeout(90000)
|
||||
page.set_default_timeout(90000)
|
||||
|
||||
# Bug - never set viewport size BEFORE page.goto
|
||||
|
||||
# Waits for the next navigation. Using Python context manager
|
||||
# prevents a race condition between clicking and waiting for a navigation.
|
||||
with page.expect_navigation():
|
||||
response = page.goto(url, wait_until='load')
|
||||
|
||||
response = page.goto(url, timeout=timeout * 1000, wait_until='commit')
|
||||
# Wait_until = commit
|
||||
# - `'commit'` - consider operation to be finished when network response is received and the document started loading.
|
||||
# Better to not use any smarts from Playwright and just wait an arbitrary number of seconds
|
||||
# This seemed to solve nearly all 'TimeoutErrors'
|
||||
extra_wait = int(os.getenv("WEBDRIVER_DELAY_BEFORE_CONTENT_READY", 5)) + self.render_extract_delay
|
||||
page.wait_for_timeout(extra_wait * 1000)
|
||||
except playwright._impl._api_types.TimeoutError as e:
|
||||
context.close()
|
||||
browser.close()
|
||||
# This can be ok, we will try to grab what we could retrieve
|
||||
pass
|
||||
raise EmptyReply(url=url, status_code=None)
|
||||
except Exception as e:
|
||||
print ("other exception when page.goto")
|
||||
print (str(e))
|
||||
context.close()
|
||||
browser.close()
|
||||
raise PageUnloadable(url=url, status_code=None)
|
||||
@@ -328,22 +322,18 @@ class base_html_playwright(Fetcher):
|
||||
if response is None:
|
||||
context.close()
|
||||
browser.close()
|
||||
print ("response object was none")
|
||||
raise EmptyReply(url=url, status_code=None)
|
||||
|
||||
if len(page.content().strip()) == 0:
|
||||
context.close()
|
||||
browser.close()
|
||||
raise EmptyReply(url=url, status_code=None)
|
||||
|
||||
# Bug 2(?) Set the viewport size AFTER loading the page
|
||||
page.set_viewport_size({"width": 1280, "height": 1024})
|
||||
extra_wait = int(os.getenv("WEBDRIVER_DELAY_BEFORE_CONTENT_READY", 5)) + self.render_extract_delay
|
||||
time.sleep(extra_wait)
|
||||
self.content = page.content()
|
||||
self.status_code = response.status
|
||||
page.set_viewport_size({"width": 1280, "height": 1024})
|
||||
|
||||
if len(self.content.strip()) == 0:
|
||||
context.close()
|
||||
browser.close()
|
||||
print ("Content was empty")
|
||||
raise EmptyReply(url=url, status_code=None)
|
||||
|
||||
self.status_code = response.status
|
||||
self.content = page.content()
|
||||
self.headers = response.all_headers()
|
||||
|
||||
if current_css_filter is not None:
|
||||
@@ -356,15 +346,9 @@ class base_html_playwright(Fetcher):
|
||||
# Bug 3 in Playwright screenshot handling
|
||||
# Some bug where it gives the wrong screenshot size, but making a request with the clip set first seems to solve it
|
||||
# JPEG is better here because the screenshots can be very very large
|
||||
|
||||
# Screenshots also travel via the ws:// (websocket) meaning that the binary data is base64 encoded
|
||||
# which will significantly increase the IO size between the server and client, it's recommended to use the lowest
|
||||
# acceptable screenshot quality here
|
||||
try:
|
||||
# Quality set to 1 because it's not used, just used as a work-around for a bug, no need to change this.
|
||||
page.screenshot(type='jpeg', clip={'x': 1.0, 'y': 1.0, 'width': 1280, 'height': 1024}, quality=1)
|
||||
# The actual screenshot
|
||||
self.screenshot = page.screenshot(type='jpeg', full_page=True, quality=int(os.getenv("PLAYWRIGHT_SCREENSHOT_QUALITY", 72)))
|
||||
page.screenshot(type='jpeg', clip={'x': 1.0, 'y': 1.0, 'width': 1280, 'height': 1024})
|
||||
self.screenshot = page.screenshot(type='jpeg', full_page=True, quality=92)
|
||||
except Exception as e:
|
||||
context.close()
|
||||
browser.close()
|
||||
|
||||
@@ -224,52 +224,34 @@ class perform_site_check():
|
||||
else:
|
||||
fetched_md5 = hashlib.md5(stripped_text_from_html).hexdigest()
|
||||
|
||||
############ Blocking rules, after checksum #################
|
||||
blocked = False
|
||||
# On the first run of a site, watch['previous_md5'] will be None, set it the current one.
|
||||
if not watch.get('previous_md5'):
|
||||
watch['previous_md5'] = fetched_md5
|
||||
update_obj["previous_md5"] = fetched_md5
|
||||
|
||||
blocked_by_not_found_trigger_text = False
|
||||
|
||||
if len(watch['trigger_text']):
|
||||
# Assume blocked
|
||||
blocked = True
|
||||
# Yeah, lets block first until something matches
|
||||
blocked_by_not_found_trigger_text = True
|
||||
# Filter and trigger works the same, so reuse it
|
||||
# It should return the line numbers that match
|
||||
result = html_tools.strip_ignore_text(content=str(stripped_text_from_html),
|
||||
wordlist=watch['trigger_text'],
|
||||
mode="line numbers")
|
||||
# Unblock if the trigger was found
|
||||
# If it returned any lines that matched..
|
||||
if result:
|
||||
blocked = False
|
||||
blocked_by_not_found_trigger_text = False
|
||||
|
||||
|
||||
if len(watch['text_should_not_be_present']):
|
||||
# If anything matched, then we should block a change from happening
|
||||
result = html_tools.strip_ignore_text(content=str(stripped_text_from_html),
|
||||
wordlist=watch['text_should_not_be_present'],
|
||||
mode="line numbers")
|
||||
if result:
|
||||
blocked = True
|
||||
|
||||
# The main thing that all this at the moment comes down to :)
|
||||
if watch['previous_md5'] != fetched_md5:
|
||||
if not blocked_by_not_found_trigger_text and watch['previous_md5'] != fetched_md5:
|
||||
changed_detected = True
|
||||
|
||||
# Looks like something changed, but did it match all the rules?
|
||||
if blocked:
|
||||
changed_detected = False
|
||||
else:
|
||||
update_obj["previous_md5"] = fetched_md5
|
||||
update_obj["last_changed"] = timestamp
|
||||
|
||||
|
||||
# Extract title as title
|
||||
if is_html:
|
||||
if self.datastore.data['settings']['application']['extract_title_as_title'] or watch['extract_title_as_title']:
|
||||
if not watch['title'] or not len(watch['title']):
|
||||
update_obj['title'] = html_tools.extract_element(find='title', html_content=fetcher.content)
|
||||
|
||||
# Always record the new checksum
|
||||
update_obj["previous_md5"] = fetched_md5
|
||||
|
||||
# On the first run of a site, watch['previous_md5'] will be None, set it the current one.
|
||||
if not watch.get('previous_md5'):
|
||||
watch['previous_md5'] = fetched_md5
|
||||
|
||||
return changed_detected, update_obj, text_content_before_ignored_filter, fetcher.screenshot, fetcher.xpath_data
|
||||
|
||||
@@ -341,8 +341,6 @@ class watchForm(commonSettingsForm):
|
||||
method = SelectField('Request method', choices=valid_method, default=default_method)
|
||||
ignore_status_codes = BooleanField('Ignore status codes (process non-2xx status codes as normal)', default=False)
|
||||
trigger_text = StringListField('Trigger/wait for text', [validators.Optional(), ValidateListRegex()])
|
||||
text_should_not_be_present = StringListField('Block change-detection if text matches', [validators.Optional(), ValidateListRegex()])
|
||||
|
||||
save_button = SubmitField('Save', render_kw={"class": "pure-button pure-button-primary"})
|
||||
save_and_preview_button = SubmitField('Save & Preview', render_kw={"class": "pure-button pure-button-primary"})
|
||||
proxy = RadioField('Proxy')
|
||||
@@ -365,9 +363,7 @@ class watchForm(commonSettingsForm):
|
||||
class globalSettingsRequestForm(Form):
|
||||
time_between_check = FormField(TimeBetweenCheckForm)
|
||||
proxy = RadioField('Proxy')
|
||||
jitter_seconds = IntegerField('Random jitter seconds ± check',
|
||||
render_kw={"style": "width: 5em;"},
|
||||
validators=[validators.NumberRange(min=0, message="Should contain zero or more seconds")])
|
||||
|
||||
|
||||
# datastore.data['settings']['application']..
|
||||
class globalSettingsApplicationForm(commonSettingsForm):
|
||||
|
||||
@@ -23,7 +23,6 @@ class model(dict):
|
||||
'requests': {
|
||||
'timeout': 15, # Default 15 seconds
|
||||
'time_between_check': {'weeks': None, 'days': None, 'hours': 3, 'minutes': None, 'seconds': None},
|
||||
'jitter_seconds': 0,
|
||||
'workers': 10, # Number of threads, lower is better for slow connections
|
||||
'proxy': None # Preferred proxy connection
|
||||
},
|
||||
|
||||
@@ -13,6 +13,7 @@ from changedetectionio.notification import (
|
||||
class model(dict):
|
||||
__newest_history_key = None
|
||||
__history_n=0
|
||||
|
||||
__base_config = {
|
||||
'url': None,
|
||||
'tag': None,
|
||||
@@ -38,7 +39,6 @@ class model(dict):
|
||||
'extract_text': [], # Extract text by regex after filters
|
||||
'subtractive_selectors': [],
|
||||
'trigger_text': [], # List of text or regex to wait for until a change is detected
|
||||
'text_should_not_be_present': [], # Text that should not present
|
||||
'fetch_backend': None,
|
||||
'extract_title_as_title': False,
|
||||
'proxy': None, # Preferred proxy connection
|
||||
@@ -48,8 +48,7 @@ class model(dict):
|
||||
'time_between_check': {'weeks': None, 'days': None, 'hours': None, 'minutes': None, 'seconds': None},
|
||||
'webdriver_delay': None
|
||||
}
|
||||
jitter_seconds = 0
|
||||
mtable = {'seconds': 1, 'minutes': 60, 'hours': 3600, 'days': 86400, 'weeks': 86400 * 7}
|
||||
|
||||
def __init__(self, *arg, **kw):
|
||||
import uuid
|
||||
self.update(self.__base_config)
|
||||
@@ -86,7 +85,7 @@ class model(dict):
|
||||
# Read the history file as a dict
|
||||
fname = os.path.join(self.__datastore_path, self.get('uuid'), "history.txt")
|
||||
if os.path.isfile(fname):
|
||||
logging.debug("Reading history index " + str(time.time()))
|
||||
logging.debug("Disk IO accessed " + str(time.time()))
|
||||
with open(fname, "r") as f:
|
||||
tmp_history = dict(i.strip().split(',', 2) for i in f.readlines())
|
||||
|
||||
@@ -158,7 +157,8 @@ class model(dict):
|
||||
|
||||
def threshold_seconds(self):
|
||||
seconds = 0
|
||||
for m, n in self.mtable.items():
|
||||
mtable = {'seconds': 1, 'minutes': 60, 'hours': 3600, 'days': 86400, 'weeks': 86400 * 7}
|
||||
for m, n in mtable.items():
|
||||
x = self.get('time_between_check', {}).get(m, None)
|
||||
if x:
|
||||
seconds += x * n
|
||||
|
||||
@@ -48,7 +48,6 @@ def process_notification(n_object, datastore):
|
||||
# Anything higher than or equal to WARNING (which covers things like Connection errors)
|
||||
# raise it as an exception
|
||||
apobjs=[]
|
||||
sent_objs=[]
|
||||
for url in n_object['notification_urls']:
|
||||
|
||||
apobj = apprise.Apprise(debug=True)
|
||||
@@ -68,11 +67,6 @@ def process_notification(n_object, datastore):
|
||||
url += k + 'avatar_url=https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/changedetectionio/static/images/avatar-256x256.png'
|
||||
|
||||
if url.startswith('tgram://'):
|
||||
# Telegram only supports a limit subset of HTML, remove the '<br/>' we place in.
|
||||
# re https://github.com/dgtlmoon/changedetection.io/issues/555
|
||||
# @todo re-use an existing library we have already imported to strip all non-allowed tags
|
||||
n_body = n_body.replace('<br/>', '\n')
|
||||
n_body = n_body.replace('</br>', '\n')
|
||||
# real limit is 4096, but minus some for extra metadata
|
||||
payload_max_size = 3600
|
||||
body_limit = max(0, payload_max_size - len(n_title))
|
||||
@@ -102,15 +96,6 @@ def process_notification(n_object, datastore):
|
||||
log_value = logs.getvalue()
|
||||
if log_value and 'WARNING' in log_value or 'ERROR' in log_value:
|
||||
raise Exception(log_value)
|
||||
|
||||
sent_objs.append({'title': n_title,
|
||||
'body': n_body,
|
||||
'url' : url,
|
||||
'body_format': n_format})
|
||||
|
||||
# Return what was sent for better logging - after the for loop
|
||||
return sent_objs
|
||||
|
||||
|
||||
# Notification title + body content parameters get created here.
|
||||
def create_notification_parameters(n_object, datastore):
|
||||
|
||||
@@ -1,17 +0,0 @@
|
||||
$(document).ready(function () {
|
||||
// Load it when the #screenshot tab is in use, so we dont give a slow experience when waiting for the text diff to load
|
||||
window.addEventListener('hashchange', function (e) {
|
||||
toggle(location.hash);
|
||||
}, false);
|
||||
|
||||
toggle(location.hash);
|
||||
|
||||
function toggle(hash_name) {
|
||||
if (hash_name === '#screenshot') {
|
||||
$("img#screenshot-img").attr('src', screenshot_url);
|
||||
$("#settings").hide();
|
||||
} else {
|
||||
$("#settings").show();
|
||||
}
|
||||
}
|
||||
});
|
||||
@@ -40,19 +40,13 @@ $(document).ready(function() {
|
||||
$.ajax({
|
||||
type: "POST",
|
||||
url: notification_base_url,
|
||||
data : data,
|
||||
statusCode: {
|
||||
400: function() {
|
||||
// More than likely the CSRF token was lost when the server restarted
|
||||
alert("There was a problem processing the request, please reload the page.");
|
||||
}
|
||||
}
|
||||
data : data
|
||||
}).done(function(data){
|
||||
console.log(data);
|
||||
alert('Sent');
|
||||
}).fail(function(data){
|
||||
console.log(data);
|
||||
alert('There was an error communicating with the server.');
|
||||
alert('Error: '+data.responseJSON.error);
|
||||
})
|
||||
});
|
||||
});
|
||||
|
||||
@@ -353,8 +353,6 @@ and also iPads specifically.
|
||||
/* Hide table headers (but not display: none;, for accessibility) */ }
|
||||
.watch-table thead, .watch-table tbody, .watch-table th, .watch-table td, .watch-table tr {
|
||||
display: block; }
|
||||
.watch-table .last-checked > span {
|
||||
vertical-align: middle; }
|
||||
.watch-table .last-checked::before {
|
||||
color: #555;
|
||||
content: "Last Checked "; }
|
||||
@@ -372,8 +370,7 @@ and also iPads specifically.
|
||||
.watch-table td {
|
||||
/* Behave like a "row" */
|
||||
border: none;
|
||||
border-bottom: 1px solid #eee;
|
||||
vertical-align: middle; }
|
||||
border-bottom: 1px solid #eee; }
|
||||
.watch-table td:before {
|
||||
/* Top/left values mimic padding */
|
||||
top: 6px;
|
||||
@@ -493,42 +490,3 @@ ul {
|
||||
|
||||
#api-key-copy {
|
||||
color: #0078e7; }
|
||||
|
||||
/* spinner */
|
||||
.loader,
|
||||
.loader:after {
|
||||
border-radius: 50%;
|
||||
width: 10px;
|
||||
height: 10px; }
|
||||
|
||||
.loader {
|
||||
margin: 0px auto;
|
||||
font-size: 3px;
|
||||
vertical-align: middle;
|
||||
display: inline-block;
|
||||
text-indent: -9999em;
|
||||
border-top: 1.1em solid rgba(38, 104, 237, 0.2);
|
||||
border-right: 1.1em solid rgba(38, 104, 237, 0.2);
|
||||
border-bottom: 1.1em solid rgba(38, 104, 237, 0.2);
|
||||
border-left: 1.1em solid #2668ed;
|
||||
-webkit-transform: translateZ(0);
|
||||
-ms-transform: translateZ(0);
|
||||
transform: translateZ(0);
|
||||
-webkit-animation: load8 1.1s infinite linear;
|
||||
animation: load8 1.1s infinite linear; }
|
||||
|
||||
@-webkit-keyframes load8 {
|
||||
0% {
|
||||
-webkit-transform: rotate(0deg);
|
||||
transform: rotate(0deg); }
|
||||
100% {
|
||||
-webkit-transform: rotate(360deg);
|
||||
transform: rotate(360deg); } }
|
||||
|
||||
@keyframes load8 {
|
||||
0% {
|
||||
-webkit-transform: rotate(0deg);
|
||||
transform: rotate(0deg); }
|
||||
100% {
|
||||
-webkit-transform: rotate(360deg);
|
||||
transform: rotate(360deg); } }
|
||||
|
||||
@@ -487,11 +487,6 @@ and also iPads specifically.
|
||||
display: block;
|
||||
}
|
||||
|
||||
.last-checked {
|
||||
> span {
|
||||
vertical-align: middle;
|
||||
}
|
||||
}
|
||||
.last-checked::before {
|
||||
color: #555;
|
||||
content: "Last Checked ";
|
||||
@@ -522,7 +517,7 @@ and also iPads specifically.
|
||||
/* Behave like a "row" */
|
||||
border: none;
|
||||
border-bottom: 1px solid #eee;
|
||||
vertical-align: middle;
|
||||
|
||||
&:before {
|
||||
/* Top/left values mimic padding */
|
||||
top: 6px;
|
||||
@@ -706,48 +701,3 @@ ul {
|
||||
#api-key-copy {
|
||||
color: #0078e7;
|
||||
}
|
||||
|
||||
/* spinner */
|
||||
.loader,
|
||||
.loader:after {
|
||||
border-radius: 50%;
|
||||
width: 10px;
|
||||
height: 10px;
|
||||
}
|
||||
.loader {
|
||||
margin: 0px auto;
|
||||
font-size: 3px;
|
||||
vertical-align: middle;
|
||||
display: inline-block;
|
||||
text-indent: -9999em;
|
||||
border-top: 1.1em solid rgba(38,104,237, 0.2);
|
||||
border-right: 1.1em solid rgba(38,104,237, 0.2);
|
||||
border-bottom: 1.1em solid rgba(38,104,237, 0.2);
|
||||
border-left: 1.1em solid #2668ed;
|
||||
-webkit-transform: translateZ(0);
|
||||
-ms-transform: translateZ(0);
|
||||
transform: translateZ(0);
|
||||
-webkit-animation: load8 1.1s infinite linear;
|
||||
animation: load8 1.1s infinite linear;
|
||||
}
|
||||
@-webkit-keyframes load8 {
|
||||
0% {
|
||||
-webkit-transform: rotate(0deg);
|
||||
transform: rotate(0deg);
|
||||
}
|
||||
100% {
|
||||
-webkit-transform: rotate(360deg);
|
||||
transform: rotate(360deg);
|
||||
}
|
||||
}
|
||||
@keyframes load8 {
|
||||
0% {
|
||||
-webkit-transform: rotate(0deg);
|
||||
transform: rotate(0deg);
|
||||
}
|
||||
100% {
|
||||
-webkit-transform: rotate(360deg);
|
||||
transform: rotate(360deg);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -159,11 +159,12 @@ class ChangeDetectionStore:
|
||||
def threshold_seconds(self):
|
||||
seconds = 0
|
||||
mtable = {'seconds': 1, 'minutes': 60, 'hours': 3600, 'days': 86400, 'weeks': 86400 * 7}
|
||||
minimum_seconds_recheck_time = int(os.getenv('MINIMUM_SECONDS_RECHECK_TIME', 60))
|
||||
for m, n in mtable.items():
|
||||
x = self.__data['settings']['requests']['time_between_check'].get(m)
|
||||
if x:
|
||||
seconds += x * n
|
||||
return seconds
|
||||
return max(seconds, minimum_seconds_recheck_time)
|
||||
|
||||
@property
|
||||
def has_unviewed(self):
|
||||
@@ -253,23 +254,12 @@ class ChangeDetectionStore:
|
||||
def scrub_watch(self, uuid):
|
||||
import pathlib
|
||||
|
||||
self.__data['watching'][uuid].update(
|
||||
{'last_checked': 0,
|
||||
'last_changed': 0,
|
||||
'last_viewed': 0,
|
||||
'previous_md5': False,
|
||||
'last_notification_error': False,
|
||||
'last_error': False})
|
||||
|
||||
# JSON Data, Screenshots, Textfiles (history index and snapshots), HTML in the future etc
|
||||
for item in pathlib.Path(os.path.join(self.datastore_path, uuid)).rglob("*.*"):
|
||||
unlink(item)
|
||||
|
||||
# Force the attr to recalculate
|
||||
bump = self.__data['watching'][uuid].history
|
||||
|
||||
self.__data['watching'][uuid].update({'history': {}, 'last_checked': 0, 'last_changed': 0, 'previous_md5': False})
|
||||
self.needs_write_urgent = True
|
||||
|
||||
for item in pathlib.Path(self.datastore_path).rglob(uuid+"/*.txt"):
|
||||
unlink(item)
|
||||
|
||||
def add_watch(self, url, tag="", extras=None, write_to_disk_now=True):
|
||||
|
||||
if extras is None:
|
||||
@@ -290,15 +280,14 @@ class ChangeDetectionStore:
|
||||
headers={'App-Guid': self.__data['app_guid']})
|
||||
res = r.json()
|
||||
|
||||
# List of permissible attributes we accept from the wild internet
|
||||
# List of permisable stuff we accept from the wild internet
|
||||
for k in ['url', 'tag',
|
||||
'paused', 'title',
|
||||
'previous_md5', 'headers',
|
||||
'body', 'method',
|
||||
'ignore_text', 'css_filter',
|
||||
'subtractive_selectors', 'trigger_text',
|
||||
'extract_title_as_title', 'extract_text',
|
||||
'text_should_not_be_present']:
|
||||
'paused', 'title',
|
||||
'previous_md5', 'headers',
|
||||
'body', 'method',
|
||||
'ignore_text', 'css_filter',
|
||||
'subtractive_selectors', 'trigger_text',
|
||||
'extract_title_as_title', 'extract_text']:
|
||||
if res.get(k):
|
||||
apply_extras[k] = res[k]
|
||||
|
||||
|
||||
@@ -14,7 +14,7 @@
|
||||
<li>Use <a target=_new href="https://github.com/caronc/apprise">AppRise URLs</a> for notification to just about any service! <i><a target=_new href="https://github.com/dgtlmoon/changedetection.io/wiki/Notification-configuration-notes">Please read the notification services wiki here for important configuration notes</a></i>.</li>
|
||||
<li><code>discord://</code> only supports a maximum <strong>2,000 characters</strong> of notification text, including the title.</li>
|
||||
<li><code>tgram://</code> bots cant send messages to other bots, so you should specify chat ID of non-bot user.</li>
|
||||
<li><code>tgram://</code> only supports very limited HTML and can fail when extra tags are sent, <a href="https://core.telegram.org/bots/api#html-style">read more here</a> (or use plaintext/markdown format)</li>
|
||||
<li>Go here for <a href="{{url_for('notification_logs')}}">notification debug logs</a></li>
|
||||
</ul>
|
||||
</div>
|
||||
<br/>
|
||||
@@ -22,7 +22,6 @@
|
||||
{% if emailprefix %}
|
||||
<a id="add-email-helper" class="pure-button button-secondary button-xsmall" style="font-size: 70%">Add email</a>
|
||||
{% endif %}
|
||||
<a href="{{url_for('notification_logs')}}" class="pure-button button-secondary button-xsmall" style="font-size: 70%">Notification debug logs</a>
|
||||
</div>
|
||||
<div id="notification-customisation" class="pure-control-group">
|
||||
<div class="pure-control-group">
|
||||
|
||||
@@ -1,11 +1,6 @@
|
||||
{% extends 'base.html' %}
|
||||
|
||||
{% block content %}
|
||||
<script>
|
||||
const screenshot_url="{{url_for('static_content', group='screenshot', filename=uuid)}}";
|
||||
</script>
|
||||
<script type="text/javascript" src="{{url_for('static_content', group='js', filename='diff-overview.js')}}" defer></script>
|
||||
|
||||
<div id="settings">
|
||||
<h1>Differences</h1>
|
||||
<form class="pure-form " action="" method="GET">
|
||||
@@ -44,7 +39,6 @@
|
||||
<div class="tabs">
|
||||
<ul>
|
||||
<li class="tab" id="default-tab"><a href="#text">Text</a></li>
|
||||
<li class="tab" id="screenshot-tab"><a href="#screenshot">Screenshot</a></li>
|
||||
</ul>
|
||||
</div>
|
||||
|
||||
@@ -66,21 +60,6 @@
|
||||
</table>
|
||||
Diff algorithm from the amazing <a href="https://github.com/kpdecker/jsdiff">github.com/kpdecker/jsdiff</a>
|
||||
</div>
|
||||
<div class="tab-pane-inner" id="screenshot">
|
||||
<div class="tip">
|
||||
For now, Differences are performed on text, not graphically, only the latest screenshot is available.
|
||||
</div>
|
||||
</br>
|
||||
{% if is_html_webdriver %}
|
||||
{% if screenshot %}
|
||||
<img style="max-width: 80%" id="screenshot-img" alt="Current screenshot from most recent request"/>
|
||||
{% else %}
|
||||
No screenshot available just yet! Try rechecking the page.
|
||||
{% endif %}
|
||||
{% else %}
|
||||
<strong>Screenshot requires Playwright/WebDriver enabled</strong>
|
||||
{% endif %}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
|
||||
|
||||
@@ -199,22 +199,6 @@ nav
|
||||
</span>
|
||||
</div>
|
||||
</fieldset>
|
||||
<fieldset>
|
||||
<div class="pure-control-group">
|
||||
{{ render_field(form.text_should_not_be_present, rows=5, placeholder="For example: Out of stock
|
||||
Sold out
|
||||
Not in stock
|
||||
Unavailable") }}
|
||||
<span class="pure-form-message-inline">
|
||||
<ul>
|
||||
<li>Block change-detection while this text is on the page, all text and regex are tested <i>case-insensitive</i>, good for waiting for when a product is available again</li>
|
||||
<li>Block text is processed from the result-text that comes out of any CSS/JSON Filters for this watch</li>
|
||||
<li>All lines here must not exist (think of each line as "OR")</li>
|
||||
<li>Note: Wrap in forward slash / to use regex example: <code>/foo\d/</code></li>
|
||||
</ul>
|
||||
</span>
|
||||
</div>
|
||||
</fieldset>
|
||||
<fieldset>
|
||||
<div class="pure-control-group">
|
||||
{{ render_field(form.extract_text, rows=5, placeholder="\d+ online") }}
|
||||
@@ -275,8 +259,6 @@ Unavailable") }}
|
||||
|
||||
<a href="{{url_for('form_delete', uuid=uuid)}}"
|
||||
class="pure-button button-small button-error ">Delete</a>
|
||||
<a href="{{url_for('scrub_watch', uuid=uuid)}}"
|
||||
class="pure-button button-small button-error ">Scrub</a>
|
||||
<a href="{{url_for('form_clone', uuid=uuid)}}"
|
||||
class="pure-button button-small ">Create Copy</a>
|
||||
</div>
|
||||
|
||||
@@ -4,7 +4,7 @@
|
||||
<div class="edit-form">
|
||||
<div class="inner">
|
||||
|
||||
<h4 style="margin-top: 0px;">Notification debug log</h4>
|
||||
<h4 style="margin-top: 0px;">The following issues were detected when sending notifications</h4>
|
||||
<div id="notification-error-log">
|
||||
<ul style="font-size: 80%; margin:0px; padding: 0 0 0 7px">
|
||||
{% for log in logs|reverse %}
|
||||
|
||||
@@ -1,10 +1,6 @@
|
||||
{% extends 'base.html' %}
|
||||
|
||||
{% block content %}
|
||||
<script>
|
||||
const screenshot_url="{{url_for('static_content', group='screenshot', filename=uuid)}}";
|
||||
</script>
|
||||
<script type="text/javascript" src="{{url_for('static_content', group='js', filename='diff-overview.js')}}" defer></script>
|
||||
|
||||
<div id="settings">
|
||||
<h1>Current - {{watch.last_checked|format_timestamp_timeago}}</h1>
|
||||
@@ -14,7 +10,6 @@
|
||||
<div class="tabs">
|
||||
<ul>
|
||||
<li class="tab" id="default-tab"><a href="#text">Text</a></li>
|
||||
<li class="tab" id="screenshot-tab"><a href="#screenshot">Screenshot</a></li>
|
||||
</ul>
|
||||
</div>
|
||||
|
||||
@@ -33,20 +28,5 @@
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
<div class="tab-pane-inner" id="screenshot">
|
||||
<div class="tip">
|
||||
For now, Differences are performed on text, not graphically, only the latest screenshot is available.
|
||||
</div>
|
||||
</br>
|
||||
{% if is_html_webdriver %}
|
||||
{% if screenshot %}
|
||||
<img style="max-width: 80%" id="screenshot-img" alt="Current screenshot from most recent request"/>
|
||||
{% else %}
|
||||
No screenshot available just yet! Try rechecking the page.
|
||||
{% endif %}
|
||||
{% else %}
|
||||
<strong>Screenshot requires Playwright/WebDriver enabled</strong>
|
||||
{% endif %}
|
||||
</div>
|
||||
</div>
|
||||
{% endblock %}
|
||||
@@ -32,11 +32,6 @@
|
||||
{{ render_field(form.requests.form.time_between_check, class="time-check-widget") }}
|
||||
<span class="pure-form-message-inline">Default time for all watches, when the watch does not have a specific time setting.</span>
|
||||
</div>
|
||||
<div class="pure-control-group">
|
||||
{{ render_field(form.requests.form.jitter_seconds, class="jitter_seconds") }}
|
||||
<span class="pure-form-message-inline">Example - 3 seconds random jitter could trigger up to 3 seconds earlier or up to 3 seconds later</span>
|
||||
</div>
|
||||
|
||||
<div class="pure-control-group">
|
||||
{% if not hide_remove_pass %}
|
||||
{% if current_user.is_authenticated %}
|
||||
|
||||
@@ -67,7 +67,7 @@
|
||||
<span class="watch-tag-list">{{ watch.tag}}</span>
|
||||
{% endif %}
|
||||
</td>
|
||||
<td class="last-checked">{{watch|format_last_checked_time|safe}}</td>
|
||||
<td class="last-checked">{{watch|format_last_checked_time}}</td>
|
||||
<td class="last-changed">{% if watch.history_n >=2 and watch.last_changed %}
|
||||
{{watch.last_changed|format_timestamp_timeago}}
|
||||
{% else %}
|
||||
|
||||
@@ -32,8 +32,6 @@ def app(request):
|
||||
"""Create application for the tests."""
|
||||
datastore_path = "./test-datastore"
|
||||
|
||||
# So they don't delay in fetching
|
||||
os.environ["MINIMUM_SECONDS_RECHECK_TIME"] = "0"
|
||||
try:
|
||||
os.mkdir(datastore_path)
|
||||
except FileExistsError:
|
||||
|
||||
@@ -1,137 +0,0 @@
|
||||
#!/usr/bin/python3
|
||||
|
||||
import time
|
||||
from flask import url_for
|
||||
from . util import live_server_setup
|
||||
from changedetectionio import html_tools
|
||||
|
||||
def set_original_ignore_response():
|
||||
test_return_data = """<html>
|
||||
<body>
|
||||
Some initial text</br>
|
||||
<p>Which is across multiple lines</p>
|
||||
</br>
|
||||
So let's see what happens. </br>
|
||||
</body>
|
||||
</html>
|
||||
|
||||
"""
|
||||
|
||||
with open("test-datastore/endpoint-content.txt", "w") as f:
|
||||
f.write(test_return_data)
|
||||
|
||||
|
||||
def set_modified_original_ignore_response():
|
||||
test_return_data = """<html>
|
||||
<body>
|
||||
Some NEW nice initial text</br>
|
||||
<p>Which is across multiple lines</p>
|
||||
</br>
|
||||
So let's see what happens. </br>
|
||||
<p>new ignore stuff</p>
|
||||
<p>out of stock</p>
|
||||
<p>blah</p>
|
||||
</body>
|
||||
</html>
|
||||
|
||||
"""
|
||||
|
||||
with open("test-datastore/endpoint-content.txt", "w") as f:
|
||||
f.write(test_return_data)
|
||||
|
||||
|
||||
# Is the same but includes ZZZZZ, 'ZZZZZ' is the last line in ignore_text
|
||||
def set_modified_response_minus_block_text():
|
||||
test_return_data = """<html>
|
||||
<body>
|
||||
Some NEW nice initial text</br>
|
||||
<p>Which is across multiple lines</p>
|
||||
<p>now on sale $2/p>
|
||||
</br>
|
||||
So let's see what happens. </br>
|
||||
<p>new ignore stuff</p>
|
||||
<p>blah</p>
|
||||
</body>
|
||||
</html>
|
||||
|
||||
"""
|
||||
|
||||
with open("test-datastore/endpoint-content.txt", "w") as f:
|
||||
f.write(test_return_data)
|
||||
|
||||
|
||||
def test_check_block_changedetection_text_NOT_present(client, live_server):
|
||||
sleep_time_for_fetch_thread = 3
|
||||
live_server_setup(live_server)
|
||||
# Use a mix of case in ZzZ to prove it works case-insensitive.
|
||||
ignore_text = "out of stoCk\r\nfoobar"
|
||||
|
||||
set_original_ignore_response()
|
||||
|
||||
# Give the endpoint time to spin up
|
||||
time.sleep(1)
|
||||
|
||||
# Add our URL to the import page
|
||||
test_url = url_for('test_endpoint', _external=True)
|
||||
res = client.post(
|
||||
url_for("import_page"),
|
||||
data={"urls": test_url},
|
||||
follow_redirects=True
|
||||
)
|
||||
assert b"1 Imported" in res.data
|
||||
|
||||
# Give the thread time to pick it up
|
||||
time.sleep(sleep_time_for_fetch_thread)
|
||||
|
||||
# Goto the edit page, add our ignore text
|
||||
# Add our URL to the import page
|
||||
res = client.post(
|
||||
url_for("edit_page", uuid="first"),
|
||||
data={"text_should_not_be_present": ignore_text, "url": test_url, 'fetch_backend': "html_requests"},
|
||||
follow_redirects=True
|
||||
)
|
||||
assert b"Updated watch." in res.data
|
||||
|
||||
# Give the thread time to pick it up
|
||||
time.sleep(sleep_time_for_fetch_thread)
|
||||
# Check it saved
|
||||
res = client.get(
|
||||
url_for("edit_page", uuid="first"),
|
||||
)
|
||||
assert bytes(ignore_text.encode('utf-8')) in res.data
|
||||
|
||||
# Trigger a check
|
||||
client.get(url_for("form_watch_checknow"), follow_redirects=True)
|
||||
|
||||
# Give the thread time to pick it up
|
||||
time.sleep(sleep_time_for_fetch_thread)
|
||||
|
||||
# It should report nothing found (no new 'unviewed' class)
|
||||
res = client.get(url_for("index"))
|
||||
assert b'unviewed' not in res.data
|
||||
assert b'/test-endpoint' in res.data
|
||||
|
||||
# The page changed, BUT the text is still there, just the rest of it changes, we should not see a change
|
||||
set_modified_original_ignore_response()
|
||||
|
||||
# Trigger a check
|
||||
client.get(url_for("form_watch_checknow"), follow_redirects=True)
|
||||
# Give the thread time to pick it up
|
||||
time.sleep(sleep_time_for_fetch_thread)
|
||||
|
||||
# It should report nothing found (no new 'unviewed' class)
|
||||
res = client.get(url_for("index"))
|
||||
assert b'unviewed' not in res.data
|
||||
assert b'/test-endpoint' in res.data
|
||||
|
||||
|
||||
# Now we set a change where the text is gone, it should now trigger
|
||||
set_modified_response_minus_block_text()
|
||||
client.get(url_for("form_watch_checknow"), follow_redirects=True)
|
||||
time.sleep(sleep_time_for_fetch_thread)
|
||||
|
||||
res = client.get(url_for("index"))
|
||||
assert b'unviewed' in res.data
|
||||
|
||||
res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
|
||||
assert b'Deleted' in res.data
|
||||
@@ -154,10 +154,6 @@ def test_check_notification(client, live_server):
|
||||
time.sleep(1)
|
||||
assert os.path.exists("test-datastore/notification.txt") == False
|
||||
|
||||
res = client.get(url_for("notification_logs"))
|
||||
# be sure we see it in the output log
|
||||
assert b'New ChangeDetection.io Notification - ' + test_url.encode('utf-8') in res.data
|
||||
|
||||
# cleanup for the next
|
||||
client.get(
|
||||
url_for("form_delete", uuid="all"),
|
||||
|
||||
@@ -45,6 +45,7 @@ class update_worker(threading.Thread):
|
||||
|
||||
try:
|
||||
changed_detected, update_obj, contents, screenshot, xpath_data = update_handler.run(uuid)
|
||||
|
||||
# Re #342
|
||||
# In Python 3, all strings are sequences of Unicode characters. There is a bytes type that holds raw bytes.
|
||||
# We then convert/.decode('utf-8') for the notification etc
|
||||
@@ -55,18 +56,18 @@ class update_worker(threading.Thread):
|
||||
except content_fetcher.ReplyWithContentButNoText as e:
|
||||
# Totally fine, it's by choice - just continue on, nothing more to care about
|
||||
# Page had elements/content but no renderable text
|
||||
if self.datastore.data['watching'].get(uuid, False) and self.datastore.data['watching'][uuid].get('css_filter'):
|
||||
if self.datastore.data['watching'][uuid].get('css_filter'):
|
||||
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': "Got HTML content but no text found (CSS / xPath Filter not found in page?)"})
|
||||
else:
|
||||
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': "Got HTML content but no text found."})
|
||||
pass
|
||||
except content_fetcher.EmptyReply as e:
|
||||
# Some kind of custom to-str handler in the exception handler that does this?
|
||||
err_text = "EmptyReply - try increasing 'Wait seconds before extracting text', Status Code {}".format(e.status_code)
|
||||
err_text = "EmptyReply: Status Code {}".format(e.status_code)
|
||||
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': err_text,
|
||||
'last_check_status': e.status_code})
|
||||
except content_fetcher.ScreenshotUnavailable as e:
|
||||
err_text = "Screenshot unavailable, page did not render fully in the expected time - try increasing 'Wait seconds before extracting text'"
|
||||
err_text = "Screenshot unavailable, page did not render fully in the expected time"
|
||||
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': err_text,
|
||||
'last_check_status': e.status_code})
|
||||
except content_fetcher.PageUnloadable as e:
|
||||
@@ -98,16 +99,9 @@ class update_worker(threading.Thread):
|
||||
|
||||
# Notifications should only trigger on the second time (first time, we gather the initial snapshot)
|
||||
if watch.history_n >= 2:
|
||||
print(">> Change detected in UUID {} - {}".format(uuid, watch['url']))
|
||||
watch_history = watch.history
|
||||
dates = list(watch_history.keys())
|
||||
# Theoretically it's possible that this could be just 1 long,
|
||||
# - In the case that the timestamp key was not unique
|
||||
if len(dates) == 1:
|
||||
raise ValueError(
|
||||
"History index had 2 or more, but only 1 date loaded, timestamps were not unique? maybe two of the same timestamps got written, needs more delay?"
|
||||
)
|
||||
prev_fname = watch_history[dates[-2]]
|
||||
|
||||
dates = list(watch.history.keys())
|
||||
prev_fname = watch.history[dates[-2]]
|
||||
|
||||
|
||||
# Did it have any notification alerts to hit?
|
||||
|
||||
@@ -73,20 +73,6 @@ services:
|
||||
# hostname: playwright-chrome
|
||||
# image: browserless/chrome
|
||||
# restart: unless-stopped
|
||||
# environment:
|
||||
# - SCREEN_WIDTH=1920
|
||||
# - SCREEN_HEIGHT=1024
|
||||
# - SCREEN_DEPTH=16
|
||||
# - ENABLE_DEBUGGER=false
|
||||
# - SCREEN_WIDTH=1280
|
||||
# - SCREEN_HEIGHT=1024
|
||||
# - SCREEN_DEPTH=16
|
||||
# - PREBOOT_CHROME=true
|
||||
# - CONNECTION_TIMEOUT=300000
|
||||
# - MAX_CONCURRENT_SESSIONS=10
|
||||
# - CHROME_REFRESH_TIME=600000
|
||||
# - DEFAULT_BLOCK_ADS=true
|
||||
# - DEFAULT_STEALTH=true
|
||||
|
||||
volumes:
|
||||
changedetection-data:
|
||||
|
||||
Reference in New Issue
Block a user