Compare commits

...

69 Commits

Author SHA1 Message Date
dgtlmoon
58b79170e4 Use time() as priority 2022-08-18 13:25:49 +02:00
dgtlmoon
1193a7f22c Playwright - Support proxy auth mechanisms (#859) 2022-08-18 09:46:28 +02:00
dgtlmoon
0b976827bb Update README.md 2022-08-17 22:11:04 +02:00
dgtlmoon
280e916033 Update README.md 2022-08-17 22:00:46 +02:00
dgtlmoon
5494e61a05 Skip processing when watch was deleted 2022-08-17 13:29:32 +02:00
dgtlmoon
e461c0b819 Playwright fetcher didn't report low level HTTP errors correctly (like Connection Refused) (#852) 2022-08-17 13:25:08 +02:00
dgtlmoon
d67c654f37 Be sure visual-selector data is set when xPath/CSS filter is not yet found (#851) 2022-08-17 13:21:06 +02:00
dgtlmoon
06ab34b6af Visual selector data not being saved by refactor 2022-08-16 16:53:15 +02:00
dgtlmoon
ba8676c4ba 'Save chrome screenshot' checkbox never used, removing, we always save the screenshot. (#844) 2022-08-16 16:18:09 +02:00
dgtlmoon
4899c1a4f9 Crash fix: Data store sub-directories werent always being created when needed (#842) 2022-08-16 15:17:36 +02:00
dgtlmoon
9bff1582f7 Make the table header easier to understand when sorting (#840) 2022-08-16 12:49:08 +02:00
dgtlmoon
269e3bb7c5 Column sorting (#838) 2022-08-16 10:45:36 +02:00
dgtlmoon
9976f3f969 Update README.md 2022-08-15 22:27:45 +02:00
dgtlmoon
1f250aa868 Revert "don't process paused entries after queue", so we can still manually recheck a paused watch 2022-08-15 22:19:17 +02:00
dgtlmoon
1c08d9f150 Remove 'last-changed' from url-watches.json and always calculate from history index (#835) 2022-08-15 21:14:18 +02:00
dgtlmoon
9942107016 Massive improvements to error handling - show separate output for non HTTP 200 status replies 2022-08-15 18:56:53 +02:00
dgtlmoon
1eb5726cbf Execute JS should happen after waiting seconds 2022-08-15 11:27:04 +02:00
dgtlmoon
b3271ff7bb Upgrade playwright python driver (#834) 2022-08-14 19:53:42 +02:00
dgtlmoon
f82d3b648a Crash protection - handle the case where watch was deleted while being checked (#833) 2022-08-14 19:13:45 +02:00
dgtlmoon
034b1330d4 Don't process a watch if it was paused after being queued (#825) 2022-08-09 10:48:18 +02:00
Hmmbob
a7d005109f Notification Library Update (fixes for Home Assistant) - update requirements.txt (#818) 2022-08-08 08:48:38 +02:00
dgtlmoon
048c355e04 Remove social links for now 2022-08-07 19:24:27 +02:00
dgtlmoon
4026575b0b 0.39.17.2 2022-08-05 15:53:09 +02:00
dgtlmoon
8c466b4826 Test fix - Remove debug from test 2022-08-05 08:26:37 +02:00
dgtlmoon
6f072b42e8 Security update - Password could be unset from settings form unexpectedly (#808) 2022-08-05 00:05:43 +02:00
dgtlmoon
e318253f31 Disable SIGCHLD Handler for now - keeping SIGTERM for DB writes 2022-08-04 23:48:36 +02:00
dgtlmoon
f0f2fe94ce Handle SIGTERM for cleaner shutdowns (#737) 2022-08-02 10:21:25 +02:00
dgtlmoon
26f5c56ba4 Remove [save & preview] button, the preview is not updated live so it can lead to confusion (#801) 2022-08-01 14:47:00 +02:00
dgtlmoon
a1c3107cd6 Feature - priority queue - edited and added watches should get checked before automatically queued watches (#799) 2022-07-31 15:35:35 +02:00
dgtlmoon
8fef3ff4ab [preview current] cleanup code and add test 2022-07-30 20:11:56 +02:00
dgtlmoon
baa25c9f9e Feature - mute notifications (#791) 2022-07-29 21:09:55 +02:00
dgtlmoon
488699b7d4 Test improvement - remove unnecessary step 2022-07-29 10:23:59 +02:00
dgtlmoon
cf3a1ee3e3 0.39.17.1 2022-07-29 10:13:29 +02:00
dgtlmoon
daae43e9f9 Bug fix: Filter failure detection notification was interfering with change-detection results, added test case (#786) 2022-07-29 10:11:49 +02:00
dgtlmoon
cdeedaa65c README.md - new Discord invite link 2022-07-28 23:07:07 +02:00
dgtlmoon
3c9d2ded38 0.39.17 2022-07-28 13:07:51 +02:00
dgtlmoon
9f4364a130 Add https://discord.com/api notification hook to the automatic truncation due to Discords 2000 char limit 2022-07-28 12:34:55 +02:00
dgtlmoon
5bd9eaf99d UI Feature - Add watch in "paused" state, saving then unpauses (#779) 2022-07-28 12:13:26 +02:00
dgtlmoon
b1c51c0a65 Enhancement - support xPath text() function filter, for example "//title/text()" in RSS feeds (#778) 2022-07-28 11:50:31 +02:00
dgtlmoon
232bd92389 Bug fix - Filter "Only trigger when new lines appear" should check all history, not only the first item (#777) 2022-07-28 10:16:19 +02:00
dgtlmoon
e6173357a9 Visual Selector direct element finder fix 2022-07-28 09:19:10 +02:00
dgtlmoon
f2b8888aff Update README.md 2022-07-27 14:25:24 +02:00
dgtlmoon
9c46f175f9 Update README.md links 2022-07-27 14:23:18 +02:00
dgtlmoon
1f27865fdf Filter failure notification send default enable now controlled by setting Env var 2022-07-27 00:01:51 +02:00
dgtlmoon
faa42d75e0 Refactor of extract text filter - Regex, support Regex (groups) and all python regex flags via /something/aiLmsux (#773) 2022-07-26 17:34:34 +02:00
dgtlmoon
3b6e6d85bb Update README.md - adding LinkedIn link 2022-07-25 00:28:41 +02:00
dgtlmoon
30d6a272ce Update README.md - Adding Discord and YouTube links 2022-07-24 23:06:42 +02:00
dgtlmoon
291700554e Bug fix for alerting when xPath based filters are no longer present (#772) 2022-07-23 19:39:52 +02:00
dgtlmoon
a82fad7059 Send notification when CSS/xPath filter is missing after more than 6 (configurable) attempts (#771) 2022-07-23 17:19:00 +02:00
dgtlmoon
c2fe5ae0d1 mailto plaintext handling fix for 'plaintext' apprise integration 2022-07-23 16:55:31 +02:00
dgtlmoon
5beefdb7cc Minor code cleanups 2022-07-23 13:18:44 +02:00
dgtlmoon
872bbba71c Notifications - email - Correctly send plaintext notification email with plaintext header (#767) 2022-07-21 15:22:20 +02:00
Jonathon Sisson
d578de1a35 Form text tweak - Regex clarification (#766) 2022-07-21 10:05:59 +02:00
dgtlmoon
cdc104be10 Update README.md 2022-07-20 14:37:45 +02:00
dgtlmoon
dd0eeca056 Handle simple obfuscations - HomeDepot.com style price obfuscation (#764) 2022-07-20 14:02:22 +02:00
dgtlmoon
a95468be08 Fixing docker-compose.yml PLAYWRIGHT_DRIVER_URL example URL 2022-07-15 20:45:29 +02:00
Brandon Wees
ace44d0e00 Notifications fix - Discord - added discord webhook base url to truncation rules (#753)
Co-authored-by: bwees <branonwees@gmail.com>
2022-07-14 17:41:12 +02:00
dgtlmoon
ebb8b88621 Update Playwright URI Env example with stealth setting and CORS workaround (more reliable fetching) 2022-07-12 22:36:22 +02:00
dgtlmoon
12fc2200de remove extra file 2022-07-12 22:32:20 +02:00
dgtlmoon
52d3d375ba removing package-lock.json - not required to be in git 2022-07-10 20:29:11 +02:00
dgtlmoon
08117089e6 Share-icon cleanups 2022-07-10 20:24:49 +02:00
dgtlmoon
2ba3a6d53f Test improvement: Extract text should return all matches 2022-07-10 20:05:48 +02:00
dgtlmoon
2f636553a9 Bug fix: RSS Feed should also announce utf-8 charset 2022-07-10 18:50:21 +02:00
Freddie Leeman
0bde48b282 Regex extract filter: Return all regex results instead of first match (#730) 2022-07-10 15:09:10 +02:00
dgtlmoon
fae1164c0b Ability to specify JS before running change-detection (#744) 2022-07-10 13:56:01 +02:00
dgtlmoon
169c293143 Playwright - log console errors to output 2022-07-10 13:55:29 +02:00
dgtlmoon
46cb5cff66 UI Improvement - Clarifying "Visual Filter" tool as "Visual Selector Filter" 2022-07-10 12:51:12 +02:00
Simo Elalj
05584ea886 Use environment variables to override new watch settings defaults (user-agent, timeout, workers) (#742) 2022-07-08 20:50:04 +02:00
marvin8
176a591357 Update docker-compose.yml - Remove duplicate environment variables from playwright-chrome sample config in docker-compose.yml (#738) 2022-07-06 09:03:10 +02:00
51 changed files with 1745 additions and 4280 deletions

View File

@@ -22,7 +22,7 @@ RUN pip install --target=/dependencies -r /requirements.txt
# Playwright is an alternative to Selenium # Playwright is an alternative to Selenium
# Excluded this package from requirements.txt to prevent arm/v6 and arm/v7 builds from failing # Excluded this package from requirements.txt to prevent arm/v6 and arm/v7 builds from failing
RUN pip install --target=/dependencies playwright~=1.20 \ RUN pip install --target=/dependencies playwright~=1.24 \
|| echo "WARN: Failed to install Playwright. The application can still run, but the Playwright option will be disabled." || echo "WARN: Failed to install Playwright. The application can still run, but the Playwright option will be disabled."
# Final image stage # Final image stage

View File

@@ -1,21 +1,15 @@
# changedetection.io ## Web Site Change Detection, Monitoring and Notification.
[**Try our $6.99/month subscription - Unlimited checks and watches!**](https://lemonade.changedetection.io/start)
[<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/screenshot.png" style="max-width:100%;" alt="Self-hosted web page change monitoring" title="Self-hosted web page change monitoring" />](https://lemonade.changedetection.io/start)
[![Release Version][release-shield]][release-link] [![Docker Pulls][docker-pulls]][docker-link] [![License][license-shield]](LICENSE.md) [![Release Version][release-shield]][release-link] [![Docker Pulls][docker-pulls]][docker-link] [![License][license-shield]](LICENSE.md)
![changedetection.io](https://github.com/dgtlmoon/changedetection.io/actions/workflows/test-only.yml/badge.svg?branch=master) ![changedetection.io](https://github.com/dgtlmoon/changedetection.io/actions/workflows/test-only.yml/badge.svg?branch=master)
## Self-Hosted, Open Source, Change Monitoring of Web Pages Know when important content changes, we support notifications via Discord, Telegram, Home-Assistant, Slack, Email and 70+ more
_Know when web pages change! Stay ontop of new information!_
Live your data-life *pro-actively* instead of *re-actively*.
Free, Open-source web page monitoring, notification and change detection. Don't have time? [**Try our $6.99/month subscription - unlimited checks and watches!**](https://lemonade.changedetection.io/start)
[<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/screenshot.png" style="max-width:100%;" alt="Self-hosted web page change monitoring" title="Self-hosted web page change monitoring" />](https://lemonade.changedetection.io/start)
**Get your own private instance now! Let us host it for you!**
[**Try our $6.99/month subscription - unlimited checks and watches!**](https://lemonade.changedetection.io/start) , _half the price of other website change monitoring services and comes with unlimited watches & checks!_ [**Try our $6.99/month subscription - unlimited checks and watches!**](https://lemonade.changedetection.io/start) , _half the price of other website change monitoring services and comes with unlimited watches & checks!_
@@ -29,6 +23,7 @@ Free, Open-source web page monitoring, notification and change detection. Don't
#### Example use cases #### Example use cases
- Products and services have a change in pricing - Products and services have a change in pricing
- _Out of stock notification_ and _Back In stock notification_
- Governmental department updates (changes are often only on their websites) - Governmental department updates (changes are often only on their websites)
- New software releases, security advisories when you're not on their mailing list. - New software releases, security advisories when you're not on their mailing list.
- Festivals with changes - Festivals with changes

View File

@@ -6,6 +6,36 @@
# Read more https://github.com/dgtlmoon/changedetection.io/wiki # Read more https://github.com/dgtlmoon/changedetection.io/wiki
from changedetectionio import changedetection from changedetectionio import changedetection
import multiprocessing
import signal
import os
def sigchld_handler(_signo, _stack_frame):
import sys
print('Shutdown: Got SIGCHLD')
# https://stackoverflow.com/questions/40453496/python-multiprocessing-capturing-signals-to-restart-child-processes-or-shut-do
pid, status = os.waitpid(-1, os.WNOHANG | os.WUNTRACED | os.WCONTINUED)
print('Sub-process: pid %d status %d' % (pid, status))
if status != 0:
sys.exit(1)
raise SystemExit
if __name__ == '__main__': if __name__ == '__main__':
changedetection.main()
#signal.signal(signal.SIGCHLD, sigchld_handler)
# The only way I could find to get Flask to shutdown, is to wrap it and then rely on the subsystem issuing SIGTERM/SIGKILL
parse_process = multiprocessing.Process(target=changedetection.main)
parse_process.daemon = True
parse_process.start()
import time
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
#parse_process.terminate() not needed, because this process will issue it to the sub-process anyway
print ("Exited - CTRL+C")

View File

@@ -1 +1,2 @@
test-datastore test-datastore
package-lock.json

View File

@@ -44,7 +44,7 @@ from flask_wtf import CSRFProtect
from changedetectionio import html_tools from changedetectionio import html_tools
from changedetectionio.api import api_v1 from changedetectionio.api import api_v1
__version__ = '0.39.16' __version__ = '0.39.17.2'
datastore = None datastore = None
@@ -54,7 +54,7 @@ ticker_thread = None
extra_stylesheets = [] extra_stylesheets = []
update_q = queue.Queue() update_q = queue.PriorityQueue()
notification_q = queue.Queue() notification_q = queue.Queue()
@@ -76,7 +76,7 @@ app.config['LOGIN_DISABLED'] = False
# Disables caching of the templates # Disables caching of the templates
app.config['TEMPLATES_AUTO_RELOAD'] = True app.config['TEMPLATES_AUTO_RELOAD'] = True
app.jinja_env.add_extension('jinja2.ext.loopcontrols')
csrf = CSRFProtect() csrf = CSRFProtect()
csrf.init_app(app) csrf.init_app(app)
@@ -115,18 +115,19 @@ def _jinja2_filter_datetime(watch_obj, format="%Y-%m-%d %H:%M:%S"):
return timeago.format(int(watch_obj['last_checked']), time.time()) return timeago.format(int(watch_obj['last_checked']), time.time())
# @app.context_processor
# def timeago():
# def _timeago(lower_time, now):
# return timeago.format(lower_time, now)
# return dict(timeago=_timeago)
@app.template_filter('format_timestamp_timeago') @app.template_filter('format_timestamp_timeago')
def _jinja2_filter_datetimestamp(timestamp, format="%Y-%m-%d %H:%M:%S"): def _jinja2_filter_datetimestamp(timestamp, format="%Y-%m-%d %H:%M:%S"):
if timestamp == False:
return 'Not yet'
return timeago.format(timestamp, time.time()) return timeago.format(timestamp, time.time())
# return timeago.format(timestamp, time.time())
# return datetime.datetime.utcfromtimestamp(timestamp).strftime(format) @app.template_filter('format_seconds_ago')
def _jinja2_filter_seconds_precise(timestamp):
if timestamp == False:
return 'Not yet'
return format(int(time.time()-timestamp), ',d')
# When nobody is logged in Flask-Login's current_user is set to an AnonymousUser object. # When nobody is logged in Flask-Login's current_user is set to an AnonymousUser object.
class User(flask_login.UserMixin): class User(flask_login.UserMixin):
@@ -313,7 +314,7 @@ def changedetection_app(config=None, datastore_o=None):
watch['uuid'] = uuid watch['uuid'] = uuid
sorted_watches.append(watch) sorted_watches.append(watch)
sorted_watches.sort(key=lambda x: x['last_changed'], reverse=True) sorted_watches.sort(key=lambda x: x.last_changed, reverse=False)
fg = FeedGenerator() fg = FeedGenerator()
fg.title('changedetection.io') fg.title('changedetection.io')
@@ -332,7 +333,7 @@ def changedetection_app(config=None, datastore_o=None):
if not watch.viewed: if not watch.viewed:
# Re #239 - GUID needs to be individual for each event # Re #239 - GUID needs to be individual for each event
# @todo In the future make this a configurable link back (see work on BASE_URL https://github.com/dgtlmoon/changedetection.io/pull/228) # @todo In the future make this a configurable link back (see work on BASE_URL https://github.com/dgtlmoon/changedetection.io/pull/228)
guid = "{}/{}".format(watch['uuid'], watch['last_changed']) guid = "{}/{}".format(watch['uuid'], watch.last_changed)
fe = fg.add_entry() fe = fg.add_entry()
# Include a link to the diff page, they will have to login here to see if password protection is enabled. # Include a link to the diff page, they will have to login here to see if password protection is enabled.
@@ -361,7 +362,7 @@ def changedetection_app(config=None, datastore_o=None):
fe.pubDate(dt) fe.pubDate(dt)
response = make_response(fg.rss_str()) response = make_response(fg.rss_str())
response.headers.set('Content-Type', 'application/rss+xml') response.headers.set('Content-Type', 'application/rss+xml;charset=utf-8')
return response return response
@app.route("/", methods=['GET']) @app.route("/", methods=['GET'])
@@ -370,20 +371,20 @@ def changedetection_app(config=None, datastore_o=None):
from changedetectionio import forms from changedetectionio import forms
limit_tag = request.args.get('tag') limit_tag = request.args.get('tag')
pause_uuid = request.args.get('pause')
# Redirect for the old rss path which used the /?rss=true # Redirect for the old rss path which used the /?rss=true
if request.args.get('rss'): if request.args.get('rss'):
return redirect(url_for('rss', tag=limit_tag)) return redirect(url_for('rss', tag=limit_tag))
if pause_uuid: op = request.args.get('op')
try: if op:
datastore.data['watching'][pause_uuid]['paused'] ^= True uuid = request.args.get('uuid')
datastore.needs_write = True if op == 'pause':
datastore.data['watching'][uuid]['paused'] ^= True
elif op == 'mute':
datastore.data['watching'][uuid]['notification_muted'] ^= True
return redirect(url_for('index', tag = limit_tag)) datastore.needs_write = True
except KeyError: return redirect(url_for('index', tag = limit_tag))
pass
# Sort by last_changed and add the uuid which is usually the key.. # Sort by last_changed and add the uuid which is usually the key..
sorted_watches = [] sorted_watches = []
@@ -406,7 +407,6 @@ def changedetection_app(config=None, datastore_o=None):
existing_tags = datastore.get_all_tags() existing_tags = datastore.get_all_tags()
form = forms.quickWatchForm(request.form) form = forms.quickWatchForm(request.form)
output = render_template("watch-overview.html", output = render_template("watch-overview.html",
form=form, form=form,
watches=sorted_watches, watches=sorted_watches,
@@ -417,7 +417,7 @@ def changedetection_app(config=None, datastore_o=None):
# Don't link to hosting when we're on the hosting environment # Don't link to hosting when we're on the hosting environment
hosted_sticky=os.getenv("SALTED_PASS", False) == False, hosted_sticky=os.getenv("SALTED_PASS", False) == False,
guid=datastore.data['app_guid'], guid=datastore.data['app_guid'],
queued_uuids=update_q.queue) queued_uuids=[uuid for p,uuid in update_q.queue])
if session.get('share-link'): if session.get('share-link'):
@@ -580,6 +580,9 @@ def changedetection_app(config=None, datastore_o=None):
if request.method == 'POST' and form.validate(): if request.method == 'POST' and form.validate():
extra_update_obj = {} extra_update_obj = {}
if request.args.get('unpause_on_save'):
extra_update_obj['paused'] = False
# Re #110, if they submit the same as the default value, set it to None, so we continue to follow the default # Re #110, if they submit the same as the default value, set it to None, so we continue to follow the default
# Assume we use the default value, unless something relevant is different, then use the form value # Assume we use the default value, unless something relevant is different, then use the form value
# values could be None, 0 etc. # values could be None, 0 etc.
@@ -619,24 +622,23 @@ def changedetection_app(config=None, datastore_o=None):
datastore.data['watching'][uuid].update(form.data) datastore.data['watching'][uuid].update(form.data)
datastore.data['watching'][uuid].update(extra_update_obj) datastore.data['watching'][uuid].update(extra_update_obj)
flash("Updated watch.") if request.args.get('unpause_on_save'):
flash("Updated watch - unpaused!.")
else:
flash("Updated watch.")
# Re #286 - We wait for syncing new data to disk in another thread every 60 seconds # Re #286 - We wait for syncing new data to disk in another thread every 60 seconds
# But in the case something is added we should save straight away # But in the case something is added we should save straight away
datastore.needs_write_urgent = True datastore.needs_write_urgent = True
# Queue the watch for immediate recheck # Queue the watch for immediate recheck, with a higher priority
update_q.put(uuid) update_q.put((1, uuid))
# Diff page [edit] link should go back to diff page # Diff page [edit] link should go back to diff page
if request.args.get("next") and request.args.get("next") == 'diff' and not form.save_and_preview_button.data: if request.args.get("next") and request.args.get("next") == 'diff':
return redirect(url_for('diff_history_page', uuid=uuid)) return redirect(url_for('diff_history_page', uuid=uuid))
else:
if form.save_and_preview_button.data: return redirect(url_for('index'))
flash('You may need to reload this page to see the new content.')
return redirect(url_for('preview_page', uuid=uuid))
else:
return redirect(url_for('index'))
else: else:
if request.method == 'POST' and not form.validate(): if request.method == 'POST' and not form.validate():
@@ -702,7 +704,14 @@ def changedetection_app(config=None, datastore_o=None):
return redirect(url_for('settings_page')) return redirect(url_for('settings_page'))
if form.validate(): if form.validate():
datastore.data['settings']['application'].update(form.data['application']) # Don't set password to False when a password is set - should be only removed with the `removepassword` button
app_update = dict(deepcopy(form.data['application']))
# Never update password with '' or False (Added by wtforms when not in submission)
if 'password' in app_update and not app_update['password']:
del (app_update['password'])
datastore.data['settings']['application'].update(app_update)
datastore.data['settings']['requests'].update(form.data['requests']) datastore.data['settings']['requests'].update(form.data['requests'])
if not os.getenv("SALTED_PASS", False) and len(form.application.form.password.encrypted_password): if not os.getenv("SALTED_PASS", False) and len(form.application.form.password.encrypted_password):
@@ -740,7 +749,7 @@ def changedetection_app(config=None, datastore_o=None):
importer = import_url_list() importer = import_url_list()
importer.run(data=request.values.get('urls'), flash=flash, datastore=datastore) importer.run(data=request.values.get('urls'), flash=flash, datastore=datastore)
for uuid in importer.new_uuids: for uuid in importer.new_uuids:
update_q.put(uuid) update_q.put((1, uuid))
if len(importer.remaining_data) == 0: if len(importer.remaining_data) == 0:
return redirect(url_for('index')) return redirect(url_for('index'))
@@ -753,7 +762,7 @@ def changedetection_app(config=None, datastore_o=None):
d_importer = import_distill_io_json() d_importer = import_distill_io_json()
d_importer.run(data=request.values.get('distill-io'), flash=flash, datastore=datastore) d_importer.run(data=request.values.get('distill-io'), flash=flash, datastore=datastore)
for uuid in d_importer.new_uuids: for uuid in d_importer.new_uuids:
update_q.put(uuid) update_q.put((1, uuid))
@@ -822,7 +831,7 @@ def changedetection_app(config=None, datastore_o=None):
previous_version_file_contents = "Unable to read {}.\n".format(previous_file) previous_version_file_contents = "Unable to read {}.\n".format(previous_file)
screenshot_url = datastore.get_screenshot(uuid) screenshot_url = watch.get_screenshot()
system_uses_webdriver = datastore.data['settings']['application']['fetch_backend'] == 'html_webdriver' system_uses_webdriver = datastore.data['settings']['application']['fetch_backend'] == 'html_webdriver'
@@ -842,7 +851,11 @@ def changedetection_app(config=None, datastore_o=None):
extra_title=" - Diff - {}".format(watch['title'] if watch['title'] else watch['url']), extra_title=" - Diff - {}".format(watch['title'] if watch['title'] else watch['url']),
left_sticky=True, left_sticky=True,
screenshot=screenshot_url, screenshot=screenshot_url,
is_html_webdriver=is_html_webdriver) is_html_webdriver=is_html_webdriver,
last_error=watch['last_error'],
last_error_text=watch.get_error_text(),
last_error_screenshot=watch.get_error_snapshot()
)
return output return output
@@ -857,73 +870,82 @@ def changedetection_app(config=None, datastore_o=None):
if uuid == 'first': if uuid == 'first':
uuid = list(datastore.data['watching'].keys()).pop() uuid = list(datastore.data['watching'].keys()).pop()
# Normally you would never reach this, because the 'preview' button is not available when there's no history
# However they may try to clear snapshots and reload the page
if datastore.data['watching'][uuid].history_n == 0:
flash("Preview unavailable - No fetch/check completed or triggers not reached", "error")
return redirect(url_for('index'))
extra_stylesheets = [url_for('static_content', group='styles', filename='diff.css')]
try: try:
watch = datastore.data['watching'][uuid] watch = datastore.data['watching'][uuid]
except KeyError: except KeyError:
flash("No history found for the specified link, bad link?", "error") flash("No history found for the specified link, bad link?", "error")
return redirect(url_for('index')) return redirect(url_for('index'))
if watch.history_n >0:
timestamps = sorted(watch.history.keys(), key=lambda x: int(x))
filename = watch.history[timestamps[-1]]
try:
with open(filename, 'r') as f:
tmp = f.readlines()
# Get what needs to be highlighted
ignore_rules = watch.get('ignore_text', []) + datastore.data['settings']['application']['global_ignore_text']
# .readlines will keep the \n, but we will parse it here again, in the future tidy this up
ignored_line_numbers = html_tools.strip_ignore_text(content="".join(tmp),
wordlist=ignore_rules,
mode='line numbers'
)
trigger_line_numbers = html_tools.strip_ignore_text(content="".join(tmp),
wordlist=watch['trigger_text'],
mode='line numbers'
)
# Prepare the classes and lines used in the template
i=0
for l in tmp:
classes=[]
i+=1
if i in ignored_line_numbers:
classes.append('ignored')
if i in trigger_line_numbers:
classes.append('triggered')
content.append({'line': l, 'classes': ' '.join(classes)})
except Exception as e:
content.append({'line': "File doesnt exist or unable to read file {}".format(filename), 'classes': ''})
else:
content.append({'line': "No history found", 'classes': ''})
screenshot_url = datastore.get_screenshot(uuid)
system_uses_webdriver = datastore.data['settings']['application']['fetch_backend'] == 'html_webdriver' system_uses_webdriver = datastore.data['settings']['application']['fetch_backend'] == 'html_webdriver'
extra_stylesheets = [url_for('static_content', group='styles', filename='diff.css')]
is_html_webdriver = True if watch.get('fetch_backend') == 'html_webdriver' or ( is_html_webdriver = True if watch.get('fetch_backend') == 'html_webdriver' or (
watch.get('fetch_backend', None) is None and system_uses_webdriver) else False watch.get('fetch_backend', None) is None and system_uses_webdriver) else False
# Never requested successfully, but we detected a fetch error
if datastore.data['watching'][uuid].history_n == 0 and (watch.get_error_text() or watch.get_error_snapshot()):
flash("Preview unavailable - No fetch/check completed or triggers not reached", "error")
output = render_template("preview.html",
content=content,
history_n=watch.history_n,
extra_stylesheets=extra_stylesheets,
# current_diff_url=watch['url'],
watch=watch,
uuid=uuid,
is_html_webdriver=is_html_webdriver,
last_error=watch['last_error'],
last_error_text=watch.get_error_text(),
last_error_screenshot=watch.get_error_snapshot())
return output
timestamp = list(watch.history.keys())[-1]
filename = watch.history[timestamp]
try:
with open(filename, 'r') as f:
tmp = f.readlines()
# Get what needs to be highlighted
ignore_rules = watch.get('ignore_text', []) + datastore.data['settings']['application']['global_ignore_text']
# .readlines will keep the \n, but we will parse it here again, in the future tidy this up
ignored_line_numbers = html_tools.strip_ignore_text(content="".join(tmp),
wordlist=ignore_rules,
mode='line numbers'
)
trigger_line_numbers = html_tools.strip_ignore_text(content="".join(tmp),
wordlist=watch['trigger_text'],
mode='line numbers'
)
# Prepare the classes and lines used in the template
i=0
for l in tmp:
classes=[]
i+=1
if i in ignored_line_numbers:
classes.append('ignored')
if i in trigger_line_numbers:
classes.append('triggered')
content.append({'line': l, 'classes': ' '.join(classes)})
except Exception as e:
content.append({'line': "File doesnt exist or unable to read file {}".format(filename), 'classes': ''})
output = render_template("preview.html", output = render_template("preview.html",
content=content, content=content,
history_n=watch.history_n,
extra_stylesheets=extra_stylesheets, extra_stylesheets=extra_stylesheets,
ignored_line_numbers=ignored_line_numbers, ignored_line_numbers=ignored_line_numbers,
triggered_line_numbers=trigger_line_numbers, triggered_line_numbers=trigger_line_numbers,
current_diff_url=watch['url'], current_diff_url=watch['url'],
screenshot=screenshot_url, screenshot=watch.get_screenshot(),
watch=watch, watch=watch,
uuid=uuid, uuid=uuid,
is_html_webdriver=is_html_webdriver) is_html_webdriver=is_html_webdriver,
last_error=watch['last_error'],
last_error_text=watch.get_error_text(),
last_error_screenshot=watch.get_error_snapshot())
return output return output
@@ -1023,11 +1045,12 @@ def changedetection_app(config=None, datastore_o=None):
if datastore.data['settings']['application']['password'] and not flask_login.current_user.is_authenticated: if datastore.data['settings']['application']['password'] and not flask_login.current_user.is_authenticated:
abort(403) abort(403)
screenshot_filename = "last-screenshot.png" if not request.args.get('error_screenshot') else "last-error-screenshot.png"
# These files should be in our subdirectory # These files should be in our subdirectory
try: try:
# set nocache, set content-type # set nocache, set content-type
watch_dir = datastore_o.datastore_path + "/" + filename response = make_response(send_from_directory(os.path.join(datastore_o.datastore_path, filename), screenshot_filename))
response = make_response(send_from_directory(filename="last-screenshot.png", directory=watch_dir, path=watch_dir + "/last-screenshot.png"))
response.headers['Content-type'] = 'image/png' response.headers['Content-type'] = 'image/png'
response.headers['Cache-Control'] = 'no-cache, no-store, must-revalidate' response.headers['Cache-Control'] = 'no-cache, no-store, must-revalidate'
response.headers['Pragma'] = 'no-cache' response.headers['Pragma'] = 'no-cache'
@@ -1063,9 +1086,9 @@ def changedetection_app(config=None, datastore_o=None):
except FileNotFoundError: except FileNotFoundError:
abort(404) abort(404)
@app.route("/api/add", methods=['POST']) @app.route("/form/add/quickwatch", methods=['POST'])
@login_required @login_required
def form_watch_add(): def form_quick_watch_add():
from changedetectionio import forms from changedetectionio import forms
form = forms.quickWatchForm(request.form) form = forms.quickWatchForm(request.form)
@@ -1078,13 +1101,19 @@ def changedetection_app(config=None, datastore_o=None):
flash('The URL {} already exists'.format(url), "error") flash('The URL {} already exists'.format(url), "error")
return redirect(url_for('index')) return redirect(url_for('index'))
# @todo add_watch should throw a custom Exception for validation etc add_paused = request.form.get('edit_and_watch_submit_button') != None
new_uuid = datastore.add_watch(url=url, tag=request.form.get('tag').strip()) new_uuid = datastore.add_watch(url=url, tag=request.form.get('tag').strip(), extras={'paused': add_paused})
if new_uuid:
if not add_paused and new_uuid:
# Straight into the queue. # Straight into the queue.
update_q.put(new_uuid) update_q.put((1, new_uuid))
flash("Watch added.") flash("Watch added.")
if add_paused:
flash('Watch added in Paused state, saving will unpause.')
return redirect(url_for('edit_page', uuid=new_uuid, unpause_on_save=1))
return redirect(url_for('index')) return redirect(url_for('index'))
@@ -1115,7 +1144,7 @@ def changedetection_app(config=None, datastore_o=None):
uuid = list(datastore.data['watching'].keys()).pop() uuid = list(datastore.data['watching'].keys()).pop()
new_uuid = datastore.clone(uuid) new_uuid = datastore.clone(uuid)
update_q.put(new_uuid) update_q.put((5, new_uuid))
flash('Cloned.') flash('Cloned.')
return redirect(url_for('index')) return redirect(url_for('index'))
@@ -1136,7 +1165,7 @@ def changedetection_app(config=None, datastore_o=None):
if uuid: if uuid:
if uuid not in running_uuids: if uuid not in running_uuids:
update_q.put(uuid) update_q.put((1, uuid))
i = 1 i = 1
elif tag != None: elif tag != None:
@@ -1144,7 +1173,7 @@ def changedetection_app(config=None, datastore_o=None):
for watch_uuid, watch in datastore.data['watching'].items(): for watch_uuid, watch in datastore.data['watching'].items():
if (tag != None and tag in watch['tag']): if (tag != None and tag in watch['tag']):
if watch_uuid not in running_uuids and not datastore.data['watching'][watch_uuid]['paused']: if watch_uuid not in running_uuids and not datastore.data['watching'][watch_uuid]['paused']:
update_q.put(watch_uuid) update_q.put((1, watch_uuid))
i += 1 i += 1
else: else:
@@ -1152,7 +1181,7 @@ def changedetection_app(config=None, datastore_o=None):
for watch_uuid, watch in datastore.data['watching'].items(): for watch_uuid, watch in datastore.data['watching'].items():
if watch_uuid not in running_uuids and not datastore.data['watching'][watch_uuid]['paused']: if watch_uuid not in running_uuids and not datastore.data['watching'][watch_uuid]['paused']:
update_q.put(watch_uuid) update_q.put((1, watch_uuid))
i += 1 i += 1
flash("{} watches are queued for rechecking.".format(i)) flash("{} watches are queued for rechecking.".format(i))
return redirect(url_for('index', tag=tag)) return redirect(url_for('index', tag=tag))
@@ -1254,7 +1283,6 @@ def notification_runner():
global notification_debug_log global notification_debug_log
from datetime import datetime from datetime import datetime
import json import json
while not app.config.exit.is_set(): while not app.config.exit.is_set():
try: try:
# At the moment only one thread runs (single runner) # At the moment only one thread runs (single runner)
@@ -1356,14 +1384,19 @@ def ticker_thread_check_time_launch_checks():
seconds_since_last_recheck = now - watch['last_checked'] seconds_since_last_recheck = now - watch['last_checked']
if seconds_since_last_recheck >= (threshold + watch.jitter_seconds) and seconds_since_last_recheck >= recheck_time_minimum_seconds: if seconds_since_last_recheck >= (threshold + watch.jitter_seconds) and seconds_since_last_recheck >= recheck_time_minimum_seconds:
if not uuid in running_uuids and uuid not in update_q.queue: if not uuid in running_uuids and uuid not in [q_uuid for p,q_uuid in update_q.queue]:
print("Queued watch UUID {} last checked at {} queued at {:0.2f} jitter {:0.2f}s, {:0.2f}s since last checked".format(uuid, # Use Epoch time as priority, so we get a "sorted" PriorityQueue, but we can still push a priority 1 into it.
watch['last_checked'], priority = int(time.time())
now, print(
watch.jitter_seconds, "> Queued watch UUID {} last checked at {} queued at {:0.2f} priority {} jitter {:0.2f}s, {:0.2f}s since last checked".format(
now - watch['last_checked'])) uuid,
watch['last_checked'],
now,
priority,
watch.jitter_seconds,
now - watch['last_checked']))
# Into the queue with you # Into the queue with you
update_q.put(uuid) update_q.put((priority, uuid))
# Reset for next time # Reset for next time
watch.jitter_seconds = 0 watch.jitter_seconds = 0

View File

@@ -24,7 +24,7 @@ class Watch(Resource):
abort(404, message='No watch exists with the UUID of {}'.format(uuid)) abort(404, message='No watch exists with the UUID of {}'.format(uuid))
if request.args.get('recheck'): if request.args.get('recheck'):
self.update_q.put(uuid) self.update_q.put((1, uuid))
return "OK", 200 return "OK", 200
# Return without history, get that via another API call # Return without history, get that via another API call
@@ -100,7 +100,7 @@ class CreateWatch(Resource):
extras = {'title': json_data['title'].strip()} if json_data.get('title') else {} extras = {'title': json_data['title'].strip()} if json_data.get('title') else {}
new_uuid = self.datastore.add_watch(url=json_data['url'].strip(), tag=tag, extras=extras) new_uuid = self.datastore.add_watch(url=json_data['url'].strip(), tag=tag, extras=extras)
self.update_q.put(new_uuid) self.update_q.put((1, new_uuid))
return {'uuid': new_uuid}, 201 return {'uuid': new_uuid}, 201
# Return concise list of available watches and some very basic info # Return concise list of available watches and some very basic info
@@ -113,12 +113,12 @@ class CreateWatch(Resource):
list[k] = {'url': v['url'], list[k] = {'url': v['url'],
'title': v['title'], 'title': v['title'],
'last_checked': v['last_checked'], 'last_checked': v['last_checked'],
'last_changed': v['last_changed'], 'last_changed': v.last_changed,
'last_error': v['last_error']} 'last_error': v['last_error']}
if request.args.get('recheck_all'): if request.args.get('recheck_all'):
for uuid in self.datastore.data['watching'].keys(): for uuid in self.datastore.data['watching'].keys():
self.update_q.put(uuid) self.update_q.put((1, uuid))
return {'status': "OK"}, 200 return {'status': "OK"}, 200
return list, 200 return list, 200

View File

@@ -4,6 +4,7 @@
import getopt import getopt
import os import os
import signal
import sys import sys
import eventlet import eventlet
@@ -11,7 +12,21 @@ import eventlet.wsgi
from . import store, changedetection_app, content_fetcher from . import store, changedetection_app, content_fetcher
from . import __version__ from . import __version__
# Only global so we can access it in the signal handler
datastore = None
app = None
def sigterm_handler(_signo, _stack_frame):
global app
global datastore
# app.config.exit.set()
print('Shutdown: Got SIGTERM, DB saved to disk')
datastore.sync_to_json()
# raise SystemExit
def main(): def main():
global datastore
global app
ssl_mode = False ssl_mode = False
host = '' host = ''
port = os.environ.get('PORT') or 5000 port = os.environ.get('PORT') or 5000
@@ -35,11 +50,6 @@ def main():
create_datastore_dir = False create_datastore_dir = False
for opt, arg in opts: for opt, arg in opts:
# if opt == '--clear-all-history':
# Remove history, the actual files you need to delete manually.
# for uuid, watch in datastore.data['watching'].items():
# watch.update({'history': {}, 'last_checked': 0, 'last_changed': 0, 'previous_md5': None})
if opt == '-s': if opt == '-s':
ssl_mode = True ssl_mode = True
@@ -72,9 +82,12 @@ def main():
"Or use the -C parameter to create the directory.".format(app_config['datastore_path']), file=sys.stderr) "Or use the -C parameter to create the directory.".format(app_config['datastore_path']), file=sys.stderr)
sys.exit(2) sys.exit(2)
datastore = store.ChangeDetectionStore(datastore_path=app_config['datastore_path'], version_tag=__version__) datastore = store.ChangeDetectionStore(datastore_path=app_config['datastore_path'], version_tag=__version__)
app = changedetection_app(app_config, datastore) app = changedetection_app(app_config, datastore)
signal.signal(signal.SIGTERM, sigterm_handler)
# Go into cleanup mode # Go into cleanup mode
if do_cleanup: if do_cleanup:
datastore.remove_unused_snapshots() datastore.remove_unused_snapshots()
@@ -111,4 +124,3 @@ def main():
else: else:
eventlet.wsgi.server(eventlet.listen((host, int(port))), app) eventlet.wsgi.server(eventlet.listen((host, int(port))), app)

View File

@@ -6,38 +6,64 @@ import requests
import time import time
import sys import sys
class PageUnloadable(Exception):
def __init__(self, status_code, url): class Non200ErrorCodeReceived(Exception):
def __init__(self, status_code, url, screenshot=None, xpath_data=None, page_html=None):
# Set this so we can use it in other parts of the app # Set this so we can use it in other parts of the app
self.status_code = status_code self.status_code = status_code
self.url = url self.url = url
self.screenshot = screenshot
self.xpath_data = xpath_data
self.page_text = None
if page_html:
from changedetectionio import html_tools
self.page_text = html_tools.html_to_text(page_html)
return
class JSActionExceptions(Exception):
def __init__(self, status_code, url, screenshot, message=''):
self.status_code = status_code
self.url = url
self.screenshot = screenshot
self.message = message
return
class PageUnloadable(Exception):
def __init__(self, status_code, url, screenshot=False, message=False):
# Set this so we can use it in other parts of the app
self.status_code = status_code
self.url = url
self.screenshot = screenshot
self.message = message
return return
pass
class EmptyReply(Exception): class EmptyReply(Exception):
def __init__(self, status_code, url): def __init__(self, status_code, url, screenshot=None):
# Set this so we can use it in other parts of the app # Set this so we can use it in other parts of the app
self.status_code = status_code self.status_code = status_code
self.url = url self.url = url
self.screenshot = screenshot
return return
pass
class ScreenshotUnavailable(Exception): class ScreenshotUnavailable(Exception):
def __init__(self, status_code, url): def __init__(self, status_code, url, page_html=None):
# Set this so we can use it in other parts of the app # Set this so we can use it in other parts of the app
self.status_code = status_code self.status_code = status_code
self.url = url self.url = url
if page_html:
from html_tools import html_to_text
self.page_text = html_to_text(page_html)
return return
pass
class ReplyWithContentButNoText(Exception): class ReplyWithContentButNoText(Exception):
def __init__(self, status_code, url): def __init__(self, status_code, url, screenshot=None):
# Set this so we can use it in other parts of the app # Set this so we can use it in other parts of the app
self.status_code = status_code self.status_code = status_code
self.url = url self.url = url
self.screenshot = screenshot
return return
pass
class Fetcher(): class Fetcher():
error = None error = None
@@ -46,6 +72,7 @@ class Fetcher():
headers = None headers = None
fetcher_description = "No description" fetcher_description = "No description"
webdriver_js_execute_code = None
xpath_element_js = """ xpath_element_js = """
// Include the getXpath script directly, easier than fetching // Include the getXpath script directly, easier than fetching
!function(e,n){"object"==typeof exports&&"undefined"!=typeof module?module.exports=n():"function"==typeof define&&define.amd?define(n):(e=e||self).getXPath=n()}(this,function(){return function(e){var n=e;if(n&&n.id)return'//*[@id="'+n.id+'"]';for(var o=[];n&&Node.ELEMENT_NODE===n.nodeType;){for(var i=0,r=!1,d=n.previousSibling;d;)d.nodeType!==Node.DOCUMENT_TYPE_NODE&&d.nodeName===n.nodeName&&i++,d=d.previousSibling;for(d=n.nextSibling;d;){if(d.nodeName===n.nodeName){r=!0;break}d=d.nextSibling}o.push((n.prefix?n.prefix+":":"")+n.localName+(i||r?"["+(i+1)+"]":"")),n=n.parentNode}return o.length?"/"+o.reverse().join("/"):""}}); !function(e,n){"object"==typeof exports&&"undefined"!=typeof module?module.exports=n():"function"==typeof define&&define.amd?define(n):(e=e||self).getXPath=n()}(this,function(){return function(e){var n=e;if(n&&n.id)return'//*[@id="'+n.id+'"]';for(var o=[];n&&Node.ELEMENT_NODE===n.nodeType;){for(var i=0,r=!1,d=n.previousSibling;d;)d.nodeType!==Node.DOCUMENT_TYPE_NODE&&d.nodeName===n.nodeName&&i++,d=d.previousSibling;for(d=n.nextSibling;d;){if(d.nodeName===n.nodeName){r=!0;break}d=d.nextSibling}o.push((n.prefix?n.prefix+":":"")+n.localName+(i||r?"["+(i+1)+"]":"")),n=n.parentNode}return o.length?"/"+o.reverse().join("/"):""}});
@@ -62,12 +89,12 @@ class Fetcher():
break; break;
} }
if('' !==r.id) { if('' !==r.id) {
chained_css.unshift("#"+r.id); chained_css.unshift("#"+CSS.escape(r.id));
final_selector= chained_css.join('>'); final_selector= chained_css.join(' > ');
// Be sure theres only one, some sites have multiples of the same ID tag :-( // Be sure theres only one, some sites have multiples of the same ID tag :-(
if (window.document.querySelectorAll(final_selector).length ==1 ) { if (window.document.querySelectorAll(final_selector).length ==1 ) {
return final_selector; return final_selector;
} }
return null; return null;
} else { } else {
chained_css.unshift(r.tagName.toLowerCase()); chained_css.unshift(r.tagName.toLowerCase());
@@ -175,12 +202,11 @@ class Fetcher():
# Will be needed in the future by the VisualSelector, always get this where possible. # Will be needed in the future by the VisualSelector, always get this where possible.
screenshot = False screenshot = False
fetcher_description = "No description"
system_http_proxy = os.getenv('HTTP_PROXY') system_http_proxy = os.getenv('HTTP_PROXY')
system_https_proxy = os.getenv('HTTPS_PROXY') system_https_proxy = os.getenv('HTTPS_PROXY')
# Time ONTOP of the system defined env minimum time # Time ONTOP of the system defined env minimum time
render_extract_delay=0 render_extract_delay = 0
@abstractmethod @abstractmethod
def get_error(self): def get_error(self):
@@ -267,7 +293,15 @@ class base_html_playwright(Fetcher):
# allow per-watch proxy selection override # allow per-watch proxy selection override
if proxy_override: if proxy_override:
self.proxy = {'server': proxy_override} # https://playwright.dev/docs/network#http-proxy
from urllib.parse import urlparse
parsed = urlparse(proxy_override)
proxy_url = "{}://{}:{}".format(parsed.scheme, parsed.hostname, parsed.port)
self.proxy = {'server': proxy_url}
if parsed.username:
self.proxy['username'] = parsed.username
if parsed.password:
self.proxy['password'] = parsed.password
def run(self, def run(self,
url, url,
@@ -309,44 +343,63 @@ class base_html_playwright(Fetcher):
page.set_default_navigation_timeout(90000) page.set_default_navigation_timeout(90000)
page.set_default_timeout(90000) page.set_default_timeout(90000)
# Bug - never set viewport size BEFORE page.goto # Listen for all console events and handle errors
page.on("console", lambda msg: print(f"Playwright console: Watch URL: {url} {msg.type}: {msg.text} {msg.args}"))
# Bug - never set viewport size BEFORE page.goto
# Waits for the next navigation. Using Python context manager # Waits for the next navigation. Using Python context manager
# prevents a race condition between clicking and waiting for a navigation. # prevents a race condition between clicking and waiting for a navigation.
with page.expect_navigation(): with page.expect_navigation():
response = page.goto(url, wait_until='load') response = page.goto(url, wait_until='load')
except playwright._impl._api_types.TimeoutError as e: except playwright._impl._api_types.TimeoutError as e:
context.close() context.close()
browser.close() browser.close()
# This can be ok, we will try to grab what we could retrieve # This can be ok, we will try to grab what we could retrieve
pass pass
except Exception as e: except Exception as e:
print ("other exception when page.goto") print("other exception when page.goto")
print (str(e)) print(str(e))
context.close() context.close()
browser.close() browser.close()
raise PageUnloadable(url=url, status_code=None) raise PageUnloadable(url=url, status_code=None, message=e.message)
if response is None: if response is None:
context.close() context.close()
browser.close() browser.close()
print ("response object was none") print("response object was none")
raise EmptyReply(url=url, status_code=None) raise EmptyReply(url=url, status_code=None)
# Bug 2(?) Set the viewport size AFTER loading the page # Bug 2(?) Set the viewport size AFTER loading the page
page.set_viewport_size({"width": 1280, "height": 1024}) page.set_viewport_size({"width": 1280, "height": 1024})
extra_wait = int(os.getenv("WEBDRIVER_DELAY_BEFORE_CONTENT_READY", 5)) + self.render_extract_delay extra_wait = int(os.getenv("WEBDRIVER_DELAY_BEFORE_CONTENT_READY", 5)) + self.render_extract_delay
time.sleep(extra_wait) time.sleep(extra_wait)
if self.webdriver_js_execute_code is not None:
try:
page.evaluate(self.webdriver_js_execute_code)
except Exception as e:
# Is it possible to get a screenshot?
error_screenshot = False
try:
page.screenshot(type='jpeg',
clip={'x': 1.0, 'y': 1.0, 'width': 1280, 'height': 1024},
quality=1)
# The actual screenshot
error_screenshot = page.screenshot(type='jpeg',
full_page=True,
quality=int(os.getenv("PLAYWRIGHT_SCREENSHOT_QUALITY", 72)))
except Exception as s:
pass
raise JSActionExceptions(status_code=response.status, screenshot=error_screenshot, message=str(e), url=url)
self.content = page.content() self.content = page.content()
self.status_code = response.status self.status_code = response.status
if len(self.content.strip()) == 0:
context.close()
browser.close()
print ("Content was empty")
raise EmptyReply(url=url, status_code=None)
self.headers = response.all_headers() self.headers = response.all_headers()
if current_css_filter is not None: if current_css_filter is not None:
@@ -373,9 +426,17 @@ class base_html_playwright(Fetcher):
browser.close() browser.close()
raise ScreenshotUnavailable(url=url, status_code=None) raise ScreenshotUnavailable(url=url, status_code=None)
if len(self.content.strip()) == 0:
context.close()
browser.close()
print("Content was empty")
raise EmptyReply(url=url, status_code=None, screenshot=self.screenshot)
context.close() context.close()
browser.close() browser.close()
if not ignore_status_codes and self.status_code!=200:
raise Non200ErrorCodeReceived(url=url, status_code=self.status_code, page_html=self.content, screenshot=self.screenshot)
class base_html_webdriver(Fetcher): class base_html_webdriver(Fetcher):
if os.getenv("WEBDRIVER_URL"): if os.getenv("WEBDRIVER_URL"):
@@ -447,6 +508,12 @@ class base_html_webdriver(Fetcher):
self.driver.set_window_size(1280, 1024) self.driver.set_window_size(1280, 1024)
self.driver.implicitly_wait(int(os.getenv("WEBDRIVER_DELAY_BEFORE_CONTENT_READY", 5))) self.driver.implicitly_wait(int(os.getenv("WEBDRIVER_DELAY_BEFORE_CONTENT_READY", 5)))
if self.webdriver_js_execute_code is not None:
self.driver.execute_script(self.webdriver_js_execute_code)
# Selenium doesn't automatically wait for actions as good as Playwright, so wait again
self.driver.implicitly_wait(int(os.getenv("WEBDRIVER_DELAY_BEFORE_CONTENT_READY", 5)))
self.screenshot = self.driver.get_screenshot_as_png() self.screenshot = self.driver.get_screenshot_as_png()
# @todo - how to check this? is it possible? # @todo - how to check this? is it possible?
@@ -497,7 +564,7 @@ class html_requests(Fetcher):
ignore_status_codes=False, ignore_status_codes=False,
current_css_filter=None): current_css_filter=None):
proxies={} proxies = {}
# Allows override the proxy on a per-request basis # Allows override the proxy on a per-request basis
if self.proxy_override: if self.proxy_override:
@@ -525,10 +592,14 @@ class html_requests(Fetcher):
if encoding: if encoding:
r.encoding = encoding r.encoding = encoding
if not r.content or not len(r.content):
raise EmptyReply(url=url, status_code=r.status_code)
# @todo test this # @todo test this
# @todo maybe you really want to test zero-byte return pages? # @todo maybe you really want to test zero-byte return pages?
if (not ignore_status_codes and not r) or not r.content or not len(r.content): if r.status_code != 200 and not ignore_status_codes:
raise EmptyReply(url=url, status_code=r.status_code) # maybe check with content works?
raise Non200ErrorCodeReceived(url=url, status_code=r.status_code, page_html=r.text)
self.status_code = r.status_code self.status_code = r.status_code
self.content = r.text self.content = r.text

View File

@@ -11,7 +11,10 @@ urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
# Some common stuff here that can be moved to a base class # Some common stuff here that can be moved to a base class
# (set_proxy_from_list)
class perform_site_check(): class perform_site_check():
screenshot = None
xpath_data = None
def __init__(self, *args, datastore, **kwargs): def __init__(self, *args, datastore, **kwargs):
super().__init__(*args, **kwargs) super().__init__(*args, **kwargs)
@@ -45,6 +48,20 @@ class perform_site_check():
return proxy_args return proxy_args
# Doesn't look like python supports forward slash auto enclosure in re.findall
# So convert it to inline flag "foobar(?i)" type configuration
def forward_slash_enclosed_regex_to_options(self, regex):
res = re.search(r'^/(.*?)/(\w+)$', regex, re.IGNORECASE)
if res:
regex = res.group(1)
regex += '(?{})'.format(res.group(2))
else:
regex += '(?{})'.format('i')
return regex
def run(self, uuid): def run(self, uuid):
timestamp = int(time.time()) # used for storage etc too timestamp = int(time.time()) # used for storage etc too
@@ -79,7 +96,7 @@ class perform_site_check():
url = self.datastore.get_val(uuid, 'url') url = self.datastore.get_val(uuid, 'url')
request_body = self.datastore.get_val(uuid, 'body') request_body = self.datastore.get_val(uuid, 'body')
request_method = self.datastore.get_val(uuid, 'method') request_method = self.datastore.get_val(uuid, 'method')
ignore_status_code = self.datastore.get_val(uuid, 'ignore_status_codes') ignore_status_codes = self.datastore.data['watching'][uuid].get('ignore_status_codes', False)
# source: support # source: support
is_source = False is_source = False
@@ -106,9 +123,15 @@ class perform_site_check():
elif system_webdriver_delay is not None: elif system_webdriver_delay is not None:
fetcher.render_extract_delay = system_webdriver_delay fetcher.render_extract_delay = system_webdriver_delay
fetcher.run(url, timeout, request_headers, request_body, request_method, ignore_status_code, watch['css_filter']) if watch['webdriver_js_execute_code'] is not None and watch['webdriver_js_execute_code'].strip():
fetcher.webdriver_js_execute_code = watch['webdriver_js_execute_code']
fetcher.run(url, timeout, request_headers, request_body, request_method, ignore_status_codes, watch['css_filter'])
fetcher.quit() fetcher.quit()
self.screenshot = fetcher.screenshot
self.xpath_data = fetcher.xpath_data
# Fetching complete, now filters # Fetching complete, now filters
# @todo move to class / maybe inside of fetcher abstract base? # @todo move to class / maybe inside of fetcher abstract base?
@@ -147,7 +170,9 @@ class perform_site_check():
is_html = False is_html = False
if is_html or is_source: if is_html or is_source:
# CSS Filter, extract the HTML that matches and feed that into the existing inscriptis::get_text # CSS Filter, extract the HTML that matches and feed that into the existing inscriptis::get_text
fetcher.content = html_tools.workarounds_for_obfuscations(fetcher.content)
html_content = fetcher.content html_content = fetcher.content
# If not JSON, and if it's not text/plain.. # If not JSON, and if it's not text/plain..
@@ -190,7 +215,7 @@ class perform_site_check():
# Treat pages with no renderable text content as a change? No by default # Treat pages with no renderable text content as a change? No by default
empty_pages_are_a_change = self.datastore.data['settings']['application'].get('empty_pages_are_a_change', False) empty_pages_are_a_change = self.datastore.data['settings']['application'].get('empty_pages_are_a_change', False)
if not is_json and not empty_pages_are_a_change and len(stripped_text_from_html.strip()) == 0: if not is_json and not empty_pages_are_a_change and len(stripped_text_from_html.strip()) == 0:
raise content_fetcher.ReplyWithContentButNoText(url=url, status_code=200) raise content_fetcher.ReplyWithContentButNoText(url=url, status_code=fetcher.get_last_status_code(), screenshot=screenshot)
# We rely on the actual text in the html output.. many sites have random script vars etc, # We rely on the actual text in the html output.. many sites have random script vars etc,
# in the future we'll implement other mechanisms. # in the future we'll implement other mechanisms.
@@ -210,15 +235,27 @@ class perform_site_check():
if len(extract_text) > 0: if len(extract_text) > 0:
regex_matched_output = [] regex_matched_output = []
for s_re in extract_text: for s_re in extract_text:
result = re.findall(s_re.encode('utf8'), stripped_text_from_html, # incase they specified something in '/.../x'
flags=re.MULTILINE | re.DOTALL | re.LOCALE) regex = self.forward_slash_enclosed_regex_to_options(s_re)
if result: result = re.findall(regex.encode('utf-8'), stripped_text_from_html)
regex_matched_output.append(result[0])
for l in result:
if type(l) is tuple:
#@todo - some formatter option default (between groups)
regex_matched_output += list(l) + [b'\n']
else:
# @todo - some formatter option default (between each ungrouped result)
regex_matched_output += [l] + [b'\n']
# Now we will only show what the regex matched
stripped_text_from_html = b''
text_content_before_ignored_filter = b''
if regex_matched_output: if regex_matched_output:
stripped_text_from_html = b'\n'.join(regex_matched_output) # @todo some formatter for presentation?
stripped_text_from_html = b''.join(regex_matched_output)
text_content_before_ignored_filter = stripped_text_from_html text_content_before_ignored_filter = stripped_text_from_html
# Re #133 - if we should strip whitespaces from triggering the change detected comparison # Re #133 - if we should strip whitespaces from triggering the change detected comparison
if self.datastore.data['settings']['application'].get('ignore_whitespace', False): if self.datastore.data['settings']['application'].get('ignore_whitespace', False):
fetched_md5 = hashlib.md5(stripped_text_from_html.translate(None, b'\r\n\t ')).hexdigest() fetched_md5 = hashlib.md5(stripped_text_from_html.translate(None, b'\r\n\t ')).hexdigest()
@@ -280,4 +317,4 @@ class perform_site_check():
if not watch.get('previous_md5'): if not watch.get('previous_md5'):
watch['previous_md5'] = fetched_md5 watch['previous_md5'] = fetched_md5
return changed_detected, update_obj, text_content_before_ignored_filter, fetcher.screenshot, fetcher.xpath_data return changed_detected, update_obj, text_content_before_ignored_filter

View File

@@ -308,6 +308,9 @@ class ValidateCSSJSONXPATHInput(object):
class quickWatchForm(Form): class quickWatchForm(Form):
url = fields.URLField('URL', validators=[validateURL()]) url = fields.URLField('URL', validators=[validateURL()])
tag = StringField('Group tag', [validators.Optional()]) tag = StringField('Group tag', [validators.Optional()])
watch_submit_button = SubmitField('Watch', render_kw={"class": "pure-button pure-button-primary"})
edit_and_watch_submit_button = SubmitField('Edit > Watch', render_kw={"class": "pure-button pure-button-primary"})
# Common to a single watch and the global settings # Common to a single watch and the global settings
class commonSettingsForm(Form): class commonSettingsForm(Form):
@@ -344,9 +347,13 @@ class watchForm(commonSettingsForm):
trigger_text = StringListField('Trigger/wait for text', [validators.Optional(), ValidateListRegex()]) trigger_text = StringListField('Trigger/wait for text', [validators.Optional(), ValidateListRegex()])
text_should_not_be_present = StringListField('Block change-detection if text matches', [validators.Optional(), ValidateListRegex()]) text_should_not_be_present = StringListField('Block change-detection if text matches', [validators.Optional(), ValidateListRegex()])
webdriver_js_execute_code = TextAreaField('Execute JavaScript before change detection', render_kw={"rows": "5"}, validators=[validators.Optional()])
save_button = SubmitField('Save', render_kw={"class": "pure-button pure-button-primary"}) save_button = SubmitField('Save', render_kw={"class": "pure-button pure-button-primary"})
save_and_preview_button = SubmitField('Save & Preview', render_kw={"class": "pure-button pure-button-primary"})
proxy = RadioField('Proxy') proxy = RadioField('Proxy')
filter_failure_notification_send = BooleanField(
'Send a notification when the filter can no longer be found on the page', default=False)
def validate(self, **kwargs): def validate(self, **kwargs):
if not super().validate(): if not super().validate():
@@ -377,7 +384,6 @@ class globalSettingsApplicationForm(commonSettingsForm):
global_subtractive_selectors = StringListField('Remove elements', [ValidateCSSJSONXPATHInput(allow_xpath=False, allow_json=False)]) global_subtractive_selectors = StringListField('Remove elements', [ValidateCSSJSONXPATHInput(allow_xpath=False, allow_json=False)])
global_ignore_text = StringListField('Ignore Text', [ValidateListRegex()]) global_ignore_text = StringListField('Ignore Text', [ValidateListRegex()])
ignore_whitespace = BooleanField('Ignore whitespace') ignore_whitespace = BooleanField('Ignore whitespace')
real_browser_save_screenshot = BooleanField('Save last screenshot when using Chrome?')
removepassword_button = SubmitField('Remove password', render_kw={"class": "pure-button pure-button-primary"}) removepassword_button = SubmitField('Remove password', render_kw={"class": "pure-button pure-button-primary"})
empty_pages_are_a_change = BooleanField('Treat empty pages as a change?', default=False) empty_pages_are_a_change = BooleanField('Treat empty pages as a change?', default=False)
render_anchor_tag_content = BooleanField('Render anchor tag content', default=False) render_anchor_tag_content = BooleanField('Render anchor tag content', default=False)
@@ -385,6 +391,11 @@ class globalSettingsApplicationForm(commonSettingsForm):
api_access_token_enabled = BooleanField('API access token security check enabled', default=True, validators=[validators.Optional()]) api_access_token_enabled = BooleanField('API access token security check enabled', default=True, validators=[validators.Optional()])
password = SaltyPasswordField() password = SaltyPasswordField()
filter_failure_notification_threshold_attempts = IntegerField('Number of times the filter can be missing before sending a notification',
render_kw={"style": "width: 5em;"},
validators=[validators.NumberRange(min=0,
message="Should contain zero or more attempts")])
class globalSettingsForm(Form): class globalSettingsForm(Form):
# Define these as FormFields/"sub forms", this way it matches the JSON storage # Define these as FormFields/"sub forms", this way it matches the JSON storage

View File

@@ -1,5 +1,4 @@
import json import json
import re
from typing import List from typing import List
from bs4 import BeautifulSoup from bs4 import BeautifulSoup
@@ -8,16 +7,23 @@ import re
from inscriptis import get_text from inscriptis import get_text
from inscriptis.model.config import ParserConfig from inscriptis.model.config import ParserConfig
class FilterNotFoundInResponse(ValueError):
def __init__(self, msg):
ValueError.__init__(self, msg)
class JSONNotFound(ValueError): class JSONNotFound(ValueError):
def __init__(self, msg): def __init__(self, msg):
ValueError.__init__(self, msg) ValueError.__init__(self, msg)
# Given a CSS Rule, and a blob of HTML, return the blob of HTML that matches # Given a CSS Rule, and a blob of HTML, return the blob of HTML that matches
def css_filter(css_filter, html_content): def css_filter(css_filter, html_content):
soup = BeautifulSoup(html_content, "html.parser") soup = BeautifulSoup(html_content, "html.parser")
html_block = "" html_block = ""
for item in soup.select(css_filter, separator=""): r = soup.select(css_filter, separator="")
if len(html_content) > 0 and len(r) == 0:
raise FilterNotFoundInResponse(css_filter)
for item in r:
html_block += str(item) html_block += str(item)
return html_block + "\n" return html_block + "\n"
@@ -42,8 +48,19 @@ def xpath_filter(xpath_filter, html_content):
tree = html.fromstring(bytes(html_content, encoding='utf-8')) tree = html.fromstring(bytes(html_content, encoding='utf-8'))
html_block = "" html_block = ""
for item in tree.xpath(xpath_filter.strip(), namespaces={'re':'http://exslt.org/regular-expressions'}): r = tree.xpath(xpath_filter.strip(), namespaces={'re': 'http://exslt.org/regular-expressions'})
html_block+= etree.tostring(item, pretty_print=True).decode('utf-8')+"<br/>" if len(html_content) > 0 and len(r) == 0:
raise FilterNotFoundInResponse(xpath_filter)
#@note: //title/text() wont work where <title>CDATA..
for element in r:
if type(element) == etree._ElementStringResult:
html_block += str(element) + "<br/>"
elif type(element) == etree._ElementUnicodeResult:
html_block += str(element) + "<br/>"
else:
html_block += etree.tostring(element, pretty_print=True).decode('utf-8') + "<br/>"
return html_block return html_block
@@ -202,3 +219,17 @@ def html_to_text(html_content: str, render_anchor_tag_content=False) -> str:
return text_content return text_content
def workarounds_for_obfuscations(content):
"""
Some sites are using sneaky tactics to make prices and other information un-renderable by Inscriptis
This could go into its own Pip package in the future, for faster updates
"""
# HomeDepot.com style <span>$<!-- -->90<!-- -->.<!-- -->74</span>
# https://github.com/weblyzard/inscriptis/issues/45
if not content:
return content
content = re.sub('<!--\s+-->', '', content)
return content

View File

@@ -1,30 +1,28 @@
import collections from os import getenv
import os
import uuid as uuid_builder
from changedetectionio.notification import ( from changedetectionio.notification import (
default_notification_body, default_notification_body,
default_notification_format, default_notification_format,
default_notification_title, default_notification_title,
) )
_FILTER_FAILURE_THRESHOLD_ATTEMPTS_DEFAULT = 6
class model(dict): class model(dict):
base_config = { base_config = {
'note': "Hello! If you change this file manually, please be sure to restart your changedetection.io instance!", 'note': "Hello! If you change this file manually, please be sure to restart your changedetection.io instance!",
'watching': {}, 'watching': {},
'settings': { 'settings': {
'headers': { 'headers': {
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36', 'User-Agent': getenv("DEFAULT_SETTINGS_HEADERS_USERAGENT", 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36'),
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Accept-Encoding': 'gzip, deflate', # No support for brolti in python requests yet. 'Accept-Encoding': 'gzip, deflate', # No support for brolti in python requests yet.
'Accept-Language': 'en-GB,en-US;q=0.9,en;' 'Accept-Language': 'en-GB,en-US;q=0.9,en;'
}, },
'requests': { 'requests': {
'timeout': 15, # Default 15 seconds 'timeout': int(getenv("DEFAULT_SETTINGS_REQUESTS_TIMEOUT", "45")), # Default 45 seconds
'time_between_check': {'weeks': None, 'days': None, 'hours': 3, 'minutes': None, 'seconds': None}, 'time_between_check': {'weeks': None, 'days': None, 'hours': 3, 'minutes': None, 'seconds': None},
'jitter_seconds': 0, 'jitter_seconds': 0,
'workers': 10, # Number of threads, lower is better for slow connections 'workers': int(getenv("DEFAULT_SETTINGS_REQUESTS_WORKERS", "10")), # Number of threads, lower is better for slow connections
'proxy': None # Preferred proxy connection 'proxy': None # Preferred proxy connection
}, },
'application': { 'application': {
@@ -33,7 +31,8 @@ class model(dict):
'base_url' : None, 'base_url' : None,
'extract_title_as_title': False, 'extract_title_as_title': False,
'empty_pages_are_a_change': False, 'empty_pages_are_a_change': False,
'fetch_backend': os.getenv("DEFAULT_FETCH_BACKEND", "html_requests"), 'fetch_backend': getenv("DEFAULT_FETCH_BACKEND", "html_requests"),
'filter_failure_notification_threshold_attempts': _FILTER_FAILURE_THRESHOLD_ATTEMPTS_DEFAULT,
'global_ignore_text': [], # List of text to ignore when calculating the comparison checksum 'global_ignore_text': [], # List of text to ignore when calculating the comparison checksum
'global_subtractive_selectors': [], 'global_subtractive_selectors': [],
'ignore_whitespace': True, 'ignore_whitespace': True,
@@ -43,7 +42,6 @@ class model(dict):
'notification_title': default_notification_title, 'notification_title': default_notification_title,
'notification_body': default_notification_body, 'notification_body': default_notification_body,
'notification_format': default_notification_format, 'notification_format': default_notification_format,
'real_browser_save_screenshot': True,
'schema_version' : 0, 'schema_version' : 0,
'webdriver_delay': None # Extra delay in seconds before extracting text 'webdriver_delay': None # Extra delay in seconds before extracting text
} }

View File

@@ -1,7 +1,9 @@
import os import os
import uuid as uuid_builder import uuid as uuid_builder
from distutils.util import strtobool
minimum_seconds_recheck_time = int(os.getenv('MINIMUM_SECONDS_RECHECK_TIME', 60)) minimum_seconds_recheck_time = int(os.getenv('MINIMUM_SECONDS_RECHECK_TIME', 60))
mtable = {'seconds': 1, 'minutes': 60, 'hours': 3600, 'days': 86400, 'weeks': 86400 * 7}
from changedetectionio.notification import ( from changedetectionio.notification import (
default_notification_body, default_notification_body,
@@ -17,7 +19,6 @@ class model(dict):
'url': None, 'url': None,
'tag': None, 'tag': None,
'last_checked': 0, 'last_checked': 0,
'last_changed': 0,
'paused': False, 'paused': False,
'last_viewed': 0, # history key value of the last viewed via the [diff] link 'last_viewed': 0, # history key value of the last viewed via the [diff] link
#'newest_history_key': 0, #'newest_history_key': 0,
@@ -34,12 +35,16 @@ class model(dict):
'notification_title': default_notification_title, 'notification_title': default_notification_title,
'notification_body': default_notification_body, 'notification_body': default_notification_body,
'notification_format': default_notification_format, 'notification_format': default_notification_format,
'notification_muted': False,
'css_filter': '', 'css_filter': '',
'last_error': False,
'extract_text': [], # Extract text by regex after filters 'extract_text': [], # Extract text by regex after filters
'subtractive_selectors': [], 'subtractive_selectors': [],
'trigger_text': [], # List of text or regex to wait for until a change is detected 'trigger_text': [], # List of text or regex to wait for until a change is detected
'text_should_not_be_present': [], # Text that should not present 'text_should_not_be_present': [], # Text that should not present
'fetch_backend': None, 'fetch_backend': None,
'filter_failure_notification_send': strtobool(os.getenv('FILTER_FAILURE_NOTIFICATION_SEND_DEFAULT', 'True')),
'consecutive_filter_failures': 0, # Every time the CSS/xPath filter cannot be located, reset when all is fine.
'extract_title_as_title': False, 'extract_title_as_title': False,
'check_unique_lines': False, # On change-detected, compare against all history if its something new 'check_unique_lines': False, # On change-detected, compare against all history if its something new
'proxy': None, # Preferred proxy connection 'proxy': None, # Preferred proxy connection
@@ -47,16 +52,17 @@ class model(dict):
# Requires setting to None on submit if it's the same as the default # Requires setting to None on submit if it's the same as the default
# Should be all None by default, so we use the system default in this case. # Should be all None by default, so we use the system default in this case.
'time_between_check': {'weeks': None, 'days': None, 'hours': None, 'minutes': None, 'seconds': None}, 'time_between_check': {'weeks': None, 'days': None, 'hours': None, 'minutes': None, 'seconds': None},
'webdriver_delay': None 'webdriver_delay': None,
'webdriver_js_execute_code': None, # Run before change-detection
} }
jitter_seconds = 0 jitter_seconds = 0
mtable = {'seconds': 1, 'minutes': 60, 'hours': 3600, 'days': 86400, 'weeks': 86400 * 7}
def __init__(self, *arg, **kw): def __init__(self, *arg, **kw):
import uuid
self.update(self.__base_config) self.update(self.__base_config)
self.__datastore_path = kw['datastore_path'] self.__datastore_path = kw['datastore_path']
self['uuid'] = str(uuid.uuid4()) self['uuid'] = str(uuid_builder.uuid4())
del kw['datastore_path'] del kw['datastore_path']
@@ -64,7 +70,10 @@ class model(dict):
self.update(kw['default']) self.update(kw['default'])
del kw['default'] del kw['default']
# goes at the end so we update the default object with the initialiser # Be sure the cached timestamp is ready
bump = self.history
# Goes at the end so we update the default object with the initialiser
super(model, self).__init__(*arg, **kw) super(model, self).__init__(*arg, **kw)
@property @property
@@ -74,6 +83,28 @@ class model(dict):
return False return False
def ensure_data_dir_exists(self):
target_path = os.path.join(self.__datastore_path, self['uuid'])
if not os.path.isdir(target_path):
print ("> Creating data dir {}".format(target_path))
os.mkdir(target_path)
@property
def label(self):
# Used for sorting
if self['title']:
return self['title']
return self['url']
@property
def last_changed(self):
# last_changed will be the newest snapshot, but when we have just one snapshot, it should be 0
if self.__history_n <= 1:
return 0
if self.__newest_history_key:
return int(self.__newest_history_key)
return 0
@property @property
def history_n(self): def history_n(self):
return self.__history_n return self.__history_n
@@ -116,19 +147,15 @@ class model(dict):
bump = self.history bump = self.history
return self.__newest_history_key return self.__newest_history_key
# Save some text file to the appropriate path and bump the history # Save some text file to the appropriate path and bump the history
# result_obj from fetch_site_status.run() # result_obj from fetch_site_status.run()
def save_history_text(self, contents, timestamp): def save_history_text(self, contents, timestamp):
import uuid import uuid
from os import mkdir, path, unlink
import logging import logging
output_path = "{}/{}".format(self.__datastore_path, self['uuid']) output_path = "{}/{}".format(self.__datastore_path, self['uuid'])
# Incase the operator deleted it, check and create. self.ensure_data_dir_exists()
if not os.path.isdir(output_path):
mkdir(output_path)
snapshot_fname = "{}/{}.stripped.txt".format(output_path, uuid.uuid4()) snapshot_fname = "{}/{}.stripped.txt".format(output_path, uuid.uuid4())
logging.debug("Saving history text {}".format(snapshot_fname)) logging.debug("Saving history text {}".format(snapshot_fname))
@@ -159,21 +186,70 @@ class model(dict):
def threshold_seconds(self): def threshold_seconds(self):
seconds = 0 seconds = 0
for m, n in self.mtable.items(): for m, n in mtable.items():
x = self.get('time_between_check', {}).get(m, None) x = self.get('time_between_check', {}).get(m, None)
if x: if x:
seconds += x * n seconds += x * n
return seconds return seconds
# Iterate over all history texts and see if something new exists # Iterate over all history texts and see if something new exists
def lines_contain_something_unique_compared_to_history(self, lines=[]): def lines_contain_something_unique_compared_to_history(self, lines: list):
local_lines = [l.decode('utf-8').strip().lower() for l in lines] local_lines = set([l.decode('utf-8').strip().lower() for l in lines])
# Compare each lines (set) against each history text file (set) looking for something new.. # Compare each lines (set) against each history text file (set) looking for something new..
existing_history = set({})
for k, v in self.history.items(): for k, v in self.history.items():
alist = [line.decode('utf-8').strip().lower() for line in open(v, 'rb')] alist = set([line.decode('utf-8').strip().lower() for line in open(v, 'rb')])
res = set(alist) != set(local_lines) existing_history = existing_history.union(alist)
if res:
return True # Check that everything in local_lines(new stuff) already exists in existing_history - it should
# if not, something new happened
return not local_lines.issubset(existing_history)
def get_screenshot(self):
fname = os.path.join(self.__datastore_path, self['uuid'], "last-screenshot.png")
if os.path.isfile(fname):
return fname
return False return False
def __get_file_ctime(self, filename):
fname = os.path.join(self.__datastore_path, self['uuid'], filename)
if os.path.isfile(fname):
return int(os.path.getmtime(fname))
return False
@property
def error_text_ctime(self):
return self.__get_file_ctime('last-error.txt')
@property
def snapshot_text_ctime(self):
if self.history_n==0:
return False
timestamp = list(self.history.keys())[-1]
return int(timestamp)
@property
def snapshot_screenshot_ctime(self):
return self.__get_file_ctime('last-screenshot.png')
@property
def snapshot_error_screenshot_ctime(self):
return self.__get_file_ctime('last-error-screenshot.png')
def get_error_text(self):
"""Return the text saved from a previous request that resulted in a non-200 error"""
fname = os.path.join(self.__datastore_path, self['uuid'], "last-error.txt")
if os.path.isfile(fname):
with open(fname, 'r') as f:
return f.read()
return False
def get_error_snapshot(self):
"""Return path to the screenshot that resulted in a non-200 error"""
fname = os.path.join(self.__datastore_path, self['uuid'], "last-error-screenshot.png")
if os.path.isfile(fname):
return fname
return False

View File

@@ -34,7 +34,6 @@ def process_notification(n_object, datastore):
valid_notification_formats[default_notification_format], valid_notification_formats[default_notification_format],
) )
# Insert variables into the notification content # Insert variables into the notification content
notification_parameters = create_notification_parameters(n_object, datastore) notification_parameters = create_notification_parameters(n_object, datastore)
@@ -64,7 +63,7 @@ def process_notification(n_object, datastore):
# So if no avatar_url is specified, add one so it can be correctly calculated into the total payload # So if no avatar_url is specified, add one so it can be correctly calculated into the total payload
k = '?' if not '?' in url else '&' k = '?' if not '?' in url else '&'
if not 'avatar_url' in url: if not 'avatar_url' in url and not url.startswith('mail'):
url += k + 'avatar_url=https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/changedetectionio/static/images/avatar-256x256.png' url += k + 'avatar_url=https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/changedetectionio/static/images/avatar-256x256.png'
if url.startswith('tgram://'): if url.startswith('tgram://'):
@@ -79,13 +78,21 @@ def process_notification(n_object, datastore):
n_title = n_title[0:payload_max_size] n_title = n_title[0:payload_max_size]
n_body = n_body[0:body_limit] n_body = n_body[0:body_limit]
elif url.startswith('discord://'): elif url.startswith('discord://') or url.startswith('https://discordapp.com/api/webhooks') or url.startswith('https://discord.com/api'):
# real limit is 2000, but minus some for extra metadata # real limit is 2000, but minus some for extra metadata
payload_max_size = 1700 payload_max_size = 1700
body_limit = max(0, payload_max_size - len(n_title)) body_limit = max(0, payload_max_size - len(n_title))
n_title = n_title[0:payload_max_size] n_title = n_title[0:payload_max_size]
n_body = n_body[0:body_limit] n_body = n_body[0:body_limit]
elif url.startswith('mailto'):
# Apprise will default to HTML, so we need to override it
# So that whats' generated in n_body is in line with what is going to be sent.
# https://github.com/caronc/apprise/issues/633#issuecomment-1191449321
if not 'format=' in url and (n_format == 'text' or n_format == 'markdown'):
prefix = '?' if not '?' in url else '&'
url = "{}{}format={}".format(url, prefix, n_format)
apobj.add(url) apobj.add(url)
apobj.notify( apobj.notify(

View File

@@ -32,16 +32,20 @@ docker run -d --name $$-test_selenium -p 4444:4444 --rm --shm-size="2g" seleni
sleep 5 sleep 5
export WEBDRIVER_URL=http://localhost:4444/wd/hub export WEBDRIVER_URL=http://localhost:4444/wd/hub
pytest tests/fetchers/test_content.py pytest tests/fetchers/test_content.py
pytest tests/test_errorhandling.py
unset WEBDRIVER_URL unset WEBDRIVER_URL
docker kill $$-test_selenium docker kill $$-test_selenium
echo "TESTING WEBDRIVER FETCH > PLAYWRIGHT/BROWSERLESS..." echo "TESTING WEBDRIVER FETCH > PLAYWRIGHT/BROWSERLESS..."
# Not all platforms support playwright (not ARM/rPI), so it's not packaged in requirements.txt # Not all platforms support playwright (not ARM/rPI), so it's not packaged in requirements.txt
pip3 install playwright~=1.22 pip3 install playwright~=1.24
docker run -d --name $$-test_browserless -e "DEFAULT_LAUNCH_ARGS=[\"--window-size=1920,1080\"]" --rm -p 3000:3000 --shm-size="2g" browserless/chrome:1.53-chrome-stable docker run -d --name $$-test_browserless -e "DEFAULT_LAUNCH_ARGS=[\"--window-size=1920,1080\"]" --rm -p 3000:3000 --shm-size="2g" browserless/chrome:1.53-chrome-stable
# takes a while to spin up # takes a while to spin up
sleep 5 sleep 5
export PLAYWRIGHT_DRIVER_URL=ws://127.0.0.1:3000 export PLAYWRIGHT_DRIVER_URL=ws://127.0.0.1:3000
pytest tests/fetchers/test_content.py pytest tests/fetchers/test_content.py
pytest tests/test_errorhandling.py
pytest tests/visualselector/test_fetch_data.py
unset PLAYWRIGHT_DRIVER_URL unset PLAYWRIGHT_DRIVER_URL
docker kill $$-test_browserless docker kill $$-test_browserless

View File

@@ -0,0 +1,42 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
width="15"
height="16.363636"
viewBox="0 0 15 16.363636"
version="1.1"
id="svg4"
sodipodi:docname="bell-off.svg"
inkscape:version="1.1.1 (1:1.1+202109281949+c3084ef5ed)"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns="http://www.w3.org/2000/svg"
xmlns:svg="http://www.w3.org/2000/svg">
<sodipodi:namedview
id="namedview5"
pagecolor="#ffffff"
bordercolor="#666666"
borderopacity="1.0"
inkscape:pageshadow="2"
inkscape:pageopacity="0.0"
inkscape:pagecheckerboard="0"
showgrid="false"
fit-margin-top="0"
fit-margin-left="0"
fit-margin-right="0"
fit-margin-bottom="0"
inkscape:zoom="28.416667"
inkscape:cx="-0.59824046"
inkscape:cy="12"
inkscape:window-width="1554"
inkscape:window-height="896"
inkscape:window-x="2095"
inkscape:window-y="107"
inkscape:window-maximized="0"
inkscape:current-layer="svg4" />
<defs
id="defs8" />
<path
d="m 14.318182,11.762045 v 1.1925 H 5.4102273 L 11.849318,7.1140909 C 12.234545,9.1561364 12.54,11.181818 14.318182,11.762045 Z m -6.7984093,4.601591 c 1.0759091,0 2.0256823,-0.955909 2.0256823,-2.045454 H 5.4545455 c 0,1.089545 0.9879545,2.045454 2.0652272,2.045454 z M 15,2.8622727 0.9177273,15.636136 0,14.627045 l 1.8443182,-1.6725 h -1.1625 v -1.1925 C 4.0070455,10.677273 2.1784091,4.5388636 5.3611364,2.6897727 5.8009091,2.4347727 6.0709091,1.9609091 6.0702273,1.4488636 v -0.00205 C 6.0702273,0.64772727 6.7104545,0 7.5,0 8.2895455,0 8.9297727,0.64772727 8.9297727,1.4468182 v 0.00205 C 8.9290909,1.9602319 9.199773,2.4354591 9.638864,2.6897773 10.364318,3.111141 10.827273,3.7568228 11.1525,4.5129591 L 14.085682,1.8531818 Z M 6.8181818,1.3636364 C 6.8181818,1.74 7.1236364,2.0454545 7.5,2.0454545 7.8763636,2.0454545 8.1818182,1.74 8.1818182,1.3636364 8.1818182,0.98795455 7.8763636,0.68181818 7.5,0.68181818 c -0.3763636,0 -0.6818182,0.30613637 -0.6818182,0.68181822 z"
id="path2"
style="fill:#f8321b;stroke-width:0.681818;fill-opacity:1" />
</svg>

After

Width:  |  Height:  |  Size: 2.1 KiB

View File

@@ -0,0 +1,20 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
width="18"
height="19.92"
viewBox="0 0 18 19.92"
version="1.1"
id="svg6"
xmlns="http://www.w3.org/2000/svg"
xmlns:svg="http://www.w3.org/2000/svg">
<defs
id="defs10" />
<path
d="M -3,-2 H 21 V 22 H -3 Z"
fill="none"
id="path2" />
<path
d="m 15,14.08 c -0.76,0 -1.44,0.3 -1.96,0.77 L 5.91,10.7 C 5.96,10.47 6,10.24 6,10 6,9.76 5.96,9.53 5.91,9.3 L 12.96,5.19 C 13.5,5.69 14.21,6 15,6 16.66,6 18,4.66 18,3 18,1.34 16.66,0 15,0 c -1.66,0 -3,1.34 -3,3 0,0.24 0.04,0.47 0.09,0.7 L 5.04,7.81 C 4.5,7.31 3.79,7 3,7 1.34,7 0,8.34 0,10 c 0,1.66 1.34,3 3,3 0.79,0 1.5,-0.31 2.04,-0.81 l 7.12,4.16 c -0.05,0.21 -0.08,0.43 -0.08,0.65 0,1.61 1.31,2.92 2.92,2.92 1.61,0 2.92,-1.31 2.92,-2.92 0,-1.61 -1.31,-2.92 -2.92,-2.92 z"
id="path4"
style="fill:#ffffff;fill-opacity:1" />
</svg>

After

Width:  |  Height:  |  Size: 892 B

View File

@@ -10,7 +10,13 @@ $(document).ready(function () {
if (hash_name === '#screenshot') { if (hash_name === '#screenshot') {
$("img#screenshot-img").attr('src', screenshot_url); $("img#screenshot-img").attr('src', screenshot_url);
$("#settings").hide(); $("#settings").hide();
} else { } else if (hash_name === '#error-screenshot') {
$("img#error-screenshot-img").attr('src', error_screenshot_url);
$("#settings").hide();
}
else {
$("#settings").show(); $("#settings").show();
} }
} }

View File

@@ -1,51 +1,44 @@
// Rewrite this is a plugin.. is all this JS really 'worth it?' // Rewrite this is a plugin.. is all this JS really 'worth it?'
window.addEventListener('hashchange', function () {
if(!window.location.hash) { var tabs = document.getElementsByClassName('active');
var tab=document.querySelectorAll("#default-tab a"); while (tabs[0]) {
tab[0].click(); tabs[0].classList.remove('active')
} }
set_active_tab();
window.addEventListener('hashchange', function() {
var tabs = document.getElementsByClassName('active');
while (tabs[0]) {
tabs[0].classList.remove('active')
}
set_active_tab();
}, false); }, false);
var has_errors=document.querySelectorAll(".messages .error"); var has_errors = document.querySelectorAll(".messages .error");
if (!has_errors.length) { if (!has_errors.length) {
if (document.location.hash == "" ) { if (document.location.hash == "") {
document.location.hash = "#general"; document.querySelector(".tabs ul li:first-child a").click();
document.getElementById("default-tab").className = "active";
} else { } else {
set_active_tab(); set_active_tab();
} }
} else { } else {
focus_error_tab(); focus_error_tab();
} }
function set_active_tab() { function set_active_tab() {
var tab=document.querySelectorAll("a[href='"+location.hash+"']"); var tab = document.querySelectorAll("a[href='" + location.hash + "']");
if (tab.length) { if (tab.length) {
tab[0].parentElement.className="active"; tab[0].parentElement.className = "active";
} }
// hash could move the page down // hash could move the page down
window.scrollTo(0, 0); window.scrollTo(0, 0);
} }
function focus_error_tab() { function focus_error_tab() {
// time to use jquery or vuejs really, // time to use jquery or vuejs really,
// activate the tab with the error // activate the tab with the error
var tabs = document.querySelectorAll('.tabs li a'),i; var tabs = document.querySelectorAll('.tabs li a'), i;
for (i = 0; i < tabs.length; ++i) { for (i = 0; i < tabs.length; ++i) {
var tab_name=tabs[i].hash.replace('#',''); var tab_name = tabs[i].hash.replace('#', '');
var pane_errors=document.querySelectorAll('#'+tab_name+' .error') var pane_errors = document.querySelectorAll('#' + tab_name + ' .error')
if (pane_errors.length) { if (pane_errors.length) {
document.location.hash = '#'+tab_name; document.location.hash = '#' + tab_name;
return true; return true;
} }
} }
return false; return false;
} }

View File

@@ -1 +1,3 @@
node_modules node_modules
package-lock.json

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,26 @@
.arrow {
border: solid #1b98f8;
border-width: 0 2px 2px 0;
display: inline-block;
padding: 3px;
&.right {
transform: rotate(-45deg);
-webkit-transform: rotate(-45deg);
}
&.left {
transform: rotate(135deg);
-webkit-transform: rotate(135deg);
}
&.up, &.asc {
transform: rotate(-135deg);
-webkit-transform: rotate(-135deg);
}
&.down, &.desc {
transform: rotate(45deg);
-webkit-transform: rotate(45deg);
}
}

View File

@@ -1,11 +1,27 @@
/* /*
* -- BASE STYLES -- * -- BASE STYLES --
* Most of these are inherited from Base, but I want to change a few. * Most of these are inherited from Base, but I want to change a few.
* nvm use v14.18.1 * nvm use v14.18.1 && npm install && npm run build
* npm install
* npm run build
* or npm run watch * or npm run watch
*/ */
.arrow {
border: solid #1b98f8;
border-width: 0 2px 2px 0;
display: inline-block;
padding: 3px; }
.arrow.right {
transform: rotate(-45deg);
-webkit-transform: rotate(-45deg); }
.arrow.left {
transform: rotate(135deg);
-webkit-transform: rotate(135deg); }
.arrow.up, .arrow.asc {
transform: rotate(-135deg);
-webkit-transform: rotate(-135deg); }
.arrow.down, .arrow.desc {
transform: rotate(45deg);
-webkit-transform: rotate(45deg); }
body { body {
color: #333; color: #333;
background: #262626; } background: #262626; }
@@ -55,6 +71,12 @@ code {
white-space: normal; } white-space: normal; }
.watch-table th { .watch-table th {
white-space: nowrap; } white-space: nowrap; }
.watch-table th a {
font-weight: normal; }
.watch-table th a.active {
font-weight: bolder; }
.watch-table th a.inactive .arrow {
display: none; }
.watch-table .title-col a[target="_blank"]::after, .watch-table .current-diff-url::after { .watch-table .title-col a[target="_blank"]::after, .watch-table .current-diff-url::after {
content: url(data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAAQElEQVR42qXKwQkAIAxDUUdxtO6/RBQkQZvSi8I/pL4BoGw/XPkh4XigPmsUgh0626AjRsgxHTkUThsG2T/sIlzdTsp52kSS1wAAAABJRU5ErkJggg==); content: url(data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAAQElEQVR42qXKwQkAIAxDUUdxtO6/RBQkQZvSi8I/pL4BoGw/XPkh4XigPmsUgh0626AjRsgxHTkUThsG2T/sIlzdTsp52kSS1wAAAABJRU5ErkJggg==);
margin: 0 3px 0 5px; } margin: 0 3px 0 5px; }
@@ -105,24 +127,6 @@ body:after, body:before {
-webkit-clip-path: polygon(100% 0, 0 0, 0 77.5%, 1% 77.4%, 2% 77.1%, 3% 76.6%, 4% 75.9%, 5% 75.05%, 6% 74.05%, 7% 72.95%, 8% 71.75%, 9% 70.55%, 10% 69.3%, 11% 68.05%, 12% 66.9%, 13% 65.8%, 14% 64.8%, 15% 64%, 16% 63.35%, 17% 62.85%, 18% 62.6%, 19% 62.5%, 20% 62.65%, 21% 63%, 22% 63.5%, 23% 64.2%, 24% 65.1%, 25% 66.1%, 26% 67.2%, 27% 68.4%, 28% 69.65%, 29% 70.9%, 30% 72.15%, 31% 73.3%, 32% 74.35%, 33% 75.3%, 34% 76.1%, 35% 76.75%, 36% 77.2%, 37% 77.45%, 38% 77.5%, 39% 77.3%, 40% 76.95%, 41% 76.4%, 42% 75.65%, 43% 74.75%, 44% 73.75%, 45% 72.6%, 46% 71.4%, 47% 70.15%, 48% 68.9%, 49% 67.7%, 50% 66.55%, 51% 65.5%, 52% 64.55%, 53% 63.75%, 54% 63.15%, 55% 62.75%, 56% 62.55%, 57% 62.5%, 58% 62.7%, 59% 63.1%, 60% 63.7%, 61% 64.45%, 62% 65.4%, 63% 66.45%, 64% 67.6%, 65% 68.8%, 66% 70.05%, 67% 71.3%, 68% 72.5%, 69% 73.6%, 70% 74.65%, 71% 75.55%, 72% 76.35%, 73% 76.9%, 74% 77.3%, 75% 77.5%, 76% 77.45%, 77% 77.25%, 78% 76.8%, 79% 76.2%, 80% 75.4%, 81% 74.45%, 82% 73.4%, 83% 72.25%, 84% 71.05%, 85% 69.8%, 86% 68.55%, 87% 67.35%, 88% 66.2%, 89% 65.2%, 90% 64.3%, 91% 63.55%, 92% 63%, 93% 62.65%, 94% 62.5%, 95% 62.55%, 96% 62.8%, 97% 63.3%, 98% 63.9%, 99% 64.75%, 100% 65.7%); -webkit-clip-path: polygon(100% 0, 0 0, 0 77.5%, 1% 77.4%, 2% 77.1%, 3% 76.6%, 4% 75.9%, 5% 75.05%, 6% 74.05%, 7% 72.95%, 8% 71.75%, 9% 70.55%, 10% 69.3%, 11% 68.05%, 12% 66.9%, 13% 65.8%, 14% 64.8%, 15% 64%, 16% 63.35%, 17% 62.85%, 18% 62.6%, 19% 62.5%, 20% 62.65%, 21% 63%, 22% 63.5%, 23% 64.2%, 24% 65.1%, 25% 66.1%, 26% 67.2%, 27% 68.4%, 28% 69.65%, 29% 70.9%, 30% 72.15%, 31% 73.3%, 32% 74.35%, 33% 75.3%, 34% 76.1%, 35% 76.75%, 36% 77.2%, 37% 77.45%, 38% 77.5%, 39% 77.3%, 40% 76.95%, 41% 76.4%, 42% 75.65%, 43% 74.75%, 44% 73.75%, 45% 72.6%, 46% 71.4%, 47% 70.15%, 48% 68.9%, 49% 67.7%, 50% 66.55%, 51% 65.5%, 52% 64.55%, 53% 63.75%, 54% 63.15%, 55% 62.75%, 56% 62.55%, 57% 62.5%, 58% 62.7%, 59% 63.1%, 60% 63.7%, 61% 64.45%, 62% 65.4%, 63% 66.45%, 64% 67.6%, 65% 68.8%, 66% 70.05%, 67% 71.3%, 68% 72.5%, 69% 73.6%, 70% 74.65%, 71% 75.55%, 72% 76.35%, 73% 76.9%, 74% 77.3%, 75% 77.5%, 76% 77.45%, 77% 77.25%, 78% 76.8%, 79% 76.2%, 80% 75.4%, 81% 74.45%, 82% 73.4%, 83% 72.25%, 84% 71.05%, 85% 69.8%, 86% 68.55%, 87% 67.35%, 88% 66.2%, 89% 65.2%, 90% 64.3%, 91% 63.55%, 92% 63%, 93% 62.65%, 94% 62.5%, 95% 62.55%, 96% 62.8%, 97% 63.3%, 98% 63.9%, 99% 64.75%, 100% 65.7%);
clip-path: polygon(100% 0, 0 0, 0 77.5%, 1% 77.4%, 2% 77.1%, 3% 76.6%, 4% 75.9%, 5% 75.05%, 6% 74.05%, 7% 72.95%, 8% 71.75%, 9% 70.55%, 10% 69.3%, 11% 68.05%, 12% 66.9%, 13% 65.8%, 14% 64.8%, 15% 64%, 16% 63.35%, 17% 62.85%, 18% 62.6%, 19% 62.5%, 20% 62.65%, 21% 63%, 22% 63.5%, 23% 64.2%, 24% 65.1%, 25% 66.1%, 26% 67.2%, 27% 68.4%, 28% 69.65%, 29% 70.9%, 30% 72.15%, 31% 73.3%, 32% 74.35%, 33% 75.3%, 34% 76.1%, 35% 76.75%, 36% 77.2%, 37% 77.45%, 38% 77.5%, 39% 77.3%, 40% 76.95%, 41% 76.4%, 42% 75.65%, 43% 74.75%, 44% 73.75%, 45% 72.6%, 46% 71.4%, 47% 70.15%, 48% 68.9%, 49% 67.7%, 50% 66.55%, 51% 65.5%, 52% 64.55%, 53% 63.75%, 54% 63.15%, 55% 62.75%, 56% 62.55%, 57% 62.5%, 58% 62.7%, 59% 63.1%, 60% 63.7%, 61% 64.45%, 62% 65.4%, 63% 66.45%, 64% 67.6%, 65% 68.8%, 66% 70.05%, 67% 71.3%, 68% 72.5%, 69% 73.6%, 70% 74.65%, 71% 75.55%, 72% 76.35%, 73% 76.9%, 74% 77.3%, 75% 77.5%, 76% 77.45%, 77% 77.25%, 78% 76.8%, 79% 76.2%, 80% 75.4%, 81% 74.45%, 82% 73.4%, 83% 72.25%, 84% 71.05%, 85% 69.8%, 86% 68.55%, 87% 67.35%, 88% 66.2%, 89% 65.2%, 90% 64.3%, 91% 63.55%, 92% 63%, 93% 62.65%, 94% 62.5%, 95% 62.55%, 96% 62.8%, 97% 63.3%, 98% 63.9%, 99% 64.75%, 100% 65.7%); } clip-path: polygon(100% 0, 0 0, 0 77.5%, 1% 77.4%, 2% 77.1%, 3% 76.6%, 4% 75.9%, 5% 75.05%, 6% 74.05%, 7% 72.95%, 8% 71.75%, 9% 70.55%, 10% 69.3%, 11% 68.05%, 12% 66.9%, 13% 65.8%, 14% 64.8%, 15% 64%, 16% 63.35%, 17% 62.85%, 18% 62.6%, 19% 62.5%, 20% 62.65%, 21% 63%, 22% 63.5%, 23% 64.2%, 24% 65.1%, 25% 66.1%, 26% 67.2%, 27% 68.4%, 28% 69.65%, 29% 70.9%, 30% 72.15%, 31% 73.3%, 32% 74.35%, 33% 75.3%, 34% 76.1%, 35% 76.75%, 36% 77.2%, 37% 77.45%, 38% 77.5%, 39% 77.3%, 40% 76.95%, 41% 76.4%, 42% 75.65%, 43% 74.75%, 44% 73.75%, 45% 72.6%, 46% 71.4%, 47% 70.15%, 48% 68.9%, 49% 67.7%, 50% 66.55%, 51% 65.5%, 52% 64.55%, 53% 63.75%, 54% 63.15%, 55% 62.75%, 56% 62.55%, 57% 62.5%, 58% 62.7%, 59% 63.1%, 60% 63.7%, 61% 64.45%, 62% 65.4%, 63% 66.45%, 64% 67.6%, 65% 68.8%, 66% 70.05%, 67% 71.3%, 68% 72.5%, 69% 73.6%, 70% 74.65%, 71% 75.55%, 72% 76.35%, 73% 76.9%, 74% 77.3%, 75% 77.5%, 76% 77.45%, 77% 77.25%, 78% 76.8%, 79% 76.2%, 80% 75.4%, 81% 74.45%, 82% 73.4%, 83% 72.25%, 84% 71.05%, 85% 69.8%, 86% 68.55%, 87% 67.35%, 88% 66.2%, 89% 65.2%, 90% 64.3%, 91% 63.55%, 92% 63%, 93% 62.65%, 94% 62.5%, 95% 62.55%, 96% 62.8%, 97% 63.3%, 98% 63.9%, 99% 64.75%, 100% 65.7%); }
.arrow {
border: solid black;
border-width: 0 3px 3px 0;
display: inline-block;
padding: 3px; }
.arrow.right {
transform: rotate(-45deg);
-webkit-transform: rotate(-45deg); }
.arrow.left {
transform: rotate(135deg);
-webkit-transform: rotate(135deg); }
.arrow.up {
transform: rotate(-135deg);
-webkit-transform: rotate(-135deg); }
.arrow.down {
transform: rotate(45deg);
-webkit-transform: rotate(45deg); }
.button-small { .button-small {
font-size: 85%; } font-size: 85%; }
@@ -203,13 +207,18 @@ body:after, body:before {
border-radius: 10px; border-radius: 10px;
margin-bottom: 1em; } margin-bottom: 1em; }
#new-watch-form input { #new-watch-form input {
width: auto !important; display: inline-block;
display: inline-block; } margin-bottom: 5px; }
#new-watch-form .label { #new-watch-form .label {
display: none; } display: none; }
#new-watch-form legend { #new-watch-form legend {
color: #fff; color: #fff;
font-weight: bold; } font-weight: bold; }
#new-watch-form #watch-add-wrapper-zone > div {
display: inline-block; }
@media only screen and (max-width: 760px) {
#new-watch-form #watch-add-wrapper-zone #url {
width: 100%; } }
#diff-col { #diff-col {
padding-left: 40px; } padding-left: 40px; }
@@ -268,11 +277,15 @@ footer {
#new-version-text a { #new-version-text a {
color: #e07171; } color: #e07171; }
.paused-state.state-False img { .watch-controls {
opacity: 0.2; } /* default */ }
.watch-controls .state-on img {
.paused-state.state-False:hover img { opacity: 0.8; }
opacity: 0.8; } .watch-controls img {
opacity: 0.2; }
.watch-controls img:hover {
transition: opacity 0.3s;
opacity: 0.8; }
.monospaced-textarea textarea { .monospaced-textarea textarea {
width: 100%; width: 100%;
@@ -532,3 +545,13 @@ ul {
100% { 100% {
-webkit-transform: rotate(360deg); -webkit-transform: rotate(360deg);
transform: rotate(360deg); } } transform: rotate(360deg); } }
.snapshot-age {
padding: 4px;
background-color: #dfdfdf;
border-radius: 3px;
font-weight: bold;
margin-bottom: 4px; }
.snapshot-age.error {
background-color: #ff0000;
color: #fff; }

View File

@@ -1,11 +1,11 @@
/* /*
* -- BASE STYLES -- * -- BASE STYLES --
* Most of these are inherited from Base, but I want to change a few. * Most of these are inherited from Base, but I want to change a few.
* nvm use v14.18.1 * nvm use v14.18.1 && npm install && npm run build
* npm install
* npm run build
* or npm run watch * or npm run watch
*/ */
@import "parts/_arrows.scss";
body { body {
color: #333; color: #333;
background: #262626; background: #262626;
@@ -70,6 +70,17 @@ code {
th { th {
white-space: nowrap; white-space: nowrap;
a {
font-weight: normal;
&.active {
font-weight: bolder;
}
&.inactive {
.arrow {
display: none;
}
}
}
} }
.title-col a[target="_blank"]::after, .current-diff-url::after { .title-col a[target="_blank"]::after, .current-diff-url::after {
@@ -139,29 +150,6 @@ body:after, body:before {
clip-path: polygon(100% 0, 0 0, 0 77.5%, 1% 77.4%, 2% 77.1%, 3% 76.6%, 4% 75.9%, 5% 75.05%, 6% 74.05%, 7% 72.95%, 8% 71.75%, 9% 70.55%, 10% 69.3%, 11% 68.05%, 12% 66.9%, 13% 65.8%, 14% 64.8%, 15% 64%, 16% 63.35%, 17% 62.85%, 18% 62.6%, 19% 62.5%, 20% 62.65%, 21% 63%, 22% 63.5%, 23% 64.2%, 24% 65.1%, 25% 66.1%, 26% 67.2%, 27% 68.4%, 28% 69.65%, 29% 70.9%, 30% 72.15%, 31% 73.3%, 32% 74.35%, 33% 75.3%, 34% 76.1%, 35% 76.75%, 36% 77.2%, 37% 77.45%, 38% 77.5%, 39% 77.3%, 40% 76.95%, 41% 76.4%, 42% 75.65%, 43% 74.75%, 44% 73.75%, 45% 72.6%, 46% 71.4%, 47% 70.15%, 48% 68.9%, 49% 67.7%, 50% 66.55%, 51% 65.5%, 52% 64.55%, 53% 63.75%, 54% 63.15%, 55% 62.75%, 56% 62.55%, 57% 62.5%, 58% 62.7%, 59% 63.1%, 60% 63.7%, 61% 64.45%, 62% 65.4%, 63% 66.45%, 64% 67.6%, 65% 68.8%, 66% 70.05%, 67% 71.3%, 68% 72.5%, 69% 73.6%, 70% 74.65%, 71% 75.55%, 72% 76.35%, 73% 76.9%, 74% 77.3%, 75% 77.5%, 76% 77.45%, 77% 77.25%, 78% 76.8%, 79% 76.2%, 80% 75.4%, 81% 74.45%, 82% 73.4%, 83% 72.25%, 84% 71.05%, 85% 69.8%, 86% 68.55%, 87% 67.35%, 88% 66.2%, 89% 65.2%, 90% 64.3%, 91% 63.55%, 92% 63%, 93% 62.65%, 94% 62.5%, 95% 62.55%, 96% 62.8%, 97% 63.3%, 98% 63.9%, 99% 64.75%, 100% 65.7%) clip-path: polygon(100% 0, 0 0, 0 77.5%, 1% 77.4%, 2% 77.1%, 3% 76.6%, 4% 75.9%, 5% 75.05%, 6% 74.05%, 7% 72.95%, 8% 71.75%, 9% 70.55%, 10% 69.3%, 11% 68.05%, 12% 66.9%, 13% 65.8%, 14% 64.8%, 15% 64%, 16% 63.35%, 17% 62.85%, 18% 62.6%, 19% 62.5%, 20% 62.65%, 21% 63%, 22% 63.5%, 23% 64.2%, 24% 65.1%, 25% 66.1%, 26% 67.2%, 27% 68.4%, 28% 69.65%, 29% 70.9%, 30% 72.15%, 31% 73.3%, 32% 74.35%, 33% 75.3%, 34% 76.1%, 35% 76.75%, 36% 77.2%, 37% 77.45%, 38% 77.5%, 39% 77.3%, 40% 76.95%, 41% 76.4%, 42% 75.65%, 43% 74.75%, 44% 73.75%, 45% 72.6%, 46% 71.4%, 47% 70.15%, 48% 68.9%, 49% 67.7%, 50% 66.55%, 51% 65.5%, 52% 64.55%, 53% 63.75%, 54% 63.15%, 55% 62.75%, 56% 62.55%, 57% 62.5%, 58% 62.7%, 59% 63.1%, 60% 63.7%, 61% 64.45%, 62% 65.4%, 63% 66.45%, 64% 67.6%, 65% 68.8%, 66% 70.05%, 67% 71.3%, 68% 72.5%, 69% 73.6%, 70% 74.65%, 71% 75.55%, 72% 76.35%, 73% 76.9%, 74% 77.3%, 75% 77.5%, 76% 77.45%, 77% 77.25%, 78% 76.8%, 79% 76.2%, 80% 75.4%, 81% 74.45%, 82% 73.4%, 83% 72.25%, 84% 71.05%, 85% 69.8%, 86% 68.55%, 87% 67.35%, 88% 66.2%, 89% 65.2%, 90% 64.3%, 91% 63.55%, 92% 63%, 93% 62.65%, 94% 62.5%, 95% 62.55%, 96% 62.8%, 97% 63.3%, 98% 63.9%, 99% 64.75%, 100% 65.7%)
} }
.arrow {
border: solid black;
border-width: 0 3px 3px 0;
display: inline-block;
padding: 3px;
&.right {
transform: rotate(-45deg);
-webkit-transform: rotate(-45deg);
}
&.left {
transform: rotate(135deg);
-webkit-transform: rotate(135deg);
}
&.up {
transform: rotate(-135deg);
-webkit-transform: rotate(-135deg);
}
&.down {
transform: rotate(45deg);
-webkit-transform: rotate(45deg);
}
}
.button-small { .button-small {
font-size: 85%; font-size: 85%;
} }
@@ -269,8 +257,8 @@ body:after, body:before {
border-radius: 10px; border-radius: 10px;
margin-bottom: 1em; margin-bottom: 1em;
input { input {
width: auto !important;
display: inline-block; display: inline-block;
margin-bottom: 5px;
} }
.label { .label {
display: none; display: none;
@@ -279,6 +267,17 @@ body:after, body:before {
color: #fff; color: #fff;
font-weight: bold; font-weight: bold;
} }
#watch-add-wrapper-zone {
> div {
display: inline-block;
}
@media only screen and (max-width: 760px) {
#url {
width: 100%;
}
}
}
} }
@@ -353,14 +352,25 @@ footer {
color: #e07171; color: #e07171;
} }
.paused-state { .watch-controls {
&.state-False img { .state-on {
img {
opacity: 0.8;
}
}
/* default */
img {
opacity: 0.2; opacity: 0.2;
} }
&.state-False:hover img { img {
opacity: 0.8; &:hover {
transition: opacity 0.3s;
opacity: 0.8;
}
} }
} }
.monospaced-textarea { .monospaced-textarea {
@@ -492,6 +502,7 @@ and also iPads specifically.
vertical-align: middle; vertical-align: middle;
} }
} }
.last-checked::before { .last-checked::before {
color: #555; color: #555;
content: "Last Checked "; content: "Last Checked ";
@@ -751,3 +762,15 @@ ul {
} }
} }
.snapshot-age {
padding: 4px;
background-color: #dfdfdf;
border-radius: 3px;
font-weight: bold;
margin-bottom: 4px;
&.error {
background-color: #ff0000;
color: #fff;
}
}

View File

@@ -8,7 +8,7 @@ import threading
import time import time
import uuid as uuid_builder import uuid as uuid_builder
from copy import deepcopy from copy import deepcopy
from os import mkdir, path, unlink from os import path, unlink
from threading import Lock from threading import Lock
import re import re
import requests import requests
@@ -158,8 +158,7 @@ class ChangeDetectionStore:
@property @property
def threshold_seconds(self): def threshold_seconds(self):
seconds = 0 seconds = 0
mtable = {'seconds': 1, 'minutes': 60, 'hours': 3600, 'days': 86400, 'weeks': 86400 * 7} for m, n in Watch.mtable.items():
for m, n in mtable.items():
x = self.__data['settings']['requests']['time_between_check'].get(m) x = self.__data['settings']['requests']['time_between_check'].get(m)
if x: if x:
seconds += x * n seconds += x * n
@@ -255,7 +254,6 @@ class ChangeDetectionStore:
self.__data['watching'][uuid].update( self.__data['watching'][uuid].update(
{'last_checked': 0, {'last_checked': 0,
'last_changed': 0,
'last_viewed': 0, 'last_viewed': 0,
'previous_md5': False, 'previous_md5': False,
'last_notification_error': False, 'last_notification_error': False,
@@ -298,7 +296,8 @@ class ChangeDetectionStore:
'ignore_text', 'css_filter', 'ignore_text', 'css_filter',
'subtractive_selectors', 'trigger_text', 'subtractive_selectors', 'trigger_text',
'extract_title_as_title', 'extract_text', 'extract_title_as_title', 'extract_text',
'text_should_not_be_present']: 'text_should_not_be_present',
'webdriver_js_execute_code']:
if res.get(k): if res.get(k):
apply_extras[k] = res[k] apply_extras[k] = res[k]
@@ -325,25 +324,12 @@ class ChangeDetectionStore:
new_watch.update(apply_extras) new_watch.update(apply_extras)
self.__data['watching'][new_uuid]=new_watch self.__data['watching'][new_uuid]=new_watch
# Get the directory ready self.__data['watching'][new_uuid].ensure_data_dir_exists()
output_path = "{}/{}".format(self.datastore_path, new_uuid)
try:
mkdir(output_path)
except FileExistsError:
print(output_path, "already exists.")
if write_to_disk_now: if write_to_disk_now:
self.sync_to_json() self.sync_to_json()
return new_uuid return new_uuid
def get_screenshot(self, watch_uuid):
output_path = "{}/{}".format(self.datastore_path, watch_uuid)
fname = "{}/last-screenshot.png".format(output_path)
if path.isfile(fname):
return fname
return False
def visualselector_data_is_ready(self, watch_uuid): def visualselector_data_is_ready(self, watch_uuid):
output_path = "{}/{}".format(self.datastore_path, watch_uuid) output_path = "{}/{}".format(self.datastore_path, watch_uuid)
screenshot_filename = "{}/last-screenshot.png".format(output_path) screenshot_filename = "{}/last-screenshot.png".format(output_path)
@@ -354,17 +340,34 @@ class ChangeDetectionStore:
return False return False
# Save as PNG, PNG is larger but better for doing visual diff in the future # Save as PNG, PNG is larger but better for doing visual diff in the future
def save_screenshot(self, watch_uuid, screenshot: bytes): def save_screenshot(self, watch_uuid, screenshot: bytes, as_error=False):
output_path = "{}/{}".format(self.datastore_path, watch_uuid)
fname = "{}/last-screenshot.png".format(output_path) if as_error:
with open(fname, 'wb') as f: target_path = os.path.join(self.datastore_path, watch_uuid, "last-error-screenshot.png")
else:
target_path = os.path.join(self.datastore_path, watch_uuid, "last-screenshot.png")
self.data['watching'][watch_uuid].ensure_data_dir_exists()
with open(target_path, 'wb') as f:
f.write(screenshot) f.write(screenshot)
f.close() f.close()
def save_xpath_data(self, watch_uuid, data): def save_error_text(self, watch_uuid, contents):
output_path = "{}/{}".format(self.datastore_path, watch_uuid)
fname = "{}/elements.json".format(output_path) target_path = os.path.join(self.datastore_path, watch_uuid, "last-error.txt")
with open(fname, 'w') as f:
with open(target_path, 'w') as f:
f.write(contents)
def save_xpath_data(self, watch_uuid, data, as_error=False):
if as_error:
target_path = os.path.join(self.datastore_path, watch_uuid, "elements-error.json")
else:
target_path = os.path.join(self.datastore_path, watch_uuid, "elements.json")
with open(target_path, 'w') as f:
f.write(json.dumps(data)) f.write(json.dumps(data))
f.close() f.close()
@@ -521,8 +524,15 @@ class ChangeDetectionStore:
# We incorrectly stored last_changed when there was not a change, and then confused the output list table # We incorrectly stored last_changed when there was not a change, and then confused the output list table
def update_3(self): def update_3(self):
# see https://github.com/dgtlmoon/changedetection.io/pull/835
return
# `last_changed` not needed, we pull that information from the history.txt index
def update_4(self):
for uuid, watch in self.data['watching'].items(): for uuid, watch in self.data['watching'].items():
# Be sure it's recalculated try:
p = watch.history # Remove it from the struct
if watch.history_n < 2: del(watch['last_changed'])
watch['last_changed'] = 0 except:
continue
return

View File

@@ -0,0 +1,7 @@
{% macro pagination(sorted_watches, total_per_page, current_page) %}
{{ sorted_watches|length }}
{% for row in sorted_watches|batch(total_per_page, '&nbsp;') %}
{{ loop.index}}
{% endfor %}
{% endmacro %}

View File

@@ -3,6 +3,9 @@
{% block content %} {% block content %}
<script> <script>
const screenshot_url="{{url_for('static_content', group='screenshot', filename=uuid)}}"; const screenshot_url="{{url_for('static_content', group='screenshot', filename=uuid)}}";
{% if last_error_screenshot %}
const error_screenshot_url="{{url_for('static_content', group='screenshot', filename=uuid, error_screenshot=1) }}";
{% endif %}
</script> </script>
<script type="text/javascript" src="{{url_for('static_content', group='js', filename='diff-overview.js')}}" defer></script> <script type="text/javascript" src="{{url_for('static_content', group='js', filename='diff-overview.js')}}" defer></script>
@@ -43,15 +46,31 @@
<script type="text/javascript" src="{{url_for('static_content', group='js', filename='tabs.js')}}" defer></script> <script type="text/javascript" src="{{url_for('static_content', group='js', filename='tabs.js')}}" defer></script>
<div class="tabs"> <div class="tabs">
<ul> <ul>
<li class="tab" id="default-tab"><a href="#text">Text</a></li> {% if last_error_text %}<li class="tab" id="error-text-tab"><a href="#error-text">Error Text</a></li> {% endif %}
{% if last_error_screenshot %}<li class="tab" id="error-screenshot-tab"><a href="#error-screenshot">Error Screenshot</a></li> {% endif %}
<li class="tab" id=""><a href="#text">Text</a></li>
<li class="tab" id="screenshot-tab"><a href="#screenshot">Screenshot</a></li> <li class="tab" id="screenshot-tab"><a href="#screenshot">Screenshot</a></li>
</ul> </ul>
</div> </div>
<div id="diff-ui"> <div id="diff-ui">
<div class="tab-pane-inner" id="error-text">
<div class="snapshot-age error">{{watch_a.error_text_ctime|format_seconds_ago}} seconds ago</div>
<pre>
{{ last_error_text }}
</pre>
</div>
<div class="tab-pane-inner" id="error-screenshot">
<div class="snapshot-age error">{{watch_a.snapshot_error_screenshot_ctime|format_seconds_ago}} seconds ago</div>
<img id="error-screenshot-img" style="max-width: 80%" alt="Current error-ing screenshot from most recent request"/>
</div>
<div class="tab-pane-inner" id="text"> <div class="tab-pane-inner" id="text">
<div class="tip">Pro-tip: Use <strong>show current snapshot</strong> tab to visualise what will be ignored. <div class="tip">Pro-tip: Use <strong>show current snapshot</strong> tab to visualise what will be ignored.
</div> </div>
<div class="snapshot-age">{{watch_a.snapshot_text_ctime|format_timestamp_timeago}}</div>
<table> <table>
<tbody> <tbody>
<tr> <tr>
@@ -70,10 +89,10 @@
<div class="tip"> <div class="tip">
For now, Differences are performed on text, not graphically, only the latest screenshot is available. For now, Differences are performed on text, not graphically, only the latest screenshot is available.
</div> </div>
</br>
{% if is_html_webdriver %} {% if is_html_webdriver %}
{% if screenshot %} {% if screenshot %}
<img style="max-width: 80%" id="screenshot-img" alt="Current screenshot from most recent request"/> <div class="snapshot-age">{{watch_a.snapshot_screenshot_ctime|format_timestamp_timeago}}</div>
<img style="max-width: 80%" id="screenshot-img" alt="Current screenshot from most recent request"/>
{% else %} {% else %}
No screenshot available just yet! Try rechecking the page. No screenshot available just yet! Try rechecking the page.
{% endif %} {% endif %}
@@ -88,7 +107,6 @@
<script defer=""> <script defer="">
var a = document.getElementById('a'); var a = document.getElementById('a');
var b = document.getElementById('b'); var b = document.getElementById('b');
var result = document.getElementById('result'); var result = document.getElementById('result');

View File

@@ -23,9 +23,9 @@
<div class="tabs collapsable"> <div class="tabs collapsable">
<ul> <ul>
<li class="tab" id="default-tab"><a href="#general">General</a></li> <li class="tab" id=""><a href="#general">General</a></li>
<li class="tab"><a href="#request">Request</a></li> <li class="tab"><a href="#request">Request</a></li>
<li class="tab"><a id="visualselector-tab" href="#visualselector">Visual Selector</a></li> <li class="tab"><a id="visualselector-tab" href="#visualselector">Visual Filter Selector</a></li>
<li class="tab"><a href="#filters-and-triggers">Filters &amp; Triggers</a></li> <li class="tab"><a href="#filters-and-triggers">Filters &amp; Triggers</a></li>
<li class="tab"><a href="#notifications">Notifications</a></li> <li class="tab"><a href="#notifications">Notifications</a></li>
</ul> </ul>
@@ -33,7 +33,7 @@
<div class="box-wrap inner"> <div class="box-wrap inner">
<form class="pure-form pure-form-stacked" <form class="pure-form pure-form-stacked"
action="{{ url_for('edit_page', uuid=uuid, next = request.args.get('next') ) }}" method="POST"> action="{{ url_for('edit_page', uuid=uuid, next = request.args.get('next'), unpause_on_save = request.args.get('unpause_on_save')) }}" method="POST">
<input type="hidden" name="csrf_token" value="{{ csrf_token() }}"/> <input type="hidden" name="csrf_token" value="{{ csrf_token() }}"/>
<div class="tab-pane-inner" id="general"> <div class="tab-pane-inner" id="general">
@@ -62,6 +62,12 @@
<div class="pure-control-group"> <div class="pure-control-group">
{{ render_checkbox_field(form.extract_title_as_title) }} {{ render_checkbox_field(form.extract_title_as_title) }}
</div> </div>
<div class="pure-control-group">
{{ render_checkbox_field(form.filter_failure_notification_send) }}
<span class="pure-form-message-inline">
Sends a notification when the filter can no longer be seen on the page, good for knowing when the page changed and your filter will not work anymore.
</span>
</div>
</fieldset> </fieldset>
</div> </div>
@@ -81,6 +87,9 @@
</span> </span>
</div> </div>
{% endif %} {% endif %}
<div class="pure-control-group inline-radio">
{{ render_checkbox_field(form.ignore_status_codes) }}
</div>
<fieldset id="webdriver-override-options"> <fieldset id="webdriver-override-options">
<div class="pure-control-group"> <div class="pure-control-group">
{{ render_field(form.webdriver_delay) }} {{ render_field(form.webdriver_delay) }}
@@ -88,14 +97,17 @@
<strong>If you're having trouble waiting for the page to be fully rendered (text missing etc), try increasing the 'wait' time here.</strong> <strong>If you're having trouble waiting for the page to be fully rendered (text missing etc), try increasing the 'wait' time here.</strong>
<br/> <br/>
This will wait <i>n</i> seconds before extracting the text. This will wait <i>n</i> seconds before extracting the text.
{% if using_global_webdriver_wait %}
<br/><strong>Using the current global default settings</strong>
{% endif %}
</div> </div>
</div> </div>
{% if using_global_webdriver_wait %} <div class="pure-control-group">
<div class="pure-form-message-inline"> {{ render_field(form.webdriver_js_execute_code) }}
<strong>Using the current global default settings</strong> <div class="pure-form-message-inline">
Run this code before performing change detection, handy for filling in fields and other actions <a href="https://github.com/dgtlmoon/changedetection.io/wiki/Run-JavaScript-before-change-detection">More help and examples here</a>
</div>
</div> </div>
{% endif %}
</fieldset> </fieldset>
<fieldset class="pure-group" id="requests-override-options"> <fieldset class="pure-group" id="requests-override-options">
{% if not playwright_enabled %} {% if not playwright_enabled %}
@@ -119,11 +131,7 @@ User-Agent: wonderbra 1.0") }}
\"car\":null \"car\":null
}") }} }") }}
</div> </div>
<div id="ignore-status-codes-option">
{{ render_checkbox_field(form.ignore_status_codes) }}
</div>
</fieldset> </fieldset>
<br/>
</div> </div>
<div class="tab-pane-inner" id="notifications"> <div class="tab-pane-inner" id="notifications">
@@ -154,15 +162,26 @@ User-Agent: wonderbra 1.0") }}
</div> </div>
</fieldset> </fieldset>
<div class="pure-control-group"> <div class="pure-control-group">
{{ render_field(form.css_filter, placeholder=".class-name or #some-id, or other CSS selector rule.", {% set field = render_field(form.css_filter,
class="m-d") }} placeholder=".class-name or #some-id, or other CSS selector rule.",
class="m-d")
%}
{{ field }}
{% if '/text()' in field %}
<span class="pure-form-message-inline"><strong>Note!: //text() function does not work where the &lt;element&gt; contains &lt;![CDATA[]]&gt;</strong></span><br/>
{% endif %}
<span class="pure-form-message-inline"> <span class="pure-form-message-inline">
<ul> <ul>
<li>CSS - Limit text to this CSS rule, only text matching this CSS rule is included.</li> <li>CSS - Limit text to this CSS rule, only text matching this CSS rule is included.</li>
<li>JSON - Limit text to this JSON rule, using <a href="https://pypi.org/project/jsonpath-ng/">JSONPath</a>, prefix with <code>"json:"</code>, use <code>json:$</code> to force re-formatting if required, <a <li>JSON - Limit text to this JSON rule, using <a href="https://pypi.org/project/jsonpath-ng/">JSONPath</a>, prefix with <code>"json:"</code>, use <code>json:$</code> to force re-formatting if required, <a
href="https://jsonpath.com/" target="new">test your JSONPath here</a></li> href="https://jsonpath.com/" target="new">test your JSONPath here</a></li>
<li>XPath - Limit text to this XPath rule, simply start with a forward-slash, example <code>//*[contains(@class, 'sametext')]</code> or <code>xpath://*[contains(@class, 'sametext')]</code>, <a <li>XPath - Limit text to this XPath rule, simply start with a forward-slash,
<ul>
<li>Example: <code>//*[contains(@class, 'sametext')]</code> or <code>xpath://*[contains(@class, 'sametext')]</code>, <a
href="http://xpather.com/" target="new">test your XPath here</a></li> href="http://xpather.com/" target="new">test your XPath here</a></li>
<li>Example: Get all titles from an RSS feed <code>//title/text()</code></li>
</ul>
</li>
</ul> </ul>
Please be sure that you thoroughly understand how to write CSS or JSONPath, XPath selector rules before filing an issue on GitHub! <a Please be sure that you thoroughly understand how to write CSS or JSONPath, XPath selector rules before filing an issue on GitHub! <a
href="https://github.com/dgtlmoon/changedetection.io/wiki/CSS-Selector-help">here for more CSS selector help</a>.<br/> href="https://github.com/dgtlmoon/changedetection.io/wiki/CSS-Selector-help">here for more CSS selector help</a>.<br/>
@@ -187,7 +206,7 @@ nav
<span class="pure-form-message-inline"> <span class="pure-form-message-inline">
<ul> <ul>
<li>Each line processed separately, any line matching will be ignored (removed before creating the checksum)</li> <li>Each line processed separately, any line matching will be ignored (removed before creating the checksum)</li>
<li>Regular Expression support, wrap the line in forward slash <code>/regex/</code></li> <li>Regular Expression support, wrap the entire line in forward slash <code>/regex/</code></li>
<li>Changing this will affect the comparison checksum which may trigger an alert</li> <li>Changing this will affect the comparison checksum which may trigger an alert</li>
<li>Use the preview/show current tab to see ignores</li> <li>Use the preview/show current tab to see ignores</li>
</ul> </ul>
@@ -230,8 +249,15 @@ Unavailable") }}
{{ render_field(form.extract_text, rows=5, placeholder="\d+ online") }} {{ render_field(form.extract_text, rows=5, placeholder="\d+ online") }}
<span class="pure-form-message-inline"> <span class="pure-form-message-inline">
<ul> <ul>
<li>Extracts text in the final output after other filters using regular expressions, for example <code>\d+ online</code></li> <li>Extracts text in the final output (line by line) after other filters using regular expressions;
<li>One line per regular-expression.</li> <ul>
<li>Regular expression &dash; example <code>/reports.+?2022/i</code></li>
<li>Use <code>//(?aiLmsux))</code> type flags (more <a href="https://docs.python.org/3/library/re.html#index-15">information here</a>)<br/></li>
<li>Keyword example &dash; example <code>Out of stock</code></li>
<li>Use groups to extract just that text &dash; example <code>/reports.+?(\d+)/i</code> returns a list of years only</li>
</ul>
</li>
<li>One line per regular-expression/ string match</li>
</ul> </ul>
</span> </span>
</div> </div>
@@ -240,7 +266,7 @@ Unavailable") }}
<div class="tab-pane-inner visual-selector-ui" id="visualselector"> <div class="tab-pane-inner visual-selector-ui" id="visualselector">
<img id="beta-logo" src="{{url_for('static_content', group='images', filename='beta-logo.png')}}"> <img id="beta-logo" src="{{url_for('static_content', group='images', filename='beta-logo.png')}}">
<strong>Pro-tip:</strong> This tool is only for limiting which elements will be included on a change-detection, not for interacting with browser directly.
<fieldset> <fieldset>
<div class="pure-control-group"> <div class="pure-control-group">
{% if visualselector_enabled %} {% if visualselector_enabled %}
@@ -280,9 +306,7 @@ Unavailable") }}
<div id="actions"> <div id="actions">
<div class="pure-control-group"> <div class="pure-control-group">
{{ render_button(form.save_button) }}
{{ render_button(form.save_button) }} {{ render_button(form.save_and_preview_button) }}
<a href="{{url_for('form_delete', uuid=uuid)}}" <a href="{{url_for('form_delete', uuid=uuid)}}"
class="pure-button button-small button-error ">Delete</a> class="pure-button button-small button-error ">Delete</a>
<a href="{{url_for('clear_watch_history', uuid=uuid)}}" <a href="{{url_for('clear_watch_history', uuid=uuid)}}"

View File

@@ -5,7 +5,7 @@
<div class="tabs collapsable"> <div class="tabs collapsable">
<ul> <ul>
<li class="tab" id="default-tab"><a href="#url-list">URL List</a></li> <li class="tab" id=""><a href="#url-list">URL List</a></li>
<li class="tab"><a href="#distill-io">Distill.io</a></li> <li class="tab"><a href="#distill-io">Distill.io</a></li>
</ul> </ul>
</div> </div>

View File

@@ -3,23 +3,39 @@
{% block content %} {% block content %}
<script> <script>
const screenshot_url="{{url_for('static_content', group='screenshot', filename=uuid)}}"; const screenshot_url="{{url_for('static_content', group='screenshot', filename=uuid)}}";
{% if last_error_screenshot %}
const error_screenshot_url="{{url_for('static_content', group='screenshot', filename=uuid, error_screenshot=1) }}";
{% endif %}
</script> </script>
<script type="text/javascript" src="{{url_for('static_content', group='js', filename='diff-overview.js')}}" defer></script> <script type="text/javascript" src="{{url_for('static_content', group='js', filename='diff-overview.js')}}" defer></script>
<div id="settings">
<h1>Current - {{watch.last_checked|format_timestamp_timeago}}</h1>
</div>
<script type="text/javascript" src="{{url_for('static_content', group='js', filename='tabs.js')}}" defer></script> <script type="text/javascript" src="{{url_for('static_content', group='js', filename='tabs.js')}}" defer></script>
<div class="tabs"> <div class="tabs">
<ul> <ul>
<li class="tab" id="default-tab"><a href="#text">Text</a></li> {% if last_error_text %}<li class="tab" id="error-text-tab"><a href="#error-text">Error Text</a></li> {% endif %}
{% if last_error_screenshot %}<li class="tab" id="error-screenshot-tab"><a href="#error-screenshot">Error Screenshot</a></li> {% endif %}
{% if history_n > 0 %}
<li class="tab" id="text-tab"><a href="#text">Text</a></li>
<li class="tab" id="screenshot-tab"><a href="#screenshot">Screenshot</a></li> <li class="tab" id="screenshot-tab"><a href="#screenshot">Screenshot</a></li>
{% endif %}
</ul> </ul>
</div> </div>
<div id="diff-ui"> <div id="diff-ui">
<div class="tab-pane-inner" id="error-text">
<div class="snapshot-age error">{{watch.error_text_ctime|format_seconds_ago}} seconds ago</div>
<pre>
{{ last_error_text }}
</pre>
</div>
<div class="tab-pane-inner" id="error-screenshot">
<div class="snapshot-age error">{{watch.snapshot_error_screenshot_ctime|format_seconds_ago}} seconds ago</div>
<img id="error-screenshot-img" style="max-width: 80%" alt="Current erroring screenshot from most recent request"/>
</div>
<div class="tab-pane-inner" id="text"> <div class="tab-pane-inner" id="text">
<div class="snapshot-age">{{watch.snapshot_text_ctime|format_timestamp_timeago}}</div>
<span class="ignored">Grey lines are ignored</span> <span class="triggered">Blue lines are triggers</span> <span class="ignored">Grey lines are ignored</span> <span class="triggered">Blue lines are triggers</span>
<table> <table>
<tbody> <tbody>
@@ -33,6 +49,7 @@
</tbody> </tbody>
</table> </table>
</div> </div>
<div class="tab-pane-inner" id="screenshot"> <div class="tab-pane-inner" id="screenshot">
<div class="tip"> <div class="tip">
For now, Differences are performed on text, not graphically, only the latest screenshot is available. For now, Differences are performed on text, not graphically, only the latest screenshot is available.

View File

@@ -16,7 +16,7 @@
<div class="edit-form"> <div class="edit-form">
<div class="tabs collapsable"> <div class="tabs collapsable">
<ul> <ul>
<li class="tab" id="default-tab"><a href="#general">General</a></li> <li class="tab" id=""><a href="#general">General</a></li>
<li class="tab"><a href="#notifications">Notifications</a></li> <li class="tab"><a href="#notifications">Notifications</a></li>
<li class="tab"><a href="#fetching">Fetching</a></li> <li class="tab"><a href="#fetching">Fetching</a></li>
<li class="tab"><a href="#filters">Global Filters</a></li> <li class="tab"><a href="#filters">Global Filters</a></li>
@@ -36,7 +36,13 @@
{{ render_field(form.requests.form.jitter_seconds, class="jitter_seconds") }} {{ render_field(form.requests.form.jitter_seconds, class="jitter_seconds") }}
<span class="pure-form-message-inline">Example - 3 seconds random jitter could trigger up to 3 seconds earlier or up to 3 seconds later</span> <span class="pure-form-message-inline">Example - 3 seconds random jitter could trigger up to 3 seconds earlier or up to 3 seconds later</span>
</div> </div>
<div class="pure-control-group">
{{ render_field(form.application.form.filter_failure_notification_threshold_attempts, class="filter_failure_notification_threshold_attempts") }}
<span class="pure-form-message-inline">After this many consecutive times that the CSS/xPath filter is missing, send a notification
<br/>
Set to <strong>0</strong> to disable
</span>
</div>
<div class="pure-control-group"> <div class="pure-control-group">
{% if not hide_remove_pass %} {% if not hide_remove_pass %}
{% if current_user.is_authenticated %} {% if current_user.is_authenticated %}
@@ -63,12 +69,6 @@
{{ render_checkbox_field(form.application.form.extract_title_as_title) }} {{ render_checkbox_field(form.application.form.extract_title_as_title) }}
<span class="pure-form-message-inline">Note: This will automatically apply to all existing watches.</span> <span class="pure-form-message-inline">Note: This will automatically apply to all existing watches.</span>
</div> </div>
<div class="pure-control-group">
{{ render_checkbox_field(form.application.form.real_browser_save_screenshot) }}
<span class="pure-form-message-inline">When using a Chrome browser, a screenshot from the last check will be available on the Diff page</span>
</div>
<div class="pure-control-group"> <div class="pure-control-group">
{{ render_checkbox_field(form.application.form.empty_pages_are_a_change) }} {{ render_checkbox_field(form.application.form.empty_pages_are_a_change) }}
<span class="pure-form-message-inline">When a page contains HTML, but no renderable text appears (empty page), is this considered a change?</span> <span class="pure-form-message-inline">When a page contains HTML, but no renderable text appears (empty page), is this considered a change?</span>
@@ -148,7 +148,7 @@ nav
<ul> <ul>
<li>Note: This is applied globally in addition to the per-watch rules.</li> <li>Note: This is applied globally in addition to the per-watch rules.</li>
<li>Each line processed separately, any line matching will be ignored (removed before creating the checksum)</li> <li>Each line processed separately, any line matching will be ignored (removed before creating the checksum)</li>
<li>Regular Expression support, wrap the line in forward slash <code>/regex/</code></li> <li>Regular Expression support, wrap the entire line in forward slash <code>/regex/</code></li>
<li>Changing this will affect the comparison checksum which may trigger an alert</li> <li>Changing this will affect the comparison checksum which may trigger an alert</li>
<li>Use the preview/show current tab to see ignores</li> <li>Use the preview/show current tab to see ignores</li>
</ul> </ul>

View File

@@ -1,20 +1,28 @@
{% extends 'base.html' %} {% extends 'base.html' %}
{% block content %} {% block content %}
{% from '_helpers.jinja' import render_simple_field %} {% from '_helpers.jinja' import render_simple_field, render_field %}
{% from '_pagination.jinja' import pagination %}
<script type="text/javascript" src="{{url_for('static_content', group='js', filename='jquery-3.6.0.min.js')}}"></script> <script type="text/javascript" src="{{url_for('static_content', group='js', filename='jquery-3.6.0.min.js')}}"></script>
<script type="text/javascript" src="{{url_for('static_content', group='js', filename='watch-overview.js')}}" defer></script> <script type="text/javascript" src="{{url_for('static_content', group='js', filename='watch-overview.js')}}" defer></script>
<div class="box"> <div class="box">
<form class="pure-form" action="{{ url_for('form_watch_add') }}" method="POST" id="new-watch-form"> <form class="pure-form" action="{{ url_for('form_quick_watch_add') }}" method="POST" id="new-watch-form">
<input type="hidden" name="csrf_token" value="{{ csrf_token() }}"/> <input type="hidden" name="csrf_token" value="{{ csrf_token() }}"/>
<fieldset> <fieldset>
<legend>Add a new change detection watch</legend> <legend>Add a new change detection watch</legend>
{{ render_simple_field(form.url, placeholder="https://...", required=true) }} <div id="watch-add-wrapper-zone">
{{ render_simple_field(form.tag, value=active_tag if active_tag else '', placeholder="watch group") }} <div>
<button type="submit" class="pure-button pure-button-primary">Watch</button> {{ render_simple_field(form.url, placeholder="https://...", required=true) }}
{{ render_simple_field(form.tag, value=active_tag if active_tag else '', placeholder="watch group") }}
</div>
<div>
{{ render_simple_field(form.watch_submit_button, title="Watch this URL!" ) }}
{{ render_simple_field(form.edit_and_watch_submit_button, title="Edit first then Watch") }}
</div>
</div>
</fieldset> </fieldset>
<span style="color:#eee; font-size: 80%;"><img style="height: 1em;display:inline-block;" src="{{url_for('static_content', group='images', filename='spread.svg')}}" /> Tip: You can also add 'shared' watches. <a href="https://github.com/dgtlmoon/changedetection.io/wiki/Sharing-a-Watch">More info</a></a></span> <span style="color:#eee; font-size: 80%;"><img style="height: 1em;display:inline-block;" src="{{url_for('static_content', group='images', filename='spread-white.svg')}}" /> Tip: You can also add 'shared' watches. <a href="https://github.com/dgtlmoon/changedetection.io/wiki/Sharing-a-Watch">More info</a></a></span>
</form> </form>
<div> <div>
<a href="{{url_for('index')}}" class="pure-button button-tag {{'active' if not active_tag }}">All</a> <a href="{{url_for('index')}}" class="pure-button button-tag {{'active' if not active_tag }}">All</a>
@@ -25,22 +33,32 @@
{% endfor %} {% endfor %}
</div> </div>
{% set sort_order = request.args.get('order', 'asc') == 'asc' %}
{% set sort_attribute = request.args.get('sort', 'last_changed') %}
{% set pagination_page = request.args.get('page', 0) %}
<div id="watch-table-wrapper"> <div id="watch-table-wrapper">
<table class="pure-table pure-table-striped watch-table"> <table class="pure-table pure-table-striped watch-table">
<thead> <thead>
<tr> <tr>
<th>#</th> <th>#</th>
<th></th> <th></th>
<th></th> {% set link_order = "desc" if sort_order else "asc" %}
<th>Last Checked</th> {% set arrow_span = "" %}
<th>Last Changed</th> <th><a class="{{ 'active '+link_order if sort_attribute == 'label' else 'inactive' }}" href="{{url_for('index', sort='label', order=link_order)}}">Website <span class='arrow {{link_order}}'></span></a></th>
<th><a class="{{ 'active '+link_order if sort_attribute == 'last_checked' else 'inactive' }}" href="{{url_for('index', sort='last_checked', order=link_order)}}">Last Checked <span class='arrow {{link_order}}'></span></a></th>
<th><a class="{{ 'active '+link_order if sort_attribute == 'last_changed' else 'inactive' }}" href="{{url_for('index', sort='last_changed', order=link_order)}}">Last Changed <span class='arrow {{link_order}}'></span></a></th>
<th></th> <th></th>
</tr> </tr>
</thead> </thead>
<tbody> <tbody>
{% set sorted_watches = watches|sort(attribute=sort_attribute, reverse=sort_order) %}
{% for watch in sorted_watches %}
{% for watch in watches|sort(attribute='last_changed', reverse=True) %} {# WIP for pagination, disabled for now
{% if not ( loop.index >= 3 and loop.index <=4) %}{% continue %}{% endif %} -->
#}
<tr id="{{ watch.uuid }}" <tr id="{{ watch.uuid }}"
class="{{ loop.cycle('pure-table-odd', 'pure-table-even') }} class="{{ loop.cycle('pure-table-odd', 'pure-table-even') }}
{% if watch.last_error is defined and watch.last_error != False %}error{% endif %} {% if watch.last_error is defined and watch.last_error != False %}error{% endif %}
@@ -49,8 +67,10 @@
{% if watch.newest_history_key| int > watch.last_viewed and watch.history_n>=2 %}unviewed{% endif %} {% if watch.newest_history_key| int > watch.last_viewed and watch.history_n>=2 %}unviewed{% endif %}
{% if watch.uuid in queued_uuids %}queued{% endif %}"> {% if watch.uuid in queued_uuids %}queued{% endif %}">
<td class="inline">{{ loop.index }}</td> <td class="inline">{{ loop.index }}</td>
<td class="inline paused-state state-{{watch.paused}}"><a href="{{url_for('index', pause=watch.uuid, tag=active_tag)}}"><img src="{{url_for('static_content', group='images', filename='pause.svg')}}" alt="Pause" title="Pause"/></a></td> <td class="inline watch-controls">
<a class="state-{{'on' if watch.paused }}" href="{{url_for('index', op='pause', uuid=watch.uuid, tag=active_tag)}}"><img src="{{url_for('static_content', group='images', filename='pause.svg')}}" alt="Pause checks" title="Pause checks"/></a>
<a class="state-{{'on' if watch.notification_muted}}" href="{{url_for('index', op='mute', uuid=watch.uuid, tag=active_tag)}}"><img src="{{url_for('static_content', group='images', filename='bell-off.svg')}}" alt="Mute notifications" title="Mute notifications"/></a>
</td>
<td class="title-col inline">{{watch.title if watch.title is not none and watch.title|length > 0 else watch.url}} <td class="title-col inline">{{watch.title if watch.title is not none and watch.title|length > 0 else watch.url}}
<a class="external" target="_blank" rel="noopener" href="{{ watch.url.replace('source:','') }}"></a> <a class="external" target="_blank" rel="noopener" href="{{ watch.url.replace('source:','') }}"></a>
<a href="{{url_for('form_share_put_watch', uuid=watch.uuid)}}"><img style="height: 1em;display:inline-block;" src="{{url_for('static_content', group='images', filename='spread.svg')}}" /></a> <a href="{{url_for('form_share_put_watch', uuid=watch.uuid)}}"><img style="height: 1em;display:inline-block;" src="{{url_for('static_content', group='images', filename='spread.svg')}}" /></a>
@@ -81,7 +101,7 @@
{% if watch.history_n >= 2 %} {% if watch.history_n >= 2 %}
<a href="{{ url_for('diff_history_page', uuid=watch.uuid) }}" target="{{watch.uuid}}" class="pure-button button-small pure-button-primary diff-link">Diff</a> <a href="{{ url_for('diff_history_page', uuid=watch.uuid) }}" target="{{watch.uuid}}" class="pure-button button-small pure-button-primary diff-link">Diff</a>
{% else %} {% else %}
{% if watch.history_n == 1 %} {% if watch.history_n == 1 or (watch.history_n ==0 and watch.error_text_ctime )%}
<a href="{{ url_for('preview_page', uuid=watch.uuid)}}" target="{{watch.uuid}}" class="pure-button button-small pure-button-primary">Preview</a> <a href="{{ url_for('preview_page', uuid=watch.uuid)}}" target="{{watch.uuid}}" class="pure-button button-small pure-button-primary">Preview</a>
{% endif %} {% endif %}
{% endif %} {% endif %}
@@ -104,6 +124,10 @@
<a href="{{ url_for('rss', tag=active_tag , token=app_rss_token)}}"><img alt="RSS Feed" id="feed-icon" src="{{url_for('static_content', group='images', filename='Generic_Feed-icon.svg')}}" height="15"></a> <a href="{{ url_for('rss', tag=active_tag , token=app_rss_token)}}"><img alt="RSS Feed" id="feed-icon" src="{{url_for('static_content', group='images', filename='Generic_Feed-icon.svg')}}" height="15"></a>
</li> </li>
</ul> </ul>
{# WIP for pagination, disabled for now
{{ pagination(sorted_watches,3, pagination_page) }}
#}
</div> </div>
</div> </div>
{% endblock %} {% endblock %}

View File

@@ -2,7 +2,7 @@
import time import time
from flask import url_for from flask import url_for
from ..util import live_server_setup from ..util import live_server_setup, wait_for_all_checks
import logging import logging
@@ -29,14 +29,8 @@ def test_fetch_webdriver_content(client, live_server):
assert b"1 Imported" in res.data assert b"1 Imported" in res.data
time.sleep(3) time.sleep(3)
attempt = 0
while attempt < 20: wait_for_all_checks(client)
res = client.get(url_for("index"))
if not b'Checking now' in res.data:
break
logging.getLogger().info("Waiting for check to not say 'Checking now'..")
time.sleep(3)
attempt += 1
res = client.get( res = client.get(

View File

@@ -19,7 +19,6 @@ def test_check_access_control(app, client):
) )
assert b"Password protection enabled." in res.data assert b"Password protection enabled." in res.data
assert b"LOG OUT" not in res.data
# Check we hit the login # Check we hit the login
res = c.get(url_for("index"), follow_redirects=True) res = c.get(url_for("index"), follow_redirects=True)
@@ -38,7 +37,40 @@ def test_check_access_control(app, client):
follow_redirects=True follow_redirects=True
) )
# Yes we are correctly logged in
assert b"LOG OUT" in res.data assert b"LOG OUT" in res.data
# 598 - Password should be set and not accidently removed
res = c.post(
url_for("settings_page"),
data={
"requests-time_between_check-minutes": 180,
'application-fetch_backend': "html_requests"},
follow_redirects=True
)
res = c.get(url_for("logout"),
follow_redirects=True)
res = c.get(url_for("settings_page"),
follow_redirects=True)
assert b"Login" in res.data
res = c.get(url_for("login"))
assert b"Login" in res.data
res = c.post(
url_for("login"),
data={"password": "foobar"},
follow_redirects=True
)
# Yes we are correctly logged in
assert b"LOG OUT" in res.data
res = c.get(url_for("settings_page")) res = c.get(url_for("settings_page"))
# Menu should be available now # Menu should be available now

View File

@@ -90,6 +90,14 @@ def test_check_basic_change_detection_functionality(client, live_server):
res = client.get(url_for("diff_history_page", uuid="first")) res = client.get(url_for("diff_history_page", uuid="first"))
assert b'Compare newest' in res.data assert b'Compare newest' in res.data
# Check the [preview] pulls the right one
res = client.get(
url_for("preview_page", uuid="first"),
follow_redirects=True
)
assert b'which has this one new line' in res.data
assert b'Which is across multiple lines' not in res.data
time.sleep(2) time.sleep(2)
# Do this a few times.. ensures we dont accidently set the status # Do this a few times.. ensures we dont accidently set the status

View File

@@ -11,16 +11,17 @@ def test_setup(live_server):
live_server_setup(live_server) live_server_setup(live_server)
def test_error_handler(client, live_server): def _runner_test_http_errors(client, live_server, http_code, expected_text):
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write("Now you going to get a {} error code\n".format(http_code))
# Give the endpoint time to spin up
time.sleep(1)
# Add our URL to the import page # Add our URL to the import page
test_url = url_for('test_endpoint', test_url = url_for('test_endpoint',
status_code=403, status_code=http_code,
_external=True) _external=True)
res = client.post( res = client.post(
url_for("import_page"), url_for("import_page"),
data={"urls": test_url}, data={"urls": test_url},
@@ -28,20 +29,39 @@ def test_error_handler(client, live_server):
) )
assert b"1 Imported" in res.data assert b"1 Imported" in res.data
# Trigger a check
client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(3) time.sleep(2)
res = client.get(url_for("index")) res = client.get(url_for("index"))
# no change
assert b'unviewed' not in res.data assert b'unviewed' not in res.data
assert b'Status Code 403' in res.data assert bytes(expected_text.encode('utf-8')) in res.data
assert bytes("just now".encode('utf-8')) in res.data
# Error viewing tabs should appear
res = client.get(
url_for("preview_page", uuid="first"),
follow_redirects=True
)
assert b'Error Text' in res.data
# 'Error Screenshot' only when in playwright mode
#assert b'Error Screenshot' in res.data
res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data
def test_http_error_handler(client, live_server):
_runner_test_http_errors(client, live_server, 403, 'Access denied')
_runner_test_http_errors(client, live_server, 404, 'Page not found')
_runner_test_http_errors(client, live_server, 500, '(Internal server Error) received')
_runner_test_http_errors(client, live_server, 400, 'Error - Request returned a HTTP error code 400')
# Just to be sure error text is properly handled # Just to be sure error text is properly handled
def test_error_text_handler(client, live_server): def test_DNS_errors(client, live_server):
# Give the endpoint time to spin up # Give the endpoint time to spin up
time.sleep(1) time.sleep(1)
@@ -53,13 +73,11 @@ def test_error_text_handler(client, live_server):
) )
assert b"1 Imported" in res.data assert b"1 Imported" in res.data
# Trigger a check
client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(3) time.sleep(3)
res = client.get(url_for("index")) res = client.get(url_for("index"))
assert b'Name or service not known' in res.data assert b'Name or service not known' in res.data
# Should always record that we tried
assert bytes("just now".encode('utf-8')) in res.data assert bytes("just now".encode('utf-8')) in res.data

View File

@@ -15,7 +15,7 @@ def set_original_response():
</br> </br>
So let's see what happens. </br> So let's see what happens. </br>
<div id="sametext">Some text thats the same</div> <div id="sametext">Some text thats the same</div>
<div id="changetext">Some text that will change</div> <div class="changetext">Some text that will change</div>
</body> </body>
</html> </html>
""" """
@@ -33,7 +33,8 @@ def set_modified_response():
</br> </br>
So let's see what happens. </br> So let's see what happens. </br>
<div id="sametext">Some text thats the same</div> <div id="sametext">Some text thats the same</div>
<div id="changetext">Some text that did change ( 1000 online <br/> 80 guests)</div> <div class="changetext">Some text that did change ( 1000 online <br/> 80 guests<br/> 2000 online )</div>
<div class="changetext">SomeCase insensitive 3456</div>
</body> </body>
</html> </html>
""" """
@@ -44,11 +45,78 @@ def set_modified_response():
return None return None
def test_check_filter_and_regex_extract(client, live_server): def set_multiline_response():
sleep_time_for_fetch_thread = 3 test_return_data = """<html>
<body>
<p>Something <br/>
across 6 billion multiple<br/>
lines
</p>
<div>aaand something lines</div>
</body>
</html>
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
return None
def test_setup(client, live_server):
live_server_setup(live_server) live_server_setup(live_server)
css_filter = "#changetext"
def test_check_filter_multiline(client, live_server):
set_multiline_response()
# Add our URL to the import page
test_url = url_for('test_endpoint', _external=True)
res = client.post(
url_for("import_page"),
data={"urls": test_url},
follow_redirects=True
)
assert b"1 Imported" in res.data
time.sleep(3)
# Goto the edit page, add our ignore text
# Add our URL to the import page
res = client.post(
url_for("edit_page", uuid="first"),
data={"css_filter": '',
'extract_text': '/something.+?6 billion.+?lines/si',
"url": test_url,
"tag": "",
"headers": "",
'fetch_backend': "html_requests"
},
follow_redirects=True
)
assert b"Updated watch." in res.data
time.sleep(3)
res = client.get(
url_for("preview_page", uuid="first"),
follow_redirects=True
)
assert b'<div class="">Something' in res.data
assert b'<div class="">across 6 billion multiple' in res.data
assert b'<div class="">lines' in res.data
# but the last one, which also says 'lines' shouldnt be here (non-greedy match checking)
assert b'aaand something lines' not in res.data
def test_check_filter_and_regex_extract(client, live_server):
sleep_time_for_fetch_thread = 3
css_filter = ".changetext"
set_original_response() set_original_response()
@@ -64,6 +132,7 @@ def test_check_filter_and_regex_extract(client, live_server):
) )
assert b"1 Imported" in res.data assert b"1 Imported" in res.data
time.sleep(1)
# Trigger a check # Trigger a check
client.get(url_for("form_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
@@ -75,7 +144,7 @@ def test_check_filter_and_regex_extract(client, live_server):
res = client.post( res = client.post(
url_for("edit_page", uuid="first"), url_for("edit_page", uuid="first"),
data={"css_filter": css_filter, data={"css_filter": css_filter,
'extract_text': '\d+ online\n\d+ guests', 'extract_text': '\d+ online\r\n\d+ guests\r\n/somecase insensitive \d+/i\r\n/somecase insensitive (345\d)/i',
"url": test_url, "url": test_url,
"tag": "", "tag": "",
"headers": "", "headers": "",
@@ -86,15 +155,6 @@ def test_check_filter_and_regex_extract(client, live_server):
assert b"Updated watch." in res.data assert b"Updated watch." in res.data
# Check it saved
res = client.get(
url_for("edit_page", uuid="first"),
)
assert b'\d+ online' in res.data
# Trigger a check
# client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -119,9 +179,20 @@ def test_check_filter_and_regex_extract(client, live_server):
# Class will be blank for now because the frontend didnt apply the diff # Class will be blank for now because the frontend didnt apply the diff
assert b'<div class="">1000 online' in res.data assert b'<div class="">1000 online' in res.data
# All regex matching should be here
assert b'<div class="">2000 online' in res.data
# Both regexs should be here # Both regexs should be here
assert b'<div class="">80 guests' in res.data assert b'<div class="">80 guests' in res.data
# Regex with flag handling should be here
assert b'<div class="">SomeCase insensitive 3456' in res.data
# Singular group from /somecase insensitive (345\d)/i
assert b'<div class="">3456' in res.data
# Regex with multiline flag handling should be here
# Should not be here # Should not be here
assert b'Some text that did change' not in res.data assert b'Some text that did change' not in res.data

View File

@@ -0,0 +1,134 @@
#!/usr/bin/python3
# https://www.reddit.com/r/selfhosted/comments/wa89kp/comment/ii3a4g7/?context=3
import os
import time
from flask import url_for
from .util import set_original_response, live_server_setup
from changedetectionio.model import App
def set_response_without_filter():
test_return_data = """<html>
<body>
Some initial text</br>
<p>Which is across multiple lines</p>
</br>
So let's see what happens. </br>
<div id="nope-doesnt-exist">Some text thats the same</div>
</body>
</html>
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
return None
def set_response_with_filter():
test_return_data = """<html>
<body>
Some initial text</br>
<p>Which is across multiple lines</p>
</br>
So let's see what happens. </br>
<div class="ticket-available">Ticket now on sale!</div>
</body>
</html>
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
return None
def test_filter_doesnt_exist_then_exists_should_get_notification(client, live_server):
# Filter knowingly doesn't exist, like someone setting up a known filter to see if some cinema tickets are on sale again
# And the page has that filter available
# Then I should get a notification
live_server_setup(live_server)
# Give the endpoint time to spin up
time.sleep(1)
set_response_without_filter()
# Add our URL to the import page
test_url = url_for('test_endpoint', _external=True)
res = client.post(
url_for("form_quick_watch_add"),
data={"url": test_url, "tag": 'cinema'},
follow_redirects=True
)
assert b"Watch added" in res.data
# Give the thread time to pick up the first version
time.sleep(3)
# Goto the edit page, add our ignore text
# Add our URL to the import page
url = url_for('test_notification_endpoint', _external=True)
notification_url = url.replace('http', 'json')
print(">>>> Notification URL: " + notification_url)
# Just a regular notification setting, this will be used by the special 'filter not found' notification
notification_form_data = {"notification_urls": notification_url,
"notification_title": "New ChangeDetection.io Notification - {watch_url}",
"notification_body": "BASE URL: {base_url}\n"
"Watch URL: {watch_url}\n"
"Watch UUID: {watch_uuid}\n"
"Watch title: {watch_title}\n"
"Watch tag: {watch_tag}\n"
"Preview: {preview_url}\n"
"Diff URL: {diff_url}\n"
"Snapshot: {current_snapshot}\n"
"Diff: {diff}\n"
"Diff Full: {diff_full}\n"
":-)",
"notification_format": "Text"}
notification_form_data.update({
"url": test_url,
"tag": "my tag",
"title": "my title",
"headers": "",
"css_filter": '.ticket-available',
"fetch_backend": "html_requests"})
res = client.post(
url_for("edit_page", uuid="first"),
data=notification_form_data,
follow_redirects=True
)
assert b"Updated watch." in res.data
time.sleep(3)
# Shouldn't exist, shouldn't have fired
assert not os.path.isfile("test-datastore/notification.txt")
# Now the filter should exist
set_response_with_filter()
client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(3)
assert os.path.isfile("test-datastore/notification.txt")
with open("test-datastore/notification.txt", 'r') as f:
notification = f.read()
assert 'Ticket now on sale' in notification
os.unlink("test-datastore/notification.txt")
# Test that if it gets removed, then re-added, we get a notification
# Remove the target and re-add it, we should get a new notification
set_response_without_filter()
client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(3)
assert not os.path.isfile("test-datastore/notification.txt")
set_response_with_filter()
client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(3)
assert os.path.isfile("test-datastore/notification.txt")
# Also test that the filter was updated after the first one was requested

View File

@@ -0,0 +1,144 @@
import os
import time
import re
from flask import url_for
from .util import set_original_response, live_server_setup
from changedetectionio.model import App
def set_response_with_filter():
test_return_data = """<html>
<body>
Some initial text</br>
<p>Which is across multiple lines</p>
</br>
So let's see what happens. </br>
<div id="nope-doesnt-exist">Some text thats the same</div>
</body>
</html>
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
return None
def run_filter_test(client, content_filter):
# Give the endpoint time to spin up
time.sleep(1)
# cleanup for the next
client.get(
url_for("form_delete", uuid="all"),
follow_redirects=True
)
if os.path.isfile("test-datastore/notification.txt"):
os.unlink("test-datastore/notification.txt")
# Add our URL to the import page
test_url = url_for('test_endpoint', _external=True)
res = client.post(
url_for("form_quick_watch_add"),
data={"url": test_url, "tag": ''},
follow_redirects=True
)
assert b"Watch added" in res.data
# Give the thread time to pick up the first version
time.sleep(3)
# Goto the edit page, add our ignore text
# Add our URL to the import page
url = url_for('test_notification_endpoint', _external=True)
notification_url = url.replace('http', 'json')
print(">>>> Notification URL: " + notification_url)
# Just a regular notification setting, this will be used by the special 'filter not found' notification
notification_form_data = {"notification_urls": notification_url,
"notification_title": "New ChangeDetection.io Notification - {watch_url}",
"notification_body": "BASE URL: {base_url}\n"
"Watch URL: {watch_url}\n"
"Watch UUID: {watch_uuid}\n"
"Watch title: {watch_title}\n"
"Watch tag: {watch_tag}\n"
"Preview: {preview_url}\n"
"Diff URL: {diff_url}\n"
"Snapshot: {current_snapshot}\n"
"Diff: {diff}\n"
"Diff Full: {diff_full}\n"
":-)",
"notification_format": "Text"}
notification_form_data.update({
"url": test_url,
"tag": "my tag",
"title": "my title",
"headers": "",
"filter_failure_notification_send": 'y',
"css_filter": content_filter,
"fetch_backend": "html_requests"})
res = client.post(
url_for("edit_page", uuid="first"),
data=notification_form_data,
follow_redirects=True
)
assert b"Updated watch." in res.data
time.sleep(3)
# Now the notification should not exist, because we didnt reach the threshold
assert not os.path.isfile("test-datastore/notification.txt")
for i in range(0, App._FILTER_FAILURE_THRESHOLD_ATTEMPTS_DEFAULT):
res = client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(3)
# We should see something in the frontend
assert b'Warning, filter' in res.data
# Now it should exist and contain our "filter not found" alert
assert os.path.isfile("test-datastore/notification.txt")
notification = False
with open("test-datastore/notification.txt", 'r') as f:
notification = f.read()
assert 'CSS/xPath filter was not present in the page' in notification
assert content_filter.replace('"', '\\"') in notification
# Remove it and prove that it doesnt trigger when not expected
os.unlink("test-datastore/notification.txt")
set_response_with_filter()
for i in range(0, App._FILTER_FAILURE_THRESHOLD_ATTEMPTS_DEFAULT):
client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(3)
# It should have sent a notification, but..
assert os.path.isfile("test-datastore/notification.txt")
# but it should not contain the info about the failed filter
with open("test-datastore/notification.txt", 'r') as f:
notification = f.read()
assert not 'CSS/xPath filter was not present in the page' in notification
# cleanup for the next
client.get(
url_for("form_delete", uuid="all"),
follow_redirects=True
)
os.unlink("test-datastore/notification.txt")
def test_setup(live_server):
live_server_setup(live_server)
def test_check_css_filter_failure_notification(client, live_server):
set_original_response()
time.sleep(1)
run_filter_test(client, '#nope-doesnt-exist')
def test_check_xpath_filter_failure_notification(client, live_server):
set_original_response()
time.sleep(1)
run_filter_test(client, '//*[@id="nope-doesnt-exist"]')
# Test that notification is never sent

View File

@@ -137,54 +137,3 @@ def test_403_page_check_works_with_ignore_status_code(client, live_server):
res = client.get(url_for("index")) res = client.get(url_for("index"))
assert b'unviewed' in res.data assert b'unviewed' in res.data
# Tests the whole stack works with staus codes ignored
def test_403_page_check_fails_without_ignore_status_code(client, live_server):
sleep_time_for_fetch_thread = 3
set_original_response()
# Give the endpoint time to spin up
time.sleep(1)
# Add our URL to the import page
test_url = url_for('test_endpoint', status_code=403, _external=True)
res = client.post(
url_for("import_page"),
data={"urls": test_url},
follow_redirects=True
)
assert b"1 Imported" in res.data
# Trigger a check
client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
# Goto the edit page, check our ignore option
# Add our URL to the import page
res = client.post(
url_for("edit_page", uuid="first"),
data={"url": test_url, "tag": "", "headers": "", 'fetch_backend': "html_requests"},
follow_redirects=True
)
assert b"Updated watch." in res.data
# Trigger a check
client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
# Make a change
set_some_changed_response()
# Trigger a check
client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
# It should have 'unviewed' still
# Because it should be looking at only that 'sametext' id
res = client.get(url_for("index"))
assert b'Status Code 403' in res.data

View File

@@ -36,7 +36,7 @@ def test_check_notification(client, live_server):
# Add our URL to the import page # Add our URL to the import page
test_url = url_for('test_endpoint', _external=True) test_url = url_for('test_endpoint', _external=True)
res = client.post( res = client.post(
url_for("form_watch_add"), url_for("form_quick_watch_add"),
data={"url": test_url, "tag": ''}, data={"url": test_url, "tag": ''},
follow_redirects=True follow_redirects=True
) )
@@ -172,7 +172,7 @@ def test_notification_validation(client, live_server):
# Add our URL to the import page # Add our URL to the import page
test_url = url_for('test_endpoint', _external=True) test_url = url_for('test_endpoint', _external=True)
res = client.post( res = client.post(
url_for("form_watch_add"), url_for("form_quick_watch_add"),
data={"url": test_url, "tag": 'nice one'}, data={"url": test_url, "tag": 'nice one'},
follow_redirects=True follow_redirects=True
) )

View File

@@ -16,7 +16,7 @@ def test_check_notification_error_handling(client, live_server):
# use a different URL so that it doesnt interfere with the actual check until we are ready # use a different URL so that it doesnt interfere with the actual check until we are ready
test_url = url_for('test_endpoint', _external=True) test_url = url_for('test_endpoint', _external=True)
res = client.post( res = client.post(
url_for("form_watch_add"), url_for("form_quick_watch_add"),
data={"url": "https://changedetection.io/CHANGELOG.txt", "tag": ''}, data={"url": "https://changedetection.io/CHANGELOG.txt", "tag": ''},
follow_redirects=True follow_redirects=True
) )

View File

@@ -0,0 +1,43 @@
#!/usr/bin/python3
import time
from flask import url_for
from .util import live_server_setup
def set_original_ignore_response():
test_return_data = """<html>
<body>
<span>The price is</span><span>$<!-- -->90<!-- -->.<!-- -->74</span>
</body>
</html>
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
def test_obfuscations(client, live_server):
set_original_ignore_response()
live_server_setup(live_server)
time.sleep(1)
# Add our URL to the import page
test_url = url_for('test_endpoint', _external=True)
res = client.post(
url_for("import_page"),
data={"urls": test_url},
follow_redirects=True
)
assert b"1 Imported" in res.data
# Give the thread time to pick it up
time.sleep(3)
# Check HTML conversion detected and workd
res = client.get(
url_for("preview_page", uuid="first"),
follow_redirects=True
)
assert b'$90.74' in res.data

View File

@@ -86,6 +86,7 @@ def test_check_xpath_filter_utf8(client, live_server):
follow_redirects=True follow_redirects=True
) )
assert b"1 Imported" in res.data assert b"1 Imported" in res.data
time.sleep(1)
res = client.post( res = client.post(
url_for("edit_page", uuid="first"), url_for("edit_page", uuid="first"),
data={"css_filter": filter, "url": test_url, "tag": "", "headers": "", 'fetch_backend': "html_requests"}, data={"css_filter": filter, "url": test_url, "tag": "", "headers": "", 'fetch_backend': "html_requests"},
@@ -99,6 +100,68 @@ def test_check_xpath_filter_utf8(client, live_server):
assert b'Deleted' in res.data assert b'Deleted' in res.data
# Handle utf-8 charset replies https://github.com/dgtlmoon/changedetection.io/pull/613
def test_check_xpath_text_function_utf8(client, live_server):
filter='//item/title/text()'
d='''<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
<channel>
<title>rpilocator.com</title>
<link>https://rpilocator.com</link>
<description>Find Raspberry Pi Computers in Stock</description>
<lastBuildDate>Thu, 19 May 2022 23:27:30 GMT</lastBuildDate>
<image>
<url>https://rpilocator.com/favicon.png</url>
<title>rpilocator.com</title>
<link>https://rpilocator.com/</link>
<width>32</width>
<height>32</height>
</image>
<item>
<title>Stock Alert (UK): RPi CM4</title>
<foo>something else unrelated</foo>
</item>
<item>
<title>Stock Alert (UK): Big monitor</title>
<foo>something else unrelated</foo>
</item>
</channel>
</rss>'''
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(d)
# Add our URL to the import page
test_url = url_for('test_endpoint', _external=True, content_type="application/rss+xml;charset=UTF-8")
res = client.post(
url_for("import_page"),
data={"urls": test_url},
follow_redirects=True
)
assert b"1 Imported" in res.data
time.sleep(1)
res = client.post(
url_for("edit_page", uuid="first"),
data={"css_filter": filter, "url": test_url, "tag": "", "headers": "", 'fetch_backend': "html_requests"},
follow_redirects=True
)
assert b"Updated watch." in res.data
time.sleep(3)
res = client.get(url_for("index"))
assert b'Unicode strings with encoding declaration are not supported.' not in res.data
# The service should echo back the request headers
res = client.get(
url_for("preview_page", uuid="first"),
follow_redirects=True
)
assert b'<div class="">Stock Alert (UK): RPi CM4' in res.data
assert b'<div class="">Stock Alert (UK): Big monitor' in res.data
res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data
def test_check_markup_xpath_filter_restriction(client, live_server): def test_check_markup_xpath_filter_restriction(client, live_server):
sleep_time_for_fetch_thread = 3 sleep_time_for_fetch_thread = 3

View File

@@ -2,6 +2,8 @@
from flask import make_response, request from flask import make_response, request
from flask import url_for from flask import url_for
import logging
import time
def set_original_response(): def set_original_response():
test_return_data = """<html> test_return_data = """<html>
@@ -68,6 +70,31 @@ def extract_api_key_from_UI(client):
api_key = m.group(1) api_key = m.group(1)
return api_key.strip() return api_key.strip()
# kinda funky, but works for now
def extract_UUID_from_client(client):
import re
res = client.get(
url_for("index"),
)
# <span id="api-key">{{api_key}}</span>
m = re.search('edit/(.+?)"', str(res.data))
uuid = m.group(1)
return uuid.strip()
def wait_for_all_checks(client):
# Loop waiting until done..
attempt=0
while attempt < 60:
time.sleep(1)
res = client.get(url_for("index"))
if not b'Checking now' in res.data:
break
logging.getLogger().info("Waiting for watch-list to not say 'Checking now'.. {}".format(attempt))
attempt += 1
def live_server_setup(live_server): def live_server_setup(live_server):
@live_server.app.route('/test-endpoint') @live_server.app.route('/test-endpoint')
@@ -133,3 +160,4 @@ def live_server_setup(live_server):
return ret return ret
live_server.start() live_server.start()

View File

@@ -0,0 +1,2 @@
"""Tests for the app."""

View File

@@ -0,0 +1,3 @@
#!/usr/bin/python3
from .. import conftest

View File

@@ -0,0 +1,35 @@
#!/usr/bin/python3
import time
from flask import url_for
from ..util import live_server_setup, wait_for_all_checks, extract_UUID_from_client
# Add a site in paused mode, add an invalid filter, we should still have visual selector data ready
def test_visual_selector_content_ready(client, live_server):
import os
assert os.getenv('PLAYWRIGHT_DRIVER_URL'), "Needs PLAYWRIGHT_DRIVER_URL set for this test"
live_server_setup(live_server)
time.sleep(1)
# Add our URL to the import page, maybe better to use something we control?
# We use an external URL because the docker container is too difficult to setup to connect back to the pytest socket
test_url = 'https://news.ycombinator.com'
res = client.post(
url_for("form_quick_watch_add"),
data={"url": test_url, "tag": '', 'edit_and_watch_submit_button': 'Edit > Watch'},
follow_redirects=True
)
assert b"Watch added in Paused state, saving will unpause" in res.data
res = client.post(
url_for("edit_page", uuid="first", unpause_on_save=1),
data={"css_filter": ".does-not-exist", "url": test_url, "tag": "", "headers": "", 'fetch_backend': "html_webdriver"},
follow_redirects=True
)
assert b"unpaused" in res.data
time.sleep(1)
wait_for_all_checks(client)
uuid = extract_UUID_from_client(client)
assert os.path.isfile(os.path.join('test-datastore', uuid, 'last-screenshot.png')), "last-screenshot.png should exist"
assert os.path.isfile(os.path.join('test-datastore', uuid, 'elements.json')), "xpath elements.json data should exist"

View File

@@ -1,8 +1,11 @@
import os
import threading import threading
import queue import queue
import time import time
from changedetectionio import content_fetcher from changedetectionio import content_fetcher
from changedetectionio.html_tools import FilterNotFoundInResponse
# A single update worker # A single update worker
# #
# Requests for checking on a single site(watch) from a queue of watches # Requests for checking on a single site(watch) from a queue of watches
@@ -19,6 +22,100 @@ class update_worker(threading.Thread):
self.datastore = datastore self.datastore = datastore
super().__init__(*args, **kwargs) super().__init__(*args, **kwargs)
def send_content_changed_notification(self, t, watch_uuid):
from changedetectionio import diff
n_object = {}
watch = self.datastore.data['watching'].get(watch_uuid, False)
if not watch:
return
watch_history = watch.history
dates = list(watch_history.keys())
# Theoretically it's possible that this could be just 1 long,
# - In the case that the timestamp key was not unique
if len(dates) == 1:
raise ValueError(
"History index had 2 or more, but only 1 date loaded, timestamps were not unique? maybe two of the same timestamps got written, needs more delay?"
)
# Did it have any notification alerts to hit?
if len(watch['notification_urls']):
print(">>> Notifications queued for UUID from watch {}".format(watch_uuid))
n_object['notification_urls'] = watch['notification_urls']
n_object['notification_title'] = watch['notification_title']
n_object['notification_body'] = watch['notification_body']
n_object['notification_format'] = watch['notification_format']
# No? maybe theres a global setting, queue them all
elif len(self.datastore.data['settings']['application']['notification_urls']):
print(">>> Watch notification URLs were empty, using GLOBAL notifications for UUID: {}".format(watch_uuid))
n_object['notification_urls'] = self.datastore.data['settings']['application']['notification_urls']
n_object['notification_title'] = self.datastore.data['settings']['application']['notification_title']
n_object['notification_body'] = self.datastore.data['settings']['application']['notification_body']
n_object['notification_format'] = self.datastore.data['settings']['application']['notification_format']
else:
print(">>> NO notifications queued, watch and global notification URLs were empty.")
# Only prepare to notify if the rules above matched
if 'notification_urls' in n_object:
# HTML needs linebreak, but MarkDown and Text can use a linefeed
if n_object['notification_format'] == 'HTML':
line_feed_sep = "</br>"
else:
line_feed_sep = "\n"
snapshot_contents = ''
with open(watch_history[dates[-1]], 'rb') as f:
snapshot_contents = f.read()
n_object.update({
'watch_url': watch['url'],
'uuid': watch_uuid,
'current_snapshot': snapshot_contents.decode('utf-8'),
'diff': diff.render_diff(watch_history[dates[-2]], watch_history[dates[-1]], line_feed_sep=line_feed_sep),
'diff_full': diff.render_diff(watch_history[dates[-2]], watch_history[dates[-1]], True, line_feed_sep=line_feed_sep)
})
self.notification_q.put(n_object)
def send_filter_failure_notification(self, watch_uuid):
threshold = self.datastore.data['settings']['application'].get('filter_failure_notification_threshold_attempts')
watch = self.datastore.data['watching'].get(watch_uuid, False)
if not watch:
return
n_object = {'notification_title': 'Changedetection.io - Alert - CSS/xPath filter was not present in the page',
'notification_body': "Your configured CSS/xPath filter of '{}' for {{watch_url}} did not appear on the page after {} attempts, did the page change layout?\n\nLink: {{base_url}}/edit/{{watch_uuid}}\n\nThanks - Your omniscient changedetection.io installation :)\n".format(
watch['css_filter'],
threshold),
'notification_format': 'text'}
if len(watch['notification_urls']):
n_object['notification_urls'] = watch['notification_urls']
elif len(self.datastore.data['settings']['application']['notification_urls']):
n_object['notification_urls'] = self.datastore.data['settings']['application']['notification_urls']
# Only prepare to notify if the rules above matched
if 'notification_urls' in n_object:
n_object.update({
'watch_url': watch['url'],
'uuid': watch_uuid
})
self.notification_q.put(n_object)
print("Sent filter not found notification for {}".format(watch_uuid))
def cleanup_error_artifacts(self, uuid):
# All went fine, remove error artifacts
cleanup_files = ["last-error-screenshot.png", "last-error.txt"]
for f in cleanup_files:
full_path = os.path.join(self.datastore.datastore_path, uuid, f)
if os.path.isfile(full_path):
os.unlink(full_path)
def run(self): def run(self):
from changedetectionio import fetch_site_status from changedetectionio import fetch_site_status
@@ -27,7 +124,7 @@ class update_worker(threading.Thread):
while not self.app.config.exit.is_set(): while not self.app.config.exit.is_set():
try: try:
uuid = self.q.get(block=False) priority, uuid = self.q.get(block=False)
except queue.Empty: except queue.Empty:
pass pass
@@ -35,16 +132,17 @@ class update_worker(threading.Thread):
self.current_uuid = uuid self.current_uuid = uuid
if uuid in list(self.datastore.data['watching'].keys()): if uuid in list(self.datastore.data['watching'].keys()):
changed_detected = False changed_detected = False
contents = "" contents = b''
screenshot = False screenshot = False
update_obj= {} update_obj= {}
xpath_data = False xpath_data = False
process_changedetection_results = True
print("> Processing UUID {} Priority {} URL {}".format(uuid, priority, self.datastore.data['watching'][uuid]['url']))
now = time.time() now = time.time()
try: try:
changed_detected, update_obj, contents, screenshot, xpath_data = update_handler.run(uuid) changed_detected, update_obj, contents = update_handler.run(uuid)
# Re #342 # Re #342
# In Python 3, all strings are sequences of Unicode characters. There is a bytes type that holds raw bytes. # In Python 3, all strings are sequences of Unicode characters. There is a bytes type that holds raw bytes.
# We then convert/.decode('utf-8') for the notification etc # We then convert/.decode('utf-8') for the notification etc
@@ -52,14 +150,61 @@ class update_worker(threading.Thread):
raise Exception("Error - returned data from the fetch handler SHOULD be bytes") raise Exception("Error - returned data from the fetch handler SHOULD be bytes")
except PermissionError as e: except PermissionError as e:
self.app.logger.error("File permission error updating", uuid, str(e)) self.app.logger.error("File permission error updating", uuid, str(e))
process_changedetection_results = False
except content_fetcher.ReplyWithContentButNoText as e: except content_fetcher.ReplyWithContentButNoText as e:
# Totally fine, it's by choice - just continue on, nothing more to care about # Totally fine, it's by choice - just continue on, nothing more to care about
# Page had elements/content but no renderable text # Page had elements/content but no renderable text
if self.datastore.data['watching'].get(uuid, False) and self.datastore.data['watching'][uuid].get('css_filter'): # Backend (not filters) gave zero output
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': "Got HTML content but no text found (CSS / xPath Filter not found in page?)"}) self.datastore.update_watch(uuid=uuid, update_obj={'last_error': "Got HTML content but no text found (With {} reply code).".format(e.status_code)})
if e.screenshot:
self.datastore.save_screenshot(watch_uuid=uuid, screenshot=e.screenshot)
process_changedetection_results = False
except content_fetcher.Non200ErrorCodeReceived as e:
if e.status_code == 403:
err_text = "Error - 403 (Access denied) received"
elif e.status_code == 404:
err_text = "Error - 404 (Page not found) received"
elif e.status_code == 500:
err_text = "Error - 500 (Internal server Error) received"
else: else:
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': "Got HTML content but no text found."}) err_text = "Error - Request returned a HTTP error code {}".format(str(e.status_code))
pass
if e.screenshot:
self.datastore.save_screenshot(watch_uuid=uuid, screenshot=e.screenshot, as_error=True)
if e.xpath_data:
self.datastore.save_xpath_data(watch_uuid=uuid, data=e.xpath_data, as_error=True)
if e.page_text:
self.datastore.save_error_text(watch_uuid=uuid, contents=e.page_text)
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': err_text,
# So that we get a trigger when the content is added again
'previous_md5': ''})
process_changedetection_results = False
except FilterNotFoundInResponse as e:
err_text = "Warning, filter '{}' not found".format(str(e))
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': err_text,
# So that we get a trigger when the content is added again
'previous_md5': ''})
# Only when enabled, send the notification
if self.datastore.data['watching'][uuid].get('filter_failure_notification_send', False):
c = self.datastore.data['watching'][uuid].get('consecutive_filter_failures', 5)
c += 1
# Send notification if we reached the threshold?
threshold = self.datastore.data['settings']['application'].get('filter_failure_notification_threshold_attempts',
0)
print("Filter for {} not found, consecutive_filter_failures: {}".format(uuid, c))
if threshold > 0 and c >= threshold:
if not self.datastore.data['watching'][uuid].get('notification_muted'):
self.send_filter_failure_notification(uuid)
c = 0
self.datastore.update_watch(uuid=uuid, update_obj={'consecutive_filter_failures': c})
process_changedetection_results = True
except content_fetcher.EmptyReply as e: except content_fetcher.EmptyReply as e:
# Some kind of custom to-str handler in the exception handler that does this? # Some kind of custom to-str handler in the exception handler that does this?
err_text = "EmptyReply - try increasing 'Wait seconds before extracting text', Status Code {}".format(e.status_code) err_text = "EmptyReply - try increasing 'Wait seconds before extracting text', Status Code {}".format(e.status_code)
@@ -69,16 +214,41 @@ class update_worker(threading.Thread):
err_text = "Screenshot unavailable, page did not render fully in the expected time - try increasing 'Wait seconds before extracting text'" err_text = "Screenshot unavailable, page did not render fully in the expected time - try increasing 'Wait seconds before extracting text'"
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': err_text, self.datastore.update_watch(uuid=uuid, update_obj={'last_error': err_text,
'last_check_status': e.status_code}) 'last_check_status': e.status_code})
except content_fetcher.PageUnloadable as e: process_changedetection_results = False
err_text = "Page request from server didnt respond correctly" except content_fetcher.JSActionExceptions as e:
err_text = "Error running JS Actions - Page request - "+e.message
if e.screenshot:
self.datastore.save_screenshot(watch_uuid=uuid, screenshot=e.screenshot, as_error=True)
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': err_text, self.datastore.update_watch(uuid=uuid, update_obj={'last_error': err_text,
'last_check_status': e.status_code}) 'last_check_status': e.status_code})
except content_fetcher.PageUnloadable as e:
err_text = "Page request from server didnt respond correctly"
if e.message:
err_text = "{} - {}".format(err_text, e.message)
if e.screenshot:
self.datastore.save_screenshot(watch_uuid=uuid, screenshot=e.screenshot, as_error=True)
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': err_text,
'last_check_status': e.status_code})
except Exception as e: except Exception as e:
self.app.logger.error("Exception reached processing watch UUID: %s - %s", uuid, str(e)) self.app.logger.error("Exception reached processing watch UUID: %s - %s", uuid, str(e))
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': str(e)}) self.datastore.update_watch(uuid=uuid, update_obj={'last_error': str(e)})
# Other serious error
process_changedetection_results = False
else: else:
# Crash protection, the watch entry could have been removed by this point (during a slow chrome fetch etc)
if not self.datastore.data['watching'].get(uuid):
continue
# Mark that we never had any failures
if not self.datastore.data['watching'][uuid].get('ignore_status_codes'):
update_obj['consecutive_filter_failures'] = 0
self.cleanup_error_artifacts(uuid)
# Different exceptions mean that we may or may not want to bump the snapshot, trigger notifications etc
if process_changedetection_results:
try: try:
watch = self.datastore.data['watching'][uuid] watch = self.datastore.data['watching'][uuid]
fname = "" # Saved history text filename fname = "" # Saved history text filename
@@ -86,67 +256,19 @@ class update_worker(threading.Thread):
# For the FIRST time we check a site, or a change detected, save the snapshot. # For the FIRST time we check a site, or a change detected, save the snapshot.
if changed_detected or not watch['last_checked']: if changed_detected or not watch['last_checked']:
# A change was detected # A change was detected
fname = watch.save_history_text(contents=contents, timestamp=str(round(time.time()))) watch.save_history_text(contents=contents, timestamp=str(round(time.time())))
# Generally update anything interesting returned
self.datastore.update_watch(uuid=uuid, update_obj=update_obj) self.datastore.update_watch(uuid=uuid, update_obj=update_obj)
# A change was detected # A change was detected
if changed_detected: if changed_detected:
n_object = {}
print (">> Change detected in UUID {} - {}".format(uuid, watch['url'])) print (">> Change detected in UUID {} - {}".format(uuid, watch['url']))
# Notifications should only trigger on the second time (first time, we gather the initial snapshot) # Notifications should only trigger on the second time (first time, we gather the initial snapshot)
if watch.history_n >= 2: if watch.history_n >= 2:
# Atleast 2, means there really was a change if not self.datastore.data['watching'][uuid].get('notification_muted'):
self.datastore.update_watch(uuid=uuid, update_obj={'last_changed': round(now)}) self.send_content_changed_notification(self, watch_uuid=uuid)
watch_history = watch.history
dates = list(watch_history.keys())
# Theoretically it's possible that this could be just 1 long,
# - In the case that the timestamp key was not unique
if len(dates) == 1:
raise ValueError(
"History index had 2 or more, but only 1 date loaded, timestamps were not unique? maybe two of the same timestamps got written, needs more delay?"
)
prev_fname = watch_history[dates[-2]]
# Did it have any notification alerts to hit?
if len(watch['notification_urls']):
print(">>> Notifications queued for UUID from watch {}".format(uuid))
n_object['notification_urls'] = watch['notification_urls']
n_object['notification_title'] = watch['notification_title']
n_object['notification_body'] = watch['notification_body']
n_object['notification_format'] = watch['notification_format']
# No? maybe theres a global setting, queue them all
elif len(self.datastore.data['settings']['application']['notification_urls']):
print(">>> Watch notification URLs were empty, using GLOBAL notifications for UUID: {}".format(uuid))
n_object['notification_urls'] = self.datastore.data['settings']['application']['notification_urls']
n_object['notification_title'] = self.datastore.data['settings']['application']['notification_title']
n_object['notification_body'] = self.datastore.data['settings']['application']['notification_body']
n_object['notification_format'] = self.datastore.data['settings']['application']['notification_format']
else:
print(">>> NO notifications queued, watch and global notification URLs were empty.")
# Only prepare to notify if the rules above matched
if 'notification_urls' in n_object:
# HTML needs linebreak, but MarkDown and Text can use a linefeed
if n_object['notification_format'] == 'HTML':
line_feed_sep = "</br>"
else:
line_feed_sep = "\n"
from changedetectionio import diff
n_object.update({
'watch_url': watch['url'],
'uuid': uuid,
'current_snapshot': contents.decode('utf-8'),
'diff': diff.render_diff(prev_fname, fname, line_feed_sep=line_feed_sep),
'diff_full': diff.render_diff(prev_fname, fname, True, line_feed_sep=line_feed_sep)
})
self.notification_q.put(n_object)
except Exception as e: except Exception as e:
# Catch everything possible here, so that if a worker crashes, we don't lose it until restart! # Catch everything possible here, so that if a worker crashes, we don't lose it until restart!
@@ -154,16 +276,16 @@ class update_worker(threading.Thread):
self.app.logger.error("Exception reached processing watch UUID: %s - %s", uuid, str(e)) self.app.logger.error("Exception reached processing watch UUID: %s - %s", uuid, str(e))
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': str(e)}) self.datastore.update_watch(uuid=uuid, update_obj={'last_error': str(e)})
finally:
# Always record that we atleast tried
self.datastore.update_watch(uuid=uuid, update_obj={'fetch_time': round(time.time() - now, 3),
'last_checked': round(time.time())})
# Always save the screenshot if it's available # Always record that we atleast tried
if screenshot: self.datastore.update_watch(uuid=uuid, update_obj={'fetch_time': round(time.time() - now, 3),
self.datastore.save_screenshot(watch_uuid=uuid, screenshot=screenshot) 'last_checked': round(time.time())})
if xpath_data:
self.datastore.save_xpath_data(watch_uuid=uuid, data=xpath_data) # Always save the screenshot if it's available
if update_handler.screenshot:
self.datastore.save_screenshot(watch_uuid=uuid, screenshot=update_handler.screenshot)
if update_handler.xpath_data:
self.datastore.save_xpath_data(watch_uuid=uuid, data=update_handler.xpath_data)
self.current_uuid = None # Done self.current_uuid = None # Done

View File

@@ -24,7 +24,7 @@ services:
# https://selenium-python.readthedocs.io/api.html#module-selenium.webdriver.common.proxy # https://selenium-python.readthedocs.io/api.html#module-selenium.webdriver.common.proxy
# #
# Alternative Playwright URL, do not use "'s or 's! # Alternative Playwright URL, do not use "'s or 's!
# - PLAYWRIGHT_DRIVER_URL=ws://playwright-chrome:3000/ # - PLAYWRIGHT_DRIVER_URL=ws://playwright-chrome:3000/?stealth=1&--disable-web-security=true
# #
# Playwright proxy settings playwright_proxy_server, playwright_proxy_bypass, playwright_proxy_username, playwright_proxy_password # Playwright proxy settings playwright_proxy_server, playwright_proxy_bypass, playwright_proxy_username, playwright_proxy_password
# #
@@ -78,9 +78,6 @@ services:
# - SCREEN_HEIGHT=1024 # - SCREEN_HEIGHT=1024
# - SCREEN_DEPTH=16 # - SCREEN_DEPTH=16
# - ENABLE_DEBUGGER=false # - ENABLE_DEBUGGER=false
# - SCREEN_WIDTH=1280
# - SCREEN_HEIGHT=1024
# - SCREEN_DEPTH=16
# - PREBOOT_CHROME=true # - PREBOOT_CHROME=true
# - CONNECTION_TIMEOUT=300000 # - CONNECTION_TIMEOUT=300000
# - MAX_CONCURRENT_SESSIONS=10 # - MAX_CONCURRENT_SESSIONS=10

View File

@@ -18,7 +18,7 @@ wtforms ~= 3.0
jsonpath-ng ~= 1.5.3 jsonpath-ng ~= 1.5.3
# Notification library # Notification library
apprise ~= 0.9.9 apprise ~= 1.0.0
# apprise mqtt https://github.com/dgtlmoon/changedetection.io/issues/315 # apprise mqtt https://github.com/dgtlmoon/changedetection.io/issues/315
paho-mqtt paho-mqtt