Compare commits

...

93 Commits

Author SHA1 Message Date
dgtlmoon
aef24c42db extended tests 2022-10-28 14:08:29 +02:00
dgtlmoon
0f6afb9ce8 Merge branch 'diff-filters' of https://github.com/bwees/changedetection.io into diff-filters 2022-10-28 13:50:19 +02:00
Brandon Wees
ea2fcee4ad fix syntax error 2022-10-27 12:05:57 -04:00
Brandon Wees
bd79c5decd Update changedetectionio/tests/test_diff_filter_changes_as_add_delete.py
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-10-27 12:03:20 -04:00
Brandon Wees
74428372c3 Update changedetectionio/tests/test_diff_filter_only_deletions.py
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-10-27 11:57:55 -04:00
dgtlmoon
e6cdb57db0 Merge branch 'master' into diff-filters 2022-10-27 17:56:56 +02:00
dgtlmoon
ac3de58116 Merge branch 'diff-filters' of https://github.com/bwees/changedetection.io into diff-filters 2022-10-27 17:37:26 +02:00
Brandon Wees
e11c6aeb5f Apply suggestions from code review
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-10-27 10:59:14 -04:00
Brandon Wees
294bb7be15 remvoe unneeded import 2022-10-27 10:57:50 -04:00
Brandon Wees
c2c8bb4de8 ensure_data_dir_exists call added 2022-10-27 10:54:30 -04:00
Brandon Wees
35d950fa74 Update changedetectionio/model/Watch.py
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-10-27 10:52:42 -04:00
Brandon Wees
d24111f3a6 Apply suggestions from code review
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-10-27 10:52:20 -04:00
Brandon Wees
7011a04399 switching to os.path.join 2022-10-27 10:43:18 -04:00
Sandro
57f604dff1 UI - Make fetch error more readable (#1038) 2022-10-27 16:40:24 +02:00
dgtlmoon
8499468749 Update README.md 2022-10-27 15:17:14 +02:00
Brandon Wees
4364521cfc Update changedetectionio/templates/edit.html
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-10-27 09:11:28 -04:00
Brandon Wees
748328453e unmerge external header server. Sorry! 2022-10-27 09:03:39 -04:00
Brandon Wees
e867e89303 Update test_backup.py 2022-10-27 08:45:44 -04:00
dgtlmoon
7f6a13ea6c Re #1052 - Watch 'open' link should use any dynamic/template info (#1063) 2022-10-27 13:29:24 +02:00
dgtlmoon
9874f0cbc7 Remove accidental files 2022-10-27 12:43:02 +02:00
dgtlmoon
3e7fd9570a Merge branch 'diff-filters' of https://github.com/bwees/changedetection.io into diff-filters 2022-10-27 12:42:28 +02:00
dgtlmoon
99f3b01013 Merge branch 'master' into diff-filters 2022-10-27 12:38:51 +02:00
dgtlmoon
72834a42fd Backups and Snapshots - Data directory now fully portable, (all paths are relative) , refactored backup zip export creation 2022-10-27 12:35:26 +02:00
Brandon Wees
43c2e71961 Merge branch 'master' into diff-filters 2022-10-26 08:18:27 -04:00
Brandon Wees
9946ee66d0 Merge pull request #2 from bwees/external-header-server
External header server
2022-10-24 09:08:57 -04:00
Brandon Wees
9f722cc76b Merge branch 'dgtlmoon:master' into external-header-server 2022-10-24 08:54:22 -04:00
dgtlmoon
62b6645810 Merge branch 'master' into diff-filters 2022-10-24 11:47:08 +02:00
dgtlmoon
e5e8b3bbbd Merge branch 'diff-filters' of https://github.com/bwees/changedetection.io into diff-filters 2022-10-24 11:47:05 +02:00
bwees
852a698629 add optional for field 2022-10-19 19:14:01 -04:00
bwees
76fd27dfab fix logic error 2022-10-19 19:10:01 -04:00
bwees
83161e4fa3 fixed string None case 2022-10-19 19:03:01 -04:00
bwees
296c7c46cb fixed empty field errors 2022-10-19 19:00:38 -04:00
bwees
0a2644d0c3 fix tests 2022-10-19 18:58:54 -04:00
bwees
495e322c9e fixed import errors 2022-10-19 18:55:05 -04:00
bwees
0d5820932f rename branch 2022-10-19 18:45:43 -04:00
Brandon Wees
408be08a48 Merge branch 'dgtlmoon:master' into external-auth 2022-10-19 18:42:27 -04:00
bwees
bad0909cc2 added external header server 2022-10-19 18:42:04 -04:00
Brandon Wees
c80f46308a Update edit.html 2022-10-17 15:10:36 -04:00
dgtlmoon
802daa6296 Merge branch 'master' into diff-filters 2022-10-17 12:10:59 +02:00
Brandon Wees
2f641da182 Apply suggestions from code review
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-10-14 07:49:28 -04:00
Brandon Wees
4951721286 Update changedetectionio/store.py
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-10-11 07:59:51 -04:00
dgtlmoon
a50d6db0b2 Merge branch 'master' into diff-filters 2022-10-11 11:17:53 +02:00
dgtlmoon
f55f7967ef Merge branch 'master' into diff-filters 2022-09-08 20:37:17 +02:00
bwees
13a96e93a2 fix linter errors after merge 2022-08-17 09:33:34 -04:00
dgtlmoon
ed93d51ae8 Merge branch 'master' into diff-filters 2022-08-17 15:26:47 +02:00
bwees
db28b30b1b add test for situation found in https://github.com/dgtlmoon/changedetection.io/pull/749#issuecomment-1200154861 2022-07-30 09:14:06 -04:00
bwees
6bdcdfbaea fixed replace bug in get_diff_types 2022-07-30 09:05:55 -04:00
bwees
0efc504c5d change form wording 2022-07-30 08:47:07 -04:00
bwees
628cb2ad44 added form validation for diff filter checkboxes 2022-07-30 08:30:56 -04:00
Brandon Wees
604f2eaf02 remove unneeded debug statements 2022-07-29 08:40:47 -04:00
bwees
2a649afd22 Merge branch 'diff-filters' of https://github.com/bwees/changedetection.io into diff-filters 2022-07-29 08:39:32 -04:00
bwees
526f8fac45 remove unneeded import 2022-07-29 08:39:30 -04:00
dgtlmoon
e76f5efee3 Merge branch 'master' into diff-filters 2022-07-29 12:54:54 +02:00
bwees
7ac0620099 fixed merge conflict with latest version 2022-07-28 20:52:01 -04:00
bwees
14765b46bd fix broken logic 2022-07-28 20:48:20 -04:00
bwees
4f3a15e68d clean up test 2022-07-28 20:48:14 -04:00
bwees
c6207f729d added middleware to fix broken default checkboxes during tests 2022-07-28 20:37:20 -04:00
bwees
fcc1a72d30 changed tests 2022-07-28 20:37:03 -04:00
bwees
6f2b7ceddb changed UI to have checkboxes instead of dropdown 2022-07-28 20:36:53 -04:00
bwees
1e265b312e fix macos test running 2022-07-28 20:33:01 -04:00
Brandon Wees
f379dda13d Apply suggestions from code review
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-07-17 11:59:20 -04:00
Brandon Wees
4a88589a27 Update changedetectionio/model/Watch.py
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-07-17 11:58:46 -04:00
bwees
cac53a76c0 added antoher step to test to cover case as described https://github.com/dgtlmoon/changedetection.io/pull/749#issuecomment-1186209681 2022-07-16 19:13:20 -04:00
bwees
8dbf2257d3 added datastore migration step 2022-07-16 19:08:57 -04:00
bwees
c0fb051dde changed get_previous_text to not create the file if it does not exist 2022-07-16 16:02:05 -04:00
bwees
cf09f03d32 fix import statements 2022-07-16 15:54:44 -04:00
Brandon Wees
237cf7db4f Update changedetectionio/model/Watch.py
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-07-16 15:49:03 -04:00
bwees
a8e24dab01 Merge branch 'diff-filters' of https://github.com/bwees/changedetection.io into diff-filters 2022-07-16 15:48:44 -04:00
bwees
5c9b7353d4 fixed difflib import 2022-07-16 15:48:43 -04:00
Brandon Wees
1e22949e3d Update changedetectionio/model/Watch.py
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-07-16 15:48:20 -04:00
Brandon Wees
68e1a64474 Update changedetectionio/model/Watch.py
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-07-16 15:46:55 -04:00
Brandon Wees
151c2dab3a Update changedetectionio/templates/edit.html
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-07-16 10:38:45 -04:00
Brandon Wees
3e43d7ad1a Update changedetectionio/templates/edit.html
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-07-16 10:38:27 -04:00
Brandon Wees
58cb7fbc2a Update changedetectionio/model/Watch.py
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-07-16 10:37:05 -04:00
Brandon Wees
23452a1599 Remove discord change (look at https://github.com/dgtlmoon/changedetection.io/pull/753 for this change) 2022-07-13 18:05:02 -04:00
bwees
7fb432bf06 Created working tests 2022-07-13 17:58:30 -04:00
bwees
dc3fc6cfdf used a drop down menu and rewrote checking code to fit GUI description 2022-07-13 17:58:13 -04:00
bwees
8ee42d2403 fixed my breaking change 2022-07-13 17:57:39 -04:00
bwees
8d9cac4c38 remove my tests because they wont run 2022-07-12 21:16:45 -04:00
bwees
374bb3824f fix test to include the new previous.txt file 2022-07-12 21:11:42 -04:00
bwees
91d8600b19 fixed test naming 2022-07-12 20:53:22 -04:00
bwees
7b0ddc23d3 workaround for diff filter checkboxes getting changed on creation of form object 2022-07-12 20:40:54 -04:00
bwees
ab74377be0 fixed file based text saving system 2022-07-12 18:28:29 -04:00
bwees
2196d120a9 rewrote and broke out tests to simplify 2022-07-12 18:27:51 -04:00
bwees
5dca59a4a0 switched to file handling of previous_text 2022-07-12 17:59:46 -04:00
bwees
ee8042b54e Fix boolean value being sent to difflib 2022-07-12 16:56:59 -04:00
bwees
4c3f233d21 Made unit test 2022-07-11 20:52:18 -04:00
bwees
159b062cb3 removed modify due to the way difflib reacts to changes 2022-07-11 20:37:01 -04:00
bwees
83565787ae added logic for filtering based on diff attributes 2022-07-11 20:35:30 -04:00
bwees
bdab4f5e09 added diff compare function to watch class 2022-07-11 20:34:33 -04:00
bwees
69075a81c5 updated data model 2022-07-11 19:27:05 -04:00
bwees
04746cc706 Added initial UI code 2022-07-11 19:26:56 -04:00
Brandon Wees
234494d907 Added character truncation rule to URL starting with https://discord.com/api/webhooks 2022-07-11 18:02:04 -04:00
16 changed files with 521 additions and 52 deletions

View File

@@ -184,9 +184,9 @@ When you enable a `json:` or `jq:` filter, you can even automatically extract an
`json:$.price` or `jq:.price` would give `23.50`, or you can extract the whole structure
## Proxy configuration
## Proxy Configuration
See the wiki https://github.com/dgtlmoon/changedetection.io/wiki/Proxy-configuration
See the wiki https://github.com/dgtlmoon/changedetection.io/wiki/Proxy-configuration , we also support using [BrightData proxy services where possible]( https://github.com/dgtlmoon/changedetection.io/wiki/Proxy-configuration#brightdata-proxy-support)
## Raspberry Pi support?

View File

@@ -987,9 +987,6 @@ def changedetection_app(config=None, datastore_o=None):
# create a ZipFile object
backupname = "changedetection-backup-{}.zip".format(int(time.time()))
# We only care about UUIDS from the current index file
uuids = list(datastore.data['watching'].keys())
backup_filepath = os.path.join(datastore_o.datastore_path, backupname)
with zipfile.ZipFile(backup_filepath, "w",
@@ -1005,12 +1002,12 @@ def changedetection_app(config=None, datastore_o=None):
# Add the flask app secret
zipObj.write(os.path.join(datastore_o.datastore_path, "secret.txt"), arcname="secret.txt")
# Add any snapshot data we find, use the full path to access the file, but make the file 'relative' in the Zip.
for txt_file_path in Path(datastore_o.datastore_path).rglob('*.txt'):
parent_p = txt_file_path.parent
if parent_p.name in uuids:
zipObj.write(txt_file_path,
arcname=str(txt_file_path).replace(datastore_o.datastore_path, ''),
# Add any data in the watch data directory.
for uuid, w in datastore.data['watching'].items():
for f in Path(w.watch_data_dir).glob('*'):
zipObj.write(f,
# Use the full path to access the file, but make the file 'relative' in the Zip.
arcname=os.path.join(f.parts[-2], f.parts[-1]),
compress_type=zipfile.ZIP_DEFLATED,
compresslevel=8)

View File

@@ -2,14 +2,14 @@ import hashlib
import logging
import os
import re
import time
import urllib3
import difflib
from changedetectionio import content_fetcher, html_tools
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
# Some common stuff here that can be moved to a base class
# (set_proxy_from_list)
class perform_site_check():
@@ -35,8 +35,6 @@ class perform_site_check():
def run(self, uuid):
from jinja2 import Environment
changed_detected = False
screenshot = False # as bytes
stripped_text_from_html = ""
@@ -68,9 +66,7 @@ class perform_site_check():
timeout = self.datastore.data['settings']['requests'].get('timeout')
# Jinja2 available in URLs along with https://pypi.org/project/jinja2-time/
jinja2_env = Environment(extensions=['jinja2_time.TimeExtension'])
url = str(jinja2_env.from_string(watch.get('url')).render())
url = watch.link
request_body = self.datastore.data['watching'][uuid].get('body')
request_method = self.datastore.data['watching'][uuid].get('method')
@@ -293,8 +289,23 @@ class perform_site_check():
else:
logging.debug("check_unique_lines: UUID {} had unique content".format(uuid))
# Always record the new checksum
if changed_detected:
if not watch.get("trigger_add", True) or not watch.get("trigger_del", True): # if we are supposed to filter any diff types
# get the diff types present in the watch
diff_types = watch.get_diff_types(text_content_before_ignored_filter)
print("Diff components found: " + str(diff_types))
# Only Additions (deletions are turned off)
if not watch["trigger_del"] and diff_types["del"] and not diff_types["add"]:
changed_detected = False
# Only Deletions (additions are turned off)
elif not watch["trigger_add"] and diff_types["add"] and not diff_types["del"]:
changed_detected = False
# Always record the new checksum and the new text
update_obj["previous_md5"] = fetched_md5
watch.save_previous_text(text_content_before_ignored_filter)
# On the first run of a site, watch['previous_md5'] will be None, set it the current one.
if not watch.get('previous_md5'):

View File

@@ -323,6 +323,18 @@ class ValidateCSSJSONXPATHInput(object):
except:
raise ValidationError("A system-error occurred when validating your jq expression")
class ValidateDiffFilters(object):
"""
Validates that at least one filter checkbox is selected
"""
def __init__(self, message=None):
self.message = message
def __call__(self, form, field):
if not form.trigger_add.data and not form.trigger_del.data:
message = field.gettext('At least one filter checkbox must be selected')
raise ValidationError(message)
class quickWatchForm(Form):
url = fields.URLField('URL', validators=[validateURL()])
@@ -365,6 +377,8 @@ class watchForm(commonSettingsForm):
check_unique_lines = BooleanField('Only trigger when new lines appear', default=False)
trigger_text = StringListField('Trigger/wait for text', [validators.Optional(), ValidateListRegex()])
text_should_not_be_present = StringListField('Block change-detection if text matches', [validators.Optional(), ValidateListRegex()])
trigger_add = BooleanField('Additions', [ValidateDiffFilters()], default=True)
trigger_del = BooleanField('Deletions', [ValidateDiffFilters()], default=True)
webdriver_js_execute_code = TextAreaField('Execute JavaScript before change detection', render_kw={"rows": "5"}, validators=[validators.Optional()])

View File

@@ -1,6 +1,8 @@
import os
import uuid as uuid_builder
from distutils.util import strtobool
import logging
import os
import time
import uuid
minimum_seconds_recheck_time = int(os.getenv('MINIMUM_SECONDS_RECHECK_TIME', 60))
mtable = {'seconds': 1, 'minutes': 60, 'hours': 3600, 'days': 86400, 'weeks': 86400 * 7}
@@ -22,7 +24,7 @@ class model(dict):
#'newest_history_key': 0,
'title': None,
'previous_md5': False,
'uuid': str(uuid_builder.uuid4()),
'uuid': str(uuid.uuid4()),
'headers': {}, # Extra headers to send
'body': None,
'method': 'GET',
@@ -45,6 +47,8 @@ class model(dict):
'consecutive_filter_failures': 0, # Every time the CSS/xPath filter cannot be located, reset when all is fine.
'extract_title_as_title': False,
'check_unique_lines': False, # On change-detected, compare against all history if its something new
'trigger_add': True,
'trigger_del': True,
'proxy': None, # Preferred proxy connection
# Re #110, so then if this is set to None, we know to use the default value instead
# Requires setting to None on submit if it's the same as the default
@@ -60,7 +64,7 @@ class model(dict):
self.update(self.__base_config)
self.__datastore_path = kw['datastore_path']
self['uuid'] = str(uuid_builder.uuid4())
self['uuid'] = str(uuid.uuid4())
del kw['datastore_path']
@@ -82,10 +86,19 @@ class model(dict):
return False
def ensure_data_dir_exists(self):
target_path = os.path.join(self.__datastore_path, self['uuid'])
if not os.path.isdir(target_path):
print ("> Creating data dir {}".format(target_path))
os.mkdir(target_path)
if not os.path.isdir(self.watch_data_dir):
print ("> Creating data dir {}".format(self.watch_data_dir))
os.mkdir(self.watch_data_dir)
@property
def link(self):
url = self.get('url', '')
if '{%' in url or '{{' in url:
from jinja2 import Environment
# Jinja2 available in URLs along with https://pypi.org/project/jinja2-time/
jinja2_env = Environment(extensions=['jinja2_time.TimeExtension'])
return str(jinja2_env.from_string(url).render())
return url
@property
def label(self):
@@ -109,18 +122,39 @@ class model(dict):
@property
def history(self):
"""History index is just a text file as a list
{watch-uuid}/history.txt
contains a list like
{epoch-time},{filename}\n
We read in this list as the history information
"""
tmp_history = {}
import logging
import time
# Read the history file as a dict
fname = os.path.join(self.__datastore_path, self.get('uuid'), "history.txt")
fname = os.path.join(self.watch_data_dir, "history.txt")
if os.path.isfile(fname):
logging.debug("Reading history index " + str(time.time()))
with open(fname, "r") as f:
for i in f.readlines():
if ',' in i:
k, v = i.strip().split(',', 2)
# The index history could contain a relative path, so we need to make the fullpath
# so that python can read it
if not '/' in v and not '\'' in v:
v = os.path.join(self.watch_data_dir, v)
else:
# It's possible that they moved the datadir on older versions
# So the snapshot exists but is in a different path
snapshot_fname = v.split('/')[-1]
proposed_new_path = os.path.join(self.watch_data_dir, snapshot_fname)
if not os.path.exists(v) and os.path.exists(proposed_new_path):
v = proposed_new_path
tmp_history[k] = v
if len(tmp_history):
@@ -132,7 +166,7 @@ class model(dict):
@property
def has_history(self):
fname = os.path.join(self.__datastore_path, self.get('uuid'), "history.txt")
fname = os.path.join(self.watch_data_dir, "history.txt")
return os.path.isfile(fname)
# Returns the newest key, but if theres only 1 record, then it's counted as not being new, so return 0.
@@ -151,25 +185,19 @@ class model(dict):
# Save some text file to the appropriate path and bump the history
# result_obj from fetch_site_status.run()
def save_history_text(self, contents, timestamp):
import uuid
import logging
output_path = os.path.join(self.__datastore_path, self['uuid'])
self.ensure_data_dir_exists()
snapshot_fname = os.path.join(output_path, str(uuid.uuid4()))
logging.debug("Saving history text {}".format(snapshot_fname))
snapshot_fname = "{}.txt".format(str(uuid.uuid4()))
# in /diff/ and /preview/ we are going to assume for now that it's UTF-8 when reading
# most sites are utf-8 and some are even broken utf-8
with open(snapshot_fname, 'wb') as f:
with open(os.path.join(self.watch_data_dir, snapshot_fname), 'wb') as f:
f.write(contents)
f.close()
# Append to index
# @todo check last char was \n
index_fname = os.path.join(output_path, "history.txt")
index_fname = os.path.join(self.watch_data_dir, "history.txt")
with open(index_fname, 'a') as f:
f.write("{},{}\n".format(timestamp, snapshot_fname))
f.close()
@@ -180,6 +208,35 @@ class model(dict):
# @todo bump static cache of the last timestamp so we dont need to examine the file to set a proper ''viewed'' status
return snapshot_fname
# Save previous text snapshot for diffing - used for calculating additions and deletions
def save_previous_text(self, contents):
import logging
output_path = os.path.join(self.__datastore_path, self['uuid'])
# Incase the operator deleted it, check and create.
self.ensure_data_dir_exists()
snapshot_fname = os.path.join(self.watch_data_dir, "previous.txt")
logging.debug("Saving previous text {}".format(snapshot_fname))
with open(snapshot_fname, 'wb') as f:
f.write(contents)
return snapshot_fname
# Get previous text snapshot for diffing - used for calculating additions and deletions
def get_previous_text(self):
snapshot_fname = os.path.join(self.watch_data_dir, "previous.txt")
if self.history_n < 1:
return ""
with open(snapshot_fname, 'rb') as f:
contents = f.read()
return contents
@property
def has_empty_checktime(self):
# using all() + dictionary comprehension
@@ -209,15 +266,40 @@ class model(dict):
# if not, something new happened
return not local_lines.issubset(existing_history)
# Get diff types (addition, deletion, modification) from the previous snapshot and new_text
# uses similar algorithm to customSequenceMatcher in diff.py
# Returns a dict of diff types and wether they are present in the diff
def get_diff_types(self, new_text):
import difflib
diff_types = {
'add': False,
'del': False,
}
# get diff types using difflib
cruncher = difflib.SequenceMatcher(isjunk=lambda x: x in " \\t", a=str(self.get_previous_text()), b=str(new_text))
for tag, alo, ahi, blo, bhi in cruncher.get_opcodes():
if tag == 'delete':
diff_types["del"] = True
elif tag == 'insert':
diff_types["add"] = True
elif tag == 'replace':
diff_types["del"] = True
diff_types["add"] = True
return diff_types
def get_screenshot(self):
fname = os.path.join(self.__datastore_path, self['uuid'], "last-screenshot.png")
fname = os.path.join(self.watch_data_dir, "last-screenshot.png")
if os.path.isfile(fname):
return fname
return False
def __get_file_ctime(self, filename):
fname = os.path.join(self.__datastore_path, self['uuid'], filename)
fname = os.path.join(self.watch_data_dir, filename)
if os.path.isfile(fname):
return int(os.path.getmtime(fname))
return False
@@ -242,9 +324,14 @@ class model(dict):
def snapshot_error_screenshot_ctime(self):
return self.__get_file_ctime('last-error-screenshot.png')
@property
def watch_data_dir(self):
# The base dir of the watch data
return os.path.join(self.__datastore_path, self['uuid'])
def get_error_text(self):
"""Return the text saved from a previous request that resulted in a non-200 error"""
fname = os.path.join(self.__datastore_path, self['uuid'], "last-error.txt")
fname = os.path.join(self.watch_data_dir, "last-error.txt")
if os.path.isfile(fname):
with open(fname, 'r') as f:
return f.read()
@@ -252,7 +339,7 @@ class model(dict):
def get_error_snapshot(self):
"""Return path to the screenshot that resulted in a non-200 error"""
fname = os.path.join(self.__datastore_path, self['uuid'], "last-error-screenshot.png")
fname = os.path.join(self.watch_data_dir, "last-error-screenshot.png")
if os.path.isfile(fname):
return fname
return False

View File

@@ -156,7 +156,7 @@ body:after, body:before {
.fetch-error {
padding-top: 1em;
font-size: 60%;
font-size: 80%;
max-width: 400px;
display: block;
}
@@ -803,4 +803,4 @@ ul {
padding: 0.5rem;
border-radius: 5px;
color: #ff3300;
}
}

View File

@@ -548,6 +548,10 @@ class ChangeDetectionStore:
# `last_changed` not needed, we pull that information from the history.txt index
def update_4(self):
for uuid, watch in self.data['watching'].items():
# Be sure it's recalculated
p = watch.history
if watch.history_n < 2:
watch['last_changed'] = 0
try:
# Remove it from the struct
del(watch['last_changed'])
@@ -583,3 +587,23 @@ class ChangeDetectionStore:
for v in ['User-Agent', 'Accept', 'Accept-Encoding', 'Accept-Language']:
if self.data['settings']['headers'].get(v):
del self.data['settings']['headers'][v]
# Generate a previous.txt for all watches that do not have one and contain history
def update_8(self):
for uuid, watch in self.data['watching'].items():
# Make sure we actually have history
if (watch.history_n == 0):
continue
latest_file_name = watch.history[watch.newest_history_key]
# Check if the previous.txt exists
if not os.path.exists(os.path.join(watch.watch_data_dir, "previous.txt")):
# Generate a previous.txt
with open(os.path.join(watch.watch_data_dir, "previous.txt"), "wb") as f:
# Fill it with the latest history
latest_file_name = watch.history[watch.newest_history_key]
with open(latest_file_name, "rb") as f2:
f.write(f2.read())

View File

@@ -173,6 +173,16 @@ User-Agent: wonderbra 1.0") }}
<span class="pure-form-message-inline">Good for websites that just move the content around, and you want to know when NEW content is added, compares new lines against all history for this watch.</span>
</div>
</fieldset>
<fieldset>
<div class="pure-control-group">
<label for="trigger-type">Filter and restrict change detection of content to</label>
{{ render_checkbox_field(form.trigger_add, class="trigger-type") }}
{{ render_checkbox_field(form.trigger_del, class="trigger-type") }}
<span class="pure-form-message-inline">
Filters the change-detection of this watch to only this type of content change. <strong>Replacements</strong> (neither additions nor deletions) are always included. The 'diff' will still include all changes.
</span>
</div>
</fieldset>
<div class="pure-control-group">
{% set field = render_field(form.css_filter,
placeholder=".class-name or #some-id, or other CSS selector rule.",

View File

@@ -87,7 +87,7 @@
<a class="state-{{'on' if watch.notification_muted}}" href="{{url_for('index', op='mute', uuid=watch.uuid, tag=active_tag)}}"><img src="{{url_for('static_content', group='images', filename='bell-off.svg')}}" alt="Mute notifications" title="Mute notifications"/></a>
</td>
<td class="title-col inline">{{watch.title if watch.title is not none and watch.title|length > 0 else watch.url}}
<a class="external" target="_blank" rel="noopener" href="{{ watch.url.replace('source:','') }}"></a>
<a class="external" target="_blank" rel="noopener" href="{{ watch.link.replace('source:','') }}"></a>
<a href="{{url_for('form_share_put_watch', uuid=watch.uuid)}}"><img style="height: 1em;display:inline-block;" src="{{url_for('static_content', group='images', filename='spread.svg')}}" /></a>
{%if watch.fetch_backend == "html_webdriver" %}<img style="height: 1em; display:inline-block;" src="{{url_for('static_content', group='images', filename='Google-Chrome-icon.png')}}" />{% endif %}

View File

@@ -1,18 +1,31 @@
#!/usr/bin/python3
import time
from .util import set_original_response, set_modified_response, live_server_setup
from flask import url_for
from urllib.request import urlopen
from . util import set_original_response, set_modified_response, live_server_setup
from zipfile import ZipFile
import re
import time
def test_backup(client, live_server):
live_server_setup(live_server)
set_original_response()
# Give the endpoint time to spin up
time.sleep(1)
# Add our URL to the import page
res = client.post(
url_for("import_page"),
data={"urls": url_for('test_endpoint', _external=True)},
follow_redirects=True
)
assert b"1 Imported" in res.data
time.sleep(3)
res = client.get(
url_for("get_backup"),
follow_redirects=True
@@ -20,6 +33,19 @@ def test_backup(client, live_server):
# Should get the right zip content type
assert res.content_type == "application/zip"
# Should be PK/ZIP stream
assert res.data.count(b'PK') >= 2
# ZipFile from buffer seems non-obvious, just save it instead
with open("download.zip", 'wb') as f:
f.write(res.data)
zip = ZipFile('download.zip')
l = zip.namelist()
uuid4hex = re.compile('^[a-f0-9]{8}-?[a-f0-9]{4}-?4[a-f0-9]{3}-?[89ab][a-f0-9]{3}-?[a-f0-9]{12}.*txt', re.I)
newlist = list(filter(uuid4hex.match, l)) # Read Note below
# Should be three txt files in the archive (history and the snapshot)
assert len(newlist) == 3

View File

@@ -0,0 +1,107 @@
#!/usr/bin/python3
# @NOTE: THIS RELIES ON SOME MIDDLEWARE TO MAKE CHECKBOXES WORK WITH WTFORMS UNDER TEST CONDITION, see changedetectionio/tests/util.py
import time
from flask import url_for
from .util import live_server_setup
def set_original_response():
test_return_data = """
Here
is
some
text
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
def set_response_with_deleted_word():
test_return_data = """
Here
is
text
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
def set_response_with_changed_word():
test_return_data = """
Here
ix
some
text
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
def test_diff_filter_changes_as_add_delete(client, live_server):
live_server_setup(live_server)
sleep_time_for_fetch_thread = 3
set_original_response()
# Give the endpoint time to spin up
time.sleep(1)
# Add our URL to the import page
test_url = url_for('test_endpoint', _external=True)
res = client.post(
url_for("import_page"),
data={"urls": test_url},
follow_redirects=True
)
assert b"1 Imported" in res.data
# Wait for it to read the original version
time.sleep(sleep_time_for_fetch_thread)
# Make a change that ONLY includes deletes
set_response_with_deleted_word()
res = client.post(
url_for("edit_page", uuid="first"),
data={"trigger_add": "y",
"trigger_del": "n",
"url": test_url,
"fetch_backend": "html_requests"},
follow_redirects=True
)
assert b"Updated watch." in res.data
time.sleep(sleep_time_for_fetch_thread)
# We should NOT see a change because we chose to not know about any Deletions
res = client.get(url_for("index"))
assert b'unviewed' not in res.data
# Recheck to be sure
client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(sleep_time_for_fetch_thread)
res = client.get(url_for("index"))
assert b'unviewed' not in res.data
# Now set the original response, which will include the word, which should trigger Added (because trigger_add ==y)
set_original_response()
client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(sleep_time_for_fetch_thread)
res = client.get(url_for("index"))
assert b'unviewed' in res.data
# Now check 'changes' are always going to be triggered
set_original_response()
client.post(
url_for("edit_page", uuid="first"),
# Neither trigger add nor del? then we should see changes still
data={"trigger_add": "n",
"trigger_del": "n",
"url": test_url,
"fetch_backend": "html_requests"},
follow_redirects=True
)
time.sleep(sleep_time_for_fetch_thread)
client.get(url_for("mark_all_viewed"), follow_redirects=True)
set_response_with_changed_word()
client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(sleep_time_for_fetch_thread)
res = client.get(url_for("index"))
assert b'unviewed' in res.data

View File

@@ -0,0 +1,83 @@
#!/usr/bin/python3
import time
from flask import url_for
from .util import live_server_setup
def set_original_response():
test_return_data = """
A few new lines
Where there is more lines originally
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
def set_delete_response():
test_return_data = """
A few new lines
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
def test_diff_filtering_no_del(client, live_server):
live_server_setup(live_server)
sleep_time_for_fetch_thread = 3
set_original_response()
# Give the endpoint time to spin up
time.sleep(1)
# Add our URL to the import page
test_url = url_for('test_endpoint', _external=True)
res = client.post(
url_for("import_page"),
data={"urls": test_url},
follow_redirects=True
)
assert b"1 Imported" in res.data
time.sleep(sleep_time_for_fetch_thread)
# Add our URL to the import page
res = client.post(
url_for("edit_page", uuid="first"),
data={"trigger_add": "y",
"trigger_del": "n",
"url": test_url,
"fetch_backend": "html_requests"},
follow_redirects=True
)
assert b"Updated watch." in res.data
assert b'unviewed' not in res.data
# Make an delete change
set_delete_response()
time.sleep(sleep_time_for_fetch_thread)
# Trigger a check
client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
# We should NOT see the change
res = client.get(url_for("index"))
assert b'unviewed' not in res.data
# Make an delete change
set_original_response()
time.sleep(sleep_time_for_fetch_thread)
# Trigger a check
client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
# We should see the change
res = client.get(url_for("index"))
assert b'unviewed' in res.data

View File

@@ -0,0 +1,72 @@
#!/usr/bin/python3
import time
from flask import url_for
from .util import live_server_setup
def set_original_response():
test_return_data = """
A few new lines
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
def set_add_response():
test_return_data = """
A few new lines
Where there is more lines than before
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
def test_diff_filtering_no_add(client, live_server):
live_server_setup(live_server)
sleep_time_for_fetch_thread = 3
set_original_response()
# Give the endpoint time to spin up
time.sleep(1)
# Add our URL to the import page
test_url = url_for('test_endpoint', _external=True)
res = client.post(
url_for("import_page"),
data={"urls": test_url},
follow_redirects=True
)
assert b"1 Imported" in res.data
time.sleep(sleep_time_for_fetch_thread)
# Add our URL to the import page
res = client.post(
url_for("edit_page", uuid="first"),
data={"trigger_add": "n",
"trigger_del": "y",
"url": test_url,
"fetch_backend": "html_requests"},
follow_redirects=True
)
assert b"Updated watch." in res.data
assert b'unviewed' not in res.data
# Make an add change
set_add_response()
time.sleep(sleep_time_for_fetch_thread)
# Trigger a check
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
# We should NOT see the change
res = client.get(url_for("index"))
# save res.data to a file
assert b'unviewed' not in res.data

View File

@@ -81,4 +81,4 @@ def test_consistent_history(client, live_server):
assert len(files_in_watch_dir) == 2, "Should be just two files in the dir, history.txt and the snapshot"
assert len(files_in_watch_dir) == 3, "Should be just three files in the dir, history.txt, previous.txt, and the snapshot"

View File

@@ -4,6 +4,12 @@ from flask import make_response, request
from flask import url_for
import logging
import time
from werkzeug import Request
import io
# This is a fix for macOS running tests.
import multiprocessing
multiprocessing.set_start_method("fork")
def set_original_response():
test_return_data = """<html>
@@ -159,6 +165,38 @@ def live_server_setup(live_server):
ret = " ".join([auth.username, auth.password, auth.type])
return ret
# Make sure any checkboxes that are supposed to be defaulted to true are set during the post request
# This is due to the fact that defaults are set in the HTML which we are not using during tests.
# This does not affect the server when running outside of a test
class DefaultCheckboxMiddleware(object):
def __init__(self, app):
self.app = app
def __call__(self, environ, start_response):
request = Request(environ)
if request.method == "POST" and "/edit" in request.path:
body = environ['wsgi.input'].read()
# if the checkboxes are not set, set them to true
if b"trigger_add" not in body:
body += b'&trigger_add=y'
if b"trigger_del" not in body:
body += b'&trigger_del=y'
# remove any checkboxes set to "n" so wtforms processes them correctly
body = body.replace(b"trigger_add=n", b"")
body = body.replace(b"trigger_del=n", b"")
body = body.replace(b"&&", b"&")
new_stream = io.BytesIO(body)
environ["CONTENT_LENGTH"] = len(body)
environ['wsgi.input'] = new_stream
return self.app(environ, start_response)
live_server.app.wsgi_app = DefaultCheckboxMiddleware(live_server.app.wsgi_app)
# Just return some GET var
@live_server.app.route('/test-return-query', methods=['GET'])
def test_return_query():

View File

@@ -47,7 +47,7 @@ selenium ~= 4.1.0
werkzeug ~= 2.0.0
# Templating, so far just in the URLs but in the future can be for the notifications also
jinja2
jinja2 ~= 3.1
jinja2-time
# playwright is installed at Dockerfile build time because it's not available on all platforms