Compare commits

...

53 Commits

Author SHA1 Message Date
dgtlmoon
329a24b234 bump comments 2022-06-15 13:58:28 +02:00
dgtlmoon
6f1be5bf71 Bump docs 2022-06-15 13:57:10 +02:00
dgtlmoon
d1f73ef4f8 Re #695
- :latest should be :latest stable release according to a new release, and also the :0.xx.xx tag for the release
- :dev can be what ever is going on in `master`
2022-06-15 13:45:40 +02:00
dgtlmoon
4a91505af5 Playwright screenshots - no need for high-res "bug workaround" screenshot, use lower quality/faster configurable image quality env var 2022-06-15 10:52:24 +02:00
dgtlmoon
4841c79b4c Adding extra check when updating DB on ReplyWithContentButNoText 2022-06-14 19:54:35 +02:00
dgtlmoon
2ba00d2e1d Notifications log - log what was sent after applying all cleanups 2022-06-14 17:01:13 +02:00
dgtlmoon
19c96f4bdd Re #555 - tgram:// notifications - strip added HTML tag which is not supported by Telegram 2022-06-14 12:00:21 +02:00
dgtlmoon
82b900fbf4 Give more helpful error message when a page doesnt load 2022-06-14 08:16:22 +02:00
dgtlmoon
358a365303 Tweaks to playwright fetch code - better timeout handling 2022-06-13 23:39:43 +02:00
dgtlmoon
a07ca4b136 Re #580 - New functionality - Random "jitter" delay to requests (#681) 2022-06-13 12:41:53 +02:00
dgtlmoon
ba8cf2c8cf 0.39.15 2022-06-12 14:05:34 +02:00
dgtlmoon
3106b6688e Watch overview list - adding spinner to make it easier to see whats currently being 'Checked' 2022-06-12 12:52:17 +02:00
dgtlmoon
2c83845dac Preview section - add helpful check 2022-06-12 11:10:06 +02:00
dgtlmoon
111266d6fa Send test notification - improved handling of errors 2022-06-12 10:47:00 +02:00
dgtlmoon
ead610151f Notification log - also log normal requests and make the log easier to find 2022-06-11 23:07:09 +02:00
dgtlmoon
7e1e763989 Update bug_report.md 2022-06-11 00:43:28 +02:00
dgtlmoon
327cc4af34 Use correct RSS CDATA handling (#662) 2022-06-08 18:40:01 +02:00
dgtlmoon
6008ff516e Improve logging (#671) 2022-06-08 18:32:41 +02:00
dgtlmoon
cdcf4b353f New [scrub] button when editing a watch - scrub single watch history (#672) 2022-06-08 18:32:25 +02:00
dgtlmoon
1ab70f8e86 Diff + Preview - Hide date selector widget when viewing screenshots as its not yet possible to compare screenshots (but will be soon!) 2022-06-07 19:53:13 +02:00
dgtlmoon
8227c012a7 Diff + Preview - Fixing screenshot behaviour after preference change 2022-06-07 19:51:17 +02:00
dgtlmoon
c113d5fb24 Screenshot handling on the diff/preview section refactor (#630) 2022-06-07 19:22:42 +02:00
dgtlmoon
8c8d4066d7 Shared watches - include "Extract text" filter 2022-06-07 17:06:05 +02:00
dgtlmoon
277dc9e1c1 Improve error message when filter not found in page result (#666) 2022-06-07 16:43:57 +02:00
dgtlmoon
fc0fd1ce9d "Extract text" filter - improve placeholder example 2022-06-06 18:26:47 +02:00
dgtlmoon
bd6127728a Visual selector - 'clear selection' button should clear the filter also 2022-06-06 17:07:29 +02:00
dgtlmoon
4101ae00c6 New feature - "Extract text" filter ability (#624) 2022-06-06 16:57:50 +02:00
dgtlmoon
62f14df3cb Fixing RSS feed HTML content formatting (#662) 2022-06-06 10:24:39 +02:00
Fuzzy
560d465c59 Update notification library - Improving telegram support 2022-06-06 10:07:50 +02:00
dgtlmoon
7929aeddfc 'Mark all viewed' button was missing in this version, added test also. (#652) 2022-06-02 10:01:03 +02:00
dgtlmoon
8294519f43 Content fetcher - Handle when a page doesnt load properly 2022-06-01 13:12:37 +02:00
dgtlmoon
8ba8a220b6 Playwright - Correctly close browser context/sessions on exceptions 2022-06-01 12:59:44 +02:00
dgtlmoon
aa3c8a9370 Move history data to a textfile, improves memory handling (#638) 2022-05-31 23:43:50 +02:00
dgtlmoon
dbb5468cdc Update feature_request.md 2022-05-31 22:07:22 +02:00
dgtlmoon
329c7620fb Remove UK Covid news 2022-05-31 22:04:35 +02:00
Amos (LFlare) Ng
1f974bfbb0 Visual Selector fix: Firefox compatibility - Visual Selector (#646) 2022-05-31 09:04:01 +02:00
Tim Loderhose
437c8525af Remove group tag arbitrary length limit (#645) 2022-05-30 18:28:53 +02:00
dgtlmoon
a2a1d5ae90 Distill.io import bug fix when no tags assigned to a watch (#557) 2022-05-29 22:04:23 +02:00
dgtlmoon
2566de2aae Ignore whitespace on by default 2022-05-28 13:30:57 +02:00
dgtlmoon
dfec8dbb39 Visual Selector - clear events when changing tabs 2022-05-25 15:47:30 +02:00
dgtlmoon
5cefb16e52 Minor code cleanup 2022-05-25 15:38:40 +02:00
dgtlmoon
341ae24b73 Re #616 - content trigger - adding extra test (#620) 2022-05-25 15:31:59 +02:00
dgtlmoon
f47c2fb7f6 README.md update Visual Selector tool - tidy up screenshots, improve text 2022-05-25 11:44:59 +02:00
dgtlmoon
9d742446ab Playwright - ByPass CSP for more reliable JS scraping, disable accept downloads 2022-05-25 11:05:18 +02:00
dgtlmoon
e3e022b0f4 VisualSelector - Better handling of filter targets that are no longer available in the HTML 2022-05-25 10:23:43 +02:00
dgtlmoon
6de4027c27 Update bug_report.md 2022-05-24 14:13:11 +02:00
dgtlmoon
cda3837355 pip build fix - include API module 2022-05-24 00:16:50 +02:00
dgtlmoon
7983675325 Visual Selector - be more resilient when sites interfere with the xPath scraping 2022-05-24 00:10:38 +02:00
dgtlmoon
eef56e52c6 Adding new Visual Selector for choosing the area of the webpage to monitor - playwright/browserless only (#566) 2022-05-23 23:44:51 +02:00
dgtlmoon
8e3195f394 0.39.14 2022-05-23 14:40:26 +02:00
dgtlmoon
e17c2121f7 Fix encoding errors with XPath filters from UTF-8 responses (#619) 2022-05-20 18:07:08 +02:00
dgtlmoon
07e279b38d API Interface (#617) 2022-05-20 16:27:51 +02:00
dgtlmoon
2c834cfe37 Add note that changedetection is not performed on the screenshot just yet (WIP https://github.com/dgtlmoon/changedetection.io/pull/419 ) 2022-05-20 12:52:41 +02:00
75 changed files with 2202 additions and 468 deletions

View File

@@ -1,9 +1,9 @@
--- ---
name: Bug report name: Bug report
about: Create a report to help us improve about: Create a bug report, if you don't follow this template, your report will be DELETED
title: '' title: ''
labels: '' labels: 'triage'
assignees: '' assignees: 'dgtlmoon'
--- ---
@@ -11,15 +11,18 @@ assignees: ''
A clear and concise description of what the bug is. A clear and concise description of what the bug is.
**Version** **Version**
In the top right area: 0.... *Exact version* in the top right area: 0....
**To Reproduce** **To Reproduce**
Steps to reproduce the behavior: Steps to reproduce the behavior:
1. Go to '...' 1. Go to '...'
2. Click on '....' 2. Click on '....'
3. Scroll down to '....' 3. Scroll down to '....'
4. See error 4. See error
! ALWAYS INCLUDE AN EXAMPLE URL WHERE IT IS POSSIBLE TO RE-CREATE THE ISSUE - USE THE 'SHARE WATCH' FEATURE AND PASTE IN THE SHARE-LINK!
**Expected behavior** **Expected behavior**
A clear and concise description of what you expected to happen. A clear and concise description of what you expected to happen.

View File

@@ -1,8 +1,8 @@
--- ---
name: Feature request name: Feature request
about: Suggest an idea for this project about: Suggest an idea for this project
title: '' title: '[feature]'
labels: '' labels: 'enhancement'
assignees: '' assignees: ''
--- ---

View File

@@ -85,8 +85,8 @@ jobs:
version: latest version: latest
driver-opts: image=moby/buildkit:master driver-opts: image=moby/buildkit:master
# master always builds :latest # master branch -> :dev container tag
- name: Build and push :latest - name: Build and push :dev
id: docker_build id: docker_build
if: ${{ github.ref }} == "refs/heads/master" if: ${{ github.ref }} == "refs/heads/master"
uses: docker/build-push-action@v2 uses: docker/build-push-action@v2
@@ -95,12 +95,12 @@ jobs:
file: ./Dockerfile file: ./Dockerfile
push: true push: true
tags: | tags: |
${{ secrets.DOCKER_HUB_USERNAME }}/changedetection.io:latest,ghcr.io/${{ github.repository }}:latest ${{ secrets.DOCKER_HUB_USERNAME }}/changedetection.io:dev,ghcr.io/${{ github.repository }}:dev
platforms: linux/amd64,linux/arm64,linux/arm/v6,linux/arm/v7 platforms: linux/amd64,linux/arm64,linux/arm/v6,linux/arm/v7
cache-from: type=local,src=/tmp/.buildx-cache cache-from: type=local,src=/tmp/.buildx-cache
cache-to: type=local,dest=/tmp/.buildx-cache cache-to: type=local,dest=/tmp/.buildx-cache
# A new tagged release is required, which builds :tag # A new tagged release is required, which builds :tag and :latest
- name: Build and push :tag - name: Build and push :tag
id: docker_build_tag_release id: docker_build_tag_release
if: github.event_name == 'release' && startsWith(github.event.release.tag_name, '0.') if: github.event_name == 'release' && startsWith(github.event.release.tag_name, '0.')
@@ -110,7 +110,10 @@ jobs:
file: ./Dockerfile file: ./Dockerfile
push: true push: true
tags: | tags: |
${{ secrets.DOCKER_HUB_USERNAME }}/changedetection.io:${{ github.event.release.tag_name }},ghcr.io/dgtlmoon/changedetection.io:${{ github.event.release.tag_name }} ${{ secrets.DOCKER_HUB_USERNAME }}/changedetection.io:${{ github.event.release.tag_name }}
ghcr.io/dgtlmoon/changedetection.io:${{ github.event.release.tag_name }}
${{ secrets.DOCKER_HUB_USERNAME }}/changedetection.io:latest
ghcr.io/dgtlmoon/changedetection.io:latest
platforms: linux/amd64,linux/arm64,linux/arm/v6,linux/arm/v7 platforms: linux/amd64,linux/arm64,linux/arm/v6,linux/arm/v7
cache-from: type=local,src=/tmp/.buildx-cache cache-from: type=local,src=/tmp/.buildx-cache
cache-to: type=local,dest=/tmp/.buildx-cache cache-to: type=local,dest=/tmp/.buildx-cache
@@ -125,5 +128,3 @@ jobs:
key: ${{ runner.os }}-buildx-${{ github.sha }} key: ${{ runner.os }}-buildx-${{ github.sha }}
restore-keys: | restore-keys: |
${{ runner.os }}-buildx- ${{ runner.os }}-buildx-

1
.gitignore vendored
View File

@@ -8,5 +8,6 @@ __pycache__
build build
dist dist
venv venv
test-datastore
*.egg-info* *.egg-info*
.vscode/settings.json .vscode/settings.json

View File

@@ -1,3 +1,4 @@
recursive-include changedetectionio/api *
recursive-include changedetectionio/templates * recursive-include changedetectionio/templates *
recursive-include changedetectionio/static * recursive-include changedetectionio/static *
recursive-include changedetectionio/model * recursive-include changedetectionio/model *

View File

@@ -12,7 +12,7 @@ Live your data-life *pro-actively* instead of *re-actively*.
Free, Open-source web page monitoring, notification and change detection. Don't have time? [**Try our $6.99/month subscription - unlimited checks and watches!**](https://lemonade.changedetection.io/start) Free, Open-source web page monitoring, notification and change detection. Don't have time? [**Try our $6.99/month subscription - unlimited checks and watches!**](https://lemonade.changedetection.io/start)
[<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/screenshot.png" style="max-width:100%;" alt="Self-hosted web page change monitoring" title="Self-hosted web page change monitoring" />](https://lemonade.changedetection.io/start) [<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/screenshot.png" style="max-width:100%;" alt="Self-hosted web page change monitoring" title="Self-hosted web page change monitoring" />](https://lemonade.changedetection.io/start)
**Get your own private instance now! Let us host it for you!** **Get your own private instance now! Let us host it for you!**
@@ -48,26 +48,37 @@ _Need an actual Chrome runner with Javascript support? We support fetching via W
## Screenshots ## Screenshots
Examining differences in content. ### Examine differences in content.
<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/screenshot-diff.png" style="max-width:100%;" alt="Self-hosted web page change monitoring context difference " title="Self-hosted web page change monitoring context difference " /> Easily see what changed, examine by word, line, or individual character.
<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/screenshot-diff.png" style="max-width:100%;" alt="Self-hosted web page change monitoring context difference " title="Self-hosted web page change monitoring context difference " />
Please :star: star :star: this project and help it grow! https://github.com/dgtlmoon/changedetection.io/ Please :star: star :star: this project and help it grow! https://github.com/dgtlmoon/changedetection.io/
### Target elements with the Visual Selector tool.
Available when connected to a <a href="https://github.com/dgtlmoon/changedetection.io/wiki/Playwright-content-fetcher">playwright content fetcher</a> (available also as part of our subscription service)
<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/visualselector-anim.gif" style="max-width:100%;" alt="Self-hosted web page change monitoring context difference " title="Self-hosted web page change monitoring context difference " />
## Installation ## Installation
### Docker ### Docker
With Docker composer, just clone this repository and.. With Docker composer, just clone this repository and..
```bash ```bash
$ docker-compose up -d $ docker-compose up -d
``` ```
Docker standalone Docker standalone
```bash ```bash
$ docker run -d --restart always -p "127.0.0.1:5000:5000" -v datastore-volume:/datastore --name changedetection.io dgtlmoon/changedetection.io $ docker run -d --restart always -p "127.0.0.1:5000:5000" -v datastore-volume:/datastore --name changedetection.io dgtlmoon/changedetection.io
``` ```
`:latest` tag is our latest stable release, `:dev` tag is our bleeding edge `master` branch.
### Windows ### Windows
See the install instructions at the wiki https://github.com/dgtlmoon/changedetection.io/wiki/Microsoft-Windows See the install instructions at the wiki https://github.com/dgtlmoon/changedetection.io/wiki/Microsoft-Windows
@@ -129,7 +140,7 @@ Just some examples
<a href="https://github.com/caronc/apprise#popular-notification-services">And everything else in this list!</a> <a href="https://github.com/caronc/apprise#popular-notification-services">And everything else in this list!</a>
<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/screenshot-notifications.png" style="max-width:100%;" alt="Self-hosted web page change monitoring notifications" title="Self-hosted web page change monitoring notifications" /> <img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/screenshot-notifications.png" style="max-width:100%;" alt="Self-hosted web page change monitoring notifications" title="Self-hosted web page change monitoring notifications" />
Now you can also customise your notification content! Now you can also customise your notification content!
@@ -137,11 +148,11 @@ Now you can also customise your notification content!
Detect changes and monitor data in JSON API's by using the built-in JSONPath selectors as a filter / selector. Detect changes and monitor data in JSON API's by using the built-in JSONPath selectors as a filter / selector.
![image](https://user-images.githubusercontent.com/275001/125165842-0ce01980-e1dc-11eb-9e73-d8137dd162dc.png) ![image](https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/json-filter-field-example.png)
This will re-parse the JSON and apply formatting to the text, making it super easy to monitor and detect changes in JSON API results This will re-parse the JSON and apply formatting to the text, making it super easy to monitor and detect changes in JSON API results
![image](https://user-images.githubusercontent.com/275001/125165995-d9ea5580-e1dc-11eb-8030-f0deced2661a.png) ![image](https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/json-diff-example.png)
### Parse JSON embedded in HTML! ### Parse JSON embedded in HTML!
@@ -177,7 +188,7 @@ Or directly donate an amount PayPal [![Donate](https://img.shields.io/badge/Dona
Or BTC `1PLFN327GyUarpJd7nVe7Reqg9qHx5frNn` Or BTC `1PLFN327GyUarpJd7nVe7Reqg9qHx5frNn`
<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/btc-support.png" style="max-width:50%;" alt="Support us!" /> <img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/btc-support.png" style="max-width:50%;" alt="Support us!" />
## Commercial Support ## Commercial Support

View File

@@ -20,6 +20,7 @@ from copy import deepcopy
from threading import Event from threading import Event
import flask_login import flask_login
import logging
import pytz import pytz
import timeago import timeago
from feedgen.feed import FeedGenerator from feedgen.feed import FeedGenerator
@@ -36,11 +37,14 @@ from flask import (
url_for, url_for,
) )
from flask_login import login_required from flask_login import login_required
from flask_restful import abort, Api
from flask_wtf import CSRFProtect from flask_wtf import CSRFProtect
from changedetectionio import html_tools from changedetectionio import html_tools
from changedetectionio.api import api_v1
__version__ = '0.39.13.1' __version__ = '0.39.15'
datastore = None datastore = None
@@ -78,6 +82,8 @@ csrf.init_app(app)
notification_debug_log=[] notification_debug_log=[]
watch_api = Api(app, decorators=[csrf.exempt])
def init_app_secret(datastore_path): def init_app_secret(datastore_path):
secret = "" secret = ""
@@ -102,7 +108,7 @@ def _jinja2_filter_datetime(watch_obj, format="%Y-%m-%d %H:%M:%S"):
# Worker thread tells us which UUID it is currently processing. # Worker thread tells us which UUID it is currently processing.
for t in running_update_threads: for t in running_update_threads:
if t.current_uuid == watch_obj['uuid']: if t.current_uuid == watch_obj['uuid']:
return "Checking now.." return '<span class="loader"></span><span> Checking now</span>'
if watch_obj['last_checked'] == 0: if watch_obj['last_checked'] == 0:
return 'Not yet' return 'Not yet'
@@ -173,12 +179,35 @@ def changedetection_app(config=None, datastore_o=None):
global datastore global datastore
datastore = datastore_o datastore = datastore_o
# so far just for read-only via tests, but this will be moved eventually to be the main source
# (instead of the global var)
app.config['DATASTORE']=datastore_o
#app.config.update(config or {}) #app.config.update(config or {})
login_manager = flask_login.LoginManager(app) login_manager = flask_login.LoginManager(app)
login_manager.login_view = 'login' login_manager.login_view = 'login'
app.secret_key = init_app_secret(config['datastore_path']) app.secret_key = init_app_secret(config['datastore_path'])
watch_api.add_resource(api_v1.WatchSingleHistory,
'/api/v1/watch/<string:uuid>/history/<string:timestamp>',
resource_class_kwargs={'datastore': datastore, 'update_q': update_q})
watch_api.add_resource(api_v1.WatchHistory,
'/api/v1/watch/<string:uuid>/history',
resource_class_kwargs={'datastore': datastore})
watch_api.add_resource(api_v1.CreateWatch, '/api/v1/watch',
resource_class_kwargs={'datastore': datastore, 'update_q': update_q})
watch_api.add_resource(api_v1.Watch, '/api/v1/watch/<string:uuid>',
resource_class_kwargs={'datastore': datastore, 'update_q': update_q})
# Setup cors headers to allow all domains # Setup cors headers to allow all domains
# https://flask-cors.readthedocs.io/en/latest/ # https://flask-cors.readthedocs.io/en/latest/
# CORS(app) # CORS(app)
@@ -293,25 +322,19 @@ def changedetection_app(config=None, datastore_o=None):
for watch in sorted_watches: for watch in sorted_watches:
dates = list(watch['history'].keys()) dates = list(watch.history.keys())
# Re #521 - Don't bother processing this one if theres less than 2 snapshots, means we never had a change detected. # Re #521 - Don't bother processing this one if theres less than 2 snapshots, means we never had a change detected.
if len(dates) < 2: if len(dates) < 2:
continue continue
# Convert to int, sort and back to str again prev_fname = watch.history[dates[-2]]
# @todo replace datastore getter that does this automatically
dates = [int(i) for i in dates]
dates.sort(reverse=True)
dates = [str(i) for i in dates]
prev_fname = watch['history'][dates[1]]
if not watch['viewed']: if not watch.viewed:
# Re #239 - GUID needs to be individual for each event # Re #239 - GUID needs to be individual for each event
# @todo In the future make this a configurable link back (see work on BASE_URL https://github.com/dgtlmoon/changedetection.io/pull/228) # @todo In the future make this a configurable link back (see work on BASE_URL https://github.com/dgtlmoon/changedetection.io/pull/228)
guid = "{}/{}".format(watch['uuid'], watch['last_changed']) guid = "{}/{}".format(watch['uuid'], watch['last_changed'])
fe = fg.add_entry() fe = fg.add_entry()
# Include a link to the diff page, they will have to login here to see if password protection is enabled. # Include a link to the diff page, they will have to login here to see if password protection is enabled.
# Description is the page you watch, link takes you to the diff JS UI page # Description is the page you watch, link takes you to the diff JS UI page
base_url = datastore.data['settings']['application']['base_url'] base_url = datastore.data['settings']['application']['base_url']
@@ -326,13 +349,14 @@ def changedetection_app(config=None, datastore_o=None):
watch_title = watch.get('title') if watch.get('title') else watch.get('url') watch_title = watch.get('title') if watch.get('title') else watch.get('url')
fe.title(title=watch_title) fe.title(title=watch_title)
latest_fname = watch['history'][dates[0]] latest_fname = watch.history[dates[-1]]
html_diff = diff.render_diff(prev_fname, latest_fname, include_equal=False, line_feed_sep="</br>") html_diff = diff.render_diff(prev_fname, latest_fname, include_equal=False, line_feed_sep="</br>")
fe.description(description="<![CDATA[<html><body><h4>{}</h4>{}</body></html>".format(watch_title, html_diff)) fe.content(content="<html><body><h4>{}</h4>{}</body></html>".format(watch_title, html_diff),
type='CDATA')
fe.guid(guid, permalink=False) fe.guid(guid, permalink=False)
dt = datetime.datetime.fromtimestamp(int(watch['newest_history_key'])) dt = datetime.datetime.fromtimestamp(int(watch.newest_history_key))
dt = dt.replace(tzinfo=pytz.UTC) dt = dt.replace(tzinfo=pytz.UTC)
fe.pubDate(dt) fe.pubDate(dt)
@@ -367,6 +391,8 @@ def changedetection_app(config=None, datastore_o=None):
if limit_tag != None: if limit_tag != None:
# Support for comma separated list of tags. # Support for comma separated list of tags.
if watch['tag'] is None:
continue
for tag_in_watch in watch['tag'].split(','): for tag_in_watch in watch['tag'].split(','):
tag_in_watch = tag_in_watch.strip() tag_in_watch = tag_in_watch.strip()
if tag_in_watch == limit_tag: if tag_in_watch == limit_tag:
@@ -389,11 +415,13 @@ def changedetection_app(config=None, datastore_o=None):
tags=existing_tags, tags=existing_tags,
active_tag=limit_tag, active_tag=limit_tag,
app_rss_token=datastore.data['settings']['application']['rss_access_token'], app_rss_token=datastore.data['settings']['application']['rss_access_token'],
has_unviewed=datastore.data['has_unviewed'], has_unviewed=datastore.has_unviewed,
# Don't link to hosting when we're on the hosting environment # Don't link to hosting when we're on the hosting environment
hosted_sticky=os.getenv("SALTED_PASS", False) == False, hosted_sticky=os.getenv("SALTED_PASS", False) == False,
guid=datastore.data['app_guid'], guid=datastore.data['app_guid'],
queued_uuids=update_q.queue) queued_uuids=update_q.queue)
if session.get('share-link'): if session.get('share-link'):
del(session['share-link']) del(session['share-link'])
return output return output
@@ -430,6 +458,19 @@ def changedetection_app(config=None, datastore_o=None):
return 'OK' return 'OK'
@app.route("/scrub/<string:uuid>", methods=['GET'])
@login_required
def scrub_watch(uuid):
try:
datastore.scrub_watch(uuid)
except KeyError:
flash('Watch not found', 'error')
else:
flash("Scrubbed watch {}".format(uuid))
return redirect(url_for('index'))
@app.route("/scrub", methods=['GET', 'POST']) @app.route("/scrub", methods=['GET', 'POST'])
@login_required @login_required
def scrub_page(): def scrub_page():
@@ -465,10 +506,10 @@ def changedetection_app(config=None, datastore_o=None):
# 0 means that theres only one, so that there should be no 'unviewed' history available # 0 means that theres only one, so that there should be no 'unviewed' history available
if newest_history_key == 0: if newest_history_key == 0:
newest_history_key = list(datastore.data['watching'][uuid]['history'].keys())[0] newest_history_key = list(datastore.data['watching'][uuid].history.keys())[0]
if newest_history_key: if newest_history_key:
with open(datastore.data['watching'][uuid]['history'][newest_history_key], with open(datastore.data['watching'][uuid].history[newest_history_key],
encoding='utf-8') as file: encoding='utf-8') as file:
raw_content = file.read() raw_content = file.read()
@@ -562,12 +603,12 @@ def changedetection_app(config=None, datastore_o=None):
# Reset the previous_md5 so we process a new snapshot including stripping ignore text. # Reset the previous_md5 so we process a new snapshot including stripping ignore text.
if form_ignore_text: if form_ignore_text:
if len(datastore.data['watching'][uuid]['history']): if len(datastore.data['watching'][uuid].history):
extra_update_obj['previous_md5'] = get_current_checksum_include_ignore_text(uuid=uuid) extra_update_obj['previous_md5'] = get_current_checksum_include_ignore_text(uuid=uuid)
# Reset the previous_md5 so we process a new snapshot including stripping ignore text. # Reset the previous_md5 so we process a new snapshot including stripping ignore text.
if form.css_filter.data.strip() != datastore.data['watching'][uuid]['css_filter']: if form.css_filter.data.strip() != datastore.data['watching'][uuid]['css_filter']:
if len(datastore.data['watching'][uuid]['history']): if len(datastore.data['watching'][uuid].history):
extra_update_obj['previous_md5'] = get_current_checksum_include_ignore_text(uuid=uuid) extra_update_obj['previous_md5'] = get_current_checksum_include_ignore_text(uuid=uuid)
# Be sure proxy value is None # Be sure proxy value is None
@@ -600,6 +641,12 @@ def changedetection_app(config=None, datastore_o=None):
if request.method == 'POST' and not form.validate(): if request.method == 'POST' and not form.validate():
flash("An error occurred, please see below.", "error") flash("An error occurred, please see below.", "error")
visualselector_data_is_ready = datastore.visualselector_data_is_ready(uuid)
# Only works reliably with Playwright
visualselector_enabled = os.getenv('PLAYWRIGHT_DRIVER_URL', False) and default['fetch_backend'] == 'html_webdriver'
output = render_template("edit.html", output = render_template("edit.html",
uuid=uuid, uuid=uuid,
watch=datastore.data['watching'][uuid], watch=datastore.data['watching'][uuid],
@@ -607,7 +654,9 @@ def changedetection_app(config=None, datastore_o=None):
has_empty_checktime=using_default_check_time, has_empty_checktime=using_default_check_time,
using_global_webdriver_wait=default['webdriver_delay'] is None, using_global_webdriver_wait=default['webdriver_delay'] is None,
current_base_url=datastore.data['settings']['application']['base_url'], current_base_url=datastore.data['settings']['application']['base_url'],
emailprefix=os.getenv('NOTIFICATION_MAIL_BUTTON_PREFIX', False) emailprefix=os.getenv('NOTIFICATION_MAIL_BUTTON_PREFIX', False),
visualselector_data_is_ready=visualselector_data_is_ready,
visualselector_enabled=visualselector_enabled
) )
return output return output
@@ -671,6 +720,7 @@ def changedetection_app(config=None, datastore_o=None):
form=form, form=form,
current_base_url = datastore.data['settings']['application']['base_url'], current_base_url = datastore.data['settings']['application']['base_url'],
hide_remove_pass=os.getenv("SALTED_PASS", False), hide_remove_pass=os.getenv("SALTED_PASS", False),
api_key=datastore.data['settings']['application'].get('api_access_token'),
emailprefix=os.getenv('NOTIFICATION_MAIL_BUTTON_PREFIX', False)) emailprefix=os.getenv('NOTIFICATION_MAIL_BUTTON_PREFIX', False))
return output return output
@@ -713,15 +763,14 @@ def changedetection_app(config=None, datastore_o=None):
return output return output
# Clear all statuses, so we do not see the 'unviewed' class # Clear all statuses, so we do not see the 'unviewed' class
@app.route("/api/mark-all-viewed", methods=['GET']) @app.route("/form/mark-all-viewed", methods=['GET'])
@login_required @login_required
def mark_all_viewed(): def mark_all_viewed():
# Save the current newest history as the most recently viewed # Save the current newest history as the most recently viewed
for watch_uuid, watch in datastore.data['watching'].items(): for watch_uuid, watch in datastore.data['watching'].items():
datastore.set_last_viewed(watch_uuid, watch['newest_history_key']) datastore.set_last_viewed(watch_uuid, int(time.time()))
flash("Cleared all statuses.")
return redirect(url_for('index')) return redirect(url_for('index'))
@app.route("/diff/<string:uuid>", methods=['GET']) @app.route("/diff/<string:uuid>", methods=['GET'])
@@ -739,20 +788,17 @@ def changedetection_app(config=None, datastore_o=None):
flash("No history found for the specified link, bad link?", "error") flash("No history found for the specified link, bad link?", "error")
return redirect(url_for('index')) return redirect(url_for('index'))
dates = list(watch['history'].keys()) history = watch.history
# Convert to int, sort and back to str again dates = list(history.keys())
# @todo replace datastore getter that does this automatically
dates = [int(i) for i in dates]
dates.sort(reverse=True)
dates = [str(i) for i in dates]
if len(dates) < 2: if len(dates) < 2:
flash("Not enough saved change detection snapshots to produce a report.", "error") flash("Not enough saved change detection snapshots to produce a report.", "error")
return redirect(url_for('index')) return redirect(url_for('index'))
# Save the current newest history as the most recently viewed # Save the current newest history as the most recently viewed
datastore.set_last_viewed(uuid, dates[0]) datastore.set_last_viewed(uuid, time.time())
newest_file = watch['history'][dates[0]]
newest_file = history[dates[-1]]
try: try:
with open(newest_file, 'r') as f: with open(newest_file, 'r') as f:
@@ -762,10 +808,10 @@ def changedetection_app(config=None, datastore_o=None):
previous_version = request.args.get('previous_version') previous_version = request.args.get('previous_version')
try: try:
previous_file = watch['history'][previous_version] previous_file = history[previous_version]
except KeyError: except KeyError:
# Not present, use a default value, the second one in the sorted list. # Not present, use a default value, the second one in the sorted list.
previous_file = watch['history'][dates[1]] previous_file = history[dates[-2]]
try: try:
with open(previous_file, 'r') as f: with open(previous_file, 'r') as f:
@@ -776,18 +822,25 @@ def changedetection_app(config=None, datastore_o=None):
screenshot_url = datastore.get_screenshot(uuid) screenshot_url = datastore.get_screenshot(uuid)
output = render_template("diff.html", watch_a=watch, system_uses_webdriver = datastore.data['settings']['application']['fetch_backend'] == 'html_webdriver'
is_html_webdriver = True if watch.get('fetch_backend') == 'html_webdriver' or (
watch.get('fetch_backend', None) is None and system_uses_webdriver) else False
output = render_template("diff.html",
watch_a=watch,
newest=newest_version_file_contents, newest=newest_version_file_contents,
previous=previous_version_file_contents, previous=previous_version_file_contents,
extra_stylesheets=extra_stylesheets, extra_stylesheets=extra_stylesheets,
versions=dates[1:], versions=dates[1:],
uuid=uuid, uuid=uuid,
newest_version_timestamp=dates[0], newest_version_timestamp=dates[-1],
current_previous_version=str(previous_version), current_previous_version=str(previous_version),
current_diff_url=watch['url'], current_diff_url=watch['url'],
extra_title=" - Diff - {}".format(watch['title'] if watch['title'] else watch['url']), extra_title=" - Diff - {}".format(watch['title'] if watch['title'] else watch['url']),
left_sticky=True, left_sticky=True,
screenshot=screenshot_url) screenshot=screenshot_url,
is_html_webdriver=is_html_webdriver)
return output return output
@@ -802,6 +855,12 @@ def changedetection_app(config=None, datastore_o=None):
if uuid == 'first': if uuid == 'first':
uuid = list(datastore.data['watching'].keys()).pop() uuid = list(datastore.data['watching'].keys()).pop()
# Normally you would never reach this, because the 'preview' button is not available when there's no history
# However they may try to scrub and reload the page
if datastore.data['watching'][uuid].history_n == 0:
flash("Preview unavailable - No fetch/check completed or triggers not reached", "error")
return redirect(url_for('index'))
extra_stylesheets = [url_for('static_content', group='styles', filename='diff.css')] extra_stylesheets = [url_for('static_content', group='styles', filename='diff.css')]
try: try:
@@ -810,9 +869,9 @@ def changedetection_app(config=None, datastore_o=None):
flash("No history found for the specified link, bad link?", "error") flash("No history found for the specified link, bad link?", "error")
return redirect(url_for('index')) return redirect(url_for('index'))
if len(watch['history']): if watch.history_n >0:
timestamps = sorted(watch['history'].keys(), key=lambda x: int(x)) timestamps = sorted(watch.history.keys(), key=lambda x: int(x))
filename = watch['history'][timestamps[-1]] filename = watch.history[timestamps[-1]]
try: try:
with open(filename, 'r') as f: with open(filename, 'r') as f:
tmp = f.readlines() tmp = f.readlines()
@@ -848,6 +907,11 @@ def changedetection_app(config=None, datastore_o=None):
content.append({'line': "No history found", 'classes': ''}) content.append({'line': "No history found", 'classes': ''})
screenshot_url = datastore.get_screenshot(uuid) screenshot_url = datastore.get_screenshot(uuid)
system_uses_webdriver = datastore.data['settings']['application']['fetch_backend'] == 'html_webdriver'
is_html_webdriver = True if watch.get('fetch_backend') == 'html_webdriver' or (
watch.get('fetch_backend', None) is None and system_uses_webdriver) else False
output = render_template("preview.html", output = render_template("preview.html",
content=content, content=content,
extra_stylesheets=extra_stylesheets, extra_stylesheets=extra_stylesheets,
@@ -856,8 +920,9 @@ def changedetection_app(config=None, datastore_o=None):
current_diff_url=watch['url'], current_diff_url=watch['url'],
screenshot=screenshot_url, screenshot=screenshot_url,
watch=watch, watch=watch,
uuid=uuid) uuid=uuid,
is_html_webdriver=is_html_webdriver)
return output return output
@app.route("/settings/notification-logs", methods=['GET']) @app.route("/settings/notification-logs", methods=['GET'])
@@ -865,31 +930,10 @@ def changedetection_app(config=None, datastore_o=None):
def notification_logs(): def notification_logs():
global notification_debug_log global notification_debug_log
output = render_template("notification-log.html", output = render_template("notification-log.html",
logs=notification_debug_log if len(notification_debug_log) else ["No errors or warnings detected"]) logs=notification_debug_log if len(notification_debug_log) else ["Notification logs are empty - no notifications sent yet."])
return output return output
@app.route("/api/<string:uuid>/snapshot/current", methods=['GET'])
@login_required
def api_snapshot(uuid):
# More for testing, possible to return the first/only
if uuid == 'first':
uuid = list(datastore.data['watching'].keys()).pop()
try:
watch = datastore.data['watching'][uuid]
except KeyError:
return abort(400, "No history found for the specified link, bad link?")
newest = list(watch['history'].keys())[-1]
with open(watch['history'][newest], 'r') as f:
content = f.read()
resp = make_response(content)
resp.headers['Content-Type'] = 'text/plain'
return resp
@app.route("/favicon.ico", methods=['GET']) @app.route("/favicon.ico", methods=['GET'])
def favicon(): def favicon():
return send_from_directory("static/images", path="favicon.ico") return send_from_directory("static/images", path="favicon.ico")
@@ -970,10 +1014,9 @@ def changedetection_app(config=None, datastore_o=None):
@app.route("/static/<string:group>/<string:filename>", methods=['GET']) @app.route("/static/<string:group>/<string:filename>", methods=['GET'])
def static_content(group, filename): def static_content(group, filename):
from flask import make_response
if group == 'screenshot': if group == 'screenshot':
from flask import make_response
# Could be sensitive, follow password requirements # Could be sensitive, follow password requirements
if datastore.data['settings']['application']['password'] and not flask_login.current_user.is_authenticated: if datastore.data['settings']['application']['password'] and not flask_login.current_user.is_authenticated:
abort(403) abort(403)
@@ -992,6 +1035,26 @@ def changedetection_app(config=None, datastore_o=None):
except FileNotFoundError: except FileNotFoundError:
abort(404) abort(404)
if group == 'visual_selector_data':
# Could be sensitive, follow password requirements
if datastore.data['settings']['application']['password'] and not flask_login.current_user.is_authenticated:
abort(403)
# These files should be in our subdirectory
try:
# set nocache, set content-type
watch_dir = datastore_o.datastore_path + "/" + filename
response = make_response(send_from_directory(filename="elements.json", directory=watch_dir, path=watch_dir + "/elements.json"))
response.headers['Content-type'] = 'application/json'
response.headers['Cache-Control'] = 'no-cache, no-store, must-revalidate'
response.headers['Pragma'] = 'no-cache'
response.headers['Expires'] = 0
return response
except FileNotFoundError:
abort(404)
# These files should be in our subdirectory # These files should be in our subdirectory
try: try:
return send_from_directory("static/{}".format(group), path=filename) return send_from_directory("static/{}".format(group), path=filename)
@@ -1000,7 +1063,7 @@ def changedetection_app(config=None, datastore_o=None):
@app.route("/api/add", methods=['POST']) @app.route("/api/add", methods=['POST'])
@login_required @login_required
def api_watch_add(): def form_watch_add():
from changedetectionio import forms from changedetectionio import forms
form = forms.quickWatchForm(request.form) form = forms.quickWatchForm(request.form)
@@ -1026,7 +1089,7 @@ def changedetection_app(config=None, datastore_o=None):
@app.route("/api/delete", methods=['GET']) @app.route("/api/delete", methods=['GET'])
@login_required @login_required
def api_delete(): def form_delete():
uuid = request.args.get('uuid') uuid = request.args.get('uuid')
if uuid != 'all' and not uuid in datastore.data['watching'].keys(): if uuid != 'all' and not uuid in datastore.data['watching'].keys():
@@ -1043,7 +1106,7 @@ def changedetection_app(config=None, datastore_o=None):
@app.route("/api/clone", methods=['GET']) @app.route("/api/clone", methods=['GET'])
@login_required @login_required
def api_clone(): def form_clone():
uuid = request.args.get('uuid') uuid = request.args.get('uuid')
# More for testing, possible to return the first/only # More for testing, possible to return the first/only
if uuid == 'first': if uuid == 'first':
@@ -1057,7 +1120,7 @@ def changedetection_app(config=None, datastore_o=None):
@app.route("/api/checknow", methods=['GET']) @app.route("/api/checknow", methods=['GET'])
@login_required @login_required
def api_watch_checknow(): def form_watch_checknow():
tag = request.args.get('tag') tag = request.args.get('tag')
uuid = request.args.get('uuid') uuid = request.args.get('uuid')
@@ -1094,7 +1157,7 @@ def changedetection_app(config=None, datastore_o=None):
@app.route("/api/share-url", methods=['GET']) @app.route("/api/share-url", methods=['GET'])
@login_required @login_required
def api_share_put_watch(): def form_share_put_watch():
"""Given a watch UUID, upload the info and return a share-link """Given a watch UUID, upload the info and return a share-link
the share-link can be imported/added""" the share-link can be imported/added"""
import requests import requests
@@ -1108,6 +1171,7 @@ def changedetection_app(config=None, datastore_o=None):
# copy it to memory as trim off what we dont need (history) # copy it to memory as trim off what we dont need (history)
watch = deepcopy(datastore.data['watching'][uuid]) watch = deepcopy(datastore.data['watching'][uuid])
# For older versions that are not a @property
if (watch.get('history')): if (watch.get('history')):
del (watch['history']) del (watch['history'])
@@ -1137,14 +1201,14 @@ def changedetection_app(config=None, datastore_o=None):
except Exception as e: except Exception as e:
flash("Could not share, something went wrong while communicating with the share server.", 'error') logging.error("Error sharing -{}".format(str(e)))
flash("Could not share, something went wrong while communicating with the share server - {}".format(str(e)), 'error')
# https://changedetection.io/share/VrMv05wpXyQa # https://changedetection.io/share/VrMv05wpXyQa
# in the browser - should give you a nice info page - wtf # in the browser - should give you a nice info page - wtf
# paste in etc # paste in etc
return redirect(url_for('index')) return redirect(url_for('index'))
# @todo handle ctrl break # @todo handle ctrl break
ticker_thread = threading.Thread(target=ticker_thread_check_time_launch_checks).start() ticker_thread = threading.Thread(target=ticker_thread_check_time_launch_checks).start()
@@ -1186,6 +1250,9 @@ def check_for_new_version():
def notification_runner(): def notification_runner():
global notification_debug_log global notification_debug_log
from datetime import datetime
import json
while not app.config.exit.is_set(): while not app.config.exit.is_set():
try: try:
# At the moment only one thread runs (single runner) # At the moment only one thread runs (single runner)
@@ -1194,13 +1261,16 @@ def notification_runner():
time.sleep(1) time.sleep(1)
else: else:
# Process notifications
now = datetime.now()
try: try:
from changedetectionio import notification from changedetectionio import notification
notification.process_notification(n_object, datastore)
sent_obj = notification.process_notification(n_object, datastore)
except Exception as e: except Exception as e:
print("Watch URL: {} Error {}".format(n_object['watch_url'], str(e))) logging.error("Watch URL: {} Error {}".format(n_object['watch_url'], str(e)))
# UUID wont be present when we submit a 'test' from the global settings # UUID wont be present when we submit a 'test' from the global settings
if 'uuid' in n_object: if 'uuid' in n_object:
@@ -1210,14 +1280,19 @@ def notification_runner():
log_lines = str(e).splitlines() log_lines = str(e).splitlines()
notification_debug_log += log_lines notification_debug_log += log_lines
# Trim the log length # Process notifications
notification_debug_log = notification_debug_log[-100:] notification_debug_log+= ["{} - SENDING - {}".format(now.strftime("%Y/%m/%d %H:%M:%S,000"), json.dumps(sent_obj))]
# Trim the log length
notification_debug_log = notification_debug_log[-100:]
# Thread runner to check every minute, look for new watches to feed into the Queue. # Thread runner to check every minute, look for new watches to feed into the Queue.
def ticker_thread_check_time_launch_checks(): def ticker_thread_check_time_launch_checks():
import random
from changedetectionio import update_worker from changedetectionio import update_worker
recheck_time_minimum_seconds = int(os.getenv('MINIMUM_SECONDS_RECHECK_TIME', 20))
print("System env MINIMUM_SECONDS_RECHECK_TIME", recheck_time_minimum_seconds)
# Spin up Workers that do the fetching # Spin up Workers that do the fetching
# Can be overriden by ENV or use the default settings # Can be overriden by ENV or use the default settings
n_workers = int(os.getenv("FETCH_WORKERS", datastore.data['settings']['requests']['workers'])) n_workers = int(os.getenv("FETCH_WORKERS", datastore.data['settings']['requests']['workers']))
@@ -1235,9 +1310,10 @@ def ticker_thread_check_time_launch_checks():
running_uuids.append(t.current_uuid) running_uuids.append(t.current_uuid)
# Re #232 - Deepcopy the data incase it changes while we're iterating through it all # Re #232 - Deepcopy the data incase it changes while we're iterating through it all
watch_uuid_list = []
while True: while True:
try: try:
copied_datastore = deepcopy(datastore) watch_uuid_list = datastore.data['watching'].keys()
except RuntimeError as e: except RuntimeError as e:
# RuntimeError: dictionary changed size during iteration # RuntimeError: dictionary changed size during iteration
time.sleep(0.1) time.sleep(0.1)
@@ -1248,33 +1324,49 @@ def ticker_thread_check_time_launch_checks():
while update_q.qsize() >= 2000: while update_q.qsize() >= 2000:
time.sleep(1) time.sleep(1)
recheck_time_system_seconds = int(datastore.threshold_seconds)
# Check for watches outside of the time threshold to put in the thread queue. # Check for watches outside of the time threshold to put in the thread queue.
now = time.time() for uuid in watch_uuid_list:
now = time.time()
recheck_time_minimum_seconds = int(os.getenv('MINIMUM_SECONDS_RECHECK_TIME', 60)) watch = datastore.data['watching'].get(uuid)
recheck_time_system_seconds = datastore.threshold_seconds if not watch:
logging.error("Watch: {} no longer present.".format(uuid))
for uuid, watch in copied_datastore.data['watching'].items(): continue
# No need todo further processing if it's paused # No need todo further processing if it's paused
if watch['paused']: if watch['paused']:
continue continue
# If they supplied an individual entry minutes to threshold. # If they supplied an individual entry minutes to threshold.
threshold = now
watch_threshold_seconds = watch.threshold_seconds()
if watch_threshold_seconds:
threshold -= watch_threshold_seconds
else:
threshold -= recheck_time_system_seconds
# Yeah, put it in the queue, it's more than time watch_threshold_seconds = watch.threshold_seconds()
if watch['last_checked'] <= max(threshold, recheck_time_minimum_seconds): threshold = watch_threshold_seconds if watch_threshold_seconds > 0 else recheck_time_system_seconds
# #580 - Jitter plus/minus amount of time to make the check seem more random to the server
jitter = datastore.data['settings']['requests'].get('jitter_seconds', 0)
if jitter > 0:
if watch.jitter_seconds == 0:
watch.jitter_seconds = random.uniform(-abs(jitter), jitter)
seconds_since_last_recheck = now - watch['last_checked']
if seconds_since_last_recheck >= (threshold + watch.jitter_seconds) and seconds_since_last_recheck >= recheck_time_minimum_seconds:
if not uuid in running_uuids and uuid not in update_q.queue: if not uuid in running_uuids and uuid not in update_q.queue:
print("Queued watch UUID {} last checked at {} queued at {:0.2f} jitter {:0.2f}s, {:0.2f}s since last checked".format(uuid,
watch['last_checked'],
now,
watch.jitter_seconds,
now - watch['last_checked']))
# Into the queue with you
update_q.put(uuid) update_q.put(uuid)
# Wait a few seconds before checking the list again # Reset for next time
time.sleep(3) watch.jitter_seconds = 0
# Wait before checking the list again - saves CPU
time.sleep(1)
# Should be low so we can break this out in testing # Should be low so we can break this out in testing
app.config.exit.wait(1) app.config.exit.wait(1)

View File

View File

@@ -0,0 +1,124 @@
from flask_restful import abort, Resource
from flask import request, make_response
import validators
from . import auth
# https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
class Watch(Resource):
def __init__(self, **kwargs):
# datastore is a black box dependency
self.datastore = kwargs['datastore']
self.update_q = kwargs['update_q']
# Get information about a single watch, excluding the history list (can be large)
# curl http://localhost:4000/api/v1/watch/<string:uuid>
# ?recheck=true
@auth.check_token
def get(self, uuid):
from copy import deepcopy
watch = deepcopy(self.datastore.data['watching'].get(uuid))
if not watch:
abort(404, message='No watch exists with the UUID of {}'.format(uuid))
if request.args.get('recheck'):
self.update_q.put(uuid)
return "OK", 200
# Return without history, get that via another API call
watch['history_n'] = watch.history_n
return watch
@auth.check_token
def delete(self, uuid):
if not self.datastore.data['watching'].get(uuid):
abort(400, message='No watch exists with the UUID of {}'.format(uuid))
self.datastore.delete(uuid)
return 'OK', 204
class WatchHistory(Resource):
def __init__(self, **kwargs):
# datastore is a black box dependency
self.datastore = kwargs['datastore']
# Get a list of available history for a watch by UUID
# curl http://localhost:4000/api/v1/watch/<string:uuid>/history
def get(self, uuid):
watch = self.datastore.data['watching'].get(uuid)
if not watch:
abort(404, message='No watch exists with the UUID of {}'.format(uuid))
return watch.history, 200
class WatchSingleHistory(Resource):
def __init__(self, **kwargs):
# datastore is a black box dependency
self.datastore = kwargs['datastore']
# Read a given history snapshot and return its content
# <string:timestamp> or "latest"
# curl http://localhost:4000/api/v1/watch/<string:uuid>/history/<int:timestamp>
@auth.check_token
def get(self, uuid, timestamp):
watch = self.datastore.data['watching'].get(uuid)
if not watch:
abort(404, message='No watch exists with the UUID of {}'.format(uuid))
if not len(watch.history):
abort(404, message='Watch found but no history exists for the UUID {}'.format(uuid))
if timestamp == 'latest':
timestamp = list(watch.history.keys())[-1]
with open(watch.history[timestamp], 'r') as f:
content = f.read()
response = make_response(content, 200)
response.mimetype = "text/plain"
return response
class CreateWatch(Resource):
def __init__(self, **kwargs):
# datastore is a black box dependency
self.datastore = kwargs['datastore']
self.update_q = kwargs['update_q']
@auth.check_token
def post(self):
# curl http://localhost:4000/api/v1/watch -H "Content-Type: application/json" -d '{"url": "https://my-nice.com", "tag": "one, two" }'
json_data = request.get_json()
tag = json_data['tag'].strip() if json_data.get('tag') else ''
if not validators.url(json_data['url'].strip()):
return "Invalid or unsupported URL", 400
extras = {'title': json_data['title'].strip()} if json_data.get('title') else {}
new_uuid = self.datastore.add_watch(url=json_data['url'].strip(), tag=tag, extras=extras)
self.update_q.put(new_uuid)
return {'uuid': new_uuid}, 201
# Return concise list of available watches and some very basic info
# curl http://localhost:4000/api/v1/watch|python -mjson.tool
# ?recheck_all=1 to recheck all
@auth.check_token
def get(self):
list = {}
for k, v in self.datastore.data['watching'].items():
list[k] = {'url': v['url'],
'title': v['title'],
'last_checked': v['last_checked'],
'last_changed': v['last_changed'],
'last_error': v['last_error']}
if request.args.get('recheck_all'):
for uuid in self.datastore.data['watching'].keys():
self.update_q.put(uuid)
return {'status': "OK"}, 200
return list, 200

View File

@@ -0,0 +1,33 @@
from flask import request, make_response, jsonify
from functools import wraps
# Simple API auth key comparison
# @todo - Maybe short lived token in the future?
def check_token(f):
@wraps(f)
def decorated(*args, **kwargs):
datastore = args[0].datastore
config_api_token_enabled = datastore.data['settings']['application'].get('api_access_token_enabled')
if not config_api_token_enabled:
return
try:
api_key_header = request.headers['x-api-key']
except KeyError:
return make_response(
jsonify("No authorization x-api-key header."), 403
)
config_api_token = datastore.data['settings']['application'].get('api_access_token')
if api_key_header != config_api_token:
return make_response(
jsonify("Invalid access - API key invalid."), 403
)
return f(*args, **kwargs)
return decorated

View File

@@ -1,10 +1,19 @@
from abc import ABC, abstractmethod from abc import ABC, abstractmethod
import chardet import chardet
import json
import os import os
import requests import requests
import time import time
import sys import sys
class PageUnloadable(Exception):
def __init__(self, status_code, url):
# Set this so we can use it in other parts of the app
self.status_code = status_code
self.url = url
return
pass
class EmptyReply(Exception): class EmptyReply(Exception):
def __init__(self, status_code, url): def __init__(self, status_code, url):
# Set this so we can use it in other parts of the app # Set this so we can use it in other parts of the app
@@ -13,6 +22,14 @@ class EmptyReply(Exception):
return return
pass pass
class ScreenshotUnavailable(Exception):
def __init__(self, status_code, url):
# Set this so we can use it in other parts of the app
self.status_code = status_code
self.url = url
return
pass
class ReplyWithContentButNoText(Exception): class ReplyWithContentButNoText(Exception):
def __init__(self, status_code, url): def __init__(self, status_code, url):
# Set this so we can use it in other parts of the app # Set this so we can use it in other parts of the app
@@ -27,6 +44,135 @@ class Fetcher():
status_code = None status_code = None
content = None content = None
headers = None headers = None
fetcher_description = "No description"
xpath_element_js = """
// Include the getXpath script directly, easier than fetching
!function(e,n){"object"==typeof exports&&"undefined"!=typeof module?module.exports=n():"function"==typeof define&&define.amd?define(n):(e=e||self).getXPath=n()}(this,function(){return function(e){var n=e;if(n&&n.id)return'//*[@id="'+n.id+'"]';for(var o=[];n&&Node.ELEMENT_NODE===n.nodeType;){for(var i=0,r=!1,d=n.previousSibling;d;)d.nodeType!==Node.DOCUMENT_TYPE_NODE&&d.nodeName===n.nodeName&&i++,d=d.previousSibling;for(d=n.nextSibling;d;){if(d.nodeName===n.nodeName){r=!0;break}d=d.nextSibling}o.push((n.prefix?n.prefix+":":"")+n.localName+(i||r?"["+(i+1)+"]":"")),n=n.parentNode}return o.length?"/"+o.reverse().join("/"):""}});
const findUpTag = (el) => {
let r = el
chained_css = [];
depth=0;
// Strategy 1: Keep going up until we hit an ID tag, imagine it's like #list-widget div h4
while (r.parentNode) {
if(depth==5) {
break;
}
if('' !==r.id) {
chained_css.unshift("#"+r.id);
final_selector= chained_css.join('>');
// Be sure theres only one, some sites have multiples of the same ID tag :-(
if (window.document.querySelectorAll(final_selector).length ==1 ) {
return final_selector;
}
return null;
} else {
chained_css.unshift(r.tagName.toLowerCase());
}
r=r.parentNode;
depth+=1;
}
return null;
}
// @todo - if it's SVG or IMG, go into image diff mode
var elements = window.document.querySelectorAll("div,span,form,table,tbody,tr,td,a,p,ul,li,h1,h2,h3,h4, header, footer, section, article, aside, details, main, nav, section, summary");
var size_pos=[];
// after page fetch, inject this JS
// build a map of all elements and their positions (maybe that only include text?)
var bbox;
for (var i = 0; i < elements.length; i++) {
bbox = elements[i].getBoundingClientRect();
// forget really small ones
if (bbox['width'] <20 && bbox['height'] < 20 ) {
continue;
}
// @todo the getXpath kind of sucks, it doesnt know when there is for example just one ID sometimes
// it should not traverse when we know we can anchor off just an ID one level up etc..
// maybe, get current class or id, keep traversing up looking for only class or id until there is just one match
// 1st primitive - if it has class, try joining it all and select, if theres only one.. well thats us.
xpath_result=false;
try {
var d= findUpTag(elements[i]);
if (d) {
xpath_result =d;
}
} catch (e) {
console.log(e);
}
// You could swap it and default to getXpath and then try the smarter one
// default back to the less intelligent one
if (!xpath_result) {
try {
// I've seen on FB and eBay that this doesnt work
// ReferenceError: getXPath is not defined at eval (eval at evaluate (:152:29), <anonymous>:67:20) at UtilityScript.evaluate (<anonymous>:159:18) at UtilityScript.<anonymous> (<anonymous>:1:44)
xpath_result = getXPath(elements[i]);
} catch (e) {
console.log(e);
continue;
}
}
if(window.getComputedStyle(elements[i]).visibility === "hidden") {
continue;
}
size_pos.push({
xpath: xpath_result,
width: Math.round(bbox['width']),
height: Math.round(bbox['height']),
left: Math.floor(bbox['left']),
top: Math.floor(bbox['top']),
childCount: elements[i].childElementCount
});
}
// inject the current one set in the css_filter, which may be a CSS rule
// used for displaying the current one in VisualSelector, where its not one we generated.
if (css_filter.length) {
q=false;
try {
// is it xpath?
if (css_filter.startsWith('/') || css_filter.startsWith('xpath:')) {
q=document.evaluate(css_filter.replace('xpath:',''), document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;
} else {
q=document.querySelector(css_filter);
}
} catch (e) {
// Maybe catch DOMException and alert?
console.log(e);
}
bbox=false;
if(q) {
bbox = q.getBoundingClientRect();
}
if (bbox && bbox['width'] >0 && bbox['height']>0) {
size_pos.push({
xpath: css_filter,
width: bbox['width'],
height: bbox['height'],
left: bbox['left'],
top: bbox['top'],
childCount: q.childElementCount
});
}
}
// Window.width required for proper scaling in the frontend
return {'size_pos':size_pos, 'browser_width': window.innerWidth};
"""
xpath_data = None
# Will be needed in the future by the VisualSelector, always get this where possible. # Will be needed in the future by the VisualSelector, always get this where possible.
screenshot = False screenshot = False
fetcher_description = "No description" fetcher_description = "No description"
@@ -47,7 +193,8 @@ class Fetcher():
request_headers, request_headers,
request_body, request_body,
request_method, request_method,
ignore_status_codes=False): ignore_status_codes=False,
current_css_filter=None):
# Should set self.error, self.status_code and self.content # Should set self.error, self.status_code and self.content
pass pass
@@ -128,52 +275,101 @@ class base_html_playwright(Fetcher):
request_headers, request_headers,
request_body, request_body,
request_method, request_method,
ignore_status_codes=False): ignore_status_codes=False,
current_css_filter=None):
from playwright.sync_api import sync_playwright from playwright.sync_api import sync_playwright
import playwright._impl._api_types import playwright._impl._api_types
from playwright._impl._api_types import Error, TimeoutError from playwright._impl._api_types import Error, TimeoutError
response = None
with sync_playwright() as p: with sync_playwright() as p:
browser_type = getattr(p, self.browser_type) browser_type = getattr(p, self.browser_type)
# Seemed to cause a connection Exception even tho I can see it connect # Seemed to cause a connection Exception even tho I can see it connect
# self.browser = browser_type.connect(self.command_executor, timeout=timeout*1000) # self.browser = browser_type.connect(self.command_executor, timeout=timeout*1000)
browser = browser_type.connect_over_cdp(self.command_executor, timeout=timeout * 1000) # 60,000 connection timeout only
browser = browser_type.connect_over_cdp(self.command_executor, timeout=60000)
# Set user agent to prevent Cloudflare from blocking the browser # Set user agent to prevent Cloudflare from blocking the browser
# Use the default one configured in the App.py model that's passed from fetch_site_status.py # Use the default one configured in the App.py model that's passed from fetch_site_status.py
context = browser.new_context( context = browser.new_context(
user_agent=request_headers['User-Agent'] if request_headers.get('User-Agent') else 'Mozilla/5.0', user_agent=request_headers['User-Agent'] if request_headers.get('User-Agent') else 'Mozilla/5.0',
proxy=self.proxy proxy=self.proxy,
# This is needed to enable JavaScript execution on GitHub and others
bypass_csp=True,
# Should never be needed
accept_downloads=False
) )
page = context.new_page() page = context.new_page()
page.set_viewport_size({"width": 1280, "height": 1024})
try: try:
response = page.goto(url, timeout=timeout * 1000, wait_until='commit') page.set_default_navigation_timeout(90000)
# Wait_until = commit page.set_default_timeout(90000)
# - `'commit'` - consider operation to be finished when network response is received and the document started loading.
# Better to not use any smarts from Playwright and just wait an arbitrary number of seconds # Bug - never set viewport size BEFORE page.goto
# This seemed to solve nearly all 'TimeoutErrors'
extra_wait = int(os.getenv("WEBDRIVER_DELAY_BEFORE_CONTENT_READY", 5)) + self.render_extract_delay # Waits for the next navigation. Using Python context manager
page.wait_for_timeout(extra_wait * 1000) # prevents a race condition between clicking and waiting for a navigation.
with page.expect_navigation():
response = page.goto(url, wait_until='load')
except playwright._impl._api_types.TimeoutError as e: except playwright._impl._api_types.TimeoutError as e:
raise EmptyReply(url=url, status_code=None) context.close()
browser.close()
# This can be ok, we will try to grab what we could retrieve
pass
except Exception as e:
print ("other exception when page.goto")
print (str(e))
context.close()
browser.close()
raise PageUnloadable(url=url, status_code=None)
if response is None: if response is None:
context.close()
browser.close()
print ("response object was none")
raise EmptyReply(url=url, status_code=None) raise EmptyReply(url=url, status_code=None)
if len(page.content().strip()) == 0: # Bug 2(?) Set the viewport size AFTER loading the page
raise EmptyReply(url=url, status_code=None) page.set_viewport_size({"width": 1280, "height": 1024})
extra_wait = int(os.getenv("WEBDRIVER_DELAY_BEFORE_CONTENT_READY", 5)) + self.render_extract_delay
self.status_code = response.status time.sleep(extra_wait)
self.content = page.content() self.content = page.content()
self.status_code = response.status
if len(self.content.strip()) == 0:
context.close()
browser.close()
print ("Content was empty")
raise EmptyReply(url=url, status_code=None)
self.headers = response.all_headers() self.headers = response.all_headers()
if current_css_filter is not None:
page.evaluate("var css_filter={}".format(json.dumps(current_css_filter)))
else:
page.evaluate("var css_filter=''")
self.xpath_data = page.evaluate("async () => {" + self.xpath_element_js + "}")
# Bug 3 in Playwright screenshot handling
# Some bug where it gives the wrong screenshot size, but making a request with the clip set first seems to solve it # Some bug where it gives the wrong screenshot size, but making a request with the clip set first seems to solve it
# JPEG is better here because the screenshots can be very very large # JPEG is better here because the screenshots can be very very large
page.screenshot(type='jpeg', clip={'x': 1.0, 'y': 1.0, 'width': 1280, 'height': 1024})
self.screenshot = page.screenshot(type='jpeg', full_page=True, quality=90) # Screenshots also travel via the ws:// (websocket) meaning that the binary data is base64 encoded
# which will significantly increase the IO size between the server and client, it's recommended to use the lowest
# acceptable screenshot quality here
try:
# Quality set to 1 because it's not used, just used as a work-around for a bug, no need to change this.
page.screenshot(type='jpeg', clip={'x': 1.0, 'y': 1.0, 'width': 1280, 'height': 1024}, quality=1)
# The actual screenshot
self.screenshot = page.screenshot(type='jpeg', full_page=True, quality=int(os.getenv("PLAYWRIGHT_SCREENSHOT_QUALITY", 72)))
except Exception as e:
context.close()
browser.close()
raise ScreenshotUnavailable(url=url, status_code=None)
context.close() context.close()
browser.close() browser.close()
@@ -225,7 +421,8 @@ class base_html_webdriver(Fetcher):
request_headers, request_headers,
request_body, request_body,
request_method, request_method,
ignore_status_codes=False): ignore_status_codes=False,
current_css_filter=None):
from selenium import webdriver from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
@@ -245,6 +442,10 @@ class base_html_webdriver(Fetcher):
self.quit() self.quit()
raise raise
self.driver.set_window_size(1280, 1024)
self.driver.implicitly_wait(int(os.getenv("WEBDRIVER_DELAY_BEFORE_CONTENT_READY", 5)))
self.screenshot = self.driver.get_screenshot_as_png()
# @todo - how to check this? is it possible? # @todo - how to check this? is it possible?
self.status_code = 200 self.status_code = 200
# @todo somehow we should try to get this working for WebDriver # @todo somehow we should try to get this working for WebDriver
@@ -254,8 +455,6 @@ class base_html_webdriver(Fetcher):
time.sleep(int(os.getenv("WEBDRIVER_DELAY_BEFORE_CONTENT_READY", 5)) + self.render_extract_delay) time.sleep(int(os.getenv("WEBDRIVER_DELAY_BEFORE_CONTENT_READY", 5)) + self.render_extract_delay)
self.content = self.driver.page_source self.content = self.driver.page_source
self.headers = {} self.headers = {}
self.screenshot = self.driver.get_screenshot_as_png()
self.quit()
# Does the connection to the webdriver work? run a test connection. # Does the connection to the webdriver work? run a test connection.
def is_ready(self): def is_ready(self):
@@ -292,7 +491,8 @@ class html_requests(Fetcher):
request_headers, request_headers,
request_body, request_body,
request_method, request_method,
ignore_status_codes=False): ignore_status_codes=False,
current_css_filter=None):
proxies={} proxies={}

View File

@@ -94,6 +94,7 @@ class perform_site_check():
# If the klass doesnt exist, just use a default # If the klass doesnt exist, just use a default
klass = getattr(content_fetcher, "html_requests") klass = getattr(content_fetcher, "html_requests")
proxy_args = self.set_proxy_from_list(watch) proxy_args = self.set_proxy_from_list(watch)
fetcher = klass(proxy_override=proxy_args) fetcher = klass(proxy_override=proxy_args)
@@ -104,7 +105,8 @@ class perform_site_check():
elif system_webdriver_delay is not None: elif system_webdriver_delay is not None:
fetcher.render_extract_delay = system_webdriver_delay fetcher.render_extract_delay = system_webdriver_delay
fetcher.run(url, timeout, request_headers, request_body, request_method, ignore_status_code) fetcher.run(url, timeout, request_headers, request_body, request_method, ignore_status_code, watch['css_filter'])
fetcher.quit()
# Fetching complete, now filters # Fetching complete, now filters
# @todo move to class / maybe inside of fetcher abstract base? # @todo move to class / maybe inside of fetcher abstract base?
@@ -202,6 +204,20 @@ class perform_site_check():
else: else:
stripped_text_from_html = stripped_text_from_html.encode('utf8') stripped_text_from_html = stripped_text_from_html.encode('utf8')
# 615 Extract text by regex
extract_text = watch.get('extract_text', [])
if len(extract_text) > 0:
regex_matched_output = []
for s_re in extract_text:
result = re.findall(s_re.encode('utf8'), stripped_text_from_html,
flags=re.MULTILINE | re.DOTALL | re.LOCALE)
if result:
regex_matched_output.append(result[0])
if regex_matched_output:
stripped_text_from_html = b'\n'.join(regex_matched_output)
text_content_before_ignored_filter = stripped_text_from_html
# Re #133 - if we should strip whitespaces from triggering the change detected comparison # Re #133 - if we should strip whitespaces from triggering the change detected comparison
if self.datastore.data['settings']['application'].get('ignore_whitespace', False): if self.datastore.data['settings']['application'].get('ignore_whitespace', False):
fetched_md5 = hashlib.md5(stripped_text_from_html.translate(None, b'\r\n\t ')).hexdigest() fetched_md5 = hashlib.md5(stripped_text_from_html.translate(None, b'\r\n\t ')).hexdigest()
@@ -219,9 +235,11 @@ class perform_site_check():
# Yeah, lets block first until something matches # Yeah, lets block first until something matches
blocked_by_not_found_trigger_text = True blocked_by_not_found_trigger_text = True
# Filter and trigger works the same, so reuse it # Filter and trigger works the same, so reuse it
# It should return the line numbers that match
result = html_tools.strip_ignore_text(content=str(stripped_text_from_html), result = html_tools.strip_ignore_text(content=str(stripped_text_from_html),
wordlist=watch['trigger_text'], wordlist=watch['trigger_text'],
mode="line numbers") mode="line numbers")
# If it returned any lines that matched..
if result: if result:
blocked_by_not_found_trigger_text = False blocked_by_not_found_trigger_text = False
@@ -236,4 +254,4 @@ class perform_site_check():
if not watch['title'] or not len(watch['title']): if not watch['title'] or not len(watch['title']):
update_obj['title'] = html_tools.extract_element(find='title', html_content=fetcher.content) update_obj['title'] = html_tools.extract_element(find='title', html_content=fetcher.content)
return changed_detected, update_obj, text_content_before_ignored_filter, fetcher.screenshot return changed_detected, update_obj, text_content_before_ignored_filter, fetcher.screenshot, fetcher.xpath_data

View File

@@ -223,7 +223,7 @@ class validateURL(object):
except validators.ValidationFailure: except validators.ValidationFailure:
message = field.gettext('\'%s\' is not a valid URL.' % (field.data.strip())) message = field.gettext('\'%s\' is not a valid URL.' % (field.data.strip()))
raise ValidationError(message) raise ValidationError(message)
class ValidateListRegex(object): class ValidateListRegex(object):
""" """
Validates that anything that looks like a regex passes as a regex Validates that anything that looks like a regex passes as a regex
@@ -307,7 +307,7 @@ class ValidateCSSJSONXPATHInput(object):
class quickWatchForm(Form): class quickWatchForm(Form):
url = fields.URLField('URL', validators=[validateURL()]) url = fields.URLField('URL', validators=[validateURL()])
tag = StringField('Group tag', [validators.Optional(), validators.Length(max=35)]) tag = StringField('Group tag', [validators.Optional()])
# Common to a single watch and the global settings # Common to a single watch and the global settings
class commonSettingsForm(Form): class commonSettingsForm(Form):
@@ -323,13 +323,16 @@ class commonSettingsForm(Form):
class watchForm(commonSettingsForm): class watchForm(commonSettingsForm):
url = fields.URLField('URL', validators=[validateURL()]) url = fields.URLField('URL', validators=[validateURL()])
tag = StringField('Group tag', [validators.Optional(), validators.Length(max=35)], default='') tag = StringField('Group tag', [validators.Optional()], default='')
time_between_check = FormField(TimeBetweenCheckForm) time_between_check = FormField(TimeBetweenCheckForm)
css_filter = StringField('CSS/JSON/XPATH Filter', [ValidateCSSJSONXPATHInput()], default='') css_filter = StringField('CSS/JSON/XPATH Filter', [ValidateCSSJSONXPATHInput()], default='')
subtractive_selectors = StringListField('Remove elements', [ValidateCSSJSONXPATHInput(allow_xpath=False, allow_json=False)]) subtractive_selectors = StringListField('Remove elements', [ValidateCSSJSONXPATHInput(allow_xpath=False, allow_json=False)])
extract_text = StringListField('Extract text', [ValidateListRegex()])
title = StringField('Title', default='') title = StringField('Title', default='')
ignore_text = StringListField('Ignore text', [ValidateListRegex()]) ignore_text = StringListField('Ignore text', [ValidateListRegex()])
@@ -360,7 +363,9 @@ class watchForm(commonSettingsForm):
class globalSettingsRequestForm(Form): class globalSettingsRequestForm(Form):
time_between_check = FormField(TimeBetweenCheckForm) time_between_check = FormField(TimeBetweenCheckForm)
proxy = RadioField('Proxy') proxy = RadioField('Proxy')
jitter_seconds = IntegerField('Random jitter seconds ± check',
render_kw={"style": "width: 5em;"},
validators=[validators.NumberRange(min=0, message="Should contain zero or more seconds")])
# datastore.data['settings']['application'].. # datastore.data['settings']['application']..
class globalSettingsApplicationForm(commonSettingsForm): class globalSettingsApplicationForm(commonSettingsForm):
@@ -374,6 +379,7 @@ class globalSettingsApplicationForm(commonSettingsForm):
empty_pages_are_a_change = BooleanField('Treat empty pages as a change?', default=False) empty_pages_are_a_change = BooleanField('Treat empty pages as a change?', default=False)
render_anchor_tag_content = BooleanField('Render anchor tag content', default=False) render_anchor_tag_content = BooleanField('Render anchor tag content', default=False)
fetch_backend = RadioField('Fetch Method', default="html_requests", choices=content_fetcher.available_fetchers(), validators=[ValidateContentFetcherIsReady()]) fetch_backend = RadioField('Fetch Method', default="html_requests", choices=content_fetcher.available_fetchers(), validators=[ValidateContentFetcherIsReady()])
api_access_token_enabled = BooleanField('API access token security check enabled', default=True, validators=[validators.Optional()])
password = SaltyPasswordField() password = SaltyPasswordField()

View File

@@ -39,7 +39,7 @@ def element_removal(selectors: List[str], html_content):
def xpath_filter(xpath_filter, html_content): def xpath_filter(xpath_filter, html_content):
from lxml import etree, html from lxml import etree, html
tree = html.fromstring(html_content) tree = html.fromstring(bytes(html_content, encoding='utf-8'))
html_block = "" html_block = ""
for item in tree.xpath(xpath_filter.strip(), namespaces={'re':'http://exslt.org/regular-expressions'}): for item in tree.xpath(xpath_filter.strip(), namespaces={'re':'http://exslt.org/regular-expressions'}):

View File

@@ -92,7 +92,7 @@ class import_distill_io_json(Importer):
for d in data.get('data'): for d in data.get('data'):
d_config = json.loads(d['config']) d_config = json.loads(d['config'])
extras = {'title': d['name']} extras = {'title': d.get('name', None)}
if len(d['uri']) and good < 5000: if len(d['uri']) and good < 5000:
try: try:
@@ -114,12 +114,9 @@ class import_distill_io_json(Importer):
except IndexError: except IndexError:
pass pass
try:
if d.get('tags', False):
extras['tag'] = ", ".join(d['tags']) extras['tag'] = ", ".join(d['tags'])
except KeyError:
pass
except IndexError:
pass
new_uuid = datastore.add_watch(url=d['uri'].strip(), new_uuid = datastore.add_watch(url=d['uri'].strip(),
extras=extras, extras=extras,

View File

@@ -23,10 +23,12 @@ class model(dict):
'requests': { 'requests': {
'timeout': 15, # Default 15 seconds 'timeout': 15, # Default 15 seconds
'time_between_check': {'weeks': None, 'days': None, 'hours': 3, 'minutes': None, 'seconds': None}, 'time_between_check': {'weeks': None, 'days': None, 'hours': 3, 'minutes': None, 'seconds': None},
'jitter_seconds': 0,
'workers': 10, # Number of threads, lower is better for slow connections 'workers': 10, # Number of threads, lower is better for slow connections
'proxy': None # Preferred proxy connection 'proxy': None # Preferred proxy connection
}, },
'application': { 'application': {
'api_access_token_enabled': True,
'password': False, 'password': False,
'base_url' : None, 'base_url' : None,
'extract_title_as_title': False, 'extract_title_as_title': False,
@@ -34,7 +36,7 @@ class model(dict):
'fetch_backend': os.getenv("DEFAULT_FETCH_BACKEND", "html_requests"), 'fetch_backend': os.getenv("DEFAULT_FETCH_BACKEND", "html_requests"),
'global_ignore_text': [], # List of text to ignore when calculating the comparison checksum 'global_ignore_text': [], # List of text to ignore when calculating the comparison checksum
'global_subtractive_selectors': [], 'global_subtractive_selectors': [],
'ignore_whitespace': False, 'ignore_whitespace': True,
'render_anchor_tag_content': False, 'render_anchor_tag_content': False,
'notification_urls': [], # Apprise URL list 'notification_urls': [], # Apprise URL list
# Custom notification content # Custom notification content

View File

@@ -1,5 +1,4 @@
import os import os
import uuid as uuid_builder import uuid as uuid_builder
minimum_seconds_recheck_time = int(os.getenv('MINIMUM_SECONDS_RECHECK_TIME', 60)) minimum_seconds_recheck_time = int(os.getenv('MINIMUM_SECONDS_RECHECK_TIME', 60))
@@ -12,29 +11,31 @@ from changedetectionio.notification import (
class model(dict): class model(dict):
base_config = { __newest_history_key = None
__history_n=0
__base_config = {
'url': None, 'url': None,
'tag': None, 'tag': None,
'last_checked': 0, 'last_checked': 0,
'last_changed': 0, 'last_changed': 0,
'paused': False, 'paused': False,
'last_viewed': 0, # history key value of the last viewed via the [diff] link 'last_viewed': 0, # history key value of the last viewed via the [diff] link
'newest_history_key': 0, #'newest_history_key': 0,
'title': None, 'title': None,
'previous_md5': False, 'previous_md5': False,
# UUID not needed, should be generated only as a key 'uuid': str(uuid_builder.uuid4()),
# 'uuid':
'headers': {}, # Extra headers to send 'headers': {}, # Extra headers to send
'body': None, 'body': None,
'method': 'GET', 'method': 'GET',
'history': {}, # Dict of timestamp and output stripped filename #'history': {}, # Dict of timestamp and output stripped filename
'ignore_text': [], # List of text to ignore when calculating the comparison checksum 'ignore_text': [], # List of text to ignore when calculating the comparison checksum
# Custom notification content # Custom notification content
'notification_urls': [], # List of URLs to add to the notification Queue (Usually AppRise) 'notification_urls': [], # List of URLs to add to the notification Queue (Usually AppRise)
'notification_title': default_notification_title, 'notification_title': default_notification_title,
'notification_body': default_notification_body, 'notification_body': default_notification_body,
'notification_format': default_notification_format, 'notification_format': default_notification_format,
'css_filter': "", 'css_filter': '',
'extract_text': [], # Extract text by regex after filters
'subtractive_selectors': [], 'subtractive_selectors': [],
'trigger_text': [], # List of text or regex to wait for until a change is detected 'trigger_text': [], # List of text or regex to wait for until a change is detected
'fetch_backend': None, 'fetch_backend': None,
@@ -46,12 +47,106 @@ class model(dict):
'time_between_check': {'weeks': None, 'days': None, 'hours': None, 'minutes': None, 'seconds': None}, 'time_between_check': {'weeks': None, 'days': None, 'hours': None, 'minutes': None, 'seconds': None},
'webdriver_delay': None 'webdriver_delay': None
} }
jitter_seconds = 0
mtable = {'seconds': 1, 'minutes': 60, 'hours': 3600, 'days': 86400, 'weeks': 86400 * 7}
def __init__(self, *arg, **kw): def __init__(self, *arg, **kw):
self.update(self.base_config) import uuid
self.update(self.__base_config)
self.__datastore_path = kw['datastore_path']
self['uuid'] = str(uuid.uuid4())
del kw['datastore_path']
if kw.get('default'):
self.update(kw['default'])
del kw['default']
# goes at the end so we update the default object with the initialiser # goes at the end so we update the default object with the initialiser
super(model, self).__init__(*arg, **kw) super(model, self).__init__(*arg, **kw)
@property
def viewed(self):
if int(self['last_viewed']) >= int(self.newest_history_key) :
return True
return False
@property
def history_n(self):
return self.__history_n
@property
def history(self):
tmp_history = {}
import logging
import time
# Read the history file as a dict
fname = os.path.join(self.__datastore_path, self.get('uuid'), "history.txt")
if os.path.isfile(fname):
logging.debug("Disk IO accessed " + str(time.time()))
with open(fname, "r") as f:
tmp_history = dict(i.strip().split(',', 2) for i in f.readlines())
if len(tmp_history):
self.__newest_history_key = list(tmp_history.keys())[-1]
self.__history_n = len(tmp_history)
return tmp_history
@property
def has_history(self):
fname = os.path.join(self.__datastore_path, self.get('uuid'), "history.txt")
return os.path.isfile(fname)
# Returns the newest key, but if theres only 1 record, then it's counted as not being new, so return 0.
@property
def newest_history_key(self):
if self.__newest_history_key is not None:
return self.__newest_history_key
if len(self.history) <= 1:
return 0
bump = self.history
return self.__newest_history_key
# Save some text file to the appropriate path and bump the history
# result_obj from fetch_site_status.run()
def save_history_text(self, contents, timestamp):
import uuid
from os import mkdir, path, unlink
import logging
output_path = "{}/{}".format(self.__datastore_path, self['uuid'])
# Incase the operator deleted it, check and create.
if not os.path.isdir(output_path):
mkdir(output_path)
snapshot_fname = "{}/{}.stripped.txt".format(output_path, uuid.uuid4())
logging.debug("Saving history text {}".format(snapshot_fname))
with open(snapshot_fname, 'wb') as f:
f.write(contents)
f.close()
# Append to index
# @todo check last char was \n
index_fname = "{}/history.txt".format(output_path)
with open(index_fname, 'a') as f:
f.write("{},{}\n".format(timestamp, snapshot_fname))
f.close()
self.__newest_history_key = timestamp
self.__history_n+=1
#@todo bump static cache of the last timestamp so we dont need to examine the file to set a proper ''viewed'' status
return snapshot_fname
@property @property
def has_empty_checktime(self): def has_empty_checktime(self):
@@ -62,8 +157,7 @@ class model(dict):
def threshold_seconds(self): def threshold_seconds(self):
seconds = 0 seconds = 0
mtable = {'seconds': 1, 'minutes': 60, 'hours': 3600, 'days': 86400, 'weeks': 86400 * 7} for m, n in self.mtable.items():
for m, n in mtable.items():
x = self.get('time_between_check', {}).get(m, None) x = self.get('time_between_check', {}).get(m, None)
if x: if x:
seconds += x * n seconds += x * n

View File

@@ -67,6 +67,11 @@ def process_notification(n_object, datastore):
url += k + 'avatar_url=https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/changedetectionio/static/images/avatar-256x256.png' url += k + 'avatar_url=https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/changedetectionio/static/images/avatar-256x256.png'
if url.startswith('tgram://'): if url.startswith('tgram://'):
# Telegram only supports a limit subset of HTML, remove the '<br/>' we place in.
# re https://github.com/dgtlmoon/changedetection.io/issues/555
# @todo re-use an existing library we have already imported to strip all non-allowed tags
n_body = n_body.replace('<br/>', '\n')
n_body = n_body.replace('</br>', '\n')
# real limit is 4096, but minus some for extra metadata # real limit is 4096, but minus some for extra metadata
payload_max_size = 3600 payload_max_size = 3600
body_limit = max(0, payload_max_size - len(n_title)) body_limit = max(0, payload_max_size - len(n_title))
@@ -97,6 +102,12 @@ def process_notification(n_object, datastore):
if log_value and 'WARNING' in log_value or 'ERROR' in log_value: if log_value and 'WARNING' in log_value or 'ERROR' in log_value:
raise Exception(log_value) raise Exception(log_value)
# Return what was sent for better logging
return {'title': n_title,
'body': n_body,
'body_format': n_format}
# Notification title + body content parameters get created here. # Notification title + body content parameters get created here.
def create_notification_parameters(n_object, datastore): def create_notification_parameters(n_object, datastore):
from copy import deepcopy from copy import deepcopy

View File

@@ -9,6 +9,8 @@
# exit when any command fails # exit when any command fails
set -e set -e
export MINIMUM_SECONDS_RECHECK_TIME=0
find tests/test_*py -type f|while read test_name find tests/test_*py -type f|while read test_name
do do
echo "TEST RUNNING $test_name" echo "TEST RUNNING $test_name"
@@ -22,3 +24,26 @@ echo "RUNNING WITH BASE_URL SET"
export BASE_URL="https://really-unique-domain.io" export BASE_URL="https://really-unique-domain.io"
pytest tests/test_notification.py pytest tests/test_notification.py
# Now for the selenium and playwright/browserless fetchers
# Note - this is not UI functional tests - just checking that each one can fetch the content
echo "TESTING WEBDRIVER FETCH > SELENIUM/WEBDRIVER..."
docker run -d --name $$-test_selenium -p 4444:4444 --rm --shm-size="2g" selenium/standalone-chrome-debug:3.141.59
# takes a while to spin up
sleep 5
export WEBDRIVER_URL=http://localhost:4444/wd/hub
pytest tests/fetchers/test_content.py
unset WEBDRIVER_URL
docker kill $$-test_selenium
echo "TESTING WEBDRIVER FETCH > PLAYWRIGHT/BROWSERLESS..."
# Not all platforms support playwright (not ARM/rPI), so it's not packaged in requirements.txt
pip3 install playwright~=1.22
docker run -d --name $$-test_browserless -e "DEFAULT_LAUNCH_ARGS=[\"--window-size=1920,1080\"]" --rm -p 3000:3000 --shm-size="2g" browserless/chrome:1.53-chrome-stable
# takes a while to spin up
sleep 5
export PLAYWRIGHT_DRIVER_URL=ws://127.0.0.1:3000
pytest tests/fetchers/test_content.py
unset PLAYWRIGHT_DRIVER_URL
docker kill $$-test_browserless

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

View File

@@ -0,0 +1,17 @@
$(document).ready(function () {
// Load it when the #screenshot tab is in use, so we dont give a slow experience when waiting for the text diff to load
window.addEventListener('hashchange', function (e) {
toggle(location.hash);
}, false);
toggle(location.hash);
function toggle(hash_name) {
if (hash_name === '#screenshot') {
$("img#screenshot-img").attr('src', screenshot_url);
$("#settings").hide();
} else {
$("#settings").show();
}
}
});

View File

@@ -1,4 +1,4 @@
$(document).ready(function() { $(document).ready(function () {
function toggle() { function toggle() {
if ($('input[name="application-fetch_backend"]:checked').val() != 'html_requests') { if ($('input[name="application-fetch_backend"]:checked').val() != 'html_requests') {
$('#requests-override-options').hide(); $('#requests-override-options').hide();
@@ -8,9 +8,29 @@ $(document).ready(function() {
$('#webdriver-override-options').hide(); $('#webdriver-override-options').hide();
} }
} }
$('input[name="application-fetch_backend"]').click(function (e) { $('input[name="application-fetch_backend"]').click(function (e) {
toggle(); toggle();
}); });
toggle(); toggle();
$("#api-key").hover(
function () {
$("#api-key-copy").html('copy').fadeIn();
},
function () {
$("#api-key-copy").hide();
}
).click(function (e) {
$("#api-key-copy").html('copied');
var range = document.createRange();
var n = $("#api-key")[0];
range.selectNode(n);
window.getSelection().removeAllRanges();
window.getSelection().addRange(range);
document.execCommand("copy");
window.getSelection().removeAllRanges();
});
}); });

View File

@@ -0,0 +1,56 @@
/**
* debounce
* @param {integer} milliseconds This param indicates the number of milliseconds
* to wait after the last call before calling the original function.
* @param {object} What "this" refers to in the returned function.
* @return {function} This returns a function that when called will wait the
* indicated number of milliseconds after the last call before
* calling the original function.
*/
Function.prototype.debounce = function (milliseconds, context) {
var baseFunction = this,
timer = null,
wait = milliseconds;
return function () {
var self = context || this,
args = arguments;
function complete() {
baseFunction.apply(self, args);
timer = null;
}
if (timer) {
clearTimeout(timer);
}
timer = setTimeout(complete, wait);
};
};
/**
* throttle
* @param {integer} milliseconds This param indicates the number of milliseconds
* to wait between calls before calling the original function.
* @param {object} What "this" refers to in the returned function.
* @return {function} This returns a function that when called will wait the
* indicated number of milliseconds between calls before
* calling the original function.
*/
Function.prototype.throttle = function (milliseconds, context) {
var baseFunction = this,
lastEventTimestamp = null,
limit = milliseconds;
return function () {
var self = context || this,
args = arguments,
now = Date.now();
if (!lastEventTimestamp || now - lastEventTimestamp >= limit) {
lastEventTimestamp = now;
baseFunction.apply(self, args);
}
};
};

View File

@@ -40,13 +40,19 @@ $(document).ready(function() {
$.ajax({ $.ajax({
type: "POST", type: "POST",
url: notification_base_url, url: notification_base_url,
data : data data : data,
statusCode: {
400: function() {
// More than likely the CSRF token was lost when the server restarted
alert("There was a problem processing the request, please reload the page.");
}
}
}).done(function(data){ }).done(function(data){
console.log(data); console.log(data);
alert('Sent'); alert('Sent');
}).fail(function(data){ }).fail(function(data){
console.log(data); console.log(data);
alert('Error: '+data.responseJSON.error); alert('There was an error communicating with the server.');
}) })
}); });
}); });

View File

@@ -0,0 +1,230 @@
// Horrible proof of concept code :)
// yes - this is really a hack, if you are a front-ender and want to help, please get in touch!
$(document).ready(function() {
var current_selected_i;
var state_clicked=false;
var c;
// greyed out fill context
var xctx;
// redline highlight context
var ctx;
var current_default_xpath;
var x_scale=1;
var y_scale=1;
var selector_image;
var selector_image_rect;
var selector_data;
$('#visualselector-tab').click(function () {
$("img#selector-background").off('load');
state_clicked = false;
current_selected_i = false;
bootstrap_visualselector();
});
$(document).on('keydown', function(event) {
if ($("img#selector-background").is(":visible")) {
if (event.key == "Escape") {
state_clicked=false;
ctx.clearRect(0, 0, c.width, c.height);
}
}
});
// For when the page loads
if(!window.location.hash || window.location.hash != '#visualselector') {
$("img#selector-background").attr('src','');
return;
}
// Handle clearing button/link
$('#clear-selector').on('click', function(event) {
if(!state_clicked) {
alert('Oops, Nothing selected!');
}
state_clicked=false;
ctx.clearRect(0, 0, c.width, c.height);
xctx.clearRect(0, 0, c.width, c.height);
$("#css_filter").val('');
});
bootstrap_visualselector();
function bootstrap_visualselector() {
if ( 1 ) {
// bootstrap it, this will trigger everything else
$("img#selector-background").bind('load', function () {
console.log("Loaded background...");
c = document.getElementById("selector-canvas");
// greyed out fill context
xctx = c.getContext("2d");
// redline highlight context
ctx = c.getContext("2d");
current_default_xpath =$("#css_filter").val();
fetch_data();
$('#selector-canvas').off("mousemove mousedown");
// screenshot_url defined in the edit.html template
}).attr("src", screenshot_url);
}
}
function fetch_data() {
// Image is ready
$('.fetching-update-notice').html("Fetching element data..");
$.ajax({
url: watch_visual_selector_data_url,
context: document.body
}).done(function (data) {
$('.fetching-update-notice').html("Rendering..");
selector_data = data;
console.log("Reported browser width from backend: "+data['browser_width']);
state_clicked=false;
set_scale();
reflow_selector();
$('.fetching-update-notice').fadeOut();
});
};
function set_scale() {
// some things to check if the scaling doesnt work
// - that the widths/sizes really are about the actual screen size cat elements.json |grep -o width......|sort|uniq
selector_image = $("img#selector-background")[0];
selector_image_rect = selector_image.getBoundingClientRect();
// make the canvas the same size as the image
$('#selector-canvas').attr('height', selector_image_rect.height);
$('#selector-canvas').attr('width', selector_image_rect.width);
$('#selector-wrapper').attr('width', selector_image_rect.width);
x_scale = selector_image_rect.width / selector_data['browser_width'];
y_scale = selector_image_rect.height / selector_image.naturalHeight;
ctx.strokeStyle = 'rgba(255,0,0, 0.9)';
ctx.fillStyle = 'rgba(255,0,0, 0.1)';
ctx.lineWidth = 3;
console.log("scaling set x: "+x_scale+" by y:"+y_scale);
$("#selector-current-xpath").css('max-width', selector_image_rect.width);
}
function reflow_selector() {
$(window).resize(function() {
set_scale();
highlight_current_selected_i();
});
var selector_currnt_xpath_text=$("#selector-current-xpath span");
set_scale();
console.log(selector_data['size_pos'].length + " selectors found");
// highlight the default one if we can find it in the xPath list
// or the xpath matches the default one
found = false;
if(current_default_xpath.length) {
for (var i = selector_data['size_pos'].length; i!==0; i--) {
var sel = selector_data['size_pos'][i-1];
if(selector_data['size_pos'][i - 1].xpath == current_default_xpath) {
console.log("highlighting "+current_default_xpath);
current_selected_i = i-1;
highlight_current_selected_i();
found = true;
break;
}
}
if(!found) {
alert("Unfortunately your existing CSS/xPath Filter was no longer found!");
}
}
$('#selector-canvas').bind('mousemove', function (e) {
if(state_clicked) {
return;
}
ctx.clearRect(0, 0, c.width, c.height);
current_selected_i=null;
// Add in offset
if ((typeof e.offsetX === "undefined" || typeof e.offsetY === "undefined") || (e.offsetX === 0 && e.offsetY === 0)) {
var targetOffset = $(e.target).offset();
e.offsetX = e.pageX - targetOffset.left;
e.offsetY = e.pageY - targetOffset.top;
}
// Reverse order - the most specific one should be deeper/"laster"
// Basically, find the most 'deepest'
var found=0;
ctx.fillStyle = 'rgba(205,0,0,0.35)';
for (var i = selector_data['size_pos'].length; i!==0; i--) {
// draw all of them? let them choose somehow?
var sel = selector_data['size_pos'][i-1];
// If we are in a bounding-box
if (e.offsetY > sel.top * y_scale && e.offsetY < sel.top * y_scale + sel.height * y_scale
&&
e.offsetX > sel.left * y_scale && e.offsetX < sel.left * y_scale + sel.width * y_scale
) {
// FOUND ONE
set_current_selected_text(sel.xpath);
ctx.strokeRect(sel.left * x_scale, sel.top * y_scale, sel.width * x_scale, sel.height * y_scale);
ctx.fillRect(sel.left * x_scale, sel.top * y_scale, sel.width * x_scale, sel.height * y_scale);
// no need to keep digging
// @todo or, O to go out/up, I to go in
// or double click to go up/out the selector?
current_selected_i=i-1;
found+=1;
break;
}
}
}.debounce(5));
function set_current_selected_text(s) {
selector_currnt_xpath_text[0].innerHTML=s;
}
function highlight_current_selected_i() {
if(state_clicked) {
state_clicked=false;
xctx.clearRect(0,0,c.width, c.height);
return;
}
var sel = selector_data['size_pos'][current_selected_i];
if (sel[0] == '/') {
// @todo - not sure just checking / is right
$("#css_filter").val('xpath:'+sel.xpath);
} else {
$("#css_filter").val(sel.xpath);
}
xctx.fillStyle = 'rgba(205,205,205,0.95)';
xctx.strokeStyle = 'rgba(225,0,0,0.9)';
xctx.lineWidth = 3;
xctx.fillRect(0,0,c.width, c.height);
// Clear out what only should be seen (make a clear/clean spot)
xctx.clearRect(sel.left * x_scale, sel.top * y_scale, sel.width * x_scale, sel.height * y_scale);
xctx.strokeRect(sel.left * x_scale, sel.top * y_scale, sel.width * x_scale, sel.height * y_scale);
state_clicked=true;
set_current_selected_text(sel.xpath);
}
$('#selector-canvas').bind('mousedown', function (e) {
highlight_current_selected_i();
});
}
});

View File

@@ -4,6 +4,7 @@ $(function () {
$(this).closest('.unviewed').removeClass('unviewed'); $(this).closest('.unviewed').removeClass('unviewed');
}); });
$('.with-share-link > *').click(function () { $('.with-share-link > *').click(function () {
$("#copied-clipboard").remove(); $("#copied-clipboard").remove();
@@ -20,5 +21,6 @@ $(function () {
$(this).remove(); $(this).remove();
}); });
}); });
}); });

View File

@@ -338,7 +338,8 @@ footer {
padding-top: 110px; } padding-top: 110px; }
div.tabs.collapsable ul li { div.tabs.collapsable ul li {
display: block; display: block;
border-radius: 0px; } border-radius: 0px;
margin-right: 0px; }
input[type='text'] { input[type='text'] {
width: 100%; } width: 100%; }
/* /*
@@ -352,6 +353,8 @@ and also iPads specifically.
/* Hide table headers (but not display: none;, for accessibility) */ } /* Hide table headers (but not display: none;, for accessibility) */ }
.watch-table thead, .watch-table tbody, .watch-table th, .watch-table td, .watch-table tr { .watch-table thead, .watch-table tbody, .watch-table th, .watch-table td, .watch-table tr {
display: block; } display: block; }
.watch-table .last-checked > span {
vertical-align: middle; }
.watch-table .last-checked::before { .watch-table .last-checked::before {
color: #555; color: #555;
content: "Last Checked "; } content: "Last Checked "; }
@@ -369,7 +372,8 @@ and also iPads specifically.
.watch-table td { .watch-table td {
/* Behave like a "row" */ /* Behave like a "row" */
border: none; border: none;
border-bottom: 1px solid #eee; } border-bottom: 1px solid #eee;
vertical-align: middle; }
.watch-table td:before { .watch-table td:before {
/* Top/left values mimic padding */ /* Top/left values mimic padding */
top: 6px; top: 6px;
@@ -429,6 +433,15 @@ and also iPads specifically.
.tab-pane-inner:target { .tab-pane-inner:target {
display: block; } display: block; }
#beta-logo {
height: 50px;
right: -3px;
top: -3px;
position: absolute; }
#selector-header {
padding-bottom: 1em; }
.edit-form { .edit-form {
min-width: 70%; min-width: 70%;
/* so it cant overflow */ /* so it cant overflow */
@@ -454,5 +467,68 @@ ul {
.time-check-widget tr input[type="number"] { .time-check-widget tr input[type="number"] {
width: 5em; } width: 5em; }
#selector-wrapper {
height: 600px;
overflow-y: scroll;
position: relative; }
#selector-wrapper > img {
position: absolute;
z-index: 4;
max-width: 100%; }
#selector-wrapper > canvas {
position: relative;
z-index: 5;
max-width: 100%; }
#selector-wrapper > canvas:hover {
cursor: pointer; }
#selector-current-xpath {
font-size: 80%; }
#webdriver-override-options input[type="number"] { #webdriver-override-options input[type="number"] {
width: 5em; } width: 5em; }
#api-key:hover {
cursor: pointer; }
#api-key-copy {
color: #0078e7; }
/* spinner */
.loader,
.loader:after {
border-radius: 50%;
width: 10px;
height: 10px; }
.loader {
margin: 0px auto;
font-size: 3px;
vertical-align: middle;
display: inline-block;
text-indent: -9999em;
border-top: 1.1em solid rgba(38, 104, 237, 0.2);
border-right: 1.1em solid rgba(38, 104, 237, 0.2);
border-bottom: 1.1em solid rgba(38, 104, 237, 0.2);
border-left: 1.1em solid #2668ed;
-webkit-transform: translateZ(0);
-ms-transform: translateZ(0);
transform: translateZ(0);
-webkit-animation: load8 1.1s infinite linear;
animation: load8 1.1s infinite linear; }
@-webkit-keyframes load8 {
0% {
-webkit-transform: rotate(0deg);
transform: rotate(0deg); }
100% {
-webkit-transform: rotate(360deg);
transform: rotate(360deg); } }
@keyframes load8 {
0% {
-webkit-transform: rotate(0deg);
transform: rotate(0deg); }
100% {
-webkit-transform: rotate(360deg);
transform: rotate(360deg); } }

View File

@@ -469,6 +469,7 @@ footer {
div.tabs.collapsable ul li { div.tabs.collapsable ul li {
display: block; display: block;
border-radius: 0px; border-radius: 0px;
margin-right: 0px;
} }
input[type='text'] { input[type='text'] {
@@ -486,6 +487,11 @@ and also iPads specifically.
display: block; display: block;
} }
.last-checked {
> span {
vertical-align: middle;
}
}
.last-checked::before { .last-checked::before {
color: #555; color: #555;
content: "Last Checked "; content: "Last Checked ";
@@ -516,7 +522,7 @@ and also iPads specifically.
/* Behave like a "row" */ /* Behave like a "row" */
border: none; border: none;
border-bottom: 1px solid #eee; border-bottom: 1px solid #eee;
vertical-align: middle;
&:before { &:before {
/* Top/left values mimic padding */ /* Top/left values mimic padding */
top: 6px; top: 6px;
@@ -613,6 +619,18 @@ $form-edge-padding: 20px;
padding: 0px; padding: 0px;
} }
#beta-logo {
height: 50px;
// looks better when it's hanging off a little
right: -3px;
top: -3px;
position: absolute;
}
#selector-header {
padding-bottom: 1em;
}
.edit-form { .edit-form {
min-width: 70%; min-width: 70%;
/* so it cant overflow */ /* so it cant overflow */
@@ -649,8 +667,87 @@ ul {
} }
} }
#selector-wrapper {
height: 600px;
overflow-y: scroll;
position: relative;
//width: 100%;
> img {
position: absolute;
z-index: 4;
max-width: 100%;
}
>canvas {
position: relative;
z-index: 5;
max-width: 100%;
&:hover {
cursor: pointer;
}
}
}
#selector-current-xpath {
font-size: 80%;
}
#webdriver-override-options { #webdriver-override-options {
input[type="number"] { input[type="number"] {
width: 5em; width: 5em;
} }
} }
#api-key {
&:hover {
cursor: pointer;
}
}
#api-key-copy {
color: #0078e7;
}
/* spinner */
.loader,
.loader:after {
border-radius: 50%;
width: 10px;
height: 10px;
}
.loader {
margin: 0px auto;
font-size: 3px;
vertical-align: middle;
display: inline-block;
text-indent: -9999em;
border-top: 1.1em solid rgba(38,104,237, 0.2);
border-right: 1.1em solid rgba(38,104,237, 0.2);
border-bottom: 1.1em solid rgba(38,104,237, 0.2);
border-left: 1.1em solid #2668ed;
-webkit-transform: translateZ(0);
-ms-transform: translateZ(0);
transform: translateZ(0);
-webkit-animation: load8 1.1s infinite linear;
animation: load8 1.1s infinite linear;
}
@-webkit-keyframes load8 {
0% {
-webkit-transform: rotate(0deg);
transform: rotate(0deg);
}
100% {
-webkit-transform: rotate(360deg);
transform: rotate(360deg);
}
}
@keyframes load8 {
0% {
-webkit-transform: rotate(0deg);
transform: rotate(0deg);
}
100% {
-webkit-transform: rotate(360deg);
transform: rotate(360deg);
}
}

View File

@@ -12,6 +12,7 @@ from os import mkdir, path, unlink
from threading import Lock from threading import Lock
import re import re
import requests import requests
import secrets
from . model import App, Watch from . model import App, Watch
@@ -39,7 +40,7 @@ class ChangeDetectionStore:
# Base definition for all watchers # Base definition for all watchers
# deepcopy part of #569 - not sure why its needed exactly # deepcopy part of #569 - not sure why its needed exactly
self.generic_definition = deepcopy(Watch.model()) self.generic_definition = deepcopy(Watch.model(datastore_path = datastore_path, default={}))
if path.isfile('changedetectionio/source.txt'): if path.isfile('changedetectionio/source.txt'):
with open('changedetectionio/source.txt') as f: with open('changedetectionio/source.txt') as f:
@@ -70,13 +71,10 @@ class ChangeDetectionStore:
if 'application' in from_disk['settings']: if 'application' in from_disk['settings']:
self.__data['settings']['application'].update(from_disk['settings']['application']) self.__data['settings']['application'].update(from_disk['settings']['application'])
# Reinitialise each `watching` with our generic_definition in the case that we add a new var in the future. # Convert each existing watch back to the Watch.model object
# @todo pretty sure theres a python we todo this with an abstracted(?) object!
for uuid, watch in self.__data['watching'].items(): for uuid, watch in self.__data['watching'].items():
_blank = deepcopy(self.generic_definition) watch['uuid']=uuid
_blank.update(watch) self.__data['watching'][uuid] = Watch.model(datastore_path=self.datastore_path, default=watch)
self.__data['watching'].update({uuid: _blank})
self.__data['watching'][uuid]['newest_history_key'] = self.get_newest_history_key(uuid)
print("Watching:", uuid, self.__data['watching'][uuid]['url']) print("Watching:", uuid, self.__data['watching'][uuid]['url'])
# First time ran, doesnt exist. # First time ran, doesnt exist.
@@ -86,8 +84,7 @@ class ChangeDetectionStore:
self.add_watch(url='http://www.quotationspage.com/random.php', tag='test') self.add_watch(url='http://www.quotationspage.com/random.php', tag='test')
self.add_watch(url='https://news.ycombinator.com/', tag='Tech news') self.add_watch(url='https://news.ycombinator.com/', tag='Tech news')
self.add_watch(url='https://www.gov.uk/coronavirus', tag='Covid') self.add_watch(url='https://changedetection.io/CHANGELOG.txt', tag='changedetection.io')
self.add_watch(url='https://changedetection.io/CHANGELOG.txt')
self.__data['version_tag'] = version_tag self.__data['version_tag'] = version_tag
@@ -107,10 +104,13 @@ class ChangeDetectionStore:
# Generate the URL access token for RSS feeds # Generate the URL access token for RSS feeds
if not 'rss_access_token' in self.__data['settings']['application']: if not 'rss_access_token' in self.__data['settings']['application']:
import secrets
secret = secrets.token_hex(16) secret = secrets.token_hex(16)
self.__data['settings']['application']['rss_access_token'] = secret self.__data['settings']['application']['rss_access_token'] = secret
# Generate the API access token
if not 'api_access_token' in self.__data['settings']['application']:
secret = secrets.token_hex(16)
self.__data['settings']['application']['api_access_token'] = secret
# Proxy list support - available as a selection in settings when text file is imported # Proxy list support - available as a selection in settings when text file is imported
# CSV list # CSV list
@@ -127,23 +127,8 @@ class ChangeDetectionStore:
# Finally start the thread that will manage periodic data saves to JSON # Finally start the thread that will manage periodic data saves to JSON
save_data_thread = threading.Thread(target=self.save_datastore).start() save_data_thread = threading.Thread(target=self.save_datastore).start()
# Returns the newest key, but if theres only 1 record, then it's counted as not being new, so return 0.
def get_newest_history_key(self, uuid):
if len(self.__data['watching'][uuid]['history']) == 1:
return 0
dates = list(self.__data['watching'][uuid]['history'].keys())
# Convert to int, sort and back to str again
# @todo replace datastore getter that does this automatically
dates = [int(i) for i in dates]
dates.sort(reverse=True)
if len(dates):
# always keyed as str
return str(dates[0])
return 0
def set_last_viewed(self, uuid, timestamp): def set_last_viewed(self, uuid, timestamp):
logging.debug("Setting watch UUID: {} last viewed to {}".format(uuid, int(timestamp)))
self.data['watching'][uuid].update({'last_viewed': int(timestamp)}) self.data['watching'][uuid].update({'last_viewed': int(timestamp)})
self.needs_write = True self.needs_write = True
@@ -167,7 +152,6 @@ class ChangeDetectionStore:
del (update_obj[dict_key]) del (update_obj[dict_key])
self.__data['watching'][uuid].update(update_obj) self.__data['watching'][uuid].update(update_obj)
self.__data['watching'][uuid]['newest_history_key'] = self.get_newest_history_key(uuid)
self.needs_write = True self.needs_write = True
@@ -175,27 +159,26 @@ class ChangeDetectionStore:
def threshold_seconds(self): def threshold_seconds(self):
seconds = 0 seconds = 0
mtable = {'seconds': 1, 'minutes': 60, 'hours': 3600, 'days': 86400, 'weeks': 86400 * 7} mtable = {'seconds': 1, 'minutes': 60, 'hours': 3600, 'days': 86400, 'weeks': 86400 * 7}
minimum_seconds_recheck_time = int(os.getenv('MINIMUM_SECONDS_RECHECK_TIME', 60))
for m, n in mtable.items(): for m, n in mtable.items():
x = self.__data['settings']['requests']['time_between_check'].get(m) x = self.__data['settings']['requests']['time_between_check'].get(m)
if x: if x:
seconds += x * n seconds += x * n
return max(seconds, minimum_seconds_recheck_time) return seconds
@property
def has_unviewed(self):
for uuid, watch in self.__data['watching'].items():
if watch.viewed == False:
return True
return False
@property @property
def data(self): def data(self):
has_unviewed = False has_unviewed = False
for uuid, v in self.__data['watching'].items(): for uuid, watch in self.__data['watching'].items():
self.__data['watching'][uuid]['newest_history_key'] = self.get_newest_history_key(uuid)
if int(v['newest_history_key']) <= int(v['last_viewed']):
self.__data['watching'][uuid]['viewed'] = True
else:
self.__data['watching'][uuid]['viewed'] = False
has_unviewed = True
# #106 - Be sure this is None on empty string, False, None, etc # #106 - Be sure this is None on empty string, False, None, etc
# Default var for fetch_backend # Default var for fetch_backend
# @todo this may not be needed anymore, or could be easily removed
if not self.__data['watching'][uuid]['fetch_backend']: if not self.__data['watching'][uuid]['fetch_backend']:
self.__data['watching'][uuid]['fetch_backend'] = self.__data['settings']['application']['fetch_backend'] self.__data['watching'][uuid]['fetch_backend'] = self.__data['settings']['application']['fetch_backend']
@@ -204,14 +187,13 @@ class ChangeDetectionStore:
if not self.__data['settings']['application']['base_url']: if not self.__data['settings']['application']['base_url']:
self.__data['settings']['application']['base_url'] = env_base_url.strip('" ') self.__data['settings']['application']['base_url'] = env_base_url.strip('" ')
self.__data['has_unviewed'] = has_unviewed
return self.__data return self.__data
def get_all_tags(self): def get_all_tags(self):
tags = [] tags = []
for uuid, watch in self.data['watching'].items(): for uuid, watch in self.data['watching'].items():
if watch['tag'] is None:
continue
# Support for comma separated list of tags. # Support for comma separated list of tags.
for tag in watch['tag'].split(','): for tag in watch['tag'].split(','):
tag = tag.strip() tag = tag.strip()
@@ -235,11 +217,11 @@ class ChangeDetectionStore:
# GitHub #30 also delete history records # GitHub #30 also delete history records
for uuid in self.data['watching']: for uuid in self.data['watching']:
for path in self.data['watching'][uuid]['history'].values(): for path in self.data['watching'][uuid].history.values():
self.unlink_history_file(path) self.unlink_history_file(path)
else: else:
for path in self.data['watching'][uuid]['history'].values(): for path in self.data['watching'][uuid].history.values():
self.unlink_history_file(path) self.unlink_history_file(path)
del self.data['watching'][uuid] del self.data['watching'][uuid]
@@ -271,15 +253,31 @@ class ChangeDetectionStore:
def scrub_watch(self, uuid): def scrub_watch(self, uuid):
import pathlib import pathlib
self.__data['watching'][uuid].update({'history': {}, 'last_checked': 0, 'last_changed': 0, 'newest_history_key': 0, 'previous_md5': False}) self.__data['watching'][uuid].update(
self.needs_write_urgent = True {'last_checked': 0,
'last_changed': 0,
'last_viewed': 0,
'previous_md5': False,
'last_notification_error': False,
'last_error': False})
for item in pathlib.Path(self.datastore_path).rglob(uuid+"/*.txt"): # JSON Data, Screenshots, Textfiles (history index and snapshots), HTML in the future etc
for item in pathlib.Path(os.path.join(self.datastore_path, uuid)).rglob("*.*"):
unlink(item) unlink(item)
# Force the attr to recalculate
bump = self.__data['watching'][uuid].history
self.needs_write_urgent = True
def add_watch(self, url, tag="", extras=None, write_to_disk_now=True): def add_watch(self, url, tag="", extras=None, write_to_disk_now=True):
if extras is None: if extras is None:
extras = {} extras = {}
# should always be str
if tag is None or not tag:
tag=''
# Incase these are copied across, assume it's a reference and deepcopy() # Incase these are copied across, assume it's a reference and deepcopy()
apply_extras = deepcopy(extras) apply_extras = deepcopy(extras)
@@ -299,7 +297,7 @@ class ChangeDetectionStore:
'body', 'method', 'body', 'method',
'ignore_text', 'css_filter', 'ignore_text', 'css_filter',
'subtractive_selectors', 'trigger_text', 'subtractive_selectors', 'trigger_text',
'extract_title_as_title']: 'extract_title_as_title', 'extract_text']:
if res.get(k): if res.get(k):
apply_extras[k] = res[k] apply_extras[k] = res[k]
@@ -309,16 +307,15 @@ class ChangeDetectionStore:
return False return False
with self.lock: with self.lock:
# @todo use a common generic version of this
new_uuid = str(uuid_builder.uuid4())
# #Re 569 # #Re 569
# Not sure why deepcopy was needed here, sometimes new watches would appear to already have 'history' set new_watch = Watch.model(datastore_path=self.datastore_path, default={
# I assumed this would instantiate a new object but somehow an existing dict was getting used
new_watch = deepcopy(Watch.model({
'url': url, 'url': url,
'tag': tag 'tag': tag
})) })
new_uuid = new_watch['uuid']
logging.debug("Added URL {} - {}".format(url, new_uuid))
for k in ['uuid', 'history', 'last_checked', 'last_changed', 'newest_history_key', 'previous_md5', 'viewed']: for k in ['uuid', 'history', 'last_checked', 'last_changed', 'newest_history_key', 'previous_md5', 'viewed']:
if k in apply_extras: if k in apply_extras:
@@ -338,23 +335,6 @@ class ChangeDetectionStore:
self.sync_to_json() self.sync_to_json()
return new_uuid return new_uuid
# Save some text file to the appropriate path and bump the history
# result_obj from fetch_site_status.run()
def save_history_text(self, watch_uuid, contents):
import uuid
output_path = "{}/{}".format(self.datastore_path, watch_uuid)
# Incase the operator deleted it, check and create.
if not os.path.isdir(output_path):
mkdir(output_path)
fname = "{}/{}.stripped.txt".format(output_path, uuid.uuid4())
with open(fname, 'wb') as f:
f.write(contents)
f.close()
return fname
def get_screenshot(self, watch_uuid): def get_screenshot(self, watch_uuid):
output_path = "{}/{}".format(self.datastore_path, watch_uuid) output_path = "{}/{}".format(self.datastore_path, watch_uuid)
fname = "{}/last-screenshot.png".format(output_path) fname = "{}/last-screenshot.png".format(output_path)
@@ -363,6 +343,15 @@ class ChangeDetectionStore:
return False return False
def visualselector_data_is_ready(self, watch_uuid):
output_path = "{}/{}".format(self.datastore_path, watch_uuid)
screenshot_filename = "{}/last-screenshot.png".format(output_path)
elements_index_filename = "{}/elements.json".format(output_path)
if path.isfile(screenshot_filename) and path.isfile(elements_index_filename) :
return True
return False
# Save as PNG, PNG is larger but better for doing visual diff in the future # Save as PNG, PNG is larger but better for doing visual diff in the future
def save_screenshot(self, watch_uuid, screenshot: bytes): def save_screenshot(self, watch_uuid, screenshot: bytes):
output_path = "{}/{}".format(self.datastore_path, watch_uuid) output_path = "{}/{}".format(self.datastore_path, watch_uuid)
@@ -371,6 +360,14 @@ class ChangeDetectionStore:
f.write(screenshot) f.write(screenshot)
f.close() f.close()
def save_xpath_data(self, watch_uuid, data):
output_path = "{}/{}".format(self.datastore_path, watch_uuid)
fname = "{}/elements.json".format(output_path)
with open(fname, 'w') as f:
f.write(json.dumps(data))
f.close()
def sync_to_json(self): def sync_to_json(self):
logging.info("Saving JSON..") logging.info("Saving JSON..")
print("Saving JSON..") print("Saving JSON..")
@@ -423,8 +420,8 @@ class ChangeDetectionStore:
index=[] index=[]
for uuid in self.data['watching']: for uuid in self.data['watching']:
for id in self.data['watching'][uuid]['history']: for id in self.data['watching'][uuid].history:
index.append(self.data['watching'][uuid]['history'][str(id)]) index.append(self.data['watching'][uuid].history[str(id)])
import pathlib import pathlib
@@ -495,3 +492,28 @@ class ChangeDetectionStore:
# Only upgrade individual watch time if it was set # Only upgrade individual watch time if it was set
if watch.get('minutes_between_check', False): if watch.get('minutes_between_check', False):
self.data['watching'][uuid]['time_between_check']['minutes'] = watch['minutes_between_check'] self.data['watching'][uuid]['time_between_check']['minutes'] = watch['minutes_between_check']
# Move the history list to a flat text file index
# Better than SQLite because this list is only appended to, and works across NAS / NFS type setups
def update_2(self):
# @todo test running this on a newly updated one (when this already ran)
for uuid, watch in self.data['watching'].items():
history = []
if watch.get('history', False):
for d, p in watch['history'].items():
d = int(d) # Used to be keyed as str, we'll fix this now too
history.append("{},{}\n".format(d,p))
if len(history):
target_path = os.path.join(self.datastore_path, uuid)
if os.path.exists(target_path):
with open(os.path.join(target_path, "history.txt"), "w") as f:
f.writelines(history)
else:
logging.warning("Datastore history directory {} does not exist, skipping history import.".format(target_path))
# No longer needed, dynamically pulled from the disk when needed.
# But we should set it back to a empty dict so we don't break if this schema runs on an earlier version.
# In the distant future we can remove this entirely
self.data['watching'][uuid]['history'] = {}

View File

@@ -14,7 +14,7 @@
<li>Use <a target=_new href="https://github.com/caronc/apprise">AppRise URLs</a> for notification to just about any service! <i><a target=_new href="https://github.com/dgtlmoon/changedetection.io/wiki/Notification-configuration-notes">Please read the notification services wiki here for important configuration notes</a></i>.</li> <li>Use <a target=_new href="https://github.com/caronc/apprise">AppRise URLs</a> for notification to just about any service! <i><a target=_new href="https://github.com/dgtlmoon/changedetection.io/wiki/Notification-configuration-notes">Please read the notification services wiki here for important configuration notes</a></i>.</li>
<li><code>discord://</code> only supports a maximum <strong>2,000 characters</strong> of notification text, including the title.</li> <li><code>discord://</code> only supports a maximum <strong>2,000 characters</strong> of notification text, including the title.</li>
<li><code>tgram://</code> bots cant send messages to other bots, so you should specify chat ID of non-bot user.</li> <li><code>tgram://</code> bots cant send messages to other bots, so you should specify chat ID of non-bot user.</li>
<li>Go here for <a href="{{url_for('notification_logs')}}">notification debug logs</a></li> <li><code>tgram://</code> only supports very limited HTML and can fail when extra tags are sent, <a href="https://core.telegram.org/bots/api#html-style">read more here</a> (or use plaintext/markdown format)</li>
</ul> </ul>
</div> </div>
<br/> <br/>
@@ -22,6 +22,7 @@
{% if emailprefix %} {% if emailprefix %}
<a id="add-email-helper" class="pure-button button-secondary button-xsmall" style="font-size: 70%">Add email</a> <a id="add-email-helper" class="pure-button button-secondary button-xsmall" style="font-size: 70%">Add email</a>
{% endif %} {% endif %}
<a href="{{url_for('notification_logs')}}" class="pure-button button-secondary button-xsmall" style="font-size: 70%">Notification debug logs</a>
</div> </div>
<div id="notification-customisation" class="pure-control-group"> <div id="notification-customisation" class="pure-control-group">
<div class="pure-control-group"> <div class="pure-control-group">

View File

@@ -1,6 +1,11 @@
{% extends 'base.html' %} {% extends 'base.html' %}
{% block content %} {% block content %}
<script>
const screenshot_url="{{url_for('static_content', group='screenshot', filename=uuid)}}";
</script>
<script type="text/javascript" src="{{url_for('static_content', group='js', filename='diff-overview.js')}}" defer></script>
<div id="settings"> <div id="settings">
<h1>Differences</h1> <h1>Differences</h1>
<form class="pure-form " action="" method="GET"> <form class="pure-form " action="" method="GET">
@@ -39,9 +44,7 @@
<div class="tabs"> <div class="tabs">
<ul> <ul>
<li class="tab" id="default-tab"><a href="#text">Text</a></li> <li class="tab" id="default-tab"><a href="#text">Text</a></li>
{% if screenshot %} <li class="tab" id="screenshot-tab"><a href="#screenshot">Screenshot</a></li>
<li class="tab"><a href="#screenshot">Current screenshot</a></li>
{% endif %}
</ul> </ul>
</div> </div>
@@ -63,17 +66,21 @@
</table> </table>
Diff algorithm from the amazing <a href="https://github.com/kpdecker/jsdiff">github.com/kpdecker/jsdiff</a> Diff algorithm from the amazing <a href="https://github.com/kpdecker/jsdiff">github.com/kpdecker/jsdiff</a>
</div> </div>
{% if screenshot %}
<div class="tab-pane-inner" id="screenshot"> <div class="tab-pane-inner" id="screenshot">
<p> <div class="tip">
<i>For now, only the most recent screenshot is saved and displayed.</i> For now, Differences are performed on text, not graphically, only the latest screenshot is available.
</p> </div>
</br>
<img src="{{url_for('static_content', group='screenshot', filename=uuid)}}"> {% if is_html_webdriver %}
{% if screenshot %}
<img style="max-width: 80%" id="screenshot-img" alt="Current screenshot from most recent request"/>
{% else %}
No screenshot available just yet! Try rechecking the page.
{% endif %}
{% else %}
<strong>Screenshot requires Playwright/WebDriver enabled</strong>
{% endif %}
</div> </div>
{% endif %}
</div> </div>

View File

@@ -5,12 +5,18 @@
<script type="text/javascript" src="{{url_for('static_content', group='js', filename='tabs.js')}}" defer></script> <script type="text/javascript" src="{{url_for('static_content', group='js', filename='tabs.js')}}" defer></script>
<script> <script>
const notification_base_url="{{url_for('ajax_callback_send_notification_test')}}"; const notification_base_url="{{url_for('ajax_callback_send_notification_test')}}";
const watch_visual_selector_data_url="{{url_for('static_content', group='visual_selector_data', filename=uuid)}}";
const screenshot_url="{{url_for('static_content', group='screenshot', filename=uuid)}}";
{% if emailprefix %} {% if emailprefix %}
const email_notification_prefix=JSON.parse('{{ emailprefix|tojson }}'); const email_notification_prefix=JSON.parse('{{ emailprefix|tojson }}');
{% endif %} {% endif %}
</script> </script>
<script type="text/javascript" src="{{url_for('static_content', group='js', filename='watch-settings.js')}}" defer></script> <script type="text/javascript" src="{{url_for('static_content', group='js', filename='watch-settings.js')}}" defer></script>
<script type="text/javascript" src="{{url_for('static_content', group='js', filename='notifications.js')}}" defer></script> <script type="text/javascript" src="{{url_for('static_content', group='js', filename='notifications.js')}}" defer></script>
<script type="text/javascript" src="{{url_for('static_content', group='js', filename='visual-selector.js')}}" defer></script>
<script type="text/javascript" src="{{url_for('static_content', group='js', filename='limit.js')}}" defer></script>
<div class="edit-form monospaced-textarea"> <div class="edit-form monospaced-textarea">
@@ -18,6 +24,7 @@
<ul> <ul>
<li class="tab" id="default-tab"><a href="#general">General</a></li> <li class="tab" id="default-tab"><a href="#general">General</a></li>
<li class="tab"><a href="#request">Request</a></li> <li class="tab"><a href="#request">Request</a></li>
<li class="tab"><a id="visualselector-tab" href="#visualselector">Visual Selector</a></li>
<li class="tab"><a href="#filters-and-triggers">Filters &amp; Triggers</a></li> <li class="tab"><a href="#filters-and-triggers">Filters &amp; Triggers</a></li>
<li class="tab"><a href="#notifications">Notifications</a></li> <li class="tab"><a href="#notifications">Notifications</a></li>
</ul> </ul>
@@ -192,6 +199,57 @@ nav
</span> </span>
</div> </div>
</fieldset> </fieldset>
<fieldset>
<div class="pure-control-group">
{{ render_field(form.extract_text, rows=5, placeholder="\d+ online") }}
<span class="pure-form-message-inline">
<ul>
<li>Extracts text in the final output after other filters using regular expressions, for example <code>\d+ online</code></li>
<li>One line per regular-expression.</li>
</ul>
</span>
</div>
</fieldset>
</div>
<div class="tab-pane-inner visual-selector-ui" id="visualselector">
<img id="beta-logo" src="{{url_for('static_content', group='images', filename='beta-logo.png')}}">
<fieldset>
<div class="pure-control-group">
{% if visualselector_enabled %}
{% if visualselector_data_is_ready %}
<div id="selector-header">
<a id="clear-selector" class="pure-button button-secondary button-xsmall" style="font-size: 70%">Clear selection</a>
<i class="fetching-update-notice" style="font-size: 80%;">One moment, fetching screenshot and element information..</i>
</div>
<div id="selector-wrapper">
<!-- request the screenshot and get the element offset info ready -->
<!-- use img src ready load to know everything is ready to map out -->
<!-- @todo: maybe something interesting like a field to select 'elements that contain text... and their parents n' -->
<img id="selector-background" />
<canvas id="selector-canvas"></canvas>
</div>
<div id="selector-current-xpath" style="overflow-x: hidden"><strong>Currently:</strong>&nbsp;<span class="text">Loading...</span></div>
<span class="pure-form-message-inline">
<p><span style="font-weight: bold">Beta!</span> The Visual Selector is new and there may be minor bugs, please report pages that dont work, help us to improve this software!</p>
</span>
{% else %}
<span class="pure-form-message-inline">Screenshot and element data is not available or not yet ready.</span>
{% endif %}
{% else %}
<span class="pure-form-message-inline">
<p>Sorry, this functionality only works with Playwright/Chrome enabled watches.</p>
<p>Enable the Playwright Chrome fetcher, or alternatively try our <a href="https://lemonade.changedetection.io/start">very affordable subscription based service</a>.</p>
<p>This is because Selenium/WebDriver can not extract full page screenshots reliably.</p>
</span>
{% endif %}
</div>
</fieldset>
</div> </div>
<div id="actions"> <div id="actions">
@@ -199,9 +257,11 @@ nav
{{ render_button(form.save_button) }} {{ render_button(form.save_and_preview_button) }} {{ render_button(form.save_button) }} {{ render_button(form.save_and_preview_button) }}
<a href="{{url_for('api_delete', uuid=uuid)}}" <a href="{{url_for('form_delete', uuid=uuid)}}"
class="pure-button button-small button-error ">Delete</a> class="pure-button button-small button-error ">Delete</a>
<a href="{{url_for('api_clone', uuid=uuid)}}" <a href="{{url_for('scrub_watch', uuid=uuid)}}"
class="pure-button button-small button-error ">Scrub</a>
<a href="{{url_for('form_clone', uuid=uuid)}}"
class="pure-button button-small ">Create Copy</a> class="pure-button button-small ">Create Copy</a>
</div> </div>
</div> </div>

View File

@@ -4,7 +4,7 @@
<div class="edit-form"> <div class="edit-form">
<div class="inner"> <div class="inner">
<h4 style="margin-top: 0px;">The following issues were detected when sending notifications</h4> <h4 style="margin-top: 0px;">Notification debug log</h4>
<div id="notification-error-log"> <div id="notification-error-log">
<ul style="font-size: 80%; margin:0px; padding: 0 0 0 7px"> <ul style="font-size: 80%; margin:0px; padding: 0 0 0 7px">
{% for log in logs|reverse %} {% for log in logs|reverse %}

View File

@@ -1,6 +1,10 @@
{% extends 'base.html' %} {% extends 'base.html' %}
{% block content %} {% block content %}
<script>
const screenshot_url="{{url_for('static_content', group='screenshot', filename=uuid)}}";
</script>
<script type="text/javascript" src="{{url_for('static_content', group='js', filename='diff-overview.js')}}" defer></script>
<div id="settings"> <div id="settings">
<h1>Current - {{watch.last_checked|format_timestamp_timeago}}</h1> <h1>Current - {{watch.last_checked|format_timestamp_timeago}}</h1>
@@ -10,9 +14,7 @@
<div class="tabs"> <div class="tabs">
<ul> <ul>
<li class="tab" id="default-tab"><a href="#text">Text</a></li> <li class="tab" id="default-tab"><a href="#text">Text</a></li>
{% if screenshot %} <li class="tab" id="screenshot-tab"><a href="#screenshot">Screenshot</a></li>
<li class="tab"><a href="#screenshot">Current screenshot</a></li>
{% endif %}
</ul> </ul>
</div> </div>
@@ -31,15 +33,20 @@
</tbody> </tbody>
</table> </table>
</div> </div>
{% if screenshot %}
<div class="tab-pane-inner" id="screenshot"> <div class="tab-pane-inner" id="screenshot">
<p> <div class="tip">
<i>For now, only the most recent screenshot is saved and displayed.</i> For now, Differences are performed on text, not graphically, only the latest screenshot is available.
</p> </div>
</br>
<img src="{{url_for('static_content', group='screenshot', filename=uuid)}}"> {% if is_html_webdriver %}
{% if screenshot %}
<img style="max-width: 80%" id="screenshot-img" alt="Current screenshot from most recent request"/>
{% else %}
No screenshot available just yet! Try rechecking the page.
{% endif %}
{% else %}
<strong>Screenshot requires Playwright/WebDriver enabled</strong>
{% endif %}
</div> </div>
{% endif %}
</div> </div>
{% endblock %} {% endblock %}

View File

@@ -20,6 +20,7 @@
<li class="tab"><a href="#notifications">Notifications</a></li> <li class="tab"><a href="#notifications">Notifications</a></li>
<li class="tab"><a href="#fetching">Fetching</a></li> <li class="tab"><a href="#fetching">Fetching</a></li>
<li class="tab"><a href="#filters">Global Filters</a></li> <li class="tab"><a href="#filters">Global Filters</a></li>
<li class="tab"><a href="#api">API</a></li>
</ul> </ul>
</div> </div>
<div class="box-wrap inner"> <div class="box-wrap inner">
@@ -31,6 +32,11 @@
{{ render_field(form.requests.form.time_between_check, class="time-check-widget") }} {{ render_field(form.requests.form.time_between_check, class="time-check-widget") }}
<span class="pure-form-message-inline">Default time for all watches, when the watch does not have a specific time setting.</span> <span class="pure-form-message-inline">Default time for all watches, when the watch does not have a specific time setting.</span>
</div> </div>
<div class="pure-control-group">
{{ render_field(form.requests.form.jitter_seconds, class="jitter_seconds") }}
<span class="pure-form-message-inline">Example - 3 seconds random jitter could trigger up to 3 seconds earlier or up to 3 seconds later</span>
</div>
<div class="pure-control-group"> <div class="pure-control-group">
{% if not hide_remove_pass %} {% if not hide_remove_pass %}
{% if current_user.is_authenticated %} {% if current_user.is_authenticated %}
@@ -43,6 +49,7 @@
<span class="pure-form-message-inline">Password is locked.</span> <span class="pure-form-message-inline">Password is locked.</span>
{% endif %} {% endif %}
</div> </div>
<div class="pure-control-group"> <div class="pure-control-group">
{{ render_field(form.application.form.base_url, placeholder="http://yoursite.com:5000/", {{ render_field(form.application.form.base_url, placeholder="http://yoursite.com:5000/",
class="m-d") }} class="m-d") }}
@@ -105,7 +112,6 @@
</fieldset> </fieldset>
</div> </div>
<div class="tab-pane-inner" id="filters"> <div class="tab-pane-inner" id="filters">
<fieldset class="pure-group"> <fieldset class="pure-group">
@@ -150,12 +156,26 @@ nav
</fieldset> </fieldset>
</div> </div>
<div class="tab-pane-inner" id="api">
<p>Drive your changedetection.io via API, More about <a href="https://github.com/dgtlmoon/changedetection.io/wiki/API-Reference">API access here</a></p>
<div class="pure-control-group">
{{ render_checkbox_field(form.application.form.api_access_token_enabled) }}
<div class="pure-form-message-inline">Restrict API access limit by using <code>x-api-key</code> header</div><br/>
<div class="pure-form-message-inline"><br/>API Key <span id="api-key">{{api_key}}</span>
<span style="display:none;" id="api-key-copy" >copy</span>
</div>
</div>
</div>
<div id="actions"> <div id="actions">
<div class="pure-control-group"> <div class="pure-control-group">
{{ render_button(form.save_button) }} {{ render_button(form.save_button) }}
<a href="{{url_for('index')}}" class="pure-button button-small button-cancel">Back</a> <a href="{{url_for('index')}}" class="pure-button button-small button-cancel">Back</a>
<a href="{{url_for('scrub_page')}}" class="pure-button button-small button-cancel">Delete History Snapshot Data</a> <a href="{{url_for('scrub_page')}}" class="pure-button button-small button-cancel">Delete History Snapshot Data</a>
</div> </div>
</div> </div>
</form> </form>
</div> </div>

View File

@@ -3,9 +3,10 @@
{% from '_helpers.jinja' import render_simple_field %} {% from '_helpers.jinja' import render_simple_field %}
<script type="text/javascript" src="{{url_for('static_content', group='js', filename='jquery-3.6.0.min.js')}}"></script> <script type="text/javascript" src="{{url_for('static_content', group='js', filename='jquery-3.6.0.min.js')}}"></script>
<script type="text/javascript" src="{{url_for('static_content', group='js', filename='watch-overview.js')}}" defer></script> <script type="text/javascript" src="{{url_for('static_content', group='js', filename='watch-overview.js')}}" defer></script>
<div class="box"> <div class="box">
<form class="pure-form" action="{{ url_for('api_watch_add') }}" method="POST" id="new-watch-form"> <form class="pure-form" action="{{ url_for('form_watch_add') }}" method="POST" id="new-watch-form">
<input type="hidden" name="csrf_token" value="{{ csrf_token() }}"/> <input type="hidden" name="csrf_token" value="{{ csrf_token() }}"/>
<fieldset> <fieldset>
<legend>Add a new change detection watch</legend> <legend>Add a new change detection watch</legend>
@@ -45,14 +46,14 @@
{% if watch.last_error is defined and watch.last_error != False %}error{% endif %} {% if watch.last_error is defined and watch.last_error != False %}error{% endif %}
{% if watch.last_notification_error is defined and watch.last_notification_error != False %}error{% endif %} {% if watch.last_notification_error is defined and watch.last_notification_error != False %}error{% endif %}
{% if watch.paused is defined and watch.paused != False %}paused{% endif %} {% if watch.paused is defined and watch.paused != False %}paused{% endif %}
{% if watch.newest_history_key| int > watch.last_viewed| int %}unviewed{% endif %} {% if watch.newest_history_key| int > watch.last_viewed and watch.history_n>=2 %}unviewed{% endif %}
{% if watch.uuid in queued_uuids %}queued{% endif %}"> {% if watch.uuid in queued_uuids %}queued{% endif %}">
<td class="inline">{{ loop.index }}</td> <td class="inline">{{ loop.index }}</td>
<td class="inline paused-state state-{{watch.paused}}"><a href="{{url_for('index', pause=watch.uuid, tag=active_tag)}}"><img src="{{url_for('static_content', group='images', filename='pause.svg')}}" alt="Pause" title="Pause"/></a></td> <td class="inline paused-state state-{{watch.paused}}"><a href="{{url_for('index', pause=watch.uuid, tag=active_tag)}}"><img src="{{url_for('static_content', group='images', filename='pause.svg')}}" alt="Pause" title="Pause"/></a></td>
<td class="title-col inline">{{watch.title if watch.title is not none and watch.title|length > 0 else watch.url}} <td class="title-col inline">{{watch.title if watch.title is not none and watch.title|length > 0 else watch.url}}
<a class="external" target="_blank" rel="noopener" href="{{ watch.url.replace('source:','') }}"></a> <a class="external" target="_blank" rel="noopener" href="{{ watch.url.replace('source:','') }}"></a>
<a href="{{url_for('api_share_put_watch', uuid=watch.uuid)}}"><img style="height: 1em;display:inline-block;" src="{{url_for('static_content', group='images', filename='spread.svg')}}" /></a> <a href="{{url_for('form_share_put_watch', uuid=watch.uuid)}}"><img style="height: 1em;display:inline-block;" src="{{url_for('static_content', group='images', filename='spread.svg')}}" /></a>
{%if watch.fetch_backend == "html_webdriver" %}<img style="height: 1em; display:inline-block;" src="{{url_for('static_content', group='images', filename='Google-Chrome-icon.png')}}" />{% endif %} {%if watch.fetch_backend == "html_webdriver" %}<img style="height: 1em; display:inline-block;" src="{{url_for('static_content', group='images', filename='Google-Chrome-icon.png')}}" />{% endif %}
@@ -66,21 +67,21 @@
<span class="watch-tag-list">{{ watch.tag}}</span> <span class="watch-tag-list">{{ watch.tag}}</span>
{% endif %} {% endif %}
</td> </td>
<td class="last-checked">{{watch|format_last_checked_time}}</td> <td class="last-checked">{{watch|format_last_checked_time|safe}}</td>
<td class="last-changed">{% if watch.history|length >= 2 and watch.last_changed %} <td class="last-changed">{% if watch.history_n >=2 and watch.last_changed %}
{{watch.last_changed|format_timestamp_timeago}} {{watch.last_changed|format_timestamp_timeago}}
{% else %} {% else %}
Not yet Not yet
{% endif %} {% endif %}
</td> </td>
<td> <td>
<a {% if watch.uuid in queued_uuids %}disabled="true"{% endif %} href="{{ url_for('api_watch_checknow', uuid=watch.uuid, tag=request.args.get('tag')) }}" <a {% if watch.uuid in queued_uuids %}disabled="true"{% endif %} href="{{ url_for('form_watch_checknow', uuid=watch.uuid, tag=request.args.get('tag')) }}"
class="recheck pure-button button-small pure-button-primary">{% if watch.uuid in queued_uuids %}Queued{% else %}Recheck{% endif %}</a> class="recheck pure-button button-small pure-button-primary">{% if watch.uuid in queued_uuids %}Queued{% else %}Recheck{% endif %}</a>
<a href="{{ url_for('edit_page', uuid=watch.uuid)}}" class="pure-button button-small pure-button-primary">Edit</a> <a href="{{ url_for('edit_page', uuid=watch.uuid)}}" class="pure-button button-small pure-button-primary">Edit</a>
{% if watch.history|length >= 2 %} {% if watch.history_n >= 2 %}
<a href="{{ url_for('diff_history_page', uuid=watch.uuid) }}" target="{{watch.uuid}}" class="pure-button button-small pure-button-primary diff-link">Diff</a> <a href="{{ url_for('diff_history_page', uuid=watch.uuid) }}" target="{{watch.uuid}}" class="pure-button button-small pure-button-primary diff-link">Diff</a>
{% else %} {% else %}
{% if watch.history|length == 1 %} {% if watch.history_n == 1 %}
<a href="{{ url_for('preview_page', uuid=watch.uuid)}}" target="{{watch.uuid}}" class="pure-button button-small pure-button-primary">Preview</a> <a href="{{ url_for('preview_page', uuid=watch.uuid)}}" target="{{watch.uuid}}" class="pure-button button-small pure-button-primary">Preview</a>
{% endif %} {% endif %}
{% endif %} {% endif %}
@@ -96,7 +97,7 @@
</li> </li>
{% endif %} {% endif %}
<li> <li>
<a href="{{ url_for('api_watch_checknow', tag=active_tag) }}" class="pure-button button-tag ">Recheck <a href="{{ url_for('form_watch_checknow', tag=active_tag) }}" class="pure-button button-tag ">Recheck
all {% if active_tag%}in "{{active_tag}}"{%endif%}</a> all {% if active_tag%}in "{{active_tag}}"{%endif%}</a>
</li> </li>
<li> <li>

View File

@@ -0,0 +1,2 @@
"""Tests for the app."""

View File

@@ -0,0 +1,3 @@
#!/usr/bin/python3
from .. import conftest

View File

@@ -0,0 +1,48 @@
#!/usr/bin/python3
import time
from flask import url_for
from ..util import live_server_setup
import logging
def test_fetch_webdriver_content(client, live_server):
live_server_setup(live_server)
#####################
res = client.post(
url_for("settings_page"),
data={"application-empty_pages_are_a_change": "",
"requests-time_between_check-minutes": 180,
'application-fetch_backend': "html_webdriver"},
follow_redirects=True
)
assert b"Settings updated." in res.data
# Add our URL to the import page
res = client.post(
url_for("import_page"),
data={"urls": "https://changedetection.io/ci-test.html"},
follow_redirects=True
)
assert b"1 Imported" in res.data
time.sleep(3)
attempt = 0
while attempt < 20:
res = client.get(url_for("index"))
if not b'Checking now' in res.data:
break
logging.getLogger().info("Waiting for check to not say 'Checking now'..")
time.sleep(3)
attempt += 1
res = client.get(
url_for("preview_page", uuid="first"),
follow_redirects=True
)
logging.getLogger().info("Looking for correct fetched HTML (text) from server")
assert b'cool it works' in res.data

View File

@@ -2,73 +2,192 @@
import time import time
from flask import url_for from flask import url_for
from . util import live_server_setup from .util import live_server_setup, extract_api_key_from_UI
def test_setup(live_server): import json
live_server_setup(live_server) import uuid
def set_response_data(test_return_data): def set_original_response():
test_return_data = """<html>
<body>
Some initial text</br>
<p>Which is across multiple lines</p>
</br>
So let's see what happens. </br>
<div id="sametext">Some text thats the same</div>
<div id="changetext">Some text that will change</div>
</body>
</html>
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
return None
def set_modified_response():
test_return_data = """<html>
<body>
Some initial text</br>
<p>which has this one new line</p>
</br>
So let's see what happens. </br>
<div id="sametext">Some text thats the same</div>
<div id="changetext">Some text that changes</div>
</body>
</html>
"""
with open("test-datastore/endpoint-content.txt", "w") as f: with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data) f.write(test_return_data)
return None
def test_snapshot_api_detects_change(client, live_server):
test_return_data = "Some initial text"
test_return_data_modified = "Some NEW nice initial text" def is_valid_uuid(val):
try:
uuid.UUID(str(val))
return True
except ValueError:
return False
sleep_time_for_fetch_thread = 3
set_response_data(test_return_data) def test_api_simple(client, live_server):
live_server_setup(live_server)
# Give the endpoint time to spin up api_key = extract_api_key_from_UI(client)
time.sleep(1)
# Add our URL to the import page # Create a watch
test_url = url_for('test_endpoint', content_type="text/plain", set_original_response()
_external=True) watch_uuid = None
# Validate bad URL
test_url = url_for('test_endpoint', _external=True,
headers={'x-api-key': api_key}, )
res = client.post( res = client.post(
url_for("import_page"), url_for("createwatch"),
data={"urls": test_url}, data=json.dumps({"url": "h://xxxxxxxxxom"}),
headers={'content-type': 'application/json', 'x-api-key': api_key},
follow_redirects=True follow_redirects=True
) )
assert b"1 Imported" in res.data
# Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
res = client.get(
url_for("api_snapshot", uuid="first"),
follow_redirects=True
)
assert test_return_data.encode() == res.data
# Make a change
set_response_data(test_return_data_modified)
# Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
res = client.get(
url_for("api_snapshot", uuid="first"),
follow_redirects=True
)
assert test_return_data_modified.encode() == res.data
def test_snapshot_api_invalid_uuid(client, live_server):
res = client.get(
url_for("api_snapshot", uuid="invalid"),
follow_redirects=True
)
assert res.status_code == 400 assert res.status_code == 400
# Create new
res = client.post(
url_for("createwatch"),
data=json.dumps({"url": test_url, 'tag': "One, Two", "title": "My test URL"}),
headers={'content-type': 'application/json', 'x-api-key': api_key},
follow_redirects=True
)
s = json.loads(res.data)
assert is_valid_uuid(s['uuid'])
watch_uuid = s['uuid']
assert res.status_code == 201
time.sleep(3)
# Verify its in the list and that recheck worked
res = client.get(
url_for("createwatch"),
headers={'x-api-key': api_key}
)
assert watch_uuid in json.loads(res.data).keys()
before_recheck_info = json.loads(res.data)[watch_uuid]
assert before_recheck_info['last_checked'] != 0
assert before_recheck_info['title'] == 'My test URL'
set_modified_response()
# Trigger recheck of all ?recheck_all=1
client.get(
url_for("createwatch", recheck_all='1'),
headers={'x-api-key': api_key},
)
time.sleep(3)
# Did the recheck fire?
res = client.get(
url_for("createwatch"),
headers={'x-api-key': api_key},
)
after_recheck_info = json.loads(res.data)[watch_uuid]
assert after_recheck_info['last_checked'] != before_recheck_info['last_checked']
assert after_recheck_info['last_changed'] != 0
# Check history index list
res = client.get(
url_for("watchhistory", uuid=watch_uuid),
headers={'x-api-key': api_key},
)
history = json.loads(res.data)
assert len(history) == 2, "Should have two history entries (the original and the changed)"
# Fetch a snapshot by timestamp, check the right one was found
res = client.get(
url_for("watchsinglehistory", uuid=watch_uuid, timestamp=list(history.keys())[-1]),
headers={'x-api-key': api_key},
)
assert b'which has this one new line' in res.data
# Fetch a snapshot by 'latest'', check the right one was found
res = client.get(
url_for("watchsinglehistory", uuid=watch_uuid, timestamp='latest'),
headers={'x-api-key': api_key},
)
assert b'which has this one new line' in res.data
# Fetch the whole watch
res = client.get(
url_for("watch", uuid=watch_uuid),
headers={'x-api-key': api_key}
)
watch = json.loads(res.data)
# @todo how to handle None/default global values?
assert watch['history_n'] == 2, "Found replacement history section, which is in its own API"
# Finally delete the watch
res = client.delete(
url_for("watch", uuid=watch_uuid),
headers={'x-api-key': api_key},
)
assert res.status_code == 204
# Check via a relist
res = client.get(
url_for("createwatch"),
headers={'x-api-key': api_key}
)
watch_list = json.loads(res.data)
assert len(watch_list) == 0, "Watch list should be empty"
def test_access_denied(client, live_server):
# `config_api_token_enabled` Should be On by default
res = client.get(
url_for("createwatch")
)
assert res.status_code == 403
res = client.get(
url_for("createwatch"),
headers={'x-api-key': "something horrible"}
)
assert res.status_code == 403
# Disable config_api_token_enabled and it should work
res = client.post(
url_for("settings_page"),
data={
"requests-time_between_check-minutes": 180,
"application-fetch_backend": "html_requests",
"application-api_access_token_enabled": ""
},
follow_redirects=True
)
assert b"Settings updated." in res.data
res = client.get(
url_for("createwatch")
)
assert res.status_code == 200

View File

@@ -29,7 +29,7 @@ def test_basic_auth(client, live_server):
assert b"Updated watch." in res.data assert b"Updated watch." in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(1) time.sleep(1)
res = client.get( res = client.get(
url_for("preview_page", uuid="first"), url_for("preview_page", uuid="first"),

View File

@@ -3,14 +3,15 @@
import time import time
from flask import url_for from flask import url_for
from urllib.request import urlopen from urllib.request import urlopen
from . util import set_original_response, set_modified_response, live_server_setup from .util import set_original_response, set_modified_response, live_server_setup
sleep_time_for_fetch_thread = 3 sleep_time_for_fetch_thread = 3
# Basic test to check inscriptus is not adding return line chars, basically works etc # Basic test to check inscriptus is not adding return line chars, basically works etc
def test_inscriptus(): def test_inscriptus():
from inscriptis import get_text from inscriptis import get_text
html_content="<html><body>test!<br/>ok man</body></html>" html_content = "<html><body>test!<br/>ok man</body></html>"
stripped_text_from_html = get_text(html_content) stripped_text_from_html = get_text(html_content)
assert stripped_text_from_html == 'test!\nok man' assert stripped_text_from_html == 'test!\nok man'
@@ -32,7 +33,7 @@ def test_check_basic_change_detection_functionality(client, live_server):
# Do this a few times.. ensures we dont accidently set the status # Do this a few times.. ensures we dont accidently set the status
for n in range(3): for n in range(3):
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -65,7 +66,7 @@ def test_check_basic_change_detection_functionality(client, live_server):
assert b'which has this one new line' in res.read() assert b'which has this one new line' in res.read()
# Force recheck # Force recheck
res = client.get(url_for("api_watch_checknow"), follow_redirects=True) res = client.get(url_for("form_watch_checknow"), follow_redirects=True)
assert b'1 watches are queued for rechecking.' in res.data assert b'1 watches are queued for rechecking.' in res.data
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -82,7 +83,7 @@ def test_check_basic_change_detection_functionality(client, live_server):
# re #16 should have the diff in here too # re #16 should have the diff in here too
assert b'(into ) which has this one new line' in res.data assert b'(into ) which has this one new line' in res.data
assert b'CDATA' in res.data assert b'CDATA' in res.data
assert expected_url.encode('utf-8') in res.data assert expected_url.encode('utf-8') in res.data
# Following the 'diff' link, it should no longer display as 'unviewed' even after we recheck it a few times # Following the 'diff' link, it should no longer display as 'unviewed' even after we recheck it a few times
@@ -93,7 +94,7 @@ def test_check_basic_change_detection_functionality(client, live_server):
# Do this a few times.. ensures we dont accidently set the status # Do this a few times.. ensures we dont accidently set the status
for n in range(2): for n in range(2):
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -101,7 +102,8 @@ def test_check_basic_change_detection_functionality(client, live_server):
# It should report nothing found (no new 'unviewed' class) # It should report nothing found (no new 'unviewed' class)
res = client.get(url_for("index")) res = client.get(url_for("index"))
assert b'unviewed' not in res.data assert b'unviewed' not in res.data
assert b'head title' not in res.data # Should not be present because this is off by default assert b'Mark all viewed' not in res.data
assert b'head title' not in res.data # Should not be present because this is off by default
assert b'test-endpoint' in res.data assert b'test-endpoint' in res.data
set_original_response() set_original_response()
@@ -109,20 +111,28 @@ def test_check_basic_change_detection_functionality(client, live_server):
# Enable auto pickup of <title> in settings # Enable auto pickup of <title> in settings
res = client.post( res = client.post(
url_for("settings_page"), url_for("settings_page"),
data={"application-extract_title_as_title": "1", "requests-time_between_check-minutes": 180, 'application-fetch_backend': "html_requests"}, data={"application-extract_title_as_title": "1", "requests-time_between_check-minutes": 180,
'application-fetch_backend': "html_requests"},
follow_redirects=True follow_redirects=True
) )
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
res = client.get(url_for("index")) res = client.get(url_for("index"))
assert b'unviewed' in res.data assert b'unviewed' in res.data
assert b'Mark all viewed' in res.data
# It should have picked up the <title> # It should have picked up the <title>
assert b'head title' in res.data assert b'head title' in res.data
# hit the mark all viewed link
res = client.get(url_for("mark_all_viewed"), follow_redirects=True)
assert b'Mark all viewed' not in res.data
assert b'unviewed' not in res.data
# #
# Cleanup everything # Cleanup everything
res = client.get(url_for("api_delete", uuid="all"), follow_redirects=True) res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data assert b'Deleted' in res.data

View File

@@ -23,7 +23,7 @@ def test_trigger_functionality(client, live_server):
res = client.get( res = client.get(
url_for("api_clone", uuid="first"), url_for("form_clone", uuid="first"),
follow_redirects=True follow_redirects=True
) )

View File

@@ -89,7 +89,7 @@ def test_check_markup_css_filter_restriction(client, live_server):
assert b"1 Imported" in res.data assert b"1 Imported" in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -110,7 +110,7 @@ def test_check_markup_css_filter_restriction(client, live_server):
assert bytes(css_filter.encode('utf-8')) in res.data assert bytes(css_filter.encode('utf-8')) in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -118,7 +118,7 @@ def test_check_markup_css_filter_restriction(client, live_server):
set_modified_response() set_modified_response()
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)

View File

@@ -145,20 +145,19 @@ def test_element_removal_full(client, live_server):
assert bytes(subtractive_selectors_data.encode("utf-8")) in res.data assert bytes(subtractive_selectors_data.encode("utf-8")) in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
# No change yet - first check # so that we set the state to 'unviewed' after all the edits
res = client.get(url_for("index")) client.get(url_for("diff_history_page", uuid="first"))
assert b"unviewed" not in res.data
# Make a change to header/footer/nav # Make a change to header/footer/nav
set_modified_response() set_modified_response()
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)

View File

@@ -39,7 +39,7 @@ def test_check_encoding_detection(client, live_server):
) )
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(2) time.sleep(2)
@@ -71,7 +71,7 @@ def test_check_encoding_detection_missing_content_type_header(client, live_serve
) )
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(2) time.sleep(2)

View File

@@ -29,7 +29,7 @@ def test_error_handler(client, live_server):
assert b"1 Imported" in res.data assert b"1 Imported" in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(3) time.sleep(3)
@@ -54,7 +54,7 @@ def test_error_text_handler(client, live_server):
assert b"1 Imported" in res.data assert b"1 Imported" in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(3) time.sleep(3)

View File

@@ -0,0 +1,127 @@
#!/usr/bin/python3
import time
from flask import url_for
from .util import live_server_setup
from ..html_tools import *
def set_original_response():
test_return_data = """<html>
<body>
Some initial text</br>
<p>Which is across multiple lines</p>
</br>
So let's see what happens. </br>
<div id="sametext">Some text thats the same</div>
<div id="changetext">Some text that will change</div>
</body>
</html>
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
return None
def set_modified_response():
test_return_data = """<html>
<body>
Some initial text</br>
<p>which has this one new line</p>
</br>
So let's see what happens. </br>
<div id="sametext">Some text thats the same</div>
<div id="changetext">Some text that did change ( 1000 online <br/> 80 guests)</div>
</body>
</html>
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
return None
def test_check_filter_and_regex_extract(client, live_server):
sleep_time_for_fetch_thread = 3
live_server_setup(live_server)
css_filter = "#changetext"
set_original_response()
# Give the endpoint time to spin up
time.sleep(1)
# Add our URL to the import page
test_url = url_for('test_endpoint', _external=True)
res = client.post(
url_for("import_page"),
data={"urls": test_url},
follow_redirects=True
)
assert b"1 Imported" in res.data
# Trigger a check
client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
# Goto the edit page, add our ignore text
# Add our URL to the import page
res = client.post(
url_for("edit_page", uuid="first"),
data={"css_filter": css_filter,
'extract_text': '\d+ online\n\d+ guests',
"url": test_url,
"tag": "",
"headers": "",
'fetch_backend': "html_requests"
},
follow_redirects=True
)
assert b"Updated watch." in res.data
# Check it saved
res = client.get(
url_for("edit_page", uuid="first"),
)
assert b'\d+ online' in res.data
# Trigger a check
# client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
# Make a change
set_modified_response()
# Trigger a check
client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
# It should have 'unviewed' still
# Because it should be looking at only that 'sametext' id
res = client.get(url_for("index"))
assert b'unviewed' in res.data
# Check HTML conversion detected and workd
res = client.get(
url_for("preview_page", uuid="first"),
follow_redirects=True
)
# Class will be blank for now because the frontend didnt apply the diff
assert b'<div class="">1000 online' in res.data
# Both regexs should be here
assert b'<div class="">80 guests' in res.data
# Should not be here
assert b'Some text that did change' not in res.data

View File

@@ -0,0 +1,84 @@
#!/usr/bin/python3
import time
import os
import json
import logging
from flask import url_for
from .util import live_server_setup
from urllib.parse import urlparse, parse_qs
def test_consistent_history(client, live_server):
live_server_setup(live_server)
# Give the endpoint time to spin up
time.sleep(1)
r = range(1, 50)
for one in r:
test_url = url_for('test_endpoint', content_type="text/html", content=str(one), _external=True)
res = client.post(
url_for("import_page"),
data={"urls": test_url},
follow_redirects=True
)
assert b"1 Imported" in res.data
time.sleep(3)
while True:
res = client.get(url_for("index"))
logging.debug("Waiting for 'Checking now' to go away..")
if b'Checking now' not in res.data:
break
time.sleep(0.5)
time.sleep(3)
# Essentially just triggers the DB write/update
res = client.post(
url_for("settings_page"),
data={"application-empty_pages_are_a_change": "",
"requests-time_between_check-minutes": 180,
'application-fetch_backend': "html_requests"},
follow_redirects=True
)
assert b"Settings updated." in res.data
# Give it time to write it out
time.sleep(3)
json_db_file = os.path.join(live_server.app.config['DATASTORE'].datastore_path, 'url-watches.json')
json_obj = None
with open(json_db_file, 'r') as f:
json_obj = json.load(f)
# assert the right amount of watches was found in the JSON
assert len(json_obj['watching']) == len(r), "Correct number of watches was found in the JSON"
# each one should have a history.txt containing just one line
for w in json_obj['watching'].keys():
history_txt_index_file = os.path.join(live_server.app.config['DATASTORE'].datastore_path, w, 'history.txt')
assert os.path.isfile(history_txt_index_file), "History.txt should exist where I expect it - {}".format(history_txt_index_file)
# Same like in model.Watch
with open(history_txt_index_file, "r") as f:
tmp_history = dict(i.strip().split(',', 2) for i in f.readlines())
assert len(tmp_history) == 1, "History.txt should contain 1 line"
# Should be two files,. the history.txt , and the snapshot.txt
files_in_watch_dir = os.listdir(os.path.join(live_server.app.config['DATASTORE'].datastore_path,
w))
# Find the snapshot one
for fname in files_in_watch_dir:
if fname != 'history.txt':
# contents should match what we requested as content returned from the test url
with open(os.path.join(live_server.app.config['DATASTORE'].datastore_path, w, fname), 'r') as snapshot_f:
contents = snapshot_f.read()
watch_url = json_obj['watching'][w]['url']
u = urlparse(watch_url)
q = parse_qs(u[4])
assert q['content'][0] == contents.strip(), "Snapshot file {} should contain {}".format(fname, q['content'][0])
assert len(files_in_watch_dir) == 2, "Should be just two files in the dir, history.txt and the snapshot"

View File

@@ -102,7 +102,7 @@ def test_check_ignore_text_functionality(client, live_server):
assert b"1 Imported" in res.data assert b"1 Imported" in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -123,7 +123,7 @@ def test_check_ignore_text_functionality(client, live_server):
assert bytes(ignore_text.encode('utf-8')) in res.data assert bytes(ignore_text.encode('utf-8')) in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -137,7 +137,7 @@ def test_check_ignore_text_functionality(client, live_server):
set_modified_ignore_response() set_modified_ignore_response()
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -152,7 +152,7 @@ def test_check_ignore_text_functionality(client, live_server):
# Just to be sure.. set a regular modified change.. # Just to be sure.. set a regular modified change..
set_modified_original_ignore_response() set_modified_original_ignore_response()
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
res = client.get(url_for("index")) res = client.get(url_for("index"))
@@ -165,7 +165,7 @@ def test_check_ignore_text_functionality(client, live_server):
# We should be able to see what we ignored # We should be able to see what we ignored
assert b'<div class="ignored">new ignore stuff' in res.data assert b'<div class="ignored">new ignore stuff' in res.data
res = client.get(url_for("api_delete", uuid="all"), follow_redirects=True) res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data assert b'Deleted' in res.data
def test_check_global_ignore_text_functionality(client, live_server): def test_check_global_ignore_text_functionality(client, live_server):
@@ -200,7 +200,7 @@ def test_check_global_ignore_text_functionality(client, live_server):
assert b"1 Imported" in res.data assert b"1 Imported" in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -222,7 +222,7 @@ def test_check_global_ignore_text_functionality(client, live_server):
assert bytes(ignore_text.encode('utf-8')) in res.data assert bytes(ignore_text.encode('utf-8')) in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -240,7 +240,7 @@ def test_check_global_ignore_text_functionality(client, live_server):
set_modified_ignore_response() set_modified_ignore_response()
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -251,10 +251,10 @@ def test_check_global_ignore_text_functionality(client, live_server):
# Just to be sure.. set a regular modified change that will trigger it # Just to be sure.. set a regular modified change that will trigger it
set_modified_original_ignore_response() set_modified_original_ignore_response()
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
res = client.get(url_for("index")) res = client.get(url_for("index"))
assert b'unviewed' in res.data assert b'unviewed' in res.data
res = client.get(url_for("api_delete", uuid="all"), follow_redirects=True) res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data assert b'Deleted' in res.data

View File

@@ -72,14 +72,14 @@ def test_render_anchor_tag_content_true(client, live_server):
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# set a new html text with a modified link # set a new html text with a modified link
set_modified_ignore_response() set_modified_ignore_response()
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -101,7 +101,7 @@ def test_render_anchor_tag_content_true(client, live_server):
assert b"Settings updated." in res.data assert b"Settings updated." in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -119,7 +119,7 @@ def test_render_anchor_tag_content_true(client, live_server):
assert b"/test-endpoint" in res.data assert b"/test-endpoint" in res.data
# Cleanup everything # Cleanup everything
res = client.get(url_for("api_delete", uuid="all"), res = client.get(url_for("form_delete", uuid="all"),
follow_redirects=True) follow_redirects=True)
assert b'Deleted' in res.data assert b'Deleted' in res.data

View File

@@ -70,12 +70,12 @@ def test_normal_page_check_works_with_ignore_status_code(client, live_server):
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
set_some_changed_response() set_some_changed_response()
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -105,7 +105,7 @@ def test_403_page_check_works_with_ignore_status_code(client, live_server):
assert b"1 Imported" in res.data assert b"1 Imported" in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -120,7 +120,7 @@ def test_403_page_check_works_with_ignore_status_code(client, live_server):
assert b"Updated watch." in res.data assert b"Updated watch." in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -128,7 +128,7 @@ def test_403_page_check_works_with_ignore_status_code(client, live_server):
set_some_changed_response() set_some_changed_response()
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -157,7 +157,7 @@ def test_403_page_check_fails_without_ignore_status_code(client, live_server):
assert b"1 Imported" in res.data assert b"1 Imported" in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -172,7 +172,7 @@ def test_403_page_check_fails_without_ignore_status_code(client, live_server):
assert b"Updated watch." in res.data assert b"Updated watch." in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -180,7 +180,7 @@ def test_403_page_check_fails_without_ignore_status_code(client, live_server):
set_some_changed_response() set_some_changed_response()
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)

View File

@@ -80,12 +80,12 @@ def test_check_ignore_whitespace(client, live_server):
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
set_original_ignore_response_but_with_whitespace() set_original_ignore_response_but_with_whitespace()
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)

View File

@@ -25,7 +25,7 @@ https://example.com tag1, other tag"""
assert b"3 Imported" in res.data assert b"3 Imported" in res.data
assert b"tag1" in res.data assert b"tag1" in res.data
assert b"other tag" in res.data assert b"other tag" in res.data
res = client.get(url_for("api_delete", uuid="all"), follow_redirects=True) res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
# Clear flask alerts # Clear flask alerts
res = client.get( url_for("index")) res = client.get( url_for("index"))
@@ -50,7 +50,7 @@ def xtest_import_skip_url(client, live_server):
assert b"1 Imported" in res.data assert b"1 Imported" in res.data
assert b"ht000000broken" in res.data assert b"ht000000broken" in res.data
assert b"1 Skipped" in res.data assert b"1 Skipped" in res.data
res = client.get(url_for("api_delete", uuid="all"), follow_redirects=True) res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
# Clear flask alerts # Clear flask alerts
res = client.get( url_for("index")) res = client.get( url_for("index"))
@@ -79,7 +79,7 @@ def test_import_distillio(client, live_server):
# Give the endpoint time to spin up # Give the endpoint time to spin up
time.sleep(1) time.sleep(1)
client.get(url_for("api_delete", uuid="all"), follow_redirects=True) client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
res = client.post( res = client.post(
url_for("import_page"), url_for("import_page"),
data={ data={
@@ -115,6 +115,6 @@ def test_import_distillio(client, live_server):
assert b"nice stuff" in res.data assert b"nice stuff" in res.data
assert b"nerd-news" in res.data assert b"nerd-news" in res.data
res = client.get(url_for("api_delete", uuid="all"), follow_redirects=True) res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
# Clear flask alerts # Clear flask alerts
res = client.get(url_for("index")) res = client.get(url_for("index"))

View File

@@ -171,7 +171,7 @@ def test_check_json_without_filter(client, live_server):
) )
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(3) time.sleep(3)
@@ -203,7 +203,7 @@ def test_check_json_filter(client, live_server):
assert b"1 Imported" in res.data assert b"1 Imported" in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(3) time.sleep(3)
@@ -229,7 +229,7 @@ def test_check_json_filter(client, live_server):
assert bytes(json_filter.encode('utf-8')) in res.data assert bytes(json_filter.encode('utf-8')) in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(3) time.sleep(3)
@@ -237,7 +237,7 @@ def test_check_json_filter(client, live_server):
set_modified_response() set_modified_response()
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(4) time.sleep(4)
@@ -288,7 +288,7 @@ def test_check_json_filter_bool_val(client, live_server):
time.sleep(3) time.sleep(3)
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(3) time.sleep(3)
@@ -296,7 +296,7 @@ def test_check_json_filter_bool_val(client, live_server):
set_modified_response() set_modified_response()
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(3) time.sleep(3)
@@ -327,7 +327,7 @@ def test_check_json_ext_filter(client, live_server):
assert b"1 Imported" in res.data assert b"1 Imported" in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(3) time.sleep(3)
@@ -353,7 +353,7 @@ def test_check_json_ext_filter(client, live_server):
assert bytes(json_filter.encode('utf-8')) in res.data assert bytes(json_filter.encode('utf-8')) in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(3) time.sleep(3)
@@ -361,7 +361,7 @@ def test_check_json_ext_filter(client, live_server):
set_modified_ext_response() set_modified_ext_response()
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(4) time.sleep(4)

View File

@@ -39,7 +39,7 @@ def test_check_basic_change_detection_functionality(client, live_server):
# Do this a few times.. ensures we dont accidently set the status # Do this a few times.. ensures we dont accidently set the status
for n in range(3): for n in range(3):
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -61,7 +61,7 @@ def test_check_basic_change_detection_functionality(client, live_server):
# this should not trigger a change, because no good text could be converted from the HTML # this should not trigger a change, because no good text could be converted from the HTML
set_nonrenderable_response() set_nonrenderable_response()
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -83,7 +83,7 @@ def test_check_basic_change_detection_functionality(client, live_server):
set_modified_response() set_modified_response()
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -97,6 +97,6 @@ def test_check_basic_change_detection_functionality(client, live_server):
# #
# Cleanup everything # Cleanup everything
res = client.get(url_for("api_delete", uuid="all"), follow_redirects=True) res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data assert b'Deleted' in res.data

View File

@@ -36,7 +36,7 @@ def test_check_notification(client, live_server):
# Add our URL to the import page # Add our URL to the import page
test_url = url_for('test_endpoint', _external=True) test_url = url_for('test_endpoint', _external=True)
res = client.post( res = client.post(
url_for("api_watch_add"), url_for("form_watch_add"),
data={"url": test_url, "tag": ''}, data={"url": test_url, "tag": ''},
follow_redirects=True follow_redirects=True
) )
@@ -98,7 +98,7 @@ def test_check_notification(client, live_server):
notification_submission = None notification_submission = None
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(3) time.sleep(3)
# Verify what was sent as a notification, this file should exist # Verify what was sent as a notification, this file should exist
with open("test-datastore/notification.txt", "r") as f: with open("test-datastore/notification.txt", "r") as f:
@@ -133,7 +133,7 @@ def test_check_notification(client, live_server):
# This should insert the {current_snapshot} # This should insert the {current_snapshot}
set_more_modified_response() set_more_modified_response()
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(3) time.sleep(3)
# Verify what was sent as a notification, this file should exist # Verify what was sent as a notification, this file should exist
with open("test-datastore/notification.txt", "r") as f: with open("test-datastore/notification.txt", "r") as f:
@@ -146,17 +146,21 @@ def test_check_notification(client, live_server):
os.unlink("test-datastore/notification.txt") os.unlink("test-datastore/notification.txt")
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(1) time.sleep(1)
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(1) time.sleep(1)
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(1) time.sleep(1)
assert os.path.exists("test-datastore/notification.txt") == False assert os.path.exists("test-datastore/notification.txt") == False
res = client.get(url_for("notification_logs"))
# be sure we see it in the output log
assert b'New ChangeDetection.io Notification - ' + test_url.encode('utf-8') in res.data
# cleanup for the next # cleanup for the next
client.get( client.get(
url_for("api_delete", uuid="all"), url_for("form_delete", uuid="all"),
follow_redirects=True follow_redirects=True
) )
@@ -168,7 +172,7 @@ def test_notification_validation(client, live_server):
# Add our URL to the import page # Add our URL to the import page
test_url = url_for('test_endpoint', _external=True) test_url = url_for('test_endpoint', _external=True)
res = client.post( res = client.post(
url_for("api_watch_add"), url_for("form_watch_add"),
data={"url": test_url, "tag": 'nice one'}, data={"url": test_url, "tag": 'nice one'},
follow_redirects=True follow_redirects=True
) )
@@ -208,6 +212,6 @@ def test_notification_validation(client, live_server):
# cleanup for the next # cleanup for the next
client.get( client.get(
url_for("api_delete", uuid="all"), url_for("form_delete", uuid="all"),
follow_redirects=True follow_redirects=True
) )

View File

@@ -16,7 +16,7 @@ def test_check_notification_error_handling(client, live_server):
# use a different URL so that it doesnt interfere with the actual check until we are ready # use a different URL so that it doesnt interfere with the actual check until we are ready
test_url = url_for('test_endpoint', _external=True) test_url = url_for('test_endpoint', _external=True)
res = client.post( res = client.post(
url_for("api_watch_add"), url_for("form_watch_add"),
data={"url": "https://changedetection.io/CHANGELOG.txt", "tag": ''}, data={"url": "https://changedetection.io/CHANGELOG.txt", "tag": ''},
follow_redirects=True follow_redirects=True
) )

View File

@@ -41,7 +41,7 @@ def test_share_watch(client, live_server):
# click share the link # click share the link
res = client.get( res = client.get(
url_for("api_share_put_watch", uuid="first"), url_for("form_share_put_watch", uuid="first"),
follow_redirects=True follow_redirects=True
) )
@@ -54,7 +54,7 @@ def test_share_watch(client, live_server):
# Now delete what we have, we will try to re-import it # Now delete what we have, we will try to re-import it
# Cleanup everything # Cleanup everything
res = client.get(url_for("api_delete", uuid="all"), follow_redirects=True) res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data assert b'Deleted' in res.data
# Add our URL to the import page # Add our URL to the import page

View File

@@ -39,7 +39,7 @@ def test_check_basic_change_detection_functionality_source(client, live_server):
set_modified_response() set_modified_response()
# Force recheck # Force recheck
res = client.get(url_for("api_watch_checknow"), follow_redirects=True) res = client.get(url_for("form_watch_checknow"), follow_redirects=True)
assert b'1 watches are queued for rechecking.' in res.data assert b'1 watches are queued for rechecking.' in res.data
time.sleep(5) time.sleep(5)

View File

@@ -43,7 +43,7 @@ def set_modified_with_trigger_text_response():
Some NEW nice initial text</br> Some NEW nice initial text</br>
<p>Which is across multiple lines</p> <p>Which is across multiple lines</p>
</br> </br>
foobar123 Add to cart
<br/> <br/>
So let's see what happens. </br> So let's see what happens. </br>
</body> </body>
@@ -60,7 +60,7 @@ def test_trigger_functionality(client, live_server):
live_server_setup(live_server) live_server_setup(live_server)
sleep_time_for_fetch_thread = 3 sleep_time_for_fetch_thread = 3
trigger_text = "foobar123" trigger_text = "Add to cart"
set_original_ignore_response() set_original_ignore_response()
# Give the endpoint time to spin up # Give the endpoint time to spin up
@@ -76,10 +76,7 @@ def test_trigger_functionality(client, live_server):
assert b"1 Imported" in res.data assert b"1 Imported" in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
# Goto the edit page, add our ignore text # Goto the edit page, add our ignore text
# Add our URL to the import page # Add our URL to the import page
@@ -98,8 +95,14 @@ def test_trigger_functionality(client, live_server):
) )
assert bytes(trigger_text.encode('utf-8')) in res.data assert bytes(trigger_text.encode('utf-8')) in res.data
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
# so that we set the state to 'unviewed' after all the edits
client.get(url_for("diff_history_page", uuid="first"))
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -113,7 +116,7 @@ def test_trigger_functionality(client, live_server):
set_modified_original_ignore_response() set_modified_original_ignore_response()
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -121,16 +124,22 @@ def test_trigger_functionality(client, live_server):
res = client.get(url_for("index")) res = client.get(url_for("index"))
assert b'unviewed' not in res.data assert b'unviewed' not in res.data
# Just to be sure.. set a regular modified change.. # Now set the content which contains the trigger text
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
set_modified_with_trigger_text_response() set_modified_with_trigger_text_response()
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
res = client.get(url_for("index")) res = client.get(url_for("index"))
assert b'unviewed' in res.data assert b'unviewed' in res.data
# https://github.com/dgtlmoon/changedetection.io/issues/616
# Apparently the actual snapshot that contains the trigger never shows
res = client.get(url_for("diff_history_page", uuid="first"))
assert b'Add to cart' in res.data
# Check the preview/highlighter, we should be able to see what we triggered on, but it should be highlighted # Check the preview/highlighter, we should be able to see what we triggered on, but it should be highlighted
res = client.get(url_for("preview_page", uuid="first")) res = client.get(url_for("preview_page", uuid="first"))
# We should be able to see what we ignored
assert b'<div class="triggered">foobar' in res.data # We should be able to see what we triggered on
assert b'<div class="triggered">Add to cart' in res.data

View File

@@ -42,9 +42,6 @@ def test_trigger_regex_functionality(client, live_server):
) )
assert b"1 Imported" in res.data assert b"1 Imported" in res.data
# Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -60,12 +57,14 @@ def test_trigger_regex_functionality(client, live_server):
"fetch_backend": "html_requests"}, "fetch_backend": "html_requests"},
follow_redirects=True follow_redirects=True
) )
time.sleep(sleep_time_for_fetch_thread)
# so that we set the state to 'unviewed' after all the edits
client.get(url_for("diff_history_page", uuid="first"))
with open("test-datastore/endpoint-content.txt", "w") as f: with open("test-datastore/endpoint-content.txt", "w") as f:
f.write("some new noise") f.write("some new noise")
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
# It should report nothing found (nothing should match the regex) # It should report nothing found (nothing should match the regex)
@@ -75,7 +74,11 @@ def test_trigger_regex_functionality(client, live_server):
with open("test-datastore/endpoint-content.txt", "w") as f: with open("test-datastore/endpoint-content.txt", "w") as f:
f.write("regex test123<br/>\nsomething 123") f.write("regex test123<br/>\nsomething 123")
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
res = client.get(url_for("index")) res = client.get(url_for("index"))
assert b'unviewed' in res.data assert b'unviewed' in res.data
# Cleanup everything
res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data

View File

@@ -22,10 +22,9 @@ def set_original_ignore_response():
def test_trigger_regex_functionality(client, live_server): def test_trigger_regex_functionality_with_filter(client, live_server):
live_server_setup(live_server) live_server_setup(live_server)
sleep_time_for_fetch_thread = 3 sleep_time_for_fetch_thread = 3
set_original_ignore_response() set_original_ignore_response()
@@ -42,43 +41,44 @@ def test_trigger_regex_functionality(client, live_server):
) )
assert b"1 Imported" in res.data assert b"1 Imported" in res.data
# Trigger a check # it needs time to save the original version
client.get(url_for("api_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
# It should report nothing found (just a new one shouldnt have anything)
res = client.get(url_for("index"))
assert b'unviewed' not in res.data
### test regex with filter ### test regex with filter
res = client.post( res = client.post(
url_for("edit_page", uuid="first"), url_for("edit_page", uuid="first"),
data={"trigger_text": "/cool.stuff\d/", data={"trigger_text": "/cool.stuff/",
"url": test_url, "url": test_url,
"css_filter": '#in-here', "css_filter": '#in-here',
"fetch_backend": "html_requests"}, "fetch_backend": "html_requests"},
follow_redirects=True follow_redirects=True
) )
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
client.get(url_for("diff_history_page", uuid="first"))
# Check that we have the expected text.. but it's not in the css filter we want # Check that we have the expected text.. but it's not in the css filter we want
with open("test-datastore/endpoint-content.txt", "w") as f: with open("test-datastore/endpoint-content.txt", "w") as f:
f.write("<html>some new noise with cool stuff2 ok</html>") f.write("<html>some new noise with cool stuff2 ok</html>")
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
# It should report nothing found (nothing should match the regex and filter) # It should report nothing found (nothing should match the regex and filter)
res = client.get(url_for("index")) res = client.get(url_for("index"))
assert b'unviewed' not in res.data assert b'unviewed' not in res.data
# now this should trigger something
with open("test-datastore/endpoint-content.txt", "w") as f: with open("test-datastore/endpoint-content.txt", "w") as f:
f.write("<html>some new noise with <span id=in-here>cool stuff6</span> ok</html>") f.write("<html>some new noise with <span id=in-here>cool stuff6</span> ok</html>")
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
res = client.get(url_for("index")) res = client.get(url_for("index"))
assert b'unviewed' in res.data assert b'unviewed' in res.data
# Cleanup everything
res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data

View File

@@ -44,6 +44,61 @@ def set_modified_response():
return None return None
# Handle utf-8 charset replies https://github.com/dgtlmoon/changedetection.io/pull/613
def test_check_xpath_filter_utf8(client, live_server):
filter='//item/*[self::description]'
d='''<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
<channel>
<title>rpilocator.com</title>
<link>https://rpilocator.com</link>
<description>Find Raspberry Pi Computers in Stock</description>
<lastBuildDate>Thu, 19 May 2022 23:27:30 GMT</lastBuildDate>
<image>
<url>https://rpilocator.com/favicon.png</url>
<title>rpilocator.com</title>
<link>https://rpilocator.com/</link>
<width>32</width>
<height>32</height>
</image>
<item>
<title>Stock Alert (UK): RPi CM4 - 1GB RAM, No MMC, No Wifi is In Stock at Pimoroni</title>
<description>Stock Alert (UK): RPi CM4 - 1GB RAM, No MMC, No Wifi is In Stock at Pimoroni</description>
<link>https://rpilocator.com?vendor=pimoroni&amp;utm_source=feed&amp;utm_medium=rss</link>
<category>pimoroni</category>
<category>UK</category>
<category>CM4</category>
<guid isPermaLink="false">F9FAB0D9-DF6F-40C8-8DEE5FC0646BB722</guid>
<pubDate>Thu, 19 May 2022 14:32:32 GMT</pubDate>
</item>
</channel>
</rss>'''
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(d)
# Add our URL to the import page
test_url = url_for('test_endpoint', _external=True, content_type="application/rss+xml;charset=UTF-8")
res = client.post(
url_for("import_page"),
data={"urls": test_url},
follow_redirects=True
)
assert b"1 Imported" in res.data
res = client.post(
url_for("edit_page", uuid="first"),
data={"css_filter": filter, "url": test_url, "tag": "", "headers": "", 'fetch_backend': "html_requests"},
follow_redirects=True
)
assert b"Updated watch." in res.data
time.sleep(3)
res = client.get(url_for("index"))
assert b'Unicode strings with encoding declaration are not supported.' not in res.data
res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data
def test_check_markup_xpath_filter_restriction(client, live_server): def test_check_markup_xpath_filter_restriction(client, live_server):
sleep_time_for_fetch_thread = 3 sleep_time_for_fetch_thread = 3
@@ -65,7 +120,7 @@ def test_check_markup_xpath_filter_restriction(client, live_server):
assert b"1 Imported" in res.data assert b"1 Imported" in res.data
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
@@ -89,12 +144,14 @@ def test_check_markup_xpath_filter_restriction(client, live_server):
set_modified_response() set_modified_response()
# Trigger a check # Trigger a check
client.get(url_for("api_watch_checknow"), follow_redirects=True) client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up # Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread) time.sleep(sleep_time_for_fetch_thread)
res = client.get(url_for("index")) res = client.get(url_for("index"))
assert b'unviewed' not in res.data assert b'unviewed' not in res.data
res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data
def test_xpath_validation(client, live_server): def test_xpath_validation(client, live_server):
@@ -117,11 +174,13 @@ def test_xpath_validation(client, live_server):
follow_redirects=True follow_redirects=True
) )
assert b"is not a valid XPath expression" in res.data assert b"is not a valid XPath expression" in res.data
res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data
# actually only really used by the distll.io importer, but could be handy too # actually only really used by the distll.io importer, but could be handy too
def test_check_with_prefix_css_filter(client, live_server): def test_check_with_prefix_css_filter(client, live_server):
res = client.get(url_for("api_delete", uuid="all"), follow_redirects=True) res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data assert b'Deleted' in res.data
# Give the endpoint time to spin up # Give the endpoint time to spin up
@@ -153,9 +212,7 @@ def test_check_with_prefix_css_filter(client, live_server):
follow_redirects=True follow_redirects=True
) )
with open('/tmp/fuck.html', 'wb') as f:
f.write(res.data)
assert b"Some text thats the same" in res.data #in selector assert b"Some text thats the same" in res.data #in selector
assert b"Some text that will change" not in res.data #not in selector assert b"Some text that will change" not in res.data #not in selector
client.get(url_for("api_delete", uuid="all"), follow_redirects=True) client.get(url_for("form_delete", uuid="all"), follow_redirects=True)

View File

@@ -1,6 +1,7 @@
#!/usr/bin/python3 #!/usr/bin/python3
from flask import make_response, request from flask import make_response, request
from flask import url_for
def set_original_response(): def set_original_response():
test_return_data = """<html> test_return_data = """<html>
@@ -55,14 +56,32 @@ def set_more_modified_response():
return None return None
# kinda funky, but works for now
def extract_api_key_from_UI(client):
import re
res = client.get(
url_for("settings_page"),
)
# <span id="api-key">{{api_key}}</span>
m = re.search('<span id="api-key">(.+?)</span>', str(res.data))
api_key = m.group(1)
return api_key.strip()
def live_server_setup(live_server): def live_server_setup(live_server):
@live_server.app.route('/test-endpoint') @live_server.app.route('/test-endpoint')
def test_endpoint(): def test_endpoint():
ctype = request.args.get('content_type') ctype = request.args.get('content_type')
status_code = request.args.get('status_code') status_code = request.args.get('status_code')
content = request.args.get('content') or None
try: try:
if content is not None:
resp = make_response(content, status_code)
resp.headers['Content-Type'] = ctype if ctype else 'text/html'
return resp
# Tried using a global var here but didn't seem to work, so reading from a file instead. # Tried using a global var here but didn't seem to work, so reading from a file instead.
with open("test-datastore/endpoint-content.txt", "r") as f: with open("test-datastore/endpoint-content.txt", "r") as f:
resp = make_response(f.read(), status_code) resp = make_response(f.read(), status_code)

View File

@@ -40,11 +40,11 @@ class update_worker(threading.Thread):
contents = "" contents = ""
screenshot = False screenshot = False
update_obj= {} update_obj= {}
xpath_data = False
now = time.time() now = time.time()
try: try:
changed_detected, update_obj, contents, screenshot = update_handler.run(uuid) changed_detected, update_obj, contents, screenshot, xpath_data = update_handler.run(uuid)
# Re #342 # Re #342
# In Python 3, all strings are sequences of Unicode characters. There is a bytes type that holds raw bytes. # In Python 3, all strings are sequences of Unicode characters. There is a bytes type that holds raw bytes.
# We then convert/.decode('utf-8') for the notification etc # We then convert/.decode('utf-8') for the notification etc
@@ -55,12 +55,25 @@ class update_worker(threading.Thread):
except content_fetcher.ReplyWithContentButNoText as e: except content_fetcher.ReplyWithContentButNoText as e:
# Totally fine, it's by choice - just continue on, nothing more to care about # Totally fine, it's by choice - just continue on, nothing more to care about
# Page had elements/content but no renderable text # Page had elements/content but no renderable text
if self.datastore.data['watching'].get(uuid, False) and self.datastore.data['watching'][uuid].get('css_filter'):
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': "Got HTML content but no text found (CSS / xPath Filter not found in page?)"})
else:
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': "Got HTML content but no text found."})
pass pass
except content_fetcher.EmptyReply as e: except content_fetcher.EmptyReply as e:
# Some kind of custom to-str handler in the exception handler that does this? # Some kind of custom to-str handler in the exception handler that does this?
err_text = "EmptyReply: Status Code {}".format(e.status_code) err_text = "EmptyReply - try increasing 'Wait seconds before extracting text', Status Code {}".format(e.status_code)
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': err_text, self.datastore.update_watch(uuid=uuid, update_obj={'last_error': err_text,
'last_check_status': e.status_code}) 'last_check_status': e.status_code})
except content_fetcher.ScreenshotUnavailable as e:
err_text = "Screenshot unavailable, page did not render fully in the expected time - try increasing 'Wait seconds before extracting text'"
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': err_text,
'last_check_status': e.status_code})
except content_fetcher.PageUnloadable as e:
err_text = "Page request from server didnt respond correctly"
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': err_text,
'last_check_status': e.status_code})
except Exception as e: except Exception as e:
self.app.logger.error("Exception reached processing watch UUID: %s - %s", uuid, str(e)) self.app.logger.error("Exception reached processing watch UUID: %s - %s", uuid, str(e))
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': str(e)}) self.datastore.update_watch(uuid=uuid, update_obj={'last_error': str(e)})
@@ -73,9 +86,7 @@ class update_worker(threading.Thread):
# For the FIRST time we check a site, or a change detected, save the snapshot. # For the FIRST time we check a site, or a change detected, save the snapshot.
if changed_detected or not watch['last_checked']: if changed_detected or not watch['last_checked']:
# A change was detected # A change was detected
fname = self.datastore.save_history_text(watch_uuid=uuid, contents=contents) fname = watch.save_history_text(contents=contents, timestamp=str(round(time.time())))
# Should always be keyed by string(timestamp)
self.datastore.update_watch(uuid, {"history": {str(round(time.time())): fname}})
# Generally update anything interesting returned # Generally update anything interesting returned
self.datastore.update_watch(uuid=uuid, update_obj=update_obj) self.datastore.update_watch(uuid=uuid, update_obj=update_obj)
@@ -86,16 +97,10 @@ class update_worker(threading.Thread):
print (">> Change detected in UUID {} - {}".format(uuid, watch['url'])) print (">> Change detected in UUID {} - {}".format(uuid, watch['url']))
# Notifications should only trigger on the second time (first time, we gather the initial snapshot) # Notifications should only trigger on the second time (first time, we gather the initial snapshot)
if len(watch['history']) > 1: if watch.history_n >= 2:
dates = list(watch['history'].keys()) dates = list(watch.history.keys())
# Convert to int, sort and back to str again prev_fname = watch.history[dates[-2]]
# @todo replace datastore getter that does this automatically
dates = [int(i) for i in dates]
dates.sort(reverse=True)
dates = [str(i) for i in dates]
prev_fname = watch['history'][dates[1]]
# Did it have any notification alerts to hit? # Did it have any notification alerts to hit?
@@ -148,6 +153,9 @@ class update_worker(threading.Thread):
# Always save the screenshot if it's available # Always save the screenshot if it's available
if screenshot: if screenshot:
self.datastore.save_screenshot(watch_uuid=uuid, screenshot=screenshot) self.datastore.save_screenshot(watch_uuid=uuid, screenshot=screenshot)
if xpath_data:
self.datastore.save_xpath_data(watch_uuid=uuid, data=xpath_data)
self.current_uuid = None # Done self.current_uuid = None # Done
self.q.task_done() self.q.task_done()

View File

Before

Width:  |  Height:  |  Size: 894 B

After

Width:  |  Height:  |  Size: 894 B

BIN
docs/json-diff-example.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

View File

Before

Width:  |  Height:  |  Size: 115 KiB

After

Width:  |  Height:  |  Size: 115 KiB

View File

Before

Width:  |  Height:  |  Size: 27 KiB

After

Width:  |  Height:  |  Size: 27 KiB

View File

Before

Width:  |  Height:  |  Size: 190 KiB

After

Width:  |  Height:  |  Size: 190 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 238 KiB

View File

@@ -6,6 +6,7 @@ timeago ~=1.0
inscriptis ~= 2.2 inscriptis ~= 2.2
feedgen ~= 0.9 feedgen ~= 0.9
flask-login ~= 0.5 flask-login ~= 0.5
flask_restful
pytz pytz
# Set these versions together to avoid a RequestsDependencyWarning # Set these versions together to avoid a RequestsDependencyWarning
@@ -17,7 +18,7 @@ wtforms ~= 3.0
jsonpath-ng ~= 1.5.3 jsonpath-ng ~= 1.5.3
# Notification library # Notification library
apprise ~= 0.9.8.3 apprise ~= 0.9.9
# apprise mqtt https://github.com/dgtlmoon/changedetection.io/issues/315 # apprise mqtt https://github.com/dgtlmoon/changedetection.io/issues/315
paho-mqtt paho-mqtt