Compare commits

..

171 Commits

Author SHA1 Message Date
dgtlmoon
aef24c42db extended tests 2022-10-28 14:08:29 +02:00
dgtlmoon
0f6afb9ce8 Merge branch 'diff-filters' of https://github.com/bwees/changedetection.io into diff-filters 2022-10-28 13:50:19 +02:00
Brandon Wees
ea2fcee4ad fix syntax error 2022-10-27 12:05:57 -04:00
Brandon Wees
bd79c5decd Update changedetectionio/tests/test_diff_filter_changes_as_add_delete.py
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-10-27 12:03:20 -04:00
Brandon Wees
74428372c3 Update changedetectionio/tests/test_diff_filter_only_deletions.py
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-10-27 11:57:55 -04:00
dgtlmoon
e6cdb57db0 Merge branch 'master' into diff-filters 2022-10-27 17:56:56 +02:00
dgtlmoon
ac3de58116 Merge branch 'diff-filters' of https://github.com/bwees/changedetection.io into diff-filters 2022-10-27 17:37:26 +02:00
Brandon Wees
e11c6aeb5f Apply suggestions from code review
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-10-27 10:59:14 -04:00
Brandon Wees
294bb7be15 remvoe unneeded import 2022-10-27 10:57:50 -04:00
Brandon Wees
c2c8bb4de8 ensure_data_dir_exists call added 2022-10-27 10:54:30 -04:00
Brandon Wees
35d950fa74 Update changedetectionio/model/Watch.py
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-10-27 10:52:42 -04:00
Brandon Wees
d24111f3a6 Apply suggestions from code review
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-10-27 10:52:20 -04:00
Brandon Wees
7011a04399 switching to os.path.join 2022-10-27 10:43:18 -04:00
Sandro
57f604dff1 UI - Make fetch error more readable (#1038) 2022-10-27 16:40:24 +02:00
dgtlmoon
8499468749 Update README.md 2022-10-27 15:17:14 +02:00
Brandon Wees
4364521cfc Update changedetectionio/templates/edit.html
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-10-27 09:11:28 -04:00
Brandon Wees
748328453e unmerge external header server. Sorry! 2022-10-27 09:03:39 -04:00
Brandon Wees
e867e89303 Update test_backup.py 2022-10-27 08:45:44 -04:00
dgtlmoon
7f6a13ea6c Re #1052 - Watch 'open' link should use any dynamic/template info (#1063) 2022-10-27 13:29:24 +02:00
dgtlmoon
9874f0cbc7 Remove accidental files 2022-10-27 12:43:02 +02:00
dgtlmoon
3e7fd9570a Merge branch 'diff-filters' of https://github.com/bwees/changedetection.io into diff-filters 2022-10-27 12:42:28 +02:00
dgtlmoon
99f3b01013 Merge branch 'master' into diff-filters 2022-10-27 12:38:51 +02:00
dgtlmoon
72834a42fd Backups and Snapshots - Data directory now fully portable, (all paths are relative) , refactored backup zip export creation 2022-10-27 12:35:26 +02:00
Brandon Wees
43c2e71961 Merge branch 'master' into diff-filters 2022-10-26 08:18:27 -04:00
dgtlmoon
724cb17224 Re #1052 - Dynamic URLs, use variables in the URL (such as the current date, the date in a month, and other logic see https://github.com/dgtlmoon/changedetection.io/wiki/Handling-variables-in-the-watched-URL ) (#1057) 2022-10-24 23:20:39 +02:00
Brandon Wees
9946ee66d0 Merge pull request #2 from bwees/external-header-server
External header server
2022-10-24 09:08:57 -04:00
Brandon Wees
9f722cc76b Merge branch 'dgtlmoon:master' into external-header-server 2022-10-24 08:54:22 -04:00
dgtlmoon
62b6645810 Merge branch 'master' into diff-filters 2022-10-24 11:47:08 +02:00
dgtlmoon
e5e8b3bbbd Merge branch 'diff-filters' of https://github.com/bwees/changedetection.io into diff-filters 2022-10-24 11:47:05 +02:00
dgtlmoon
4eb4b401a1 API - system info - allow 5 minutes grace before watch is considered 'overdue' 2022-10-23 23:12:28 +02:00
dgtlmoon
5d40e16c73 API - Adding basic system info/system state API (#1051) 2022-10-23 19:15:11 +02:00
dgtlmoon
492bbce6b6 Build - Fix syntax in container build test (#1050) 2022-10-23 16:02:13 +02:00
dgtlmoon
0394a56be5 Building - Test container build on PR 2022-10-23 15:54:19 +02:00
Entepotenz
7839551d6b Testing - Use same version of playwright while running tests as in production builds (#1047) 2022-10-23 11:26:32 +02:00
Entepotenz
9c5588c791 update path for validation in the CONTRIBUTING.md (#1046) 2022-10-23 11:25:29 +02:00
bwees
852a698629 add optional for field 2022-10-19 19:14:01 -04:00
bwees
76fd27dfab fix logic error 2022-10-19 19:10:01 -04:00
bwees
83161e4fa3 fixed string None case 2022-10-19 19:03:01 -04:00
bwees
296c7c46cb fixed empty field errors 2022-10-19 19:00:38 -04:00
bwees
0a2644d0c3 fix tests 2022-10-19 18:58:54 -04:00
bwees
495e322c9e fixed import errors 2022-10-19 18:55:05 -04:00
bwees
0d5820932f rename branch 2022-10-19 18:45:43 -04:00
Brandon Wees
408be08a48 Merge branch 'dgtlmoon:master' into external-auth 2022-10-19 18:42:27 -04:00
bwees
bad0909cc2 added external header server 2022-10-19 18:42:04 -04:00
dgtlmoon
5a43a350de History index safety check - Be sure that only valid history index lines are read (#1042) 2022-10-19 22:41:13 +02:00
Michael McMillan
3c31f023ce Option to Hide the Referer header from monitored websites. (#996) 2022-10-18 09:16:22 +02:00
Brandon Wees
c80f46308a Update edit.html 2022-10-17 15:10:36 -04:00
dgtlmoon
4cbcc59461 0.39.20.4 2022-10-17 18:36:47 +02:00
dgtlmoon
4be0260381 Better cross platform file handling in diff and preview (#1034) 2022-10-17 18:36:22 +02:00
dgtlmoon
957a3c1c16 0.39.20.3 2022-10-17 17:43:35 +02:00
dgtlmoon
85897e0bf9 Windows - diff file handling improvements (#1031) 2022-10-17 17:40:28 +02:00
dgtlmoon
63095f70ea Also include tests in pip build 2022-10-17 17:13:15 +02:00
dgtlmoon
802daa6296 Merge branch 'master' into diff-filters 2022-10-17 12:10:59 +02:00
Brandon Wees
2f641da182 Apply suggestions from code review
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-10-14 07:49:28 -04:00
dgtlmoon
8d5b0b5576 Update README.md 2022-10-12 10:51:39 +02:00
dgtlmoon
1b077abd93 0.39.20.2 2022-10-12 09:53:59 +02:00
dgtlmoon
32ea1a8721 Windows - JQ - Make library optional so it doesnt break Windows pip installs (#1009) 2022-10-12 09:53:16 +02:00
dgtlmoon
fff32cef0d Adding test - Test the 'execute JS before changedetection' (#1006) 2022-10-11 14:40:36 +02:00
Brandon Wees
4951721286 Update changedetectionio/store.py
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-10-11 07:59:51 -04:00
dgtlmoon
a50d6db0b2 Merge branch 'master' into diff-filters 2022-10-11 11:17:53 +02:00
dgtlmoon
8fb146f3e4 0.39.20.1 2022-10-09 23:05:35 +02:00
dgtlmoon
770b0faa45 Code - check containers build when Dockerfile or requirements.txt changes (#1005) 2022-10-09 22:58:01 +02:00
dgtlmoon
f6faa90340 Adding make to Dockerfile build as required by jq for ARM devices 2022-10-09 22:29:18 +02:00
dgtlmoon
669fd3ae0b Dont use default Requests user-agent and accept headers in playwright+selenium requests, breaks sites such as united.com. (#1004) 2022-10-09 18:25:36 +02:00
dgtlmoon
17d37fb626 0.39.20 2022-10-09 16:13:32 +02:00
Yusef Ouda
dfa7fc3a81 Adds support for jq JSON path querying engine (#1001) 2022-10-09 16:12:45 +02:00
dgtlmoon
cd467df97a Adding link to BrightData Proxy info (#1003) 2022-10-09 15:51:57 +02:00
dgtlmoon
71bc2fed82 Remove quotationspage default watch 2022-10-09 14:06:07 +02:00
Hmmbob
738fcfe01c Notification library: Bump apprise to 1.1.0 (signal, opsgenie, pagerduty, bark and mailto fixes, adds support for BulkSMS and SMSEagle) (#1002) 2022-10-09 11:42:51 +02:00
dgtlmoon
3ebb2ab9ba Selenium fetcher - screenshot should be taken after 'wait' time, not before #873 2022-09-25 11:05:07 +02:00
dgtlmoon
ac98bc9144 Upgrade Playwright to 1.26 2022-09-24 23:51:26 +02:00
dgtlmoon
3705ce6681 Render Extract Configurable Delay Seconds should also apply after executing any JS #958 2022-09-24 23:48:03 +02:00
dgtlmoon
f7ea99412f Re #958 - remove change screensize, should be in 1280x720 default, was causing "Unable to retrieve content because the page is navigating and changing the content." on some sites 2022-09-19 14:02:32 +02:00
dgtlmoon
d4715e2bc8 Tidy up proxies.json logic, adding tests (#955) 2022-09-19 13:14:35 +02:00
dgtlmoon
8567a83c47 Update README.md - Include BrightData suggestion 2022-09-16 13:21:01 +02:00
dgtlmoon
77fdf59ae3 Improve Proxy minimum time debug output 2022-09-15 17:17:07 +02:00
dgtlmoon
0e194aa4b4 Default proxy settings fixes 2022-09-15 16:58:23 +02:00
dgtlmoon
2ba55bb477 Use proxies.json instead of proxies.txt - see wiki Proxies section (#945) 2022-09-15 15:25:23 +02:00
dgtlmoon
4c759490da Upgrade Playwright to 1.25 2022-09-15 15:10:40 +02:00
dgtlmoon
58a52c1f60 Update README.md 2022-09-13 15:29:05 +02:00
dgtlmoon
22638399c1 0.39.19.1 2022-09-11 09:23:43 +02:00
dgtlmoon
e3381776f2 Notification - code tidyup 2022-09-11 09:08:13 +02:00
dgtlmoon
26e2f21a80 Watch list & notification - Adding extra list batch operations for Mute, Unmute, Reset-to-default 2022-09-10 15:29:39 +02:00
dgtlmoon
b6009ae9ff Notification - Reset defaults button should be on edit page only 2022-09-10 15:19:18 +02:00
dgtlmoon
b046d6ef32 Notification watch settings - add button to make watch use defaults (empties the settings) 2022-09-10 15:11:31 +02:00
dgtlmoon
e154a3cb7a Notification system update - set watch to use defaults if it is the same as the default 2022-09-10 15:01:11 +02:00
Jason Nader
1262700263 Fix typo (#924) 2022-09-09 12:08:01 +02:00
dgtlmoon
f55f7967ef Merge branch 'master' into diff-filters 2022-09-08 20:37:17 +02:00
dgtlmoon
434c5813b9 0.39.19 2022-09-08 20:16:35 +02:00
dgtlmoon
0a3dc7d77b Update README.md 2022-09-08 20:15:23 +02:00
dgtlmoon
a7e296de65 Tweaks to python PIP readme 2022-09-08 17:53:58 +02:00
dgtlmoon
bd0fbaaf27 Use play and pause separate icons (#919) 2022-09-08 17:50:45 +02:00
dgtlmoon
0c111bd9ae Further notification settings refinement (#910) 2022-09-08 09:10:04 +02:00
dgtlmoon
ed9ac0b7fb Reliability improvement - Check watch UUID exists when reporting missing path (#915) 2022-09-07 23:04:35 +02:00
dgtlmoon
743a3069bb repair pip readme 2022-09-04 15:23:32 +02:00
dgtlmoon
fefc39427b Test improvement - Visual selector data loads as JSON (#895) 2022-08-31 16:32:50 +02:00
dgtlmoon
2c6faa7c4e Cleaner separation of watch/global notification settings (#894) 2022-08-31 15:49:13 +02:00
dgtlmoon
6168cd2899 Code maintenance - Removing old function (#875) 2022-08-31 15:23:10 +02:00
dgtlmoon
f3c7c969d8 Show screenshot age in [preview] 2022-08-25 11:18:00 +02:00
dgtlmoon
1355c2a245 Update README.md 2022-08-25 11:00:20 +02:00
dgtlmoon
96cf1a06df Update README.md 2022-08-24 23:26:55 +02:00
dgtlmoon
019a4a0375 Update README.md 2022-08-24 09:52:11 +02:00
dgtlmoon
db2f7b80ea Update bug_report.md 2022-08-20 15:30:53 +02:00
dgtlmoon
bfabd7b094 Update bug_report.md 2022-08-20 15:28:01 +02:00
dgtlmoon
d92dbfe765 Update README.md 2022-08-20 00:39:37 +02:00
dgtlmoon
67d2441334 0.39.18 2022-08-19 11:37:26 +02:00
dgtlmoon
3c30bc02d5 More data saving pre-checks (#863) 2022-08-18 23:25:23 +02:00
dgtlmoon
dcb54117d5 Update screenshot 2022-08-18 15:29:34 +02:00
dgtlmoon
b1e32275dc Checkbox operations - reorder buttons for safety 2022-08-18 15:10:05 +02:00
dgtlmoon
e2a6865932 UI feature - Basic checkbox/group operations (#861) 2022-08-18 14:48:21 +02:00
dgtlmoon
f04adb7202 Bug fix - automatically queued watch checks weren't always being processed sequentially 2022-08-18 13:41:28 +02:00
dgtlmoon
1193a7f22c Playwright - Support proxy auth mechanisms (#859) 2022-08-18 09:46:28 +02:00
dgtlmoon
0b976827bb Update README.md 2022-08-17 22:11:04 +02:00
dgtlmoon
280e916033 Update README.md 2022-08-17 22:00:46 +02:00
bwees
13a96e93a2 fix linter errors after merge 2022-08-17 09:33:34 -04:00
dgtlmoon
ed93d51ae8 Merge branch 'master' into diff-filters 2022-08-17 15:26:47 +02:00
dgtlmoon
5494e61a05 Skip processing when watch was deleted 2022-08-17 13:29:32 +02:00
dgtlmoon
e461c0b819 Playwright fetcher didn't report low level HTTP errors correctly (like Connection Refused) (#852) 2022-08-17 13:25:08 +02:00
dgtlmoon
d67c654f37 Be sure visual-selector data is set when xPath/CSS filter is not yet found (#851) 2022-08-17 13:21:06 +02:00
dgtlmoon
06ab34b6af Visual selector data not being saved by refactor 2022-08-16 16:53:15 +02:00
dgtlmoon
ba8676c4ba 'Save chrome screenshot' checkbox never used, removing, we always save the screenshot. (#844) 2022-08-16 16:18:09 +02:00
dgtlmoon
4899c1a4f9 Crash fix: Data store sub-directories werent always being created when needed (#842) 2022-08-16 15:17:36 +02:00
dgtlmoon
9bff1582f7 Make the table header easier to understand when sorting (#840) 2022-08-16 12:49:08 +02:00
bwees
db28b30b1b add test for situation found in https://github.com/dgtlmoon/changedetection.io/pull/749#issuecomment-1200154861 2022-07-30 09:14:06 -04:00
bwees
6bdcdfbaea fixed replace bug in get_diff_types 2022-07-30 09:05:55 -04:00
bwees
0efc504c5d change form wording 2022-07-30 08:47:07 -04:00
bwees
628cb2ad44 added form validation for diff filter checkboxes 2022-07-30 08:30:56 -04:00
Brandon Wees
604f2eaf02 remove unneeded debug statements 2022-07-29 08:40:47 -04:00
bwees
2a649afd22 Merge branch 'diff-filters' of https://github.com/bwees/changedetection.io into diff-filters 2022-07-29 08:39:32 -04:00
bwees
526f8fac45 remove unneeded import 2022-07-29 08:39:30 -04:00
dgtlmoon
e76f5efee3 Merge branch 'master' into diff-filters 2022-07-29 12:54:54 +02:00
bwees
7ac0620099 fixed merge conflict with latest version 2022-07-28 20:52:01 -04:00
bwees
14765b46bd fix broken logic 2022-07-28 20:48:20 -04:00
bwees
4f3a15e68d clean up test 2022-07-28 20:48:14 -04:00
bwees
c6207f729d added middleware to fix broken default checkboxes during tests 2022-07-28 20:37:20 -04:00
bwees
fcc1a72d30 changed tests 2022-07-28 20:37:03 -04:00
bwees
6f2b7ceddb changed UI to have checkboxes instead of dropdown 2022-07-28 20:36:53 -04:00
bwees
1e265b312e fix macos test running 2022-07-28 20:33:01 -04:00
Brandon Wees
f379dda13d Apply suggestions from code review
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-07-17 11:59:20 -04:00
Brandon Wees
4a88589a27 Update changedetectionio/model/Watch.py
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-07-17 11:58:46 -04:00
bwees
cac53a76c0 added antoher step to test to cover case as described https://github.com/dgtlmoon/changedetection.io/pull/749#issuecomment-1186209681 2022-07-16 19:13:20 -04:00
bwees
8dbf2257d3 added datastore migration step 2022-07-16 19:08:57 -04:00
bwees
c0fb051dde changed get_previous_text to not create the file if it does not exist 2022-07-16 16:02:05 -04:00
bwees
cf09f03d32 fix import statements 2022-07-16 15:54:44 -04:00
Brandon Wees
237cf7db4f Update changedetectionio/model/Watch.py
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-07-16 15:49:03 -04:00
bwees
a8e24dab01 Merge branch 'diff-filters' of https://github.com/bwees/changedetection.io into diff-filters 2022-07-16 15:48:44 -04:00
bwees
5c9b7353d4 fixed difflib import 2022-07-16 15:48:43 -04:00
Brandon Wees
1e22949e3d Update changedetectionio/model/Watch.py
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-07-16 15:48:20 -04:00
Brandon Wees
68e1a64474 Update changedetectionio/model/Watch.py
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-07-16 15:46:55 -04:00
Brandon Wees
151c2dab3a Update changedetectionio/templates/edit.html
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-07-16 10:38:45 -04:00
Brandon Wees
3e43d7ad1a Update changedetectionio/templates/edit.html
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-07-16 10:38:27 -04:00
Brandon Wees
58cb7fbc2a Update changedetectionio/model/Watch.py
Co-authored-by: dgtlmoon <leigh@morresi.net>
2022-07-16 10:37:05 -04:00
Brandon Wees
23452a1599 Remove discord change (look at https://github.com/dgtlmoon/changedetection.io/pull/753 for this change) 2022-07-13 18:05:02 -04:00
bwees
7fb432bf06 Created working tests 2022-07-13 17:58:30 -04:00
bwees
dc3fc6cfdf used a drop down menu and rewrote checking code to fit GUI description 2022-07-13 17:58:13 -04:00
bwees
8ee42d2403 fixed my breaking change 2022-07-13 17:57:39 -04:00
bwees
8d9cac4c38 remove my tests because they wont run 2022-07-12 21:16:45 -04:00
bwees
374bb3824f fix test to include the new previous.txt file 2022-07-12 21:11:42 -04:00
bwees
91d8600b19 fixed test naming 2022-07-12 20:53:22 -04:00
bwees
7b0ddc23d3 workaround for diff filter checkboxes getting changed on creation of form object 2022-07-12 20:40:54 -04:00
bwees
ab74377be0 fixed file based text saving system 2022-07-12 18:28:29 -04:00
bwees
2196d120a9 rewrote and broke out tests to simplify 2022-07-12 18:27:51 -04:00
bwees
5dca59a4a0 switched to file handling of previous_text 2022-07-12 17:59:46 -04:00
bwees
ee8042b54e Fix boolean value being sent to difflib 2022-07-12 16:56:59 -04:00
bwees
4c3f233d21 Made unit test 2022-07-11 20:52:18 -04:00
bwees
159b062cb3 removed modify due to the way difflib reacts to changes 2022-07-11 20:37:01 -04:00
bwees
83565787ae added logic for filtering based on diff attributes 2022-07-11 20:35:30 -04:00
bwees
bdab4f5e09 added diff compare function to watch class 2022-07-11 20:34:33 -04:00
bwees
69075a81c5 updated data model 2022-07-11 19:27:05 -04:00
bwees
04746cc706 Added initial UI code 2022-07-11 19:26:56 -04:00
Brandon Wees
234494d907 Added character truncation rule to URL starting with https://discord.com/api/webhooks 2022-07-11 18:02:04 -04:00
56 changed files with 1872 additions and 390 deletions

View File

@@ -7,6 +7,20 @@ assignees: 'dgtlmoon'
---
**DO NOT USE THIS FORM TO REPORT THAT A PARTICULAR WEBSITE IS NOT SCRAPING/WATCHING AS EXPECTED**
This form is only for direct bugs and feature requests todo directly with the software.
Please report watched websites (full URL and _any_ settings) that do not work with changedetection.io as expected [**IN THE DISCUSSION FORUMS**](https://github.com/dgtlmoon/changedetection.io/discussions) or your report will be deleted
CONSIDER TAKING OUT A SUBSCRIPTION FOR A SMALL PRICE PER MONTH, YOU GET THE BENEFIT OF USING OUR PAID PROXIES AND FURTHERING THE DEVELOPMENT OF CHANGEDETECTION.IO
THANK YOU
**Describe the bug**
A clear and concise description of what the bug is.

View File

@@ -0,0 +1,55 @@
name: ChangeDetection.io Container Build Test
# Triggers the workflow on push or pull request events
# This line doesnt work, even tho it is the documented one
#on: [push, pull_request]
on:
push:
paths:
- requirements.txt
- Dockerfile
pull_request:
paths:
- requirements.txt
- Dockerfile
# Changes to requirements.txt packages and Dockerfile may or may not always be compatible with arm etc, so worth testing
# @todo: some kind of path filter for requirements.txt and Dockerfile
jobs:
test-container-build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.9
uses: actions/setup-python@v2
with:
python-version: 3.9
# Just test that the build works, some libraries won't compile on ARM/rPi etc
- name: Set up QEMU
uses: docker/setup-qemu-action@v1
with:
image: tonistiigi/binfmt:latest
platforms: all
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v1
with:
install: true
version: latest
driver-opts: image=moby/buildkit:master
- name: Test that the docker containers can build
id: docker_build
uses: docker/build-push-action@v2
# https://github.com/docker/build-push-action#customizing
with:
context: ./
file: ./Dockerfile
platforms: linux/arm/v7,linux/arm/v6,linux/amd64,linux/arm64,
cache-from: type=local,src=/tmp/.buildx-cache
cache-to: type=local,dest=/tmp/.buildx-cache

View File

@@ -1,28 +1,25 @@
name: ChangeDetection.io Test
name: ChangeDetection.io App Test
# Triggers the workflow on push or pull request events
on: [push, pull_request]
jobs:
test-build:
test-application:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.9
uses: actions/setup-python@v2
with:
python-version: 3.9
- name: Show env vars
run: set
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install flake8 pytest
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
if [ -f requirements-dev.txt ]; then pip install -r requirements-dev.txt; fi
- name: Lint with flake8
run: |
# stop the build if there are Python syntax errors or undefined names
@@ -39,7 +36,4 @@ jobs:
# Each test is totally isolated and performs its own cleanup/reset
cd changedetectionio; ./run_all_tests.sh
# https://github.com/docker/build-push-action/blob/master/docs/advanced/test-before-push.md ?
# https://github.com/docker/buildx/issues/59 ? Needs to be one platform?
# https://github.com/docker/buildx/issues/495#issuecomment-918925854

View File

@@ -6,7 +6,7 @@ Otherwise, it's always best to PR into the `dev` branch.
Please be sure that all new functionality has a matching test!
Use `pytest` to validate/test, you can run the existing tests as `pytest tests/test_notifications.py` for example
Use `pytest` to validate/test, you can run the existing tests as `pytest tests/test_notification.py` for example
```
pip3 install -r requirements-dev

View File

@@ -5,13 +5,14 @@ FROM python:3.8-slim as builder
ARG CRYPTOGRAPHY_DONT_BUILD_RUST=1
RUN apt-get update && apt-get install -y --no-install-recommends \
libssl-dev \
libffi-dev \
g++ \
gcc \
libc-dev \
libffi-dev \
libssl-dev \
libxslt-dev \
zlib1g-dev \
g++
make \
zlib1g-dev
RUN mkdir /install
WORKDIR /install
@@ -22,9 +23,14 @@ RUN pip install --target=/dependencies -r /requirements.txt
# Playwright is an alternative to Selenium
# Excluded this package from requirements.txt to prevent arm/v6 and arm/v7 builds from failing
RUN pip install --target=/dependencies playwright~=1.24 \
RUN pip install --target=/dependencies playwright~=1.26 \
|| echo "WARN: Failed to install Playwright. The application can still run, but the Playwright option will be disabled."
RUN pip install --target=/dependencies jq~=1.3 \
|| echo "WARN: Failed to install JQ. The application can still run, but the Jq: filter option will be disabled."
# Final image stage
FROM python:3.8-slim
@@ -58,6 +64,7 @@ EXPOSE 5000
# The actual flask app
COPY changedetectionio /app/changedetectionio
# The eventlet server wrapper
COPY changedetection.py /app/changedetection.py

View File

@@ -2,6 +2,7 @@ recursive-include changedetectionio/api *
recursive-include changedetectionio/templates *
recursive-include changedetectionio/static *
recursive-include changedetectionio/model *
recursive-include changedetectionio/tests *
include changedetection.py
global-exclude *.pyc
global-exclude node_modules

View File

@@ -1,45 +1,48 @@
# changedetection.io
![changedetection.io](https://github.com/dgtlmoon/changedetection.io/actions/workflows/test-only.yml/badge.svg?branch=master)
<a href="https://hub.docker.com/r/dgtlmoon/changedetection.io" target="_blank" title="Change detection docker hub">
<img src="https://img.shields.io/docker/pulls/dgtlmoon/changedetection.io" alt="Docker Pulls"/>
</a>
<a href="https://hub.docker.com/r/dgtlmoon/changedetection.io" target="_blank" title="Change detection docker hub">
<img src="https://img.shields.io/github/v/release/dgtlmoon/changedetection.io" alt="Change detection latest tag version"/>
</a>
## Web Site Change Detection, Monitoring and Notification.
## Self-hosted open source change monitoring of web pages.
Live your data-life pro-actively, track website content changes and receive notifications via Discord, Email, Slack, Telegram and 70+ more
_Know when web pages change! Stay ontop of new information!_
Live your data-life *pro-actively* instead of *re-actively*, do not rely on manipulative social media for consuming important information.
[<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/screenshot.png" style="max-width:100%;" alt="Self-hosted web page change monitoring" title="Self-hosted web page change monitoring" />](https://lemonade.changedetection.io/start?src=pip)
<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/screenshot.png" style="max-width:100%;" alt="Self-hosted web page change monitoring" title="Self-hosted web page change monitoring" />
**Get your own private instance now! Let us host it for you!**
[**Try our $6.99/month subscription - unlimited checks, watches and notifications!**](https://lemonade.changedetection.io/start), choose from different geographical locations, let us handle everything for you.
[**Don't have time? Let us host it for you! try our extremely affordable subscription use our proxies and support!**](https://lemonade.changedetection.io/start)
#### Example use cases
Know when ...
- Government department updates (changes are often only on their websites)
- Local government news (changes are often only on their websites)
- Products and services have a change in pricing
- _Out of stock notification_ and _Back In stock notification_
- Governmental department updates (changes are often only on their websites)
- New software releases, security advisories when you're not on their mailing list.
- Festivals with changes
- Realestate listing changes
- Know when your favourite whiskey is on sale, or other special deals are announced before anyone else
- COVID related news from government websites
- University/organisation news from their website
- Detect and monitor changes in JSON API responses
- API monitoring and alerting
- JSON API monitoring and alerting
- Changes in legal and other documents
- Trigger API calls via notifications when text appears on a website
- Glue together APIs using the JSON filter and JSON notifications
- Create RSS feeds based on changes in web content
- Monitor HTML source code for unexpected changes, strengthen your PCI compliance
- You have a very sensitive list of URLs to watch and you do _not_ want to use the paid alternatives. (Remember, _you_ are the product)
_Need an actual Chrome runner with Javascript support? We support fetching via WebDriver and Playwright!</a>_
#### Key Features
- Lots of trigger filters, such as "Trigger on text", "Remove text by selector", "Ignore text", "Extract text", also using regular-expressions!
- Target elements with xPath and CSS Selectors, Easily monitor complex JSON with JSONPath or jq
- Switch between fast non-JS and Chrome JS based "fetchers"
- Easily specify how often a site should be checked
- Execute JS before extracting text (Good for logging in, see examples in the UI!)
- Override Request Headers, Specify `POST` or `GET` and other methods
- Use the "Visual Selector" to help target specific elements
**Get monitoring now!**
```bash
$ pip3 install changedetection.io
$ pip3 install changedetection.io
```
Specify a target for the *datastore path* with `-d` (required) and a *listening port* with `-p` (defaults to `5000`)
@@ -51,17 +54,5 @@ $ changedetection.io -d /path/to/empty/data/dir -p 5000
Then visit http://127.0.0.1:5000 , You should now be able to access the UI.
### Features
- Website monitoring
- Change detection of content and analyses
- Filters on change (Select by CSS or JSON)
- Triggers (Wait for text, wait for regex)
- Notification support
- JSON API Monitoring
- Parse JSON embedded in HTML
- (Reverse) Proxy support
- Javascript support via WebDriver
- RaspberriPi (arm v6/v7/64 support)
See https://github.com/dgtlmoon/changedetection.io for more information.

View File

@@ -1,29 +1,25 @@
# changedetection.io
## Web Site Change Detection, Monitoring and Notification.
Live your data-life pro-actively, track website content changes and receive notifications via Discord, Email, Slack, Telegram and 70+ more
[<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/screenshot.png" style="max-width:100%;" alt="Self-hosted web page change monitoring" title="Self-hosted web page change monitoring" />](https://lemonade.changedetection.io/start?src=github)
[![Release Version][release-shield]][release-link] [![Docker Pulls][docker-pulls]][docker-link] [![License][license-shield]](LICENSE.md)
![changedetection.io](https://github.com/dgtlmoon/changedetection.io/actions/workflows/test-only.yml/badge.svg?branch=master)
## Web Site Change Detection, Monitoring and Notification - Self-Hosted or SaaS.
Know when important content changes, we support notifications via Discord, Telegram, Home-Assistant, Slack, Email and 70+ more
_Know when web pages change! Stay ontop of new information! get notifications when important website content changes_
[**Don't have time? Let us host it for you! try our $6.99/month subscription - use our proxies and support!**](https://lemonade.changedetection.io/start) , _half the price of other website change monitoring services and comes with unlimited watches & checks!_
Live your data-life *pro-actively* instead of *re-actively*.
Free, Open-source web page monitoring, notification and change detection. Don't have time? [**Try our $6.99/month subscription - unlimited checks and watches!**](https://lemonade.changedetection.io/start)
- Chrome browser included.
- Super fast, no registration needed setup.
- Start watching and receiving change notifications instantly.
[<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/screenshot.png" style="max-width:100%;" alt="Self-hosted web page change monitoring" title="Self-hosted web page change monitoring" />](https://lemonade.changedetection.io/start)
Easily see what changed, examine by word, line, or individual character.
**Get your own private instance now! Let us host it for you!**
[**Try our $6.99/month subscription - unlimited checks and watches!**](https://lemonade.changedetection.io/start) , _half the price of other website change monitoring services and comes with unlimited watches & checks!_
- Automatic Updates, Automatic Backups, No Heroku "paused application", don't miss a change!
- Javascript browser included
- Unlimited checks and watches!
<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/screenshot-diff.png" style="max-width:100%;" alt="Self-hosted web page change monitoring context difference " title="Self-hosted web page change monitoring context difference " />
#### Example use cases
@@ -46,16 +42,23 @@ Free, Open-source web page monitoring, notification and change detection. Don't
- Monitor HTML source code for unexpected changes, strengthen your PCI compliance
- You have a very sensitive list of URLs to watch and you do _not_ want to use the paid alternatives. (Remember, _you_ are the product)
_Need an actual Chrome runner with Javascript support? We support fetching via WebDriver!</a>_
_Need an actual Chrome runner with Javascript support? We support fetching via WebDriver and Playwright!</a>_
#### Key Features
- Lots of trigger filters, such as "Trigger on text", "Remove text by selector", "Ignore text", "Extract text", also using regular-expressions!
- Target elements with xPath and CSS Selectors, Easily monitor complex JSON with JSONPath or jq
- Switch between fast non-JS and Chrome JS based "fetchers"
- Easily specify how often a site should be checked
- Execute JS before extracting text (Good for logging in, see examples in the UI!)
- Override Request Headers, Specify `POST` or `GET` and other methods
- Use the "Visual Selector" to help target specific elements
- Configurable [proxy per watch](https://github.com/dgtlmoon/changedetection.io/wiki/Proxy-configuration)
We [recommend and use Bright Data](https://brightdata.grsm.io/n0r16zf7eivq) global proxy services, Bright Data will match any first deposit up to $100 using our signup link.
## Screenshots
### Examine differences in content.
Easily see what changed, examine by word, line, or individual character.
<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/screenshot-diff.png" style="max-width:100%;" alt="Self-hosted web page change monitoring context difference " title="Self-hosted web page change monitoring context difference " />
Please :star: star :star: this project and help it grow! https://github.com/dgtlmoon/changedetection.io/
### Filter by elements using the Visual Selector tool.
@@ -118,8 +121,8 @@ See the wiki for more information https://github.com/dgtlmoon/changedetection.io
## Filters
XPath, JSONPath and CSS support comes baked in! You can be as specific as you need, use XPath exported from various XPath element query creation tools.
XPath, JSONPath, jq, and CSS support comes baked in! You can be as specific as you need, use XPath exported from various XPath element query creation tools.
(We support LXML `re:test`, `re:math` and `re:replace`.)
## Notifications
@@ -148,7 +151,7 @@ Now you can also customise your notification content!
## JSON API Monitoring
Detect changes and monitor data in JSON API's by using the built-in JSONPath selectors as a filter / selector.
Detect changes and monitor data in JSON API's by using either JSONPath or jq to filter, parse, and restructure JSON as needed.
![image](https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/json-filter-field-example.png)
@@ -156,9 +159,20 @@ This will re-parse the JSON and apply formatting to the text, making it super ea
![image](https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/json-diff-example.png)
### JSONPath or jq?
For more complex parsing, filtering, and modifying of JSON data, jq is recommended due to the built-in operators and functions. Refer to the [documentation](https://stedolan.github.io/jq/manual/) for more specifc information on jq.
One big advantage of `jq` is that you can use logic in your JSON filter, such as filters to only show items that have a value greater than/less than etc.
See the wiki https://github.com/dgtlmoon/changedetection.io/wiki/JSON-Selector-Filter-help for more information and examples
Note: `jq` library must be added separately (`pip3 install jq`)
### Parse JSON embedded in HTML!
When you enable a `json:` filter, you can even automatically extract and parse embedded JSON inside a HTML page! Amazingly handy for sites that build content based on JSON, such as many e-commerce websites.
When you enable a `json:` or `jq:` filter, you can even automatically extract and parse embedded JSON inside a HTML page! Amazingly handy for sites that build content based on JSON, such as many e-commerce websites.
```
<html>
@@ -168,11 +182,11 @@ When you enable a `json:` filter, you can even automatically extract and parse e
</script>
```
`json:$.price` would give `23.50`, or you can extract the whole structure
`json:$.price` or `jq:.price` would give `23.50`, or you can extract the whole structure
## Proxy configuration
## Proxy Configuration
See the wiki https://github.com/dgtlmoon/changedetection.io/wiki/Proxy-configuration
See the wiki https://github.com/dgtlmoon/changedetection.io/wiki/Proxy-configuration , we also support using [BrightData proxy services where possible]( https://github.com/dgtlmoon/changedetection.io/wiki/Proxy-configuration#brightdata-proxy-support)
## Raspberry Pi support?

View File

@@ -1,16 +1,5 @@
#!/usr/bin/python3
# @todo logging
# @todo extra options for url like , verify=False etc.
# @todo enable https://urllib3.readthedocs.io/en/latest/user-guide.html#ssl as option?
# @todo option for interval day/6 hour/etc
# @todo on change detected, config for calling some API
# @todo fetch title into json
# https://distill.io/features
# proxy per check
# - flask_cors, itsdangerous,MarkupSafe
import datetime
import os
import queue
@@ -44,7 +33,7 @@ from flask_wtf import CSRFProtect
from changedetectionio import html_tools
from changedetectionio.api import api_v1
__version__ = '0.39.17.2'
__version__ = '0.39.20.4'
datastore = None
@@ -205,6 +194,9 @@ def changedetection_app(config=None, datastore_o=None):
watch_api.add_resource(api_v1.Watch, '/api/v1/watch/<string:uuid>',
resource_class_kwargs={'datastore': datastore, 'update_q': update_q})
watch_api.add_resource(api_v1.SystemInfo, '/api/v1/systeminfo',
resource_class_kwargs={'datastore': datastore, 'update_q': update_q})
@@ -503,7 +495,7 @@ def changedetection_app(config=None, datastore_o=None):
from changedetectionio import fetch_site_status
# Get the most recent one
newest_history_key = datastore.get_val(uuid, 'newest_history_key')
newest_history_key = datastore.data['watching'][uuid].get('newest_history_key')
# 0 means that theres only one, so that there should be no 'unviewed' history available
if newest_history_key == 0:
@@ -552,16 +544,13 @@ def changedetection_app(config=None, datastore_o=None):
# be sure we update with a copy instead of accidently editing the live object by reference
default = deepcopy(datastore.data['watching'][uuid])
# Show system wide default if nothing configured
if datastore.data['watching'][uuid]['fetch_backend'] is None:
default['fetch_backend'] = datastore.data['settings']['application']['fetch_backend']
# Show system wide default if nothing configured
if all(value == 0 or value == None for value in datastore.data['watching'][uuid]['time_between_check'].values()):
default['time_between_check'] = deepcopy(datastore.data['settings']['requests']['time_between_check'])
# Defaults for proxy choice
if datastore.proxy_list is not None: # When enabled
# @todo
# Radio needs '' not None, or incase that the chosen one no longer exists
if default['proxy'] is None or not any(default['proxy'] in tup for tup in datastore.proxy_list):
default['proxy'] = ''
@@ -575,7 +564,10 @@ def changedetection_app(config=None, datastore_o=None):
# @todo - Couldn't get setattr() etc dynamic addition working, so remove it instead
del form.proxy
else:
form.proxy.choices = [('', 'Default')] + datastore.proxy_list
form.proxy.choices = [('', 'Default')]
for p in datastore.proxy_list:
form.proxy.choices.append(tuple((p, datastore.proxy_list[p]['label'])))
if request.method == 'POST' and form.validate():
extra_update_obj = {}
@@ -598,10 +590,8 @@ def changedetection_app(config=None, datastore_o=None):
if form.fetch_backend.data == datastore.data['settings']['application']['fetch_backend']:
extra_update_obj['fetch_backend'] = None
# Notification URLs
datastore.data['watching'][uuid]['notification_urls'] = form.notification_urls.data
# Ignore text
# Ignore text
form_ignore_text = form.ignore_text.data
datastore.data['watching'][uuid]['ignore_text'] = form_ignore_text
@@ -649,18 +639,27 @@ def changedetection_app(config=None, datastore_o=None):
# Only works reliably with Playwright
visualselector_enabled = os.getenv('PLAYWRIGHT_DRIVER_URL', False) and default['fetch_backend'] == 'html_webdriver'
# JQ is difficult to install on windows and must be manually added (outside requirements.txt)
jq_support = True
try:
import jq
except ModuleNotFoundError:
jq_support = False
output = render_template("edit.html",
uuid=uuid,
watch=datastore.data['watching'][uuid],
form=form,
has_empty_checktime=using_default_check_time,
using_global_webdriver_wait=default['webdriver_delay'] is None,
current_base_url=datastore.data['settings']['application']['base_url'],
emailprefix=os.getenv('NOTIFICATION_MAIL_BUTTON_PREFIX', False),
form=form,
has_default_notification_urls=True if len(datastore.data['settings']['application']['notification_urls']) else False,
has_empty_checktime=using_default_check_time,
jq_support=jq_support,
playwright_enabled=os.getenv('PLAYWRIGHT_DRIVER_URL', False),
settings_application=datastore.data['settings']['application'],
using_global_webdriver_wait=default['webdriver_delay'] is None,
uuid=uuid,
visualselector_data_is_ready=visualselector_data_is_ready,
visualselector_enabled=visualselector_enabled,
playwright_enabled=os.getenv('PLAYWRIGHT_DRIVER_URL', False)
watch=datastore.data['watching'][uuid],
)
return output
@@ -672,26 +671,34 @@ def changedetection_app(config=None, datastore_o=None):
default = deepcopy(datastore.data['settings'])
if datastore.proxy_list is not None:
available_proxies = list(datastore.proxy_list.keys())
# When enabled
system_proxy = datastore.data['settings']['requests']['proxy']
# In the case it doesnt exist anymore
if not any([system_proxy in tup for tup in datastore.proxy_list]):
if not system_proxy in available_proxies:
system_proxy = None
default['requests']['proxy'] = system_proxy if system_proxy is not None else datastore.proxy_list[0][0]
default['requests']['proxy'] = system_proxy if system_proxy is not None else available_proxies[0]
# Used by the form handler to keep or remove the proxy settings
default['proxy_list'] = datastore.proxy_list
default['proxy_list'] = available_proxies[0]
# Don't use form.data on POST so that it doesnt overrid the checkbox status from the POST status
form = forms.globalSettingsForm(formdata=request.form if request.method == 'POST' else None,
data=default
)
# Remove the last option 'System default'
form.application.form.notification_format.choices.pop()
if datastore.proxy_list is None:
# @todo - Couldn't get setattr() etc dynamic addition working, so remove it instead
del form.requests.form.proxy
else:
form.requests.form.proxy.choices = datastore.proxy_list
form.requests.form.proxy.choices = []
for p in datastore.proxy_list:
form.requests.form.proxy.choices.append(tuple((p, datastore.proxy_list[p]['label'])))
if request.method == 'POST':
# Password unset is a GET, but we can lock the session to a salted env password to always need the password
@@ -732,7 +739,8 @@ def changedetection_app(config=None, datastore_o=None):
current_base_url = datastore.data['settings']['application']['base_url'],
hide_remove_pass=os.getenv("SALTED_PASS", False),
api_key=datastore.data['settings']['application'].get('api_access_token'),
emailprefix=os.getenv('NOTIFICATION_MAIL_BUTTON_PREFIX', False))
emailprefix=os.getenv('NOTIFICATION_MAIL_BUTTON_PREFIX', False),
settings_application=datastore.data['settings']['application'])
return output
@@ -811,8 +819,10 @@ def changedetection_app(config=None, datastore_o=None):
newest_file = history[dates[-1]]
# Read as binary and force decode as UTF-8
# Windows may fail decode in python if we just use 'r' mode (chardet decode exception)
try:
with open(newest_file, 'r') as f:
with open(newest_file, 'r', encoding='utf-8', errors='ignore') as f:
newest_version_file_contents = f.read()
except Exception as e:
newest_version_file_contents = "Unable to read {}.\n".format(newest_file)
@@ -825,7 +835,7 @@ def changedetection_app(config=None, datastore_o=None):
previous_file = history[dates[-2]]
try:
with open(previous_file, 'r') as f:
with open(previous_file, 'r', encoding='utf-8', errors='ignore') as f:
previous_version_file_contents = f.read()
except Exception as e:
previous_version_file_contents = "Unable to read {}.\n".format(previous_file)
@@ -902,7 +912,7 @@ def changedetection_app(config=None, datastore_o=None):
timestamp = list(watch.history.keys())[-1]
filename = watch.history[timestamp]
try:
with open(filename, 'r') as f:
with open(filename, 'r', encoding='utf-8', errors='ignore') as f:
tmp = f.readlines()
# Get what needs to be highlighted
@@ -977,9 +987,6 @@ def changedetection_app(config=None, datastore_o=None):
# create a ZipFile object
backupname = "changedetection-backup-{}.zip".format(int(time.time()))
# We only care about UUIDS from the current index file
uuids = list(datastore.data['watching'].keys())
backup_filepath = os.path.join(datastore_o.datastore_path, backupname)
with zipfile.ZipFile(backup_filepath, "w",
@@ -995,12 +1002,12 @@ def changedetection_app(config=None, datastore_o=None):
# Add the flask app secret
zipObj.write(os.path.join(datastore_o.datastore_path, "secret.txt"), arcname="secret.txt")
# Add any snapshot data we find, use the full path to access the file, but make the file 'relative' in the Zip.
for txt_file_path in Path(datastore_o.datastore_path).rglob('*.txt'):
parent_p = txt_file_path.parent
if parent_p.name in uuids:
zipObj.write(txt_file_path,
arcname=str(txt_file_path).replace(datastore_o.datastore_path, ''),
# Add any data in the watch data directory.
for uuid, w in datastore.data['watching'].items():
for f in Path(w.watch_data_dir).glob('*'):
zipObj.write(f,
# Use the full path to access the file, but make the file 'relative' in the Zip.
arcname=os.path.join(f.parts[-2], f.parts[-1]),
compress_type=zipfile.ZIP_DEFLATED,
compresslevel=8)
@@ -1186,6 +1193,63 @@ def changedetection_app(config=None, datastore_o=None):
flash("{} watches are queued for rechecking.".format(i))
return redirect(url_for('index', tag=tag))
@app.route("/form/checkbox-operations", methods=['POST'])
@login_required
def form_watch_list_checkbox_operations():
op = request.form['op']
uuids = request.form.getlist('uuids')
if (op == 'delete'):
for uuid in uuids:
uuid = uuid.strip()
if datastore.data['watching'].get(uuid):
datastore.delete(uuid.strip())
flash("{} watches deleted".format(len(uuids)))
elif (op == 'pause'):
for uuid in uuids:
uuid = uuid.strip()
if datastore.data['watching'].get(uuid):
datastore.data['watching'][uuid.strip()]['paused'] = True
flash("{} watches paused".format(len(uuids)))
elif (op == 'unpause'):
for uuid in uuids:
uuid = uuid.strip()
if datastore.data['watching'].get(uuid):
datastore.data['watching'][uuid.strip()]['paused'] = False
flash("{} watches unpaused".format(len(uuids)))
elif (op == 'mute'):
for uuid in uuids:
uuid = uuid.strip()
if datastore.data['watching'].get(uuid):
datastore.data['watching'][uuid.strip()]['notification_muted'] = True
flash("{} watches muted".format(len(uuids)))
elif (op == 'unmute'):
for uuid in uuids:
uuid = uuid.strip()
if datastore.data['watching'].get(uuid):
datastore.data['watching'][uuid.strip()]['notification_muted'] = False
flash("{} watches un-muted".format(len(uuids)))
elif (op == 'notification-default'):
from changedetectionio.notification import (
default_notification_format_for_watch
)
for uuid in uuids:
uuid = uuid.strip()
if datastore.data['watching'].get(uuid):
datastore.data['watching'][uuid.strip()]['notification_title'] = None
datastore.data['watching'][uuid.strip()]['notification_body'] = None
datastore.data['watching'][uuid.strip()]['notification_urls'] = []
datastore.data['watching'][uuid.strip()]['notification_format'] = default_notification_format_for_watch
flash("{} watches set to use default notification settings".format(len(uuids)))
return redirect(url_for('index'))
@app.route("/api/share-url", methods=['GET'])
@login_required
def form_share_put_watch():
@@ -1321,6 +1385,8 @@ def ticker_thread_check_time_launch_checks():
import random
from changedetectionio import update_worker
proxy_last_called_time = {}
recheck_time_minimum_seconds = int(os.getenv('MINIMUM_SECONDS_RECHECK_TIME', 20))
print("System env MINIMUM_SECONDS_RECHECK_TIME", recheck_time_minimum_seconds)
@@ -1381,17 +1447,42 @@ def ticker_thread_check_time_launch_checks():
if watch.jitter_seconds == 0:
watch.jitter_seconds = random.uniform(-abs(jitter), jitter)
seconds_since_last_recheck = now - watch['last_checked']
if seconds_since_last_recheck >= (threshold + watch.jitter_seconds) and seconds_since_last_recheck >= recheck_time_minimum_seconds:
if not uuid in running_uuids and uuid not in [q_uuid for p,q_uuid in update_q.queue]:
print("> Queued watch UUID {} last checked at {} queued at {:0.2f} jitter {:0.2f}s, {:0.2f}s since last checked".format(uuid,
watch['last_checked'],
now,
watch.jitter_seconds,
now - watch['last_checked']))
# Proxies can be set to have a limit on seconds between which they can be called
watch_proxy = datastore.get_preferred_proxy_for_watch(uuid=uuid)
if watch_proxy and watch_proxy in list(datastore.proxy_list.keys()):
# Proxy may also have some threshold minimum
proxy_list_reuse_time_minimum = int(datastore.proxy_list.get(watch_proxy, {}).get('reuse_time_minimum', 0))
if proxy_list_reuse_time_minimum:
proxy_last_used_time = proxy_last_called_time.get(watch_proxy, 0)
time_since_proxy_used = int(time.time() - proxy_last_used_time)
if time_since_proxy_used < proxy_list_reuse_time_minimum:
# Not enough time difference reached, skip this watch
print("> Skipped UUID {} using proxy '{}', not enough time between proxy requests {}s/{}s".format(uuid,
watch_proxy,
time_since_proxy_used,
proxy_list_reuse_time_minimum))
continue
else:
# Record the last used time
proxy_last_called_time[watch_proxy] = int(time.time())
# Use Epoch time as priority, so we get a "sorted" PriorityQueue, but we can still push a priority 1 into it.
priority = int(time.time())
print(
"> Queued watch UUID {} last checked at {} queued at {:0.2f} priority {} jitter {:0.2f}s, {:0.2f}s since last checked".format(
uuid,
watch['last_checked'],
now,
priority,
watch.jitter_seconds,
now - watch['last_checked']))
# Into the queue with you
update_q.put((5, uuid))
update_q.put((priority, uuid))
# Reset for next time
watch.jitter_seconds = 0

View File

@@ -122,3 +122,37 @@ class CreateWatch(Resource):
return {'status': "OK"}, 200
return list, 200
class SystemInfo(Resource):
def __init__(self, **kwargs):
# datastore is a black box dependency
self.datastore = kwargs['datastore']
self.update_q = kwargs['update_q']
@auth.check_token
def get(self):
import time
overdue_watches = []
# Check all watches and report which have not been checked but should have been
for uuid, watch in self.datastore.data.get('watching', {}).items():
# see if now - last_checked is greater than the time that should have been
# this is not super accurate (maybe they just edited it) but better than nothing
t = watch.threshold_seconds()
if not t:
# Use the system wide default
t = self.datastore.threshold_seconds
time_since_check = time.time() - watch.get('last_checked')
# Allow 5 minutes of grace time before we decide it's overdue
if time_since_check - (5 * 60) > t:
overdue_watches.append(uuid)
return {
'queue_size': self.update_q.qsize(),
'overdue_watches': overdue_watches,
'uptime': round(time.time() - self.datastore.start_time, 2),
'watch_count': len(self.datastore.data.get('watching', {}))
}, 200

View File

@@ -102,6 +102,14 @@ def main():
has_password=datastore.data['settings']['application']['password'] != False
)
# Monitored websites will not receive a Referer header
# when a user clicks on an outgoing link.
@app.after_request
def hide_referrer(response):
if os.getenv("HIDE_REFERER", False):
response.headers["Referrer-Policy"] = "no-referrer"
return response
# Proxy sub-directory support
# Set environment var USE_X_SETTINGS=1 on this script
# And then in your proxy_pass settings

View File

@@ -31,11 +31,12 @@ class JSActionExceptions(Exception):
return
class PageUnloadable(Exception):
def __init__(self, status_code, url, screenshot=False):
def __init__(self, status_code, url, screenshot=False, message=False):
# Set this so we can use it in other parts of the app
self.status_code = status_code
self.url = url
self.screenshot = screenshot
self.message = message
return
class EmptyReply(Exception):
@@ -292,7 +293,15 @@ class base_html_playwright(Fetcher):
# allow per-watch proxy selection override
if proxy_override:
self.proxy = {'server': proxy_override}
# https://playwright.dev/docs/network#http-proxy
from urllib.parse import urlparse
parsed = urlparse(proxy_override)
proxy_url = "{}://{}:{}".format(parsed.scheme, parsed.hostname, parsed.port)
self.proxy = {'server': proxy_url}
if parsed.username:
self.proxy['username'] = parsed.username
if parsed.password:
self.proxy['password'] = parsed.password
def run(self,
url,
@@ -307,6 +316,7 @@ class base_html_playwright(Fetcher):
import playwright._impl._api_types
from playwright._impl._api_types import Error, TimeoutError
response = None
with sync_playwright() as p:
browser_type = getattr(p, self.browser_type)
@@ -356,7 +366,7 @@ class base_html_playwright(Fetcher):
print(str(e))
context.close()
browser.close()
raise PageUnloadable(url=url, status_code=None)
raise PageUnloadable(url=url, status_code=None, message=e.message)
if response is None:
context.close()
@@ -364,8 +374,11 @@ class base_html_playwright(Fetcher):
print("response object was none")
raise EmptyReply(url=url, status_code=None)
# Bug 2(?) Set the viewport size AFTER loading the page
page.set_viewport_size({"width": 1280, "height": 1024})
# Removed browser-set-size, seemed to be needed to make screenshots work reliably in older playwright versions
# Was causing exceptions like 'waiting for page but content is changing' etc
# https://www.browserstack.com/docs/automate/playwright/change-browser-window-size 1280x720 should be the default
extra_wait = int(os.getenv("WEBDRIVER_DELAY_BEFORE_CONTENT_READY", 5)) + self.render_extract_delay
time.sleep(extra_wait)
@@ -389,6 +402,13 @@ class base_html_playwright(Fetcher):
raise JSActionExceptions(status_code=response.status, screenshot=error_screenshot, message=str(e), url=url)
else:
# JS eval was run, now we also wait some time if possible to let the page settle
if self.render_extract_delay:
page.wait_for_timeout(self.render_extract_delay * 1000)
page.wait_for_timeout(500)
self.content = page.content()
self.status_code = response.status
self.headers = response.all_headers()
@@ -505,8 +525,6 @@ class base_html_webdriver(Fetcher):
# Selenium doesn't automatically wait for actions as good as Playwright, so wait again
self.driver.implicitly_wait(int(os.getenv("WEBDRIVER_DELAY_BEFORE_CONTENT_READY", 5)))
self.screenshot = self.driver.get_screenshot_as_png()
# @todo - how to check this? is it possible?
self.status_code = 200
# @todo somehow we should try to get this working for WebDriver
@@ -517,6 +535,8 @@ class base_html_webdriver(Fetcher):
self.content = self.driver.page_source
self.headers = {}
self.screenshot = self.driver.get_screenshot_as_png()
# Does the connection to the webdriver work? run a test connection.
def is_ready(self):
from selenium import webdriver
@@ -555,6 +575,11 @@ class html_requests(Fetcher):
ignore_status_codes=False,
current_css_filter=None):
# Make requests use a more modern looking user-agent
if not 'User-Agent' in request_headers:
request_headers['User-Agent'] = os.getenv("DEFAULT_SETTINGS_HEADERS_USERAGENT",
'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36')
proxies = {}
# Allows override the proxy on a per-request basis

View File

@@ -2,50 +2,24 @@ import hashlib
import logging
import os
import re
import time
import urllib3
import difflib
from changedetectionio import content_fetcher, html_tools
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
# Some common stuff here that can be moved to a base class
# (set_proxy_from_list)
class perform_site_check():
screenshot = None
xpath_data = None
def __init__(self, *args, datastore, **kwargs):
super().__init__(*args, **kwargs)
self.datastore = datastore
# If there was a proxy list enabled, figure out what proxy_args/which proxy to use
# if watch.proxy use that
# fetcher.proxy_override = watch.proxy or main config proxy
# Allows override the proxy on a per-request basis
# ALWAYS use the first one is nothing selected
def set_proxy_from_list(self, watch):
proxy_args = None
if self.datastore.proxy_list is None:
return None
# If its a valid one
if any([watch['proxy'] in p for p in self.datastore.proxy_list]):
proxy_args = watch['proxy']
# not valid (including None), try the system one
else:
system_proxy = self.datastore.data['settings']['requests']['proxy']
# Is not None and exists
if any([system_proxy in p for p in self.datastore.proxy_list]):
proxy_args = system_proxy
# Fallback - Did not resolve anything, use the first available
if proxy_args is None:
proxy_args = self.datastore.proxy_list[0][0]
return proxy_args
# Doesn't look like python supports forward slash auto enclosure in re.findall
# So convert it to inline flag "foobar(?i)" type configuration
def forward_slash_enclosed_regex_to_options(self, regex):
@@ -61,13 +35,13 @@ class perform_site_check():
def run(self, uuid):
timestamp = int(time.time()) # used for storage etc too
changed_detected = False
screenshot = False # as bytes
stripped_text_from_html = ""
watch = self.datastore.data['watching'][uuid]
watch = self.datastore.data['watching'].get(uuid)
if not watch:
return
# Protect against file:// access
if re.search(r'^file', watch['url'], re.IGNORECASE) and not os.getenv('ALLOW_FILE_URI', False):
@@ -78,7 +52,7 @@ class perform_site_check():
# Unset any existing notification error
update_obj = {'last_notification_error': False, 'last_error': False}
extra_headers = self.datastore.get_val(uuid, 'headers')
extra_headers =self.datastore.data['watching'][uuid].get('headers')
# Tweak the base config with the per-watch ones
request_headers = self.datastore.data['settings']['headers'].copy()
@@ -90,10 +64,12 @@ class perform_site_check():
if 'Accept-Encoding' in request_headers and "br" in request_headers['Accept-Encoding']:
request_headers['Accept-Encoding'] = request_headers['Accept-Encoding'].replace(', br', '')
timeout = self.datastore.data['settings']['requests']['timeout']
url = self.datastore.get_val(uuid, 'url')
request_body = self.datastore.get_val(uuid, 'body')
request_method = self.datastore.get_val(uuid, 'method')
timeout = self.datastore.data['settings']['requests'].get('timeout')
url = watch.link
request_body = self.datastore.data['watching'][uuid].get('body')
request_method = self.datastore.data['watching'][uuid].get('method')
ignore_status_codes = self.datastore.data['watching'][uuid].get('ignore_status_codes', False)
# source: support
@@ -110,9 +86,13 @@ class perform_site_check():
# If the klass doesnt exist, just use a default
klass = getattr(content_fetcher, "html_requests")
proxy_id = self.datastore.get_preferred_proxy_for_watch(uuid=uuid)
proxy_url = None
if proxy_id:
proxy_url = self.datastore.proxy_list.get(proxy_id).get('url')
print ("UUID {} Using proxy {}".format(uuid, proxy_url))
proxy_args = self.set_proxy_from_list(watch)
fetcher = klass(proxy_override=proxy_args)
fetcher = klass(proxy_override=proxy_url)
# Configurable per-watch or global extra delay before extracting text (for webDriver types)
system_webdriver_delay = self.datastore.data['settings']['application'].get('webdriver_delay', None)
@@ -127,6 +107,9 @@ class perform_site_check():
fetcher.run(url, timeout, request_headers, request_body, request_method, ignore_status_codes, watch['css_filter'])
fetcher.quit()
self.screenshot = fetcher.screenshot
self.xpath_data = fetcher.xpath_data
# Fetching complete, now filters
# @todo move to class / maybe inside of fetcher abstract base?
@@ -160,8 +143,9 @@ class perform_site_check():
has_filter_rule = True
if has_filter_rule:
if 'json:' in css_filter_rule:
stripped_text_from_html = html_tools.extract_json_as_string(content=fetcher.content, jsonpath_filter=css_filter_rule)
json_filter_prefixes = ['json:', 'jq:']
if any(prefix in css_filter_rule for prefix in json_filter_prefixes):
stripped_text_from_html = html_tools.extract_json_as_string(content=fetcher.content, json_filter=css_filter_rule)
is_html = False
if is_html or is_source:
@@ -305,11 +289,26 @@ class perform_site_check():
else:
logging.debug("check_unique_lines: UUID {} had unique content".format(uuid))
# Always record the new checksum
if changed_detected:
if not watch.get("trigger_add", True) or not watch.get("trigger_del", True): # if we are supposed to filter any diff types
# get the diff types present in the watch
diff_types = watch.get_diff_types(text_content_before_ignored_filter)
print("Diff components found: " + str(diff_types))
# Only Additions (deletions are turned off)
if not watch["trigger_del"] and diff_types["del"] and not diff_types["add"]:
changed_detected = False
# Only Deletions (additions are turned off)
elif not watch["trigger_add"] and diff_types["add"] and not diff_types["del"]:
changed_detected = False
# Always record the new checksum and the new text
update_obj["previous_md5"] = fetched_md5
watch.save_previous_text(text_content_before_ignored_filter)
# On the first run of a site, watch['previous_md5'] will be None, set it the current one.
if not watch.get('previous_md5'):
watch['previous_md5'] = fetched_md5
return changed_detected, update_obj, text_content_before_ignored_filter, fetcher.screenshot, fetcher.xpath_data
return changed_detected, update_obj, text_content_before_ignored_filter

View File

@@ -303,6 +303,37 @@ class ValidateCSSJSONXPATHInput(object):
# Re #265 - maybe in the future fetch the page and offer a
# warning/notice that its possible the rule doesnt yet match anything?
if not self.allow_json:
raise ValidationError("jq not permitted in this field!")
if 'jq:' in line:
try:
import jq
except ModuleNotFoundError:
# `jq` requires full compilation in windows and so isn't generally available
raise ValidationError("jq not support not found")
input = line.replace('jq:', '')
try:
jq.compile(input)
except (ValueError) as e:
message = field.gettext('\'%s\' is not a valid jq expression. (%s)')
raise ValidationError(message % (input, str(e)))
except:
raise ValidationError("A system-error occurred when validating your jq expression")
class ValidateDiffFilters(object):
"""
Validates that at least one filter checkbox is selected
"""
def __init__(self, message=None):
self.message = message
def __call__(self, form, field):
if not form.trigger_add.data and not form.trigger_del.data:
message = field.gettext('At least one filter checkbox must be selected')
raise ValidationError(message)
class quickWatchForm(Form):
@@ -314,14 +345,14 @@ class quickWatchForm(Form):
# Common to a single watch and the global settings
class commonSettingsForm(Form):
notification_urls = StringListField('Notification URL list', validators=[validators.Optional(), ValidateNotificationBodyAndTitleWhenURLisSet(), ValidateAppRiseServers()])
notification_title = StringField('Notification title', default=default_notification_title, validators=[validators.Optional(), ValidateTokensList()])
notification_body = TextAreaField('Notification body', default=default_notification_body, validators=[validators.Optional(), ValidateTokensList()])
notification_format = SelectField('Notification format', choices=valid_notification_formats.keys(), default=default_notification_format)
notification_urls = StringListField('Notification URL list', validators=[validators.Optional(), ValidateAppRiseServers()])
notification_title = StringField('Notification title', validators=[validators.Optional(), ValidateTokensList()])
notification_body = TextAreaField('Notification body', validators=[validators.Optional(), ValidateTokensList()])
notification_format = SelectField('Notification format', choices=valid_notification_formats.keys())
fetch_backend = RadioField(u'Fetch method', choices=content_fetcher.available_fetchers(), validators=[ValidateContentFetcherIsReady()])
extract_title_as_title = BooleanField('Extract <title> from document and use as watch title', default=False)
webdriver_delay = IntegerField('Wait seconds before extracting text', validators=[validators.Optional(), validators.NumberRange(min=1, message="Should contain one or more seconds")] )
webdriver_delay = IntegerField('Wait seconds before extracting text', validators=[validators.Optional(), validators.NumberRange(min=1,
message="Should contain one or more seconds")])
class watchForm(commonSettingsForm):
@@ -346,6 +377,8 @@ class watchForm(commonSettingsForm):
check_unique_lines = BooleanField('Only trigger when new lines appear', default=False)
trigger_text = StringListField('Trigger/wait for text', [validators.Optional(), ValidateListRegex()])
text_should_not_be_present = StringListField('Block change-detection if text matches', [validators.Optional(), ValidateListRegex()])
trigger_add = BooleanField('Additions', [ValidateDiffFilters()], default=True)
trigger_del = BooleanField('Deletions', [ValidateDiffFilters()], default=True)
webdriver_js_execute_code = TextAreaField('Execute JavaScript before change detection', render_kw={"rows": "5"}, validators=[validators.Optional()])
@@ -355,6 +388,8 @@ class watchForm(commonSettingsForm):
filter_failure_notification_send = BooleanField(
'Send a notification when the filter can no longer be found on the page', default=False)
notification_muted = BooleanField('Notifications Muted / Off', default=False)
def validate(self, **kwargs):
if not super().validate():
return False
@@ -384,7 +419,6 @@ class globalSettingsApplicationForm(commonSettingsForm):
global_subtractive_selectors = StringListField('Remove elements', [ValidateCSSJSONXPATHInput(allow_xpath=False, allow_json=False)])
global_ignore_text = StringListField('Ignore Text', [ValidateListRegex()])
ignore_whitespace = BooleanField('Ignore whitespace')
real_browser_save_screenshot = BooleanField('Save last screenshot when using Chrome?')
removepassword_button = SubmitField('Remove password', render_kw={"class": "pure-button pure-button-primary"})
empty_pages_are_a_change = BooleanField('Treat empty pages as a change?', default=False)
render_anchor_tag_content = BooleanField('Render anchor tag content', default=False)

View File

@@ -1,11 +1,11 @@
import json
from typing import List
from bs4 import BeautifulSoup
from jsonpath_ng.ext import parse
import re
from inscriptis import get_text
from inscriptis.model.config import ParserConfig
from jsonpath_ng.ext import parse
from typing import List
import json
import re
class FilterNotFoundInResponse(ValueError):
def __init__(self, msg):
@@ -79,19 +79,35 @@ def extract_element(find='title', html_content=''):
return element_text
#
def _parse_json(json_data, jsonpath_filter):
s=[]
jsonpath_expression = parse(jsonpath_filter.replace('json:', ''))
match = jsonpath_expression.find(json_data)
def _parse_json(json_data, json_filter):
if 'json:' in json_filter:
jsonpath_expression = parse(json_filter.replace('json:', ''))
match = jsonpath_expression.find(json_data)
return _get_stripped_text_from_json_match(match)
if 'jq:' in json_filter:
try:
import jq
except ModuleNotFoundError:
# `jq` requires full compilation in windows and so isn't generally available
raise Exception("jq not support not found")
jq_expression = jq.compile(json_filter.replace('jq:', ''))
match = jq_expression.input(json_data).all()
return _get_stripped_text_from_json_match(match)
def _get_stripped_text_from_json_match(match):
s = []
# More than one result, we will return it as a JSON list.
if len(match) > 1:
for i in match:
s.append(i.value)
s.append(i.value if hasattr(i, 'value') else i)
# Single value, use just the value, as it could be later used in a token in notifications.
if len(match) == 1:
s = match[0].value
s = match[0].value if hasattr(match[0], 'value') else match[0]
# Re #257 - Better handling where it does not exist, in the case the original 's' value was False..
if not match:
@@ -103,16 +119,16 @@ def _parse_json(json_data, jsonpath_filter):
return stripped_text_from_html
def extract_json_as_string(content, jsonpath_filter):
def extract_json_as_string(content, json_filter):
stripped_text_from_html = False
# Try to parse/filter out the JSON, if we get some parser error, then maybe it's embedded <script type=ldjson>
try:
stripped_text_from_html = _parse_json(json.loads(content), jsonpath_filter)
stripped_text_from_html = _parse_json(json.loads(content), json_filter)
except json.JSONDecodeError:
# Foreach <script json></script> blob.. just return the first that matches jsonpath_filter
# Foreach <script json></script> blob.. just return the first that matches json_filter
s = []
soup = BeautifulSoup(content, 'html.parser')
bs_result = soup.findAll('script')
@@ -131,7 +147,7 @@ def extract_json_as_string(content, jsonpath_filter):
# Just skip it
continue
else:
stripped_text_from_html = _parse_json(json_data, jsonpath_filter)
stripped_text_from_html = _parse_json(json_data, json_filter)
if stripped_text_from_html:
break

View File

@@ -13,10 +13,6 @@ class model(dict):
'watching': {},
'settings': {
'headers': {
'User-Agent': getenv("DEFAULT_SETTINGS_HEADERS_USERAGENT", 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36'),
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Accept-Encoding': 'gzip, deflate', # No support for brolti in python requests yet.
'Accept-Language': 'en-GB,en-US;q=0.9,en;'
},
'requests': {
'timeout': int(getenv("DEFAULT_SETTINGS_REQUESTS_TIMEOUT", "45")), # Default 45 seconds
@@ -42,7 +38,6 @@ class model(dict):
'notification_title': default_notification_title,
'notification_body': default_notification_body,
'notification_format': default_notification_format,
'real_browser_save_screenshot': True,
'schema_version' : 0,
'webdriver_delay': None # Extra delay in seconds before extracting text
}

View File

@@ -1,14 +1,14 @@
import os
import uuid as uuid_builder
from distutils.util import strtobool
import logging
import os
import time
import uuid
minimum_seconds_recheck_time = int(os.getenv('MINIMUM_SECONDS_RECHECK_TIME', 60))
mtable = {'seconds': 1, 'minutes': 60, 'hours': 3600, 'days': 86400, 'weeks': 86400 * 7}
from changedetectionio.notification import (
default_notification_body,
default_notification_format,
default_notification_title,
default_notification_format_for_watch
)
@@ -24,7 +24,7 @@ class model(dict):
#'newest_history_key': 0,
'title': None,
'previous_md5': False,
'uuid': str(uuid_builder.uuid4()),
'uuid': str(uuid.uuid4()),
'headers': {}, # Extra headers to send
'body': None,
'method': 'GET',
@@ -32,9 +32,9 @@ class model(dict):
'ignore_text': [], # List of text to ignore when calculating the comparison checksum
# Custom notification content
'notification_urls': [], # List of URLs to add to the notification Queue (Usually AppRise)
'notification_title': default_notification_title,
'notification_body': default_notification_body,
'notification_format': default_notification_format,
'notification_title': None,
'notification_body': None,
'notification_format': default_notification_format_for_watch,
'notification_muted': False,
'css_filter': '',
'last_error': False,
@@ -47,6 +47,8 @@ class model(dict):
'consecutive_filter_failures': 0, # Every time the CSS/xPath filter cannot be located, reset when all is fine.
'extract_title_as_title': False,
'check_unique_lines': False, # On change-detected, compare against all history if its something new
'trigger_add': True,
'trigger_del': True,
'proxy': None, # Preferred proxy connection
# Re #110, so then if this is set to None, we know to use the default value instead
# Requires setting to None on submit if it's the same as the default
@@ -62,7 +64,7 @@ class model(dict):
self.update(self.__base_config)
self.__datastore_path = kw['datastore_path']
self['uuid'] = str(uuid_builder.uuid4())
self['uuid'] = str(uuid.uuid4())
del kw['datastore_path']
@@ -83,6 +85,21 @@ class model(dict):
return False
def ensure_data_dir_exists(self):
if not os.path.isdir(self.watch_data_dir):
print ("> Creating data dir {}".format(self.watch_data_dir))
os.mkdir(self.watch_data_dir)
@property
def link(self):
url = self.get('url', '')
if '{%' in url or '{{' in url:
from jinja2 import Environment
# Jinja2 available in URLs along with https://pypi.org/project/jinja2-time/
jinja2_env = Environment(extensions=['jinja2_time.TimeExtension'])
return str(jinja2_env.from_string(url).render())
return url
@property
def label(self):
# Used for sorting
@@ -105,16 +122,40 @@ class model(dict):
@property
def history(self):
"""History index is just a text file as a list
{watch-uuid}/history.txt
contains a list like
{epoch-time},{filename}\n
We read in this list as the history information
"""
tmp_history = {}
import logging
import time
# Read the history file as a dict
fname = os.path.join(self.__datastore_path, self.get('uuid'), "history.txt")
fname = os.path.join(self.watch_data_dir, "history.txt")
if os.path.isfile(fname):
logging.debug("Reading history index " + str(time.time()))
with open(fname, "r") as f:
tmp_history = dict(i.strip().split(',', 2) for i in f.readlines())
for i in f.readlines():
if ',' in i:
k, v = i.strip().split(',', 2)
# The index history could contain a relative path, so we need to make the fullpath
# so that python can read it
if not '/' in v and not '\'' in v:
v = os.path.join(self.watch_data_dir, v)
else:
# It's possible that they moved the datadir on older versions
# So the snapshot exists but is in a different path
snapshot_fname = v.split('/')[-1]
proposed_new_path = os.path.join(self.watch_data_dir, snapshot_fname)
if not os.path.exists(v) and os.path.exists(proposed_new_path):
v = proposed_new_path
tmp_history[k] = v
if len(tmp_history):
self.__newest_history_key = list(tmp_history.keys())[-1]
@@ -125,7 +166,7 @@ class model(dict):
@property
def has_history(self):
fname = os.path.join(self.__datastore_path, self.get('uuid'), "history.txt")
fname = os.path.join(self.watch_data_dir, "history.txt")
return os.path.isfile(fname)
# Returns the newest key, but if theres only 1 record, then it's counted as not being new, so return 0.
@@ -144,35 +185,58 @@ class model(dict):
# Save some text file to the appropriate path and bump the history
# result_obj from fetch_site_status.run()
def save_history_text(self, contents, timestamp):
import uuid
import logging
output_path = "{}/{}".format(self.__datastore_path, self['uuid'])
self.ensure_data_dir_exists()
snapshot_fname = "{}.txt".format(str(uuid.uuid4()))
# Incase the operator deleted it, check and create.
if not os.path.isdir(output_path):
os.mkdir(output_path)
snapshot_fname = "{}/{}.stripped.txt".format(output_path, uuid.uuid4())
logging.debug("Saving history text {}".format(snapshot_fname))
with open(snapshot_fname, 'wb') as f:
# in /diff/ and /preview/ we are going to assume for now that it's UTF-8 when reading
# most sites are utf-8 and some are even broken utf-8
with open(os.path.join(self.watch_data_dir, snapshot_fname), 'wb') as f:
f.write(contents)
f.close()
# Append to index
# @todo check last char was \n
index_fname = "{}/history.txt".format(output_path)
index_fname = os.path.join(self.watch_data_dir, "history.txt")
with open(index_fname, 'a') as f:
f.write("{},{}\n".format(timestamp, snapshot_fname))
f.close()
self.__newest_history_key = timestamp
self.__history_n+=1
self.__history_n += 1
#@todo bump static cache of the last timestamp so we dont need to examine the file to set a proper ''viewed'' status
# @todo bump static cache of the last timestamp so we dont need to examine the file to set a proper ''viewed'' status
return snapshot_fname
# Save previous text snapshot for diffing - used for calculating additions and deletions
def save_previous_text(self, contents):
import logging
output_path = os.path.join(self.__datastore_path, self['uuid'])
# Incase the operator deleted it, check and create.
self.ensure_data_dir_exists()
snapshot_fname = os.path.join(self.watch_data_dir, "previous.txt")
logging.debug("Saving previous text {}".format(snapshot_fname))
with open(snapshot_fname, 'wb') as f:
f.write(contents)
return snapshot_fname
# Get previous text snapshot for diffing - used for calculating additions and deletions
def get_previous_text(self):
snapshot_fname = os.path.join(self.watch_data_dir, "previous.txt")
if self.history_n < 1:
return ""
with open(snapshot_fname, 'rb') as f:
contents = f.read()
return contents
@property
def has_empty_checktime(self):
# using all() + dictionary comprehension
@@ -202,15 +266,40 @@ class model(dict):
# if not, something new happened
return not local_lines.issubset(existing_history)
# Get diff types (addition, deletion, modification) from the previous snapshot and new_text
# uses similar algorithm to customSequenceMatcher in diff.py
# Returns a dict of diff types and wether they are present in the diff
def get_diff_types(self, new_text):
import difflib
diff_types = {
'add': False,
'del': False,
}
# get diff types using difflib
cruncher = difflib.SequenceMatcher(isjunk=lambda x: x in " \\t", a=str(self.get_previous_text()), b=str(new_text))
for tag, alo, ahi, blo, bhi in cruncher.get_opcodes():
if tag == 'delete':
diff_types["del"] = True
elif tag == 'insert':
diff_types["add"] = True
elif tag == 'replace':
diff_types["del"] = True
diff_types["add"] = True
return diff_types
def get_screenshot(self):
fname = os.path.join(self.__datastore_path, self['uuid'], "last-screenshot.png")
fname = os.path.join(self.watch_data_dir, "last-screenshot.png")
if os.path.isfile(fname):
return fname
return False
def __get_file_ctime(self, filename):
fname = os.path.join(self.__datastore_path, self['uuid'], filename)
fname = os.path.join(self.watch_data_dir, filename)
if os.path.isfile(fname):
return int(os.path.getmtime(fname))
return False
@@ -235,9 +324,14 @@ class model(dict):
def snapshot_error_screenshot_ctime(self):
return self.__get_file_ctime('last-error-screenshot.png')
@property
def watch_data_dir(self):
# The base dir of the watch data
return os.path.join(self.__datastore_path, self['uuid'])
def get_error_text(self):
"""Return the text saved from a previous request that resulted in a non-200 error"""
fname = os.path.join(self.__datastore_path, self['uuid'], "last-error.txt")
fname = os.path.join(self.watch_data_dir, "last-error.txt")
if os.path.isfile(fname):
with open(fname, 'r') as f:
return f.read()
@@ -245,7 +339,7 @@ class model(dict):
def get_error_snapshot(self):
"""Return path to the screenshot that resulted in a non-200 error"""
fname = os.path.join(self.__datastore_path, self['uuid'], "last-error-screenshot.png")
fname = os.path.join(self.watch_data_dir, "last-error-screenshot.png")
if os.path.isfile(fname):
return fname
return False

View File

@@ -14,16 +14,19 @@ valid_tokens = {
'current_snapshot': ''
}
default_notification_format_for_watch = 'System default'
default_notification_format = 'Text'
default_notification_body = '{watch_url} had a change.\n---\n{diff}\n---\n'
default_notification_title = 'ChangeDetection.io Notification - {watch_url}'
valid_notification_formats = {
'Text': NotifyFormat.TEXT,
'Markdown': NotifyFormat.MARKDOWN,
'HTML': NotifyFormat.HTML,
# Used only for editing a watch (not for global)
default_notification_format_for_watch: default_notification_format_for_watch
}
default_notification_format = 'Text'
default_notification_body = '{watch_url} had a change.\n---\n{diff}\n---\n'
default_notification_title = 'ChangeDetection.io Notification - {watch_url}'
def process_notification(n_object, datastore):
# Get the notification body from datastore

View File

@@ -9,6 +9,8 @@
# exit when any command fails
set -e
SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
find tests/test_*py -type f|while read test_name
do
echo "TEST RUNNING $test_name"
@@ -23,6 +25,13 @@ export BASE_URL="https://really-unique-domain.io"
pytest tests/test_notification.py
## JQ + JSON: filter test
# jq is not available on windows and we should just test it when the package is installed
# this will re-test with jq support
pip3 install jq~=1.3
pytest tests/test_jsonpath_jq_selector.py
# Now for the selenium and playwright/browserless fetchers
# Note - this is not UI functional tests - just checking that each one can fetch the content
@@ -38,13 +47,60 @@ docker kill $$-test_selenium
echo "TESTING WEBDRIVER FETCH > PLAYWRIGHT/BROWSERLESS..."
# Not all platforms support playwright (not ARM/rPI), so it's not packaged in requirements.txt
pip3 install playwright~=1.22
PLAYWRIGHT_VERSION=$(grep -i -E "RUN pip install.+" "$SCRIPT_DIR/../Dockerfile" | grep --only-matching -i -E "playwright[=><~+]+[0-9\.]+")
echo "using $PLAYWRIGHT_VERSION"
pip3 install "$PLAYWRIGHT_VERSION"
docker run -d --name $$-test_browserless -e "DEFAULT_LAUNCH_ARGS=[\"--window-size=1920,1080\"]" --rm -p 3000:3000 --shm-size="2g" browserless/chrome:1.53-chrome-stable
# takes a while to spin up
sleep 5
export PLAYWRIGHT_DRIVER_URL=ws://127.0.0.1:3000
pytest tests/fetchers/test_content.py
pytest tests/test_errorhandling.py
pytest tests/visualselector/test_fetch_data.py
unset PLAYWRIGHT_DRIVER_URL
docker kill $$-test_browserless
docker kill $$-test_browserless
# Test proxy list handling, starting two squids on different ports
# Each squid adds a different header to the response, which is the main thing we test for.
docker run -d --name $$-squid-one --rm -v `pwd`/tests/proxy_list/squid.conf:/etc/squid/conf.d/debian.conf -p 3128:3128 ubuntu/squid:4.13-21.10_edge
docker run -d --name $$-squid-two --rm -v `pwd`/tests/proxy_list/squid.conf:/etc/squid/conf.d/debian.conf -p 3129:3128 ubuntu/squid:4.13-21.10_edge
# So, basic HTTP as env var test
export HTTP_PROXY=http://localhost:3128
export HTTPS_PROXY=http://localhost:3128
pytest tests/proxy_list/test_proxy.py
docker logs $$-squid-one 2>/dev/null|grep one.changedetection.io
if [ $? -ne 0 ]
then
echo "Did not see a request to one.changedetection.io in the squid logs (while checking env vars HTTP_PROXY/HTTPS_PROXY)"
fi
unset HTTP_PROXY
unset HTTPS_PROXY
# 2nd test actually choose the preferred proxy from proxies.json
cp tests/proxy_list/proxies.json-example ./test-datastore/proxies.json
# Makes a watch use a preferred proxy
pytest tests/proxy_list/test_multiple_proxy.py
# Should be a request in the default "first" squid
docker logs $$-squid-one 2>/dev/null|grep chosen.changedetection.io
if [ $? -ne 0 ]
then
echo "Did not see a request to chosen.changedetection.io in the squid logs (while checking preferred proxy)"
fi
# And one in the 'second' squid (user selects this as preferred)
docker logs $$-squid-two 2>/dev/null|grep chosen.changedetection.io
if [ $? -ne 0 ]
then
echo "Did not see a request to chosen.changedetection.io in the squid logs (while checking preferred proxy)"
fi
# @todo - test system override proxy selection and watch defaults, setup a 3rd squid?
docker kill $$-squid-one
docker kill $$-squid-two

View File

@@ -0,0 +1,51 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!-- Created with Inkscape (http://www.inkscape.org/) -->
<svg
width="20.108334mm"
height="21.43125mm"
viewBox="0 0 20.108334 21.43125"
version="1.1"
id="svg5"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns="http://www.w3.org/2000/svg"
xmlns:svg="http://www.w3.org/2000/svg">
<defs
id="defs2" />
<g
id="layer1"
transform="translate(-141.05873,-76.816635)">
<image
width="20.108334"
height="21.43125"
preserveAspectRatio="none"
style="image-rendering:optimizeQuality"
xlink:href="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAEwAAABRCAYAAAB430BuAAAABHNCSVQICAgIfAhkiAAABLxJREFU
eJztnN2Z2jgUhl8Z7petIGwF0WMXsFBBoIKwFWS2gmQryKSCJRXsTAUDBTDRVBCmgkAB9tkLexh+
bIONLGwP7xU2RjafpaOjoyNBCxHNQAJEfG5sl+3ZLrAWeAyST5/sF91mFH3bRbZbsAq4ClaQq2B7
iKYnmg9Z318F20ICRnj8pMOd6E3HscNVsATxmQD/oeghPCnDLO26q2AkYin+TQ7XREyyrn3zgu2J
BSEjZTBZ179pwQ7EEv7KaoovvFnBUsV6ZHrsd+0WTHhKPV1SLGivYEsA1KEtEs2grFitRjQ65VxP
fH5JgEjAKsvXupKwFfYxaYJeSeHcWqVSCuwD7/HQQD8lRHLWDStBWG3slbAElkTc5/lTZdkIJhpN
h6/UUZDyzAgZK8PKVoEKErE8HlD0bBVcI2ZqwdBWYbFgAT+g1UZwrBbcvRyIpofHJ1Sh1rQCZt1k
lN5msQAm8CoYoFF8KVHOsFtQ5aayExBUhpnopJl6J/3/FREGWCrxmaH40/4z1oyQ320Yf5dDozXC
P4QMCRkCY4S5w/tbMTtd4L2Ngo6wJmSQ4hfdScAU+OjgGazgOXEl8oJyof3Z6Spx0iTzgnLKsMoK
w9SRuoR3rHniVVMXwRpDXQR7d+kHOJV6CFZB0khVOBGsTcE6VzWsNVGQizfJptU+N4LlD3AbVfsu
XsOahhvB8nrB08IrtcGNYNIct+EYl2+S6mr0D8kLUMrV6BfFRTzOGs4Ey8p1aNrUnssaliaMO/vV
sfNi3AmW5j54DgUTO/dyJ1hab9iwHhLcNskP23ZMND0kewFBXek6vZvHg/hMiUPSN00z+OBasFig
y8wSRfnZ0adSBz+sUVwFK4jbJhnPP06To1ETczpcCnavHhltHd82LU0AXDbJMGXBU8PSBAA8Jxk0
wnNaqlGSJuAyg+dsXIV38iZqXU3iWsmodhetSNlDQgJGriZxbWVSe1hS/gQ+S/C6j4QEfES21vxU
icXsoC4vC5mqJvbybyXgduucG/YWaYmmj+IdHvpoxFdt8ltRP5h3iZjRqfBh60C4t1rNY7rxAU95
aYnhEp+/u8pgxGfeRCfyJIR5SkLfFOHYXMMzu63PEDF9WQnSo8MUmhduyUWYEzGyvnRmU3683ugG
GAG/2bqJU4RnFDNCpsfWb5chswUnwb5Xg+hxiyo9w7MGJoSVpmYulam+A8scS+5nPYtf+s9mpZw7
J1nayDnCVuu4Ck+E6DqIBYDHHR1+is/n8kVUhfBExMBFMzm4taafkXcWL9BSfBG/nNN8sutYcE3S
d7XI3o6lSpIe/xcAIX/svzDxMVu22BAyLNKL2q9hwrdLiZWwXbP6B99GDLaGSpoOD6JPn4yxK1i8
B0StY1zKsCJiQNxzQ0HRbAm2BsZN2TBDGVaE5USzIVjsNix2VrzWHmUwB6J5fD32uyKCzQ7OxG5D
vzZuQ0E2osXjRlBMjvWe5WtYPE4b2BynXQJlMEToTUegmEiwM1mzQ1nBvqvH5ov1wlZHcA+AZHdc
xQW7vNuQS9kBtzKs1IIRMM7b0q/YvGTzto4qbFutdV5FnLtLk2x3JVWUfXKTbIu9Opc2J6Osj19S
HLfJKO64r6rg/wFBX3+2ZapW8wAAAABJRU5ErkJggg==
"
id="image832"
x="141.05873"
y="76.816635" />
</g>
</svg>

After

Width:  |  Height:  |  Size: 2.4 KiB

View File

@@ -0,0 +1,122 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
version="1.1"
id="Capa_1"
x="0px"
y="0px"
viewBox="0 0 15 14.998326"
xml:space="preserve"
width="15"
height="14.998326"
sodipodi:docname="play.svg"
inkscape:version="1.1.1 (1:1.1+202109281949+c3084ef5ed)"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns="http://www.w3.org/2000/svg"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"><sodipodi:namedview
id="namedview21"
pagecolor="#ffffff"
bordercolor="#666666"
borderopacity="1.0"
inkscape:pageshadow="2"
inkscape:pageopacity="0.0"
inkscape:pagecheckerboard="0"
showgrid="false"
inkscape:zoom="45.47174"
inkscape:cx="7.4991632"
inkscape:cy="7.4991632"
inkscape:window-width="1554"
inkscape:window-height="896"
inkscape:window-x="3048"
inkscape:window-y="227"
inkscape:window-maximized="0"
inkscape:current-layer="Capa_1" /><metadata
id="metadata39"><rdf:RDF><cc:Work
rdf:about=""><dc:format>image/svg+xml</dc:format><dc:type
rdf:resource="http://purl.org/dc/dcmitype/StillImage" /></cc:Work></rdf:RDF></metadata><defs
id="defs37" />
<path
id="path2"
style="fill:#1b98f8;fill-opacity:1;stroke-width:0.0292893"
d="M 7.4980469,0 C 4.5496028,-0.04093755 1.7047721,1.8547661 0.58789062,4.5800781 -0.57819305,7.2574082 0.02636631,10.583252 2.0703125,12.671875 4.0368718,14.788335 7.2754393,15.560096 9.9882812,14.572266 12.800219,13.617028 14.874915,10.855516 14.986328,7.8847656 15.172991,4.9968456 13.497714,2.109448 10.910156,0.8203125 9.858961,0.28011352 8.6796569,-0.00179908 7.4980469,0 Z"
sodipodi:nodetypes="ccccccc" />
<g
id="g4"
transform="translate(-0.01903604,0.02221043)">
</g>
<g
id="g6"
transform="translate(-0.01903604,0.02221043)">
</g>
<g
id="g8"
transform="translate(-0.01903604,0.02221043)">
</g>
<g
id="g10"
transform="translate(-0.01903604,0.02221043)">
</g>
<g
id="g12"
transform="translate(-0.01903604,0.02221043)">
</g>
<g
id="g14"
transform="translate(-0.01903604,0.02221043)">
</g>
<g
id="g16"
transform="translate(-0.01903604,0.02221043)">
</g>
<g
id="g18"
transform="translate(-0.01903604,0.02221043)">
</g>
<g
id="g20"
transform="translate(-0.01903604,0.02221043)">
</g>
<g
id="g22"
transform="translate(-0.01903604,0.02221043)">
</g>
<g
id="g24"
transform="translate(-0.01903604,0.02221043)">
</g>
<g
id="g26"
transform="translate(-0.01903604,0.02221043)">
</g>
<g
id="g28"
transform="translate(-0.01903604,0.02221043)">
</g>
<g
id="g30"
transform="translate(-0.01903604,0.02221043)">
</g>
<g
id="g32"
transform="translate(-0.01903604,0.02221043)">
</g>
<path
sodipodi:type="star"
style="fill:#ffffff;fill-opacity:1;stroke-width:37.7953;paint-order:stroke fill markers"
id="path1203"
inkscape:flatsided="false"
sodipodi:sides="3"
sodipodi:cx="7.2964563"
sodipodi:cy="7.3240671"
sodipodi:r1="3.805218"
sodipodi:r2="1.9026089"
sodipodi:arg1="-0.0017436774"
sodipodi:arg2="1.0454539"
inkscape:rounded="0"
inkscape:randomized="0"
d="M 11.101669,7.317432 8.2506324,8.9701135 5.3995964,10.622795 5.3938504,7.3273846 5.3881041,4.0319742 8.2448863,5.6747033 Z"
inkscape:transform-center-x="-0.94843001"
inkscape:transform-center-y="0.0033175346" /></svg>

After

Width:  |  Height:  |  Size: 3.5 KiB

View File

@@ -22,5 +22,18 @@ $(function () {
});
});
// checkboxes - check all
$("#check-all").click(function (e) {
$('input[type=checkbox]').not(this).prop('checked', this.checked);
});
// checkboxes - show/hide buttons
$("input[type=checkbox]").click(function (e) {
if ($('input[type=checkbox]:checked').length) {
$('#checkbox-operations').slideDown();
} else {
$('#checkbox-operations').slideUp();
}
});
});

View File

@@ -30,4 +30,11 @@ $(document).ready(function() {
});
toggle();
$('#notification-setting-reset-to-default').click(function (e) {
$('#notification_title').val('');
$('#notification_body').val('');
$('#notification_format').val('System default');
$('#notification_urls').val('');
e.preventDefault();
});
});

View File

@@ -555,3 +555,26 @@ ul {
.snapshot-age.error {
background-color: #ff0000;
color: #fff; }
#checkbox-operations {
background: rgba(0, 0, 0, 0.05);
padding: 1em;
border-radius: 10px;
margin-bottom: 1em;
display: none; }
.checkbox-uuid > * {
vertical-align: middle; }
.inline-warning {
border: 1px solid #ff3300;
padding: 0.5rem;
border-radius: 5px;
color: #ff3300; }
.inline-warning > span {
display: inline-block;
vertical-align: middle; }
.inline-warning img.inline-warning-icon {
display: inline;
height: 26px;
vertical-align: middle; }

View File

@@ -156,7 +156,7 @@ body:after, body:before {
.fetch-error {
padding-top: 1em;
font-size: 60%;
font-size: 80%;
max-width: 400px;
display: block;
}
@@ -774,3 +774,33 @@ ul {
}
}
#checkbox-operations {
background: rgba(0, 0, 0, 0.05);
padding: 1em;
border-radius: 10px;
margin-bottom: 1em;
display: none;
}
.checkbox-uuid {
> * {
vertical-align: middle;
}
}
.inline-warning {
> span {
display: inline-block;
vertical-align: middle;
}
img.inline-warning-icon {
display: inline;
height: 26px;
vertical-align: middle;
}
border: 1px solid #ff3300;
padding: 0.5rem;
border-radius: 5px;
color: #ff3300;
}

View File

@@ -8,7 +8,7 @@ import threading
import time
import uuid as uuid_builder
from copy import deepcopy
from os import mkdir, path, unlink
from os import path, unlink
from threading import Lock
import re
import requests
@@ -30,14 +30,14 @@ class ChangeDetectionStore:
def __init__(self, datastore_path="/datastore", include_default_watches=True, version_tag="0.0.0"):
# Should only be active for docker
# logging.basicConfig(filename='/dev/stdout', level=logging.INFO)
self.needs_write = False
self.__data = App.model()
self.datastore_path = datastore_path
self.json_store_path = "{}/url-watches.json".format(self.datastore_path)
self.needs_write = False
self.proxy_list = None
self.start_time = time.time()
self.stop_thread = False
self.__data = App.model()
# Base definition for all watchers
# deepcopy part of #569 - not sure why its needed exactly
self.generic_definition = deepcopy(Watch.model(datastore_path = datastore_path, default={}))
@@ -81,8 +81,6 @@ class ChangeDetectionStore:
except (FileNotFoundError, json.decoder.JSONDecodeError):
if include_default_watches:
print("Creating JSON store at", self.datastore_path)
self.add_watch(url='http://www.quotationspage.com/random.php', tag='test')
self.add_watch(url='https://news.ycombinator.com/', tag='Tech news')
self.add_watch(url='https://changedetection.io/CHANGELOG.txt', tag='changedetection.io')
@@ -113,9 +111,7 @@ class ChangeDetectionStore:
self.__data['settings']['application']['api_access_token'] = secret
# Proxy list support - available as a selection in settings when text file is imported
# CSV list
# "name, address", or just "name"
proxy_list_file = "{}/proxies.txt".format(self.datastore_path)
proxy_list_file = "{}/proxies.json".format(self.datastore_path)
if path.isfile(proxy_list_file):
self.import_proxy_list(proxy_list_file)
@@ -244,10 +240,6 @@ class ChangeDetectionStore:
return False
def get_val(self, uuid, val):
# Probably their should be dict...
return self.data['watching'][uuid].get(val)
# Remove a watchs data but keep the entry (URL etc)
def clear_watch_history(self, uuid):
import pathlib
@@ -324,12 +316,7 @@ class ChangeDetectionStore:
new_watch.update(apply_extras)
self.__data['watching'][new_uuid]=new_watch
# Get the directory ready
output_path = "{}/{}".format(self.datastore_path, new_uuid)
try:
mkdir(output_path)
except FileExistsError:
print(output_path, "already exists.")
self.__data['watching'][new_uuid].ensure_data_dir_exists()
if write_to_disk_now:
self.sync_to_json()
@@ -346,29 +333,35 @@ class ChangeDetectionStore:
# Save as PNG, PNG is larger but better for doing visual diff in the future
def save_screenshot(self, watch_uuid, screenshot: bytes, as_error=False):
if not self.data['watching'].get(watch_uuid):
return
if as_error:
target_path = os.path.join(self.datastore_path, watch_uuid, "last-error-screenshot.png")
else:
target_path = os.path.join(self.datastore_path, watch_uuid, "last-screenshot.png")
self.data['watching'][watch_uuid].ensure_data_dir_exists()
with open(target_path, 'wb') as f:
f.write(screenshot)
f.close()
def save_error_text(self, watch_uuid, contents):
if not self.data['watching'].get(watch_uuid):
return
target_path = os.path.join(self.datastore_path, watch_uuid, "last-error.txt")
with open(target_path, 'w') as f:
f.write(contents)
def save_xpath_data(self, watch_uuid, data, as_error=False):
if not self.data['watching'].get(watch_uuid):
return
if as_error:
target_path = os.path.join(self.datastore_path, watch_uuid, "elements.json")
else:
target_path = os.path.join(self.datastore_path, watch_uuid, "elements-error.json")
else:
target_path = os.path.join(self.datastore_path, watch_uuid, "elements.json")
with open(target_path, 'w') as f:
f.write(json.dumps(data))
@@ -440,20 +433,42 @@ class ChangeDetectionStore:
unlink(item)
def import_proxy_list(self, filename):
import csv
with open(filename, newline='') as f:
reader = csv.reader(f, skipinitialspace=True)
# @todo This loop can could be improved
l = []
for row in reader:
if len(row):
if len(row)>=2:
l.append(tuple(row[:2]))
else:
l.append(tuple([row[0], row[0]]))
self.proxy_list = l if len(l) else None
with open(filename) as f:
self.proxy_list = json.load(f)
print ("Registered proxy list", list(self.proxy_list.keys()))
def get_preferred_proxy_for_watch(self, uuid):
"""
Returns the preferred proxy by ID key
:param uuid: UUID
:return: proxy "key" id
"""
proxy_id = None
if self.proxy_list is None:
return None
# If its a valid one
watch = self.data['watching'].get(uuid)
if watch.get('proxy') and watch.get('proxy') in list(self.proxy_list.keys()):
return watch.get('proxy')
# not valid (including None), try the system one
else:
system_proxy_id = self.data['settings']['requests'].get('proxy')
# Is not None and exists
if self.proxy_list.get(system_proxy_id):
return system_proxy_id
# Fallback - Did not resolve anything, use the first available
if system_proxy_id is None:
first_default = list(self.proxy_list)[0]
return first_default
return None
# Run all updates
# IMPORTANT - Each update could be run even when they have a new install and the schema is correct
# So therefor - each `update_n` should be very careful about checking if it needs to actually run
@@ -533,9 +548,62 @@ class ChangeDetectionStore:
# `last_changed` not needed, we pull that information from the history.txt index
def update_4(self):
for uuid, watch in self.data['watching'].items():
# Be sure it's recalculated
p = watch.history
if watch.history_n < 2:
watch['last_changed'] = 0
try:
# Remove it from the struct
del(watch['last_changed'])
except:
continue
return
return
def update_5(self):
# If the watch notification body, title look the same as the global one, unset it, so the watch defaults back to using the main settings
# In other words - the watch notification_title and notification_body are not needed if they are the same as the default one
current_system_body = self.data['settings']['application']['notification_body'].translate(str.maketrans('', '', "\r\n "))
current_system_title = self.data['settings']['application']['notification_body'].translate(str.maketrans('', '', "\r\n "))
for uuid, watch in self.data['watching'].items():
try:
watch_body = watch.get('notification_body', '')
if watch_body and watch_body.translate(str.maketrans('', '', "\r\n ")) == current_system_body:
# Looks the same as the default one, so unset it
watch['notification_body'] = None
watch_title = watch.get('notification_title', '')
if watch_title and watch_title.translate(str.maketrans('', '', "\r\n ")) == current_system_title:
# Looks the same as the default one, so unset it
watch['notification_title'] = None
except Exception as e:
continue
return
# We incorrectly used common header overrides that should only apply to Requests
# These are now handled in content_fetcher::html_requests and shouldnt be passed to Playwright/Selenium
def update_7(self):
# These were hard-coded in early versions
for v in ['User-Agent', 'Accept', 'Accept-Encoding', 'Accept-Language']:
if self.data['settings']['headers'].get(v):
del self.data['settings']['headers'][v]
# Generate a previous.txt for all watches that do not have one and contain history
def update_8(self):
for uuid, watch in self.data['watching'].items():
# Make sure we actually have history
if (watch.history_n == 0):
continue
latest_file_name = watch.history[watch.newest_history_key]
# Check if the previous.txt exists
if not os.path.exists(os.path.join(watch.watch_data_dir, "previous.txt")):
# Generate a previous.txt
with open(os.path.join(watch.watch_data_dir, "previous.txt"), "wb") as f:
# Fill it with the latest history
latest_file_name = watch.history[watch.newest_history_key]
with open(latest_file_name, "rb") as f2:
f.write(f2.read())

View File

@@ -1,13 +1,14 @@
{% from '_helpers.jinja' import render_field %}
{% macro render_common_settings_form(form, current_base_url, emailprefix) %}
{% macro render_common_settings_form(form, emailprefix, settings_application) %}
<div class="pure-control-group">
{{ render_field(form.notification_urls, rows=5, placeholder="Examples:
Gitter - gitter://token/room
Office365 - o365://TenantID:AccountEmail/ClientID/ClientSecret/TargetEmail
AWS SNS - sns://AccessKeyID/AccessSecretKey/RegionName/+PhoneNo
SMTPS - mailtos://user:pass@mail.domain.com?to=receivingAddress@example.com", class="notification-urls")
SMTPS - mailtos://user:pass@mail.domain.com?to=receivingAddress@example.com",
class="notification-urls" )
}}
<div class="pure-form-message-inline">
<ul>
@@ -26,15 +27,16 @@
</div>
<div id="notification-customisation" class="pure-control-group">
<div class="pure-control-group">
{{ render_field(form.notification_title, class="m-d notification-title") }}
{{ render_field(form.notification_title, class="m-d notification-title", placeholder=settings_application['notification_title']) }}
<span class="pure-form-message-inline">Title for all notifications</span>
</div>
<div class="pure-control-group">
{{ render_field(form.notification_body , rows=5, class="notification-body") }}
{{ render_field(form.notification_body , rows=5, class="notification-body", placeholder=settings_application['notification_body']) }}
<span class="pure-form-message-inline">Body for all notifications</span>
</div>
<div class="pure-control-group">
{{ render_field(form.notification_format , rows=5, class="notification-format") }}
<!-- unsure -->
{{ render_field(form.notification_format , class="notification-format") }}
<span class="pure-form-message-inline">Format for all notifications</span>
</div>
<div class="pure-controls">
@@ -94,7 +96,7 @@
</table>
<br/>
URLs generated by changedetection.io (such as <code>{diff_url}</code>) require the <code>BASE_URL</code> environment variable set.<br/>
Your <code>BASE_URL</code> var is currently "{{current_base_url}}"
Your <code>BASE_URL</code> var is currently "{{settings_application['current_base_url']}}"
</span>
</div>
</div>

View File

@@ -40,7 +40,8 @@
<fieldset>
<div class="pure-control-group">
{{ render_field(form.url, placeholder="https://...", required=true, class="m-d") }}
<span class="pure-form-message-inline">Some sites use JavaScript to create the content, for this you should <a href="https://github.com/dgtlmoon/changedetection.io/wiki/Fetching-pages-with-WebDriver">use the Chrome/WebDriver Fetcher</a></span>
<span class="pure-form-message-inline">Some sites use JavaScript to create the content, for this you should <a href="https://github.com/dgtlmoon/changedetection.io/wiki/Fetching-pages-with-WebDriver">use the Chrome/WebDriver Fetcher</a></span><br/>
<span class="pure-form-message-inline">You can use variables in the URL, perfect for inserting the current date and other logic, <a href="https://github.com/dgtlmoon/changedetection.io/wiki/Handling-variables-in-the-watched-URL">help and examples here</a></span><br/>
</div>
<div class="pure-control-group">
{{ render_field(form.title, class="m-d") }}
@@ -77,6 +78,7 @@
<span class="pure-form-message-inline">
<p>Use the <strong>Basic</strong> method (default) where your watched site doesn't need Javascript to render.</p>
<p>The <strong>Chrome/Javascript</strong> method requires a network connection to a running WebDriver+Chrome server, set by the ENV var 'WEBDRIVER_URL'. </p>
Tip: <a href="https://github.com/dgtlmoon/changedetection.io/wiki/Proxy-configuration#brightdata-proxy-support">Connect using BrightData Proxies, find out more here.</a>
</span>
</div>
{% if form.proxy %}
@@ -135,10 +137,20 @@ User-Agent: wonderbra 1.0") }}
</div>
<div class="tab-pane-inner" id="notifications">
<strong>Note: <i>These settings override the global settings for this watch.</i></strong>
<fieldset>
<div class="field-group">
{{ render_common_settings_form(form, current_base_url, emailprefix) }}
<div class="pure-control-group inline-radio">
{{ render_checkbox_field(form.notification_muted) }}
</div>
<div class="field-group" id="notification-field-group">
{% if has_default_notification_urls %}
<div class="inline-warning">
<img class="inline-warning-icon" src="{{url_for('static_content', group='images', filename='notice.svg')}}" alt="Look out!" title="Lookout!"/>
There are <a href="{{ url_for('settings_page')}}#notifications">system-wide notification URLs enabled</a>, this form will override notification settings for this watch only &dash; an empty Notification URL list here will still send notifications.
</div>
{% endif %}
<a href="#notifications" id="notification-setting-reset-to-default" class="pure-button button-xsmall" style="right: 20px; top: 20px; position: absolute; background-color: #5f42dd; border-radius: 4px; font-size: 70%; color: #fff">Use system defaults</a>
{{ render_common_settings_form(form, emailprefix, settings_application) }}
</div>
</fieldset>
</div>
@@ -161,6 +173,16 @@ User-Agent: wonderbra 1.0") }}
<span class="pure-form-message-inline">Good for websites that just move the content around, and you want to know when NEW content is added, compares new lines against all history for this watch.</span>
</div>
</fieldset>
<fieldset>
<div class="pure-control-group">
<label for="trigger-type">Filter and restrict change detection of content to</label>
{{ render_checkbox_field(form.trigger_add, class="trigger-type") }}
{{ render_checkbox_field(form.trigger_del, class="trigger-type") }}
<span class="pure-form-message-inline">
Filters the change-detection of this watch to only this type of content change. <strong>Replacements</strong> (neither additions nor deletions) are always included. The 'diff' will still include all changes.
</span>
</div>
</fieldset>
<div class="pure-control-group">
{% set field = render_field(form.css_filter,
placeholder=".class-name or #some-id, or other CSS selector rule.",
@@ -173,8 +195,16 @@ User-Agent: wonderbra 1.0") }}
<span class="pure-form-message-inline">
<ul>
<li>CSS - Limit text to this CSS rule, only text matching this CSS rule is included.</li>
<li>JSON - Limit text to this JSON rule, using <a href="https://pypi.org/project/jsonpath-ng/">JSONPath</a>, prefix with <code>"json:"</code>, use <code>json:$</code> to force re-formatting if required, <a
href="https://jsonpath.com/" target="new">test your JSONPath here</a></li>
<li>JSON - Limit text to this JSON rule, using either <a href="https://pypi.org/project/jsonpath-ng/" target="new">JSONPath</a> or <a href="https://stedolan.github.io/jq/" target="new">jq</a> (if installed).
<ul>
<li>JSONPath: Prefix with <code>json:</code>, use <code>json:$</code> to force re-formatting if required, <a href="https://jsonpath.com/" target="new">test your JSONPath here</a>.</li>
{% if jq_support %}
<li>jq: Prefix with <code>jq:</code> and <a href="https://jqplay.org/" target="new">test your jq here</a>. Using <a href="https://stedolan.github.io/jq/" target="new">jq</a> allows for complex filtering and processing of JSON data with built-in functions, regex, filtering, and more. See examples and documentation <a href="https://stedolan.github.io/jq/manual/" target="new">here</a>.</li>
{% else %}
<li>jq support not installed</li>
{% endif %}
</ul>
</li>
<li>XPath - Limit text to this XPath rule, simply start with a forward-slash,
<ul>
<li>Example: <code>//*[contains(@class, 'sametext')]</code> or <code>xpath://*[contains(@class, 'sametext')]</code>, <a
@@ -183,7 +213,7 @@ User-Agent: wonderbra 1.0") }}
</ul>
</li>
</ul>
Please be sure that you thoroughly understand how to write CSS or JSONPath, XPath selector rules before filing an issue on GitHub! <a
Please be sure that you thoroughly understand how to write CSS, JSONPath, XPath{% if jq_support %}, or jq selector{%endif%} rules before filing an issue on GitHub! <a
href="https://github.com/dgtlmoon/changedetection.io/wiki/CSS-Selector-help">here for more CSS selector help</a>.<br/>
</span>
</div>

View File

@@ -57,6 +57,7 @@
</br>
{% if is_html_webdriver %}
{% if screenshot %}
<div class="snapshot-age">{{watch.snapshot_screenshot_ctime|format_timestamp_timeago}}</div>
<img style="max-width: 80%" id="screenshot-img" alt="Current screenshot from most recent request"/>
{% else %}
No screenshot available just yet! Try rechecking the page.

View File

@@ -60,7 +60,7 @@
{{ render_field(form.application.form.base_url, placeholder="http://yoursite.com:5000/",
class="m-d") }}
<span class="pure-form-message-inline">
Base URL used for the {base_url} token in notifications and RSS links.<br/>Default value is the ENV var 'BASE_URL' (Currently "{{current_base_url}}"),
Base URL used for the <code>{base_url}</code> token in notifications and RSS links.<br/>Default value is the ENV var 'BASE_URL' (Currently "{{settings_application['current_base_url']}}"),
<a href="https://github.com/dgtlmoon/changedetection.io/wiki/Configurable-BASE_URL-setting">read more here</a>.
</span>
</div>
@@ -69,12 +69,6 @@
{{ render_checkbox_field(form.application.form.extract_title_as_title) }}
<span class="pure-form-message-inline">Note: This will automatically apply to all existing watches.</span>
</div>
<div class="pure-control-group">
{{ render_checkbox_field(form.application.form.real_browser_save_screenshot) }}
<span class="pure-form-message-inline">When using a Chrome browser, a screenshot from the last check will be available on the Diff page</span>
</div>
<div class="pure-control-group">
{{ render_checkbox_field(form.application.form.empty_pages_are_a_change) }}
<span class="pure-form-message-inline">When a page contains HTML, but no renderable text appears (empty page), is this considered a change?</span>
@@ -93,7 +87,7 @@
<div class="tab-pane-inner" id="notifications">
<fieldset>
<div class="field-group">
{{ render_common_settings_form(form.application.form, current_base_url, emailprefix) }}
{{ render_common_settings_form(form.application.form, emailprefix, settings_application) }}
</div>
</fieldset>
</div>
@@ -105,6 +99,8 @@
<p>Use the <strong>Basic</strong> method (default) where your watched sites don't need Javascript to render.</p>
<p>The <strong>Chrome/Javascript</strong> method requires a network connection to a running WebDriver+Chrome server, set by the ENV var 'WEBDRIVER_URL'. </p>
</span>
<br/>
Tip: <a href="https://github.com/dgtlmoon/changedetection.io/wiki/Proxy-configuration#brightdata-proxy-support">Connect using BrightData Proxies, find out more here.</a>
</div>
<fieldset class="pure-group" id="webdriver-override-options">
<div class="pure-form-message-inline">

View File

@@ -24,6 +24,17 @@
</fieldset>
<span style="color:#eee; font-size: 80%;"><img style="height: 1em;display:inline-block;" src="{{url_for('static_content', group='images', filename='spread-white.svg')}}" /> Tip: You can also add 'shared' watches. <a href="https://github.com/dgtlmoon/changedetection.io/wiki/Sharing-a-Watch">More info</a></a></span>
</form>
<form class="pure-form" action="{{ url_for('form_watch_list_checkbox_operations') }}" method="POST" id="watch-list-form">
<input type="hidden" name="csrf_token" value="{{ csrf_token() }}"/>
<div id="checkbox-operations">
<button class="pure-button button-secondary button-xsmall" style="font-size: 70%" name="op" value="pause">Pause</button>
<button class="pure-button button-secondary button-xsmall" style="font-size: 70%" name="op" value="unpause">UnPause</button>
<button class="pure-button button-secondary button-xsmall" style="font-size: 70%" name="op" value="mute">Mute</button>
<button class="pure-button button-secondary button-xsmall" style="font-size: 70%" name="op" value="unmute">UnMute</button>
<button class="pure-button button-secondary button-xsmall" style="font-size: 70%" name="op" value="notification-default">Use default notification</button>
<button class="pure-button button-secondary button-xsmall" style="background: #dd4242; font-size: 70%" name="op" value="delete">Delete</button>
</div>
<div>
<a href="{{url_for('index')}}" class="pure-button button-tag {{'active' if not active_tag }}">All</a>
{% for tag in tags %}
@@ -41,7 +52,7 @@
<table class="pure-table pure-table-striped watch-table">
<thead>
<tr>
<th>#</th>
<th><input style="vertical-align: middle" type="checkbox" id="check-all"/> #</th>
<th></th>
{% set link_order = "desc" if sort_order else "asc" %}
{% set arrow_span = "" %}
@@ -66,13 +77,17 @@
{% if watch.paused is defined and watch.paused != False %}paused{% endif %}
{% if watch.newest_history_key| int > watch.last_viewed and watch.history_n>=2 %}unviewed{% endif %}
{% if watch.uuid in queued_uuids %}queued{% endif %}">
<td class="inline">{{ loop.index }}</td>
<td class="inline checkbox-uuid" ><input name="uuids" type="checkbox" value="{{ watch.uuid}} "/> <span>{{ loop.index }}</span></td>
<td class="inline watch-controls">
<a class="state-{{'on' if watch.paused }}" href="{{url_for('index', op='pause', uuid=watch.uuid, tag=active_tag)}}"><img src="{{url_for('static_content', group='images', filename='pause.svg')}}" alt="Pause checks" title="Pause checks"/></a>
{% if not watch.paused %}
<a class="state-off" href="{{url_for('index', op='pause', uuid=watch.uuid, tag=active_tag)}}"><img src="{{url_for('static_content', group='images', filename='pause.svg')}}" alt="Pause checks" title="Pause checks"/></a>
{% else %}
<a class="state-on" href="{{url_for('index', op='pause', uuid=watch.uuid, tag=active_tag)}}"><img src="{{url_for('static_content', group='images', filename='play.svg')}}" alt="UnPause checks" title="UnPause checks"/></a>
{% endif %}
<a class="state-{{'on' if watch.notification_muted}}" href="{{url_for('index', op='mute', uuid=watch.uuid, tag=active_tag)}}"><img src="{{url_for('static_content', group='images', filename='bell-off.svg')}}" alt="Mute notifications" title="Mute notifications"/></a>
</td>
<td class="title-col inline">{{watch.title if watch.title is not none and watch.title|length > 0 else watch.url}}
<a class="external" target="_blank" rel="noopener" href="{{ watch.url.replace('source:','') }}"></a>
<a class="external" target="_blank" rel="noopener" href="{{ watch.link.replace('source:','') }}"></a>
<a href="{{url_for('form_share_put_watch', uuid=watch.uuid)}}"><img style="height: 1em;display:inline-block;" src="{{url_for('static_content', group='images', filename='spread.svg')}}" /></a>
{%if watch.fetch_backend == "html_webdriver" %}<img style="height: 1em; display:inline-block;" src="{{url_for('static_content', group='images', filename='Google-Chrome-icon.png')}}" />{% endif %}
@@ -129,5 +144,6 @@
#}
</div>
</form>
</div>
{% endblock %}

View File

@@ -2,7 +2,7 @@
import time
from flask import url_for
from ..util import live_server_setup
from ..util import live_server_setup, wait_for_all_checks
import logging
@@ -29,14 +29,8 @@ def test_fetch_webdriver_content(client, live_server):
assert b"1 Imported" in res.data
time.sleep(3)
attempt = 0
while attempt < 20:
res = client.get(url_for("index"))
if not b'Checking now' in res.data:
break
logging.getLogger().info("Waiting for check to not say 'Checking now'..")
time.sleep(3)
attempt += 1
wait_for_all_checks(client)
res = client.get(

View File

@@ -0,0 +1,2 @@
"""Tests for the app."""

View File

@@ -0,0 +1,14 @@
#!/usr/bin/python3
from .. import conftest
#def pytest_addoption(parser):
# parser.addoption("--url_suffix", action="store", default="identifier for request")
#def pytest_generate_tests(metafunc):
# # This is called for every test. Only get/set command line arguments
# # if the argument is specified in the list of test "fixturenames".
# option_value = metafunc.config.option.url_suffix
# if 'url_suffix' in metafunc.fixturenames and option_value is not None:
# metafunc.parametrize("url_suffix", [option_value])

View File

@@ -0,0 +1,10 @@
{
"proxy-one": {
"label": "One",
"url": "http://127.0.0.1:3128"
},
"proxy-two": {
"label": "two",
"url": "http://127.0.0.1:3129"
}
}

View File

@@ -0,0 +1,41 @@
acl localnet src 0.0.0.1-0.255.255.255 # RFC 1122 "this" network (LAN)
acl localnet src 10.0.0.0/8 # RFC 1918 local private network (LAN)
acl localnet src 100.64.0.0/10 # RFC 6598 shared address space (CGN)
acl localnet src 169.254.0.0/16 # RFC 3927 link-local (directly plugged) machines
acl localnet src 172.16.0.0/12 # RFC 1918 local private network (LAN)
acl localnet src 192.168.0.0/16 # RFC 1918 local private network (LAN)
acl localnet src fc00::/7 # RFC 4193 local private network range
acl localnet src fe80::/10 # RFC 4291 link-local (directly plugged) machines
acl localnet src 159.65.224.174
acl SSL_ports port 443
acl Safe_ports port 80 # http
acl Safe_ports port 21 # ftp
acl Safe_ports port 443 # https
acl Safe_ports port 70 # gopher
acl Safe_ports port 210 # wais
acl Safe_ports port 1025-65535 # unregistered ports
acl Safe_ports port 280 # http-mgmt
acl Safe_ports port 488 # gss-http
acl Safe_ports port 591 # filemaker
acl Safe_ports port 777 # multiling http
acl CONNECT method CONNECT
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access allow localhost manager
http_access deny manager
http_access allow localhost
http_access allow localnet
http_access deny all
http_port 3128
coredump_dir /var/spool/squid
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern -i (/cgi-bin/|\?) 0 0% 0
refresh_pattern \/(Packages|Sources)(|\.bz2|\.gz|\.xz)$ 0 0% 0 refresh-ims
refresh_pattern \/Release(|\.gpg)$ 0 0% 0 refresh-ims
refresh_pattern \/InRelease$ 0 0% 0 refresh-ims
refresh_pattern \/(Translation-.*)(|\.bz2|\.gz|\.xz)$ 0 0% 0 refresh-ims
refresh_pattern . 0 20% 4320
logfile_rotate 0

View File

@@ -0,0 +1,38 @@
#!/usr/bin/python3
import time
from flask import url_for
from ..util import live_server_setup
def test_preferred_proxy(client, live_server):
time.sleep(1)
live_server_setup(live_server)
time.sleep(1)
url = "http://chosen.changedetection.io"
res = client.post(
url_for("import_page"),
# Because a URL wont show in squid/proxy logs due it being SSLed
# Use plain HTTP or a specific domain-name here
data={"urls": url},
follow_redirects=True
)
assert b"1 Imported" in res.data
time.sleep(2)
res = client.post(
url_for("edit_page", uuid="first"),
data={
"css_filter": "",
"fetch_backend": "html_requests",
"headers": "",
"proxy": "proxy-two",
"tag": "",
"url": url,
},
follow_redirects=True
)
assert b"Updated watch." in res.data
time.sleep(2)
# Now the request should appear in the second-squid logs

View File

@@ -0,0 +1,19 @@
#!/usr/bin/python3
import time
from flask import url_for
from ..util import live_server_setup, wait_for_all_checks, extract_UUID_from_client
# just make a request, we will grep in the docker logs to see it actually got called
def test_check_basic_change_detection_functionality(client, live_server):
live_server_setup(live_server)
res = client.post(
url_for("import_page"),
# Because a URL wont show in squid/proxy logs due it being SSLed
# Use plain HTTP or a specific domain-name here
data={"urls": "http://one.changedetection.io"},
follow_redirects=True
)
assert b"1 Imported" in res.data
time.sleep(3)

View File

@@ -147,6 +147,16 @@ def test_api_simple(client, live_server):
# @todo how to handle None/default global values?
assert watch['history_n'] == 2, "Found replacement history section, which is in its own API"
# basic systeminfo check
res = client.get(
url_for("systeminfo"),
headers={'x-api-key': api_key},
)
info = json.loads(res.data)
assert info.get('watch_count') == 1
assert info.get('uptime') > 0.5
# Finally delete the watch
res = client.delete(
url_for("watch", uuid=watch_uuid),

View File

@@ -1,18 +1,31 @@
#!/usr/bin/python3
import time
from .util import set_original_response, set_modified_response, live_server_setup
from flask import url_for
from urllib.request import urlopen
from . util import set_original_response, set_modified_response, live_server_setup
from zipfile import ZipFile
import re
import time
def test_backup(client, live_server):
live_server_setup(live_server)
set_original_response()
# Give the endpoint time to spin up
time.sleep(1)
# Add our URL to the import page
res = client.post(
url_for("import_page"),
data={"urls": url_for('test_endpoint', _external=True)},
follow_redirects=True
)
assert b"1 Imported" in res.data
time.sleep(3)
res = client.get(
url_for("get_backup"),
follow_redirects=True
@@ -20,6 +33,19 @@ def test_backup(client, live_server):
# Should get the right zip content type
assert res.content_type == "application/zip"
# Should be PK/ZIP stream
assert res.data.count(b'PK') >= 2
# ZipFile from buffer seems non-obvious, just save it instead
with open("download.zip", 'wb') as f:
f.write(res.data)
zip = ZipFile('download.zip')
l = zip.namelist()
uuid4hex = re.compile('^[a-f0-9]{8}-?[a-f0-9]{4}-?4[a-f0-9]{3}-?[89ab][a-f0-9]{3}-?[a-f0-9]{12}.*txt', re.I)
newlist = list(filter(uuid4hex.match, l)) # Read Note below
# Should be three txt files in the archive (history and the snapshot)
assert len(newlist) == 3

View File

@@ -0,0 +1,107 @@
#!/usr/bin/python3
# @NOTE: THIS RELIES ON SOME MIDDLEWARE TO MAKE CHECKBOXES WORK WITH WTFORMS UNDER TEST CONDITION, see changedetectionio/tests/util.py
import time
from flask import url_for
from .util import live_server_setup
def set_original_response():
test_return_data = """
Here
is
some
text
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
def set_response_with_deleted_word():
test_return_data = """
Here
is
text
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
def set_response_with_changed_word():
test_return_data = """
Here
ix
some
text
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
def test_diff_filter_changes_as_add_delete(client, live_server):
live_server_setup(live_server)
sleep_time_for_fetch_thread = 3
set_original_response()
# Give the endpoint time to spin up
time.sleep(1)
# Add our URL to the import page
test_url = url_for('test_endpoint', _external=True)
res = client.post(
url_for("import_page"),
data={"urls": test_url},
follow_redirects=True
)
assert b"1 Imported" in res.data
# Wait for it to read the original version
time.sleep(sleep_time_for_fetch_thread)
# Make a change that ONLY includes deletes
set_response_with_deleted_word()
res = client.post(
url_for("edit_page", uuid="first"),
data={"trigger_add": "y",
"trigger_del": "n",
"url": test_url,
"fetch_backend": "html_requests"},
follow_redirects=True
)
assert b"Updated watch." in res.data
time.sleep(sleep_time_for_fetch_thread)
# We should NOT see a change because we chose to not know about any Deletions
res = client.get(url_for("index"))
assert b'unviewed' not in res.data
# Recheck to be sure
client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(sleep_time_for_fetch_thread)
res = client.get(url_for("index"))
assert b'unviewed' not in res.data
# Now set the original response, which will include the word, which should trigger Added (because trigger_add ==y)
set_original_response()
client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(sleep_time_for_fetch_thread)
res = client.get(url_for("index"))
assert b'unviewed' in res.data
# Now check 'changes' are always going to be triggered
set_original_response()
client.post(
url_for("edit_page", uuid="first"),
# Neither trigger add nor del? then we should see changes still
data={"trigger_add": "n",
"trigger_del": "n",
"url": test_url,
"fetch_backend": "html_requests"},
follow_redirects=True
)
time.sleep(sleep_time_for_fetch_thread)
client.get(url_for("mark_all_viewed"), follow_redirects=True)
set_response_with_changed_word()
client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(sleep_time_for_fetch_thread)
res = client.get(url_for("index"))
assert b'unviewed' in res.data

View File

@@ -0,0 +1,83 @@
#!/usr/bin/python3
import time
from flask import url_for
from .util import live_server_setup
def set_original_response():
test_return_data = """
A few new lines
Where there is more lines originally
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
def set_delete_response():
test_return_data = """
A few new lines
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
def test_diff_filtering_no_del(client, live_server):
live_server_setup(live_server)
sleep_time_for_fetch_thread = 3
set_original_response()
# Give the endpoint time to spin up
time.sleep(1)
# Add our URL to the import page
test_url = url_for('test_endpoint', _external=True)
res = client.post(
url_for("import_page"),
data={"urls": test_url},
follow_redirects=True
)
assert b"1 Imported" in res.data
time.sleep(sleep_time_for_fetch_thread)
# Add our URL to the import page
res = client.post(
url_for("edit_page", uuid="first"),
data={"trigger_add": "y",
"trigger_del": "n",
"url": test_url,
"fetch_backend": "html_requests"},
follow_redirects=True
)
assert b"Updated watch." in res.data
assert b'unviewed' not in res.data
# Make an delete change
set_delete_response()
time.sleep(sleep_time_for_fetch_thread)
# Trigger a check
client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
# We should NOT see the change
res = client.get(url_for("index"))
assert b'unviewed' not in res.data
# Make an delete change
set_original_response()
time.sleep(sleep_time_for_fetch_thread)
# Trigger a check
client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
# We should see the change
res = client.get(url_for("index"))
assert b'unviewed' in res.data

View File

@@ -0,0 +1,72 @@
#!/usr/bin/python3
import time
from flask import url_for
from .util import live_server_setup
def set_original_response():
test_return_data = """
A few new lines
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
def set_add_response():
test_return_data = """
A few new lines
Where there is more lines than before
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
def test_diff_filtering_no_add(client, live_server):
live_server_setup(live_server)
sleep_time_for_fetch_thread = 3
set_original_response()
# Give the endpoint time to spin up
time.sleep(1)
# Add our URL to the import page
test_url = url_for('test_endpoint', _external=True)
res = client.post(
url_for("import_page"),
data={"urls": test_url},
follow_redirects=True
)
assert b"1 Imported" in res.data
time.sleep(sleep_time_for_fetch_thread)
# Add our URL to the import page
res = client.post(
url_for("edit_page", uuid="first"),
data={"trigger_add": "n",
"trigger_del": "y",
"url": test_url,
"fetch_backend": "html_requests"},
follow_redirects=True
)
assert b"Updated watch." in res.data
assert b'unviewed' not in res.data
# Make an add change
set_add_response()
time.sleep(sleep_time_for_fetch_thread)
# Trigger a check
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
# We should NOT see the change
res = client.get(url_for("index"))
# save res.data to a file
assert b'unviewed' not in res.data

View File

@@ -81,4 +81,4 @@ def test_consistent_history(client, live_server):
assert len(files_in_watch_dir) == 2, "Should be just two files in the dir, history.txt and the snapshot"
assert len(files_in_watch_dir) == 3, "Should be just three files in the dir, history.txt, previous.txt, and the snapshot"

View File

@@ -0,0 +1,33 @@
#!/usr/bin/python3
import time
from flask import url_for
from .util import live_server_setup
# If there was only a change in the whitespacing, then we shouldnt have a change detected
def test_jinja2_in_url_query(client, live_server):
live_server_setup(live_server)
# Give the endpoint time to spin up
time.sleep(1)
# Add our URL to the import page
test_url = url_for('test_return_query', _external=True)
# because url_for() will URL-encode the var, but we dont here
full_url = "{}?{}".format(test_url,
"date={% now 'Europe/Berlin', '%Y' %}.{% now 'Europe/Berlin', '%m' %}.{% now 'Europe/Berlin', '%d' %}", )
res = client.post(
url_for("form_quick_watch_add"),
data={"url": full_url, "tag": "test"},
follow_redirects=True
)
assert b"Watch added" in res.data
time.sleep(3)
# It should report nothing found (no new 'unviewed' class)
res = client.get(
url_for("preview_page", uuid="first"),
follow_redirects=True
)
assert b'date=2' in res.data

View File

@@ -2,10 +2,15 @@
# coding=utf-8
import time
from flask import url_for
from flask import url_for, escape
from . util import live_server_setup
import pytest
jq_support = True
try:
import jq
except ModuleNotFoundError:
jq_support = False
def test_setup(live_server):
live_server_setup(live_server)
@@ -36,16 +41,28 @@ and it can also be repeated
from .. import html_tools
# See that we can find the second <script> one, which is not broken, and matches our filter
text = html_tools.extract_json_as_string(content, "$.offers.price")
text = html_tools.extract_json_as_string(content, "json:$.offers.price")
assert text == "23.5"
text = html_tools.extract_json_as_string('{"id":5}', "$.id")
# also check for jq
if jq_support:
text = html_tools.extract_json_as_string(content, "jq:.offers.price")
assert text == "23.5"
text = html_tools.extract_json_as_string('{"id":5}', "jq:.id")
assert text == "5"
text = html_tools.extract_json_as_string('{"id":5}', "json:$.id")
assert text == "5"
# When nothing at all is found, it should throw JSONNOTFound
# Which is caught and shown to the user in the watch-overview table
with pytest.raises(html_tools.JSONNotFound) as e_info:
html_tools.extract_json_as_string('COMPLETE GIBBERISH, NO JSON!', "$.id")
html_tools.extract_json_as_string('COMPLETE GIBBERISH, NO JSON!', "json:$.id")
if jq_support:
with pytest.raises(html_tools.JSONNotFound) as e_info:
html_tools.extract_json_as_string('COMPLETE GIBBERISH, NO JSON!', "jq:.id")
def set_original_ext_response():
data = """
@@ -66,6 +83,7 @@ def set_original_ext_response():
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(data)
return None
def set_modified_ext_response():
data = """
@@ -86,6 +104,7 @@ def set_modified_ext_response():
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(data)
return None
def set_original_response():
test_return_data = """
@@ -184,10 +203,10 @@ def test_check_json_without_filter(client, live_server):
assert b'&#34;&lt;b&gt;' in res.data
assert res.data.count(b'{\n') >= 2
res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data
def test_check_json_filter(client, live_server):
json_filter = 'json:boss.name'
def check_json_filter(json_filter, client, live_server):
set_original_response()
# Give the endpoint time to spin up
@@ -226,7 +245,7 @@ def test_check_json_filter(client, live_server):
res = client.get(
url_for("edit_page", uuid="first"),
)
assert bytes(json_filter.encode('utf-8')) in res.data
assert bytes(escape(json_filter).encode('utf-8')) in res.data
# Trigger a check
client.get(url_for("form_watch_checknow"), follow_redirects=True)
@@ -252,10 +271,17 @@ def test_check_json_filter(client, live_server):
# And #462 - check we see the proper utf-8 string there
assert "Örnsköldsvik".encode('utf-8') in res.data
res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data
def test_check_json_filter_bool_val(client, live_server):
json_filter = "json:$['available']"
def test_check_jsonpath_filter(client, live_server):
check_json_filter('json:boss.name', client, live_server)
def test_check_jq_filter(client, live_server):
if jq_support:
check_json_filter('jq:.boss.name', client, live_server)
def check_json_filter_bool_val(json_filter, client, live_server):
set_original_response()
# Give the endpoint time to spin up
@@ -304,14 +330,22 @@ def test_check_json_filter_bool_val(client, live_server):
# But the change should be there, tho its hard to test the change was detected because it will show old and new versions
assert b'false' in res.data
res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data
def test_check_jsonpath_filter_bool_val(client, live_server):
check_json_filter_bool_val("json:$['available']", client, live_server)
def test_check_jq_filter_bool_val(client, live_server):
if jq_support:
check_json_filter_bool_val("jq:.available", client, live_server)
# Re #265 - Extended JSON selector test
# Stuff to consider here
# - Selector should be allowed to return empty when it doesnt match (people might wait for some condition)
# - The 'diff' tab could show the old and new content
# - Form should let us enter a selector that doesnt (yet) match anything
def test_check_json_ext_filter(client, live_server):
json_filter = 'json:$[?(@.status==Sold)]'
def check_json_ext_filter(json_filter, client, live_server):
set_original_ext_response()
# Give the endpoint time to spin up
@@ -350,7 +384,7 @@ def test_check_json_ext_filter(client, live_server):
res = client.get(
url_for("edit_page", uuid="first"),
)
assert bytes(json_filter.encode('utf-8')) in res.data
assert bytes(escape(json_filter).encode('utf-8')) in res.data
# Trigger a check
client.get(url_for("form_watch_checknow"), follow_redirects=True)
@@ -376,3 +410,12 @@ def test_check_json_ext_filter(client, live_server):
assert b'ForSale' not in res.data
assert b'Sold' in res.data
res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data
def test_check_jsonpath_ext_filter(client, live_server):
check_json_ext_filter('json:$[?(@.status==Sold)]', client, live_server)
def test_check_jq_ext_filter(client, live_server):
if jq_support:
check_json_ext_filter('jq:.[] | select(.status | contains("Sold"))', client, live_server)

View File

@@ -4,7 +4,13 @@ import re
from flask import url_for
from . util import set_original_response, set_modified_response, set_more_modified_response, live_server_setup
import logging
from changedetectionio.notification import default_notification_body, default_notification_title
from changedetectionio.notification import (
default_notification_body,
default_notification_format,
default_notification_title,
valid_notification_formats,
)
def test_setup(live_server):
live_server_setup(live_server)
@@ -20,9 +26,26 @@ def test_check_notification(client, live_server):
# Re 360 - new install should have defaults set
res = client.get(url_for("settings_page"))
notification_url = url_for('test_notification_endpoint', _external=True).replace('http', 'json')
assert default_notification_body.encode() in res.data
assert default_notification_title.encode() in res.data
#####################
# Set this up for when we remove the notification from the watch, it should fallback with these details
res = client.post(
url_for("settings_page"),
data={"application-notification_urls": notification_url,
"application-notification_title": "fallback-title "+default_notification_title,
"application-notification_body": "fallback-body "+default_notification_body,
"application-notification_format": default_notification_format,
"requests-time_between_check-minutes": 180,
'application-fetch_backend': "html_requests"},
follow_redirects=True
)
assert b"Settings updated." in res.data
# When test mode is in BASE_URL env mode, we should see this already configured
env_base_url = os.getenv('BASE_URL', '').strip()
if len(env_base_url):
@@ -47,8 +70,6 @@ def test_check_notification(client, live_server):
# Goto the edit page, add our ignore text
# Add our URL to the import page
url = url_for('test_notification_endpoint', _external=True)
notification_url = url.replace('http', 'json')
print (">>>> Notification URL: "+notification_url)
@@ -158,6 +179,30 @@ def test_check_notification(client, live_server):
# be sure we see it in the output log
assert b'New ChangeDetection.io Notification - ' + test_url.encode('utf-8') in res.data
set_original_response()
res = client.post(
url_for("edit_page", uuid="first"),
data={
"url": test_url,
"tag": "my tag",
"title": "my title",
"notification_urls": '',
"notification_title": '',
"notification_body": '',
"notification_format": default_notification_format,
"fetch_backend": "html_requests"},
follow_redirects=True
)
assert b"Updated watch." in res.data
time.sleep(2)
# Verify what was sent as a notification, this file should exist
with open("test-datastore/notification.txt", "r") as f:
notification_submission = f.read()
assert "fallback-title" in notification_submission
assert "fallback-body" in notification_submission
# cleanup for the next
client.get(
url_for("form_delete", uuid="all"),
@@ -180,20 +225,20 @@ def test_notification_validation(client, live_server):
assert b"Watch added" in res.data
# Re #360 some validation
res = client.post(
url_for("edit_page", uuid="first"),
data={"notification_urls": 'json://localhost/foobar',
"notification_title": "",
"notification_body": "",
"notification_format": "Text",
"url": test_url,
"tag": "my tag",
"title": "my title",
"headers": "",
"fetch_backend": "html_requests"},
follow_redirects=True
)
assert b"Notification Body and Title is required when a Notification URL is used" in res.data
# res = client.post(
# url_for("edit_page", uuid="first"),
# data={"notification_urls": 'json://localhost/foobar',
# "notification_title": "",
# "notification_body": "",
# "notification_format": "Text",
# "url": test_url,
# "tag": "my tag",
# "title": "my title",
# "headers": "",
# "fetch_backend": "html_requests"},
# follow_redirects=True
# )
# assert b"Notification Body and Title is required when a Notification URL is used" in res.data
# Now adding a wrong token should give us an error
res = client.post(
@@ -215,3 +260,5 @@ def test_notification_validation(client, live_server):
url_for("form_delete", uuid="all"),
follow_redirects=True
)

View File

@@ -2,6 +2,14 @@
from flask import make_response, request
from flask import url_for
import logging
import time
from werkzeug import Request
import io
# This is a fix for macOS running tests.
import multiprocessing
multiprocessing.set_start_method("fork")
def set_original_response():
test_return_data = """<html>
@@ -68,6 +76,31 @@ def extract_api_key_from_UI(client):
api_key = m.group(1)
return api_key.strip()
# kinda funky, but works for now
def extract_UUID_from_client(client):
import re
res = client.get(
url_for("index"),
)
# <span id="api-key">{{api_key}}</span>
m = re.search('edit/(.+?)"', str(res.data))
uuid = m.group(1)
return uuid.strip()
def wait_for_all_checks(client):
# Loop waiting until done..
attempt=0
while attempt < 60:
time.sleep(1)
res = client.get(url_for("index"))
if not b'Checking now' in res.data:
break
logging.getLogger().info("Waiting for watch-list to not say 'Checking now'.. {}".format(attempt))
attempt += 1
def live_server_setup(live_server):
@live_server.app.route('/test-endpoint')
@@ -132,4 +165,42 @@ def live_server_setup(live_server):
ret = " ".join([auth.username, auth.password, auth.type])
return ret
# Make sure any checkboxes that are supposed to be defaulted to true are set during the post request
# This is due to the fact that defaults are set in the HTML which we are not using during tests.
# This does not affect the server when running outside of a test
class DefaultCheckboxMiddleware(object):
def __init__(self, app):
self.app = app
def __call__(self, environ, start_response):
request = Request(environ)
if request.method == "POST" and "/edit" in request.path:
body = environ['wsgi.input'].read()
# if the checkboxes are not set, set them to true
if b"trigger_add" not in body:
body += b'&trigger_add=y'
if b"trigger_del" not in body:
body += b'&trigger_del=y'
# remove any checkboxes set to "n" so wtforms processes them correctly
body = body.replace(b"trigger_add=n", b"")
body = body.replace(b"trigger_del=n", b"")
body = body.replace(b"&&", b"&")
new_stream = io.BytesIO(body)
environ["CONTENT_LENGTH"] = len(body)
environ['wsgi.input'] = new_stream
return self.app(environ, start_response)
live_server.app.wsgi_app = DefaultCheckboxMiddleware(live_server.app.wsgi_app)
# Just return some GET var
@live_server.app.route('/test-return-query', methods=['GET'])
def test_return_query():
return request.query_string
live_server.start()

View File

@@ -0,0 +1,2 @@
"""Tests for the app."""

View File

@@ -0,0 +1,3 @@
#!/usr/bin/python3
from .. import conftest

View File

@@ -0,0 +1,54 @@
#!/usr/bin/python3
import time
from flask import url_for
from ..util import live_server_setup, wait_for_all_checks, extract_UUID_from_client
# Add a site in paused mode, add an invalid filter, we should still have visual selector data ready
def test_visual_selector_content_ready(client, live_server):
import os
import json
assert os.getenv('PLAYWRIGHT_DRIVER_URL'), "Needs PLAYWRIGHT_DRIVER_URL set for this test"
live_server_setup(live_server)
time.sleep(1)
# Add our URL to the import page, because the docker container (playwright/selenium) wont be able to connect to our usual test url
test_url = "https://changedetection.io/ci-test/test-runjs.html"
res = client.post(
url_for("form_quick_watch_add"),
data={"url": test_url, "tag": '', 'edit_and_watch_submit_button': 'Edit > Watch'},
follow_redirects=True
)
assert b"Watch added in Paused state, saving will unpause" in res.data
res = client.post(
url_for("edit_page", uuid="first", unpause_on_save=1),
data={
"url": test_url,
"tag": "",
"headers": "",
'fetch_backend': "html_webdriver",
'webdriver_js_execute_code': 'document.querySelector("button[name=test-button]").click();'
},
follow_redirects=True
)
assert b"unpaused" in res.data
time.sleep(1)
wait_for_all_checks(client)
uuid = extract_UUID_from_client(client)
# Check the JS execute code before extract worked
res = client.get(
url_for("preview_page", uuid="first"),
follow_redirects=True
)
assert b'I smell JavaScript' in res.data
assert os.path.isfile(os.path.join('test-datastore', uuid, 'last-screenshot.png')), "last-screenshot.png should exist"
assert os.path.isfile(os.path.join('test-datastore', uuid, 'elements.json')), "xpath elements.json data should exist"
# Open it and see if it roughly looks correct
with open(os.path.join('test-datastore', uuid, 'elements.json'), 'r') as f:
json.load(f)

View File

@@ -11,11 +11,14 @@ from changedetectionio.html_tools import FilterNotFoundInResponse
# Requests for checking on a single site(watch) from a queue of watches
# (another process inserts watches into the queue that are time-ready for checking)
import logging
import sys
class update_worker(threading.Thread):
current_uuid = None
def __init__(self, q, notification_q, app, datastore, *args, **kwargs):
logging.basicConfig(stream=sys.stderr, level=logging.DEBUG)
self.q = q
self.app = app
self.notification_q = notification_q
@@ -26,6 +29,10 @@ class update_worker(threading.Thread):
from changedetectionio import diff
from changedetectionio.notification import (
default_notification_format_for_watch
)
n_object = {}
watch = self.datastore.data['watching'].get(watch_uuid, False)
if not watch:
@@ -40,33 +47,27 @@ class update_worker(threading.Thread):
"History index had 2 or more, but only 1 date loaded, timestamps were not unique? maybe two of the same timestamps got written, needs more delay?"
)
# Did it have any notification alerts to hit?
if len(watch['notification_urls']):
print(">>> Notifications queued for UUID from watch {}".format(watch_uuid))
n_object['notification_urls'] = watch['notification_urls']
n_object['notification_title'] = watch['notification_title']
n_object['notification_body'] = watch['notification_body']
n_object['notification_format'] = watch['notification_format']
n_object['notification_urls'] = watch['notification_urls'] if len(watch['notification_urls']) else \
self.datastore.data['settings']['application']['notification_urls']
n_object['notification_title'] = watch['notification_title'] if watch['notification_title'] else \
self.datastore.data['settings']['application']['notification_title']
n_object['notification_body'] = watch['notification_body'] if watch['notification_body'] else \
self.datastore.data['settings']['application']['notification_body']
n_object['notification_format'] = watch['notification_format'] if watch['notification_format'] != default_notification_format_for_watch else \
self.datastore.data['settings']['application']['notification_format']
# No? maybe theres a global setting, queue them all
elif len(self.datastore.data['settings']['application']['notification_urls']):
print(">>> Watch notification URLs were empty, using GLOBAL notifications for UUID: {}".format(watch_uuid))
n_object['notification_urls'] = self.datastore.data['settings']['application']['notification_urls']
n_object['notification_title'] = self.datastore.data['settings']['application']['notification_title']
n_object['notification_body'] = self.datastore.data['settings']['application']['notification_body']
n_object['notification_format'] = self.datastore.data['settings']['application']['notification_format']
else:
print(">>> NO notifications queued, watch and global notification URLs were empty.")
# Only prepare to notify if the rules above matched
if 'notification_urls' in n_object:
if 'notification_urls' in n_object and n_object['notification_urls']:
# HTML needs linebreak, but MarkDown and Text can use a linefeed
if n_object['notification_format'] == 'HTML':
line_feed_sep = "</br>"
else:
line_feed_sep = "\n"
snapshot_contents = ''
with open(watch_history[dates[-1]], 'rb') as f:
snapshot_contents = f.read()
@@ -77,8 +78,10 @@ class update_worker(threading.Thread):
'diff': diff.render_diff(watch_history[dates[-2]], watch_history[dates[-1]], line_feed_sep=line_feed_sep),
'diff_full': diff.render_diff(watch_history[dates[-2]], watch_history[dates[-1]], True, line_feed_sep=line_feed_sep)
})
logging.info (">> SENDING NOTIFICATION")
self.notification_q.put(n_object)
else:
logging.info (">> NO Notification sent, notification_url was empty in both watch and system")
def send_filter_failure_notification(self, watch_uuid):
@@ -142,7 +145,7 @@ class update_worker(threading.Thread):
now = time.time()
try:
changed_detected, update_obj, contents, screenshot, xpath_data = update_handler.run(uuid)
changed_detected, update_obj, contents = update_handler.run(uuid)
# Re #342
# In Python 3, all strings are sequences of Unicode characters. There is a bytes type that holds raw bytes.
# We then convert/.decode('utf-8') for the notification etc
@@ -183,6 +186,9 @@ class update_worker(threading.Thread):
process_changedetection_results = False
except FilterNotFoundInResponse as e:
if not self.datastore.data['watching'].get(uuid):
continue
err_text = "Warning, filter '{}' not found".format(str(e))
self.datastore.update_watch(uuid=uuid, update_obj={'last_error': err_text,
# So that we get a trigger when the content is added again
@@ -223,6 +229,9 @@ class update_worker(threading.Thread):
'last_check_status': e.status_code})
except content_fetcher.PageUnloadable as e:
err_text = "Page request from server didnt respond correctly"
if e.message:
err_text = "{} - {}".format(err_text, e.message)
if e.screenshot:
self.datastore.save_screenshot(watch_uuid=uuid, screenshot=e.screenshot, as_error=True)
@@ -234,6 +243,9 @@ class update_worker(threading.Thread):
# Other serious error
process_changedetection_results = False
else:
# Crash protection, the watch entry could have been removed by this point (during a slow chrome fetch etc)
if not self.datastore.data['watching'].get(uuid):
continue
# Mark that we never had any failures
if not self.datastore.data['watching'][uuid].get('ignore_status_codes'):
@@ -241,10 +253,6 @@ class update_worker(threading.Thread):
self.cleanup_error_artifacts(uuid)
# Crash protection, the watch entry could have been removed by this point (during a slow chrome fetch etc)
if not self.datastore.data['watching'].get(uuid):
continue
# Different exceptions mean that we may or may not want to bump the snapshot, trigger notifications etc
if process_changedetection_results:
try:
@@ -280,10 +288,10 @@ class update_worker(threading.Thread):
'last_checked': round(time.time())})
# Always save the screenshot if it's available
if screenshot:
self.datastore.save_screenshot(watch_uuid=uuid, screenshot=screenshot)
if xpath_data:
self.datastore.save_xpath_data(watch_uuid=uuid, data=xpath_data)
if update_handler.screenshot:
self.datastore.save_screenshot(watch_uuid=uuid, screenshot=update_handler.screenshot)
if update_handler.xpath_data:
self.datastore.save_xpath_data(watch_uuid=uuid, data=update_handler.xpath_data)
self.current_uuid = None # Done

View File

@@ -6,6 +6,8 @@ services:
hostname: changedetection
volumes:
- changedetection-data:/datastore
# Configurable proxy list support, see https://github.com/dgtlmoon/changedetection.io/wiki/Proxy-configuration#proxy-list-support
# - ./proxies.json:/datastore/proxies.json
# environment:
# Default listening port, can also be changed with the -p option
@@ -30,7 +32,7 @@ services:
#
# https://playwright.dev/python/docs/api/class-browsertype#browser-type-launch-option-proxy
#
# Plain requsts - proxy support example.
# Plain requests - proxy support example.
# - HTTP_PROXY=socks5h://10.10.1.10:1080
# - HTTPS_PROXY=socks5h://10.10.1.10:1080
#
@@ -43,6 +45,9 @@ services:
# Respect proxy_pass type settings, `proxy_set_header Host "localhost";` and `proxy_set_header X-Forwarded-Prefix /app;`
# More here https://github.com/dgtlmoon/changedetection.io/wiki/Running-changedetection.io-behind-a-reverse-proxy-sub-directory
# - USE_X_SETTINGS=1
#
# Hides the `Referer` header so that monitored websites can't see the changedetection.io hostname.
# - HIDE_REFERER=true
# Comment out ports: when using behind a reverse proxy , enable networks: etc.
ports:

BIN
docs/proxy-example.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 190 KiB

After

Width:  |  Height:  |  Size: 209 KiB

View File

@@ -1,8 +1,8 @@
flask~= 2.0
flask ~= 2.0
flask_wtf
eventlet>=0.31.0
eventlet >= 0.31.0
validators
timeago ~=1.0
timeago ~= 1.0
inscriptis ~= 2.2
feedgen ~= 0.9
flask-login ~= 0.5
@@ -10,15 +10,20 @@ flask_restful
pytz
# Set these versions together to avoid a RequestsDependencyWarning
requests[socks] ~= 2.26
# >= 2.26 also adds Brotli support if brotli is installed
brotli ~= 1.0
requests[socks] ~= 2.28
urllib3 > 1.26
chardet > 2.3.0
wtforms ~= 3.0
jsonpath-ng ~= 1.5.3
# jq not available on Windows so must be installed manually
# Notification library
apprise ~= 1.0.0
apprise ~= 1.1.0
# apprise mqtt https://github.com/dgtlmoon/changedetection.io/issues/315
paho-mqtt
@@ -41,4 +46,9 @@ selenium ~= 4.1.0
# need to revisit flask login versions
werkzeug ~= 2.0.0
# Templating, so far just in the URLs but in the future can be for the notifications also
jinja2 ~= 3.1
jinja2-time
# playwright is installed at Dockerfile build time because it's not available on all platforms