Compare commits

...

123 Commits
0.1 ... 0.26

Author SHA1 Message Date
Leigh Morresi
b0fb52017c Handle the case of someone supplying a bad link 2021-02-24 09:56:29 +01:00
Leigh Morresi
fc6fba377a Merge branch 'master' of github.com:dgtlmoon/changedetection.io 2021-02-24 09:53:58 +01:00
Leigh Morresi
7ea39ada7c Adding jump to next change diff widget 2021-02-24 09:53:40 +01:00
dgtlmoon
e98ea37342 Moving nice screenshot to above the fold :) 2021-02-22 16:39:04 +01:00
dgtlmoon
e20577df15 Adding docker hub badge for tag information 2021-02-22 14:48:57 +01:00
Leigh Morresi
19dcbc2f08 Bumping schema tag to 0.25 2021-02-22 08:53:04 +01:00
Leigh Morresi
c59838a6e4 Issue #5 - Remove arbitrary '600' minutes limit 2021-02-22 08:38:41 +01:00
Leigh Morresi
0a8c339535 Add test delay for github action test 2021-02-21 21:08:04 +01:00
Leigh Morresi
cd5b703037 Add wait for threads in test 2021-02-21 20:54:15 +01:00
Leigh Morresi
90642742bd Extending tests to cover resetting the diff/unviewed status correctly 2021-02-21 20:46:56 +01:00
Leigh Morresi
96221598e7 Tidy up return logic 2021-02-21 20:23:50 +01:00
Leigh Morresi
98623de38c Code tidy 2021-02-21 20:14:35 +01:00
Leigh Morresi
33985dbd9d Fix docker app files paths 2021-02-21 16:31:42 +01:00
Leigh Morresi
a3a5ca78bf Tweaking Dockerfile for new eventlet wrapper 2021-02-21 16:13:55 +01:00
dgtlmoon
3fcbbb3fbf Create LICENSE 2021-02-21 15:42:45 +01:00
dgtlmoon
70252b24f9 Adding docker pulls counter badge 2021-02-21 15:39:17 +01:00
dgtlmoon
0a08616c87 Merge pull request #11 from dgtlmoon/pytest
Separate flask from eventlet runtime and get pytest working
2021-02-21 15:22:54 +01:00
Leigh Morresi
beebba487c Use master branch for badge 2021-02-21 15:21:30 +01:00
Leigh Morresi
cbeafcbaa0 Removing unused import 2021-02-21 14:26:58 +01:00
Leigh Morresi
e200cd3289 Fixing a few more easy lint wins 2021-02-21 14:26:19 +01:00
Leigh Morresi
22c7a1a88d Merge branch 'pytest' of github.com:dgtlmoon/changedetection.io into pytest 2021-02-21 14:21:45 +01:00
Leigh Morresi
63eea2d6db Linting fixups 2021-02-21 14:21:14 +01:00
dgtlmoon
3e9a110671 Update README.md 2021-02-21 14:15:21 +01:00
Leigh Morresi
22bc8fabd1 Add badge under pytest branch 2021-02-21 14:14:27 +01:00
Leigh Morresi
9030070b3d Merge branch 'master' into pytest 2021-02-21 14:09:49 +01:00
dgtlmoon
fca7bb8583 Create python-app.yml 2021-02-21 14:09:34 +01:00
Leigh Morresi
3c175bfc4a Create the test datastore 2021-02-21 14:08:34 +01:00
Leigh Morresi
fd5475ba38 Minor cleanup 2021-02-21 14:05:52 +01:00
Leigh Morresi
b0c5dbd88e Just use the current/previous md5 2021-02-21 13:46:16 +01:00
Leigh Morresi
1718e2e86f Finalse pytest methods 2021-02-21 13:41:00 +01:00
Leigh Morresi
b46a7fc4b1 Port should be an integer 2021-02-21 13:40:48 +01:00
Leigh Morresi
4770ebb2ea Tweaking client 2021-02-16 21:48:38 +01:00
Leigh Morresi
d4db082c01 remove unused imports 2021-02-16 21:44:44 +01:00
Leigh Morresi
c8607ae8bb Use session/client fixture 2021-02-16 21:42:26 +01:00
Leigh Morresi
b361a61d18 Addingmissing files 2021-02-16 21:36:41 +01:00
Leigh Morresi
87f4347fe5 hack of pytest implementation - doesnt work yet 2021-02-16 21:35:28 +01:00
Leigh Morresi
93ee65fe53 Tidy up a few broken datastore paths 2021-02-12 19:43:05 +01:00
Leigh Morresi
9f964b6d3f WIP, separate out the Flask from everything else, get pytest working 2021-02-12 19:24:30 +01:00
Leigh Morresi
426b09b7e1 Make records in the overview that have a difference that have not been viewed in the [diff] tab bold 2021-02-11 10:36:54 +01:00
Leigh Morresi
ec98415c4d Adding 0.24 tag 2021-02-05 18:46:00 +01:00
Leigh Morresi
47e5a7cf09 Avoid accidently using Python's objects that are copied - but land as a 'soft reference', need to use a better dict struct in the future #6 2021-02-05 18:43:35 +01:00
Leigh Morresi
d07cf53a07 Minor fix to 'last changed' field, simplify template and logic 2021-02-04 13:15:39 +01:00
Leigh Morresi
b9f73a6240 Remove debug print 2021-02-04 12:55:13 +01:00
Leigh Morresi
5e31ae86d0 Use a thread locker and cleaner separation of concerns between main thread and site status fetch 2021-02-04 12:38:48 +01:00
Leigh Morresi
ef2dd44e7e Adding tag to json 2021-02-03 22:28:37 +01:00
Leigh Morresi
07f41782c0 Adding SEND_FILE_MAX_AGE_DEFAULT to ensure backups etc dont get old 2021-02-03 09:45:58 +01:00
Leigh Morresi
d93926a8b6 Minor fix - load extra stylesheet only once 2021-02-03 09:29:22 +01:00
Leigh Morresi
7072858814 Minor tweaks for development setup 2021-02-03 09:28:52 +01:00
Leigh Morresi
cd5c05e72a Provide named containers and remove all existing 2021-02-02 23:41:28 +01:00
Leigh Morresi
3034d17c06 Adding new [Scrub All Version History] button under [settings] (But keep your URL list) 2021-02-02 23:26:16 +01:00
Leigh Morresi
3b2c8d356a Flag for immediate sync of index after adding new watch 2021-02-02 23:07:19 +01:00
Leigh Morresi
711853a149 Sometimes it seems .update wasnt thread safe and isnt used here, just add a clean new dict member 2021-02-02 22:51:18 +01:00
Leigh Morresi
5669ae70cc Adding ARG to Dockerfile 2021-02-02 19:11:38 +01:00
Leigh Morresi
084dcde410 Include the triggered build SHA as part of the backup info, when built in docker hub. 2021-02-02 18:32:18 +01:00
Leigh Morresi
37b070f5a0 Add cache busting var to style sheets 2021-02-02 18:11:03 +01:00
Leigh Morresi
3952f3a207 Slightly more bulletproof instructions 2021-02-02 18:10:09 +01:00
Leigh Morresi
0c3d5e55ab Updating screenshot 2021-02-02 17:58:20 +01:00
Leigh Morresi
6a102374c6 Push newly created watches directly into the update check Queue. 2021-02-02 17:50:05 +01:00
Leigh Morresi
bbd99c9aa9 Adding checkall 2021-02-02 17:46:40 +01:00
Leigh Morresi
26c9a6e0fc Easily download a full backup 2021-02-02 17:11:06 +01:00
Leigh Morresi
c4197a5045 Show the date/time of the current/most up to date version 2021-02-02 16:36:03 +01:00
Leigh Morresi
f1c2ece32f Use a pool of thread workers, better for huge lists of watchers 2021-02-02 16:29:06 +01:00
Leigh Morresi
704b8daa6d Code cleanup edit submit handler can be the same function 2021-02-02 15:37:20 +01:00
Leigh Morresi
9ec820fa97 Add update howto 2021-02-02 12:22:04 +01:00
Leigh Morresi
e7e3eb36c0 Refactor slightly confusing difference build function 2021-02-02 12:14:23 +01:00
Leigh Morresi
801b50cb5b Version comparison had the wrong order 2021-02-02 12:02:13 +01:00
Leigh Morresi
eecc620386 https://github.com/psf/requests/issues/4525 - brotli compression is not yet supported in requests, be sure that users cant accidently use this content type encoding in the headers 2021-02-02 11:49:43 +01:00
Leigh Morresi
25b565d9ba Include the current URL in the page when viewing the version diff 2021-02-01 21:56:18 +01:00
Leigh Morresi
7b4ed2429d Include a selfcheck/diagnosis routine 2021-02-01 16:56:26 +01:00
Leigh Morresi
4e0fb33580 On manual recheck request, redirect to same tag listing 2021-02-01 16:54:57 +01:00
Leigh Morresi
4931e757b9 Set default diff type to 'lines', faster for starters. 2021-02-01 12:47:32 +01:00
Leigh Morresi
3e934e8f8c Supply different versions to browse 2021-02-01 12:39:15 +01:00
dgtlmoon
118814912f Fix heading 2021-02-01 11:43:39 +01:00
dgtlmoon
4013e34899 Update README.md
Boldify callout text
2021-02-01 11:42:51 +01:00
Leigh Morresi
b58cf76445 Adding diff screenshot 2021-02-01 10:52:53 +01:00
Leigh Morresi
0042ca08e1 Add more start-up examples 2021-02-01 10:24:29 +01:00
Leigh Morresi
7b5e839e3d Tweak theming 2021-02-01 10:24:15 +01:00
Leigh Morresi
7c589a73c5 Use a even simpler run command 2021-02-01 10:16:21 +01:00
Leigh Morresi
3e23ed314a improve the wording 2021-01-31 21:38:11 +01:00
Leigh Morresi
d0ee49e465 Add basic settings page (so far just recheck time in minutes) 2021-01-31 21:36:42 +01:00
Leigh Morresi
5b8252c171 Updating README 2021-01-31 20:20:49 +01:00
Leigh Morresi
3e503dbbc6 Updating screenshot (new diff button) 2021-01-31 20:19:32 +01:00
Leigh Morresi
86f2f54abe Trigger write index after edit of a watch 2021-01-31 20:07:10 +01:00
Leigh Morresi
81534d9367 Add [diff] mechanism 2021-01-31 19:55:35 +01:00
Leigh Morresi
43c7ccb3fe Use a single thread for writing the sync json 2021-01-31 18:49:14 +01:00
Leigh Morresi
a6c864ecfd Use existing tag 2021-01-30 13:03:14 +01:00
Leigh Morresi
0ee47a1274 When all items showed, show which tag it belongs to 2021-01-30 12:49:36 +01:00
Leigh Morresi
17701c5c72 Sort tag list 2021-01-30 12:44:36 +01:00
Leigh Morresi
a8ae9d54aa Set active tag selection 2021-01-30 12:26:22 +01:00
Leigh Morresi
3eaccfe5da Support for comma separated tags 2021-01-30 11:22:59 +01:00
Leigh Morresi
e589b441db Tweak styling for 'new watch' form 2021-01-30 10:40:42 +01:00
Leigh Morresi
bfcb17ca24 Remove import for old lib 2021-01-30 10:29:39 +01:00
Leigh Morresi
98f6f4619f Switch to inscriptis
prepare config backend struct
2021-01-30 10:14:19 +01:00
Leigh Morresi
fbe20d45cc Support for custom headers per watch 2021-01-29 19:12:39 +01:00
Leigh Morresi
3f60ab4167 Going back to larger PNG screenshot, looks better in Github :) 2021-01-29 18:06:45 +01:00
Leigh Morresi
13dcce9b47 Fix alt text in markup 2021-01-29 18:02:15 +01:00
Leigh Morresi
d133315021 Adding new screenshot binary 2021-01-29 18:01:41 +01:00
Leigh Morresi
49dd88fba5 Updating screenshot 2021-01-29 18:01:21 +01:00
dgtlmoon
49ff826a48 Moving start text to a more visible part 2021-01-29 17:55:34 +01:00
Leigh Morresi
a899c8e12c Tweak messages 2021-01-29 17:52:49 +01:00
Leigh Morresi
7ee36fad8a Change message text 2021-01-29 17:51:40 +01:00
Leigh Morresi
ba17b23f7a Fixing messages styling 2021-01-29 17:50:47 +01:00
Leigh Morresi
b6e9bb12fb Basic tag browse buttons 2021-01-29 15:51:30 +01:00
Leigh Morresi
016937d5de Bulk import 2021-01-29 15:39:38 +01:00
Leigh Morresi
abef169382 Tidy up 'last_checked' date handling 2021-01-29 14:45:12 +01:00
Leigh Morresi
04c8ea7960 Dev environment setup 2021-01-29 14:45:07 +01:00
Leigh Morresi
4ab49f2c8c Dev docker tweaks 2021-01-29 14:45:00 +01:00
Leigh Morresi
ded9d09f37 Remove messy text 2021-01-29 13:28:46 +01:00
Leigh Morresi
1dd34d7548 Tweaking text 2021-01-29 13:26:16 +01:00
Leigh Morresi
7fb8a37da3 Fixing checkall hook 2021-01-29 13:12:27 +01:00
Leigh Morresi
324c54fe46 Use requests's r.text so we dont have to deal with charsets 2021-01-29 13:05:31 +01:00
Leigh Morresi
8b775f9188 Add note 2021-01-29 13:05:26 +01:00
Leigh Morresi
00020a4c90 Fix bad copy command 2021-01-29 12:40:51 +01:00
Leigh Morresi
9edb591670 Oops left out the image name 2021-01-29 12:40:39 +01:00
Leigh Morresi
bbccb3181b Fix build setup for the docker hub image https://hub.docker.com/r/dgtlmoon/changedetection.io 2021-01-29 12:33:42 +01:00
Leigh Morresi
b263773e09 Update screenshot 2021-01-29 12:33:37 +01:00
Leigh Morresi
b7a0c2dbcd Add edit UI
Move to keyed structure instead of list
2021-01-29 10:49:05 +01:00
Leigh Morresi
1629dee6a5 Fixes to CSS 2021-01-28 18:38:47 +01:00
Leigh Morresi
85a91d6e51 Add method to launch a full recheck of all
@note - needs to be converted to a python Queue threads
2021-01-28 15:30:02 +01:00
Leigh Morresi
0d1bc1a22c Merge branch 'master' of github.com:dgtlmoon/changedetection.io 2021-01-28 14:45:39 +01:00
Leigh Morresi
cf345dc567 Tweaks to docker layout 2021-01-28 14:45:30 +01:00
Leigh Morresi
9c0c8bf6aa Remove actual :// links, dont consider these as part of the changes, often they include variables/trackingscript ref etc 2021-01-28 14:45:01 +01:00
dgtlmoon
cf046d88da Create FUNDING.yml 2021-01-28 13:51:04 +01:00
38 changed files with 3055 additions and 536 deletions

12
.github/FUNDING.yml vendored Normal file
View File

@@ -0,0 +1,12 @@
# These are supported funding model platforms
github: dgtlmoon
patreon: # Replace with a single Patreon username
open_collective: # Replace with a single Open Collective username
ko_fi: # Replace with a single Ko-fi username
tidelift: # Replace with a single Tidelift platform-name/package-name e.g., npm/babel
community_bridge: # Replace with a single Community Bridge project-name e.g., cloud-foundry
liberapay: # Replace with a single Liberapay username
issuehunt: # Replace with a single IssueHunt username
otechie: # Replace with a single Otechie username
custom: # Replace with up to 4 custom sponsorship URLs e.g., ['link1', 'link2']

37
.github/workflows/python-app.yml vendored Normal file
View File

@@ -0,0 +1,37 @@
# This workflow will install Python dependencies, run tests and lint with a single version of Python
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions
name: changedetection.io
on:
push:
branches: [ master ]
pull_request:
branches: [ master ]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.9
uses: actions/setup-python@v2
with:
python-version: 3.9
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install flake8 pytest
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
- name: Lint with flake8
run: |
# stop the build if there are Python syntax errors or undefined names
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: Test with pytest
run: |
cd backend; pytest

4
.gitignore vendored
View File

@@ -2,4 +2,6 @@ __pycache__
.idea
*.pyc
datastore/url-watches.json
datastore/*
datastore/*
__pycache__
.pytest_cache

28
Dockerfile Normal file
View File

@@ -0,0 +1,28 @@
FROM python:3.8-slim
COPY requirements.txt /tmp/requirements.txt
RUN pip3 install -r /tmp/requirements.txt
RUN [ ! -d "/app" ] && mkdir /app
RUN [ ! -d "/datastore" ] && mkdir /datastore
# The actual flask app
COPY backend /app/backend
# The eventlet server wrapper
COPY changedetection.py /app/changedetection.py
WORKDIR /app
# https://stackoverflow.com/questions/58701233/docker-logs-erroneously-appears-empty-until-container-stops
ENV PYTHONUNBUFFERED=1
# Attempt to store the triggered commit
ARG SOURCE_COMMIT
ARG SOURCE_BRANCH
RUN echo "commit: $SOURCE_COMMIT branch: $SOURCE_BRANCH" >/source.txt
CMD [ "python", "./changedetection.py" , "-d", "/datastore"]

201
LICENSE Normal file
View File

@@ -0,0 +1,201 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@@ -1,32 +1,63 @@
# changedetection.io
![changedetection.io](https://github.com/dgtlmoon/changedetection.io/actions/workflows/python-app.yml/badge.svg?branch=master)
<a href="https://hub.docker.com/r/dgtlmoon/changedetection.io" target="_blank" title="Change detection docker hub">
<img src="https://img.shields.io/docker/pulls/dgtlmoon/changedetection.io" alt="Docker Pulls"/>
</a>
<a href="https://hub.docker.com/r/dgtlmoon/changedetection.io" target="_blank" title="Change detection docker hub">
<img src="https://img.shields.io/docker/v/dgtlmoon/changedetection.io" alt="Change detection latest tag version"/>
</a>
## Self-hosted change monitoring of web pages.
_Why?_ Many years ago I had used a couple of web site change detection/monitoring services,
but they got bought up by larger media companies, after this, whenI logged in they
wanted _even more_ private data about me.
_Know when web pages change! Stay ontop of new information!_
All I simply wanted todo was to know which pages were changing and when (and maybe see
some basic information about what those changes were)
![Self-hosted web page change monitoring application screenshot](screenshot.png?raw=true "Self-hosted web page change monitoring screenshot")
![Alt text](screenshot.png?raw=true "Screenshot")
Get going...
#### Example use cases
```
$ git clone https://github.com/dgtlmoon/changedetection.io.git
$ cd changedetection.io
$ docker-compose up -d
Know when ...
- Government department updates (changes are often only on their websites)
- Local government news (changes are often only on their websites)
- New software releases, security advisories when you're not on their mailing list.
- Festivals with changes
- Realestate listing changes
**Get monitoring now! super simple, one command!**
```bash
docker run -d --restart always -p "127.0.0.1:5000:5000" -v datastore-volume:/datastore --name changedetection.io dgtlmoon/changedetection.io
```
Now visit http://127.0.0.1:5000 , The interface will now expose the UI, you can change this in the `docker-compose.yml`
Now visit http://127.0.0.1:5000 , You should now be able to access the UI.
#### Updating to latest version
Highly recommended :)
```bash
docker pull dgtlmoon/changedetection.io
docker kill $(docker ps -a|grep changedetection.io|awk '{print $1}')
docker rm $(docker ps -a|grep changedetection.io|awk '{print $1}')
docker run -d --restart always -p "127.0.0.1:5000:5000" -v datastore-volume:/datastore --name changedetection.io dgtlmoon/changedetection.io
```
### Screenshots
Examining differences in content.
![Self-hosted web page change monitoring context difference screenshot](screenshot-diff.png?raw=true "Self-hosted web page change monitoring context difference screenshot")
### Future plans
- Greater configuration of check interval times, page request headers.
- General options for timeout, default headers
- ~~General options for timeout, default headers~~
- On change detection, callout to another API (handy for notices/issue trackers)
- Explore the differences that were detected.
- ~~Explore the differences that were detected~~
- Add more options to explore versions of differences
- Use a graphic/rendered page difference instead of text (see the experimental `selenium-screenshot-diff` branch)
Please :star: star :star: this project and help it grow! https://github.com/dgtlmoon/changedetection.io/

1
backend/README-pytest.md Normal file
View File

@@ -0,0 +1 @@
Note: run `pytest` from this directory.

489
backend/__init__.py Normal file
View File

@@ -0,0 +1,489 @@
#!/usr/bin/python3
# @todo logging
# @todo sort by last_changed
# @todo extra options for url like , verify=False etc.
# @todo enable https://urllib3.readthedocs.io/en/latest/user-guide.html#ssl as option?
# @todo maybe a button to reset all 'last-changed'.. so you can see it clearly when something happens since your last visit
# @todo option for interval day/6 hour/etc
# @todo on change detected, config for calling some API
# @todo make tables responsive!
# @todo fetch title into json
# https://distill.io/features
# proxy per check
# - flask_cors, itsdangerous,MarkupSafe
import time
import os
import timeago
import threading
import queue
from flask import Flask, render_template, request, send_file, send_from_directory, abort, redirect, url_for
datastore = None
# Local
running_update_threads = []
ticker_thread = None
messages = []
extra_stylesheets = []
update_q = queue.Queue()
app = Flask(__name__, static_url_path="/var/www/change-detection/backen/static")
# Stop browser caching of assets
app.config['SEND_FILE_MAX_AGE_DEFAULT'] = 0
app.config['STOP_THREADS'] = False
# Disables caching of the templates
app.config['TEMPLATES_AUTO_RELOAD'] = True
# We use the whole watch object from the store/JSON so we can see if there's some related status in terms of a thread
# running or something similar.
@app.template_filter('format_last_checked_time')
def _jinja2_filter_datetime(watch_obj, format="%Y-%m-%d %H:%M:%S"):
# Worker thread tells us which UUID it is currently processing.
for t in running_update_threads:
if t.current_uuid == watch_obj['uuid']:
return "Checking now.."
if watch_obj['last_checked'] == 0:
return 'Not yet'
return timeago.format(int(watch_obj['last_checked']), time.time())
# @app.context_processor
# def timeago():
# def _timeago(lower_time, now):
# return timeago.format(lower_time, now)
# return dict(timeago=_timeago)
@app.template_filter('format_timestamp_timeago')
def _jinja2_filter_datetimestamp(timestamp, format="%Y-%m-%d %H:%M:%S"):
return timeago.format(timestamp, time.time())
# return timeago.format(timestamp, time.time())
# return datetime.datetime.utcfromtimestamp(timestamp).strftime(format)
def changedetection_app(config=None, datastore_o=None):
global datastore
datastore = datastore_o
# Hmm
app.config.update(dict(DEBUG=True))
app.config.update(config or {})
# Setup cors headers to allow all domains
# https://flask-cors.readthedocs.io/en/latest/
# CORS(app)
# https://github.com/pallets/flask/blob/93dd1709d05a1cf0e886df6223377bdab3b077fb/examples/tutorial/flaskr/__init__.py#L39
# You can divide up the stuff like this
@app.route("/", methods=['GET'])
def index():
global messages
limit_tag = request.args.get('tag')
# Sort by last_changed and add the uuid which is usually the key..
sorted_watches = []
for uuid, watch in datastore.data['watching'].items():
if limit_tag != None:
# Support for comma separated list of tags.
for tag_in_watch in watch['tag'].split(','):
tag_in_watch = tag_in_watch.strip()
if tag_in_watch == limit_tag:
watch['uuid'] = uuid
sorted_watches.append(watch)
else:
watch['uuid'] = uuid
sorted_watches.append(watch)
sorted_watches.sort(key=lambda x: x['last_changed'], reverse=True)
existing_tags = datastore.get_all_tags()
output = render_template("watch-overview.html",
watches=sorted_watches,
messages=messages,
tags=existing_tags,
active_tag=limit_tag)
# Show messages but once.
messages = []
return output
@app.route("/scrub", methods=['GET', 'POST'])
def scrub_page():
from pathlib import Path
global messages
if request.method == 'POST':
confirmtext = request.form.get('confirmtext')
if confirmtext == 'scrub':
for txt_file_path in Path(app.config['datastore_path']).rglob('*.txt'):
os.unlink(txt_file_path)
for uuid, watch in datastore.data['watching'].items():
watch['last_checked'] = 0
watch['last_changed'] = 0
watch['previous_md5'] = None
watch['history'] = {}
datastore.needs_write = True
messages.append({'class': 'ok', 'message': 'Cleaned all version history.'})
else:
messages.append({'class': 'error', 'message': 'Wrong confirm text.'})
return redirect(url_for('index'))
return render_template("scrub.html")
@app.route("/edit", methods=['GET', 'POST'])
def edit_page():
global messages
import validators
if request.method == 'POST':
uuid = request.args.get('uuid')
url = request.form.get('url').strip()
tag = request.form.get('tag').strip()
form_headers = request.form.get('headers').strip().split("\n")
extra_headers = {}
if form_headers:
for header in form_headers:
if len(header):
parts = header.split(':', 1)
extra_headers.update({parts[0].strip(): parts[1].strip()})
validators.url(url) # @todo switch to prop/attr/observer
datastore.data['watching'][uuid].update({'url': url,
'tag': tag,
'headers': extra_headers})
datastore.needs_write = True
messages.append({'class': 'ok', 'message': 'Updated watch.'})
return redirect(url_for('index'))
else:
uuid = request.args.get('uuid')
output = render_template("edit.html", uuid=uuid, watch=datastore.data['watching'][uuid], messages=messages)
return output
@app.route("/settings", methods=['GET', "POST"])
def settings_page():
global messages
if request.method == 'POST':
try:
minutes = int(request.values.get('minutes').strip())
except ValueError:
messages.append({'class': 'error', 'message': "Invalid value given, use an integer."})
else:
if minutes >= 5:
datastore.data['settings']['requests']['minutes_between_check'] = minutes
datastore.needs_write = True
messages.append({'class': 'ok', 'message': "Updated"})
else:
messages.append(
{'class': 'error', 'message': "Must be atleast 5 minutes."})
output = render_template("settings.html", messages=messages,
minutes=datastore.data['settings']['requests']['minutes_between_check'])
messages = []
return output
@app.route("/import", methods=['GET', "POST"])
def import_page():
import validators
global messages
remaining_urls = []
good = 0
if request.method == 'POST':
urls = request.values.get('urls').split("\n")
for url in urls:
url = url.strip()
if len(url) and validators.url(url):
new_uuid = datastore.add_watch(url=url.strip(), tag="")
# Straight into the queue.
update_q.put(new_uuid)
good += 1
else:
if len(url):
remaining_urls.append(url)
messages.append({'class': 'ok', 'message': "{} Imported, {} Skipped.".format(good, len(remaining_urls))})
if len(remaining_urls) == 0:
return redirect(url_for('index'))
else:
output = render_template("import.html",
messages=messages,
remaining="\n".join(remaining_urls)
)
messages = []
return output
@app.route("/diff/<string:uuid>", methods=['GET'])
def diff_history_page(uuid):
global messages
# More for testing, possible to return the first/only
if uuid == 'first':
uuid= list(datastore.data['watching'].keys()).pop()
extra_stylesheets = ['/static/css/diff.css']
try:
watch = datastore.data['watching'][uuid]
except KeyError:
messages.append({'class': 'error', 'message': "No history found for the specified link, bad link?"})
return redirect(url_for('index'))
dates = list(watch['history'].keys())
# Convert to int, sort and back to str again
dates = [int(i) for i in dates]
dates.sort(reverse=True)
dates = [str(i) for i in dates]
if len(dates) < 2:
messages.append({'class': 'error', 'message': "Not enough saved change detection snapshots to produce a report."})
return redirect(url_for('index'))
# Save the current newest history as the most recently viewed
datastore.set_last_viewed(uuid, dates[0])
newest_file = watch['history'][dates[0]]
with open(newest_file, 'r') as f:
newest_version_file_contents = f.read()
previous_version = request.args.get('previous_version')
try:
previous_file = watch['history'][previous_version]
except KeyError:
# Not present, use a default value, the second one in the sorted list.
previous_file = watch['history'][dates[1]]
with open(previous_file, 'r') as f:
previous_version_file_contents = f.read()
output = render_template("diff.html", watch_a=watch,
messages=messages,
newest=newest_version_file_contents,
previous=previous_version_file_contents,
extra_stylesheets=extra_stylesheets,
versions=dates[1:],
newest_version_timestamp=dates[0],
current_previous_version=str(previous_version),
current_diff_url=watch['url'])
return output
@app.route("/favicon.ico", methods=['GET'])
def favicon():
return send_from_directory("/app/static/images", filename="favicon.ico")
# We're good but backups are even better!
@app.route("/backup", methods=['GET'])
def get_backup():
import zipfile
from pathlib import Path
# create a ZipFile object
backupname = "changedetection-backup-{}.zip".format(int(time.time()))
# We only care about UUIDS from the current index file
uuids = list(datastore.data['watching'].keys())
with zipfile.ZipFile(os.path.join(app.config['datastore_path'], backupname), 'w',
compression=zipfile.ZIP_DEFLATED,
compresslevel=6) as zipObj:
# Be sure we're written fresh
datastore.sync_to_json()
# Add the index
zipObj.write(os.path.join(app.config['datastore_path'], "url-watches.json"))
# Add any snapshot data we find
for txt_file_path in Path(app.config['datastore_path']).rglob('*.txt'):
parent_p = txt_file_path.parent
if parent_p.name in uuids:
zipObj.write(txt_file_path)
return send_file(os.path.join(app.config['datastore_path'], backupname),
as_attachment=True,
mimetype="application/zip",
attachment_filename=backupname)
@app.route("/static/<string:group>/<string:filename>", methods=['GET'])
def static_content(group, filename):
# These files should be in our subdirectory
full_path = os.path.realpath(__file__)
p = os.path.dirname(full_path)
try:
return send_from_directory("{}/static/{}".format(p, group), filename=filename)
except FileNotFoundError:
abort(404)
@app.route("/api/add", methods=['POST'])
def api_watch_add():
global messages
# @todo add_watch should throw a custom Exception for validation etc
new_uuid = datastore.add_watch(url=request.form.get('url').strip(), tag=request.form.get('tag').strip())
# Straight into the queue.
update_q.put(new_uuid)
messages.append({'class': 'ok', 'message': 'Watch added.'})
return redirect(url_for('index'))
@app.route("/api/delete", methods=['GET'])
def api_delete():
global messages
uuid = request.args.get('uuid')
datastore.delete(uuid)
messages.append({'class': 'ok', 'message': 'Deleted.'})
return redirect(url_for('index'))
@app.route("/api/checknow", methods=['GET'])
def api_watch_checknow():
global messages
tag = request.args.get('tag')
uuid = request.args.get('uuid')
i = 0
running_uuids = []
for t in running_update_threads:
running_uuids.append(t.current_uuid)
# @todo check thread is running and skip
if uuid:
if uuid not in running_uuids:
update_q.put(uuid)
i = 1
elif tag != None:
# Items that have this current tag
for watch_uuid, watch in datastore.data['watching'].items():
if (tag != None and tag in watch['tag']):
i += 1
if watch_uuid not in running_uuids:
update_q.put(watch_uuid)
else:
# No tag, no uuid, add everything.
for watch_uuid, watch in datastore.data['watching'].items():
i += 1
if watch_uuid not in running_uuids:
update_q.put(watch_uuid)
messages.append({'class': 'ok', 'message': "{} watches are rechecking.".format(i)})
return redirect(url_for('index', tag=tag))
# @todo handle ctrl break
ticker_thread = threading.Thread(target=ticker_thread_check_time_launch_checks).start()
return app
# Requests for checking on the site use a pool of thread Workers managed by a Queue.
class Worker(threading.Thread):
current_uuid = None
def __init__(self, q, *args, **kwargs):
self.q = q
super().__init__(*args, **kwargs)
def run(self):
from backend import fetch_site_status
update_handler = fetch_site_status.perform_site_check(datastore=datastore)
while True:
try:
uuid = self.q.get(block=True, timeout=1)
except queue.Empty:
# We have a chance to kill this thread that needs to monitor for new jobs..
# Delays here would be caused by a current response object pending
# @todo switch to threaded response handler
if app.config['STOP_THREADS']:
return
else:
self.current_uuid = uuid
if uuid in list(datastore.data['watching'].keys()):
try:
changed_detected, result, contents = update_handler.run(uuid)
except PermissionError as s:
app.logger.error("File permission error updating", uuid, str(s))
else:
if result:
datastore.update_watch(uuid=uuid, update_obj=result)
if changed_detected:
# A change was detected
datastore.save_history_text(uuid=uuid, contents=contents, result_obj=result)
self.current_uuid = None # Done
self.q.task_done()
# Thread runner to check every minute, look for new watches to feed into the Queue.
def ticker_thread_check_time_launch_checks():
# Spin up Workers.
for _ in range(datastore.data['settings']['requests']['workers']):
new_worker = Worker(update_q)
running_update_threads.append(new_worker)
new_worker.start()
# Every minute check for new UUIDs to follow up on
while True:
if app.config['STOP_THREADS']:
return
running_uuids = []
for t in running_update_threads:
running_uuids.append(t.current_uuid)
# Look at the dataset, find a stale watch to process
minutes = datastore.data['settings']['requests']['minutes_between_check']
for uuid, watch in datastore.data['watching'].items():
if watch['last_checked'] <= time.time() - (minutes * 60):
# @todo maybe update_q.queue is enough?
if not uuid in running_uuids and uuid not in update_q.queue:
update_q.put(uuid)
# Should be low so we can break this out in testing
time.sleep(1)

View File

@@ -1,183 +0,0 @@
#!/usr/bin/python3
# @todo logging
# @todo sort by last_changed
# @todo extra options for url like , verify=False etc.
# @todo enable https://urllib3.readthedocs.io/en/latest/user-guide.html#ssl as option?
# @todo maybe a button to reset all 'last-changed'.. so you can see it clearly when something happens since your last visit
# @todo option for interval day/6 hour/etc
# @todo on change detected, config for calling some API
# @todo make tables responsive!
# @todo fetch title into json
import json
import eventlet
import eventlet.wsgi
import time
import os
import getopt
import sys
import datetime
import timeago
import threading
from flask import Flask, render_template, request, send_file, send_from_directory, safe_join, abort, redirect, url_for
# Local
import store
import fetch_site_status
ticker_thread = None
datastore = store.ChangeDetectionStore()
messages = []
running_update_threads = {}
app = Flask(__name__, static_url_path='/static')
app.config['STATIC_RESOURCES'] = "/app/static"
# app.config['SECRET_KEY'] = 'secret!'
# Disables caching of the templates
app.config['TEMPLATES_AUTO_RELOAD'] = True
# We use the whole watch object from the store/JSON so we can see if there's some related status in terms of a thread
# running or something similar.
@app.template_filter('format_last_checked_time')
def _jinja2_filter_datetime(watch_obj, format="%Y-%m-%d %H:%M:%S"):
global running_update_threads
if watch_obj['uuid'] in running_update_threads:
if running_update_threads[watch_obj['uuid']].is_alive():
return "Checking now.."
if watch_obj['last_checked'] == 0:
return 'Not yet'
return datetime.datetime.utcfromtimestamp(int(watch_obj['last_checked'])).strftime(format)
# @app.context_processor
# def timeago():
# def _timeago(lower_time, now):
# return timeago.format(lower_time, now)
# return dict(timeago=_timeago)
@app.template_filter('format_timestamp_timeago')
def _jinja2_filter_datetimestamp(timestamp, format="%Y-%m-%d %H:%M:%S"):
if timestamp == 0:
return 'Not yet'
return timeago.format(timestamp, time.time())
# return timeago.format(timestamp, time.time())
# return datetime.datetime.utcfromtimestamp(timestamp).strftime(format)
@app.route("/", methods=['GET'])
def main_page():
global messages
# Show messages but once.
# maybe if the change happened more than a few days ago.. add a class
# Sort by last_changed
datastore.data['watching'].sort(key=lambda x: x['last_changed'], reverse=True)
output = render_template("watch-overview.html", watches=datastore.data['watching'], messages=messages)
messages = []
return output
@app.route("/favicon.ico", methods=['GET'])
def favicon():
return send_from_directory("/app/static/images", filename="favicon.ico")
@app.route("/static/<string:group>/<string:filename>", methods=['GET'])
def static_content(group, filename):
try:
return send_from_directory("/app/static/{}".format(group), filename=filename)
except FileNotFoundError:
abort(404)
@app.route("/api/add", methods=['POST'])
def api_watch_add():
global messages
# @todo add_watch should throw a custom Exception for validation etc
datastore.add_watch(url=request.form.get('url').strip(), tag=request.form.get('tag').strip())
messages.append({'class': 'ok', 'message': 'Saved'})
launch_checks()
return redirect(url_for('main_page'))
@app.route("/api/checknow", methods=['GET'])
def api_watch_checknow():
global messages
uuid = request.args.get('uuid')
# dict would be better, this is a simple safety catch.
for watch in datastore.data['watching']:
if watch['uuid'] == uuid:
# @todo cancel if already running?
running_update_threads[uuid] = fetch_site_status.perform_site_check(uuid=uuid,
datastore=datastore)
running_update_threads[uuid].start()
return redirect(url_for('main_page'))
# Can be used whenever, launch threads that need launching to update the stored information
def launch_checks():
import fetch_site_status
global running_update_threads
for watch in datastore.data['watching']:
if watch['last_checked'] <= time.time() - 3 * 60 * 60:
running_update_threads[watch['uuid']] = fetch_site_status.perform_site_check(uuid=watch['uuid'],
datastore=datastore)
running_update_threads[watch['uuid']].start()
# Thread runner to check every minute
def ticker_thread_check_time_launch_checks():
while True:
launch_checks()
time.sleep(60)
def main(argv):
ssl_mode = False
port = 5000
# @todo handle ctrl break
ticker_thread = threading.Thread(target=ticker_thread_check_time_launch_checks).start()
try:
opts, args = getopt.getopt(argv, "sp:")
except getopt.GetoptError:
print('backend.py -s SSL enable -p [port]')
sys.exit(2)
for opt, arg in opts:
if opt == '-s':
ssl_mode = True
if opt == '-p':
port = arg
# @todo finalise SSL config, but this should get you in the right direction if you need it.
if ssl_mode:
eventlet.wsgi.server(eventlet.wrap_ssl(eventlet.listen(('', port)),
certfile='cert.pem',
keyfile='privkey.pem',
server_side=True), app)
else:
eventlet.wsgi.server(eventlet.listen(('', port)), app)
if __name__ == '__main__':
main(sys.argv[1:])

View File

@@ -1,4 +1,15 @@
FROM python:3.8-buster
FROM python:3.8-slim
# https://stackoverflow.com/questions/58701233/docker-logs-erroneously-appears-empty-until-container-stops
ENV PYTHONUNBUFFERED=1
# Should be mounted from docker-compose-development.yml
RUN pip3 install -r /requirements.txt
WORKDIR /app
RUN [ ! -d "/datastore" ] && mkdir /datastore
COPY sleep.py /
CMD [ "python", "/sleep.py" ]

View File

@@ -1,20 +0,0 @@
aiohttp
async-timeout
chardet==2.3.0
multidict
python-engineio
six==1.10.0
yarl
flask
eventlet
requests
validators
bleach==3.2.1
html5lib==0.9999999 # via bleach
timeago
html2text
# @notes
# - Dont install socketio, it interferes with flask_socketio

View File

@@ -1,9 +1,7 @@
import time
import sys
print ("Sleep loop, you should run your script from the console")
while True:
# Wait for 5 seconds
time.sleep(2)
time.sleep(2)

View File

@@ -1,107 +1,91 @@
from threading import Thread
import time
import requests
import hashlib
import os
from inscriptis import get_text
# Hmm Polymorphism datastore, thread, etc
class perform_site_check(Thread):
def __init__(self, *args, uuid=False, datastore, **kwargs):
# Some common stuff here that can be moved to a base class
class perform_site_check():
def __init__(self, *args, datastore, **kwargs):
super().__init__(*args, **kwargs)
self.timestamp = int(time.time()) # used for storage etc too
self.uuid = uuid
self.datastore = datastore
self.url = datastore.get_val(uuid, 'url')
self.current_md5 = datastore.get_val(uuid, 'previous_md5')
self.output_path = "/datastore/{}".format(self.uuid)
def save_firefox_screenshot(self, uuid, output):
# @todo call selenium or whatever
return
def run(self, uuid):
timestamp = int(time.time()) # used for storage etc too
stripped_text_from_html = False
changed_detected = False
def ensure_output_path(self):
update_obj = {'previous_md5': self.datastore.data['watching'][uuid]['previous_md5'],
'history': {},
"last_checked": timestamp
}
extra_headers = self.datastore.get_val(uuid, 'headers')
# Tweak the base config with the per-watch ones
request_headers = self.datastore.data['settings']['headers']
request_headers.update(extra_headers)
# https://github.com/psf/requests/issues/4525
# Requests doesnt yet support brotli encoding, so don't put 'br' here, be totally sure that the user cannot
# do this by accident.
if 'Accept-Encoding' in request_headers and "br" in request_headers['Accept-Encoding']:
request_headers['Accept-Encoding'] = request_headers['Accept-Encoding'].replace(', br', '')
try:
os.stat(self.output_path)
except:
os.mkdir(self.output_path)
def save_response_html_output(self, output):
# @todo maybe record a history.json, [timestamp, md5, filename]
with open("{}/{}.txt".format(self.output_path, self.timestamp), 'w') as f:
f.write(output)
f.close()
def save_response_stripped_output(self, output):
fname = "{}/{}.stripped.txt".format(self.output_path, self.timestamp)
with open(fname, 'w') as f:
f.write(output)
f.close()
return fname
def run(self):
headers = {
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-GB,en-US;q=0.9,en;q=0.8,cs;q=0.7'
}
extra_headers = self.datastore.get_val(self.uuid, 'headers')
headers.update(extra_headers)
print (headers)
print("Checking", self.url)
import html2text
self.ensure_output_path()
timeout = self.datastore.data['settings']['requests']['timeout']
except KeyError:
# @todo yeah this should go back to the default value in store.py, but this whole object should abstract off it
timeout = 15
try:
r = requests.get(self.url, headers=headers, timeout=15, verify=False)
stripped_text_from_html = html2text.html2text(r.content.decode('utf-8'))
url = self.datastore.get_val(uuid, 'url')
r = requests.get(url,
headers=request_headers,
timeout=timeout,
verify=False)
stripped_text_from_html = get_text(r.text)
# Usually from networkIO/requests level
except (requests.exceptions.ConnectionError,requests.exceptions.ReadTimeout) as e:
self.datastore.update_watch(self.uuid, 'last_error', str(e))
except (requests.exceptions.ConnectionError, requests.exceptions.ReadTimeout) as e:
update_obj["last_error"] = str(e)
print(str(e))
except requests.exceptions.MissingSchema:
print("Skipping {} due to missing schema/bad url".format(uuid))
# Usually from html2text level
except UnicodeDecodeError as e:
self.datastore.update_watch(self.uuid, 'last_error', str(e))
update_obj["last_error"] = str(e)
print(str(e))
# figure out how to deal with this cleaner..
# 'utf-8' codec can't decode byte 0xe9 in position 480: invalid continuation byte
else:
# We rely on the actual text in the html output.. many sites have random script vars etc,
# in the future we'll implement other mechanisms.
# We rely on the actual text in the html output.. many sites have random script vars etc
self.datastore.update_watch(self.uuid, 'last_error', False)
self.datastore.update_watch(self.uuid, 'last_check_status', r.status_code)
update_obj["last_check_status"] = r.status_code
update_obj["last_error"] = False
if not len(r.text):
update_obj["last_error"] = "Empty reply"
fetched_md5 = hashlib.md5(stripped_text_from_html.encode('utf-8')).hexdigest()
if self.current_md5 != fetched_md5:
# could be None or False depending on JSON type
if self.datastore.data['watching'][uuid]['previous_md5'] != fetched_md5:
changed_detected = True
# Dont confuse people by putting last-changed, when it actually just changed from nothing..
if self.datastore.get_val(self.uuid, 'previous_md5') is not None:
self.datastore.update_watch(self.uuid, 'last_changed', self.timestamp)
# Don't confuse people by updating as last-changed, when it actually just changed from None..
if self.datastore.get_val(uuid, 'previous_md5'):
update_obj["last_changed"] = timestamp
self.datastore.update_watch(self.uuid, 'previous_md5', fetched_md5)
self.save_response_html_output(r.text)
output_filepath = self.save_response_stripped_output(stripped_text_from_html)
update_obj["previous_md5"] = fetched_md5
# Update history with the stripped text for future reference, this will also mean we save the first
# attempt because 'self.current_md5 != fetched_md5' (current_md5 will be None when not run)
# need to learn more about attr/setters/getters
history = self.datastore.get_val(self.uuid, 'history')
history.update(dict([(self.timestamp, output_filepath)]))
self.datastore.update_watch(self.uuid, 'history', history)
self.datastore.update_watch(self.uuid, 'last_checked', int(time.time()))
pass
return changed_detected, update_obj, stripped_text_from_html

View File

@@ -1,14 +0,0 @@
from flask import make_response
from functools import wraps, update_wrapper
from datetime import datetime
def nocache(view):
@wraps(view)
def no_cache(*args, **kwargs):
response = make_response(view(*args, **kwargs))
response.headers['hmm'] = datetime.now()
return response
return update_wrapper(no_cache, view)

View File

@@ -1,9 +0,0 @@
FROM python:3.8-buster
COPY requirements.txt /tmp/requirements.txt
RUN pip3 install -r /tmp/requirements.txt
# So that it can find the certs
WORKDIR /app
CMD [ "python", "/app/backend.py" ]

2
backend/pytest.ini Normal file
View File

@@ -0,0 +1,2 @@
[pytest]
addopts = --no-start-live-server --live-server-port=5005

View File

@@ -1,20 +0,0 @@
aiohttp
async-timeout
chardet==2.3.0
multidict
python-engineio
six==1.10.0
yarl
flask
eventlet
requests
validators
bleach==3.2.1
html5lib==0.9999999 # via bleach
timeago
html2text
# @notes
# - Dont install socketio, it interferes with flask_socketio

View File

@@ -0,0 +1,66 @@
table {
table-layout: fixed;
width: 100%;
}
td {
width: 33%;
padding: 3px 4px;
border: 1px solid transparent;
vertical-align: top;
font: 1em monospace;
text-align: left;
white-space: pre-wrap;
}
h1 {
display: inline;
font-size: 100%;
}
del {
text-decoration: none;
color: #b30000;
background: #fadad7;
}
ins {
background: #eaf2c2;
color: #406619;
text-decoration: none;
}
#result {
white-space: pre-wrap;
}
#settings {
background: rgba(0,0,0,.05);
padding: 1em;
border-radius: 10px;
margin-bottom: 1em;
color: #fff;
font-size: 80%;
}
#settings label {
margin-left: 1em;
display: inline-block;
font-weight: normal;
}
.source {
position: absolute;
right: 1%;
top: .2em;
}
@-moz-document url-prefix() {
body {
height: 99%; /* Hide scroll bar in Firefox */
}
}
#diff-ui {
background: #fff;
padding: 2em;
margin: 1em;
border-radius: 5px;
font-size: 9px;
}

View File

@@ -3,95 +3,237 @@
* Most of these are inherited from Base, but I want to change a few.
*/
body {
color: #333;
background: #262626;
color: #333;
background: #262626;
}
.pure-table-even {
background: #fff;
}
/* Some styles from https://css-tricks.com/ */
a {
text-decoration: none;
color: #1b98f8;
text-decoration: none;
color: #1b98f8;
}
a.github-link {
color: #fff;
color: #fff;
}
.pure-menu-horizontal {
background: #fff;
padding: 5px;
display: flex;
justify-content: space-between;
border-bottom: 2px solid #ed5900;
align-items: center;
}
section.content {
padding-top: 5em;
padding-bottom: 5em;
flex-direction: column;
padding-top: 5em;
padding-bottom: 5em;
flex-direction: column;
display: flex;
align-items: center;
justify-content: center;
}
.pure-table.watch-table td {
font-size: 90%;
font-size: 80%;
}
/* table related */
.watch-table {
width: 100%;
}
.watch-table tr.unviewed {
font-weight: bold;
}
.watch-tag-list {
color: #e70069;
white-space: nowrap;
}
.box {
max-width: 80%;
flex-direction: column;
display: flex;
justify-content: center;
}
.watch-table .error {
color: #aa0000;
color: #a00;
}
.home-menu {
background: #fff;
padding: 5px;
display: flex;
justify-content: space-between;
.watch-table td {
white-space: nowrap;
}
.pure-table-even {
/* missing */
background: #fff;
.watch-table td.title-col {
word-break: break-all;
white-space: normal;
}
body:after {
content: "";
background: linear-gradient(130deg,#ff7a18,#af002d 41.07%,#319197 76.05%)
.watch-table th {
white-space: nowrap;
}
body:after,body:before {
display: block;
height: 600px;
position: absolute;
top: 0;
left: 0;
width: 100%;
z-index: -1;
}
body::after {
opacity: 0.91;
}
body::before {
content: "";
background-image:
url(/static/images/gradient-border.png);
}
body:before {
background-size: cover
}
body:after,body:before {
-webkit-clip-path: polygon(100% 0,0 0,0 77.5%,1% 77.4%,2% 77.1%,3% 76.6%,4% 75.9%,5% 75.05%,6% 74.05%,7% 72.95%,8% 71.75%,9% 70.55%,10% 69.3%,11% 68.05%,12% 66.9%,13% 65.8%,14% 64.8%,15% 64%,16% 63.35%,17% 62.85%,18% 62.6%,19% 62.5%,20% 62.65%,21% 63%,22% 63.5%,23% 64.2%,24% 65.1%,25% 66.1%,26% 67.2%,27% 68.4%,28% 69.65%,29% 70.9%,30% 72.15%,31% 73.3%,32% 74.35%,33% 75.3%,34% 76.1%,35% 76.75%,36% 77.2%,37% 77.45%,38% 77.5%,39% 77.3%,40% 76.95%,41% 76.4%,42% 75.65%,43% 74.75%,44% 73.75%,45% 72.6%,46% 71.4%,47% 70.15%,48% 68.9%,49% 67.7%,50% 66.55%,51% 65.5%,52% 64.55%,53% 63.75%,54% 63.15%,55% 62.75%,56% 62.55%,57% 62.5%,58% 62.7%,59% 63.1%,60% 63.7%,61% 64.45%,62% 65.4%,63% 66.45%,64% 67.6%,65% 68.8%,66% 70.05%,67% 71.3%,68% 72.5%,69% 73.6%,70% 74.65%,71% 75.55%,72% 76.35%,73% 76.9%,74% 77.3%,75% 77.5%,76% 77.45%,77% 77.25%,78% 76.8%,79% 76.2%,80% 75.4%,81% 74.45%,82% 73.4%,83% 72.25%,84% 71.05%,85% 69.8%,86% 68.55%,87% 67.35%,88% 66.2%,89% 65.2%,90% 64.3%,91% 63.55%,92% 63%,93% 62.65%,94% 62.5%,95% 62.55%,96% 62.8%,97% 63.3%,98% 63.9%,99% 64.75%,100% 65.7%);
clip-path: polygon(100% 0,0 0,0 77.5%,1% 77.4%,2% 77.1%,3% 76.6%,4% 75.9%,5% 75.05%,6% 74.05%,7% 72.95%,8% 71.75%,9% 70.55%,10% 69.3%,11% 68.05%,12% 66.9%,13% 65.8%,14% 64.8%,15% 64%,16% 63.35%,17% 62.85%,18% 62.6%,19% 62.5%,20% 62.65%,21% 63%,22% 63.5%,23% 64.2%,24% 65.1%,25% 66.1%,26% 67.2%,27% 68.4%,28% 69.65%,29% 70.9%,30% 72.15%,31% 73.3%,32% 74.35%,33% 75.3%,34% 76.1%,35% 76.75%,36% 77.2%,37% 77.45%,38% 77.5%,39% 77.3%,40% 76.95%,41% 76.4%,42% 75.65%,43% 74.75%,44% 73.75%,45% 72.6%,46% 71.4%,47% 70.15%,48% 68.9%,49% 67.7%,50% 66.55%,51% 65.5%,52% 64.55%,53% 63.75%,54% 63.15%,55% 62.75%,56% 62.55%,57% 62.5%,58% 62.7%,59% 63.1%,60% 63.7%,61% 64.45%,62% 65.4%,63% 66.45%,64% 67.6%,65% 68.8%,66% 70.05%,67% 71.3%,68% 72.5%,69% 73.6%,70% 74.65%,71% 75.55%,72% 76.35%,73% 76.9%,74% 77.3%,75% 77.5%,76% 77.45%,77% 77.25%,78% 76.8%,79% 76.2%,80% 75.4%,81% 74.45%,82% 73.4%,83% 72.25%,84% 71.05%,85% 69.8%,86% 68.55%,87% 67.35%,88% 66.2%,89% 65.2%,90% 64.3%,91% 63.55%,92% 63%,93% 62.65%,94% 62.5%,95% 62.55%,96% 62.8%,97% 63.3%,98% 63.9%,99% 64.75%,100% 65.7%)
}
.button-small {
font-size: 85%;
}
a[target="_blank"]::after {
.watch-table .title-col a[target="_blank"]::after, .current-diff-url::after {
content: url(data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAAQElEQVR42qXKwQkAIAxDUUdxtO6/RBQkQZvSi8I/pL4BoGw/XPkh4XigPmsUgh0626AjRsgxHTkUThsG2T/sIlzdTsp52kSS1wAAAABJRU5ErkJggg==);
margin: 0 3px 0 5px;
}
#check-all-button {
text-align:right;
}
#check-all-button a {
border-top-left-radius: initial;
border-top-right-radius: initial;
border-bottom-left-radius: 5px;
border-bottom-right-radius: 5px;
}
body:after {
content: "";
background: linear-gradient(130deg, #ff7a18, #af002d 41.07%, #319197 76.05%)
}
body:after, body:before {
display: block;
height: 600px;
position: absolute;
top: 0;
left: 0;
width: 100%;
z-index: -1;
}
body::after {
opacity: 0.91;
}
body::before {
content: "";
background-image: url(/static/images/gradient-border.png);
}
body:before {
background-size: cover
}
body:after, body:before {
-webkit-clip-path: polygon(100% 0, 0 0, 0 77.5%, 1% 77.4%, 2% 77.1%, 3% 76.6%, 4% 75.9%, 5% 75.05%, 6% 74.05%, 7% 72.95%, 8% 71.75%, 9% 70.55%, 10% 69.3%, 11% 68.05%, 12% 66.9%, 13% 65.8%, 14% 64.8%, 15% 64%, 16% 63.35%, 17% 62.85%, 18% 62.6%, 19% 62.5%, 20% 62.65%, 21% 63%, 22% 63.5%, 23% 64.2%, 24% 65.1%, 25% 66.1%, 26% 67.2%, 27% 68.4%, 28% 69.65%, 29% 70.9%, 30% 72.15%, 31% 73.3%, 32% 74.35%, 33% 75.3%, 34% 76.1%, 35% 76.75%, 36% 77.2%, 37% 77.45%, 38% 77.5%, 39% 77.3%, 40% 76.95%, 41% 76.4%, 42% 75.65%, 43% 74.75%, 44% 73.75%, 45% 72.6%, 46% 71.4%, 47% 70.15%, 48% 68.9%, 49% 67.7%, 50% 66.55%, 51% 65.5%, 52% 64.55%, 53% 63.75%, 54% 63.15%, 55% 62.75%, 56% 62.55%, 57% 62.5%, 58% 62.7%, 59% 63.1%, 60% 63.7%, 61% 64.45%, 62% 65.4%, 63% 66.45%, 64% 67.6%, 65% 68.8%, 66% 70.05%, 67% 71.3%, 68% 72.5%, 69% 73.6%, 70% 74.65%, 71% 75.55%, 72% 76.35%, 73% 76.9%, 74% 77.3%, 75% 77.5%, 76% 77.45%, 77% 77.25%, 78% 76.8%, 79% 76.2%, 80% 75.4%, 81% 74.45%, 82% 73.4%, 83% 72.25%, 84% 71.05%, 85% 69.8%, 86% 68.55%, 87% 67.35%, 88% 66.2%, 89% 65.2%, 90% 64.3%, 91% 63.55%, 92% 63%, 93% 62.65%, 94% 62.5%, 95% 62.55%, 96% 62.8%, 97% 63.3%, 98% 63.9%, 99% 64.75%, 100% 65.7%);
clip-path: polygon(100% 0, 0 0, 0 77.5%, 1% 77.4%, 2% 77.1%, 3% 76.6%, 4% 75.9%, 5% 75.05%, 6% 74.05%, 7% 72.95%, 8% 71.75%, 9% 70.55%, 10% 69.3%, 11% 68.05%, 12% 66.9%, 13% 65.8%, 14% 64.8%, 15% 64%, 16% 63.35%, 17% 62.85%, 18% 62.6%, 19% 62.5%, 20% 62.65%, 21% 63%, 22% 63.5%, 23% 64.2%, 24% 65.1%, 25% 66.1%, 26% 67.2%, 27% 68.4%, 28% 69.65%, 29% 70.9%, 30% 72.15%, 31% 73.3%, 32% 74.35%, 33% 75.3%, 34% 76.1%, 35% 76.75%, 36% 77.2%, 37% 77.45%, 38% 77.5%, 39% 77.3%, 40% 76.95%, 41% 76.4%, 42% 75.65%, 43% 74.75%, 44% 73.75%, 45% 72.6%, 46% 71.4%, 47% 70.15%, 48% 68.9%, 49% 67.7%, 50% 66.55%, 51% 65.5%, 52% 64.55%, 53% 63.75%, 54% 63.15%, 55% 62.75%, 56% 62.55%, 57% 62.5%, 58% 62.7%, 59% 63.1%, 60% 63.7%, 61% 64.45%, 62% 65.4%, 63% 66.45%, 64% 67.6%, 65% 68.8%, 66% 70.05%, 67% 71.3%, 68% 72.5%, 69% 73.6%, 70% 74.65%, 71% 75.55%, 72% 76.35%, 73% 76.9%, 74% 77.3%, 75% 77.5%, 76% 77.45%, 77% 77.25%, 78% 76.8%, 79% 76.2%, 80% 75.4%, 81% 74.45%, 82% 73.4%, 83% 72.25%, 84% 71.05%, 85% 69.8%, 86% 68.55%, 87% 67.35%, 88% 66.2%, 89% 65.2%, 90% 64.3%, 91% 63.55%, 92% 63%, 93% 62.65%, 94% 62.5%, 95% 62.55%, 96% 62.8%, 97% 63.3%, 98% 63.9%, 99% 64.75%, 100% 65.7%)
}
.button-small {
font-size: 85%;
}
.fetch-error {
padding-top: 1em;
font-size: 60%;
max-width: 400px;
display: block;
}
padding-top: 1em;
font-size: 60%;
max-width: 400px;
display: block;
}
.edit-form {
background: #fff;
padding: 2em;
margin: 1em;
border-radius: 5px;
}
.button-secondary {
color: white;
border-radius: 4px;
text-shadow: 0 1px 1px rgba(0, 0, 0, 0.2);
}
.button-success {
background: rgb(28, 184, 65);
/* this is a green */
}
.button-tag {
background: rgb(99, 99, 99);
color: #fff;
font-size: 65%;
border-bottom-left-radius: initial;
border-bottom-right-radius: initial;
}
.button-tag.active {
background: #9c9c9c;
font-weight: bold;
}
.button-error {
background: rgb(202, 60, 60);
/* this is a maroon */
}
.button-warning {
background: rgb(223, 117, 20);
/* this is an orange */
}
.button-secondary {
background: rgb(66, 184, 221);
/* this is a light blue */
}
.button-cancel {
background: rgb(200, 200, 200);
/* this is a green */
}
.messages {
padding: 1em;
background: rgba(255,255,255,.2);
border-radius: 10px;
color: #fff;
font-weight: bold;
}
.pure-form label {
font-weight: bold;
}
#new-watch-form {
background: rgba(0,0,0,.05);
padding: 1em;
border-radius: 10px;
margin-bottom: 1em;
}
#new-watch-form legend {
color: #fff;
}
#diff-col {
padding-left:40px;
}
#diff-jump {
position: fixed;
left: 0px;
top: 80px;
background: #fff;
padding: 10px;
border-top-right-radius: 5px;
border-bottom-right-radius: 5px;
box-shadow: 5px 0 5px -2px #888;
}
#diff-jump a {
color: #1b98f8;
cursor: grabbing;
-moz-user-select: none;
-webkit-user-select: none;
-ms-user-select:none;
user-select:none;
-o-user-select:none;
}

1055
backend/static/js/diff.js Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -1,14 +1,45 @@
import json
import uuid
import validators
import uuid as uuid_builder
import os.path
from os import path
from threading import Lock
from copy import deepcopy
import logging
import time
import threading
# Is there an existing library to ensure some data store (JSON etc) is in sync with CRUD methods?
# Open a github issue if you know something :)
# https://stackoverflow.com/questions/6190468/how-to-trigger-function-on-value-change
class ChangeDetectionStore:
lock = Lock()
def __init__(self):
def __init__(self, datastore_path="/datastore", include_default_watches=True):
self.needs_write = False
self.datastore_path = datastore_path
self.json_store_path = "{}/url-watches.json".format(self.datastore_path)
self.stop_thread = False
self.__data = {
'note': "Hello! If you change this file manually, please be sure to restart your changedetection.io instance!",
'watching': {},
'tag': "0.25",
'settings': {
'headers': {
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Accept-Encoding': 'gzip, deflate', # No support for brolti in python requests yet.
'Accept-Language': 'en-GB,en-US;q=0.9,en;'
},
'requests': {
'timeout': 15, # Default 15 seconds
'minutes_between_check': 3 * 60, # Default 3 hours
'workers': 10 # Number of threads, lower is better for slow connections
}
}
}
# Base definition for all watchers
self.generic_definition = {
@@ -16,61 +47,117 @@ class ChangeDetectionStore:
'tag': None,
'last_checked': 0,
'last_changed': 0,
'last_viewed': 0, # history key value of the last viewed via the [diff] link
'newest_history_key': "",
'title': None,
'uuid': str(uuid.uuid4()),
'headers' : {}, # Extra headers to send
'history' : {} # Dict of timestamp and output stripped filename
'previous_md5': "",
'uuid': str(uuid_builder.uuid4()),
'headers': {}, # Extra headers to send
'history': {} # Dict of timestamp and output stripped filename
}
try:
with open('/datastore/url-watches.json') as json_file:
self.data = json.load(json_file)
# Reinitialise each `watching` with our generic_definition in the case that we add a new var in the future.
i = 0
while i < len(self.data['watching']):
_blank = self.generic_definition.copy()
_blank.update(self.data['watching'][i])
self.data['watching'][i] = _blank
if path.isfile('/source.txt'):
with open('/source.txt') as f:
# Should be set in Dockerfile to look for /source.txt , this will give us the git commit #
# So when someone gives us a backup file to examine, we know exactly what code they were running.
self.__data['build_sha'] = f.read()
print("Watching:", self.data['watching'][i]['url'])
i += 1
try:
with open(self.json_store_path) as json_file:
from_disk = json.load(json_file)
# @todo isnt there a way todo this dict.update recursively?
# Problem here is if the one on the disk is missing a sub-struct, it wont be present anymore.
if 'watching' in from_disk:
self.__data['watching'].update(from_disk['watching'])
if 'settings' in from_disk:
if 'headers' in from_disk['settings']:
self.__data['settings']['headers'].update(from_disk['settings']['headers'])
if 'requests' in from_disk['settings']:
self.__data['settings']['requests'].update(from_disk['settings']['requests'])
# Reinitialise each `watching` with our generic_definition in the case that we add a new var in the future.
# @todo pretty sure theres a python we todo this with an abstracted(?) object!
for uuid, watch in self.data['watching'].items():
_blank = deepcopy(self.generic_definition)
_blank.update(watch)
self.__data['watching'].update({uuid: _blank})
self.__data['watching'][uuid]['newest_history_key'] = self.get_newest_history_key(uuid)
print("Watching:", uuid, self.__data['watching'][uuid]['url'])
# First time ran, doesnt exist.
except (FileNotFoundError, json.decoder.JSONDecodeError):
print("Resetting JSON store")
if include_default_watches:
print("Creating JSON store at", self.datastore_path)
self.data = {}
self.data['watching'] = []
self._init_blank_data()
self.sync_to_json()
self.add_watch(url='http://www.quotationspage.com/random.php', tag='test')
self.add_watch(url='https://news.ycombinator.com/', tag='Tech news')
self.add_watch(url='https://www.gov.uk/coronavirus', tag='Covid')
self.add_watch(url='https://changedetection.io', tag='Tech news')
def _init_blank_data(self):
# Finally start the thread that will manage periodic data saves to JSON
save_data_thread = threading.Thread(target=self.save_datastore).start()
# Test site
_blank = self.generic_definition.copy()
_blank.update({
'url': 'https://changedetection.io',
'tag': 'general',
'uuid': str(uuid.uuid4())
})
self.data['watching'].append(_blank)
# Returns the newest key, but if theres only 1 record, then it's counted as not being new, so return 0.
def get_newest_history_key(self, uuid):
if len(self.__data['watching'][uuid]['history']) == 1:
return 0
# Test site
_blank = self.generic_definition.copy()
_blank.update({
'url': 'http://www.quotationspage.com/random.php',
'tag': 'test',
'uuid': str(uuid.uuid4())
})
self.data['watching'].append(_blank)
dates = list(self.__data['watching'][uuid]['history'].keys())
# Convert to int, sort and back to str again
dates = [int(i) for i in dates]
dates.sort(reverse=True)
if len(dates):
# always keyed as str
return str(dates[0])
def update_watch(self, uuid, val, var):
# Probably their should be dict...
for watch in self.data['watching']:
if watch['uuid'] == uuid:
watch[val] = var
# print("Updated..", val)
self.sync_to_json()
return 0
def set_last_viewed(self, uuid, timestamp):
self.data['watching'][uuid].update({'last_viewed': str(timestamp)})
self.needs_write = True
def update_watch(self, uuid, update_obj):
with self.lock:
# In python 3.9 we have the |= dict operator, but that still will lose data on nested structures...
for dict_key, d in self.generic_definition.items():
if isinstance(d, dict):
if update_obj is not None and dict_key in update_obj:
self.__data['watching'][uuid][dict_key].update(update_obj[dict_key])
del (update_obj[dict_key])
self.__data['watching'][uuid].update(update_obj)
self.__data['watching'][uuid]['newest_history_key'] = self.get_newest_history_key(uuid)
self.needs_write = True
@property
def data(self):
return self.__data
def get_all_tags(self):
tags = []
for uuid, watch in self.data['watching'].items():
# Support for comma separated list of tags.
for tag in watch['tag'].split(','):
tag = tag.strip()
if tag not in tags:
tags.append(tag)
tags.sort()
return tags
def delete(self, uuid):
with self.lock:
del (self.__data['watching'][uuid])
self.needs_write = True
def url_exists(self, url):
@@ -83,30 +170,65 @@ class ChangeDetectionStore:
def get_val(self, uuid, val):
# Probably their should be dict...
for watch in self.data['watching']:
if watch['uuid'] == uuid:
return watch.get(val)
return None
return self.data['watching'][uuid].get(val)
def add_watch(self, url, tag):
validators.url(url)
with self.lock:
# @todo use a common generic version of this
new_uuid = str(uuid_builder.uuid4())
_blank = deepcopy(self.generic_definition)
_blank.update({
'url': url,
'tag': tag,
'uuid': new_uuid
})
# @todo use a common generic version of this
self.data['watching'][new_uuid] = _blank
_blank = self.generic_definition.copy()
_blank.update({
'url': url,
'tag': tag,
'uuid': str(uuid.uuid4())
})
self.data['watching'].append(_blank)
# Get the directory ready
output_path = "{}/{}".format(self.datastore_path, new_uuid)
try:
os.mkdir(output_path)
except FileExistsError:
print(output_path, "already exists.")
self.sync_to_json()
# @todo throw custom exception
return new_uuid
# Save some text file to the appropriate path and bump the history
# result_obj from fetch_site_status.run()
def save_history_text(self, uuid, result_obj, contents):
output_path = "{}/{}".format(self.datastore_path, uuid)
fname = "{}/{}-{}.stripped.txt".format(output_path, result_obj['previous_md5'], str(time.time()))
with open(fname, 'w') as f:
f.write(contents)
f.close()
# Update history with the stripped text for future reference, this will also mean we save the first
# Should always be keyed by string(timestamp)
self.update_watch(uuid, {"history": {str(result_obj["last_checked"]): fname}})
return fname
def sync_to_json(self):
with open('/datastore/url-watches.json', 'w') as json_file:
json.dump(self.data, json_file, indent=4)
print("Saving..")
with open(self.json_store_path, 'w') as json_file:
json.dump(self.__data, json_file, indent=4)
logging.info("Re-saved index")
self.needs_write = False
# Thread runner, this helps with thread/write issues when there are many operations that want to update the JSON
# by just running periodically in one thread, according to python, dict updates are threadsafe.
def save_datastore(self):
while True:
if self.stop_thread:
print("Shutting down datastore thread")
return
if self.needs_write:
self.sync_to_json()
time.sleep(1)
# body of the constructor

View File

@@ -6,18 +6,34 @@
<meta name="description" content="Self hosted website change detection.">
<title>Change Detection</title>
<link rel="stylesheet" href="/static/css/pure-min.css">
<link rel="stylesheet" href="/static/css/styles.css">
<link rel="stylesheet" href="/static/css/styles.css?ver=1000">
{% if extra_stylesheets %}
{% for m in extra_stylesheets %}
<link rel="stylesheet" href="{{ m }}?ver=1000">
{% endfor %}
{% endif %}
</head>
<body>
<div class="header">
<div class="home-menu pure-menu pure-menu-horizontal pure-menu-fixed">
<a class="pure-menu-heading" href=""><strong>Change</strong>Detection.io</a>
<a class="pure-menu-heading" href="/"><strong>Change</strong>Detection.io</a>
{% if current_diff_url %}
<a class=current-diff-url href="{{ current_diff_url }}"><span style="max-width: 30%; overflow: hidden;">{{ current_diff_url }}</a>
{% endif %}
<ul class="pure-menu-list">
<li class="pure-menu-item pure-menu-selected"><a class="github-link " href="https://github.com/dgtlmoon/changedetection.io"
data-hotkey="g d" aria-label="Homepage "
data-ga-click="Header, go to dashboard, icon:logo">
<li class="pure-menu-item">
<a href="/backup" class="pure-menu-link">BACKUP</a>
</li>
<li class="pure-menu-item">
<a href="/import" class="pure-menu-link">IMPORT</a>
</li>
<li class="pure-menu-item">
<a href="/settings" class="pure-menu-link">SETTINGS</a>
</li>
<li class="pure-menu-item"><a class="github-link" href="https://github.com/dgtlmoon/changedetection.io">
<svg class="octicon octicon-mark-github v-align-middle" height="32" viewBox="0 0 16 16" version="1.1"
width="32" aria-hidden="true">
<path fill-rule="evenodd"
@@ -37,11 +53,13 @@
{% block header %}{% endblock %}
</header>
{% if messages %}
<div class="messages">
{% for message in messages %}
<div class="flash-message {{ message['class'] }}">{{ message['message'] }}</div>
{% for message in messages %}
<div class="flash-message {{ message['class'] }}">{{ message['message'] }}</div>
{% endfor %}
</div>
{% endif %}
{% block content %}

165
backend/templates/diff.html Normal file
View File

@@ -0,0 +1,165 @@
{% extends 'base.html' %}
{% block content %}
<div id="settings">
<h1>Differences</h1>
<form class="pure-form " action="" method="GET">
<fieldset>
<label for="diffWords" class="pure-checkbox">
<input type="radio" name="diff_type" id="diffWords" value="diffWords" /> Words</label>
<label for="diffLines" class="pure-checkbox">
<input type="radio" name="diff_type" id="diffLines" value="diffLines" checked=""/> Lines</label>
<label for="diffChars" class="pure-checkbox">
<input type="radio" name="diff_type" id="diffChars" value="diffChars"/> Chars</label>
{% if versions|length >= 1 %}
<label for="diff-version">Compare newest (<span id="current-v-date"></span>) with</label>
<select id="diff-version" name="previous_version">
{% for version in versions %}
<option value="{{version}}" {% if version== current_previous_version %} selected="" {% endif %}>
{{version}}
</option>
{% endfor %}
</select>
<button type="submit" class="pure-button pure-button-primary">Go</button>
{% endif %}
</fieldset>
</form>
<del>Removed text</del>
<ins>Inserted Text</ins>
</div>
<div id="diff-jump">
<a onclick="next_diff();">Jump</a>
</div>
<div id="diff-ui">
<table>
<tbody>
<tr>
<!-- just proof of concept copied straight from github.com/kpdecker/jsdiff -->
<td id="a" style="display: none;">{{previous}}</td>
<td id="b" style="display: none;">{{newest}}</td>
<td id="diff-col">
<span id="result"></span>
</td>
</tr>
</tbody>
</table>
Diff algorithm from the amazing <a href="https://github.com/kpdecker/jsdiff">github.com/kpdecker/jsdiff</a>
</div>
<script src="/static/js/diff.js"></script>
<script defer="">
var a = document.getElementById('a');
var b = document.getElementById('b');
var result = document.getElementById('result');
function changed() {
var diff = JsDiff[window.diffType](a.textContent, b.textContent);
var fragment = document.createDocumentFragment();
for (var i=0; i < diff.length; i++) {
if (diff[i].added && diff[i + 1] && diff[i + 1].removed) {
var swap = diff[i];
diff[i] = diff[i + 1];
diff[i + 1] = swap;
}
var node;
if (diff[i].removed) {
node = document.createElement('del');
node.classList.add("change");
node.appendChild(document.createTextNode(diff[i].value));
} else if (diff[i].added) {
node = document.createElement('ins');
node.classList.add("change");
node.appendChild(document.createTextNode(diff[i].value));
} else {
node = document.createTextNode(diff[i].value);
}
fragment.appendChild(node);
}
result.textContent = '';
result.appendChild(fragment);
}
window.onload = function() {
/* Convert what is options from UTC time.time() to local browser time */
var diffList=document.getElementById("diff-version");
if (typeof(diffList) != 'undefined' && diffList != null) {
for (var option of diffList.options) {
var dateObject = new Date(option.value*1000);
option.label=dateObject.toLocaleString();
}
}
/* Set current version date as local time in the browser also */
var current_v = document.getElementById("current-v-date");
var dateObject = new Date({{ newest_version_timestamp }}*1000);
current_v.innerHTML=dateObject.toLocaleString();
onDiffTypeChange(document.querySelector('#settings [name="diff_type"]:checked'));
changed();
};
a.onpaste = a.onchange =
b.onpaste = b.onchange = changed;
if ('oninput' in a) {
a.oninput = b.oninput = changed;
} else {
a.onkeyup = b.onkeyup = changed;
}
function onDiffTypeChange(radio) {
window.diffType = radio.value;
document.title = "Diff " + radio.value.slice(4);
}
var radio = document.getElementsByName('diff_type');
for (var i = 0; i < radio.length; i++) {
radio[i].onchange = function(e) {
onDiffTypeChange(e.target);
changed();
}
}
var inputs = document.getElementsByClassName('change');
inputs.current=0;
function next_diff() {
var element = inputs[inputs.current];
var headerOffset = 80;
var elementPosition = element.getBoundingClientRect().top;
var offsetPosition = elementPosition - headerOffset + window.scrollY;
window.scrollTo({
top: offsetPosition,
behavior: "smooth"
});
inputs.current++;
if(inputs.current >= inputs.length) {
inputs.current=0;
}
}
</script>
{% endblock %}

View File

@@ -0,0 +1,55 @@
{% extends 'base.html' %}
{% block content %}
<div class="edit-form">
<form class="pure-form pure-form-stacked" action="/edit?uuid={{uuid}}" method="POST">
<fieldset>
<div class="pure-control-group">
<label for="url">URL</label>
<input type="url" id="url" required="" placeholder="https://..." name="url" value="{{ watch.url}}"
size="50"/>
<span class="pure-form-message-inline">This is a required field.</span>
</div>
<div class="pure-control-group">
<label for="tag">Tag</label>
<input type="text" placeholder="tag" size="10" id="tag" name="tag" value="{{ watch.tag}}"/>
<span class="pure-form-message-inline">Grouping tags, can be a comma separated list.</span>
</div>
<fieldset class="pure-group">
<label for="headers">Extra request headers</label>
<textarea id=headers name="headers" class="pure-input-1-2" placeholder="Example
Cookie: foobar
User-Agent: wonderbra 1.0"
style="width: 100%;
font-family:monospace;
white-space: pre;
overflow-wrap: normal;
overflow-x: scroll;" rows="5">{% for key, value in watch.headers.items() %}{{ key }}: {{ value }}
{% endfor %}</textarea>
<br/>
</fieldset>
<div class="pure-control-group">
<button type="submit" class="pure-button pure-button-primary">Save</button>
</div>
<br/>
<div class="pure-control-group">
<a href="/" class="pure-button button-small button-cancel">Cancel</a>
<a href="/api/delete?uuid={{uuid}}"
class="pure-button button-small button-error ">Delete</a>
</div>
</fieldset>
</form>
</div>
{% endblock %}

View File

@@ -0,0 +1,26 @@
{% extends 'base.html' %}
{% block content %}
<div class="edit-form">
<form class="pure-form pure-form-aligned" action="/import" method="POST">
<fieldset class="pure-group">
<legend>One URL per line, URLs that do not pass validation will stay in the textarea.</legend>
<textarea name="urls" class="pure-input-1-2" placeholder="https://"
style="width: 100%;
font-family:monospace;
white-space: pre;
overflow-wrap: normal;
overflow-x: scroll;" rows="25">{{ remaining }}</textarea>
</fieldset>
<button type="submit" class="pure-button pure-input-1-2 pure-button-primary">Import</button>
</form>
</div>
{% endblock %}

View File

@@ -0,0 +1,43 @@
{% extends 'base.html' %}
{% block content %}
<div class="edit-form">
<form class="pure-form pure-form-stacked" action="/scrub" method="POST">
<fieldset>
<div class="pure-control-group">
This will remove all version snapshots/data, but keep your list of URLs. <br/>
You may like to use the <strong>BACKUP</strong> link first.<br/>
Type in the word <strong>scrub</strong> to confirm that you understand!
<br/>
</div>
<div class="pure-control-group">
<br/>
<label for="confirmtext">Confirm</label><br/>
<input type="text" id="confirmtext" required="" name="confirmtext" value="" size="10"/>
<br/>
</div>
<div class="pure-control-group">
<button type="submit" class="pure-button pure-button-primary">Scrub!</button>
</div>
<br/>
<div class="pure-control-group">
<a href="/" class="pure-button button-small button-cancel">Cancel</a>
</div>
</fieldset>
</form>
</div>
{% endblock %}

View File

@@ -0,0 +1,35 @@
{% extends 'base.html' %}
{% block content %}
<div class="edit-form">
<form class="pure-form pure-form-stacked" action="/settings" method="POST">
<fieldset>
<div class="pure-control-group">
<label for="minutes">Maximum time in minutes until recheck.</label>
<input type="text" id="minutes" required="" name="minutes" value="{{minutes}}"
size="5"/>
<span class="pure-form-message-inline">This is a required field.</span>
</div>
<br/>
<div class="pure-control-group">
<button type="submit" class="pure-button pure-button-primary">Save</button>
</div>
<br/>
<div class="pure-control-group">
<a href="/" class="pure-button button-small button-cancel">Back</a>
<a href="/scrub" class="pure-button button-small button-cancel">Reset all version data</a>
</div>
</fieldset>
</form>
</div>
{% endblock %}

View File

@@ -1,52 +1,85 @@
{% extends 'base.html' %}
{% block content %}
<div class="box">
<form class="pure-form" action="/api/add" method="POST">
<fieldset>
<legend>Add new change detection watch</legend>
<input type="url" placeholder="https://..." name="url"/>
<input type="text" placeholder="tag" size="10" name="tag"/>
<button type="submit" class="pure-button pure-button-primary">Save</button>
</fieldset>
<!-- add extra stuff, like do a http POST and send headers -->
<!-- user/pass r = requests.get('https://api.github.com/user', auth=('user', 'pass')) -->
</form>
<form class="pure-form" action="/api/add" method="POST" id="new-watch-form">
<fieldset>
<legend>Add a new change detection watch</legend>
<input type="url" placeholder="https://..." name="url"/>
<input type="text" placeholder="tag" size="10" name="tag" value="{{active_tag if active_tag}}"/>
<button type="submit" class="pure-button pure-button-primary">Watch</button>
</fieldset>
<!-- add extra stuff, like do a http POST and send headers -->
<!-- user/pass r = requests.get('https://api.github.com/user', auth=('user', 'pass')) -->
</form>
<div>
<!-- make a nice list of tags here to click on -->
<i>Note: Times are in UTC for now - todo - JS front end format<br/></i>
<table class="pure-table pure-table-striped watch-table">
<thead>
<tr>
<th>#</th>
<th></th>
<th>Last Checked</th>
<th>Last Changed</th>
<th></th>
</tr>
</thead>
<tbody>
{% for tag in tags %}
{% if tag == "" %}
<a href="/" class="pure-button button-tag {{'active' if active_tag == tag }}">All</a>
{% else %}
<a href="/?tag={{ tag}}" class="pure-button button-tag {{'active' if active_tag == tag }}">{{ tag }}</a>
{% endif %}
{% endfor %}
</div>
<div id="watch-table-wrapper">
<table class="pure-table pure-table-striped watch-table">
<thead>
<tr>
<th>#</th>
<th></th>
<th>Last Checked</th>
<th>Last Changed</th>
<th></th>
</tr>
</thead>
<tbody>
{% for watch in watches %}
<tr id="{{ watch.uuid }}" class="{{ loop.cycle('pure-table-odd', 'pure-table-even') }} {% if watch.last_error is defined and watch.last_error != False %}error{% endif %}">
<td>{{ loop.index }}</td>
<td>{% if watch.title is not none %}{{ watch.title }}{% else %}{{ watch.url }}{% endif %}<a class="external" target=_blank href="{{ watch.url }}"></a>
{% if watch.last_error is defined and watch.last_error != False %}
<div class="fetch-error">{{ watch.last_error }}</div>
{% endif %}
</td>
<td>{{watch|format_last_checked_time}}
</td>
<td>{{watch.last_changed|format_timestamp_timeago}}</td>
<td><a href="/api/checknow?uuid={{ watch.uuid}}" class="pure-button button-small pure-button-primary">Recheck</a> <button type="submit" class="pure-button button-small pure-button-primary">Delete</button></td>
</tr>
{% endfor %}
{% for watch in watches %}
<tr id="{{ watch.uuid }}"
class="{{ loop.cycle('pure-table-odd', 'pure-table-even') }}
{% if watch.last_error is defined and watch.last_error != False %}error{% endif %}
{% if watch.newest_history_key| int > watch.last_viewed| int %}unviewed{% endif %}">
<td>{{ loop.index }}</td>
<td class="title-col">{{watch.title if watch.title is not none else watch.url}}
<a class="external" target=_blank href="{{ watch.url }}"></a>
{% if watch.last_error is defined and watch.last_error != False %}
<div class="fetch-error">{{ watch.last_error }}</div>
{% endif %}
{% if not active_tag %}
<span class="watch-tag-list">{{ watch.tag}}</span>
{% endif %}
</td>
<td>{{watch|format_last_checked_time}}</td>
<td>{% if watch.history|length >= 2 and watch.last_changed %}
{{watch.last_changed|format_timestamp_timeago}}
{% else %}
Not yet
{% endif %}
</td>
<td>
<a href="/api/checknow?uuid={{ watch.uuid}}{% if request.args.get('tag') %}&tag={{request.args.get('tag')}}{% endif %}"
class="pure-button button-small pure-button-primary">Recheck</a>
<a href="/edit?uuid={{ watch.uuid}}" class="pure-button button-small pure-button-primary">Edit</a>
{% if watch.history|length >= 2 %}
<a href="/diff/{{ watch.uuid}}" class="pure-button button-small pure-button-primary">Diff</a>
{% endif %}
</td>
</tr>
{% endfor %}
</tbody>
</table>
</tbody>
</table>
<div id="check-all-button">
<a href="/api/checknow{% if active_tag%}?tag={{active_tag}}{%endif%}" class="pure-button button-tag ">Recheck
all {% if active_tag%}in "{{active_tag}}"{%endif%}</a>
</div>
</div>
</div>
{% endblock %}

View File

@@ -0,0 +1,2 @@
"""Tests for the app."""

43
backend/tests/conftest.py Normal file
View File

@@ -0,0 +1,43 @@
#!/usr/bin/python3
import pytest
from backend import changedetection_app
from backend import store
import os
# https://github.com/pallets/flask/blob/1.1.2/examples/tutorial/tests/test_auth.py
# Much better boilerplate than the docs
# https://www.python-boilerplate.com/py3+flask+pytest/
global app
@pytest.fixture(scope='session')
def app(request):
"""Create application for the tests."""
datastore_path = "./test-datastore"
try:
os.mkdir(datastore_path)
except FileExistsError:
pass
try:
os.unlink("{}/url-watches.json".format(datastore_path))
except FileNotFoundError:
pass
app_config = {'datastore_path': datastore_path}
datastore = store.ChangeDetectionStore(datastore_path=app_config['datastore_path'], include_default_watches=False)
app = changedetection_app(app_config, datastore)
def teardown():
datastore.stop_thread = True
app.config['STOP_THREADS'] = True
request.addfinalizer(teardown)
return app

View File

@@ -0,0 +1,117 @@
#!/usr/bin/python3
import time
from flask import url_for
from urllib.request import urlopen
def set_original_response():
test_return_data = """<html>
<body>
Some initial text</br>
<p>Which is across multiple lines</p>
</br>
So let's see what happens. </br>
</body>
</html>
"""
with open("test-datastore/output.txt", "w") as f:
f.write(test_return_data)
def set_modified_response():
test_return_data = """<html>
<body>
Some initial text</br>
<p>which has this one new line</p>
</br>
So let's see what happens. </br>
</body>
</html>
"""
with open("test-datastore/output.txt", "w") as f:
f.write(test_return_data)
def test_check_basic_change_detection_functionality(client, live_server):
sleep_time_for_fetch_thread = 5
@live_server.app.route('/test-endpoint')
def test_endpoint():
# Tried using a global var here but didn't seem to work, so reading from a file instead.
with open("test-datastore/output.txt", "r") as f:
return f.read()
set_original_response()
live_server.start()
# Add our URL to the import page
res = client.post(
url_for("import_page"),
data={"urls": url_for('test_endpoint', _external=True)},
follow_redirects=True
)
assert b"1 Imported" in res.data
time.sleep(sleep_time_for_fetch_thread)
# Do this a few times.. ensures we dont accidently set the status
for n in range(3):
client.get(url_for("api_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
# It should report nothing found (no new 'unviewed' class)
res = client.get(url_for("index"))
assert b'unviewed' not in res.data
assert b'test-endpoint' in res.data
#####################
# Make a change
set_modified_response()
res = urlopen(url_for('test_endpoint', _external=True))
assert b'which has this one new line' in res.read()
# Force recheck
res = client.get(url_for("api_watch_checknow"), follow_redirects=True)
assert b'1 watches are rechecking.' in res.data
time.sleep(sleep_time_for_fetch_thread)
# Now something should be ready, indicated by having a 'unviewed' class
res = client.get(url_for("index"))
assert b'unviewed' in res.data
# Following the 'diff' link, it should no longer display as 'unviewed' even after we recheck it a few times
res = client.get(url_for("diff_history_page", uuid="first") )
assert b'Compare newest' in res.data
time.sleep(2)
# Do this a few times.. ensures we dont accidently set the status
for n in range(3):
client.get(url_for("api_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
# It should report nothing found (no new 'unviewed' class)
res = client.get(url_for("index"))
assert b'unviewed' not in res.data
assert b'test-endpoint' in res.data
set_original_response()
client.get(url_for("api_watch_checknow"), follow_redirects=True)
time.sleep(sleep_time_for_fetch_thread)
res = client.get(url_for("index"))
assert b'unviewed' in res.data

65
changedetection.py Normal file
View File

@@ -0,0 +1,65 @@
#!/usr/bin/python3
# Launch as a eventlet.wsgi server instance.
import getopt
import sys
import eventlet
import eventlet.wsgi
import backend
from backend import store
def main(argv):
ssl_mode = False
port = 5000
datastore_path = "./datastore"
try:
opts, args = getopt.getopt(argv, "sd:p:", "purge")
except getopt.GetoptError:
print('backend.py -s SSL enable -p [port] -d [datastore path]')
sys.exit(2)
for opt, arg in opts:
# if opt == '--purge':
# Remove history, the actual files you need to delete manually.
# for uuid, watch in datastore.data['watching'].items():
# watch.update({'history': {}, 'last_checked': 0, 'last_changed': 0, 'previous_md5': None})
if opt == '-s':
ssl_mode = True
if opt == '-p':
port = int(arg)
if opt == '-d':
datastore_path = arg
# threads can read from disk every x seconds right?
# front end can just save
# We just need to know which threads are looking at which UUIDs
# isnt there some @thingy to attach to each route to tell it, that this route needs a datastore
app_config = {'datastore_path': datastore_path}
datastore = store.ChangeDetectionStore(datastore_path=app_config['datastore_path'])
app = backend.changedetection_app(app_config, datastore)
if ssl_mode:
# @todo finalise SSL config, but this should get you in the right direction if you need it.
eventlet.wsgi.server(eventlet.wrap_ssl(eventlet.listen(('', port)),
certfile='cert.pem',
keyfile='privkey.pem',
server_side=True), app)
else:
eventlet.wsgi.server(eventlet.listen(('', port)), app)
if __name__ == '__main__':
main(sys.argv[1:])

View File

@@ -1,2 +0,0 @@
Empty dir, please keep, this is used to store your data!

View File

@@ -6,14 +6,15 @@ services:
backend:
build: ./backend/dev-docker
image: dgtlmoon/changedetection.io:0.1-dev
image: dgtlmoon/changedetection.io:dev
container_name: changedetection.io-dev
volumes:
- ./backend:/app
- ./requirements.txt:/requirements.txt # Normally COPY'ed in the Dockerfile
- ./datastore:/datastore
ports:
- "127.0.0.1:5000:5000"
- "127.0.0.1:5001:5000"
networks:
- changenet

View File

@@ -1,21 +0,0 @@
version: "2"
services:
backend:
build: ./backend/production-docker
image: dgtlmoon/changedetection.io:0.1
container_name: changedetection.io
volumes:
- ./backend:/app
- ./datastore:/datastore
ports:
- "127.0.0.1:5000:5000"
networks:
- changenet
restart: always
networks:
changenet:

View File

@@ -7,6 +7,9 @@ six==1.10.0
yarl
flask
pytest
pytest-flask # for live_server
eventlet
requests
validators
@@ -15,6 +18,7 @@ bleach==3.2.1
html5lib==0.9999999 # via bleach
timeago
html2text
inscriptis
# @notes
# - Dont install socketio, it interferes with flask_socketio

BIN
screenshot-diff.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 115 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 190 KiB

After

Width:  |  Height:  |  Size: 217 KiB