mirror of
https://github.com/dgtlmoon/changedetection.io.git
synced 2025-11-11 12:07:08 +00:00
Compare commits
9 Commits
export-reg
...
faster-bro
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
1a331b77bd | ||
|
|
6d932149e3 | ||
|
|
2c764e8f84 | ||
|
|
07765b0d38 | ||
|
|
7c3faa8e38 | ||
|
|
4624974b91 | ||
|
|
991841f1f9 | ||
|
|
e3db324698 | ||
|
|
0988bef2cd |
45
README.md
45
README.md
@@ -13,15 +13,33 @@ _Live your data-life pro-actively, Detect website changes and perform meaningful
|
||||
|
||||
- Chrome browser included.
|
||||
- Super fast, no registration needed setup.
|
||||
- Start watching and receiving change notifications instantly.
|
||||
- Get started watching and receiving website change notifications straight away.
|
||||
|
||||
|
||||
Easily see what changed, examine by word, line, or individual character.
|
||||
### Target specific parts of the webpage using the Visual Selector tool.
|
||||
|
||||
<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/screenshot-diff.png" style="max-width:100%;" alt="Self-hosted web page change monitoring context difference " title="Self-hosted web page change monitoring context difference " />
|
||||
Available when connected to a <a href="https://github.com/dgtlmoon/changedetection.io/wiki/Playwright-content-fetcher">playwright content fetcher</a> (included as part of our subscription service)
|
||||
|
||||
[<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/visualselector-anim.gif" style="max-width:100%;" alt="Self-hosted web page change monitoring context difference " title="Self-hosted web page change monitoring context difference " />](https://lemonade.changedetection.io/start?src=github)
|
||||
|
||||
### Easily see what changed, examine by word, line, or individual character.
|
||||
|
||||
[<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/screenshot-diff.png" style="max-width:100%;" alt="Self-hosted web page change monitoring context difference " title="Self-hosted web page change monitoring context difference " />](https://lemonade.changedetection.io/start?src=github)
|
||||
|
||||
|
||||
#### Example use cases
|
||||
### Perform interactive browser steps
|
||||
|
||||
Fill in text boxes, click buttons and more, setup your changedetection scenario.
|
||||
|
||||
Using the **Browser Steps** configuration, add basic steps before performing change detection, such as logging into websites, adding a product to a cart, accept cookie logins, entering dates and refining searches.
|
||||
|
||||
[<img src="docs/browsersteps-anim.gif" style="max-width:100%;" alt="Self-hosted web page change monitoring context difference " title="Website change detection with interactive browser steps, login, cookies etc" />](https://lemonade.changedetection.io/start?src=github)
|
||||
|
||||
After **Browser Steps** have been run, then visit the **Visual Selector** tab to refine the content you're interested in.
|
||||
Requires Playwright to be enabled.
|
||||
|
||||
|
||||
### Example use cases
|
||||
|
||||
- Products and services have a change in pricing
|
||||
- _Out of stock notification_ and _Back In stock notification_
|
||||
@@ -59,27 +77,8 @@ _Need an actual Chrome runner with Javascript support? We support fetching via W
|
||||
|
||||
We [recommend and use Bright Data](https://brightdata.grsm.io/n0r16zf7eivq) global proxy services, Bright Data will match any first deposit up to $100 using our signup link.
|
||||
|
||||
## Screenshots
|
||||
|
||||
Please :star: star :star: this project and help it grow! https://github.com/dgtlmoon/changedetection.io/
|
||||
|
||||
### Target specific parts of the webpage using the Visual Selector tool.
|
||||
|
||||
Available when connected to a <a href="https://github.com/dgtlmoon/changedetection.io/wiki/Playwright-content-fetcher">playwright content fetcher</a> (included as part of our subscription service)
|
||||
|
||||
<img src="https://raw.githubusercontent.com/dgtlmoon/changedetection.io/master/docs/visualselector-anim.gif" style="max-width:100%;" alt="Self-hosted web page change monitoring context difference " title="Self-hosted web page change monitoring context difference " />
|
||||
|
||||
### Perform interactive browser steps
|
||||
|
||||
Fill in text boxes, click buttons and more, setup your changedetection scenario.
|
||||
|
||||
Using the **Browser Steps** configuration, add basic steps before performing change detection, such as logging into websites, adding a product to a cart, accept cookie logins, entering dates and refining searches.
|
||||
|
||||
<img src="docs/browsersteps-anim.gif" style="max-width:100%;" alt="Self-hosted web page change monitoring context difference " title="Website change detection with interactive browser steps, login, cookies etc" />
|
||||
|
||||
After **Browser Steps** have been run, then visit the **Visual Selector** tab to refine the content you're interested in.
|
||||
Requires Playwright to be enabled.
|
||||
|
||||
## Installation
|
||||
|
||||
### Docker
|
||||
|
||||
@@ -27,6 +27,7 @@ from flask import (
|
||||
session,
|
||||
url_for,
|
||||
)
|
||||
from flask_compress import Compress as FlaskCompress
|
||||
from flask_login import login_required
|
||||
from flask_restful import abort, Api
|
||||
from flask_wtf import CSRFProtect
|
||||
@@ -51,6 +52,10 @@ app = Flask(__name__,
|
||||
static_url_path="",
|
||||
static_folder="static",
|
||||
template_folder="templates")
|
||||
from flask_compress import Compress
|
||||
|
||||
# Super handy for compressing large BrowserSteps responses and others
|
||||
FlaskCompress(app)
|
||||
|
||||
# Stop browser caching of assets
|
||||
app.config['SEND_FILE_MAX_AGE_DEFAULT'] = 0
|
||||
|
||||
@@ -210,16 +210,29 @@ def construct_blueprint(datastore: ChangeDetectionStore):
|
||||
except playwright._impl._api_types.Error as e:
|
||||
return make_response("Browser session ran out of time :( Please reload this page."+str(e), 401)
|
||||
|
||||
p = {'screenshot': "data:image/png;base64,{}".format(
|
||||
# Use send_file() which is way faster than read/write loop on bytes
|
||||
import json
|
||||
from tempfile import mkstemp
|
||||
from flask import send_file
|
||||
tmp_fd, tmp_file = mkstemp(text=True, suffix=".json", prefix="changedetectionio-")
|
||||
|
||||
output = json.dumps({'screenshot': "data:image/png;base64,{}".format(
|
||||
base64.b64encode(state[0]).decode('ascii')),
|
||||
'xpath_data': state[1],
|
||||
'session_age_start': this_session.age_start,
|
||||
'browser_time_remaining': round(remaining)
|
||||
}
|
||||
})
|
||||
|
||||
with os.fdopen(tmp_fd, 'w') as f:
|
||||
f.write(output)
|
||||
|
||||
# @todo BSON/binary JSON, faster xfer, OR pick it off the disk
|
||||
return p
|
||||
response = make_response(send_file(path_or_file=tmp_file,
|
||||
mimetype='application/json; charset=UTF-8',
|
||||
etag=True))
|
||||
# No longer needed
|
||||
os.unlink(tmp_file)
|
||||
|
||||
return response
|
||||
|
||||
return browser_steps_blueprint
|
||||
|
||||
|
||||
@@ -22,6 +22,7 @@ browser_step_ui_config = {'Choose one': '0 0',
|
||||
'Click element': '1 0',
|
||||
'Click element containing text': '0 1',
|
||||
'Enter text in field': '1 1',
|
||||
'Execute JS': '0 1',
|
||||
# 'Extract text and use as filter': '1 0',
|
||||
'Goto site': '0 0',
|
||||
'Press Enter': '0 0',
|
||||
@@ -97,6 +98,9 @@ class steppable_browser_interface():
|
||||
|
||||
self.page.fill(selector, value, timeout=10 * 1000)
|
||||
|
||||
def action_execute_js(self, selector, value):
|
||||
self.page.evaluate(value)
|
||||
|
||||
def action_click_element(self, selector, value):
|
||||
print("Clicking element")
|
||||
if not len(selector.strip()):
|
||||
@@ -232,7 +236,7 @@ class browsersteps_live_ui(steppable_browser_interface):
|
||||
self.page.evaluate("var include_filters=''")
|
||||
# Go find the interactive elements
|
||||
# @todo in the future, something smarter that can scan for elements with .click/focus etc event handlers?
|
||||
elements = 'a,button,input,select,textarea,i,th,td,p,li,h1,h2,h3,h4'
|
||||
elements = 'a,button,input,select,textarea,i,th,td,p,li,h1,h2,h3,h4,div,span'
|
||||
xpath_element_js = xpath_element_js.replace('%ELEMENTS%', elements)
|
||||
xpath_data = self.page.evaluate("async () => {" + xpath_element_js + "}")
|
||||
# So the JS will find the smallest one first
|
||||
|
||||
@@ -400,6 +400,15 @@ class watchForm(commonSettingsForm):
|
||||
self.body.errors.append('Body must be empty when Request Method is set to GET')
|
||||
result = False
|
||||
|
||||
# Attempt to validate jinja2 templates in the URL
|
||||
from jinja2 import Environment
|
||||
# Jinja2 available in URLs along with https://pypi.org/project/jinja2-time/
|
||||
jinja2_env = Environment(extensions=['jinja2_time.TimeExtension'])
|
||||
try:
|
||||
ready_url = str(jinja2_env.from_string(self.url.data).render())
|
||||
except Exception as e:
|
||||
self.url.errors.append('Invalid template syntax')
|
||||
result = False
|
||||
return result
|
||||
|
||||
|
||||
|
||||
@@ -93,12 +93,23 @@ class model(dict):
|
||||
@property
|
||||
def link(self):
|
||||
url = self.get('url', '')
|
||||
ready_url = url
|
||||
if '{%' in url or '{{' in url:
|
||||
from jinja2 import Environment
|
||||
# Jinja2 available in URLs along with https://pypi.org/project/jinja2-time/
|
||||
jinja2_env = Environment(extensions=['jinja2_time.TimeExtension'])
|
||||
return str(jinja2_env.from_string(url).render())
|
||||
return url
|
||||
try:
|
||||
ready_url = str(jinja2_env.from_string(url).render())
|
||||
except Exception as e:
|
||||
from flask import (
|
||||
flash, Markup, url_for
|
||||
)
|
||||
message = Markup('<a href="{}#general">The URL {} is invalid and cannot be used, click to edit</a>'.format(
|
||||
url_for('edit_page', uuid=self.get('uuid')), self.get('url', '')))
|
||||
flash(message, 'error')
|
||||
return ''
|
||||
|
||||
return ready_url
|
||||
|
||||
@property
|
||||
def label(self):
|
||||
|
||||
@@ -1,3 +1,13 @@
|
||||
// @file Scrape the page looking for elements of concern (%ELEMENTS%)
|
||||
// http://matatk.agrip.org.uk/tests/position-and-width/
|
||||
// https://stackoverflow.com/questions/26813480/when-is-element-getboundingclientrect-guaranteed-to-be-updated-accurate
|
||||
//
|
||||
// Some pages like https://www.londonstockexchange.com/stock/NCCL/ncondezi-energy-limited/analysis
|
||||
// will automatically force a scroll somewhere, so include the position offset
|
||||
// Lets hope the position doesnt change while we iterate the bbox's, but this is better than nothing
|
||||
|
||||
var scroll_y=+document.documentElement.scrollTop || document.body.scrollTop
|
||||
|
||||
// Include the getXpath script directly, easier than fetching
|
||||
function getxpath(e) {
|
||||
var n = e;
|
||||
@@ -71,8 +81,13 @@ var bbox;
|
||||
for (var i = 0; i < elements.length; i++) {
|
||||
bbox = elements[i].getBoundingClientRect();
|
||||
|
||||
// forget really small ones
|
||||
if (bbox['width'] < 15 && bbox['height'] < 15) {
|
||||
// Forget really small ones
|
||||
if (bbox['width'] < 10 && bbox['height'] < 10) {
|
||||
continue;
|
||||
}
|
||||
|
||||
// Don't include elements that are offset from canvas
|
||||
if (bbox['top']+scroll_y < 0 || bbox['left'] < 0) {
|
||||
continue;
|
||||
}
|
||||
|
||||
@@ -109,14 +124,16 @@ for (var i = 0; i < elements.length; i++) {
|
||||
continue;
|
||||
}
|
||||
|
||||
// @todo Possible to ONLY list where it's clickable to save JSON xfer size
|
||||
size_pos.push({
|
||||
xpath: xpath_result,
|
||||
width: Math.round(bbox['width']),
|
||||
height: Math.round(bbox['height']),
|
||||
left: Math.floor(bbox['left']),
|
||||
top: Math.floor(bbox['top']),
|
||||
top: Math.floor(bbox['top'])+scroll_y,
|
||||
tagName: (elements[i].tagName) ? elements[i].tagName.toLowerCase() : '',
|
||||
tagtype: (elements[i].tagName == 'INPUT' && elements[i].type) ? elements[i].type.toLowerCase() : ''
|
||||
tagtype: (elements[i].tagName == 'INPUT' && elements[i].type) ? elements[i].type.toLowerCase() : '',
|
||||
isClickable: (elements[i].onclick) || window.getComputedStyle(elements[i]).cursor == "pointer"
|
||||
});
|
||||
|
||||
}
|
||||
@@ -150,6 +167,7 @@ if (include_filters.length) {
|
||||
|
||||
if (q) {
|
||||
bbox = q.getBoundingClientRect();
|
||||
console.log("xpath_element_scraper: Got filter element, scroll from top was "+scroll_y)
|
||||
} else {
|
||||
console.log("xpath_element_scraper: filter element "+f+" was not found");
|
||||
}
|
||||
@@ -157,10 +175,10 @@ if (include_filters.length) {
|
||||
if (bbox && bbox['width'] > 0 && bbox['height'] > 0) {
|
||||
size_pos.push({
|
||||
xpath: f,
|
||||
width: Math.round(bbox['width']),
|
||||
height: Math.round(bbox['height']),
|
||||
left: Math.floor(bbox['left']),
|
||||
top: Math.floor(bbox['top'])
|
||||
width: parseInt(bbox['width']),
|
||||
height: parseInt(bbox['height']),
|
||||
left: parseInt(bbox['left']),
|
||||
top: parseInt(bbox['top'])+scroll_y
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
@@ -196,9 +196,7 @@ $(document).ready(function () {
|
||||
$('input[placeholder="Value"]', first_available).addClass('ok').click().focus();
|
||||
found_something = true;
|
||||
} else {
|
||||
// Assume it's just for clicking on
|
||||
// what are we clicking on?
|
||||
if (x['tagName'].startsWith('h')|| x['tagName'] === 'a' || x['tagName'] === 'button' || x['tagtype'] === 'submit'|| x['tagtype'] === 'checkbox'|| x['tagtype'] === 'radio'|| x['tagtype'] === 'li') {
|
||||
if (x['isClickable'] || x['tagName'].startsWith('h')|| x['tagName'] === 'a' || x['tagName'] === 'button' || x['tagtype'] === 'submit'|| x['tagtype'] === 'checkbox'|| x['tagtype'] === 'radio'|| x['tagtype'] === 'li') {
|
||||
$('select', first_available).val('Click element').change();
|
||||
$('input[type=text]', first_available).first().val(x['xpath']);
|
||||
found_something = true;
|
||||
@@ -220,8 +218,6 @@ $(document).ready(function () {
|
||||
}
|
||||
}
|
||||
}
|
||||
} else {
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
@@ -252,6 +248,7 @@ $(document).ready(function () {
|
||||
}).done(function (data) {
|
||||
xpath_data = data.xpath_data;
|
||||
$('#browsersteps-img').attr('src', data.screenshot);
|
||||
$("#loading-status-text").fadeIn();
|
||||
// This should trigger 'Goto site'
|
||||
$('#browser_steps >li:first-child .apply').click();
|
||||
browserless_seconds_remaining = data.browser_time_remaining;
|
||||
@@ -393,6 +390,7 @@ $(document).ready(function () {
|
||||
400: function () {
|
||||
// More than likely the CSRF token was lost when the server restarted
|
||||
alert("There was a problem processing the request, please reload the page.");
|
||||
$("#loading-status-text").hide();
|
||||
}
|
||||
}
|
||||
}).done(function (data) {
|
||||
@@ -404,12 +402,14 @@ $(document).ready(function () {
|
||||
$("#browsersteps-img").css('opacity',1);
|
||||
$('ul#browser_steps li .control .apply').css('opacity',1);
|
||||
browserless_seconds_remaining = data.browser_time_remaining;
|
||||
$("#loading-status-text").hide();
|
||||
}).fail(function (data) {
|
||||
console.log(data);
|
||||
if (data.responseText.includes("Browser session expired")) {
|
||||
disable_browsersteps_ui();
|
||||
}
|
||||
apply_buttons_disabled=false;
|
||||
$("#loading-status-text").hide();
|
||||
$('ul#browser_steps li .control .apply').css('opacity',1);
|
||||
$("#browsersteps-img").css('opacity',1);
|
||||
//$('#browsersteps-selector-wrapper .loader').fadeOut(2500);
|
||||
|
||||
@@ -160,7 +160,7 @@ User-Agent: wonderbra 1.0") }}
|
||||
<input type=checkbox id="include_text_elements" > <label for="include_text_elements">Turn on text finder</label>
|
||||
</div>
|
||||
|
||||
|
||||
<div id="loading-status-text" style="display: none;">Please wait, first browser step can take a little time to load..<div class="spinner"></div></div>
|
||||
<div class="flex-wrapper" >
|
||||
|
||||
<div id="browser-steps-ui" class="noselect" style="width: 100%; background-color: #eee; border-radius: 5px;">
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
flask~=2.0
|
||||
flask_wtf
|
||||
flask-compress
|
||||
eventlet>=0.31.0
|
||||
validators
|
||||
timeago~=1.0
|
||||
|
||||
Reference in New Issue
Block a user