Jason Schwarzenberger
|
0f39446a61
|
tz aware for use in settings.
|
2020-11-05 16:30:55 +13:00 |
|
Jason Schwarzenberger
|
351059aab1
|
fix excludes.
|
2020-11-05 15:59:13 +13:00 |
|
Jason Schwarzenberger
|
4488e2c292
|
add an excludes list of substrings for urls in the settings for sitemap/category.
|
2020-11-05 15:51:59 +13:00 |
|
Jason Schwarzenberger
|
afda5b635c
|
disqus test.
|
2020-11-05 14:23:51 +13:00 |
|
Jason Schwarzenberger
|
0fc1a44d2b
|
fix issue in substack.
|
2020-11-04 17:40:29 +13:00 |
|
Jason Schwarzenberger
|
9fff1b9e46
|
avoid duplicate articles listed on the category page
|
2020-11-04 17:14:42 +13:00 |
|
Jason Schwarzenberger
|
16b59f6c67
|
try stop bad pages.
|
2020-11-04 16:34:31 +13:00 |
|
Jason Schwarzenberger
|
939f4775a7
|
better settings example.
|
2020-11-04 15:52:34 +13:00 |
|
Jason Schwarzenberger
|
9bfc6fc6fa
|
scraper settings, ordering and loop.
|
2020-11-04 15:47:12 +13:00 |
|
Jason Schwarzenberger
|
6ea9844d00
|
remove useless try blocks.
|
2020-11-04 15:37:19 +13:00 |
|
Jason Schwarzenberger
|
1318259d3d
|
imply referrer is substack.
|
2020-11-04 15:21:07 +13:00 |
|
Jason Schwarzenberger
|
98a0c2257c
|
increase declutter timeout.
|
2020-11-04 15:15:00 +13:00 |
|
Jason Schwarzenberger
|
e6976db25d
|
fix tabs
|
2020-11-04 15:04:20 +13:00 |
|
Jason Schwarzenberger
|
9edc8b7cca
|
move scraping for article content to files.
|
2020-11-04 15:00:58 +13:00 |
|
Jason Schwarzenberger
|
d718d05a04
|
fix dates for newsroom.
|
2020-11-04 11:53:16 +13:00 |
|
Jason Schwarzenberger
|
9f4ff4acf0
|
remove unnecessary sitemap.xml request.
|
2020-11-04 11:22:15 +13:00 |
|
Jason Schwarzenberger
|
db6aad84ec
|
fix mistake.
|
2020-11-04 11:12:01 +13:00 |
|
Jason Schwarzenberger
|
29f8a8b8cc
|
add news site categories feed.
|
2020-11-04 11:08:50 +13:00 |
|
Jason
|
abf8589e02
|
fix sitemap
|
2020-11-03 10:53:40 +00:00 |
|
Jason
|
b759f46582
|
use extruct for opengraph/json-ld/microdata of articles
|
2020-11-03 10:31:36 +00:00 |
|
Jason Schwarzenberger
|
736cdc8576
|
fix mistake.
|
2020-11-03 17:04:46 +13:00 |
|
Jason Schwarzenberger
|
244d416f6e
|
settings config of sitemap/substack publications.
|
2020-11-03 17:01:29 +13:00 |
|
Jason Schwarzenberger
|
5f98a2e76a
|
Merge remote-tracking branch 'tanner/master' into master
And adding relevant setings.py.example/etc.
|
2020-11-03 16:44:02 +13:00 |
|
Jason Schwarzenberger
|
76f1d57702
|
sitemap based feed.
|
2020-11-03 16:00:03 +13:00 |
|
Jason Schwarzenberger
|
4e64cf682a
|
add the bulletin.
|
2020-11-03 12:41:16 +13:00 |
|
Jason Schwarzenberger
|
c5fe5d25a0
|
add substack.py top sites, replacing webworm.py
|
2020-11-03 12:28:39 +13:00 |
|
Jason
|
283a2b1545
|
fix webworm comments
|
2020-11-02 22:06:43 +00:00 |
|
Jason Schwarzenberger
|
0d6a86ace2
|
fix webworm dates.
|
2020-11-03 10:31:14 +13:00 |
|
Jason Schwarzenberger
|
f23bf628e0
|
add webworm/substack as a feed.
|
2020-11-02 17:09:59 +13:00 |
|
|
ca78a6d7a9
|
Move feed and Praw config to settings.py
|
2020-11-02 02:26:54 +00:00 |
|
|
e59acefda9
|
Remove Whoosh
|
2020-11-02 00:22:40 +00:00 |
|
|
cbc802b7e9
|
Try Hackernews API twice
|
2020-11-02 00:17:22 +00:00 |
|
|
4579dfce00
|
Improve logging
|
2020-11-02 00:13:43 +00:00 |
|
|
feba8b7aa0
|
Make qotnews work with WaPo
|
2020-10-29 04:55:34 +00:00 |
|
|
992c1c1233
|
Monkeypatch earlier
|
2020-10-24 22:30:00 +00:00 |
|
|
88d2216627
|
Add a script to delete a story
|
2020-10-03 23:42:21 +00:00 |
|
|
6cf2f01b08
|
Adjust feeds
|
2020-10-03 23:41:57 +00:00 |
|
|
6576eb1bac
|
Adjust content-type request timeout
|
2020-08-14 03:57:43 +00:00 |
|
|
472af76d1a
|
Adjust port
|
2020-08-14 03:57:18 +00:00 |
|
|
4727d34eb6
|
Delete displayed-attributes when init search
|
2020-08-14 03:56:47 +00:00 |
|
|
0e086b60b8
|
Remove business subreddit from feed
|
2020-08-14 03:55:28 +00:00 |
|
|
b46ce36c63
|
Update requirements
|
2020-07-08 05:24:32 +00:00 |
|
|
9a449bf3ca
|
Remove extra logging
|
2020-07-08 02:36:40 +00:00 |
|
|
0bd9f05250
|
Fix crash when HN feed fails
|
2020-07-08 02:36:40 +00:00 |
|
|
9c116bde4a
|
Remove document img and ignore r/technology
|
2020-07-08 02:36:40 +00:00 |
|
|
ebedaef00b
|
Tune search rankings and attributes
|
2020-07-08 02:36:40 +00:00 |
|
|
d7f0643bd7
|
Add more logging
|
2020-07-08 02:36:40 +00:00 |
|
|
f1c846acd0
|
Remove get first image
|
2020-07-08 02:36:40 +00:00 |
|
|
850b30e353
|
Add requests timeouts and temporary logging
|
2020-07-08 02:36:40 +00:00 |
|
|
d614ad0743
|
Integrate with external MeiliSearch server
|
2020-07-08 02:36:40 +00:00 |
|