Commit Graph

140 Commits

Author SHA1 Message Date
Jason Schwarzenberger
32bc3b906b add update-story.py 2020-11-19 15:06:55 +13:00
Jason Schwarzenberger
f5e65632b8 fix comment date. 2020-11-19 14:27:24 +13:00
Jason Schwarzenberger
1fe524207e stuff comments. 2020-11-19 14:23:01 +13:00
Jason Schwarzenberger
539350a83d port separation. 2020-11-18 17:21:37 +13:00
Jason Schwarzenberger
f5b38f5c6b remove readerserver, add declutter. 2020-11-18 12:59:35 +13:00
Jason Schwarzenberger
3b885e4327 renaming things. 2020-11-17 15:54:14 +13:00
Jason Schwarzenberger
5668fa5dbc fix mistake. 2020-11-17 12:54:54 +13:00
Jason Schwarzenberger
b771b52501 add regex to get a unique ref from each sitemap/category based article url. 2020-11-17 12:38:28 +13:00
Jason Schwarzenberger
f5ccd844da fix import error. 2020-11-16 15:41:09 +13:00
Jason Schwarzenberger
6a91b9402f split categories, sitemap and other crap out of news.py 2020-11-16 15:30:33 +13:00
Jason Schwarzenberger
b23e470317 move reddit thresholds as settings variables. 2020-11-16 10:11:39 +13:00
Jason Schwarzenberger
7420b5ece9 fix microdata multiple authors 2020-11-12 17:33:46 +13:00
Jason Schwarzenberger
64ced635cc fix mistake. 2020-11-12 17:15:29 +13:00
Jason Schwarzenberger
9318627f1b ability to pass in multiple site maps/category urls. 2020-11-12 17:11:51 +13:00
Jason Schwarzenberger
3d0a3f1577 support list based json-ld authors. 2020-11-12 15:08:23 +13:00
Jason Schwarzenberger
587b10c438 recursive sitemaps (sitemap indexes) 2020-11-12 14:56:46 +13:00
Jason
00954c6cac local browser scraper 2020-11-11 09:26:54 +00:00
Jason Schwarzenberger
3169af3002 hostname from settings. 2020-11-11 09:46:27 +13:00
Jason Schwarzenberger
d588a60930 add source to searchable attributes. 2020-11-11 09:37:54 +13:00
Jason Schwarzenberger
408e2870b2 tzinfo and microdata schema urls. 2020-11-10 16:51:27 +13:00
Jason Schwarzenberger
44b8b36547 add data cast in query. 2020-11-10 15:50:18 +13:00
Jason Schwarzenberger
1d78b1c592 fix favicon url. 2020-11-10 15:34:21 +13:00
Jason Schwarzenberger
0374794536 Sitemap and Category to get favicon into icon property of story. 2020-11-10 15:22:27 +13:00
Jason Schwarzenberger
5efc6ef2d3 add related stories (in api only) 2020-11-10 14:09:56 +13:00
Jason Schwarzenberger
4ec50e20cb feed thread loop. 2020-11-10 10:10:38 +13:00
Jason Schwarzenberger
c1b7877f4b remove limit. 2020-11-09 17:54:50 +13:00
Jason Schwarzenberger
7b8cbfc9b9 try to make feed only determined by the max age. 2020-11-09 17:50:58 +13:00
Jason Schwarzenberger
bfa4108a8e Merge remote-tracking branch 'tanner/master' 2020-11-09 16:08:28 +13:00
Jason Schwarzenberger
0bd0d40a31 use json type in sqlite. 2020-11-09 15:45:10 +13:00
Jason Schwarzenberger
4e04595415 fix search. 2020-11-09 15:44:44 +13:00
Jason
006db2960c change to 3 days 2020-11-09 01:36:51 +00:00
Jason Schwarzenberger
1f063f0dac undo log level change 2020-11-06 11:20:34 +13:00
Jason Schwarzenberger
1658346aa9 fix news.py feed. 2020-11-06 10:37:43 +13:00
Jason Schwarzenberger
2dbc702b40 switch to python-dateutil for parser, reverse sort xml feeds. 2020-11-06 10:02:39 +13:00
Jason Schwarzenberger
1c4764e67d sort sitemap feed by lastmod time. 2020-11-06 09:30:15 +13:00
Jason
ee49d2021e newsroom 2020-11-05 20:28:55 +00:00
Jason
c391c50ab1 use localize 2020-11-05 04:15:31 +00:00
Jason Schwarzenberger
095f0d549a use replace. 2020-11-05 16:57:08 +13:00
Jason Schwarzenberger
c21c71667e fix date issue. 2020-11-05 16:41:15 +13:00
Jason Schwarzenberger
c3a2c91a11 update requirements.txt 2020-11-05 16:33:50 +13:00
Jason Schwarzenberger
0f39446a61 tz aware for use in settings. 2020-11-05 16:30:55 +13:00
Jason Schwarzenberger
351059aab1 fix excludes. 2020-11-05 15:59:13 +13:00
Jason Schwarzenberger
4488e2c292 add an excludes list of substrings for urls in the settings for sitemap/category. 2020-11-05 15:51:59 +13:00
Jason Schwarzenberger
afda5b635c disqus test. 2020-11-05 14:23:51 +13:00
Jason Schwarzenberger
0fc1a44d2b fix issue in substack. 2020-11-04 17:40:29 +13:00
Jason Schwarzenberger
9fff1b9e46 avoid duplicate articles listed on the category page 2020-11-04 17:14:42 +13:00
Jason Schwarzenberger
16b59f6c67 try stop bad pages. 2020-11-04 16:34:31 +13:00
Jason Schwarzenberger
939f4775a7 better settings example. 2020-11-04 15:52:34 +13:00
Jason Schwarzenberger
9bfc6fc6fa scraper settings, ordering and loop. 2020-11-04 15:47:12 +13:00
Jason Schwarzenberger
6ea9844d00 remove useless try blocks. 2020-11-04 15:37:19 +13:00