Commit Graph

122 Commits

Author SHA1 Message Date
Jason Schwarzenberger d588a60930 add source to searchable attributes. 2020-11-11 09:37:54 +13:00
Jason Schwarzenberger 408e2870b2 tzinfo and microdata schema urls. 2020-11-10 16:51:27 +13:00
Jason Schwarzenberger 44b8b36547 add data cast in query. 2020-11-10 15:50:18 +13:00
Jason Schwarzenberger 1d78b1c592 fix favicon url. 2020-11-10 15:34:21 +13:00
Jason Schwarzenberger 0374794536 Sitemap and Category to get favicon into icon property of story. 2020-11-10 15:22:27 +13:00
Jason Schwarzenberger 5efc6ef2d3 add related stories (in api only) 2020-11-10 14:09:56 +13:00
Jason Schwarzenberger 4ec50e20cb feed thread loop. 2020-11-10 10:10:38 +13:00
Jason Schwarzenberger c1b7877f4b remove limit. 2020-11-09 17:54:50 +13:00
Jason Schwarzenberger 7b8cbfc9b9 try to make feed only determined by the max age. 2020-11-09 17:50:58 +13:00
Jason Schwarzenberger bfa4108a8e Merge remote-tracking branch 'tanner/master' 2020-11-09 16:08:28 +13:00
Jason Schwarzenberger 0bd0d40a31 use json type in sqlite. 2020-11-09 15:45:10 +13:00
Jason Schwarzenberger 4e04595415 fix search. 2020-11-09 15:44:44 +13:00
Jason 006db2960c change to 3 days 2020-11-09 01:36:51 +00:00
Jason Schwarzenberger 1f063f0dac undo log level change 2020-11-06 11:20:34 +13:00
Jason Schwarzenberger 1658346aa9 fix news.py feed. 2020-11-06 10:37:43 +13:00
Jason Schwarzenberger 2dbc702b40 switch to python-dateutil for parser, reverse sort xml feeds. 2020-11-06 10:02:39 +13:00
Jason Schwarzenberger 1c4764e67d sort sitemap feed by lastmod time. 2020-11-06 09:30:15 +13:00
Jason ee49d2021e newsroom 2020-11-05 20:28:55 +00:00
Jason c391c50ab1 use localize 2020-11-05 04:15:31 +00:00
Jason Schwarzenberger 095f0d549a use replace. 2020-11-05 16:57:08 +13:00
Jason Schwarzenberger c21c71667e fix date issue. 2020-11-05 16:41:15 +13:00
Jason Schwarzenberger c3a2c91a11 update requirements.txt 2020-11-05 16:33:50 +13:00
Jason Schwarzenberger 0f39446a61 tz aware for use in settings. 2020-11-05 16:30:55 +13:00
Jason Schwarzenberger 351059aab1 fix excludes. 2020-11-05 15:59:13 +13:00
Jason Schwarzenberger 4488e2c292 add an excludes list of substrings for urls in the settings for sitemap/category. 2020-11-05 15:51:59 +13:00
Jason Schwarzenberger afda5b635c disqus test. 2020-11-05 14:23:51 +13:00
Jason Schwarzenberger 0fc1a44d2b fix issue in substack. 2020-11-04 17:40:29 +13:00
Jason Schwarzenberger 9fff1b9e46 avoid duplicate articles listed on the category page 2020-11-04 17:14:42 +13:00
Jason Schwarzenberger 16b59f6c67 try stop bad pages. 2020-11-04 16:34:31 +13:00
Jason Schwarzenberger 939f4775a7 better settings example. 2020-11-04 15:52:34 +13:00
Jason Schwarzenberger 9bfc6fc6fa scraper settings, ordering and loop. 2020-11-04 15:47:12 +13:00
Jason Schwarzenberger 6ea9844d00 remove useless try blocks. 2020-11-04 15:37:19 +13:00
Jason Schwarzenberger 1318259d3d imply referrer is substack. 2020-11-04 15:21:07 +13:00
Jason Schwarzenberger 98a0c2257c increase declutter timeout. 2020-11-04 15:15:00 +13:00
Jason Schwarzenberger e6976db25d fix tabs 2020-11-04 15:04:20 +13:00
Jason Schwarzenberger 9edc8b7cca move scraping for article content to files. 2020-11-04 15:00:58 +13:00
Jason Schwarzenberger d718d05a04 fix dates for newsroom. 2020-11-04 11:53:16 +13:00
Jason Schwarzenberger 9f4ff4acf0 remove unnecessary sitemap.xml request. 2020-11-04 11:22:15 +13:00
Jason Schwarzenberger db6aad84ec fix mistake. 2020-11-04 11:12:01 +13:00
Jason Schwarzenberger 29f8a8b8cc add news site categories feed. 2020-11-04 11:08:50 +13:00
tanner 9a279d44b1 Add header to get content type 2020-11-03 20:27:43 +00:00
Jason abf8589e02 fix sitemap 2020-11-03 10:53:40 +00:00
Jason b759f46582 use extruct for opengraph/json-ld/microdata of articles 2020-11-03 10:31:36 +00:00
Jason Schwarzenberger 736cdc8576 fix mistake. 2020-11-03 17:04:46 +13:00
Jason Schwarzenberger 244d416f6e settings config of sitemap/substack publications. 2020-11-03 17:01:29 +13:00
Jason Schwarzenberger 5f98a2e76a Merge remote-tracking branch 'tanner/master' into master
And adding relevant setings.py.example/etc.
2020-11-03 16:44:02 +13:00
Jason Schwarzenberger 76f1d57702 sitemap based feed. 2020-11-03 16:00:03 +13:00
Jason Schwarzenberger 4e64cf682a add the bulletin. 2020-11-03 12:41:16 +13:00
Jason Schwarzenberger c5fe5d25a0 add substack.py top sites, replacing webworm.py 2020-11-03 12:28:39 +13:00
Jason 283a2b1545 fix webworm comments 2020-11-02 22:06:43 +00:00