Commit Graph

88 Commits

Author SHA1 Message Date
Brett Kuprel
09a0f85b8e separate setup processes for flax and torch 2022-07-01 11:08:33 -04:00
Brett Kuprel
b40fd83a0d mega works with latest flax version 0.5.2 now, removing 0.4.2 pin 2022-07-01 02:58:43 -04:00
Brett Kuprel
eaee59a1ef update readme 2022-06-30 21:05:02 -04:00
Brett Kuprel
f8bffc6892 update readme 2022-06-30 21:03:56 -04:00
Brett Kuprel
08b158d580 updated readme 2022-06-30 16:50:04 -04:00
Brett Kuprel
d9d7f34b22 update readme 2022-06-30 15:33:53 -04:00
Brett Kuprel
c8f3304363 update readme 2022-06-30 15:18:29 -04:00
Brett Kuprel
b913b58353 pre converting params to torch allows mega to run in standard colab runtime 2022-06-30 14:54:08 -04:00
Brett Kuprel
41a44068d0 keep params in expendable mode 2022-06-30 09:36:32 -04:00
Brett Kuprel
fb97ba5e20 update readme, cleanup 2022-06-30 07:41:31 -04:00
Brett Kuprel
005ee4938e
Update README.md 2022-06-29 15:24:09 -04:00
Brett Kuprel
15b0c03485
Merge pull request #38 from chenxwh/replicate
Add Web Demo & Docker environment
2022-06-29 15:20:20 -04:00
Brett Kuprel
d99828a239 simplified flax attention and matched torch attention 2022-06-29 14:56:28 -04:00
Chenxi
fcc17c895d
Merge branch 'kuprel:main' into replicate 2022-06-29 19:50:47 +01:00
Chenxi
eb9f4c6b3b replicate demo 2022-06-29 19:50:10 +01:00
Brett Kuprel
a4df279fd2 simplified attention and keys_values state resulted in decrease in inference time to 7.3 seconds (from ~10 seconds) 2022-06-29 13:56:29 -04:00
Brett Kuprel
764b5bbc0e works with latest flax version 0.5.2, updated requirements.txt 2022-06-29 11:46:19 -04:00
Brett Kuprel
c4f613c89f readme wording 2022-06-29 11:01:46 -04:00
Brett Kuprel
6046863805 readme wording 2022-06-29 11:00:38 -04:00
Brett Kuprel
2b552fe9db readme wording 2022-06-29 10:54:01 -04:00
Brett Kuprel
b7c2414c76 readme wording 2022-06-29 10:47:08 -04:00
Brett Kuprel
4e62f85ab9 readme wording 2022-06-29 10:45:46 -04:00
Brett Kuprel
53695e32f7 updated readme with torch examples 2022-06-29 10:43:46 -04:00
Brett Kuprel
ed91ab4a30 refactored to load models once and run multiple times 2022-06-29 09:42:12 -04:00
Brett Kuprel
1fbb209623 fixed bug with cuda in detokenizer 2022-06-28 22:02:35 -04:00
Brett Kuprel
aef24ea157 torch.no_grad(), cleanup 2022-06-28 12:16:44 -04:00
Omar Sanseviero
efa40ab321
Update README.md 2022-06-28 09:25:56 +02:00
Brett Kuprel
6260252348 update readme 2022-06-27 22:08:29 -04:00
kuprel
2ad7009a16
Update README.md 2022-06-27 21:11:27 -04:00
Brett Kuprel
6a068651e5 updated colab 2022-06-27 20:59:39 -04:00
Brett Kuprel
24d8e29ef2 readme formatting 2022-06-27 17:28:40 -04:00
Brett Kuprel
8363495f0a updated readme 2022-06-27 17:26:05 -04:00
Brett Kuprel
a014dccc05 fixed an issue with argument parser 2022-06-27 16:49:42 -04:00
Brett Kuprel
e7001f063c simplified 2022-06-27 15:46:04 -04:00
Brett Kuprel
18e6a9852f license and cleanup 2022-06-27 14:34:10 -04:00
Brett Kuprel
32b7aa196b readme 2022-06-27 13:51:48 -04:00
Brett Kuprel
194ae7dfa1 examples 2022-06-27 13:38:35 -04:00
Brett Kuprel
97fe8515f1 first commit 2022-06-27 11:55:55 -04:00