Commit Graph

33 Commits

Author SHA1 Message Date
Brett Kuprel
09a0f85b8e separate setup processes for flax and torch 2022-07-01 11:08:33 -04:00
Brett Kuprel
7bf76deafb fixed wrong file path 2022-07-01 10:58:29 -04:00
Brett Kuprel
e4c2be54cb save converted detokenizer params 2022-07-01 10:17:29 -04:00
Brett Kuprel
b40fd83a0d mega works with latest flax version 0.5.2 now, removing 0.4.2 pin 2022-07-01 02:58:43 -04:00
Brett Kuprel
08b158d580 updated readme 2022-06-30 16:50:04 -04:00
Brett Kuprel
2311a1af7b delete cache 2022-06-30 15:48:20 -04:00
Brett Kuprel
b913b58353 pre converting params to torch allows mega to run in standard colab runtime 2022-06-30 14:54:08 -04:00
Brett Kuprel
c2a3858c96 delete params sooner 2022-06-30 11:44:36 -04:00
Brett Kuprel
f951424e38 is_reusable 2022-06-30 11:25:24 -04:00
Brett Kuprel
b55bcba4c0 removed deepcopy, delete expendable parameters after use 2022-06-30 11:09:09 -04:00
Brett Kuprel
41a44068d0 keep params in expendable mode 2022-06-30 09:36:32 -04:00
Brett Kuprel
df9aa6f915 sort -> topk, prev_token_and_index -> prev_token, token_index 2022-06-30 09:04:11 -04:00
Brett Kuprel
fb97ba5e20 update readme, cleanup 2022-06-30 07:41:31 -04:00
Brett Kuprel
1e18ba0ffa is_expendable argument reduces memory usage for command line script 2022-06-30 06:43:10 -04:00
Brett Kuprel
d99828a239 simplified flax attention and matched torch attention 2022-06-29 14:56:28 -04:00
Brett Kuprel
61cc99c13c read tokenizer files with utf8 encoding 2022-06-29 14:18:23 -04:00
Brett Kuprel
661ec976ac simplified attention for torch model 2022-06-29 13:48:12 -04:00
Brett Kuprel
ed91ab4a30 refactored to load models once and run multiple times 2022-06-29 09:42:12 -04:00
Adam Novak
28c812c832 Use all logical cores in Torch mode 2022-06-28 22:26:51 -04:00
Brett Kuprel
1fbb209623 fixed bug with cuda in detokenizer 2022-06-28 22:02:35 -04:00
Brett Kuprel
764b0bc685 cuda in detokenizer from previous commit broke colab flax model, fixed 2022-06-28 21:36:48 -04:00
Brett Kuprel
17c96fe110 works with cuda 2022-06-28 21:28:36 -04:00
Brett Kuprel
9d6b6dcc92 previous commit broke flax model, fixed now 2022-06-28 12:54:58 -04:00
Brett Kuprel
5aa6fe49bf use cuda if available 2022-06-28 12:47:11 -04:00
Brett Kuprel
8544f59576 use cuda if available 2022-06-28 12:38:31 -04:00
Brett Kuprel
aef24ea157 torch.no_grad(), cleanup 2022-06-28 12:16:44 -04:00
Brett Kuprel
34df2b97df previous commit broke colab example, so adjusting flax requirement to 0.4.2 for now 2022-06-28 08:04:08 -04:00
Brett Kuprel
38ebe54a38 works with latest flax version 0.5.2 now 2022-06-28 07:12:29 -04:00
Brett Kuprel
a014dccc05 fixed an issue with argument parser 2022-06-27 16:49:42 -04:00
Brett Kuprel
e7001f063c simplified 2022-06-27 15:46:04 -04:00
Brett Kuprel
18e6a9852f license and cleanup 2022-06-27 14:34:10 -04:00
Brett Kuprel
c936d26102 back to linear attention 2022-06-27 13:19:03 -04:00
Brett Kuprel
018414a5c3 fixed relative imports 2022-06-27 12:43:47 -04:00