| 
							
							
								 Brett Kuprel | 09a0f85b8e | separate setup processes for flax and torch | 2022-07-01 11:08:33 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | 7bf76deafb | fixed wrong file path | 2022-07-01 10:58:29 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | e4c2be54cb | save converted detokenizer params | 2022-07-01 10:17:29 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | b40fd83a0d | mega works with latest flax version 0.5.2 now, removing 0.4.2 pin | 2022-07-01 02:58:43 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | 08b158d580 | updated readme | 2022-06-30 16:50:04 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | 2311a1af7b | delete cache | 2022-06-30 15:48:20 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | b913b58353 | pre converting params to torch allows mega to run in standard colab runtime | 2022-06-30 14:54:08 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | c2a3858c96 | delete params sooner | 2022-06-30 11:44:36 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | f951424e38 | is_reusable | 2022-06-30 11:25:24 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | b55bcba4c0 | removed deepcopy, delete expendable parameters after use | 2022-06-30 11:09:09 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | 41a44068d0 | keep params in expendable mode | 2022-06-30 09:36:32 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | df9aa6f915 | sort -> topk, prev_token_and_index -> prev_token, token_index | 2022-06-30 09:04:11 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | fb97ba5e20 | update readme, cleanup | 2022-06-30 07:41:31 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | 1e18ba0ffa | is_expendable argument reduces memory usage for command line script | 2022-06-30 06:43:10 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | d99828a239 | simplified flax attention and matched torch attention | 2022-06-29 14:56:28 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | 61cc99c13c | read tokenizer files with utf8 encoding | 2022-06-29 14:18:23 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | 661ec976ac | simplified attention for torch model | 2022-06-29 13:48:12 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | ed91ab4a30 | refactored to load models once and run multiple times | 2022-06-29 09:42:12 -04:00 |  | 
			
				
					| 
							
							
								 Adam Novak | 28c812c832 | Use all logical cores in Torch mode | 2022-06-28 22:26:51 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | 1fbb209623 | fixed bug with cuda in detokenizer | 2022-06-28 22:02:35 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | 764b0bc685 | cuda in detokenizer from previous commit broke colab flax model, fixed | 2022-06-28 21:36:48 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | 17c96fe110 | works with cuda | 2022-06-28 21:28:36 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | 9d6b6dcc92 | previous commit broke flax model, fixed now | 2022-06-28 12:54:58 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | 5aa6fe49bf | use cuda if available | 2022-06-28 12:47:11 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | 8544f59576 | use cuda if available | 2022-06-28 12:38:31 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | aef24ea157 | torch.no_grad(), cleanup | 2022-06-28 12:16:44 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | 34df2b97df | previous commit broke colab example, so adjusting flax requirement to 0.4.2 for now | 2022-06-28 08:04:08 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | 38ebe54a38 | works with latest flax version 0.5.2 now | 2022-06-28 07:12:29 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | a014dccc05 | fixed an issue with argument parser | 2022-06-27 16:49:42 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | e7001f063c | simplified | 2022-06-27 15:46:04 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | 18e6a9852f | license and cleanup | 2022-06-27 14:34:10 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | c936d26102 | back to linear attention | 2022-06-27 13:19:03 -04:00 |  | 
			
				
					| 
							
							
								 Brett Kuprel | 018414a5c3 | fixed relative imports | 2022-06-27 12:43:47 -04:00 |  |