2022-07-03 20:49:38 +00:00
|
|
|
|
min(DALL·E)
|
|
|
|
|
===========
|
|
|
|
|
|
2022-07-07 12:21:20 +00:00
|
|
|
|
|Colab| |Replicate| |Discord|
|
2022-07-03 20:49:38 +00:00
|
|
|
|
|
2022-07-07 12:21:20 +00:00
|
|
|
|
This is a fast, minimal port of Boris Dayma’s `DALL·E
|
2022-07-04 11:28:44 +00:00
|
|
|
|
Mega <https://github.com/borisdayma/dalle-mini>`__. It has been stripped
|
2022-07-03 20:49:38 +00:00
|
|
|
|
down for inference and converted to PyTorch. The only third party
|
|
|
|
|
dependencies are numpy, requests, pillow and torch.
|
|
|
|
|
|
2022-07-04 20:06:49 +00:00
|
|
|
|
To generate a 4x4 grid of DALL·E Mega images it takes: - 89 sec with a
|
|
|
|
|
T4 in Colab - 48 sec with a P100 in Colab - 14 sec with an A100 on
|
2022-07-07 12:21:20 +00:00
|
|
|
|
Replicate
|
2022-07-03 20:49:38 +00:00
|
|
|
|
|
|
|
|
|
The flax model and code for converting it to torch can be found
|
|
|
|
|
`here <https://github.com/kuprel/min-dalle-flax>`__.
|
|
|
|
|
|
|
|
|
|
Install
|
|
|
|
|
-------
|
|
|
|
|
|
|
|
|
|
.. code:: bash
|
|
|
|
|
|
|
|
|
|
$ pip install min-dalle
|
|
|
|
|
|
|
|
|
|
Usage
|
|
|
|
|
-----
|
|
|
|
|
|
|
|
|
|
Load the model parameters once and reuse the model to generate multiple
|
|
|
|
|
images.
|
|
|
|
|
|
|
|
|
|
.. code:: python
|
|
|
|
|
|
|
|
|
|
from min_dalle import MinDalle
|
|
|
|
|
|
2022-07-07 12:21:20 +00:00
|
|
|
|
model = MinDalle(
|
|
|
|
|
is_mega=True,
|
|
|
|
|
is_reusable=True,
|
|
|
|
|
models_root='./pretrained'
|
|
|
|
|
)
|
2022-07-03 20:49:38 +00:00
|
|
|
|
|
|
|
|
|
The required models will be downloaded to ``models_root`` if they are
|
|
|
|
|
not already there. Once everything has finished initializing, call
|
2022-07-07 12:21:20 +00:00
|
|
|
|
``generate_image`` with some text as many times as you want. Use a
|
|
|
|
|
positive ``seed`` for reproducible results. Higher values for
|
|
|
|
|
``log2_supercondition_factor`` result in better agreement with the text
|
|
|
|
|
but a narrower variety of generated images. Every image token is sampled
|
|
|
|
|
from the top-:math:`k` most probable tokens.
|
2022-07-03 20:49:38 +00:00
|
|
|
|
|
|
|
|
|
.. code:: python
|
|
|
|
|
|
2022-07-07 12:21:20 +00:00
|
|
|
|
image = model.generate_image(
|
|
|
|
|
text='Nuclear explosion broccoli',
|
|
|
|
|
seed=-1,
|
|
|
|
|
grid_size=4,
|
|
|
|
|
log2_k=6,
|
|
|
|
|
log2_supercondition_factor=5,
|
|
|
|
|
is_verbose=False
|
|
|
|
|
)
|
2022-07-03 20:49:38 +00:00
|
|
|
|
|
|
|
|
|
display(image)
|
|
|
|
|
|
2022-07-07 12:21:20 +00:00
|
|
|
|
Interactive
|
|
|
|
|
~~~~~~~~~~~
|
2022-07-03 20:49:38 +00:00
|
|
|
|
|
2022-07-07 12:21:20 +00:00
|
|
|
|
If the model is being used interactively (e.g. in a notebook)
|
|
|
|
|
``generate_image_stream`` can be used to generate a stream of images as
|
|
|
|
|
the model is decoding. The detokenizer adds a slight delay for each
|
|
|
|
|
image. Setting ``log2_mid_count`` to 3 results in a total of
|
|
|
|
|
``2 ** 3 = 8`` generated images. The only valid values for
|
|
|
|
|
``log2_mid_count`` are 0, 1, 2, 3, and 4. This is implemented in the
|
|
|
|
|
colab.
|
2022-07-03 20:49:38 +00:00
|
|
|
|
|
|
|
|
|
.. code:: python
|
|
|
|
|
|
2022-07-07 12:21:20 +00:00
|
|
|
|
image_stream = model.generate_image_stream(
|
|
|
|
|
text='Dali painting of WALL·E',
|
|
|
|
|
seed=-1,
|
|
|
|
|
grid_size=3,
|
|
|
|
|
log2_mid_count=3,
|
|
|
|
|
log2_k=6,
|
|
|
|
|
log2_supercondition_factor=3,
|
|
|
|
|
is_verbose=False
|
|
|
|
|
)
|
2022-07-03 20:49:38 +00:00
|
|
|
|
|
2022-07-07 12:21:20 +00:00
|
|
|
|
for image in image_stream:
|
|
|
|
|
display(image)
|
2022-07-03 20:49:38 +00:00
|
|
|
|
|
|
|
|
|
Command Line
|
|
|
|
|
~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
Use ``image_from_text.py`` to generate images from the command line.
|
|
|
|
|
|
|
|
|
|
.. code:: bash
|
|
|
|
|
|
2022-07-07 12:21:20 +00:00
|
|
|
|
$ python image_from_text.py --text='artificial intelligence' --no-mega
|
2022-07-03 20:49:38 +00:00
|
|
|
|
|
2022-07-07 12:21:20 +00:00
|
|
|
|
.. |Colab| image:: https://colab.research.google.com/assets/colab-badge.svg
|
2022-07-03 20:49:38 +00:00
|
|
|
|
:target: https://colab.research.google.com/github/kuprel/min-dalle/blob/main/min_dalle.ipynb
|
|
|
|
|
.. |Replicate| image:: https://replicate.com/kuprel/min-dalle/badge
|
|
|
|
|
:target: https://replicate.com/kuprel/min-dalle
|
2022-07-07 12:21:20 +00:00
|
|
|
|
.. |Discord| image:: https://img.shields.io/discord/823813159592001537?color=5865F2&logo=discord&logoColor=white
|
|
|
|
|
:target: https://discord.com/channels/823813159592001537/912729332311556136
|