A simplified version of VQGAN-CLIP for art generation.
I've found existing VQGAN code to be feature-rich, but complicated for beginner use. Also, they rarely have innate CPU support. This is intended to be a lighter-weight but still capable variant.
Developed primarily for linux (Ubuntu 20) with python 3.9, but has been tested some on Windows and MacOS (still w/ python 3.9.)
It is recommended that you install via the command line using Anaconda. E.g. first create a new anaconda environment:
conda create -n simple_vqgan python=3.9
and activate it:
conda activate simple_vqgan
Clone this repo, and then navigate into the cloned directory. Then install with the setup.py.
python setup.py install
The package should now be installed!
Also note that if you intend to use a GPU, you may need to install a version of pytorch specific to your CUDA version.
conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=10.2 -c pytorch
conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch
CPU-only use shouldn't require special pytorch installation.
Using all the defaults, all you need to do is import the generate
function and pass it a prompt string:
from simple_vqgan import generate
generate('Neural Dreams')
This will run and then save a generated image. By default, it will be named after the prompt (e.g. Neural_Dreams.png) to a directory named 'renders' in the current directory. Both the file name and the save location can be specified with input arguments.
You can change other aspects of the generation with other inputs. Of note, you can change the size of the generated image by specifying height
and width
values. Further input args can be used to change image size, device (CPU vs CUDA), iterations, seeds, image transformations, etc.
For example, to generate a 500x500-pixel image for 300 iterations using the GPU, you could run:
generate('Neural Dreams', height=500, width=500, iterations=300, device='CUDA')
You can pass any text string to the model, and it will do its best to create a matching image. For example:
"ocean cityscape"
You can get interesting renders by specifying modifiers, such as styles...:
"landscape pastels"
"fractal 4k HD"
Artists...:
"Andrew Jones City intricate pencil drawing"
"Beksiński depths organic brutalism"
"Andrew Jones City in the style of Beksiński"
Engines, websites, etc...:
"God is dead Unreal Engine"
"Flavortown trending on ArtStation"
These are just the tip of the iceberg; have fun!