simple_vqgan

A simplified version of VQGAN-CLIP for art generation.

I've found existing VQGAN code to be feature-rich, but complicated for beginner use. Also, they rarely have innate CPU support. This is intended to be a lighter-weight but still capable variant.

Developed primarily for linux (Ubuntu 20) with python 3.9, but has been tested some on Windows and MacOS (still w/ python 3.9.)

Anaconda Installation

It is recommended that you install via the command line using Anaconda. E.g. first create a new anaconda environment:

conda create -n simple_vqgan python=3.9

and activate it:

conda activate simple_vqgan

Clone this repo, and then navigate into the cloned directory. Then install with the setup.py.

python setup.py install

The package should now be installed!

If you want to use CUDA (gpu)

Also note that if you intend to use a GPU, you may need to install a version of pytorch specific to your CUDA version.

CUDA 10.2

conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=10.2 -c pytorch

CUDA 11.3

conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch

CPU-only use shouldn't require special pytorch installation.

Running

Using all the defaults, all you need to do is import the generate function and pass it a prompt string:

from simple_vqgan import generate

generate('Neural Dreams')

This will run and then save a generated image. By default, it will be named after the prompt (e.g. Neural_Dreams.png) to a directory named 'renders' in the current directory. Both the file name and the save location can be specified with input arguments.

You can change other aspects of the generation with other inputs. Of note, you can change the size of the generated image by specifying height and width values. Further input args can be used to change image size, device (CPU vs CUDA), iterations, seeds, image transformations, etc.

For example, to generate a 500x500-pixel image for 300 iterations using the GPU, you could run:

generate('Neural Dreams', height=500, width=500, iterations=300, device='CUDA')

Examples

You can pass any text string to the model, and it will do its best to create a matching image. For example:

"ocean cityscape"

You can get interesting renders by specifying modifiers, such as styles...:

"landscape pastels"

"fractal 4k HD"

$fractal_4k_HD_0$

Artists...:

"Andrew Jones City intricate pencil drawing"

"Beksiński depths organic brutalism"

"Andrew Jones City in the style of Beksiński"

Engines, websites, etc...:

"God is dead Unreal Engine"

"Flavortown trending on ArtStation"

These are just the tip of the iceberg; have fun!

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
simple_vqgan		simple_vqgan
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

simple_vqgan