Skip to content

Simplified version of existing VQGAN code, for easier setup and use

Notifications You must be signed in to change notification settings

MShinkle/simple_vqgan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

simple_vqgan

A simplified version of VQGAN-CLIP for art generation.

I've found existing VQGAN code to be feature-rich, but complicated for beginner use. Also, they rarely have innate CPU support. This is intended to be a lighter-weight but still capable variant.

Developed primarily for linux (Ubuntu 20) with python 3.9, but has been tested some on Windows and MacOS (still w/ python 3.9.)

Anaconda Installation

It is recommended that you install via the command line using Anaconda. E.g. first create a new anaconda environment:

conda create -n simple_vqgan python=3.9

and activate it:

conda activate simple_vqgan

Clone this repo, and then navigate into the cloned directory. Then install with the setup.py.

python setup.py install

The package should now be installed!

If you want to use CUDA (gpu)

Also note that if you intend to use a GPU, you may need to install a version of pytorch specific to your CUDA version.

CUDA 10.2

conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=10.2 -c pytorch

CUDA 11.3

conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch

CPU-only use shouldn't require special pytorch installation.

Running

Using all the defaults, all you need to do is import the generate function and pass it a prompt string:

from simple_vqgan import generate

generate('Neural Dreams')

This will run and then save a generated image. By default, it will be named after the prompt (e.g. Neural_Dreams.png) to a directory named 'renders' in the current directory. Both the file name and the save location can be specified with input arguments.

You can change other aspects of the generation with other inputs. Of note, you can change the size of the generated image by specifying height and width values. Further input args can be used to change image size, device (CPU vs CUDA), iterations, seeds, image transformations, etc.

For example, to generate a 500x500-pixel image for 300 iterations using the GPU, you could run:

generate('Neural Dreams', height=500, width=500, iterations=300, device='CUDA')

Examples

You can pass any text string to the model, and it will do its best to create a matching image. For example:

"ocean cityscape"

Ocean_cityscape__1

You can get interesting renders by specifying modifiers, such as styles...:

"landscape pastels"

Landscape_pastels_0

"fractal 4k HD"

fractal_4k_HD_0

Artists...:

"Andrew Jones City intricate pencil drawing"

andrew_jones_city_pencil_drawing

"Beksiński depths organic brutalism"

Beksiński_depths_organic_brutalism_0

"Andrew Jones City in the style of Beksiński"

andrew_jones_city_Beksiński

Engines, websites, etc...:

"God is dead Unreal Engine"

God_is_dead_unreal_engine_0

"Flavortown trending on ArtStation"

Flavortown_trending_on_artstation_0

These are just the tip of the iceberg; have fun!

About

Simplified version of existing VQGAN code, for easier setup and use

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages