The copy can then be used to compare outputs with the original version. It is matched to the state dictionary from the ffhq StyleGAN2 model checkpoint, so that we can then update a copy of it to reflect the styles we want to impart through training. We also set the latent dimension for both generators as 512.įor set up, we first instantiate an untrained generator to be finetuned as we go through the process. We specify the device as cuda because we are working with GPU's. To conclude setup, we need to instantiate our generators. Original_generator.load_state_dict(ckpt, strict=False) Original_generator = Generator(1024, latent_dim, 8, 2).to(device)Ĭkpt = torch.load('stylegan2-ffhq-config-f.pt') It is critical that you run the cell that follows these imports as well, because this is where we will be getting the checkpoints for the StyleGAN2 and e4e model we will be using as the basis for our generator. ![]() !mv pretrained_models/stylegan2-ffhq-config-f.pt ~/./notebooks The following os.makedirs() statements then create and check for the inclusion of the directories we will be using for JoJoGAN. ipynb file to ensure this works properly. Be sure not to change the location of the. ![]() Notably, we are using both local and python installed packages. This next cell imports the packages to the notebook now that they have been installed on the machine. Os.makedirs('style_images_aligned', exist_ok=True) Os.makedirs('style_images', exist_ok=True) Os.makedirs('inversion_codes', exist_ok=True) #importsįrom torchvision import transforms, utilsįrom e4e_projection import projection as e4e_projection Be sure to run this cell first to make sure everything will work properly going forward. This first code cell contains the libraries that are required but were not installed on the official PyTorch image we used to create the instance. !update-alternatives -install /usr/bin/ninja ninja /usr/local/bin/ninja 1 -force !unzip ninja-linux.zip -d /usr/local/bin/ !pip install gdown scikit-learn=0.22 scipy lpips dlib opencv-python-headless tensorflow This is where we will be doing most of our work. Once that's done and your instance has spun up, go ahead and open the stylize.ipynb file. For your workspace URL, be sure to enter. Once you've done so, scroll to the bottom of the page and select the advanced options toggle. When you go to create your notebook for JoJoGAN, be sure to select the PyTorch tile as well as a GPU instance. JoJoGAN is a PyTorch based package, and it leverages a number of libraries to achieve its functionality. Previous one and few shot attempts have approached their level of success, but JoJoGAN has managed to achieve an extremely high level of quality for the images it generates.įollow the steps below to see how to run JoJoGAN on Gradient Notebooks! The StyleGAN2 model is then made generalizable so that the imparted style can be subsequently applied to new images. JoJoGAN is capable of intaking any single image of a face (ideally a high quality head shot of some kind), approximating the paired real data using GAN inversion, and using the data to minutely adjust a pre-trained StyleGAN2 model. JoJoGAN aims to solve this problem by first approximating a paired training dataset and then finetuning a StyleGAN to perform one-shot face stylization. ![]() This PyTorch-written architecture was constructed with the goal of capturing the stylistic details that have been historically difficult to account for, such as transferring style effects that conserve facial details like eye shape or mouth details. In this tutorial, we will look at JoJoGAN - a novel approach to conducting one-shot style transfer for facial images. An example of JoJoGAN (trained on faces from the tv show Arcane) applying its stylization to randomly sampled faces. Regions like the eyes and mouth in particular are very difficult to get an AI to approximate for generation correctly. Like many computer vision tasks, the challenge of transferring style on to the rougher and larger areas of an image is far easier than transferring that same style to the finer features of a face. This combination of utility and ease of demonstration make style transfer one of the most popular first computer vision projects many data scientists, ML engineers, and AI enthusiasts undertake, such as imparting the style of Vincent van Gogh's "Starry Night" to a previously mundane landscape photograph. There are a number of reasons for this, including the demonstrability of the method leaning well into publication and the potential utility of making quick stylistic edits to photos. Style transfer is one of the hottest topics around deep learning media these days.
0 Comments
Leave a Reply. |