この部分をやってみようとしていますが、
なんか上手く行きませんでした!
上手く行かなかった記録を残すだけの記事となってしまいました!
Image-to-Image, Image-to-upscaler, Image-to-super-resolution
スクリプト全文
折りたたみ
import torch from diffusers import AutoPipelineForImage2Image, StableDiffusionLatentUpscalePipeline, StableDiffusionUpscalePipeline from diffusers.utils import make_image_grid, load_image # ------------------------------ # Image-to-Image # ------------------------------ pipeline = AutoPipelineForImage2Image.from_pretrained( "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True ) pipeline.enable_model_cpu_offload() # prepare image init_image_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/img2img-init.png" init_image = load_image(init_image_url) # pass prompt and image to pipeline prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image_1 = pipeline(prompt, image=init_image, output_type="latent").images[0] # ------------------------------ # upscaler # ------------------------------ upscaler = StableDiffusionLatentUpscalePipeline.from_pretrained( "stabilityai/sd-x2-latent-upscaler", torch_dtype=torch.float16, use_safetensors=True ) upscaler.enable_model_cpu_offload() image_2 = upscaler(prompt, image=image_1, output_type="latent").images[0] # ------------------------------ # super-resolution # ------------------------------ super_res = StableDiffusionUpscalePipeline.from_pretrained( "stabilityai/stable-diffusion-x4-upscaler", torch_dtype=torch.float16, use_safetensors=True ) super_res.enable_model_cpu_offload() image_3 = super_res(prompt, image=image_2).images[0] make_image_grid([init_image, image_3.resize((512, 512))], rows=1, cols=2)
エラー送出...
そのままやったら以下のエラーが出てしまった
ValueError Traceback (most recent call last)in () 30 # ------------------------------ 31 ---> 32 upscaler = StableDiffusionLatentUpscalePipeline.from_pretrained( 33 "stabilityai/sd-x2-latent-upscaler", 34 torch_dtype=torch.float16, ValueError: The deprecation tuple ('no variant default', '0.24.0', "You are trying to load the model files of the `variant=fp16`, but no such modeling files are available.The default model files: {'text_encoder/model.safetensors', 'vae/diffusion_pytorch_model.bin', 'unet/diffusion_pytorch_model.safetensors', 'vae/diffusion_pytorch_model.safetensors', 'text_encoder/pytorch_model.bin', 'unet/diffusion_pytorch_model.bin'} will be loaded instead. Make sure to not load from `variant=fp16`if such variant modeling files are not available. Doing so will lead to an error in v0.24.0 as defaulting to non-variantmodeling files is deprecated.") should be removed since diffusers' version 0.26.0 is >= 0.24.0 |
とりあえず variant="fp16" が悪そうなので消してみたけど...
そして別のエラーも出てしまった...
ValueError Traceback (most recent call last)in () 49 super_res.enable_model_cpu_offload() 50 ---> 51 image_3 = super_res(prompt, image=image_2).images[0] 52 make_image_grid([init_image, image_3.resize((512, 512))], rows=1, cols=2) ValueError: `prompt` has batch size 1 and `image` has batch size 4. Please make sure that passed `prompt` matches the batch size of `image`. |
悲しみです。
同じエラーで困っている方もいらっしゃった...
super-resolution だけでも使えるっぽいのでやってみる
import torch from diffusers import StableDiffusionUpscalePipeline init_image_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/img2img-init.png" init_image = load_image(init_image_url) init_image small_image = init_image.resize((128,128)) small_image super_res = StableDiffusionUpscalePipeline.from_pretrained( "stabilityai/stable-diffusion-x4-upscaler", torch_dtype=torch.float16, variant="fp16", use_safetensors=True ).to("cuda") super_res.enable_model_cpu_offload() prompt = "Astronaut, best quality, masterpiece, an extremely delicate and beautiful, extremely detailed" super_res_image_1 = super_res(prompt, image=small_image).images[0] super_res_image_2 = super_res(prompt, image=super_res_image_1).images[0] make_image_grid([super_res_image_2, super_res_image_1, small_image], rows=1, cols=3)
まとめ
128x128 を単純に 2048x2048 に拡大してもこうはならないよね!
超解像すごい!