ジャコ Lab

プログラミング関連のメモ帳的ブログです

画像からアップスケーラー、そして超解像へ

huggingface.co

この部分をやってみようとしていますが、
なんか上手く行きませんでした!

上手く行かなかった記録を残すだけの記事となってしまいました!

Image-to-Image, Image-to-upscaler, Image-to-super-resolution

スクリプト全文

折りたたみ

import torch
from diffusers import AutoPipelineForImage2Image, StableDiffusionLatentUpscalePipeline, StableDiffusionUpscalePipeline
from diffusers.utils import make_image_grid, load_image

# ------------------------------
# Image-to-Image
# ------------------------------

pipeline = AutoPipelineForImage2Image.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16,
    use_safetensors=True
)
pipeline.enable_model_cpu_offload()

# prepare image
init_image_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/img2img-init.png"
init_image = load_image(init_image_url)

# pass prompt and image to pipeline
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image_1 = pipeline(prompt, image=init_image, output_type="latent").images[0]

# ------------------------------
# upscaler
# ------------------------------

upscaler = StableDiffusionLatentUpscalePipeline.from_pretrained(
    "stabilityai/sd-x2-latent-upscaler", 
    torch_dtype=torch.float16, 
    use_safetensors=True
)
upscaler.enable_model_cpu_offload()

image_2 = upscaler(prompt, image=image_1, output_type="latent").images[0]

# ------------------------------
# super-resolution
# ------------------------------

super_res = StableDiffusionUpscalePipeline.from_pretrained(
    "stabilityai/stable-diffusion-x4-upscaler",
    torch_dtype=torch.float16,
    use_safetensors=True
)
super_res.enable_model_cpu_offload()

image_3 = super_res(prompt, image=image_2).images[0]
make_image_grid([init_image, image_3.resize((512, 512))], rows=1, cols=2)

エラー送出...

そのままやったら以下のエラーが出てしまった
ValueError                                Traceback (most recent call last)
 in ()
     30 # ------------------------------
     31 
---> 32 upscaler = StableDiffusionLatentUpscalePipeline.from_pretrained(
     33     "stabilityai/sd-x2-latent-upscaler",
     34     torch_dtype=torch.float16,

ValueError: The deprecation tuple ('no variant default', '0.24.0', "You are trying to load the model files of the `variant=fp16`, but no such modeling files are available.The default model files: {'text_encoder/model.safetensors', 'vae/diffusion_pytorch_model.bin', 'unet/diffusion_pytorch_model.safetensors', 'vae/diffusion_pytorch_model.safetensors', 'text_encoder/pytorch_model.bin', 'unet/diffusion_pytorch_model.bin'} will be loaded instead. Make sure to not load from `variant=fp16`if such variant modeling files are not available. Doing so will lead to an error in v0.24.0 as defaulting to non-variantmodeling files is deprecated.") should be removed since diffusers' version 0.26.0 is >= 0.24.0
    

とりあえず variant="fp16" が悪そうなので消してみたけど...

そして別のエラーも出てしまった...
ValueError                                Traceback (most recent call last)
 in ()
     49 super_res.enable_model_cpu_offload()
     50 
---> 51 image_3 = super_res(prompt, image=image_2).images[0]
     52 make_image_grid([init_image, image_3.resize((512, 512))], rows=1, cols=2)

ValueError: `prompt` has batch size 1 and `image` has batch size 4. Please make sure that passed `prompt` matches the batch size of `image`.
    

悲しみです。
同じエラーで困っている方もいらっしゃった...

discuss.huggingface.co

super-resolution だけでも使えるっぽいのでやってみる

import torch
from diffusers import StableDiffusionUpscalePipeline

init_image_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/img2img-init.png"
init_image = load_image(init_image_url)
init_image

small_image = init_image.resize((128,128))
small_image

super_res = StableDiffusionUpscalePipeline.from_pretrained(
    "stabilityai/stable-diffusion-x4-upscaler",
    torch_dtype=torch.float16,
    variant="fp16",
    use_safetensors=True
).to("cuda")
super_res.enable_model_cpu_offload()

prompt = "Astronaut, best quality, masterpiece, an extremely delicate and beautiful, extremely detailed"
super_res_image_1 = super_res(prompt, image=small_image).images[0]
super_res_image_2 = super_res(prompt, image=super_res_image_1).images[0]
make_image_grid([super_res_image_2, super_res_image_1, small_image], rows=1, cols=3)

(2048x2048) | (512x512) | (128x128)

まとめ

128x128 を単純に 2048x2048 に拡大してもこうはならないよね!
超解像すごい!