AnimateDiff のモーション LoRA を複数組み合わせて使ってみる

モーション LoRA は複数を組み合わせて使うことが可能なようです。

例えばズームアウトとスライドを組み合わせて使う感じ

この記事では、再び AnimateDiffPipeline を使い、モーション LoRA を組み合わせて使ってみます。

はじめに
モーション LoRA を複数組み合わせてみる
ズームアウト＋マヌルネコで組み合わせてみる
まとめ

はじめに

上記を参考に進めますが、以下のエラーが出ます。

HTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/api/models/diffusers/animatediff-motion-lora-zoom-out

Unauthorized なので、HF_TOKEN が必要とかそっち系かも？

よって、以下の部分を直します。

  pipe.load_lora_weights(
-     "diffusers/animatediff-motion-lora-zoom-out",
+    "guoyww/animatediff-motion-lora-zoom-out",
      adapter_name="zoom-out",
  )
  pipe.load_lora_weights(
-     "diffusers/animatediff-motion-lora-pan-left",
+    "guoyww/animatediff-motion-lora-pan-left",
      adapter_name="pan-left",
  )

これでも動くっぽいので、Unauthorized は無視します！

モーション LoRA を複数組み合わせてみる

モデル等のロード (MotionAdapter, AnimateDiffPipeline, DDIMScheduler)

import torch
from diffusers import AnimateDiffPipeline, DDIMScheduler, MotionAdapter

# モーションアダプターのロード
adapter = MotionAdapter.from_pretrained(
    "guoyww/animatediff-motion-adapter-v1-5-2",
    torch_dtype=torch.float16
).to("cuda")

# SD 1.5 系のモデルを AnimateDiffPipeline でロード
pipe = AnimateDiffPipeline.from_pretrained(
    "SG161222/Realistic_Vision_V5.1_noVAE",
    motion_adapter=adapter,
    torch_dtype=torch.float16
).to("cuda")

# モーション LoRA のロード
pipe.load_lora_weights(
    "guoyww/animatediff-motion-lora-zoom-out",
    adapter_name="zoom-out",
)
pipe.load_lora_weights(
    "guoyww/animatediff-motion-lora-pan-left",
    adapter_name="pan-left",
)
pipe.set_adapters(["zoom-out", "pan-left"], adapter_weights=[1.0, 1.0])

# スケジューラの設定
pipe.scheduler = DDIMScheduler.from_config(
    pipe.scheduler.config,
    clip_sample=False,
    timestep_spacing="linspace",
    beta_schedule="linear",
    steps_offset=1,
)

# enable memory savings
pipe.enable_vae_slicing()
pipe.enable_model_cpu_offload()

パイプライン実行

import torch

# プロンプト実行
prompt =  "masterpiece, bestquality, highlydetailed, ultradetailed, sunset, orange sky, warm lighting, fishing boats, ocean waves seagulls, rippling water, wharf, silhouette, serene atmosphere, dusk, evening glow, golden hour, coastal landscape, seaside scenery"
negative_prompt = "bad quality, worse quality"
frames = pipe(
    prompt,
    negative_prompt=negative_prompt,
    num_frames=16,
    guidance_scale=7.5,
    num_inference_steps=25,
    generator=torch.Generator("cpu").manual_seed(42),
).frames[0]

実行結果

from datetime import datetime
from zoneinfo import ZoneInfo
from diffusers.utils import export_to_gif

# Asia/Tokyo タイムゾーンの現在時刻を YYYYMMDDhhmmss 形式で得る
formattedNow = datetime.now(tz=ZoneInfo("Asia/Tokyo")).strftime("%Y%m%d%H%M%S")

# 実行結果
export_to_gif(frames, f"animation_{formattedNow}.gif")

ズームアウトとスライドのモーション LoRA を組み合わせた出力の様子 — ズームアウトとスライドのモーション LoRA を組み合わせて出力した結果

うんうん、いい感じ

ズームアウト＋マヌルネコで組み合わせてみる

前述では モーション LoRA と モーション LoRA で組み合わせましたが、
今度は モーション LoRA と マヌルネコ を組み合わせてみます。

マヌルネコの LoRA はこちら

モデル等のロード (MotionAdapter, AnimateDiffPipeline, DDIMScheduler)

  import torch
  from diffusers import AnimateDiffPipeline, DDIMScheduler, MotionAdapter

  # モーションアダプターのロード
  adapter = MotionAdapter.from_pretrained(
      "guoyww/animatediff-motion-adapter-v1-5-2",
      torch_dtype=torch.float16
  ).to("cuda")

  # SD 1.5 系のモデルを AnimateDiffPipeline でロード
  pipe = AnimateDiffPipeline.from_pretrained(
      "SG161222/Realistic_Vision_V5.1_noVAE",
      motion_adapter=adapter,
      torch_dtype=torch.float16
  ).to("cuda")

  # モーション LoRA のロード
  pipe.load_lora_weights(
      "guoyww/animatediff-motion-lora-zoom-out",
      adapter_name="zoom-out",
  )
  pipe.load_lora_weights(
-     "guoyww/animatediff-motion-lora-pan-left",
+     "."
+     weights="manuruneko.safetensors"
-     adapter_name="pan-left",
+     adapter_name="manuruneko",
  )
- pipe.set_adapters(["zoom-out", "pan-left"], adapter_weights=[1.0, 1.0])
+ pipe.set_adapters(["zoom-out", "manuruneko"], adapter_weights=[1.0, 0.6])

  # スケジューラの設定
  pipe.scheduler = DDIMScheduler.from_config(
      pipe.scheduler.config,
      clip_sample=False,
      timestep_spacing="linspace",
      beta_schedule="linear",
      steps_offset=1,
  )

  # enable memory savings
  pipe.enable_vae_slicing()
  pipe.enable_model_cpu_offload()

パイプライン実行

import torch

# プロンプト実行
prompt =  "cut tusuncub walking in the snow, blurry, looking at viewer, depth of field, blurry background, full body, solo, (cute and lovely:1.4), Beautiful and realistic eye details, perfect anatomy, (Nonsense:1.4), (pure background:1.4), Centered-Shot, realistic photo, photograph, 4k, hyper detailed, DSLR, 24 Megapixels, 8mm Lens, Full Frame, film grain, Global Illumination, (studio Lighting:1.4), Award Winning Photography, diffuse reflection, ray tracing, <lora:TUSUN:0.6>"
negative_prompt = "bad quality, worse quality"
frames = pipe(
    prompt,
    negative_prompt=negative_prompt,
    num_frames=16,
    guidance_scale=7.5,
    num_inference_steps=25
).frames[0]

lpw_stable_diffusionのコミュニティパイプラインも付けようとしたけど、エラーになったから諦め！
重み付けは効かないし、77 token も超えてるけど気にしない！