We don’t have quite enough information to fully identify the problem…
If the image is bit-identical (same seed, same prompt, same steps) with and without the LoRA, then the LoRA did not affect the forward pass at all. In Diffusers terms, either (1) the LoRA weights never got attached (key mismatch or loader path issue), or (2) they attached to the wrong component, or (3) you hit a known Flux + 4-bit LoRA bug/regression.
Below is the shortest path to a working setup, with the background for why each change matters.
Background: how LoRA “changes the image” in Diffusers
A LoRA is a set of small matrices that get injected into specific linear layers of the base model. For diffusion models that means:
- Denoiser (UNet for SD1.5/SDXL, or Transformer/DiT for Flux)
- Sometimes also text encoder
Diffusers load_lora_weights() is supposed to:
- read the LoRA tensors,
- match their keys to modules in the pipeline,
- install adapter hooks so inference uses modified weights.
If matching fails, Diffusers can end up effectively doing nothing. Some Flux LoRA issues even describe “no error, but the style isn’t applied and output is identical.” (GitHub)
The two biggest problems in your code
1) You load the Flux.2 transformer as AutoModel
For Flux.2, Hugging Face’s own blog shows you should instantiate the denoiser as Flux2Transformer2DModel, not a generic AutoModel. (Hugging Face)
Why it matters:
- LoRA injection depends on model class structure and module naming.
- A generic auto class can still run inference but can break adapter injection or key matching.
2) You are using the “single file” LoRA path in a risky way
The Diffusers docs show load_lora_weights(pretrained_model_name_or_path, weight_name=...) where the first argument is typically a repo id or directory, and weight_name selects the file. (Hugging Face)
Passing a file path and a weight_name is easy to get wrong across versions and loaders. If Diffusers treats the first argument as a directory, it can fail to locate the file cleanly or load an empty state dict.
Flux-specific pitfall: “LoRA scaling” is not what you think
In Flux pipelines, lora_scale is documented as applying to text encoder LoRA layers. It is not a guaranteed “global LoRA strength” knob for the transformer denoiser. (GitHub)
There is also a long-standing Flux issue where changing lora_scale produced no change with a fixed seed. (GitHub)
So for Flux, the reliable test is:
- confirm adapters actually loaded,
- force a huge adapter weight,
- confirm output changes.
Flux + 4-bit: you may be on a version where LoRA is broken
Diffusers v0.32.2 explicitly says it “fixes a regression in loading LoRAs with bitsandbytes 4bit quantized Flux models.” (GitHub)
There are also Flux issues about LoRA failing when transformers are quantized to 4-bit. (GitHub)
So even if your code is “conceptually right”, an older Diffusers release can give exactly your symptom: LoRA loads but has no effect.
What I would do for your exact case (in order)
Step 1: Print versions first
You want to know if you are before or after the 4-bit Flux LoRA regression fix.
import diffusers, transformers
print("diffusers", diffusers.__version__)
print("transformers", transformers.__version__)
If you are older than the fix, upgrade Diffusers (at least past v0.32.2). (GitHub)
Also avoid random nightly dev builds if something breaks. There are issues where load_lora_weights works in one version and breaks in another, then works again after downgrade. (GitHub)
Step 2: Build the pipeline exactly like the Flux.2 reference code (critical)
Replace AutoModel with Flux2Transformer2DModel exactly as shown in the official Flux.2 post. (Hugging Face)
Step 3: Load the LoRA as “directory + weight_name”
This matches the documented pattern and avoids ambiguity. (Hugging Face)
Step 4: Force adapter activation and do an A/B test
Diffusers supports setting adapter scaling. The docs show set_adapters() for controlling scale. (Hugging Face)
Even if you load only one LoRA, forcing adapter weight removes ambiguity.
A “known-good” rewritten version of your function
import torch
from transformers import Mistral3ForConditionalGeneration
from diffusers import Flux2Pipeline, Flux2Transformer2DModel
def generate_image_using_flux2(prompt: str):
repo_id = "diffusers/FLUX.2-dev-bnb-4bit"
device = "cuda:0"
torch_dtype = torch.bfloat16
# Match the official Flux.2 4-bit loading pattern
transformer = Flux2Transformer2DModel.from_pretrained(
repo_id, subfolder="transformer", torch_dtype=torch_dtype, device_map="cpu"
)
text_encoder = Mistral3ForConditionalGeneration.from_pretrained(
repo_id, subfolder="text_encoder", dtype=torch_dtype, device_map="cpu"
)
pipe = Flux2Pipeline.from_pretrained(
repo_id,
transformer=transformer,
text_encoder=text_encoder,
torch_dtype=torch_dtype,
)
# Load LoRA BEFORE offload to reduce device-map weirdness
# Use directory + weight_name (doc pattern)
pipe.load_lora_weights(
".",
weight_name="jimfitzpatrick-fluxlora.safetensors",
adapter_name="jimfitz",
)
# Force a very obvious strength for testing
pipe.set_adapters("jimfitz", 2.0)
pipe.enable_model_cpu_offload()
img = pipe(
prompt=prompt,
generator=torch.Generator(device=device).manual_seed(42),
num_inference_steps=24,
guidance_scale=4,
).images[0]
return img
Key points this fixes:
- Correct transformer class for Flux.2 loading. (Hugging Face)
- LoRA load path matches Diffusers adapter-loading docs. (Hugging Face)
- Explicit adapter activation and exaggerated weight.
If this still produces identical output, then either the LoRA is incompatible (wrong base model family), or you are still hitting a version-specific LoRA bug for quantized Flux. (GitHub)
If it still doesn’t work: diagnose “incompatible LoRA” quickly
1) The LoRA may be for Flux.1 or SDXL, not Flux.2
This happens a lot with Civitai LoRAs. The symptom matches a known Diffusers issue: “silent erroring, doesn’t error but style isn’t applied and output is identical.” (GitHub)
Because I cannot open the Civitai page from here (site access fails), I cannot confirm what base it was trained for. You can still confirm locally:
Inspect keys inside the safetensors file.
- If you see lots of
unet keys, it is SDXL or SD1.5 style.
- If you see Flux-specific transformer keys but for Flux.1 shapes or names, it may need conversion.
Flux LoRA key mismatch and conversion is a known topic (kohya conversion scripts are mentioned directly in Diffusers issues). (GitHub)
2) Try the LoRA on a non-quantized Flux.2 pipeline
Quantization is a known boundary where LoRA breaks. There are multiple Flux LoRA issues involving quantized transformers. (GitHub)
If it works unquantized but not on *-bnb-4bit, you have confirmed it is a quantization-path problem, not the LoRA file itself.
Prompt-trigger pitfall (less likely for “identical output”, but real)
Some LoRAs are trained with a trigger token and expect you to include it in the prompt. Diffusers docs explicitly show an example where you must include a token (“cnmt”) to trigger the LoRA. (Hugging Face)
If you never use the trigger token, the effect can be weak or look like “nothing changed,” especially at low CFG.
Similar “no effect” cases online (the ones that match your symptom)
- Flux LoRA key mismatch leading to identical output, described as “silent erroring.” (GitHub)
- Flux
lora_scale not changing output with fixed seed. (GitHub)
- Flux LoRA failing when transformer is quantized. (GitHub)
- Diffusers release explicitly fixing a 4-bit Flux LoRA loading regression. (GitHub)
- Version-specific
load_lora_weights breakage reported on dev builds. (GitHub)
High-quality references and guides
Summary
- Your strongest fix: load the denoiser as
Flux2Transformer2DModel, not AutoModel. (Hugging Face)
- Load the LoRA as
directory + weight_name, then force set_adapters(..., 2.0) to prove it affects output. (Hugging Face)
- Ensure Diffusers includes the 4-bit Flux LoRA regression fix (v0.32.2 mentions it explicitly). (GitHub)
- If still identical, assume LoRA incompatibility (Flux.1 or SDXL) or a quantization-path LoRA failure. (GitHub)