MCA-Bench: A Multimodal Benchmark for Evaluating CAPTCHA Robustness Against VLM-based Attacks Paper • 2506.05982 • Published Jun 6, 2025 • 2 • 2
Autoregressive Images Watermarking through Lexical Biasing: An Approach Resistant to Regeneration Attack Paper • 2506.01011 • Published Jun 1, 2025 • 9 • 2
DiffDecompose: Layer-Wise Decomposition of Alpha-Composited Images via Diffusion Transformers Paper • 2505.21541 • Published May 24, 2025 • 7 • 2
RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers Paper • 2506.02528 • Published Jun 3, 2025 • 15 • 2
EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering Paper • 2505.24417 • Published May 30, 2025 • 13 • 2
GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning Chains Paper • 2505.18700 • Published May 24, 2025 • 4 • 2
OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data Paper • 2505.18445 • Published May 24, 2025 • 63 • 2
FocusedAD: Character-centric Movie Audio Description Paper • 2504.12157 • Published Apr 16, 2025 • 8 • 3
EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer Paper • 2503.07027 • Published Mar 10, 2025 • 29 • 2
PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data Paper • 2502.14397 • Published Feb 20, 2025 • 41 • 6
PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data Paper • 2502.14397 • Published Feb 20, 2025 • 41 • 6
LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer Paper • 2502.01105 • Published Feb 3, 2025 • 21 • 4
MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation Paper • 2502.01572 • Published Feb 3, 2025 • 21 • 2