Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2406.06911

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

Papers - Training - Multi-GPU

AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

Paper • 2406.06911 • Published Jun 11, 2024 • 12

Papers - Image - Denoising - Stride Denoising

AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

Paper • 2406.06911 • Published Jun 11, 2024 • 12

Papers - Image - Diffusion - Parallel Denoising

AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

Paper • 2406.06911 • Published Jun 11, 2024 • 12

Papers - Image - LPIPS

Dynamic Typography: Bringing Words to Life

Paper • 2404.11614 • Published Apr 17, 2024 • 46
Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer

Paper • 2404.14351 • Published Apr 22, 2024 • 6
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models

Paper • 2404.17672 • Published Apr 26, 2024 • 19
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10, 2024 • 71

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17, 2024 • 10
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18, 2024 • 18
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 62
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24, 2024 • 78

Papers - Video - SDXL - Multi-GPU

AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

Paper • 2406.06911 • Published Jun 11, 2024 • 12

Papers - Image - Inference - Model Segmentation

AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

Paper • 2406.06911 • Published Jun 11, 2024 • 12

Image-Gen Personalization

pOps: Photo-Inspired Diffusion Operators

Paper • 2406.01300 • Published Jun 3, 2024 • 17
AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

Paper • 2406.06911 • Published Jun 11, 2024 • 12
Interpreting the Weight Space of Customized Diffusion Models

Paper • 2406.09413 • Published Jun 13, 2024 • 20
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts

Paper • 2406.09162 • Published Jun 13, 2024 • 14

Papers - University - National University of Singapore

Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization

Paper • 2404.09956 • Published Apr 15, 2024 • 12
Contrastive Chain-of-Thought Prompting

Paper • 2311.09277 • Published Nov 15, 2023 • 35
FlashSpeech: Efficient Zero-Shot Speech Synthesis

Paper • 2404.14700 • Published Apr 23, 2024 • 32
AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

Paper • 2406.06911 • Published Jun 11, 2024 • 12

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17, 2024 • 10
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18, 2024 • 18
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 62
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24, 2024 • 78

Papers - Training - Multi-GPU

AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

Paper • 2406.06911 • Published Jun 11, 2024 • 12

Papers - Video - SDXL - Multi-GPU

AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

Paper • 2406.06911 • Published Jun 11, 2024 • 12

Papers - Image - Denoising - Stride Denoising

AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

Paper • 2406.06911 • Published Jun 11, 2024 • 12

Papers - Image - Inference - Model Segmentation

AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

Paper • 2406.06911 • Published Jun 11, 2024 • 12

Papers - Image - Diffusion - Parallel Denoising

AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

Paper • 2406.06911 • Published Jun 11, 2024 • 12

Image-Gen Personalization

pOps: Photo-Inspired Diffusion Operators

Paper • 2406.01300 • Published Jun 3, 2024 • 17
AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

Paper • 2406.06911 • Published Jun 11, 2024 • 12
Interpreting the Weight Space of Customized Diffusion Models

Paper • 2406.09413 • Published Jun 13, 2024 • 20
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts

Paper • 2406.09162 • Published Jun 13, 2024 • 14

Papers - Image - LPIPS

Dynamic Typography: Bringing Words to Life

Paper • 2404.11614 • Published Apr 17, 2024 • 46
Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer

Paper • 2404.14351 • Published Apr 22, 2024 • 6
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models

Paper • 2404.17672 • Published Apr 26, 2024 • 19
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10, 2024 • 71

Papers - University - National University of Singapore

Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization

Paper • 2404.09956 • Published Apr 15, 2024 • 12
Contrastive Chain-of-Thought Prompting

Paper • 2311.09277 • Published Nov 15, 2023 • 35
FlashSpeech: Efficient Zero-Shot Speech Synthesis

Paper • 2404.14700 • Published Apr 23, 2024 • 32
AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

Paper • 2406.06911 • Published Jun 11, 2024 • 12

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs