Release v0.2.1 · ssundaram21/dreamsim

We're releasing 4 new variants of DreamSim! These new checkpoints are:

DINOv2 B/14 and SynCLR B/16 as backbones
DINOv2 B/14 and DINO B/16 trained with the original contrastive loss on both CLS and dense features.

These models (and the originals) are further evaluated in our new NeurIPS 2024 paper, When Does Perceptual Alignment Benefit Vision Representations?

We find that our perceptually-aligned representations outperform the baseline models on a variety of downstream standard computer vision tasks, including semantic segmentation, depth estimation, object counting, instance retrieval, and retrieval-augmented generation. These results point towards perceptual alignment as a useful objective for learning general-purpose vision representations. See the paper and our blog post for more details.

Here's how they perform on NIGHTS:

	NIGHTS - Val	NIGHTS - Test
`ensemble`	96.9%	96.2%
`dino_vitb16`	95.6%	94.8%
`open_clip_vitb32`	95.6%	95.3%
`clip_vitb32`	94.9%	93.6%
`dinov2_vitb14`	94.9%	95.0%
`synclr_vitb16`	96.0%	95.9%
`dino_vitb16 (patch)`	94.9%	94.8%
`dinov2_vitb14 (patch)`	95.5%	95.1%

Additionally, we fixed a bug in embedding normalization. This shouldn't significantly affect model performance, but may explain very minor changes in pipelines where DreamSim (with normalize_embeds=True) is being used.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.2.1