EUPE ViT-T/16 ONNX

ONNX export of kittn/eupe_vitt16 for transformers.js image feature extraction.

This repo contains:

  • config.json
  • preprocessor_config.json
  • onnx/model.onnx

The exported model takes dynamic pixel_values shaped [batch, channels, height, width] and returns dynamic last_hidden_state shaped [batch, sequence_length, hidden].

Minimal transformers.js usage:

import { pipeline, RawImage } from "@huggingface/transformers";

const extractor = await pipeline("image-feature-extraction", "kittn/eupe_vitt16-onnx", {
  device,
  dtype,
});

extractor.processor.image_processor.do_resize = false;

const imageData = await RawImage.fromCanvas(imageCanvas);
const features = await extractor(imageData, { pooling: "none" });

const numRegisterTokens = extractor.model.config.num_register_tokens ?? 0;
const patchFeatures = features.slice(null, [1 + numRegisterTokens, null]);
const normalizedFeatures = patchFeatures.normalize(2, -1);
Downloads last month
16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support