Geoparsing: Diagram Parsing for Plane and Solid Geometry with a Unified Formal Language
Paper β’ 2604.11600 β’ Published β’ 1
[π Homepage] [π» Github] [π€ Huggingface Dataset] [π€ Huggingface Model] [π Paper]
We follow the official environment setup of Qwen3-VL. Please refer to: π https://huggingface.co/Qwen/Qwen3-VL-4B-Instruct
We provide a minimal example for running inference with the released Geoparsing model.
import torch
from transformers import Qwen3VLForConditionalGeneration, AutoProcessor
model_path = "YOUR_MODEL_PATH" # local path or HuggingFace repo id
model = Qwen3VLForConditionalGeneration.from_pretrained(
model_path,
torch_dtype="auto",
device_map="cuda:0"
)
processor = AutoProcessor.from_pretrained(model_path)
messages = [
{
"role": "user",
"content": [
{
"type": "image",
"image": "examples/3_17.jpg",
},
{
"type": "text",
"text": "Please parse the geometric diagram and provide its formal description.",
},
],
}
]
inputs = processor.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_dict=True,
return_tensors="pt"
)
inputs = inputs.to(model.device)
generated_ids = model.generate(
**inputs,
max_new_tokens=1280
)
generated_ids_trimmed = [
out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_text = processor.batch_decode(
generated_ids_trimmed,
skip_special_tokens=True,
clean_up_tokenization_spaces=False
)
print(output_text[0])
@article{wang2026geoparsing,
title={Geoparsing: Diagram Parsing for Plane and Solid Geometry with a Unified Formal Language},
author={Wang, Peijie and Zhang, Ming-Liang and Cao, Jun and Deng, Chao and Ran, Dekang and Sun, Hongda and Bu, Pi and Zhang, Xuan and Wang, Yingyao and Song, Jun and Zheng, Bo and Yin, Fei and Liu, Cheng-Lin},
journal={https://arxiv.org/abs/2604.11600},
year={2026}
}