THUDM: GLM 4.1V 9B Thinking

GLM-4.1V-9B-Thinking is a 9B parameter vision-language model developed by THUDM, based on the GLM-4-9B foundation. It introduces a reasoning-centric "thinking paradigm" enhanced with reinforcement learning to improve multimodal reasoning, long-context understanding (up to 64K tokens), and complex problem solving. It achieves state-of-the-art performance among models in its class, outperforming even larger models like Qwen-2.5-VL-72B on a majority of benchmark tasks.

Description

GLM-4.1V-9B-Thinking is a 9B parameter vision-language model developed by THUDM, based on the GLM-4-9B foundation. It introduces a reasoning-centric "thinking paradigm" enhanced with reinforcement learning to improve multimodal reasoning, long-context understanding (up to 64K tokens), and complex problem solving. It achieves state-of-the-art performance among models in its class, outperforming even larger models like Qwen-2.5-VL-72B on a majority of benchmark tasks.

ArchitectureАрхитектура

Modality:
text+image->text
InputModalities:
image, text
OutputModalities:
text
Tokenizer:
Other

ContextAndLimits

ContextLength:
65536 Tokens
MaxResponseTokens:
8000 Tokens
Moderation:
Disabled

PricingRUB

Request:
Image:
WebSearch:
InternalReasoning:
Prompt1KTokens:
Completion1KTokens:

DefaultParameters

Temperature:
0
StartChatWith THUDM: GLM 4.1V 9B Thinking

UserComments