no <think> tag between the response
after setting "chat_template_kwargs": {"enable_thinking": True}, the identifier still does not appear in the answer
The tag will be detected and filtered, and the content between and will be separated out and returned as the reasoning_content field. This is the parser logic in SGLang and VLLM.
Sadly the reasoning parser doesnt currently work properly,
I have the Problem with GLM-4.6V-Flash in Vllm (tested 0.13.0 and nightly)
on 13 it outputs thinking and response in the content field
on nightly everything gets put in the reasoning_content field with an empty content
i have max length set at 50k tokens and this is happening for responses <2000 tokens aswell.
currently trying to find out how to debug it, but it seems the model is not outputting the correct end reasoning tokens