DongJae Shin's picture

DongJae Shin

ShinDJ

·

AI & ML interests

NLP, LLM, Vision-Langauge Model

Recent Activity

updated a dataset about 23 hours ago

ShinDJ/OneThinker_img_train

published a dataset about 24 hours ago

ShinDJ/OneThinker_img_train

upvoted a paper 3 days ago

Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding

View all activity

Organizations

updated a dataset about 23 hours ago

ShinDJ/OneThinker_img_train

Viewer • Updated about 23 hours ago • 173k • 5

published a dataset about 24 hours ago

ShinDJ/OneThinker_img_train

Viewer • Updated about 23 hours ago • 173k • 5

upvoted a paper 3 days ago

Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding

Paper • 2512.17532 • Published 7 days ago • 62

updated 2 datasets 3 days ago

Tutoruslabs/GSM8K_KOR-train

Viewer • Updated 3 days ago • 3.42k • 8 • 3

Tutoruslabs/GSM8K_KOR

Viewer • Updated 3 days ago • 569 • 6 • 2

updated a Space 3 days ago

Trackio

published a Space 3 days ago

Trackio

updated a Space 3 days ago

Trl Trackio

published a Space 3 days ago

Trl Trackio

liked a model 10 days ago

nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16

Text Generation • 32B • Updated 3 days ago • 182k • 479

upvoted an article 22 days ago

Article

We Got Claude to Fine-Tune an Open Source LLM

23 days ago

•

535

reacted to sergiopaniego's post with 🔥 22 days ago

Post

2383

NEW: @mistralai released a fantastic family of multimodal models, Ministral 3.

You can fine-tune them for free on Colab using TRL ⚡️, supporting both SFT and GRPO

Link to the notebooks:
- SFT: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_ministral3_vl.ipynb
- GRPO: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_ministral3_vl.ipynb
- TRL and more examples: https://huggingface.co/docs/trl/index

2 replies

·

reacted to sergiopaniego's post with 👍 about 1 month ago

Post

1729

Interested in RL training environments?

We just released a beginner-friendly walkthrough notebook!

Train a model to play Wordle using TRL + OpenEnv (TextArena) + GRPO + vLLM.

happy learning! 🌱

Notebook: https://github.com/huggingface/trl/blob/main/examples/notebooks/openenv_wordle_grpo.ipynb

OpenEnv guide in TRL: https://huggingface.co/docs/trl/main/en/openenv

updated 3 datasets about 1 month ago

Tutoruslabs/KMMR-VisMath

Viewer • Updated Nov 25 • 2.06k • 4 • 2

Tutoruslabs/ELEMENTARY_MATH

Viewer • Updated Nov 25 • 448 • 4 • 2

Tutoruslabs/ChartQA_KOR

Viewer • Updated Nov 25 • 2k • 6 • 2

upvoted a paper about 1 month ago

VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model

Paper • 2504.07615 • Published Apr 10 • 35

liked a model about 1 month ago

case01/llama-3.2-Korean-Bllossom-AICA-5B-gguf

0.9B • Updated Jan 19 • 114 • 2

upvoted a paper about 1 month ago

NVIDIA Nemotron Nano V2 VL

Paper • 2511.03929 • Published Nov 6 • 27

liked a dataset about 1 month ago

Tutoruslabs/ELEMENTARY_MATH

Viewer • Updated Nov 25 • 448 • 4 • 2