Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
magibu
's Collections
Pretrain Datasets
papers
Ekip karışık verileri
Turkish Language Healthcare Datasets
Pretrain Datasets
updated
Jan 3
Datasets we use for pretraining large language models
Upvote
-
omarkamali/wikipedia-monthly
Updated
3 days ago
•
12.4k
•
52
alibayram/hukuk_soru_cevap
Viewer
•
Updated
Nov 6, 2024
•
2.08k
•
15
•
14
umutertugrul/turkish-hospital-medical-articles
Viewer
•
Updated
Oct 2, 2025
•
24.6k
•
7
•
8
umutertugrul/turkish-medical-articles
Viewer
•
Updated
Oct 2, 2025
•
42.8k
•
9
•
3
alibayram/tr-books
Viewer
•
Updated
Dec 17, 2025
•
3.7k
•
6
selimfirat/bilkent-turkish-writings-dataset
Viewer
•
Updated
May 24, 2025
•
25.1k
•
88
•
12
umutertugrul/turkish-academic-theses-dataset
Viewer
•
Updated
Aug 18, 2025
•
649k
•
74
•
9
alibayram/onedio_haberler
Viewer
•
Updated
Jun 18, 2024
•
66.7k
•
8
•
5
habanoz/news-tr-1.8M
Viewer
•
Updated
Oct 6, 2024
•
1.85M
•
101
•
7
alibayram/hepsiburada_yorumlar
Viewer
•
Updated
Jun 18, 2024
•
2.66M
•
8
•
14
alibayram/kitapyurdu_yorumlar
Viewer
•
Updated
Jun 18, 2024
•
405k
•
38
•
1
alibayram/beyazperde_yorumlar
Viewer
•
Updated
Jun 18, 2024
•
192k
•
15
•
5
BILGEM-AI/BILGE-Synthetic-Stories
Viewer
•
Updated
Nov 20, 2025
•
2.87M
•
34
•
5
Upvote
-
Share collection
View history
Collection guide
Browse collections