Text Generation
Safetensors
Danish
English
llama
peter-sk commited on
Commit
7dfc40f
·
2 Parent(s): ffb71bdf12a1ba

Merge branch 'main' of https://huggingface.co/danish-foundation-models/munin-7b-open-pt

Browse files
Files changed (1) hide show
  1. README.md +16 -8
README.md CHANGED
@@ -1,8 +1,16 @@
1
- ---
2
- license: apache-2.0
3
- ---
4
-
5
- | Stage | Batch size | Steps | HF path | Data mix | Comments |
6
- +--------+-------------+-------+---------+-------------------------------+----------+
7
- | stage1 | 262,144 tok | 37852 | [subfolder="stage1"](https://huggingface.co/danish-foundation-models/munin-7b-open-pt/tree/main/stage1)
8
- | 2/3 [DynaWord](https://huggingface.co/datasets/danish-foundation-models/danish-dynaword/tree/9e230b35e31a510e5ab909112ad5bfc9463b2c23/), 1/3 [Common-Pile] (common-pile/comma_v0.1_training_dataset/5afc546db324e7f39f297ba757c9a60547151e7c/) | excludes depbank, jvj, nordjyllandnews, synne for DynaWord; uses subsets and weighting from [Comma-v0.1-2T](https://huggingface.co/common-pile/comma-v0.1-2t) cooldown phase for Common-Pile |
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - danish-foundation-models/danish-dynaword
5
+ - common-pile/comma_v0.1_training_dataset
6
+ language:
7
+ - da
8
+ - en
9
+ base_model:
10
+ - common-pile/comma-v0.1-2t
11
+ pipeline_tag: text-generation
12
+ ---
13
+
14
+ | Stage | Batch size | Steps | HF path | Data mix | Comments |
15
+ |--------|----------------|-------|-----------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
16
+ | stage1 | 262,144 tok | 37,852| [subfolder="stage1"](https://huggingface.co/danish-foundation-models/munin-7b-open-pt/tree/main/stage1) | 2/3 [DynaWord](https://huggingface.co/datasets/danish-foundation-models/danish-dynaword/tree/9e230b35e31a510e5ab909112ad5bfc9463b2c23) <br> 1/3 [Common-Pile](https://huggingface.co/common-pile/comma_v0.1_training_dataset/5afc546db324e7f39f297ba757c9a60547151e7c/) | Excludes depbank, jvj, nordjyllandnews, synne for DynaWord; uses subsets and weighting from [Comma-v0.1-2T](https://huggingface.co/common-pile/comma-v0.1-2t) cooldown phase for Common-Pile. |