Update README.md
Browse files
README.md
CHANGED
|
@@ -53,6 +53,7 @@ Furthermore, we are also planning to look into proper RLVR to make her better at
|
|
| 53 |
A more detailed writeup will be released whenever we get to a more final model, but currently this model has undergone
|
| 54 |
- Abliteration
|
| 55 |
- SFT on primarily roleplay logs and scrapes
|
|
|
|
| 56 |
- WPO\[2\] on general preference data, writing data, and Luna persona data
|
| 57 |
|
| 58 |
## Citations
|
|
|
|
| 53 |
A more detailed writeup will be released whenever we get to a more final model, but currently this model has undergone
|
| 54 |
- Abliteration
|
| 55 |
- SFT on primarily roleplay logs and scrapes
|
| 56 |
+
- A merge to heal some of the bad characteristics of the SFT phase with the original abliteration, epoch 1, and epoch 2 of the SFT
|
| 57 |
- WPO\[2\] on general preference data, writing data, and Luna persona data
|
| 58 |
|
| 59 |
## Citations
|