Instructions to use howey/HDT-E with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use howey/HDT-E with Transformers:
# Load model directly from transformers import AutoModelForPreTraining model = AutoModelForPreTraining.from_pretrained("howey/HDT-E", dtype="auto") - Notebooks
- Google Colab
- Kaggle
| library_name: transformers | |
| license: apache-2.0 | |
| language: | |
| - en | |
| datasets: | |
| - howey/unarXive | |
| - howey/wiki_en | |
| - howey/hupd | |
| # Model Weights Comming Soon! | |
| ## Using HDT | |
| To use the pre-trained model for masked language modeling, use the following snippet: | |
| ```python | |
| from transformers import AutoModelForMaskedLM, AutoTokenizer | |
| # See the `MDLM` collection page on the hub for list of available models. | |
| tokenizer = transformers.AutoTokenizer.from_pretrained('howey/HDT-E') | |
| model_name = 'howey/HDT-E' | |
| model = AutoModelForMaskedLM.from_pretrained(model_name) | |
| ``` | |
| For more details, please see our github repository: [HDT](https://github.com/autonomousvision/hdt) | |
| ## Model Details | |
| The model, which has a context length of `8192` and is similar in size to BERT with approximately `110M` parameters, | |
| was trained on standard masked language modeling task with a Transformer-based architecture using our proposed hierarchical attention. | |
| The training regimen comprised 24 hours on the ArXiv+Wikipedia+HUPD corpus, involving the processing of a total of `1.3 billion` tokens. | |
| For more details, please see our paper: [HDT: Hierarchical Document Transformer](https://arxiv.org/pdf/2407.08330). | |
| ## Citation | |
| <!-- If there is a paper or blog post introducing the model, the Bibtex information for that should go in this section. --> | |
| Please cite our work using the bibtex below: | |
| **BibTeX:** | |
| ``` | |
| @inproceedings{He2024COLM, | |
| title={HDT: Hierarchical Document Transformer}, | |
| author={Haoyu He and Markus Flicke and Jan Buchmann and Iryna Gurevych and Andreas Geiger}, | |
| year={2024}, | |
| booktitle={Conference on Language Modeling} | |
| } | |
| ``` | |
| ## Model Card Contact | |
| Haoyu (haoyu.he@uni-tuebingen.de) |