Add LaTa files

Browse files

Files changed (6) hide show

README.md +47 -0
config.json +28 -0
flax_model.msgpack +3 -0
pytorch_model.bin +3 -0
tf_model.h5 +3 -0
tokenizer.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,47 @@

+---
+language: la
+license: apache-2.0
+inference: false
+---
+# LaTa
+The paper [Exploring Language Models for Classical Philology](https://todo.com) is the first effort to systematically provide state-of-the-art language models for Classical Philology. LaTa is a T5-base sized, monolingual, encoder-decoder variant.
+This model was trained on the [Corpus Corporum](https://mlat.uzh.ch/).
+Further information can be found in our paper or in our [GitHub repository](https://github.com/Heidelberg-NLP/ancient-language-models).
+## Usage
+```python
+from transformers import AutoTokenizer, AutoModelForConditionalGeneration
+tokenizer = AutoTokenizer.from_pretrained('bowphs/LaTa')
+model = AutoModelForConditionalGeneration.from_pretrained('bowphs/LaTa')
+```
+Please check out the awesome Hugging Face tutorials on how to fine-tune our models.
+## Evaluation Results
+When fine-tuned on lemmatization data from [EvaLatin 2022](https://universaldependencies.org/), LaTa achieves the following results:
+| Task | Classical | Cross-genre  | Cross-time |
+|:--:|:--:|:--:|:--:|
+|      |97.30|93.95|92.26|
+## Contact
+If you have any questions or problems, feel free to [reach out](mailto:riemenschneider@cl.uni-heidelberg.de).
+## Citation
+```bibtex
+@incollection{riemenschneiderfrank:2023,
+    address = "Toronto, Canada",
+    author = "Riemenschneider, Frederick and Frank, Anette",
+    booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL’23)",
+    note = "to appear",
+    pubType = "incollection",
+    publisher = "Association for Computational Linguistics",
+    title = "Exploring Large Language Models for Classical Philology",
+    url = "https://arxiv.org/abs/2305.13698",
+    year = "2023",
+    key = "riemenschneiderfrank:2023"
+}
+```

config.json ADDED Viewed

	@@ -0,0 +1,28 @@

+{
+  "_name_or_path": "bowphs/LaTa",
+  "architectures": [
+    "T5ForConditionalGeneration"
+  ],
+  "d_ff": 2048,
+  "d_kv": 64,
+  "d_model": 768,
+  "decoder_start_token_id": 0,
+  "dropout_rate": 0.1,
+  "eos_token_id": 1,
+  "feed_forward_proj": "gated-gelu",
+  "gradient_checkpointing": false,
+  "initializer_factor": 1.0,
+  "is_encoder_decoder": true,
+  "layer_norm_epsilon": 1e-06,
+  "model_type": "t5",
+  "num_decoder_layers": 12,
+  "num_heads": 12,
+  "num_layers": 12,
+  "output_past": true,
+  "pad_token_id": 0,
+  "relative_attention_num_buckets": 32,
+  "tie_word_embeddings": false,
+  "transformers_version": "4.10.0",
+  "use_cache": true,
+  "vocab_size": 52103
+}

flax_model.msgpack ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fc78634ed2ab333eb54f9b429fc9c5a71dac9f5515fae97be5a9995e34ed3014
+size 1113050015

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f47245357c65a6b6dfd9aa5ebdd84a1bccb458ebeb488bd4826a896c2f8a80c6
+size 1113160781

tf_model.h5 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d4830bfa8e7bd06f9a5f21269549dd8916a18e60132e7bdf2bc92ab4a9e7b483
+size 1113611600

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff