Model Card for Model ID
This is a model fine-tuned for multilingual NER, specifically for literary texts.
Model Details
Model Description
This model is a fine-tuned version of the multilingual XLM-RoBERTa, trained with English, French and Italian literary data.
- Developed by: WpnSta as part of a NLP training course
- Language(s) (NLP): English, French, Italian
- Finetuned from model: XLM-RoBERTa-base
Model Sources
- Repository: Github Repository with training code
- Demo: Web Interface
Direct Use
This model is ready to be used to predict Named Entities on new text. It will detect the following entities, according to the LitBank annotation schema:
- PER (Person, character, also animals with active roles in the narrative)
- FAC (Facility, e.g. the house, the street)
- GPE (Geopolitical Entity, e.g. London, the village)
- LOC (Location, e.g. the river, the sea)
- ORG (Organisation, e.g. the army, the court)
- VEH (Vehicle, e.g. thi ship, the coach)
- TIME (Temporal or historical reference, e.g. in the morning, Easter)
Limitations
The model was trained on literary texts, specifically from the 18th, 19th and 20th century (see Training Data below). It will perform best on custom text in the same languages and from the same time period, for more recent texts a model trained on news text will probably perform better.
Training Data
The model was trained on ~620'000 tokens in English, French and Italian, the following datasets were used:
- Downloads last month
- 77
Model tree for WpnSta/lner-xlm-roberta
Base model
FacebookAI/xlm-roberta-base