Model Card for Model ID

This is a model fine-tuned for multilingual NER, specifically for literary texts.

Model Details

Model Description

This model is a fine-tuned version of the multilingual XLM-RoBERTa, trained with English, French and Italian literary data.

  • Developed by: WpnSta as part of a NLP training course
  • Language(s) (NLP): English, French, Italian
  • Finetuned from model: XLM-RoBERTa-base

Model Sources

Direct Use

This model is ready to be used to predict Named Entities on new text. It will detect the following entities, according to the LitBank annotation schema:

  • PER (Person, character, also animals with active roles in the narrative)
  • FAC (Facility, e.g. the house, the street)
  • GPE (Geopolitical Entity, e.g. London, the village)
  • LOC (Location, e.g. the river, the sea)
  • ORG (Organisation, e.g. the army, the court)
  • VEH (Vehicle, e.g. thi ship, the coach)
  • TIME (Temporal or historical reference, e.g. in the morning, Easter)

Limitations

The model was trained on literary texts, specifically from the 18th, 19th and 20th century (see Training Data below). It will perform best on custom text in the same languages and from the same time period, for more recent texts a model trained on news text will probably perform better.

Training Data

The model was trained on ~620'000 tokens in English, French and Italian, the following datasets were used:

Downloads last month
77
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for WpnSta/lner-xlm-roberta

Finetuned
(3766)
this model

Dataset used to train WpnSta/lner-xlm-roberta

Space using WpnSta/lner-xlm-roberta 1