File size: 4,103 Bytes
e9fe048
 
 
 
 
 
 
 
 
274449a
e9fe048
527d359
 
 
e9fe048
b11d199
5e84c6a
fb52b4c
e9fe048
 
 
fb52b4c
 
 
5e84c6a
fb52b4c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
274449a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
---
language:
- en
metrics:
- accuracy
- precision
- recall
- f1
library_name: transformers
pipeline_tag: text-classification
---
# bert-large-relation14

Finetuned BERT model for 14-class classification. It was introduced in the paper: [Automatic Slide Generation Using Discourse Relations](https://link.springer.com/chapter/10.1007/978-3-031-36336-8_61) and first released in this repository. This model is uncased: it does not make a difference between english and English.

In our proposed method in this [paper](https://link.springer.com/chapter/10.1007/978-3-031-36336-8_61), we only used this model for the classification of discourse relation between the FIRST and SECOND sentence in summarized sentences. The model that is used between the other sentences is [this model](https://huggingface.co/teppei727/bert_woco). If you are curious about our proposed method, it's better to see that model.

# Descliption

This model can classify the relation between the sentence pair of input.

Now we are working on preparing the Model card. Please wait for a few days.


The model trained from [bert-large-uncased](https://huggingface.co/bert-large-uncased) on the dataset published in the paper: [Automatic Prediction of Discourse Connectives](https://arxiv.org/abs/1702.00992).

The dataset to make this model is based on English Wikipedia data and has 20 labels. However, this model will classify into 14 labels. This is because the 20-class data set was restructured to 14 classes to suit our research objective of "automatic slide generation. This distribution is shown below.


|Level 1|Level 2|Level 3|Connectives (20)|
|-------------|-----------------|------------------|--------------------|
| Temporal    | Synchronous     |                  | meanwhile          |
| Temporal    | Asynchronous    | Precedence       | then,              |
| Temporal    | Asynchronous    | Precedence       | finally,           |
| Temporal    | Asynchronous    | Succession       | by then            |
| Contingency | Cause           | Result           | therefore          |
| Comparison  | Concession      | Arg2-as-denier   | however,           |
| Comparison  | Concession      | Arg2-as-denier   | nevertheless       |
| Comparison  | Contrast        |                  | on the other hand, |
| Comparison  | Contrast        |                  | by contrast,       |
| Expansion   | Conjunction     |                  | and                |
| Expansion   | Conjunction     |                  | moreover           |
| Expansion   | Conjunction     |                  | indeed             |
| Expansion   | Equivalence     |                  | in other words     |
| Expansion   | Exception       | Arg1-as-excpt    | otherwise          |
| Expansion   | Instantiation   | Arg2-as-instance | for example,       |
| Expansion   | Level-of-detail | Arg1-as-detail   | overall,           |
| Expansion   | Level-of-detail | Arg2-as-detail   | in particular,     |
| Expansion   | Substitution    | Arg2-as-subst    | instead            |
| Expansion   | Substitution    | Arg2-as-subst    | rather             |

# Training

The model was trained using AutoModelForSequenceClassification.from_pretrained 

```
training_args = TrainingArguments(
    output_dir = output_dir,
    save_strategy="epoch",
    num_train_epochs = 5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=32,
    warmup_steps=0,
    weight_decay=0.01,
    logging_dir="./logs",
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    metric_for_best_model="f1",
    load_best_model_at_end=True
)
```

# Evaluation (14 labels and original 20 labels classification) using the dataset test split gives:

|     Model                |     Macro F1    |     Accuracy    |     Precision    |     Recall    |
|--------------------------|-----------------|-----------------|------------------|---------------|
|     14 labels classification    |     0.586       |     0.589       |     0.630        |     0.591     |
|     20 labels classification    |     0.478       |     0.488       |     0.536        |     0.488     |