Update README.md
Browse files
README.md
CHANGED
|
@@ -232,7 +232,7 @@ import torch
|
|
| 232 |
from transformers import WhisperForConditionalGeneration, WhisperProcessor
|
| 233 |
|
| 234 |
#Load the processor and model.
|
| 235 |
-
MODEL_NAME="
|
| 236 |
processor = WhisperProcessor.from_pretrained(MODEL_NAME)
|
| 237 |
model = WhisperForConditionalGeneration.from_pretrained(MODEL_NAME).to("cuda")
|
| 238 |
|
|
@@ -277,12 +277,12 @@ The specific datasets used to create the model are:
|
|
| 277 |
- [3CatParla](https://huggingface.co/datasets/projecte-aina/3catparla_asr). (Soon to be published)
|
| 278 |
- [commonvoice_benchmark_catalan_accents](https://huggingface.co/datasets/projecte-aina/commonvoice_benchmark_catalan_accents)
|
| 279 |
- [corts_valencianes](https://huggingface.co/datasets/projecte-aina/corts_valencianes_asr_a) (Only the anonymized version of the dataset is public. We trained the model with the non-anonymized version.)
|
| 280 |
-
- [parlament_parla_v3](https://huggingface.co/datasets/projecte-aina/parlament_parla_v3)
|
| 281 |
- [IB3](https://huggingface.co/datasets/projecte-aina/ib3_ca_asr) (Soon to be published)
|
| 282 |
|
| 283 |
### Training procedure
|
| 284 |
|
| 285 |
-
This model is the result of
|
| 286 |
|
| 287 |
### Training Hyperparameters
|
| 288 |
|
|
|
|
| 232 |
from transformers import WhisperForConditionalGeneration, WhisperProcessor
|
| 233 |
|
| 234 |
#Load the processor and model.
|
| 235 |
+
MODEL_NAME="BSC-LT/whisper-bsc-large-v3-cat"
|
| 236 |
processor = WhisperProcessor.from_pretrained(MODEL_NAME)
|
| 237 |
model = WhisperForConditionalGeneration.from_pretrained(MODEL_NAME).to("cuda")
|
| 238 |
|
|
|
|
| 277 |
- [3CatParla](https://huggingface.co/datasets/projecte-aina/3catparla_asr). (Soon to be published)
|
| 278 |
- [commonvoice_benchmark_catalan_accents](https://huggingface.co/datasets/projecte-aina/commonvoice_benchmark_catalan_accents)
|
| 279 |
- [corts_valencianes](https://huggingface.co/datasets/projecte-aina/corts_valencianes_asr_a) (Only the anonymized version of the dataset is public. We trained the model with the non-anonymized version.)
|
| 280 |
+
- [parlament_parla_v3](https://huggingface.co/datasets/projecte-aina/parlament_parla_v3) (Only the anonymized version of the dataset is public. We trained the model with the non-anonymized version.)
|
| 281 |
- [IB3](https://huggingface.co/datasets/projecte-aina/ib3_ca_asr) (Soon to be published)
|
| 282 |
|
| 283 |
### Training procedure
|
| 284 |
|
| 285 |
+
This model is the result of fine-tuning the model ["openai/whisper-large-v3"](https://huggingface.co/openai/whisper-large-v3) by following this [tutorial](https://github.com/langtech-bsc/whisper_ft_pipeline) provided by [Language Technologies Laboratory](https://huggingface.co/BSC-LT). (Soon to be published)
|
| 286 |
|
| 287 |
### Training Hyperparameters
|
| 288 |
|