I am trying to train a Machine Translation model from HuggingFace (t5-large) using the europarl_bilingual dataset.
When I execute the code in my Linux GPU it goes without error until it downloads the model in
from transformers import AutoModelForSeq2SeqLM, DataCollatorForSeq2Seq, Seq2SeqTrainingArguments, Seq2SeqTrainer
model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)
It performs the download, but when calling the model it throws the error:
Segmentation fault (core dumped)
I faced the same error while importing the tokenizer but I found the solution here ImportError when from transformers import BertTokenizer
by downgrading the tokenizer version to:
conda install -c huggingface tokenizers=0.10.1 transformers=4.6.1
I tried keeping the tokenizer version to 0.10.1 and upgrading the transformers version to the latest for linux = 4.11.3 but the same error.
Do you have any idea what might be happening or how to fix it?
Thanks,