Main library for working with transformer models (BERT, GPT, etc.).
See this page for more details:
https://huggingface.co/docs/transformers/en/installation
Install transformers:
$ pip install transformers
To install Transformers with PyTorch as the backend (defaults to CPU if no GPU/CUDA is available), run:
$ pip install 'transformers[torch]'
To test if the installation was successful, run the following command. It should return a label and a score for the provided text:
$ python3 -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('hugging face is the best'))"
[{'label': 'POSITIVE', 'score': 0.999839186668396}]
Example: Running a model using Hugging Face Transformers:
$ vi huggingface-llm.py
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-small")
tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-small")
print(model)
print(tokenizer)
Run the python script:
$ python3 huggingface-llm.py
Output:
GPT2TokenizerFast(
name_or_path='microsoft/DialoGPT-small',
vocab_size=50257,
model_max_length=1024,
is_fast=True,
padding_side='right',
truncation_side='right',
special_tokens={'bos_token': '<|endoftext|>', 'eos_token': '<|endoftext|>', 'unk_token': '<|endoftext|>'},
clean_up_tokenization_spaces=True,
added_tokens_decoder={50256: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True, special=True),}
)
GPT2LMHeadModel(
(transformer): GPT2Model(
(wte): Embedding(50257, 1024)
(wpe): Embedding(1024, 1024)
...
)
(lm_head): Linear(in_features=1024, out_features=50257, bias=False)
)