✖️ 📶 👩🏿‍💻 GPT2とPyTorchを使用したテキストの生成 👂🏽 🗒️ 👨🏼‍💻

Huggingfaceフレームワークを使用した任意の言語での高速で簡単なテキスト生成

コース「機械学習」の一環として。Advanced」は興味深い資料の翻訳を用意しました。

また、「ABテストを最適化するための多腕バンディット」に関する公開ウェビナーに参加することをお勧めします。ウェビナーでは、参加者は専門家と一緒に、強化学習の最も効果的なユースケースの1つを分析し、ABテスト問題をベイズ推定問題に再定式化する方法についても検討します。

前書き

— (Natural Language Processing - NLP) . , , GPT-3, , , . - , , .

GPT-2 — GPT-3. Transformers, Huggingface. , GPT-2 , : GPT2 Pytorch

GPT-2 , , ! , .

4: ,

1:

Huggingface Transformers, , PyTorch. PyTorch, .

PyTorch, Huggingface Transformers, :

pip install transformers

2:

Transformers, pipeline:

from transformers import pipeline

pipeline , .

3:

. :

text_generation = pipeline(“text-generation”)

— GPT-2, .

4: ,

, . :

The world is

()

prefix_text = "The world is"

5:

, , ! , :

generated_text= text_generation(prefix_text, max_length=50, do_sample=False)[0]

print(generated_text[‘generated_text’])

max_length

50 . :

The world is a better place if you’re a good person.

(   ,    .)

I’m not saying that you should be a bad person. I’m saying that you should be a good person.

(  ,      .  ,      .)

I’m not saying that you should be a bad

(  ,     .)

, , , . , . , , (, top-k/top-p ) , . , Huggingface TextGenerationPipeline.

:

-, , ; , , . , Huggingface , ( ), , .

, . GPT2 CKIPLab , .

, :

from transformers import BertTokenizerFast, AutoModelWithLMHead

tokenizer = BertTokenizerFast.from_pretrained(‘bert-base-chinese’)

model = AutoModelWithLMHead.from_pretrained(‘ckiplab/gpt2-base-chinese’)

text_generation = pipeline(“text-generation”, model=model, tokenizer=tokenizer)

, , :

我 想 要 去

prefix_text = "我 想 要 去"

##

, , :

generated_text= text_generation(prefix_text, max_length=50, do_sample=False)[0]

print(generated_text['generated_text'])

我 想 要 去 看 看 。 」 他 說 : 「 我 們 不 能 說, 我 們 不 能 說, 我 們 不 能 說, 我 們 不 能 說, 我 們 不 能 說, 我 們 不 能 說, 我 們

##    ».  : «   ,    ,    , 
   ,    ,    , .

, , .

! , , API, Huggingface, . Jupyter:

In [1]:
from transformers import pipeline
 
In [ ]:
text_generation = pipeline("text-generation")
 
In [7]:
prefix_text = "The world is"
 
In [8]:
generated_text= text_generation(prefix_text, max_length=50, do_sample=False)[0]
print(generated_text['generated_text'])
 
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
 
The world is a better place if you're a good person.
 
I'm not saying that you should be a bad person. I'm saying that you should be a good person.
 
I'm not saying that you should be a bad

! , . - , . , , . , .

Brown, Tom B., et al. “Language models are few-shot learners.” arXiv preprint arXiv:2005.14165 (2020).

Radford, Alec, et al. “Language models are unsupervised multitask learners.” OpenAI blog 1.8 (2019): 9.

Transformers Github, Huggingface

Transformers Official Documentation, Huggingface

Pytorch Official Website, Facebook AI Research

Fan, Angela, Mike Lewis, and Yann Dauphin. “Hierarchical neural story generation.” arXiv preprint arXiv:1805.04833 (2018).

Welleck, Sean, et al. “Neural text generation with unlikelihood training.” arXiv preprint arXiv:1908.04319 (2019).

CKIPLab Transformers Github, Chinese Knowlege and Information Processing at the Institute of Information Science and the Institute of Linguistics of Academia Sinica

«Machine Learning. Advanced».

«Multi-armed bandits AB ».

GPT2とPyTorchを使用したテキストの生成

Huggingfaceフレームワークを使用した任意の言語での高速で簡単なテキスト生成

前書き

1:

2:

3:

4: ,

5:

:

More articles: