Menu

#TokenIzer

9 posts

Feed·
8 of 9 posts
📰
0

Splitting with 's

Reddit r/learnpython·u/possiblypossums44·about 1 month ago
#ZhTde6dA

I have a question about splitting words with an apostrophe. I wanted to split an English text into words, where words like 'they're' or 'I'm' get recognized as one word and stay together. I also wanted words connected with a hyphen to stay together.…

15s
Read More
Using hf tokenizers in Rust
📰
0

Using hf tokenizers in Rust

DEV Community·Wayne·about 1 month ago
#1c5EUsx1

Master Rust tokenizers with Hugging Face's powerful library. Learn to implement text tokenization, encoding/decoding, and work with pretrained models like GPT-2, BERT, and Llama for NLP applications. Uses the from_pretrained method

15s
Read More