Class: DiscourseAi::Tokenizer::BgeM3Tokenizer
- Inherits:
-
BasicTokenizer
- Object
- BasicTokenizer
- DiscourseAi::Tokenizer::BgeM3Tokenizer
- Defined in:
- lib/discourse_ai/tokenizer/bge_m3_tokenizer.rb
Overview
Tokenizer used in bge-m3, a capable multilingual long context embeddings model.
Class Method Summary collapse
Methods inherited from BasicTokenizer
available_llm_tokenizers, below_limit?, decode, encode, size, tokenize, truncate
Class Method Details
.tokenizer ⇒ Object
7 8 9 10 11 12 |
# File 'lib/discourse_ai/tokenizer/bge_m3_tokenizer.rb', line 7 def self.tokenizer @tokenizer ||= ::Tokenizers.from_file( DiscourseAi::Tokenizers.vendor_path("bge-m3.json") ) end |