Class: DiscourseAi::Tokenizer::MultilingualE5LargeTokenizer
- Inherits:
-
BasicTokenizer
- Object
- BasicTokenizer
- DiscourseAi::Tokenizer::MultilingualE5LargeTokenizer
- Defined in:
- lib/discourse_ai/tokenizer/multilingual_e5_large_tokenizer.rb
Overview
Tokenizer from multilingual-e5-large, first multilingual embeddings model used in Discourse
Class Method Summary collapse
Methods inherited from BasicTokenizer
available_llm_tokenizers, below_limit?, decode, encode, size, tokenize, truncate
Class Method Details
.tokenizer ⇒ Object
7 8 9 10 11 12 |
# File 'lib/discourse_ai/tokenizer/multilingual_e5_large_tokenizer.rb', line 7 def self.tokenizer @tokenizer ||= ::Tokenizers.from_file( DiscourseAi::Tokenizers.vendor_path("multilingual-e5-large.json") ) end |