Class: DiscourseAi::Tokenizer::BgeLargeEnTokenizer
- Inherits:
-
BasicTokenizer
- Object
- BasicTokenizer
- DiscourseAi::Tokenizer::BgeLargeEnTokenizer
- Defined in:
- lib/discourse_ai/tokenizer/bge_large_en_tokenizer.rb
Overview
Tokenizer used in bge-large-en-v1.5, the most common embeddings model used for Discourse
Class Method Summary collapse
Methods inherited from BasicTokenizer
available_llm_tokenizers, below_limit?, decode, encode, size, tokenize, truncate
Class Method Details
.tokenizer ⇒ Object
7 8 9 10 11 12 |
# File 'lib/discourse_ai/tokenizer/bge_large_en_tokenizer.rb', line 7 def self.tokenizer @tokenizer ||= ::Tokenizers.from_file( DiscourseAi::Tokenizers.vendor_path("bge-large-en.json") ) end |