Class: Langchain::LLM::LlamaCpp

Inherits:

Object
Base
Langchain::LLM::LlamaCpp

Defined in:: lib/langchain/llm/llama_cpp.rb

Overview

A wrapper around the LlamaCpp.rb library

Gem requirements:

gem "llama_cpp"

Usage:

llama = Langchain::LLM::LlamaCpp.new(
  model_path: ENV["LLAMACPP_MODEL_PATH"],
  n_gpu_layers: Integer(ENV["LLAMACPP_N_GPU_LAYERS"]),
  n_threads: Integer(ENV["LLAMACPP_N_THREADS"])
)

Instance Attribute Summary collapse

#model_path ⇒ Object

Returns the value of attribute model_path.
#n_ctx ⇒ Object

Returns the value of attribute n_ctx.
#n_gpu_layers ⇒ Object

Returns the value of attribute n_gpu_layers.
#n_threads ⇒ Object writeonly

Sets the attribute n_threads.
#seed ⇒ Object

Returns the value of attribute seed.

Attributes inherited from Base

#client

Instance Method Summary collapse

#complete(prompt:, n_predict: 128) ⇒ String

The completed prompt.
#embed(text:) ⇒ Array<Float>

The embedding.
#initialize(model_path:, n_gpu_layers: 1, n_ctx: 2048, n_threads: 1, seed: 0) ⇒ LlamaCpp constructor

A new instance of LlamaCpp.

Methods inherited from Base

#chat, #default_dimensions, #summarize

Methods included from DependencyHelper

#depends_on

Constructor Details

#initialize(model_path:, n_gpu_layers: 1, n_ctx: 2048, n_threads: 1, seed: 0) ⇒ `LlamaCpp`

Returns a new instance of LlamaCpp.

Parameters:

model_path (String) —

The path to the model to use
n_gpu_layers (Integer) (defaults to: 1) —

The number of GPU layers to use
n_ctx (Integer) (defaults to: 2048) —

The number of context tokens to use
n_threads (Integer) (defaults to: 1) —

The CPU number of threads to use
seed (Integer) (defaults to: 0) —

The seed to use

# File 'lib/langchain/llm/llama_cpp.rb', line 25

def initialize(model_path:, n_gpu_layers: 1, n_ctx: 2048, n_threads: 1, seed: 0)
  depends_on "llama_cpp"

  @model_path = model_path
  @n_gpu_layers = n_gpu_layers
  @n_ctx = n_ctx
  @n_threads = n_threads
  @seed = seed
end

Instance Attribute Details

#model_path ⇒ `Object`

Returns the value of attribute model_path.



17
18
19

# File 'lib/langchain/llm/llama_cpp.rb', line 17

def model_path
  @model_path
end

#n_ctx ⇒ `Object`

Returns the value of attribute n_ctx.



17
18
19

# File 'lib/langchain/llm/llama_cpp.rb', line 17

def n_ctx
  @n_ctx
end

#n_gpu_layers ⇒ `Object`

Returns the value of attribute n_gpu_layers.



17
18
19

# File 'lib/langchain/llm/llama_cpp.rb', line 17

def n_gpu_layers
  @n_gpu_layers
end

#n_threads=(value) ⇒ `Object`

Sets the attribute n_threads

Parameters:

value —

the value to set the attribute n_threads to.



18
19
20

# File 'lib/langchain/llm/llama_cpp.rb', line 18

def n_threads=(value)
  @n_threads = value
end

#seed ⇒ `Object`

Returns the value of attribute seed.



17
18
19

# File 'lib/langchain/llm/llama_cpp.rb', line 17

def seed
  @seed
end

Instance Method Details

#complete(prompt:, n_predict: 128) ⇒ `String`

Returns The completed prompt.

Parameters:

prompt (String) —

The prompt to complete
n_predict (Integer) (defaults to: 128) —

The number of tokens to predict

Returns:

(String) —

The completed prompt

# File 'lib/langchain/llm/llama_cpp.rb', line 51

def complete(prompt:, n_predict: 128)
  # contexts do not appear to be stateful when it comes to completion, so re-use the same one
  context = completion_context
  ::LLaMACpp.generate(context, prompt, n_predict: n_predict)
end

#embed(text:) ⇒ `Array<Float>`

Returns The embedding.

Parameters:

text (String) —

The text to embed

Returns:

(Array<Float>) —

The embedding

# File 'lib/langchain/llm/llama_cpp.rb', line 37

def embed(text:)
  # contexts are kinda stateful when it comes to embeddings, so allocate one each time
  context = embedding_context

  embedding_input = @model.tokenize(text: text, add_bos: true)
  return unless embedding_input.size.positive?

  context.eval(tokens: embedding_input, n_past: 0)
  Langchain::LLM::LlamaCppResponse.new(context, model: context.model.desc)
end

Class: Langchain::LLM::LlamaCpp

Overview

Instance Attribute Summary collapse

Attributes inherited from Base

Instance Method Summary collapse

Methods inherited from Base

Methods included from DependencyHelper

Constructor Details

#initialize(model_path:, n_gpu_layers: 1, n_ctx: 2048, n_threads: 1, seed: 0) ⇒ LlamaCpp

Instance Attribute Details

#model_path ⇒ Object

#n_ctx ⇒ Object

#n_gpu_layers ⇒ Object

#n_threads=(value) ⇒ Object

#seed ⇒ Object

Instance Method Details

#complete(prompt:, n_predict: 128) ⇒ String

#embed(text:) ⇒ Array<Float>

#initialize(model_path:, n_gpu_layers: 1, n_ctx: 2048, n_threads: 1, seed: 0) ⇒ `LlamaCpp`

#model_path ⇒ `Object`

#n_ctx ⇒ `Object`

#n_gpu_layers ⇒ `Object`

#n_threads=(value) ⇒ `Object`

#seed ⇒ `Object`

#complete(prompt:, n_predict: 128) ⇒ `String`

#embed(text:) ⇒ `Array<Float>`