Class: HTM

Inherits:

Object

Object
HTM

show all

Defined in:: lib/htm.rb,
lib/htm/errors.rb,
lib/htm/railtie.rb,
lib/htm/sinatra.rb,
lib/htm/version.rb,
lib/htm/database.rb,
lib/htm/models/tag.rb,
lib/htm/job_adapter.rb,
lib/htm/models/node.rb,
lib/htm/tag_service.rb,
lib/htm/models/robot.rb,
lib/htm/configuration.rb,
lib/htm/working_memory.rb,
lib/htm/models/node_tag.rb,
lib/htm/long_term_memory.rb,
lib/htm/embedding_service.rb,
lib/htm/active_record_config.rb,
lib/htm/jobs/generate_tags_job.rb,
lib/htm/jobs/generate_embedding_job.rb

Overview

HTM error classes

Defined Under Namespace

Modules: JobAdapter, Jobs, Models, Sinatra Classes: ActiveRecordConfig, AuthorizationError, CircuitBreakerOpenError, Configuration, Database, DatabaseError, EmbeddingError, EmbeddingService, Error, LongTermMemory, NotFoundError, QueryTimeoutError, Railtie, ResourceExhaustedError, TagError, TagService, ValidationError, WorkingMemory

Constant Summary collapse

MAX_KEY_LENGTH = Validation constants

MAX_VALUE_LENGTH = 1MB

1_000_000

MAX_ARRAY_SIZE =

VALID_RECALL_STRATEGIES =

[:vector, :fulltext, :hybrid].freeze

VERSION =

"0.0.1"

Class Attribute Summary collapse

.configuration ⇒ HTM::Configuration

Get current configuration.

Instance Attribute Summary collapse

#long_term_memory ⇒ Object readonly

Returns the value of attribute long_term_memory.
#robot_id ⇒ Object readonly

Returns the value of attribute robot_id.
#robot_name ⇒ Object readonly

Returns the value of attribute robot_name.
#working_memory ⇒ Object readonly

Returns the value of attribute working_memory.

Class Method Summary collapse

.configure {|config| ... } ⇒ Object

Configure HTM.
.count_tokens(text) ⇒ Integer

Count tokens using configured counter.
.embed(text) ⇒ Array<Float>

Generate embedding using EmbeddingService.
.extract_tags(text, existing_ontology: []) ⇒ Array<String>

Extract tags using TagService.
.logger ⇒ Logger

Get configured logger.
.reset_configuration! ⇒ Object

Reset configuration to defaults.

Instance Method Summary collapse

#forget(node_id, confirm: false) ⇒ Boolean

Forget a memory node (explicit deletion).
#initialize(working_memory_size: 128_000, robot_name: nil, db_config: nil, db_pool_size: 5, db_query_timeout: 30_000, db_cache_size: 1000, db_cache_ttl: 300) ⇒ HTM constructor

Initialize a new HTM instance.
#recall(topic, timeframe: nil, limit: 20, strategy: :vector, with_relevance: false, query_tags: [], raw: false) ⇒ Array<String>, Array<Hash>

Recall memories from a timeframe and topic.
#remember(content, source: "", tags: []) ⇒ Integer

Remember new information.

Constructor Details

#initialize(working_memory_size: 128_000, robot_name: nil, db_config: nil, db_pool_size: 5, db_query_timeout: 30_000, db_cache_size: 1000, db_cache_ttl: 300) ⇒ `HTM`

Initialize a new HTM instance

Parameters:

working_memory_size (Integer) (defaults to: 128_000) —

Maximum tokens for working memory (default: 128,000)
robot_name (String) (defaults to: nil) —

Human-readable name for this robot (auto-generated if not provided)
db_config (Hash) (defaults to: nil) —

Database configuration (uses ENV if not provided)
db_pool_size (Integer) (defaults to: 5) —

Database connection pool size (default: 5)
db_query_timeout (Integer) (defaults to: 30_000) —

Database query timeout in milliseconds (default: 30000)
db_cache_size (Integer) (defaults to: 1000) —

Number of database query results to cache (default: 1000, use 0 to disable)
db_cache_ttl (Integer) (defaults to: 300) —

Database cache TTL in seconds (default: 300)

# File 'lib/htm.rb', line 68

def initialize(
  working_memory_size: 128_000,
  robot_name: nil,
  db_config: nil,
  db_pool_size: 5,
  db_query_timeout: 30_000,
  db_cache_size: 1000,
  db_cache_ttl: 300
)
  # Establish ActiveRecord connection if not already connected
  HTM::ActiveRecordConfig.establish_connection! unless HTM::ActiveRecordConfig.connected?

  @robot_name = robot_name || "robot_#{SecureRandom.uuid[0..7]}"

  # Initialize components
  @working_memory = HTM::WorkingMemory.new(max_tokens: working_memory_size)
  @long_term_memory = HTM::LongTermMemory.new(
    db_config || HTM::Database.default_config,
    pool_size: db_pool_size,
    query_timeout: db_query_timeout,
    cache_size: db_cache_size,
    cache_ttl: db_cache_ttl
  )

  # Register this robot in the database and get its integer ID
  @robot_id = register_robot
end

Class Attribute Details

.configuration ⇒ `HTM::Configuration`

Get current configuration

Returns:

(HTM::Configuration)



274
275
276

# File 'lib/htm/configuration.rb', line 274

def configuration
  @configuration ||= Configuration.new
end

Instance Attribute Details

#long_term_memory ⇒ `Object` (readonly)

Returns the value of attribute long_term_memory.



49
50
51

# File 'lib/htm.rb', line 49

def long_term_memory
  @long_term_memory
end

#robot_id ⇒ `Object` (readonly)

Returns the value of attribute robot_id.



49
50
51

# File 'lib/htm.rb', line 49

def robot_id
  @robot_id
end

#robot_name ⇒ `Object` (readonly)

Returns the value of attribute robot_name.



49
50
51

# File 'lib/htm.rb', line 49

def robot_name
  @robot_name
end

#working_memory ⇒ `Object` (readonly)

Returns the value of attribute working_memory.



49
50
51

# File 'lib/htm.rb', line 49

def working_memory
  @working_memory
end

Class Method Details

.configure {|config| ... } ⇒ `Object`

Configure HTM

Examples:

Custom configuration

HTM.configure do |config|
  config.embedding_generator = ->(text) { MyEmbedder.embed(text) }
  config.tag_extractor = ->(text, ontology) { MyTagger.extract(text, ontology) }
end

Default configuration

HTM.configure  # Uses RubyLLM defaults

Yields:

(config) —

Configuration object

Yield Parameters:

config (HTM::Configuration)

# File 'lib/htm/configuration.rb', line 292

def configure
  yield(configuration) if block_given?
  configuration.validate!
  configuration
end

.count_tokens(text) ⇒ `Integer`

Count tokens using configured counter

Parameters:

text (String) —

Text to count tokens for

Returns:

(Integer) —

Token count

# File 'lib/htm/configuration.rb', line 328

def count_tokens(text)
  configuration.token_counter.call(text)
rescue StandardError => e
  raise HTM::ValidationError, "Token counting failed: #{e.message}"
end

.embed(text) ⇒ `Array<Float>`

Generate embedding using EmbeddingService

Parameters:

text (String) —

Text to embed

Returns:

(Array<Float>) —

Embedding vector (original, not padded)

# File 'lib/htm/configuration.rb', line 308

def embed(text)
  result = HTM::EmbeddingService.generate(text)
  result[:embedding]
end

.extract_tags(text, existing_ontology: []) ⇒ `Array<String>`

Extract tags using TagService

Parameters:

text (String) —

Text to analyze
existing_ontology (Array<String>) (defaults to: []) —

Sample of existing tags for context

Returns:

(Array<String>) —

Extracted and validated tag names



319
320
321

# File 'lib/htm/configuration.rb', line 319

def extract_tags(text, existing_ontology: [])
  HTM::TagService.extract(text, existing_ontology: existing_ontology)
end

.logger ⇒ `Logger`

Get configured logger

Returns:

(Logger) —

Configured logger instance



338
339
340

# File 'lib/htm/configuration.rb', line 338

def logger
  configuration.logger
end

.reset_configuration! ⇒ `Object`

Reset configuration to defaults



299
300
301

# File 'lib/htm/configuration.rb', line 299

def reset_configuration!
  @configuration = Configuration.new
end

Instance Method Details

#forget(node_id, confirm: false) ⇒ `Boolean`

Forget a memory node (explicit deletion)

Parameters:

key (String) —

Key of the node to delete
confirm (Symbol) (defaults to: false) —

Must be :confirmed to proceed

Returns:

(Boolean) —

true if deleted

Raises:

(ArgumentError) —

if confirmation not provided
(HTM::NotFoundError) —

if node doesn’t exist

# File 'lib/htm.rb', line 251

def forget(node_id, confirm: false)
  # Validate inputs
  raise ArgumentError, "node_id cannot be nil" if node_id.nil?
  raise ArgumentError, "Must pass confirm: :confirmed to delete" unless confirm == :confirmed

  # Verify node exists
  unless @long_term_memory.exists?(node_id)
    raise HTM::NotFoundError, "Node not found: #{node_id}"
  end

  # Delete the node and remove from working memory
  @long_term_memory.delete(node_id)
  @working_memory.remove(node_id)

  update_robot_activity
  true
end

#recall(topic, timeframe: nil, limit: 20, strategy: :vector, with_relevance: false, query_tags: [], raw: false) ⇒ `Array<String>`, `Array<Hash>`

Recall memories from a timeframe and topic

Examples:

Basic usage (returns content strings)

memories = htm.recall("PostgreSQL")
# => ["PostgreSQL is great for time-series data", "PostgreSQL with TimescaleDB..."]

Get full node hashes

nodes = htm.recall("PostgreSQL", raw: true)
# => [{"id" => 1, "content" => "...", "created_at" => "...", ...}, ...]

With timeframe

memories = htm.recall("PostgreSQL", timeframe: "last week")

With all options

memories = htm.recall("PostgreSQL",
  timeframe: "last month",
  limit: 50,
  strategy: :hybrid,
  with_relevance: true,
  query_tags: ["database", "timeseries"])

Parameters:

topic (String) —

Topic to search for (required)
timeframe (String, Range, nil) (defaults to: nil) —

Time range (default: last 7 days). Examples: “last week”, 7.days.ago..Time.now
limit (Integer) (defaults to: 20) —

Maximum number of nodes to retrieve (default: 20)
strategy (Symbol) (defaults to: :vector) —

Search strategy (:vector, :fulltext, :hybrid) (default: :vector)
with_relevance (Boolean) (defaults to: false) —

Include dynamic relevance scores (default: false)
query_tags (Array<String>) (defaults to: []) —

Tags to boost relevance (default: [])
raw (Boolean) (defaults to: false) —

Return full node hashes (true) or just content strings (false) (default: false)

Returns:

(Array<String>, Array<Hash>) —

Content strings (raw: false) or full node hashes (raw: true)

# File 'lib/htm.rb', line 183

def recall(topic, timeframe: nil, limit: 20, strategy: :vector, with_relevance: false, query_tags: [], raw: false)
  # Use default timeframe if not provided (last 7 days)
  timeframe ||= "last 7 days"

  # Validate inputs
  validate_timeframe!(timeframe)
  validate_positive_integer!(limit, "limit")
  validate_recall_strategy!(strategy)
  validate_array!(query_tags, "query_tags")

  parsed_timeframe = parse_timeframe(timeframe)

  # Use relevance-based search if requested
  if with_relevance
    nodes = @long_term_memory.search_with_relevance(
      timeframe: parsed_timeframe,
      query: topic,
      query_tags: query_tags,
      limit: limit,
      embedding_service: (strategy == :vector || strategy == :hybrid) ? HTM : nil
    )
  else
    # Perform standard RAG-based retrieval
    nodes = case strategy
    when :vector
      # Vector search using query embedding
      @long_term_memory.search(
        timeframe: parsed_timeframe,
        query: topic,
        limit: limit,
        embedding_service: HTM
      )
    when :fulltext
      @long_term_memory.search_fulltext(
        timeframe: parsed_timeframe,
        query: topic,
        limit: limit
      )
    when :hybrid
      # Hybrid search combining vector + fulltext
      @long_term_memory.search_hybrid(
        timeframe: parsed_timeframe,
        query: topic,
        limit: limit,
        embedding_service: HTM
      )
    end
  end

  # Add to working memory (evict if needed)
  nodes.each do |node|
    add_to_working_memory(node)
  end

  update_robot_activity

  # Return full nodes or just content based on raw parameter
  raw ? nodes : nodes.map { |node| node['content'] }
end

#remember(content, source: "", tags: []) ⇒ `Integer`

Remember new information

Stores content in long-term memory and adds it to working memory. Embeddings and hierarchical tags are automatically extracted by LLM in the background.

If content is empty, returns the ID of the most recent node without creating a duplicate. Nil values for content or source are converted to empty strings.

Examples:

Remember with source

node_id = htm.remember("PostgreSQL is great for HTM", source: "user")

Remember with manual tags

node_id = htm.remember("Time-series data", source: "user", tags: ["database:timescaledb"])

Parameters:

content (String, nil) —

The information to remember
source (String, nil) (defaults to: "") —

Where this content came from (defaults to empty string if not provided)
tags (Array<String>) (defaults to: []) —

Manual tags to assign (optional, in addition to auto-extracted tags)

Returns:

(Integer) —

Database ID of the memory node

# File 'lib/htm.rb', line 115

def remember(content, source: "", tags: [])
  # Convert nil to empty string
  content = content.to_s
  source = source.to_s

  # If content is empty, return the last node ID without creating a new entry
  if content.empty?
    last_node = HTM::Models::Node.order(created_at: :desc).first
    return last_node&.id || 0
  end

  # Calculate token count using configured counter
  token_count = HTM.count_tokens(content)

  # Store in long-term memory immediately (without embedding)
  # Embedding and tags will be generated asynchronously
  node_id = @long_term_memory.add(
    content: content,
    source: source,
    token_count: token_count,
    robot_id: @robot_id,
    embedding: nil  # Will be generated in background
  )

  HTM.logger.info "Node #{node_id} created for robot #{@robot_name} (#{token_count} tokens)"

  # Enqueue background jobs for embedding and tag generation
  # Both jobs run in parallel with equal priority
  enqueue_embedding_job(node_id)
  enqueue_tags_job(node_id, manual_tags: tags)

  # Add to working memory (access_count starts at 0)
  @working_memory.add(node_id, content, token_count: token_count, access_count: 0)

  update_robot_activity
  node_id
end

Class: HTM

Overview

Defined Under Namespace

Constant Summary collapse

Class Attribute Summary collapse

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(working_memory_size: 128_000, robot_name: nil, db_config: nil, db_pool_size: 5, db_query_timeout: 30_000, db_cache_size: 1000, db_cache_ttl: 300) ⇒ HTM

Class Attribute Details

.configuration ⇒ HTM::Configuration

Instance Attribute Details

#long_term_memory ⇒ Object (readonly)

#robot_id ⇒ Object (readonly)

#robot_name ⇒ Object (readonly)

#working_memory ⇒ Object (readonly)

Class Method Details

.configure {|config| ... } ⇒ Object

Examples:

Custom configuration

Default configuration

.count_tokens(text) ⇒ Integer

.embed(text) ⇒ Array<Float>

.extract_tags(text, existing_ontology: []) ⇒ Array<String>

.logger ⇒ Logger

.reset_configuration! ⇒ Object

Instance Method Details

#forget(node_id, confirm: false) ⇒ Boolean

#recall(topic, timeframe: nil, limit: 20, strategy: :vector, with_relevance: false, query_tags: [], raw: false) ⇒ Array<String>, Array<Hash>

Examples:

Basic usage (returns content strings)

Get full node hashes

With timeframe

With all options

#remember(content, source: "", tags: []) ⇒ Integer

Examples:

Remember with source

Remember with manual tags

#initialize(working_memory_size: 128_000, robot_name: nil, db_config: nil, db_pool_size: 5, db_query_timeout: 30_000, db_cache_size: 1000, db_cache_ttl: 300) ⇒ `HTM`

.configuration ⇒ `HTM::Configuration`

#long_term_memory ⇒ `Object` (readonly)

#robot_id ⇒ `Object` (readonly)

#robot_name ⇒ `Object` (readonly)

#working_memory ⇒ `Object` (readonly)

.configure {|config| ... } ⇒ `Object`

.count_tokens(text) ⇒ `Integer`

.embed(text) ⇒ `Array<Float>`

.extract_tags(text, existing_ontology: []) ⇒ `Array<String>`

.logger ⇒ `Logger`

.reset_configuration! ⇒ `Object`

#forget(node_id, confirm: false) ⇒ `Boolean`

#recall(topic, timeframe: nil, limit: 20, strategy: :vector, with_relevance: false, query_tags: [], raw: false) ⇒ `Array<String>`, `Array<Hash>`

#remember(content, source: "", tags: []) ⇒ `Integer`