Class: HTM

Inherits:
Object
  • Object
show all
Defined in:
lib/htm.rb,
lib/htm/errors.rb,
lib/htm/railtie.rb,
lib/htm/sinatra.rb,
lib/htm/version.rb,
lib/htm/database.rb,
lib/htm/models/tag.rb,
lib/htm/job_adapter.rb,
lib/htm/models/node.rb,
lib/htm/tag_service.rb,
lib/htm/models/robot.rb,
lib/htm/configuration.rb,
lib/htm/working_memory.rb,
lib/htm/models/node_tag.rb,
lib/htm/long_term_memory.rb,
lib/htm/embedding_service.rb,
lib/htm/active_record_config.rb,
lib/htm/jobs/generate_tags_job.rb,
lib/htm/jobs/generate_embedding_job.rb

Overview

HTM error classes

Defined Under Namespace

Modules: JobAdapter, Jobs, Models, Sinatra Classes: ActiveRecordConfig, AuthorizationError, CircuitBreakerOpenError, Configuration, Database, DatabaseError, EmbeddingError, EmbeddingService, Error, LongTermMemory, NotFoundError, QueryTimeoutError, Railtie, ResourceExhaustedError, TagError, TagService, ValidationError, WorkingMemory

Constant Summary collapse

MAX_KEY_LENGTH =

Validation constants

255
MAX_VALUE_LENGTH =

1MB

1_000_000
MAX_ARRAY_SIZE =
1000
VALID_RECALL_STRATEGIES =
[:vector, :fulltext, :hybrid].freeze
VERSION =
"0.0.1"

Class Attribute Summary collapse

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(working_memory_size: 128_000, robot_name: nil, db_config: nil, db_pool_size: 5, db_query_timeout: 30_000, db_cache_size: 1000, db_cache_ttl: 300) ⇒ HTM

Initialize a new HTM instance

Parameters:

  • working_memory_size (Integer) (defaults to: 128_000)

    Maximum tokens for working memory (default: 128,000)

  • robot_name (String) (defaults to: nil)

    Human-readable name for this robot (auto-generated if not provided)

  • db_config (Hash) (defaults to: nil)

    Database configuration (uses ENV if not provided)

  • db_pool_size (Integer) (defaults to: 5)

    Database connection pool size (default: 5)

  • db_query_timeout (Integer) (defaults to: 30_000)

    Database query timeout in milliseconds (default: 30000)

  • db_cache_size (Integer) (defaults to: 1000)

    Number of database query results to cache (default: 1000, use 0 to disable)

  • db_cache_ttl (Integer) (defaults to: 300)

    Database cache TTL in seconds (default: 300)



68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
# File 'lib/htm.rb', line 68

def initialize(
  working_memory_size: 128_000,
  robot_name: nil,
  db_config: nil,
  db_pool_size: 5,
  db_query_timeout: 30_000,
  db_cache_size: 1000,
  db_cache_ttl: 300
)
  # Establish ActiveRecord connection if not already connected
  HTM::ActiveRecordConfig.establish_connection! unless HTM::ActiveRecordConfig.connected?

  @robot_name = robot_name || "robot_#{SecureRandom.uuid[0..7]}"

  # Initialize components
  @working_memory = HTM::WorkingMemory.new(max_tokens: working_memory_size)
  @long_term_memory = HTM::LongTermMemory.new(
    db_config || HTM::Database.default_config,
    pool_size: db_pool_size,
    query_timeout: db_query_timeout,
    cache_size: db_cache_size,
    cache_ttl: db_cache_ttl
  )

  # Register this robot in the database and get its integer ID
  @robot_id = register_robot
end

Class Attribute Details

.configurationHTM::Configuration

Get current configuration

Returns:



274
275
276
# File 'lib/htm/configuration.rb', line 274

def configuration
  @configuration ||= Configuration.new
end

Instance Attribute Details

#long_term_memoryObject (readonly)

Returns the value of attribute long_term_memory.



49
50
51
# File 'lib/htm.rb', line 49

def long_term_memory
  @long_term_memory
end

#robot_idObject (readonly)

Returns the value of attribute robot_id.



49
50
51
# File 'lib/htm.rb', line 49

def robot_id
  @robot_id
end

#robot_nameObject (readonly)

Returns the value of attribute robot_name.



49
50
51
# File 'lib/htm.rb', line 49

def robot_name
  @robot_name
end

#working_memoryObject (readonly)

Returns the value of attribute working_memory.



49
50
51
# File 'lib/htm.rb', line 49

def working_memory
  @working_memory
end

Class Method Details

.configure {|config| ... } ⇒ Object

Configure HTM

Examples:

Custom configuration

HTM.configure do |config|
  config.embedding_generator = ->(text) { MyEmbedder.embed(text) }
  config.tag_extractor = ->(text, ontology) { MyTagger.extract(text, ontology) }
end

Default configuration

HTM.configure  # Uses RubyLLM defaults

Yields:

  • (config)

    Configuration object

Yield Parameters:



292
293
294
295
296
# File 'lib/htm/configuration.rb', line 292

def configure
  yield(configuration) if block_given?
  configuration.validate!
  configuration
end

.count_tokens(text) ⇒ Integer

Count tokens using configured counter

Parameters:

  • text (String)

    Text to count tokens for

Returns:

  • (Integer)

    Token count



328
329
330
331
332
# File 'lib/htm/configuration.rb', line 328

def count_tokens(text)
  configuration.token_counter.call(text)
rescue StandardError => e
  raise HTM::ValidationError, "Token counting failed: #{e.message}"
end

.embed(text) ⇒ Array<Float>

Generate embedding using EmbeddingService

Parameters:

  • text (String)

    Text to embed

Returns:

  • (Array<Float>)

    Embedding vector (original, not padded)



308
309
310
311
# File 'lib/htm/configuration.rb', line 308

def embed(text)
  result = HTM::EmbeddingService.generate(text)
  result[:embedding]
end

.extract_tags(text, existing_ontology: []) ⇒ Array<String>

Extract tags using TagService

Parameters:

  • text (String)

    Text to analyze

  • existing_ontology (Array<String>) (defaults to: [])

    Sample of existing tags for context

Returns:

  • (Array<String>)

    Extracted and validated tag names



319
320
321
# File 'lib/htm/configuration.rb', line 319

def extract_tags(text, existing_ontology: [])
  HTM::TagService.extract(text, existing_ontology: existing_ontology)
end

.loggerLogger

Get configured logger

Returns:

  • (Logger)

    Configured logger instance



338
339
340
# File 'lib/htm/configuration.rb', line 338

def logger
  configuration.logger
end

.reset_configuration!Object

Reset configuration to defaults



299
300
301
# File 'lib/htm/configuration.rb', line 299

def reset_configuration!
  @configuration = Configuration.new
end

Instance Method Details

#forget(node_id, confirm: false) ⇒ Boolean

Forget a memory node (explicit deletion)

Parameters:

  • key (String)

    Key of the node to delete

  • confirm (Symbol) (defaults to: false)

    Must be :confirmed to proceed

Returns:

  • (Boolean)

    true if deleted

Raises:

  • (ArgumentError)

    if confirmation not provided

  • (HTM::NotFoundError)

    if node doesn’t exist



251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
# File 'lib/htm.rb', line 251

def forget(node_id, confirm: false)
  # Validate inputs
  raise ArgumentError, "node_id cannot be nil" if node_id.nil?
  raise ArgumentError, "Must pass confirm: :confirmed to delete" unless confirm == :confirmed

  # Verify node exists
  unless @long_term_memory.exists?(node_id)
    raise HTM::NotFoundError, "Node not found: #{node_id}"
  end

  # Delete the node and remove from working memory
  @long_term_memory.delete(node_id)
  @working_memory.remove(node_id)

  update_robot_activity
  true
end

#recall(topic, timeframe: nil, limit: 20, strategy: :vector, with_relevance: false, query_tags: [], raw: false) ⇒ Array<String>, Array<Hash>

Recall memories from a timeframe and topic

Examples:

Basic usage (returns content strings)

memories = htm.recall("PostgreSQL")
# => ["PostgreSQL is great for time-series data", "PostgreSQL with TimescaleDB..."]

Get full node hashes

nodes = htm.recall("PostgreSQL", raw: true)
# => [{"id" => 1, "content" => "...", "created_at" => "...", ...}, ...]

With timeframe

memories = htm.recall("PostgreSQL", timeframe: "last week")

With all options

memories = htm.recall("PostgreSQL",
  timeframe: "last month",
  limit: 50,
  strategy: :hybrid,
  with_relevance: true,
  query_tags: ["database", "timeseries"])

Parameters:

  • topic (String)

    Topic to search for (required)

  • timeframe (String, Range, nil) (defaults to: nil)

    Time range (default: last 7 days). Examples: “last week”, 7.days.ago..Time.now

  • limit (Integer) (defaults to: 20)

    Maximum number of nodes to retrieve (default: 20)

  • strategy (Symbol) (defaults to: :vector)

    Search strategy (:vector, :fulltext, :hybrid) (default: :vector)

  • with_relevance (Boolean) (defaults to: false)

    Include dynamic relevance scores (default: false)

  • query_tags (Array<String>) (defaults to: [])

    Tags to boost relevance (default: [])

  • raw (Boolean) (defaults to: false)

    Return full node hashes (true) or just content strings (false) (default: false)

Returns:

  • (Array<String>, Array<Hash>)

    Content strings (raw: false) or full node hashes (raw: true)



183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
# File 'lib/htm.rb', line 183

def recall(topic, timeframe: nil, limit: 20, strategy: :vector, with_relevance: false, query_tags: [], raw: false)
  # Use default timeframe if not provided (last 7 days)
  timeframe ||= "last 7 days"

  # Validate inputs
  validate_timeframe!(timeframe)
  validate_positive_integer!(limit, "limit")
  validate_recall_strategy!(strategy)
  validate_array!(query_tags, "query_tags")

  parsed_timeframe = parse_timeframe(timeframe)

  # Use relevance-based search if requested
  if with_relevance
    nodes = @long_term_memory.search_with_relevance(
      timeframe: parsed_timeframe,
      query: topic,
      query_tags: query_tags,
      limit: limit,
      embedding_service: (strategy == :vector || strategy == :hybrid) ? HTM : nil
    )
  else
    # Perform standard RAG-based retrieval
    nodes = case strategy
    when :vector
      # Vector search using query embedding
      @long_term_memory.search(
        timeframe: parsed_timeframe,
        query: topic,
        limit: limit,
        embedding_service: HTM
      )
    when :fulltext
      @long_term_memory.search_fulltext(
        timeframe: parsed_timeframe,
        query: topic,
        limit: limit
      )
    when :hybrid
      # Hybrid search combining vector + fulltext
      @long_term_memory.search_hybrid(
        timeframe: parsed_timeframe,
        query: topic,
        limit: limit,
        embedding_service: HTM
      )
    end
  end

  # Add to working memory (evict if needed)
  nodes.each do |node|
    add_to_working_memory(node)
  end

  update_robot_activity

  # Return full nodes or just content based on raw parameter
  raw ? nodes : nodes.map { |node| node['content'] }
end

#remember(content, source: "", tags: []) ⇒ Integer

Remember new information

Stores content in long-term memory and adds it to working memory. Embeddings and hierarchical tags are automatically extracted by LLM in the background.

If content is empty, returns the ID of the most recent node without creating a duplicate. Nil values for content or source are converted to empty strings.

Examples:

Remember with source

node_id = htm.remember("PostgreSQL is great for HTM", source: "user")

Remember with manual tags

node_id = htm.remember("Time-series data", source: "user", tags: ["database:timescaledb"])

Parameters:

  • content (String, nil)

    The information to remember

  • source (String, nil) (defaults to: "")

    Where this content came from (defaults to empty string if not provided)

  • tags (Array<String>) (defaults to: [])

    Manual tags to assign (optional, in addition to auto-extracted tags)

Returns:

  • (Integer)

    Database ID of the memory node



115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
# File 'lib/htm.rb', line 115

def remember(content, source: "", tags: [])
  # Convert nil to empty string
  content = content.to_s
  source = source.to_s

  # If content is empty, return the last node ID without creating a new entry
  if content.empty?
    last_node = HTM::Models::Node.order(created_at: :desc).first
    return last_node&.id || 0
  end

  # Calculate token count using configured counter
  token_count = HTM.count_tokens(content)

  # Store in long-term memory immediately (without embedding)
  # Embedding and tags will be generated asynchronously
  node_id = @long_term_memory.add(
    content: content,
    source: source,
    token_count: token_count,
    robot_id: @robot_id,
    embedding: nil  # Will be generated in background
  )

  HTM.logger.info "Node #{node_id} created for robot #{@robot_name} (#{token_count} tokens)"

  # Enqueue background jobs for embedding and tag generation
  # Both jobs run in parallel with equal priority
  enqueue_embedding_job(node_id)
  enqueue_tags_job(node_id, manual_tags: tags)

  # Add to working memory (access_count starts at 0)
  @working_memory.add(node_id, content, token_count: token_count, access_count: 0)

  update_robot_activity
  node_id
end