Class: HTM::LongTermMemory
- Inherits:
-
Object
- Object
- HTM::LongTermMemory
- Defined in:
- lib/htm/long_term_memory.rb
Overview
Long-term Memory - PostgreSQL/TimescaleDB-backed permanent storage
LongTermMemory provides durable storage for all memory nodes with:
-
Vector similarity search (RAG)
-
Full-text search
-
Time-range queries
-
Relationship graphs
-
Tag system
-
ActiveRecord ORM for data access
-
Query result caching for efficiency
Constant Summary collapse
- DEFAULT_QUERY_TIMEOUT =
milliseconds (30 seconds)
30_000- MAX_VECTOR_DIMENSION =
Maximum supported dimension with HNSW index (pgvector limitation)
2000- DEFAULT_CACHE_SIZE =
Number of queries to cache
1000- DEFAULT_CACHE_TTL =
Cache lifetime in seconds (5 minutes)
300
Instance Attribute Summary collapse
-
#query_timeout ⇒ Object
readonly
Returns the value of attribute query_timeout.
Instance Method Summary collapse
-
#add(content:, source:, token_count: 0, robot_id:, embedding: nil) ⇒ Integer
Add a node to long-term memory.
-
#add_tag(node_id:, tag:) ⇒ void
Add a tag to a node.
-
#calculate_relevance(node:, query_tags: [], vector_similarity: nil) ⇒ Float
Calculate dynamic relevance score for a node given query context.
-
#delete(node_id) ⇒ void
Delete a node.
-
#exists?(node_id) ⇒ Boolean
Check if a node exists.
-
#get_node_tags(node_id) ⇒ Array<String>
Get tags for a specific node.
-
#initialize(config, pool_size: nil, query_timeout: DEFAULT_QUERY_TIMEOUT, cache_size: DEFAULT_CACHE_SIZE, cache_ttl: DEFAULT_CACHE_TTL) ⇒ LongTermMemory
constructor
A new instance of LongTermMemory.
-
#mark_evicted(node_ids) ⇒ void
Mark nodes as evicted from working memory.
-
#node_topics(node_id) ⇒ Array<String>
Get topics for a specific node.
-
#nodes_by_topic(topic_path, exact: false, limit: 50) ⇒ Array<Hash>
Retrieve nodes by ontological topic.
-
#ontology_structure ⇒ Array<Hash>
Get ontology structure view.
-
#pool_size ⇒ Object
For backwards compatibility with tests/code that expect pool_size.
-
#popular_tags(limit: 20, timeframe: nil) ⇒ Array<Hash>
Get most popular tags.
-
#register_robot(robot_name) ⇒ void
Register a robot.
-
#retrieve(node_id) ⇒ Hash?
Retrieve a node by ID.
-
#search(timeframe:, query:, limit:, embedding_service:) ⇒ Array<Hash>
Vector similarity search.
-
#search_by_tags(tags:, match_all: false, timeframe: nil, limit: 20) ⇒ Array<Hash>
Search nodes by tags.
-
#search_fulltext(timeframe:, query:, limit:) ⇒ Array<Hash>
Full-text search.
-
#search_hybrid(timeframe:, query:, limit:, embedding_service:, prefilter_limit: 100) ⇒ Array<Hash>
Hybrid search (full-text + vector).
-
#search_with_relevance(timeframe:, query: nil, query_tags: [], limit: 20, embedding_service: nil) ⇒ Array<Hash>
Search with dynamic relevance scoring.
-
#shutdown ⇒ Object
Shutdown - no-op with ActiveRecord (connection pool managed by ActiveRecord).
-
#stats ⇒ Hash
Get memory statistics.
-
#topic_relationships(min_shared_nodes: 2, limit: 50) ⇒ Array<Hash>
Get topic relationships (co-occurrence).
-
#track_access(node_ids) ⇒ void
Track access for multiple nodes (bulk operation).
-
#update_last_accessed(node_id) ⇒ void
Update last_accessed timestamp.
-
#update_robot_activity(robot_id) ⇒ void
Update robot activity timestamp.
Constructor Details
#initialize(config, pool_size: nil, query_timeout: DEFAULT_QUERY_TIMEOUT, cache_size: DEFAULT_CACHE_SIZE, cache_ttl: DEFAULT_CACHE_TTL) ⇒ LongTermMemory
Returns a new instance of LongTermMemory.
28 29 30 31 32 33 34 35 36 37 38 39 40 |
# File 'lib/htm/long_term_memory.rb', line 28 def initialize(config, pool_size: nil, query_timeout: DEFAULT_QUERY_TIMEOUT, cache_size: DEFAULT_CACHE_SIZE, cache_ttl: DEFAULT_CACHE_TTL) @config = config @query_timeout = query_timeout # in milliseconds # Set statement timeout for ActiveRecord queries ActiveRecord::Base.connection.execute("SET statement_timeout = #{@query_timeout}") # Initialize query result cache (disable with cache_size: 0) if cache_size > 0 @query_cache = LruRedux::TTL::ThreadSafeCache.new(cache_size, cache_ttl) @cache_stats = { hits: 0, misses: 0 } end end |
Instance Attribute Details
#query_timeout ⇒ Object (readonly)
Returns the value of attribute query_timeout.
26 27 28 |
# File 'lib/htm/long_term_memory.rb', line 26 def query_timeout @query_timeout end |
Instance Method Details
#add(content:, source:, token_count: 0, robot_id:, embedding: nil) ⇒ Integer
Add a node to long-term memory
Embeddings should be generated client-side and provided via the embedding parameter.
53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
# File 'lib/htm/long_term_memory.rb', line 53 def add(content:, source:, token_count: 0, robot_id:, embedding: nil) # Prepare embedding if provided if # Pad embedding to 2000 dimensions if needed actual_dimension = .length if actual_dimension < 2000 = + Array.new(2000 - actual_dimension, 0.0) else = end = "[#{.join(',')}]" end # Create node using ActiveRecord node = HTM::Models::Node.create!( content: content, source: source, token_count: token_count, robot_id: robot_id, embedding: ? : nil, embedding_dimension: ? .length : nil ) # Invalidate cache since database content changed invalidate_cache! node.id end |
#add_tag(node_id:, tag:) ⇒ void
This method returns an undefined value.
Add a tag to a node
231 232 233 234 235 236 237 238 239 |
# File 'lib/htm/long_term_memory.rb', line 231 def add_tag(node_id:, tag:) tag_record = HTM::Models::Tag.find_or_create_by(name: tag) HTM::Models::NodeTag.create( node_id: node_id, tag_id: tag_record.id ) rescue ActiveRecord::RecordNotUnique # Tag association already exists, ignore end |
#calculate_relevance(node:, query_tags: [], vector_similarity: nil) ⇒ Float
Calculate dynamic relevance score for a node given query context
Combines multiple signals:
-
Vector similarity (semantic match)
-
Tag overlap (categorical match)
-
Recency (freshness)
-
Access frequency (popularity/utility)
413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 |
# File 'lib/htm/long_term_memory.rb', line 413 def calculate_relevance(node:, query_tags: [], vector_similarity: nil) # 1. Vector similarity (semantic match) - weight: 0.5 semantic_score = if vector_similarity vector_similarity elsif node['similarity'] node['similarity'].to_f else 0.5 # Neutral if no embedding end # 2. Tag overlap (categorical relevance) - weight: 0.3 = (node['id']) tag_score = if .any? && .any? weighted_hierarchical_jaccard(, ) else 0.5 # Neutral if no tags end # 3. Recency (temporal relevance) - weight: 0.1 age_hours = (Time.now - Time.parse(node['created_at'].to_s)) / 3600.0 recency_score = Math.exp(-age_hours / 168.0) # 1-week half-life # 4. Access frequency (behavioral signal) - weight: 0.1 access_count = node['access_count'] || 0 access_score = Math.log(1 + access_count) / 10.0 # Normalize to 0-1 # Weighted composite (scale to 0-10) relevance = ( (semantic_score * 0.5) + (tag_score * 0.3) + (recency_score * 0.1) + (access_score * 0.1) ) * 10.0 relevance.clamp(0.0, 10.0) end |
#delete(node_id) ⇒ void
This method returns an undefined value.
Delete a node
115 116 117 118 119 120 121 |
# File 'lib/htm/long_term_memory.rb', line 115 def delete(node_id) node = HTM::Models::Node.find_by(id: node_id) node&.destroy # Invalidate cache since database content changed invalidate_cache! end |
#exists?(node_id) ⇒ Boolean
Check if a node exists
128 129 130 |
# File 'lib/htm/long_term_memory.rb', line 128 def exists?(node_id) HTM::Models::Node.exists?(node_id) end |
#get_node_tags(node_id) ⇒ Array<String>
Get tags for a specific node
503 504 505 506 507 508 509 510 |
# File 'lib/htm/long_term_memory.rb', line 503 def (node_id) HTM::Models::Tag .joins(:node_tags) .where(node_tags: { node_id: node_id }) .pluck(:name) rescue [] end |
#mark_evicted(node_ids) ⇒ void
This method returns an undefined value.
Mark nodes as evicted from working memory
246 247 248 249 250 |
# File 'lib/htm/long_term_memory.rb', line 246 def mark_evicted(node_ids) return if node_ids.empty? HTM::Models::Node.where(id: node_ids).update_all(in_working_memory: false) end |
#node_topics(node_id) ⇒ Array<String>
Get topics for a specific node
392 393 394 395 396 397 398 |
# File 'lib/htm/long_term_memory.rb', line 392 def node_topics(node_id) HTM::Models::Tag .joins(:node_tags) .where(node_tags: { node_id: node_id }) .order(:name) .pluck(:name) end |
#nodes_by_topic(topic_path, exact: false, limit: 50) ⇒ Array<Hash>
Retrieve nodes by ontological topic
332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 |
# File 'lib/htm/long_term_memory.rb', line 332 def nodes_by_topic(topic_path, exact: false, limit: 50) if exact nodes = HTM::Models::Node .joins(:tags) .where(tags: { name: topic_path }) .distinct .order(created_at: :desc) .limit(limit) else nodes = HTM::Models::Node .joins(:tags) .where("tags.name LIKE ?", "#{topic_path}%") .distinct .order(created_at: :desc) .limit(limit) end nodes.map(&:attributes) end |
#ontology_structure ⇒ Array<Hash>
Get ontology structure view
356 357 358 359 360 361 |
# File 'lib/htm/long_term_memory.rb', line 356 def ontology_structure result = ActiveRecord::Base.connection.select_all( "SELECT * FROM ontology_structure WHERE root_topic IS NOT NULL ORDER BY root_topic, level1_topic, level2_topic" ) result.to_a end |
#pool_size ⇒ Object
For backwards compatibility with tests/code that expect pool_size
321 322 323 |
# File 'lib/htm/long_term_memory.rb', line 321 def pool_size ActiveRecord::Base.connection_pool.size end |
#popular_tags(limit: 20, timeframe: nil) ⇒ Array<Hash>
Get most popular tags
562 563 564 565 566 567 568 569 570 571 572 573 574 575 |
# File 'lib/htm/long_term_memory.rb', line 562 def (limit: 20, timeframe: nil) query = HTM::Models::Tag .joins(:node_tags) .joins('INNER JOIN nodes ON nodes.id = node_tags.node_id') .group('tags.id', 'tags.name') .select('tags.name, COUNT(node_tags.id) as usage_count') query = query.where('nodes.created_at >= ? AND nodes.created_at <= ?', timeframe.begin, timeframe.end) if timeframe query .order('usage_count DESC') .limit(limit) .map { |tag| { name: tag.name, usage_count: tag.usage_count } } end |
#register_robot(robot_name) ⇒ void
This method returns an undefined value.
Register a robot
274 275 276 277 278 |
# File 'lib/htm/long_term_memory.rb', line 274 def register_robot(robot_name) robot = HTM::Models::Robot.find_or_create_by(name: robot_name) robot.update(last_active: Time.current) robot.id end |
#retrieve(node_id) ⇒ Hash?
Retrieve a node by ID
Automatically tracks access by incrementing access_count and updating last_accessed
89 90 91 92 93 94 95 96 97 98 |
# File 'lib/htm/long_term_memory.rb', line 89 def retrieve(node_id) node = HTM::Models::Node.find_by(id: node_id) return nil unless node # Track access (atomic increment) node.increment!(:access_count) node.touch(:last_accessed) node.attributes end |
#search(timeframe:, query:, limit:, embedding_service:) ⇒ Array<Hash>
Vector similarity search
140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 |
# File 'lib/htm/long_term_memory.rb', line 140 def search(timeframe:, query:, limit:, embedding_service:) # Return uncached if cache disabled return search_uncached(timeframe: timeframe, query: query, limit: limit, embedding_service: ) unless @query_cache # Generate cache key cache_key = cache_key_for(:search, timeframe, query, limit) # Try to get from cache cached = @query_cache[cache_key] if cached @cache_stats[:hits] += 1 return cached end # Cache miss - execute query @cache_stats[:misses] += 1 result = search_uncached(timeframe: timeframe, query: query, limit: limit, embedding_service: ) # Store in cache @query_cache[cache_key] = result result end |
#search_by_tags(tags:, match_all: false, timeframe: nil, limit: 20) ⇒ Array<Hash>
Search nodes by tags
520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 |
# File 'lib/htm/long_term_memory.rb', line 520 def (tags:, match_all: false, timeframe: nil, limit: 20) return [] if .empty? # Build base query query = HTM::Models::Node .joins(:tags) .where(tags: { name: }) .distinct # Apply timeframe filter if provided query = query.where(created_at: timeframe) if timeframe if match_all # Match ALL tags (intersection) query = query .group('nodes.id') .having('COUNT(DISTINCT tags.name) = ?', .size) end # Get results nodes = query.limit(limit).map(&:attributes) # Calculate relevance and enrich with tags nodes.map do |node| relevance = calculate_relevance( node: node, query_tags: ) node.merge({ 'relevance' => relevance, 'tags' => (node['id']) }) end.sort_by { |n| -n['relevance'] } end |
#search_fulltext(timeframe:, query:, limit:) ⇒ Array<Hash>
Full-text search
170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 |
# File 'lib/htm/long_term_memory.rb', line 170 def search_fulltext(timeframe:, query:, limit:) # Return uncached if cache disabled return search_fulltext_uncached(timeframe: timeframe, query: query, limit: limit) unless @query_cache # Generate cache key cache_key = cache_key_for(:fulltext, timeframe, query, limit) # Try to get from cache cached = @query_cache[cache_key] if cached @cache_stats[:hits] += 1 return cached end # Cache miss - execute query @cache_stats[:misses] += 1 result = search_fulltext_uncached(timeframe: timeframe, query: query, limit: limit) # Store in cache @query_cache[cache_key] = result result end |
#search_hybrid(timeframe:, query:, limit:, embedding_service:, prefilter_limit: 100) ⇒ Array<Hash>
Hybrid search (full-text + vector)
202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 |
# File 'lib/htm/long_term_memory.rb', line 202 def search_hybrid(timeframe:, query:, limit:, embedding_service:, prefilter_limit: 100) # Return uncached if cache disabled return search_hybrid_uncached(timeframe: timeframe, query: query, limit: limit, embedding_service: , prefilter_limit: prefilter_limit) unless @query_cache # Generate cache key cache_key = cache_key_for(:hybrid, timeframe, query, limit, prefilter_limit) # Try to get from cache cached = @query_cache[cache_key] if cached @cache_stats[:hits] += 1 return cached end # Cache miss - execute query @cache_stats[:misses] += 1 result = search_hybrid_uncached(timeframe: timeframe, query: query, limit: limit, embedding_service: , prefilter_limit: prefilter_limit) # Store in cache @query_cache[cache_key] = result result end |
#search_with_relevance(timeframe:, query: nil, query_tags: [], limit: 20, embedding_service: nil) ⇒ Array<Hash>
Search with dynamic relevance scoring
Returns nodes with calculated relevance scores based on query context
461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 |
# File 'lib/htm/long_term_memory.rb', line 461 def search_with_relevance(timeframe:, query: nil, query_tags: [], limit: 20, embedding_service: nil) # Get candidates from appropriate search method candidates = if query && # Vector search search_uncached(timeframe: timeframe, query: query, limit: limit * 2, embedding_service: ) elsif query # Full-text search search_fulltext_uncached(timeframe: timeframe, query: query, limit: limit * 2) else # Time-range only HTM::Models::Node .where(created_at: timeframe) .order(created_at: :desc) .limit(limit * 2) .map(&:attributes) end # Calculate relevance for each candidate scored_nodes = candidates.map do |node| relevance = calculate_relevance( node: node, query_tags: , vector_similarity: node['similarity']&.to_f ) node.merge({ 'relevance' => relevance, 'tags' => (node['id']) }) end # Sort by relevance and return top K scored_nodes .sort_by { |n| -n['relevance'] } .take(limit) end |
#shutdown ⇒ Object
Shutdown - no-op with ActiveRecord (connection pool managed by ActiveRecord)
315 316 317 318 |
# File 'lib/htm/long_term_memory.rb', line 315 def shutdown # ActiveRecord handles connection pool shutdown # This method kept for API compatibility end |
#stats ⇒ Hash
Get memory statistics
294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 |
# File 'lib/htm/long_term_memory.rb', line 294 def stats base_stats = { total_nodes: HTM::Models::Node.count, nodes_by_robot: HTM::Models::Node.group(:robot_id).count, total_tags: HTM::Models::Tag.count, oldest_memory: HTM::Models::Node.minimum(:created_at), newest_memory: HTM::Models::Node.maximum(:created_at), active_robots: HTM::Models::Robot.count, robot_activity: HTM::Models::Robot.select(:id, :name, :last_active).map(&:attributes), database_size: ActiveRecord::Base.connection.select_value("SELECT pg_database_size(current_database())").to_i } # Include cache statistics if cache is enabled if @query_cache base_stats[:cache] = cache_stats end base_stats end |
#topic_relationships(min_shared_nodes: 2, limit: 50) ⇒ Array<Hash>
Get topic relationships (co-occurrence)
369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 |
# File 'lib/htm/long_term_memory.rb', line 369 def topic_relationships(min_shared_nodes: 2, limit: 50) result = ActiveRecord::Base.connection.select_all( <<~SQL, SELECT t1.name AS topic1, t2.name AS topic2, COUNT(DISTINCT nt1.node_id) AS shared_nodes FROM tags t1 JOIN node_tags nt1 ON t1.id = nt1.tag_id JOIN node_tags nt2 ON nt1.node_id = nt2.node_id JOIN tags t2 ON nt2.tag_id = t2.id WHERE t1.name < t2.name GROUP BY t1.name, t2.name HAVING COUNT(DISTINCT nt1.node_id) >= #{min_shared_nodes.to_i} ORDER BY shared_nodes DESC LIMIT #{limit.to_i} SQL ) result.to_a end |
#track_access(node_ids) ⇒ void
This method returns an undefined value.
Track access for multiple nodes (bulk operation)
Updates access_count and last_accessed for all nodes in the array
259 260 261 262 263 264 265 266 |
# File 'lib/htm/long_term_memory.rb', line 259 def track_access(node_ids) return if node_ids.empty? # Atomic batch update HTM::Models::Node.where(id: node_ids).update_all( "access_count = access_count + 1, last_accessed = NOW()" ) end |
#update_last_accessed(node_id) ⇒ void
This method returns an undefined value.
Update last_accessed timestamp
105 106 107 108 |
# File 'lib/htm/long_term_memory.rb', line 105 def update_last_accessed(node_id) node = HTM::Models::Node.find_by(id: node_id) node&.update(last_accessed: Time.current) end |
#update_robot_activity(robot_id) ⇒ void
This method returns an undefined value.
Update robot activity timestamp
285 286 287 288 |
# File 'lib/htm/long_term_memory.rb', line 285 def update_robot_activity(robot_id) robot = HTM::Models::Robot.find_by(id: robot_id) robot&.update(last_active: Time.current) end |