Module: PatternRuby::EntityTypes

Defined in:
lib/pattern_ruby/entity_types.rb

Defined Under Namespace

Classes: Registry

Constant Summary collapse

BUILT_IN =
{
  string: { pattern: nil, parser: nil },
  number: { pattern: /\d+(?:\.\d+)?/, parser: ->(s) { s.include?(".") ? s.to_f : s.to_i } },
  integer: { pattern: /\d+/, parser: ->(s) { s.to_i } },
  email: { pattern: /[\w.+\-]+@[\w\-]+\.[\w.]+/, parser: nil },
  phone: { pattern: /\+?\d[\d\s\-()]{7,}/, parser: nil },
  url: { pattern: %r{https?://\S+}, parser: nil },
  currency: { pattern: /\$[\d,]+(?:\.\d{2})?/, parser: ->(s) { s.delete(",").delete("$").to_f } },
}.freeze
STANDARD_NER_TYPES =

Standard NER entity types recognized in pattern constraints like entity:PERSON

%w[
  PERSON PER
  LOCATION LOC
  ORGANIZATION ORG
  DATE TIME
  MONEY PERCENT
  MISC
].freeze

Class Method Summary collapse

Class Method Details

.valid_entity_type?(type_str) ⇒ Boolean

Validates an entity type string used in pattern constraints (e.g., “PERSON” from name:PERSON). Returns true if the type is a standard NER type, a built-in type, or matches the custom type format (UPPER_SNAKE_CASE).

Returns:

  • (Boolean)


26
27
28
29
30
31
32
33
34
35
# File 'lib/pattern_ruby/entity_types.rb', line 26

def self.valid_entity_type?(type_str)
  return false if type_str.nil? || type_str.strip.empty?

  normalized = type_str.strip
  return true if STANDARD_NER_TYPES.include?(normalized)
  return true if BUILT_IN.key?(normalized.downcase.to_sym)

  # Allow custom types that follow UPPER_SNAKE_CASE or CamelCase convention
  !!(normalized.match?(/\A[A-Z][A-Z0-9_]*\z/) || normalized.match?(/\A[A-Z][a-zA-Z0-9]*\z/))
end