Class: PDF::Reader::AdvancedTextRunFilter

Inherits:
Object
  • Object
show all
Defined in:
lib/pdf/reader/advanced_text_run_filter.rb

Overview

Filter a collection of TextRun objects based on a set of conditions. It can be used to filter text runs based on their attributes. The filter can return the text runs that matches the conditions (only) or the text runs that do not match the conditions (exclude).

You can filter the text runs based on all its attributes with the operators mentioned in VALID_OPERATORS. The filter can be nested with ‘or’ and ‘and’ conditions.

Examples:

  1. Single condition

AdvancedTextRunFilter.exclude(text_runs, text: { include: ‘sample’ })

  1. Multiple conditions (and)

AdvancedTextRunFilter.exclude(text_runs, {

font_size: { greater_than: 10, less_than: 15 }

})

  1. Multiple possible values (or)

AdvancedTextRunFilter.exclude(text_runs, {

font_size: { equal: [10, 12] }

})

  1. Complex AND/OR filter

AdvancedTextRunFilter.exclude(text_runs, {

and: [
  { font_size: { greater_than: 10 } },
  { or: [
    { text: { include: "sample" } },
    { width: { greater_than: 100 } }
  ]}
]

})

Constant Summary collapse

VALID_OPERATORS =
%i[
  equal
  not_equal
  greater_than
  less_than
  greater_than_or_equal
  less_than_or_equal
  include
  exclude
]

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(text_runs, filter_hash) ⇒ AdvancedTextRunFilter

Returns a new instance of AdvancedTextRunFilter.



61
62
63
64
# File 'lib/pdf/reader/advanced_text_run_filter.rb', line 61

def initialize(text_runs, filter_hash)
  @text_runs = text_runs
  @filter_hash = filter_hash
end

Instance Attribute Details

#filter_hashObject (readonly)

Returns the value of attribute filter_hash.



59
60
61
# File 'lib/pdf/reader/advanced_text_run_filter.rb', line 59

def filter_hash
  @filter_hash
end

#text_runsObject (readonly)

Returns the value of attribute text_runs.



59
60
61
# File 'lib/pdf/reader/advanced_text_run_filter.rb', line 59

def text_runs
  @text_runs
end

Class Method Details

.exclude(text_runs, filter_hash) ⇒ Object



55
56
57
# File 'lib/pdf/reader/advanced_text_run_filter.rb', line 55

def self.exclude(text_runs, filter_hash)
  new(text_runs, filter_hash).exclude
end

.only(text_runs, filter_hash) ⇒ Object



51
52
53
# File 'lib/pdf/reader/advanced_text_run_filter.rb', line 51

def self.only(text_runs, filter_hash)
  new(text_runs, filter_hash).only
end

Instance Method Details

#excludeObject



71
72
73
74
# File 'lib/pdf/reader/advanced_text_run_filter.rb', line 71

def exclude
  return text_runs if filter_hash.empty?
  text_runs.reject { |text_run| evaluate_filter(text_run) }
end

#onlyObject



66
67
68
69
# File 'lib/pdf/reader/advanced_text_run_filter.rb', line 66

def only
  return text_runs if filter_hash.empty?
  text_runs.select { |text_run| evaluate_filter(text_run) }
end