Module: RDig::ContentExtractors::ExternalAppHelper

Included in:
PdfContentExtractor, WordContentExtractor
Defined in:
lib/rdig/content_extractors.rb

Overview

to be used by concrete implementations having a get_content class method that takes a path to a file and return the textual content extracted from that file.

Instance Method Summary collapse

Instance Method Details

#as_file(content) {|file| ... } ⇒ Object

Yields:

  • (file)


65
66
67
68
69
70
71
# File 'lib/rdig/content_extractors.rb', line 65

def as_file(content)
  file = Tempfile.new('rdig')
  file << content
  file.close
  yield file
  file.delete
end

#can_do(content_type) ⇒ Object

setting @available according to presence of external executables in initializer of ContentExtractor is needed to make this work



75
76
77
# File 'lib/rdig/content_extractors.rb', line 75

def can_do(content_type)
  @available and super(content_type)
end

#process(content) ⇒ Object



57
58
59
60
61
62
63
# File 'lib/rdig/content_extractors.rb', line 57

def process(content)
  result = {}
  as_file(content) do |file|
    result[:content] = get_content(file.path).strip
  end
  result
end