Class: PDF::Reader::PositionalTextReceiver

Inherits:
PageTextReceiver
  • Object
show all
Defined in:
lib/pdf/reader/positional_text_receiver.rb

Overview

Receiver to access positional (x,y) text content from a PDF

Typical usage:

reader = PDF::Reader.new(filename)
receiver = PDF::Reader::PositionalTextReceiver.new
reader.page(page).walk(receiver)
receiver.content

Instance Method Summary collapse

Instance Method Details

#contentObject

override PageTextReceiver content accessor . Returns a hash of positional text:

{
  y_coord=>{x_coord=>text, x_coord=>text },
  y_coord=>{x_coord=>text, x_coord=>text }
}


27
28
29
# File 'lib/pdf/reader/positional_text_receiver.rb', line 27

def content
  @content
end

#show_text(string) ⇒ Object

record text that is drawn on the page

Raises:

  • (PDF::Reader::MalformedPDFError)


13
14
15
16
17
18
19
# File 'lib/pdf/reader/positional_text_receiver.rb', line 13

def show_text(string) # Tj
  raise PDF::Reader::MalformedPDFError, "current font is invalid" if @state.current_font.nil?
  newx, newy = @state.trm_transform(0,0)
  @content[newy] ||= {}
  @content[newy][newx] ||= ''
  @content[newy][newx] << @state.current_font.to_utf8(string)
end