Class: PDF::Reader::Filter

Inherits:
Object
  • Object
show all
Defined in:
lib/pdf/reader/filter.rb

Overview

Various parts of a PDF file can be passed through a filter before being stored to provide support for features like compression and encryption. This class is for decoding that content.

Currently only 1 filter type is supported. Hopefully support for others will be added in the future.

Instance Method Summary collapse

Constructor Details

#initialize(name, options = nil) ⇒ Filter

creates a new filter for decoding content.

Filters that are only used to encode image data are accepted, but the data is returned untouched. At this stage PDF::Reader has no need to decode images.



42
43
44
45
46
47
48
49
50
51
52
53
54
# File 'lib/pdf/reader/filter.rb', line 42

def initialize (name, options = nil)
  @options = options

  case name.to_sym
  when :ASCII85Decode  then @filter = :ascii85
  when :ASCIIHexDecode then @filter = :asciihex
  when :CCITTFaxDecode then @filter = nil
  when :DCTDecode      then @filter = nil
  when :FlateDecode    then @filter = :flate
  when :JBIG2Decode    then @filter = nil
  else                 raise UnsupportedFeatureError, "Unknown filter: #{name}"
  end
end

Instance Method Details

#ascii85(data) ⇒ Object

Decode the specified data using the Ascii85 algorithm. Relies on the AScii85 rubygem.



72
73
74
75
76
77
78
# File 'lib/pdf/reader/filter.rb', line 72

def ascii85(data)
  data = "<~#{data}" unless data.to_s[0,2] == "<~"
  Ascii85::decode(data)
rescue Exception => e
  # Oops, there was a problem decoding the stream
  raise MalformedPDFError, "Error occured while decoding an ASCII85 stream (#{e.class.to_s}: #{e.to_s})"
end

#asciihex(data) ⇒ Object

Decode the specified data using the AsciiHex algorithm.



82
83
84
85
86
87
88
89
90
91
# File 'lib/pdf/reader/filter.rb', line 82

def asciihex(data)
  data.chop! if data[-1,1] == ">"
  data = data[1,data.size] if data[0,1] == "<"
  data.gsub!(/[^A-Fa-f0-9]/,"")
  data << "0" if data.size % 2 == 1
  data.scan(/.{2}/).map { |s| s.hex.chr }.join("")
rescue Exception => e
  # Oops, there was a problem decoding the stream
  raise MalformedPDFError, "Error occured while decoding an ASCIIHex stream (#{e.class.to_s}: #{e.to_s})"
end

#filter(data) ⇒ Object

attempts to decode the specified data with the current filter

Filters that are only used to encode image data are accepted, but the data is returned untouched. At this stage PDF::Reader has no need to decode images.



61
62
63
64
65
66
67
# File 'lib/pdf/reader/filter.rb', line 61

def filter (data)
  # leave the data untouched if we don't support the required filter
  return data if @filter.nil?

  # decode the data
  self.send(@filter, data)
end

#flate(data) ⇒ Object

Decode the specified data with the Zlib compression algorithm



94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
# File 'lib/pdf/reader/filter.rb', line 94

def flate (data)
  begin
    Zlib::Inflate.new.inflate(data)
  rescue Zlib::DataError => e
    # by default, Ruby's Zlib assumes the data it's inflating
    # is RFC1951 deflated data, wrapped in a RFC1951 zlib container.
    # If that fails, then use an undocumented 'feature' to attempt to inflate
    # the data as a raw RFC1951 stream.
    #
    # See
    # - http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/243545
    # - http://www.gzip.org/zlib/zlib_faq.html#faq38
    Zlib::Inflate.new(-Zlib::MAX_WBITS).inflate(data)
  end
rescue Exception => e
  # Oops, there was a problem inflating the stream
  raise MalformedPDFError, "Error occured while inflating a compressed stream (#{e.class.to_s}: #{e.to_s})"
end