Class: SimpleTextExtract::FormatExtractor::DocX

Inherits:
Base
  • Object
show all
Defined in:
lib/simple_text_extract/format_extractor/doc_x.rb

Instance Attribute Summary

Attributes inherited from Base

#file

Instance Method Summary collapse

Methods inherited from Base

#initialize, #missing_dependency?

Constructor Details

This class inherits a constructor from SimpleTextExtract::FormatExtractor::Base

Instance Method Details

#extractObject



6
7
8
9
10
# File 'lib/simple_text_extract/format_extractor/doc_x.rb', line 6

def extract
  return nil if missing_dependency?("unzip")

  `unzip -p #{Shellwords.escape(file.path)} | grep '<w:t' | sed 's/<[^<]*>//g' | grep -v '^[[:space:]]*$'`
end