Class: OoxmlParser::Parser

Inherits:
Object
  • Object
show all
Defined in:
lib/ooxml_parser/common_parser/parser.rb

Class Method Summary collapse

Class Method Details

.parse(path_to_file) ⇒ CommonDocumentStructure

Base method to parse document of any type

Parameters:

  • path_to_file (String)

    file

Returns:



20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# File 'lib/ooxml_parser/common_parser/parser.rb', line 20

def self.parse(path_to_file)
  Parser.parse_format(path_to_file) do
    format = Parser.recognize_folder_format
    case format
    when :docx
      DocumentStructure.parse
    when :xlsx
      XLSXWorkbook.parse
    when :pptx
      Presentation.parse
    else
      warn "#{path_to_file} is a simple zip file without OOXML content"
    end
  end
end

.parse_format(path_to_file) ⇒ CommonDocumentStructure

Base method to yield parse document of any type

Parameters:

  • path_to_file (String)

    file

Returns:



6
7
8
9
10
11
12
13
14
15
# File 'lib/ooxml_parser/common_parser/parser.rb', line 6

def self.parse_format(path_to_file)
  return nil if OOXMLDocumentObject.encrypted_file?(path_to_file)
  path_to_zip_file = OOXMLDocumentObject.copy_file_and_rename_to_zip(path_to_file)
  OOXMLDocumentObject.path_to_folder = path_to_zip_file.sub(File.basename(path_to_zip_file), '')
  OOXMLDocumentObject.unzip_file(path_to_zip_file, OOXMLDocumentObject.path_to_folder)
  model = yield
  model.file_path = path_to_file if model
  FileUtils.rm_rf(OOXMLDocumentObject.path_to_folder)
  model
end

.recognize_folder_format(directory = OOXMLDocumentObject.path_to_folder) ⇒ Symbol

Recognize folder format

Parameters:

  • directory (String) (defaults to: OOXMLDocumentObject.path_to_folder)

    path to dirctory

Returns:

  • (Symbol)

    type of document



39
40
41
42
43
# File 'lib/ooxml_parser/common_parser/parser.rb', line 39

def self.recognize_folder_format(directory = OOXMLDocumentObject.path_to_folder)
  return :docx if Dir.exist?("#{directory}/word")
  return :xlsx if Dir.exist?("#{directory}/xl")
  return :pptx if Dir.exist?("#{directory}/ppt")
end