Class: Langchain::Processors::Xlsx

Inherits:
Base
  • Object
show all
Defined in:
lib/langchain/processors/xlsx.rb

Constant Summary collapse

EXTENSIONS =
[".xlsx", ".xlsm"].freeze
CONTENT_TYPES =
["application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"].freeze

Instance Method Summary collapse

Methods included from DependencyHelper

#depends_on

Constructor Details

#initializeXlsx

Returns a new instance of Xlsx.



9
10
11
# File 'lib/langchain/processors/xlsx.rb', line 9

def initialize(*)
  depends_on "roo"
end

Instance Method Details

#parse(data) ⇒ Array<Array<String>>

Parse the document and return the text

Parameters:

  • data (File)

Returns:

  • (Array<Array<String>>)

    Array of rows, each row is an array of cells



16
17
18
19
20
21
22
23
# File 'lib/langchain/processors/xlsx.rb', line 16

def parse(data)
  xlsx_file = Roo::Spreadsheet.open(data)
  xlsx_file.each_with_pagename.flat_map do |_, sheet|
    sheet.map do |row|
      row.map { |i| i.to_s.strip }
    end
  end
end