Class: Daru::IO::Importers::Excelx

Inherits:
Base
  • Object
show all
Defined in:
lib/daru/io/importers/excelx.rb

Overview

Excelx Importer Class, that handles .xlsx files in the Excel Importer

See Also:

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from Base

from, guess_parse

Methods inherited from Base

#optional_gem

Constructor Details

#initializeExcelx

Checks for required gem dependencies of Excelx Importer



11
12
13
# File 'lib/daru/io/importers/excelx.rb', line 11

def initialize
  optional_gem 'roo', '~> 2.7.0'
end

Class Method Details

.read(path) ⇒ Daru::IO::Importers::Excelx

Reads from an excelx (xlsx) file

Examples:

Reading from a local xlsx file

local_instance = Daru::IO::Importers::Excelx.read("Stock-counts-sheet.xlsx")

Reading from a remote xlsx file

url = "https://www.exact.com/uk/images/downloads/getting-started-excel-sheets/Stock-counts-sheet.xlsx"
remote_instance = Daru::IO::Importers::Excelx.read(url)

Parameters:

  • path (String)

    Local / Remote path of xlsx file, where the DataFrame is to be imported from.

Returns:



30
31
32
33
# File 'lib/daru/io/importers/excelx.rb', line 30

def read(path)
  @file_data = Roo::Excelx.new(path)
  self
end

Instance Method Details

#call(sheet: 0, skiprows: 0, skipcols: 0, order: true, index: false) ⇒ Daru::DataFrame

Imports a Daru::DataFrame from an Excelx Importer instance

Examples:

Importing from specific sheet

df = local_instance.call(sheet: 'Example Stock Counts')

#=> <Daru::DataFrame(15x7)>
#           Status Stock coun  Item code        New Descriptio Stock coun Offset G/L
#     0          H          1        nil        nil New stock  2014-08-01        nil
#     1        nil          1  IND300654          2 New stock  2014-08-01      51035
#     2        nil          1   IND43201          5 New stock  2014-08-01      51035
#     3        nil          1   OUT30045          3 New stock  2014-08-01      51035
#    ...       ...        ...     ...           ...     ...       ...           ...

Importing from a remote URL and default sheet

df = remote_instance.call

#=> <Daru::DataFrame(15x7)>
#           Status Stock coun  Item code        New Descriptio Stock coun Offset G/L
#     0          H          1        nil        nil New stock  2014-08-01        nil
#     1        nil          1  IND300654          2 New stock  2014-08-01      51035
#     2        nil          1   IND43201          5 New stock  2014-08-01      51035
#     3        nil          1   OUT30045          3 New stock  2014-08-01      51035
#    ...       ...        ...     ...           ...     ...       ...           ...

Importing without headers

df = local_instance.call(sheet: 'Example Stock Counts', headers: false)

#=> <Daru::DataFrame(16x7)>
#                0           1          2          3          4          5        6
#     0      Status Stock coun  Item code        New Descriptio Stock coun Offset G/L
#     1          H          1        nil        nil New stock  2014-08-01        nil
#     2        nil          1  IND300654          2 New stock  2014-08-01      51035
#     3        nil          1   IND43201          5 New stock  2014-08-01      51035
#     4        nil          1   OUT30045          3 New stock  2014-08-01      51035
#    ...       ...        ...     ...           ...     ...       ...           ...

Parameters:

  • sheet (Integer or String) (defaults to: 0)

    Imports from a specific sheet

  • skiprows (Integer) (defaults to: 0)

    Skips the first :skiprows number of rows from the sheet being parsed.

  • skipcols (Integer) (defaults to: 0)

    Skips the first :skipcols number of columns from the sheet being parsed.

  • order (Boolean) (defaults to: true)

    Defaults to true. When set to true, first row of the given sheet is used as the order of the Daru::DataFrame and data of the Dataframe consists of the remaining rows.

  • index (Boolean) (defaults to: false)

    Defaults to false. When set to true, first column of the given sheet is used as the index of the Daru::DataFrame and data of the Dataframe consists of the remaining columns.

    When set to false, a default order (0 to n-1) is chosen for the DataFrame, and the data of the DataFrame consists of all rows in the sheet.

Returns:



87
88
89
90
91
92
93
94
95
96
97
# File 'lib/daru/io/importers/excelx.rb', line 87

def call(sheet: 0, skiprows: 0, skipcols: 0, order: true, index: false)
  @order    = order
  @index    = index
  worksheet = @file_data.sheet(sheet)
  @data     = strip_html_tags(skip_data(worksheet.to_a, skiprows, skipcols))
  @index    = process_index
  @order    = process_order || (0..@data.first.length-1)
  @data     = process_data

  Daru::DataFrame.rows(@data, order: @order, index: @index)
end

#read(path) ⇒ Daru::IO::Importers::Excelx

Reads from an excelx (xlsx) file

Examples:

Reading from a local xlsx file

local_instance = Daru::IO::Importers::Excelx.read("Stock-counts-sheet.xlsx")

Reading from a remote xlsx file

url = "https://www.exact.com/uk/images/downloads/getting-started-excel-sheets/Stock-counts-sheet.xlsx"
remote_instance = Daru::IO::Importers::Excelx.read(url)

Parameters:

  • path (String)

    Local / Remote path of xlsx file, where the DataFrame is to be imported from.

Returns:



30
31
32
33
# File 'lib/daru/io/importers/excelx.rb', line 30

def read(path)
  @file_data = Roo::Excelx.new(path)
  self
end