Class: Daru::IO::Importers::Mongo

Inherits:
JSON show all
Defined in:
lib/daru/io/importers/mongo.rb

Overview

Mongo Importer Class, that extends from_mongo method to Daru::DataFrame

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from JSON

#read, read

Methods inherited from Base

guess_parse, read

Methods inherited from Base

#optional_gem

Constructor Details

#initializeMongo

Checks for required gem dependencies of Mongo Importer



11
12
13
14
# File 'lib/daru/io/importers/mongo.rb', line 11

def initialize
  super
  optional_gem 'mongo'
end

Class Method Details

.from(connection) ⇒ Daru::IO::Importers::Mongo

Loads data from a given connection

Examples:

Loading from a connection string

instance_1 = Daru::IO::Importers::Mongo.from('mongodb://127.0.0.1:27017/test')

Loading from a connection hash

instance_2 = Daru::IO::Importers::Mongo.from({ hosts: ['127.0.0.1:27017'], database: 'test' })

Loading from a Mongo::Client connection

instance_3 = Daru::IO::Importers::Mongo.from(Mongo::Client.new ['127.0.0.1:27017'], database: 'test')

Parameters:

  • connection (String or Hash or Mongo::Client)

    Contains details about a Mongo database / hosts to connect.

Returns:



33
34
35
36
# File 'lib/daru/io/importers/mongo.rb', line 33

def from(connection)
  @client = get_client(connection)
  self
end

Instance Method Details

#call(collection, *columns, order: nil, index: nil, filter: nil, limit: nil, skip: nil, **named_columns) ⇒ Daru::DataFrame

Note:
  • For more information on using JSON-path selectors, have a look at the explanations here and here.
  • The Mongo gem faces Argument Error : expected Proc Argument issue due to the bug in MRI Ruby 2.4.0 mentioned here. This seems to have been fixed in Ruby 2.4.1 onwards. Hence, please avoid using this Mongo Importer in Ruby version 2.4.0.

Imports a Daru::DataFrame from a Mongo Importer instance.

Examples:

Importing without jsonpath selectors

# The below 'cars' collection can be recreated in a Mongo shell with -
# db.cars.drop()
# db.cars.insert({name: "Audi", price: 52642})
# db.cars.insert({name: "Mercedes", price: 57127})
# db.cars.insert({name: "Volvo", price: 29000})

df = instance.call('cars')

#=> #<Daru::DataFrame(3x3)>
#           _id       name      price
#  0 5948d0bfcd       Audi    52642.0
#  1 5948d0c6cd   Mercedes    57127.0
#  2 5948d0cecd      Volvo    29000.0

Importing with jsonpath selectors

# The below 'cars' collection can be recreated in a Mongo shell with -
# db.cars.drop()
# db.cars.insert({name: "Audi", price: 52642, star: { fuel: 9.8, cost: 8.6, seats: 9.9, sound: 9.3 }})
# db.cars.insert({name: "Mercedes", price: 57127, star: { fuel: 9.3, cost: 8.9, seats: 8.4, sound: 9.1 }})
# db.cars.insert({name: "Volvo", price: 29000, star: { fuel: 7.8, cost: 9.9, seats: 8.2, sound: 8.9 }})

df = instance.call(
  'cars',
  '$.._id',
  '$..name',
  '$..price',
  '$..star..fuel',
  '$..star..cost'
)

#=> #<Daru::DataFrame(3x5)>
#          _id       name      price       fuel       cost
# 0 5948d40b50       Audi    52642.0        9.8        8.6
# 1 5948d42850   Mercedes    57127.0        9.3        8.9
# 2 5948d44350      Volvo    29000.0        7.8        9.9

Parameters:

  • collection (String or Symbol)

    A specific collection in the Mongo database, to import as Daru::DataFrame.

  • columns (Array)

    JSON-path slectors to select specific fields from the JSON input.

  • order (String or Array) (defaults to: nil)

    Either a JSON-path selector string, or an array containing the order of the Daru::DataFrame. DO NOT provide both order and named_columns at the same time.

  • index (String or Array) (defaults to: nil)

    Either a JSON-path selector string, or an array containing the order of the Daru::DataFrame.

  • filter (Hash) (defaults to: nil)

    Filters and chooses Mongo documents that match the given filter from the collection.

  • limit (Interger) (defaults to: nil)

    Limits the number of Mongo documents to be parsed from the collection.

  • skip (Integer) (defaults to: nil)

    Skips skip number of documents from the Mongo collection.

  • named_columns (Hash)

    JSON-path selectors to select specific fields from the JSON input. DO NOT provide both order and named_columns at the same time.

Returns:



107
108
109
110
111
112
113
114
115
116
# File 'lib/daru/io/importers/mongo.rb', line 107

def call(collection, *columns, order: nil, index: nil,
  filter: nil, limit: nil, skip: nil, **named_columns)
  @json = ::JSON.parse(
    @client[collection.to_sym]
    .find(filter, skip: skip, limit: limit)
    .to_json
  )

  super(*columns, order: order, index: index, **named_columns)
end

#from(connection) ⇒ Daru::IO::Importers::Mongo

Loads data from a given connection

Examples:

Loading from a connection string

instance_1 = Daru::IO::Importers::Mongo.from('mongodb://127.0.0.1:27017/test')

Loading from a connection hash

instance_2 = Daru::IO::Importers::Mongo.from({ hosts: ['127.0.0.1:27017'], database: 'test' })

Loading from a Mongo::Client connection

instance_3 = Daru::IO::Importers::Mongo.from(Mongo::Client.new ['127.0.0.1:27017'], database: 'test')

Parameters:

  • connection (String or Hash or Mongo::Client)

    Contains details about a Mongo database / hosts to connect.

Returns:



33
34
35
36
# File 'lib/daru/io/importers/mongo.rb', line 33

def from(connection)
  @client = get_client(connection)
  self
end