Class: SO2DB::Importer

Inherits:
Object
  • Object
show all
Defined in:
lib/so2db.rb

Overview

Base class for StackOverflow data importers. Drives database setup and data importing files from a directory.

Implementations of this class must provide a method with the following signature:

import_stream(formatter)

This method may be private. The purpose of this method is to actually perform the data import with data from the provided formatter. The formatter is provided to support scenarios of streaming data to STDIN (e.g., PostgreSQL’s COPY command) as well as pushing data to a file before import (e.g., for MySQL’s mysqlimport utility). It has type SO2DB::Formatter.

The importer uses ActiveRecord for table creation and Foreigner for creating table relationships. You are limited to the databases supported by these libraries. In addition, a ‘uuid’ method must be avaiable to the adapter provided to ActiveRecord. (See so2pg for an example of an adapter extension that provides the method.)

In addition, it provides two accessors for subclasses:

attr_reader :conn_opts
attr_accessor :delimiter

The conn_opts property provides the ActiveRecord connection data (e.g., :database, :host, etc.). The delimiter property sets the delimiter used by the formatter. The delimiter is v (0xB) by default.

Direct Known Subclasses

PgImporter

Instance Method Summary collapse

Constructor Details

#initialize(relations = false, optionals = false, adapter = '', options = {}) ⇒ Importer

Initializes the importer.

Arguments:

relations: (Boolean) Indicates whether database relationships should
                     be created.
optionals: (Boolean) Indicates whether optional database tables and
                     content should be created.
adapter:   (String)  The ActiveRecord adapter name (e.g., 'postgresql').
options:   (Hash)    The database connection options, as required by
                     ActiveRecord for the provided adapter.


65
66
67
68
69
70
# File 'lib/so2db.rb', line 65

def initialize(relations = false, optionals = false, adapter = '', options = {})
  @relations = relations
  @optionals = optionals
  @conn_opts = options.merge( { :adapter => adapter } )
  @format_delimiter = 11.chr.to_s
end

Instance Method Details

#import(dir) ⇒ Object

Creates the database tables and relationships, and imports the data in the files in the specified directory.

Arguments:

dir:  (String) The directory path containting the StackOverflow data
               dump XML files (e.g., badges.xml, posts.xml, etc.).


78
79
80
81
82
83
84
85
# File 'lib/so2db.rb', line 78

def import(dir)
  setup
  create_basics
  import_data(dir)
  create_relations if @relations
  create_optionals if @optionals
  create_optional_relations if @relations and @optionals
end