IntegratedData
A tool for compositing custom domain objects from disparate data sources.
Synopsis
IntegratedData is a micro-framework for data integration that allows you to build POROs whose properties are scattered across various sources.
Concepts
Data Sources
An IntegratedData::Source is any class capable of retrieving data from some source on initialization. It could access a database, API, CSV, or pretty much anything under the digital sun. Public methods of IntegratedData::Sources act like different strategies for accessing the data therein.
Entities
An IntegratedData::Entity represents a concept with various attributes. Each attribute can specify different data sources and strategies it can be derived from, using different identifiers.
Identifiers
Finally, you are encouraged to make proper value objects for your identifiers by using IntegratedData::Identifiers. When initializing an entity, any ids with a corresponding identifier class will be coerced into an instance of one of these.
API
Data Sources
Convert any class into a data source by extending IntegratedData::Source:
require 'integrated_data'
class MySource
extend IntegratedData::Source
end
Requirements
Sources are required to implement these methods:
IntegratedData::Source.build(options = {})
Data sources must implement a class method, build, that can accept a Hash of options and returns an object you can invoke strategies on.
IntegratedData::Sourcestrategies
The object returned by IntegratedData::Source.build should have public methods that accept a single argument–an identifier–and return data from the source.
Results
- Hooked in
When the class is used as the :source parameter of the IntegratedData::Entity.lookup DSL, it will be used to fetch data for that attribute.
Extensions
There are currently no gems that extend IntegratedData::Source.
Identifiers
Convert any class into an identifier by extending IntegratedData::Identifier:
require 'integrated_data'
class MyIdentifier
extend IntegratedData::Identifier
end
Requirements
Entities are required to implement these methods:
IntegratedData::Identifier#parse(value)
Identifiers must be able to be parse values into a value object.
Results
- Hooked in
When the class is used as the :identifier parameter of the IntegratedData::Entity.lookup DSL, it will be used to coerce identifiers into that value object.
Extensions
There are currently no gems that extend IntegratedData::Identifier.
Entities
Convert any class into an entity by extending IntegratedData::Entity:
require 'integrated_data'
class MyEntity
extend IntegratedData::Entity
end
Requirements
Entities are required to implement these methods:
IntegratedData::Source#initialize(attributes = {})
Entities must be able to be initialized with a single argument, a Hash of attributes.
Results
- Lookup DSL
Entities gain a class method, lookup, that can be used to register ways to look up attributes. They can be instantiated with a hash of identifiers, and are then automatically initialized with a hash of attributes.
@identifiers
Entity instances have access to an instance variable, @identifiers, reflecting the identifiers they were furnished with on initialization.
@attributes
Entity instances have access to an instance variable, @attributes, reflecting the attributes they were furnished with on initialization.
Extensions
There are currently no gems that extend IntegratedData::Entity.
Examples
Sources
CSVs
An in-memory data source could be implemented as such:
class InMemorySource < Array
# This simply creates a default `build` method that raises an NotImplementedError.
extend IntegratedData::Source
# Here we fulfill the IntegratedData::Source interface
# by overriding the class method with our own implementation.
class << self
# `build` must accept a Hash; in this case we require an array of hashes.
def build(data: [])
data = Array.try_convert(data).map do |hashlike|
Hash.try_convert(data)
end.compact
new data
end
end
# Default strategy, if none is specified by the entity
def call(id)
find do |attributes|
id == attributes[:id]
end
end
end
A CSV data source could be implemented as such:
require 'smarter_csv'
class CSVSource
# This simply creates a default `build` method that raises an NotImplementedError.
extend IntegratedData::Source
# Here we fulfill the IntegratedData::Source interface
# by overriding the class method with our own implementation.
class << self
# `build` must accept a Hash; in this case we require a file parameter.
def build(file:)
new(file)
end
end
# Just store the file for reference so CSVs are only parsed when needed.
def initialize(file)
@file = file
end
# Default strategy, if none is specified by the entity
def call(id)
@data ||= SmarterCSV.process(@file)
@data.find do |attributes|
id == attributes[:id]
end
end
# Custom strategy with variant behavior, used on demand.
# In this case, if an entity needs multiple attributes from the csv file,
# and it's using the `:uncached` strategy,
# it'll process the csv each time it needs an attribute from it.
def uncached(id)
SmarterCSV.process(@file).find do |attributes|
id == attributes[:id]
end
end
end
Identifiers
An identifier could be implemented as such:
class PaddedString < String
# This simply creates a default `parse` method that raises an NotImplementedError.
extend IntegratedData::Identifier
# Here we fulfill the IntegratedData::Identifier interface
# by overriding the class method with our own implementation.
class << self
# `parse` must accept a single value to be coerced.
def parse(value)
string = value.to_s
string.insert(0, '0') until string.length >= 10
new string
end
end
# Pad other strings before comparison.
def == other
super self.class.parse other
end
end
Entities
Finally, we could put it all together as such:
require 'ostruct'
# We use OpenStruct to get an object that can be initialized with a hash of attributes.
class Student < OpenStruct
# This allows you to use the `lookup` DSL and defines `new` to hook into it.
extend IntegratedData::Entity
# The full lookup DSL looks like this:
# lookup :attribute_name, from: SourceClass, by: :identifier_key, with: (OptionalIdentifierClass or nil), strategy: (:optional_strategy_method or :call), **extra_options_for_source)
lookup :first_name, from: InMemorySource, by: :student_id, data: [student_id: '0000000001', first_name: 'Chris']
end