Eventosaurus
Enables easy asynchronous event storing and querying on DynamoDB
Installation
Add this line to your application's Gemfile:
gem 'eventosaurus'
And then execute:
$ bundle
Or install it yourself as:
$ gem install eventosaurus
Setting up a local DynamoDB server
When developing, use the local DynamoDB server. Install and kick off your local instance:
Set the following environment variables:
EVENT_ENVIRONMENT_PREFIX=localhost
AWS_ENDPOINT=http://localhost:8000
Install and kick off DynamoDB:
$ brew install dynamodb-local
$ ln -sfv /usr/local/opt/dynamodb-local/*.plist ~/Library/LaunchAgents
$ launchctl load ~/Library/LaunchAgents/homebrew.mxcl.dynamodb-local.plist
Configuration
You need to add the following initializer (ex: config/initializers/eventosaurus.rb):
Eventosaurus.configure do |config|
config.use_sidekiq
# ex: localhost, productions
config.environment_prefix = ENV['EVENT_ENVIRONMENT_PREFIX']
config.aws_access_key_id = ENV['AWS_ACCESS_KEY_ID']
config.aws_secret_access_key = ENV['AWS_SECRET_ACCESS_KEY']
config.aws_region = ENV['AWS_REGION']
# optional, used for local dynamodb
config.aws_endpoint = ENV['AWS_ENDPOINT']
end
Sidekiq Alternatives
Eventosaurus ships with both synchronous and asynchronous options. By default Eventosaurus uses sidekiq to persist data to DynamoDB asynchronously.
For synchronous persistance:
Eventosaurus.configure do |config|
# ...
config.use_synchronous
# ...
end
To use your own persistence mechanism, reference the below two files:
And include it in your configuration
require 'custom_persistor'
Eventosaurus.configure do |config|
# ...
config.persistor = CustomPersistor
# ...
end
Event Representation
Every event type is represented by a class that includes Eventosaurus::Storable. Each class must do two things:
- define the table using the
table_definitionmacro - define the
detailsclass method, which defines the event interface
Here is an example of an event definition:
module Events
class PhoneCall
include Eventosaurus::Storable
table_definition name: :phone_call, partition_key: { person_id: :n }
def self.details( person_id:, phone_number:, last_called:)
{
'person_id' => person_id
'phone_number' => phone_number.to_s,
'last_called' => last_called
}
end
end
end
There are some built-in attributes for your events:
- the gem defines the range of your partition key to be
event_uuid.When writing an event, this attribute is enforced to be unique, preventing duplicate writes. See Event Duplication Prevention below. - the event also stores the timestamp, which represents the time when the gem client calls
.store
Building the Tables
DynamoDB must have the tables needed to run your events. Once you've written your event classes you must run a rake task to create the tables. The tables are namespaced by your environment, as defined in the environment_prefix variable mentioned above. So if you build locally, the table name might be localhost_phone_call. This will allow us to quickly get up and running on new environments. For the time being, the rake task expects your event definitions to be in app/models/events. Be sure to put them there!. Rake tasks are scoped to only work with tables that begin with your environment_prefix. This means even if staging and production point to the same dynamodb account, the drop_tables task will only drop tables from the environment specified.
rake eventosaurus:create_tables
(NOTE: in the short term, you will have to manually run this create tables task upon deploy, as well as staging environments and anyone who pulls code utilizing these tables. This is temporary and there is an outstanding task to change this.)
You may then verify the tables were created:
rake eventosaurus:list_tables
See the JSON used to create your tables:
rake eventosaurus:describe_tables
If you decide to (╯°□°)╯︵ ┻━┻
rake eventosaurus:drop_tables
Storing Data
To store data use the .store class method on your event class. Use the same signature as your details method mentioned above:
# Somewhere in your app:
def check_for_phonecall(row)
Events::PhoneCall.store(
person_id: row[:person_id],
phone_number: row[:phone_number],
last_called: row[:last_called]
)
end
Querying Data
The gem gives you some dynamic methods to query your data based on your table definition. It's important to keep in mind that you are working with DynamoDB. It is not meant to be a data store that is accessed generically. It expects you to know the queries you want to run upfront. Good for us, we are storing each event type in its own table, so we can make good guesses about this. To this end, eventosaurus creates getters for the attributes you listed in your table definition:
Events::PhoneCall.by_person_id(5)
Events::PhoneCall.by_person_id(5).by_table_name('users').count
# event_uuid & created_at included for free :)
Events::PhoneCall.by_created_at('2015-01-04', 'GT')
The queries above return eventosaurus Query objects. To actually execute the query, use the run method:
Events::PhoneCall.by_person_id(5).count.run
Note:
- In the last example we query by created_at even though it was not listed in the table definition. This is because each table gets the created_at timestamp column as well as the event_uuid column added.
- The operator defaults to 'EQ' (equals) but there are many to choose from: EQ, NE, IN, LE, LT, GE, GT, CONTAINS, NOT_CONTAINS, BEGINS_WITH
- DynamoDB only allows a single secondary index to accompany the partition key. This means the following query will not work the way you think:
# too many secondary predicates. after one secondary index is used, the rest will be full scans on whatever comes back after the first local index.
Events::PhoneCall.by_created_at('2015-01-04', 'GT').by_table_name('users')
To sum it up: for speed, you are allowed 0||1 partition key condition and 0||1 secondary condition. No more than that.
Test mode
Test mode can be enabled by placing the following in rails_helper.rb (or equivalent):
Eventosaurus.enable_test_mode
On Choosing the proper table_definition for your event
When considering the correct partition_key, there are a few considerations. The first is to consider the predicates you will filter by. The predicates you use the most should probably become your partition_key. The second is the number of different values you expect to see in your partition. The more you have, the better. This is a complicated subject, and understanding of how DynamoDB works (partion keys, local and global secondary indexes) should be understood before creating an event. Here you can find more detail about best practices, and of course hopefully a co-worker Near You can help too.
Event Error Handling upon calling .store
When you call .store, you will be utilizing your .details method. Sadly, sometimes you will make mistakes and the gem will raise. Happily, you can decide what to do about the errors. If your call to .store raises, the on_error class method is called with the error as an argument. Feel free to overwrite this class method in your Event class:
module Events
class NeatEvent
include Eventosaurus::Storable
# ... your primary event code here...
def on_error(error)
HaikuNotifier.write_haiku_with_error(error)
end
end
end
Event Duplication Prevention
This gem has two methods of preventing duplicate events from being written to DynamoDB, and each method addresses a different way duplication can occur.
Background on the event_uuid column
Before getting into the two methods below, the foundational piece of information is that DynamoDB writes can be configured to fail if a duplicate value is found on an attribute. The Event Gem leverages this by using an 'event_uuid' as the table's sort key, and setting our writes to fail if the event_uuid already exists.
Duplication Cause 1: Double processing of asynchronous jobs.
If the same job that sends an event to dynamodb gets run twice, we need to make sure we don't store two events. This is handled by the mechanism explained above: writes are instructed to fail if they see the event_uuid already exists.
Duplication Cause 2: Gem client erroneously calls the .store method multiple times
If your (the gem client's) code has an error, and your code calls Events::NeatEvent.store more than intended, the Event Gem can be configured to help defend and ensure only 1 event is stored. To guard against this, you can create a composite primary key, based on the fields of your choosing. This composite key is then digested and used as the basis of the event_uuid. You may use the macro compsite_primary_key to achieve this duplication defense:
module Events
class NeatEvent
composite_primary_key :location, :employee_name, :employee_action
end
The above example will cause a string like the following to be generated and used as a digest for the event_uuid:
location=gardens:employee_name=Markus:employee_action=ate grapes
Even if you call Events::NeatEvent.store(args) multiple times with the same args, only one event will be created.
If you do not opt to use the composite_primary_key feature, the Event Gem will use SecureRandom.uuid to generate the uuid, which has a much less likely chance of collision than you winning the lottery (It follows RFC 4122)
Do not add the created_at attr to your list of composite_primary_key attrs
Using a timestamp representing the creation-time of the event (aka the created_at attr) in the composite_primary_key is not advisable, as accidental duplicate events might have slightly different timestamps, and thus slightly different UUIDs. Put simply: do not add the created_at attr to your list of composite_primary_key attrs.
Note that any attribute listed in the composite_primary_key macro promotes that attribute to a required attribute.
Using the AWS SDK (V2) Client Directly
To access the Aws SDK v2 client directly (for educational purposes only), access via Eventosaurus.configuration.dynamodb_client. For example:
Eventosaurus.configuration.dynamodb_client.list_tables
Development
After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.
Releasing
Version bumps should be done straight in master after appropriate PRs are merged.
Gem release best practices here
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/blueapron/eventosaurus.
License
The gem is Copyright 2015 Blue Apron, Inc.