Mixpanel input plugin for Embulk
embulk-input-mixpanel is the Embulk input plugin for Mixpanel.
Overview
Required Embulk version >= 0.8.6 (since v0.4.0).
- Plugin type: input
- Resume supported: no
- Cleanup supported: no
- Guess supported: yes
Setup
How to get API configuration
This plugin uses API key and API secret for target project. Before you make your config.yml, you should get API key and API secret in mixpanel website.
For API configuration, you should log in mixpanel website, and click "Account" at the header. When you select "Projects" panel, you can get "API Key" and "API Secret" for each project.
How to get project's timezone
This plugin uses project's timezone to adjust timestamp to UTC.
To get it, you should log in mixpanel website, and click gear icon at the lower left. Then an opened dialog shows timezone at "Timezone" column in "Management" tab.
Configuration
- api_key: project API Key (string, required)
- api_secret: project API Secret (string, required)
- timezone: project timezone(string, required)
- from_date: From date to export (string, optional, default: today - 2)
- NOTE: Mixpanel API supports to export data from at least 2 days before to at most the previous day.
- fetch_days: Count of days range for exporting (integer, optional, default: from_date - (today - 1))
- NOTE: Mixpanel doesn't support to from_date > today - 2
- fetch_unknown_columns(deprecated): If you want this plugin fetches unknown (unconfigured in config) columns (boolean, optional, default: true)
- NOTE: If true,
unknown_columnscolumn is created and added unknown columns' data.
- NOTE: If true,
- fetch_custom_properties: All custom properties into
custom_propertieskey. "custom properties" are not desribed Mixpanel document 1, 2. (boolean, optional, default: false)- NOTE: Cannot set both
fetch_unknown_columnsandfetch_custom_propertiestotrue.
- NOTE: Cannot set both
- event: The event or events to filter data (array, optional, default: nil)
- where: Expression to filter data (c.f. https://mixpanel.com/docs/api-documentation/data-export-api#segmentation-expressions) (string, optional, default: nil)
- bucket:The data backet to filter data (string, optional, default: nil)
- retry_initial_wait_sec Wait seconds for exponential backoff initial value (integer, default: 1)
- retry_limit: Try to retry this times (integer, default: 5)
fetch_unknown_columns and fetch_custom_properties
If you have such data and set config.yml as below.
| event | $city | $custom | $foobar |
|---|---|---|---|
| ev | Tokyo | custom | foobar |
(NOTE: $city is a reserved key, $custom and $foobar are not)
in:
type: mixpanel
api_key: "API_KEY"
api_secret: "API_SECRET"
timezone: "US/Pacific"
from_date: "2015-07-19"
fetch_days: 5
columns:
- {name: event, type: string}
- {name: $custom, type: string}
fetch_unknown_columns: true will fetch as:
| event | $custom | unknown_columns (json) |
|---|---|---|
| ev | custom | {"$city":"Tokyo", "$foobar": "foobar"} |
fetch_custom_properties: true will fetch as:
| event | $custom | custom_properties (json) |
|---|---|---|
| ev | custom | {"$foobar": "foobar"} |
fetch_unknown_columns recognize $city and $foobar as unknown_columns because they are not described in config.yml.
fetch_custom_properties recognize $foobar as custom_properties. $custom is also custom property but it was described in config.yml.
Example
in:
type: mixpanel
api_key: "API_KEY"
api_secret: "API_SECRET"
timezone: "US/Pacific"
from_date: "2015-07-19"
fetch_days: 5
Run test
$ rake