my_obfuscate

Standalone Ruby code for the selective rewriting of SQL dumps in order to protect user privacy. Supports MySQL and SQL Server.

Install

(sudo) gem install iterationlabs-my_obfuscate

Example Usage

Make an obfuscator.rb script:

#!/usr/bin/env ruby
require "rubygems"
require "my_obfuscate"

obfuscator = MyObfuscate.new({
  :people => {
    :email                     => { :type => :email, :skip_regexes => [/^[\w\.\_]+@my_company\.com$/i] },
    :ethnicity                 => :keep,
    :crypted_password          => { :type => :fixed, :string => "SOME_FIXED_PASSWORD_FOR_EASE_OF_DEBUGGING" },
    :salt                      => { :type => :fixed, :string => "SOME_THING" },
    :remember_token            => :null,
    :remember_token_expires_at => :null,
    :age                       => { :type => :null, :unless => lambda { |person| person[:email] == "[email protected]" } },
    :photo_file_name           => :null,
    :photo_content_type        => :null,
    :photo_file_size           => :null,
    :photo_updated_at          => :null,
    :postal_code               => { :type => :fixed, :string => "94109", :unless => lambda {|person| person[:postal_code] == "12345"} },
    :name                      => :name,
    :full_address              => :address,
    :bio                       => { :type => :lorem, :number => 4 },
    :relationship_status       => { :type => :fixed, :one_of => ["Single", "Divorced", "Married", "Engaged", "In a Relationship"] },
    :has_children              => { :type => :integer, :between => 0..1 },
  },

  :invites                     => :truncate,
  :invite_requests             => :truncate,
  :tags                        => :keep,

  :relationships => {
    :account_id                => :keep,
    :code                      => { :type => :string, :length => 8, :chars => MyObfuscate::USERNAME_CHARS }
  }
})
obfuscator.fail_on_unspecified_columns = true # if you want it to require every column in the table to be in the above definition
obfuscator.globally_kept_columns = %w[id created_at updated_at] # if you set fail_on_unspecified_columns, you may want this as well
obfuscator.obfuscate(STDIN, STDOUT)

And to get an obfuscated dump:

mysqldump -c --add-drop-table -u user -ppassword database | ruby obfuscator.rb > obfuscated_dump.sql

Note that the -c option on mysqldump is required to use my_obfuscator. If you get MySQL errors due to very long lines, try some combination of –max_allowed_packet=128M, –single-transaction, –skip-extended-insert, and –quick.

Database Server

By default the database type is assumed to be MySQL, but you can use the builtin SQL Server support by specifying:

obfuscator.database_type = :sql_server

Changes

  • Support for SQL Server

  • :unless and :if now support :nil as a shorthand for a Proc that checks for nil

  • :name, :lorem, and :address are all now supported types. You can pass :number to :lorem to specify how many sentences to generate. The default is one.

  • { :type => :whatever } is now optional when no additional options are needed. Just use :whatever.

  • Warnings are thrown when an unknown column type or table is encountered. Use :keep in both cases.

  • { :type => :fixed, :string => Proc { |row| ... } } is now available.

Note on Patches/Pull Requests

  • Fork the project.

  • Make your feature addition or bug fix.

  • Add tests for it. This is important so I don’t break it in a future version unintentionally.

  • Commit, do not mess with rakefile, version, or history. (If you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)

  • Send me a pull request. Bonus points for topic branches.

Thanks

Thanks to Mavenlink and Pivotal Labs for patches and updates!

Copyright © 2009 Honk. Now maintained by Iteration Labs, LLC. See LICENSE for details.