Module: Rails::Snowflake::Id

Defined in:
lib/rails/snowflake/id.rb

Defined Under Namespace

Classes: Callbacks

Constant Summary collapse

DEFAULT_REGEX =
/timestamp_id\('(?<seq_prefix>\w+)'/

Class Method Summary collapse

Class Method Details

.at(timestamp, with_random: true) ⇒ Object



111
112
113
114
115
116
117
# File 'lib/rails/snowflake/id.rb', line 111

def at(timestamp, with_random: true)
  id  = timestamp.to_i * 1000
  id += rand(1000) if with_random
  id <<= 16
  id += rand(2**16) if with_random
  id
end

.define_timestamp_idObject

Our ID will be composed of the following: 6 bytes (48 bits) of millisecond-level timestamp 2 bytes (16 bits) of sequence data

The ‘sequence data’ is intended to be unique within a given millisecond, yet obscure the ‘serial number’ of this row.

To do this, we hash the following data:

  • Table name (if provided, skipped if not)

  • Secret salt (should not be guessable)

  • Timestamp (again, millisecond-level granularity)

We then take the first two bytes of that value, and add the lowest two bytes of the table ID sequence number (‘table_name`_id_seq). This means that even if we insert two rows at the same millisecond, they will have distinct ’sequence data’ portions.

If this happens, and an attacker can see both such IDs, they can determine which of the two entries was inserted first, but not the total number of entries in the table (even mod 2**16).

The table name is included in the hash to ensure that different tables derive separate sequence bases so rows inserted in the same millisecond in different tables do not reveal the table ID sequence number for one another.

The secret salt is included in the hash to ensure that external users cannot derive the sequence base given the timestamp and table name, which would allow them to compute the table ID sequence number.



68
69
70
71
72
# File 'lib/rails/snowflake/id.rb', line 68

def define_timestamp_id
  return if already_defined?

  connection.execute(sanitized_timestamp_id_sql)
end

.ensure_id_sequences_existObject



74
75
76
77
78
79
# File 'lib/rails/snowflake/id.rb', line 74

def ensure_id_sequences_exist
  # Find tables using timestamp IDs.
  connection.tables.each do |table|
    ensure_id_sequences_exist_for(table)
  end
end

.ensure_id_sequences_exist_for(table_name) ⇒ Object



81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
# File 'lib/rails/snowflake/id.rb', line 81

def ensure_id_sequences_exist_for(table_name)
  # We're only concerned with "id" columns.
  id_col = connection.columns(table_name).find { |col| col.name == "id" }
  return unless id_col

  # And only those that are using timestamp_id.
  data = DEFAULT_REGEX.match(id_col.default_function)
  return unless data

  seq_name = "#{data[:seq_prefix]}_id_seq"

  # If we were on Postgres 9.5+, we could do CREATE SEQUENCE IF
  # NOT EXISTS, but we can't depend on that. Instead, catch the
  # possible exception and ignore it.
  # Note that seq_name isn't a column name, but it's a
  # relation, like a column, and follows the same quoting rules
  # in Postgres.
  connection.execute(<<~SQL)
    DO $$
      BEGIN
        CREATE SEQUENCE #{connection.quote_column_name(seq_name)};
      EXCEPTION WHEN duplicate_table THEN
        -- Do nothing, we have the sequence already.
      END
    $$ LANGUAGE plpgsql;
  SQL
rescue StandardError => e
  Rails.logger.warn "Rails::Snowflake: Could not ensure sequence for #{table_name}: #{e.message}"
end

.to_time(id) ⇒ Object



119
120
121
# File 'lib/rails/snowflake/id.rb', line 119

def to_time(id)
  Time.at((id >> 16) / 1000).utc
end