Module: SnowflakeId::Generator

Defined in:
lib/snowflake_id/generator.rb

Defined Under Namespace

Classes: Callbacks

Constant Summary collapse

DEFAULT_REGEX =
/timestamp_id\('(?<seq_prefix>\w+)'/

Class Method Summary collapse

Class Method Details

.at(timestamp, with_random: true) ⇒ Object



110
111
112
113
114
115
116
# File 'lib/snowflake_id/generator.rb', line 110

def at(timestamp, with_random: true)
  id  = timestamp.to_i * 1000
  id += rand(1000) if with_random
  id <<= 16
  id += rand(2**16) if with_random
  id
end

.define_timestamp_idObject

Our ID will be composed of the following: 6 bytes (48 bits) of millisecond-level timestamp 2 bytes (16 bits) of sequence data

The ‘sequence data’ is intended to be unique within a given millisecond, yet obscure the ‘serial number’ of this row.

To do this, we hash the following data:

  • Table name (if provided, skipped if not)

  • Secret salt (should not be guessable)

  • Timestamp (again, millisecond-level granularity)

We then take the first two bytes of that value, and add the lowest two bytes of the table ID sequence number (‘table_name`_id_seq). This means that even if we insert two rows at the same millisecond, they will have distinct ’sequence data’ portions.

If this happens, and an attacker can see both such IDs, they can determine which of the two entries was inserted first, but not the total number of entries in the table (even mod 2**16).

The table name is included in the hash to ensure that different tables derive separate sequence bases so rows inserted in the same millisecond in different tables do not reveal the table ID sequence number for one another.

The secret salt is included in the hash to ensure that external users cannot derive the sequence base given the timestamp and table name, which would allow them to compute the table ID sequence number.



67
68
69
70
71
# File 'lib/snowflake_id/generator.rb', line 67

def define_timestamp_id
  return if already_defined?

  connection.execute(sanitized_timestamp_id_sql)
end

.ensure_id_sequences_existObject



73
74
75
76
77
78
# File 'lib/snowflake_id/generator.rb', line 73

def ensure_id_sequences_exist
  # Find tables using timestamp IDs.
  connection.tables.each do |table|
    ensure_id_sequences_exist_for(table)
  end
end

.ensure_id_sequences_exist_for(table_name) ⇒ Object



80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
# File 'lib/snowflake_id/generator.rb', line 80

def ensure_id_sequences_exist_for(table_name)
  # We're only concerned with "id" columns.
  id_col = connection.columns(table_name).find { |col| col.name == "id" }
  return unless id_col

  # And only those that are using timestamp_id.
  data = DEFAULT_REGEX.match(id_col.default_function)
  return unless data

  seq_name = "#{data[:seq_prefix]}_id_seq"

  # If we were on Postgres 9.5+, we could do CREATE SEQUENCE IF
  # NOT EXISTS, but we can't depend on that. Instead, catch the
  # possible exception and ignore it.
  # Note that seq_name isn't a column name, but it's a
  # relation, like a column, and follows the same quoting rules
  # in Postgres.
  connection.execute("    DO $$\n      BEGIN\n        CREATE SEQUENCE \#{connection.quote_column_name(seq_name)};\n      EXCEPTION WHEN duplicate_table THEN\n        -- Do nothing, we have the sequence already.\n      END\n    $$ LANGUAGE plpgsql;\n  SQL\nrescue StandardError => e\n  Rails.logger.warn \"SnowflakeId: Could not ensure sequence for \#{table_name}: \#{e.message}\"\nend\n")

.to_time(id) ⇒ Object



118
119
120
# File 'lib/snowflake_id/generator.rb', line 118

def to_time(id)
  Time.at((id >> 16) / 1000).utc
end