Module: SnowflakeId::Generator
- Defined in:
- lib/snowflake_id/generator.rb
Defined Under Namespace
Classes: Callbacks
Constant Summary collapse
- DEFAULT_REGEX =
/timestamp_id\('(?<seq_prefix>\w+)'/
Class Method Summary collapse
- .at(timestamp, with_random: true) ⇒ Object
-
.define_timestamp_id ⇒ Object
Our ID will be composed of the following: 6 bytes (48 bits) of millisecond-level timestamp 2 bytes (16 bits) of sequence data.
- .ensure_id_sequences_exist ⇒ Object
- .ensure_id_sequences_exist_for(table_name) ⇒ Object
- .to_time(id) ⇒ Object
Class Method Details
.at(timestamp, with_random: true) ⇒ Object
110 111 112 113 114 115 116 |
# File 'lib/snowflake_id/generator.rb', line 110 def at(, with_random: true) id = .to_i * 1000 id += rand(1000) if with_random id <<= 16 id += rand(2**16) if with_random id end |
.define_timestamp_id ⇒ Object
Our ID will be composed of the following: 6 bytes (48 bits) of millisecond-level timestamp 2 bytes (16 bits) of sequence data
The ‘sequence data’ is intended to be unique within a given millisecond, yet obscure the ‘serial number’ of this row.
To do this, we hash the following data:
-
Table name (if provided, skipped if not)
-
Secret salt (should not be guessable)
-
Timestamp (again, millisecond-level granularity)
We then take the first two bytes of that value, and add the lowest two bytes of the table ID sequence number (‘table_name`_id_seq). This means that even if we insert two rows at the same millisecond, they will have distinct ’sequence data’ portions.
If this happens, and an attacker can see both such IDs, they can determine which of the two entries was inserted first, but not the total number of entries in the table (even mod 2**16).
The table name is included in the hash to ensure that different tables derive separate sequence bases so rows inserted in the same millisecond in different tables do not reveal the table ID sequence number for one another.
The secret salt is included in the hash to ensure that external users cannot derive the sequence base given the timestamp and table name, which would allow them to compute the table ID sequence number.
67 68 69 70 71 |
# File 'lib/snowflake_id/generator.rb', line 67 def return if already_defined? connection.execute() end |
.ensure_id_sequences_exist ⇒ Object
73 74 75 76 77 78 |
# File 'lib/snowflake_id/generator.rb', line 73 def ensure_id_sequences_exist # Find tables using timestamp IDs. connection.tables.each do |table| ensure_id_sequences_exist_for(table) end end |
.ensure_id_sequences_exist_for(table_name) ⇒ Object
80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 |
# File 'lib/snowflake_id/generator.rb', line 80 def ensure_id_sequences_exist_for(table_name) # We're only concerned with "id" columns. id_col = connection.columns(table_name).find { |col| col.name == "id" } return unless id_col # And only those that are using timestamp_id. data = DEFAULT_REGEX.match(id_col.default_function) return unless data seq_name = "#{data[:seq_prefix]}_id_seq" # If we were on Postgres 9.5+, we could do CREATE SEQUENCE IF # NOT EXISTS, but we can't depend on that. Instead, catch the # possible exception and ignore it. # Note that seq_name isn't a column name, but it's a # relation, like a column, and follows the same quoting rules # in Postgres. connection.execute(" DO $$\n BEGIN\n CREATE SEQUENCE \#{connection.quote_column_name(seq_name)};\n EXCEPTION WHEN duplicate_table THEN\n -- Do nothing, we have the sequence already.\n END\n $$ LANGUAGE plpgsql;\n SQL\nrescue StandardError => e\n Rails.logger.warn \"SnowflakeId: Could not ensure sequence for \#{table_name}: \#{e.message}\"\nend\n") |
.to_time(id) ⇒ Object
118 119 120 |
# File 'lib/snowflake_id/generator.rb', line 118 def to_time(id) Time.at((id >> 16) / 1000).utc end |