Class: NdrPseudonymise::PrescriptionPseudonymiser
- Inherits:
-
PseudonymisationSpecification
- Object
- PseudonymisationSpecification
- NdrPseudonymise::PrescriptionPseudonymiser
- Defined in:
- lib/ndr_pseudonymise/prescription_pseudonymiser.rb
Overview
Pseudonymise prescription data
Constant Summary collapse
- PREAMBLE_V2_DEMOG_ONLY =
'Pseudonymised matching data v2.0-demog-only'.freeze
Constants inherited from PseudonymisationSpecification
NdrPseudonymise::PseudonymisationSpecification::HEADER_ROW_PREFIX, NdrPseudonymise::PseudonymisationSpecification::KEY_BYTES, NdrPseudonymise::PseudonymisationSpecification::PREAMBLE_V1_STRIPED
Instance Method Summary collapse
-
#csv_header_row ⇒ Object
Header row for CSV data.
-
#emit_csv_rows(out_csv, pseudonymised_row) ⇒ Object
Append the output of pseudonymise_row to a CSV file.
-
#initialize(format_spec, key_bundle) ⇒ PrescriptionPseudonymiser
constructor
A new instance of PrescriptionPseudonymiser.
-
#pseudonymise_row(row) ⇒ Object
Pseudonymise a row of prescription data, returning an array of a single row: [[packed_pseudoid_and_demographics, clinical_data1, …]] Where packed_pseudoid_and_demographics consists of “pseudo_id1 (key_bundle) packed_pseudoid_and_demographics”.
-
#row_errors2(row) ⇒ Object
Validate a row of prescription data Return false if this row is a valid data row, otherwise a list of errors.
Methods inherited from PseudonymisationSpecification
#all_demographics, #clinical_data, #core_demographics, #data_hash, #decrypt_data, #decrypt_to_csv, #encrypt_data, factory, get_key_bundle, #header_row?, #pseudo_id, #pseudonymise_csv, #random_key, #real_ids, #row_errors, #safe_json
Constructor Details
#initialize(format_spec, key_bundle) ⇒ PrescriptionPseudonymiser
Returns a new instance of PrescriptionPseudonymiser.
16 17 18 19 20 |
# File 'lib/ndr_pseudonymise/prescription_pseudonymiser.rb', line 16 def initialize(format_spec, key_bundle) super return if @format_spec[:demographics] == [0, 1] raise 'Invalid specification: expected nhsnumber and birthdate in first 2 columns' end |
Instance Method Details
#csv_header_row ⇒ Object
Header row for CSV data
62 63 64 |
# File 'lib/ndr_pseudonymise/prescription_pseudonymiser.rb', line 62 def csv_header_row [PREAMBLE_V2_DEMOG_ONLY] end |
#emit_csv_rows(out_csv, pseudonymised_row) ⇒ Object
Append the output of pseudonymise_row to a CSV file
67 68 69 |
# File 'lib/ndr_pseudonymise/prescription_pseudonymiser.rb', line 67 def emit_csv_rows(out_csv, pseudonymised_row) out_csv << pseudonymised_row[0] end |
#pseudonymise_row(row) ⇒ Object
Pseudonymise a row of prescription data, returning an array of a single row:
- [packed_pseudoid_and_demographics, clinical_data1, …]
-
Where packed_pseudoid_and_demographics consists of “pseudo_id1 (key_bundle) packed_pseudoid_and_demographics”
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
# File 'lib/ndr_pseudonymise/prescription_pseudonymiser.rb', line 40 def pseudonymise_row(row) @key_cache ||= {} # Cache pseudonymisation keys for more compact import all_demographics = { 'nhsnumber' => row[0], 'birthdate' => row[1] } key = all_demographics.to_json if @key_cache.key?(key) pseudo_id1, key_bundle, demog_key = @key_cache[key] else pseudo_id1, key_bundle, demog_key = NdrPseudonymise::SimplePseudonymisation. generate_keys_nhsnumber_demog_only(@salt1, @salt2, row[0]) if !row[0].to_s.empty? && !row[1].to_s.empty? # && false to stop caching @key_cache = {} if @key_cache.size > 10000 # Limit cache size @key_cache[key] = [pseudo_id1, key_bundle, demog_key] end end encrypted_demographics = NdrPseudonymise::SimplePseudonymisation. encrypt_data64(demog_key, all_demographics.to_json) packed_pseudoid_and_demographics = format('%s (%s) %s', pseudo_id1, key_bundle, encrypted_demographics) [[packed_pseudoid_and_demographics] + row[2..-1]] end |
#row_errors2(row) ⇒ Object
Validate a row of prescription data Return false if this row is a valid data row, otherwise a list of errors
24 25 26 27 28 29 30 31 32 33 34 |
# File 'lib/ndr_pseudonymise/prescription_pseudonymiser.rb', line 24 def row_errors2(row) # Not significantly faster than optimised general #row_errors method (nhsnumber, birthdate) = row[0..1] unless nhsnumber.is_a?(String) && nhsnumber =~ /\A([0-9]{10})?\Z/ raise 'Invalid NHS number' end raise 'Missing NHS number' if nhsnumber.size < 10 unless birthdate.is_a?(String) && birthdate =~ /\A([0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]|)\Z/ raise 'Invalid birthdate' end end |