AI Redactor

Gem Version Build Status

AI-powered redaction tool for automatically detecting and masking Personally Identifiable Information (PII) in text and images. Designed to help organizations comply with GDPR, HIPAA, and KVKK regulations.

Features

  • Text Analysis: Regex-based detection of PII including:

    • Turkish National ID numbers
    • IBAN numbers
    • Credit card numbers
    • Phone numbers
    • Email addresses
    • Social Security Numbers
    • License plates
    • IP addresses
    • Passport numbers
    • Tax ID numbers
    • Bank account numbers
  • Flexible Masking Options:

    • Custom mask characters
    • Configurable mask length
    • Format-preserving masking
    • Pattern-specific filtering
  • Comprehensive Reporting:

    • JSON reports with detection details
    • Confidence scores for each detection
    • Position information
    • Summary statistics
  • CLI Interface: Command-line tool for batch processing

  • Developer-friendly API: Simple Ruby interface

Installation

Add this line to your application's Gemfile:

gem 'ai_redactor'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install ai_redactor

Usage

Basic Text Masking

require 'ai_redactor'

# Simple masking
text = "John Smith, National ID: 12345678901, IBAN: GB29NWBK60161331926819"
masked = AiRedactor.mask_text(text)
puts masked
# => "John Smith, National ID: **********, IBAN: ********"

# Custom masking options
masked = AiRedactor.mask_text(text, 
  mask_char: 'X', 
  mask_length: 4,
  preserve_format: true
)

Detailed Analysis

# Get detailed analysis report
report = AiRedactor.analyze_text(text)

puts "Found #{report.detection_count} PII items"
puts "Detection types: #{report.detection_types.join(', ')}"
puts "Average confidence: #{report.summary[:average_confidence]}"

# Access individual detections
report.detections.each do |detection|
  puts "#{detection[:type]}: #{detection[:original]} (confidence: #{detection[:confidence]})"
end

# Export as JSON
json_report = report.to_json
File.write('analysis_report.json', json_report)

Pattern Filtering

# Only detect specific patterns
email_only = AiRedactor.mask_text(text, patterns: [:email])

# Multiple specific patterns
financial_only = AiRedactor.mask_text(text, patterns: [:iban, :credit_card, :bank_account])

CLI Usage

# Basic masking
ai_redactor "Contact John at [email protected] or call 555-123-4567"

# Custom options
ai_redactor --mask-char X --mask-length 4 "Email: [email protected]"

# Detailed analysis
ai_redactor --analyze --format json "ID: 12345678901, Email: [email protected]"

# Save to file
ai_redactor --output report.txt --analyze "Sensitive data here"

# List available patterns
ai_redactor --list-patterns

# Filter specific patterns
ai_redactor --patterns email,phone "Contact info: [email protected], 555-1234"

Configuration Options

Option Description Default
mask_char Character used for masking "*"
mask_length Length of mask 8
preserve_format Preserve original format false
patterns Array of patterns to detect All patterns
case_sensitive Case-sensitive matching false

Supported PII Patterns

  • turkish_id: Turkish National ID numbers (11 digits)
  • iban: International Bank Account Numbers
  • credit_card: Credit card numbers (various formats)
  • phone: Phone numbers (international and local)
  • email: Email addresses
  • ssn: Social Security Numbers (US format)
  • license_plate: License plate numbers
  • ip_address: IP addresses
  • passport: Passport numbers
  • tax_id: Tax ID numbers
  • bank_account: Bank account numbers

Privacy & Security

  • Offline Processing: No data sent to external services
  • No Data Storage: Text is processed in memory only
  • Configurable: Full control over detection and masking
  • Compliance Ready: Supports GDPR, HIPAA, and KVKK requirements

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt.

To install this gem onto your local machine, run bundle exec rake install.

Running Tests

bundle exec rspec

Code Quality

bundle exec rubocop

Roadmap

  • v0.1: ✅ Text redaction with regex patterns
  • v0.2: 🔄 ONNX-powered face detection in images
  • v0.3: 📋 REST/gRPC API service mode
  • v1.0: 🎯 Full compliance suite with audit logs

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/ahmetxhero/ai_redactor.

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

License

The gem is available as open source under the terms of the MIT License.

Support


Protecting Privacy, One Redaction at a Time 🛡️