Class: Spurline::Security::InjectionScanner
- Inherits:
-
Object
- Object
- Spurline::Security::InjectionScanner
- Defined in:
- lib/spurline/security/injection_scanner.rb
Overview
Scans Content objects for prompt injection patterns. Configurable strictness: :strict, :moderate, :permissive.
Only scans content at trust levels that could be injected (:user, :external, :untrusted). System and operator content is trusted by definition and bypasses scanning.
Pattern tiers are additive: :strict includes all :moderate patterns, :moderate includes all :permissive (BASE) patterns.
Constant Summary collapse
- SKIP_TRUST_LEVELS =
i[system operator].freeze
- BASE_PATTERNS =
Patterns checked at all strictness levels — the most obvious injection attempts.
[ /ignore\s+(all\s+)?(previous|prior|above|earlier)\s+(instructions|prompts|context|rules)/i, /you\s+are\s+now\s+(a|an|in)\s+/i, /\bsystem\s*:\s*\n/i, /\bforget\s+(all\s+|everything\s+)?(previous|prior|your)\s+(instructions|context|rules|training)/i, /\bdisregard\s+(all\s+)?(previous|prior|above|your)\s+(instructions|prompts|rules)/i, /\bnew\s+instructions\s*:/i, /\bpretend\s+(you\s+are|to\s+be|that\s+you)/i, ].freeze
- MODERATE_PATTERNS =
Additional patterns for :moderate and :strict — social engineering and role manipulation.
[ /\bdo\s+not\s+follow\b/i, /\boverride\s+(your|the)\s+(instructions|rules|guidelines|programming)\b/i, /\bact\s+as\s+(if\s+you\s+are|though\s+you|a\b)/i, /\bbehave\s+as\s+(if|though|a\b)/i, /\bfrom\s+now\s+on\s*,?\s*(you|your|act|behave|respond|ignore)/i, /\bjailbreak/i, /\bdeveloper\s+mode\b/i, /\bDAN\s+(mode|prompt)\b/i, /\bdo\s+anything\s+now\b/i, /\bunfiltered\s+(mode|response|output)\b/i, /\bno\s+(restrictions|rules|limitations|filters|censorship)\b/i, /\bbypass\s+(your|the|any|all)\s+(restrictions|rules|filters|safety|guidelines)/i, ].freeze
- STRICT_PATTERNS =
Additional patterns for :strict only — structural attacks and format manipulation.
[ /\brole\s*:\s*(system|assistant)\b/i, /<\/?system>/i, /\[INST\]/i, /<<\s*SYS\s*>>/i, /<\|im_start\|>/i, /\bIMPORTANT\s*:\s*(new|override|ignore|forget|disregard|update)/i, /\bATTENTION\s*:\s*(new|override|ignore|forget|disregard|update)/i, /\b(BEGIN|END)\s+(SYSTEM|INSTRUCTION|PROMPT)\b/i, /---+\s*\n\s*(system|instruction|new prompt|override)/i, /\bbase64\s*[\s:]+[A-Za-z0-9+\/=]{20,}/i, /\btranslate\s+(the\s+)?(following|this)\s+(from|to)\s+.*\s+(and|then)\s+(ignore|forget|override)/i, /\brepeat\s+(the\s+)?(system\s+prompt|instructions|your\s+rules)/i, /\b(reveal|show|display|output|print)\s+(your|the)\s+(system\s+prompt|instructions|rules)/i, ].freeze
- LEVELS =
i[strict moderate permissive].freeze
Instance Attribute Summary collapse
-
#level ⇒ Object
readonly
Returns the value of attribute level.
Instance Method Summary collapse
-
#initialize(level: :strict) ⇒ InjectionScanner
constructor
A new instance of InjectionScanner.
-
#scan!(content) ⇒ Object
Scans a Content object for injection patterns.
Constructor Details
#initialize(level: :strict) ⇒ InjectionScanner
Returns a new instance of InjectionScanner.
64 65 66 67 |
# File 'lib/spurline/security/injection_scanner.rb', line 64 def initialize(level: :strict) validate_level!(level) @level = level end |
Instance Attribute Details
#level ⇒ Object (readonly)
Returns the value of attribute level.
62 63 64 |
# File 'lib/spurline/security/injection_scanner.rb', line 62 def level @level end |
Instance Method Details
#scan!(content) ⇒ Object
Scans a Content object for injection patterns. Returns nil if clean, raises InjectionAttemptError if detected.
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
# File 'lib/spurline/security/injection_scanner.rb', line 71 def scan!(content) return if SKIP_TRUST_LEVELS.include?(content.trust) text = content.text patterns_for_level.each do |pattern| next unless text.match?(pattern) raise Spurline::InjectionAttemptError, "Injection pattern detected in content (trust: #{content.trust}, " \ "source: #{content.source}). Pattern: #{pattern.source[0..40]}. " \ "Review the content or adjust injection_filter level." end nil end |