Class: Tahweel::Ocr

Inherits:
Object
  • Object
show all
Defined in:
lib/tahweel/ocr.rb

Overview

The main entry point for Optical Character Recognition (OCR). This class acts as a factory/strategy context, delegating the actual extraction logic to a specific processor.

Examples:

Usage with default processor (Google Drive)

text = Tahweel::Ocr.extract("image.png")

Usage with a specific processor (Future-proofing)

# text = Tahweel::Ocr.extract("image.png", processor: :tesseract)

Constant Summary collapse

AVAILABLE_PROCESSORS =
[:google_drive].freeze

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(processor: :google_drive) ⇒ Ocr

Initializes the OCR engine with a specific processor strategy.

Parameters:

  • processor (Symbol) (defaults to: :google_drive)

    The processor to use (default: :google_drive).

Raises:

  • (ArgumentError)

    If an unknown processor is specified.



29
30
31
32
33
34
# File 'lib/tahweel/ocr.rb', line 29

def initialize(processor: :google_drive)
  @processor = case processor
               when :google_drive then Processors::GoogleDrive.new
               else raise ArgumentError, "Unknown processor: #{processor}"
               end
end

Class Method Details

.extract(file_path, processor: :google_drive) ⇒ String

Convenience method to extract text using a specific processor.

Parameters:

  • file_path (String)

    Path to the image file.

  • processor (Symbol) (defaults to: :google_drive)

    The processor to use (default: :google_drive).

Returns:

  • (String)

    The extracted text.



23
# File 'lib/tahweel/ocr.rb', line 23

def self.extract(file_path, processor: :google_drive) = new(processor: processor).extract(file_path)

Instance Method Details

#extract(file_path) ⇒ String

Extracts text from the file using the configured processor.

Parameters:

  • file_path (String)

    Path to the image file.

Returns:

  • (String)

    The extracted text.



40
# File 'lib/tahweel/ocr.rb', line 40

def extract(file_path) = @processor.extract(file_path)