Module: AllFather

Included in:
SCC, SRT, TTML, VTT
Defined in:
lib/allfather.rb

Overview

A Module that kind of acts as an interface where the generic methods that applies to each caption type can be defined

To use for a new caption type, simply include this module and provide caption specific implementations

Defined Under Namespace

Classes: InvalidInputException, LangDetectionFailureException

Constant Summary collapse

VALID_FILES =

Valid file extensions that we support; Keep expanding as we grow

[".scc", ".srt", ".vtt", ".ttml", ".dfxp"]
TYPE_SCC =

Caption type constants

1
TYPE_SRT =
2
TYPE_VTT =
3
TYPE_TTML =
4
TYPE_DFXP =
5

Instance Method Summary collapse

Instance Method Details

#callsignObject

While the logic of abstracting stuff to callers has it’s benefits, sometimes it’s required to identify which instance are we specifically operate on. This method returns the instance currently being operated on and returns one of the TYPE_ constants defined here Implement this unless and absolutely it’s necessary and there is no other easy way to do things

Returns
  • the call sign of the instance



171
172
173
# File 'lib/allfather.rb', line 171

def callsign
  raise "Not Implemented. Class #{self.class.name} doesn't implement callsign"
end

#infer_languagesObject

Method to infer the language(s) of the caption by inspecting the file depending on the type of the caption file

Returns

  • The ISO 639-1 Letter Language codes



56
57
58
# File 'lib/allfather.rb', line 56

def infer_languages
  raise "Not Implemented. Class #{self.class.name} doesn't implement infer_languages"
end

#is_valid?Boolean

Method to do basic validations like is this a valid file to even accept for any future transactions

Returns:

true if the file is valid and false otherwise

Returns:

  • (Boolean)


44
45
46
# File 'lib/allfather.rb', line 44

def is_valid?
  raise "Not Implemented. Class #{self.class.name} doesn't implement is_valid?"
end

#set_translator(translator) ⇒ Object

Method to set a translation engine

  • translator - Instance of translation engine. Refer to ‘engines/aws` for example

Raises

  • ‘InvalidInputException` when the argument `translator` is not an instance of Translator class



69
70
71
72
73
# File 'lib/allfather.rb', line 69

def set_translator(translator)
  if translator && !(translator.is_a? Translator)
    raise InvalidInputException.new("Argument is not an instance of Translator")
  end
end

#supported_transformationsObject

Method to report on the supported transformations. Each implementor is free to return the types to which it can convert itself to

Returns

  • An array of one or more types defined as TYPE_ constants here



157
158
159
# File 'lib/allfather.rb', line 157

def supported_transformations
  raise "Not Implemented. Class #{self.class.name} doesn't implement supported_transformations"
end

#transform_to(types, src_lang, target_lang, output_dir) ⇒ Object

Method to convert from one caption type to other types. If the src_lang is not provided then all source languages will be converted to target types. For example, if a ttml file has “en” and “es” and target_type is vtt and no src_lang is provided 2 vtt files would be created one per language in the source. if a target_lang is provided then one of the lang from source would be picked for creating the output file with target_lang

If no target_lang is provided, no translations are applied. output_file is created using without any need for any language translation services. Hence doesn’t incur any cost !!

Note: src_lang makes sense only for caption types that can hold multi lingual captions like dfxp and ttml. For other caption sources this field is ignored

  • types - An array of Valid input caption type(s). Refer to ‘#CaptionType`

  • src_lang - can be inferred using #infer_language method

  • target_lang - Target 2 letter ISO language code to which the source needs to be translated in to.

  • output_dir - Output Directory. Generated files would be dumped here

Raises

InvalidInputException shall be raised if

  1. The input file doesn’t exist or is unreadable or is invalid caption

  2. The output dir doesn’t exist

  3. Invalid lang codes for a given caption type

  4. Unsupported type to which conversion is requested for



134
135
136
137
138
139
140
141
142
143
144
145
146
147
# File 'lib/allfather.rb', line 134

def transform_to(types, src_lang, target_lang, output_dir)
  if (types - supported_transformations).size != 0
    raise InvalidInputException.new("Unknown types provided for conversion in input #{types}")
  end
  unless File.directory?(output_dir)
    FileUtils.mkdir_p(output_dir)
  end
  # Basic validations
  if types.include?(TYPE_SCC)
    if target_lang && !target_lang.eql?("en")
      raise InvalidInputException.new("SCC can be generated only in en. #{target_lang} is unsupported")
    end
  end
end

#translate(src_lang, target_lang, output_file) ⇒ Object

Method to translate the caption from one language to another

  • src_lang - can be inferred using #infer_language method

  • target_lang - Target 2 letter ISO language code to which the source needs to be translated in to.

  • output_file - Output file. Can be a fully qualified path or just file name

Raises

InvalidInputException shall be raised if

  1. The input file doesn’t exist or is unreadable or is invalid caption

  2. The output file can’t be written

  3. The target_lang is not a valid ISO 639-1 Letter Language code



89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
# File 'lib/allfather.rb', line 89

def translate(src_lang, target_lang, output_file)
  # Check if a non empty output file is present and error out to avoid
  # the danger or overwriting some important file !!
  if File.exists?(output_file) && File.size(output_file) > 0
    raise InvalidInputException.new("Output file #{output_file} is not empty.")
  else
    # Just open the file in writable mode and close it just to ensure that
    # we can write the output file
    File.open(output_file, "w") {|f|
    }
  end
  # Check if the file is writable ?
  unless File.writable?(output_file)
    raise InvalidInputException.new("Output file #{output_file} not writable.")
  end
  # Further checks can be done only in caption specific implementations
  # or translation engine specific implementation
end