Class: SmTranscript::Transcript
- Inherits:
-
Object
- Object
- SmTranscript::Transcript
- Defined in:
- lib/sm_transcript/transcript.rb
Instance Method Summary collapse
-
#cleanup_phrase(phrase) ⇒ Object
There are some word combinations that occur with such regularity that they call out to be fixed.
-
#get_time_expression(milliseconds) ⇒ Object
words_to_phrase.
-
#initialize(word_arr) ⇒ Transcript
constructor
A new instance of Transcript.
-
#words_to_phrase(start_time) ⇒ Object
Times are expressed in milliseconds, far more granularity than is useful for most user-facing apps, especially since the player reports elapsed time only ten times a second.
- #write_html(dest_file) ⇒ Object
- #write_ttml(dest_file) ⇒ Object
Constructor Details
#initialize(word_arr) ⇒ Transcript
Returns a new instance of Transcript.
15 16 17 18 |
# File 'lib/sm_transcript/transcript.rb', line 15 def initialize(word_arr) @metadata = {} @words = word_arr end |
Instance Method Details
#cleanup_phrase(phrase) ⇒ Object
There are some word combinations that occur with such regularity that they call out to be fixed. For example, “m I t” is unambiguously MIT. These edits can only be done when the phrase has been assembled.
124 125 126 |
# File 'lib/sm_transcript/transcript.rb', line 124 def cleanup_phrase(phrase) phrase end |
#get_time_expression(milliseconds) ⇒ Object
words_to_phrase
117 118 119 |
# File 'lib/sm_transcript/transcript.rb', line 117 def get_time_expression(milliseconds) milliseconds end |
#words_to_phrase(start_time) ⇒ Object
Times are expressed in milliseconds, far more granularity than is useful for most user-facing apps, especially since the player reports elapsed time only ten times a second. By reducing the time by orders of magnitude provides these benefits:
1) Multiple words fall within a <span> element. 2) Better mapping between start times and player time tracking
113 114 115 |
# File 'lib/sm_transcript/transcript.rb', line 113 def words_to_phrase(start_time) start_time.to_i/1000 end |
#write_html(dest_file) ⇒ Object
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
# File 'lib/sm_transcript/transcript.rb', line 20 def write_html(dest_file) # TODO: Do we want to notify user when overwriting existing file? # if File.exists?(dest_file) # p "overwriting existing destination file" # end File.open(dest_file, "w") do |f| span_element = "" prev_start_time = 0 start_time = 0 @words.each do |w| # get the start time and reduce its granularity so that multiple # words fall within a <span> element. start_time = w.start_time.to_i/1000 if start_time.to_i == prev_start_time.to_i # append word span_element << " #{w.word}" else # create a new span_element # since prev_start_time is zero on first line, this avoids # writing a closing </span> with no opening <span> f.puts span_element << "</span> " unless prev_start_time == 0 span_element = "<span id='T#{start_time}'>#{w.word}" prev_start_time = start_time end end # In the block above, the last word isn't written if # the start_time and prev_start_time are the same. f.puts span_element << "</span> " unless start_time != prev_start_time end end |
#write_ttml(dest_file) ⇒ Object
52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 |
# File 'lib/sm_transcript/transcript.rb', line 52 def write_ttml(dest_file) # TODO: Do we want to notify user when overwriting existing file? # if File.exists?(dest_file) # p "overwriting existing destination file" # end buf = "" bldr = Builder::XmlMarkup.new( :target => buf, :indent => 2 ) bldr.instruct! bldr.tt("xmlns" => "http://www.w3.org/2006/04/ttaf1", "xmlns:tts" => "http://www.w3.org/ns/ttml#styling", "xmlns:ttm" => "http://www.w3.org/ns/ttml#metadata", "xml:lang" => "en" ) { bldr.head { |b| b.ttm :title, 'Document Metadata Example' b.ttm :desc, 'This document employs document metadata.' } bldr.body { bldr.div { span_element = "" prev_start_secs = 0 start_ms = end_ms = 0 start_secs = 0 @words.each do |w| # get the start time and reduce its granularity so that multiple # words fall within a span element. start_secs = w.start_time.to_i/1000 if start_secs == prev_start_secs # append word end_ms = w.end_time.to_i span_element << " #{w.word}" else # create a new span_element bldr.p( span_element, "xml:id" => "T#{start_secs.to_s}", "begin" => "#{start_ms.to_s}ms", "end" => "#{end_ms.to_s}ms" ) start_ms = w.start_time.to_i end_ms = w.end_time.to_i span_element = " #{w.word}" prev_start_secs = start_secs end end # In the block above, the last word isn't written if # the start_time and prev_start_time are the same. bldr.p( span_element, "xml:id" => "T#{start_secs.to_s}", "begin" => "#{start_ms.to_s}ms", "end" => "#{end_ms.to_s}ms" ) unless start_secs != prev_start_secs } } } # p buf File.open(dest_file, "w") do |f| f.puts buf f.flush end end |