Class: EpubTools::SplitChapters

Inherits:
Object
  • Object
show all
Includes:
Loggable
Defined in:
lib/epub_tools/split_chapters.rb

Overview

Takes a Google Docs generated, already extracted from their EPUB, XHTML files with multiple chapters and it:

  • Extracts classes using StyleFinder

  • Looks for tags that say something like Chapter XX or Prologue and splits the text there

  • Creates new chapter_XX.xhtml files that are cleaned using XHTMLCleaner

  • Saves those files to output_dir

Instance Method Summary collapse

Methods included from Loggable

#log

Constructor Details

#initialize(options = {}) ⇒ SplitChapters

Initializes the class

Parameters:

  • options (Hash) (defaults to: {})

    Configuration options

Options Hash (options):

  • :input_file (String)

    Path to the source XHTML (required)

  • :book_title (String)

    Title to use in HTML <title> tags (required)

  • :output_dir (String)

    Where to write chapter files (default: ‘./chapters’)

  • :output_prefix (String)

    Filename prefix for chapter files (default: ‘chapter’)

  • :verbose (Boolean)

    Whether to print progress to STDOUT (default: false)



26
27
28
29
30
31
32
# File 'lib/epub_tools/split_chapters.rb', line 26

def initialize(options = {})
  @input_file    = options.fetch(:input_file)
  @book_title    = options.fetch(:book_title)
  @output_dir    = options[:output_dir] || './chapters'
  @output_prefix = options[:output_prefix] || 'chapter'
  @verbose       = options[:verbose] || false
end

Instance Method Details

#runArray<String>

Runs the splitter

Returns:

  • (Array<String>)

    List of generated chapter file paths



36
37
38
39
40
41
42
43
44
45
46
47
48
49
# File 'lib/epub_tools/split_chapters.rb', line 36

def run
  # Prepare output dir
  FileUtils.mkdir_p(@output_dir)

  # Read the doc
  raw_content = read_and_strip_problematic_tags
  doc = Nokogiri::HTML(raw_content)

  # Find Style Classes
  StyleFinder.new({ file_path: @input_file, verbose: @verbose }).run

  chapters = extract_chapters(doc)
  write_chapter_files(chapters)
end