Module: Wortsammler

Defined in:
lib/wortsammler.rb,
lib/wortsammler/version.rb,
lib/wortsammler/pdf_utilities.rb

Overview

This module provides utilites to handle pdf file.s

Note that it only works on Mac OS X.

Author:

  • Bernhard Weichel

Constant Summary collapse

PROGNAME =
"wortsammler"
VERSION =
"2.0.1"

Class Method Summary collapse

Class Method Details

.beautify(paths, config = nil) ⇒ Nil

beautify a list of Documents

Parameters:

  • paths (Array of String)

    Array of filenames which shall be cleaned.

  • config (ProoConfig) (defaults to: nil)

Returns:

  • (Nil)

    no return



212
213
214
215
216
217
218
219
# File 'lib/wortsammler.rb', line 212

def self.beautify(paths, config=nil)

  cleaner = PandocBeautifier.new($log)
  cleaner.config = config if config

  paths.each { |f| cleaner.beautify(f) }
  nil
end

.collect_traces(config) ⇒ Nil

collect the Traceables in a document specified by a manifest

Parameters:

  • config (ProolibConfig)

    the manifest model

Returns:

  • (Nil)

    no return



327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
# File 'lib/wortsammler.rb', line 327

def self.collect_traces(config)

  files   = config.input # get the input files
  rootdir = config.rootdir # get the root directory

  downstream_tracefile = config.downstream_tracefile # String to save downstram filenames
  reqtracefile_base    = config.reqtracefile_base # string to determine the requirements tracing results
  upstream_tracefiles  = config.upstream_tracefiles # String to read upstream tracefiles

  traceable_set = TraceableSet.new

  # collect all traceables in input
  files.each { |f|
    x=TraceableSet.processTracesInMdFile(f)
    traceable_set.merge(x)
  }

  # collect all upstream traceables
  #
  upstream_traceable_set=TraceableSet.new
  unless upstream_tracefiles.nil?
    upstream_tracefiles.each { |f|
      x=TraceableSet.processTracesInMdFile(f)
      upstream_traceable_set.merge(x)
    }
  end

  # check undefined traces
  all_traceable_set=TraceableSet.new
  all_traceable_set.merge(traceable_set)
  all_traceable_set.merge(upstream_traceable_set)
  undefineds=all_traceable_set.undefined_ids
  $log.warn "undefined traces: #{undefineds.join(' ')}" unless undefineds.empty?


  # check duplicates
  duplicates=all_traceable_set.duplicate_traces
  if duplicates.count > 0
    $log.warn "duplicated trace ids found:"
    duplicates.each { |d| d.each { |t| $log.warn "#{t.id} in #{t.info}" } }
  end

  # write traceables to the intermediate Tracing file
  outname                     ="#{rootdir}/#{reqtracefile_base}.md"

  # poke ths sort order for the traceables
  all_traceable_set.sort_order=config.traceSortOrder if config.traceSortOrder
  traceable_set.sort_order    =config.traceSortOrder if config.traceSortOrder
  # generate synopsis of traceableruby 1.8.7 garbage at end of file


  tracelist                   =""
  File.open(outname, "w") { |fx|
    fx.puts ""
    fx.puts "\\clearpage"
    fx.puts ""
    fx.puts "# Requirements Tracing"
    fx.puts ""
    tracelist=all_traceable_set.reqtraceSynopsis(:SPECIFICATION_ITEM)
    fx.puts tracelist
  }

  # output the graphxml
  # write traceables to the intermediate Tracing file
  outname="#{rootdir}/#{reqtracefile_base}.graphml"
  File.open(outname, "w") { |fx| fx.puts all_traceable_set.to_graphml }

  outname="#{rootdir}/#{reqtracefile_base}Compare.txt"
  File.open(outname, "w") { |fx| fx.puts traceable_set.to_compareEntries }

  # write the downstream_trace file - to be included in downstream - speciifcations
  outname="#{rootdir}/#{downstream_tracefile}"
  File.open(outname, "w") { |fx|
    fx.puts ""
    fx.puts "\\clearpage"
    fx.puts ""
    fx.puts "# Upstream Requirements"
    fx.puts ""
    fx.puts traceable_set.to_downstream_tracefile(:SPECIFICATION_ITEM)
  } unless downstream_tracefile.nil?


  # now add the upstream traces to input
  files.concat(upstream_tracefiles) unless upstream_tracefiles.nil?

  nil
end

.crop_pdf(infile, outfile = nil) ⇒ nil

crop a pdf file

Parameters:

  • infile (String)

    The name of the pdf file

  • outfile (String) (defaults to: nil)

    The name of the output file. if no file is given, then the inputfile is replaced by the cropped file.

Returns:

  • (nil)

    no return



153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
# File 'lib/wortsammler/pdf_utilities.rb', line 153

def self.crop_pdf(infile, outfile=nil)

  result=`gs -q -dBATCH -dNOPAUSE -sDEVICE=bbox \"#{infile}\" 2>&1`
  coords=/%BoundingBox:\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/.match(result).captures.join(" ")

  tmpfile=infile+"_tmp"

  cmd =[]
  cmd << "gs -q -dBATCH -dNOPAUSE -sDEVICE=pdfwrite"
  cmd << "-o\"#{tmpfile}\""
  cmd << "-c \"[/CropBox [#{coords}] /PAGES pdfmark\""
  cmd << "-f \"#{infile}\""
  cmd = cmd.join(" ")
  result = system(cmd)

  outfile=infile if outfile.nil?
  FileUtils.cp(tmpfile, outfile)
  FileUtils.rm(tmpfile)
  nil
end

.execute(options) ⇒ Nil

execute Wortsammler after parsing the command line

Parameters:

  • options (Hash)

    The parsed commandline arguments.

    The key of each entry is the argument name as a symbol

    The value of each entry is the value of the argument

    No default handling is performed, since defaulting of arguments has been done on the commandline processor.

Returns:

  • (Nil)

    No Return



24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
# File 'lib/wortsammler.rb', line 24

def self.execute(options)


  PandocBeautifier.new($log).check_pandoc_version

  ##
  #
  # print version info
  #
  if options[:version] then
    puts "this is #{Wortsammler::PROGNAME} version #{Wortsammler::VERSION}\n"

    pandoc=`#{PANDOC_EXE} -v`.split("\n")[0] rescue pandoc="error running pandoc"
    xetex=`#{LATEX_EXE} -v`.split("\n")[0] rescue pandoc="error running xelatex"

    $log.info "found #{pandoc}"
    $log.info "found #{xetex}"

    $log.debug("debug mode turned on")

    l= "-----------------"
    $log.info l
    options.each { |k, v| $log.info "#{k}: #{v}" }
    $log.info l
  end

  ##
  # initialize a project
  #
  if project_folder=options[:init] then
    if File.exists?(project_folder)
      $log.error "directory already exists: '#{project_folder}'"
      exit(false)
    end
    Wortsammler::init_folders(project_folder)
  end


  ##
  #
  # load the manifest or use default configuration
  #
  config = ProoConfig.new();
  if config_file=options[:manifest] then
    config.load_from_file(config_file)
  end

  ##
  # process input path
  #
  #
  input_files=nil
  if inputpath = options[:inputpath]
    unless File.exists? inputpath then
      $log.error "path does not exist path '#{inputpath}'"
      exit(false)
    end
    if File.file?(inputpath) #(RS_Mdc)
      input_files=[inputpath]
    elsif File.exists?(inputpath)
      input_files=Dir["#{inputpath}/**/*.md", "#{inputpath}/**/*.markdown", "#{inputpath}/**/*.plantuml"]
    end
  end

  ##
  #
  # beautify markdown files
  #
  #

  if options[:beautify]

    # process path
    if input_files then
      Wortsammler.beautify(input_files, config)
    end

    # process manifest
    if config.input then
      Wortsammler.beautify(config.input, config)
    end

    unless input_files or config
      $log.error "no input specified. Please use -m or -i to specify input"
      exit false
    end
  end

  #
  # plantuml markdown files
  #
  #

  if options[:plantuml]

    # process path
    if input_files then
      Wortsammler.plantuml(input_files)
    end

    # process manifest

    if config.input then
      Wortsammler.plantuml(config.input)
    end

    unless input_files or config.input
      $log.error "no input specified. Please use -m or -i to specify input"
      exit false
    end
  end


  ##
  # process collect in markdown files
  #

  if options[:collect]

    # collect by path

    if input_files then
      $log.warn "collect from path not yet implemented"
    end

    # collect by manifest

    if config.input then
      Wortsammler.collect_traces(config)
    end

    unless input_files or config.input
      $log.error "no input specified. Please use -m or -i to specify input"
      exit false
    end
  end


  ##
  #  process files
  #
  if options[:process]

    if input_files then

      if options[:outputformats] then
        outputformats = options[:outputformats].split(":")
      end

      if options[:outputfolder] then
        outputfolder = options[:outputfolder]
      else
        $log.error "no output folder specified"
        exit false
      end

      unless File.exists?(outputfolder) then
        $log.info "creating folder '#{outputfolder}'"
        FileUtils.mkdir_p(outputfolder)
      end

      input_files.each { |f| Wortsammler.render_single_document(f, outputfolder, outputformats, config) }
    end

    # collect by manifest

    if config.input then
      Wortsammler.process(config)
    end

    unless input_files or config
      $log.error "no input specified. Please use -m or -i to specify input"
      exit false
    end
  end

  nil
end

.init_folders(root) ⇒ Nil

initialize a project directory. It creates a bunch of folders, a root document, a manifest and an intial rakefile

Parameters:

  • root (String)
    The path to the root folder of the sample project

Returns:

  • (Nil)

    No return



293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
# File 'lib/wortsammler.rb', line 293

def self.init_folders(root)

  folders=["ZSUPP_Manifests",
           "ZGEN_Documents",
           "ZSUPP_Tools",
           "ZSUPP_Styles",
           "ZGEN_RequirementsTracing",
           "001_Main",
           "900_snippets"
  ]

  folders.each { |folder|
    FileUtils.mkdir_p("#{root}/#{folder}")
  }

  resourcedir=File.dirname(__FILE__)+"/../resources"
  Dir["#{resourcedir}/*.yaml"].each { |f|
    FileUtils.cp(f, "#{root}/ZSUPP_Manifests")
  }
  FileUtils.cp("#{resourcedir}/main.md", "#{root}/001_Main")
  FileUtils.cp("#{resourcedir}/rakefile.rb", "#{root}/ZSUPP_Tools")
  FileUtils.cp("#{resourcedir}/default.wortsammler.latex", "#{root}/ZSUPP_Styles")
  FileUtils.cp("#{resourcedir}/logo.jpg", "#{root}/ZSUPP_Styles")
  FileUtils.cp("#{resourcedir}/snippets.xlsx", "#{root}/900_snippets")

  nil
end

.plantuml(paths) ⇒ Nil

plantuml a list of Documents

Parameters:

  • paths (Array of String)

    Array of filenames which shall be converted.

  • config (ProoConfig)

Returns:

  • (Nil)

    no return



228
229
230
231
232
233
234
235
236
237
238
239
240
241
# File 'lib/wortsammler.rb', line 228

def self.plantuml(paths)

  plantumljar=File.dirname(__FILE__)+"/../resources/plantuml.jar"

  paths.each { |f|
    cmd          = "java -jar \"#{plantumljar}\" -v \"#{f}\" 2>&1"
    r            =`#{cmd}`
    no_of_images = r.split($/).grep(/Number of image/).first.split(":")[1]

    $log.info("#{no_of_images} uml diagram(s) in #{File.basename(f)}")
    $log.info(r) unless $?.success?
  }
  nil
end

.pptx_to_cropped_pdf(infile, outfile) ⇒ Array of String

convert an Powerpoint presentation to cropped pdf file. it generates one file per sheet

Parameters:

  • infile (String)

    name of the Powerpoint file

  • outfile (String)

    basis of the generated pdf file.

Returns:

  • (Array of String)

    the list of generated files. In fact it it is only one file. But for sake of harmonization with xlsx_to_pdf it is returned as an array.



97
98
99
100
101
102
103
# File 'lib/wortsammler/pdf_utilities.rb', line 97

def self.pptx_to_cropped_pdf(infile, outfile)
  outfiles=self.pptx_to_pdf(infile, outfile)
  outfiles.each{|f|
    self.crop_pdf(f)
  }
  outfile
end

.pptx_to_pdf(infile, outfile) ⇒ Array of String

convert an Powerpoint presentation to non cropped pdf file. it generates one file per sheet

Parameters:

  • infile (String)

    name of the Powerpoint file

  • outfile (String)

    basis of the generated pdf file.

Returns:

  • (Array of String)

    the list of generated files. In fact it it is only one file. But for sake of harmonization with xlsx_to_pdf it is returned as an array.



116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
# File 'lib/wortsammler/pdf_utilities.rb', line 116

def self.pptx_to_pdf(infile, outfile)

  tmpdir=Dir.mktmpdir
  tmpout="#{tmpdir}/#{File.basename(outfile)}"
  tmpin="#{tmpdir}/#{File.basename(infile)}"
  outdir =File.dirname(File.absolute_path(outfile))

  FileUtils.cp(infile, tmpin)

  cmd=[]
  cmd << "osascript <<-EOF"
  cmd << "tell application \"Microsoft PowerPoint\""
  #cmd << "activate"
  cmd << ""
  cmd << "open \"#{tmpin}\""
  cmd << "open \"#{tmpin}\"" # todo: I open it twice making sure that it is really open
  cmd << "set theActivePPT to the active presentation"
  cmd << "save theActivePPT in \"#{tmpout}\" as save as PDF"
  cmd << "close theActivePPT"
  cmd << "quit"
  cmd << "end tell"
  cmd = cmd.join("\n")
  system(cmd)
  FileUtils.cp("#{tmpout}", outfile)
  [outfile]
end

.process(config) ⇒ Nil

process the documents according to the manifest

Parameters:

  • config (ProoConfig)

    A configuration object representing the manifest.

Returns:

  • (Nil)

    no return



250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
# File 'lib/wortsammler.rb', line 250

def self.process(config)
  cleaner = PandocBeautifier.new($log)
  cleaner.config = config

  cleaner.generateDocument(config.input,
                           config.outdir,
                           config.outname,
                           config.format,
                           config.vars,
                           config.editions,
                           config.snippets,
                           config.frontmatter,
                           config)

  nil

end

.render_single_document(filename, outputfolder, outputformats, config = nil) ⇒ Nil

render a single document

Parameters:

  • filename (String)

    The filename of the document file which shall be rendered

  • outputfolder (String)

    The path to the outputfolder where the output files shall be placed.

  • config (ProoConfig) (defaults to: nil)
  • outputformats (Array of String)

    The list of formats which shall be produced

Returns:

  • (Nil)

    no return



279
280
281
282
283
284
# File 'lib/wortsammler.rb', line 279

def self.render_single_document(filename, outputfolder, outputformats, config=nil)
  cleaner = PandocBeautifier.new($log)
  cleaner.config = config if config
  cleaner.render_single_document(filename, outputfolder, outputformats)
  nil
end

.xlsx_to_cropped_pdf(infile, outfile) ⇒ Array of String

convert an excel workbook to cropped pdf file. it generates one file per sheet

Parameters:

  • infile (String)

    name of the Excel file

  • outfile (String)

    basis of the generated pdf file. The name of the sheet is mangled into the file accroding to the pattern <body_of_outfile> <name of the sheet>.<extension of outfile> note the space inserted by excel.

Returns:

  • (Array of String)

    the list of generated files.



28
29
30
31
32
33
34
35
# File 'lib/wortsammler/pdf_utilities.rb', line 28

def self.xlsx_to_cropped_pdf(infile, outfile)
  outfiles=self.xlsx_to_pdf(infile, outfile)
  outfiles.each{|f|
    self.crop_pdf(f)
  }

  outfiles
end

.xlsx_to_pdf(infile, outfile) ⇒ Array of String

convert an excel workbook to *non cropped* pdf file. it generates one file per sheet

Parameters:

  • infile (String)

    name of the Excel file

  • outfile (String)

    basis of the generated pdf file. The name of the sheet is mangled into the file accroding to the pattern <body_of_outfile> <name of the sheet>.<extension of outfile> note the space inserted by excel.

Returns:

  • (Array of String)

    the list of generated files.



50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
# File 'lib/wortsammler/pdf_utilities.rb', line 50

def self.xlsx_to_pdf(infile, outfile)

  tmpdir=Dir.mktmpdir
  outext=File.extname(outfile)
  tmpbase=File.basename(outfile, outext)
  tmpfile="#{tmpdir}/#{File.basename(infile)}"
  outdir =File.dirname(File.absolute_path(outfile))

  FileUtils.cp(infile, tmpfile)

  cmd=[]
  cmd << "osascript <<-EOF"
  cmd << "set theTmpBase to (POSIX file \"#{tmpdir}/#{tmpbase}.pdf\") as string"
  cmd << "tell application \"Microsoft Excel\""
  #cmd << "activate"
  cmd << ""
  cmd << "open \"#{tmpfile}\""
  cmd << ""
  cmd << "save active workbook in theTmpBase as PDF file format"
  cmd << "delay 1"
  cmd << "close active workbook saving no"
  cmd << "quit"
  cmd << "end tell"
  cmd << "EOF"
  cmd = cmd.join("\n")
  system(cmd)

  Dir["#{tmpdir}/#{tmpbase}*.pdf"].each do |f|
    outfilename=File.basename(f).gsub(" ", "_")
    FileUtils.cp(f, "#{outdir}/#{outfilename}")
  end

  Dir["#{outdir}/#{tmpbase}*#{outext}"]
end