Kitchen

Kitchen lets you modify the structure and content of XML files. You create a Recipe with instructions and bake it in the Oven.

Full documentation at rubydoc.info.

Installation

Add this line to your application's Gemfile:

gem 'openstax_kitchen'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install openstax_kitchen

Two Ways to Use Kitchen

There are two ways to use Kitchen: the "generic" way and the "book" way. The generic way provides mechanisms for traversing and modifying an XML document. The book way extends the generic way by adding mechanisms that are specific to the book content XML produced at OpenStax (e.g. the book way knows about chapters and pages, figures and terms, etc, whereas the generic way does not have this knowledge).

We'll first talk about the generic way since those tools are also available in the book way.

Generic Usage

Kitchen lets you modify the structure and content of XML files. You create a Recipe and bake it in the Oven:

require "openstax_kitchen"

recipe = Kitchen::Recipe.new do |document|
  document.search("div.section").each do |element|
    element.name = "section"
    element.remove_class("section")
  end
end

Kitchen::Oven.bake(
  input_file: "some_file.xhtml",
  recipes: recipe,
  output_file: "some_other_file.xhtml"
)

The above example changes all <div class="section"> tags to <section>.

The document above is a Kitchen::Document and the element is a Kitchen::Element. Both have methods for reading and manipulating the XML. You can of course name the block argument whatever you want (see examples below).

The `search` method and enumerators

search takes one or more CSS and XPath selectors and returns an enumerator that iterates over the matching elements inside the document or element that search is called on.

The enumerator that is returned is an ElementEnumerator which is a subclass of Ruby's Enumerator. Enumerators are also Enumerable which gives you a bunch of methods you can call on enumerators like:

count - get the number of matching elements
map - form a new array using the matching elements
each - do something with each matching element
first - return the first matching element
etc etc

Here's an example calling search on a document and then calling each on its result:

doc.search("div.example").each do |div| # find all "div.example" elements in the document
  div.add_class("foo")                  # add a class to each of those elements
  div.search("p").each do |p|           # find all "p" elements inside the "div.example" elements
    p.name = "div"                      # change them to "div" tags
  end
end

Clipboards, cut, copy, and paste

When baking our content, we often want to move content around or make copies of content to reuse elsewhere in the document. Kitchen provides clipboard functionality to help with this.

Every document holds a set of named clipboards. You can cut and copy to these named clipboards:

doc.search("div.example").each do |div|
  div.cut(to: :my_special_clipboard)
end

doc.first("p").copy to: :foo

And then in some code where you are building up a string of HTML to insert you can

new_html = doc.clipboard(name: :my_special_clipboard).paste

cut puts the element on the clipboard and removes the original from the document. copy leaves the element in the document and puts a copy of the element on the clipboard.

Instead of using named clipboards, you can also pass any Clipboard object to these methods:

my_clipboard = Clipboard.new
doc.search("div.example").each do |div|
  div.cut(to: my_clipboard)
end

new_html = my_clipboard.paste

This is often the better way to go because if you use the named ones in the document you have to remember to clear them before you use it to not get stuck with whatever you left there the last time you used it.

ElementEnumerator also provides extra clipboard-related methods to make your life easier. Instead of writing

doc.search("div.example").each do |div|
  div.cut(to: :my_special_clipboard)
end

You can say

doc.search("div.example").cut(to: :my_special_clipboard)

The same applies to copy and these methods also work with passed-in Clipboard objects. If you don't pass in a clipboard name or a Clipboard object, these methods return a new Clipboard containing the cut or copied content:

a_new_clipboard = doc.search("div.example").cut

Clipboards are also Enumerable so you can call the enumerable methods (count, each, etc) on them.

When elements that were copied are pasted (or when elements that were cut are pasted more than once), Kitchen will update the IDs of pasted elements to keep them unique. Kitchen adds _copy_1, _copy_2, etc to IDs to make this happen. The _copy_ prefix is configurable (or at least close to it).

If you want to remove an element (or all elements matched by an enumerator) but NOT put those elements on a clipboard, you can use the trash method:

some_div.trash
doc.search(".not_needed").trash

Pantries

A document also gives you access to named pantries. A pantry is a place to store items that you can label for later retrieval by that label.

doc.pantry.store "some text", label: "some label"
doc.pantry.get("some label") # => "some text"

The above uses the :default pantry. You can also use named pantries:

doc.pantry(name: :figure_titles).store "Moon landing", label: "id42"

Counters

Oftentimes we need to count things in a document, for example to number chapters and pages. A document provides named counters:

doc.counter(:chapter).increment
doc.counter(:chapter).get
doc.counter(:chapter).reset

See book-oriented usage for a better way of counting elements.

Adding content

In kitchen we can prepend or append element children or siblings:

# <div><span>Hi</span></div> => <div><span><br/>Hi</span></div>
doc.search("span").first.prepend(child: "<br/>")

# <div><span>Hi</span></div> => <div><div></div><span>Hi</span></div>
doc.search("span").first.prepend(sibling: "<div>")

# <div><span>Hi</span></div> => <div><span>Hi<br/></span></div>
doc.search("span").first.append(child: "<br/>")

# <div><span>Hi</span></div> => <div><span>Hi</span><p/></div>
doc.search("span").first.append(sibling: "<p/>")

We can also replace all children:

# <div><span>Hi</span></div> => <div><span><p>Howdy</p></span></div>
doc.search("span").first.replace_children with: "<p>Howdy</p>"

And we can wrap an element with another element:

# <div><span>Hi</span></div> => <div><span class="other"><span>Hi</span></span></div>
doc.search("span").first.wrap("<span class='other'>")

or wrap an element's children:

# <div><span>Hi</span></div> => <div><span><span class="other" data-type="foo">Hi</span></span></div>
doc.search("span").first.wrap_children('span', class: 'other', data_type: 'foo')

Checking for elements

You can see if an element contains an element matching a selector:

my_element.contains?(".title") #=> true or false

Miscellaneous

ElementEnumerator also provides a first! method that is like the standard first except it raises an error if there is no matching first element to return.

Using `raw` to get at underlying Nokogiri objects.

Kitchen uses the Nokogiri gem to parse and manipulate XML documents. Document objects wraps a Nokogiri::XML::Document object, and Element objects wrap a Nokogiri::XML::Node object. If you want to do something wild and crazy you can access these underlying objects using the raw method on Document and Element. Note that many of the methods on the underlying objects are exposed on the Kitchen object, e.g. instead of saying my_element.raw['data-type'] you can say my_element['data-type'].

Book-Oriented Usage

All of the above works, but it is generic and we have a specific problem handling books that use a specific schema. To that end, Kitchen also includes a BookDocument to use in place of Document as well as elements and enumerators specific to this schema, e.g. BookElement, ChapterElement, PageElement, TableElement, FigureElement, NoteElement, ExampleElement. BookDocument has a method called book that returns a BookElement that wraps the top-level html element. All of these elements have methods on them for searching for other of these specific elements, so that instead of

doc.book.search("[data-type='page']")

we can say

doc.book.pages

In the generic usage, you can chain search methods:

doc.book.search("[data-type='page']").search("figure")

will find all figure elements inside pages inside my_chapter.

In the book-oriented usage, you can chain specific search methods to achieve the same effect:

doc.book.pages.figures

This chaining of enumerators gives other benefits. The above search for figures will yield figures that know the page they were found in as well as their numerical position within that page. So you could do something like this:

doc.book.chapters.pages.figures.each do |figure|
  figure.prepend(child:
    "<span class='os-number'>Figure #{figure.count_in(:chapter)}.#{figure.count_in(:page)}</span>" \
    "<span class='os-title'>A figure in chapter #{figure.ancestor(:chapter).title}</span>"
  )
end

This finds all figures that are in pages that are in chapters in the book. The count_in methods on the figure give the number position of the figure element within the chapter or page so we can form a figure number like "2.13". And as seen here, chapter elements (instances of ChapterElement) have a title method that returns the title text for the chapter. Figures have a caption element, etc.

The CSS for these specific search methods is hidden away so you don't have to deal with it. But if you want to customize that CSS you can. You can pass an overriding CSS selector to these methods, and if you use the $ character in that argument the search method will replace it with the normal CSS selector, e.g. if you wanted to get rid of all of the table elements that have the "unnumbered" class you could say:

doc.book.tables("$.unnumbered").cut

Sometimes, it is difficult to setup a search using CSS. In such cases, you can also pass only and except arguments to search methods, e.g.:

doc.book.figures(except: :subfigure?)

only and except can be the names of methods (that return truthy/falsy values) on the element being iterated over, as shown above, or they can be lambdas or procs as shown here:

doc.book.figures(only: ->(fig) { fig.children.count == 2 })

Obviously this is a somewhat contrived example, but the idea is that by passing a callable you can do complex searches.

Overriding Default Book-Oriented Selectors

Book-oriented methods like book.pages.figures hide from us the CSS or XPath selectors that let us find child elements like .pages. But sometimes, the default selector we have isn't what is used in a certain book. In these cases, we can override the selector once in the recipe and still continue to use the book-oriented usage. For example, a page summary is normally found using the CSS section.summary. But some books use a .section-summary class. For these books, we can override the selectors in their recipes:

recipe = Kitchen::BookRecipe.new do |doc|
  doc.selectors.override(
    page_summary: ".section-summary"
  )

Directions

All of the above talks about the how to search through the XML file and perform basic operations on that file. Our recipes will be combinations of all of the above: search for elements; cut, copy and paste them; count them; rework them; etc.

One recipe for processing a book probably does 10-30 different kinds of operations: format and number tables, same for figures and examples, number and organize exercises and their solutions in different parts of the book, build chapter glossaries, build an index, build a table of contents, etc, etc.

We're not going to want to write out all of those steps in every receipe. Instead it'd be a better idea to write out each step in its own little piece of code. With the steps isolated from each other we'll be dealing with less code all at once and it'll be much easier to write tests to exercise that code.

In Kitchen, we've started the process of writing out these steps and we've put them in a directions folder (which is also a Directions module). E.g. Kitchen::Directions::BakeChapterSummary modifies a provided chapter to have a chapter summary at the end.

It is probably true that the BakeChapterSummary code will work for some number of books, but other books might have different requirements. As such we can expect that there will be different variants of the chapter summary baking step. To anticipate this, our first implementation of this step lives in a method named v1 (so to run it you call BakeChapterSummary.v1(chapter: some_chapter)). Later if there's a tweak needed that can't fit into v1's approach, we can make a v2 method that could live in its own file. This may or may not be the right approach to handle this kind of code variation, but it is at least a place to start.

Internationalization (I18n)

Recognizing that our books will be translated into multiple languages, Kitchen has support for internationalization (I18n). There's a spot for translation files in the locales directory, in which there is currently one en.yml translation file for English. Within our directions code you'll see uses of it like here to title an Example:

<span class="os-title-label">#{I18n.t(:example)} </span>

Building HTML strings

There are a number of valid ways of building up HTML strings to insert into documents.

Maybe you have a tiny bit of HTML to add and you can use vanilla Ruby strings:

some_element.append(child: "<br/>")

You can continue doing this with multiline strings but it gets to be a pain -- you have to add your own newlines (\n) and line continuation symbols (\):

some_element.append(child: "first line\n" \
                           "second line")

Ruby has a much better way of handling multiline strings: heredocs. The best of these is the "squiggly" heredoc, which captures string content between <<~SOME_ARBITRARY_TEXT and SOME_ARBITRARY_TEXT:

some_element.append(sibling: <<~HTML
    <div class="os-caption-container">
      <span class="os-caption">Awesomeness</span>
    </div>
  HTML
)

The squiggly heredoc removes the shortest leading indentation from each line. It lets you use single and double quotes inside the string without escaping them. And at least in certain editors when you use HTML as your "some arbitrary text", you'll get HTML syntax highlighting. You can also do interpolate variables into the string using #{some_variable} The above example is equivalent to this Ruby string written in the earlier approach:

"<div class=\"os-caption-container\">\n" \
"  <span class=\"os-caption\">Awesomeness</span>\n" \
"</div>\n"

The big downside to all of these approaches is that for more complicated strings, we often need to use some Ruby logic to build up different parts of the string, and the techniques above don't allow for that.

Let's invent an example of needing to build some HTML that had a listing of all chapter titles in a bulleted list and then their page titles within them in a nested bulleted list (kind of like a table of contents). This is an example of what we'd be shooting for:

<ul>
  <li>
    <span>Chapter 1 Title</span>
    <ul>
      <li><span>Page 1.1 Title</span></li>
      <li><span>Page 1.2 Title</span></li>
    </ul>
  </li>
  <li>
    ... etc etc
  </li>
</ul>

Here's one way we could build up this string using squiggly heredocs:

class SomethingThatBakes
  def bake(doc)
    @book = doc.book

    chapter_bullets_array = @book.chapters.map do |chapter|
      page_bullets_array = chapter.pages.map do
        <<~HTML
          <li><span>#{page.title.text}</span></li>
        HTML
      end

      <<~HTML
        <li>
          <span>#{chapter.title.text}</span>
          <ul>
            #{page_bullets_array.join("\n")}
          </ul>
        </li>
      HTML
    end.join("\n")

    final_string = <<~HTML
      <ul>#{chapter_bullets_array.join("\n")}</ul>
    HTML

    # do something with that final_string
  end
end

The above works but it is a little fragmented to read. We have to build up parts of the bulleted lists in arrays, then join them together with newlines and embed them in other strings (some of which are also collected in an array and then later substituted and joined).

For these more complex strings we have another option: ERB (Embedded RuBy). ERB is part of standard Ruby and had its heyday when Rails came out in the 2000s. ERB lets us make a separate HTML file with Ruby sprinkled within it. Let's call this file blah.html.erb:

<ul>
  <% @book.chapters.each do |chapter| %>
    <li>
      <span><%= chapter.title.text</span>
      <ul>
        <% chapter.pages.each do |page| %>
          <li><span><%= page.title.text %></span></li>
        <% end %>
      </ul>
    </li>
  <% end %>
</ul>

In our Ruby class doing the generation, we add a renderable statement at the top code we can then say:

class SomethingThatBakes
  renderable

  def bake(doc)
    final_string = render(file: 'blah.html.erb')

    # do something with that final_string
  end
end

This ERB approach is a lot easier to read -- you can see the nesting structure directly in the template file. The Ruby code in the ERB template will have access to any instance variable in the code that called it, i.e. the variables that start with @.

The render method takes a file argument that is a string file path. If the path is relative, it is relative to the directory in which the render call is made.

If you want to make relative file paths be relative to a different directory, you can pass a directory string to the renderable statement: renderable dir: '/Some/other/directory'.

Again, all these techniques work and there are times to use them all.

One-file scripts

Want to make a one-file script to do some baking? Use the "inline" form of bundler:

#!/usr/bin/env ruby

require "bundler/inline"

gemfile do
  gem 'openstax_kitchen', '2.0.0'
end

require "openstax_kitchen"

recipe = Kitchen::Recipe.new do |doc|
  # ... recipe steps here
end

Kitchen::Oven.bake(
  input_file: "some_file.xhtml",
  recipes: recipe,
  output_file: "some_other_file.xhtml")
)

Incidentally, the bake method returns timing information, if you puts its result you'll see it.

Recipe (and Gem) Development

Docker

You can use Docker for your development environment. To build the image:

$> ./docker/build

To drop into the running container:

$> ./docker/bash

To run specs (or something else) from the host:

$> ./docker/run rspec

Non-Docker

After checking out the repo, run bin/setup to install dependencies. If you want to install this gem onto your local machine, run bundle exec rake install.

Console

You can also run bin/console for an interactive prompt that will allow you to experiment.

Tutorials

There are some tutorials you can work through in the tutorials directory. Each tutorial is in a separated numbered subdirectory, e.g. tutorials/01. Each tutorial directory contains a raw.html file that is your starting point (along with some instructions in comments at the top), an expected_baked.html file that is what you're trying to get to when your recipe is applied to the input file, as well as some number of solution files (don't look at those unless you get stuck!!). To get started, run:

$> ./setup_my_recipes

in the tutorials directory. That will make a blank my_recipe.rb file in each of the numbered tutorial subdirectories. This is where you'll do your work. The first "Hello world!" tutorial ("00") asks you to make a recipe that changes

<div class="hello">
  <span>Planet?</span>
</div>

<h1 class="hello">
  <span>World!</span>
</h1>

There's an included script to check to see if your recipe achieves the desired transformation:

$> ./check_it 00

Will check to see if your tutorials/00/my_recipe.rb file does what is needed. If it does, you'll see a "way to go" message. If it doesn't, you'll see a diff between the expected output and the actual output. E.g. if you run ./check_it 00 without having done any work yet, you'll see:

The actual output does not match the expected output.
- = actual output
+ = expected output

@@ -1,4 +1,4 @@
-<div class="hello">
-  <span>Planet?</span>
-</div>
+<h1 class="hello">
+  <span>World!</span>
+</h1>

The check_it script can also check the solutions. E.g. if you say

$> ./check_it 00 solution_1

you'll see

The actual output matches the expected output! Way to go!

There is normally more than one way to achieve the desired output, so feel free to diverge from what is shown in the solution files. Note that the my_recipe.rb files and all actual_baked.html files are ignored by Git.

Important: If things aren't working in your tutorial work (or actually in any recipe work), use the debugger! Just add a debugger line anywhere in your code to stop execution there so you can poke around. You can print variables by typing out their name, run methods on objects, say s to step into function calls, n ("next") to step over function calls, b 97 to set a new breakpoint at line 97, and c to continue to the next debugger statement, breakpoint, or the end of the script.

Error Messages

Kitchen tries to give helpful error messages instead of huge stack traces, e.g.:

The recipe has an error: undefined method `bleach' for main:Object
at or near the following highlighted line

-----+ ./my_work/test.rb -----
   17|
   18|   doc.chapters.each do |chapter|
   19|     chapter.bleach("div.exercise") do |elem|
   20|       elem.first("h3").trash
   21|       elem.cut to: :review_questions

Encountered on line 64 in the input document on element:
<div data-type="chapter">...</div>

If you'd still like the huge stack trace, you can set the VERBOSE environment variable to anything, e.g.

$> VERBOSE=1 ./my_work/test.rb

Releases

To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Documentation

Documentation is handled via YARD. The Solargraph gem can be used in popular editors for code completion.

Run yard server --reload to watch for changes in your local codebase everytime you refresh the page.

Navidate to http://localhost:8808/ to view documentation in your browser.

Use the inch gem to get feedback on where documentation is lacking bundle exec inch (add --help for more options).

Specs

Run bundle exec rspec to run the specs. rake rspec probably does the same thing.

Spec offers 3 ways to compare expected XML output to actual output.

match_snapshot_auto generates a snapshot file and compares the test output to it using rspec-snapshot. To update the snapshots, run UPDATE_SNAPSHOTS=true rspec

match_normalized_html gets rid of extra blanks, sorts all tag attributes alphabetically by attribute name (e.g. sorts "" to "" so that attribute order doesn't impact a match), prints the HTML back out with a standard indent, and then does a normal string diff.

expect(book_1).to match_normalized_html("some string of HTML here")

match_html_nodes does a node-by-node diff using the nokogiri-diff gem. It gives more specific node diff data but is also not quite as clear.

expect(book_1).to match_html_nodes("some string of HTML here")

More on snapshots

Autogenerated snapshot files are created by composing the path with the name of the test. Be aware of collisions (i.e. better to use 'v1 works', 'v2 works', etc than just 'works' when deal with multiple versions).

When the expected output is less than 3 lines or so, inline matching with match_normalized_html is preferred. Any long expected output block should get a snapshot.

Profiling

If you set the PROFILE environment variable to something before you run specs or a recipe, search query profile data will be collected and printed, e.g.

%> PROFILE=1 rspec

Caching

There's a low-level CSS query caching tool that saves repeated queries. In some tests, it saves 15% of query time. It is disabled by default (because we aren't super sure that it is completely safe) but can be turned on with

doc.config.enable_search_cache = true

VSCode

Visit vscode:extension/ms-vscode-remote.remote-containers in a browser
It'll open VSCode and bring you to an extension install screen, click "Install"
Click the remote button now in the bottom left hand corner.
Click "Remote-Containers: Open Folder in Container"
Select the cloned kitchen folder.

This (assuming you have Docker installed) will launch a docker container for Kitchen, install Ruby and needed libraries, and then let you edit the code running in that container through VSCode. Solargraph will work (code completion and inline documentation) as will Rubocop for linting.

Rubocop

Rubocop is good for helping us keep our code style standardized, but it isn't the end-all be-all of things. We can disable certain checks within a file, e.g.

# rubocop:disable Style/NumericPredicate

or we can disable or change global settings in the .rubocop.yml file.

Rubocop is setup to run within the VSCode dev container (see above).

The lefthook is included in the Docker build. When you push your code to GitHub, lefthook runs Rubocop on all the files you have changed. It won't let you push if you have Rubocop errors. You'll have to fix the errors or make changes to the .rubocop.yml files to bypass the errors. You can also run lefthook directly with

$ /code> lefthook run pre-push

Misc References

Tutorials

Fix up tutorials and describe how to use them here

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/openstax/kitchen. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the code of conduct.

Tools

Helpful scripts in the bin directory:

normalize - Normalizes content files to make it easier to compare them. E.g. if you want to compare kitchen baked output to cnx-recipes baked output, you should normalize the files first. normalize somefile.xhtml produces somefile.normalized.xhtml which has its attributes sorted by attribute name, copied element ID numbers masked (because they change based on order of operations in recipes, but their values are not important), and some errors in legacy baked files removed (e.g. unnumbered tables get a summary attribute with a bogus number).

License

The gem is available as open source under the terms of the MIT License.

TODO

Specs galore :-)
Think up and handle a bunch more recipe errors, test they all raise some kind of RecipeError.
Encapsulate numbering schemes (e.g. chapter pages are "5.2", appendix pages are "D7") and maybe set on book document? Right now we are doing inline things like *('A'..'Z')][page.count_in(:book)-1]}#{table.count_in(:page) which is ugly.
Control I18n language in Oven.
README: element_children, .only, selectors, config files
Use ERB for more readable string building?

Quirks

When Kitchen writes out HTML containing unicode characters it uses the hexadecimal form, whereas current CE baking uses the decimal form. I haven't found an internal way to change how Kitchen's underlying library writes these characters, so if you need to do a new-to-old comparison, you can use a few lines of ruby to do a search and replace:

original_output = File.read("kitchen_output.xhtml")
modified_output = original_output.gsub(/&#x([0-9A-F]+);/){"&##{$1.hex};"}
File.open("kitchen_output.xhtml", "w") {|file| file.puts modified_output}

If this difference matters (if we need the decimal version), we can do more work to figure out a better implementation.

Ideas

Use tmux for real-time evaluation of recipes to see output within one split terminal (source XML in one pane, recipe in middle, output on right).

Code of Conduct

Everyone interacting in the Kitchen project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.

OpenStax Book Recipes

Uses the openstax_kitchen gem to bake OpenStax books.

Recipe File Structure

In this repo, we have (at least) one recipe for every book. There's a books/ directory in which we have subfolders, one for each book. In those subfolders, we put an executable Ruby file, normally named bake but it could be anything. (Later if you want to make a new recipe for the book you could name it bake_new or whatever in the same directory).

The executable bake Ruby scripts will typically have everything they need inside of them, e.g. they'll look like this:

#!/usr/bin/env ruby

require "bundler/inline"

gemfile do
  gem 'openstax_kitchen', '2.0.0'
  gem 'slop', '4.8.2'
  gem 'byebug'
end

recipe = Kitchen::BookRecipe.new(book_short_name: :chemistry) do |doc|
  # ... RECIPE CODE HERE ...
end

opts = Slop.parse do |slop|
  slop.string '--input', 'Assembled XHTML input file', required: true
  slop.string '--output', 'Baked XHTML output file', required: true
end

puts Kitchen::Oven.bake(
  input_file: opts[:input],
  recipes: recipe,
  output_file: opts[:output]
)

Normal big Ruby projects have a Gemfile that is processed by the bundler gem to install and use all the right versions of any gems your code depends on. Here we're using bundler's ability to include a gemfile directly in the source code. We're only using a few gems, so this is doable. It is also makes the script nicely self-contained. Because inline gem declarations like this don't have a Gemfile.lock that locks down the versions of gems (like other bigger Ruby projects), we want to make sure we use specific versions of gems e.g. '1.0.1' and not '~> 1.0.0'.

Slop is a simple command line argument parser. Here it lets us call this bake script with --input some_file.xhtml and --output other_file.xhtml arguments that are passed into the Kitchen::Oven.bake call along with the recipe this script defines. The Kitchen::Oven.bake method returns some profiling numbers that we output to the screen using puts.

The main `bake` script

The top-level bake Bash script calls the right scripts in the books folder based on the book slug. E.g. if you call ./bake -b chemistry -i in.xhtml -o out.xhtml this script will call ./books/chemistry/bake --input in.xhtml --output out.xhtml. Every time we add a new recipe, we'll need to update this top-level bake script so it knows how to call it. This is also why the names of the scripts inside /books don't really matter, because the top-level bake script knows the names of the lower-level scripts.

The main pipeline (enki) calls bake in the git-bake (or archive-bake) step.

The `shorten` script

This script generates shortened content for a book. It calls the book-specific shorten script in books/book-name/shorten to generate a shortened version of the assembled file, then bakes this file with kitchen, and normalizes. Essentially, it bundles calls to three other scripts: the book-specific shorten, the book-specific bake, and normalize. The output files are written to the data/book-name/short/ directory.

Call this script with ./shorten -b <bookname> -i <inputfile>. Add USE_LOCAL_KITCHEN=1 at the beginning to bake with the local version of kitchen.

It is assumed that the given <bookname> will match the folder name for the book in the /books/ directory.

As with the main bake script, new books must be added to the case statement, ex:

case "${book}" in
  chemistry) dest="${DIR}/data/chemistry/short" && script="${DIR}/books/chemistry/shorten";;
    ... [other cases]
  {book-name}) dest="$DIR/data/{book-name}/short"  && script="${DIR}/books/{book-name}/shorten";;
  *) echo "Unknown book '${book}'"; exit 1;;
esac

Book-specific `shorten` scripts

Each book has its own directions to create a shortened version for development and testing purposes. This script is in books/book-name/shorten. It uses the kitchen framework and Oven.bake to remove parts of the book and generate output, but it does not yield a baked book.

Docker

Development and execution can be done using Docker.

To build the docker image:

$> ./docker/build

Note this builds the runtime environment, suitable for running in production and some development work. If you want a more full development environment, use VS Code using the remote containers extension. This will build the development environment with a nice terminal, VS Code Live Share for pairing, etc. To install the remote containers extension, visit vscode:extension/ms-vscode-remote.remote-containers in a browser.

To drop into the Docker container:

$> ./docker/bash

To use the Docker image to bake input XHTML files do the following (you can do it all on one line, just put it on multiple lines here to describe each part):

$> docker run --rm \                                                                # Remove container after the run
              -v $PWD:/files \                                                      # Mount the current host directory as /files so we can put files
                                                                                    #   in and get them out
              openstax/recipes:latest \                                             # The image ID (could use a specific tagged image instead of "latest")
              /code/bake -b chemistry -i /files/input.xhtml -o /files/baked.xhtml # The call to the main `bake` script

The above runs the baking in the latest (or some tagged) image. If you want to run using your latest recipe code on your local machine, you can mount that code in the container by adding another -v argument: -v /path/to/my/local/recipes:/code.

Want to run the recipes and do interactive debugging? Add the -it flags to the docker run call above.

Rubocop

Rubocop is available inside the VSCode dev container. Moreover, the lefthook gem enforces that Rubocop linting passes on modified files before pushes are allowed. To test this without pushing run lefthook run pre-push.

Using your local git clone of kitchen within the recipes devcontainer

If you put the absolute path of your machine's cloned kitchen folder into a .devcontainer/kitchen_path text file, e.g.:

/Users/staxly/dev/openstax/kitchen

the devcontainer in VS Code will mount that folder to /code/kitchen. This will let you look at and edit the code from within your VS Code recipes workspace and will let you point recipes at this local code with

gem 'openstax_kitchen', path: '/code/kitchen'

so that you can develop in both recipes and kitchen at the same time. The kitchen folder has its own independent git state.

If you don't have a kitchen_path file, the devcontainer will mount a fake empty directory in /tmp.

If you use a line like the following in your recipe script to choose the kitchen gem version:

gem 'openstax_kitchen', ENV['USE_LOCAL_KITCHEN'] ? { path: '/code/kitchen' } : '2.0.0'

And then call the recipe script prefixed with defining the USE_LOCAL_KITCHEN variable:

$ /code> USE_LOCAL_KITCHEN=1 ./books/chemistry2e/bake ...

then your recipe will use your local kitchen folder. You can leave the gem line as is when you commit it, and in production runs since the USE_LOCAL_KITCHEN environment variable isn't set, the version number at the end will be used.

Starting a recipe

The create_new_recipe script offers a quickstart way to generate many of the initial files for recipe development, like the locale files and the boilerplate for the bake script. It also adds the relevante line to main bake. Call it with ruby scripts/create_new_recipe <recipe-name> <...>.

Devs will still need to edit/create:

shorten
main_spec
test data for specs

Creating a new recipe manually

New recipes files are created in lib/recipes/book-name.

In order to run the new recipe via the bake script, the recipe must be added to the case statement, i.e.:

case "${book}" in
  chemistry) $DIR/books/chemistry/bake --input $input_file --output $output_file;;
  ... [other cases]
  {book-name}) $DIR/books/{book-name}/bake --input $input_file --output $output_file;;
  *) echo "Unknown book '${book}'"; exit 1;;
esac

Note that the recipe file must be made executable by running chmod +x [path] or chmod 755 [path] before it can be called by the bake script.

Working on a recipe (converting from easybake)

When developing a recipe that already exists in easybake, the main goal is to produce output via the kitchen recipe that is identical to the easybake output. It's helpful to use VSCode's native differ or another diff tool to compare the output from the two methods of baking.

Normalize

The kitchen output and the easybake output may have a number of unimportant differences such as in whitespace and the ordering of attributes. The normalize script tidies the HTML and puts all attributes in alphabetical order. Call this script on an HTML file by calling ruby scripts/normalize [path].

Book-specific locale files

A book may contain translations specific to itself (i.e., the note title 'Portrait of a Chemist' only appears in Chemistry). To solve this problem, locales specific to the book may be created. A recipe has the ability to receive a custom locales path, or infer the location of the locales directory as long as this directory is stored next to the bake file. For example, the recipe in books/chemistry/bake would look for a directory called books/chemistry/locales. This locale file does not permanently modify the I18n backend and only persists for as long as the recipe runs, so no need to worry about conflicts with other books' locales.

Specs for recipes

When the book-specific recipe is done, we can create a spec for it. The way specs are done is by comparing the baked file to an expected output file via the md5 hex.

1) Create a folder under spec/books/book-name with a file called input.xhtml and another file called expected_output.xhtml inside it. As a suggestion, the input file could contain the content inside the assembled.xhtml file and the expected_output file could contain the content inside the normalized version of the kitchen-baked file, so kitchen-baked.normalized.xhtml, both of these files are generated by the shorten script, but the important thing is that the input and expected_output exist and are useful test data.

2) Create an entry on the spec/main_spec.rb file like:

  it 'bakes {book-name}' do
    expect('{book-name}').to bake_correctly
  end

the book-name should match the book directory where the match_helper.rb can find the recipe for it, this is done by bake_correctly.

3) If you wish to run local kitchen for the spec to succeed, you can do USE_LOCAL_KITCHEN=1 rspec, if you want to run it with the current version of kitchen, make sure you have the right version at the top of your recipe like: gem 'openstax_kitchen', ENV['USE_LOCAL_KITCHEN'] ? { path: '/code/kitchen' } : '3.2.0' or, instead of a version, you can set it to a specific sha of a branch.

Kitchen

Installation

Two Ways to Use Kitchen

Generic Usage

The search method and enumerators

Clipboards, cut, copy, and paste

Pantries

Counters

Adding content

Checking for elements

Miscellaneous

Using raw to get at underlying Nokogiri objects.

Book-Oriented Usage

Overriding Default Book-Oriented Selectors

Directions

Internationalization (I18n)

Building HTML strings

One-file scripts

Recipe (and Gem) Development

Docker

Non-Docker

Console

Tutorials

Error Messages

Releases

Documentation

Specs

More on snapshots

Profiling

Caching

VSCode

Rubocop

Misc References

Tutorials

Contributing

Tools

License

TODO

Quirks

Ideas

Code of Conduct

OpenStax Book Recipes

Recipe File Structure

The main bake script

The shorten script

Book-specific shorten scripts

Docker

Rubocop

Using your local git clone of kitchen within the recipes devcontainer

Starting a recipe

Creating a new recipe manually

Working on a recipe (converting from easybake)

Normalize

Book-specific locale files

Specs for recipes

The `search` method and enumerators

Using `raw` to get at underlying Nokogiri objects.

The main `bake` script

The `shorten` script

Book-specific `shorten` scripts