Chapter 3 Whole document metadata

You are probably not used to thinking about medical manuscripts as having “metadata”, but all files have metadata. Some is set for you automatically by your software (right-click on a file on your computer and select “Get Info” to see some). Others you may set yourself, like in a document’s filename; think: asd_obesity_2015.Rmd. Other metadata may be present in your directory structure, like:

So the file asd_obesity_2015.Rmd was my analysis of data from the Autism Treatment Network for a manuscript on obesity we published in 2015.

When you start working in R Markdown, we try to store metadata inside the document itself. The place to store it is in what is called the YAML header (sometimes called YAML frontmatter), which is the space at the top of your .Rmd that is fenced in by three dashes as delimiters (---) like this:

What kinds of metadata are useful for medical manuscripts? Think about things you may often add to the header of a Word document you plan to share with other people like collaborators:

  • the author
  • the date the document was written (or last updated!)
  • a title (perhaps with a subtitle) to frame what the document is all about for the reader

All of these things are “keys” we can add to an .Rmd document using the YAML header.

3.1 All about YAML

YAML stands for “YAML Ain’t No Markup Language”, and you are not the only one to read that and think “well, that is not very helpful at all.” The most important things to know about YAML are that:

  1. You can only have one YAML per .Rmd document (meaning, there is only one source of metadata for the file, which makes sense!).

  2. The contents of a YAML are formatted as a series of key: value pairs separated by colons; each key on a new line. In the YAML above, the keys are title, author, and output.

  3. Indentation matters.
       Indentation matters.
         Indentation matters.

You can “reuse” the content in your YAML in your text too. For example, using R code, rmarkdown::metadata$title will print:

R Markdown for Medicine

…in this text, because it is pulling metadata from our book’s title set in the YAML.

3.2 Changing output formats

Let’s focus first on the most important key in our YAML above: the output format. R Markdown comes with a suite of possible “output formats” built-in.

Your turn


Change this document’s output format from html_document to word_document.
Save and 🧶 “Knit to Word”!

3.3 Adding output options

In YAML, indentation is used to indicate nesting. Why do you need to indent? Usually, you want to add output options for a given output format. Output options can be set in your YAML, but they must be nested under the output format. The spacing here is critical as you won’t get an error message when you do it wrong- usually your document will “knit” but silently fail.

On the right, I’ve added some options to the html_document format.

We can give the document a table of contents, and make it a floating table of contents, by setting both option arguments to true (the default is false).

Try it out to see what this looks like.

Your turn


Word documents can have a table of contents too, but not a floating one. Try adding a toc to your word document output. Save and 🧶 “Knit to Word”!

You can see output options for each output format on the R Markdown reference page. For example, here are the options for word documents. When options are set in the YAML, replace = with : with each key-value pair on a new line.

Watch
  your

    indentation.