1 From Zero to Quarto

In this chapter, we have prepared are range of small prepretory exercises you can complete to get you started with quarto. The exercises are designed in a bottom up approach: You will start out with an empty file and start adding content and complexity.

This is not the way Quarto is usually taught. However, in in this way, you will get a good understanding of how quarto works. At a later stage, we will provide you with a template to create a ZHAW compatible pdf. In these following exercises, we will stick to html outputs.

These instructions were hand crafted to meet the requirements of the PA2 thesis. The way we approach this, is that we implement a chapter from Rousseeuw and Leroy (1987) using Quarto and typst.

1.1 System setup

Before you start, you will need to set up your software environment and your IDE:

Download and install Quarto
Set up Quarto to work with your preferred IDE: VSCode, RStudio, Positron, Jupyter or Neovim

1.2 Tasks

Task 1: Initiate a live preview

Start out with a empty text file with the .qmd file extension. In the examples we will use index.qmd. In the terminal, run the following command:

quarto preview index.qmd

This will:

Create a (currently empty) html file in the working directory
Start a watch process that will automatically re-create index.html when changes to index.qmd are saved
Start a local web server that will serve the rendered html file at http://localhost:someport
Open the rendered html file in your default web browser

This is a very good starting point to learn Quarto.

Important

If you are using Python with a virtual environment, make sure to activate the environment before running the command above. You might be prompted to install additional python libraries (e.g. jupyter, nbformat), which you should do.

Task 2: Add text

Now, open index.qmd in your preferred IDE and add some text from Rousseeuw and Leroy (1987). You can copy and paste the raw, unformatted text from here.

To format this text in Markdown, you have to apply the following changes:

Prepend the title with a # symbol
Separate paragraphs, titles, lists with a blank line

We will introduce images, tables and equations later. Save the file and see the changes reflected in the browser.

More information on Markdown

https://quarto.org/docs/authoring/markdown-basics.html

Task 3: Add metadata

Quarto documents usually contain metadata in a YAML header at the top of the file. This metadata can add content to the document (such as a title, author name, date, etc), but also specify how the document should be rendered (for example to pdf or html). The metadata is contained between two lines with three dashes ---.

Add the following metadata to the top of your index.qmd file:

---
title: "Robust Regression and Outlier Detection" 
subtitle: "Chapter 2: Simple Regression"
author: 
  - Peter J. Rousseeuw
  - Annick M. Leroy
date: "1987-10-19"
format: html
---

Important

Although you will need to render your thesis to pdf, we recommend starting out with html output, since it renders much faster. If no format is specified, html is the default.

Save the file and see the changes reflected in the browser.

Task 4: Add Equations

Equations are written in LaTeX syntax. Inline equations are surrounded by single dollar signs, euqations on a separate line are surrounded by double dollar signs.

To write your equation, you can use an online LaTeX equation editor such as this one. You will have to experiment around and do some web searches, then copy the LaTeX code into your Quarto document.

In our example, you can add the following equation to the end of the document and see how it renders in the browser:

Markdown
Rendered

$$ \hat{y} =\hat{\theta}_2^{\prime} x^{\hat{\theta}_1}  $$

\[ \hat{y} =\hat{\theta}_2^{\prime} x^{\hat{\theta}_1} \]

More information on equations

https://quarto.org/docs/authoring/markdown-basics.html#equations

Task 5: Add Code

Now we will experiment with adding some code to our document.

Important

We don’t recommend writing your entire data processing pipeline in a Quarto document. Depending on the topic of your thesis, this could be a time consuming process (e.g. fitting ML models).

Instead, we recommend writing scripts that do the data processing and analysis in a separtate project, and then using Quarto to generate the report.

However, you can use R / Python heavily to visualize our results in plots and tables.

To add code, you need to create a code block. A code block starts with three backticks followed by the language name in braces, and ends with three backticks. We recommend you find out what keyboard shortcut your IDE uses to insert a code chunk (VSCode: Ctrl+Shift+I )

For example, the following code block contains python code:

Code chunk
Rendered

```{python}
from datetime import date

today = date.today()
print("Today date is: ", today)
```

from datetime import date

today = date.today()
print("Today date is: ", today)

Today date is:  2025-09-25

If you save the file now, you will see that the code is executed and both the code and its output are shown in the rendered document.

By default, code blocks will show both the code and the output. You can control this behavior globally (for the entire document) or locally (for a specific code block). In our case, and we recommend this for your thesis as well, we don’t want any code to be shown in the final output. To achieve this, we add the following global options to the YAML header:

execute:
  echo: false

Save the file and see that the code is no longer shown in the rendered document.

Task 6: Add Tables

Creating tables is where quarto really shines, since you have your favorite programming language at your disposal. So rather than manually typesetting a machine readable table (e.g. a csv) in a markup language, you can just write a small piece of code that generates the table for you. Since R and Python have very powerful table generation libraries, we recommend using one of those languages for this purpose.

Download the dataset Animals.csv (see Section 1) and save it in the same directory as your index.qmd file. Then, insert a code chunk to read the data and show the data as a table:

Code Chunk
Rendered

```{python}
import pandas as pd
animals = pd.read_csv("Animals.csv")

animals
```

index	Species	brain	body
1	Mountain beaver	8.1	1.35
2	Cow	423.0	465.00
3	Grey wolf	119.5	36.33
4	Goat	115.0	27.66
5	Guinea pig	5.5	1.04
6	Dipliodocus	50.0	11700.00

(… truncated)

Save your file and see the table appear in the rendered document. By default, the print method for dataframes is rendered to a html table within quarto. This already looks quite good, we will ignore the little differences to the original table (duplicate index, varying column names) for now.

Task 7: Add table caption

What we do need to add is a caption to the table. This can be done by adding the tbl-caption option to the code block:

```{python}
#| tbl-cap: "Table 7: Brain and body weights of 28 animals."
import pandas as pd
animals = pd.read_csv("Animals.csv")

animals
```

Task 8: Add a table identifier

Currently, we manually numbered the table as “Table 7”. With potentially many tables and many figures in your thesis, you will want to automatically number them and be able to refer to them in the text. To do this, we need to add an identifier, a so called label to the table.

To let Quarto know that what we are labeling is a table, we need to prepend the label with tbl-. We add the label to the code block options like this.

```{python}
#| tbl-cap: "Brain and body weights of 28 animals."
#| label: tbl-animals
import pandas as pd
animals = pd.read_csv("Animals.csv")

animals
```

Note that you should now remove Table 7 from the caption, since the table will be automatically numbered. Check your output to verify.

Now, to refer to the table in the text, we have to replace the hard coded table number with the table reference.

Before
After

Table 7 presents the brain weight (in grams) and the body weight (in
kilograms) of 28 animals.

@tbl-animals presents the brain weight (in grams) and the body weight (in
kilograms) of 28 animals.

Information on Cross-references

https://quarto.org/docs/authoring/cross-references.html

Task 9: Add Figures (as files)

The general syntax for adding figures is:

![](figure.png)

So, download brain-vs-body.png (see Section 1) and save it in the folder where your index.qmd file is located. Then, add the following line to your document:

![](brain-vs-body.png)

A caption is added in between the square brackets. A label can be added by appending {#fig-label} after the image file name. Note that either one of them will center the figure on the page. Just as tabble labels start with tbl-, a figure label must start with fig-.

![Logarithmic brain weight versus logarithmic body weight for 28 animals with LS
(dashed line) and RLS fit (solid line).](brain-vs-body.png){#fig-brain-body}

Now, you can replace the hard coded figure numbers with the figure reference:

Before
After

A clear picture of the relationship between the logarithms (to the base
10) of these measurements is shown in Figure 7.

A clear picture of the relationship between the logarithms (to the base
10) of these measurements is shown in @fig-brain-body.

To change the width of the image, you can add width in the curly braces. The width can be specified in pixels or as a percentage of the page width:

![<caption>](brain-vs-body.png){#fig-brain-body width=50%}

Task 10: Add Figure (from code)

You can also generate your figures from code, similarly to how we generated the table above. This makes it easy to ensure that the figures are always up to date with the sorrounding text. The following python code recreates the figure from the book. Add it to your Quarto document and see the result in the rendered document.

Python code

```{python}
import numpy as np
import statsmodels.api as sm
import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator

animals['logBrain'] = np.log10(animals['brain'])
animals['logBody'] = np.log10(animals['body'])
X = sm.add_constant(animals['logBody'])

# LS (Ordinary Least Squares) fit
ols_model = sm.OLS(animals['logBrain'], X)
ols_results = ols_model.fit()

# RLS (Robust Least Squares) fit
rls_model = sm.RLM(animals['logBrain'], X, M=sm.robust.norms.HuberT())
rls_results = rls_model.fit()


# Plotting
plt.scatter(animals['logBody'], animals['logBrain'], color='black')
# Sort by logBody to ensure proper line plotting
sorted_indices = np.argsort(animals['logBody'])
sorted_logBody = animals['logBody'].iloc[sorted_indices]
sorted_ols_fitted = ols_results.fittedvalues.iloc[sorted_indices]
sorted_rls_fitted = rls_results.fittedvalues.iloc[sorted_indices]

plt.plot(sorted_logBody, sorted_ols_fitted, color='black', linestyle='--', linewidth=2, label='LS Fit')
plt.plot(sorted_logBody, sorted_rls_fitted, color='black', linestyle='-', linewidth=1.5, label='RLS Fit')
plt.xlabel('Log body weight')
plt.ylabel('Log brain weight')
plt.axis('square')
plt.xlim(-2, 5)
plt.ylim(-2, 5)

# Set major ticks at 0, 2, 4
plt.xticks([0, 2, 4])
plt.yticks([0, 2, 4])

# Set minor ticks at 0.4, 0.8, 1.2, 1.6, 2.4, 2.8, 3.2, 3.6
plt.gca().xaxis.set_minor_locator(MultipleLocator(0.4))
plt.gca().yaxis.set_minor_locator(MultipleLocator(0.4))

# Set ticks to point inward with different lengths
plt.tick_params(direction='in', which='major', length=12, width=1)
plt.tick_params(direction='in', which='minor', length=6, width=1)

plt.show()
```

Just as we did with tables, we can add a caption and a label to the figure by adding options to the code block:

```{python}
#| label: fig-brain-body
#| fig-cap: "Logarithmic brain weight versus logarithmic body weight for 28 animals with LS (dashed line) and RLS fit (solid line)."

<the python code goes here> ...
```

Task 11: Add Bibliography and Citations

Next, we will replace the following hardcoded bibliography entries with citations that are automatically generated from a bibliography file.

(This sample was taken from larger data sets in Weisberg 1980 and Jerison 1973)

The way this works in Quarto is analogous to how it works in Latex. The first step is to get the metadata of the papers you want to cite in a machine readable format. The most common format is BibTeX. You can find the BibTeX entry for a paper in Google Scholar by clicking on the quotation mark symbol below the search result. Also, many journals also provide BibTeX entries for their papers.

If we look for Weisberg, Sanford 1980 we in Google Scholar, we find the following BibTeX entry:

@article{weisberg1980,
  title={Some large-sample tests for nonnormality in the linear regression model: Comment},
  author={Weisberg, Sanford},
  journal={Journal of the American Statistical Association},
  volume={75},
  number={369},
  pages={28--31},
  year={1980},
  publisher={JSTOR}
}

Note that weisberg1980 is the label of this entry, which you can choose yourself. You will need this label to cite the paper in your text. However, before you can do this, you will need to store the BibTeX entry in a file with the .bib file extension. Create a new file named references.bib in the same directory as your index.qmd file, and copy the BibTeX entry above into this file.

Now, you need to tell Quarto where to find your bibliography file. This is done by adding the bibliography field to the YAML header of your index.qmd file:

bibliography: references.bib

Now, you can replace the hardcoded reference to Weisberg 1980 with a citation:

Before
After

(This sample was taken from larger data sets in Weisberg 1980 and Jerison 1973.)

(This sample was taken from larger data sets in @weisberg1980 and Jerison 1973)

Have a look at the rendered document to see that the citation is now shown in the text, and a bibliography entry for Weisberg 1980 is automatically added at the end of the document. You will notice that the citation is linked to the bibliography entry in the rendered document. If you want to place it at a specific location, you can add the following line to your document where you want the bibliography to appear:

::: {#refs}
:::

In a same way, you can add the BibTeX entry for Jerison 1973 to your references.bib file and cite it in your text. The bibtex entry for Jerison 1973 is:

@book{jerison1973,
  title={Evolution of the brain and intelligence},
  author={Jerison, Harry J},
  year={1973},
  publisher={Academic Press}
}

You might have noticed that the default citation shows the year in parentheses. In this perticular case, we do not want this style, since the whole sentence is already in parantheses. To achieve this, apply the following changes:

Before

Markdown
Rendered

(This sample was taken from larger data sets in @weisberg1980 and @jerison1973)

(This sample was taken from larger data sets in Weisberg (1980) and Jerison (1973))

After

Markdown
Rendered

[This sample was taken from larger data sets in @weisberg1980; and @jerison1973]

(This sample was taken from larger data sets in Weisberg 1980; and Jerison 1973)

1.3 Bibliography

Jerison, Harry J. 1973. “Fossil Evidence of the Evolution of the Human Brain.” Annual Review of Anthropology 4: 27–58.

Rousseeuw, Peter J, and Annick M Leroy. 1987. Robust Regression and Outlier Detection. John wiley & sons.

Weisberg, Sanford. 1980. “Some Large-Sample Tests for Nonnormality in the Linear Regression Model: Comment.” Journal of the American Statistical Association 75 (369): 28–31.