<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Worked examples | Learning from Data: R programming</title>
    <link>https://bit-2021.netlify.app/example/</link>
      <atom:link href="https://bit-2021.netlify.app/example/index.xml" rel="self" type="application/rss+xml" />
    <description>Worked examples</description>
    <generator>Source Themes Academic (https://sourcethemes.com/academic/)</generator>
    <image>
      <url>https://bit-2021.netlify.app/media/social-image.png</url>
      <title>Worked examples</title>
      <link>https://bit-2021.netlify.app/example/</link>
    </image>
    
    <item>
      <title>Exploratory Data Analysis for Modelling</title>
      <link>https://bit-2021.netlify.app/example/modelling_eda/</link>
      <pubDate>Tue, 28 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/example/modelling_eda/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#penguin-bill-dimensions&#34;&gt;Penguin Bill dimensions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#exploratory-data-analysis-eda.&#34;&gt;Exploratory Data Analysis (EDA).&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#ggallyggpairs-to-get-scatter-plot-correlation-matrix&#34;&gt;&lt;code&gt;GGally::ggpairs()&lt;/code&gt; to get scatter plot + correlation matrix&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#dummy-variables&#34;&gt;Dummy variables&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#modelling-considerations-of-numerical-variables&#34;&gt;Modelling considerations of numerical variables&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#penguin-body-mass-vs-flipper-length&#34;&gt;Penguin body mass vs flipper length&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#penguin-body-mass-vs-bill-depth&#34;&gt;Penguin body mass vs bill depth&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#penguin-body-mass-and-flipper-size-faceted-by-sex&#34;&gt;Penguin body mass and flipper size, faceted by sex&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#simpsons-paradox-penguin-bill-length-vs-bill-depth&#34;&gt;Simpson’s Paradox: Penguin bill length vs bill depth&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#is-y-heavily-skewed-does-it-need-to-be-transformed&#34;&gt;Is Y heavily skewed? Does it need to be transformed?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#acknowledgements&#34;&gt;Acknowledgements&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;p&gt;You may remember the Happy Feet movie and here is an Adélie penguin singing
&lt;a href=&#34;https://www.youtube.com/watch?v=ZCATC0z0K0I&#34; target=&#34;_blank&#34;&gt;My Way&lt;/a&gt;. We shall analyse data that were collected and made available by &lt;a href=&#34;https://www.uaf.edu/cfos/people/faculty/detail/kristen-gorman.php&#34;&gt;Dr. Kristen Gorman&lt;/a&gt; and the &lt;a href=&#34;https://pal.lternet.edu/&#34;&gt;Palmer Station, Antarctica LTER&lt;/a&gt;. The dataset contains data for 344 penguins on 3 different species of penguins (Adélie, Chinstrap, and Gentoo), collected from 3 islands in the Palmer Archipelago, Antarctica.&lt;/p&gt;
&lt;p&gt;This great artwork was made by &lt;span class=&#34;citation&#34;&gt;[@allison_horst]&lt;/span&gt;(&lt;a href=&#34;https://twitter.com/allison_horst&#34; class=&#34;uri&#34;&gt;https://twitter.com/allison_horst&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/img/penguins.png&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;and if you are a Happy Feet fan, these are the penguins we have data on&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://happyfeet.fandom.com/wiki/Ad%C3%A9lie_Penguin&#34; target=&#34;_blank&#34;&gt;Adélie&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://happyfeet.fandom.com/wiki/Chinstrap_Penguin&#34; target=&#34;_blank&#34;&gt;Chinstrap&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://happyfeet.fandom.com/wiki/Gentoo_Penguin&#34; target=&#34;_blank&#34;&gt;Gentoo&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div id=&#34;penguin-bill-dimensions&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Penguin Bill dimensions&lt;/h2&gt;
&lt;p&gt;The culmen is the upper ridge of a bird’s bill. In the simplified &lt;code&gt;penguins&lt;/code&gt; data, culmen length and depth are renamed as variables &lt;code&gt;bill_length_mm&lt;/code&gt; and &lt;code&gt;bill_depth_mm&lt;/code&gt; to be more intuitive.&lt;/p&gt;
&lt;p&gt;For this penguin data, the culmen (bill) length and depth are measured as shown below:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/img/culmen_depth.png&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;exploratory-data-analysis-eda.&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Exploratory Data Analysis (EDA).&lt;/h2&gt;
&lt;p&gt;The variable of interest is penguin body mass. We want to see whether body mass is related to any of the other variables included in the dataframe. So let us start by looking at the data&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(palmerpenguins)
glimpse(penguins)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Rows: 344
## Columns: 8
## $ species           &amp;lt;fct&amp;gt; Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, A...
## $ island            &amp;lt;fct&amp;gt; Torgersen, Torgersen, Torgersen, Torgersen, Torge...
## $ bill_length_mm    &amp;lt;dbl&amp;gt; 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34....
## $ bill_depth_mm     &amp;lt;dbl&amp;gt; 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18....
## $ flipper_length_mm &amp;lt;int&amp;gt; 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, ...
## $ body_mass_g       &amp;lt;int&amp;gt; 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 347...
## $ sex               &amp;lt;fct&amp;gt; male, female, female, NA, female, male, female, m...
## $ year              &amp;lt;int&amp;gt; 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2...&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;skimr::skim(penguins)&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;caption&gt;&lt;span id=&#34;tab:unnamed-chunk-3&#34;&gt;Table 1: &lt;/span&gt;Data summary&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Name&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;penguins&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Number of rows&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;344&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Number of columns&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;_______________________&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Column type frequency:&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;factor&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;numeric&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;________________________&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Group variables&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Variable type: factor&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th align=&#34;left&#34;&gt;skim_variable&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;n_missing&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;complete_rate&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;ordered&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;n_unique&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;top_counts&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;species&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.00&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;FALSE&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;Ade: 152, Gen: 124, Chi: 68&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;island&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.00&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;FALSE&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;Bis: 168, Dre: 124, Tor: 52&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;sex&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;11&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.97&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;FALSE&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;mal: 168, fem: 165&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Variable type: numeric&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th align=&#34;left&#34;&gt;skim_variable&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;n_missing&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;complete_rate&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;mean&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;sd&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p0&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p25&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p50&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p75&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p100&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;hist&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;bill_length_mm&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.99&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;43.92&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;5.46&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;32.1&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;39.23&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;44.45&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;48.5&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;59.6&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▃▇▇▆▁&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;bill_depth_mm&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.99&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;17.15&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.97&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;13.1&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;15.60&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;17.30&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;18.7&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;21.5&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▅▅▇▇▂&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;flipper_length_mm&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.99&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;200.92&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;14.06&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;172.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;190.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;197.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;213.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;231.0&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▂▇▃▅▂&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;body_mass_g&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.99&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;4201.75&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;801.95&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2700.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3550.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;4050.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;4750.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;6300.0&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▃▇▆▃▂&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;year&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2008.03&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.82&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2007.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2007.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2008.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2009.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2009.0&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▇▁▇▁▇&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;div id=&#34;ggallyggpairs-to-get-scatter-plot-correlation-matrix&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;code&gt;GGally::ggpairs()&lt;/code&gt; to get scatter plot + correlation matrix&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;GGally::ggpairs()&lt;/code&gt; is a useful tool for exploring distributions and correlations.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggpairs1 &amp;lt;- penguins %&amp;gt;% 
  na.omit() %&amp;gt;% 
  select( body_mass_g, flipper_length_mm, bill_length_mm, bill_depth_mm, species, sex, island) %&amp;gt;% 
  rename(`Flipper length(mm)` = flipper_length_mm, 
         `Body mass (g)` = body_mass_g, 
         `Bill length (mm)` = bill_length_mm, 
         `Bill depth (mm)` = bill_depth_mm) %&amp;gt;% 
  GGally::ggpairs() +
  theme_minimal() 

ggpairs1&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/modelling_eda_files/figure-html/ggpairs1-1.png&#34; width=&#34;672&#34; /&gt;
&lt;code&gt;ggpairs()&lt;/code&gt; provides a lot of information, so let us spend some time deciphering this chart.&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Along the diagonal we get a density plot of each variable of interest. For instance, body mass seems to be right skewed, and the rest of the variables seem bimodal.&lt;/li&gt;
&lt;li&gt;The upper part of the matrix, shows correlation coefficients– to determine between which variables, read off the corresponding row and column header&lt;/li&gt;
&lt;li&gt;The bottom part provides a scatterplot between any two variables which you can again determine by looking at the row/column they correspond.&lt;/li&gt;
&lt;li&gt;For categorical variables (species, sex, island), we do not get any numerical values, but rather histograms and boxplots that show the distribution of outcomes. If we wanted to get a numerical correlation value, we would use the &lt;code&gt;polycor&lt;/code&gt; package and &lt;code&gt;polycor::hetcor()&lt;/code&gt; that calculates polyserial correlations between numeric and ordinal variables, and polychoric correlations between ordinal variables, but this is outside the scope of this class.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Which correlation is the highest? That between body mass and flipper length, or 0.873. We can see the scatterplot of these two variables in the upper left, just underneath the density plot of body mass, whereas to the right of the body mass density plot we can see the correlation value of 0.873. We could somehow “see” a line through these points. In building a model for body mass, surely flipper length will be the first variable to consider and the one that would explain a fair bit of the variation in body mass.&lt;/p&gt;
&lt;p&gt;What about the lowest correlation? That seems to be the one between bill length and bill depth, with a correlation value of -0.229. On the face of it, there seems to be very weak relationship between these two variables.&lt;/p&gt;
&lt;p&gt;What about the second lowest correlation? Numerically, it is the one between body mass and bill depth (-0.472). However, if you look at the corresponding scatterplot in the lower left corner you see two clusters of points, so we need to dig a bit deeper.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;dummy-variables&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Dummy variables&lt;/h2&gt;
&lt;p&gt;Dummy, or categorical, variables allows us to incorporate non-numeric data into our analysis. In this example we have two categorical variables (&lt;code&gt;species&lt;/code&gt;, &lt;code&gt;sex&lt;/code&gt;) and we can use &lt;code&gt;ggpairs()&lt;/code&gt; to dig a bit deeper and then use these categorical variables in our modelling. The first scatterplot matrix does not take into consideration the species or the sex of the penguins, variables that may help explain differences in body mass.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggpairs2 &amp;lt;- penguins %&amp;gt;% 
  na.omit() %&amp;gt;% 
  select( species, sex, body_mass_g, flipper_length_mm, bill_length_mm, bill_depth_mm) %&amp;gt;% 
  rename(`Flipper lenght(mm)` = flipper_length_mm, 
         `Body mass (g)` = body_mass_g, 
         `Bill length (mm)` = bill_length_mm, 
         `Bill depth (mm)` = bill_depth_mm) %&amp;gt;% 
  GGally::ggpairs(aes(colour=species, shape=species),
                  alpha = 0.4) +
  scale_colour_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  scale_fill_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  theme_minimal()
  

ggpairs2&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/modelling_eda_files/figure-html/ggpairs2-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;When we include the categorical (factor) variables of &lt;code&gt;species&lt;/code&gt; and &lt;code&gt;sex&lt;/code&gt;, and since these are no longer numerical values, we get boxplots and histograms. For instance to explore the relationship between body mass and species, look at the upper left of the table; yo can see that the green penguins (Gentoo) seem to be heavier than the other two species which may have no difference between them.&lt;/p&gt;
&lt;p&gt;Since this plot may be confusing, let us just concentrate on the scatterplot matrix of numerical values coloured by &lt;code&gt;species&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggpairs3 &amp;lt;- penguins %&amp;gt;% 
  na.omit() %&amp;gt;% 
  select( species, sex, body_mass_g, flipper_length_mm, bill_length_mm, bill_depth_mm) %&amp;gt;% 
  rename(`Flipper lenght(mm)` = flipper_length_mm, 
         `Body mass (g)` = body_mass_g, 
         `Bill length (mm)` = bill_length_mm, 
         `Bill depth (mm)` = bill_depth_mm) %&amp;gt;% 
  GGally::ggpairs(aes(colour=species, shape=species),
                  alpha = 0.4,
                  columns = c(3:6)) + 
  scale_colour_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  scale_fill_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  theme_minimal()
  

ggpairs3&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/modelling_eda_files/figure-html/ggpairs3-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;modelling-considerations-of-numerical-variables&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Modelling considerations of numerical variables&lt;/h2&gt;
&lt;p&gt;Let us examine the relationship between body mass and flipper length, and body mass and bill depth.&lt;/p&gt;
&lt;div id=&#34;penguin-body-mass-vs-flipper-length&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Penguin body mass vs flipper length&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;mass_flipper &amp;lt;- ggplot(data = penguins,
                       aes(x = flipper_length_mm,
                           y = body_mass_g)) +
  geom_point(aes(colour = species,
                 shape = species),
             size = 3,
             alpha = 0.6) +
  theme_minimal() +
  geom_smooth(method = &amp;quot;lm&amp;quot;, se=FALSE, aes(colour = species)) +
  scale_color_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  labs(title = &amp;quot;Penguin size, Palmer Station LTER&amp;quot;,
       subtitle = &amp;quot;Flipper length and body mass for Adelie, Chinstrap and Gentoo Penguins&amp;quot;,
       x = &amp;quot;Flipper length (mm)&amp;quot;,
       y = &amp;quot;Body mass (g)&amp;quot;,
       color = &amp;quot;Penguin species&amp;quot;,
       shape = &amp;quot;Penguin species&amp;quot;) +
  theme(legend.position = c(0.2, 0.7),
        legend.background = element_rect(fill = &amp;quot;white&amp;quot;, colour = NA),
        plot.title.position = &amp;quot;plot&amp;quot;,
        plot.caption = element_text(hjust = 0, face= &amp;quot;italic&amp;quot;),
        plot.caption.position = &amp;quot;plot&amp;quot;)

mass_flipper&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/modelling_eda_files/figure-html/unnamed-chunk-4-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;penguin-body-mass-vs-bill-depth&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Penguin body mass vs bill depth&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;mass_bill_depth &amp;lt;- ggplot(data = penguins,
                       aes(x = bill_depth_mm,
                           y = body_mass_g)) +
  geom_point(aes(colour = species,
                 shape = species),
             size = 3,
             alpha = 0.6) +
  theme_minimal() +
  geom_smooth(method = &amp;quot;lm&amp;quot;, se=FALSE, aes(colour = species)) +
  scale_color_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  labs(title = &amp;quot;Penguin size, Palmer Station LTER&amp;quot;,
       subtitle = &amp;quot;Bill depth and body mass for Adelie, Chinstrap and Gentoo Penguins&amp;quot;,
       x = &amp;quot;Bill depth (mm)&amp;quot;,
       y = &amp;quot;Body mass (g)&amp;quot;,
       color = &amp;quot;Penguin species&amp;quot;,
       shape = &amp;quot;Penguin species&amp;quot;) +
  theme(legend.position = c(0.8, 0.8),
        legend.background = element_rect(fill = &amp;quot;white&amp;quot;, colour = NA),
        plot.title.position = &amp;quot;plot&amp;quot;,
        plot.caption = element_text(hjust = 0, face= &amp;quot;italic&amp;quot;),
        plot.caption.position = &amp;quot;plot&amp;quot;)

mass_bill_depth&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/modelling_eda_files/figure-html/unnamed-chunk-5-1.png&#34; width=&#34;672&#34; /&gt;
This shows that there is very little difference between Adelie and Chinstrap, but Gentoo is markedly different. All three species have a positive relationship (as bill depth increases, so does body mass), which is not what the numerical correlation coefficient of -0.472 ould have us believe.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;penguin-body-mass-and-flipper-size-faceted-by-sex&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Penguin body mass and flipper size, faceted by sex&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(penguins, aes(x = flipper_length_mm,
                            y = body_mass_g)) +
  geom_point(aes(colour = sex)) +
  theme_minimal() +
  scale_color_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;cyan4&amp;quot;), na.translate = FALSE) +
  labs(title = &amp;quot;Penguin flipper and body mass&amp;quot;,
       subtitle = &amp;quot;Dimensions for male and female Adelie, Chinstrap and Gentoo Penguins at Palmer Station LTER&amp;quot;,
       x = &amp;quot;Flipper length (mm)&amp;quot;,
       y = &amp;quot;Body mass (g)&amp;quot;,
       color = &amp;quot;Penguin sex&amp;quot;) +
  theme(legend.position = &amp;quot;bottom&amp;quot;,
        legend.background = element_rect(fill = &amp;quot;white&amp;quot;, color = NA),
        plot.title.position = &amp;quot;plot&amp;quot;,
        plot.caption = element_text(hjust = 0, face= &amp;quot;italic&amp;quot;),
        plot.caption.position = &amp;quot;plot&amp;quot;) +
  facet_wrap(~species)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/modelling_eda_files/figure-html/unnamed-chunk-6-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;On average, male penguins seem to be heavier than female ones, and this is consistent along all three species.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;simpsons-paradox-penguin-bill-length-vs-bill-depth&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Simpson’s Paradox: Penguin bill length vs bill depth&lt;/h3&gt;
&lt;p&gt;In the original scatterplot matrix without &lt;code&gt;species&lt;/code&gt;, the lowest correlation coefficient of -0.235 was between bill length and bill depth. The following is just the scatterplot with the line of best fit added.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;bill_no_species &amp;lt;- ggplot(data = penguins,
                         aes(x = bill_length_mm,
                             y = bill_depth_mm)) +
  geom_point() +
  theme_minimal() +
  scale_color_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  labs(title = &amp;quot;Penguin bill dimensions (omit species)&amp;quot;,
       subtitle = &amp;quot;Palmer Station LTER&amp;quot;,
       x = &amp;quot;Bill length (mm)&amp;quot;,
       y = &amp;quot;Bill depth (mm)&amp;quot;) +
  theme(plot.title.position = &amp;quot;plot&amp;quot;,
        plot.caption = element_text(hjust = 0, face= &amp;quot;italic&amp;quot;),
        plot.caption.position = &amp;quot;plot&amp;quot;) +
  geom_smooth(method = &amp;quot;lm&amp;quot;, se = FALSE, color = &amp;quot;blue&amp;quot;)

bill_no_species&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/modelling_eda_files/figure-html/unnamed-chunk-7-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;p&gt;However, if we plot the same scatterplot colouring points by &lt;code&gt;species&lt;/code&gt;, we get a completely different story.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;bill_len_dep &amp;lt;- ggplot(data = penguins,
                         aes(x = bill_length_mm,
                             y = bill_depth_mm,
                             group = species)) +
  geom_point(aes(colour = species,
                 shape = species),
             size = 3,
             alpha = 0.8) +
  geom_smooth(method = &amp;quot;lm&amp;quot;, se = FALSE, aes(colour = species)) +
  theme_minimal() +
  scale_color_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  labs(title = &amp;quot;Penguin bill dimensions&amp;quot;,
       subtitle = &amp;quot;Bill length and depth for Adelie, Chinstrap and Gentoo Penguins at Palmer Station LTER&amp;quot;,
       x = &amp;quot;Bill length (mm)&amp;quot;,
       y = &amp;quot;Bill depth (mm)&amp;quot;,
       color = &amp;quot;Penguin species&amp;quot;,
       shape = &amp;quot;Penguin species&amp;quot;) +
  theme(legend.position = c(0.85, 0.10),
        legend.background = element_rect(fill = &amp;quot;white&amp;quot;, color = NA),
        plot.title.position = &amp;quot;plot&amp;quot;,
        plot.caption = element_text(hjust = 0, face= &amp;quot;italic&amp;quot;),
        plot.caption.position = &amp;quot;plot&amp;quot;)

bill_len_dep&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/modelling_eda_files/figure-html/unnamed-chunk-8-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;a href=&#34;https://en.wikipedia.org/wiki/Simpson%27s_paradox&#34;&gt;Simpson’s paradox&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;is-y-heavily-skewed-does-it-need-to-be-transformed&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Is Y heavily skewed? Does it need to be transformed?&lt;/h2&gt;
&lt;p&gt;THe variable we are interested to explain, body mass, does not appear to be heavily skewed so there is no need for any transformation.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;acknowledgements&#34; class=&#34;section level2 toc-ignore&#34;&gt;
&lt;h2&gt;Acknowledgements&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;This page is adapted from the Palmer &lt;a href=&#34;https://allisonhorst.github.io/palmerpenguins/articles/examples.html&#34; target=&#34;_blank&#34;&gt;Palmer Penguins package Vignette&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Import data</title>
      <link>https://bit-2021.netlify.app/example/eda-import-data/</link>
      <pubDate>Tue, 21 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/example/eda-import-data/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#overview&#34;&gt;Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#importing-csv-files-read_csv&#34;&gt;Importing CSV files: &lt;code&gt;read_csv()&lt;/code&gt;&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#importing-csv-files-directly-off-the-internet&#34;&gt;Importing CSV files directly off the internet&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#importing-csv-files-saved-locally&#34;&gt;Importing CSV files saved locally&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#need-for-speed-enter-data.tablefread&#34;&gt;Need for speed: Enter &lt;code&gt;data.table::fread()&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#other-data-formats&#34;&gt;Other data formats&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#rio-a-swiss-army-knife-for-data-input-output&#34;&gt;&lt;code&gt;rio&lt;/code&gt;: a swiss-army knife for data input-output&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#never-work-directly-on-the-raw-data&#34;&gt;Never work directly on the raw data&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#other-links&#34;&gt;Other links&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;Learning Objectives &lt;br&gt;
1. Load external data from a .csv file into a data frame.&lt;br&gt;
2. Describe what a data frame is.&lt;br&gt;
3. Use indexing to subset specific portions of data frames.&lt;br&gt;
4. Describe what a factor is.&lt;br&gt;
5. Reorder and rename factors.&lt;br&gt;
6. Format dates.&lt;br&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div id=&#34;overview&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Overview&lt;/h2&gt;
&lt;p&gt;One of the things that I found strange when I started working with R was that, unlike other software like Excel, Stata, SPSS, etc., you couldn’t just double click on an .xls, .dta, or .sav file, load the data and look at its contents. In R, we must use a command to explicitly import the data into memory.&lt;/p&gt;
&lt;p&gt;While there are many possible data formats, we will concentrate on &lt;strong&gt;CSV&lt;/strong&gt; files, namely &lt;em&gt;Comma Separated Values&lt;/em&gt; files that are a common way to save the raw data from spreadsheets, without any of the formatting, etc. The &lt;strong&gt;readr&lt;/strong&gt; R package contains functions for importing data saved as &lt;em&gt;flat file&lt;/em&gt; documents; &lt;code&gt;readr&lt;/code&gt; is a core member of the tidyverse and is loaded everytime you call &lt;code&gt;library(tidyverse)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;CSV file names end with a .csv and if you opened one inside Excel, it would look like a regular Excel file. NASA provides an estimate of global surface temperature change which allows us to calculate weather anomalies. The data is available at &lt;a href=&#34;https://data.giss.nasa.gov/gistemp/tabledata_v3/NH.Ts+dSST.csv&#34;&gt;https://data.giss.nasa.gov/gistemp/tabledata_v3/NH.Ts+dSST.csv&lt;/a&gt; as a CSV file which you open inside Excel looks something like this:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/img/weatherAnomaliesCSVinExcel.png&#34; width=&#34;80%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;However, this is what a &lt;strong&gt;CSV&lt;/strong&gt; file looks like on the inside: a bunch of values separated with commas.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/img/weatherAnomaliesCSV.png&#34; width=&#34;80%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;By the way, if you look at the data closely, you will notice that the values in the &lt;code&gt;D-N&lt;/code&gt; (December-November) and &lt;code&gt;DJF&lt;/code&gt; (December-January-February) columns for the year 1880 are &lt;code&gt;***&lt;/code&gt;. These &lt;code&gt;***&lt;/code&gt; denote a missing value, in the same way that R uses the &lt;code&gt;NA&lt;/code&gt; (or &lt;strong&gt;not available&lt;/strong&gt;) value.&lt;/p&gt;
&lt;p&gt;If you’d like R to treat these &lt;code&gt;***&lt;/code&gt; values as missing, you will need to convert them to &lt;code&gt;NA&lt;/code&gt;s. One way to do this is to ask &lt;code&gt;read_csv()&lt;/code&gt; to parse &lt;code&gt;***&lt;/code&gt; values as &lt;code&gt;NA&lt;/code&gt; values when it reads in the data.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;importing-csv-files-read_csv&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Importing CSV files: &lt;code&gt;read_csv()&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Importing CSV is part of base R using the &lt;code&gt;read.csv()&lt;/code&gt; command. However, we will use the &lt;code&gt;readr&lt;/code&gt; package and its &lt;code&gt;read_csv()&lt;/code&gt; command that allows us to read flat data. &lt;code&gt;read_csv()&lt;/code&gt; is significantly (8-10 times) faster and more user friendly than the base R command, with no need to define rownames, no &lt;code&gt;stringsAsFactors = TRUE&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Even though we only concentrate on CSV files, &lt;code&gt;readr&lt;/code&gt; has several functions that allow you to import a specific flat file format.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th&gt;Function&lt;/th&gt;
&lt;th&gt;Reads&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;read_csv()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Comma separated values&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;read_csv2()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Semi-colon separate values&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;read_delim()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;General delimited files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;read_fwf()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fixed width files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;read_log()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Apache log files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;read_table()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Space separated files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;read_tsv()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Tab delimited values&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Just as you can import data, &lt;code&gt;readr&lt;/code&gt; allows you to export data and save it locally. These functions are similar to the &lt;code&gt;read_&lt;/code&gt; functions and each save a tibble (or data frame) in the specific file format.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th&gt;Function&lt;/th&gt;
&lt;th&gt;Writes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;write_csv()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Comma separated values&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;write_excel_csv()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;CSV that you plan to open in Excel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;write_delim()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;General delimited files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;write_file()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;A single string, written as is&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;write_lines()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;A vector of strings, one string per line&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;write_tsv()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Tab delimited values&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;To use a &lt;code&gt;write_&lt;/code&gt; function, first give it the name of the data frame to save, then give it a filename to save in your working directory.&lt;/p&gt;
&lt;div id=&#34;importing-csv-files-directly-off-the-internet&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Importing CSV files directly off the internet&lt;/h3&gt;
&lt;p&gt;If the CSV exists on the internet and you have the URL address, you don’t have to download it to your local machine and then import it; you can import it directly off the web using the URL link.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;url &amp;lt;- &amp;quot;https://data.giss.nasa.gov/gistemp/tabledata_v3/NH.Ts+dSST.csv&amp;quot;
weather &amp;lt;- read_csv(url, skip = 1, na = &amp;quot;***&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When using the &lt;code&gt;read_csv()&lt;/code&gt; function, we added two options:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;&lt;code&gt;skip = 1&lt;/code&gt; option is there as the real data table only starts in Row 2, so we need to skip one row.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;na = &#34;***&#34;&lt;/code&gt;option informs R how missing observations in the spreadsheet are coded. As discussed earlier, it is best to specify NA values here, as otherwise some of the data may not be recognized as numeric data.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Notice that the code above saves the output to an object named &lt;code&gt;weather&lt;/code&gt;. You must save the output of &lt;code&gt;read_csv()&lt;/code&gt; to an object if you wish to use it later; otherwise, &lt;code&gt;read_csv()&lt;/code&gt; will just print the contents of the data set at the command line.&lt;/p&gt;
&lt;p&gt;Also, the assignment statement doesn’t produce any output for &lt;code&gt;weather&lt;/code&gt; because assignments don’t display anything. If we want to check that our data has been loaded, we can see glimpse the structure of the dataframe using &lt;code&gt;glimpse()&lt;/code&gt; or see its contents by just typing its name: &lt;code&gt;weather&lt;/code&gt;.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;glimpse(weather)&lt;/code&gt; shows us the number of observations and variables, and then, for each variable, shows the variable type; in our case, all of the variables are &lt;code&gt;dbl&lt;/code&gt; or double, namely numeric variables.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;weather&lt;/code&gt;, just invoking the name of the dataframe, shows the contents of the dataframe in the tabular form it is saved in&lt;/li&gt;
&lt;/ul&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;glimpse(weather)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Rows: 140
## Columns: 19
## $ Year  &amp;lt;dbl&amp;gt; 1880, 1881, 1882, 1883, 1884, 1885, 1886, 1887, 1888, 1889, 1...
## $ Jan   &amp;lt;dbl&amp;gt; -0.54, -0.19, 0.22, -0.59, -0.23, -1.00, -0.68, -1.07, -0.53,...
## $ Feb   &amp;lt;dbl&amp;gt; -0.38, -0.25, 0.22, -0.67, -0.11, -0.37, -0.68, -0.58, -0.59,...
## $ Mar   &amp;lt;dbl&amp;gt; -0.26, 0.02, 0.00, -0.16, -0.65, -0.21, -0.57, -0.36, -0.58, ...
## $ Apr   &amp;lt;dbl&amp;gt; -0.37, -0.02, -0.36, -0.27, -0.62, -0.53, -0.34, -0.42, -0.24...
## $ May   &amp;lt;dbl&amp;gt; -0.11, -0.06, -0.32, -0.32, -0.42, -0.55, -0.34, -0.27, -0.16...
## $ Jun   &amp;lt;dbl&amp;gt; -0.22, -0.36, -0.38, -0.26, -0.52, -0.47, -0.43, -0.20, -0.04...
## $ Jul   &amp;lt;dbl&amp;gt; -0.23, -0.06, -0.37, -0.09, -0.48, -0.39, -0.20, -0.23, 0.04,...
## $ Aug   &amp;lt;dbl&amp;gt; -0.24, -0.03, -0.14, -0.26, -0.50, -0.44, -0.47, -0.52, -0.19...
## $ Sep   &amp;lt;dbl&amp;gt; -0.26, -0.23, -0.17, -0.33, -0.45, -0.32, -0.34, -0.17, -0.12...
## $ Oct   &amp;lt;dbl&amp;gt; -0.32, -0.40, -0.53, -0.21, -0.41, -0.30, -0.31, -0.40, 0.04,...
## $ Nov   &amp;lt;dbl&amp;gt; -0.37, -0.42, -0.32, -0.40, -0.48, -0.28, -0.45, -0.19, -0.03...
## $ Dec   &amp;lt;dbl&amp;gt; -0.48, -0.28, -0.42, -0.25, -0.40, 0.00, -0.17, -0.43, -0.26,...
## $ `J-D` &amp;lt;dbl&amp;gt; -0.32, -0.19, -0.21, -0.32, -0.44, -0.40, -0.42, -0.40, -0.22...
## $ `D-N` &amp;lt;dbl&amp;gt; NA, -0.21, -0.20, -0.33, -0.43, -0.44, -0.40, -0.38, -0.24, -...
## $ DJF   &amp;lt;dbl&amp;gt; NA, -0.31, 0.06, -0.56, -0.20, -0.59, -0.45, -0.61, -0.52, -0...
## $ MAM   &amp;lt;dbl&amp;gt; -0.24, -0.02, -0.22, -0.25, -0.56, -0.43, -0.42, -0.35, -0.33...
## $ JJA   &amp;lt;dbl&amp;gt; -0.23, -0.15, -0.30, -0.20, -0.50, -0.44, -0.37, -0.32, -0.06...
## $ SON   &amp;lt;dbl&amp;gt; -0.32, -0.35, -0.34, -0.32, -0.44, -0.30, -0.37, -0.25, -0.04...&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;weather&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 140 x 19
##     Year   Jan    Feb    Mar   Apr   May   Jun   Jul   Aug   Sep   Oct   Nov
##    &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt;  &amp;lt;dbl&amp;gt;  &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt; &amp;lt;dbl&amp;gt;
##  1  1880 -0.54 -0.38  -0.26  -0.37 -0.11 -0.22 -0.23 -0.24 -0.26 -0.32 -0.37
##  2  1881 -0.19 -0.25   0.02  -0.02 -0.06 -0.36 -0.06 -0.03 -0.23 -0.4  -0.42
##  3  1882  0.22  0.22   0     -0.36 -0.32 -0.38 -0.37 -0.14 -0.17 -0.53 -0.32
##  4  1883 -0.59 -0.67  -0.16  -0.27 -0.32 -0.26 -0.09 -0.26 -0.33 -0.21 -0.4 
##  5  1884 -0.23 -0.11  -0.65  -0.62 -0.42 -0.52 -0.48 -0.5  -0.45 -0.41 -0.48
##  6  1885 -1    -0.37  -0.21  -0.53 -0.55 -0.47 -0.39 -0.44 -0.32 -0.3  -0.28
##  7  1886 -0.68 -0.68  -0.570 -0.34 -0.34 -0.43 -0.2  -0.47 -0.34 -0.31 -0.45
##  8  1887 -1.07 -0.580 -0.36  -0.42 -0.27 -0.2  -0.23 -0.52 -0.17 -0.4  -0.19
##  9  1888 -0.53 -0.59  -0.580 -0.24 -0.16 -0.04  0.04 -0.19 -0.12  0.04 -0.03
## 10  1889 -0.31  0.35   0.07   0.15 -0.05 -0.12 -0.1  -0.16 -0.26 -0.34 -0.61
## # ... with 130 more rows, and 7 more variables: Dec &amp;lt;dbl&amp;gt;, `J-D` &amp;lt;dbl&amp;gt;,
## #   `D-N` &amp;lt;dbl&amp;gt;, DJF &amp;lt;dbl&amp;gt;, MAM &amp;lt;dbl&amp;gt;, JJA &amp;lt;dbl&amp;gt;, SON &amp;lt;dbl&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When you use &lt;code&gt;read_csv()&lt;/code&gt;, &lt;code&gt;read_csv()&lt;/code&gt; tries to match each column of input to one of the basic data types in R. In our case, &lt;code&gt;read_csv()&lt;/code&gt; misidentified the contents of the &lt;code&gt;Year&lt;/code&gt; column to a real number, rather than an integer. You can correct this with R’s &lt;code&gt;as.integer()&lt;/code&gt; function, or you can read the data in again, this time instructing &lt;code&gt;read_csv()&lt;/code&gt; to parse the column as integers.&lt;/p&gt;
&lt;p&gt;To do this:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;add the argument &lt;code&gt;col_types&lt;/code&gt; to &lt;code&gt;read_csv()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;set equal the &lt;code&gt;col_types&lt;/code&gt; arguent to a list.&lt;/li&gt;
&lt;li&gt;add a named element to the list for each column you would like to manually parse; in our case, we want to make column ‘Year’ an integer.&lt;/li&gt;
&lt;/ol&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;weather &amp;lt;- read_csv(url, skip = 1, na = &amp;quot;***&amp;quot;,
                   col_types = list(Year = col_integer()))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To complete the code, set &lt;code&gt;Year&lt;/code&gt; equal to one of the functions below, each function instructs &lt;code&gt;read_csv()&lt;/code&gt; to parse &lt;code&gt;Year&lt;/code&gt; as a specific type of data.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th&gt;Type function&lt;/th&gt;
&lt;th&gt;Data Type&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;col_character()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;character&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;col_date()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Date&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;col_datetime()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;POSIXct (date-time)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;col_double()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;double (numeric)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;col_factor()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;factor&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;col_guess()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;let readr geuss (default)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;col_integer()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;integer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;col_logical()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;logical&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;col_number()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;numbers mixed with non-number characters&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;col_numeric()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;double or integer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;col_skip()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;do not read this column&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;col_time()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;time&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;div id=&#34;importing-csv-files-saved-locally&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Importing CSV files saved locally&lt;/h3&gt;
&lt;p&gt;If you want to read a CSV file you have saved locally in your computer, you must let RStudio know which folder the file lives in; in more technical terms, you have to set the Working Directory. You can determine the location of your working directory by running &lt;code&gt;getwd()&lt;/code&gt;. You can change the location of your working directory by going to &lt;strong&gt;Session &amp;gt; Set Working Directory&lt;/strong&gt; in the RStudio IDE menu bar or use the &lt;code&gt;Ctrl+Shift+H&lt;/code&gt; shortcut in Windows, &lt;code&gt;Cmd+Shift+H&lt;/code&gt; in Mac to browse for the folder where the file resides.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;need-for-speed-enter-data.tablefread&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Need for speed: Enter &lt;code&gt;data.table::fread()&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;If you have a file which is fairly large, the &lt;strong&gt;fread()&lt;/strong&gt; function from the &lt;em&gt;data.table&lt;/em&gt; package, &lt;code&gt;data.table::fread()&lt;/code&gt;, can make your life easier. You use it in a similar way to &lt;code&gt;read_csv()&lt;/code&gt;, but it is faster. The following table compares how long it takes to read 1.1 million rows from CDC’s &lt;a href=&#34;https://data.cdc.gov/Case-Surveillance/COVID-19-Case-Surveillance-Public-Use-Data/vbim-akqf&#34;&gt;COVID-19 Case Surveillance Public Use Data&lt;/a&gt;. We downloaded the CSV locally and then read it with base R &lt;code&gt;read.csv()&lt;/code&gt;, &lt;code&gt;readr::read_csv()&lt;/code&gt;, and &lt;code&gt;data.table::fread()&lt;/code&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;library(microbenchmark)
mbm = microbenchmark(
  baseR =   read.csv(&amp;quot;COVID-19_Case_Surveillance_Public_Use_Data.csv&amp;quot;),
  readr =   read_csv(&amp;quot;COVID-19_Case_Surveillance_Public_Use_Data.csv&amp;quot;),
  data.table =   fread(&amp;quot;COVID-19_Case_Surveillance_Public_Use_Data.csv&amp;quot;),
  times=10
)
mbm&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th&gt;Unit: seconds&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;expr&lt;/td&gt;
&lt;td&gt;min&lt;/td&gt;
&lt;td&gt;lq&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;mean&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;median&lt;/td&gt;
&lt;td&gt;uq&lt;/td&gt;
&lt;td&gt;max&lt;/td&gt;
&lt;td&gt;neval&lt;/td&gt;
&lt;td&gt;cld&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;baseR&lt;/td&gt;
&lt;td&gt;6.314878&lt;/td&gt;
&lt;td&gt;7.598240&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;9.930234&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;9.556545&lt;/td&gt;
&lt;td&gt;11.353675&lt;/td&gt;
&lt;td&gt;15.014476&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;c&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;readr&lt;/td&gt;
&lt;td&gt;4.450299&lt;/td&gt;
&lt;td&gt;5.270310&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;6.300699&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;6.317381&lt;/td&gt;
&lt;td&gt;7.489372&lt;/td&gt;
&lt;td&gt;8.341003&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;b&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;data.table&lt;/td&gt;
&lt;td&gt;1.080079&lt;/td&gt;
&lt;td&gt;1.950533&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2.240901&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2.251069&lt;/td&gt;
&lt;td&gt;2.729169&lt;/td&gt;
&lt;td&gt;3.030911&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;a&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The diffenrence in speed is remarkarble; base R is the slowest with an avearge loading time of 9.93 seconds. &lt;code&gt;read_csv()&lt;/code&gt; yields an average of 6.3 seconds, and &lt;code&gt;data.table:fread()&lt;/code&gt; reduces loading time to 2.24 seconds.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;other-data-formats&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Other data formats&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;readr&lt;/code&gt; package provides efficient functions for reading and saving common flat file data formats. For other data types, consider using :&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th&gt;Package&lt;/th&gt;
&lt;th&gt;Reads&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;readxl&lt;/td&gt;
&lt;td&gt;Excel files (.xls, .xlsx)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;haven&lt;/td&gt;
&lt;td&gt;SPSS, Stata, and SAS files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;jsonlite&lt;/td&gt;
&lt;td&gt;json&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;xml2&lt;/td&gt;
&lt;td&gt;xml&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;httr&lt;/td&gt;
&lt;td&gt;web API’s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;rvest&lt;/td&gt;
&lt;td&gt;web pages (web scraping)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;DBI&lt;/td&gt;
&lt;td&gt;databases&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;sparklyr&lt;/td&gt;
&lt;td&gt;data loaded into spark&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;div id=&#34;rio-a-swiss-army-knife-for-data-input-output&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;&lt;code&gt;rio&lt;/code&gt;: a swiss-army knife for data input-output&lt;/h3&gt;
&lt;p&gt;A really neat package to handle importing- exporing data is &lt;code&gt;rio&lt;/code&gt; whose authors call it &lt;em&gt;A Swiss-Army knife for data input-output&lt;/em&gt;. It works by determining the data structure from the file extension, uses reasonable defaults for data import and export (e.g., ‘stringsAsFactors=FALSE’), supports web-based import (including from SSL/HTTPS). It also has a useful function, ‘convert()’, that provides a simple method for converting between file types. You can &lt;a href=&#34;https://cran.r-project.org/web/packages/rio/vignettes/rio.html&#34;&gt;read more about &lt;code&gt;rio&lt;/code&gt; here&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;never-work-directly-on-the-raw-data&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Never work directly on the raw data&lt;/h2&gt;
&lt;p&gt;In 2012 Cecilia Giménez, an 83-year-old widow and amateur painter, attempted to restore a century-old fresco of Jesus crowned with thorns in her local church in Borja, Spain. The restoration didn’t go very well, but, surprisingly, the &lt;a href=&#34;https://news.artnet.com/art-world/botched-restoration-of-jesus-fresco-miraculously-saves-spanish-town-197057&#34;&gt;botched restoration of Jesus fresco miraculously saved the Spanish Town&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/img/restoration.png&#34; width=&#34;80%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;As a most important rule, please do not work on the raw data; it’s unlikely you will have Cecilia Giménez’s good fortune to become (in)famous for your not-so-brilliant work. Make sure you import the data in R, leave the raw data aside, and if you make any changes tidying and wrangling your data, save it using &lt;code&gt;write_csv()&lt;/code&gt; with a different file name.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;other-links&#34; class=&#34;section level2 toc-ignore&#34;&gt;
&lt;h2&gt;Other links&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://www.nytimes.com/2014/08/18/technology/for-big-data-scientists-hurdle-to-insights-is-janitor-work.html&#34;&gt;For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle to Insights&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;br&gt;
&lt;br&gt;&lt;/p&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Sampling and confidence intervals (CI)</title>
      <link>https://bit-2021.netlify.app/example/inference_sampling_ci/</link>
      <pubDate>Wed, 29 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/example/inference_sampling_ci/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#summary-statistics&#34;&gt;Summary statistics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#boxplots&#34;&gt;Boxplots&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#confidence-intervals-ci&#34;&gt;Confidence Intervals (CI)&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#cis-for-body-mass&#34;&gt;CIs for body mass&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#cis-for-flipper-length&#34;&gt;CIs for flipper length&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#t-test-assuming-unequal-variance&#34;&gt;t-Test assuming unequal variance&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#t-test-for-body-mass&#34;&gt;t-Test for body mass&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#t-test-for-flipper-length&#34;&gt;t-Test for flipper length&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#t-test-assuming-equal-variance&#34;&gt;t-Test assuming equal variance&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#how-do-we-test-whether-the-two-groups-have-equal-variance&#34;&gt;How do we test whether the two groups have equal variance?&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#bartlett-test-check-equality-of-variances-based-on-the-mean&#34;&gt;Bartlett test: Check equality of variances based on the &lt;em&gt;mean&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#levene-test-check-equality-of-variances-based-on-the-median&#34;&gt;Levene test Check equality of variances based on the &lt;em&gt;median&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#fligner-killeen-test-check-homogeneity-of-variances-based-on-the-median-so-its-more-robust-to-outliers&#34;&gt;Fligner-Killeen test: Check homogeneity of variances based on the median, so it’s more robust to outliers&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#acknowledgements&#34;&gt;Acknowledgements&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;p&gt;We are back to dealing with penguins, and we want to explore body mass and flipper length across the three different species.&lt;/p&gt;
&lt;div id=&#34;summary-statistics&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Summary statistics&lt;/h2&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;penguins %&amp;gt;%
  group_by(species) %&amp;gt;%
  summarize(across(c( body_mass_g, flipper_length_mm),
                   mean, na.rm = TRUE)) %&amp;gt;% 
  kable()&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
species
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
body_mass_g
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
flipper_length_mm
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Adelie
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3701
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
190
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Chinstrap
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3733
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
196
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Gentoo
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
5076
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
217
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;div id=&#34;boxplots&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Boxplots&lt;/h2&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;body_mass_plot &amp;lt;- ggplot(data = penguins, aes(y = species, x= body_mass_g)) +
  geom_boxplot(aes(color = species), width = 0.3, show.legend = FALSE) +
  geom_jitter(aes(color = species), alpha = 0.5, show.legend = FALSE, position = position_jitter(width = 0.2, seed = 0)) +
  scale_color_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  theme_minimal() +
  labs(title = &amp;quot;Penguin size, Palmer Station LTER&amp;quot;,
       subtitle = &amp;quot;Body mass (in grams) for Adelie, Chinstrap and Gentoo Penguins&amp;quot;,
       y = &amp;quot;Species&amp;quot;,
       x = &amp;quot;Body mass (grams)&amp;quot;)

body_mass_plot&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/inference_sampling_ci_files/figure-html/unnamed-chunk-2-1.png&#34; width=&#34;648&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;flipper_plot &amp;lt;- ggplot(data = penguins, aes(y = species, x = flipper_length_mm)) +
  geom_boxplot(aes(color = species), width = 0.3, show.legend = FALSE) +
  geom_jitter(aes(color = species), alpha = 0.5, show.legend = FALSE, position = position_jitter(width = 0.2, seed = 0)) +
  scale_color_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  theme_minimal() +
  labs(title = &amp;quot;Penguin size, Palmer Station LTER&amp;quot;,
       subtitle = &amp;quot;Flipper length for Adelie, Chinstrap and Gentoo Penguins&amp;quot;,
       y = &amp;quot;Species&amp;quot;,
       x = &amp;quot;Flipper length (mm)&amp;quot;)



flipper_plot&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/inference_sampling_ci_files/figure-html/unnamed-chunk-2-2.png&#34; width=&#34;648&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;confidence-intervals-ci&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Confidence Intervals (CI)&lt;/h2&gt;
&lt;div id=&#34;cis-for-body-mass&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;CIs for body mass&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;formula_ci_body_mass &amp;lt;- penguins %&amp;gt;%
  group_by(species) %&amp;gt;%
  summarise( mean_body_mass = mean(body_mass_g, na.rm = TRUE), 
             sd_mass = sd(body_mass_g, na.rm = TRUE), 
             count = n(), 
             
             # get t-critical value with (n-1) degrees of freedom
             t_critical = qt(0.975, count-1),
             se = sd_mass/sqrt(count),
             margin_of_error = t_critical * se,
             ci_low = mean_body_mass - margin_of_error,
             ci_high = mean_body_mass + margin_of_error
  )


formula_ci_body_mass %&amp;gt;% 
  kable()&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
species
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
mean_body_mass
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
sd_mass
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
count
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
t_critical
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
se
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
margin_of_error
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
ci_low
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
ci_high
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Adelie
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3701
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
459
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
152
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1.98
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
37.2
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
73.5
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3627
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3774
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Chinstrap
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3733
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
384
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
68
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
2.00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
46.6
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
93.0
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3640
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3826
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Gentoo
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
5076
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
504
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
124
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1.98
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
45.3
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
89.6
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
4986
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
5166
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#visualise  CIs for all species 
ggplot(formula_ci_body_mass, 
       aes(x=reorder(species, mean_body_mass), 
           y=mean_body_mass, 
           colour=species)) +
  geom_point() +
  scale_colour_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  geom_errorbar(width=.2, aes(ymin=ci_low, ymax=ci_high)) + 
  labs(x=&amp;quot; &amp;quot;,
       y= &amp;quot;Mean body mass (grams)&amp;quot;, 
       title=&amp;quot;Which species has the highest mean weight?&amp;quot;) + 
  theme_minimal()+
  coord_flip()+
  theme(legend.position = &amp;quot;none&amp;quot;)+
  NULL&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/inference_sampling_ci_files/figure-html/unnamed-chunk-4-1.png&#34; width=&#34;648&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# we will draw a violin plot and then use position=&amp;quot;jitter&amp;quot; or geom_jitter() 
# to see how spread out the actual points are

ggplot(data = penguins, aes(y = species, x= body_mass_g)) +
  geom_violin(aes(colour = species), width = 0.3, show.legend = FALSE) +
  geom_jitter(aes(colour = species), alpha = 0.5, show.legend = FALSE, position = position_jitter(width = 0.2, seed = 0)) +
  scale_color_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  
 # superimpose  the mean as a big orange dot
  geom_point(data = formula_ci_body_mass,
             aes(x=mean_body_mass, y = species), colour = &amp;quot;orange&amp;quot;, size = 8)+

  
  theme_minimal() +
  labs(title = &amp;quot;Penguin size, Palmer Station LTER&amp;quot;,
       subtitle = &amp;quot;Body mass (in grams) for Adelie, Chinstrap and Gentoo Penguins&amp;quot;,
       y = &amp;quot;Species&amp;quot;,
       x = &amp;quot;Body mass (grams)&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/inference_sampling_ci_files/figure-html/unnamed-chunk-4-2.png&#34; width=&#34;648&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;cis-for-flipper-length&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;CIs for flipper length&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;formula_ci_flipper_length &amp;lt;- penguins %&amp;gt;%
  group_by(species) %&amp;gt;%
  summarise( mean_flipper_length = mean(flipper_length_mm, na.rm = TRUE), 
             sd_flipper_length = sd(flipper_length_mm, na.rm = TRUE), 
             count = n(), 
             
             # get t-critical value with (n-1) degrees of freedom
             t_critical = qt(0.975, count-1),
             se = sd_flipper_length/sqrt(count),
             margin_of_error = t_critical * se,
             ci_low = mean_flipper_length - margin_of_error,
             ci_high = mean_flipper_length + margin_of_error
  )


formula_ci_flipper_length %&amp;gt;% 
  kable()&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
species
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
mean_flipper_length
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
sd_flipper_length
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
count
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
t_critical
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
se
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
margin_of_error
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
ci_low
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
ci_high
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Adelie
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
190
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
6.54
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
152
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1.98
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.530
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1.05
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
189
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
191
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Chinstrap
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
196
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
7.13
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
68
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
2.00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.865
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1.73
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
194
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
198
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Gentoo
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
217
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
6.49
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
124
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1.98
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.582
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1.15
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
216
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
218
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#visualise  CIs for all species 
ggplot(formula_ci_flipper_length, 
       aes(x=reorder(species, mean_flipper_length), 
           y=mean_flipper_length, 
           colour=species)) +
  geom_point() +
  scale_colour_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  geom_errorbar(width=.2, aes(ymin=ci_low, ymax=ci_high)) + 
  labs(x=&amp;quot; &amp;quot;,
       y= &amp;quot;Mean flipper length (mm)&amp;quot;, 
       title=&amp;quot;Which species has the longest mean flipper?&amp;quot;) + 
  theme_minimal()+
  coord_flip()+
  theme(legend.position = &amp;quot;none&amp;quot;)+
  NULL&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/inference_sampling_ci_files/figure-html/unnamed-chunk-6-1.png&#34; width=&#34;648&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# we will draw a violin plot and then use position=&amp;quot;jitter&amp;quot; or geom_jitter() 
# to see how spread out the actual points are

ggplot(data = penguins, aes(y = species, x= flipper_length_mm)) +
  geom_violin(aes(colour = species), width = 0.3, show.legend = FALSE) +
  geom_jitter(aes(colour = species), alpha = 0.5, show.legend = FALSE, position = position_jitter(width = 0.2, seed = 0)) +
  scale_color_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  
 # superimpose  the mean as a big orange dot
  geom_point(data = formula_ci_flipper_length,
             aes(x=mean_flipper_length, y = species), colour = &amp;quot;orange&amp;quot;, size = 8)+

  theme_minimal() +
  labs(title = &amp;quot;Penguin size, Palmer Station LTER&amp;quot;,
       subtitle = &amp;quot;Flipper length (in mm) for Adelie, Chinstrap and Gentoo Penguins&amp;quot;,
       y = &amp;quot;Species&amp;quot;,
       x = &amp;quot;Flipper length (mm)&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/inference_sampling_ci_files/figure-html/unnamed-chunk-6-2.png&#34; width=&#34;648&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Remember in the penguins data, we saw that Gentoo penguins are very unlike the other two; however, what if we wanted to compare Adelie and Chinstrap both in terms of body mass and flipper length? By looking at the confidence intervals, we already have an indication as to whether there is a difference or not. We will use a t-Test to check if the group means are different.&lt;/p&gt;
&lt;p&gt;Briefly, a t-Test should be used when we want to assess whether the mean between two groups are similar or not. The null hypothesis for a t-test is that the two means are equal, and the alternative is that they are not.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;t-test-assuming-unequal-variance&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;t-Test assuming unequal variance&lt;/h2&gt;
&lt;p&gt;R’s built-in function for running a t-test is &lt;code&gt;t.test()&lt;/code&gt; and by default R assumes that the variance in the two groups’ populations is not equal.&lt;/p&gt;
&lt;div id=&#34;t-test-for-body-mass&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;t-Test for body mass&lt;/h3&gt;
&lt;p&gt;Remember that in our plots, body mass seemed to be fairly similar. While there was variability between the two species, the two average values were fairly similar and the two Confidence Intervals ovelapped quite a bit.&lt;/p&gt;
&lt;p&gt;When we run our hypothesis test, we must first set up the hypotheses.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Null Hypothesis, &lt;span class=&#34;math inline&#34;&gt;\(H_0\)&lt;/span&gt;&lt;/strong&gt;: There is no difference in &lt;em&gt;mean&lt;/em&gt; body mass measurements between the two species (Adelie and Chinstrap). In other words &lt;span class=&#34;math inline&#34;&gt;\(\mu_1 = \mu_2\)&lt;/span&gt;, or their difference is equal to 0.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Altenative Hypothesis, &lt;span class=&#34;math inline&#34;&gt;\(H_1\)&lt;/span&gt;&lt;/strong&gt;: There is a difference in &lt;em&gt;mean&lt;/em&gt; body mass measurements between the two species.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Does the data provide enough evidence to reject the null hypothesis, or could the variation be due to luck? Typically, we wanr the p-value to be less than 5%, or equivalently the t-stat to be roughly more than 2, as fairly strong evidence to reject the null hypothesis.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#select only Adelie and Chinstrap penguins
adelie_chinstrap_test_data &amp;lt;- penguins %&amp;gt;%
  filter(species %in% c(&amp;quot;Adelie&amp;quot;, &amp;quot;Chinstrap&amp;quot;))


test1 &amp;lt;- t.test(body_mass_g ~ species, 
        data = adelie_chinstrap_test_data) 

test1&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
##  Welch Two Sample t-test
## 
## data:  body_mass_g by species
## t = -0.5, df = 152, p-value = 0.6
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -150.4   85.5
## sample estimates:
##    mean in group Adelie mean in group Chinstrap 
##                    3701                    3733&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In our case, the t-test confirms what we already knew. First, the t-value is -0.5 and the p-value=0.6. Another way to look at it, is that the CI for the difference between the two means is [-150.4, 85.5] which contains zero indicating that we do &lt;strong&gt;not&lt;/strong&gt; have strong evidence to reject the null hypothesis.&lt;/p&gt;
&lt;p&gt;We can use &lt;code&gt;broom:tidy()&lt;/code&gt; to convert these t-test results to a nice data frame.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;test1_tidy &amp;lt;- tidy(test1) %&amp;gt;% 
  # Calculate difference in means, since t.test() doesn&amp;#39;t actually do that
  mutate(estimate = estimate1 - estimate2) %&amp;gt;%
  # Rearrange columns
  select(starts_with(&amp;quot;estimate&amp;quot;), everything())

test1_tidy&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 1 x 10
##   estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high
##      &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;   &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;    &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;
## 1    -32.4     3701.     3733.    -0.543   0.588      152.    -150.      85.5
## # ... with 2 more variables: method &amp;lt;chr&amp;gt;, alternative &amp;lt;chr&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;A much cleaner output! The estimated average difference in body mass is -32.4g (we subtracted Adelie - Chinstrap, 3701-3733), the t-statistic = -0.543 and the p-value = 0.588, way greater than 0.05.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;t-test-for-flipper-length&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;t-Test for flipper length&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Null Hypothesis, &lt;span class=&#34;math inline&#34;&gt;\(H_0\)&lt;/span&gt;&lt;/strong&gt;: There is no difference in &lt;em&gt;mean&lt;/em&gt; flipper length measurements between the two species (Adelie and Chinstrap).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Altenative Hypothesis, &lt;span class=&#34;math inline&#34;&gt;\(H_1\)&lt;/span&gt;&lt;/strong&gt;: There is a difference in &lt;em&gt;mean&lt;/em&gt; flipper length measurements between the two species.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;test2 &amp;lt;- t.test(flipper_length_mm ~ species, 
        data = adelie_chinstrap_test_data) 

test2_tidy &amp;lt;- tidy(test2) %&amp;gt;% 
  # Calculate difference in means, since t.test() doesn&amp;#39;t actually do that
  mutate(estimate = estimate1 - estimate2) %&amp;gt;%
  # Rearrange columns
  select(starts_with(&amp;quot;estimate&amp;quot;), everything())

test2_tidy&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 1 x 10
##   estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high
##      &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;   &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;    &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;
## 1    -5.87      190.      196.     -5.78 6.05e-8      120.    -7.88     -3.86
## # ... with 2 more variables: method &amp;lt;chr&amp;gt;, alternative &amp;lt;chr&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In our case, the t-test confirms what we already knew. the t-value is well above 2 and the p-value well below 0.05, indicating that we have strong evidence to reject the null hypothesis and therefore determine that there is a difference in mean flipper length.&lt;/p&gt;
&lt;p&gt;The estimated average difference in flipper length is -5.9mm, the t-statistic is t-stat = -5.78 and the p-value = 6.05e-08 = &lt;span class=&#34;math inline&#34;&gt;\(6.05*10^{-8} = 0.00000605\)&lt;/span&gt;, a tiny number which is way less than 0.05.&lt;/p&gt;
&lt;p&gt;So where does this leave us?&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;In terms of body mass even though we measured an average difference of 32 grams, this is not statistically significant, as its t-statistic was less than 2 and, equivalently, its p-value is &amp;gt;&amp;gt; 0.05&lt;/li&gt;
&lt;li&gt;In terms of lfipper length, he measured average difference of 5.87mm &lt;strong&gt;is&lt;/strong&gt; statistically significant as the t-statistic is 5.78 and the p-vaue &amp;lt;&amp;lt;&amp;lt; 0.0.5&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;t-test-assuming-equal-variance&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;t-Test assuming equal variance&lt;/h2&gt;
&lt;p&gt;We can run &lt;code&gt;t.test()&lt;/code&gt; assuming the two groups have equal variance.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;test1_equal_variance &amp;lt;- t.test(body_mass_g ~ species, 
        data = adelie_chinstrap_test_data,
        var.equal = TRUE) # assume equal variance 

test1_tidy_equal_variance &amp;lt;- tidy(test1_equal_variance) %&amp;gt;% 
  # Calculate difference in means, since t.test() doesn&amp;#39;t actually do that
  mutate(estimate = estimate1 - estimate2) %&amp;gt;%
  # Rearrange columns
  select(starts_with(&amp;quot;estimate&amp;quot;), everything())

test1_tidy_equal_variance&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 1 x 10
##   estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high
##      &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;   &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;    &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;
## 1    -32.4     3701.     3733.    -0.508   0.612       217    -158.      93.4
## # ... with 2 more variables: method &amp;lt;chr&amp;gt;, alternative &amp;lt;chr&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;test2_equal_variance &amp;lt;- t.test(flipper_length_mm ~ species, 
        data = adelie_chinstrap_test_data,
        var.equal = TRUE) # assume equal variance 

test2_tidy_equal_variance &amp;lt;- tidy(test2_equal_variance) %&amp;gt;% 
  # Calculate difference in means, since t.test() doesn&amp;#39;t actually do that
  mutate(estimate = estimate1 - estimate2) %&amp;gt;%
  # Rearrange columns
  select(starts_with(&amp;quot;estimate&amp;quot;), everything())

test2_tidy_equal_variance&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 1 x 10
##   estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high
##      &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;   &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;    &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;
## 1    -5.87      190.      196.     -5.97 9.38e-9       217    -7.81     -3.93
## # ... with 2 more variables: method &amp;lt;chr&amp;gt;, alternative &amp;lt;chr&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;how-do-we-test-whether-the-two-groups-have-equal-variance&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;How do we test whether the two groups have equal variance?&lt;/h2&gt;
&lt;p&gt;There are several ways to check if the two groups have equal variance. For all these tests, the null hypothesis is that the two groups have equal variances.&lt;/p&gt;
&lt;p&gt;As in all hypothesis tests, if the p-value is less than 0.05, we can assume that they have unequal variances.&lt;/p&gt;
&lt;div id=&#34;bartlett-test-check-equality-of-variances-based-on-the-mean&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Bartlett test: Check equality of variances based on the &lt;em&gt;mean&lt;/em&gt;&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;body_mass_variance &amp;lt;- bartlett.test(body_mass_g ~ species, 
        data = adelie_chinstrap_test_data)
body_mass_variance&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
##  Bartlett test of homogeneity of variances
## 
## data:  body_mass_g by species
## Bartlett&amp;#39;s K-squared = 3, df = 1, p-value = 0.1&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;flipper_variance &amp;lt;- bartlett.test(flipper_length_mm ~ species, 
        data = adelie_chinstrap_test_data)
flipper_variance&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
##  Bartlett test of homogeneity of variances
## 
## data:  flipper_length_mm by species
## Bartlett&amp;#39;s K-squared = 0.7, df = 1, p-value = 0.4&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In both cases, since the p-value is greater than 0.05, we cannot reject the null hypothesis so we assume that the two groups have equal variances.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;levene-test-check-equality-of-variances-based-on-the-median&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Levene test Check equality of variances based on the &lt;em&gt;median&lt;/em&gt;&lt;/h3&gt;
&lt;p&gt;Levene’s test also checks for homogeneity of variance and can based either on the mean or on the median. The median is a robust statistic, as it’s not influenced by outliers as much as the mean can be.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;car::leveneTest(body_mass_g ~ species, 
                center = mean,
                data = adelie_chinstrap_test_data)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Levene&amp;#39;s Test for Homogeneity of Variance (center = mean)
##        Df F value Pr(&amp;gt;F)  
## group   1    4.63  0.032 *
##       217                 
## ---
## Signif. codes:  0 &amp;#39;***&amp;#39; 0.001 &amp;#39;**&amp;#39; 0.01 &amp;#39;*&amp;#39; 0.05 &amp;#39;.&amp;#39; 0.1 &amp;#39; &amp;#39; 1&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;car::leveneTest(flipper_length_mm ~ species, 
                  center = mean,
                  data = adelie_chinstrap_test_data)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Levene&amp;#39;s Test for Homogeneity of Variance (center = mean)
##        Df F value Pr(&amp;gt;F)
## group   1    0.62   0.43
##       217&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;car::leveneTest(body_mass_g ~ species, 
                center = median,
                data = adelie_chinstrap_test_data)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Levene&amp;#39;s Test for Homogeneity of Variance (center = median)
##        Df F value Pr(&amp;gt;F)  
## group   1    4.82  0.029 *
##       217                 
## ---
## Signif. codes:  0 &amp;#39;***&amp;#39; 0.001 &amp;#39;**&amp;#39; 0.01 &amp;#39;*&amp;#39; 0.05 &amp;#39;.&amp;#39; 0.1 &amp;#39; &amp;#39; 1&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;car::leveneTest(flipper_length_mm ~ species, 
                  center = median,
                  data = adelie_chinstrap_test_data)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Levene&amp;#39;s Test for Homogeneity of Variance (center = median)
##        Df F value Pr(&amp;gt;F)
## group   1    0.62   0.43
##       217&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Checking for homogeneity of variance based on the median, we can reject the null hypothesis for body mass (p-value = 0.029 &amp;lt; 0.05), but not for flipper length.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;fligner-killeen-test-check-homogeneity-of-variances-based-on-the-median-so-its-more-robust-to-outliers&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Fligner-Killeen test: Check homogeneity of variances based on the median, so it’s more robust to outliers&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;fligner.test(body_mass_g ~ species, 
             data = adelie_chinstrap_test_data)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
##  Fligner-Killeen test of homogeneity of variances
## 
## data:  body_mass_g by species
## Fligner-Killeen:med chi-squared = 4, df = 1, p-value = 0.04&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;fligner.test(flipper_length_mm ~ species, 
              data = adelie_chinstrap_test_data)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
##  Fligner-Killeen test of homogeneity of variances
## 
## data:  flipper_length_mm by species
## Fligner-Killeen:med chi-squared = 0.5, df = 1, p-value = 0.5&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let us summarise all the p-values from these tests&lt;/p&gt;
&lt;table style=&#34;width:71%;&#34;&gt;
&lt;colgroup&gt;
&lt;col width=&#34;30%&#34; /&gt;
&lt;col width=&#34;16%&#34; /&gt;
&lt;col width=&#34;23%&#34; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th align=&#34;left&#34;&gt;Test&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;Body Mass&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;Flipper Length&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;&lt;strong&gt;Bartlett&lt;/strong&gt;&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.1&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;&lt;strong&gt;Levene (mean)&lt;/strong&gt;&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.032&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.43&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;&lt;strong&gt;Levene (median)&lt;/strong&gt;&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.029&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.43&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;&lt;strong&gt;Fligner-Killeen&lt;/strong&gt;&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.04&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.4&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;In all of the Body mass tests, with the exception of the Bartlett tests, the p-value is less than 0.05. In other words, we sem to have enough evidence to conclude that the variances are different.&lt;/p&gt;
&lt;p&gt;However, in all of the flipper length tests,, all of the p-values are &amp;gt; 0.0.5, which means we cannot reject the null hypothesis so we’re probably safe assuming the variances are equal and leaving &lt;code&gt;var.equal = TRUE&lt;/code&gt; on.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;acknowledgements&#34; class=&#34;section level2 toc-ignore&#34;&gt;
&lt;h2&gt;Acknowledgements&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;This page is adapted from the &lt;a href=&#34;https://allisonhorst.github.io/palmerpenguins/articles/examples.html&#34; target=&#34;_blank&#34;&gt;Palmer Penguins package Vignette&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Testing for differences in mean values</title>
      <link>https://bit-2021.netlify.app/example/inference_diff_means/</link>
      <pubDate>Wed, 29 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/example/inference_diff_means/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#summary-statistics&#34;&gt;Summary statistics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#boxplots&#34;&gt;Boxplots&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#confidence-intervals-ci&#34;&gt;Confidence Intervals (CI)&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#cis-for-body-mass&#34;&gt;CIs for body mass&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#cis-for-flipper-length&#34;&gt;CIs for flipper length&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#t-test-assuming-unequal-variance&#34;&gt;t-Test assuming unequal variance&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#t-test-for-body-mass&#34;&gt;t-Test for body mass&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#t-test-for-flipper-length&#34;&gt;t-Test for flipper length&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#t-test-assuming-equal-variance&#34;&gt;t-Test assuming equal variance&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#how-do-we-test-whether-the-two-groups-have-equal-variance&#34;&gt;How do we test whether the two groups have equal variance?&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#bartlett-test-check-equality-of-variances-based-on-the-mean&#34;&gt;Bartlett test: Check equality of variances based on the &lt;em&gt;mean&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#levene-test-check-equality-of-variances-based-on-the-median&#34;&gt;Levene test Check equality of variances based on the &lt;em&gt;median&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#fligner-killeen-test-check-homogeneity-of-variances-based-on-the-median-so-its-more-robust-to-outliers&#34;&gt;Fligner-Killeen test: Check homogeneity of variances based on the median, so it’s more robust to outliers&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#acknowledgements&#34;&gt;Acknowledgements&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;p&gt;We are back to dealing with penguins, and we want to explore body mass and flipper length across the three different species.&lt;/p&gt;
&lt;div id=&#34;summary-statistics&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Summary statistics&lt;/h2&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;penguins %&amp;gt;%
  group_by(species) %&amp;gt;%
  summarize(across(c( body_mass_g, flipper_length_mm),
                   mean, na.rm = TRUE)) %&amp;gt;% 
  kable()&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
species
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
body_mass_g
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
flipper_length_mm
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Adelie
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3701
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
190
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Chinstrap
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3733
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
196
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Gentoo
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
5076
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
217
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;div id=&#34;boxplots&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Boxplots&lt;/h2&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;body_mass_plot &amp;lt;- ggplot(data = penguins, aes(y = species, x= body_mass_g)) +
  geom_boxplot(aes(color = species), width = 0.3, show.legend = FALSE) +
  geom_jitter(aes(color = species), alpha = 0.5, show.legend = FALSE, position = position_jitter(width = 0.2, seed = 0)) +
  scale_color_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  theme_minimal() +
  labs(title = &amp;quot;Penguin size, Palmer Station LTER&amp;quot;,
       subtitle = &amp;quot;Body mass (in grams) for Adelie, Chinstrap and Gentoo Penguins&amp;quot;,
       y = &amp;quot;Species&amp;quot;,
       x = &amp;quot;Body mass (grams)&amp;quot;)

body_mass_plot&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/inference_diff_means_files/figure-html/unnamed-chunk-2-1.png&#34; width=&#34;648&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;flipper_plot &amp;lt;- ggplot(data = penguins, aes(y = species, x = flipper_length_mm)) +
  geom_boxplot(aes(color = species), width = 0.3, show.legend = FALSE) +
  geom_jitter(aes(color = species), alpha = 0.5, show.legend = FALSE, position = position_jitter(width = 0.2, seed = 0)) +
  scale_color_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  theme_minimal() +
  labs(title = &amp;quot;Penguin size, Palmer Station LTER&amp;quot;,
       subtitle = &amp;quot;Flipper length for Adelie, Chinstrap and Gentoo Penguins&amp;quot;,
       y = &amp;quot;Species&amp;quot;,
       x = &amp;quot;Flipper length (mm)&amp;quot;)



flipper_plot&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/inference_diff_means_files/figure-html/unnamed-chunk-2-2.png&#34; width=&#34;648&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;confidence-intervals-ci&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Confidence Intervals (CI)&lt;/h2&gt;
&lt;div id=&#34;cis-for-body-mass&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;CIs for body mass&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;formula_ci_body_mass &amp;lt;- penguins %&amp;gt;%
  group_by(species) %&amp;gt;%
  summarise( mean_body_mass = mean(body_mass_g, na.rm = TRUE), 
             sd_mass = sd(body_mass_g, na.rm = TRUE), 
             count = n(), 
             
             # get t-critical value with (n-1) degrees of freedom
             t_critical = qt(0.975, count-1),
             se = sd_mass/sqrt(count),
             margin_of_error = t_critical * se,
             ci_low = mean_body_mass - margin_of_error,
             ci_high = mean_body_mass + margin_of_error
  )


formula_ci_body_mass %&amp;gt;% 
  kable()&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
species
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
mean_body_mass
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
sd_mass
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
count
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
t_critical
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
se
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
margin_of_error
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
ci_low
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
ci_high
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Adelie
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3701
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
459
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
152
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1.98
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
37.2
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
73.5
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3627
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3774
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Chinstrap
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3733
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
384
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
68
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
2.00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
46.6
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
93.0
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3640
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3826
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Gentoo
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
5076
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
504
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
124
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1.98
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
45.3
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
89.6
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
4986
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
5166
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#visualise  CIs for all species 
ggplot(formula_ci_body_mass, 
       aes(x=reorder(species, mean_body_mass), 
           y=mean_body_mass, 
           colour=species)) +
  geom_point() +
  scale_colour_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  geom_errorbar(width=.2, aes(ymin=ci_low, ymax=ci_high)) + 
  labs(x=&amp;quot; &amp;quot;,
       y= &amp;quot;Mean body mass (grams)&amp;quot;, 
       title=&amp;quot;Which species has the highest mean weight?&amp;quot;) + 
  theme_minimal()+
  coord_flip()+
  theme(legend.position = &amp;quot;none&amp;quot;)+
  NULL&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/inference_diff_means_files/figure-html/unnamed-chunk-4-1.png&#34; width=&#34;648&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# we will draw a violin plot and then use position=&amp;quot;jitter&amp;quot; or geom_jitter() 
# to see how spread out the actual points are

ggplot(data = penguins, aes(y = species, x= body_mass_g)) +
  geom_violin(aes(colour = species), width = 0.3, show.legend = FALSE) +
  geom_jitter(aes(colour = species), alpha = 0.5, show.legend = FALSE, position = position_jitter(width = 0.2, seed = 0)) +
  scale_color_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  
 # superimpose  the mean as a big orange dot
  geom_point(data = formula_ci_body_mass,
             aes(x=mean_body_mass, y = species), colour = &amp;quot;orange&amp;quot;, size = 8)+

  
  theme_minimal() +
  labs(title = &amp;quot;Penguin size, Palmer Station LTER&amp;quot;,
       subtitle = &amp;quot;Body mass (in grams) for Adelie, Chinstrap and Gentoo Penguins&amp;quot;,
       y = &amp;quot;Species&amp;quot;,
       x = &amp;quot;Body mass (grams)&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/inference_diff_means_files/figure-html/unnamed-chunk-4-2.png&#34; width=&#34;648&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;cis-for-flipper-length&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;CIs for flipper length&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;formula_ci_flipper_length &amp;lt;- penguins %&amp;gt;%
  group_by(species) %&amp;gt;%
  summarise( mean_flipper_length = mean(flipper_length_mm, na.rm = TRUE), 
             sd_flipper_length = sd(flipper_length_mm, na.rm = TRUE), 
             count = n(), 
             
             # get t-critical value with (n-1) degrees of freedom
             t_critical = qt(0.975, count-1),
             se = sd_flipper_length/sqrt(count),
             margin_of_error = t_critical * se,
             ci_low = mean_flipper_length - margin_of_error,
             ci_high = mean_flipper_length + margin_of_error
  )


formula_ci_flipper_length %&amp;gt;% 
  kable()&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&#34;text-align:left;&#34;&gt;
species
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
mean_flipper_length
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
sd_flipper_length
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
count
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
t_critical
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
se
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
margin_of_error
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
ci_low
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
ci_high
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Adelie
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
190
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
6.54
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
152
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1.98
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.530
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1.05
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
189
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
191
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Chinstrap
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
196
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
7.13
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
68
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
2.00
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.865
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1.73
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
194
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
198
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:left;&#34;&gt;
Gentoo
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
217
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
6.49
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
124
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1.98
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
0.582
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1.15
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
216
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
218
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#visualise  CIs for all species 
ggplot(formula_ci_flipper_length, 
       aes(x=reorder(species, mean_flipper_length), 
           y=mean_flipper_length, 
           colour=species)) +
  geom_point() +
  scale_colour_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  geom_errorbar(width=.2, aes(ymin=ci_low, ymax=ci_high)) + 
  labs(x=&amp;quot; &amp;quot;,
       y= &amp;quot;Mean flipper length (mm)&amp;quot;, 
       title=&amp;quot;Which species has the longest mean flipper?&amp;quot;) + 
  theme_minimal()+
  coord_flip()+
  theme(legend.position = &amp;quot;none&amp;quot;)+
  NULL&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/inference_diff_means_files/figure-html/unnamed-chunk-6-1.png&#34; width=&#34;648&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# we will draw a violin plot and then use position=&amp;quot;jitter&amp;quot; or geom_jitter() 
# to see how spread out the actual points are

ggplot(data = penguins, aes(y = species, x= flipper_length_mm)) +
  geom_violin(aes(colour = species), width = 0.3, show.legend = FALSE) +
  geom_jitter(aes(colour = species), alpha = 0.5, show.legend = FALSE, position = position_jitter(width = 0.2, seed = 0)) +
  scale_color_manual(values = c(&amp;quot;darkorange&amp;quot;,&amp;quot;purple&amp;quot;,&amp;quot;cyan4&amp;quot;)) +
  
 # superimpose  the mean as a big orange dot
  geom_point(data = formula_ci_flipper_length,
             aes(x=mean_flipper_length, y = species), colour = &amp;quot;orange&amp;quot;, size = 8)+

  theme_minimal() +
  labs(title = &amp;quot;Penguin size, Palmer Station LTER&amp;quot;,
       subtitle = &amp;quot;Flipper length (in mm) for Adelie, Chinstrap and Gentoo Penguins&amp;quot;,
       y = &amp;quot;Species&amp;quot;,
       x = &amp;quot;Flipper length (mm)&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/inference_diff_means_files/figure-html/unnamed-chunk-6-2.png&#34; width=&#34;648&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Remember in the penguins data, we saw that Gentoo penguins are very unlike the other two; however, what if we wanted to compare Adelie and Chinstrap both in terms of body mass and flipper length? By looking at the confidence intervals, we already have an indication as to whether there is a difference or not. We will use a t-Test to check if the group means are different.&lt;/p&gt;
&lt;p&gt;Briefly, a t-Test should be used when we want to assess whether the mean between two groups are similar or not. The null hypothesis for a t-test is that the two means are equal, and the alternative is that they are not.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;t-test-assuming-unequal-variance&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;t-Test assuming unequal variance&lt;/h2&gt;
&lt;p&gt;R’s built-in function for running a t-test is &lt;code&gt;t.test()&lt;/code&gt; and by default R assumes that the variance in the two groups’ populations is not equal.&lt;/p&gt;
&lt;div id=&#34;t-test-for-body-mass&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;t-Test for body mass&lt;/h3&gt;
&lt;p&gt;Remember that in our plots, body mass seemed to be fairly similar. While there was variability between the two species, the two average values were fairly similar and the two Confidence Intervals ovelapped quite a bit.&lt;/p&gt;
&lt;p&gt;When we run our hypothesis test, we must first set up the hypotheses.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Null Hypothesis, &lt;span class=&#34;math inline&#34;&gt;\(H_0\)&lt;/span&gt;&lt;/strong&gt;: There is no difference in &lt;em&gt;mean&lt;/em&gt; body mass measurements between the two species (Adelie and Chinstrap). In other words &lt;span class=&#34;math inline&#34;&gt;\(\mu_1 = \mu_2\)&lt;/span&gt;, or their difference is equal to 0.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Altenative Hypothesis, &lt;span class=&#34;math inline&#34;&gt;\(H_1\)&lt;/span&gt;&lt;/strong&gt;: There is a difference in &lt;em&gt;mean&lt;/em&gt; body mass measurements between the two species.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Does the data provide enough evidence to reject the null hypothesis, or could the variation be due to luck? Typically, we wanr the p-value to be less than 5%, or equivalently the t-stat to be roughly more than 2, as fairly strong evidence to reject the null hypothesis.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#select only Adelie and Chinstrap penguins
adelie_chinstrap_test_data &amp;lt;- penguins %&amp;gt;%
  filter(species %in% c(&amp;quot;Adelie&amp;quot;, &amp;quot;Chinstrap&amp;quot;))


test1 &amp;lt;- t.test(body_mass_g ~ species, 
        data = adelie_chinstrap_test_data) 

test1&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
##  Welch Two Sample t-test
## 
## data:  body_mass_g by species
## t = -0.5, df = 152, p-value = 0.6
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -150.4   85.5
## sample estimates:
##    mean in group Adelie mean in group Chinstrap 
##                    3701                    3733&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In our case, the t-test confirms what we already knew. First, the t-value is -0.5 and the p-value=0.6. Another way to look at it, is that the CI for the difference between the two means is [-150.4, 85.5] which contains zero indicating that we do &lt;strong&gt;not&lt;/strong&gt; have strong evidence to reject the null hypothesis.&lt;/p&gt;
&lt;p&gt;We can use &lt;code&gt;broom:tidy()&lt;/code&gt; to convert these t-test results to a nice data frame.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;test1_tidy &amp;lt;- tidy(test1) %&amp;gt;% 
  # Calculate difference in means, since t.test() doesn&amp;#39;t actually do that
  mutate(estimate = estimate1 - estimate2) %&amp;gt;%
  # Rearrange columns
  select(starts_with(&amp;quot;estimate&amp;quot;), everything())

test1_tidy&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 1 x 10
##   estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high
##      &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;   &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;    &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;
## 1    -32.4     3701.     3733.    -0.543   0.588      152.    -150.      85.5
## # ... with 2 more variables: method &amp;lt;chr&amp;gt;, alternative &amp;lt;chr&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;A much cleaner output! The estimated average difference in body mass is -32.4g (we subtracted Adelie - Chinstrap, 3701-3733), the t-statistic = -0.543 and the p-value = 0.588, way greater than 0.05.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;t-test-for-flipper-length&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;t-Test for flipper length&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Null Hypothesis, &lt;span class=&#34;math inline&#34;&gt;\(H_0\)&lt;/span&gt;&lt;/strong&gt;: There is no difference in &lt;em&gt;mean&lt;/em&gt; flipper length measurements between the two species (Adelie and Chinstrap).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Altenative Hypothesis, &lt;span class=&#34;math inline&#34;&gt;\(H_1\)&lt;/span&gt;&lt;/strong&gt;: There is a difference in &lt;em&gt;mean&lt;/em&gt; flipper length measurements between the two species.&lt;/li&gt;
&lt;/ul&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;test2 &amp;lt;- t.test(flipper_length_mm ~ species, 
        data = adelie_chinstrap_test_data) 

test2_tidy &amp;lt;- tidy(test2) %&amp;gt;% 
  # Calculate difference in means, since t.test() doesn&amp;#39;t actually do that
  mutate(estimate = estimate1 - estimate2) %&amp;gt;%
  # Rearrange columns
  select(starts_with(&amp;quot;estimate&amp;quot;), everything())

test2_tidy&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 1 x 10
##   estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high
##      &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;   &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;    &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;
## 1    -5.87      190.      196.     -5.78 6.05e-8      120.    -7.88     -3.86
## # ... with 2 more variables: method &amp;lt;chr&amp;gt;, alternative &amp;lt;chr&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In our case, the t-test confirms what we already knew. the t-value is well above 2 and the p-value well below 0.05, indicating that we have strong evidence to reject the null hypothesis and therefore determine that there is a difference in mean flipper length.&lt;/p&gt;
&lt;p&gt;The estimated average difference in flipper length is -5.9mm, the t-statistic is t-stat = -5.78 and the p-value = 6.05e-08 = &lt;span class=&#34;math inline&#34;&gt;\(6.05*10^{-8} = 0.00000605\)&lt;/span&gt;, a tiny number which is way less than 0.05.&lt;/p&gt;
&lt;p&gt;So where does this leave us?&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;In terms of body mass even though we measured an average difference of 32 grams, this is not statistically significant, as its t-statistic was less than 2 and, equivalently, its p-value is &amp;gt;&amp;gt; 0.05&lt;/li&gt;
&lt;li&gt;In terms of lfipper length, he measured average difference of 5.87mm &lt;strong&gt;is&lt;/strong&gt; statistically significant as the t-statistic is 5.78 and the p-vaue &amp;lt;&amp;lt;&amp;lt; 0.0.5&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;t-test-assuming-equal-variance&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;t-Test assuming equal variance&lt;/h2&gt;
&lt;p&gt;We can run &lt;code&gt;t.test()&lt;/code&gt; assuming the two groups have equal variance.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;test1_equal_variance &amp;lt;- t.test(body_mass_g ~ species, 
        data = adelie_chinstrap_test_data,
        var.equal = TRUE) # assume equal variance 

test1_tidy_equal_variance &amp;lt;- tidy(test1_equal_variance) %&amp;gt;% 
  # Calculate difference in means, since t.test() doesn&amp;#39;t actually do that
  mutate(estimate = estimate1 - estimate2) %&amp;gt;%
  # Rearrange columns
  select(starts_with(&amp;quot;estimate&amp;quot;), everything())

test1_tidy_equal_variance&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 1 x 10
##   estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high
##      &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;   &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;    &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;
## 1    -32.4     3701.     3733.    -0.508   0.612       217    -158.      93.4
## # ... with 2 more variables: method &amp;lt;chr&amp;gt;, alternative &amp;lt;chr&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;test2_equal_variance &amp;lt;- t.test(flipper_length_mm ~ species, 
        data = adelie_chinstrap_test_data,
        var.equal = TRUE) # assume equal variance 

test2_tidy_equal_variance &amp;lt;- tidy(test2_equal_variance) %&amp;gt;% 
  # Calculate difference in means, since t.test() doesn&amp;#39;t actually do that
  mutate(estimate = estimate1 - estimate2) %&amp;gt;%
  # Rearrange columns
  select(starts_with(&amp;quot;estimate&amp;quot;), everything())

test2_tidy_equal_variance&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 1 x 10
##   estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high
##      &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;   &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;    &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;
## 1    -5.87      190.      196.     -5.97 9.38e-9       217    -7.81     -3.93
## # ... with 2 more variables: method &amp;lt;chr&amp;gt;, alternative &amp;lt;chr&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;how-do-we-test-whether-the-two-groups-have-equal-variance&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;How do we test whether the two groups have equal variance?&lt;/h2&gt;
&lt;p&gt;There are several ways to check if the two groups have equal variance. For all these tests, the null hypothesis is that the two groups have equal variances.&lt;/p&gt;
&lt;p&gt;As in all hypothesis tests, if the p-value is less than 0.05, we can assume that they have unequal variances.&lt;/p&gt;
&lt;div id=&#34;bartlett-test-check-equality-of-variances-based-on-the-mean&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Bartlett test: Check equality of variances based on the &lt;em&gt;mean&lt;/em&gt;&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;body_mass_variance &amp;lt;- bartlett.test(body_mass_g ~ species, 
        data = adelie_chinstrap_test_data)
body_mass_variance&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
##  Bartlett test of homogeneity of variances
## 
## data:  body_mass_g by species
## Bartlett&amp;#39;s K-squared = 3, df = 1, p-value = 0.1&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;flipper_variance &amp;lt;- bartlett.test(flipper_length_mm ~ species, 
        data = adelie_chinstrap_test_data)
flipper_variance&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
##  Bartlett test of homogeneity of variances
## 
## data:  flipper_length_mm by species
## Bartlett&amp;#39;s K-squared = 0.7, df = 1, p-value = 0.4&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In both cases, since the p-value is greater than 0.05, we cannot reject the null hypothesis so we assume that the two groups have equal variances.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;levene-test-check-equality-of-variances-based-on-the-median&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Levene test Check equality of variances based on the &lt;em&gt;median&lt;/em&gt;&lt;/h3&gt;
&lt;p&gt;Levene’s test also checks for homogeneity of variance and can based either on the mean or on the median. The median is a robust statistic, as it’s not influenced by outliers as much as the mean can be.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;car::leveneTest(body_mass_g ~ species, 
                center = mean,
                data = adelie_chinstrap_test_data)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Levene&amp;#39;s Test for Homogeneity of Variance (center = mean)
##        Df F value Pr(&amp;gt;F)  
## group   1    4.63  0.032 *
##       217                 
## ---
## Signif. codes:  0 &amp;#39;***&amp;#39; 0.001 &amp;#39;**&amp;#39; 0.01 &amp;#39;*&amp;#39; 0.05 &amp;#39;.&amp;#39; 0.1 &amp;#39; &amp;#39; 1&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;car::leveneTest(flipper_length_mm ~ species, 
                  center = mean,
                  data = adelie_chinstrap_test_data)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Levene&amp;#39;s Test for Homogeneity of Variance (center = mean)
##        Df F value Pr(&amp;gt;F)
## group   1    0.62   0.43
##       217&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;car::leveneTest(body_mass_g ~ species, 
                center = median,
                data = adelie_chinstrap_test_data)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Levene&amp;#39;s Test for Homogeneity of Variance (center = median)
##        Df F value Pr(&amp;gt;F)  
## group   1    4.82  0.029 *
##       217                 
## ---
## Signif. codes:  0 &amp;#39;***&amp;#39; 0.001 &amp;#39;**&amp;#39; 0.01 &amp;#39;*&amp;#39; 0.05 &amp;#39;.&amp;#39; 0.1 &amp;#39; &amp;#39; 1&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;car::leveneTest(flipper_length_mm ~ species, 
                  center = median,
                  data = adelie_chinstrap_test_data)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Levene&amp;#39;s Test for Homogeneity of Variance (center = median)
##        Df F value Pr(&amp;gt;F)
## group   1    0.62   0.43
##       217&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Checking for homogeneity of variance based on the median, we can reject the null hypothesis for body mass (p-value = 0.029 &amp;lt; 0.05), but not for flipper length.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;fligner-killeen-test-check-homogeneity-of-variances-based-on-the-median-so-its-more-robust-to-outliers&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Fligner-Killeen test: Check homogeneity of variances based on the median, so it’s more robust to outliers&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;fligner.test(body_mass_g ~ species, 
             data = adelie_chinstrap_test_data)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
##  Fligner-Killeen test of homogeneity of variances
## 
## data:  body_mass_g by species
## Fligner-Killeen:med chi-squared = 4, df = 1, p-value = 0.04&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;fligner.test(flipper_length_mm ~ species, 
              data = adelie_chinstrap_test_data)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## 
##  Fligner-Killeen test of homogeneity of variances
## 
## data:  flipper_length_mm by species
## Fligner-Killeen:med chi-squared = 0.5, df = 1, p-value = 0.5&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let us summarise all the p-values from these tests&lt;/p&gt;
&lt;table style=&#34;width:71%;&#34;&gt;
&lt;colgroup&gt;
&lt;col width=&#34;30%&#34; /&gt;
&lt;col width=&#34;16%&#34; /&gt;
&lt;col width=&#34;23%&#34; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th align=&#34;left&#34;&gt;Test&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;Body Mass&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;Flipper Length&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;&lt;strong&gt;Bartlett&lt;/strong&gt;&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.1&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;&lt;strong&gt;Levene (mean)&lt;/strong&gt;&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.032&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.43&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;&lt;strong&gt;Levene (median)&lt;/strong&gt;&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.029&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.43&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;&lt;strong&gt;Fligner-Killeen&lt;/strong&gt;&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.04&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.4&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;In all of the Body mass tests, with the exception of the Bartlett tests, the p-value is less than 0.05. In other words, we sem to have enough evidence to conclude that the variances are different.&lt;/p&gt;
&lt;p&gt;However, in all of the flipper length tests,, all of the p-values are &amp;gt; 0.0.5, which means we cannot reject the null hypothesis so we’re probably safe assuming the variances are equal and leaving &lt;code&gt;var.equal = TRUE&lt;/code&gt; on.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;acknowledgements&#34; class=&#34;section level2 toc-ignore&#34;&gt;
&lt;h2&gt;Acknowledgements&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;This page is adapted from the &lt;a href=&#34;https://allisonhorst.github.io/palmerpenguins/articles/examples.html&#34; target=&#34;_blank&#34;&gt;Palmer Penguins package Vignette&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Inpect data</title>
      <link>https://bit-2021.netlify.app/example/eda-inspect-data/</link>
      <pubDate>Fri, 24 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/example/eda-inspect-data/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#overview&#34;&gt;Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#dplyrglimpse&#34;&gt;&lt;code&gt;dplyr::glimpse()&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#skimrskim&#34;&gt;&lt;code&gt;skimr::skim()&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#on-you-own&#34;&gt;On you own&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;Learning Objectives &lt;br&gt;
1. Glimpse the structure of the dataframe &lt;br&gt;
3. Summarize the structure of a dataframe&lt;br&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div id=&#34;overview&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Overview&lt;/h2&gt;
&lt;p&gt;Once you have loaded your data set into R, you have to inspect and get a feel for the data. We are typically interested in the following:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;The dimensions of the data set; how many rows (cases) and how many columns does the data frame have.&lt;/li&gt;
&lt;li&gt;The types of variables we have; are they integer, character, logical, factor (categorical) etc.&lt;/li&gt;
&lt;li&gt;The number of missing, or &lt;code&gt;NA&lt;/code&gt;, values in the dataframe.&lt;/li&gt;
&lt;li&gt;A quick look at some summary statistics&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;First, if we wanted to look at the dataset in a spreadsheet-style data viewer, we can just invoke &lt;code&gt;View(gapminder)&lt;/code&gt; (&lt;strong&gt;V&lt;/strong&gt;iew with a capital &lt;strong&gt;V&lt;/strong&gt;).&lt;/p&gt;
&lt;p&gt;While this is nice, it is not very useful, as we cannot dig deeper and see what kind of variables we have, whether there are any missing values, etc.&lt;/p&gt;
&lt;p&gt;There are two functions that we will talk about, &lt;code&gt;dplyr::glimpse()&lt;/code&gt; and &lt;code&gt;skimr::skim()&lt;/code&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;dplyrglimpse&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;code&gt;dplyr::glimpse()&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;glimpse()&lt;/code&gt; is like a transposed version of &lt;code&gt;print()&lt;/code&gt;: It first gives you the dimensions (rows and columns) and then gives us the dataframe’s columns (or variables), the variable type (&lt;fct&gt;, &lt;int&gt;, &lt;dbl&gt;), and then gives us the first few values of each variable. Let us look at the outcome of&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;glimpse(gapminder)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Rows: 1,704
## Columns: 6
## $ country   &amp;lt;fct&amp;gt; Afghanistan, Afghanistan, Afghanistan, Afghanistan, Afgha...
## $ continent &amp;lt;fct&amp;gt; Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asi...
## $ year      &amp;lt;int&amp;gt; 1952, 1957, 1962, 1967, 1972, 1977, 1982, 1987, 1992, 199...
## $ lifeExp   &amp;lt;dbl&amp;gt; 28.801, 30.332, 31.997, 34.020, 36.088, 38.438, 39.854, 4...
## $ pop       &amp;lt;int&amp;gt; 8425333, 9240934, 10267083, 11537966, 13079460, 14880372,...
## $ gdpPercap &amp;lt;dbl&amp;gt; 779.4453, 820.8530, 853.1007, 836.1971, 739.9811, 786.113...&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We see that we have 1704 rows, or cases. We also have 6 columns, or variables and right underneath we see each column individually:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;country&lt;/code&gt; is a categorical, or factor variable of type &lt;code&gt;&amp;lt;fct&amp;gt;&lt;/code&gt;. and the first few cases are all Afghanistan, just because it is the first one alphabetically&lt;/li&gt;
&lt;li&gt;&lt;code&gt;continent&lt;/code&gt; is also a factor variable, and the first values of this categorical variable are “Asia”, &#34;Europe’, etc.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;year&lt;/code&gt; is an integer variable of type &lt;code&gt;&amp;lt;int&amp;gt;&lt;/code&gt;. This is the year for which we have data for each country, between 1952 and 2007 in 5-year intervals.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;lifeExp&lt;/code&gt; is a double precision, or real number, of type &lt;code&gt;&amp;lt;dbl&amp;gt;&lt;/code&gt; that refers to life expectancy&lt;/li&gt;
&lt;li&gt;&lt;code&gt;pop&lt;/code&gt; is an integer variable of type &lt;code&gt;&amp;lt;int&amp;gt;&lt;/code&gt; that refers to the population&lt;/li&gt;
&lt;li&gt;&lt;code&gt;gdpPercap&lt;/code&gt; is a double precision, or real number, of type &lt;code&gt;&amp;lt;dbl&amp;gt;&lt;/code&gt; that refers to GDP per capita&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;skimrskim&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;code&gt;skimr::skim()&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;While &lt;code&gt;glimpse()&lt;/code&gt; allows us to look at the contents of the dataframe, &lt;code&gt;skimr::skim()&lt;/code&gt; is more useful and I always use it in my workflow.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;skimr::skim(gapminder)&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;caption&gt;&lt;span id=&#34;tab:unnamed-chunk-2&#34;&gt;Table 1: &lt;/span&gt;Data summary&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Name&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;gapminder&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Number of rows&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;1704&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Number of columns&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;_______________________&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Column type frequency:&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;factor&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;numeric&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;________________________&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Group variables&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Variable type: factor&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th align=&#34;left&#34;&gt;skim_variable&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;n_missing&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;complete_rate&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;ordered&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;n_unique&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;top_counts&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;country&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;FALSE&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;142&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;Afg: 12, Alb: 12, Alg: 12, Ang: 12&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;continent&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;FALSE&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;5&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;Afr: 624, Asi: 396, Eur: 360, Ame: 300&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Variable type: numeric&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
&lt;colgroup&gt;
&lt;col width=&#34;8%&#34; /&gt;
&lt;col width=&#34;6%&#34; /&gt;
&lt;col width=&#34;8%&#34; /&gt;
&lt;col width=&#34;7%&#34; /&gt;
&lt;col width=&#34;8%&#34; /&gt;
&lt;col width=&#34;5%&#34; /&gt;
&lt;col width=&#34;6%&#34; /&gt;
&lt;col width=&#34;6%&#34; /&gt;
&lt;col width=&#34;7%&#34; /&gt;
&lt;col width=&#34;8%&#34; /&gt;
&lt;col width=&#34;25%&#34; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th align=&#34;left&#34;&gt;skim_variable&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;n_missing&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;complete_rate&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;mean&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;sd&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p0&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p25&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p50&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p75&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p100&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;hist&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;year&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1979.50&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;17.27&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1952.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1965.75&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1979.50&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1993.25&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2007.0&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▇▅▅▅▇&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;lifeExp&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;59.47&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;12.92&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;23.60&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;48.20&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;60.71&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;70.85&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;82.6&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▁▆▇▇▇&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;pop&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;29601212.32&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;106157896.74&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;60011.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2793664.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;7023595.50&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;19585221.75&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1318683096.0&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▇▁▁▁▁&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;gdpPercap&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;7215.33&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;9857.45&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;241.17&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1202.06&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3531.85&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;9325.46&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;113523.1&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▇▁▁▁▁&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;code&gt;skimr::skim()&lt;/code&gt; give us a data summary with the dimensions (rows and columns) of the dataframe, and the type of columns; in this case, we have 2 factor and 2 numeric columns (variables).&lt;/p&gt;
&lt;p&gt;For all variable types, it gives us the number of missing values (&lt;code&gt;n_missing&lt;/code&gt;) and the &lt;code&gt;complete_rate&lt;/code&gt;; in the gapminder data, there are no missing values, so &lt;code&gt;n_mising = 0&lt;/code&gt; and &lt;code&gt;complete_rate = 1&lt;/code&gt;.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;For factor variables&lt;/strong&gt;, skim() provides information on
&lt;ul&gt;
&lt;li&gt;whether it is an &lt;code&gt;ordered&lt;/code&gt; factor; if false, the default ordering is alphabetical, otherwise one has to explicitly specify the order of the categories.&lt;/li&gt;
&lt;li&gt;the &lt;code&gt;n_unique&lt;/code&gt;, or distinct instances of each country; in &lt;code&gt;gapminder&lt;/code&gt; we have data on 142 distinct countries and 5 continents&lt;/li&gt;
&lt;li&gt;the &lt;code&gt;top_counts&lt;/code&gt; shows the top number of instances for each factor; each &lt;code&gt;country&lt;/code&gt; has 12 observations, but in &lt;code&gt;continent&lt;/code&gt; Africa has 624 observations, Asia 396, etc.&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;For numeric variables&lt;/strong&gt;, skim() provides summary statistics; mean, standard deviation and the 0th (min), 25th, 50, 75th and 100th (max) percentile. It also gives us a rough histogram to get an idea on the shape of the distribution (normal, skewed, uniform)&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;on-you-own&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;On you own&lt;/h2&gt;
&lt;p&gt;The following dataframe has data on London’d cycle hire scheme, &lt;a href=&#34;https://tfl.gov.uk/modes/cycling/santander-cycles&#34;&gt;Santander Cycles&lt;/a&gt;. Besides the number of bikes rented out, the dataframe also contains weather information.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;bikes &amp;lt;- read_csv(here(&amp;quot;data&amp;quot;, &amp;quot;londonBikes.csv&amp;quot;))&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;skimr::skim(bikes)&lt;/code&gt;&lt;/pre&gt;
&lt;table&gt;
&lt;caption&gt;&lt;span id=&#34;tab:bikes3&#34;&gt;Table 2: &lt;/span&gt;Data summary&lt;/caption&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Name&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;bikes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Number of rows&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;3439&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Number of columns&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;14&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;_______________________&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Column type frequency:&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;character&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;logical&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;numeric&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;9&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;________________________&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;Group variables&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Variable type: character&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th align=&#34;left&#34;&gt;skim_variable&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;n_missing&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;complete_rate&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;min&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;max&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;empty&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;n_unique&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;whitespace&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;date&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;8&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;8&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3439&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Variable type: logical&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th align=&#34;left&#34;&gt;skim_variable&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;n_missing&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;complete_rate&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;mean&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;count&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;rain&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;851&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.75&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.62&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;TRU: 1595, FAL: 993&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;fog&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;851&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.75&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.07&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;FAL: 2403, TRU: 185&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;thunderstorm&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;851&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.75&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.03&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;FAL: 2512, TRU: 76&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;snow&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;851&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.75&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.02&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;FAL: 2533, TRU: 55&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;Variable type: numeric&lt;/strong&gt;&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th align=&#34;left&#34;&gt;skim_variable&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;n_missing&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;complete_rate&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;mean&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;sd&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p0&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p25&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p50&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p75&lt;/th&gt;
&lt;th align=&#34;right&#34;&gt;p100&lt;/th&gt;
&lt;th align=&#34;left&#34;&gt;hist&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;bikes_hired&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;26158.95&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;9135.13&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3531.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;19626.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;26022.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;32759.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;73094.0&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▃▇▅▁▁&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;season&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.46&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.12&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;2.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;4.0&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▇▇▁▇▇&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;max_temp&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1877&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.45&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;16.48&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;6.19&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;-1.2&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;11.93&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;16.7&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;20.9&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;36.7&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▁▆▇▃▁&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;min_temp&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1929&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.44&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;7.62&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;5.14&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;-8.2&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3.90&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;7.9&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;11.8&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;20.0&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▁▅▇▇▂&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;avg_temp&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;27&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.99&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;11.70&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;5.41&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;-4.1&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;7.60&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;11.6&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;15.9&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;28.6&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▁▆▇▅▁&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;avg_humidity&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;745&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.78&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;74.91&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;10.84&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;37.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;67.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;76.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;83.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;100.0&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▁▂▆▇▂&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;avg_pressure&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;773&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.78&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1015.10&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;10.24&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;979.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1009.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1016.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1022.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1044.0&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▁▂▇▆▁&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td align=&#34;left&#34;&gt;avg_windspeed&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;745&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.78&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;14.01&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;6.10&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;10.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;13.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;18.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;47.0&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▇▇▂▁▁&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td align=&#34;left&#34;&gt;rainfall_mm&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;51&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.99&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.67&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;3.68&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.00&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;0.0&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;1.5&lt;/td&gt;
&lt;td align=&#34;right&#34;&gt;48.0&lt;/td&gt;
&lt;td align=&#34;left&#34;&gt;▇▁▁▁▁&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;A couple of graded learnr interactive exercices?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;What kind of variable is &lt;code&gt;date&lt;/code&gt;? What kind of variable is &lt;code&gt;season&lt;/code&gt;?&lt;/li&gt;
&lt;li&gt;How often does it rain in London?&lt;/li&gt;
&lt;li&gt;What is the average annual temperature (in degrees C)?&lt;/li&gt;
&lt;li&gt;What is the maximum rainfall?&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Simulation-based tests</title>
      <link>https://bit-2021.netlify.app/example/inference_simulating_t_tests/</link>
      <pubDate>Wed, 29 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/example/inference_simulating_t_tests/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#there-is-only-one-test&#34;&gt;&lt;em&gt;There is only one test&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#bootstrap-simulation-using-infer-package&#34;&gt;Bootstrap simulation, using &lt;code&gt;infer&lt;/code&gt; package&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#ci-for-median&#34;&gt;CI for median&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;there-is-only-one-test&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;em&gt;There is only one test&lt;/em&gt;&lt;/h2&gt;
&lt;p&gt;We can use &lt;code&gt;t.test()&lt;/code&gt; to check whether two means are equal or not. Instead of dealing with the assumptions of the data and finding the appropriate statistical test to test for equality of variance, we can use the power of bootstrapping, permutation, and simulation to construct a null distribution, calculate confidence intervals and run ny kind of test for inference.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The beauty and power of simulation is that we can run hypothesis test not just on differences of &lt;strong&gt;means&lt;/strong&gt; that we have fomuals for, but also on other parameters, like &lt;strong&gt;medians&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;According to Allen Downey, &lt;a href=&#34;http://allendowney.blogspot.com/2016/06/there-is-still-only-one-test.html&#34; target=&#34;_blank&#34;&gt;there is only one statistical test&lt;/a&gt; and that all statistical tests follow the same pattern:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Step 1: Calculate a sample statistic from the observed data, or &lt;span class=&#34;math inline&#34;&gt;\(\delta^{*}\)&lt;/span&gt;. This is the measure you care about: the difference in means, the mean, the median, the proportion, the difference in proportions, anything you want really!&lt;/li&gt;
&lt;li&gt;Step 2: Use simulation to invent a world where &lt;span class=&#34;math inline&#34;&gt;\(\delta\)&lt;/span&gt; is null. Simulate what the world would look like if there was no difference between two groups, or if there was no difference in medians or proportions, or where the average value is a specific number.&lt;/li&gt;
&lt;li&gt;Step 3: Look at &lt;span class=&#34;math inline&#34;&gt;\(\delta^{*}\)&lt;/span&gt; in the null world. Put the observed sample statistic in the null world and see if it fits well.&lt;/li&gt;
&lt;li&gt;Step 4: Calculate the probability that &lt;span class=&#34;math inline&#34;&gt;\(\delta^{*}\)&lt;/span&gt; could exist in null world. This is the p-value, or the probability that you’d see a &lt;span class=&#34;math inline&#34;&gt;\(\delta^{*}\)&lt;/span&gt; at least that high in a world where there’s no difference.&lt;/li&gt;
&lt;li&gt;Step 5: Decide if &lt;span class=&#34;math inline&#34;&gt;\(\delta^{*}\)&lt;/span&gt; is statistically significant. Choose some threshold, cutoff value (e.g., 0.05) for deciding if there’s sufficient proof for rejecting the null world.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;bootstrap-simulation-using-infer-package&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Bootstrap simulation, using &lt;code&gt;infer&lt;/code&gt; package&lt;/h2&gt;
&lt;/div&gt;
&lt;div id=&#34;ci-for-median&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;CI for median&lt;/h2&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Clean data</title>
      <link>https://bit-2021.netlify.app/example/eda-clean-data/</link>
      <pubDate>Fri, 24 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/example/eda-clean-data/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#janitor-package-for-cleaning-variable-names&#34;&gt;&lt;code&gt;janitor&lt;/code&gt; package for cleaning variable names&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#code-that-works-is-not-necessarily-good-code&#34;&gt;&lt;em&gt;Code that works is not necessarily good code&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#other-links&#34;&gt;Other links&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;janitor-package-for-cleaning-variable-names&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;code&gt;janitor&lt;/code&gt; package for cleaning variable names&lt;/h2&gt;
&lt;p&gt;When we create data files, we frequently use variable names and formats that are easily readable for humans, but no so for computers.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Data scientists, according to interviews and expert estimates, spend from 50 percent to 80 percent of their time mired in this more mundane labor of collecting and preparing unruly digital data, before it can be explored for useful nuggets.
– &lt;a href=&#34;https://www.nytimes.com/2014/08/18/technology/for-big-data-scientists-hurdle-to-insights-is-janitor-work.html&#34;&gt;For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle to Insights&lt;/a&gt; &lt;em&gt;The New York Times, 2014&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;code&gt;janitor&lt;/code&gt; has many functions, but its core function is &lt;code&gt;clean_names()&lt;/code&gt; which will make your life easier if you call it whenever you load data into R. The following example is taken from &lt;a href=&#34;https://www.rdocumentation.org/packages/janitor/versions/1.2.0&#34;&gt;janitor’s documentation page&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Let us read an Excel file with a roster of teachers at a fictional American high school, stored in the Microsoft Excel file &lt;code&gt;dirty_data.xlsx&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/img/dirty_data.png&#34; width=&#34;80%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Some of the variable names, e.g., &lt;code&gt;First Name&lt;/code&gt;, &lt;code&gt;Last Name&lt;/code&gt;, are not only capitalised, but also contain a space in the variable name. Let us read in the file and have a glimpse inside it.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;roster &amp;lt;- readxl::read_excel(here(&amp;quot;data&amp;quot;, &amp;quot;dirty_data.xlsx&amp;quot;))

glimpse(roster)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Rows: 13
## Columns: 11
## $ `First Name`        &amp;lt;chr&amp;gt; &amp;quot;Jason&amp;quot;, &amp;quot;Jason&amp;quot;, &amp;quot;Alicia&amp;quot;, &amp;quot;Ada&amp;quot;, &amp;quot;Desus&amp;quot;, &amp;quot;Ch...
## $ `Last Name`         &amp;lt;chr&amp;gt; &amp;quot;Bourne&amp;quot;, &amp;quot;Bourne&amp;quot;, &amp;quot;Keys&amp;quot;, &amp;quot;Lovelace&amp;quot;, &amp;quot;Nice&amp;quot;,...
## $ `Employee Status`   &amp;lt;chr&amp;gt; &amp;quot;Teacher&amp;quot;, &amp;quot;Teacher&amp;quot;, &amp;quot;Teacher&amp;quot;, &amp;quot;Teacher&amp;quot;, &amp;quot;Ad...
## $ Subject             &amp;lt;chr&amp;gt; &amp;quot;PE&amp;quot;, &amp;quot;Drafting&amp;quot;, &amp;quot;Music&amp;quot;, NA, &amp;quot;Dean&amp;quot;, &amp;quot;Physics...
## $ `Hire Date`         &amp;lt;dbl&amp;gt; 39690, 39690, 37118, 27515, 41431, 11037, 11037...
## $ `% Allocated`       &amp;lt;dbl&amp;gt; 0.75, 0.25, 1.00, 1.00, 1.00, 0.50, 0.50, NA, 0...
## $ `Full time?`        &amp;lt;chr&amp;gt; &amp;quot;Yes&amp;quot;, &amp;quot;Yes&amp;quot;, &amp;quot;Yes&amp;quot;, &amp;quot;Yes&amp;quot;, &amp;quot;Yes&amp;quot;, &amp;quot;Yes&amp;quot;, &amp;quot;Yes&amp;quot;...
## $ `do not edit! ---&amp;gt;` &amp;lt;lgl&amp;gt; NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
## $ Certification...9   &amp;lt;chr&amp;gt; &amp;quot;Physical ed&amp;quot;, &amp;quot;Physical ed&amp;quot;, &amp;quot;Instr. music&amp;quot;, &amp;quot;...
## $ Certification...10  &amp;lt;chr&amp;gt; &amp;quot;Theater&amp;quot;, &amp;quot;Theater&amp;quot;, &amp;quot;Vocal music&amp;quot;, &amp;quot;Computers...
## $ Certification...11  &amp;lt;lgl&amp;gt; NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We notice that if we wanted to refer to the variable for a first name (1st in the list) or percent allocated (6th in the list), we would need to refer to them as the string “First Name” and “% Allocated” respectively. To avoid this, we can use &lt;code&gt;janitor::clean_names()&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;roster_clean &amp;lt;- roster %&amp;gt;% 
  clean_names()

glimpse(roster_clean)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Rows: 13
## Columns: 11
## $ first_name        &amp;lt;chr&amp;gt; &amp;quot;Jason&amp;quot;, &amp;quot;Jason&amp;quot;, &amp;quot;Alicia&amp;quot;, &amp;quot;Ada&amp;quot;, &amp;quot;Desus&amp;quot;, &amp;quot;Chie...
## $ last_name         &amp;lt;chr&amp;gt; &amp;quot;Bourne&amp;quot;, &amp;quot;Bourne&amp;quot;, &amp;quot;Keys&amp;quot;, &amp;quot;Lovelace&amp;quot;, &amp;quot;Nice&amp;quot;, &amp;quot;...
## $ employee_status   &amp;lt;chr&amp;gt; &amp;quot;Teacher&amp;quot;, &amp;quot;Teacher&amp;quot;, &amp;quot;Teacher&amp;quot;, &amp;quot;Teacher&amp;quot;, &amp;quot;Admi...
## $ subject           &amp;lt;chr&amp;gt; &amp;quot;PE&amp;quot;, &amp;quot;Drafting&amp;quot;, &amp;quot;Music&amp;quot;, NA, &amp;quot;Dean&amp;quot;, &amp;quot;Physics&amp;quot;,...
## $ hire_date         &amp;lt;dbl&amp;gt; 39690, 39690, 37118, 27515, 41431, 11037, 11037, ...
## $ percent_allocated &amp;lt;dbl&amp;gt; 0.75, 0.25, 1.00, 1.00, 1.00, 0.50, 0.50, NA, 0.5...
## $ full_time         &amp;lt;chr&amp;gt; &amp;quot;Yes&amp;quot;, &amp;quot;Yes&amp;quot;, &amp;quot;Yes&amp;quot;, &amp;quot;Yes&amp;quot;, &amp;quot;Yes&amp;quot;, &amp;quot;Yes&amp;quot;, &amp;quot;Yes&amp;quot;, ...
## $ do_not_edit       &amp;lt;lgl&amp;gt; NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
## $ certification_9   &amp;lt;chr&amp;gt; &amp;quot;Physical ed&amp;quot;, &amp;quot;Physical ed&amp;quot;, &amp;quot;Instr. music&amp;quot;, &amp;quot;PE...
## $ certification_10  &amp;lt;chr&amp;gt; &amp;quot;Theater&amp;quot;, &amp;quot;Theater&amp;quot;, &amp;quot;Vocal music&amp;quot;, &amp;quot;Computers&amp;quot;,...
## $ certification_11  &amp;lt;lgl&amp;gt; NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now, the variable names contain no spaces, are all lower case, and we can explicitly refer to them rather than using a string of characters– it all makes life a bit easier!&lt;/p&gt;
&lt;div id=&#34;code-that-works-is-not-necessarily-good-code&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;&lt;em&gt;Code that works is not necessarily good code&lt;/em&gt;&lt;/h3&gt;
&lt;p&gt;According to Phil Karlton, &lt;a href=&#34;https://martinfowler.com/bliki/TwoHardThings.html&#34;&gt;&lt;em&gt;there are only two hard things in Computer Science: cache invalidation and naming things&lt;/em&gt;&lt;/a&gt;. It is good practice to use meaningful names for variables and data frames, use spacing, comments, etc. Both Google and Hadley Wickham have great &lt;a href=&#34;https://style.tidyverse.org/&#34;&gt;style guides for programming in R&lt;/a&gt; and the &lt;code&gt;janitor&lt;/code&gt; package helps in creating variable names with a consistent style.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;other-links&#34; class=&#34;section level2 toc-ignore&#34;&gt;
&lt;h2&gt;Other links&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://www.nytimes.com/2014/08/18/technology/for-big-data-scientists-hurdle-to-insights-is-janitor-work.html&#34;&gt;For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle to Insights&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;br&gt;
&lt;br&gt;&lt;/p&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Linear Model Fitting</title>
      <link>https://bit-2021.netlify.app/example/modelling_fit_lm/</link>
      <pubDate>Tue, 28 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/example/modelling_fit_lm/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#fit-a-model-using-lmy-x1-x2-...-data-dataframe&#34;&gt;Fit a model using &lt;code&gt;lm(Y ~ X1 + X2 +..., data = dataframe)&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#look-at-model-output&#34;&gt;Look at model output&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#write-down-the-equation-for-model1&#34;&gt;Write down the equation for model1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#plot-scatterplot-and-residuals&#34;&gt;Plot scatterplot and residuals&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#add-more-explanatory-variables&#34;&gt;Add more explanatory variables&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#check-collinearity&#34;&gt;Check collinearity&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#summary-model-comparison-table-using-huxtablehuxreg&#34;&gt;Summary model comparison table using &lt;code&gt;huxtable::huxreg()&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#fitting-multiple-regression-models-in-one-go&#34;&gt;Fitting multiple regression models in one go&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#simpsons-paradox&#34;&gt;Simpson’s paradox&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;p&gt;We will be using the Palmer penguins data to understand body mass.&lt;/p&gt;
&lt;div id=&#34;fit-a-model-using-lmy-x1-x2-...-data-dataframe&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Fit a model using &lt;code&gt;lm(Y ~ X1 + X2 +..., data = dataframe)&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;The function to fit a linear regression model in R is &lt;code&gt;lm(Y ~ X1 + X2 +..., data = mydataframe)&lt;/code&gt;. &lt;code&gt;lm&lt;/code&gt;, as many other functions in R, uses the formula interface The tilde (~) can be translated as &lt;em&gt;is a function of&lt;/em&gt;. We are saying that &lt;span class=&#34;math inline&#34;&gt;\(Y\)&lt;/span&gt; is a function of &lt;span class=&#34;math inline&#34;&gt;\(X1\)&lt;/span&gt;, &lt;span class=&#34;math inline&#34;&gt;\(X2\)&lt;/span&gt;, etc., and the data for our analysis comes from &lt;code&gt;mydataframe&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Back to our penguins, we want to see whether body mass is a function of flipper length. We create an object called &lt;code&gt;model1&lt;/code&gt; that holds the results of this linear regression model.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;model1 &amp;lt;- lm(body_mass_g ~ flipper_length_mm, data = penguins)&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;look-at-model-output&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Look at model output&lt;/h2&gt;
&lt;p&gt;We will be using the &lt;code&gt;broom&lt;/code&gt; package to make modelling easier to work with. There are 3 main functions in broom:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;tidy()&lt;/code&gt; - This is where you get most of the output you want, including coefficients and p-values&lt;/li&gt;
&lt;li&gt;&lt;code&gt;glance()&lt;/code&gt; - additional measures on your model, including R-squared, log likelihood, and AIC/BIC&lt;/li&gt;
&lt;li&gt;&lt;code&gt;augment()&lt;/code&gt; - make predictions with your model using new data&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For now, we will use &lt;code&gt;broom::tidy()&lt;/code&gt; and &lt;code&gt;broom::glance()&lt;/code&gt; to get model results.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;model1 %&amp;gt;% broom::tidy()&lt;/code&gt;&lt;/pre&gt;
&lt;table class=&#34;huxtable&#34; style=&#34;border-collapse: collapse; border: 0px; margin-bottom: 2em; margin-top: 2em; ; margin-left: auto; margin-right: auto;  &#34; id=&#34;tab:model1_output&#34;&gt;
&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;term&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;estimate&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;std.error&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;statistic&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0.4pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;p.value&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;(Intercept)&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;-5.78e+03&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;306&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;-18.9&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0.4pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;5.59e-55&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;flipper_length_mm&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;49.7&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;1.52&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;32.7&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0.4pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;4.37e-107&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;model1 %&amp;gt;% broom::glance()&lt;/code&gt;&lt;/pre&gt;
&lt;table class=&#34;huxtable&#34; style=&#34;border-collapse: collapse; border: 0px; margin-bottom: 2em; margin-top: 2em; ; margin-left: auto; margin-right: auto;  &#34; id=&#34;tab:model1_output&#34;&gt;
&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;r.squared&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;adj.r.squared&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;sigma&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;statistic&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;p.value&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;df&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;logLik&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;AIC&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;BIC&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;deviance&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;df.residual&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0.4pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;nobs&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;0.759&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;0.758&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;394&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;1.07e+03&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;4.37e-107&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;1&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;-2.53e+03&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;5.06e+03&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;5.07e+03&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;5.29e+07&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;340&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0.4pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;342&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;/div&gt;
&lt;div id=&#34;write-down-the-equation-for-model1&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Write down the equation for model1&lt;/h2&gt;
&lt;p&gt;&lt;span class=&#34;math display&#34;&gt;\[
\text{body_mass_g} = \alpha + \beta_{1}(\text{flipper_length_mm}) + \epsilon
\]&lt;/span&gt;
&lt;span class=&#34;math display&#34;&gt;\[
\text{body_mass_g} = -5780.83 + 49.69(\text{flipper_length_mm}) + \epsilon
\]&lt;/span&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;plot-scatterplot-and-residuals&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Plot scatterplot and residuals&lt;/h2&gt;
&lt;/div&gt;
&lt;div id=&#34;add-more-explanatory-variables&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Add more explanatory variables&lt;/h2&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;model2 &amp;lt;- lm(body_mass_g ~ flipper_length_mm + species , data = penguins)

model3 &amp;lt;- lm(body_mass_g ~ flipper_length_mm + species + sex , data = penguins)

model4 &amp;lt;- lm(body_mass_g ~ flipper_length_mm + species + sex + bill_length_mm, data = penguins)

model5 &amp;lt;- lm(body_mass_g ~ flipper_length_mm + species + sex + bill_length_mm + bill_depth_mm , data = penguins)

# Fit a model with all explanatory variables (~ .)
model6 &amp;lt;- lm(body_mass_g ~ . , data = penguins)&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;check-collinearity&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Check collinearity&lt;/h2&gt;
&lt;p&gt;With so many explanatory variables, we need to worry about colinearity, i.e., whether the explanatory variables (all of the &lt;span class=&#34;math inline&#34;&gt;\(X\)&lt;/span&gt;’s) are highly correlated among themselves.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#model2 &amp;lt;- lm(body_mass_g ~ flipper_length_mm + species , data = penguins)
car::vif(model2)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##                   GVIF Df GVIF^(1/(2*Df))
## flipper_length_mm 4.51  1            2.12
## species           4.51  2            1.46&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# model3 &amp;lt;- lm(body_mass_g ~ flipper_length_mm + species + sex , data = penguins)
car::vif(model3)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##                   GVIF Df GVIF^(1/(2*Df))
## flipper_length_mm 6.05  1            2.46
## species           5.65  2            1.54
## sex               1.36  1            1.17&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# model4 &amp;lt;- lm(body_mass_g ~ flipper_length_mm + species + sex + bill_length_mm, data = penguins)
car::vif(model4)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##                    GVIF Df GVIF^(1/(2*Df))
## flipper_length_mm  6.44  1            2.54
## species           18.16  2            2.06
## sex                1.81  1            1.35
## bill_length_mm     5.95  1            2.44&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# model5 &amp;lt;- lm(body_mass_g ~ flipper_length_mm + species + sex + bill_length_mm + bill_depth_mm , data = penguins)
car::vif(model5)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##                    GVIF Df GVIF^(1/(2*Df))
## flipper_length_mm  6.69  1            2.59
## species           41.07  2            2.53
## sex                2.31  1            1.52
## bill_length_mm     6.07  1            2.46
## bill_depth_mm      6.08  1            2.47&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# model6 &amp;lt;- lm(body_mass_g ~ . , data = penguins)
car::vif(model6)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##                    GVIF Df GVIF^(1/(2*Df))
## species           71.20  2            2.90
## island             3.76  2            1.39
## bill_length_mm     6.12  1            2.47
## bill_depth_mm      6.27  1            2.50
## flipper_length_mm  7.78  1            2.79
## sex                2.34  1            1.53
## year               1.17  1            1.08&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;model7 &amp;lt;- lm(body_mass_g ~ flipper_length_mm +  sex + bill_depth_mm , data = penguins)
car::vif(model7)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## flipper_length_mm               sex     bill_depth_mm 
##              2.44              1.89              2.65&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;summary-model-comparison-table-using-huxtablehuxreg&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Summary model comparison table using &lt;code&gt;huxtable::huxreg()&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Which of the six models we have fit seems to be the best one? Let us compare them on one table.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;huxreg(model1, model2, model3, model4, model5, model6, model7,
                 statistics = c(&amp;#39;#observations&amp;#39; = &amp;#39;nobs&amp;#39;, 
                                &amp;#39;R squared&amp;#39; = &amp;#39;r.squared&amp;#39;, 
                                &amp;#39;Adj. R Squared&amp;#39; = &amp;#39;adj.r.squared&amp;#39;, 
                                &amp;#39;Residual SE&amp;#39; = &amp;#39;sigma&amp;#39;), 
                 bold_signif = 0.05
) %&amp;gt;% 
  set_caption(&amp;#39;Comparison of models&amp;#39;)&lt;/code&gt;&lt;/pre&gt;
&lt;table class=&#34;huxtable&#34; style=&#34;border-collapse: collapse; border: 0px; margin-bottom: 2em; margin-top: 2em; ; margin-left: auto; margin-right: auto;  &#34; id=&#34;tab:compare_models&#34;&gt;
&lt;caption style=&#34;caption-side: top; text-align: center;&#34;&gt;Comparison of models&lt;/caption&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(1)&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(2)&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(3)&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(4)&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(5)&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(6)&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(7)&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(Intercept)&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;-5780.831 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;-4031.477 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-365.817&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-759.064&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;-1460.995 *&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;84087.945 *&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;-2246.829 ***&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(305.815)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(584.151)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(532.050)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(541.377)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(571.308)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(41912.019)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(625.286)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;flipper_length_mm&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;49.686 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;40.705 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;20.025 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;17.847 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;15.950 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;18.504 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;38.190 ***&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(1.518)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(3.071)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(2.846)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(2.902)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(2.910)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(3.128)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(2.084)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;speciesChinstrap&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;-206.510 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-87.634&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;-291.711 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;-251.477 **&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;-282.539 **&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(57.731)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(46.347)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(81.502)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(81.079)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(88.790)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;speciesGentoo&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;266.810 **&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;836.260 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;707.028 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;1014.627 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;890.958 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(95.264)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(85.185)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(94.359)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(129.561)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(144.563)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;sexmale&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;530.381 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;465.395 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;389.892 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;378.977 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;538.080 ***&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(37.810)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(43.081)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(47.848)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(48.074)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(51.310)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;bill_length_mm&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;21.633 **&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;18.204 *&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;18.964 **&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(7.148)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(7.106)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(7.112)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;bill_depth_mm&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;67.218 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;60.798 **&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;-86.947 ***&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(19.742)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(20.002)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(15.456)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;islandDream&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-21.180&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(58.390)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;islandTorgersen&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-58.777&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(60.852)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;year&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;-42.785 *&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(20.949)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;#observations&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;342&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;342&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;333&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;333&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;333&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;333&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;333&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;R squared&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.759&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.783&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.867&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.871&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.875&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.877&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.823&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;Adj. R Squared&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.758&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.781&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.865&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.869&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.873&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.873&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.821&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;Residual SE&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;394.278&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;375.535&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;295.565&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;291.955&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;287.338&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;286.524&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;340.427&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th colspan=&#34;8&#34; style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt; *** p &amp;lt; 0.001;  ** p &amp;lt; 0.01;  * p &amp;lt; 0.05.&lt;/th&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;The best model seems to be model 7, so we will use &lt;code&gt;broom::tidy()&lt;/code&gt; and &lt;code&gt;broom::glance()&lt;/code&gt; to get model results.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;model7 %&amp;gt;% broom::tidy()&lt;/code&gt;&lt;/pre&gt;
&lt;table class=&#34;huxtable&#34; style=&#34;border-collapse: collapse; border: 0px; margin-bottom: 2em; margin-top: 2em; ; margin-left: auto; margin-right: auto;  &#34; id=&#34;tab:model7_output&#34;&gt;
&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;term&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;estimate&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;std.error&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;statistic&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0.4pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;p.value&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;(Intercept)&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;-2.25e+03&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;625&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;-3.59&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0.4pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;0.000376&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;flipper_length_mm&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;38.2&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;2.08&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;18.3&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0.4pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;3.47e-52&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;sexmale&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;538&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;51.3&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;10.5&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0.4pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;2.17e-22&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;bill_depth_mm&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-86.9&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;15.5&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-5.63&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0.4pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;3.96e-08&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;model7 %&amp;gt;% broom::glance()&lt;/code&gt;&lt;/pre&gt;
&lt;table class=&#34;huxtable&#34; style=&#34;border-collapse: collapse; border: 0px; margin-bottom: 2em; margin-top: 2em; ; margin-left: auto; margin-right: auto;  &#34; id=&#34;tab:model7_output&#34;&gt;
&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;r.squared&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;adj.r.squared&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;sigma&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;statistic&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;p.value&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;df&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;logLik&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;AIC&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;BIC&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;deviance&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;df.residual&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0.4pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;nobs&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;0.823&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;0.821&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;340&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;509&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;2.9e-123&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;3&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;-2.41e+03&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;4.83e+03&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;4.85e+03&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;3.81e+07&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;329&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0.4pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;333&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;Let us write down its equation&lt;/p&gt;
&lt;p&gt;&lt;span class=&#34;math display&#34;&gt;\[
\begin{aligned}
\text{body_mass_g} &amp;amp;= \alpha + \beta_{1}(\text{flipper_length_mm}) + \beta_{2}(\text{sex}_{\text{male}})\ + \beta_{3}(\text{bill_depth_mm}) + \epsilon
\end{aligned}
\]&lt;/span&gt;&lt;span class=&#34;math display&#34;&gt;\[
\begin{aligned}
\text{body_mass_g} &amp;amp;= -2246.83 + 38.19(\text{flipper_length_mm}) + 538.08(\text{sex}_{\text{male}})\ - 86.95(\text{bill_depth_mm}) + \epsilon
\end{aligned}
\]&lt;/span&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;fitting-multiple-regression-models-in-one-go&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Fitting multiple regression models in one go&lt;/h2&gt;
&lt;p&gt;Let us recall the relationship between body mass and bill depth and have a look at the scatteplot.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/modelling_fit_lm_files/figure-html/unnamed-chunk-2-1.png&#34; width=&#34;648&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;We could run three separate regression, but we can estimate three regression models with a few lines of code.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;penguins %&amp;gt;%
  na.omit() %&amp;gt;% 
  group_by(species) %&amp;gt;%
  summarise(
    broom::tidy(lm( body_mass_g ~ bill_depth_mm))
  )&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 6
## # Groups:   species [3]
##   species   term          estimate std.error statistic  p.value
##   &amp;lt;fct&amp;gt;     &amp;lt;chr&amp;gt;            &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;    &amp;lt;dbl&amp;gt;
## 1 Adelie    (Intercept)     -297.      469.    -0.634  5.27e- 1
## 2 Adelie    bill_depth_mm    218.       25.5    8.55   1.67e-14
## 3 Chinstrap (Intercept)      -36.2     613.    -0.0591 9.53e- 1
## 4 Chinstrap bill_depth_mm    205.       33.2    6.16   4.79e- 8
## 5 Gentoo    (Intercept)     -422.      488.    -0.864  3.89e- 1
## 6 Gentoo    bill_depth_mm    368.       32.5   11.3    1.64e-20&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;What we see is BLAH…&lt;/p&gt;
&lt;p&gt;What if we add &lt;code&gt;sex&lt;/code&gt;? First, let us facet_wrap() our scatter plot to see what it looks like&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/modelling_fit_lm_files/figure-html/unnamed-chunk-3-1.png&#34; width=&#34;648&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;penguins %&amp;gt;%
  na.omit() %&amp;gt;% 
  group_by(species) %&amp;gt;%
  summarise(
    broom::tidy(lm( body_mass_g ~ bill_depth_mm + sex))
  )&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 9 x 6
## # Groups:   species [3]
##   species   term          estimate std.error statistic  p.value
##   &amp;lt;fct&amp;gt;     &amp;lt;chr&amp;gt;            &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;    &amp;lt;dbl&amp;gt;
## 1 Adelie    (Intercept)     1931.      452.      4.28  3.47e- 5
## 2 Adelie    bill_depth_mm     81.6      25.6     3.19  1.74e- 3
## 3 Adelie    sexmale          556.       62.1     8.96  1.63e-15
## 4 Chinstrap (Intercept)      830.      861.      0.964 3.39e- 1
## 5 Chinstrap bill_depth_mm    153.       48.9     3.14  2.55e- 3
## 6 Chinstrap sexmale          156.      110.      1.42  1.60e- 1
## 7 Gentoo    (Intercept)     2741.      579.      4.73  6.37e- 6
## 8 Gentoo    bill_depth_mm    136.       40.6     3.35  1.08e- 3
## 9 Gentoo    sexmale          604.       79.8     7.57  9.94e-12&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;simpsons-paradox&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Simpson’s paradox&lt;/h2&gt;
&lt;p&gt;Recall from our EDA, we saw no relationship between bill length and depth.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;bill_no_species&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/modelling_fit_lm_files/figure-html/unnamed-chunk-5-1.png&#34; width=&#34;648&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;If we fit a simple regression model, we get the following&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;simpsons_model &amp;lt;- lm( bill_depth_mm ~ bill_length_mm, 
                      data = penguins %&amp;gt;% na.omit())

simpsons_model %&amp;gt;% broom::tidy()&lt;/code&gt;&lt;/pre&gt;
&lt;table class=&#34;huxtable&#34; style=&#34;border-collapse: collapse; border: 0px; margin-bottom: 2em; margin-top: 2em; ; margin-left: auto; margin-right: auto;  &#34; id=&#34;tab:unnamed-chunk-6&#34;&gt;
&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;term&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;estimate&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;std.error&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;statistic&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0.4pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;p.value&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;(Intercept)&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;20.8&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;0.854&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;24.3&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0.4pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;1.03e-75&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;bill_length_mm&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-0.0823&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.0193&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-4.27&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0.4pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;2.53e-05&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;simpsons_model %&amp;gt;% broom::glance()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;table class=&#34;huxtable&#34; style=&#34;border-collapse: collapse; border: 0px; margin-bottom: 2em; margin-top: 2em; ; margin-left: auto; margin-right: auto;  &#34; id=&#34;tab:unnamed-chunk-6&#34;&gt;
&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;r.squared&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;adj.r.squared&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;sigma&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;statistic&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;p.value&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;df&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;logLik&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;AIC&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;BIC&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;deviance&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;df.residual&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0.4pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;nobs&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;0.0523&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;0.0494&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;1.92&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;18.3&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;2.53e-05&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;1&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;-689&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;1.38e+03&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;1.39e+03&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;1.22e+03&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;331&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0.4pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;333&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

The slope is significant, but the model r.squared (R2) explains only 5% of the overall variability.&lt;/p&gt;
&lt;p&gt;However, when we plotted the same scatterplot colouring points by species, we got a completely different story.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;bill_len_dep&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/modelling_fit_lm_files/figure-html/unnamed-chunk-7-1.png&#34; width=&#34;648&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;We can again fit three individual models in one go&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;penguins %&amp;gt;%
  na.omit() %&amp;gt;% 
  group_by(species) %&amp;gt;%
  summarise(
    broom::tidy(lm( bill_depth_mm ~ bill_length_mm )),
    broom::glance(lm( bill_depth_mm ~ bill_length_mm ))
  )&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 16
## # Groups:   species [3]
##   species term  estimate std.error statistic  p.value r.squared adj.r.squared
##   &amp;lt;fct&amp;gt;   &amp;lt;chr&amp;gt;    &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;    &amp;lt;dbl&amp;gt;     &amp;lt;dbl&amp;gt;         &amp;lt;dbl&amp;gt;
## 1 Adelie  (Int~   11.5      1.37        25.2 1.51e- 6     0.149         0.143
## 2 Adelie  bill~    0.177    0.0352      25.2 1.51e- 6     0.149         0.143
## 3 Chinst~ (Int~    7.57     1.55        49.2 1.53e- 9     0.427         0.418
## 4 Chinst~ bill~    0.222    0.0317      49.2 1.53e- 9     0.427         0.418
## 5 Gentoo  (Int~    5.12     1.06        87.5 7.34e-16     0.428         0.423
## 6 Gentoo  bill~    0.208    0.0222      87.5 7.34e-16     0.428         0.423
## # ... with 8 more variables: sigma &amp;lt;dbl&amp;gt;, df &amp;lt;dbl&amp;gt;, logLik &amp;lt;dbl&amp;gt;, AIC &amp;lt;dbl&amp;gt;,
## #   BIC &amp;lt;dbl&amp;gt;, deviance &amp;lt;dbl&amp;gt;, df.residual &amp;lt;int&amp;gt;, nobs &amp;lt;int&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Alternatively, we can run a multiple regression model and the adjusted R2 = 76.5%&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;simpsons_model2 &amp;lt;- lm( bill_depth_mm ~ bill_length_mm + species, 
                      data = penguins %&amp;gt;% na.omit())

simpsons_model2 %&amp;gt;% broom::tidy()&lt;/code&gt;&lt;/pre&gt;
&lt;table class=&#34;huxtable&#34; style=&#34;border-collapse: collapse; border: 0px; margin-bottom: 2em; margin-top: 2em; ; margin-left: auto; margin-right: auto;  &#34; id=&#34;tab:unnamed-chunk-8&#34;&gt;
&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;term&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;estimate&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;std.error&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;statistic&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0.4pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;p.value&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;(Intercept)&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;10.6&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;0.691&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;15.3&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0.4pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;2.98e-40&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;bill_length_mm&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.2&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.0177&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;11.3&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0.4pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;2.26e-25&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;speciesChinstrap&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;-1.93&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;0.226&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;-8.56&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0.4pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;4.26e-16&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;speciesGentoo&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-5.1&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.194&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-26.3&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0.4pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;1.04e-82&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;simpsons_model2 %&amp;gt;% broom::glance()&lt;/code&gt;&lt;/pre&gt;
&lt;table class=&#34;huxtable&#34; style=&#34;border-collapse: collapse; border: 0px; margin-bottom: 2em; margin-top: 2em; ; margin-left: auto; margin-right: auto;  &#34; id=&#34;tab:unnamed-chunk-8&#34;&gt;
&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;r.squared&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;adj.r.squared&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;sigma&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;statistic&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;p.value&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;df&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;logLik&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;AIC&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;BIC&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;deviance&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;df.residual&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0.4pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;nobs&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0.4pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;0.767&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;0.765&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;0.954&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;362&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;8.88e-104&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;3&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;-455&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;920&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;939&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;300&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;329&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0.4pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; background-color: rgb(242, 242, 242); font-weight: normal;&#34;&gt;333&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Model diagnostics</title>
      <link>https://bit-2021.netlify.app/example/modelling_diagnostics/</link>
      <pubDate>Wed, 29 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/example/modelling_diagnostics/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#linear-regression-assumptions-l-i-n-e&#34;&gt;Linear Regression Assumptions: &lt;strong&gt;L-I-N-E&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#regression-diagnostic-plots-with-ggfortifyautoplot&#34;&gt;Regression diagnostic plots with &lt;code&gt;ggfortify::autoplot()&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;linear-regression-assumptions-l-i-n-e&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Linear Regression Assumptions: &lt;strong&gt;L-I-N-E&lt;/strong&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;L&lt;/strong&gt;: Linear relationship between (Y) and the explanatory variable (X)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;I&lt;/strong&gt;: Independence of errors—there’s no connection between how far any two points lie from the regression line
_ &lt;strong&gt;N&lt;/strong&gt;: Normal distribution of Y at each level of X&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;E&lt;/strong&gt;: equality of variance of the errors – variability remains the same for all levels of X.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In other words, the residuals (errors) should satisfy the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;L: The mean value for Y at each level of X lies on regression line.&lt;/li&gt;
&lt;li&gt;I: There is no clear pattern in the errors&lt;/li&gt;
&lt;li&gt;N: At each level of X, the values for Y are normally distributed.&lt;/li&gt;
&lt;li&gt;E: The variability in the Y’s for each level of X is the same&lt;/li&gt;
&lt;/ul&gt;
&lt;div class=&#34;figure&#34; style=&#34;text-align: center&#34;&gt;&lt;span id=&#34;fig:OLSassumptions&#34;&gt;&lt;/span&gt;
&lt;img src=&#34;https://bit-2021.netlify.app/example/modelling_diagnostics_files/figure-html/OLSassumptions-1.png&#34; alt=&#34;Assumptions for linear ordinary least squares (OLS) regression&#34; width=&#34;90%&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Figure 1: Assumptions for linear ordinary least squares (OLS) regression
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;regression-diagnostic-plots-with-ggfortifyautoplot&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Regression diagnostic plots with &lt;code&gt;ggfortify::autoplot()&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Let us see what is happening in our models. We will use the &lt;code&gt;ggfortify&lt;/code&gt; package and its &lt;code&gt;autoplot()&lt;/code&gt; command to get the following regression diagnostic plots:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;&lt;strong&gt;Residuals vs. Fitted&lt;/strong&gt;: checks Linearity assumption. Residuals should be random, with no pattern, and around Y = 0; if not, there is a pattern in the data that is currently unaccounted for.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Normal Q-Q&lt;/strong&gt;: checks residual Normality assumption. Deviations from a straight line indicate that residuals do not follow a Normal distribution.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Scale-Location&lt;/strong&gt;: checks whether residuals have equal/constant variance or not. Positive or negative trends across the fitted values indicate variability that is not constant.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Residuals vs. Leverage&lt;/strong&gt;: check for influential points. Points with high leverage (having unusual values of the predictors) and/or high absolute residuals can have an undue influence on estimates of model parameters.&lt;/li&gt;
&lt;/ol&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;model1 &amp;lt;- lm(body_mass_g ~ flipper_length_mm, data = penguins)

model2 &amp;lt;- lm(body_mass_g ~ flipper_length_mm + species , data = penguins)

model3 &amp;lt;- lm(body_mass_g ~ flipper_length_mm + species + sex , data = penguins)

model4 &amp;lt;- lm(body_mass_g ~ flipper_length_mm + species + sex , data = penguins)

model5 &amp;lt;- lm(body_mass_g ~ flipper_length_mm + species + sex + bill_length_mm + bill_depth_mm , data = penguins)

model6 &amp;lt;- lm(body_mass_g ~ . , data = penguins)


library(ggfortify)

autoplot(model1) +
  theme_minimal() + 
  labs (title = &amp;quot;Model 1 Diagnostic Plots&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/modelling_diagnostics_files/figure-html/unnamed-chunk-1-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;autoplot(model2) +
  theme_minimal() + 
  labs (title = &amp;quot;Model 2 Diagnostic Plots&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/modelling_diagnostics_files/figure-html/unnamed-chunk-1-2.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;autoplot(model3) +
  theme_minimal() + 
  labs (title = &amp;quot;Model 3 Diagnostic Plots&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/modelling_diagnostics_files/figure-html/unnamed-chunk-1-3.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;autoplot(model4) +
  theme_minimal() + 
  labs (title = &amp;quot;Model 4 Diagnostic Plots&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/modelling_diagnostics_files/figure-html/unnamed-chunk-1-4.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;autoplot(model5) +
  theme_minimal() + 
  labs (title = &amp;quot;Model 5 Diagnostic Plots&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/modelling_diagnostics_files/figure-html/unnamed-chunk-1-5.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;autoplot(model6) +
  theme_minimal() + 
  labs (title = &amp;quot;Model 6 Diagnostic Plots&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/modelling_diagnostics_files/figure-html/unnamed-chunk-1-6.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Binary classification model</title>
      <link>https://bit-2021.netlify.app/example/modelling_fit_glm/</link>
      <pubDate>Tue, 28 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/example/modelling_fit_glm/</guid>
      <description>



</description>
    </item>
    
    <item>
      <title>Visualise Data</title>
      <link>https://bit-2021.netlify.app/example/eda-visualise-data/</link>
      <pubDate>Tue, 21 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/example/eda-visualise-data/</guid>
      <description>
&lt;script src=&#34;https://bit-2021.netlify.app/rmarkdown-libs/header-attrs/header-attrs.js&#34;&gt;&lt;/script&gt;
&lt;script src=&#34;https://bit-2021.netlify.app/rmarkdown-libs/kePrint/kePrint.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;https://bit-2021.netlify.app/rmarkdown-libs/lightable/lightable.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;script src=&#34;https://bit-2021.netlify.app/rmarkdown-libs/htmlwidgets/htmlwidgets.js&#34;&gt;&lt;/script&gt;
&lt;script src=&#34;https://bit-2021.netlify.app/rmarkdown-libs/plotly-binding/plotly.js&#34;&gt;&lt;/script&gt;
&lt;script src=&#34;https://bit-2021.netlify.app/rmarkdown-libs/typedarray/typedarray.min.js&#34;&gt;&lt;/script&gt;
&lt;script src=&#34;https://bit-2021.netlify.app/rmarkdown-libs/jquery/jquery.min.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;https://bit-2021.netlify.app/rmarkdown-libs/crosstalk/css/crosstalk.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;script src=&#34;https://bit-2021.netlify.app/rmarkdown-libs/crosstalk/js/crosstalk.min.js&#34;&gt;&lt;/script&gt;
&lt;link href=&#34;https://bit-2021.netlify.app/rmarkdown-libs/plotly-htmlwidgets-css/plotly-htmlwidgets.css&#34; rel=&#34;stylesheet&#34; /&gt;
&lt;script src=&#34;https://bit-2021.netlify.app/rmarkdown-libs/plotly-main/plotly-latest.min.js&#34;&gt;&lt;/script&gt;

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#overview&#34;&gt;Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#layers&#34;&gt;Layers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#facetting&#34;&gt;Facetting&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#tweaking-graphics-for-publication-quality&#34;&gt;Tweaking graphics for publication quality&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#making-plots-interactive-using-plotly&#34;&gt;Making plots interactive using &lt;code&gt;plotly&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#animated-graphs&#34;&gt;Animated Graphs&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#gapminder-animations---transition_time&#34;&gt;Gapminder Animations - &lt;code&gt;transition_time()&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#why-you-should-always-plot-your-data&#34;&gt;Why you should always plot your data&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#what-data-patterns-can-lie-behind-a-correlation-coefficient&#34;&gt;What data patterns can lie behind a correlation coefficient?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#rstudios-primers-for-ggplot2&#34;&gt;RStudio’s primers for &lt;strong&gt;ggplot2&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#further-resources&#34;&gt;Further resources&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;Learning Objectives &lt;br&gt;
1. Produce scatter plots, boxplots, and time series plots using ggplot. &lt;br&gt;
2. Set universal plot settings &lt;br&gt;
3. Describe what faceting is and apply faceting in ggplot. &lt;br&gt;
4. Modify the aesthetics of an existing ggplot plot (including axis labels and colour). &lt;br&gt;
5. Build complex and customized plots from data in a data frame.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div id=&#34;overview&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Overview&lt;/h2&gt;
&lt;blockquote&gt;
&lt;p&gt;Above all else show the data. &lt;br&gt;
      –Edward Tufte, &lt;em&gt;The Visual Display of Quantitative Information&lt;/em&gt;, 2001&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;code&gt;ggplot2&lt;/code&gt; has become the de facto standard for visualising data in R. The ggplot system moves away from a defined set of graphs (e.g., scatterplot, bar chart, etc) and instead breaks graphics down to their basic components and allows you to build plots layer by layer.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;“In brief… a statistical graphic is a mapping from &lt;strong&gt;data&lt;/strong&gt; to &lt;strong&gt;aesthetic attributes&lt;/strong&gt; (colour, shape, size) of &lt;strong&gt;geometric objects&lt;/strong&gt; (points, lines, bars). The plot may also contain statistical transformations of the data and is drawn on a specific coordinates system” &lt;br&gt;
      – Hadley Wickham (ggplot2 creator)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/img/ggplot.png&#34; width=&#34;80%&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;It may seem verbose and unwieldy, but the idea of building a plot on a layer-by-layer basis is very powerful.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;You begin a plot by defining the dataset we will use.&lt;/li&gt;
&lt;li&gt;Then, we specify aesthetics, namely (x,y) coordinates, colour, size, etc.&lt;/li&gt;
&lt;li&gt;Finally, we choose what &lt;code&gt;geom&lt;/code&gt; (or geometric shape) we want to use to represent our data.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We can then add more layers, like legends, labels, facets. etc.&lt;/p&gt;
&lt;p&gt;In the following examples, we will use the &lt;code&gt;gapminder&lt;/code&gt; dataset with data on life expectancy &lt;code&gt;lifeExp&lt;/code&gt;, population &lt;code&gt;pop&lt;/code&gt;, and GDP per capita &lt;code&gt;gdpPerCap&lt;/code&gt; for a number of countries between 1952 and 2007. We want to build a graph that shows the relationship between GDP per capita and life expectancy.&lt;/p&gt;
&lt;p&gt;As we said, first we define the dataset we are using&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(gapminder) #load the package gapminder that contains the data

ggplot(data=gapminder)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/gapminder1-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;We just get an empty canvas, as we haven’t done much with our dataset.&lt;/p&gt;
&lt;p&gt;The next thing is to map &lt;strong&gt;aesthetics&lt;/strong&gt;. In our case, we will map &lt;code&gt;gdpPercap&lt;/code&gt; to the x-axis, and &lt;code&gt;lifeExp&lt;/code&gt; to the y-axis.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(
  data = gapminder,
  mapping = aes(
    x = gdpPercap,
    y = lifeExp))&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/gapminder2-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;This is an improvement over the blank canvas we got earlier, as we have mapped the x- and y- axes and we see the likely ranges of both variables. However, to see the scatter plot we want, we must add a &lt;strong&gt;geometry&lt;/strong&gt;; as scatter plots are a bunch of points, the relevant geometry is &lt;code&gt;geom_point()&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(
  data = gapminder,
  mapping = aes(
    x = gdpPercap,
    y = lifeExp)) +
  geom_point()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/gapminder3-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;What if we wanted to colour the points by the continent each country is in? This is a change of the aesthetic properties, so we just add &lt;code&gt;colour = continent&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(
  data = gapminder,
  mapping = aes(
    x = gdpPercap,
    y = lifeExp, 
    colour = continent)) +
  geom_point()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/gapminder4-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;What if instead of a scatter plot we wanted to create a line plot? It would be the same code as before, but now the relevant geometry we should is &lt;code&gt;geom_line&lt;/code&gt; insrtead of &lt;code&gt;geom_point&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(
  data = gapminder,
  mapping = aes(
    x = gdpPercap,
    y = lifeExp, 
    colour = continent)) +
  geom_line()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/gapminder5-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;However, this is not a particularly useful plot, so let us go back to our scatter plot.&lt;/p&gt;
&lt;p&gt;What if we wanted to have the size of each point correspond to the population of the country? This is not a geometry, but an aesthetic property. If we add &lt;code&gt;size = pop&lt;/code&gt;, the points produced will be proportional to the country’s population, and we still have the aesthetic property &lt;code&gt;colour = continent&lt;/code&gt; that will colour its point with the continent the country is in.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(
  data = gapminder,
  mapping = aes(
    x = gdpPercap,
    y = lifeExp, 
    colour = continent,
    size = pop)) +
  geom_point()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/gapminder6-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;This is a more interesting graph, but given the non-linear pattern we see, we can perhaps improve it by taking the logarithm of the x-axis, GDP per capita. At the end of the commands, or layers, that make up our graph we add &lt;code&gt;scale_x_log10()&lt;/code&gt;. This will take the logarithm of the values in the x-axis and should produce a scatterplot with a linear pattern.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(
  data = gapminder,
  mapping = aes(
    x = gdpPercap,
    y = lifeExp, 
    colour = continent,
    size = pop)) +
  geom_point()+
  scale_x_log10()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/gapminder7-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;If we wanted to change the labels on the x-axis to dollars, we add &lt;code&gt;labels = scales::dollar&lt;/code&gt; to the function &lt;code&gt;scale_x_log10()&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(
  data = gapminder,
  mapping = aes(
    x = gdpPercap,
    y = lifeExp, 
    colour = continent,
    size = pop)) +
  geom_point()+
  scale_x_log10(labels = scales::dollar)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/gapminder8-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Any graph should be properly labelled, and we can add labels by adding another layer: &lt;code&gt;labs&lt;/code&gt; will add the relevant labels (title, subtitle, x- and y-axes, and a caption) as shown below.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(
  data = gapminder,
  mapping = aes(
    x = gdpPercap,
    y = lifeExp, 
    colour = continent,
    size = pop)) +
  geom_point() +
  scale_x_log10(labels = scales::dollar) +
  labs(title = &amp;quot;Life Expectancy vs GDP per capita&amp;quot;,
       subtitle = &amp;quot;1952-2007&amp;quot;, 
       x = &amp;quot;GDP per capita&amp;quot;, 
       y = &amp;quot;Life Expectancy&amp;quot;,
       caption = &amp;quot;Source: Gapminder&amp;quot;  
  )+
  NULL&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/gapminder9-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Style advice: While you can have the entire code for a ggplot in one single line, &lt;strong&gt;please don’t&lt;/strong&gt;! &lt;br&gt;
First, it makes it very hard to read and understand. &lt;br&gt;
Secondly, you build a ggplot in layers; by having each layer in a separate line, you can easily comment out a line (just add a hashtag &lt;code&gt;#&lt;/code&gt; at the beginning of the line) and see what is the effect of removing that layer. &lt;br&gt; What about the final &lt;code&gt;NULL&lt;/code&gt;? Well, it’s there to ensure that no matter how many lines you comment out, you have no orphan &lt;code&gt;+&lt;/code&gt;s and your code will run fine.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Finally, we can change the default theme which is a plot on a grey background; for this graph, we have chosen &lt;code&gt;theme_minimal()&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(
  data = gapminder,
  mapping = aes(
    x = gdpPercap,
    y = lifeExp, 
    colour = continent,
    size = pop)) +
  geom_point() +
  scale_x_log10(labels = scales::dollar) +
  labs(title = &amp;quot;Life Expectancy vs GDP per capita&amp;quot;,
       subtitle = &amp;quot;1952-2007&amp;quot;, 
       x = &amp;quot;GDP per capita&amp;quot;, 
       y = &amp;quot;Life Expectancy&amp;quot;,
       caption = &amp;quot;Source: Gapminder&amp;quot;  
      ) +
  theme_minimal()+
  NULL&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/gapminder10-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Your turn&lt;/strong&gt;:
&lt;br&gt;
Try exprimenting with different themes. &lt;br&gt;
1. Change &lt;code&gt;theme_minimal()&lt;/code&gt; to &lt;code&gt;theme_bw()&lt;/code&gt;. What’s the difference? &lt;br&gt;
2. Now use &lt;code&gt;theme_void()&lt;/code&gt; which is an even more minimal theme!
&lt;br&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;!---LEARNR EX 1--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;myIframev1&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/ggplot_theme1/&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;p&gt;Let us revisit our simple scatter plot. Because we have too may data points, we can add &lt;code&gt;alpha = 0.4&lt;/code&gt; in &lt;code&gt;geom_point()&lt;/code&gt; to make some of the points more transparent; &lt;code&gt;alpha = 1&lt;/code&gt; means solid colour and opaque data points, whereas lower values of &lt;code&gt;alpha&lt;/code&gt; make some points more transparent.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(
  data = gapminder,
  mapping = aes(
    x = gdpPercap,
    y = lifeExp, 
    colour = continent,
    size = pop, 
    )) +
  geom_point(alpha = 0.4) +
  scale_x_log10(labels = scales::dollar) +
  labs(title = &amp;quot;Life Expectancy vs GDP per capita&amp;quot;,
       subtitle = &amp;quot;1952-2007&amp;quot;, 
       x = &amp;quot;GDP per capita&amp;quot;, 
       y = &amp;quot;Life Expectancy&amp;quot;,
       caption = &amp;quot;Source: Gapminder&amp;quot;  
      ) +
  theme_minimal()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/gapminder10-1-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;layers&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Layers&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;ggplot&lt;/code&gt; create graphics in layers. Once you define your data and the aesthetics [(x,y) coordinates, colour, size, fill, etc.], you can then add add more layers in that you keep on ‘doing’ things to the data.&lt;/p&gt;
&lt;p&gt;In essence, each &lt;code&gt;geom&lt;/code&gt; layer specifies&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A &lt;code&gt;geom&lt;/code&gt;: the graphical object to be drawn (histogram, boxplot, density plot, etc.)&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;stat&lt;/code&gt;: what “statistic” it is applied to&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;position&lt;/code&gt;: how it is placed; &lt;code&gt;identity&lt;/code&gt;, &lt;code&gt;jitter&lt;/code&gt;, &lt;code&gt;dodge&lt;/code&gt;, &lt;code&gt;stack&lt;/code&gt;, &lt;code&gt;fill&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;Unfortunately, due to an early design mistake I called these either stat_() or geom_(). A better decision would have been to call them layer_() functions: that’s a more accurate description because every layer involves a stat and a geom. – Hadley Wickham&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Position adjustments are used, as the name says, to adjust the position of each geom. The following position adjustments and their defaults are shown below:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;position_identity&lt;/code&gt; - default of most geoms– Doesn’t adjust position&lt;/li&gt;
&lt;li&gt;&lt;code&gt;position_jitter&lt;/code&gt; - default of geom_jitter. Adding random noise to a plot can sometimes make it easier to read. Jittering is particularly useful for small datasets with at least one discrete position.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;position_dodge&lt;/code&gt; - default of geom_boxplot. Dodging preserves the vertical position of an geom while adjusting the horizontal position.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;position_stack&lt;/code&gt; - default of geom_bar==geom_histogram and geom_area– it stacks bars on top of each other&lt;/li&gt;
&lt;li&gt;&lt;code&gt;position_fill&lt;/code&gt; - useful for geom_bar==geom_histogram and geom_area– stacks bars and standardises each stack to have constant height&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Let us create a base plot of life expectancy, coloured by continent&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;life_exp_plot &amp;lt;- 
  ggplot(
    data = gapminder,
    mapping = aes(
      x = lifeExp,
      fill = continent)
  ) &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Nothing much happens, as we have just defined the base plot. Let us now plot a &lt;code&gt;geom_histogram()&lt;/code&gt;, which uses position_fill as its default.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;life_exp_plot + 
  geom_histogram()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/life_expectancy__plot1-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Is this a useful graph? &lt;code&gt;position_stack&lt;/code&gt;, the deafult for geom_histogram(), stacks bars on top of each other. Look at the bar that appears right after the 70 year life expectancy. Right at the bottomw, we have a a few observations from, followed by the blue European one, the green Asia, etc. all the way to the top where you see the few red observations that correspond to Africa.&lt;/p&gt;
&lt;p&gt;We can improve on this by using &lt;code&gt;position = &#34;identity&#34;&lt;/code&gt; that doesn’t adjust position. We also use &lt;code&gt;alpha = 0.3&lt;/code&gt; to make the bars more transparent. We also plot a density plot, a smoothed version of a histogram using &lt;code&gt;geom_density&lt;/code&gt;; its default position is identity and both plots are equivalent.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;life_exp_plot + 
  geom_histogram(
    position = &amp;quot;identity&amp;quot;,
    alpha = 0.3
  )&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/life_expectancy_plot2-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;life_exp_plot + 
  geom_density( alpha = 0.3)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/life_expectancy_plot2-2.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;If we again think what each layer specifies&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A &lt;code&gt;geom&lt;/code&gt;: density plot&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;stat&lt;/code&gt;: density&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;position&lt;/code&gt;: identity&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What if we change the position and we use &lt;code&gt;stack&lt;/code&gt; for the position layer?&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;life_exp_plot + 
  geom_histogram(
    position = &amp;quot;stack&amp;quot;,
    alpha = 0.3
  )&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/life_expectancy_plot3-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;life_exp_plot + 
  geom_histogram(alpha = 0.3)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/life_expectancy_plot3-2.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Both plots are identical; in the second one, we didn’t specify what &lt;code&gt;position&lt;/code&gt; should be, so ggplot used the default position for a histogram, which is &lt;code&gt;position = stack&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Finallt, we can also use &lt;code&gt;position = &#34;fill&#34;&lt;/code&gt; which stacks bars and standardises each stack to have constant height, or &lt;code&gt;position = &#34;dodge&#34;&lt;/code&gt; (to separate each continent) for the position layer&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;life_exp_plot + 
  geom_histogram(
    position = &amp;quot;fill&amp;quot;,
    alpha = 0.3
  )&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/life_expectancy_plot4-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;life_exp_plot + 
  geom_histogram(
    position = &amp;quot;dodge&amp;quot;,
    alpha = 0.5
  )&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/life_expectancy_plot4-2.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;facetting&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Facetting&lt;/h2&gt;
&lt;p&gt;One of the nice features of &lt;code&gt;ggplot2&lt;/code&gt; is a special technique called faceting that allows us to split one plot into multiple plots based on a factor included in the dataset. In the &lt;code&gt;gapminder&lt;/code&gt; scatterplot example, we can use facetting and produce one scatter plot for each continent separately by using &lt;code&gt;facet_wrap&lt;/code&gt; and &lt;code&gt;facet_grid&lt;/code&gt; as shown below.&lt;/p&gt;
&lt;p&gt;Before proceeding, we will define an object &lt;code&gt;gapminder_scatterplot&lt;/code&gt; with the sequence of layers that gives us the ‘core’ life expectancy vs GDP scatterplot. Having stored the ‘core’ plot into an object, we can then add layers to it as needed, something which is useful for programming, as it saves you from retyping things.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;facet_wrap()&lt;/code&gt; allows us to get the same graph, but looking at by changing another variable; in our case, we will look at the core scatterplot first by &lt;code&gt;continent&lt;/code&gt;, and then by `year.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#define the core gapminder scatterplot of life expectancy vs GDP
# store it in an obect called `gapminder_scatterplot`
gapminder_scatterplot &amp;lt;-  
  ggplot(
    data = gapminder,
    mapping = aes(
      x = gdpPercap,
      y = lifeExp, 
      colour = continent), 
      alpha = 0.2) +
  geom_point() +
  scale_x_log10(labels = scales::dollar) +
  labs(title = &amp;quot;Life Expectancy vs GDP per capita, 1952-2007&amp;quot;, 
       x = &amp;quot;GDP per capita&amp;quot;, 
       y = &amp;quot;Life Expectancy&amp;quot;,
       caption = &amp;quot;Source: Gapminder&amp;quot;  
  ) +
  theme_minimal()


# We now add a new layer to our base plot: facet_wrap(~x), 
# where x is the variable you want to facet by

# first, facet the scatterplot by continent
gapminder_scatterplot +
  facet_wrap(~continent) &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/gapminder11_facet-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# then, facet the scatterplot by year
gapminder_scatterplot +
  facet_wrap(~year) &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/gapminder11_facet-2.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;We can use the &lt;code&gt;nrow&lt;/code&gt; argument to manually control the number of rows in the faceting. We will consider the faceting by continent plot and want to have the output in 3 rows, so &lt;code&gt;nrow = 3&lt;/code&gt;. Also, we do not want any legends for the colours used, as ggplot will explicitly name the continents. To remove the legends, we add &lt;code&gt;theme(legend.position=&#34;none&#34;)&lt;/code&gt; to our ggplot.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;gapminder_scatterplot +
  facet_wrap(
    facets = vars(continent),
         nrow = 3) +
  theme(legend.position=&amp;quot;none&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/gapminder11-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;If you want to facet a plot and have its results appear in grid, we can use &lt;code&gt;facet_grid()&lt;/code&gt;. You can define what the row and the columns in your grid should correspond to.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# use facet_grid(), where rows refer to continents
gapminder_scatterplot +
  facet_grid(vars(rows=continent)) +
  theme(legend.position=&amp;quot;none&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/gapminder_facet_grid-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# we can create a facet_grid where you can define *both* rows and columns
# in our scatterplot, we add a facet_grid() layer where columns = continents and rows =  year
gapminder_scatterplot+
  theme_minimal(8) + # just make the font size smaller
  facet_grid(
    cols = vars(continent), 
    rows = vars(year)
    ) +
  theme(legend.position=&amp;quot;none&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/gapminder_facet_grid-2.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Finally, if instead of a scatter plot we wanted to create a &lt;strong&gt;boxplot&lt;/strong&gt; of life expectancy by continent, we use similar aesthetics, but the relevant geometry is &lt;code&gt;geom_boxplot()&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(
  data = gapminder,
  mapping = aes(
    x = continent,
    y = lifeExp, 
    fill = continent)) +
  geom_boxplot() +
  labs(title = &amp;quot;Life Expectancy among the continents, 1952-2007&amp;quot;, 
       x = &amp;quot; &amp;quot;, # Empty, as the levels of the x-variable are the continets
       y = &amp;quot;Life Expectancy&amp;quot;,
       caption = &amp;quot;Source: Gapminder&amp;quot;  
      ) +
  theme_minimal()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/gapminder12-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;tweaking-graphics-for-publication-quality&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Tweaking graphics for publication quality&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;ggplot&lt;/code&gt; comes with many other options for tweaking plots to get them just the way you want for publication. These can be a bit hard to remember, but I usually look them up in &lt;a href=&#34;http://www.cookbook-r.com/Graphs/&#34;&gt;R graphics cookbook&lt;/a&gt; and the &lt;a href=&#34;https://bbc.github.io/rcookbook/&#34;&gt;BBC Visual and Data Journalism cookbook for R graphics&lt;/a&gt;, both of which have example code to cover most use cases!&lt;/p&gt;
&lt;p&gt;In the example below, we select only those observations between 1997 and 2007, calculate the average life expectancy, average GDP per capita, and average population. We then create a new object, &lt;code&gt;gapminder9707_plot&lt;/code&gt; which is the series of commands that make up our plot. To actually see the plot, we either use &lt;code&gt;print(gapminder9707_plot)&lt;/code&gt; or just &lt;code&gt;gapminder9707_plot&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;gapminder9707 &amp;lt;- gapminder %&amp;gt;% 
  group_by(continent, country) %&amp;gt;%
  filter(year %in% c(1997, 2002, 2007)) %&amp;gt;%
  summarise(avg_life = mean(lifeExp, na.rm = TRUE),
            avg_gdp = mean(gdpPercap, na.rm = TRUE),
            avg_population_millions = mean(pop/1000000, na.rm = TRUE)) %&amp;gt;% 
  ungroup()          
            
gapminder9707_plot &amp;lt;- ggplot(data = gapminder9707,
       mapping = aes(x = avg_gdp,
                     y = avg_life,
                     colour = continent,
                     size = avg_population_millions,
                     label = country)) +
  geom_point() +
  scale_x_log10(labels = scales::dollar) +
  theme_bw() +
  labs(title = &amp;quot;Life Expectancy vs GDP per capita, 1997-2007&amp;quot;, 
       x = &amp;quot;Average GDP per capita&amp;quot;, 
       y = &amp;quot;Average Life Expectancy&amp;quot;,
       caption = &amp;quot;Source: Gapminder&amp;quot;) +
  geom_text(nudge_y = -.8, size = 2.2, check_overlap = TRUE)+
  theme(legend.position=&amp;quot;none&amp;quot;) 


gapminder9707_plot&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/publication_ready_plot-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;making-plots-interactive-using-plotly&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Making plots interactive using &lt;code&gt;plotly&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;We can make our plots interactive using the &lt;code&gt;plotly&lt;/code&gt; package, which allows us to look at each point, zoon in/out, etc. Once you load the plotly library, it is simply a matter or using the &lt;code&gt;ggplotly&lt;/code&gt; command. Move your cursor on the graph and see what happens!&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;plotly::ggplotly(gapminder9707_plot)&lt;/code&gt;&lt;/pre&gt;
&lt;div id=&#34;htmlwidget-1&#34; style=&#34;width:672px;height:480px;&#34; class=&#34;plotly html-widget&#34;&gt;&lt;/div&gt;
&lt;script type=&#34;application/json&#34; data-for=&#34;htmlwidget-1&#34;&gt;{&#34;x&#34;:{&#34;data&#34;:[{&#34;x&#34;:[3.73529816636707,3.5162117673424,3.13002676887317,4.03101235742701,3.02816078018156,2.64985043208335,3.276495823188,2.8623726152838,3.11005149558937,3.03283057509239,2.44243072300279,3.54821673665988,3.22009207134793,3.29268125150292,3.68451524445853,3.87836985020573,2.88840235491599,2.76261568000973,4.1298084405593,2.83823298308871,3.06004771237998,2.96342380553952,2.81326976138648,3.13685252983265,3.12825534292477,2.71465484474075,4.0150749225077,2.98910899771779,2.8486096223189,2.96758686339591,3.20998901027445,3.96070082304799,3.52551008129843,2.80835720900158,3.62950845746983,2.77840610712675,3.24338575658821,3.82517133418015,2.87287207097169,3.15539712949296,3.18794125122919,2.85261795104452,2.96044212782028,3.91133151367053,3.31722763319711,3.62042639176536,2.96937593184278,2.96244470228726,3.77067082874193,2.97013916557606,3.05616507642202,2.80937964846766],&#34;y&#34;:[70.8156666666667,41.5656666666667,55.3036666666667,49.9726666666667,51.0896666666667,47.422,50.8283333333333,44.705,50.9163333333333,62.9286666666667,44.6716666666667,53.7513333333333,47.717,53.7736666666667,69.4536666666667,49.724,55.5526666666667,51.0246666666667,57.9856666666667,57.7833333333333,59.0103333333333,53.7126666666667,45.5883333333333,53.1696666666667,47.581,43.884,72.748,57.2356666666667,46.9356666666667,52.0626666666667,62.2803333333333,71.8303333333333,69.4796666666667,44.1506666666667,54.4313333333333,54.2253333333333,46.977,75.6526666666667,41.914,64.3903333333333,61.6163333333333,41.159,45.9633333333333,54.3133333333333,56.766,45.9236666666667,50.2113333333333,58.1236666666667,72.9793333333333,47.9776666666667,40.605,43.4283333333333],&#34;text&#34;:[&#34;avg_gdp:  5436&lt;br /&gt;avg_life: 70.8&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   31.231&lt;br /&gt;country: Algeria&#34;,&#34;avg_gdp:  3283&lt;br /&gt;avg_life: 41.6&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   11.054&lt;br /&gt;country: Angola&#34;,&#34;avg_gdp:  1349&lt;br /&gt;avg_life: 55.3&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    7.057&lt;br /&gt;country: Benin&#34;,&#34;avg_gdp: 10740&lt;br /&gt;avg_life: 50.0&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    1.602&lt;br /&gt;country: Botswana&#34;,&#34;avg_gdp:  1067&lt;br /&gt;avg_life: 51.1&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   12.310&lt;br /&gt;country: Burkina Faso&#34;,&#34;avg_gdp:   447&lt;br /&gt;avg_life: 47.4&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    7.178&lt;br /&gt;country: Burundi&#34;,&#34;avg_gdp:  1890&lt;br /&gt;avg_life: 50.8&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   15.941&lt;br /&gt;country: Cameroon&#34;,&#34;avg_gdp:   728&lt;br /&gt;avg_life: 44.7&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    4.038&lt;br /&gt;country: Central African Republic&#34;,&#34;avg_gdp:  1288&lt;br /&gt;avg_life: 50.9&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    8.879&lt;br /&gt;country: Chad&#34;,&#34;avg_gdp:  1079&lt;br /&gt;avg_life: 62.9&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    0.618&lt;br /&gt;country: Comoros&#34;,&#34;avg_gdp:   277&lt;br /&gt;avg_life: 44.7&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   55.929&lt;br /&gt;country: Congo, Dem. Rep.&#34;,&#34;avg_gdp:  3534&lt;br /&gt;avg_life: 53.8&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    3.310&lt;br /&gt;country: Congo, Rep.&#34;,&#34;avg_gdp:  1660&lt;br /&gt;avg_life: 47.7&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   16.297&lt;br /&gt;country: Cote d&#39;Ivoire&#34;,&#34;avg_gdp:  1962&lt;br /&gt;avg_life: 53.8&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    0.454&lt;br /&gt;country: Djibouti&#34;,&#34;avg_gdp:  4836&lt;br /&gt;avg_life: 69.5&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   73.237&lt;br /&gt;country: Egypt&#34;,&#34;avg_gdp:  7557&lt;br /&gt;avg_life: 49.7&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    0.496&lt;br /&gt;country: Equatorial Guinea&#34;,&#34;avg_gdp:   773&lt;br /&gt;avg_life: 55.6&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    4.460&lt;br /&gt;country: Eritrea&#34;,&#34;avg_gdp:   579&lt;br /&gt;avg_life: 51.0&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   68.107&lt;br /&gt;country: Ethiopia&#34;,&#34;avg_gdp: 13484&lt;br /&gt;avg_life: 58.0&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    1.293&lt;br /&gt;country: Gabon&#34;,&#34;avg_gdp:   689&lt;br /&gt;avg_life: 57.8&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    1.461&lt;br /&gt;country: Gambia&#34;,&#34;avg_gdp:  1148&lt;br /&gt;avg_life: 59.0&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   20.614&lt;br /&gt;country: Ghana&#34;,&#34;avg_gdp:   919&lt;br /&gt;avg_life: 53.7&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    8.935&lt;br /&gt;country: Guinea&#34;,&#34;avg_gdp:   651&lt;br /&gt;avg_life: 45.6&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    1.333&lt;br /&gt;country: Guinea-Bissau&#34;,&#34;avg_gdp:  1370&lt;br /&gt;avg_life: 53.2&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   31.754&lt;br /&gt;country: Kenya&#34;,&#34;avg_gdp:  1344&lt;br /&gt;avg_life: 47.6&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    2.014&lt;br /&gt;country: Lesotho&#34;,&#34;avg_gdp:   518&lt;br /&gt;avg_life: 43.9&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    2.736&lt;br /&gt;country: Liberia&#34;,&#34;avg_gdp: 10353&lt;br /&gt;avg_life: 72.7&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    5.388&lt;br /&gt;country: Libya&#34;,&#34;avg_gdp:   975&lt;br /&gt;avg_life: 57.2&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   16.602&lt;br /&gt;country: Madagascar&#34;,&#34;avg_gdp:   706&lt;br /&gt;avg_life: 46.9&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   11.857&lt;br /&gt;country: Malawi&#34;,&#34;avg_gdp:   928&lt;br /&gt;avg_life: 52.1&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   10.666&lt;br /&gt;country: Mali&#34;,&#34;avg_gdp:  1622&lt;br /&gt;avg_life: 62.3&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    2.848&lt;br /&gt;country: Mauritania&#34;,&#34;avg_gdp:  9135&lt;br /&gt;avg_life: 71.8&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    1.200&lt;br /&gt;country: Mauritius&#34;,&#34;avg_gdp:  3354&lt;br /&gt;avg_life: 69.5&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   31.151&lt;br /&gt;country: Morocco&#34;,&#34;avg_gdp:   643&lt;br /&gt;avg_life: 44.2&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   18.343&lt;br /&gt;country: Mozambique&#34;,&#34;avg_gdp:  4261&lt;br /&gt;avg_life: 54.4&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    1.934&lt;br /&gt;country: Namibia&#34;,&#34;avg_gdp:   600&lt;br /&gt;avg_life: 54.2&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   11.234&lt;br /&gt;country: Niger&#34;,&#34;avg_gdp:  1751&lt;br /&gt;avg_life: 47.0&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:  120.380&lt;br /&gt;country: Nigeria&#34;,&#34;avg_gdp:  6686&lt;br /&gt;avg_life: 75.7&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    0.742&lt;br /&gt;country: Reunion&#34;,&#34;avg_gdp:   746&lt;br /&gt;avg_life: 41.9&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    7.975&lt;br /&gt;country: Rwanda&#34;,&#34;avg_gdp:  1430&lt;br /&gt;avg_life: 64.4&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    0.172&lt;br /&gt;country: Sao Tome and Principe&#34;,&#34;avg_gdp:  1541&lt;br /&gt;avg_life: 61.6&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   10.891&lt;br /&gt;country: Senegal&#34;,&#34;avg_gdp:   712&lt;br /&gt;avg_life: 41.2&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    5.361&lt;br /&gt;country: Sierra Leone&#34;,&#34;avg_gdp:   913&lt;br /&gt;avg_life: 46.0&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    7.835&lt;br /&gt;country: Somalia&#34;,&#34;avg_gdp:  8153&lt;br /&gt;avg_life: 54.3&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   43.755&lt;br /&gt;country: South Africa&#34;,&#34;avg_gdp:  2076&lt;br /&gt;avg_life: 56.8&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   37.181&lt;br /&gt;country: Sudan&#34;,&#34;avg_gdp:  4173&lt;br /&gt;avg_life: 45.9&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    1.106&lt;br /&gt;country: Swaziland&#34;,&#34;avg_gdp:   932&lt;br /&gt;avg_life: 50.2&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   34.473&lt;br /&gt;country: Tanzania&#34;,&#34;avg_gdp:   917&lt;br /&gt;avg_life: 58.1&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    5.000&lt;br /&gt;country: Togo&#34;,&#34;avg_gdp:  5898&lt;br /&gt;avg_life: 73.0&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:    9.759&lt;br /&gt;country: Tunisia&#34;,&#34;avg_gdp:   934&lt;br /&gt;avg_life: 48.0&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   25.040&lt;br /&gt;country: Uganda&#34;,&#34;avg_gdp:  1138&lt;br /&gt;avg_life: 40.6&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   10.587&lt;br /&gt;country: Zambia&#34;,&#34;avg_gdp:   645&lt;br /&gt;avg_life: 43.4&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions:   11.881&lt;br /&gt;country: Zimbabwe&#34;],&#34;type&#34;:&#34;scatter&#34;,&#34;mode&#34;:&#34;markers&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(248,118,109,1)&#34;,&#34;opacity&#34;:1,&#34;size&#34;:[6.72760602797348,5.52454756951681,5.16755216054104,4.41213862760544,5.62251958020771,5.17968548501525,5.88013868706544,4.81963153494679,5.34044521712509,4.13277174729714,7.7295011179664,4.71663739995308,5.90376239097201,4.06046263566322,8.30122249508917,4.08051501130236,4.87493748319513,8.13958204323653,4.33975554870753,4.38005760189931,6.17124751081782,5.34545406635119,4.34948176618431,6.75231541729152,4.49751602359385,4.62666602047349,4.98772210313198,5.92373866899622,5.5878102614617,5.49313800874509,4.64487722965116,4.31598727853304,6.72383986873236,6.03447276259101,4.4817371621777,5.53892501171234,9.5793254023263,4.1790594814895,5.25722641853177,3.77952755905512,5.51143547817408,4.98450219787755,5.24391149213095,7.27179461844066,6.99764690026521,4.29078480884114,6.87768026134228,4.94186890499087,5.41747897593005,6.41749218442389,5.48666683201641,5.5896428134198],&#34;symbol&#34;:&#34;circle&#34;,&#34;line&#34;:{&#34;width&#34;:1.88976377952756,&#34;color&#34;:&#34;rgba(248,118,109,1)&#34;}},&#34;hoveron&#34;:&#34;points&#34;,&#34;name&#34;:&#34;Africa&#34;,&#34;legendgroup&#34;:&#34;Africa&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;x&#34;:[4.03535371072954,3.54660611539427,3.92350297184792,4.51676942867998,4.05523145026848,3.79886238147315,3.9039135314952,3.83928400703392,3.67526752213631,3.82555079918739,3.73332428839496,3.6910434950007,3.10422857646474,3.51448197982362,3.85405449985021,4.03459253822447,3.39659989857463,3.90810792731621,3.60937784709796,3.8051897695126,4.26468994293184,4.1056428099285,4.59408370350983,3.9632951225695,4.00268932982023],&#34;y&#34;:[74.3116666666667,63.829,70.928,79.6776666666667,77.4096666666667,71.628,78.055,77.194,71.013,73.8263333333333,70.7156666666667,68.5196666666667,58.5746666666667,68.8073333333333,72.292,74.9223333333333,70.7203333333333,74.6623333333333,70.6356666666667,69.9043333333333,77.147,69.42,77.454,75.3046666666667,72.8863333333333],&#34;text&#34;:[&#34;avg_gdp: 10848&lt;br /&gt;avg_life: 74.3&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:   38.279&lt;br /&gt;country: Argentina&#34;,&#34;avg_gdp:  3521&lt;br /&gt;avg_life: 63.8&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:    8.419&lt;br /&gt;country: Bolivia&#34;,&#34;avg_gdp:  8385&lt;br /&gt;avg_life: 70.9&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:  179.491&lt;br /&gt;country: Brazil&#34;,&#34;avg_gdp: 32868&lt;br /&gt;avg_life: 79.7&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:   31.866&lt;br /&gt;country: Canada&#34;,&#34;avg_gdp: 11356&lt;br /&gt;avg_life: 77.4&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:   15.461&lt;br /&gt;country: Chile&#34;,&#34;avg_gdp:  6293&lt;br /&gt;avg_life: 71.6&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:   40.965&lt;br /&gt;country: Colombia&#34;,&#34;avg_gdp:  8015&lt;br /&gt;avg_life: 78.1&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:    3.829&lt;br /&gt;country: Costa Rica&#34;,&#34;avg_gdp:  6907&lt;br /&gt;avg_life: 77.2&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:   11.209&lt;br /&gt;country: Cuba&#34;,&#34;avg_gdp:  4734&lt;br /&gt;avg_life: 71.0&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:    8.654&lt;br /&gt;country: Dominican Republic&#34;,&#34;avg_gdp:  6692&lt;br /&gt;avg_life: 73.8&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:   12.863&lt;br /&gt;country: Ecuador&#34;,&#34;avg_gdp:  5412&lt;br /&gt;avg_life: 70.7&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:    6.359&lt;br /&gt;country: El Salvador&#34;,&#34;avg_gdp:  4910&lt;br /&gt;avg_life: 68.5&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:   11.185&lt;br /&gt;country: Guatemala&#34;,&#34;avg_gdp:  1271&lt;br /&gt;avg_life: 58.6&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:    7.675&lt;br /&gt;country: Haiti&#34;,&#34;avg_gdp:  3270&lt;br /&gt;avg_life: 68.8&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:    6.676&lt;br /&gt;country: Honduras&#34;,&#34;avg_gdp:  7146&lt;br /&gt;avg_life: 72.3&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:    2.659&lt;br /&gt;country: Jamaica&#34;,&#34;avg_gdp: 10829&lt;br /&gt;avg_life: 74.9&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:  102.359&lt;br /&gt;country: Mexico&#34;,&#34;avg_gdp:  2492&lt;br /&gt;avg_life: 70.7&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:    5.144&lt;br /&gt;country: Nicaragua&#34;,&#34;avg_gdp:  8093&lt;br /&gt;avg_life: 74.7&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:    2.989&lt;br /&gt;country: Panama&#34;,&#34;avg_gdp:  4068&lt;br /&gt;avg_life: 70.6&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:    5.902&lt;br /&gt;country: Paraguay&#34;,&#34;avg_gdp:  6385&lt;br /&gt;avg_life: 69.9&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:   26.731&lt;br /&gt;country: Peru&#34;,&#34;avg_gdp: 18395&lt;br /&gt;avg_life: 77.1&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:    3.854&lt;br /&gt;country: Puerto Rico&#34;,&#34;avg_gdp: 12754&lt;br /&gt;avg_life: 69.4&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:    1.099&lt;br /&gt;country: Trinidad and Tobago&#34;,&#34;avg_gdp: 39272&lt;br /&gt;avg_life: 77.5&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:  287.242&lt;br /&gt;country: United States&#34;,&#34;avg_gdp:  9190&lt;br /&gt;avg_life: 75.3&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:    3.358&lt;br /&gt;country: Uruguay&#34;,&#34;avg_gdp: 10062&lt;br /&gt;avg_life: 72.9&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions:   24.249&lt;br /&gt;country: Venezuela&#34;],&#34;type&#34;:&#34;scatter&#34;,&#34;mode&#34;:&#34;markers&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(163,165,0,1)&#34;,&#34;opacity&#34;:1,&#34;size&#34;:[7.04501506814027,5.29868143481951,10.8632004179082,6.75760404471265,5.84791211970399,7.15812828876902,4.79114302942878,5.53694165524078,5.32016768188401,5.66402109672951,5.09532400534705,5.53504210918148,5.22849262510052,5.12865375375649,4.61372783584097,9.12693334049441,4.95907240449911,4.66743024288311,5.04579546811574,6.50568499341888,4.79457654357769,4.28883990013158,12.7422524239991,4.72373070502124,6.37518537135309],&#34;symbol&#34;:&#34;circle&#34;,&#34;line&#34;:{&#34;width&#34;:1.88976377952756,&#34;color&#34;:&#34;rgba(163,165,0,1)&#34;}},&#34;hoveron&#34;:&#34;points&#34;,&#34;name&#34;:&#34;Americas&#34;,&#34;legendgroup&#34;:&#34;Americas&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;x&#34;:[2.89147347846538,4.38911659206344,3.0669981893363,3.04718265879913,3.53855823020568,4.51548362443266,3.27552612958804,3.50214933931494,3.98692194680545,3.59981101151252,4.35746144606758,4.47264552165406,3.60241299933731,3.21577679421717,4.29059590535987,4.6117860068265,3.97817992874428,4.03864060310991,3.3765017515434,2.81734497144193,3.02249136296315,4.31381862691475,3.35205449766939,3.446015835998,4.31002782822298,4.58989661059359,3.50740387576007,3.61242046477958,4.38117840609998,3.80672723259957,3.27044045988688,3.68876098267699,3.34459367012025],&#34;y&#34;:[42.5733333333333,74.785,61.829,57.6696666666667,71.805,81.2343333333333,63.114,68.4263333333333,69.4856666666667,58.4673333333333,79.57,81.7643333333333,71.19,67.2286666666667,76.7716666666667,76.8826666666667,71.0953333333333,73.0743333333333,65.1536666666667,60.7683333333333,61.517,74.1106666666667,63.637,70.185,71.6453333333333,78.6333333333333,71.2226666666667,72.9076666666667,76.88,68.9003333333333,72.646,72.296,60.342],&#34;text&#34;:[&#34;avg_gdp:   779&lt;br /&gt;avg_life: 42.6&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:   26.462&lt;br /&gt;country: Afghanistan&#34;,&#34;avg_gdp: 24497&lt;br /&gt;avg_life: 74.8&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:    0.655&lt;br /&gt;country: Bahrain&#34;,&#34;avg_gdp:  1167&lt;br /&gt;avg_life: 61.8&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:  136.473&lt;br /&gt;country: Bangladesh&#34;,&#34;avg_gdp:  1115&lt;br /&gt;avg_life: 57.7&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:   12.947&lt;br /&gt;country: Cambodia&#34;,&#34;avg_gdp:  3456&lt;br /&gt;avg_life: 71.8&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 1276.386&lt;br /&gt;country: China&#34;,&#34;avg_gdp: 32771&lt;br /&gt;avg_life: 81.2&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:    6.746&lt;br /&gt;country: Hong Kong, China&#34;,&#34;avg_gdp:  1886&lt;br /&gt;avg_life: 63.1&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 1034.523&lt;br /&gt;country: India&#34;,&#34;avg_gdp:  3178&lt;br /&gt;avg_life: 68.4&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:  211.295&lt;br /&gt;country: Indonesia&#34;,&#34;avg_gdp:  9703&lt;br /&gt;avg_life: 69.5&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:   66.563&lt;br /&gt;country: Iran&#34;,&#34;avg_gdp:  3979&lt;br /&gt;avg_life: 58.5&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:   24.092&lt;br /&gt;country: Iraq&#34;,&#34;avg_gdp: 22775&lt;br /&gt;avg_life: 79.6&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:    5.996&lt;br /&gt;country: Israel&#34;,&#34;avg_gdp: 29692&lt;br /&gt;avg_life: 81.8&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:  126.830&lt;br /&gt;country: Japan&#34;,&#34;avg_gdp:  4003&lt;br /&gt;avg_life: 71.2&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:    5.296&lt;br /&gt;country: Jordan&#34;,&#34;avg_gdp:  1644&lt;br /&gt;avg_life: 67.2&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:   22.367&lt;br /&gt;country: Korea, Dem. Rep.&#34;,&#34;avg_gdp: 19525&lt;br /&gt;avg_life: 76.8&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:   47.729&lt;br /&gt;country: Korea, Rep.&#34;,&#34;avg_gdp: 40906&lt;br /&gt;avg_life: 76.9&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:    2.127&lt;br /&gt;country: Kuwait&#34;,&#34;avg_gdp:  9510&lt;br /&gt;avg_life: 71.1&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:    3.676&lt;br /&gt;country: Lebanon&#34;,&#34;avg_gdp: 10931&lt;br /&gt;avg_life: 73.1&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:   22.653&lt;br /&gt;country: Malaysia&#34;,&#34;avg_gdp:  2380&lt;br /&gt;avg_life: 65.2&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:    2.681&lt;br /&gt;country: Mongolia&#34;,&#34;avg_gdp:   657&lt;br /&gt;avg_life: 60.8&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:   45.536&lt;br /&gt;country: Myanmar&#34;,&#34;avg_gdp:  1053&lt;br /&gt;avg_life: 61.5&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:   25.926&lt;br /&gt;country: Nepal&#34;,&#34;avg_gdp: 20598&lt;br /&gt;avg_life: 74.1&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:    2.734&lt;br /&gt;country: Oman&#34;,&#34;avg_gdp:  2249&lt;br /&gt;avg_life: 63.6&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:  152.746&lt;br /&gt;country: Pakistan&#34;,&#34;avg_gdp:  2793&lt;br /&gt;avg_life: 70.2&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:   83.028&lt;br /&gt;country: Philippines&#34;,&#34;avg_gdp: 20419&lt;br /&gt;avg_life: 71.6&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:   24.444&lt;br /&gt;country: Saudi Arabia&#34;,&#34;avg_gdp: 38895&lt;br /&gt;avg_life: 78.6&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:    4.184&lt;br /&gt;country: Singapore&#34;,&#34;avg_gdp:  3217&lt;br /&gt;avg_life: 71.2&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:   19.551&lt;br /&gt;country: Sri Lanka&#34;,&#34;avg_gdp:  4097&lt;br /&gt;avg_life: 72.9&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:   17.184&lt;br /&gt;country: Syria&#34;,&#34;avg_gdp: 24054&lt;br /&gt;avg_life: 76.9&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:   22.419&lt;br /&gt;country: Taiwan&#34;,&#34;avg_gdp:  6408&lt;br /&gt;avg_life: 68.9&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:   62.697&lt;br /&gt;country: Thailand&#34;,&#34;avg_gdp:  1864&lt;br /&gt;avg_life: 72.6&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:   80.740&lt;br /&gt;country: Vietnam&#34;,&#34;avg_gdp:  4884&lt;br /&gt;avg_life: 72.3&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:    3.411&lt;br /&gt;country: West Bank and Gaza&#34;,&#34;avg_gdp:  2211&lt;br /&gt;avg_life: 60.3&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions:   18.913&lt;br /&gt;country: Yemen, Rep.&#34;],&#34;type&#34;:&#34;scatter&#34;,&#34;mode&#34;:&#34;markers&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(0,191,125,1)&#34;,&#34;opacity&#34;:1,&#34;size&#34;:[6.49185137707063,4.1470342010906,9.95537054210971,5.67026697086806,22.6771653543307,5.13588550970687,20.7924816353829,11.4657626469075,8.08976546048613,6.36673446156033,5.05613352188063,9.73289216790774,4.97693226597572,6.27170320674659,7.42752718539711,4.51928556792372,4.76982752293953,6.2876998158144,4.61746871530203,7.34241414194419,6.46404362079212,4.62626271180737,10.313641086751,8.59467173768862,6.38568597001294,4.83915669700179,6.10823856211687,5.96137058433025,6.27460109152898,7.96239174145164,8.52770538606763,4.73162726006918,6.06958165353194],&#34;symbol&#34;:&#34;circle&#34;,&#34;line&#34;:{&#34;width&#34;:1.88976377952756,&#34;color&#34;:&#34;rgba(0,191,125,1)&#34;}},&#34;hoveron&#34;:&#34;points&#34;,&#34;name&#34;:&#34;Asia&#34;,&#34;legendgroup&#34;:&#34;Asia&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;x&#34;:[3.66068514409951,4.5125066096541,4.48543599205057,3.78370424250963,3.90934131248531,4.08066515550477,4.27475830787878,4.51076507057034,4.45299007477962,4.45375564554941,4.47709741882628,4.36046948640673,4.17187905673463,4.50245029453059,4.51971860982492,4.43250345570442,3.87072967448547,4.52620492575145,4.6542546300248,4.09750889663096,4.28721621089926,3.93852611693021,3.91972173615152,4.1706852723108,4.32626336525676,4.39270770393705,4.4696648282381,4.54042416565394,3.85668443215778,4.47108034127712],&#34;y&#34;:[75.008,78.773,78.4303333333333,74.062,71.8216666666667,74.768,75.3353333333333,77.2073333333333,78.271,79.629,78.472,78.536,72.3226666666667,80.4023333333333,77.5966666666667,79.8686666666667,74.6563333333333,78.774,79.1886666666667,74.3276666666667,77.1193333333333,71.1726666666667,73.149,73.7243333333333,76.572,79.8303333333333,80.1046666666667,80.5636666666667,70.4856666666667,78.3713333333333],&#34;text&#34;:[&#34;avg_gdp:  4578&lt;br /&gt;avg_life: 75.0&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:    3.512&lt;br /&gt;country: Albania&#34;,&#34;avg_gdp: 32547&lt;br /&gt;avg_life: 78.8&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:    8.139&lt;br /&gt;country: Austria&#34;,&#34;avg_gdp: 30580&lt;br /&gt;avg_life: 78.4&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:   10.301&lt;br /&gt;country: Belgium&#34;,&#34;avg_gdp:  6077&lt;br /&gt;avg_life: 74.1&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:    4.108&lt;br /&gt;country: Bosnia and Herzegovina&#34;,&#34;avg_gdp:  8116&lt;br /&gt;avg_life: 71.8&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:    7.684&lt;br /&gt;country: Bulgaria&#34;,&#34;avg_gdp: 12041&lt;br /&gt;avg_life: 74.8&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:    4.473&lt;br /&gt;country: Croatia&#34;,&#34;avg_gdp: 18826&lt;br /&gt;avg_life: 75.3&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:   10.262&lt;br /&gt;country: Czech Republic&#34;,&#34;avg_gdp: 32416&lt;br /&gt;avg_life: 77.2&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:    5.375&lt;br /&gt;country: Denmark&#34;,&#34;avg_gdp: 28379&lt;br /&gt;avg_life: 78.3&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:    5.189&lt;br /&gt;country: Finland&#34;,&#34;avg_gdp: 28429&lt;br /&gt;avg_life: 79.6&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:   59.877&lt;br /&gt;country: France&#34;,&#34;avg_gdp: 29998&lt;br /&gt;avg_life: 78.5&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:   82.254&lt;br /&gt;country: Germany&#34;,&#34;avg_gdp: 22933&lt;br /&gt;avg_life: 78.5&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:   10.604&lt;br /&gt;country: Greece&#34;,&#34;avg_gdp: 14855&lt;br /&gt;avg_life: 72.3&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:   10.095&lt;br /&gt;country: Hungary&#34;,&#34;avg_gdp: 31802&lt;br /&gt;avg_life: 80.4&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:    0.287&lt;br /&gt;country: Iceland&#34;,&#34;avg_gdp: 33092&lt;br /&gt;avg_life: 77.6&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:    3.885&lt;br /&gt;country: Ireland&#34;,&#34;avg_gdp: 27071&lt;br /&gt;avg_life: 79.9&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:   57.851&lt;br /&gt;country: Italy&#34;,&#34;avg_gdp:  7426&lt;br /&gt;avg_life: 74.7&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:    0.699&lt;br /&gt;country: Montenegro&#34;,&#34;avg_gdp: 33590&lt;br /&gt;avg_life: 78.8&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:   16.099&lt;br /&gt;country: Netherlands&#34;,&#34;avg_gdp: 45108&lt;br /&gt;avg_life: 79.2&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:    4.523&lt;br /&gt;country: Norway&#34;,&#34;avg_gdp: 12517&lt;br /&gt;avg_life: 74.3&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:   38.600&lt;br /&gt;country: Poland&#34;,&#34;avg_gdp: 19374&lt;br /&gt;avg_life: 77.1&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:   10.411&lt;br /&gt;country: Portugal&#34;,&#34;avg_gdp:  8680&lt;br /&gt;avg_life: 71.2&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:   22.414&lt;br /&gt;country: Romania&#34;,&#34;avg_gdp:  8312&lt;br /&gt;avg_life: 73.1&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:   10.199&lt;br /&gt;country: Serbia&#34;,&#34;avg_gdp: 14814&lt;br /&gt;avg_life: 73.7&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:    5.414&lt;br /&gt;country: Slovak Republic&#34;,&#34;avg_gdp: 21196&lt;br /&gt;avg_life: 76.6&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:    2.011&lt;br /&gt;country: Slovenia&#34;,&#34;avg_gdp: 24701&lt;br /&gt;avg_life: 79.8&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:   40.152&lt;br /&gt;country: Spain&#34;,&#34;avg_gdp: 29489&lt;br /&gt;avg_life: 80.1&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:    8.961&lt;br /&gt;country: Sweden&#34;,&#34;avg_gdp: 34708&lt;br /&gt;avg_life: 80.6&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:    7.370&lt;br /&gt;country: Switzerland&#34;,&#34;avg_gdp:  7189&lt;br /&gt;avg_life: 70.5&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:   67.172&lt;br /&gt;country: Turkey&#34;,&#34;avg_gdp: 29586&lt;br /&gt;avg_life: 78.4&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions:   59.832&lt;br /&gt;country: United Kingdom&#34;],&#34;type&#34;:&#34;scatter&#34;,&#34;mode&#34;:&#34;markers&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(0,176,246,1)&#34;,&#34;opacity&#34;:1,&#34;size&#34;:[4.74636125829721,5.27268627833302,5.4631286533769,4.829052305981,5.22935189554844,4.87660340242221,5.45985013599608,4.9862275737699,4.96436386330961,7.86698548992764,8.57212271456568,5.4881111444911,5.44586871965755,3.9590704474622,4.79888394793951,7.7970347219531,4.16367355543581,5.89067640483796,4.88297273410438,7.0587350989316,5.47222158421371,6.27433402395287,5.45464270246248,4.99062896081242,4.49687331593756,7.12431252832405,5.34778779327948,5.19877426131893,8.10947650172528,7.86543977705],&#34;symbol&#34;:&#34;circle&#34;,&#34;line&#34;:{&#34;width&#34;:1.88976377952756,&#34;color&#34;:&#34;rgba(0,176,246,1)&#34;}},&#34;hoveron&#34;:&#34;points&#34;,&#34;name&#34;:&#34;Europe&#34;,&#34;legendgroup&#34;:&#34;Europe&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;x&#34;:[4.48723766591315,4.36439603639178],&#34;y&#34;:[80.145,78.9546666666667],&#34;text&#34;:[&#34;avg_gdp: 30707&lt;br /&gt;avg_life: 80.1&lt;br /&gt;continent: Oceania&lt;br /&gt;avg_population_millions:   19.515&lt;br /&gt;country: Australia&#34;,&#34;avg_gdp: 23142&lt;br /&gt;avg_life: 79.0&lt;br /&gt;continent: Oceania&lt;br /&gt;avg_population_millions:    3.900&lt;br /&gt;country: New Zealand&#34;],&#34;type&#34;:&#34;scatter&#34;,&#34;mode&#34;:&#34;markers&#34;,&#34;marker&#34;:{&#34;autocolorscale&#34;:false,&#34;color&#34;:&#34;rgba(231,107,243,1)&#34;,&#34;opacity&#34;:1,&#34;size&#34;:[6.10608530174377,4.80091886080825],&#34;symbol&#34;:&#34;circle&#34;,&#34;line&#34;:{&#34;width&#34;:1.88976377952756,&#34;color&#34;:&#34;rgba(231,107,243,1)&#34;}},&#34;hoveron&#34;:&#34;points&#34;,&#34;name&#34;:&#34;Oceania&#34;,&#34;legendgroup&#34;:&#34;Oceania&#34;,&#34;showlegend&#34;:true,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;x&#34;:[3.73529816636707,3.5162117673424,3.13002676887317,4.03101235742701,3.02816078018156,2.64985043208335,3.276495823188,2.8623726152838,3.11005149558937,3.03283057509239,2.44243072300279,3.54821673665988,3.22009207134793,3.29268125150292,3.68451524445853,3.87836985020573,2.88840235491599,2.76261568000973,4.1298084405593,2.83823298308871,3.06004771237998,2.96342380553952,2.81326976138648,3.13685252983265,3.12825534292477,2.71465484474075,4.0150749225077,2.98910899771779,2.8486096223189,2.96758686339591,3.20998901027445,3.96070082304799,3.52551008129843,2.80835720900158,3.62950845746983,2.77840610712675,3.24338575658821,3.82517133418015,2.87287207097169,3.15539712949296,3.18794125122919,2.85261795104452,2.96044212782028,3.91133151367053,3.31722763319711,3.62042639176536,2.96937593184278,2.96244470228726,3.77067082874193,2.97013916557606,3.05616507642202,2.80937964846766],&#34;y&#34;:[70.0156666666667,40.7656666666667,54.5036666666667,49.1726666666667,50.2896666666667,46.622,50.0283333333333,43.905,50.1163333333333,62.1286666666667,43.8716666666667,52.9513333333333,46.917,52.9736666666667,68.6536666666667,48.924,54.7526666666667,50.2246666666667,57.1856666666667,56.9833333333333,58.2103333333333,52.9126666666667,44.7883333333333,52.3696666666667,46.781,43.084,71.948,56.4356666666667,46.1356666666667,51.2626666666667,61.4803333333333,71.0303333333333,68.6796666666667,43.3506666666667,53.6313333333333,53.4253333333333,46.177,74.8526666666667,41.114,63.5903333333333,60.8163333333333,40.359,45.1633333333333,53.5133333333333,55.966,45.1236666666667,49.4113333333333,57.3236666666667,72.1793333333333,47.1776666666667,39.805,42.6283333333333],&#34;text&#34;:[&#34;Algeria&#34;,&#34;Angola&#34;,&#34;Benin&#34;,&#34;Botswana&#34;,&#34;Burkina Faso&#34;,&#34;Burundi&#34;,&#34;Cameroon&#34;,&#34;Central African Republic&#34;,&#34;Chad&#34;,&#34;Comoros&#34;,&#34;Congo, Dem. Rep.&#34;,&#34;Congo, Rep.&#34;,&#34;Cote d&#39;Ivoire&#34;,&#34;Djibouti&#34;,&#34;Egypt&#34;,&#34;Equatorial Guinea&#34;,&#34;Eritrea&#34;,&#34;Ethiopia&#34;,&#34;Gabon&#34;,&#34;Gambia&#34;,&#34;Ghana&#34;,&#34;Guinea&#34;,&#34;Guinea-Bissau&#34;,&#34;Kenya&#34;,&#34;Lesotho&#34;,&#34;Liberia&#34;,&#34;Libya&#34;,&#34;Madagascar&#34;,&#34;Malawi&#34;,&#34;Mali&#34;,&#34;Mauritania&#34;,&#34;Mauritius&#34;,&#34;Morocco&#34;,&#34;Mozambique&#34;,&#34;Namibia&#34;,&#34;Niger&#34;,&#34;Nigeria&#34;,&#34;Reunion&#34;,&#34;Rwanda&#34;,&#34;Sao Tome and Principe&#34;,&#34;Senegal&#34;,&#34;Sierra Leone&#34;,&#34;Somalia&#34;,&#34;South Africa&#34;,&#34;Sudan&#34;,&#34;Swaziland&#34;,&#34;Tanzania&#34;,&#34;Togo&#34;,&#34;Tunisia&#34;,&#34;Uganda&#34;,&#34;Zambia&#34;,&#34;Zimbabwe&#34;],&#34;hovertext&#34;:[&#34;avg_gdp:  5436&lt;br /&gt;avg_life: 70.8&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Algeria&#34;,&#34;avg_gdp:  3283&lt;br /&gt;avg_life: 41.6&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Angola&#34;,&#34;avg_gdp:  1349&lt;br /&gt;avg_life: 55.3&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Benin&#34;,&#34;avg_gdp: 10740&lt;br /&gt;avg_life: 50.0&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Botswana&#34;,&#34;avg_gdp:  1067&lt;br /&gt;avg_life: 51.1&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Burkina Faso&#34;,&#34;avg_gdp:   447&lt;br /&gt;avg_life: 47.4&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Burundi&#34;,&#34;avg_gdp:  1890&lt;br /&gt;avg_life: 50.8&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Cameroon&#34;,&#34;avg_gdp:   728&lt;br /&gt;avg_life: 44.7&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Central African Republic&#34;,&#34;avg_gdp:  1288&lt;br /&gt;avg_life: 50.9&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Chad&#34;,&#34;avg_gdp:  1079&lt;br /&gt;avg_life: 62.9&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Comoros&#34;,&#34;avg_gdp:   277&lt;br /&gt;avg_life: 44.7&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Congo, Dem. Rep.&#34;,&#34;avg_gdp:  3534&lt;br /&gt;avg_life: 53.8&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Congo, Rep.&#34;,&#34;avg_gdp:  1660&lt;br /&gt;avg_life: 47.7&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Cote d&#39;Ivoire&#34;,&#34;avg_gdp:  1962&lt;br /&gt;avg_life: 53.8&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Djibouti&#34;,&#34;avg_gdp:  4836&lt;br /&gt;avg_life: 69.5&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Egypt&#34;,&#34;avg_gdp:  7557&lt;br /&gt;avg_life: 49.7&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Equatorial Guinea&#34;,&#34;avg_gdp:   773&lt;br /&gt;avg_life: 55.6&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Eritrea&#34;,&#34;avg_gdp:   579&lt;br /&gt;avg_life: 51.0&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Ethiopia&#34;,&#34;avg_gdp: 13484&lt;br /&gt;avg_life: 58.0&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Gabon&#34;,&#34;avg_gdp:   689&lt;br /&gt;avg_life: 57.8&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Gambia&#34;,&#34;avg_gdp:  1148&lt;br /&gt;avg_life: 59.0&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Ghana&#34;,&#34;avg_gdp:   919&lt;br /&gt;avg_life: 53.7&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Guinea&#34;,&#34;avg_gdp:   651&lt;br /&gt;avg_life: 45.6&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Guinea-Bissau&#34;,&#34;avg_gdp:  1370&lt;br /&gt;avg_life: 53.2&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Kenya&#34;,&#34;avg_gdp:  1344&lt;br /&gt;avg_life: 47.6&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Lesotho&#34;,&#34;avg_gdp:   518&lt;br /&gt;avg_life: 43.9&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Liberia&#34;,&#34;avg_gdp: 10353&lt;br /&gt;avg_life: 72.7&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Libya&#34;,&#34;avg_gdp:   975&lt;br /&gt;avg_life: 57.2&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Madagascar&#34;,&#34;avg_gdp:   706&lt;br /&gt;avg_life: 46.9&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Malawi&#34;,&#34;avg_gdp:   928&lt;br /&gt;avg_life: 52.1&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Mali&#34;,&#34;avg_gdp:  1622&lt;br /&gt;avg_life: 62.3&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Mauritania&#34;,&#34;avg_gdp:  9135&lt;br /&gt;avg_life: 71.8&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Mauritius&#34;,&#34;avg_gdp:  3354&lt;br /&gt;avg_life: 69.5&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Morocco&#34;,&#34;avg_gdp:   643&lt;br /&gt;avg_life: 44.2&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Mozambique&#34;,&#34;avg_gdp:  4261&lt;br /&gt;avg_life: 54.4&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Namibia&#34;,&#34;avg_gdp:   600&lt;br /&gt;avg_life: 54.2&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Niger&#34;,&#34;avg_gdp:  1751&lt;br /&gt;avg_life: 47.0&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Nigeria&#34;,&#34;avg_gdp:  6686&lt;br /&gt;avg_life: 75.7&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Reunion&#34;,&#34;avg_gdp:   746&lt;br /&gt;avg_life: 41.9&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Rwanda&#34;,&#34;avg_gdp:  1430&lt;br /&gt;avg_life: 64.4&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Sao Tome and Principe&#34;,&#34;avg_gdp:  1541&lt;br /&gt;avg_life: 61.6&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Senegal&#34;,&#34;avg_gdp:   712&lt;br /&gt;avg_life: 41.2&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Sierra Leone&#34;,&#34;avg_gdp:   913&lt;br /&gt;avg_life: 46.0&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Somalia&#34;,&#34;avg_gdp:  8153&lt;br /&gt;avg_life: 54.3&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: South Africa&#34;,&#34;avg_gdp:  2076&lt;br /&gt;avg_life: 56.8&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Sudan&#34;,&#34;avg_gdp:  4173&lt;br /&gt;avg_life: 45.9&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Swaziland&#34;,&#34;avg_gdp:   932&lt;br /&gt;avg_life: 50.2&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Tanzania&#34;,&#34;avg_gdp:   917&lt;br /&gt;avg_life: 58.1&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Togo&#34;,&#34;avg_gdp:  5898&lt;br /&gt;avg_life: 73.0&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Tunisia&#34;,&#34;avg_gdp:   934&lt;br /&gt;avg_life: 48.0&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Uganda&#34;,&#34;avg_gdp:  1138&lt;br /&gt;avg_life: 40.6&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Zambia&#34;,&#34;avg_gdp:   645&lt;br /&gt;avg_life: 43.4&lt;br /&gt;continent: Africa&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Zimbabwe&#34;],&#34;textfont&#34;:{&#34;size&#34;:8.31496062992126,&#34;color&#34;:&#34;rgba(248,118,109,1)&#34;},&#34;type&#34;:&#34;scatter&#34;,&#34;mode&#34;:&#34;text&#34;,&#34;hoveron&#34;:&#34;points&#34;,&#34;name&#34;:&#34;Africa&#34;,&#34;legendgroup&#34;:&#34;Africa&#34;,&#34;showlegend&#34;:false,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;x&#34;:[4.03535371072954,3.54660611539427,3.92350297184792,4.51676942867998,4.05523145026848,3.79886238147315,3.9039135314952,3.83928400703392,3.67526752213631,3.82555079918739,3.73332428839496,3.6910434950007,3.10422857646474,3.51448197982362,3.85405449985021,4.03459253822447,3.39659989857463,3.90810792731621,3.60937784709796,3.8051897695126,4.26468994293184,4.1056428099285,4.59408370350983,3.9632951225695,4.00268932982023],&#34;y&#34;:[73.5116666666667,63.029,70.128,78.8776666666667,76.6096666666667,70.828,77.255,76.394,70.213,73.0263333333333,69.9156666666667,67.7196666666667,57.7746666666667,68.0073333333333,71.492,74.1223333333333,69.9203333333333,73.8623333333333,69.8356666666667,69.1043333333333,76.347,68.62,76.654,74.5046666666667,72.0863333333333],&#34;text&#34;:[&#34;Argentina&#34;,&#34;Bolivia&#34;,&#34;Brazil&#34;,&#34;Canada&#34;,&#34;Chile&#34;,&#34;Colombia&#34;,&#34;Costa Rica&#34;,&#34;Cuba&#34;,&#34;Dominican Republic&#34;,&#34;Ecuador&#34;,&#34;El Salvador&#34;,&#34;Guatemala&#34;,&#34;Haiti&#34;,&#34;Honduras&#34;,&#34;Jamaica&#34;,&#34;Mexico&#34;,&#34;Nicaragua&#34;,&#34;Panama&#34;,&#34;Paraguay&#34;,&#34;Peru&#34;,&#34;Puerto Rico&#34;,&#34;Trinidad and Tobago&#34;,&#34;United States&#34;,&#34;Uruguay&#34;,&#34;Venezuela&#34;],&#34;hovertext&#34;:[&#34;avg_gdp: 10848&lt;br /&gt;avg_life: 74.3&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Argentina&#34;,&#34;avg_gdp:  3521&lt;br /&gt;avg_life: 63.8&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Bolivia&#34;,&#34;avg_gdp:  8385&lt;br /&gt;avg_life: 70.9&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Brazil&#34;,&#34;avg_gdp: 32868&lt;br /&gt;avg_life: 79.7&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Canada&#34;,&#34;avg_gdp: 11356&lt;br /&gt;avg_life: 77.4&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Chile&#34;,&#34;avg_gdp:  6293&lt;br /&gt;avg_life: 71.6&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Colombia&#34;,&#34;avg_gdp:  8015&lt;br /&gt;avg_life: 78.1&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Costa Rica&#34;,&#34;avg_gdp:  6907&lt;br /&gt;avg_life: 77.2&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Cuba&#34;,&#34;avg_gdp:  4734&lt;br /&gt;avg_life: 71.0&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Dominican Republic&#34;,&#34;avg_gdp:  6692&lt;br /&gt;avg_life: 73.8&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Ecuador&#34;,&#34;avg_gdp:  5412&lt;br /&gt;avg_life: 70.7&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: El Salvador&#34;,&#34;avg_gdp:  4910&lt;br /&gt;avg_life: 68.5&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Guatemala&#34;,&#34;avg_gdp:  1271&lt;br /&gt;avg_life: 58.6&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Haiti&#34;,&#34;avg_gdp:  3270&lt;br /&gt;avg_life: 68.8&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Honduras&#34;,&#34;avg_gdp:  7146&lt;br /&gt;avg_life: 72.3&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Jamaica&#34;,&#34;avg_gdp: 10829&lt;br /&gt;avg_life: 74.9&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Mexico&#34;,&#34;avg_gdp:  2492&lt;br /&gt;avg_life: 70.7&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Nicaragua&#34;,&#34;avg_gdp:  8093&lt;br /&gt;avg_life: 74.7&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Panama&#34;,&#34;avg_gdp:  4068&lt;br /&gt;avg_life: 70.6&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Paraguay&#34;,&#34;avg_gdp:  6385&lt;br /&gt;avg_life: 69.9&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Peru&#34;,&#34;avg_gdp: 18395&lt;br /&gt;avg_life: 77.1&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Puerto Rico&#34;,&#34;avg_gdp: 12754&lt;br /&gt;avg_life: 69.4&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Trinidad and Tobago&#34;,&#34;avg_gdp: 39272&lt;br /&gt;avg_life: 77.5&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: United States&#34;,&#34;avg_gdp:  9190&lt;br /&gt;avg_life: 75.3&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Uruguay&#34;,&#34;avg_gdp: 10062&lt;br /&gt;avg_life: 72.9&lt;br /&gt;continent: Americas&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Venezuela&#34;],&#34;textfont&#34;:{&#34;size&#34;:8.31496062992126,&#34;color&#34;:&#34;rgba(163,165,0,1)&#34;},&#34;type&#34;:&#34;scatter&#34;,&#34;mode&#34;:&#34;text&#34;,&#34;hoveron&#34;:&#34;points&#34;,&#34;name&#34;:&#34;Americas&#34;,&#34;legendgroup&#34;:&#34;Americas&#34;,&#34;showlegend&#34;:false,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;x&#34;:[2.89147347846538,4.38911659206344,3.0669981893363,3.04718265879913,3.53855823020568,4.51548362443266,3.27552612958804,3.50214933931494,3.98692194680545,3.59981101151252,4.35746144606758,4.47264552165406,3.60241299933731,3.21577679421717,4.29059590535987,4.6117860068265,3.97817992874428,4.03864060310991,3.3765017515434,2.81734497144193,3.02249136296315,4.31381862691475,3.35205449766939,3.446015835998,4.31002782822298,4.58989661059359,3.50740387576007,3.61242046477958,4.38117840609998,3.80672723259957,3.27044045988688,3.68876098267699,3.34459367012025],&#34;y&#34;:[41.7733333333333,73.985,61.029,56.8696666666667,71.005,80.4343333333333,62.314,67.6263333333333,68.6856666666667,57.6673333333333,78.77,80.9643333333333,70.39,66.4286666666667,75.9716666666667,76.0826666666667,70.2953333333333,72.2743333333333,64.3536666666667,59.9683333333333,60.717,73.3106666666667,62.837,69.385,70.8453333333333,77.8333333333333,70.4226666666667,72.1076666666667,76.08,68.1003333333333,71.846,71.496,59.542],&#34;text&#34;:[&#34;Afghanistan&#34;,&#34;Bahrain&#34;,&#34;Bangladesh&#34;,&#34;Cambodia&#34;,&#34;China&#34;,&#34;Hong Kong, China&#34;,&#34;India&#34;,&#34;Indonesia&#34;,&#34;Iran&#34;,&#34;Iraq&#34;,&#34;Israel&#34;,&#34;Japan&#34;,&#34;Jordan&#34;,&#34;Korea, Dem. Rep.&#34;,&#34;Korea, Rep.&#34;,&#34;Kuwait&#34;,&#34;Lebanon&#34;,&#34;Malaysia&#34;,&#34;Mongolia&#34;,&#34;Myanmar&#34;,&#34;Nepal&#34;,&#34;Oman&#34;,&#34;Pakistan&#34;,&#34;Philippines&#34;,&#34;Saudi Arabia&#34;,&#34;Singapore&#34;,&#34;Sri Lanka&#34;,&#34;Syria&#34;,&#34;Taiwan&#34;,&#34;Thailand&#34;,&#34;Vietnam&#34;,&#34;West Bank and Gaza&#34;,&#34;Yemen, Rep.&#34;],&#34;hovertext&#34;:[&#34;avg_gdp:   779&lt;br /&gt;avg_life: 42.6&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Afghanistan&#34;,&#34;avg_gdp: 24497&lt;br /&gt;avg_life: 74.8&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Bahrain&#34;,&#34;avg_gdp:  1167&lt;br /&gt;avg_life: 61.8&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Bangladesh&#34;,&#34;avg_gdp:  1115&lt;br /&gt;avg_life: 57.7&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Cambodia&#34;,&#34;avg_gdp:  3456&lt;br /&gt;avg_life: 71.8&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: China&#34;,&#34;avg_gdp: 32771&lt;br /&gt;avg_life: 81.2&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Hong Kong, China&#34;,&#34;avg_gdp:  1886&lt;br /&gt;avg_life: 63.1&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: India&#34;,&#34;avg_gdp:  3178&lt;br /&gt;avg_life: 68.4&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Indonesia&#34;,&#34;avg_gdp:  9703&lt;br /&gt;avg_life: 69.5&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Iran&#34;,&#34;avg_gdp:  3979&lt;br /&gt;avg_life: 58.5&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Iraq&#34;,&#34;avg_gdp: 22775&lt;br /&gt;avg_life: 79.6&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Israel&#34;,&#34;avg_gdp: 29692&lt;br /&gt;avg_life: 81.8&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Japan&#34;,&#34;avg_gdp:  4003&lt;br /&gt;avg_life: 71.2&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Jordan&#34;,&#34;avg_gdp:  1644&lt;br /&gt;avg_life: 67.2&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Korea, Dem. Rep.&#34;,&#34;avg_gdp: 19525&lt;br /&gt;avg_life: 76.8&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Korea, Rep.&#34;,&#34;avg_gdp: 40906&lt;br /&gt;avg_life: 76.9&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Kuwait&#34;,&#34;avg_gdp:  9510&lt;br /&gt;avg_life: 71.1&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Lebanon&#34;,&#34;avg_gdp: 10931&lt;br /&gt;avg_life: 73.1&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Malaysia&#34;,&#34;avg_gdp:  2380&lt;br /&gt;avg_life: 65.2&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Mongolia&#34;,&#34;avg_gdp:   657&lt;br /&gt;avg_life: 60.8&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Myanmar&#34;,&#34;avg_gdp:  1053&lt;br /&gt;avg_life: 61.5&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Nepal&#34;,&#34;avg_gdp: 20598&lt;br /&gt;avg_life: 74.1&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Oman&#34;,&#34;avg_gdp:  2249&lt;br /&gt;avg_life: 63.6&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Pakistan&#34;,&#34;avg_gdp:  2793&lt;br /&gt;avg_life: 70.2&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Philippines&#34;,&#34;avg_gdp: 20419&lt;br /&gt;avg_life: 71.6&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Saudi Arabia&#34;,&#34;avg_gdp: 38895&lt;br /&gt;avg_life: 78.6&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Singapore&#34;,&#34;avg_gdp:  3217&lt;br /&gt;avg_life: 71.2&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Sri Lanka&#34;,&#34;avg_gdp:  4097&lt;br /&gt;avg_life: 72.9&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Syria&#34;,&#34;avg_gdp: 24054&lt;br /&gt;avg_life: 76.9&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Taiwan&#34;,&#34;avg_gdp:  6408&lt;br /&gt;avg_life: 68.9&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Thailand&#34;,&#34;avg_gdp:  1864&lt;br /&gt;avg_life: 72.6&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Vietnam&#34;,&#34;avg_gdp:  4884&lt;br /&gt;avg_life: 72.3&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: West Bank and Gaza&#34;,&#34;avg_gdp:  2211&lt;br /&gt;avg_life: 60.3&lt;br /&gt;continent: Asia&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Yemen, Rep.&#34;],&#34;textfont&#34;:{&#34;size&#34;:8.31496062992126,&#34;color&#34;:&#34;rgba(0,191,125,1)&#34;},&#34;type&#34;:&#34;scatter&#34;,&#34;mode&#34;:&#34;text&#34;,&#34;hoveron&#34;:&#34;points&#34;,&#34;name&#34;:&#34;Asia&#34;,&#34;legendgroup&#34;:&#34;Asia&#34;,&#34;showlegend&#34;:false,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;x&#34;:[3.66068514409951,4.5125066096541,4.48543599205057,3.78370424250963,3.90934131248531,4.08066515550477,4.27475830787878,4.51076507057034,4.45299007477962,4.45375564554941,4.47709741882628,4.36046948640673,4.17187905673463,4.50245029453059,4.51971860982492,4.43250345570442,3.87072967448547,4.52620492575145,4.6542546300248,4.09750889663096,4.28721621089926,3.93852611693021,3.91972173615152,4.1706852723108,4.32626336525676,4.39270770393705,4.4696648282381,4.54042416565394,3.85668443215778,4.47108034127712],&#34;y&#34;:[74.208,77.973,77.6303333333333,73.262,71.0216666666667,73.968,74.5353333333333,76.4073333333333,77.471,78.829,77.672,77.736,71.5226666666667,79.6023333333333,76.7966666666667,79.0686666666667,73.8563333333333,77.974,78.3886666666667,73.5276666666667,76.3193333333333,70.3726666666667,72.349,72.9243333333333,75.772,79.0303333333333,79.3046666666667,79.7636666666667,69.6856666666667,77.5713333333333],&#34;text&#34;:[&#34;Albania&#34;,&#34;Austria&#34;,&#34;Belgium&#34;,&#34;Bosnia and Herzegovina&#34;,&#34;Bulgaria&#34;,&#34;Croatia&#34;,&#34;Czech Republic&#34;,&#34;Denmark&#34;,&#34;Finland&#34;,&#34;France&#34;,&#34;Germany&#34;,&#34;Greece&#34;,&#34;Hungary&#34;,&#34;Iceland&#34;,&#34;Ireland&#34;,&#34;Italy&#34;,&#34;Montenegro&#34;,&#34;Netherlands&#34;,&#34;Norway&#34;,&#34;Poland&#34;,&#34;Portugal&#34;,&#34;Romania&#34;,&#34;Serbia&#34;,&#34;Slovak Republic&#34;,&#34;Slovenia&#34;,&#34;Spain&#34;,&#34;Sweden&#34;,&#34;Switzerland&#34;,&#34;Turkey&#34;,&#34;United Kingdom&#34;],&#34;hovertext&#34;:[&#34;avg_gdp:  4578&lt;br /&gt;avg_life: 75.0&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Albania&#34;,&#34;avg_gdp: 32547&lt;br /&gt;avg_life: 78.8&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Austria&#34;,&#34;avg_gdp: 30580&lt;br /&gt;avg_life: 78.4&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Belgium&#34;,&#34;avg_gdp:  6077&lt;br /&gt;avg_life: 74.1&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Bosnia and Herzegovina&#34;,&#34;avg_gdp:  8116&lt;br /&gt;avg_life: 71.8&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Bulgaria&#34;,&#34;avg_gdp: 12041&lt;br /&gt;avg_life: 74.8&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Croatia&#34;,&#34;avg_gdp: 18826&lt;br /&gt;avg_life: 75.3&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Czech Republic&#34;,&#34;avg_gdp: 32416&lt;br /&gt;avg_life: 77.2&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Denmark&#34;,&#34;avg_gdp: 28379&lt;br /&gt;avg_life: 78.3&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Finland&#34;,&#34;avg_gdp: 28429&lt;br /&gt;avg_life: 79.6&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: France&#34;,&#34;avg_gdp: 29998&lt;br /&gt;avg_life: 78.5&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Germany&#34;,&#34;avg_gdp: 22933&lt;br /&gt;avg_life: 78.5&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Greece&#34;,&#34;avg_gdp: 14855&lt;br /&gt;avg_life: 72.3&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Hungary&#34;,&#34;avg_gdp: 31802&lt;br /&gt;avg_life: 80.4&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Iceland&#34;,&#34;avg_gdp: 33092&lt;br /&gt;avg_life: 77.6&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Ireland&#34;,&#34;avg_gdp: 27071&lt;br /&gt;avg_life: 79.9&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Italy&#34;,&#34;avg_gdp:  7426&lt;br /&gt;avg_life: 74.7&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Montenegro&#34;,&#34;avg_gdp: 33590&lt;br /&gt;avg_life: 78.8&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Netherlands&#34;,&#34;avg_gdp: 45108&lt;br /&gt;avg_life: 79.2&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Norway&#34;,&#34;avg_gdp: 12517&lt;br /&gt;avg_life: 74.3&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Poland&#34;,&#34;avg_gdp: 19374&lt;br /&gt;avg_life: 77.1&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Portugal&#34;,&#34;avg_gdp:  8680&lt;br /&gt;avg_life: 71.2&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Romania&#34;,&#34;avg_gdp:  8312&lt;br /&gt;avg_life: 73.1&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Serbia&#34;,&#34;avg_gdp: 14814&lt;br /&gt;avg_life: 73.7&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Slovak Republic&#34;,&#34;avg_gdp: 21196&lt;br /&gt;avg_life: 76.6&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Slovenia&#34;,&#34;avg_gdp: 24701&lt;br /&gt;avg_life: 79.8&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Spain&#34;,&#34;avg_gdp: 29489&lt;br /&gt;avg_life: 80.1&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Sweden&#34;,&#34;avg_gdp: 34708&lt;br /&gt;avg_life: 80.6&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Switzerland&#34;,&#34;avg_gdp:  7189&lt;br /&gt;avg_life: 70.5&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Turkey&#34;,&#34;avg_gdp: 29586&lt;br /&gt;avg_life: 78.4&lt;br /&gt;continent: Europe&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: United Kingdom&#34;],&#34;textfont&#34;:{&#34;size&#34;:8.31496062992126,&#34;color&#34;:&#34;rgba(0,176,246,1)&#34;},&#34;type&#34;:&#34;scatter&#34;,&#34;mode&#34;:&#34;text&#34;,&#34;hoveron&#34;:&#34;points&#34;,&#34;name&#34;:&#34;Europe&#34;,&#34;legendgroup&#34;:&#34;Europe&#34;,&#34;showlegend&#34;:false,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null},{&#34;x&#34;:[4.48723766591315,4.36439603639178],&#34;y&#34;:[79.345,78.1546666666667],&#34;text&#34;:[&#34;Australia&#34;,&#34;New Zealand&#34;],&#34;hovertext&#34;:[&#34;avg_gdp: 30707&lt;br /&gt;avg_life: 80.1&lt;br /&gt;continent: Oceania&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: Australia&#34;,&#34;avg_gdp: 23142&lt;br /&gt;avg_life: 79.0&lt;br /&gt;continent: Oceania&lt;br /&gt;avg_population_millions: 2.2&lt;br /&gt;country: New Zealand&#34;],&#34;textfont&#34;:{&#34;size&#34;:8.31496062992126,&#34;color&#34;:&#34;rgba(231,107,243,1)&#34;},&#34;type&#34;:&#34;scatter&#34;,&#34;mode&#34;:&#34;text&#34;,&#34;hoveron&#34;:&#34;points&#34;,&#34;name&#34;:&#34;Oceania&#34;,&#34;legendgroup&#34;:&#34;Oceania&#34;,&#34;showlegend&#34;:false,&#34;xaxis&#34;:&#34;x&#34;,&#34;yaxis&#34;:&#34;y&#34;,&#34;hoverinfo&#34;:&#34;text&#34;,&#34;frame&#34;:null}],&#34;layout&#34;:{&#34;margin&#34;:{&#34;t&#34;:43.7625570776256,&#34;r&#34;:7.30593607305936,&#34;b&#34;:40.1826484018265,&#34;l&#34;:37.2602739726027},&#34;plot_bgcolor&#34;:&#34;rgba(255,255,255,1)&#34;,&#34;paper_bgcolor&#34;:&#34;rgba(255,255,255,1)&#34;,&#34;font&#34;:{&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;,&#34;family&#34;:&#34;&#34;,&#34;size&#34;:14.6118721461187},&#34;title&#34;:{&#34;text&#34;:&#34;Life Expectancy vs GDP per capita, 1997-2007&#34;,&#34;font&#34;:{&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;,&#34;family&#34;:&#34;&#34;,&#34;size&#34;:17.5342465753425},&#34;x&#34;:0,&#34;xref&#34;:&#34;paper&#34;},&#34;xaxis&#34;:{&#34;domain&#34;:[0,1],&#34;automargin&#34;:true,&#34;type&#34;:&#34;linear&#34;,&#34;autorange&#34;:false,&#34;range&#34;:[2.33183952765169,4.7648458253759],&#34;tickmode&#34;:&#34;array&#34;,&#34;ticktext&#34;:[&#34;$300.00&#34;,&#34;$1,000.00&#34;,&#34;$3,000.00&#34;,&#34;$10,000.00&#34;,&#34;$30,000.00&#34;],&#34;tickvals&#34;:[2.47712125471966,3,3.47712125471966,4,4.47712125471966],&#34;categoryorder&#34;:&#34;array&#34;,&#34;categoryarray&#34;:[&#34;$300.00&#34;,&#34;$1,000.00&#34;,&#34;$3,000.00&#34;,&#34;$10,000.00&#34;,&#34;$30,000.00&#34;],&#34;nticks&#34;:null,&#34;ticks&#34;:&#34;outside&#34;,&#34;tickcolor&#34;:&#34;rgba(51,51,51,1)&#34;,&#34;ticklen&#34;:3.65296803652968,&#34;tickwidth&#34;:0.66417600664176,&#34;showticklabels&#34;:true,&#34;tickfont&#34;:{&#34;color&#34;:&#34;rgba(77,77,77,1)&#34;,&#34;family&#34;:&#34;&#34;,&#34;size&#34;:11.689497716895},&#34;tickangle&#34;:-0,&#34;showline&#34;:false,&#34;linecolor&#34;:null,&#34;linewidth&#34;:0,&#34;showgrid&#34;:true,&#34;gridcolor&#34;:&#34;rgba(235,235,235,1)&#34;,&#34;gridwidth&#34;:0.66417600664176,&#34;zeroline&#34;:false,&#34;anchor&#34;:&#34;y&#34;,&#34;title&#34;:{&#34;text&#34;:&#34;Average GDP per capita&#34;,&#34;font&#34;:{&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;,&#34;family&#34;:&#34;&#34;,&#34;size&#34;:14.6118721461187}},&#34;hoverformat&#34;:&#34;.2f&#34;},&#34;yaxis&#34;:{&#34;domain&#34;:[0,1],&#34;automargin&#34;:true,&#34;type&#34;:&#34;linear&#34;,&#34;autorange&#34;:false,&#34;range&#34;:[37.7070333333333,83.8623],&#34;tickmode&#34;:&#34;array&#34;,&#34;ticktext&#34;:[&#34;40&#34;,&#34;50&#34;,&#34;60&#34;,&#34;70&#34;,&#34;80&#34;],&#34;tickvals&#34;:[40,50,60,70,80],&#34;categoryorder&#34;:&#34;array&#34;,&#34;categoryarray&#34;:[&#34;40&#34;,&#34;50&#34;,&#34;60&#34;,&#34;70&#34;,&#34;80&#34;],&#34;nticks&#34;:null,&#34;ticks&#34;:&#34;outside&#34;,&#34;tickcolor&#34;:&#34;rgba(51,51,51,1)&#34;,&#34;ticklen&#34;:3.65296803652968,&#34;tickwidth&#34;:0.66417600664176,&#34;showticklabels&#34;:true,&#34;tickfont&#34;:{&#34;color&#34;:&#34;rgba(77,77,77,1)&#34;,&#34;family&#34;:&#34;&#34;,&#34;size&#34;:11.689497716895},&#34;tickangle&#34;:-0,&#34;showline&#34;:false,&#34;linecolor&#34;:null,&#34;linewidth&#34;:0,&#34;showgrid&#34;:true,&#34;gridcolor&#34;:&#34;rgba(235,235,235,1)&#34;,&#34;gridwidth&#34;:0.66417600664176,&#34;zeroline&#34;:false,&#34;anchor&#34;:&#34;x&#34;,&#34;title&#34;:{&#34;text&#34;:&#34;Average Life Expectancy&#34;,&#34;font&#34;:{&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;,&#34;family&#34;:&#34;&#34;,&#34;size&#34;:14.6118721461187}},&#34;hoverformat&#34;:&#34;.2f&#34;},&#34;shapes&#34;:[{&#34;type&#34;:&#34;rect&#34;,&#34;fillcolor&#34;:&#34;transparent&#34;,&#34;line&#34;:{&#34;color&#34;:&#34;rgba(51,51,51,1)&#34;,&#34;width&#34;:0.66417600664176,&#34;linetype&#34;:&#34;solid&#34;},&#34;yref&#34;:&#34;paper&#34;,&#34;xref&#34;:&#34;paper&#34;,&#34;x0&#34;:0,&#34;x1&#34;:1,&#34;y0&#34;:0,&#34;y1&#34;:1}],&#34;showlegend&#34;:false,&#34;legend&#34;:{&#34;bgcolor&#34;:&#34;rgba(255,255,255,1)&#34;,&#34;bordercolor&#34;:&#34;transparent&#34;,&#34;borderwidth&#34;:1.88976377952756,&#34;font&#34;:{&#34;color&#34;:&#34;rgba(0,0,0,1)&#34;,&#34;family&#34;:&#34;&#34;,&#34;size&#34;:11.689497716895}},&#34;hovermode&#34;:&#34;closest&#34;,&#34;barmode&#34;:&#34;relative&#34;},&#34;config&#34;:{&#34;doubleClick&#34;:&#34;reset&#34;,&#34;showSendToCloud&#34;:false},&#34;source&#34;:&#34;A&#34;,&#34;attrs&#34;:{&#34;56863f0a&#34;:{&#34;x&#34;:{},&#34;y&#34;:{},&#34;colour&#34;:{},&#34;size&#34;:{},&#34;label&#34;:{},&#34;type&#34;:&#34;scatter&#34;},&#34;56816713709&#34;:{&#34;x&#34;:{},&#34;y&#34;:{},&#34;colour&#34;:{},&#34;size&#34;:{},&#34;label&#34;:{}}},&#34;cur_data&#34;:&#34;56863f0a&#34;,&#34;visdat&#34;:{&#34;56863f0a&#34;:[&#34;function (y) &#34;,&#34;x&#34;],&#34;56816713709&#34;:[&#34;function (y) &#34;,&#34;x&#34;]},&#34;highlight&#34;:{&#34;on&#34;:&#34;plotly_click&#34;,&#34;persistent&#34;:false,&#34;dynamic&#34;:false,&#34;selectize&#34;:false,&#34;opacityDim&#34;:0.2,&#34;selected&#34;:{&#34;opacity&#34;:1},&#34;debounce&#34;:0},&#34;shinyEvents&#34;:[&#34;plotly_hover&#34;,&#34;plotly_click&#34;,&#34;plotly_selected&#34;,&#34;plotly_relayout&#34;,&#34;plotly_brushed&#34;,&#34;plotly_brushing&#34;,&#34;plotly_clickannotation&#34;,&#34;plotly_doubleclick&#34;,&#34;plotly_deselect&#34;,&#34;plotly_afterplot&#34;,&#34;plotly_sunburstclick&#34;],&#34;base_url&#34;:&#34;https://plot.ly&#34;},&#34;evals&#34;:[],&#34;jsHooks&#34;:[]}&lt;/script&gt;
&lt;/div&gt;
&lt;div id=&#34;animated-graphs&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Animated Graphs&lt;/h2&gt;
&lt;p&gt;Animated graphs have recently become popular. The internet is full of tutorials and code-throughs where people explain how to do something interesting with R, so here is one if you wanted to know more about &lt;a href=&#34;https://www.infoworld.com/video/89987/r-tip-animations-in-r&#34;&gt;animations in R&lt;/a&gt;. You have to install the &lt;code&gt;gganimate&lt;/code&gt; package and the animated graphs usually take some time to produce, as R needs to generates a number of GIF files and then create the animation, so please be patient!&lt;/p&gt;
&lt;div id=&#34;gapminder-animations---transition_time&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Gapminder Animations - &lt;code&gt;transition_time()&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;First we look at an animated boxplot of life expectancy by continent over time. The code to produce the plot is fairly straight-forward &lt;code&gt;ggplot&lt;/code&gt;, but the last couple of lines ( &lt;code&gt;transition_time(year)&lt;/code&gt; + &lt;code&gt;ease_aes(&#34;linear&#34;)&lt;/code&gt;) are the ones that produce the animation.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(gganimate)

boxplot_animation &amp;lt;- ggplot(data = gapminder,
       mapping = aes(x = continent,
                     y = lifeExp,
                     fill = continent)) +
  geom_boxplot() +
  theme_bw() +
  theme(legend.position=&amp;quot;none&amp;quot;) +
  labs(title = &amp;quot;Year: {frame_time}&amp;quot;, 
       x = &amp;quot;Continent&amp;quot;, 
       y = &amp;quot;Life Expectancy&amp;quot;) +  
  transition_time(year) +
  ease_aes(&amp;quot;linear&amp;quot;)


animate(boxplot_animation, height=600, width = 600)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/animated_boxplot-1.gif&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;If we want to animate the evolution of the relationship between life expectancy and GDP, similar to &lt;a href=&#34;https://www.youtube.com/watch?v=jbkSRLYSojo&#34;&gt;Hans Rosling’s 200 Countries, 200 Years, 4 Minutes&lt;/a&gt;, we can use the code below&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;animation1 &amp;lt;- ggplot(data = gapminder,
       mapping = aes(x = gdpPercap,
                     y = lifeExp,
                     colour = continent,
                     size = pop)) +
  geom_point(alpha = 0.5) +
  scale_x_log10(labels = scales::dollar) +
  theme_bw() +
  theme(legend.position=&amp;quot;none&amp;quot;) +
  labs(title = &amp;quot;Year: {frame_time}&amp;quot;, 
       x = &amp;quot;GDP per capita&amp;quot;, 
       y = &amp;quot;Life Expectancy&amp;quot;) +    
  transition_time(year)+
  ease_aes(&amp;quot;linear&amp;quot;)

animate(animation1, height=600, width = 600)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/animation-1.gif&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Finally, instead of one scatter plot, if we wanted to facet our animation by continent, we just add the &lt;code&gt;facet_wrap(~continent)&lt;/code&gt; line of code as shown below&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;faceted_animation &amp;lt;- ggplot(data = gapminder,
       mapping = aes(x = gdpPercap,
                     y = lifeExp,
                     colour = continent,
                     size = pop)) +
  geom_point(alpha = 0.5) +
  scale_x_log10(labels = scales::dollar) +
  theme_bw() +
  theme(legend.position=&amp;quot;none&amp;quot;) +
  facet_wrap(~continent) +
  labs(title = &amp;quot;Year: {frame_time}&amp;quot;, 
       x = &amp;quot;GDP per capita&amp;quot;, 
       y = &amp;quot;Life Expectancy&amp;quot;) +    
  transition_time(year)+
  ease_aes(&amp;quot;linear&amp;quot;)

animate(faceted_animation, height=800, width = 800)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/faceted_animation_by_continent-1.gif&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;why-you-should-always-plot-your-data&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Why you should always plot your data&lt;/h2&gt;
&lt;p&gt;We have touched on the basics of &lt;code&gt;ggplot&lt;/code&gt; visualisations, but in this section we wanted to discuss why one should always plot the data and not just rely on tables of summary statistics.&lt;/p&gt;
&lt;p&gt;Let us consider thirteen datasets all of which have 142 observations of (x,y) values. The table below shows the average value of X and Y, the standard deviation of X and Y, as well as the correlation coefficient between X and Y.&lt;/p&gt;
&lt;table class=&#34;table table-striped table-bordered&#34; style=&#34;margin-left: auto; margin-right: auto;&#34;&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
id
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
n
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
mean_x
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
mean_y
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
sd_x
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
sd_y
&lt;/th&gt;
&lt;th style=&#34;text-align:right;&#34;&gt;
correlation
&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
1
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
142
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
54.3
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
47.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
16.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
26.9
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.064
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
2
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
142
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
54.3
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
47.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
16.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
26.9
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.069
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
3
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
142
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
54.3
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
47.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
16.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
26.9
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.068
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
4
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
142
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
54.3
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
47.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
16.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
26.9
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.064
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
5
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
142
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
54.3
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
47.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
16.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
26.9
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.060
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
6
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
142
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
54.3
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
47.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
16.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
26.9
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.062
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
7
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
142
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
54.3
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
47.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
16.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
26.9
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.069
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
142
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
54.3
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
47.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
16.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
26.9
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.069
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
9
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
142
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
54.3
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
47.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
16.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
26.9
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.069
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
10
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
142
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
54.3
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
47.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
16.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
26.9
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.063
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
11
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
142
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
54.3
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
47.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
16.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
26.9
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.069
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
12
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
142
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
54.3
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
47.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
16.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
26.9
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.067
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
13
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
142
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
54.3
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
47.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
16.8
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
26.9
&lt;/td&gt;
&lt;td style=&#34;text-align:right;&#34;&gt;
-0.066
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Since our datasets contain values for X and Y, we can estimate 13 regression models and plot the values for each of the 13 intercepts and slope for X.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/datasaurus-regression-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;If we just looked at either the summary statistics table, or the plots of intercepts and slopes, we may be tempted to conclude that the 13 datasets are either identical or very much alike. However, this is far from the truth, as this is what the 13 individual datasets look like.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/datasaurus_graph-1.png&#34; width=&#34;768&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;!-- We can create an animation to show how the data sets change --&gt;
&lt;!-- ```{r datasaurus_animation, warning = FALSE} --&gt;
&lt;!-- ggplot(datasaurus_dozen, aes(x = x, y = y))+ --&gt;
&lt;!--   geom_point() + --&gt;
&lt;!--   theme_bw() + --&gt;
&lt;!--   transition_states(dataset, 3, 1) + --&gt;
&lt;!--   ease_aes(&#39;cubic-in-out&#39;) --&gt;
&lt;!-- ``` --&gt;
&lt;p&gt;You can read more about why you &lt;a href=&#34;https://www.autodeskresearch.com/publications/samestats&#34;&gt;should never trust summary statistics alone and should always visualize your data&lt;/a&gt;.&lt;/p&gt;
&lt;div id=&#34;what-data-patterns-can-lie-behind-a-correlation-coefficient&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;What data patterns can lie behind a correlation coefficient?&lt;/h3&gt;
&lt;p&gt;Jan Vanhove has written about the &lt;a href=&#34;http://janhove.github.io/teaching/2016/11/21/what-correlations-look-like&#34;&gt;data patterns that can lie behind a correlation coefficient&lt;/a&gt; and why you should always plot and visualise a scatter plot; he has created a package, &lt;code&gt;cannoball&lt;/code&gt;, where you specify a correlation coefficient &lt;code&gt;r&lt;/code&gt; and a sample size &lt;code&gt;n&lt;/code&gt;, and you get multiple scatterplots of the same correlation value, but fairly different in their scatter.&lt;/p&gt;
&lt;p&gt;We will visualise 16 different datasets, all of which have a correlation of 0.50, and a sample of size n = 100.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/example/eda-visualise-data_files/figure-html/cannonball_correlations-1.png&#34; width=&#34;672&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;rstudios-primers-for-ggplot2&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;RStudio’s primers for &lt;strong&gt;ggplot2&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;You can work through RStudio’s introductory primers for &lt;strong&gt;ggplot2&lt;/strong&gt;; these are fairly short once you get used to the syntax of &lt;code&gt;ggplot()&lt;/code&gt;.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;RStudios’s primers on visualising data&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://rstudio.cloud/learn/primers/3.1&#34;&gt;Exploratory Data Analysis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://rstudio.cloud/learn/primers/3.2&#34;&gt;Bar Charts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://rstudio.cloud/learn/primers/3.3&#34;&gt;Histograms&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://rstudio.cloud/learn/primers/3.4&#34;&gt;Boxplots and Counts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://rstudio.cloud/learn/primers/3.5&#34;&gt;Scatterplots&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://rstudio.cloud/learn/primers/3.6&#34;&gt;Line plots&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://rstudio.cloud/learn/primers/3.7&#34;&gt;Overplotting and Big Data&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://rstudio.cloud/learn/primers/3.8&#34;&gt;Customize Your Plots&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;further-resources&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Further resources&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://resources.rstudio.com/the-essentials-of-data-science/data-visualization-2-1&#34;&gt;Data visualisation with ggplot cheatsheet&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/rstudio/cheatsheets/raw/master/gganimate.pdf&#34;&gt;gganimate cheatsheet&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://cedricscherer.netlify.com/2019/05/17/the-evolution-of-a-ggplot-ep.-1/&#34;&gt;The Evolution of a ggplot&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/clauswilke/practical_ggplot2&#34;&gt;Step-by-step examples of building publication-quality figures in ggplot2 from ‘Fundamentals of Data Visualization’ by Claus Wilke&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/TheEconomist/covid-19-excess-deaths-tracker&#34;&gt;The Economist’s tracker for covid-19 excess deaths&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;br&gt;
&lt;br&gt;&lt;/p&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Manipulate Data</title>
      <link>https://bit-2021.netlify.app/example/eda-manipulate-data/</link>
      <pubDate>Tue, 21 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/example/eda-manipulate-data/</guid>
      <description>
&lt;script src=&#34;https://cdnjs.cloudflare.com/ajax/libs/iframe-resizer/3.5.16/iframeResizer.min.js&#34; type=&#34;text/javascript&#34;&gt;&lt;/script&gt;

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#the-pipe-operator-or&#34;&gt;The &lt;code&gt;pipe&lt;/code&gt; operator, or &lt;strong&gt;&lt;code&gt;%&amp;gt;%&lt;/code&gt;&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#key-functions-in-dplyr&#34;&gt;Key functions in &lt;code&gt;dplyr&lt;/code&gt;&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#pick-columns-with-select&#34;&gt;Pick columns with &lt;code&gt;select()&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#pick-rows-with-filter&#34;&gt;Pick rows with &lt;code&gt;filter()&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#sort-data-with-arrange&#34;&gt;Sort data with &lt;code&gt;arrange()&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#add-new-columns-with-mutate&#34;&gt;Add new columns with &lt;code&gt;mutate()&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#combine-multiple-verbs-with-pipes&#34;&gt;Combine multiple verbs with pipes (&lt;code&gt;%&amp;gt;%&lt;/code&gt;)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#summarise-data-by-groups-with-group_by-summarise&#34;&gt;Summarise data by groups with &lt;code&gt;group_by() %&amp;gt;% summarise()&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#further-resources&#34;&gt;Further resources&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;Learning Objectives &lt;br&gt;
1. Select certain variables (or columns) in a dataframe with the dplyr function &lt;strong&gt;select&lt;/strong&gt; &lt;code&gt;dplyr::select()&lt;/code&gt; &lt;br&gt;
2. Select certain cases (or rows) in a dataframe according to filtering conditions with the dplyr function &lt;strong&gt;filter&lt;/strong&gt; &lt;code&gt;dplyr::filter()&lt;/code&gt; &lt;br&gt;
3. Pass the output of one dplyr function to the input of another function with the ‘pipe’ operator &lt;code&gt;%&amp;gt;%&lt;/code&gt; &lt;br&gt;
4. Create new variables (columns) in a dataframe that are functions of existing columns with &lt;code&gt;dplyr::mutate()&lt;/code&gt; &lt;br&gt;
5. Use &lt;code&gt;dplyr::group_by()&lt;/code&gt;, &lt;code&gt;dplyr::summarise()&lt;/code&gt;, and &lt;code&gt;dplyr::count()&lt;/code&gt; to split a dataframe into groups of observations, calculate summary statistics for each group, and also count the number of total observations in each group &lt;br&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;When working on a real project, data will seldom (if ever!) arrive in exactly the format you would like to have it in in order to analyse it. We need to &lt;strong&gt;manipulate and transform&lt;/strong&gt; data and just as we have a grammar for generating graphics (the &lt;strong&gt;layered grammar of graphics&lt;/strong&gt; in &lt;code&gt;ggplot&lt;/code&gt;), we also have a syntax for data transformation.&lt;/p&gt;
&lt;p&gt;&lt;code&gt;dplyr&lt;/code&gt; is a package that contains useful functions for transforming and manipulating data frames. You can think of these functions as &lt;strong&gt;verbs&lt;/strong&gt;, that do something to the data. All of the &lt;code&gt;dplyr&lt;/code&gt; verbs (or functions), and in fact pretty much everything in the &lt;code&gt;tidyverse&lt;/code&gt;, works in the following fashion:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;The first argument is a data frame&lt;/li&gt;
&lt;li&gt;Subsequent arguments describe what to do with the data frame&lt;/li&gt;
&lt;li&gt;The result is a new data frame&lt;/li&gt;
&lt;/ol&gt;
&lt;div id=&#34;the-pipe-operator-or&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;The &lt;code&gt;pipe&lt;/code&gt; operator, or &lt;strong&gt;&lt;code&gt;%&amp;gt;%&lt;/code&gt;&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;The pipe operator, this strange &lt;strong&gt;&lt;code&gt;%&amp;gt;%&lt;/code&gt;&lt;/strong&gt; thing, takes the value to the left of it and passes it through to the thing to the right of it. Let us create a couple of lists and a simple function to see an example&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# A list (or vector) of multiple values too:
my_first_list &amp;lt;- c(1, 2, 3, 5, 8, 13, 21, 34, 55, 89)
my_second_list &amp;lt;- c(1, 1, 2, 3, 5, 8, 13, 21, 34, 55)

# Define a function that takes X and adds 100
my_function &amp;lt;- function(x) {
  new_x &amp;lt;- x + 100
  return(new_x)
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Functions work on single values and on lists (or vectors):&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# call my_function with x=14 as an argument
my_function(14)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 114&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# call my_function with x=my_first_list as an argument; this is a 
# vectorised operation, as it will add 100 to each value in my_first_list
my_function(my_first_list)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##  [1] 101 102 103 105 108 113 121 134 155 189&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# call my_function with my_first_list+my_second_list as argument; this is a 
# vectorised operation, as it will first add my_first_list+my_second_list 
# and then add 100 to each value 
my_function(my_first_list+my_second_list)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##  [1] 102 103 105 108 113 121 134 155 189 244&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can nest functions inside each other and use &lt;code&gt;mean(my_function(my_first_list))&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;mean(my_function(my_first_list))&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 123&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;But this can get really hard to read, since you have to read from the inside out. In English, this nested mess reads “Calculate the &lt;code&gt;mean&lt;/code&gt; of the results of &lt;code&gt;my_function&lt;/code&gt; applied to &lt;code&gt;my_first_list&lt;/code&gt;.” We can simplify this by reversing the nested chain and using the pipe operator&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;my_first_list %&amp;gt;% 
  my_function() %&amp;gt;% 
  mean() &lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 123&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Here we start with the data and then describe the actions/verbs to do something to the data. We can read this chain as &#34;Take &lt;code&gt;my_first_list&lt;/code&gt;, pass it through &lt;code&gt;my_function&lt;/code&gt;, and calculate the mean of that.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;&lt;code&gt;%&amp;gt;%&lt;/code&gt;&lt;/strong&gt; is called a &lt;em&gt;pipe&lt;/em&gt; and you can also read or think of the pipe operator as the words “and then.”
There’s also a keyboard shortcut for this too, since typing %&amp;gt;% all the time can be tedious: In Windows you would use &lt;code&gt;Ctrl + Shift + M&lt;/code&gt; and in Mac wou would use &lt;code&gt;⌘ /Cmd+  shift +  M&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Similarly, we frequently need to perform a series of intermediate steps to transform data for analysis. If we write each step as a discrete command and store their contents as new objects, our code becomes difficult to read and understand.&lt;/p&gt;
&lt;p&gt;When speaking or writing, we never start with a sentence with a verb, but rather with a noun (subject). It is good practice to start with a dataframe/object and then use verbs (or functions) to describe what you want to do.&lt;/p&gt;
&lt;p&gt;Suppose we wanted to look at the first few rows of life expectancy values, using the &lt;code&gt;head()&lt;/code&gt; function, of the &lt;code&gt;gapminder&lt;/code&gt; dataframe.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Nested command, rather hard to read, since we read from the inside out
head(select(gapminder,lifeExp))&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 1
##   lifeExp
##     &amp;lt;dbl&amp;gt;
## 1    28.8
## 2    30.3
## 3    32.0
## 4    34.0
## 5    36.1
## 6    38.4&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# using the pipe operator: Start with gapminder, and then
gapminder %&amp;gt;% 
  
  # select the column (or variable) lifeExp, and then 
  select(lifeExp) %&amp;gt;% 
  
  # use the head() function to return the first few rows of the dataset 
  head()&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 1
##   lifeExp
##     &amp;lt;dbl&amp;gt;
## 1    28.8
## 2    30.3
## 3    32.0
## 4    34.0
## 5    36.1
## 6    38.4&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;key-functions-in-dplyr&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Key functions in &lt;code&gt;dplyr&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;There are 6 important verbs that you’ll typically use when working with data:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Extract columns/variables with &lt;code&gt;select()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Extract rows/cases with &lt;code&gt;filter()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Arrange/sort rows with &lt;code&gt;arrange()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Make new columns/variables with &lt;code&gt;mutate()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Make group summaries with &lt;code&gt;group_by %&amp;gt;% summarise()&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;table&gt;
&lt;colgroup&gt;
&lt;col width=&#34;20%&#34; /&gt;
&lt;col width=&#34;80%&#34; /&gt;
&lt;/colgroup&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th&gt;&lt;code&gt;function()&lt;/code&gt;&lt;/th&gt;
&lt;th&gt;Action performed&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;select()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Selects a subset of &lt;strong&gt;columns&lt;/strong&gt; (or variables) from the data frame&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;filter()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Subsets &lt;strong&gt;observations&lt;/strong&gt; based on their values&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;arrange()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Changes the order of observations based on their values&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;mutate()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Creates new &lt;strong&gt;columns&lt;/strong&gt; (or variables)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;group_by()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Changes the unit of analysis from the complete dataset to individual groups of &lt;strong&gt;columns&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;summarise()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Collapses the data frame to a smaller number of rows which summarise the larger data&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Every &lt;strong&gt;dplyr&lt;/strong&gt; verb follows the same pattern. The first argument is always a data frame, and the function always returns a data frame:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;&lt;span style=&#39;background-color:pink&#39;&gt;VERB&lt;/span&gt;(&lt;span style=&#39;background-color:yellow&#39;&gt;DATA_TO_TRANSFORM&lt;/span&gt;, &lt;span style=&#39;background-color:lightblue&#39;&gt;STUFF_IT_DOES&lt;/span&gt;)&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;div id=&#34;pick-columns-with-select&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Pick columns with &lt;code&gt;select()&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;If we want to select ], or drop, specific columns from a tibble, we use the &lt;code&gt;select()&lt;/code&gt; verb. For instance, if we wanted to keep only the &lt;code&gt;lifeExp&lt;/code&gt; and &lt;code&gt;year&lt;/code&gt; columns:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;&lt;span style=&#39;background-color:yellow&#39;&gt;gapminder&lt;/span&gt; %&gt;% &lt;span style=&#39;background-color:pink&#39;&gt;select&lt;/span&gt;(&lt;span style=&#39;background-color:lightblue&#39;&gt;lifeExp, year&lt;/span&gt;)&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;
## # A tibble: 1,704 x 2
##    lifeExp  year
##      &amp;lt;dbl&amp;gt; &amp;lt;int&amp;gt;
##  1    28.8  1952
##  2    30.3  1957
##  3    32.0  1962
##  4    34.0  1967
##  5    36.1  1972
##  6    38.4  1977
##  7    39.9  1982
##  8    40.8  1987
##  9    41.7  1992
## 10    41.8  1997
## # ... with 1,694 more rows
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can remove specific columns by prefacing the column names with a minus sign &lt;code&gt;-&lt;/code&gt;. SO to drop &lt;code&gt;-lifeExp&lt;/code&gt; from our tibble, we would use:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;&lt;span style=&#39;background-color:yellow&#39;&gt;gapminder&lt;/span&gt; %&gt;% &lt;span style=&#39;background-color:pink&#39;&gt;select&lt;/span&gt;(&lt;span style=&#39;background-color:lightblue&#39;&gt;-lifeExp&lt;/span&gt;)&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;
## # A tibble: 1,704 x 5
##    country     continent  year      pop gdpPercap
##    &amp;lt;fct&amp;gt;       &amp;lt;fct&amp;gt;     &amp;lt;int&amp;gt;    &amp;lt;int&amp;gt;     &amp;lt;dbl&amp;gt;
##  1 Afghanistan Asia       1952  8425333      779.
##  2 Afghanistan Asia       1957  9240934      821.
##  3 Afghanistan Asia       1962 10267083      853.
##  4 Afghanistan Asia       1967 11537966      836.
##  5 Afghanistan Asia       1972 13079460      740.
##  6 Afghanistan Asia       1977 14880372      786.
##  7 Afghanistan Asia       1982 12881816      978.
##  8 Afghanistan Asia       1987 13867957      852.
##  9 Afghanistan Asia       1992 16317921      649.
## 10 Afghanistan Asia       1997 22227415      635.
## # ... with 1,694 more rows
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can also rename columns using &lt;code&gt;select()&lt;/code&gt;, using the syntax &lt;code&gt;select(new_name = old_name)&lt;/code&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;&lt;span style=&#39;background-color:yellow&#39;&gt;gapminder&lt;/span&gt; %&gt;% &lt;span style=&#39;background-color:pink&#39;&gt;select&lt;/span&gt;(&lt;span style=&#39;background-color:lightblue&#39;&gt;year, country, life_expectancy = lifeExp&lt;/span&gt;)&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;
## # A tibble: 1,704 x 3
##     year country     life_expectancy
##    &amp;lt;int&amp;gt; &amp;lt;fct&amp;gt;                 &amp;lt;dbl&amp;gt;
##  1  1952 Afghanistan            28.8
##  2  1957 Afghanistan            30.3
##  3  1962 Afghanistan            32.0
##  4  1967 Afghanistan            34.0
##  5  1972 Afghanistan            36.1
##  6  1977 Afghanistan            38.4
##  7  1982 Afghanistan            39.9
##  8  1987 Afghanistan            40.8
##  9  1992 Afghanistan            41.7
## 10  1997 Afghanistan            41.8
## # ... with 1,694 more rows
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Alternatively, there’s a special &lt;code&gt;rename()&lt;/code&gt; verb with the same syntax, i.e., &lt;code&gt;rename(new_name = old_name)&lt;/code&gt; that will rename a column to a new name, while keeping all the other columns:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;&lt;span style=&#39;background-color:yellow&#39;&gt;gapminder&lt;/span&gt; %&gt;% &lt;span style=&#39;background-color:pink&#39;&gt;rename&lt;/span&gt;(&lt;span style=&#39;background-color:lightblue&#39;&gt;life_expectancy = lifeExp&lt;/span&gt;)&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;
## # A tibble: 1,704 x 6
##    country     continent  year life_expectancy      pop gdpPercap
##    &amp;lt;fct&amp;gt;       &amp;lt;fct&amp;gt;     &amp;lt;int&amp;gt;           &amp;lt;dbl&amp;gt;    &amp;lt;int&amp;gt;     &amp;lt;dbl&amp;gt;
##  1 Afghanistan Asia       1952            28.8  8425333      779.
##  2 Afghanistan Asia       1957            30.3  9240934      821.
##  3 Afghanistan Asia       1962            32.0 10267083      853.
##  4 Afghanistan Asia       1967            34.0 11537966      836.
##  5 Afghanistan Asia       1972            36.1 13079460      740.
##  6 Afghanistan Asia       1977            38.4 14880372      786.
##  7 Afghanistan Asia       1982            39.9 12881816      978.
##  8 Afghanistan Asia       1987            40.8 13867957      852.
##  9 Afghanistan Asia       1992            41.7 16317921      649.
## 10 Afghanistan Asia       1997            41.8 22227415      635.
## # ... with 1,694 more rows
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;pick-rows-with-filter&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Pick rows with &lt;code&gt;filter()&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;filter()&lt;/code&gt; function takes two arguments: a tibble to transform, and a set of tests. It will return each row for which the test is TRUE.&lt;/p&gt;
&lt;p&gt;This code, for instance, will look at the &lt;code&gt;gapminder&lt;/code&gt; dataset and return all rows where &lt;code&gt;country&lt;/code&gt; is equal to “Jordan”.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;&lt;span style=&#39;background-color:pink&#39;&gt;filter&lt;/span&gt;(&lt;span style=&#39;background-color:yellow&#39;&gt;gapminder&lt;/span&gt;, &lt;span style=&#39;background-color:lightblue&#39;&gt;country == &#34;Jordan&#34;&lt;/span&gt;)&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;
## # A tibble: 12 x 6
##    country continent  year lifeExp     pop gdpPercap
##    &amp;lt;fct&amp;gt;   &amp;lt;fct&amp;gt;     &amp;lt;int&amp;gt;   &amp;lt;dbl&amp;gt;   &amp;lt;int&amp;gt;     &amp;lt;dbl&amp;gt;
##  1 Jordan  Asia       1952    43.2  607914     1547.
##  2 Jordan  Asia       1957    45.7  746559     1886.
##  3 Jordan  Asia       1962    48.1  933559     2348.
##  4 Jordan  Asia       1967    51.6 1255058     2742.
##  5 Jordan  Asia       1972    56.5 1613551     2111.
##  6 Jordan  Asia       1977    61.1 1937652     2852.
##  7 Jordan  Asia       1982    63.7 2347031     4161.
##  8 Jordan  Asia       1987    65.9 2820042     4449.
##  9 Jordan  Asia       1992    68.0 3867409     3432.
## 10 Jordan  Asia       1997    69.8 4526235     3645.
## 11 Jordan  Asia       2002    71.3 5307470     3845.
## 12 Jordan  Asia       2007    72.5 6053193     4519.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Notice that there are two equal signs (&lt;code&gt;==&lt;/code&gt;).
Please note that when testing for equality, we use a double equal sign, (&lt;code&gt;==&lt;/code&gt;). If you had used a single equal sign, that would be the assignment operator, i.e., you set an argument (like &lt;code&gt;data = gapminder&lt;/code&gt;); when you use two equal signs, you are running a logical a test.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th&gt;Test&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;x &amp;lt; y&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Less than&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;x &amp;gt; y&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Greater than&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;x == y&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Equal to&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;x &amp;lt;= y&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Less than or equal to&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;x &amp;gt;= y&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Greater than or equal to&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;x != y&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Not equal to&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;x %in% y&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;In (group membership)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;is.na(x)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Is missing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;!is.na(x)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Is not missing&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Your turn&lt;/strong&gt;: Use &lt;code&gt;filter()&lt;/code&gt; and logical tests to show:
&lt;br&gt;
1. The data for China &lt;br&gt;
2. All data for countries in Africa &lt;br&gt;
3. All cases (rows) where life expectancy is greater than 80 &lt;br&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;!---LEARNR EX 1--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;myIframe1&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/dplyr_filter1/&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;p&gt;You can also use multiple conditions, and these will extract rows that meet every test. By default, if you separate the tests with a comma, R will consider this an “and” test and find rows that are &lt;em&gt;both&lt;/em&gt; Jordan and greater than 2000.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;&lt;span style=&#39;background-color:pink&#39;&gt;filter&lt;/span&gt;(&lt;span style=&#39;background-color:yellow&#39;&gt;gapminder&lt;/span&gt;, &lt;span style=&#39;background-color:lightblue&#39;&gt;country == &#34;Jordan&#34;, year &gt; 2000&lt;/span&gt;)&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;
## # A tibble: 2 x 6
##   country continent  year lifeExp     pop gdpPercap
##   &amp;lt;fct&amp;gt;   &amp;lt;fct&amp;gt;     &amp;lt;int&amp;gt;   &amp;lt;dbl&amp;gt;   &amp;lt;int&amp;gt;     &amp;lt;dbl&amp;gt;
## 1 Jordan  Asia       2002    71.3 5307470     3845.
## 2 Jordan  Asia       2007    72.5 6053193     4519.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you have any programming experience, you can also use the common operators for &lt;strong&gt;“and”&lt;/strong&gt; with “&lt;code&gt;&amp;amp;&lt;/code&gt;”, &lt;strong&gt;“or”&lt;/strong&gt; with “&lt;code&gt;|&lt;/code&gt;”, and &lt;strong&gt;“not”&lt;/strong&gt; with “&lt;code&gt;!&lt;/code&gt;”:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr class=&#34;header&#34;&gt;
&lt;th&gt;Operator&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;a &amp;amp; b&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;and&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td&gt;&lt;code&gt;a | b&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;or&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td&gt;&lt;code&gt;!a&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;not&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Your turn&lt;/strong&gt;: Use &lt;code&gt;filter()&lt;/code&gt; and logical tests to show:
&lt;br&gt;
1. India before 1970 &lt;br&gt;
2. Countries where life expectancy in 2007 is below 60 &lt;br&gt;
3. Countries where life expectancy in 2007 is below 60 and are not in Africa &lt;br&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;!---LEARNR EX 2--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;myIframe2&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/dplyr_filter2/&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;p&gt;Beware of some common mistakes! You can’t collapse multiple tests into one. Instead, use two separate tests:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# This won&amp;#39;t work!
filter(gapminder, 1960 &amp;lt; year &amp;lt; 1980)

# This will work
filter(gapminder, 1960 &amp;lt; year, year &amp;lt; 1980)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Also, you can avoid stringing together lots of tests by using the &lt;code&gt;%in%&lt;/code&gt; operator, which checks to see if a value is in a list of values.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# This works, but is tedious-- what if you wanted to pick a dozen countries?
filter(gapminder, 
       country == &amp;quot;Mexico&amp;quot; |  country == &amp;quot;United States&amp;quot; | country == &amp;quot;Canada&amp;quot; )

# This is more concise and easier to add other countries later
filter(gapminder, 
       country %in% c(&amp;quot;Mexico&amp;quot;, &amp;quot;United States&amp;quot;, &amp;quot;Canada&amp;quot; ))&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;sort-data-with-arrange&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Sort data with &lt;code&gt;arrange()&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;arrange()&lt;/code&gt; verb sorts data. By default it sorts in ascending order, from minimum to maximum value:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;&lt;span style=&#39;background-color:yellow&#39;&gt;gapminder&lt;/span&gt; %&gt;% &lt;span style=&#39;background-color:pink&#39;&gt;arrange&lt;/span&gt;(&lt;span style=&#39;background-color:lightblue&#39;&gt;lifeExp&lt;/span&gt;)&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;
## # A tibble: 1,704 x 6
##    country      continent  year lifeExp     pop gdpPercap
##    &amp;lt;fct&amp;gt;        &amp;lt;fct&amp;gt;     &amp;lt;int&amp;gt;   &amp;lt;dbl&amp;gt;   &amp;lt;int&amp;gt;     &amp;lt;dbl&amp;gt;
##  1 Rwanda       Africa     1992    23.6 7290203      737.
##  2 Afghanistan  Asia       1952    28.8 8425333      779.
##  3 Gambia       Africa     1952    30    284320      485.
##  4 Angola       Africa     1952    30.0 4232095     3521.
##  5 Sierra Leone Africa     1952    30.3 2143249      880.
##  6 Afghanistan  Asia       1957    30.3 9240934      821.
##  7 Cambodia     Asia       1977    31.2 6978607      525.
##  8 Mozambique   Africa     1952    31.3 6446316      469.
##  9 Sierra Leone Africa     1957    31.6 2295678     1004.
## 10 Burkina Faso Africa     1952    32.0 4469979      543.
## # ... with 1,694 more rows
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can sort in descending order (max to min) by using the &lt;code&gt;desc()&lt;/code&gt; for the column/variable you want sorted:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;&lt;span style=&#39;background-color:yellow&#39;&gt;gapminder&lt;/span&gt; %&gt;% &lt;span style=&#39;background-color:pink&#39;&gt;arrange&lt;/span&gt;(&lt;span style=&#39;background-color:lightblue&#39;&gt;desc(lifeExp)&lt;/span&gt;)&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;
## # A tibble: 1,704 x 6
##    country          continent  year lifeExp       pop gdpPercap
##    &amp;lt;fct&amp;gt;            &amp;lt;fct&amp;gt;     &amp;lt;int&amp;gt;   &amp;lt;dbl&amp;gt;     &amp;lt;int&amp;gt;     &amp;lt;dbl&amp;gt;
##  1 Japan            Asia       2007    82.6 127467972    31656.
##  2 Hong Kong, China Asia       2007    82.2   6980412    39725.
##  3 Japan            Asia       2002    82   127065841    28605.
##  4 Iceland          Europe     2007    81.8    301931    36181.
##  5 Switzerland      Europe     2007    81.7   7554661    37506.
##  6 Hong Kong, China Asia       2002    81.5   6762476    30209.
##  7 Australia        Oceania    2007    81.2  20434176    34435.
##  8 Spain            Europe     2007    80.9  40448191    28821.
##  9 Sweden           Europe     2007    80.9   9031088    33860.
## 10 Israel           Asia       2007    80.7   6426679    25523.
## # ... with 1,694 more rows
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can sort by multiple columns by specifying them in a comma separated list. For example, we can sort by &lt;code&gt;continent&lt;/code&gt; first and then sort by &lt;code&gt;lifeExp&lt;/code&gt; (life expectancy) in descending order within each continent:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;&lt;span style=&#39;background-color:yellow&#39;&gt;gapminder&lt;/span&gt; %&gt;% &lt;br&gt;&amp;nbsp;&amp;nbsp;&lt;span style=&#39;background-color:pink&#39;&gt;arrange&lt;/span&gt;(&lt;span style=&#39;background-color:lightblue&#39;&gt;continent, desc(lifeExp)&lt;/span&gt;)&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;
## # A tibble: 1,704 x 6
##    country   continent  year lifeExp      pop gdpPercap
##    &amp;lt;fct&amp;gt;     &amp;lt;fct&amp;gt;     &amp;lt;int&amp;gt;   &amp;lt;dbl&amp;gt;    &amp;lt;int&amp;gt;     &amp;lt;dbl&amp;gt;
##  1 Reunion   Africa     2007    76.4   798094     7670.
##  2 Reunion   Africa     2002    75.7   743981     6316.
##  3 Reunion   Africa     1997    74.8   684810     6072.
##  4 Libya     Africa     2007    74.0  6036914    12057.
##  5 Tunisia   Africa     2007    73.9 10276158     7093.
##  6 Reunion   Africa     1992    73.6   622191     6101.
##  7 Tunisia   Africa     2002    73.0  9770575     5723.
##  8 Mauritius Africa     2007    72.8  1250882    10957.
##  9 Libya     Africa     2002    72.7  5368585     9535.
## 10 Algeria   Africa     2007    72.3 33333216     6223.
## # ... with 1,694 more rows
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;add-new-columns-with-mutate&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Add new columns with &lt;code&gt;mutate()&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;You create new columns, or variables, with the &lt;code&gt;mutate()&lt;/code&gt; function. You can create a single new column of &lt;code&gt;gdp&lt;/code&gt; in the &lt;code&gt;gapminder&lt;/code&gt; tibble as follows:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;&lt;span style=&#39;background-color:pink&#39;&gt;mutate&lt;/span&gt;(&lt;span style=&#39;background-color:yellow&#39;&gt;gapminder&lt;/span&gt;, &lt;span style=&#39;background-color:lightblue&#39;&gt;gdp = gdpPercap * pop&lt;/span&gt;)&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;
## # A tibble: 1,704 x 7
##    country     continent  year lifeExp      pop gdpPercap          gdp
##    &amp;lt;fct&amp;gt;       &amp;lt;fct&amp;gt;     &amp;lt;int&amp;gt;   &amp;lt;dbl&amp;gt;    &amp;lt;int&amp;gt;     &amp;lt;dbl&amp;gt;        &amp;lt;dbl&amp;gt;
##  1 Afghanistan Asia       1952    28.8  8425333      779.  6567086330.
##  2 Afghanistan Asia       1957    30.3  9240934      821.  7585448670.
##  3 Afghanistan Asia       1962    32.0 10267083      853.  8758855797.
##  4 Afghanistan Asia       1967    34.0 11537966      836.  9648014150.
##  5 Afghanistan Asia       1972    36.1 13079460      740.  9678553274.
##  6 Afghanistan Asia       1977    38.4 14880372      786. 11697659231.
##  7 Afghanistan Asia       1982    39.9 12881816      978. 12598563401.
##  8 Afghanistan Asia       1987    40.8 13867957      852. 11820990309.
##  9 Afghanistan Asia       1992    41.7 16317921      649. 10595901589.
## 10 Afghanistan Asia       1997    41.8 22227415      635. 14121995875.
## # ... with 1,694 more rows
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And you can create multiple columns by including a comma-separated list of new columns to create:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;&lt;span style=&#39;background-color:pink&#39;&gt;mutate&lt;/span&gt;(&lt;span style=&#39;background-color:yellow&#39;&gt;gapminder&lt;/span&gt;, &lt;span style=&#39;background-color:lightblue&#39;&gt;gdp = gdpPercap * pop&lt;/span&gt;,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span style=&#39;background-color:lightblue&#39;&gt;pop_mill = round(pop / 1000000)&lt;/span&gt;)&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;
## # A tibble: 1,704 x 8
##    country     continent  year lifeExp      pop gdpPercap          gdp pop_mill
##    &amp;lt;fct&amp;gt;       &amp;lt;fct&amp;gt;     &amp;lt;int&amp;gt;   &amp;lt;dbl&amp;gt;    &amp;lt;int&amp;gt;     &amp;lt;dbl&amp;gt;        &amp;lt;dbl&amp;gt;    &amp;lt;dbl&amp;gt;
##  1 Afghanistan Asia       1952    28.8  8425333      779.  6567086330.        8
##  2 Afghanistan Asia       1957    30.3  9240934      821.  7585448670.        9
##  3 Afghanistan Asia       1962    32.0 10267083      853.  8758855797.       10
##  4 Afghanistan Asia       1967    34.0 11537966      836.  9648014150.       12
##  5 Afghanistan Asia       1972    36.1 13079460      740.  9678553274.       13
##  6 Afghanistan Asia       1977    38.4 14880372      786. 11697659231.       15
##  7 Afghanistan Asia       1982    39.9 12881816      978. 12598563401.       13
##  8 Afghanistan Asia       1987    40.8 13867957      852. 11820990309.       14
##  9 Afghanistan Asia       1992    41.7 16317921      649. 10595901589.       16
## 10 Afghanistan Asia       1997    41.8 22227415      635. 14121995875.       22
## # ... with 1,694 more rows
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can also run logical, conditional tests within &lt;code&gt;mutate()&lt;/code&gt; using the &lt;code&gt;ifelse()&lt;/code&gt; function. This works like the &lt;code&gt;=IF&lt;/code&gt; function in Excel and it takes three arguments:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;a logical test,&lt;/li&gt;
&lt;li&gt;what happens if the test is true, and&lt;/li&gt;
&lt;li&gt;what happens if the test is false:&lt;/li&gt;
&lt;/ol&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;ifelse(&lt;span style=&#39;background-color:#faca7d&#39;&gt;TEST&lt;/span&gt;, &lt;span style=&#39;background-color:#9bbffa&#39;&gt;VALUE_IF_TRUE&lt;/span&gt;, &lt;span style=&#39;background-color:#f79b94&#39;&gt;VALUE_IF_FALSE&lt;/span&gt;)&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can create a new column that is a binary (TRUE/FALSE) indicator for whether &lt;code&gt;year&lt;/code&gt; is after 1960:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;mutate(gapminder, after_1960 = ifelse(&lt;span style=&#39;background-color:#faca7d&#39;&gt;year &gt; 1960&lt;/span&gt;, &lt;span style=&#39;background-color:#9bbffa&#39;&gt;TRUE&lt;/span&gt;, &lt;span style=&#39;background-color:#f79b94&#39;&gt;FALSE&lt;/span&gt;))&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;
## # A tibble: 1,704 x 7
##    country     continent  year lifeExp      pop gdpPercap after_1960
##    &amp;lt;fct&amp;gt;       &amp;lt;fct&amp;gt;     &amp;lt;int&amp;gt;   &amp;lt;dbl&amp;gt;    &amp;lt;int&amp;gt;     &amp;lt;dbl&amp;gt; &amp;lt;lgl&amp;gt;     
##  1 Afghanistan Asia       1952    28.8  8425333      779. FALSE     
##  2 Afghanistan Asia       1957    30.3  9240934      821. FALSE     
##  3 Afghanistan Asia       1962    32.0 10267083      853. TRUE      
##  4 Afghanistan Asia       1967    34.0 11537966      836. TRUE      
##  5 Afghanistan Asia       1972    36.1 13079460      740. TRUE      
##  6 Afghanistan Asia       1977    38.4 14880372      786. TRUE      
##  7 Afghanistan Asia       1982    39.9 12881816      978. TRUE      
##  8 Afghanistan Asia       1987    40.8 13867957      852. TRUE      
##  9 Afghanistan Asia       1992    41.7 16317921      649. TRUE      
## 10 Afghanistan Asia       1997    41.8 22227415      635. TRUE      
## # ... with 1,694 more rows
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can also use text labels instead of &lt;code&gt;TRUE&lt;/code&gt; and &lt;code&gt;FALSE&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;mutate(gapminder, &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;after_1960 = ifelse(&lt;span style=&#39;background-color:#faca7d&#39;&gt;year &gt; 1960&lt;/span&gt;, &lt;span style=&#39;background-color:#9bbffa&#39;&gt;&#34;After 1960&#34;&lt;/span&gt;, &lt;span style=&#39;background-color:#f79b94&#39;&gt;&#34;Before 1960&#34;&lt;/span&gt;))&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;
## # A tibble: 1,704 x 7
##    country     continent  year lifeExp      pop gdpPercap after_1960 
##    &amp;lt;fct&amp;gt;       &amp;lt;fct&amp;gt;     &amp;lt;int&amp;gt;   &amp;lt;dbl&amp;gt;    &amp;lt;int&amp;gt;     &amp;lt;dbl&amp;gt; &amp;lt;chr&amp;gt;      
##  1 Afghanistan Asia       1952    28.8  8425333      779. Before 1960
##  2 Afghanistan Asia       1957    30.3  9240934      821. Before 1960
##  3 Afghanistan Asia       1962    32.0 10267083      853. After 1960 
##  4 Afghanistan Asia       1967    34.0 11537966      836. After 1960 
##  5 Afghanistan Asia       1972    36.1 13079460      740. After 1960 
##  6 Afghanistan Asia       1977    38.4 14880372      786. After 1960 
##  7 Afghanistan Asia       1982    39.9 12881816      978. After 1960 
##  8 Afghanistan Asia       1987    40.8 13867957      852. After 1960 
##  9 Afghanistan Asia       1992    41.7 16317921      649. After 1960 
## 10 Afghanistan Asia       1997    41.8 22227415      635. After 1960 
## # ... with 1,694 more rows
&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Your turn&lt;/strong&gt;: Use &lt;code&gt;mutate()&lt;/code&gt; to:
&lt;br&gt;
1. Add an &lt;code&gt;africa&lt;/code&gt; column that is TRUE if the country is on the African continent &lt;br&gt;
2. Add a column &lt;code&gt;log_GDP&lt;/code&gt; for the logarithm of GDP per capita, using &lt;code&gt;log(gdpPercap)&lt;/code&gt; &lt;br&gt;
3. Add an &lt;code&gt;africa_asia&lt;/code&gt; column that says “Africa or Asia” if the country is in Africa or Asia, and “Not Africa or Asia” if it’s not &lt;br&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;!---LEARNR EX 3--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;myIframe3&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/dplyr_mutate1/&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;/div&gt;
&lt;div id=&#34;combine-multiple-verbs-with-pipes&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Combine multiple verbs with pipes (&lt;code&gt;%&amp;gt;%&lt;/code&gt;)&lt;/h3&gt;
&lt;p&gt;What if you want to include only rows from 2002 &lt;em&gt;and&lt;/em&gt; make a new column with the logged GDP per capita? Doing this requires both &lt;code&gt;filter()&lt;/code&gt; and &lt;code&gt;mutate()&lt;/code&gt;, so we need to find a way to use both at once.&lt;/p&gt;
&lt;p&gt;One solution is to use intermediate data frames for each step:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;&lt;span style=&#39;background-color:#faca7d&#39;&gt;gapminder_2002_filtered&lt;/span&gt; &lt;- filter(gapminder, year == 2002)&lt;br&gt;&lt;br&gt;&lt;span style=&#39;background-color:#9bbffa&#39;&gt;gapminder_2002_logged&lt;/span&gt; &lt;- mutate(&lt;span style=&#39;background-color:#faca7d&#39;&gt;gapminder_2002_filtered&lt;/span&gt;, log_gdpPercap = log(gdpPercap))&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That works fine, but your environment panel will start getting full of lots of intermediate data frames.&lt;/p&gt;
&lt;p&gt;Another solution is to nest the functions inside each other. Remember that all &lt;strong&gt;dplyr&lt;/strong&gt; functions return data frames, so you can feed the results of one into another:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;&lt;span style=&#39;background-color:#faca7d&#39;&gt;filter&lt;/span&gt;(&lt;span style=&#39;background-color:#9bbffa&#39;&gt;mutate(gapminder, log_gdpPercap = log(gdpPercap))&lt;/span&gt;, &lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span style=&#39;background-color:#faca7d&#39;&gt;year == 2002&lt;/span&gt;)&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That works too, but it gets &lt;em&gt;really&lt;/em&gt; complicated once you have even more functions, and it’s hard to keep track of which function’s arguments go where. I’d avoid doing this entirely.&lt;/p&gt;
&lt;p&gt;One really nice solution is to use the pipe operator, or &lt;code&gt;%&amp;gt;%&lt;/code&gt;. &lt;strong&gt;The pipe takes an object on the left and passes it as the first argument of the function on the right&lt;/strong&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# gapminder will automatically get placed in the _____ spot
gapminder %&amp;gt;% filter(_____, country == &amp;quot;Jordan&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;These two lines of code do the same thing:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;filter(&lt;span style=&#39;background-color:#f79b94&#39;&gt;gapminder&lt;/span&gt;, country == &#34;Jordan&#34;)&lt;br&gt;&lt;br&gt;&lt;span style=&#39;background-color:#f79b94&#39;&gt;gapminder&lt;/span&gt; %&gt;% filter(country == &#34;Jordan&#34;)&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Using pipes, you always start with a data frame, pass it to one verb to do one thing, then pass the output of that verb (a dataframe) to the next verb that will do something else, and so on. &lt;strong&gt;When reading any code with a &lt;code&gt;%&amp;gt;%&lt;/code&gt;, it’s easiest to read the &lt;code&gt;%&amp;gt;%&lt;/code&gt; as “and then”.&lt;/strong&gt; This would read:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Take the &lt;code&gt;gapminder&lt;/code&gt; dataset &lt;em&gt;and then&lt;/em&gt; filter it so that it only has rows from 2002 &lt;em&gt;and then&lt;/em&gt; add a new column (&lt;code&gt;mutate&lt;/code&gt;) with the logged GDP per capita&lt;/p&gt;
&lt;/blockquote&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;gapminder %&amp;gt;% 
  filter(year == 2002) %&amp;gt;% 
  mutate(log_gdpPercap = log(gdpPercap))&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;summarise-data-by-groups-with-group_by-summarise&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Summarise data by groups with &lt;code&gt;group_by() %&amp;gt;% summarise()&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;summarise()&lt;/code&gt; verb takes an entire frame and collapses all of the rows in a single number as it calculates summary information about it. For instance, the following code will start with the entire &lt;code&gt;gapminder&lt;/code&gt; data, calculate average life expectancy, and return just a single value, namely avarage life expectnacy among all countries and all years :&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;&lt;span style=&#39;background-color:yellow&#39;&gt;gapminder&lt;/span&gt; %&gt;% &lt;span style=&#39;background-color:pink&#39;&gt;summarize&lt;/span&gt;(&lt;span style=&#39;background-color:lightblue&#39;&gt;mean_life = mean(lifeExp)&lt;/span&gt;)&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;
## # A tibble: 1 x 1
##   mean_life
##       &amp;lt;dbl&amp;gt;
## 1      59.5
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can also make multiple summary variables, just like &lt;code&gt;mutate()&lt;/code&gt;, and it will return a column for each:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;&lt;span style=&#39;background-color:yellow&#39;&gt;gapminder&lt;/span&gt; %&gt;% &lt;span style=&#39;background-color:pink&#39;&gt;summarize&lt;/span&gt;(&lt;span style=&#39;background-color:lightblue&#39;&gt;mean_life = mean(lifeExp)&lt;/span&gt;,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span style=&#39;background-color:lightblue&#39;&gt;sd_life = sd(lifeExp)&lt;/span&gt;,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span style=&#39;background-color:lightblue&#39;&gt;min_life = min(lifeExp)&lt;/span&gt;,&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;span style=&#39;background-color:lightblue&#39;&gt;max_life = max(lifeExp)&lt;/span&gt;&lt;br&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;)&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;
## # A tibble: 1 x 4
##   mean_life sd_life min_life max_life
##       &amp;lt;dbl&amp;gt;   &amp;lt;dbl&amp;gt;    &amp;lt;dbl&amp;gt;    &amp;lt;dbl&amp;gt;
## 1      59.5    12.9     23.6     82.6
&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Your turn&lt;/strong&gt;: Use &lt;code&gt;summarise()&lt;/code&gt; to calculate:
&lt;br&gt;
1. The first (minimum) year in the &lt;code&gt;gapminder&lt;/code&gt; dataset &lt;br&gt;
2. The last (maximum) year in the dataset &lt;br&gt;
3. The number of rows in the dataset (use the &lt;a href=&#34;https://rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf&#34;&gt;&lt;strong&gt;dplyr&lt;/strong&gt; cheatsheet&lt;/a&gt;) &lt;br&gt;
4. The number of distinct countries in the dataset (use the &lt;a href=&#34;https://rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf&#34;&gt;&lt;strong&gt;dplyr&lt;/strong&gt; cheatsheet&lt;/a&gt;) &lt;br&gt;
5. Use &lt;code&gt;filter()&lt;/code&gt; and &lt;code&gt;summarise()&lt;/code&gt; to calculate the median, minimum, and maximum life expectancy on the African continent in 2007
&lt;br&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;!---LEARNR EX 4--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;myIframe4&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/dplyr_summarise1/&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;p&gt;Again, remember that &lt;code&gt;summarise()&lt;/code&gt; on its own summarises the entire dataset, so you only get numbers in a single row. These values can be what you want, e.g., averages, standard deviations, and min/max values for the entire dataset. If you group your data into separate subgroups with &lt;code&gt;group_by()&lt;/code&gt;, you can use &lt;code&gt;summarise()&lt;/code&gt; to calculate summary statistics for each group.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;group_by()&lt;/code&gt; function puts rows into groups based on values in a column. If you run:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&#39;language-r&#39;&gt;&lt;code&gt;&lt;span style=&#39;background-color:yellow&#39;&gt;gapminder&lt;/span&gt; %&gt;% &lt;span style=&#39;background-color:pink&#39;&gt;group_by&lt;/span&gt;(&lt;span style=&#39;background-color:lightblue&#39;&gt;continent&lt;/span&gt;)&lt;/code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;
## # A tibble: 1,704 x 6
## # Groups:   continent [5]
##    country     continent  year lifeExp      pop gdpPercap
##    &amp;lt;fct&amp;gt;       &amp;lt;fct&amp;gt;     &amp;lt;int&amp;gt;   &amp;lt;dbl&amp;gt;    &amp;lt;int&amp;gt;     &amp;lt;dbl&amp;gt;
##  1 Afghanistan Asia       1952    28.8  8425333      779.
##  2 Afghanistan Asia       1957    30.3  9240934      821.
##  3 Afghanistan Asia       1962    32.0 10267083      853.
##  4 Afghanistan Asia       1967    34.0 11537966      836.
##  5 Afghanistan Asia       1972    36.1 13079460      740.
##  6 Afghanistan Asia       1977    38.4 14880372      786.
##  7 Afghanistan Asia       1982    39.9 12881816      978.
##  8 Afghanistan Asia       1987    40.8 13867957      852.
##  9 Afghanistan Asia       1992    41.7 16317921      649.
## 10 Afghanistan Asia       1997    41.8 22227415      635.
## # ... with 1,694 more rows
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;…you won’t see anything different! R has put the dataset into separate invisible groups behind the scenes, but you haven’t done anything with those groups, so nothing has really happened. If you do things with those groups with &lt;code&gt;summarise()&lt;/code&gt;, though, &lt;code&gt;group_by()&lt;/code&gt; becomes much more useful.&lt;/p&gt;
&lt;p&gt;For instance, this will take the &lt;code&gt;gapminder&lt;/code&gt; data frame, group it by continent, and then summarize it by calculating the number of distinct countries in each group. It will return &lt;em&gt;one row for each group&lt;/em&gt;, so there should be a row for each continent:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;gapminder %&amp;gt;% 
  group_by(continent) %&amp;gt;% 
  summarize(n_countries = n_distinct(country)) &lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 5 x 2
##   continent n_countries
##   &amp;lt;fct&amp;gt;           &amp;lt;int&amp;gt;
## 1 Africa             52
## 2 Americas           25
## 3 Asia               33
## 4 Europe             30
## 5 Oceania             2&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can calculate multiple summary statistics, as before:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;gapminder %&amp;gt;% 
  group_by(continent) %&amp;gt;% 
  summarize(n_countries = n_distinct(country),
            avg_life_exp = mean(lifeExp)) &lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 5 x 3
##   continent n_countries avg_life_exp
##   &amp;lt;fct&amp;gt;           &amp;lt;int&amp;gt;        &amp;lt;dbl&amp;gt;
## 1 Africa             52         48.9
## 2 Americas           25         64.7
## 3 Asia               33         60.1
## 4 Europe             30         71.9
## 5 Oceania             2         74.3&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Your turn&lt;/strong&gt;:
&lt;br&gt;
1. Calculate summary statistics for life expectancy for each continent. Calculate minimum, maximum, median, mean, and standard deviation, and total count (n) &lt;br&gt;
2. Do the same, but for the year 2007 only
&lt;br&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;!---LEARNR EX 5--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;myIframe5&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/dplyr_summarise2/&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;p&gt;Finally, you can group by multiple columns and R will create subgroups for every combination of the groups and return the number of rows of combinations. For instance, we can calculate the average life expectancy by both year and continent and we’ll get 60 rows, since there are 5 continents and 12 years (5 × 12 = 60):&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;gapminder %&amp;gt;% 
  group_by(continent, year) %&amp;gt;% 
  summarize(avg_life_exp = mean(lifeExp)) &lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 60 x 3
## # Groups:   continent [5]
##    continent  year avg_life_exp
##    &amp;lt;fct&amp;gt;     &amp;lt;int&amp;gt;        &amp;lt;dbl&amp;gt;
##  1 Africa     1952         39.1
##  2 Africa     1957         41.3
##  3 Africa     1962         43.3
##  4 Africa     1967         45.3
##  5 Africa     1972         47.5
##  6 Africa     1977         49.6
##  7 Africa     1982         51.6
##  8 Africa     1987         53.3
##  9 Africa     1992         53.6
## 10 Africa     1997         53.6
## # ... with 50 more rows&lt;/code&gt;&lt;/pre&gt;
&lt;blockquote&gt;
&lt;p&gt;A common mistake I have seen is that people use the &lt;code&gt;summarise()&lt;/code&gt; function &lt;strong&gt;before&lt;/strong&gt; any &lt;code&gt;group_by()&lt;/code&gt;. Rememebr that if you &lt;code&gt;summarise()&lt;/code&gt; first, you collapse the entire dataframe into a single row, so there is no &lt;code&gt;group_by()&lt;/code&gt; that can be done on a single row of data!!&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;further-resources&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Further resources&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf&#34;&gt;&lt;strong&gt;dplyr&lt;/strong&gt; and &lt;strong&gt;tidyr&lt;/strong&gt; cheat sheet&lt;/a&gt; for examples.&lt;/li&gt;
&lt;/ul&gt;
&lt;script&gt;
  iFrameResize({}, &#34;.interactive&#34;);
&lt;/script&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Reshape Data</title>
      <link>https://bit-2021.netlify.app/example/eda-reshape-data/</link>
      <pubDate>Tue, 21 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/example/eda-reshape-data/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#overview&#34;&gt;Overview&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#pivot_longer-or-gather-data&#34;&gt;&lt;code&gt;pivot_longer&lt;/code&gt; or &lt;code&gt;gather&lt;/code&gt; data&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#pivot_wider-or-spread-data&#34;&gt;&lt;code&gt;pivot_wider&lt;/code&gt; or &lt;code&gt;spread&lt;/code&gt; data&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#separating&#34;&gt;Separating&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#uniting&#34;&gt;Uniting&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#rstudio-primer-on-tidyr&#34;&gt;RStudio primer on &lt;strong&gt;tidyr&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#more-resources&#34;&gt;More resources&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;Learning Objectives &lt;br&gt;
1. Understand the concept of a wide and a long table format and for which purpose those formats are useful. &lt;br&gt;
2. Understand what key-value pairs are. &lt;br&gt;
3. Reshape a dataframe from long to wide format and back with the &lt;code&gt;tidyr::pivot_longer()&lt;/code&gt; and &lt;code&gt;tidyr::pivot_wider()&lt;/code&gt; commands.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;div id=&#34;overview&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Overview&lt;/h2&gt;
&lt;p&gt;It is often said that the vast majority of data analysis is spent on cleaning and preparing data. This is something that must be repeated many times over the course of analysis as new problems come to light or new data is collected.&lt;/p&gt;
&lt;p&gt;Most people are used to analyze data in a spreadsheet or tabular format. For instance, if we wanted to study climate change, we can find data on the &lt;em&gt;Combined Land-Surface Air and Sea-Surface Water Temperature Anomalies&lt;/em&gt; in the Northern Hemisphere at &lt;a href=&#34;https://data.giss.nasa.gov/gistemp&#34;&gt;NASA’s Goddard Institute for Space Studies&lt;/a&gt;. The &lt;a href=&#34;https://data.giss.nasa.gov/gistemp/tabledata_v3/NH.Ts+dSST.txt&#34;&gt;tabular data of temperature anomalies can be found here&lt;/a&gt; and part of that data set is shown below:&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/img/weather_anomalies.png&#34; width=&#34;90%&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;A lot of these tabular shape spreadsheets were designed for efficient data entry and not necessarily to undertake any kind of statistical analysis. The principles of tidy data provide a standard way to organise data values within a dataset. A standard makes initial data cleaning easier because you don’t need to start from scratch and reinvent the wheel every time.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tidy data&lt;/strong&gt; is a specific way of organising data in a consistent manner and structuring datasets to facilitate analysis with the tidyverse. The tidy data standard has been designed to facilitate exploratory data analysis; tidy datasets and tidy tools help make data analysis easier, allowing you to focus on the interesting domain problem, not on the logistics of cleaning data.&lt;/p&gt;
&lt;p&gt;Before we proceed, a few definitions taken from &lt;a href=&#34;https://garrettgman.github.io/tidying/&#34;&gt;Garret Grolemund&lt;/a&gt; and the vignette(“tidy-data”)&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Variable&lt;/strong&gt;: A quantity, quality, or property that you can measure.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Observation&lt;/strong&gt;: A set of values that display the relationship between variables. To be an observation, values need to be measured under similar conditions, usually measured on the same observational unit at the same time.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Value&lt;/strong&gt;: The state of a variable that you observe when you measure it.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There are three rules which make a dataset &lt;strong&gt;tidy&lt;/strong&gt;:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Each variable must have its own column.&lt;/li&gt;
&lt;li&gt;Each observation must have its own row.&lt;/li&gt;
&lt;li&gt;Each value must have its own cell.&lt;/li&gt;
&lt;/ol&gt;
&lt;div class=&#34;figure&#34;&gt;
&lt;img src=&#34;https://r4ds.had.co.nz/images/tidy-1.png&#34; alt=&#34;&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;Figure 12.1 from &lt;a href=&#34;https://r4ds.had.co.nz&#34;&gt;&lt;em&gt;R for Data Science&lt;/em&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;A tidy dataset is a &lt;strong&gt;long&lt;/strong&gt; dataset, where each variable appears in one column, and each observation has its own row.
The weather anomalies dataset is a &lt;strong&gt;wide&lt;/strong&gt; dataset; the three variables are &lt;code&gt;date&lt;/code&gt; (or &lt;code&gt;year&lt;/code&gt; and &lt;code&gt;month&lt;/code&gt; if you wanted to keep them separate), and &lt;code&gt;delta&lt;/code&gt; (the actual temperature difference).&lt;/p&gt;
&lt;p&gt;We will often need to reshape our datasets and should have a way to go:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;from wide format to long (tidy) format using &lt;code&gt;tidyr::gather()&lt;/code&gt; or &lt;code&gt;tidyr::pivot_longer()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;from long (tidy) to wide format using &lt;code&gt;tidyr::spread()&lt;/code&gt; or &lt;code&gt;tidyr::pivot_wider()&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In a set of wonderful animations from &lt;a href=&#34;https://github.com/gadenbuie/tidyexplain#tidy-data&#34;&gt;Garrick Aden-Buie&lt;/a&gt;, this is the process of coverting from long format to wide and back&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/img/tidyr-longer-wider.gif&#34; width=&#34;90%&#34; style=&#34;display: block; margin: auto;&#34; /&gt;&lt;/p&gt;
&lt;p&gt;Let us review the basic tasks for tidying data using the R for Data Science &lt;code&gt;gapminder&lt;/code&gt; subset.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;table1&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 4
##   country      year  cases population
##   &amp;lt;chr&amp;gt;       &amp;lt;int&amp;gt;  &amp;lt;int&amp;gt;      &amp;lt;int&amp;gt;
## 1 Afghanistan  1999    745   19987071
## 2 Afghanistan  2000   2666   20595360
## 3 Brazil       1999  37737  172006362
## 4 Brazil       2000  80488  174504898
## 5 China        1999 212258 1272915272
## 6 China        2000 213766 1280428583&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Note that in this data frame, each variable is in its own column (&lt;code&gt;country&lt;/code&gt;, &lt;code&gt;year&lt;/code&gt;, &lt;code&gt;cases&lt;/code&gt;, and &lt;code&gt;population&lt;/code&gt;), each observation is in its own row (i.e. each row is a different country-year pairing), and each value has its own cell.&lt;/p&gt;
&lt;div id=&#34;pivot_longer-or-gather-data&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;&lt;code&gt;pivot_longer&lt;/code&gt; or &lt;code&gt;gather&lt;/code&gt; data&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Gathering&lt;/strong&gt; entails bringing a variable spread across multiple columns into a single column. For example, this version of &lt;code&gt;table1&lt;/code&gt; is not tidy because the &lt;code&gt;year&lt;/code&gt; variable is in wide format, spread across multiple columns:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;table4a&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 3 x 3
##   country     `1999` `2000`
## * &amp;lt;chr&amp;gt;        &amp;lt;int&amp;gt;  &amp;lt;int&amp;gt;
## 1 Afghanistan    745   2666
## 2 Brazil       37737  80488
## 3 China       212258 213766&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The variables (columns) that a tidy dataframe would have would be &lt;code&gt;country&lt;/code&gt;, &lt;code&gt;year&lt;/code&gt;, and &lt;code&gt;cases&lt;/code&gt;. We can use the &lt;code&gt;pivot_longer&lt;/code&gt; or &lt;code&gt;gather()&lt;/code&gt; function from the &lt;code&gt;tidyr&lt;/code&gt; package to reshape the data frame and make this tidy. To do this we need three pieces of information:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;The names of the columns that represent the values, not variables. Here, those are &lt;code&gt;1999&lt;/code&gt; and &lt;code&gt;2000&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;key&lt;/code&gt;, or the name of the variable whose values form the column names. Here that is &lt;code&gt;year&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;value&lt;/code&gt;, or the name of the variable whose values are spread over the cells. Here that is &lt;code&gt;cases&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;blockquote&gt;
&lt;p&gt;Notice that we create the names for &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;value&lt;/code&gt; - they do not already exist in the data frame.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;We implement this using the &lt;code&gt;pivot_longer()&lt;/code&gt; or &lt;code&gt;gather()&lt;/code&gt; function. &lt;code&gt;pivot_longer()&lt;/code&gt; requires the newest version of &lt;code&gt;tidyr&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Once you have installed the newest version of tidyr, then you can use either &lt;code&gt;pivot_longer()&lt;/code&gt; or &lt;code&gt;gather()&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;table4a %&amp;gt;% 
  pivot_longer(cols=c(`1999`, `2000`), 
               names_to = &amp;quot;year&amp;quot;, 
               values_to = &amp;quot;cases&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 3
##   country     year   cases
##   &amp;lt;chr&amp;gt;       &amp;lt;chr&amp;gt;  &amp;lt;int&amp;gt;
## 1 Afghanistan 1999     745
## 2 Afghanistan 2000    2666
## 3 Brazil      1999   37737
## 4 Brazil      2000   80488
## 5 China       1999  212258
## 6 China       2000  213766&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;table4a %&amp;gt;% 
  gather(`1999`, `2000`, 
         key = year, 
         value = cases)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 3
##   country     year   cases
##   &amp;lt;chr&amp;gt;       &amp;lt;chr&amp;gt;  &amp;lt;int&amp;gt;
## 1 Afghanistan 1999     745
## 2 Brazil      1999   37737
## 3 China       1999  212258
## 4 Afghanistan 2000    2666
## 5 Brazil      2000   80488
## 6 China       2000  213766&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This operation would be called reshaping data wide to long.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;pivot_wider-or-spread-data&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;&lt;code&gt;pivot_wider&lt;/code&gt; or &lt;code&gt;spread&lt;/code&gt; data&lt;/h3&gt;
&lt;p&gt;If we wanted to make a long table into a wide one, we use &lt;code&gt;pivot_wider&lt;/code&gt;; &lt;strong&gt;spreading&lt;/strong&gt; brings an observation spread across multiple rows into a single row. It is the reverse of gathering, or taking a wide dataset and making it long. For instance, take &lt;code&gt;table2&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;table2&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 12 x 4
##    country      year type            count
##    &amp;lt;chr&amp;gt;       &amp;lt;int&amp;gt; &amp;lt;chr&amp;gt;           &amp;lt;int&amp;gt;
##  1 Afghanistan  1999 cases             745
##  2 Afghanistan  1999 population   19987071
##  3 Afghanistan  2000 cases            2666
##  4 Afghanistan  2000 population   20595360
##  5 Brazil       1999 cases           37737
##  6 Brazil       1999 population  172006362
##  7 Brazil       2000 cases           80488
##  8 Brazil       2000 population  174504898
##  9 China        1999 cases          212258
## 10 China        1999 population 1272915272
## 11 China        2000 cases          213766
## 12 China        2000 population 1280428583&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It violates the tidy data principle because each observation (unit of analysis is a country-year pairing) is split across multiple rows. To tidy the data frame, we need to know:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;The &lt;code&gt;key&lt;/code&gt; column, or the column that contains variable names. Here, it is &lt;code&gt;type&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;value&lt;/code&gt; column, or the column that contains values for multiple variables. Here it is &lt;code&gt;count&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;blockquote&gt;
&lt;p&gt;Notice that unlike for gathering, when spreading the &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;value&lt;/code&gt; columns are already defined in the data frame. We do not create the names ourselves, only identify them in the existing data frame.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;table2 %&amp;gt;%
  pivot_wider(names_from = type, values_from = count)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 4
##   country      year  cases population
##   &amp;lt;chr&amp;gt;       &amp;lt;int&amp;gt;  &amp;lt;int&amp;gt;      &amp;lt;int&amp;gt;
## 1 Afghanistan  1999    745   19987071
## 2 Afghanistan  2000   2666   20595360
## 3 Brazil       1999  37737  172006362
## 4 Brazil       2000  80488  174504898
## 5 China        1999 212258 1272915272
## 6 China        2000 213766 1280428583&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;table2 %&amp;gt;%
  spread(key = type, value = count)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 4
##   country      year  cases population
##   &amp;lt;chr&amp;gt;       &amp;lt;int&amp;gt;  &amp;lt;int&amp;gt;      &amp;lt;int&amp;gt;
## 1 Afghanistan  1999    745   19987071
## 2 Afghanistan  2000   2666   20595360
## 3 Brazil       1999  37737  172006362
## 4 Brazil       2000  80488  174504898
## 5 China        1999 212258 1272915272
## 6 China        2000 213766 1280428583&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This operation would be called reshaping data long to wide.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;separating&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Separating&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Separating&lt;/strong&gt; splits multiple variables stored in a single column into multiple columns. For example in &lt;code&gt;table3&lt;/code&gt;, the &lt;code&gt;rate&lt;/code&gt; column contains both &lt;code&gt;cases&lt;/code&gt; and &lt;code&gt;population&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;table3&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 3
##   country      year rate             
## * &amp;lt;chr&amp;gt;       &amp;lt;int&amp;gt; &amp;lt;chr&amp;gt;            
## 1 Afghanistan  1999 745/19987071     
## 2 Afghanistan  2000 2666/20595360    
## 3 Brazil       1999 37737/172006362  
## 4 Brazil       2000 80488/174504898  
## 5 China        1999 212258/1272915272
## 6 China        2000 213766/1280428583&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;strong&gt;This is a bad idea as you lose information&lt;/strong&gt;. Tidy data principles require each column to contain a single variable. We can use &lt;code&gt;separate()&lt;/code&gt; to split the column into two new columns:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;table3 %&amp;gt;% 
  separate(rate, into = c(&amp;quot;cases&amp;quot;, &amp;quot;population&amp;quot;))&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 4
##   country      year cases  population
##   &amp;lt;chr&amp;gt;       &amp;lt;int&amp;gt; &amp;lt;chr&amp;gt;  &amp;lt;chr&amp;gt;     
## 1 Afghanistan  1999 745    19987071  
## 2 Afghanistan  2000 2666   20595360  
## 3 Brazil       1999 37737  172006362 
## 4 Brazil       2000 80488  174504898 
## 5 China        1999 212258 1272915272
## 6 China        2000 213766 1280428583&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;uniting&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Uniting&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;Uniting&lt;/strong&gt; is the inverse of separating - when a variable is stored in multiple columns, uniting brings the variable back into a single column. &lt;code&gt;table5&lt;/code&gt; splits the year variable into two columns:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;table5&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 4
##   country     century year  rate             
## * &amp;lt;chr&amp;gt;       &amp;lt;chr&amp;gt;   &amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt;            
## 1 Afghanistan 19      99    745/19987071     
## 2 Afghanistan 20      00    2666/20595360    
## 3 Brazil      19      99    37737/172006362  
## 4 Brazil      20      00    80488/174504898  
## 5 China       19      99    212258/1272915272
## 6 China       20      00    213766/1280428583&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To bring them back, use the &lt;code&gt;unite()&lt;/code&gt; function:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;table5 %&amp;gt;% 
  unite(new, century, year)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 3
##   country     new   rate             
##   &amp;lt;chr&amp;gt;       &amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt;            
## 1 Afghanistan 19_99 745/19987071     
## 2 Afghanistan 20_00 2666/20595360    
## 3 Brazil      19_99 37737/172006362  
## 4 Brazil      20_00 80488/174504898  
## 5 China       19_99 212258/1272915272
## 6 China       20_00 213766/1280428583&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# remove underscore
table5 %&amp;gt;% 
  unite(new, century, year, sep = &amp;quot;&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 6 x 3
##   country     new   rate             
##   &amp;lt;chr&amp;gt;       &amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt;            
## 1 Afghanistan 1999  745/19987071     
## 2 Afghanistan 2000  2666/20595360    
## 3 Brazil      1999  37737/172006362  
## 4 Brazil      2000  80488/174504898  
## 5 China       1999  212258/1272915272
## 6 China       2000  213766/1280428583&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If we wanted to make &lt;code&gt;gapminder&lt;/code&gt; a tabular, wide dataframe, we would use &lt;code&gt;pivot_wider()&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;gapminder_life_exp_wide  &amp;lt;- gapminder %&amp;gt;% 
  select(country, continent, 
         lifeExp, year) %&amp;gt;% 
  pivot_wider(names_from = year, values_from = lifeExp) 


  gapminder_life_exp_wide &lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 142 x 14
##    country continent `1952` `1957` `1962` `1967` `1972` `1977` `1982` `1987`
##    &amp;lt;fct&amp;gt;   &amp;lt;fct&amp;gt;      &amp;lt;dbl&amp;gt;  &amp;lt;dbl&amp;gt;  &amp;lt;dbl&amp;gt;  &amp;lt;dbl&amp;gt;  &amp;lt;dbl&amp;gt;  &amp;lt;dbl&amp;gt;  &amp;lt;dbl&amp;gt;  &amp;lt;dbl&amp;gt;
##  1 Afghan~ Asia        28.8   30.3   32.0   34.0   36.1   38.4   39.9   40.8
##  2 Albania Europe      55.2   59.3   64.8   66.2   67.7   68.9   70.4   72  
##  3 Algeria Africa      43.1   45.7   48.3   51.4   54.5   58.0   61.4   65.8
##  4 Angola  Africa      30.0   32.0   34     36.0   37.9   39.5   39.9   39.9
##  5 Argent~ Americas    62.5   64.4   65.1   65.6   67.1   68.5   69.9   70.8
##  6 Austra~ Oceania     69.1   70.3   70.9   71.1   71.9   73.5   74.7   76.3
##  7 Austria Europe      66.8   67.5   69.5   70.1   70.6   72.2   73.2   74.9
##  8 Bahrain Asia        50.9   53.8   56.9   59.9   63.3   65.6   69.1   70.8
##  9 Bangla~ Asia        37.5   39.3   41.2   43.5   45.3   46.9   50.0   52.8
## 10 Belgium Europe      68     69.2   70.2   70.9   71.4   72.8   73.9   75.4
## # ... with 132 more rows, and 4 more variables: `1992` &amp;lt;dbl&amp;gt;, `1997` &amp;lt;dbl&amp;gt;,
## #   `2002` &amp;lt;dbl&amp;gt;, `2007` &amp;lt;dbl&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Similarly, if we wanted to convert from the wide gapminder to the long one, we would use either &lt;code&gt;gather&lt;/code&gt; or &lt;code&gt;pivot_longer()&lt;/code&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;gapminder_life_exp_wide %&amp;gt;% 
  gather(key = &amp;quot;year&amp;quot;, value = &amp;quot;lifeExp&amp;quot;,
         -country, -continent) %&amp;gt;% 
  mutate(year = as.numeric(year)) &lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 1,704 x 4
##    country     continent  year lifeExp
##    &amp;lt;fct&amp;gt;       &amp;lt;fct&amp;gt;     &amp;lt;dbl&amp;gt;   &amp;lt;dbl&amp;gt;
##  1 Afghanistan Asia       1952    28.8
##  2 Albania     Europe     1952    55.2
##  3 Algeria     Africa     1952    43.1
##  4 Angola      Africa     1952    30.0
##  5 Argentina   Americas   1952    62.5
##  6 Australia   Oceania    1952    69.1
##  7 Austria     Europe     1952    66.8
##  8 Bahrain     Asia       1952    50.9
##  9 Bangladesh  Asia       1952    37.5
## 10 Belgium     Europe     1952    68  
## # ... with 1,694 more rows&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;gapminder_life_exp_wide %&amp;gt;% 
  pivot_longer(
    cols = c(-country, -continent), #keep country and continent
    names_to = &amp;quot;year&amp;quot;, 
    values_to = &amp;quot;lifeExp&amp;quot;,
         ) %&amp;gt;% 
  mutate(year = as.numeric(year)) &lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 1,704 x 4
##    country     continent  year lifeExp
##    &amp;lt;fct&amp;gt;       &amp;lt;fct&amp;gt;     &amp;lt;dbl&amp;gt;   &amp;lt;dbl&amp;gt;
##  1 Afghanistan Asia       1952    28.8
##  2 Afghanistan Asia       1957    30.3
##  3 Afghanistan Asia       1962    32.0
##  4 Afghanistan Asia       1967    34.0
##  5 Afghanistan Asia       1972    36.1
##  6 Afghanistan Asia       1977    38.4
##  7 Afghanistan Asia       1982    39.9
##  8 Afghanistan Asia       1987    40.8
##  9 Afghanistan Asia       1992    41.7
## 10 Afghanistan Asia       1997    41.8
## # ... with 1,694 more rows&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;rstudio-primer-on-tidyr&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;RStudio primer on &lt;strong&gt;tidyr&lt;/strong&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://rstudio.cloud/learn/primers/4.1&#34;&gt;Reshape Data&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;blockquote&gt;
&lt;p&gt;Recent versions of &lt;strong&gt;tidyr&lt;/strong&gt; have renamed these core functions: &lt;code&gt;gather()&lt;/code&gt; is now &lt;code&gt;pivot_longer()&lt;/code&gt; and &lt;code&gt;spread()&lt;/code&gt; is now &lt;code&gt;pivot_wider()&lt;/code&gt;. The syntax for these &lt;code&gt;pivot_*()&lt;/code&gt; functions is &lt;em&gt;slightly&lt;/em&gt; different from what it was in &lt;code&gt;gather()&lt;/code&gt; and &lt;code&gt;spread()&lt;/code&gt;, so you can’t just replace the names. Even though, both &lt;code&gt;gather()&lt;/code&gt; and &lt;code&gt;spread()&lt;/code&gt; still work and won’t go away for a while, I think it’s worth learning the newer &lt;code&gt;pivot_*()&lt;/code&gt; functions.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/div&gt;
&lt;div id=&#34;more-resources&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;More resources&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://www.storybench.org/pivoting-data-from-columns-to-rows-and-back-in-the-tidyverse/&#34; target=&#34;_blank&#34;&gt;Pivoting data from columns to rows (and back!) in the tidyverse&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://tidyr.tidyverse.org/dev/articles/pivot.html&#34; target=&#34;_blank&#34;&gt;Pivoting in &lt;code&gt;tidyr&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Hadley Wickham’s &lt;a href=&#34;https://vita.had.co.nz/papers/tidy-data.html&#34; target=&#34;_blank&#34;&gt;tidy data paper&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.tandfonline.com/doi/full/10.1080/00031305.2017.1375989&#34; target=&#34;_blank&#34;&gt;Data Organization in Spreadsheets&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;script&gt;
  iFrameResize({}, &#34;.interactive&#34;);
&lt;/script&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Side-by-side regression tables</title>
      <link>https://bit-2021.netlify.app/example/modelling_side_by_side_tables/</link>
      <pubDate>Tue, 28 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/example/modelling_side_by_side_tables/</guid>
      <description>

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#huxtablehuxreg&#34;&gt;&lt;code&gt;huxtable::huxreg()&lt;/code&gt;&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#installing&#34;&gt;Installing&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#basic-usage&#34;&gt;Basic usage&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#statistics-to-display-bold-significant-variables-add-captions&#34;&gt;Statistics to display, bold significant variables, add captions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#warning-1-huxtable-reformats-all-your-tables&#34;&gt;Warning 1: &lt;strong&gt;huxtable&lt;/strong&gt; reformats all your tables&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#warning-2-knitting-to-pdf-is-fragile&#34;&gt;warning 2: Knitting to PDF is fragile&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;p&gt;It’s often helpful to put the results from regression models in a side-by-side table so you can compare coefficients across different model specifications. If you’re unfamiliar with these kinds of tables, &lt;a href=&#34;http://svmiller.com/blog/2014/08/reading-a-regression-table-a-guide-for-students/&#34;&gt;check out this helpful guide to how to read them&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We will use the &lt;code&gt;huxreg()&lt;/code&gt; function in the &lt;a href=&#34;https://hughjonesd.github.io/huxtable/&#34;&gt;&lt;strong&gt;huxtable&lt;/strong&gt; package&lt;/a&gt;, the Palmer Penguins dataset, and a few regression models on trying to explain &lt;code&gt;body_mass_g&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(palmerpenguins)

model1 &amp;lt;- lm(body_mass_g ~ flipper_length_mm, data = penguins)
model2 &amp;lt;- lm(body_mass_g ~ flipper_length_mm + bill_length_mm, data = penguins)
model3 &amp;lt;- lm(body_mass_g ~ flipper_length_mm + bill_length_mm + species, data = penguins)
model4 &amp;lt;- lm(body_mass_g ~ flipper_length_mm + bill_length_mm + species + sex, data = penguins)&lt;/code&gt;&lt;/pre&gt;
&lt;div id=&#34;huxtablehuxreg&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;code&gt;huxtable::huxreg()&lt;/code&gt;&lt;/h2&gt;
&lt;div id=&#34;installing&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Installing&lt;/h3&gt;
&lt;p&gt;&lt;strong&gt;huxtable&lt;/strong&gt; is published on CRAN, so use the “Packages” panel in RStudio to install &lt;code&gt;huxtable&lt;/code&gt;, or use:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;install.packages(&amp;quot;huxtable&amp;quot;, dependencies = TRUE)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To knit to Word, you also need the &lt;strong&gt;flextable&lt;/strong&gt; package, so install that too from the “Packages” panel, or use:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;install.packages(&amp;quot;flextable&amp;quot;, dependencies = TRUE)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Using &lt;code&gt;huxtable::huxreg()&lt;/code&gt; to knit to HTML and Word works very well.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;basic-usage&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Basic usage&lt;/h3&gt;
&lt;p&gt;As you go about builiding your linear models, you save them as object with different names, e.g., &lt;em&gt;model1&lt;/em&gt;, &lt;em&gt;model2&lt;/em&gt;, etc. Once you have the models you want to compate, pass them to &lt;code&gt;huxreg()&lt;/code&gt;:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(huxtable)

huxreg(model1, model2, model3, model4)&lt;/code&gt;&lt;/pre&gt;
&lt;table class=&#34;huxtable&#34; style=&#34;border-collapse: collapse; border: 0px; margin-bottom: 2em; margin-top: 2em; ; margin-left: auto; margin-right: auto;  &#34; id=&#34;tab:unnamed-chunk-3&#34;&gt;
&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(1)&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(2)&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(3)&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(4)&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(Intercept)&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-5780.831 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-5736.897 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-3904.387 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-759.064&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(305.815)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(307.959)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(529.257)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(541.377)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;flipper_length_mm&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;49.686 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;48.145 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;27.429 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;17.847 ***&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(1.518)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(2.011)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(3.176)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(2.902)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;bill_length_mm&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;6.047&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;61.736 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;21.633 **&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(5.180)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(7.126)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(7.148)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;speciesChinstrap&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-748.562 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-291.711 ***&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(81.534)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(81.502)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;speciesGentoo&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;90.435&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;707.028 ***&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(88.647)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(94.359)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;sexmale&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;465.395 ***&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(43.081)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;N&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;342&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;342&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;342&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;333&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;R2&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.759&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.760&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.822&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.871&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;logLik&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-2528.427&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-2527.741&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-2476.373&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-2359.787&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;AIC&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;5062.855&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;5063.482&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;4964.745&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;4733.574&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th colspan=&#34;5&#34; style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt; *** p &amp;lt; 0.001;  ** p &amp;lt; 0.01;  * p &amp;lt; 0.05.&lt;/th&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;p&gt;You can add column names, to make the output table more user friendly.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;huxreg(list(&amp;quot;A&amp;quot; = model1, &amp;quot;B&amp;quot; = model2, &amp;quot;C&amp;quot; = model3, &amp;quot;D&amp;quot; = model4))&lt;/code&gt;&lt;/pre&gt;
&lt;table class=&#34;huxtable&#34; style=&#34;border-collapse: collapse; border: 0px; margin-bottom: 2em; margin-top: 2em; ; margin-left: auto; margin-right: auto;  &#34; id=&#34;tab:unnamed-chunk-4&#34;&gt;
&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;A&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;B&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;C&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;D&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(Intercept)&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-5780.831 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-5736.897 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-3904.387 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-759.064&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(305.815)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(307.959)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(529.257)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(541.377)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;flipper_length_mm&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;49.686 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;48.145 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;27.429 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;17.847 ***&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(1.518)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(2.011)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(3.176)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(2.902)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;bill_length_mm&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;6.047&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;61.736 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;21.633 **&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(5.180)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(7.126)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(7.148)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;speciesChinstrap&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-748.562 ***&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-291.711 ***&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(81.534)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(81.502)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;speciesGentoo&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;90.435&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;707.028 ***&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(88.647)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(94.359)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;sexmale&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;465.395 ***&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(43.081)&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;N&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;342&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;342&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;342&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;333&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;R2&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.759&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.760&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.822&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.871&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;logLik&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-2528.427&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-2527.741&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-2476.373&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;-2359.787&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;AIC&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;5062.855&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;5063.482&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;4964.745&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;4733.574&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th colspan=&#34;5&#34; style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt; *** p &amp;lt; 0.001;  ** p &amp;lt; 0.01;  * p &amp;lt; 0.05.&lt;/th&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;/div&gt;
&lt;div id=&#34;statistics-to-display-bold-significant-variables-add-captions&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Statistics to display, bold significant variables, add captions&lt;/h3&gt;
&lt;p&gt;You can choose what statistics to display; for instance, if we wanted to show number of observatiosn, R^2 and adjusted R2, and the residual SE, we pass them to the &lt;code&gt;statistics&lt;/code&gt; variable. We can also bold those variables that are significant at, say 0.05, and choose not to use stars to denote level of significance by using &lt;code&gt;stars = NULL&lt;/code&gt;.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;huxreg(model1, model2, model3, 
                 statistics = c(&amp;#39;#observations&amp;#39; = &amp;#39;nobs&amp;#39;, 
                                &amp;#39;R squared&amp;#39; = &amp;#39;r.squared&amp;#39;, 
                                &amp;#39;Adj. R Squared&amp;#39; = &amp;#39;adj.r.squared&amp;#39;, 
                                &amp;#39;Residual SE&amp;#39; = &amp;#39;sigma&amp;#39;), 
                 bold_signif = 0.05, 
                 stars = NULL
) %&amp;gt;% 
  set_caption(&amp;#39;Comparison of models&amp;#39;)&lt;/code&gt;&lt;/pre&gt;
&lt;table class=&#34;huxtable&#34; style=&#34;border-collapse: collapse; border: 0px; margin-bottom: 2em; margin-top: 2em; ; margin-left: auto; margin-right: auto;  &#34; id=&#34;tab:unnamed-chunk-5&#34;&gt;
&lt;caption style=&#34;caption-side: top; text-align: center;&#34;&gt;Comparison of models&lt;/caption&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;col&gt;&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(1)&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(2)&lt;/th&gt;&lt;th style=&#34;vertical-align: top; text-align: center; white-space: normal; border-style: solid solid solid solid; border-width: 0.8pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(3)&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(Intercept)&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;-5780.831&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;-5736.897&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;-3904.387&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(305.815)&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(307.959)&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(529.257)&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;flipper_length_mm&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;49.686&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;48.145&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;27.429&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(1.518)&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(2.011)&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(3.176)&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;bill_length_mm&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;6.047&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;61.736&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(5.180)&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(7.126)&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;speciesChinstrap&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;-748.562&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: bold;&#34;&gt;(81.534)&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;speciesGentoo&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;90.435&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.4pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;(88.647)&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;#observations&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;342&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;342&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0.4pt 0pt 0pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;342&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;R squared&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.759&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.760&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.822&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;Adj. R Squared&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.758&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.759&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;0.820&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
&lt;th style=&#34;vertical-align: top; text-align: left; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;Residual SE&lt;/th&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;394.278&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;394.068&amp;nbsp;&lt;/td&gt;&lt;td style=&#34;vertical-align: top; text-align: right; white-space: normal; border-style: solid solid solid solid; border-width: 0pt 0pt 0.8pt 0pt;    padding: 6pt 6pt 6pt 6pt; font-weight: normal;&#34;&gt;340.114&amp;nbsp;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;/div&gt;
&lt;div id=&#34;warning-1-huxtable-reformats-all-your-tables&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Warning 1: &lt;strong&gt;huxtable&lt;/strong&gt; reformats all your tables&lt;/h3&gt;
&lt;p&gt;If your document creates any other tables (like with &lt;code&gt;tidy()&lt;/code&gt;), &lt;strong&gt;huxtable&lt;/strong&gt; automatically formats these tables in a fancy way. If you don’t want that, you can turn it off with this code—put it at the top of your document near where you load your libraries:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# Huxtable likes to automatically format *all* tables, which is annoying. 
# This turns that off.
options(&amp;#39;huxtable.knit_print_df&amp;#39; = FALSE)&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;warning-2-knitting-to-pdf-is-fragile&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;warning 2: Knitting to PDF is fragile&lt;/h3&gt;
&lt;p&gt;In order to knit to PDF, you need to install LaTeX, which you should have done when you installed &lt;code&gt;tinytex&lt;/code&gt;. When using &lt;strong&gt;huxtable&lt;/strong&gt;, before knitting to PDF for the first time on your computer, you need to run this in your &lt;em&gt;console&lt;/em&gt; to install the LaTeX packages that R uses to knit &lt;strong&gt;huxtable&lt;/strong&gt; tables to PDF:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;huxtable::install_latex_dependencies()&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you’re using &lt;strong&gt;tinytex&lt;/strong&gt;, you’ll also need to run this once on your computer:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;tinytex::tlmgr_install(&amp;quot;unicode-math&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
</description>
    </item>
    
  </channel>
</rss>
