<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Exercises | Learning from Data: R programming</title>
    <link>https://bit-2021.netlify.app/exercise/</link>
      <atom:link href="https://bit-2021.netlify.app/exercise/index.xml" rel="self" type="application/rss+xml" />
    <description>Exercises</description>
    <generator>Source Themes Academic (https://sourcethemes.com/academic/)</generator><lastBuildDate>Sun, 26 Jul 2020 00:00:00 +0000</lastBuildDate>
    <image>
      <url>https://bit-2021.netlify.app/media/social-image.png</url>
      <title>Exercises</title>
      <link>https://bit-2021.netlify.app/exercise/</link>
    </image>
    
    <item>
      <title>Confidence Intervals</title>
      <link>https://bit-2021.netlify.app/exercise/inference_ci-exercise/</link>
      <pubDate>Sat, 25 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/exercise/inference_ci-exercise/</guid>
      <description>
&lt;script src=&#34;https://cdnjs.cloudflare.com/ajax/libs/iframe-resizer/3.5.16/iframeResizer.min.js&#34; type=&#34;text/javascript&#34;&gt;&lt;/script&gt;


&lt;!---LEARNR sampling_mcq--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;sampling_mcq&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/sampling_mcq&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;script&gt;
  iFrameResize({}, &#34;.interactive&#34;);
&lt;/script&gt;
</description>
    </item>
    
    <item>
      <title>Exploratory Data Analysis for Modelling</title>
      <link>https://bit-2021.netlify.app/exercise/modelling_eda-exercise/</link>
      <pubDate>Sat, 25 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/exercise/modelling_eda-exercise/</guid>
      <description>



</description>
    </item>
    
    <item>
      <title>R Syntax, Vectors, missing data</title>
      <link>https://bit-2021.netlify.app/exercise/rbasics-exercise/</link>
      <pubDate>Sat, 25 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/exercise/rbasics-exercise/</guid>
      <description>
&lt;script src=&#34;https://cdnjs.cloudflare.com/ajax/libs/iframe-resizer/3.5.16/iframeResizer.min.js&#34; type=&#34;text/javascript&#34;&gt;&lt;/script&gt;

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#r-syntax&#34;&gt;R Syntax&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#assignmnent-operator--&#34;&gt;Assignmnent Operator &lt;code&gt;&amp;lt;-&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#r-is-case-sensitive&#34;&gt;R is case sensitive&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#typos&#34;&gt;Typos&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#comments&#34;&gt;Comments&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#r-knows-youre-not-finished&#34;&gt;R knows you’re not finished&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#arithmetic-operations-and-functions&#34;&gt;Arithmetic Operations and Functions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#main-data-types-and-vectors&#34;&gt;Main Data types and Vectors&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#your-turn&#34;&gt;&lt;strong&gt;Your turn&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#vectors&#34;&gt;Vectors&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#manipulating-vectors&#34;&gt;Manipulating vectors&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#your-turn-1&#34;&gt;&lt;strong&gt;Your turn&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#factors&#34;&gt;Factors&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#your-turn-2&#34;&gt;&lt;strong&gt;Your turn&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#missing-data-or-na&#34;&gt;Missing data, or &lt;strong&gt;&lt;code&gt;NA&lt;/code&gt;&lt;/strong&gt;&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#your-turn-3&#34;&gt;&lt;strong&gt;Your turn&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#online-quiz-variable-types---vectors&#34;&gt;Online Quiz: Variable Types - Vectors&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;r-syntax&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;R Syntax&lt;/h2&gt;
&lt;p&gt;We can type commands in the command prompt and use R as a simple calculator. For instance, try typing &lt;code&gt;5 + 20&lt;/code&gt;, and hitting enter. When you do this, you’ve entered a command, and R will &lt;strong&gt;execute&lt;/strong&gt; that command. However, it’s more interesting when we can create objects or variables and work with these beasts!&lt;/p&gt;
&lt;div id=&#34;assignmnent-operator--&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Assignmnent Operator &lt;code&gt;&amp;lt;-&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;R treats everything (single numbers, lists, vectors, datasets) as &lt;strong&gt;objects&lt;/strong&gt;. To create an object, we must use the assignment operator &lt;code&gt;&amp;lt;-&lt;/code&gt;. For instance, if we had data on a student whose name is Alex, is 28 years old, and comes from Athens, we would create three objects, &lt;code&gt;name&lt;/code&gt;, &lt;code&gt;height&lt;/code&gt;, and &lt;code&gt;city&lt;/code&gt; and assign the values of &lt;code&gt;Alex&lt;/code&gt;, &lt;code&gt;28&lt;/code&gt;, and &lt;code&gt;Athens&lt;/code&gt; respectively, we would type&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;name &amp;lt;- &amp;quot;Alex&amp;quot;
age &amp;lt;- 28
city &amp;lt;- &amp;quot;Athens&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The two objects have now been created; if we wanted to print out their values, we can use the &lt;code&gt;print()&lt;/code&gt; function or just type the names of the objects.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;print(name); print(age); print(city)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Alex&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 28&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Athens&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;name&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Alex&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;age&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 28&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;city&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Athens&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can mentally read the command &lt;code&gt;age &amp;lt;- 28&lt;/code&gt; as &lt;em&gt;object &lt;code&gt;age&lt;/code&gt; becomes equal to the value 28&lt;/em&gt;. There is a keyboard shortcut &lt;code&gt;Alt + -&lt;/code&gt; to get the assignment operator. We can do more interesting and useful things creating variables and assigning values to them. For instance, if we have the relevant dimensions and wanted to calculate the area and volume of a room, we could do it as follows:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;room_length &amp;lt;- 5.63
room_width  &amp;lt;- 6.48
room_height &amp;lt;- 2.93
room_area &amp;lt;- room_length * room_width
room_volume &amp;lt;- room_length * room_width * room_height

room_area&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 36.4824&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;room_volume&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 106.8934&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;r-is-case-sensitive&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;R is case sensitive&lt;/h3&gt;
&lt;p&gt;R is case sensitive and needs everything exactly as it was defined. &lt;code&gt;age&lt;/code&gt; is different from &lt;code&gt;AgE&lt;/code&gt; and &lt;code&gt;Age&lt;/code&gt;. So if you type&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;age &amp;lt;- 28
AgE &amp;lt;- 34
Age &amp;lt;- 55

age; AgE; Age&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 28&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 34&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 55&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;R will create three different objects.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;typos&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Typos&lt;/h3&gt;
&lt;p&gt;R is a brilliant piece of software, but it cannot handle typos. Unlike Google’s search, &lt;em&gt;“Did you mean…”&lt;/em&gt;, it takes it on faith that what you typed is &lt;strong&gt;exactly&lt;/strong&gt; what you meant. For example, suppose that you forgot to hit the shift key when trying to type &lt;code&gt;+&lt;/code&gt;, and as a result your command ended up being &lt;code&gt;5 = 20&lt;/code&gt; rather than &lt;code&gt;5 + 20&lt;/code&gt;. Here’s what happens:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;5 = 20&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Error in 5 = 20: invalid (do_set) left-hand side to assignment&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;R attempted to interpret &lt;code&gt;5 = 20&lt;/code&gt; as a command, and spits out an error message because this makes no sense to it. Even more subtle is the fact that some typos won’t produce errors at all, because they happen to correspond to R commands. For instance, suppose that instead of &lt;code&gt;5 + 20&lt;/code&gt;, I mistakenly type command &lt;code&gt;5 - 20&lt;/code&gt;. Clearly, R has no way of knowing that you meant to add &lt;code&gt;20&lt;/code&gt; to &lt;code&gt;5&lt;/code&gt;, not subtract &lt;code&gt;20&lt;/code&gt; from &lt;code&gt;5&lt;/code&gt;, so what happens this time is this:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;5 - 20&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] -15&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In this case, R produces the right answer, but to the the wrong question.&lt;/p&gt;
&lt;p&gt;R will always try to do exactly what you ask it to do. There is no autocorrect or equivalent to “Did you mean..” in R, and for good reason. When doing advanced stuff and even the simplest of statistics is pretty advanced in a lot of ways, it’s dangerous to let a mindless automaton like R try to overrule the human user. But because of this, it’s your responsibility to be careful. Always make sure you type exactly what you mean. When dealing with computers, it’s not enough to type approximately the right thing. In general, you absolutely must be precise in what you say to R … like all machines it is too stupid to be anything other than absurdly literal in its interpretation.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;comments&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Comments&lt;/h3&gt;
&lt;p&gt;It is useful to put comments in your code, to make everything more readable. These comments could help others and you when you go back to your code in the future. R comments start with a hashtag sign &lt;code&gt;#&lt;/code&gt;. Everything after the hashtag to the end of the line will be ignored by R. RStudio by default thinks that every line you write is a command; if you want to turn a line into a comment, place the cursor in the line and hit &lt;code&gt;Ctrl + Shift + C&lt;/code&gt; in Windows or &lt;code&gt;Cmd + Shift + C&lt;/code&gt; in a Mac.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# This line is a comment and will be ignored when run.
city # Text after the hashtag &amp;quot;#&amp;quot; is also ignored.&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Athens&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;r-knows-youre-not-finished&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;R knows you’re not finished&lt;/h3&gt;
&lt;p&gt;If you hit enter in a situation where it’s obvious to R that you haven’t actually finished typing the command, R is just smart enough to keep waiting. For example, if you wanted to calculate &lt;code&gt;15 - 4&lt;/code&gt;, and start by typing type &lt;code&gt;15 -&lt;/code&gt; and then press enter by mistake, R is smart enough to realise that you probably wanted to type in another number. So here’s what happens:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;gt; 15 -
+&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and there’s a blinking cursor next to the plus &lt;code&gt;+&lt;/code&gt; sign. What this means is that R is still waiting for you to finish. It thinks you’re still typing your command, so it hasn’t tried to execute it yet. In other words, this plus sign is actually another command prompt. It’s different from the usual one (i.e., the &lt;code&gt;&amp;gt;&lt;/code&gt; symbol) to remind you that R is going to add whatever you type now to what you typed last time. For example, if I then go on to type &lt;code&gt;4&lt;/code&gt; and hit enter, what we get:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&amp;gt; 15 -
+ 4
[1] 11&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And as far as R is concerned, this is exactly the same as if you had typed &lt;code&gt;15 - 4&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;By the way, if after entering the &lt;code&gt;15 -&lt;/code&gt; you wanted to stop execution and cancel your command, just hit the &lt;strong&gt;escape&lt;/strong&gt; key. R will return you to the normal command prompt (i.e. &lt;code&gt;&amp;gt;&lt;/code&gt;) without attempting to execute the botched command.&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;arithmetic-operations-and-functions&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Arithmetic Operations and Functions&lt;/h3&gt;
&lt;p&gt;R has the basic operators and you can use it as as simple calculator: addition is &lt;code&gt;+&lt;/code&gt;, subtraction is &lt;code&gt;-&lt;/code&gt;, multiplication is &lt;code&gt;*&lt;/code&gt;, division is &lt;code&gt;/&lt;/code&gt;, and &lt;code&gt;^&lt;/code&gt; is the power operator:&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;2 + 3 &lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 5&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;5 - 8&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] -3&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;13 * 21&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 273&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;34 / 55&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 0.6181818&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;(5 * 13)/4 - 7&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 9.25&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# ^ : to the power off
2^3&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 8&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# for exponentiation, you can also use **
2 ** 3&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 8&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# square root
sqrt(25)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 5&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Besides the basic operations functions, you can use standard mathematical functions&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Rounding
-&lt;code&gt;round()&lt;/code&gt;, &lt;code&gt;floor()&lt;/code&gt;, &lt;code&gt;ceiling()&lt;/code&gt;,&lt;/li&gt;
&lt;li&gt;Logarithms and Exponentials
-&lt;code&gt;exp()&lt;/code&gt;, &lt;code&gt;log()&lt;/code&gt;, &lt;code&gt;log10()&lt;/code&gt;, &lt;code&gt;log2()&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# R knows pi = 3.1415926...

# round to 2 decimal places 
round(pi, digits = 2); round(pi,2)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 3.14&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 3.14&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#Round down to nearest integer
floor(pi)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 3&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;#Round up to nearest integer
ceiling(pi)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 4&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;main-data-types-and-vectors&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Main Data types and Vectors&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;character&lt;/strong&gt;: sometimes referred to as &lt;code&gt;string&lt;/code&gt; data, tend to be surrounded by quotes &lt;code&gt;&amp;lt;chr&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;numeric&lt;/strong&gt;: real or decimal numbers, sometimes referred to as “double” &lt;code&gt;&amp;lt;dbl&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;integer&lt;/strong&gt;: a subset of numeric in which numbers are stored as integers &lt;code&gt;&amp;lt;int&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;factor&lt;/strong&gt;: a categorical variables with different categories sorted alphabetically by default &lt;code&gt;&amp;lt;fct&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;logical&lt;/strong&gt;: Boolean data (TRUE and FALSE) &lt;code&gt;&amp;lt;lgl&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div id=&#34;your-turn&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;&lt;strong&gt;Your turn&lt;/strong&gt;&lt;/h3&gt;
&lt;!---LEARNR EX S1_ex1--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;myIframe_s1_ex1&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/s1_ex1_variables/&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;vectors&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Vectors&lt;/h2&gt;
&lt;p&gt;A vector is a collection of objects. There is a magical operator in R, &lt;code&gt;c&lt;/code&gt; which we use to &lt;strong&gt;c&lt;/strong&gt;ombine different elements.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# assign vector
ages &amp;lt;- c(20:30, 35, 50, 42, 72) 

# recall vector
ages&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;##  [1] 20 21 22 23 24 25 26 27 28 29 30 35 50 42 72&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# how many things are in the vector &amp;#39;ages&amp;#39;?
length(ages)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 15&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# what type of object is &amp;#39;ages?
class(ages)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;numeric&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;R allows vectorized operations, so we can get the average, or median of &lt;code&gt;ages&lt;/code&gt; by just typing&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# performing functions with vectors
mean(ages)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 31.6&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;median(ages)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 27&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can also have a collection of strings, or characters&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# vector of days of the week 
days &amp;lt;- c(&amp;quot;Monday&amp;quot;, &amp;quot;Tuesday&amp;quot;, &amp;quot;Wednesday&amp;quot;, &amp;quot;Thursday&amp;quot;, &amp;quot;Friday&amp;quot;, &amp;quot;Saturday&amp;quot;, &amp;quot;Sunday&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In this case, each word is encased in quotation marks, indicating they are characters rather than object names.&lt;/p&gt;
&lt;p&gt;Please answer the following questions about &lt;code&gt;days&lt;/code&gt;:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;How many values are in &lt;code&gt;days&lt;/code&gt;?&lt;/li&gt;
&lt;li&gt;What type of data (&lt;code&gt;class&lt;/code&gt;) is &lt;code&gt;days&lt;/code&gt;?&lt;/li&gt;
&lt;li&gt;Overview of &lt;code&gt;days&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;div id=&#34;manipulating-vectors&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Manipulating vectors&lt;/h3&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# add a value to end of vector
ages &amp;lt;- c(ages, 90) &lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# add value at the beginning
ages &amp;lt;- c(30, ages)&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# extracting second value
days[2] &lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Tuesday&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# excluding (dropping) second value
days[-2] &lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Monday&amp;quot;    &amp;quot;Wednesday&amp;quot; &amp;quot;Thursday&amp;quot;  &amp;quot;Friday&amp;quot;    &amp;quot;Saturday&amp;quot;  &amp;quot;Sunday&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# extracting first and third values
days[c(1, 3)] &lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Monday&amp;quot;    &amp;quot;Wednesday&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;your-turn-1&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;&lt;strong&gt;Your turn&lt;/strong&gt;&lt;/h3&gt;
&lt;!---LEARNR EX S1_ex2--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;myIframe_s1_ex2&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/s1_ex2_vectors/&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;p&gt;R tends to handle interpreting data types in the background of most operations. Usually it tries to coerce data to fit the general pattern of the data given to it.&lt;/p&gt;
&lt;p&gt;What type of data is each of the following objects? Anything unusual?&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;num_char &amp;lt;- c(1, 2, 3, &amp;quot;a&amp;quot;)
num_logical &amp;lt;- c(1, 2, 3, TRUE)
char_logical &amp;lt;- c(&amp;quot;a&amp;quot;, &amp;quot;b&amp;quot;, &amp;quot;c&amp;quot;, TRUE)
tricky &amp;lt;- c(1, 2, 3, &amp;quot;4&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;factors&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Factors&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;days&lt;/code&gt; is a character vector so R internally sorts it alphabetically and it thinks that Friday should be first. If we wanted to make into a categorical variable, we use&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;days &amp;lt;- factor(days)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can reorder, or relevel the factor, using &lt;code&gt;fct_relevel&lt;/code&gt; from the tidyverse package &lt;code&gt;forcats&lt;/code&gt;, or using &lt;code&gt;levels&lt;/code&gt; from baseR.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;days_sorted &amp;lt;- forcats::fct_relevel(days, levels = c(&amp;quot;Monday&amp;quot;,
                                     &amp;quot;Tuesday&amp;quot;,
                                     &amp;quot;Wednesday&amp;quot;,
                                     &amp;quot;Thursday&amp;quot;,
                                     &amp;quot;Friday&amp;quot;,
                                     &amp;quot;Saturday&amp;quot;,
                                     &amp;quot;Sunday&amp;quot;))

days_sorted2 &amp;lt;- factor(days, levels = c(&amp;quot;Monday&amp;quot;,
                                     &amp;quot;Tuesday&amp;quot;,
                                     &amp;quot;Wednesday&amp;quot;,
                                     &amp;quot;Thursday&amp;quot;,
                                     &amp;quot;Friday&amp;quot;,
                                     &amp;quot;Saturday&amp;quot;,
                                     &amp;quot;Sunday&amp;quot;))&lt;/code&gt;&lt;/pre&gt;
&lt;div id=&#34;your-turn-2&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;&lt;strong&gt;Your turn&lt;/strong&gt;&lt;/h3&gt;
&lt;!---LEARNR EX S1_ex3--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;myIframe_s1_ex3&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/s1_ex3_factors/&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;missing-data-or-na&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Missing data, or &lt;strong&gt;&lt;code&gt;NA&lt;/code&gt;&lt;/strong&gt;&lt;/h2&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# create a vector with missing data
times &amp;lt;- c(2, 4, 4, NA, 6)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;NA is not a character&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# calculate mean and max on vector with missing data
mean(times)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] NA&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;max(times)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] NA&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# add argument to remove NA
mean(times, na.rm = TRUE)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 4&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;max(times, na.rm = TRUE)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 6&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# remove incomplete cases
na.omit(times) &lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] 2 4 4 6
## attr(,&amp;quot;na.action&amp;quot;)
## [1] 4
## attr(,&amp;quot;class&amp;quot;)
## [1] &amp;quot;omit&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;div id=&#34;your-turn-3&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;&lt;strong&gt;Your turn&lt;/strong&gt;&lt;/h3&gt;
&lt;!---LEARNR EX S1_ex4--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;myIframe_s1_ex4&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/s1_ex4_nas/&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;online-quiz-variable-types---vectors&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Online Quiz: Variable Types - Vectors&lt;/h2&gt;
&lt;!---LEARNR EX S1_quiz1--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;height: 900px;&#34; id=&#34;myIframe_s1_quiz1&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/s1_quiz1_variabletype_vector_nas/&#34; frameborder=&#34;0&#34; scrolling=&#34;yes&#34;&gt;
&lt;blockquote&gt;
&lt;/iframe&gt;
&lt;/blockquote&gt;
&lt;!----------------&gt;
&lt;script&gt;
  iFrameResize({}, &#34;.interactive&#34;);
&lt;/script&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Visualise data</title>
      <link>https://bit-2021.netlify.app/exercise/ggplot-exercise/</link>
      <pubDate>Sat, 25 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/exercise/ggplot-exercise/</guid>
      <description>
&lt;script src=&#34;https://cdnjs.cloudflare.com/ajax/libs/iframe-resizer/3.5.16/iframeResizer.min.js&#34; type=&#34;text/javascript&#34;&gt;&lt;/script&gt;

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#introduction-to-ggplot2-by-visualising-numeric-data.&#34;&gt;Introduction to &lt;code&gt;ggplot2&lt;/code&gt; by visualising numeric data.&lt;/a&gt;&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#scatter-plots-and-multiple-panels-using-facet_wrap&#34;&gt;Scatter plots and multiple panels using &lt;code&gt;facet_wrap()&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#animating-changes&#34;&gt;Animating changes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#imdb-movie-ratings-scatterplots-and-relationships&#34;&gt;IMDB movie ratings: Scatterplots and relationships&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#imdb-movie-ratings-boxplots-violin-plots&#34;&gt;IMDB movie ratings: Boxplots, violin plots&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#multiple-panels-using-facet_wrap-and-facet_grid&#34;&gt;Multiple panels using &lt;code&gt;facet_wrap()&lt;/code&gt; and &lt;code&gt;facet_grid()&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;introduction-to-ggplot2-by-visualising-numeric-data.&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Introduction to &lt;code&gt;ggplot2&lt;/code&gt; by visualising numeric data.&lt;/h2&gt;
&lt;p&gt;We will start with the &lt;code&gt;gapminder&lt;/code&gt; data set. We look at its contents&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;glimpse(gapminder)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Rows: 1,704
## Columns: 6
## $ country   &amp;lt;fct&amp;gt; Afghanistan, Afghanistan, Afghanistan, Afghanistan, Afgha...
## $ continent &amp;lt;fct&amp;gt; Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asi...
## $ year      &amp;lt;int&amp;gt; 1952, 1957, 1962, 1967, 1972, 1977, 1982, 1987, 1992, 199...
## $ lifeExp   &amp;lt;dbl&amp;gt; 28.801, 30.332, 31.997, 34.020, 36.088, 38.438, 39.854, 4...
## $ pop       &amp;lt;int&amp;gt; 8425333, 9240934, 10267083, 11537966, 13079460, 14880372,...
## $ gdpPercap &amp;lt;dbl&amp;gt; 779.4453, 820.8530, 853.1007, 836.1971, 739.9811, 786.113...&lt;/code&gt;&lt;/pre&gt;
&lt;div id=&#34;scatter-plots-and-multiple-panels-using-facet_wrap&#34; class=&#34;section level3&#34;&gt;
&lt;h3&gt;Scatter plots and multiple panels using &lt;code&gt;facet_wrap()&lt;/code&gt;&lt;/h3&gt;
&lt;!---LEARNR sec2_ex1_dates--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;sec2_ex1&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/sec2_ex1/&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div id=&#34;animating-changes&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Animating changes&lt;/h2&gt;
&lt;p&gt;Racing bars! We will create a simple bar graph showing the evolution of GDP per capita for the top 8 countries&lt;/p&gt;
&lt;!---LEARNR sec2_ex3--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;sec2_ex3&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/sec2_ex3/&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;/div&gt;
&lt;div id=&#34;imdb-movie-ratings-scatterplots-and-relationships&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;IMDB movie ratings: Scatterplots and relationships&lt;/h2&gt;
&lt;p&gt;For this section, we will use a sample of movies released since 2000 with data from IMDB. We have data on movies from the following six genres:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Action&lt;/li&gt;
&lt;li&gt;Adventure&lt;/li&gt;
&lt;li&gt;Comedy&lt;/li&gt;
&lt;li&gt;Drama&lt;/li&gt;
&lt;li&gt;Animation&lt;/li&gt;
&lt;li&gt;Documentary&lt;/li&gt;
&lt;/ul&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;imdb &amp;lt;- read_csv(here::here(&amp;quot;data&amp;quot;, &amp;quot;movies.csv&amp;quot;))
imdb_short &amp;lt;- imdb %&amp;gt;% 
  filter(genre %in% c(&amp;quot;Action&amp;quot;, &amp;quot;Adventure&amp;quot;, &amp;quot;Comedy&amp;quot;, &amp;quot;Drama&amp;quot;, &amp;quot;Animation&amp;quot;, &amp;quot;Documentary&amp;quot;),
         year &amp;gt;= 2000)&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;glimpse(imdb_short)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Rows: 1,762
## Columns: 11
## $ title               &amp;lt;chr&amp;gt; &amp;quot;Avatar&amp;quot;, &amp;quot;Jurassic World&amp;quot;, &amp;quot;The Avengers&amp;quot;, &amp;quot;Th...
## $ genre               &amp;lt;chr&amp;gt; &amp;quot;Action&amp;quot;, &amp;quot;Action&amp;quot;, &amp;quot;Action&amp;quot;, &amp;quot;Action&amp;quot;, &amp;quot;Action...
## $ director            &amp;lt;chr&amp;gt; &amp;quot;James Cameron&amp;quot;, &amp;quot;Colin Trevorrow&amp;quot;, &amp;quot;Joss Whedo...
## $ year                &amp;lt;dbl&amp;gt; 2009, 2015, 2012, 2008, 2015, 2012, 2004, 2013,...
## $ duration            &amp;lt;dbl&amp;gt; 178, 124, 173, 152, 141, 164, 93, 146, 151, 103...
## $ gross               &amp;lt;dbl&amp;gt; 760505847, 652177271, 623279547, 533316061, 458...
## $ budget              &amp;lt;dbl&amp;gt; 2.37e+08, 1.50e+08, 2.20e+08, 1.85e+08, 2.50e+0...
## $ cast_facebook_likes &amp;lt;dbl&amp;gt; 4834, 8458, 87697, 57802, 92000, 106759, 1148, ...
## $ votes               &amp;lt;dbl&amp;gt; 886204, 418214, 995415, 1676169, 462669, 114433...
## $ reviews             &amp;lt;dbl&amp;gt; 3777, 1934, 2425, 5312, 1752, 3514, 688, 1208, ...
## $ rating              &amp;lt;dbl&amp;gt; 7.9, 7.0, 8.1, 9.0, 7.5, 8.5, 7.2, 7.6, 7.3, 8....&lt;/code&gt;&lt;/pre&gt;
&lt;!---LEARNR sec2_ex4_5--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;sec2_ex4_5&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/sec2_ex4_5/&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;/div&gt;
&lt;div id=&#34;imdb-movie-ratings-boxplots-violin-plots&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;IMDB movie ratings: Boxplots, violin plots&lt;/h2&gt;
&lt;p&gt;Let us consider the &lt;code&gt;rating&lt;/code&gt; movies got according to their &lt;code&gt;genre&lt;/code&gt;. How can we visualise the distribution of ratings?&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(imdb_short,
       aes(x=rating, y = genre, fill = genre,  alpha = 0.2))+
  geom_boxplot()+
  theme_minimal()+
  theme(legend.position = &amp;quot;none&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/exercise/ggplot-exercise_files/figure-html/unnamed-chunk-2-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;ggplot(imdb_short,
       aes(x=rating, y = genre, fill = genre,  alpha = 0.2))+
  geom_violin()+
  theme_minimal()+
  theme(legend.position = &amp;quot;none&amp;quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/exercise/ggplot-exercise_files/figure-html/unnamed-chunk-2-2.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div id=&#34;multiple-panels-using-facet_wrap-and-facet_grid&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Multiple panels using &lt;code&gt;facet_wrap()&lt;/code&gt; and &lt;code&gt;facet_grid()&lt;/code&gt;&lt;/h2&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;imdb_short %&amp;gt;% 
  filter(genre %in% c(&amp;quot;Action&amp;quot;, &amp;quot;Comedy&amp;quot;, &amp;quot;Drama&amp;quot;),
         year &amp;gt;= 2010) %&amp;gt;% 
ggplot(aes(x=rating,  fill = genre,  alpha = 0.2))+
  geom_boxplot()+
  theme_minimal()+
  theme(legend.position = &amp;quot;none&amp;quot;)+
  facet_grid(
    rows= vars(year),
    cols= vars(genre)
  )&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/exercise/ggplot-exercise_files/figure-html/unnamed-chunk-3-1.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;imdb_short %&amp;gt;% 
  filter(genre %in% c(&amp;quot;Action&amp;quot;, &amp;quot;Comedy&amp;quot;, &amp;quot;Drama&amp;quot;),
         year &amp;gt;= 2010) %&amp;gt;% 
ggplot(aes(x=rating,  fill = genre,  alpha = 0.2))+
  geom_boxplot()+
  theme_minimal()+
  theme(legend.position = &amp;quot;none&amp;quot;)+
  facet_grid(
    rows= vars(cut(budget, 3)),
    cols= vars(genre)
  )&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/exercise/ggplot-exercise_files/figure-html/unnamed-chunk-3-2.png&#34; width=&#34;672&#34; /&gt;&lt;/p&gt;
&lt;script&gt;
  iFrameResize({}, &#34;.interactive&#34;);
&lt;/script&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Hypothesis Testing</title>
      <link>https://bit-2021.netlify.app/exercise/inference_hypothesis-exercise/</link>
      <pubDate>Sat, 25 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/exercise/inference_hypothesis-exercise/</guid>
      <description>
&lt;script src=&#34;https://cdnjs.cloudflare.com/ajax/libs/iframe-resizer/3.5.16/iframeResizer.min.js&#34; type=&#34;text/javascript&#34;&gt;&lt;/script&gt;


&lt;!---LEARNR sampling_mcq--&gt;
&lt;!-- &lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;sampling_mcq&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/sampling_mcq&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;&lt;/iframe&gt; --&gt;
&lt;!----------------&gt;
&lt;script&gt;
  iFrameResize({}, &#34;.interactive&#34;);
&lt;/script&gt;
</description>
    </item>
    
    <item>
      <title>Import, inspect, and clean data</title>
      <link>https://bit-2021.netlify.app/exercise/import-inspect-exercise/</link>
      <pubDate>Sat, 25 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/exercise/import-inspect-exercise/</guid>
      <description>
&lt;script src=&#34;https://cdnjs.cloudflare.com/ajax/libs/iframe-resizer/3.5.16/iframeResizer.min.js&#34; type=&#34;text/javascript&#34;&gt;&lt;/script&gt;

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#inspecting-and-cleaning-data&#34;&gt;Inspecting and Cleaning Data&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;inspecting-and-cleaning-data&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Inspecting and Cleaning Data&lt;/h2&gt;
&lt;p&gt;Inspecting data using &lt;code&gt;skimr::skim()&lt;/code&gt;&lt;/p&gt;
&lt;!---LEARNR s1_ex5_skim--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;s1_ex5_skim&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/s1_ex5_skim/&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;script&gt;
  iFrameResize({}, &#34;.interactive&#34;);
&lt;/script&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Manipulate data</title>
      <link>https://bit-2021.netlify.app/exercise/dplyr-exercise/</link>
      <pubDate>Sat, 25 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/exercise/dplyr-exercise/</guid>
      <description>
&lt;script src=&#34;https://cdnjs.cloudflare.com/ajax/libs/iframe-resizer/3.5.16/iframeResizer.min.js&#34; type=&#34;text/javascript&#34;&gt;&lt;/script&gt;

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#select-variables-in-a-dataset-using-select-and-sort-using-arrange&#34;&gt;&lt;strong&gt;Select&lt;/strong&gt; variables in a dataset using &lt;code&gt;select()&lt;/code&gt; and &lt;strong&gt;sort&lt;/strong&gt; using &lt;code&gt;arrange()&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#filter-rows-in-a-dataset-using-filter&#34;&gt;&lt;strong&gt;Filter&lt;/strong&gt; rows in a dataset using &lt;code&gt;filter()&lt;/code&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#mutate-to-change-the-data-type-of-a-variable-and-create-new-variables&#34;&gt;&lt;strong&gt;&lt;code&gt;mutate()&lt;/code&gt;&lt;/strong&gt; to change the data type of a variable and create new variables&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#group_bysummarise-to-get-summary-statistics-including-counts-means-etc.-within-categories.&#34;&gt;&lt;strong&gt;&lt;code&gt;group_by()/summarise()&lt;/code&gt;&lt;/strong&gt; to get summary statistics, including counts, means, etc., within categories.&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#putting-it-all-together&#34;&gt;Putting it all together&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;select-variables-in-a-dataset-using-select-and-sort-using-arrange&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;strong&gt;Select&lt;/strong&gt; variables in a dataset using &lt;code&gt;select()&lt;/code&gt; and &lt;strong&gt;sort&lt;/strong&gt; using &lt;code&gt;arrange()&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;The dataframe &lt;code&gt;movies&lt;/code&gt; has been loaded into memory. It contains a sample of movies from IMDB, and its contents are shown below:&lt;/p&gt;
&lt;!---LEARNR s3_ex12_pipe_select--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;s3_ex12_pipe_select&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/s3_ex12_pipe_select&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;/div&gt;
&lt;div id=&#34;filter-rows-in-a-dataset-using-filter&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;strong&gt;Filter&lt;/strong&gt; rows in a dataset using &lt;code&gt;filter()&lt;/code&gt;&lt;/h2&gt;
&lt;p&gt;Remember that &lt;code&gt;select()&lt;/code&gt; allows us to choose columns, or variables, whereas &lt;code&gt;filter()&lt;/code&gt; chooses rows, or cases, that conform to certain criteria&lt;/p&gt;
&lt;!---LEARNR s3_ex3_filter--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;s3_ex3_filter&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/s3_ex3_filter&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;/div&gt;
&lt;div id=&#34;mutate-to-change-the-data-type-of-a-variable-and-create-new-variables&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;strong&gt;&lt;code&gt;mutate()&lt;/code&gt;&lt;/strong&gt; to change the data type of a variable and create new variables&lt;/h2&gt;
&lt;!---LEARNR s3_ex5_mutates--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;s3_ex5_mutates&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/s3_ex5_mutates&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;/div&gt;
&lt;div id=&#34;group_bysummarise-to-get-summary-statistics-including-counts-means-etc.-within-categories.&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;strong&gt;&lt;code&gt;group_by()/summarise()&lt;/code&gt;&lt;/strong&gt; to get summary statistics, including counts, means, etc., within categories.&lt;/h2&gt;
&lt;!---LEARNR s3_ex8_summarise--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;s3_ex8_summarise&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/s3_ex8_summarise&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;/div&gt;
&lt;div id=&#34;putting-it-all-together&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;Putting it all together&lt;/h2&gt;
&lt;p&gt;You can put together all of your &lt;code&gt;dplyr&lt;/code&gt; knowledge to work four genres of movies, namely action, adventure, comedy and drama and create the following plot.&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/img/movies_to_watch.png&#34; width=&#34;100%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;For these genres, you have to&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Make sure you do not have multiple entries of the same movie; use &lt;code&gt;distinct(movie, _keep.all=TRUE)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Calculate a normalised metric for rating, where you adjust the movie’s rating by the number of votes it received out of the total votes in its genre, &lt;code&gt;normalised_rating = rating * (votes / total votes in genre)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Arrange movies, so higher &lt;code&gt;normalised_rating&lt;/code&gt; appears first.&lt;/li&gt;
&lt;li&gt;Categorise and colour movies according to their budget &lt;code&gt;cost&lt;/code&gt;
&lt;ul&gt;
&lt;li&gt;cheap (&amp;lt;20m, or &amp;lt;&lt;code&gt;20e6&lt;/code&gt; as &lt;code&gt;e6&lt;/code&gt; is R shorthand for 1 million, or &lt;span class=&#34;math inline&#34;&gt;\(10^6\)&lt;/span&gt;,&lt;/li&gt;
&lt;li&gt;moderate (20-120m), and&lt;/li&gt;
&lt;li&gt;expensive (&amp;gt;120m)&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;Convert &lt;code&gt;cost&lt;/code&gt; column to a factor variable and re-level in the correct order (cheap, moderate, expensive)&lt;/li&gt;
&lt;li&gt;Change the labels in the x- and y-axis, and give appropriate titles, subtitles, etc&lt;/li&gt;
&lt;li&gt;use theme minimal&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Some tips:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;for sorting columns within a a ggplot, check out the &lt;code&gt;reorder()&lt;/code&gt; in x argument&lt;/li&gt;
&lt;li&gt;if you perform dplyr on original dataframe make sure to overwrite dataframe otherwise all changes are done on the fly and are not saved&lt;/li&gt;
&lt;li&gt;consider freeing the scales of the facet wrap&lt;/li&gt;
&lt;/ol&gt;
&lt;!---LEARNR s3_ex9_complete--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;s3_ex9_complete&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/s3_ex9_complete&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;script&gt;
  iFrameResize({}, &#34;.interactive&#34;);
&lt;/script&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Model Fitting</title>
      <link>https://bit-2021.netlify.app/exercise/modelling_fit-exercise/</link>
      <pubDate>Sat, 25 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/exercise/modelling_fit-exercise/</guid>
      <description>



</description>
    </item>
    
    <item>
      <title>Questions on Descriptive and Inferential Statistics</title>
      <link>https://bit-2021.netlify.app/exercise/thought_questions-exercise/</link>
      <pubDate>Wed, 26 Aug 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/exercise/thought_questions-exercise/</guid>
      <description>
&lt;script src=&#34;https://cdnjs.cloudflare.com/ajax/libs/iframe-resizer/3.5.16/iframeResizer.min.js&#34; type=&#34;text/javascript&#34;&gt;&lt;/script&gt;


&lt;!-- ## Thought Questions for Descriptive and Inferential Statistics --&gt;
&lt;p&gt;The questions listed below are designed for discussion and preparation. When reviewing these questions, try to illustrate your points with specific examples/cases from what we have seen in class.&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Why do we make distinctions between &lt;strong&gt;samples&lt;/strong&gt; and &lt;strong&gt;populations&lt;/strong&gt; in statistics?&lt;/li&gt;
&lt;li&gt;Discuss the use of exploratory data analysis. Illustrate with an example.&lt;/li&gt;
&lt;li&gt;Why do we care about the distribution of their data?&lt;/li&gt;
&lt;li&gt;What are outliers? What impact do they have on how you describe your data?&lt;/li&gt;
&lt;li&gt;What is a robust statistic? Why would you choose to use it?&lt;/li&gt;
&lt;li&gt;What is variability? Discuss measures of variability. What are their strengths and weaknesses? In a operations setting, what is the relation of variability to the quality of the product/service?&lt;/li&gt;
&lt;li&gt;Why do we standardise data? What does a Z-score tell you, i.e., how do you interpret one? How do you convert a raw score to a Z-score?&lt;/li&gt;
&lt;li&gt;What is the Normal distribution? What is the Standard Normal distribution?&lt;/li&gt;
&lt;li&gt;How do you use the Normal distribution table to find the percentage of the population that is expected to fall between two points or beyond or below one point in the distribution?&lt;/li&gt;
&lt;li&gt;What does the Central Limit Theorem tell us?&lt;/li&gt;
&lt;li&gt;Discuss the differences between descriptive and inferential statistics. Is one better than the other? Are they competitive or complementary? Illustrate the kind of situation in which each approach is appropriate.&lt;/li&gt;
&lt;li&gt;What are the steps involved in hypothesis testing?&lt;/li&gt;
&lt;li&gt;What does specifying the null hypothesis mean? What about the alternative hypothesis? What is the benefit of being so specific about the hypotheses?&lt;/li&gt;
&lt;li&gt;With inferential statistics, the goal is to reject the null hypothesis. What does this mean? Do we conclude that the alternative hypothesis is correct? Why or why not?&lt;/li&gt;
&lt;li&gt;Why is the standard error of the mean, based on many samples, going to be smaller than the standard deviation of a single sample? In explaining your answer, be sure to describe the interpretation of a standard error of the mean.&lt;/li&gt;
&lt;li&gt;What types of error can occur when making decisions based on test of hypothesis? Be specific.&lt;/li&gt;
&lt;li&gt;Why are observations that are more than 3 or less than -3 standard deviations from the mean often considered outliers by some researchers?&lt;/li&gt;
&lt;li&gt;What does it mean if a researcher sets her &lt;strong&gt;&lt;span class=&#34;math inline&#34;&gt;\(\alpha = 0.01\)&lt;/span&gt;&lt;/strong&gt;, and rejects the null hypothesis? How does this differ from setting the alpha at &lt;strong&gt;&lt;span class=&#34;math inline&#34;&gt;\(\alpha = 0.05\)&lt;/span&gt;&lt;/strong&gt; and rejecting the null? In which case is the researcher going to be most likely to reject the null hypothesis?&lt;/li&gt;
&lt;li&gt;What is the difference between a one-tailed (directional) and a two-tailed (non-directional) test? When would you use each of them?&lt;/li&gt;
&lt;li&gt;What is meant when a researcher says that a finding is &lt;strong&gt;statistically significant?&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;!---LEARNR sampling_mcq--&gt;
&lt;!-- &lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;sampling_mcq&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/sampling_mcq&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;&lt;/iframe&gt; --&gt;
&lt;!----------------&gt;
&lt;script&gt;
  iFrameResize({}, &#34;.interactive&#34;);
&lt;/script&gt;
</description>
    </item>
    
    <item>
      <title>Handling dates/times</title>
      <link>https://bit-2021.netlify.app/exercise/lubridate-exercise/</link>
      <pubDate>Mon, 27 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/exercise/lubridate-exercise/</guid>
      <description>
&lt;script src=&#34;https://cdnjs.cloudflare.com/ajax/libs/iframe-resizer/3.5.16/iframeResizer.min.js&#34; type=&#34;text/javascript&#34;&gt;&lt;/script&gt;

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#lubridate-to-convert-strings-to-date-objects&#34;&gt;&lt;code&gt;lubridate&lt;/code&gt; to convert strings to &lt;code&gt;Date&lt;/code&gt; objects&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#on-your-own&#34;&gt;&lt;strong&gt;On your own&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;lubridate&lt;/code&gt; package is one of the most useful packages to handle dates and times in R. Your data may contain dates as a string of characters (class &lt;code&gt;&amp;lt;chr&amp;gt;&lt;/code&gt;), and you need to convert them to date objects before you can do any kind of analysis.&lt;/p&gt;
&lt;div id=&#34;lubridate-to-convert-strings-to-date-objects&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;code&gt;lubridate&lt;/code&gt; to convert strings to &lt;code&gt;Date&lt;/code&gt; objects&lt;/h2&gt;
&lt;p&gt;Let us look at an example: We will define Christmas as a string in various formats and will then try to convert it to a &lt;code&gt;Date&lt;/code&gt; object so we can manipulate it.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;library(lubridate)

today &amp;lt;-  Sys.Date()
today&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;2020-08-25&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;class(today)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Date&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;date1 &amp;lt;-  &amp;quot;25-12-2020&amp;quot;
date2 &amp;lt;-  &amp;quot;12-25-2020&amp;quot;
date3 &amp;lt;-  &amp;quot;2000-12-25&amp;quot;
date4 &amp;lt;-  &amp;quot;Dec 25, 2020&amp;quot;

class(date1)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;character&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;class(date4)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;character&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# dmy: day-month-year
xmas1 &amp;lt;- lubridate::dmy(date1)
class(xmas1)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Date&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# mdy: month-day-year
xmas2 &amp;lt;- lubridate::mdy(date2)
class(xmas2)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Date&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# ymd: year-month-date, ISO8601 standard
# https://en.wikipedia.org/wiki/ISO_8601
xmas3 &amp;lt;- ymd(date3)
class(xmas3)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Date&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# mdy: month-day-year
xmas4 &amp;lt;- lubridate::mdy(date4)
class(xmas4)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## [1] &amp;quot;Date&amp;quot;&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# once we have it as a Data object, we can do calculations...
xmas1 - today&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Time difference of 122 days&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;# ... but these calculations will not work if the date is a string (character) 
date1 - today&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Error in `-.Date`(date1, today): can only subtract from &amp;quot;Date&amp;quot; objects&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;div id=&#34;on-your-own&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;strong&gt;On your own&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;We’ll use data from &lt;a href=&#34;http://www.tfl.gov.uk&#34; class=&#34;uri&#34;&gt;http://www.tfl.gov.uk&lt;/a&gt; to analyse usage of the London Bike Sharing scheme. This data has already been downloaded for you and exists in a &lt;code&gt;CSV&lt;/code&gt; (Comma Separated Values), along with weather information.&lt;/p&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;bike &amp;lt;- read_csv(here::here(&amp;quot;data&amp;quot;, &amp;quot;londonBikes.csv&amp;quot;))&lt;/code&gt;&lt;/pre&gt;
&lt;pre class=&#34;r&#34;&gt;&lt;code&gt;glimpse(bike)&lt;/code&gt;&lt;/pre&gt;
&lt;pre&gt;&lt;code&gt;## Rows: 3,439
## Columns: 14
## $ date          &amp;lt;chr&amp;gt; &amp;quot;01-01-11&amp;quot;, &amp;quot;02-01-11&amp;quot;, &amp;quot;03-01-11&amp;quot;, &amp;quot;04-01-11&amp;quot;, &amp;quot;05-0...
## $ bikes_hired   &amp;lt;dbl&amp;gt; 4555, 6250, 7262, 13430, 13757, 9595, 9294, 9338, 105...
## $ season        &amp;lt;dbl&amp;gt; 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,...
## $ max_temp      &amp;lt;dbl&amp;gt; 7.2, 4.0, 2.9, NA, 7.1, NA, 10.8, 10.4, 7.2, 8.9, 8.3...
## $ min_temp      &amp;lt;dbl&amp;gt; NA, NA, NA, 0.3, 3.8, NA, 1.0, NA, NA, -1.9, NA, 3.7,...
## $ avg_temp      &amp;lt;dbl&amp;gt; 5.6, 2.9, 1.4, 2.7, 5.6, 4.1, 6.1, 6.9, 3.1, 4.3, 5.8...
## $ avg_humidity  &amp;lt;dbl&amp;gt; 84, 79, 80, 87, 84, 92, 92, 82, 79, 87, 82, 89, 89, 8...
## $ avg_pressure  &amp;lt;dbl&amp;gt; 1025, 1028, 1024, 1013, 1000, 996, 999, 997, 1012, 10...
## $ avg_windspeed &amp;lt;dbl&amp;gt; 10, 8, 6, 6, 19, 5, 11, 23, 16, 14, 16, 16, 23, 24, 2...
## $ rainfall_mm   &amp;lt;dbl&amp;gt; 0.0, 0.5, 0.0, 0.0, 0.0, 0.5, 11.4, 13.0, 1.0, 0.0, 7...
## $ rain          &amp;lt;lgl&amp;gt; TRUE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, FAL...
## $ fog           &amp;lt;lgl&amp;gt; FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, TRUE, FALSE,...
## $ thunderstorm  &amp;lt;lgl&amp;gt; FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALS...
## $ snow          &amp;lt;lgl&amp;gt; FALSE, FALSE, TRUE, FALSE, FALSE, TRUE, FALSE, FALSE,...&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;date&lt;/code&gt; is a character string, and is given as 01-01-2011, 02-01-2011, 03-01-2011, meaning 1st, 2nd, 3rd of January, etc. In other words, the format of the string is &lt;code&gt;dmy&lt;/code&gt;, or day-month-year.&lt;/p&gt;
&lt;!---LEARNR s1_ex6_dates--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;s1_ex6_dates&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/s1_ex6_dates/&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;script&gt;
  iFrameResize({}, &#34;.interactive&#34;);
&lt;/script&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Model diagnostics</title>
      <link>https://bit-2021.netlify.app/exercise/modelling_diagnostics-exercise/</link>
      <pubDate>Sat, 25 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/exercise/modelling_diagnostics-exercise/</guid>
      <description>



</description>
    </item>
    
    <item>
      <title>Reshape data</title>
      <link>https://bit-2021.netlify.app/exercise/reshape-exercise/</link>
      <pubDate>Sat, 25 Jul 2020 00:00:00 +0000</pubDate>
      <guid>https://bit-2021.netlify.app/exercise/reshape-exercise/</guid>
      <description>
&lt;script src=&#34;https://cdnjs.cloudflare.com/ajax/libs/iframe-resizer/3.5.16/iframeResizer.min.js&#34; type=&#34;text/javascript&#34;&gt;&lt;/script&gt;

&lt;div id=&#34;TOC&#34;&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#pivot_longer---pivot_wider-to-make-a-wide-dataset-long-and-vice-versa&#34;&gt;&lt;code&gt;pivot_longer()&lt;/code&gt; - &lt;code&gt;pivot_wider&lt;/code&gt; to make a &lt;em&gt;wide&lt;/em&gt; dataset &lt;em&gt;long&lt;/em&gt; and vice-versa&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#left_join-to-join-tables-on-columns&#34;&gt;&lt;code&gt;left_join()&lt;/code&gt; to join tables on columns&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#bind_rows-to-combine-rows-from-two-or-more-datasets&#34;&gt;&lt;code&gt;bind_rows()&lt;/code&gt; to combine rows from two or more datasets&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div id=&#34;pivot_longer---pivot_wider-to-make-a-wide-dataset-long-and-vice-versa&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;code&gt;pivot_longer()&lt;/code&gt; - &lt;code&gt;pivot_wider&lt;/code&gt; to make a &lt;em&gt;wide&lt;/em&gt; dataset &lt;em&gt;long&lt;/em&gt; and vice-versa&lt;/h2&gt;
&lt;p&gt;The &lt;a href=&#34;https://data.un.org/&#34;&gt;United Nations have all sorts of data&lt;/a&gt;. For this example, we will work with data on tourist/visitor arrivals and tourism expenditure. The dataframe &lt;code&gt;un_tourism_data&lt;/code&gt; has been loaded into memory, and contains data on tourist arrivals (in thousands) and tourism expenditure (in millions of US$). We would like to calculate spending per tourist and see how some of the top tourist destinations compare&lt;/p&gt;
&lt;p&gt;You have to:&lt;/p&gt;
&lt;ol style=&#34;list-style-type: decimal&#34;&gt;
&lt;li&gt;Convert &lt;code&gt;un_tourism_data&lt;/code&gt; from long to wide format; you need to do this to create the new variable &lt;code&gt;spending_per_tourist&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;clean column names&lt;/li&gt;
&lt;li&gt;rename columns to “tourism_expenditure” and “tourist_arrivals”&lt;/li&gt;
&lt;li&gt;remove rows where tourism expenditure or arrivals are &lt;code&gt;NA&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;create a new column where you calculate spending per tourist (remember expenditure is in millions and arrival is in thousands)&lt;/li&gt;
&lt;/ol&gt;
&lt;!---LEARNR s4_ex1_pivotwide--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;s4_ex1_pivotwide&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/s4_ex1_pivotwide&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;p&gt;You have successfully calculated spending per tourist. We are now faced with the challenge of producing a plot that looks like this&lt;/p&gt;
&lt;p&gt;&lt;img src=&#34;https://bit-2021.netlify.app/img/tourist_arrivals_spending.png&#34; width=&#34;100%&#34; /&gt;&lt;/p&gt;
&lt;p&gt;The best way to get this plot is to first reshape the dataframe from wide to long, and then apply your ggplot skills.&lt;/p&gt;
&lt;!---LEARNR s4_ex2_pivotlong--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;s4_ex2_pivotlong&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/s4_ex2_pivotlong&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;/div&gt;
&lt;div id=&#34;left_join-to-join-tables-on-columns&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;code&gt;left_join()&lt;/code&gt; to join tables on columns&lt;/h2&gt;
&lt;p&gt;We have loaded into memory two dataframes &lt;code&gt;countries&lt;/code&gt; and &lt;code&gt;matches&lt;/code&gt; that contain matches played in various European football (soccer) leagues over a number of years. We want to join the two dataframes, so we can see the name, rather than an ID of the league. We also want to calculate the average number of goals per game in each league and plot those averages for all seasons.&lt;/p&gt;
&lt;!---LEARNR s4_ex3_joins--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;s4_ex3_joins&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/s4_ex3_joins&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;/div&gt;
&lt;div id=&#34;bind_rows-to-combine-rows-from-two-or-more-datasets&#34; class=&#34;section level2&#34;&gt;
&lt;h2&gt;&lt;code&gt;bind_rows()&lt;/code&gt; to combine rows from two or more datasets&lt;/h2&gt;
&lt;p&gt;We have three distinct dataframes, england_matches, germany_matches, and italy_matches that contain data on each country. We need to combine these three datasets into one, and sort it in ascending order by date.&lt;/p&gt;
&lt;!---LEARNR s4_ex4_bindrows--&gt;
&lt;iframe style=&#34;margin:0 auto; min-width: 100%;&#34; id=&#34;s4_ex4_bindrows&#34; class=&#34;interactive&#34; src=&#34;https://kchristodoulou.shinyapps.io/s4_ex4_bindrows&#34; scrolling=&#34;no&#34; frameborder=&#34;no&#34;&gt;
&lt;/iframe&gt;
&lt;!----------------&gt;
&lt;script&gt;
  iFrameResize({}, &#34;.interactive&#34;);
&lt;/script&gt;
&lt;/div&gt;
</description>
    </item>
    
  </channel>
</rss>
