Bio300B Lecture 2
Institutt for biovitenskap, UiB
25 August 2025
Why not just use Excel for everything?
Assign object to a name
Forgetting to assign is a very common error
Function name followed by brackets
Arguments separated by comma
Don’t include an argument - uses default
Don’t need to name arguments if in correct order
All elements must be the same type
Atomic vectors
Automatic coercion
Predict the outcome of
[1] 1 0
Extract from
2 dimensional
All elements same type
Arrays can have 3+ dimensions
[row_indices, column_indices]
[1] 4 5
Each element of a list can be a different type
Can make a smaller list, or extract contents of a carriage
Extract vector “a”
From the following extract
rectangular data structure - 2-dimensions
columns can have different type of object
special type of list where all vectors have same length
Tibbles are better behaved version of data.frame
Data.frames have row and column names
With square brackets
With column names
Which method is safer?
Can also use dplyr
package.
Extract the from
Return TRUE or FALSE
a == b
TRUE if a equals b (test of equality - single =
assignment)a != b
TRUE if a not equal to ba > b
TRUE if a greater than ba <= b
TRUE if a less than or equal to bUseful with subsetting, ifelse()
or dplyr::case_when()
logical conditions can be combined
&
AND - TRUE if both TRUE|
OR - TRUE if either TRUE!
NOT - TRUE if FALSEif
statements for choice
else
is optional
Use &&
and ||
to return a single TRUE/FALSE
Often don’t need an explicit loop - R is vectorised
for
loopsfor
loops iterate over elements of a vector
for
pitfallsNeed to pre-allocate space or slow
Rarely need a loop - purrr::map()
, apply()
generally cleaner
map()
$a
[1] 2
$b
[1] 5.5
$c
[1] 7
a b c
2.0 5.5 7.0
apply()
for iterating over rows/columns of a matrix
With your computer
With your collaborators
“Your closest collaborator is you six months ago but you don’t reply to email.” — Paul Wilson
Need understandable code
Badstylemakescodehardertoread
A condition of publication in a Nature Portfolio journal is that authors are required to make materials, data, code, and associated protocols promptly available to readers without undue qualifications.
it is a condition for publication of accepted manuscripts at CJFAS that authors make publicly available all data and code needed to reproduce those results (including code to reproduce statistical results, simulation results, and figures) via an online data repository.
The only way to write good code is to write tons of shitty code first. Feeling shame about bad code stops you from getting to good code
— Hadley Wickham (@hadleywickham) 17 April 2015
Make your own style - but be consistent
“There are only two hard things in Computer Science: cache invalidation and naming things.”
— Phil Karlton
TRUE
, for
, if
cnames_1 = data_1
.columnNames()
.filter((c) => !['Timestamp', 'Score'].includes(c))
.map(d => d.replace(/^.*\[/, '').replace(/]$/, ''));
foldData_1 = data_1
.fold(aq.not('Timestamp', 'Score'), { as: ['question', 'answer'] })
.spread({question: d => op.split(d.question, '[')} , {as: ['question_title', 'question']})
.derive({question: d => op.replace(d.question, ']', '')});
Plot.plot({
marginLeft: 170,
marginBottom: 60,
height: 500,
x: {label: 'Frequency', labelOffset: 50},
y: {label: null, domain: cnames_1},
color: {
legend: true,
domain: ["Yes", "No"],
range: ["green", "orange"],
},
style: {fontSize: '25px'},
marks: [
Plot.barX(foldData_1, {y: 'question', x: 1, inset: 0.5, fill: 'answer', sort: 'answer'}),
Plot.ruleX([0])
]
})
k
camelCase 🐫 | UpperCamelCase | snake_case 🐍 |
---|---|---|
billLengthMM | BillLengthMM | bill_length_mm |
bergenWeather2022 | BergenWeather2022 | bergen_weather_2022 |
dryMassG | DryMassG | dry_mass_g |
makeWeatherPlot | MakeWeatherPlot | make_weather_plot |
Place spaces
|>
, +
, -
, <-
, )=
in function callsUse styler
package to edit code to meet style guide.
Use lintr
package for static code analysis, including style check
Helps you find your way around a script
Long scripts become difficult to navigate
Fix by moving parts of the code into different files
For example:
Import with
Number files so they sort alphabetically in order of use.
Repeated code is hard to maintain
Make repeated code into functions.
Single place to maintain
Comments
Use # to start comments.
Help you and others to understand what you did
Comments should explain the why, not the what.
Try to make code self-documenting with descriptive object names