2020-11-29

Design cheatsheet

I like to complain that design can distract from the main topic and is therefore not important. However, design is important. If your site, presentation or article looks ugly, then you already are one step behind in convincing the audience. The cheatsheet below can be used to quickly fix design mistakes.

Colors

Suprisingly, you should Never Use Black. Instead you can use a colors which are near black. For example:

TintHTML color codeExample text
Pure black#000000~~~

Lorem ipsum dolor sit amet

~~~
Grey#4D4D4D~~~

Lorem ipsum dolor sit amet

~~~
Green#506455~~~

Lorem ipsum dolor sit amet

~~~
Blue#113654~~~

Lorem ipsum dolor sit amet

~~~
Pink#564556~~~

Lorem ipsum dolor sit amet

~~~

Show more

2020-11-14

Frequentist and Bayesian coin flipping

To me, it is still unclear what exactly is the difference between Frequentist and Bayesian statistics. Most explanations involve terms such as "likelihood", "uncertainty" and "prior probabilities". Here, I'm going to show the difference between both statistical paradigms by using a coin flipping example. In the examples, the effect of showing more data to both paradigms will be visualised.

Generating data

Lets start by generating some data from a fair coin flip, that is, the probability of heads is 0.5.

import CairoMakie

using AlgebraOfGraphics: Lines, Scatter, data, draw, visual, mapping
using Distributions
using HypothesisTests: OneSampleTTest, confint
using StableRNGs: StableRNG

Show more

2020-11-07

Installing NixOS with encryption on a Lenovo laptop

In this post, I walk through the steps to install NixOS on a Lenovo Yoga 7 with an encrypted root disk. This tutorial is mainly based on the tutorial by Martijn Vermaat and comments by @ahstro and dbwest.

USB preparation

Download NixOS and figure out the location of the USB drive with lsblk. Use the location of the drive and not the partition, so /dev/sdb instead of /dev/sdb1. Then, prepare the USB with

Show more

2020-11-04

The logit and logistic functions

Linear regression works on real numbers \mathbb{R}, that is, the input and output are in \mathbb{R}. For probabilities, this is problematic because the linear regression will happily give a probability of -934, where we know that probabilities should always lie between 0 and 1. This is only by definition, but it is an useful definition in practice. Informally, the logistic function converts values from real numbers to probabilities and the logit function does the reverse.

Show more

2020-09-26

The principle of maximum entropy

Say that you are a statistician and are asked to come up with a probability distribution for the current state of knowledge on some particular topic you know little about. (This, in Bayesian statistics, is known as choosing a suitable prior.) To do this, the safest bet is coming up with the least informative distribution via the principle of maximum entropy.

This principle is clearly explained by Jaynes (1968): consider a die which has been tossed a very large number of times N. We expect the average to be 3.5, that is, we expect a distribution where P_n = \frac{1}{6} for each n, see the figure below.

Show more

2020-08-12

Writing effectively

According to McEnerney (2014), academics are trained to be poor writers. Eventually, they end up in his office and tell, while crying, that their careers might end soon. One reason why academics are poor writers is that they are expert writers. Expert writers are not experts in writing but are experts who write. An expert writer typically thinks via writing and they assume that this raw output is good enough for readers. However, it isn't good enough. For a start, expert writers have a worldview which differs from the readers' due to the writers' expertise. So, to avoid crying, McEnerney argues that writers should instead write to be valuable to the community of readers.

Show more

2020-07-29

Writing checklist

I keep forgetting lessons about writing. After writing a text, my usual response is to declare it as near perfect and never look at it again. In this text, I will describe a checklist, which I can use to quickly debunk the declaration. I plan to improve this checklist over time. Hopefully, text which passes the checklist in a few dozen years from now will, indeed, be near perfect.

The list is roughly ordered by importance. The text should:

  1. Ensure that the writing is valuable to the community of readers.

  2. Be simple (Adams, 2015) or be made as simple as possible, but not simpler. This is also known as Occam's razor, kill your darlings or the KISS principle.

  3. Be polite, that is, not contain a career limiting move. For example, do not "write papers proclaiming the superiority of your work and the pathetic inadequacy of the contributions of A, B, C, ..." (Wadge 2020).

  4. Be consistent. For example, either use the Oxford comma in the entire text or do not use it at all.

  5. Avoid misspellings.

  6. Avoid comma splices.

  7. Place the object before the action, so write "the boy hit the ball" instead of "the ball was hit by the boy".

  8. Flow naturally; just like a normal conversation. This is, for me, contradictory to writing when programming.

  9. Provide a high-level overview of the text. This can be a summary, abstract, a few sentences in the introduction or a combination of these.

  10. Prefer common collocations. A list of common collocations is The Academic Collocation List.

  11. Use simple verbs, for example, prefer "stop" over "cease to move on" or "do not continue".

  12. Avoid dying metaphors such as "stand shoulder to shoulder with" (Orwell, 1946). Metaphors aim to "assist thought by evoking a visual image" (Orwell, 1946). Dying metaphors do not evoke such an image anymore due to overuse (Orwell, 1946).

  13. Avoid pretentious diction such as dressing up simple statements, inappropriate adjectives and foreign words and expressions (Orwell, 1946). For example, respectively "effective", "epic" and "status quo" (Orwell, 1946).

  14. Avoid meaningless words, that is, words for which no clear definition exists. For example, "democracy" and "freedom" have "several different meanings which cannot be reconciled with one another" (Orwell, 1946).

Show more

2020-06-28

Combinations and permutations

Counting is simple except when there is a lot to be counted. Combinations and permutations are such a case; they are about counting without replacement. Suppose we want to count the number of possible results we can obtain from picking k numbers, without replacement, from an equal or larger set of numbers, that is, from n where k \leq n. When the same set of numbers in different orders should be counted separately, then the count is called the number of permutations. So, if we have some set of numbers and shuffle some numbers around, then we say that the numbers are permuted. When the same set of numbers in different orders should be counted only once, then the count is called the number of combinations. Which makes sense since it is only about the combination of numbers and not the order.

Show more

2020-06-27

Comparing means and SDs

When comparing different papers it might be that the papers have numbers about the same thing, but that the numbers are on different scales. Forr example, many different questionnaires exists measuring the same constructs such as the NEO-PI and the BFI both measure the Big Five personality traits. Say, we want to compare reported means and standard deviations (SDs) for these questionnaires, which both use a Likert scale.

In this post, the equations to rescale reported means and standard deviations (SDs) to another scale are derived. Before that, an example is worked trough to get an intuition of the problem.

Show more

2020-05-11

Predicates and reproducibility

While reading texts on statistics and meta-science I kept noticing vagueness. For example, there seems to be half a dozen definitions of replicability in papers since 2016. In this text, I try to formalize the underlying structure.

Edit 2020-11-01: The model below is basically the same, but poorer, than the causal models as presented by, for example, Pearl (2009).

Assume determinism. Assume that for any function f there is a set of predicates, or context, C which need to hold for the function to hold, that is, return the correct answer. Let this be denoted by C \xRightarrow{a} f. For example, Bernoulli's equation solved for \rho only holds for a context C_b containing isentropic flows, that is, C_b \Rightarrow \text{Bernoulli's equation}, where C_b contains isentropic flows. There have been arguments that such contexts need to contain an (open-ended) list of negative conditions (Hoefer, 2003). Let these contexts and the contexts below also contain this list.

Show more

◀ prev

▶ next