Comparing means and SDs


When comparing different papers it might be that the papers have numbers about the same thing, but that the numbers are on different scales. Forr example, many different questionnaires exists measuring the same constructs such as the NEO-PI and the BFI both measure the Big Five personality traits. Say, we want to compare reported means and standard deviations (SDs) for these questionnaires, which both use a Likert scale.

In this post, the equations to rescale reported means and standard deviations (SDs) to another scale are derived. Before that, an example is worked trough to get an intuition of the problem.

  1. Preliminaries
  2. An example with numbers
  3. Linear transformations
  4. Transformations
  5. References


In this post, I stick to the set theory convention of denoting sets by uppercase letters. So, A|A| denotes the number of items in the set AA and a|a| denotes the absolute value of the number aa. To say that predicate PxP_x holds for all elements in XX, I use the notation t[tT:Px]\forall_t[t \in T : P_x], for example: if XX contains all integers above 3, then we can write x[xX:3<x]\forall_x[x \in X : 3 < x].

For some study, let the set of participants and questions be respectively denoted by PP and QQ with P=n|P| = n and Q=v|Q| = v. Let the set of responses be denoted by RR with R=nv|R| = n \cdot v and let TT denote the set of the summed scores per participant, that is, T={t1,t2,,tn}T = \{ t_1, t_2, \ldots, t_n \}, see the table below.

p1p_1r11r_{11}r12r_{12}...r1vr_{1v}t1=qQ r1qt_1 = \sum_{q \in Q} \: r_{1q}
p2p_2r21r_{21}r22r_{22}...r2vr_{2v}t2=qQ r2qt_2 = \sum_{q \in Q} \: r_{2q}
pnp_nrn1r_{n1}rn2r_{n2}...rnvr_{nv}tn=qQ rnqt_n = \sum_{q \in Q} \: r_{nq}

Let mm and ss denote respectively the reported mean and sample SD. We assume that the papers calculated the mean and SD with

m=mean(T)=TP m = mean(T) = \frac{\sum T}{|P|}


s=sd(T)=Var(T)=1n1pP(tpm)2. s = sd(T) = \sqrt{Var(T)} = \sqrt{\frac{1}{n - 1} \sum_{p \in P} (t_p - m)^2}.

Note here that Bessel's correction is applied, because 1n1\frac{1}{n - 1} instead of 1n\frac{1}{n}. This seems to be the default way to calculate the standard deviation.

An example with numbers

Lets consider one study consisting of only one question and three participants. Each response uUu \in U is an integer (Z\mathbb{Z}) in the range [1, 3], that is, u[uU:uZ1u3]\forall_u[u \in U : u \in \mathbb{Z} \land 1 \leq u \leq 3]. So, the lower and upper bound of uu are respectively ul=1u_l = 1 and uu=3u_u = 3.


We can rescale these numbers to a normalized response vVv \in V in the range [0, 1] by applying min-max normalization,

v=uuluuul=u131=u12. v = \frac{u - u_l}{u_u - u_l} = \frac{u - 1}{3 - 1} = \frac{u - 1}{2}.

The rescaled responses become


Now, suppose that the study would have used a scale in the range [0, 5]. Let these responses be denoted by wWw \in W. We can rescale the normalized responses vVv \in V in the range [0, 1] up to wWw \in W in the range [0, 5] with

w=v(wuwl)+wl=v(50)+0=5v. w = v \cdot (w_u - w_l) + w_l = v \cdot (5 - 0) + 0 = 5v.

This results in

p3p_32122 \frac{1}{2}2122 \frac{1}{2}

Since we know all the responses, we can calculate the means and standard deviations:

WW2122 \frac{1}{2}2122 \frac{1}{2}

Now, suppose that UU was part of a study reported in a paper and the scale of VV was the scale we have for our own study. Of course, a typical study doesn't give us all responses uUu \in U. Instead, we only have mean(U)mean(U) and sd(U)sd(U) and want to know mean(W)mean(W) and sd(W)sd(W). This can be done by using the equations derived below. We could first normalize the result, by Eq. (14),

mean(V)=mean(U)uluuul=mean(U)131=212=12 mean(V) = \frac{mean(U) - u_l}{u_u - u_l} = \frac{mean(U) - 1}{3 - 1} = \frac{2 - 1}{2} = \frac{1}{2}

and, by Eq. (15),

sd(V)=sd(U)uuul=sd(U)31=12. sd(V) = \frac{sd(U)}{u_u - u_l} = \frac{sd(U)}{3 - 1} = \frac{1}{2}.

Next, we can rescale this to the range of WW. By Eq. (16),

mean(W)=(wuwl)mean(V)+wl=(50)mean(V)+0=512=212 mean(W) = (w_u - w_l) \cdot mean(V) + w_l = (5 - 0) \cdot mean(V) + 0 = 5 \cdot \frac{1}{2} = 2 \frac{1}{2}

and, by Eq. (17),

sd(W)=(wuwl)sd(V)=(50)12=212. sd(W) = (w_u - w_l) \cdot sd(V) = (5 - 0) \cdot \frac{1}{2} = 2 \frac{1}{2}.

We could also go from UU to WW in one step. By Eq. (18),

mean(W)=(wuwl)mean(U)uluuul+gl=(50)2131+0=212. mean(W) = (w_u - w_l) \cdot \frac{mean(U) - u_l}{u_u - u_l} + g_l = (5 - 0) \cdot \frac{2 - 1}{3 - 1} + 0 = 2 \frac{1}{2}.

and, by Eq. (19),

sd(W)=(wuwl)sd(U)uuul=(50)131=212. sd(W) = (w_u - w_l) \cdot \frac{sd(U)}{u_u - u_l} = (5 - 0) \cdot \frac{1}{3 - 1} = 2 \frac{1}{2}.

Linear transformations

Consider a random variable XX with a finite mean and variance, and some constants aa and bb. Before we can derive the transformations, we need some equations to be able to move aa and bb out of mean(aX+b)mean(aX + b) and sd(aX+b)sd(aX + b).

For the mean, the transformation is quite straightforward,

mean(aX+b)=i=1X(axi+b)X=i=1X(axi)+XbX=i=1X(axi)X+b=ai=1X(xi)X+b=ai=1X(xi)X+b=amean(x)+b. \begin{aligned} mean(aX + b) &= \frac{\sum_{i=1}^{|X|} (ax_i + b)}{|X|} \\ &= \frac{\sum_{i=1}^{|X|} (ax_i) + |X|b}{|X|} \\ &= \frac{\sum_{i=1}^{|X|} (ax_i)}{|X|} + b \\ &= \frac{a \sum_{i=1}^{|X|}(x_i)}{|X|} + b \\ &= a \cdot \frac{\sum_{i=1}^{|X|}(x_i)}{|X|} + b \\ &= a \cdot mean(x) + b. \end{aligned}

Note that the position of constant bb makes intuitive sense: for example, if you add a constant bb to all the elements of a sample, then the mean will move by bb. To scale the standard deviation, we can use the equation for a linear transformation of the variance (Hogg et al. (2018)),

Var(aX+b)=a2Var(X). Var(aX + b) = a^2 \cdot Var(X).

We can use this to derive that

sd(aX+b)=Var(aX+b)=a2Var(X)=asd(X). sd(aX + b) = \sqrt{Var(aX + b)} = \sqrt{a^2 \cdot Var(X)} = |a| \cdot sd(X).


Next, we derive the equations for the transformations. Let ll and uu be respectively the lower and upper bound for the Likert scale over all the answers; specifically, t[tT:ltu]\forall_t [t \in T : l \leq t \leq u]. Let klk_l and kuk_u be respectively the lower and upper bound for the Likert scale per answer; specifically, r[rR:klrku]\forall_r [ r \in R : k_l \leq r \leq k_u]. Now, for the normalized mean mean(T)mean(T'),

mean(T)=mean(Tklkukl)=mean(Tkl)kukl=mean(T)klkukl=mean(T)klkukl mean(T') = mean \left( \frac{T - k_l}{k_u - k_l} \right) = \frac{mean(T - k_l)}{k_u - k_l} = \frac{mean(T) - k_l}{k_u - k_l} = \frac{mean(T) - k_l}{k_u - k_l}

and for the normalized SD sd(T)sd(T'),

sd(T)=sd(Tklkukl)=sd(Tkl)kukl=sd(T)kukl sd(T') = sd \left( \frac{T - k_l}{k_u - k_l} \right) = \frac{sd(T - k_l)}{|k_u - k_l|} = \frac{sd(T)}{k_u - k_l}

where kukl=kukl|k_u - k_l| = k_u - k_l since we know that both are positive and kl<kuk_l < k_u.

To change these normalized scores back to another scale in the range [gl,gu][g_l, g_u], we can use

mean(T)=mean((gugl)T+gl)=(gugl)mean(T)+gl mean(T'') = mean((g_u - g_l) \cdot T' + g_l) = (g_u - g_l) \cdot mean(T') + g_l


sd(T)=sd((gugl)T+gl)=(gugl)sd(T) sd(T'') = sd((g_u - g_l) \cdot T' + g_l) = (g_u - g_l) \cdot sd(T')

We can also transform the mean and SD into one step from the range [kl,ku][k_l, k_u] to [gl,gu][g_l, g_u] with

mean(T)=(gugl)mean(T)+gl=(gugl)mean(T)klkukl+gl \begin{aligned} mean(T'') &= (g_u - g_l) \cdot mean(T') + g_l \\ &= (g_u - g_l) \cdot \frac{mean(T) - k_l}{k_u - k_l} + g_l \end{aligned}


sd(T)=(gugl)sd(T)=(gugl)sd(T)kukl \begin{aligned} sd(T'') &= (g_u - g_l) \cdot sd(T') \\ &= (g_u - g_l) \cdot \frac{sd(T')}{k_u - k_l} \end{aligned}


Hogg, R. V., McKean, J., & Craig, A. T. (2018). Introduction to mathematical statistics. Pearson Education.