Experimental design—power analysis and its visualisation

Objectives

  • Be able to perform power calculations using computational simulation approaches.
  • Know how to create and use line and image plots.

Power relates to the ability to detect the presence of a true effect and is an important component of experimental design. In this lesson, we will consider a general-purpose simulation approach to estimating the power of an experimental design. We will also investigate how we can visualise such data using line plots and two-dimensional image plots.

Calculating power given effect size and sample size

We will begin by considering a scenario in which we have an effect size and sample size in mind and we would like to know the associated power. For our example experiment, we will use a between-subjects design with two factors, 30 participants per group, and we will assume a ‘large’ effect size (Cohen’s d = 0.8). You might recognise this design, as it was the same that we used in the previous lesson to demonstrate the operation of the independent-samples t-test. Here, we will determine the power of this test.

The key to determining power using a simulation approach is to again leverage the computational capacity that we have once we are able to write our own programs. We will perform a large number of simulated experiments, each time calculating our test statistic (independent samples t-test, in this case) and accumulating the number of times we reject the null hypothesis. Then, the power is simply the proportion of times that we are able to reject the null hypothesis (remembering that we control the population means and we know that there is a true difference).


import numpy as np

import scipy.stats

n_per_group = 30

# effect size = 0.8
group_means = [0.0, 0.8]
group_sigmas = [1.0, 1.0]

n_groups = len(group_means)

# number of simulations
n_sims = 10000

# store the p value for each simulation
sim_p = np.empty(n_sims)
sim_p.fill(np.nan)

for i_sim in range(n_sims):

    data = np.empty([n_per_group, n_groups])
    data.fill(np.nan)

    # simulate the data for this 'experiment'
    for i_group in range(n_groups):

        data[:, i_group] = np.random.normal(
            loc=group_means[i_group],
            scale=group_sigmas[i_group],
            size=n_per_group
        )

    result = scipy.stats.ttest_ind(data[:, 0], data[:, 1])

    sim_p[i_sim] = result[1]

# number of simulations where the null was rejected
n_rej = np.sum(sim_p < 0.05)

prop_rej = n_rej / float(n_sims)

print "Power: ", prop_rej
Power:  0.8619

We can see that our power to detect a large effect size with 30 participants per group in a between-subjects design is about 86%. That is, if a large effect size is truly present then we would expect to be able to reject the null hypothesis (at an alpha of 0.05) about 86% of the time.

The above code was written with clarity rather than speed of execution in mind, but in subsequent examples we would like to perform more computationally demanding analyses—so let’s make some tweaks to allow it to be performed quicker. In the code below, we generate the simulated data all at once and then use the axis argument to scipy.stats.ttest_ind.


import numpy as np

import scipy.stats

n_per_group = 30

# effect size = 0.8
group_means = [0.0, 0.8]
group_sigmas = [1.0, 1.0]

n_groups = len(group_means)

# number of simulations
n_sims = 10000

data = np.empty([n_sims, n_per_group, n_groups])
data.fill(np.nan)

for i_group in range(n_groups):

    data[:, :, i_group] = np.random.normal(
        loc=group_means[i_group],
        scale=group_sigmas[i_group],
        size=[n_sims, n_per_group]
    )

result = scipy.stats.ttest_ind(
    data[:, :, 0],
    data[:, :, 1],
    axis=1
)

sim_p = result[1]

# number of simulations where the null was rejected
n_rej = np.sum(sim_p < 0.05)

prop_rej = n_rej / float(n_sims)

print "Power: ", prop_rej
Power:  0.8596

Required sample size to achieve a given power for a given effect size

Now we will move on to a more common scenario, where you have a design and effect size in mind and would like to know what sample size you would need to achieve a particular power (80% is typical). This is a straightforward extension of the previous example: we begin with a sample size and calculate the associated power. We then perform such a calculation repeatedly, each time increasing the sample size, until the power has reached the desired level.


import numpy as np

import scipy.stats

# start at 20 participants
n_per_group = 20

# effect size = 0.8
group_means = [0.0, 0.8]
group_sigmas = [1.0, 1.0]

n_groups = len(group_means)

# number of simulations
n_sims = 10000

# power level that we would like to reach
desired_power = 0.8

# initialise the power for the current sample size to a small value
current_power = 0.0

# keep iterating until desired power is obtained
while current_power < desired_power:

    data = np.empty([n_sims, n_per_group, n_groups])
    data.fill(np.nan)

    for i_group in range(n_groups):

        data[:, :, i_group] = np.random.normal(
            loc=group_means[i_group],
            scale=group_sigmas[i_group],
            size=[n_sims, n_per_group]
        )

    result = scipy.stats.ttest_ind(
        data[:, :, 0],
        data[:, :, 1],
        axis=1
    )

    sim_p = result[1]

    # number of simulations where the null was rejected
    n_rej = np.sum(sim_p < 0.05)

    prop_rej = n_rej / float(n_sims)

    current_power = prop_rej

    print "With {n:d} samples per group, power = {p:.3f}".format(
        n=n_per_group,
        p=current_power
    )

    # increase the number of samples by one for the next iteration of the loop
    n_per_group += 1
With 20 samples per group, power = 0.697
With 21 samples per group, power = 0.714
With 22 samples per group, power = 0.738
With 23 samples per group, power = 0.755
With 24 samples per group, power = 0.769
With 25 samples per group, power = 0.791
With 26 samples per group, power = 0.801

We can see that we would reach the desired power with somewhere between 25 and 27 participants per group.

Visualising the sample size/power relationship

In the previous example, we sought to determine the sample size that would provide a given level of power. However, perhaps we do not have a single level of power in mind at the moment but would like to see the relationship between sample size and power so that we can see the costs and benefits of a particular sample size.

First, we will use a similar approach to the previous example, however we will perform the simulations across a fixed set of sample sizes. We will then save the power values to disk so that we can use them to create a visualisation without having to re-run all the simulations.


import numpy as np

import scipy.stats

# let's look at samples sizes of 10 per group up to 50 per group in steps of 5
ns_per_group = np.arange(10, 51, 5)

# a bit awkward - number of n's
n_ns = len(ns_per_group)

# effect size = 0.8
group_means = [0.0, 0.8]
group_sigmas = [1.0, 1.0]

n_groups = len(group_means)

# number of simulations
n_sims = 10000

power = np.empty(n_ns)
power.fill(np.nan)

for i_n in range(n_ns):

    n_per_group = ns_per_group[i_n]

    data = np.empty([n_sims, n_per_group, n_groups])
    data.fill(np.nan)

    for i_group in range(n_groups):

        data[:, :, i_group] = np.random.normal(
            loc=group_means[i_group],
            scale=group_sigmas[i_group],
            size=[n_sims, n_per_group]
        )

    result = scipy.stats.ttest_ind(
        data[:, :, 0],
        data[:, :, 1],
        axis=1
    )

    sim_p = result[1]

    # number of simulations where the null was rejected
    n_rej = np.sum(sim_p < 0.05)

    prop_rej = n_rej / float(n_sims)

    power[i_n] = prop_rej

# this is a bit tricky
# what we want to save is an array with 2 columns, but we have 2 different 1D
# arrays. here, we use the 'np.newaxis' property to add a column each of the
# two 1D arrays, and then 'stack' them horizontally
array_to_save = np.hstack(
    [
        ns_per_group[:, np.newaxis],
        power[:, np.newaxis]
    ]
)

power_path = "prog_data_vis_power_sim_data.tsv"

np.savetxt(
    power_path,
    array_to_save,
    delimiter="\t",
    header="Simulated power as a function of samples per group"
)

Now, we want to begin our visualisation code by loading the data we just saved.


import numpy as np

power_path = "prog_data_vis_power_sim_data.tsv"

power = np.loadtxt(power_path, delimiter="\t")

# number of n's is the number of rows
n_ns = power.shape[0]

ns_per_group = power[:, 0]
power_per_ns = power[:, 1]

We will now use familiar techniques to set up a one-panel figure. We will also apply some of the usual formatting changes.

import numpy as np

import veusz.embed

power_path = "prog_data_vis_power_sim_data.tsv"

power = np.loadtxt(power_path, delimiter="\t")

# number of n's is the number of rows
n_ns = power.shape[0]

ns_per_group = power[:, 0]
power_per_ns = power[:, 1]

embed = veusz.embed.Embedded("veusz")

page = embed.Root.Add("page")
page.width.val = "8.4cm"
page.height.val = "8cm"

graph = page.Add("graph", autoadd=False)

x_axis = graph.Add("axis")
y_axis = graph.Add("axis")


# do the typical manipulations to the axis apperances
graph.Border.hide.val = True

typeface = "Arial"

for curr_axis in [x_axis, y_axis]:

    curr_axis.Label.font.val = typeface
    curr_axis.TickLabels.font.val = typeface

    curr_axis.autoMirror.val = False
    curr_axis.outerticks.val = True

embed.WaitForClose()

Alright, now we can plot the sample size per group on the horizontal axis and the power on the vertical axis. We will use the familiar xy figure type.

import numpy as np

import veusz.embed

power_path = "prog_data_vis_power_sim_data.tsv"

power = np.loadtxt(power_path, delimiter="\t")

# number of n's is the number of rows
n_ns = power.shape[0]

ns_per_group = power[:, 0]
power_per_ns = power[:, 1]

embed = veusz.embed.Embedded("veusz")

page = embed.Root.Add("page")
page.width.val = "8.4cm"
page.height.val = "8cm"

graph = page.Add("graph", autoadd=False)

x_axis = graph.Add("axis")
y_axis = graph.Add("axis")

# let veusz know about the data
embed.SetData("ns_per_group", ns_per_group)
embed.SetData("power_per_ns", power_per_ns)

xy = graph.Add("xy")

xy.xData.val = "ns_per_group"
xy.yData.val = "power_per_ns"

# do the typical manipulations to the axis apperances
graph.Border.hide.val = True

typeface = "Arial"

for curr_axis in [x_axis, y_axis]:

    curr_axis.Label.font.val = typeface
    curr_axis.TickLabels.font.val = typeface

    curr_axis.autoMirror.val = False
    curr_axis.outerticks.val = True

embed.WaitForClose()

The figure is looking pretty good—the decelerating relationship between the number of samples in each group and the power is clearly evident. Let’s apply some more formatting changes to improve its appearance.

import numpy as np

import veusz.embed

power_path = "prog_data_vis_power_sim_data.tsv"

power = np.loadtxt(power_path, delimiter="\t")

# number of n's is the number of rows
n_ns = power.shape[0]

ns_per_group = power[:, 0]
power_per_ns = power[:, 1]

embed = veusz.embed.Embedded("veusz")

page = embed.Root.Add("page")
page.width.val = "8.4cm"
page.height.val = "8cm"

graph = page.Add("graph", autoadd=False)

x_axis = graph.Add("axis")
y_axis = graph.Add("axis")

# let veusz know about the data
embed.SetData("ns_per_group", ns_per_group)
embed.SetData("power_per_ns", power_per_ns)

xy = graph.Add("xy")

xy.xData.val = "ns_per_group"
xy.yData.val = "power_per_ns"

# set the x ticks to be at every 2nd location we sampled
x_axis.MajorTicks.manualTicks.val = ns_per_group[::2].tolist()
x_axis.MinorTicks.hide.val = True
x_axis.label.val = "Sample size per group"
x_axis.min.val = float(np.min(ns_per_group) - 5)
x_axis.max.val = float(np.max(ns_per_group) + 5)

# sensible to include the whole range here
y_axis.min.val = 0.0
y_axis.max.val = 1.0
y_axis.label.val = "Power (for d = 0.8)"

# do the typical manipulations to the axis apperances
graph.Border.hide.val = True

typeface = "Arial"

for curr_axis in [x_axis, y_axis]:

    curr_axis.Label.font.val = typeface
    curr_axis.TickLabels.font.val = typeface

    curr_axis.autoMirror.val = False
    curr_axis.outerticks.val = True

embed.WaitForClose()

Calculating power across varying sample and effect sizes

In the previous examples, we have assumed a fixed (‘large’) effect size. However, perhaps we want to investigate how power changes with both effect size and sample size. This is again a straightforward extension of the previous example. We will again save the data to disk to facilitate subsequent visualisation.


import numpy as np

import scipy.stats

# let's look at samples sizes of 10 per group up to 50 per group in steps of 5
ns_per_group = np.arange(10, 51, 5)

# a bit awkward - number of n's
n_ns = len(ns_per_group)

# span the range from a bit less than a 'small' effect size to a bit bigger
# than a 'large' effect size
effect_sizes = np.arange(0.2, 0.91, 0.1)

n_effect_sizes = len(effect_sizes)

power = np.empty([n_effect_sizes, n_ns])
power.fill(np.nan)

# number of simulations
n_sims = 10000

for i_es in range(n_effect_sizes):

    group_means = [0.0, effect_sizes[i_es]]
    group_sigmas = [1.0, 1.0]

    n_groups = len(group_means)

    for i_n in range(n_ns):

        n_per_group = ns_per_group[i_n]

        data = np.empty([n_sims, n_per_group, n_groups])
        data.fill(np.nan)

        for i_group in range(n_groups):

            data[:, :, i_group] = np.random.normal(
                loc=group_means[i_group],
                scale=group_sigmas[i_group],
                size=[n_sims, n_per_group]
            )

        result = scipy.stats.ttest_ind(
            data[:, :, 0],
            data[:, :, 1],
            axis=1
        )

        sim_p = result[1]

        # number of simulations where the null was rejected
        n_rej = np.sum(sim_p < 0.05)

        prop_rej = n_rej / float(n_sims)

        power[i_es, i_n] = prop_rej

power_path = "prog_data_vis_power_es_x_ss_sim_data.tsv"

np.savetxt(
    power_path,
    power,
    delimiter="\t",
    header="Simulated power as a function of samples per group and ES"
)

Visualising power across varying sample and effect sizes

Now that we have our power calculations as a function of sample size and effect size saved to disk, we can think about how to visualise it. But first, let’s load the data.


import numpy as np

power_path = "prog_data_vis_power_es_x_ss_sim_data.tsv"

power = np.loadtxt(power_path, delimiter="\t")

# number of effect sizes is the number of rows
n_effect_sizes = power.shape[0]
# number of n's is the number of columns
n_ns = power.shape[1]

ns_per_group = np.arange(10, 51, 5)
effect_sizes = np.arange(0.2, 0.91, 0.1)

To visualise this two-dimensional data (effect size x sample size per group), we will investigate a new veusz figure type—an image. First, let’s set up our figure framework as per usual.

import numpy as np

import veusz.embed

power_path = "prog_data_vis_power_es_x_ss_sim_data.tsv"

power = np.loadtxt(power_path, delimiter="\t")

# number of effect sizes is the number of rows
n_effect_sizes = power.shape[0]
# number of n's is the number of columns
n_ns = power.shape[1]

ns_per_group = np.arange(10, 51, 5)
effect_sizes = np.arange(0.2, 0.91, 0.1)

embed = veusz.embed.Embedded("veusz")

page = embed.Root.Add("page")
page.width.val = "8.4cm"
page.height.val = "8cm"

graph = page.Add("graph", autoadd=False)

x_axis = graph.Add("axis")
y_axis = graph.Add("axis")

# do the typical manipulations to the axis apperances
graph.Border.hide.val = True

typeface = "Arial"

for curr_axis in [x_axis, y_axis]:

    curr_axis.Label.font.val = typeface
    curr_axis.TickLabels.font.val = typeface

    curr_axis.autoMirror.val = False
    curr_axis.outerticks.val = True

embed.WaitForClose()

Now, we will add our image figure type to the graph. First, we tell veusz about the data using the SetData2D function, using the arguments xcent and ycent to tell veusz what the horizontal and vertical datapoints represent.

import numpy as np

import veusz.embed

power_path = "prog_data_vis_power_es_x_ss_sim_data.tsv"

power = np.loadtxt(power_path, delimiter="\t")

# number of effect sizes is the number of rows
n_effect_sizes = power.shape[0]
# number of n's is the number of columns
n_ns = power.shape[1]

ns_per_group = np.arange(10, 51, 5)
effect_sizes = np.arange(0.2, 0.91, 0.1)

embed = veusz.embed.Embedded("veusz")

page = embed.Root.Add("page")
page.width.val = "8.4cm"
page.height.val = "8cm"

graph = page.Add("graph", autoadd=False)

x_axis = graph.Add("axis")
y_axis = graph.Add("axis")

embed.SetData2D(
    "power",
    power,
    xcent=ns_per_group,
    ycent=effect_sizes
)

img = graph.Add("image")

img.data.val = "power"

# do the typical manipulations to the axis apperances
graph.Border.hide.val = True

typeface = "Arial"

for curr_axis in [x_axis, y_axis]:

    curr_axis.Label.font.val = typeface
    curr_axis.TickLabels.font.val = typeface

    curr_axis.autoMirror.val = False
    curr_axis.outerticks.val = True

embed.WaitForClose()

Now we are starting to get somewhere—we can see that the power for a given combination of effect size and sample size per group is represented by the luminance of the relevant cell. Let’s first clean up the appearance of the figure a bit.

import numpy as np

import veusz.embed

power_path = "prog_data_vis_power_es_x_ss_sim_data.tsv"

power = np.loadtxt(power_path, delimiter="\t")

# number of effect sizes is the number of rows
n_effect_sizes = power.shape[0]
# number of n's is the number of columns
n_ns = power.shape[1]

ns_per_group = np.arange(10, 51, 5)
effect_sizes = np.arange(0.2, 0.91, 0.1)

embed = veusz.embed.Embedded("veusz")

page = embed.Root.Add("page")
page.width.val = "8.4cm"
page.height.val = "8cm"

graph = page.Add("graph", autoadd=False)

x_axis = graph.Add("axis")
y_axis = graph.Add("axis")

embed.SetData2D(
    "power",
    power,
    xcent=ns_per_group,
    ycent=effect_sizes
)

img = graph.Add("image")

img.data.val = "power"

# set 0.0 = black and 1.0 = white
img.min.val = 0.0
img.max.val = 1.0

x_axis.MajorTicks.manualTicks.val = ns_per_group[::2].tolist()
x_axis.MinorTicks.hide.val = True
x_axis.label.val = "Sample size per group"

y_axis.MajorTicks.manualTicks.val = effect_sizes.tolist()
y_axis.MinorTicks.hide.val = True
y_axis.label.val = "Effect size (Cohen's d)"
y_axis.max.val = float(np.max(effect_sizes) + 0.05)

# do the typical manipulations to the axis apperances
graph.Border.hide.val = True

typeface = "Arial"

for curr_axis in [x_axis, y_axis]:

    curr_axis.Label.font.val = typeface
    curr_axis.TickLabels.font.val = typeface

    curr_axis.autoMirror.val = False
    curr_axis.outerticks.val = True

embed.WaitForClose()

It is looking better, but a major deficiency at the moment is a lack of information about what the intensity in each cell represents. To remedy this, we will add a colorbar to our graph. But first, we will make room for it by increasing our vertical page size and increasing the top margin of our graph.

import numpy as np

import veusz.embed

power_path = "prog_data_vis_power_es_x_ss_sim_data.tsv"

power = np.loadtxt(power_path, delimiter="\t")

# number of effect sizes is the number of rows
n_effect_sizes = power.shape[0]
# number of n's is the number of columns
n_ns = power.shape[1]

ns_per_group = np.arange(10, 51, 5)
effect_sizes = np.arange(0.2, 0.91, 0.1)

embed = veusz.embed.Embedded("veusz")

page = embed.Root.Add("page")
page.width.val = "8.4cm"
page.height.val = "12cm"

graph = page.Add("graph", autoadd=False)

graph.topMargin.val = "3cm"

x_axis = graph.Add("axis")
y_axis = graph.Add("axis")

embed.SetData2D(
    "power",
    power,
    xcent=ns_per_group,
    ycent=effect_sizes
)

img = graph.Add("image")

img.data.val = "power"

# set 0.0 = black and 1.0 = white
img.min.val = 0.0
img.max.val = 1.0

x_axis.MajorTicks.manualTicks.val = ns_per_group[::2].tolist()
x_axis.MinorTicks.hide.val = True
x_axis.label.val = "Sample size per group"

y_axis.MajorTicks.manualTicks.val = effect_sizes.tolist()
y_axis.MinorTicks.hide.val = True
y_axis.label.val = "Effect size (Cohen's d)"
y_axis.max.val = float(np.max(effect_sizes) + 0.05)

# do the typical manipulations to the axis apperances
graph.Border.hide.val = True

typeface = "Arial"

for curr_axis in [x_axis, y_axis]:

    curr_axis.Label.font.val = typeface
    curr_axis.TickLabels.font.val = typeface

    curr_axis.autoMirror.val = False
    curr_axis.outerticks.val = True

embed.WaitForClose()

Now, we can add and position the colour bar:

import numpy as np

import veusz.embed

power_path = "prog_data_vis_power_es_x_ss_sim_data.tsv"

power = np.loadtxt(power_path, delimiter="\t")

# number of effect sizes is the number of rows
n_effect_sizes = power.shape[0]
# number of n's is the number of columns
n_ns = power.shape[1]

ns_per_group = np.arange(10, 51, 5)
effect_sizes = np.arange(0.2, 0.91, 0.1)

embed = veusz.embed.Embedded("veusz")

page = embed.Root.Add("page")
page.width.val = "8.4cm"
page.height.val = "12cm"

graph = page.Add("graph", autoadd=False)

graph.topMargin.val = "3cm"

x_axis = graph.Add("axis")
y_axis = graph.Add("axis")

embed.SetData2D(
    "power",
    power,
    xcent=ns_per_group,
    ycent=effect_sizes
)

# give the image a 'name', so that we can tell the colour bar what image it
# should be associated with
img = graph.Add("image", name="power_img")

img.data.val = "power"

# set 0.0 = black and 1.0 = white
img.min.val = 0.0
img.max.val = 1.0

cbar = graph.Add("colorbar")

# connect the colour bar with the power image
cbar.widgetName.val = "power_img"
cbar.horzPosn.val = "centre"
cbar.vertPosn.val = "manual"
cbar.vertManual.val = -0.35
cbar.label.val = "Power"

x_axis.MajorTicks.manualTicks.val = ns_per_group[::2].tolist()
x_axis.MinorTicks.hide.val = True
x_axis.label.val = "Sample size per group"

y_axis.MajorTicks.manualTicks.val = effect_sizes.tolist()
y_axis.MinorTicks.hide.val = True
y_axis.label.val = "Effect size (Cohen's d)"
y_axis.max.val = float(np.max(effect_sizes) + 0.05)

# do the typical manipulations to the axis apperances
graph.Border.hide.val = True

typeface = "Arial"

for curr_axis in [x_axis, y_axis, cbar]:

    curr_axis.Label.font.val = typeface
    curr_axis.TickLabels.font.val = typeface

    curr_axis.autoMirror.val = False
    curr_axis.outerticks.val = True

embed.WaitForClose()