# Fundamentals of data and coding for psychology

## Data associated with psychology research

These classes are about the fundamentals of digital data and its representation.
However, it is very useful to be aware of how such concepts are applied in everyday research in psychology - of the kind that you will be doing during your internship.

The learning objectives for this lesson are for you to be able to:

* Identify the data associated with a research study by reading the journal article reporting its results.
* Describe the types of data that are associated with conducting research in psychology across a range of specialities.
* Use a notebook to store and share written work.

### Categorising research data

Your task for this lesson will be to read and interpret the data associated with a published journal article.

To structure your thinking around this task, it is useful to categorise the data associated with a research study into three stages:

#### Pre-experiment

This concerns the sort of data is required before you can run the experiment.
This might include a set of images, or the digital form of a survey, or the timing for a schedule of rat apparatus availability, etc.

#### Gathered ('raw') data

This concerns the data that are obtained from running the study, in close to their 'original' unprocessed form.
This might include key presses (which key and response time), presentation schedules, Likert scale selections, free text responses, etc.

#### Derived data

This concerns the data that are produced during the data analysis - typically derived from the pre-experiment and raw data.
This might include summaries of repeated trials (e.g. averaging), data transformations, statistical estimates, etc.

### Evaluating published research studies

Your task is to read and evaluate a published study in terms of its associated data:

1. Pick a journal article that is related to the area of your internship project. This could be one of your supervisor's articles, or an article that has been recommended by your supervisor in relation to your project - or it could be an article that you have independently discovered. If the article has multiple experiments, pick one as the focus.
1. Create a new notebook (see the [course overview page](https://webutils.psy.unsw.edu.au/internship_coding/site/) for instructions) to store your work on this task.
1. Write a couple of sentences summarising the study.
1. Think carefully about the study in relation to the data categories above, and produce an inventory of the data involved in the study. It does not need to be exhaustive.
1. Export your notebook to HTML (see the [course overview page](https://webutils.psy.unsw.edu.au/internship_coding/site/) for instructions).
1. Create a new post on the `Coding forum` on the course Moodle site and attach your exported notebook from the previous step.
1. Review the posts from your classmates, *taking particular note of research from **outside** the area of your internship project*. Feel free to visit your classmates and discuss their posts.

Below, you will find a couple of examples from our own work. You can use these as a template for how you approach your task. Remember that you can use the various capacities of `Markdown` to produce formatted output if you think that will enhance the presentation of your notebook.

### Example 1

Mannion, D.J. (2015) [Sensitivity to the visual field origin of natural image patches in human low-level visual cortex.](https://peerj.com/articles/1038) *PeerJ, 3*, e1038.

##### Area

This article relates to visual neuroscience (perception).

##### Summary

Regions in low-level human visual cortex are responsive to patterns presented within a particular region of the visual field.
The aim of this study was to determine whether such responses are affected by whether the patterns are obtained from a matching or mismatching region of the visual field during natural navigation - that is, whether regions of visual cortex that are responsive to the upper visual field responds differently if the patterns are associated with the exposure from the upper and lower visual fields.
The study used functional magnetic resonance imaging (fMRI) to infer the level of neural activity elicited by patterns sourced from the upper and lower visual fields of a freely-navigating observer when presented to the upper and lower visual fields of human participants.

##### Pre-experiment data

We needed 30 images of natural scenes to present to observers in the MRI scanner.
These were upsampled, cropped to circular apertures, and combined with other patterns at runtime during the experiment.

<!-- note that I'm using a HTML tag directly because Markdown doesn't allow for setting image display size -->
<img src="https://dfzljdn9uc3pi.cloudfront.net/2015/1038/1/fig-1-2x.jpg" alt="Stimuli" style="width: 400px;"/>

We also needed to generate and store the information about each of the 83 x 2 'events' that occurred within a single scanning run.
This information included the time of the event and what conditions were active during the event.

We also needed to generate and store similar information for an unrelated task that participants were asked to do at the centre of their vision.
This information included the digit and its polarity (black or white) 3 times a second for the duration of a scanning run.

##### Raw data

The primary raw data that was collected was in the form of fMRI volumes.
An fMRI volume was acquired every 2 seconds, and was acquired from an 192 x 192 x 70mm section of the brain with 2mm resolution - giving 96 x 96 x 35 = 322,560 numbers every 2 seconds.
With each run being 332 seconds in duration, that means there were 322,560 x 166 = 53,544,960 numbers collected on every run.
With each participant doing 10 runs, that means there were 53,544,960 x 10 = 535,449,600 (= a lot of) numbers per participant.

We also collected behavioural data, which included the timing of button presses during a run.

##### Derived data

The fMRI volumes were subjected to standard pre-processing routines to correct for slice acquisition timing and head motion, producing new data volumes.
They were then projected onto the cortical surface by averaging the signals arising from within the grey matter, producing one number per surface node for each participant, hemisphere, and timepoint.

Single-participant general linear model (GLM) analysis on the surface data produced regression weights and statistical estimates for each node.
These were then averaged within sub-regions of visual areas, as shown below, and converted into a percentage signal change measure.

<img src="https://dfzljdn9uc3pi.cloudfront.net/2015/1038/1/fig-2-2x.jpg" alt="Flat map" style="width: 400px;"/>

The input into the group statistical analysis was 30 (images) x 2 (visual field locations) x 2 (source locations) x 3 (visual areas) x 7 (participants) set of numbers.


### Example 2

Peterson, L.M., Kersten, D.J., & Mannion, D.J. (2018) [Surface curvature from kinetic depth can affect lightness.](http://www.djmannion.net/docs/djm_sfm_lightness_web.pdf) *Journal of Experimental Psychology: Human Perception & Performance, 44(12)*, 1856-1864.

##### Area

This article relates to visual perception.

##### Summary

Our perception of an object's surface reflectance (the shade of grey of the object) is influenced by how we perceive the shape of the object's surface.
In this study, we tested whether dynamic cues to surface shape can affect our judgement of surface reflectance.
In a behavioural experiment, participants were shown images of two surfaces (which, depending on the experimental condition, contained dynamic shape cues) and asked to compare the brightness of two patches embedded within the surfaces.
We expected that the patches would be perceived differently when the surfaces appeared to be flat and perceived similiarly when the surfaces appeared to be curved.


##### Pre-experiment data

We rendered our stimuli using a program called Mitsuba, being controlled via Python.
The stimuli were two abutting cylinders that were either stationary or rotating, depending on the experimental condition.
For the rotating cylinders, we generated a sequence of 180 images which captured one cycle of rotation.

At the centre of each cylinder was a small patch; the reflectance of one of these patches varied throughout the experiment (the 'comparison patch').
On each trial, the reflectance of the comparison patch was chosen from 201 possible values based on a participant's responses on previous trials.

We also needed to add surface texture to the cylinders.
To do this, we generated a set of textures containing black dots which varied randomly in their size, position, orientation, and aspect ratio.

We ended up with 180 (cylinder rotation) X 201 (comparison patch reflectances) X 11 (surface textures) images.


##### Raw data
At the end of each participation session, we had a 544 row X 27 column data table.

Each sessions had 544 trials in total.
For each trial, we recorded the reflectance value of the comparison patch and whether the participant responded that they perceived the comparison patch as brighter than the reference patch.

The raw behavioural data also included response time, details of the stimulus presented on each trial (e.g. whether the surfaces were moving and/or had surface texture), participant's ID and the experimental condition they were assigned to, and whether the trial was a 'catch' or practice trial.

##### Derived data

Using the raw behavioural data, we generated psychometric functions for each participant.
The psychometric function related the proportion of trials in which a participant judged the comparison patch as brighter than the reference to the reflectance of the comparison patch.

The raw data for each participant (480 trials, excluding catch and practice trials) was summarised so that each participant had a single score.
This score represented the extent to which perception of the comparison and reference patches differed.
The average score was calculated for each of the four experimental conditions and these average scores were used for the planned contrasts analysis.