Last week I read an interesting post on validity, and this week I am going to talk about its brother, reliability. So, why is reliability important? It’s easy when learning about scientific research and writing to feel like we’re given a whole set of rules and standards to follow and stick to, extra bits to think about and boxes to tick, when what we really want to be doing is researching psychology.
Ask Joe Bloggs on the street what he thinks about reliability, and he’ll probably say something like: “Because if your results aren’t reliable then they’ll be wrong”. And he’s right. (Do you not agree?) But are these methods helping us? Or are they a psychological form of ‘politically correct gone too far’?
Reliability and validity come together hand in hand to ensure the results of an experiment are trustworthy, realistic and correctly obtained. Validity is defined as the “extent to which a measure assesses what it is claimed to measure” (p. 261 Howitt, D & Cramer, D 2008), whereas reliability concerns consistency across different times or circumstances. An experiment could produce results which may be valid, and therefore are correctly measured, and could help us draw a conclusion, however they might not be reliable.
Reliability tells us that if one week, an experiment produces results to support hypothesis A, and that then in another experiment, either with a different sample, or a similar (if not the same) sample at a different time hypothesis A is then proven wrong, the results aren’t very reliable, and therefore there is insufficient evidence to draw a conclusion.
Reliability in psychology is often measured using statistical methods. ‘Internal reliability’ refers to how well each data value on a scale measures the concept in question. If the data is reliable, then theoretically any data value used will give the same as any other value, or indeed, all values together. Methods are used, such as ‘split half reliability’, where the first and second halves of results are separated, and then the Pearson correlation for these results is calculated. Other mathematical functions such as ‘Spearman-Brown formula’ and ‘Guttman reliability’ are also used.
More practically, tests can be repeated, either as a simple repeat (‘Test-retest reliability’) or in a different form (‘Alternate form reliability’) however this in turn can adversely affect the results, since the circumstances of participants may change, or memories of the first test can affect how participants handle the second test. Alternate forms reliability attempts to overcome the latter problem, by using a slightly different test, which resolves the issue to some extent.
Internal reliability still works hand in hand with these practical methods of ensuring reliability, e.g. after a repeat test, if we see that the value calculated for internal reliability are different from that of the original test, we can determine that results may not be reliable.
While statistics seem so lifeless, dull and uninteresting, we can see here how mathematical formulas can compensate where practical work falls short, but also vice-versa. Obviously, results must be reliable, and here we have a selection of methods that, when used in conjunction with our scientific judgement can and will help us ensure both validity and reliability of our research.