By Gordon Rugg
Card sorts aren’t as widely known as they should be. They’re a neat, efficient method for finding out about how people categorise their world. They also make it possible to investigate topics that are hard to describe in words, which is particularly useful when you’re investigating perceptions of visual items such as web pages and physical artefacts.
This article is an introductory tutorial; there are links to further reading in the notes at the end. If you’re encountering problems when using interviews or questionnaires or focus groups, then you might find that using card sorts will give you new insights. Card sorts are popular with research participants and with novice researchers, because the procedures are easy to grasp; for experienced researchers, card sorts are a powerful, flexible tool.
There are various versions of card sorts, most of which have been around for decades. The core concept is that each card contains a picture of something, or a name of something, or a brief description of something. (You can also use the items themselves, rather than pictures of them.)
The participant’s task is to sort the cards into groups or a pattern, depending on the variety of card sorts involved.
Varieties involving a single sort
Several varieties of card sorts ask the participant to sort the cards only once. This has some advantages and some disadvantages. I’ll discuss these approaches briefly, before moving on to the variety that’s described in this article.
Typically the “single-sort” varieties of card sorts involve asking the participants to sort the cards into a specified pattern. For instance, in Q sorts, each card contains a different statement, and the participant has to rank these with the statements that you most strongly agree with at one end, the ones you most strongly disagree with at the other end, and the intermediate ones at appropriate points in between.
Other approaches are to ask the participants to sort cards into a two-dimensional array, or into groups of the participant’s choice, or into groups named by the researcher.
With these “single-sort” varieties of card sorts, you can then use powerful statistics to analyse the distributions and look for patterns.
The variety described here
The version that I usually use involves getting the participants to sort the cards into groups of their own choice, numerous times, using a different criterion for each sort. This continues until they run out of criteria for sorting, or for a predetermined time, whichever comes first. It’s a version pioneered by various researchers, such as Gammack.
This variety gives more flexibility than the “single-sort” varieties, both with regard to clarifying how the participants are mentally categorising the items depicted by the cards, and also with regard to analysis of the results.
This variety also has the advantage of being well documented onine, including tutorial articles, and an online MSc thesis by Andy Hurd which gives an excellent example of how the method is used, including analysis of the results, and appendices with a range of useful templates and materials.
Preparation and procedure
The first step is to decide what will go on the cards. Most researchers use between about eight and twenty cards – much fewer, and it can get a bit silly, much more and it can become unmanageable. The cards should all be at about the same semantic level, so that the participants are comparing like with like: for instance, it would usually be fine to have cards with names like “bicycle” and “car”, but not to have one card named “vehicle” and another named “Y registration Ford Mondeo hatchback”.
Once this has been done, you need to number each card to make recording easier; we usually put the number on the front of the card in the top right corner. We’ve used cards of various sizes, from playing card size up to A4 screenshots of web sites, without significant problems. It can be more challenging if you’re getting participants to sort items (e.g. mineral samples or engine parts) rather than cards, if the items are bulky, fragile or messy. An advantage of using images or physical items is that participants can usually sort them into groups, even if the participant doesn’t know exactly what the item is, or what it’s called.
Here’s an example, from a pack of images of drinking vessels.
It’s a modern replica of a classical Roman glass. Most participants wouldn’t know that, but they’d still be able to see that it’s made of glass, and that it’s decorated, etc, so they’d be able to sort it on those criteria if they wanted.
Instructions and demonstration
The next step is to put together instructions for the participants, explaining what they need to do, and emphasising that there are no right or wrong answers. There’s a copyleft set of instructions in the Rugg & McGeorge tutorials that you can adapt under the usual copyleft conditions, and a set in Andy Hurd’s thesis.
We normally supplement the instructions with a brief demonstration of card sorts from a domain that won’t suggest categories relevant to the domain you’re investigating. For instance, if you’re investigating people’s categorisations of men’s clothing, you could demonstrate card sorts with the domain of car types.
The data collection
The participant now has to sort the cards into groups of their own choice – as many or as few groups as they like, including “don’t know” and “not applicable” if they wish. They should only use one criterion at a time, so if they try using “big and expensive”, you should head them off and check whether they could sort once on the criterion of size, and then sort again on the criterion of cost. This quite often happens on the first sort, but usually participants have no problems after you’ve clarified this point.
Here’s what the cards might look like after being sorted. These cards are in three groups.The illustration shows why it’s a good idea to number the cards at the top; this makes it easy to see which cards are in which group when the cards are piled on top of each other.
Once the participant has sorted the cards into groups, you then ask the name of each group, and record the numbers of the cards in each group. It’s important to use the participant’s own words for the names of the groups, even if these sound inconsistent or wrong to you; often, the participants are making important distinctions that are reflected in unusual wording.
You then ask what the criterion was for the sorting (there’s more detail about this below).
Here’s what the results so far would look like in your recording sheet.
The record sheet has participant and session information at the top.
The next piece of information is the sort number, together with the criterion that the participant used for sorting in that sort. It’s a good idea to record the information in this way, because it makes the analysis easier later on – you can simply read off the number of sorts performed by that participant in that session.
Finding out what the criterion was for sorting isn’t always straightforward. For instance, if the groups were “small”, “medium” and “large”, then the participant might say that the criterion was “size”. It is extremely unwise to make assumptions about what the participant will call the criterion, or to suggest names: the whole point of card sorting is to find out what criteria the participants use, and what they call them, not what you would call them.
It’s highly advisable to ask for the group names first and the criterion second. For some reason, if you do it the other way round, participants quite often change their minds and rearrange the cards, whereas if you do it with the groups first and the criterion second, this hardly ever happens.
The next set of information after the sort number on the recording sheet is the name of each group, in the participant’s own exact words.
Below the name of each group, you write the numbers of the cards that the participant put in that group. It’s a good idea to ask the participants to call out the numbers to you for each group, telling them the name of the group whose numbers you want. This is useful as a double check that you have the correct cards in the correct group; sometimes participants tell you the group names right-to-left, for instance, which can easily lead to chaos with the transcription. Asking participants to give you the numbers also makes the session more interactive – a lot of participants like helping in this way.
Each card number is followed by a comma and a space, so that you can distinguish between “1, 2” and “12”.
Each set of numbers is separated with a hash mark (particularly useful if your pack contains a lot of cards, making your recording cramped).
In practice, we find it easier to do the recording onto blank sheets of paper, rather than using pre-printed forms. This gives more flexibility for when participants change their minds, or when you want to make a note to yourself about something that cropped up in the session.
Recording the next sorts
Once you’ve recorded the first sort, you get the participant to re-sort using a different criterion.They decide to use the criterion of material.
Your record sheet would then look like this.
We’ve included the “Sort 3:” line as an indication that this is a session in progress, not a completed one.
One of the group names in Sort 2 is “ordinary pottery”. This is a classic example of where trying to change the participant’s wording would have led to introducing an error into the data. The participant here is making a distinction between stoneware and other types of pottery, which is a valid distinction based on technical knowledge about ceramics. Experts often have to make awkward decisions about terminology when explaining their knowledge to non-experts, and will sometimes use phrasing that’s designed to be more accessible to non-experts, but which may sound strange when first encountered.
After you’ve recorded the second sort, you repeat the sort-and-record procedure until the participant runs out of criteria for sorting.
Participant often reach a point where they report not being able to think of more criteria, but where it’s clear that they are having trouble articulating further criteria, rather than having exhausted their mental stock of criteria. When this happens, it’s worth trying to help them to generate some more criteria. One method is to show them a couple of randomly chosen cards, and ask what the main single difference is between them; another is to show them three cards and ask which is the odd one out, and why; another is to show them a card and ask them to describe it, to see whether any of the descriptors could be used for sorting.
What quite often happens is that the participant then thinks of two or three more criteria, uses these, and then says that there aren’t any more worth bothering with.
Typically, non-experts will perform a handful of sorts, and experts will perform somewhere between a handful of sorts and a dozen sorts, though these numbers are only rough approximations. Card sorts sessions usually take about forty minutes until the participant runs out of criteria (and less time if they only use a few criteria). Again, this figure is only approximate, but the key thing is that card sorts sessions normally reach a natural end in a reasonable amount of time, which normally fits within the usual one-hour slot that researchers often have to work within when doing data collection.
At the end of the sessions, you have a highly structured set of data which you can analyse in different ways.
Most of the research performed with this variety of card sorts uses a standard set of analysis procedures, including qualitative analysis, content analysis, and quantitative analysis. Andy Hurd’s thesis is a good example of this. Novice researchers usually find this standard set of analyses helpful because it provides a clear, well-defined set of basic steps. Expert researchers usually find the same set useful because it also includes several sophisticated advanced forms of analysis, including qualitative analysis and multidimensional statistical analysis.
In terms of qualitative analysis, you can simply look at which criteria and categories are being mentioned, to identify what is salient to the participants, and also to identify what’s not being mentioned. The things that aren’t being mentioned are often significant absences, where a topic is either being actively excluded from the discussion, or where it hasn’t crossed the participant’s mind as something that’s feasible.
In terms of qualitative and quantitative analysis, you do content analysis on the criteria used (the group names are often meaningless without the criterion name: for instance, “high”, “medium” and “low”). Since the criteria are usually very terse phrases, you get a good signal to noise ratio, and the content analysis is a lot cleaner than with natural language transcripts, while still being in the participants’ own words. The degree of verbatim agreement in names of criteria can give you some useful insights into the degree of consensus within a domain, and the actual criteria used are often surprising, which can lead to some useful insights into the domain.
Content analysis is a big topic, which we’ll cover in more detail in a later blog. There’s a chapter about it in A Gentle Guide to Research Methods, and it’s covered in numerous standard texts, so for brevity, I’m assuming that readers will either already be familiar with it, or will easily be able to find more detailed information about it.
In terms of quantitative analysis, there are various standard analyses.
One analysis is to count the number of criteria used by the different participants, to see whether one group uses a different number of criteria from the other (for instance, experts usually sort using more criteria than novices).
A second analysis is to count the number of groups used by each participant within each sort. This is often a proxy for the participants’ level of expertise, with experts tending to sort the cards into multiple groups each time, whereas novices tend to sort the cards into only two groups each time.
More advanced quantitative analysis includes various forms of statistical analysis derived from how often each card co-occurs in the same group as each other card. This can be done either on the basis of statistical distances between cards or between criteria for sorting the cards.
Statistical distance between the cards can be shown in a co-occurrence matrix, as in the Martine & Rugg article in the references below. Since this is based purely on how often each card occurs in the same group as each other card, you can directly compare data collected in different cultures – for instance, data collected from Egyptian and from Chinese participants – without needing to translate any of the wording from the data collection.
Statistical distance between the criteria can be shown using a minimum edit distance measure, as in the Deibel et al reference below.
Further forms of analysis are also possible, and we’ll write about those in later articles.
Card sorts have various advantages over related techniques such as repertory grids; for instance, they can handle nominal values for criteria such as “colour” easily, which repertory grids can’t. Most researchers who use card sorts find them a clean, simple, powerful method, and most participants enjoy performing card sorts.
Card sorts also have limitations. They have been described as “a tease” because they tend to raise a lot of further questions which need to be followed up – classic questions include “why did they use that criterion?” and “what do they mean by that term?” Most of these questions are well suited to laddering, so most researchers who use card sorts also consider laddering sooner or later.
There’s a downloadable tutorial article on laddering here:
We’ll be writing about laddering in later articles on this blog.
You can put some interesting things onto cards. Some examples:
- scenarios, for e.g. involving people’s categorisation of risks or problems
- personality theory topics, using cards such as “myself now” and “myself as I would like to be”
- screenshots of Web sites
- pictures of people
- pictures of products
You can also do various forms of meta-analysis via card sorting, such as asking participants or other researchers to sort the criterion names used by other participants, or by themselves, in a previous card sorting session.
Some examples of how card sorts have been used:
- Sue Gerrard used card sorts to investigate perceptions of women’s working dress
- Linda Upchurch used them to investigate Web page quality metrics
- Satvere Sanghera used them to investigate perceptions of software project risks
- Andy Hurd used card sorts to investigate cross-cultural differences in perceptions of Web sites
There is plenty of scope for other applications. It’s well worth trying this technique; it’s powerful, simple and flexible, and most researchers who have tried it find it useful and pleasant to use, as well as giving fascinating new insights into how people view their world.
There’s various material about card sorts on my Keele website here:
This includes Andy Hurd’s MSc thesis, which is an excellent template for using card sorts:
There’s more about card sorts, and about elicitation methods in general, in my book with Marian Petre, A Gentle Guide to Research Methods. It contains worked examples of using each method, as well as guidance on which methods to use for which purposes.
There’s a description of it here:
It’s available on Amazon here:
Some references and further reading
There is a special issue on card sorts, guest edited by Sally Fincher and Josh Tennenberg, in the July 2005 issue of Expert Systems: the International Journal of Knowledge Engineering and Neural Nets (issue 22(3)).
There’s an online example of statistical analysis of card sorts here:
Using edit distance to analyze card sorts: Article by Deibel, Anderson & Anderson
Curran, M.J., Corr, S. & Rugg, G. (2005)
Attitudes to expert systems: a card sort study.
The Foot, December 2005, Volume 15, Issue 4, pp. 190-19
This is available online via ScienceDirect: http://authors.elsevier.com/sd/article/S095825920500060X
Rugg, G. & McGeorge, P. (2005)
The sorting techniques: a tutorial paper on card sorts, picture sorts and item sorts.
Expert Systems, 22(3) (NOTE: this is a reprint of our 1997 paper in Expert Systems)
Gerrard, S. & Dickinson, J. (2005)
Women’s working wardrobes: a study using card sorts
Expert Systems 22(3), pp.108-114
Martine, G. & Rugg, G. (2005)
That site looks 88.46% familiar: quantifying similarity of Web page design
Expert Systems 22(3), pp.115-120
Upchurch, L., Rugg, G. & Kitchenham, B. (2001)
Using Card Sorts to Elicit Web Page Quality Attributes.
IEEE Software, 18(4) 2001, pp. 84-89
Rugg, G. & McGeorge, P. (1999)
The concept sorting techniques.
The Encyclopedia of Library and Information Science, volume 65, supplement 28, pp. 43-71
Marcel Dekker, Inc, New York
Rugg, G. & McGeorge, P. (1997)
The sorting techniques: a tutorial paper on card sorts, picture sorts and item sorts.
Expert Systems, 14(2), pp 80-93
Rugg, G., Corbridge, C., Major, N.P., Burton, A.M. & Shadbolt, N.R. (1992)
A comparison of sorting techniques in knowledge elicitation.
Knowledge Acquisition, 4(3), pp. 279-291
This article is an updated and expanded version of an introductory tutorial on my Keele website here: http://www.scm.keele.ac.uk/research/knowledge_modelling/km/Blogs/Card_Sorts.php