In this paper, Pirolli and Card present a model of the “sensemaking loop”, the process by which analysts gather, filter, understand, and hypothesize about information. To build this model, they used cognitive task analysis and think aloud protocol analysis to identify the techniques and processes the analysts were using. However, the authors provide no further description of the methodology used in this study. This is particularly unfortunate as Cognitive Task Analysis does not refer to a single specific technique, but rather to a suite of approaches aimed at understanding what mental processes go into a task.
Methodological questions aside, the model of the sensemaking loop that the authors derived is an interesting description of the process. It is composed of a series of interlocking loops, through which the data becomes increasingly more structured. At one end of the structure spectrum, there are the external data sources: the things that an analyst collects to analyze. At the other end of the structure spectrum is the presentation of hypotheses: the narratives that the analyst has constructed about the data. Perhaps the most interesting point between these two extremes is the schema, which is a structure that the analysts used to organize data. That structure might be internal to the analyst’s mind, or it may be external, a sketch the analyst has made, a map of data connections, or even a more formal tool like a computer visualization of the data.
The entire process, from external data sources to presentation, falls within two loops. The first, the foraging loop, “involves processes aimed at seeking information, searching and filtering it, and reading and extracting information”. The second, the sense making loop, involves iterative development of a mental model that best fits the evidence. These loops can be traversed from bottom to top, as we typically think of the analysis process operating, or from top to bottom, as when new analysis calls a previous hypothesis into question.
The authors identify several areas in which new tools or processes could improve the sensemaking process. In the foraging loop, they identify a tradeoff between exploration, enrichment, and exploitation. Exploration is the process of finding new information to analyze. Enriching is the process of narrowing the set of items to be analyzed to those that are most relevant and useful. Exploitation is the process of reading through the collected and enriched information and extracting useful patterns from it. The authors write, “It will generally be desirable to explore as much of the information space as possible (because there may be a cost to missing something novel in the data), but this comes at the cost of having to actually work through the material and eventually exploit it.” This is an important trade-off to evaluate when doing research; if you are designing a marketing campaign for a breakfast cereal company and you miss something in the data, the cost of having missed something will likely be relatively low. But if you are trying to figure out where a terrorist organization will strike next, missing something novel could be disastrous. The cost of missing something is therefore an important consideration when scoping a project, and it must be weighed against the cost of providing incomplete or inaccurate analysis.
Another area in the foraging loop that the authors identify as an area that could be leveraged is the amount of time spent scanning, assessing, and selecting items for further attention. They suggest developing techniques for reducing the cost of this task by highlighting important information (names, numbers, locations, etc) with “pre-attentive codings” or “re-representing documents”, such as by summary. These are both processes that ethnographers have developed to deal with the large amount of qualitative information they deal with. When writing field notes, we apply “tags” or “codes” to the documents so that we can easily find all of our field notes in which, for instance, sloth bathing was discussed by 25 year old Spanish speakers. (In that case we would be looking for a collection of tags “sloth bathing” “25 year old” “Spanish”). Summarizing documents is actually a technique that my friend Dhruv taught me: at the top of every set of field notes he makes, there is a short summary of the document, without he relevant details about the person or people he was interviewing or observing. Beneath that is generally a bulleted list of the highlights from the interaction. This kind of multi-pass processing makes the exploitation phase easier later.
I also see this leverage point as an opportunity for data mining. Much of the kind of assessment the authors discuss could be easily automated; one could simply ask the algorithm or tool to suggest information that is relevant to the information you are currently looking at, or to a particular set of concepts passed as parameters.
The authors also identify starting a new task in the loop or performing follow-up searches as areas that could be leveraged.
In the sense making loop, many of the leverage points identified are associated with well-known cognitive biases, such as confirmation bias. The first point they discuss is the limited attention span for evidence and hypotheses. Human working memory is limited, and that places limits on how much evidence someone can consider at once. This can, however, be alleviated by pushing information patterns onto “external memory” like visual displays.
Another point the authors identify is in the generation of hypotheses. Humans are fabulous at fitting data into patterns they are familiar with; it’s one of the reasons we are very good at recognizing faces. However, that also means that we are biased towards interpreting information using existing frameworks. In their analysis, the authors found that people failed to generate hypotheses. “Time pressures and data overload work against the individual analyst’s ability to rigorously follow effective methods for generating, managing, and evaluating hypotheses.”
This is a less obvious place to apply computer-aided techniques, but one where they may be especially needed. Even something as simple as taking the evidence an analyst has compiled and randomly reordering it might help the analyst look at things in new ways. More sophisticated approaches utilizing data mining and machine learning could even generate hypotheses for the analyst to validate or invalidate.
I enjoyed this paper because I could identify so much of my own analysis process in it; however, I am not convinced that the hypothesizing appears only so late in the process. I suspect that I start out with several hypotheses built from experience, before I even begin foraging. As I filter information, the hypotheses take shape, building or being discarded as I go. This all happens before the formal hypothesis building segment of the sensemaking loop (although that generally happens as well). How well does this model fit with your experiences with analysis?
Pirolli, P., and Card, S. The sensemaking process and leverage points for analyst as identified through cognitive task analysis. In International Conference on Intelligence Analysis (McLean, VA, May 2005).
A lot of Pirolli and Card’s other work looks really interesting. I have a feeling we will be re-visiting papers by these gentlemen in the near future.