Emotion, the very spark of feeling that makes our heart flutter, eyes tear over and our hands clench in fear. No doubt, we are all controlled by emotions. It is the primary instinct that drives us to feel and act. In UX, people are paying more attention upon the skill of empathy, plus emotion. But, as far as we can see, no-one has defined a standard on how emotion in UX/usability should be measured. A designer’s gut feel, their previous mistakes and experience, mostly does it. Trial and error with an agile process is ok, but can it be measured?
Since, by nature, emotions are intangible, there isn’t a definite method to measure emotion yet. We have written this summary so that we can work out what would be best to do as a consultancy right now.
Neurometrics, Biometrics AND Eye Tracking
Andrew Schall (the principal researcher and senior director at Key Lime Interactive) has written a comprehensive article suggesting various new methods on how emotions can be measured more accurately and objectively, along with their pros and cons. We briefly review some of the methods mentioned in his article below. Then we will focus on some techniques you can adopt right now in your UX practice.
Facial response analysis
Traditional facial response analysis involves a few researchers observing participants, and coming to an agreement on what emotions are being elicited by the participants. In recent years, software and algorithms have been developed to recognize facial expressions of different emotions, just with a simple webcam set-up. However, the current state of this technology only recognizes a limited set of emotions (e.g., anger, fear, joy), and are only accurate when the emotions are overtly expressed. An example for such a software would be AFFDEX by Affectiva. You can also check out this Tedtalk here by Affectiva’s Chief Strategy and Science Officer, Rana el Kaliouby. Other similar software includes Noldus’ FaceReader and ZFace. Despite the limitations, deeper and more precise algorithms are rapidly being developed to raise the accuracy of the analysis.
EMG is able to accurately measure more subtly expressed emotions by measuring signals from specific muscles known to react to specific emotions (check out this Scholarpedia article for a simple introduction). However, EMG is obtrusive and only works if you know which facial muscles to measure beforehand. It is also impossible to put electrodes across the entire face of the participants; but again, this is too intrusive for everyday usability testing.
Another limitation for using facial response analysis and EMG is that they can only measure overt emotions which are often under conscious control. As such, these emotions can be highly influenced by social settings. For example, humans tend to show stronger facial expressions if they believed that they are being observed.
One of our UX consultants trying out the Empathica E3 Wristband
GSR (Galvanic Skin Response)
GSR technology have been traditionally used to measure physiological arousal. It can accurately measure intensities (e.g., arousal, stress), but not emotional valence (positive or negative). Although some computational algorithms can be applied to the GSR data to measure valence (Monajati, Abbasi, Shabaninia, & Shamekhi, 2012), it is still far from being able to measure specific emotions.
Other limitations include a delay of 1- 3 seconds (maybe more, depending on the equipment used), and can be affected by external surrounding conditions (e.g., temperature, humidity) as well as internal bodily conditions (e.g., medications). We have a GSR unit and tried experimenting with it, but we found that it was rather difficult to correlate spikes in GSR with UI interactions. The temporal resolution of GSR is too crude to measure emotional response to individual events.
EEG is a neuroimaging method used to measure real-time changes in voltage caused by brain activity. Measuring brain activity means it has a much larger arsenal of measures for emotional responses as compared to biometrics. Its excellent temporal resolution also means that it has the potential to measure real-time changes in emotional responses that would be very useful for UX research. However just like physiological response patterns, brain activity patterns are affected by many external and internal factors. Well-designed computational methods and trained algorithms are needed to extract information from the “noisy” EEG data. For example, movement can cause bunches of artifact that are not related to experienced emotions. Research into EEG as a measurement for emotions are still in early stages, but it has showed more promising results than the GSR in measuring emotional states.
EEG technology is now becoming increasingly accessible (check out this list on Wikipedia), and companies like Emotiv are already starting to produce lightweight and wireless EEG equipment for a simpler and less obtrusive set-up. It means, however, that there will be lesser electrodes to precisely and reliably transform the data into meaningful insights. It is a trade-off between obtrusiveness and data sensitivity.
Eye-tracking is not obtrusive and can measure arousal from blink activity, pupil size and dwell times, however pupilometry like this suffers from the same problem of being affected by many external and internal factors. Thus, the environment must be well-controlled to avoid disturbance that may contribute to changes in pupilometry data of the participants.
With eye-tracking we can measure people’s unconscious eye gaze response to an interface they are using. Specific emotions, however, cannot be measured using eye-tracking alone, and instead are discovered only in the Retrospective Think Aloud (RTA) Interview afterwards, which is susceptible to suggestibility effect.
Despite eye-tracking’s inability to measure emotional states meaningfully on its own, its main advantage lies in its flexibility to combine with other research methods and measurements to gather powerful insights. Eye-tracking aids us in determining the user’s attention, focus or other mental states. Using other devices, we are potentially able to pinpoint specific events or touch points that cause a change in emotional states during testing sessions. The usage of lightweight eye tracking equipment, such as our Tobii Glasses 2, also enables the flexibility of the research objective to test in their own environment if they were to require more ecological validity.
How do we use all this?
One important piece of advice from Andrew Schall’s article is that EEG and GSR are not for everyone, as there can be potential for misinterpretation and misuse of the data. We believe that there is a need to understand the science behind the complexities of the technologies beforehand in order to avoid misusing them. This also applies to the eye-tracking technology, even if you are using it as a complementary research method to pinpoint specific events as mentioned above.
Andrew also warned that it is often insufficient to measure emotions just with a single technique, as the neurometrics and biometrics measurements for emotions described above are not fully matured yet. Using a variety of methods to complement each other would obtain a better accuracy in identifying users’ specific emotional experiences. There are, however, still significant challenges to implement a standard for measuring emotions using these technologies, especially in terms of economy and practicality. Given neurometrics and biometric measurements still have some way to go, is there any other way to measure emotions more economically and practically?
What else can we do to measure emotion?
We believe the answer to this question could be good old self-report questionnaires.
Questionnaires, unlike user interviews, are more objective and standardized, hence results can be compared across different context and projects. Our clients always want to compare scores like NPS or SUS for themselves against other projects across their organization. Although questionnaires still suffer from the same problem of having a reliance on a user’s recall (which could be mitigated by the use of the eye-tracking + RTA research methodology), it is simple to implement and you do not need to be a neuroscientist to analyze the results. There might be countless questionnaires available online, but fret not, we have done a little research to identify the following that are designed and empirically tested to measure aspects related to emotions
1. Geneva Emotion Wheel
This is an empirically tested instrument to measure emotional reactions to objects, events, and situations, based on Scherer’s Component Process Model. It assesses 20 emotions and can be used in 3 different ways, depending on your objective. You can download a standard template to use at their website, provided it is for non-commercial research purpose.
2. Plutchik’s Wheel of Emotions
Source: Author/Copyright holder: Machine Elf 1735. Copyright terms and licence: Public Domain.
Plutchik’s wheel of emotions is an early model of an emotional wheel that was constructed based on 8 “basic” emotions and their “opposite emotions”. It was further expanded to include more complex emotions that are composed of 2 basic ones. Even though this model lacks empirical testing, some UX designers and researchers use it from time to time to map out user journeys, because it provides an organisational structure (e.g., intensity, complexity) when measuring emotions.
3. Self Assessment Manikin (SAM)
This is a questionnaire that uses pictorial scales to measure 3 dimensions of experienced emotions: pleasure, arousal and dominance. It has been often used in evaluations of advertisements and increasingly in product evaluation. Because it is pictorial-based, it is compatible with a wider range of population (children, or participants from different language/cultural background).
This questionnaire also uses pictorial scales, but it is designed to measure more specific emotions for product evaluation purposes. It uses a set of 7 positive and 7 negative emotions to measure the emotional impact on users. Like eye tracking, PrEmo can be used either as a quantitative tool by itself, or as a qualitative tool to complement user interviews. Although PrEmo is available for academic (non-commercial) usage free-of-charge, there is a charge in using it for commercial purposes.
The Attrakdiff does not measure specific emotions, but it includes an assessment of emotional impact on product evaluation. It measures attractiveness of a product based on 2 sets of scales:
- Pragmatic scale – basically usability, e.g., usefulness of a product
- Hedonic scale – this is measuring emotional reactions. It is not measuring the distinct emotions itself, but the user’s needs and behaviours arising from the emotions, e.g., curiosity, identification, joy, enthusiasm
Their website offers a pretty comprehensive overview of what is it about and you are able to have a go at the demo on their website too.
youxemotions offer a simple and easy-to-use solution to measure emotions. Users will choose what they felt from 9 emotions and 5 levels of intensity. Turning results into charts for presentation is extremely easy as well. It is currently in beta, and is free for use till the end of the beta period.
Even though there are various ways to measuring emotions is UX, it is important to understand the benefits and limitations to each method. After all, research methods are only useful if it can help you answer your research question or design objective.
If specific emotions are too complicated for your needs, maybe an analysis of how users are using their mouse would be a good enough tool to infer negative emotions when users are browsing websites.
-Ying Ki, Shermaine & James
Monajati, M., Abbasi, S. H., Shabaninia, F., & Shamekhi, S. (2012). Emotions States Recognition Based on Physiological Parameters by Employing of Fuzzy-Adaptive Resonance Theory. International Journal of Intelligence Science, 2, 166-175 .