Abstract

Research has shown gender differences in a number of cognitive domains, such as spatial ability and math performance. Research on gender differences in empathy has yielded inconsistent results. Studies have suggested the observed gender differences in empathy may arise from males’ reluctance to report empathy instead of a difference in ability. The present research investigated the effect that explicitly informing participants about the nature of an empathy task (empathy condition) or leading them to believe that the task evaluates social abilities (social abilities condition) has on performance on the Interpersonal Reactivity Index (IRI), a standard measure of empathy. Participants (20 males and 20 females) completed the IRI and the Marlowe-Crowne Social Desirability Scale form C, which measures the tendency to respond in a socially desirable way. The scores of females on the IRI were significantly higher than those of males in the empathy condition, and no significant gender difference was found in the social abilities condition. There was no significant difference between conditions for the social desirability score. Together, the results suggest gender differences on self-report measures of empathy do not arise from a difference in abilities.

Introduction

Empathy is ability to identify what someone else is thinking or feeling and to respond to these thoughts and feelings with an appropriate emotion (Baron-Cohen, 2011). Empathy is thought to comprise two main components, cognitive empathy and emotional (or affective) empathy (Lawrence, Shaw, Baker, Baron-Cohen & David, 2004). Cognitive empathy is the ability to know what another person is thinking and is similar to “theory of mind,” and emotional empathy is the ability to experience an emotion similar to that of another individual (Lawrence et al., 2004; Rueckert & Naybar, 2008). Empathy has an impact on social and emotional health across cultures (Cassels, Chan, Chung, & Birch, 2010). It is correlated with prosocial behavior and altruism (Carlo, Hausmann, Christiansen, & Randall, 2003) as well as inhibiting antisocial and aggressive behavior (LeSure-Lester, 2000; Baron-Cohen, 2011). Higher levels of empathy and emotional management are also associated with better relationships with peers (Eisenberg, Miller, Shell, McNalley, & Shae, 1991). Lower levels of empathy have been associated with disorders such as Asperger syndrome and autism (Baron-Cohen & Wheelright, 2004; Baron-Cohen, 2011). Thus, empathy plays a vital role in an individual’s life, as it allows the individual to interact effectively in social situations.

Empathy is thought to be a critical cognitive difference between men and women due to females scoring significantly higher on measures of empathy. The existing research, however, has revealed mixed results regarding evidence for gender differences. Emotional judgment tasks such as “Reading the Mind in the Eyes,” which rely on the accurate judgment of emotion from observing the eyes, indicate female superiority (Baron-Cohen, Wheelright, Hill, Raste, & Plumb, 2001). Evaluation of nonverbal data using neuroimaging techniques such as functional magnetic resonance imaging (fMRI) provides less support for gender differences in empathy. In their meta-analysis, Wager et al. (2003) did not find a significant difference in brain activation between men and women in response to emotional stimuli. The use of physiological measures as indices of empathy results in men and women obtaining similar scores (Eisenberg & Lennon, 1983). Thus, in studies that use neuroscientific and physiological methods, gender differences in empathy do not appear. These findings strongly suggest that previously reported evidence of gender differences in empathy might be influenced by the methods adopted in the studies.

The most convincing evidence for gender differences in empathy is provided by studies using self-report measures to assess empathy (Rueckert, 2011). Women score significantly higher than men on the Emotional Quotient (Baron-Cohen & Wheelright, 2004) and the Interpersonal Reactivity Index (Davis, 1980). Culture and socialization play an important role in the development of empathy (Baron-Cohen, 2005) and thus may explain the discrepancy in findings reporting gender differences in empathy. Eisenberg and Lennon (1983) suggested biases in self-report measures could influence the observed gender differences. The differences may because men may be reluctant to report empathic experiences due to social expectations. When a measure is thought to assess empathy, it may prompt responses influenced by an individual’s identification with gender stereotypes (Michalska et al., 2013). One of the most prevalent stereotypes in society is that women are more caring, people-oriented and empathetic than men (Rueckert, 2011). Therefore, it is possible that when a measure is thought to assess empathy, women feel they must respond more empathically, whereas men feel they must respond less empathically in order to conform to gender roles.

In studies using self-report measures of empathy, gender differences are found to increase between childhood and early adulthood (Mestre et al., 2009; Michalska et al., 2013). However, no significant gender difference in brain activation related to empathy has been found with increasing age (Eisenberg & Lennon, 1983). These findings lend more support to the role of culture and socialization in the observed gender differences in empathy. Previous research has suggested that knowledge of stereotypes in areas of personality traits and achievement increases in adolescence (Signorella, Bigler, & Liben, 1993 in Berk, 2006). Increased awareness of stereotypes may lead to the observed increase in gender differences on self-report measures of empathy. The role of gender stereotypes on other cognitive abilities has been evaluated. Dar-Nimrod and Heine (2006) found that telling participants that there are no gender differences or that gender differences arise due to differences in experience could attenuate gender differences in math performance. Moe (2009) investigated whether positive beliefs and explanations can influence performance on the mental rotation task, at which males are thought to be superior. It was found that leading women to believe that females perform better than males improved their performance on the task. Thus, changing the information given to participants prior to the task can reduce gender differences in tasks that are stereotypically associated with one gender.

In their experiment, Klein and Hodges (2002) found women performed significantly better than men on a measure of empathic accuracy only when participants were led to believe that the measure was related to empathy prior to completing the task. But when participants were paid in exchange for accuracy, leading participants to believe that the measure was related to empathy resulted in no gender differences. The authors suggest the gender differences observed in the performance of empathic accuracy arise from differences in motivation to perform well. The findings from this study strongly suggest gender differences may be susceptible to test conditions and that the motivation to perform well and may not be due to a difference in ability between the genders.

The current study aimed to investigate the influence of written instructions on gender differences in empathy. Instructions explicitly indicating a measure of empathy and instructions indicating a measure of social abilities were presented to participants prior to completing the Interpersonal Reactivity Index (IRI) (Davis, 1980). To investigate the tendency to respond in a manner considered socially acceptable, a measure of social desirability, Marlowe-Crowne Social Desirability Scale Short Form C (MC-SDS form C) (Reynolds, 1982) was also used. Thus, if participants have a tendency to present themselves in a favorable (or unfavorable) light, their social desirability scores would reflect that. If gender differences in empathy are a result of differences in ability, changing the information about the nature of the task should have no impact on the empathy and social desirability scores. Based on evidence illuminating the impact of gender stereotypes on performance on self-report measures of empathy, there should be a significant difference between the IRI scores of males and females in the condition in which participants are explicitly told their empathy levels are being evaluated (empathy condition). No gender differences in IRI scores are predicted when participants are told that their social abilities are being evaluated (social abilities condition). As a significant gender difference is predicted in the empathy condition, both males and females in this condition may be responding in a way that they believe is socially acceptable. Thus, for the MC-SDS form C, participants in the empathy condition should display higher social desirability scores than participants in the social desirability condition.

Methods

Participants:

40 participants between 18 and 31 years of age (M = 21.5, SD = 2.68) were recruited through the University of St Andrews Research Participation System (SONA). There were 20 females and 20 males.

Materials:

Participants completed the IRI (Davis, 1980), which contained 28 questions divided into four subscales (7 items each) that measure different aspects of empathy. Each question of the IRI including reversed scoring items (items 3, 7, 12, 13, 14, 15, 18, and 19) was assigned a score between 0 and 4. The total score for each subscale within the questionnaire and the overall scores for each participant were calculated. The Empathic Concern (EC) subscale measures an individual’s tendency to experience concern for others. The Fantasy Scale (FS) subscale measures an individual’s ability to imaginatively be involved with fictitious characters and situations. The Personal Distress (PD) subscale evaluates the tendency of an individual to become distressed as a result of witnessing another individual’s distress. Finally, the Perspective Taking (PT) subscale evaluates an individual’s ability to adopt the perspective of another person. Participants were required to respond to each item in the questionnaire on a five-point scale ranging from 0 (does not describe me well) to 4 (describes me very well).

Participants also completed the MC-SDS form C (Reynolds, 1982). This measure evaluates the tendency of individuals to respond in ways that are considered desirable based on culture and social norms. The questionnaire contains 13 statements based on culturally acceptable behaviors that are relatively unlikely to occur (Crowne & Marlowe, 1960). The items are to be answered with either “true” or “false” depending on how well the statement describes the individual. Each item, including the reversed scored items (items 1, 2, 3, 4, 6, 8, 11, and 12), was assigned a score of either 0 or 1. The total score for each participant was calculated.

Priming:

Participants were primed using the IRI. In the empathy condition, participants were explicitly told their empathy levels and reactions to emotional situations were being measured. In the social abilities condition, participants were told their social abilities and reactions to social situations were being measured. At the end of the instruction sheet, each participant was required to write his or her age and gender, thereby making gender salient before completing the questionnaire.

Procedure:

Participants were randomly assigned to either the empathy condition or the social abilities condition using an assignment system that ensured an equal number of males and females were in each condition. They were then presented with a description of the IRI and instructions on how to complete the questionnaire. The experimenter returned to the room after approximately eight minutes and presented participants with a second set of instructions on how to complete the social desirability scale. The experimenter returned after approximately four minutes, and the participants were given a full debrief explaining the purpose of the study including the predictions of the study. They were then compensated with GBP 1.50 for their time.

Statistical Analysis

The data was checked for normality using SPSS. The Shapiro-Wilk normality tests revealed that only the data of the females in the empathy condition was normally distributed (w = .790, p = 0.11). To ensure accurate interpretation of parametric tests, the variances for each group of scores were calculated. This revealed that the analysis of variance is likely to be valid, as the largest variance was less than four times the smallest variance (Howell, 1992) (social abilities females = 147.21, social abilities males = 98.44, empathy females = 102.5 and empathy males = 246.27).

A repeated-measures ANOVA was conducted with gender (male/female) and condition (empathy/social abilities) as the between-subjects factors and the subscales (EC, FS, PD, PT) as the repeated measures factors. Where sphericity was violated, appropriate corrections were used. For further analysis of the results, a simple effects analysis was conducted.

The data obtained from the MC-SDS form C was checked for normality, and the Shapiro-Wilk test revealed that the data were drawn from a normal distribution (w = .972, p > .05). A two-way between-subjects ANOVA was conducted with gender and condition as between-subjects variables and the social desirability score as the dependent variable.

Results

Interpersonal Reactivity Index

Mauchly’s test indicated that the assumption of sphericity had been violated (x2 (5)= 25.64, p < .001). Therefore, degrees of freedom were corrected using Greenhouse-Geisser estimates of sphericity (e = 0.73). Tests of between-subjects effects revealed a significant effect of gender (F (1, 36) = 13.19, p = .001) and a significant interaction between gender and condition (F (1, 36) = 6.97, p = .013). The two-way repeated measures ANOVA revealed a significant main effect of subscale (F (2.20, 79.29) = 15.89, p < .001). Figure 1 shows the mean scores of participants on each of the subscales within the IRI. However, no significant interaction between subscale and gender (F (2.20, 79.29) = 1.19, p > .05) or subscale and condition (F (2.20, 79.29) = 2.06, p > .05) was found. Thus, gender and condition did not influence the scores on the four subscales.

A simple effect analysis revealed no significant difference in scores of males and females in the social abilities condition (F (1,39) = 0.50, p > .05). The analysis revealed a significant difference between the scores of males and females in the empathy condition (F (1,39) = 19.01, p < .001) with males scoring significantly lower than females. This indicates gender differences in empathy were only present in the empathy condition. Social Desirability Scale The two-way ANOVA revealed no main effect of gender (F (1,36) = 0.39, p > .05) or condition (F (1,36) = 0.09, p > .05). The interaction between gender and condition was also not significant (F (1,36) = 1.81, p > .05), indicating there was no difference in social desirability scores in any of the four conditions.

Discussion

The current study aimed to investigate the influence of written instructions on gender differences in empathy and whether participants’ scores reflected a tendency to respond in a manner perceived to be socially acceptable. The results revealed that in the empathy condition, males scored significantly lower than females on the IRI, while no significant gender difference was found in the social abilities condition. Thus, the data support the hypothesis that gender differences on self-report measures of empathy do not arise due to a difference in ability. Although significant gender differences have been observed in each of the four subscales of the IRI, the differences on the four subscales in the current study were not influenced by the condition or gender of the participants. This result, however, is consistent with previous research suggesting there is a variation in scores obtained on each subscale (Davis, 1980). In contrast to the prediction that participants in the empathy condition would have significantly higher scores on the MC-SDS form C, no significant difference in social desirability scores was found between the four conditions. Although the results were not significant, males in the empathy condition scored the highest on the MC-SDS form C, whereas females in the empathy condition scored the second highest.

The results are consistent with previous findings demonstrating that when one gender is thought to perform better due to stereotypes, the information presented prior to the task can reduce gender differences (Moe, 2009). In the current study, information about the task that indicated that the measure evaluated empathy, a trait that is stereotypically thought to be feminine (Rueckert, 2011), resulted in gender differences consistent with earlier studies (Davis, 1980). But when participants were led to believe the measure evaluated an ability that had no gender stereotypes associated with it, gender differences on the same measure diminished. The absence of gender differences in empathy scores in the social abilities condition in the current study suggests a lack of difference in empathic ability between genders. The lower scores of males in the empathy condition compared to both females in the empathy condition and males in the social abilities condition suggest that when it was obvious what behavior trait was being evaluated, males may not have been motivated to perform well.

Females in both the empathy and the social abilities conditions performed at almost the same level (slightly higher in the empathy condition). Males’ mean performance was significantly worse than that of females in the empathy condition. In their experiment, Michalska et al. (2013) found that even though males and females did not differ in their hemodynamic and pupil dilation responses to painful stimuli, females reported being significantly more upset than males. With respect to the tendency of participants to display themselves in a socially favorable way, the slightly higher social desirability scores of males and females in the empathy condition lend further support to the role of gender stereotypes in the gender differences observed on self-report measures of empathy. Females in the empathy condition scored the highest on the IRI, and males in the empathy condition scored the lowest on the IRI. Taken together, the results obtained from the MC-SDS form C and the IRI suggest social expectations regarding gender roles and gender stereotypes may lead women to report higher levels of empathetic behavior and men to report lower levels of empathetic behavior.

In studies that have reported gender differences on self-report measures of empathy, a measure of social desirability has not been included (Mestre et al, 2009, Lawrence et al, 2004; Baron-Cohen & Wheelright, 2004; Davis, 1980). In this study, the MC-SDS form C was included to investigate the likelihood of participants presenting themselves in a socially acceptable manner when it was obvious what behavior was being evaluated. However, MC-SDS form C can be used to control for socially desirable responses in self-report measures. The results illuminate the importance of including a measure of social desirability as scores on self-report measures may be influenced by the tendency of participants to present themselves in a way that is considered acceptable by society, and this may lead to incorrect interpretation of data.

Although the results obtained suggest gender differences on self-report measures of empathy arise from social expectations, the current study had certain limitations. Due to the small sample size within each condition, a reliable correlation between IRI and MC-SDS form C scores was not possible. Future studies may benefit from a larger sample size to allow for analysis of the relationship between empathy scores and socially desirable responses. The current study also did not consider demographic information about participants. It is possible that the tendency to report empathy may vary across cultures due to the variance in gender roles across cultures. Future studies may benefit from investigating whether the observed differences are stronger in cultures that have stronger gender role beliefs. Finally, previous research has found gender role orientation is a better predictor of gender differences in empathy than gender itself (Karniol, Gabay, Ochion & Harari, 1998). An inclusion of a gender role orientation measure such as Bem’s gender role orientation inventory (Karniol et al., 1998) may be beneficial for further research. This would allow one to investigate whether participants who score higher on masculinity obtain lower empathy scores regardless of instruction presented prior to the IRI.

In conclusion, the findings that gender differences do not arise from differences in ability are consistent with studies using implicit measures of empathy (Michalska et al., 2013). The current study suggests gender differences in self-report measures of empathy may be influenced by stereotypes and gender role expectations in society. It is possible that when the nature of the measure is made obvious, individuals may be motivated to perform according to social expectations. Gender differences in self-report measures of empathy previously reported may arise due to a tendency of women to over-report empathic behavior and a tendency of men to under-report it (Wager & Ochsner, 2005). The results suggest males and females can obtain the same level of empathy scores when the stereotype associated with a measure is removed. The results from the current study have important implications for future studies aiming to evaluate empathy, as information about the measure may influence the way participants respond. This study challenges the reliability of previously reported gender differences on self-report measures of empathy and adds to a growing area of literature suggesting no gender differences in ability in a number for cognitive domains (Moe, 2009; Dar-Nimrod & Heine, 2006).