You’ve probably been on a first date a few times, so the feeling of being curious about the person you’re meeting with for the first time is likely a familiar one.
Such as: Are they open-minded people? Or are they extroverted people who might be “easier to get along with”? This has to do with people’s personalities.
Personalities also matter a lot when it comes to hiring new employees to join your company. In fact, more and more companies consider candidates’ personalities as early as the very first step of the hiring process (so during initial candidate screening).
From an employers’ viewpoint, this can be translated into whether this candidate fits the company culture or not. This is in most cases the main reason to screen candidates by using a personality test.
Additionally, we tend to assume that personality tests help us get to know a person in a short amount of time, even without being in direct touch with them.
And yes, personality tests do offer some insights into the preferences of candidates regarding task completion and their working style. However, results from scientific research have a different opinion about whether they can be used as a hiring tool.
In this blog, together we’ll take a more in-depth look at the actual feasibility of personality tests as means of pre-employment testing.
4 Reasons to reconsider using personality tests as the primary screening procedure
Low Validity when it comes to Predicting Future Job Performance
First things first – you should take into consideration that personality tests tend to have low criterion validity in relation to job performance (Morgeson et al., 2007; Barrick & Mount, 1991; Barrick et al., 2001). It means that the result of personality tests does not always accurately predict job performance. Specifically, personality tests only show a validity coefficient of 0.23, which is considered low (corrected; Morgeson et al., 2007). In contrast, cognitive tests provide validity coefficients of 0.62 (considered moderate to high) in predicting job proficiency.
While looking into the validity of the Big Five model test, for example, Barrick and Mount (1991) also examined the correlation between each dimension in the Big Five model and job performance. As a result, they found that correlations between personality dimensions and job performance ranged from 0.04 (Openness to Experience) to 0.22 (Conscientiousness; corrected). And mind you – these are research findings from the early 90s….
In a later meta-analysis of validity in the Big Five model, conscientiousness only yielded 0.12 of average predictive validity for work performance (Barrick et al., 2001; Murphy, 2005).
According to these results it becomes quite clear that personality tests are a poor predictor of work performance. As Barrick once mentioned (Morgeson et al., 2007):
“If you took all the Big Five, measured well, and corrected for everything using the most optimistic corrections you could possibly get, you could account for about 15% of the variance in performance.”
In reality, results obtained from personality tests are rarely corrected. Specifically, we usually don’t correct or even think of the measurement error from the administration of the test when we interpret the results.
If we used such (inaccurate) results as the basis for choosing future employees, the probability of the person performing as we expected them to based on the test results would be very low. ..
Dangers of Self-reporting and Social desirability
On top of that, most personality tests rely solely on self-reported opinions that might invoke faking behavior (Birkeland et al., 2006; Niessen et al., 2017; Donovan et al., 2003).
Simply put, faking is “saying what you think you ought to say rather than what you really want to say. We have a word for that — ‘civilization’.” (Kevin Murphy, In Morgeson et al., 2007a, p. 712).
Be honest with me, how would you answer this question during the hiring process: ‘Do you prefer familiarity over unfamiliarity?’
You might start wondering what this question is intended to measure, and in what way you can respond to get a higher chance of proceeding to the next stages in recruitment.
When you initiate this kind of deliberate thinking, rather than respond naturally, you probably involve yourselves in self-impression management to show a better self-image to the hiring manager. It can be intentional or unintentional, conscious and unconscious. This means, it can even happen without us being aware of it.
Studies have shown that there’s a difference between the self-assessment scores in low-stake (e.g., experiment setting) and high-stake (e.g., hiring procedure or academic admission) conditions.
For example, Niessen and colleagues (2017) have found score inflations in non-cognitive constructs (such as personality or behavioral tendency) depending on whether participants are informed that the selection decisions are based on their score or not.
Similarly, Birkeland and colleagues (2006) demonstrated that the score of applicants in personality tests is significantly higher compared to non-applicant contexts across 25 studies, especially the scores of emotional stability and conscientiousness (Big Five model).
Some researchers claim that social desirability has minimal effect on construct validity (Hogan et al., 2007; Ellingson et al., 2001). However, that does not mean that it is suitable for use in selection decisions. Particularly, faking can significantly change the rank orders and thus influence hiring decisions (Stewart et al., 2010; Rosse et al., 1998). This becomes even worse when the selection ratio is small and top-down selection is used (choosing from the top candidates first; Morgeson et al., 2007).
Let’s do some simple math: if candidate A responded socially-desirably in 3 questions in a 15-item personality test, they already have a more than 20% chance of outperforming candidate B, with the same personality, who answered truthfully in the same test.
Although 20% seems little, the job position might only need one person to be filled. Thus, this small difference might result in a change in rank order and cause candidate B to lose the opportunity to proceed to the next round of the procedure.
Involves the use of Vague Statements
Most personality tests consist of vague items to be conveniently generalized to as many fields as possible. Sometimes, we are confused about the situations described in the questions/statements and might respond differently from how the author originally defined or intended it (Morgeson et al., 2007).
For instance, we might have different perspectives towards ‘many’, ‘often’, ‘sometimes’ or ‘always’.
Think of the statement “I read a lot.” We might both choose the option ‘strongly agree’, but it could be the case that I think reading 1 book per month is ‘a lot’ , while you think reading 1 book per week can be considered ‘a lot’.
Another example is questions/statements consisting of ‘things’ and ‘other’, such as ‘I worry about things’, and ‘I get annoyed by others’ behavior’.
We might define them in various ways depending on what context we refer to.
Broad and Generalizable ways of Presenting Results
Did you ever do an online psychological test that gave you a summary of what type of person you are? Did you look at the result, feeling that those words are so “you” and assume accordingly the psychological test is accurate?
The interesting thing is that when we take a sneak peek at another person’s result, we might realize that we also fit into this type of personality. Why is that?
This is the so-called Barnum effect. Some personality-test publishers make very broad statements about each type of personality, causing us to think the report is scientifically precise (Guastello et al., 1989; Snyder, 2000).
When in fact, the statement can be suitable for everyone, rather than uniquely describing who you are. In the hiring process, you need to be extra cautious when making decisions using this kind of report. You probably believe that these ‘accurate’ reports offer a reliable basis for making decisions, but in fact, they are too general for you to make a valid and fair judgment.
Test results could also be presented in a dimensional personality profile (e.g., 16 PF, NEO, MPA) or as a personality type (e.g., MBTI, DISC, HBDI; Diekmann & König, 2018).
Specifically, a dimensional personality profile refers to the approach that everyone has a certain degree of each personality trait on a continuous scale, whereas personality types categorize people into distinct groups (Gangestad & Snyder, 1985; Diekmann & König, 2018).
In the next part, I will mainly focus on discussing the concern of personality types using two case studies, Myers Briggs Type Indicator (MBTI) and DiSC personality assessment, as well as comparing the similarity and differences in both tests.
Case study: MBTI & DISC
In the infographic below, you can see the main differences between MBTI and DISC.
You can see that although they both are personality type tests, they measure different facets of personality and yield diverse types of results.
Despite having differences, they also have some similarities in their nature. Speaking of their benefit in presenting results, they offer a less complex way to interpret the result: they summarize all the information and categorize people into different types, leading to easier explanations and a more appealing way for practitioners (Diekmann & König, 2018).
However, like many other personality tests, there are some considerations we need to take into account if we want to use them in the hiring process.
In fact, both companies have a disclaimer on their website stating that the scores obtained from the test are not recommended for pre-employment screening (DiSC Profile; MBTIOnline; Haynie, December 10th 2021).
They are more focused on measuring people’s preferences of doing things or their way of approaching certain kinds of work. It is not their main intention to explicitly evaluate how well people are going to perform in specific skills, aptitudes, abilities or other job-success-related factors.
Furthermore, there’re also some concerns about the way they present their results. Categorizing people into groups might cause an (over)simplification of interpreting people, ultimately generating unwanted stereotyped filters while screening for candidates (e.g., Gangestad & Snyder, 1985). Forcing a dichotomy for a trait that should be considered a continuum may lead to inaccurate interpretations for a large proportion of test-takers.
For example, categorizing people only as either introverted or extroverted leaves no room for an accurate interpretation of people who score in the middle of this categorization.
These categorizations are however frequently used to infer other behaviors.
For instance, you can easily find a Compatibility Chart for all MBTI types in Google, which is not provided by the official website or based on solid research (see picture below).
You might think that candidates with certain types of MBTI could be good colleagues or have complementary skills with you and advance them to the next step of the hiring process.
However, you should be aware that this kind of compatibility matching does not have a scientific backup or even a strong theoretical foundation.
It would be risky to, consciously and unconsciously, make decisions in the hiring processes based on such results.
Last but not least, we are going to discuss some substantive but difficult-to-access properties. Despite the low validity in predicting job performance I mentioned before, you would still need to consider the cutoff points during categorization. Even in other tests involving classification, it is often hard to set theoretically or empirically meaningful cutoff points. On top of that, there is a doubt about whether a person can solely be assigned to a certain type (Robins et al., 1998; Diekmann & König, 2018).
It would not matter much if we only want to know what a person’s preferences look like, but it would be problematic if we need to make hiring decisions from these categorizations….
Final word when it comes to personality tests…
Personality tests might be fascinating to use as you can know people who you first get in touch with within a very short period of time (well, the duration of the test to be precise).
However, you should be cautious when using them in the pre-employment screening process, considering their low validity in predicting future job performance, generalized scores due to social desirability, phrasing in questions and statements, and ways of presenting results.
With all the evidence provided, I hope it has become clear to you that personality tests are not the best option as a primary screening tool in the hiring process…
Alessandra, T. (2017, 7 December). DISC vs. MBTI Assessments. International Coaching Federation. Retrieved on 12 April 2022, from https://coachingfederation.org/blog/disc-vs-mbti-assessments#:%7E:text=MBTI%20is%20largely%20an%20indicator,personality%20translates%20to%20external%20behavior
Barrick, M. R., & Mount, M. K. (1991). The big five personality dimensions and job performance: a meta‐analysis. Personnel psychology, 44, 1-26. https://doi.org/10.1111/j.1744-6570.1991.tb00688.x
Barrick, M. R., Mount, M. K., & Judge, T. A. (2001). Personality and performance at the beginning of the new millennium: What do we know and where do we go next?. International Journal of Selection and assessment, 9, 9-30. https://doi.org/10.1111/1468-2389.00160
Birkeland, S. A., Manson, T. M., Kisamore, J. L., Brannick, M. T., & Smith, M. A. (2006). A meta‐analytic investigation of job applicant faking on personality measures. International Journal of Selection and Assessment, 14, 317-335.
Diekmann, J., & König, C. J. (2018). Personality testing in personnel selection: Love it? Leave it? Understand it!. In Current issues in work and organizational psychology (pp. 17-31). Routledge.
Donovan, J. J., Dwight, S. A., & Hurtz, G. M. (2003). An assessment of the prevalence, severity, and verifiability of entry-level applicant faking using the randomized response technique. Human Performance, 16, 81-106. https://doi.org/10.1207/S15327043HUP1601_4
Ellingson, J. E., Smith, D. B., & Sackett, P. R. (2001). Investigating the influence of social desirability on personality factor structure. Journal of Applied Psychology, 86, 122. https://doi.org/10.1037/0021-9010.86.1.122
Gangestad, S., & Snyder, M. (1985). ” To carve nature at its joints”: On the existence of discrete classes in personality. Psychological review, 92, 317. https://doi.org/10.1037/0033-295X.92.3.317
Guastello, S. J., Guastello, D. D., & Craft, L. L. (1989). Assessment of the Barnum effect in computer-based test interpretations. The Journal of psychology, 123, 477-484. https://doi.org/10.1080/00223980.1989.10543001
Haynie, S. (2021, 10 December). Should Personality Assessments Be Used In Hiring? Forbes. Retrieved on 12 April 2022, from https://www.forbes.com/sites/forbescoachescouncil/2021/06/03/should-personality-assessments-be-used-in-hiring/?sh=497a805737c0
Hiring tools: Is DiSC a good tool for hiring? (2022). DiSC Profile. https://www.discprofile.com/everything-disc/hiring
Hogan, J., Barrett, P., & Hogan, R. (2007). Personality measurement, faking, and employment selection. Journal of Applied Psychology, 92, 1270. https://doi.org/10.1037/0021-9010.92.5.1270
Kennedy, R. B., & Kennedy, D. A. (2004). Using the myers‐briggs type indicator® in career counseling. Journal of employment counseling, 41, 38-43. https://doi.org/10.1002/j.2161-1920.2004.tb00876.x
Morgeson, F. P., Campion, M. A., Dipboye, R. L., Hollenbeck, J. R., Murphy, K., & Schmitt, N. (2007). Reconsidering the use of personality tests in personnel selection contexts. Personnel psychology, 60, 683-729. https://doi.org/10.1111/j.1744-6570.2007.00089.x
Murphy, K. R. (2005). Why don’t measures of broad dimensions of personality perform better as predictors of job performance?. Human performance, 18, 343-357. https://doi.org/10.1207/s15327043hup1804_2
Niessen, A. S. M., Meijer, R. R., & Tendeiro, J. N. (2017). Measuring non-cognitive predictors in high-stakes contexts: The effect of self-presentation on self-report instruments used in admission to higher education. Personality and Individual Differences, 106, 183-189. https://doi.org/10.1016/j.paid.2016.11.014
Robins, R. W., John, O. P., & Caspi, A. (1998). The typological approach to studying personality. In R. B. Cairns, L. R. Bergman, & J. Kagan (Eds.), Methods and models for studying the individual (pp. 135–160). Sage Publications, Inc.
Rosse, J. G., Stecher, M. D., Miller, J. L., & Levin, R. A. (1998). The impact of response distortion on preemployment personality testing and hiring decisions. Journal of Applied Psychology, 83, 634. https://doi.org/10.1037/0021-9010.83.4.634
Snyder, D. K. (2000). Computer-assisted judgment: defining strengths and liabilities. Psychological Assessment, 12, 52. https://doi.org/10.1037/1040-35188.8.131.52
Stewart, G. L., Darnold, T. C., Zimmerman, R. D., Parks, L., & Dustin, S. L. (2010). Exploring how response distortion of personality measures affects individuals. Personality and Individual Differences, 49, 622-628. https://doi.org/10.1016/j.paid.2010.05.035
Zerbe, W. J., & Paulhus, D. L. (1987). Socially desirable responding in organizational behavior: A reconception. Academy of management review, 12, 250-264. https://doi.org/10.5465/amr.1987.4307820