INTERVIEWER: “Good morning, nice to meet you. Are you going to get pregnant?” While this may seem like a shocking way to begin a conversation, some version of this dialog occurs in many surgical residency interviews. According to research presented at the 2021 Society for Clinical Surgery Symposium by Arash Fereydooni, MD, and colleagues, more than one-third of vascular applicants in the 2020 Match were asked about marital status and family planning. Females experienced a significantly higher proportion of these inquiries than their male counterparts.
Speak with a few female surgeons, and it becomes clear that bias in interviews extends far beyond training. As a specialty, our answer to this issue can’t be “try to stop asking sexist questions.” Don’t worry though—as usual, I have a solution. And you (as usual?) will probably hate it. The answer? Stop interviewing.
I have heard the resistance already. You are a chief, chair, program director, or some other thing of great significance. You have been interviewing candidates for decades. You have an unwavering confidence in your ability to spend 20 minutes with someone and “feel them out.” Well, no one else has told you this, so I will. Stop. You are not just bad at interviewing, but you are actively terrible. A literal lottery would produce better results. How do I know this? Because you are human.
A few years ago, the University of Texas Medical School at Houston increased its enrollment from 150 to 200 students. The expansion was authorized late in the application cycle, so the class had to be filled with 50 students who most schools had already rejected, often based on low interview scores. Over the next four years, there was no difference in performance between the last 50 students and the 150 picked first. The interviews had failed to select better candidates. A 2012 study by Jason Dana and Robyn Dawes asked participants to predict students’ GPA for the upcoming year. The participants were given the students’ background information, including previous semesters’ GPA. They then conducted interviews with half of the students. While instructed that a student’s academic record is the best predictor of future performance, the participants often let their interview impressions outweigh these data. As a result, they were much worse at predicting future GPAs among those they interviewed.
Through these and many other studies, science has found that humans have a persistent, irrational confidence in their interviewing ability. In most cases, this misplaced conviction significantly hinders the process of selecting the best applicant. Renowned psychologist Scott Highhouse calls this “the greatest failure of industrial and organizational psychology.”
Most surgeons would point to the interview process as an integral tool for selecting applicants who will be the best fit with their culture. That is because we have a misunderstanding of the most critical components of work culture. We want to hire doctors who will work hard and succeed, and are also easy to get along with. So as a proxy for developing this rubric, we tend to hire people just like ourselves. After all, what is your model for success? Most likely yourself or some other archetype you associate with achievement. If you were a collegiate lacrosse player, you tend to value this characteristic in others. This process leads us to limit the potential paradigms for success. Interviewing tends to exclude outliers, to reject exceptionalism. Would you find the young doctor from the University of Baghdad who will become aortic disease titan Hazim Safi, MD? More likely you wouldn’t even interview him.
Organizational psychologists know that effective work cultures are not built on cohesive personalities. They are formed with congruent values. A department developed with doctors who value collaboration and learning will always outperform one with a great lacrosse team. Our current interview system, however, is poorly designed to identify core values. Most applicants already have set answers for our values-based questions. We think we are interviewing medical students, but, in reality, we are interviewing their publicists.
Interview scenario: The interviewer asks, “What is your greatest weakness?”
Applicant: “I think it’s that I care too much. I will never rest until I get the job done. I will stay single-minded on a task at all costs. I guess what I’m saying is that if you hire me, I will literally murder your enemies.”
In medicine, most of us conduct what are called unstructured interviews. The interviewer is free to explore random details they find relevant. Have you ever lived in the South? What are your hobbies? If you were a tree, what kind of tree would you be? (Obviously the type of tree that would murder your enemies!) The problem with unstructured interviews is that the interviewer rates them highly for perceived effectiveness, but, in reality,
they are among the worst predictors of job performance. Our overconfidence in our expertise and experience leads us to cling to this broken system. Unstructured interviews are not only ineffective, they are also a perfect environment for our implicit biases to run amok. Many of these biases focus on women. They will need to leave early to take care of their children. They will take an extended maternity leave. They will get married and not care about surgery anymore. Gender norms are powerful. Men who brag about their accomplishments are seen as confident, while in women this characteristic is seen as cockiness. Therefore women tend to downplay their achievements. Prior to conducting a skills course, I have the trainees complete a self-assessment regarding their ability to perform various procedures. One rated their confidence in performing an open thoracoabdominal aneurysm repair as “extreme.” The surprise was not to learn that the trainee was a male but rather that he was an intern. Extreme indeed.
A study by Jessi L. Smith and Meghan Huntoon at Montana State University examined womens’ unease with self-promotion. They asked two groups of female students to write an essay describing their accomplishments to earn a scholarship. A black box was placed in the room with each group. The box was unexplained to the control group, but the others were told it was a subliminal noise generator that can’t be heard but may cause mild discomfort. The box itself was empty, of course, because there is no such thing as a subliminal noise generator. The group that was warned about possible discomfort performed significantly better on their essay scores. It seems that women may experience distress with self-promotion, but this can be overcome if acknowledged and explained, even by something that does not exist.
Over the past several years, I have conducted a semi-qualitative analysis of my own interview “instincts.” Not of my trainees; that is too small a sample size. When I travel to different institutions or meetings, I try to take stock of the progress of vascular residents I did not take. Particularly of those who did not impress me. The results were sobering enough that I stopped interviewing applicants altogether. Instead, I let them interview me. To their credit, they have all managed not to ask me about my family and marital plans.
If I can’t convince you to turn your hiring process into some megamillions-type lottery carnival, there are some alternatives rooted in science.
Structured interviews allow each candidate to face the same questions, in the same order, in the same amount of time. The interviewer should be blind to the applicant’s CV so the score can provide an independent data point. Each answer should be scored immediately and independently to eliminate a potential halo effect when one attribute of the applicant (either an answer or a personal characteristic) is viewed so favorably by the interviewer that all of their other scores are artificially buoyed. The pitchfork effect is the opposite, such as a student from Alabama may experience with me during football season.
Structured interviews allow for qualitative assessments over time. Which questions best predicted job performance? If likability is really important to you, then grade it. This is obviously open to bias, but if given a numeric score, it becomes controllable. Other best practices include avoiding panel interviews and group assessments which are subject to the dominant personality problem. Do you really want all of your hires to mirror the most overbearing and irritating doctor in your group? Finally, use the scores. Sure, it’s shady math, but at least it’s math?
According to psychologists, the best predictors of future job performance are work sample tests. These assessments require applicants to perform physical or psychological tasks similar to those they would experience on the job. So an aspiring sales executive might be given 20 minutes to design an ad campaign. Work sample tests also help to eliminate age, gender and appearance biases. Research in developing these tools for surgery is desperately needed.
Technical skill assessments might seem like a logical place to start. But after 14 years of experience testing trainee surgical proficiency, I would caution against using this metric to predict future ability among novices. The scatterplot of technical skill during the first year of residency is mostly noise, but it sharpens inexorably to competence as training and experience proceed. If we really consider ourselves master teachers, shouldn’t we be fighting over the true disasters anyway? I’ll take the hard worker who can’t tie a knot over the lazy prodigy any day.
It is time to reform our process for selecting surgical trainees and faculty. Questions about marriage and children are not only Match violations, they are also a defiance of federal laws. If we can’t even conduct interviews without committing actual crimes, it’s probably time to move on.
Malachi Sheahan III, MD, is the Claude C. Craighead Jr. professor and chair in the division of vascular and endovascular surgery at Louisiana State University Health Sciences Center in New Orleans.