Using ChatGPT to Create and Evaluate Technical Interview's

There are many use cases for OpenAI’s GTP large language models (LLM). The more obvious ones include content generation, translation and summarizing text.

However, there are also applications for technical recruiters and hiring managers, specifically when designing coding tests for interviews and evaluating candidate responses.

In fact, at CodeInterview we have tested this substantially as we develop our AI Assist feature powered by OpenAI. We tried creating and refining questions as well as asking ChatGPT to rank and score candidate responses automatically.

In this article, we’ll share some of our learnings so you can also utilize ChatGPT to save time when creating questions and evaluating candidates.

Ready? Let’s dive in.

Creating Technical Interview Questions with ChatGPT

First, let’s look at our best practices for designing effective interview questions and technical assessments.

Establishing the foundation

When using ChatGPT for creating technical questions, you need to feed it as much information as possible in advance. Make sure you have the below details laid out:

Question format: multiple choice, open-ended, timed or a classic coding problem.
Difficulty: you’ll need to specify to ChatGPT the difficulty you want so it aligns with the role’s level of seniority and technical requirements.
Language: if you want to test the candidate’s abilities in a specific programming language or technology, specify this when creating the question.
Available time: for timed questions, include the target time to complete so ChatGPT takes this into account when designing questions.

While each question will have its own specifics, this is a basic starting point for you to keep in mind. Don’t start with general prompts – rather begin with your non-negotiable requirements for the question and iterate from there (more on this later).

Crafting the prompt

Now it’s time to design the question prompt based on the established parameters. You will likely have to do several tries before achieving a good level of clarity and relevance to the desired technical skills. An example prompt you can start with is this:

“I’m hiring for [position] and looking to design a unique interview question. It should adhere to the below requirements:

1. Be strictly relevant to [position] at a [junior/mid/senior] level

2. It should evaluate for [basic/advanced/expert] level of skills in [programming language/technology]

3. The target response time should be [X] minutes so the question should not be too time-consuming or quick to solve.

Can you generate 2-3 technical interview questions based on these requirements?”

Simply edit the intro or requirements to achieve a better level of detail if you’re not happy with the initial suggestions. When you have a question you’re happy with, it’s time to try it out in action.

Validating the question

Even if the question looks good on paper, you may encounter other issues while solving it. In order to avoid confusing candidates, make sure to test the question yourself or send it to a team member that can help you validate it.

The goal is to confirm if it aligns with the intended purpose and effectively assesses candidates’ abilities. If something seems wrong, you need to go back to the drawing board and repeat the process.

Building an effective question library

The value of this approach is that you can build a solid library of questions to use in different scenarios. If you’re just doing a few questions, you may be better off doing it manually.

But once you get the hang of ChatGPT for creating questions, you’ll be able to create a set of at least 5-10 questions for the roles you regularly hire for, different seniority levels and languages/technologies. The more you use the tool, the more time and effort you will save as the advantages stack up question by question.

With CodeInterview, you get an out-of-the-box solution to create and store templates, ready to load when you start interviewing a candidate.

The key thing to remember is this technology is still being developed and requires a decent amount of testing and editing until you reach a level that’s good enough to use with real candidates. Be cautious and always validate questions before using them in actual recruiting.

Evaluating Technical Candidate Responses

You can also utilize ChatGPT to evaluate technical responses from coding assessments, take-home projects and even coding interviews (assuming you use a feature like CodeInterview’s Code Playback).

Let’s look into a few best practices we’d recommend for tapping into ChatGPT’s capabilities to assess candidate performance and provide rankings.

Crafting an effective evaluation prompt

The main goal here is to provide enough context in your prompt, taking into account the question’s goals and desired outcomes.

Make sure to include the actual question, relevant use case and the candidate’s response for evaluation by ChatGPT.

Lastly, ask ChatGPT to rank the response based on the provided criteria. As with generating questions, you may need to do several iterations to design an effective evaluation prompt.

Ensuring consistency

It’s important to maintain uniformity in the prompt structure across different candidate responses. This way, you will ensure a more accurate and fair comparison. Even small tweaks to the prompt may yield inconsistent results so make sure you are evaluating only candidates that apply for the same position and solved the same problems.

Manual review

We’d recommend performing a manual review of the candidate response and initial results obtained from ChatGPT. This way, you can verify the accuracy and reliability of ChatGPT’s evaluation. If you see big discrepancies, you may have to revise the prompt and/or questions you have designed.

Ultimately, you will need to adapt and refine the process to align with specific questions and evaluation goals.

Common Mistakes to Avoid in Question Design and Candidate Evaluation

Now, let’s examine some common issues that may arise when using ChatGPT to create technical questions and evaluate responses.

Overreliance on ChatGPT

When creating a technical interview assessment, avoid solely relying on ChatGPT. While it’s a useful tool to speed up the process, it cannot replace logical and innovative input by a human. Rather, recognize ChatGPT’s flaws and advantages and maintain a balanced approach.

Using ChatGPT in isolation

Looking at coding problems and candidate solutions from different perspectives can improve the overall selection. So, in addition to ChatGPT, consider getting feedback from other team members and review responses manually, looking for candidate qualities beyond accuracy and speed of completion.

The thought and approach they put in may be valuable despite not arriving at the best solution. Combining ChatGPT’s insights with human judgment can help you recognize these traits.

Changing prompts too often

Refrain from changing prompts significantly from candidate to candidate during evaluation. Consistency in prompts ensures fair and comparable assessments across all candidates.

Conclusion

As with any technology, ChatGPT is continually evolving so testing and validation are essential to ensure its effectiveness in real recruitment scenarios. By incorporating ChatGPT thoughtfully and avoiding common pitfalls, you can harness its capabilities to enhance the efficiency and quality of your remote technical interview and candidate selection.

Further reading:

Using ChatGPT to Create and Evaluate Technical Interview’s