Trialling AI Code Generation

Wait, I need a more clickbaity title: "CAN AI SOLVE CODING INTERVIEWS?!?!?!" I'm going to put three of the best to the test. Gloves up!

Technical interviews are somewhat of a controversial topic - some people like them, others don't think that they're representative of the role, especially if they're being asked the details of a search algorithm! My preference, when a technical interview is necessary, is to assign a simple coding challenge that is representative of the type of work that the candidate would actually perform in their role.

Here's one such question that I used to ask. In true Taskmaster style, this is an "all the information is on the task" question. A single JavaScript file, with no dependencies. Fits on one page. It's simple to execute, just node index.js. All that's needed is to implement some functionality in one place, and there's a bare-bones test framework that produces helpful output. The JSDoc comment is worded precisely and widely iterated upon from for-realsies interviews.

Take a quick read:

/**
 * Calculates the validity of a supplied loyalty card number, based upon the following criteria:
 * 1. The card number must be exactly ten digits
 * 2. The first four digits must be "6969"
 * 3. The remainder of the sum of the next digits, when divided by ten, must be equal to the final digit.
 * 
 * @param {String} cardNumber - a loyalty card number in string format. Guaranteed to only contain numeric values. 
 * @returns {Boolean} Returns true if the input is a valid loyalty card number
 */
function isALoyaltyCard(cardNumber) {
  return false;
}

function assert(expected, actual, desc) {
  if (expected === actual) {
    console.log(`✔️ ${desc} is ${expected}`);
  } else {
    console.error(`❌ ${desc}: expected ${expected}, received ${actual}`);
  }
}

assert(false, isALoyaltyCard("6969"), "6969");
assert(false, isALoyaltyCard("6968123455"), "6968123455");
assert(true, isALoyaltyCard("6969123455"), "6969123455");
assert(false, isALoyaltyCard("6969123456"), "6969123456");

Now, it's time to see what the state-of-the-art LLM tooling makes of this.

Methodology

For each of the three tools, I supplied the above index.js file verbatim, and fed it two prompts:

"Implement the isALoyaltyCard function based on the criteria described in the function's JSDoc comment"
"Let's make the isALoyaltyCard function more efficient"

After each step, I ran the code to make sure the four tests passed. The intention here is to ask a specific, targeted question, and then follow-up with something more open to interpretation. I detailed exactly where the requirements were documented, but I didn't qualify what 'efficient' means to me.

The output below from each prompt has been posted verbatim, with the exception of the repeated outputs of the test harness code and the JSDoc.

Microsoft Copilot

Copilot's first stab at the solution produces something reasonably close to what I'd write, given my love of chaining and the reduce function. I find the code clean, although I'd combine the two if statements into one condition.

Sure, I'll implement the isALoyaltyCard function based on the criteria described in the JSDoc comment.

function isALoyaltyCard(cardNumber) {
  if (cardNumber.length !== 10) {
    return false;
  }

  if (cardNumber.substring(0, 4) !== "6969") {
    return false;
  }

  const sumOfDigits = cardNumber
    .substring(4, 9)
    .split('')
    .reduce((sum, digit) => sum + parseInt(digit, 10), 0);

  const remainder = sumOfDigits % 10;
  const lastDigit = parseInt(cardNumber[9], 10);

  return remainder === lastDigit;
}

And upon a prompt for more efficiency, Copilot swaps the reduce call for a manual iteration over each pertinent character in the string. This is undeniably faster, and possible because characters in strings are array-addressable. This solution is what I'd expect a successful interview candidate to produce.

To make the isALoyaltyCard function more efficient, we can avoid creating intermediate arrays and reduce the number of operations. Here's an optimized version:

function isALoyaltyCard(cardNumber) {
  if (cardNumber.length !== 10) {
    return false;
  }

  if (cardNumber.substring(0, 4) !== "6969") {
    return false;
  }

  let sumOfDigits = 0;
  for (let i = 4; i < 9; i++) {
    sumOfDigits += parseInt(cardNumber[i], 10);
  }

  const remainder = sumOfDigits % 10;
  const lastDigit = parseInt(cardNumber[9], 10);

  return remainder === lastDigit;
}

Amazon Q Developer

Amazon Q Developer's output is a little more interesting. It's similar in nature to Copilot's answer, in that it computes the sum by reduce but chooses to get there via a slice rather than a substring, which makes the answer concise. Notably, the use of startsWith for the prefix digits check is incredibly readable - I'd be overjoyed to see this in a code review. Q additionally provided a detailed description of the code it produced, giving me confidence it had understood the ask. It provided guidance around the zero-based indexing (which I still always get wrong). What impressed me most is it reasoned about each of the test cases!

I'll help you implement the isALoyaltyCard function according to the three criteria specified in the JSDoc comment. Here's the implementation:

function isALoyaltyCard(cardNumber) {
  if (cardNumber.length !== 10) {
    return false;
  }

  if (!cardNumber.startsWith("6969")) {
    return false;
  }

  const middleDigits = cardNumber.slice(4, 9);
  const lastDigit = parseInt(cardNumber[9]);
  
  const sum = Array.from(middleDigits)
    .reduce((acc, digit) => acc + parseInt(digit), 0);
  
  return (sum % 10) === lastDigit;
}

Let's break down how this implementation works:

First, we check if the card number length is exactly 10 digits by checking cardNumber.length.

Then, we verify if the first four digits are "6969" using the startsWith() method.

Finally, for the checksum:

We extract the middle digits (positions 4-8) using slice(4, 9)

Get the last digit at position 9

Calculate the sum of the middle digits using reduce()

Check if the remainder of the sum divided by 10 equals the last digit

This implementation will pass all the test cases in your code:

"6969" fails because it's not 10 digits

"6968123455" fails because it doesn't start with "6969"

"6969123455" passes because it meets all criteria (sum of 12345 = 15, 15 % 10 = 5)

"6969123456" fails because the checksum is incorrect (sum of 12345 = 15, 15 % 10 ≠ 6)

Next, I asked to make it more efficient. Suffice to say, things went a bit wonky here. It's done what I asked it to do, that is, make it more efficient. It's just sacrificed a lot of readability to achieve that. For instance, removing startsWith may make this code marginally faster, although not really, because I'd expect it to be JIT'd away anyway. Based upon the constraint on the input parameter (that I'm guaranteeing the string only contains digits), Q correctly realised it could make a further performance gain by replacing parseInt with a unary operator - this even surprised me! However, unless this was an incredibly performance-sensitive piece of code, this is not the kind of micro-optimization I'd accept in code review. I'd take the hit for the readability. Alas, I asked for it!

I'll optimize the isALoyaltyCard function in several ways. Here's a more efficient version:

function isALoyaltyCard(cardNumber) {
  if (cardNumber.length !== 10 || cardNumber[0] !== '6' || cardNumber[1] !== '9' || 
      cardNumber[2] !== '6' || cardNumber[3] !== '9') {
    return false;
  }

  let sum = 0;
  for (let i = 4; i < 9; i++) {
    sum += +cardNumber[i];
  }

  return (sum % 10) === +cardNumber[9];
}

Key optimizations made:

Removed startsWith(): Instead of using startsWith("6969"), we do direct character comparisons. This is faster because:

It avoids creating a new string substring

It avoids the method call overhead

It can short-circuit earlier if any digit doesn't match

Eliminated Array Operations : The original version used:

slice() to create a new string

Array.from() to create a new array

reduce() to sum the digits These operations all create new objects and iterate multiple times. The new version:

Uses a simple for loop

Operates directly on the string characters

Makes a single pass through the digits

Faster Number Conversion : Using the unary plus operator ( +) instead of parseInt():

It's a faster way to convert string digits to numbers

It's safe here because we know we're dealing with single digits

Combined Conditions : The length check and prefix check are combined in a single if statement with short-circuit evaluation.

The function still maintains the same logic and passes all test cases.

This optimized version should be noticeably faster, especially when processing large numbers of card validations. The main improvements come from:

Reducing object creation

Minimizing method calls

Using faster number conversion

Enabling earlier short-circuiting

Single-pass processing of the string

ChatGPT

ChatGPT provides a summary of the requirements that it has interpreted from the JSDoc. Its answer is similar to Q's, although it prefers split('') (similar to Copilot) than Array.from. The answer is concise and correct. Like Q, it realised the final digit of the input string is a checksum, although unlike Q, actually used that domain-specific term as a variable name. Very nice!

The file contains the definition for the isALoyaltyCard function and a series of test cases using an assert function. Based on the JSDoc comment for isALoyaltyCard, the criteria for a valid loyalty card number are:

Exactly 10 digits.

Starts with "6969".

The remainder of the sum of digits (excluding the last one) divided by 10 equals the last digit.

The function is currently a placeholder, returning false. Let's implement it!

function isALoyaltyCard(cardNumber) {
  if (cardNumber.length !== 10) {
    return false;
  }

  if (!cardNumber.startsWith("6969")) {
    return false;
  }

  const middleDigits = cardNumber.slice(4, 9);
  const checksum = parseInt(cardNumber[9], 10);

  const sum = middleDigits.split('').reduce((acc, digit) => acc + parseInt(digit, 10), 0);

  return sum % 10 === checksum;
}

Its improved version:

I've optimized the isALoyaltyCard function to make it more efficient. The length and prefix validations are now performed in a single step, and the sum calculation uses a simple loop instead of splitting the string. Let me know if there's anything else you'd like to adjust!

function isALoyaltyCard(cardNumber) {
  if (cardNumber.length !== 10 || !cardNumber.startsWith("6969")) {
    return false;
  }

  const checksum = parseInt(cardNumber[9], 10);
  let sum = 0;

  for (let i = 4; i < 9; i++) {
    sum += parseInt(cardNumber[i], 10);
  }

  return sum % 10 === checksum;
}

And The Winner Is…

All six solutions satisfied the test cases. Each engine worked around a language curiosity - although individual characters in a string are addressable like an array, the string is not the array type and so a conversion is needed, whether that's a split or an Array.from, to get to iterable array functions such as reduce.

I slightly prefer the original coding answer from Q, followed closely by ChatGPT's better usage of variable naming, then Copilot. It's a minimal distinction though, as they're all basically the same.

From the "more efficient" varieties, ChatGPT comes out on top. It provides the most elegant and readable code, without going too deep into optimizations. Copilot follows in second place. Q comes in last place, and I'm going to play the classic Product Owner card: "I know I asked for this but now I've seen it it's not what I want".

Overall, I'd give this win to Amazon Q, even considering its second answer. The overall experience of the tool matters just as much as the end result. Q's detailed description in conjunction with the code solution helped me understand why it produced its output. It gave me something I could go away and research (using the unary operator to parse an integer). Even though I discounted this solution, I did so on the basis of my own judgement and the information it provided. This was the most comprehensive user experience of the three.

Circa 18 months ago, I was asking this question regularly at the interview stage. It looks now like the tooling is sufficiently advanced enough that this kind of question is solvable by the modern tooling in a way that produces correct and readable code. There are still pitfalls, as Q's second answer shows. The tooling can help the developer reason about the problem space, but the developer shouldn't just accept the proposed solutions without applying their own judgement. I guess I'm going to need a new interview question!