Does not seem work universally. Just tested a few with this prompt
"create a javascript function to count any letter in any word. Run this function for the letter "r" and the word "strawberry" and print the count"
ChatGPT-4o => Output is 3. Passed
Claude3.5 => Output is 2. Failed. Told it the count is wrong. It apologised and then fixed the issue in the code. Output is now 3. Useless if the human does not spot the error.
llama3.1-70b(local) => Output is 2. Failed.
llama3.1-70b(Groq) => Output is 2. Failed.
Gemma2-9b-lt(local) => Output is 2. Failed.
Curiously all the ones that failed had this code (or some near identical version of it)
```javascript
function countLetter(letter, word) {
// Convert both letter and word to lowercase to make the search case-insensitive
const lowerCaseWord = word.toLowerCase();
const lowerCaseLetter = letter.toLowerCase();
// Use the split() method with the letter as the separator to get an array of substrings separated by the letter
const substrings = lowerCaseWord.split(lowerCaseLetter);
// The count of the letter is the number of splits minus one (because there are n-1 spaces between n items)
return substrings.length - 1;
It's not the job of the LLM to run the code... if you ask it to run the code, it will just do its best approximation at giving you a result similar to what the code seems to be doing. It's not actually running it.
Just like Dall-E is not layering coats of pain to make a watercolor... it just makes something that looks like one.
Your LLM (or you) should run the code in a code interpretor. Which ChatGPT did because it has access to tools. Your local ones don't.
Claude isn't actually running console.log() it produced correct code.
This prompt "please write a javascript function that takes a string and a letter and iterates over the characters in a string and counts the occurrences of the letter"
Produced a correct function given both chatGPT4o and claude3.5 for me.
"create a javascript function to count any letter in any word. Run this function for the letter "r" and the word "strawberry" and print the count"
ChatGPT-4o => Output is 3. Passed
Claude3.5 => Output is 2. Failed. Told it the count is wrong. It apologised and then fixed the issue in the code. Output is now 3. Useless if the human does not spot the error.
llama3.1-70b(local) => Output is 2. Failed.
llama3.1-70b(Groq) => Output is 2. Failed.
Gemma2-9b-lt(local) => Output is 2. Failed.
Curiously all the ones that failed had this code (or some near identical version of it)
```javascript
function countLetter(letter, word) {
}// Test the function with "r" and "strawberry"
console.log(countLetter("r", "strawberry")); // Output: 2 ```