The Ultimate AI Showdown: ChatGPT vs. Bard vs. Bing
Written on
Chapter 1: Introduction to AI Coding Tools
The landscape of software development has been revolutionized by the advent of Large Language Models (LLMs) and generative AI technologies. The pressing question is: which tool excels in code generation? To answer this, we conducted a coding challenge involving three prominent web-based AI tools: ChatGPT, Bard, and Bing. While there are paid options like GitHub Copilot and ReplIt Ghostwriter that we will explore in a subsequent article, our focus here is on free alternatives.
We assessed each tool based on the accuracy of the generated code, overall performance, client/test code, and the quality of explanations or guidance provided alongside their solutions.
Section 1.1: Coding Challenge #1 - Implementing an LRU Cache
Our first coding challenge was to create a Least Recently Used (LRU) Cache. This concept, which I encountered during an interview at Amazon, requires a fixed-size dictionary that evicts the least recently used key. After successfully landing the job at AWS, I was curious to see how our AI assistants would tackle this problem.
The coding prompt for this challenge was straightforward: develop a Python class named LRUCache with a constructor accepting the cache size. This class should manage string keys and values, ensuring that the cache never exceeds its maximum size and evicting the least recently used key as needed. The implementation requires using a dictionary to store data and another structure to track usage.
The AI tools produced varying solutions. ChatGPT's implementation was adequate, offering a robust client code and an extensive explanation. I found its client code particularly useful for testing across all tools. Here’s a brief look at the client code:
cache = LRUCache(3)
cache.put('key1', 'value1')
cache.put('key2', 'value2')
cache.put('key3', 'value3')
print(cache.get('key1')) # Expected Output: value1
cache.put('key4', 'value4') # 'key2' gets evicted
print(cache.get('key2')) # Expected Output: None
print(cache.get('key4')) # Expected Output: value4
While ChatGPT's solution was functional, it performed significantly slower than its competitors, primarily due to using a list for managing usage order. This resulted in O(n) operations to move keys, which isn't ideal for performance.
Section 1.2: Performance Analysis of LRU Cache Solutions
Bing provided a correct solution, optimizing performance with a doubly-linked list, but lacked client code. Its implementation efficiently managed recent usage, as highlighted in the following code snippet.
class Node:
def __init__(self, key, value):
self.key = key
self.value = value
self.prev = None
self.next = None
class LRUCache:
# Implementation goes here...
Bard emerged as the leader in this round, utilizing the collections.OrderedDict, which simplified the code significantly while maintaining performance. Bard's code harnessed Python's magic methods for a more elegant solution.
from collections import OrderedDict
class LRUCache:
def __init__(self, max_size):
self.max_size = max_size
self.cache = OrderedDict()
def put(self, key, value):
# Implementation goes here...
In terms of execution time, Bard triumphed, completing the task in 0.625 seconds, followed by Bing at 1.405 seconds, and ChatGPT lagging at 34.123 seconds.
Chapter 2: Coding Challenge #2 - Atbash Cipher Implementation
Moving on to our second challenge, we tasked the AI tools with implementing the Atbash cipher—a straightforward encryption method that mirrors letters in the alphabet. The prompt was simple: create a function that applies the Atbash cipher to a given string.
ChatGPT delivered a correct solution along with comprehensive client code and explanations. In contrast, Bing also produced a valid implementation but lacked client code, opting for a hard-coded dictionary instead of calculating mirrored letters through ASCII math.
def atbash_cipher(text):
# Implementation goes here...
Bard's attempt, however, faltered, resulting in an "IndexError" due to an incorrect method of calculating mirrored letters.
The winner of round #2 is ChatGPT, thanks to its succinct code and thorough explanations.
Chapter 3: Coding Challenge #3 - Finding the Least Common Multiple
For our final challenge, we asked the tools to develop a function to find the least common multiple (LCM) of a list of integers. Bard excelled here, leveraging the built-in math.lcm function, simplifying the code significantly.
import math
def lcm(numbers):
# Implementation goes here...
Both ChatGPT and Bing provided working solutions, but Bard's use of Python's standard library made its solution stand out.
Summary of Results
In the culmination of our AI coding competition, Bard won two rounds but stumbled in one. Its ability to utilize Python libraries effectively demonstrated a strong grasp of the language. ChatGPT excelled in providing detailed client code and explanations, though its performance in the LRU cache challenge was subpar. Bing maintained consistent performance across all challenges but lacked in-depth explanations.
Ultimately, as a senior engineer, I would recommend starting with Bard or ChatGPT, as all three tools show great promise as coding assistants. For further insights into rapid application development using ChatGPT, consider checking out my book, Rapid Software Engineering with ChatGPT.