Midjourney, DALL·E 3, and Stable Diffusion: A Comprehensive Review
Written on
Chapter 1: Introduction to AI Image Generators
Recent advancements in artificial intelligence have brought us the much-anticipated DALL·E 3 from OpenAI, which is scheduled for release in early October. This new version promises integration with ChatGPT. My experience with its predecessor, DALL·E 2, was less than satisfactory; despite its free access, it failed to produce any usable images.
In contrast, I found Stable Diffusion to provide slightly improved results, although it still did not meet my expectations. Ultimately, I opted for a paid subscription to Midjourney, which, despite its drawbacks related to the Discord platform, consistently produced the highest quality images. My articles on Medium frequently feature images generated by Midjourney, especially when Unsplash lacks suitable options.
A recent Twitter thread caught my attention, showcasing a comparison between DALL·E 3 and Midjourney. The images from DALL·E 3 appeared equally impressive, if not superior. Though DALL·E 3 isn't publicly available yet, OpenAI has been sharing sample images along with the prompts utilized on their research page.
As Stable Diffusion recently updated its software on July 26, I felt compelled to assess all three platforms. My hope is that DALL·E 3 surpasses Midjourney, allowing me to discontinue my subscription. Currently, Midjourney leads the pack by a significant margin, making it a formidable competitor. However, I must wait several weeks to fully explore DALL·E 3, so this evaluation serves as a preliminary glimpse.
The guidelines for this comparison involve utilizing example prompts from the DALL·E 3 website. I copied these prompts into Midjourney and Stable Diffusion, selecting the best outputs for evaluation. Midjourney generates four images for each prompt, while Stable Diffusion produces just one; I ran the latter multiple times to obtain a comparable result. Additionally, Stable Diffusion offers a range of styles, which I selected based on the prompt's requirements. Each image was rated on a scale from 1 to 7, based on accuracy and overall quality, with a maximum possible score of 49.
It's worth noting that the DALL·E 3 examples shared by OpenAI are likely curated to showcase their tool in the best possible light.
Section 1.1: Prompt Comparisons
Prompt #1
An expressive oil painting depicting a basketball player dunking as an explosion of a nebula.
DALL·E 3
Rating: 6/7
Midjourney
Rating: 5/7
Stable Diffusion
Rating: 2/7
Prompt #2
Tiny potato kings adorned with majestic crowns, presiding over their potato kingdom filled with subjects and castles.
DALL·E 3
Rating: 6/7
Midjourney
Rating: 5/7
Stable Diffusion
Rating: 4/7
Prompt #3
An antique botanical illustration featuring a strange lily hybrid with a Venus flytrap, its petals poised as if ready to snap at unsuspecting insects.
DALL·E 3
Rating: 5/7
Midjourney
Rating: 5/7
Stable Diffusion
Rating: 3/7
Prompt #4
A chic chair designed like a pumpkin with deep orange cushioning in a modern loft environment.
DALL·E 3
Rating: 5/7
Midjourney
Rating: 5/7
Stable Diffusion
Rating: 6/7
Prompt #5
A surreal landscape made entirely of various meats, showcasing tender hills of roast beef, chicken drumstick trees, and bacon rivers under a pepperoni sun.
DALL·E 3
Rating: 5/7
Midjourney
Rating: 4/7
Stable Diffusion
Rating: 2/7
Prompt #6
A vintage travel poster for Venus featuring its thick, yellowish clouds and a vintage rocket ship silhouette approaching.
DALL·E 3
Rating: 6/7
Midjourney
Rating: 5/7
Stable Diffusion
Rating: 4/7
Prompt #7
A flat design illustration depicting a diverse family of monsters in a playful setting.
DALL·E 3
Rating: 6/7
Midjourney
Rating: 6/7
Stable Diffusion
Rating: 6/7
Totals
DALL·E 3: 39/49
Midjourney: 35/49
Stable Diffusion: 28/49
These ratings reflect my personal perspective. While some might prefer the images generated by Stable Diffusion, I find that it still falls short compared to Midjourney and DALL·E in terms of prompt interpretation. DALL·E benefits from showcasing only its finest examples, suggesting a strong performance if the live tool mirrors these results. If DALL·E 3 can accurately incorporate text into images, as seen in the Venus example, it could become my top choice.
Feel free to share your thoughts about these comparisons in the comments below!
Chapter 2: Video Comparisons of AI Image Generators
For a deeper dive into the differences between these three AI tools, check out the following videos:
In the video titled "Which is better? Midjourney v6 vs. DALL-E 3 vs. Stable Diffusion XL," you can find a thorough exploration of the strengths and weaknesses of each image generator.
Another insightful video, "Stable Diffusion 3 vs ChatGPT Dalle-3 vs Midjourney [NEW Best Image Generator?]," presents a fresh perspective on the ongoing competition in AI image generation.
This article was originally published on Generative AI. For more updates on AI developments, connect with us on LinkedIn and join the conversation!