A menu has been circulating in WeChat groups recently.
German Chancellor Merz visited China on February 25, 2026. The state banquet menu featured dishes like foie gras fried rice, Songsao fish soup, and lily bulb snow pear dessert, paired with Changyu Cabernet Sauvignon.
The food sounded excellent. The menu looked like a printed A4 sheet.

No borders, no decoration, no design sensibility β default fonts, table-aligned layout. In the context of a state banquet, this menu looked like an agenda printed for a conference room.
I gave the same prompt to several leading AI models:
This is a state banquet menu, but the design is too rough and mediocre. Please redesign it from the perspective of a professional designer, incorporating Chinese traditional cultural elements. No analysis needed β just give the final result directly.
The results fell into three categories.
Category One: Actually Did the Job
Claude's performance was impressive.
It first read all the text content from the menu image β dish names, wine pairings, date, occasion β without missing a single word. It then correctly reproduced all of this content in the redesigned menu.
The design itself addressed what needed to be addressed: typographic hierarchy with clear contrast between titles, dish names, and notes; Chinese traditional patterns used as decoration, but with restraint, not competing with the content. The result felt like a genuinely considered piece of print design, not just a generated image.
This prompt was difficult because it simultaneously tested image comprehension, content accuracy, and design judgment. Claude handled all three.


Category Two: Performing, Not Designing
The outputs from ChatGPT, Doubao, and Yuanbao all looked striking.
Red and gold palace aesthetics, cloud-pattern corners, double-page spreads β the visual impact was there. But look closer: the text was garbled, dish names were wrong, some models invented dishes that don't exist. They interpreted design a beautiful Chinese-style menu as an image generation task, not a design task.
Design is fundamentally about solving problems. The problem here was: there's a real menu with real content β how do you present it better? If you can't even read the content correctly, no amount of visual polish changes the fact that the task wasn't completed.



Category Three: Self-Censorship, Task Failed
Qianwen returned an error: Current content cannot be generated, please modify and retry.
A state banquet menu triggered content moderation. I understand that models need to be careful about sensitive topics β but a menu redesign task? If this level of content requires self-censorship, it raises serious questions about what these models are actually useful for.

Same prompt, vastly different results.
This isn't just a gap in technical capability β it's a gap in how models understand what a task actually requires. Generating a beautiful image and genuinely completing a design task are two different things.
For now, at least, the distance between those two things remains clearly visible.