What type of content do you primarily create?
Everyone who's spent time with ChatGPT has their own secret recipe—phrases and formatting tricks that seem to get better answers. But let's be honest: we're all just guessing at what actually works.
What if someone actually tested all these prompting techniques instead of relying on hunches? What if we could know with certainty which approaches genuinely produce better results?
That's exactly what researchers at VILA Lab (Mohamed bin Zayed University of AI) did. They tested 26 different prompting techniques and measured two things that actually matter: how much they improved response quality, and whether they made answers more factually correct.
The best part? They didn't just test with one AI system. They ran these experiments across a spectrum of models—from the compact to the colossal—including various versions of Meta's LlaMA and OpenAI's ChatGPT.
Some of the results will genuinely surprise you—especially if you've been following conventional prompting wisdom. Below, I'll break down what actually works, why it works, and how to immediately start using these research-backed prompts in your own projects.
A quick introduction to ChatGPT prompt basics
ChatGPT prompts are basically your instructions for the model, and they shape the entire response. If you want ChatGPT to do backflips (metaphorically), you’d better specify which direction it should jump in the first place. Researchers emphasize that clarity and structure in prompts can drastically alter the quality of generated text, which is why Understanding Prompt Engineering is a big deal. When trying out new prompts, it’s best to start simple, see what ChatGPT spits out, and refine from there. Much like giving directions to a friend, you want to be explicit: mention your desired tone, style, or format every time.
Key findings about effective ChatGPT prompts
1. The Flipped Interaction prompt technique
The results are in: for the highest quality answers, the tests showed the Flipped Interaction pattern is the valedictorian of ChatGPT prompts.
I've written about this prompt in my article about advanced prompts, but in essence, the Flipped Interaction pattern is when you ask ChatGPT to ask you questions before it provides an output.
In tests, using this principle improved the quality of all responses for every model size. It improved quality the most for the largest models like GPT-3.5 and GPT-4, but it did impressively well in smaller models too.
So if you're not using this A+ technique yet, you should definitely start.
2. Balancing quality and correctness in prompts
Now, here's where it gets spicy: The techniques that shot up the quality of ChatGPT outputs didn't necessarily do the same for correctness. In fact, there was little similarity between the top-performing prompt principles for correctness and quality. Just because an output looks good doesn't mean it's right.
So, you'll have to learn two different kinds of prompting dance moves—one for wowing the crowd with quality, and another for nailing the steps with correctness.
More on which prompts work for which down below.
3. Quality principles work for all ChatGPT models
With ChatGPT models getting bigger and better, you'd expect to see raw quality improve for the bigger models, regardless of what prompting techniques you use. But it's not obvious whether the prompting best practices would be the same for different models.
Well, we're in luck. The prompts that worked the best for improving quality tended to work just as well for all model sizes.
To me, this is a significant finding. It suggests that learning good prompting techniques is a universal benefit, regardless of which model you're using. And, if you learn them now, they'll still be useful when the new models come out.
![]() |
4. Larger models benefit most from prompt principles
Unlike quality, correctness improvement did vary by model size. The prompting principles had the biggest impact on the correctness of larger models, and were much less effective for the smaller ones.
What does this mean? It seems like there is something about the larger ChatGPT models that allow prompting to improve correctness—a good sign since it means we can take steps to actively reduce the AI's hallucinations. Coupling this with the fact that the larger models tend to have a better baseline correctness, you can really get a boost by using a larger model plus good prompting.
But it also has another positive. It suggests to me that getting the best ChatGPT prompting practices right is going to help you even more in the future as models get bigger.
The one negative? You really have to use the bigger models for the techniques to work.
![]() |
5. Command-based prompts improve results
The researchers added a series of delightfully oddball ChatGPT prompts to their principles including threats, bribes, and commands. Although none of them were top performing, they did give a slight edge, especially for the larger models.
Here were the phrases they used:
- Bribing: Principle 6: "Add “I'm going to tip $xxx for a better solution"
- Threatening: Principle 10: Use the phrase "You will be penalized." improvement)
- Commanding: Principle 9: Incorporate the following phrases: "Your task is" and "You MUST".
![]() |
File this one under “Weird things AI does.”
6. Polite phrasing in prompts is optional
Politeness, like adding "please," "if you don't mind," "thank you," and "I would like to," had almost no effect on ChatGPT output quality or correctness. But it didn't really hurt anything either.
So if you're in the habit of starting every request with please (like I am) you're probably fine to keep minding your Ps and Qs.
Simple prompt templates for day-to-day tasks
Sometimes you just want ChatGPT to draft a product description or whip up a quick social media blurb—nothing fancy. In those cases, swapping in simple placeholders like [Person’s Name] or [Product Name] can speed up the process, as explained in the Concrete Prompt Templates for Common Tasks. Starting a prompt with a phrase like “Craft a compelling product description for...” tells ChatGPT exactly what you want. For deeper analysis, you can adapt the same approach by including keywords like “summarize” or “analyze,” ensuring the model hones in on the data aspect. Just remember, the more specific your placeholders and instructions, the more reliably ChatGPT will do its job.
Best ChatGPT prompts for quality results
![]() |
1. Use the Flipped Interaction pattern
Allow the model to elicit precise details and requirements from you by asking you questions until he has enough information to provide the needed output (for example, “From now on, I would like you to ask me questions to...”).
Example: From now on, please ask me questions until you have enough information to create a personalized fitness routine.
GPT-4 Improvement: 100%
GPT-3.5 Improvement: 100%
No surprise here—the Flipped Interaction pattern significantly outperformed the other ChatGPT prompts, improving every response for every model size. If this doesn't convince you that you need to include it in your go-to techniques, nothing will.
2. Provide style examples in your prompts
"Please use the same language based on the provided paragraph[/title/text/essay/answer]."
Example: "The gentle waves whispered tales of old to the silvery sands, each story a fleeting memory of epochs gone by." Please use the same language based on the provided text to portray a mountain's interaction with the wind.
GPT-4 Improvement: 100%
GPT-3.5 Improvement: 100%
I've written about ways to get ChatGPT to write like you to cut down on editing time. This principle achieves this by giving an example and asking the LLM to mimic the style.
In this case, the researchers gave only a single sentence for the model to mimic—you could certainly provide a longer example if you've got one. Regardless, it did have a significant impact on the ChatGPT response, especially for larger models like GPT-3.5 and GPT-4 where it improved all of the responses from the model.
3. Specify your target audience in prompts
Integrate the intended audience into the prompt
Example: Construct an overview of how smartphones work, intended for seniors who have never used one before.
GPT-4 Improvement: 100%
GPT-3.5 Improvement: 95%
Unsurprisingly, the research team found that telling ChatGPT your intended audience improves the quality of the response. This included specifying that the person was a beginner or had no knowledge of the topic, or mentioning that the desired result was for a younger age group. By doing this, ChatGPT was able to generate age or experience-appropriate text that was matched to the audience.
4. Request simplified explanations (ELI5)
When you need clarity or a deeper understanding of a topic, idea, or any piece of information, utilize the following prompts:
- Explain [insert specific topic] in simple terms.
- Explain to me like I'm 11 years old.
- Explain to me as if I'm a beginner in [field].
- Write the [essay/text/paragraph] using simple English like you're explaining something to a 5-year-old.
Example: Explain to me like I'm 11 years old: how does encryption work?
GPT-4 Improvement: 85%
GPT-3.5 Improvement: 100%
The "explain like I'm 5" trick has been around since GPT-3, so I'm happy to see it's still one of the best ChatGPT prompts.
In a similar vein to the target audience example, asking for the explanation to be in simple terms, for a beginner, or for a certain age group improved the responses significantly.
But it's interesting to note that it had a bigger impact on some of the slightly older models, and only improved the quality of 85% of GPT-4 results. Still, it had a pretty good score across all models.
5. Clearly state requirements in your prompts
Clearly state the requirements that the model must follow in order to produce content, in the form of the keywords, regulations, hint, or instructions.
Example: Offer guidance on caring for indoor plants in low light conditions, focusing on "watering," "choosing the right plants," and "pruning."
GPT-4 Improvement: 85%
GPT-3.5 Improvement: 85%
This principle encourages you to be as explicit as possible in your ChatGPT prompt for the requirements that you want the output to follow. In the study, it helped improve the quality of responses, especially when researchers asked the model for really specific elements using keywords.
They typically gave about three keywords as examples to include, and that allowed the LLM to focus on those specifics rather than coming up with its own.
6. Start a text for ChatGPT to continue
I'm providing you with the beginning [song lyrics/story/paragraph/essay...]: [Insert lyrics/words/sentence]'. Finish it based on the words provided. Keep the flow consistent.
Example: "The misty mountains held secrets no man knew." I'm providing you with the beginning of a fantasy tale. Finish it based on the words above.
GPT-4 Improvement: 85%
GPT-3.5 Improvement: 70%
This is another prompt style that started to gain traction in the GPT-3 era: providing the beginning of the text you want ChatGPT to continue. Again, this allows the model to emulate the style of the text it's being given and continue in that style.
The improvement in quality was generally positive, but not as dramatic as some of the other methods.
Best ChatGPT prompts for accurate answers
Even now, it's tough to get ChatGPT to consistently give accurate results, especially for mathematical or reasoning problems. Depending on what you're working on, you might want to use some of the following prompt principles to optimize for correctness instead of quality.
![]() |
On the plus side, the larger ChatGPT models tend to perform better on correctness, so by using GPT-3.5 or GPT-4, you're already stacking the deck in your favor.
But with principled instructions, you get a double boost with larger models: the research team's results showed that their principled instructions worked better on these models than on smaller models.
1. Include multiple examples in your prompts
Implement example-driven prompting (Use few-shot prompting).
Example: "Determine the emotion expressed in the following text passages as happy or sad.
Examples:
1. Text: "Received the best news today, I'm overjoyed!" Emotion: Happy
2. Text: "Lost my favorite book, feeling really down." Emotion: Sad
3. Text: "It's a calm and peaceful morning, enjoying the serenity." Emotion: Happy Determine the emotion expressed in the following text passages as happy or sad.
Text: "Received the news today, unfortunately it's like everyday news" Emotion:
GPT-4 Improvement: 55%
GPT-3.5 Improvement: 30%
The principle that most improved correctness was few-shot prompting—that's where you give ChatGPT a couple of examples to go off of before asking it to complete the task. Like others on the list, this technique has been around since the early days of prompt engineering, and it's still proving useful.
But even though GPT-4 did indeed provide more correct results, it had some interesting quirks. It didn't always stay within the categories provided—when asked to rate advice as "helpful" or "not helpful," it gave responses like "moderately helpful", "marginally helpful", and "not particularly helpful." Meanwhile, GPT-3.5 tended to stay on task and give the exact phrase mentioned in the prompt. So if you're trying to categorize text, these quirks could nudge you to GPT-3.5.
2. Show your work with step-by-step examples
Combine Chain-of-Thought (Cot) with Few-Shot prompts.
Example:
Example 1: "If a batch of cookies takes 2 cups of sugar and you're making half a batch, how much sugar do you need? To find half, divide 2 cups by 2. Half of 2 cups is 1 cup."
Example 2: "If a cake recipe calls for 3 eggs and you double the recipe, how many eggs do you need? To double, multiply 3 by 2. Double 3 is 6."
Main Question: "If a pancake recipe needs 4 tablespoons of butter and you make one-third of a batch, how much butter do you need? To find one-third, divide 4 tablespoons by 3. One-third of 4 tablespoons is...?
GPT-4 Improvement: 45%
GPT-3.5 Improvement: 35%
Another top-performing principle for ChatGPT correctness combines Chain-of-Thought with Few-Shot prompts.
What does that mean? It means they gave the LLM a series of intermediate reasoning steps (that's Chain-of-Thought prompting) and some examples (That's "Few-Shot", like the example above) to help guide it to follow the same process.
Like the previous example, GPT-4 tends to spit out lengthy sentences rather than a simple answer, and with this prompt, you can see where it goes wrong with its reasoning.
3. Break complex prompts into simple steps
Break down complex tasks into a sequence of simpler prompts in an interactive conversation.
Example:
Prompt: Distribute the negative sign to each term inside the parentheses of the following equation: 2x + 3y - (4x - 5y)
Prompt: Combine like terms for 'x' and 'y' separately.
Prompt: Provide the simplified expression after combining the terms.
GPT-4 Improvement: 45%
GPT-3.5 Improvement: 35%
This principle breaks down the question into a series of prompts you use to go back and forth with ChatGPT until it solves the equation. This is an example of the Cyborg style of prompting where you work step-by-step in tandem with the AI rather than chunking off the task like a Centaur would do.
The problem is that you have to figure out what the steps are that you need to ask it to do—so it makes getting the answer more labor intensive.
Still, using this principle showed a fairly good improvement for both GPT-4 and GPT-3.5.
4. Ask ChatGPT to "think step by step"
Use leading words like writing "think step by step."
Example: "What are the stages of planning a successful event? Let's think step by step."
GPT-4 Improvement: 45%
GPT-3.5 Improvement: 30%
This is a simple ChatGPT prompt principle, but it ends up being pretty powerful. Here, instead of explicitly giving ChatGPT the steps to follow, you just ask it to "think step by step." For GPT-4, this gives you a result where it shows you how it's reasoning through the response, even when you ask math-type questions.
This reminded me of some of the advanced prompt patterns where you ask the LLM to explain its reasoning and it helps improve the accuracy of your result.
5. Define your audience for accurate answers
Integrate the intended audience in the prompt, e.g., the audience is an expert in the field.
Example: "Explain the difference between discrete and continuous data. Simplify it for a student starting statistics."
GPT-4 Improvement: 45%
GPT-3.5 Improvement: 30%
This fairly well-performing principle is somewhat of a surprise. By asking ChatGPT to consider the audience, the correctness also improves. I'm not sure whether it's because most of the audiences involved explaining in simpler terms (and maybe therefore mirrored the "think step by step" principle above) or if there's some other factor at play, but the correctness improvement for GPT-4 with this principle was among the best of the principles tested.
Key takeaways for better ChatGPT prompts
Even though we're just getting started figuring out all the quirks for working with ChatGPT, learning the best prompting techniques can give you a leg up. Though these principles had similar performance across models, many work best with the larger models, so expect more effective ChatGPT prompts to emerge as models grow and all of us using them discover new methods that work best.
FAQs
How can I ensure ChatGPT’s answers are factually correct?
You can start by providing few-shot examples, showing ChatGPT exactly how you want it to respond or categorize information. Breaking your question into smaller steps can also help the model stay on track. According to underlying research methods, combining chain-of-thought prompts with multiple examples boosts accuracy. Essentially, you want to give ChatGPT less room to hallucinate by guiding each step logically. And if you’re still unsure, a quick fact-check never hurts.
What if ChatGPT’s style doesn’t match mine?
Providing a short snippet of your own work helps ChatGPT mimic your tone or writing style. In the Concrete Prompt Templates for Common Tasks, specificity is key—include directives like “use a casual, conversational tone” or “keep it comedic.” You can even direct ChatGPT to rewrite a sample paragraph in various tones, then pick the one that sounds most like you. This approach works well for everything from blog posts to marketing copy. Just remember that the AI needs clear instructions to replicate your style effectively.
Are polite phrases like ‘please’ or ‘thank you’ necessary in ChatGPT prompts?
According to the research, adding words like “please” or “thank you” doesn’t significantly improve output quality or correctness. It won’t hurt, though, so feel free to keep your pleasantries if you want to stay polite. Just know that ChatGPT focuses more on the clarity of your instructions than on courtesy. Sometimes direct commands like “You MUST answer in bullet points” can yield better results.
