Skip to main content

Prompt Engineering

Prompt Engineering | Kaggle

Prompt design is the process of creating a prompt that is tailored to the specific task that the system is being asked to perform.

Prompt engineering is the process of creating a prompt that is designed to improve performance.

Types of Prompt

User Prompts, which are conversational prompts that a user asks, and

System Prompts, which are at the backend and guide the LLM model to provide the desired output.

Prompting Principles

Principle 1: Write clear and specific instructions

Tactic 1: Use delimiters to clearly indicate distinct parts of the input

  • Delimiters can be anything like: , """, < >, <tag> </tag>, :

Tactic 2: Ask for a structured output

  • JSON, HTML

Tactic 3: Ask the model to check whether conditions are satisfied

Tactic 4: "Few-shot" prompting

Principle 2: Give the model time to "think"

Tactic 1: Specify the steps required to complete a task

Tactic 2: Instruct the model to work out its own solution before rushing to a conclusion

Others

  • Imitating - In the style of x write about y
  • At the end of prompts add - make it catchy!

Prompting Techniques

prompt-techniques

Chain-of-thought

Chain-of-thought (CoT) prompting is a technique that allows large language models (LLMs) to solve a problem as a series of intermediate steps before giving a final answer. Chain-of-thought prompting improves reasoning ability by inducing the model to answer a multi-step problem with steps of reasoning that mimic a train of thought. It allows large language models to overcome difficulties with some reasoning tasks that require logical thinking and multiple steps to solve, such as arithmetic or commonsense reasoning questions.

Least to most prompting

Least-to-most prompting is a prompt engineering technique where complex problems are broken down into smaller, simpler subproblems, and then solved sequentially. This approach is particularly effective in tasks involving symbolic manipulation, compositional generalization, and mathematical reasoning, often exceeding the performance of Chain-of-Thought prompting on more difficult problems.

Here's a breakdown of how it works:

  1. Problem Decomposition: The initial prompt guides the Large Language Model (LLM) to decompose a complex problem into a series of simpler subproblems.
  2. Sequential Solving: The LLM then solves each subproblem sequentially, utilizing the solutions to previous subproblems to guide the next step.
  3. Enhanced Reasoning: By breaking down complex tasks into simpler components, least-to-most prompting allows LLMs to leverage their reasoning capabilities more effectively, leading to improved performance, especially on challenging problems.

Least-to-Most Prompting

Other techniques

  • Generated knowledge prompting
  • Least-to-most prompting
  • Self-consistency decoding
  • Complexity-based prompting
  • Self-refine
  • Tree-of-thought
  • Maieutic prompting
  • Directional-stimulus prompting

Prompt engineering - Wikipedia

Parameters

Temperature

Controls the randomness of the model's output. A higher temperature makes the output more random, while a lower temperature makes it more deterministic.

Understanding OpenAI's Temperature Parameter | Colt Steele

Temperature (0.0 to 2.0): Think of this like creativity settings.

  • Temperature 0.1 = very focused and predictable (good for factual answers),
  • Temperature 0.9 = more creative and random (good for brainstorming).

Top P (0.0 to 1.0): Controls word choice diversity.

  • Lower = sticks to most likely words,
  • Higher = considers more unusual word options.

Top K (number): Limits how many word options the AI considers at each step.

ParameterRangeDescriptionExample (Weather)
Temperature0.0 to 2.0 (preferable -0.7)Controls randomness and creativityLow (0.1): "Today's weather is very hot with clear skies and high humidity. The temperature is expected to reach 95°F with a UV index of 8. There is a 10% chance of precipitation." High (0.9): "The scorching sun beats down mercilessly today, turning the sidewalks into sizzling griddles that could fry an egg! The sky, a brilliant azure canvas without a single cloud brushstroke, offers no respite from the relentless heat wave that has the whole city moving in slow motion."
Top P0.0 to 1.0Controls diversity by including less common words until reaching a probability thresholdLow (0.3): "Today's weather is very hot and sunny. The temperature will reach 95 degrees with no clouds in sight." High (0.9): "Today's weather is very sweltering with a blinding sun and stifling humidity making the afternoon particularly oppressive for outdoor activities."
Top Knumber (1- 100)Directly limits the number of word options considered regardless of probabilityFor "The weather today is very..." Top K = 2: Can only choose between "hot" (40%) and "cold" (30%), making output more predictable Top K = 5: Can choose from "hot" (40%), "cold" (30%), "nice" (15%), "humid" (10%), or "pleasant" (5%), giving more variety

The key differences:

  • Temperature affects overall randomness and creativity across all word choices
  • Top P controls diversity by including less common words until reaching a probability threshold
  • Top K directly limits the number of word options considered regardless of probability

These parameters can be combined - for instance, using a moderate Temperature (0.7) with a low Top K (5) would give creative but controlled outputs that don't go too far off track.

Other Topics

  • Iterative
  • Summarizing
  • Inferring
  • Transforming
  • Expanding
  • Chatbot
  • Conclusion

ChatGPT Prompt Engineering for Developers - DeepLearning.AI

Summarization

{
"anthropic_version": "bedrock-2023-05-31",
"messages": [
{
"role" : "user",
"content" : "You will be given a conversation between a user and an AI assistant.
When available, in order to have more context, you will also be give summaries you previously generated.
Your goal is to summarize the input conversation.

When you generate summaries you ALWAYS follow the below guidelines:
<guidelines>
- Each summary MUST be formatted in XML format.
- Each summary must contain at least the following topics: 'user goals', 'assistant actions'.
- Each summary, whenever applicable, MUST cover every topic and be place between <topic name='$TOPIC_NAME'></topic>.
- You AlWAYS output all applicable topics within <summary></summary>
- If nothing about a topic is mentioned, DO NOT produce a summary for that topic.
- You summarize in <topic name='user goals'></topic> ONLY what is related to User, e.g., user goals.
- You summarize in <topic name='assistant actions'></topic> ONLY what is related to Assistant, e.g., assistant actions.
- NEVER start with phrases like 'Here's the summary...', provide directly the summary in the format described below.
</guidelines>

The XML format of each summary is as it follows:
<summary>
<topic name='$TOPIC_NAME'>
...
</topic>
...
</summary>

Here is the list of summaries you previously generated.

<previous_summaries>
$past_conversation_summary$
</previous_summaries>

And here is the current conversation session between a user and an AI assistant:

<conversation>
$conversation$
</conversation>

Please summarize the input conversation following above guidelines plus below additional guidelines:
<additional_guidelines>
- ALWAYS strictly follow above XML schema and ALWAYS generate well-formatted XML.
- NEVER forget any detail from the input conversation.
- You also ALWAYS follow below special guidelines for some of the topics.
<special_guidelines>
<user_goals>
- You ALWAYS report in <topic name='user goals'></topic> all details the user provided in formulating their request.
</user_goals>
<assistant_actions>
- You ALWAYS report in <topic name='assistant actions'></topic> all details about action taken by the assistant, e.g., parameters used to invoke actions.
</assistant_actions>
</special_guidelines>
</additional_guidelines>
"
}
]
}

Assistant APIs

The Assistants API allows you to build AI assistants within your own applications. An Assistant has instructions and can leverage models, tools, and knowledge to respond to user queries. The Assistants API currently supports three types of tools: Code Interpreter, Retrieval, and Function calling.

At a high level, a typical integration of the Assistants API has the following flow:

  1. Create an Assistant in the API by defining its custom instructions and picking a model. If helpful, enable tools like Code Interpreter, Retrieval, and Function calling.
  2. Create a Thread when a user starts a conversation.
  3. Add Messages to the Thread as the user ask questions.
  4. Run the Assistant on the Thread to trigger responses. This automatically calls the relevant tools.

Create AI Assistants with OpenAI's Assistants API

Knowledge based retrieval tool -

platform.openai.com/docs/assistants/overview

Learning