Fundamentals of Prompt Engineering

Prompt engineering is an emerging field that focuses on developing, designing, and optimizing prompts to enhance the output of LLMs for your needs. It gives you a way to guide the model's behavior to the outcomes you want to achieve.

TLDR: But why prompt engineering when we can finetune a model?

Prompt engineering is different from fine-tuning. In fine-tuning, the weights or parameters are adjusted using training data with the goal of optimizing a cost function. Fine-tuning can be an expensive process, both in terms of computation time and actual cost. Prompt engineering, however, attempts to guide the trained FM, an LLM, or a text-to-image model, to give more relevant and accurate answers.

Prompt engineering is the fastest way to harness the power of large language models. By interacting with an LLM through a series of questions, statements, or instructions, you can adjust LLM output behavior based on the specific context of the output you want to achieve.

Effective prompt techniques can help your business accomplish the following benefits:

Boost a model's abilities and improve safety.
Augment the model with domain knowledge and external tools without changing model parameters or fine-tuning.
Interact with language models to grasp their full capabilities.
Achieve better quality outputs through better quality inputs.

Elements of a prompt

A prompt's form depends on the task you are giving to a model. As you explore prompt engineering examples, you will review prompts containing some or all of the following elements:

Instructions: This is a task for the large language model to do. It provides a task description or instruction for how the model should perform.

Context: This is external information to guide the model.

Input data: This is the input for which you want a response.

Output indicator: This is the output type or format.

Basic Prompt Techniques

Zero-shot prompting

Zero-shot prompting is a prompting technique where a user presents a task to an LLM without giving the model further examples. Here, the user expects the model to perform the task without a prior understanding, or shot, of the task. Modern LLMs demonstrate remarkable zero-shot performance.

Few-shot prompting

Few-shot prompting is a prompting technique where you give the model contextual information about the requested tasks. In this technique, you provide examples of both the task and the output you want. Providing this context, or a few shots, in the prompt conditions the model to follow the task guidance closely.

Chain-of-thought prompting

Chain-of-thought (CoT) prompting breaks down complex reasoning tasks through intermediary reasoning steps. You can use both zero-shot and few-shot prompting techniques with CoT prompts.

Chain-of-thought prompts are specific to a problem type. You can use the phrase "Think step by step" to invoke CoT reasoning from your machine learning model.

Consider the following prompts using the Jurassic-2 (J2) model. The first prompt example uses a single CoT prompt, and the second example uses self-consistency.

Self-Consistency Improves Chain of Thought Reasoning in Language Models
To learn more about the self-consistency prompting technique, follow the link.https://arxiv.org/abs/2203.11171

Retrieval Augmented Generation (RAG)

Retrieval Augmented Generation (RAG) is a prompting technique that supplies domain-relevant data as context to produce responses based on that data and the prompt. This technique is similar to fine-tuning. However, rather than having to fine-tune an FM with a small set of labeled examples, RAG retrieves a small set of relevant documents from a large corpus and uses that to provide context to answer the question. RAG will not change the weights of the foundation model whereas fine-tuning will change model weights.

This approach can be more cost-efficient than regular fine-tuning because the RAG approach doesn't incur the cost of fine-tuning a model. RAG also addresses the challenge of frequent data changes because it retrieves updated and relevant information instead of relying on potentially outdated sets of data.

In RAG, the external data can come from multiple data sources, such as a document repository, databases, or APIs. Before using RAG with LLMs, you must prepare and keep the knowledge base updated. The following diagram shows the conceptual flow of using RAG with LLMs. To see the steps the model uses to learn once the knowledge base has been prepared, choose each of the four numbered markers.