Press "Enter" to skip to content

Design effective AI prompts with Microsoft Prompt Engine

The launch of Microsoft’s new AI-powered Bing shed new light on the company’s investments in OpenAI’s large language models and generative AI, turning them into a consumer-facing service. Early experiments with the service quickly revealed the details of the predefined cues Microsoft was using to keep the Bing chatbot focused on delivering search results.

Large language models, such as the OpenAI GPT suite, are considered better as prompt and response tools. You give the model a message and it responds with a series of words that fit both the content and style of the message, and in some cases even the mood. Models are trained using large amounts of data that are then tuned for a specific task. By providing a well-designed prompt and limiting the size of the response, it is possible to reduce the risk that the model will produce grammatically correct but inherently false results.

Introduction to rapid engineering

Directions from Microsoft’s Bing showed that it was limited to simulating a useful personality that would create content from search results, using Microsoft’s own Prometheus model as a set of additional feedback loops to keep results on topic and on topic. context. Perhaps the most interesting thing about these hints is that it’s clear that Microsoft has been investing in a new discipline of software engineering: hint engineering.

It’s an approach you should also invest in, especially if you’re working with Microsoft’s Azure OpenAI APIs. Generative AIs, like great language models, will be part of the public face of your application and your business, and you’ll need to keep them on-brand and under control. That requires some quick engineering: designing an effective configuration flag, tuning the model, and ensuring that user flags don’t generate unwanted results.

Both Microsoft and OpenAI provide sandboxed environments in which you can create and test base requests. You can paste a message body, add sample user content, and see the typical output. Although there is an element of randomness in the model, you will get similar results for any input, so you can test the features and build the “personality” of your model.

This approach is not only necessary for chat and text based models; you’ll need some aspect of rapid engineering into a Codex-based AI-powered development tool or a DALL-E image generator used for slide clip art or as part of a low-code workflow. Adding structure and control to prompts keeps generative AI productive, helps prevent errors, and reduces the risk of misuse.

Using requests with Azure OpenAI

It’s important to remember that you have other tools to control both context and consistency with large language models beyond the prompt. Another option is to control the length of the response (or, in the case of a ChatGPT-based system, the responses) by limiting the number of tokens that can be used in an interaction. This keeps responses concise and they are less likely to get off topic.

Working with the Azure OpenAI APIs is a relatively easy way to embed large language models into your code, but while they make it easy to deliver strings to APIs, what’s needed is a way to manage those strings. It takes a lot of code to apply fast engineering disciplines to your application, applying the appropriate patterns and practices beyond the basic question and answer options.

Manage prompts with Prompt Engine

Microsoft has been working on an open source project, Prompt Engine, to manage requests and deliver the expected results of a large language model, with versions of JavaScript, C#, and Python all in separate GitHub repositories. All three have the same basic functionality: managing the context of any interaction with a model.

If you use the JavaScript version, there is support for three different kinds of model: a generic message-based model, a code model, and a chat-based system. It’s a useful way to manage the various components of a well-designed ad, which supports both your own input and user interactions (including model responses). That last part is important as a way to manage context between interactions, ensuring state is persisted across chats and between lines of code in an app.

Also Read:  AI startup Anthropic unveils moral principles behind chatbot Claude

You get the same options as the Python version, allowing you to quickly use the same processes as JavaScript code. The C# version only supports generic and text parsing models, but these can be easily reused for your applications of choice. The JavaScript option is good for web applications and Visual Studio Code extensions, while the Python tool is a logical choice for anyone who works with many different machine learning tools.

The intent is to treat the large language model as a contributor to the user, allowing you to create your own feedback loops around the AI, much like Microsoft’s Prometheus. By having a standard pattern to work with the model, you can iterate around your own base directions by tracking the outputs and refining the inputs where necessary.

Manage GPT interactions with Prompt Engine

Prompt Engine installs as a library from familiar repositories like npm and pip, with sample code in their GitHub repositories. Getting started is pretty easy once the module imports the appropriate libraries. Start with a description of your request, followed by some sample interactions. For example, when you’re converting natural language to code, each interaction is a pair that has a sample query followed by the expected exit code in the target language.

There must be several interactions to create the most effective indicator. The default target language is Python, but you can set your choice of languages ​​using a CodeEngineConfig call.

With a target language and a set of samples, you can now create a prompt from a user query. The resulting request string can be used in an Azure OpenAI API call. If you want to keep the context with your next call, just add the response to a new interaction and it will carry over to the next call. Since it is not part of the original sample interactions, it will not persist beyond the current user’s session and cannot be used by another user or in another call. This approach simplifies the creation of dialogs, but it is important to keep track of the total tokens used so that your message does not exceed the token limits of the model. Prompt Engine includes a way to ensure that the prompt length does not exceed the maximum number of tokens for your current model, and removes older dialogs when necessary. This approach means that dialogs can lose context, so you may need to help users understand that there are limits to the length of a conversation.

If you are explicitly targeting a chat system, you can configure usernames and bot names with a contextual description that includes bot behaviors and tones that can be included in sample interactions, passing responses back to the prompt engine to build context in the next notice. .

You can use cached interactions to add a feedback loop to your app, for example, look for unwanted terms and phrases, or use the user response rating to determine which interactions persist between requests. Logging successful and unsuccessful alerts will allow you to create a more effective default alert, adding new examples as needed. Microsoft suggests creating a dynamic bank of examples that can be matched against queries, using a set of similar examples to dynamically generate a prompt that approximates your user’s query and hopefully produces more accurate results.

Prompt Engine is a simple tool that helps you build an appropriate pattern for generating prompts. It’s an effective way to manage the limitations of large language models like GPT-3 and Codex, while creating the necessary feedback loops that help prevent a model from behaving in unexpected ways.

Copyright © 2023 IDG Communications, Inc.

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *