Press "Enter" to skip to content

Getting started with Azure OpenAI

Modern machine learning and AI research has moved quickly from the lab to our IDEs, with tools like Azure Cognitive Services providing API-based access to pre-trained models. There are many different approaches to delivering AI services, and one of the most promising methods for working with the language is a technique called generative pretraining, or GPT, which handles large amounts of text.

Open AI and Microsoft

The OpenAI Research Lab pioneered this technique, publishing the initial paper on the subject in 2018. The model it uses has gone through several iterations, beginning with the unsupervised GPT-2, which used unlabeled data to mimic the humans. Built on 40 GB of public Internet content, GPT-2 required significant training to provide a model with 1.5 billion parameters. GPT-3 followed, a much larger model with 175 billion parameters. Licensed exclusively to Microsoft, GPT-3 is the foundation for tools such as the programming code-focused Codex used by GitHub Copilot and the DALL-E image generator.

With a model like GPT-3 requiring significant amounts of compute and memory, on the order of thousands of petaflop/s-day, it is an ideal candidate for cloud-based high-performance computing on specialized supercomputer hardware. Microsoft has built its own Nvidia-based servers for supercomputing on Azure, and its cloud instances appear on the TOP500 supercomputing list. Azure’s AI servers are powered by Nvidia Ampere A12000 Tensor Core GPUs, interconnected via a high-speed InfiniBand network.

Add OpenAI to Azure

OpenAI’s generative AI tools have been built and trained on Azure servers. As part of a long-term agreement between OpenAI and Microsoft, OpenAI tools will be available as part of Azure, with Azure-specific APIs and integration with Azure billing services. After some time in private preview, the Azure OpenAI API set is now generally available, with support for GPT-3 text generation and the Codex code model. Microsoft has said that it will add DALL-E imaging in a future update.

That doesn’t mean that just anyone can create an application that uses GPT-3; Microsoft is still agreeing to ensure projects adhere to its AI ethical use policies and are strictly focused on specific use cases. You must also be a direct Microsoft customer to get access to Azure OpenAI. Microsoft uses a similar process to access its Limited Access Cognitive Services, where there is a possibility of phishing or privacy violations.

Those policies are likely to remain stringent, and some areas, such as healthcare, will likely require additional protection to meet regulatory requirements. Microsoft’s own experiences with AI language models have taught it a lesson it doesn’t want to repeat. As added protection, there are content filters on inputs and outputs, with alerts for both Microsoft and developers.

Explore Azure OpenAI Studio

Once your account has been approved to use Azure OpenAI, you can start building code that uses your API endpoints. The appropriate Azure resources can be created from the portal, Azure CLI, or Arm templates. If you’re using the Azure portal, create a resource that’s assigned to your account and the resource group you intend to use for your application and any associated Azure services and infrastructure. Next, give the resource a name and select the pricing tier. At the moment, there is only one pricing option, but this will likely change as Microsoft rolls out new service levels.

With a resource in place, you can now deploy a model using Azure OpenAI Studio. This is where you will do most of your work with OpenAI. Currently, you can choose from members of the GPT-3 family of models, including the code-based Codex. Additional models use embeds, complex semantic information that is optimized for search.

Within each family, there is a set of different models with names indicating both cost and capacity. If you use GPT-3, Ada is the lowest cost and least capable and Davinci is the highest. Each model is a superset of the previous one, so as tasks get more complex, you don’t need to change your code, just choose a different model. Interestingly, Microsoft recommends starting with the most capable model when designing an OpenAI-based application, as this allows you to tune the underlying model for price and performance when it goes into production.

Work with model customization

Although GPT-3’s text completion features have gone viral, in practice your application will need to be much more focused on your specific use case. You don’t want GPT-3 powering a support service that regularly provides irrelevant advice. You must create a custom model using training samples with desired inputs and outputs, which Azure OpenAI calls “completions”. It’s important to have a large training data set, and Microsoft recommends using several hundred examples. You can include all your requests and completions in a JSON file to simplify the management of your training data.

Also Read:  Google Cloud Flex Agreements woo users during a slowdown in demand

With a custom model in place, you can use Azure OpenAI Studio to test how GPT-3 will work for your scenario. A basic playground allows you to see how the model responds to specific prompts, with a basic console app that allows you to type a prompt and returns an OpenAI completion. Microsoft describes creating a good ad as “show, don’t tell,” which suggests that ads should be as explicit as possible for the best result. The playground also helps train your model, so if you’re building a classifier, you can provide a list of text and expected results before delivering inputs and a trigger to get a response.

A useful feature of the playground is the ability to set an intent and expected behaviors ahead of time, so if you’re using OpenAI to power a help desk triage tool, you can set the expectation that the output will be polite. and quiet, making sure he wins. Do not imitate an angry user. The same tools can be used with the Codex model, so you can see how it works as a code completion tool or as a dynamic wizard.

Write code to work with Azure OpenAI

Once you’re ready to start coding, you can use the REST endpoints of your implementation, either directly or with the OpenAI Python libraries. The latter is probably the fastest route to live code. You will need the endpoint URL, an authentication key, and the name of your deployment. Once you have them, set the appropriate environment variables for your code. As always, in production it’s best not to hardcode keys and use a tool like Azure Key Vault to manage them.

Calling an endpoint is quite easy: just use the openai.Completion.create method to get a response, setting the maximum number of tokens needed to contain your prompt and your response. The response object returned by the API contains the text generated by your model, which can be extracted, formatted, and then used by the rest of your code. The basic calls are simple, and there are additional parameters that your code can use to manage the response. These control the creativity of the model and how it displays its results. You can use these parameters to ensure that responses are direct and accurate.

If you are using another language, please use their REST and JSON parsing tools. You can find an API reference in the Azure OpenAI documentation or take advantage of the Swagger specs hosted on Azure’s GitHub to generate API calls and work with the returned data. This approach works well with IDEs like Visual Studio.

Azure OpenAI pricing

A key element of the OpenAI models is their token-based pricing model. Tokens in Azure OpenAI are not the familiar authentication token; they are tokenized sections of strings, which are created using an internal statistical model. Open AI provides a tool on their site to show how strings are tokenized to help you understand how your queries are billed. You can expect a token to be approximately four characters of text, although it can be fewer or more; however, you should end up with 75 words that need around 100 tokens (about a normal paragraph of text).

The more complex the model, the more expensive the tokens are. The base Ada model is around $0.0004 for 1,000 tokens, and the higher-end Davinci is $0.02. If you apply your own tuning, there is a storage cost, and if you use embeds, the costs can be an order of magnitude higher due to increased computing requirements. There are additional costs for fine-tuning the models, starting at $20 per computing hour. The Azure website has sample prices, but actual prices may vary depending on your organization’s account relationship with Microsoft.

Perhaps the most amazing thing about Azure OpenAIclo is how simple it is. Since you’re using pre-built models (with the option of some tuning), all you need to do is apply some basic pre-training, understand how hints generate results, and link the tools to your code, generating text content or code whenever necessary. necessary. it is necessary

Copyright © 2023 IDG Communications, Inc.

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *