Press "Enter" to skip to content

Using Hugging Face machine learning models in Azure

Microsoft’s recent Azure Open Source Day showcased a new reference application built with cloud-native tools and services, with a focus on Microsoft’s own open source tools. The app was created to be a service to help owners reunite with lost pets. Use machine learning to quickly compare photos of a lost animal with images from animal shelters, rescues, and community sites. It’s a good example of how open source tools can build complex sites and services, from infrastructure like code tools to application frameworks and various tools that add functionality to code.

At the heart of the app is an open source machine learning model, part of a library of many thousands of models and data sets developed by the Hugging Face community and built on top of their huge selection of different tools and services. The scale of the community is a good reason to use Hugging Face models, whether importing them to infer in your own code, running them on your own servers, or accessing them via a cloud API.

Why use Hugging Face?

There’s another reason to consider working with Hugging Face on Azure: it allows you to apply AI to many different business problems. Although Microsoft’s own Cognitive Services APIs cover many common AI scenarios with well-defined APIs, they are one company’s dogged view of which machine learning services make sense for businesses. That makes them something of a jack of all trades, designed for general purposes rather than specific tasks. If your code needs to support an edge case, it can be a lot of work to add the appropriate settings to the APIs.

Yes, there is the option to create your own specific models using Azure Machine Learning Studio, working with tools like PyTorch and TensorFlow to design and train models from scratch. But that requires a strong background in data science and machine learning to build and train models. There are other problems with a “from scratch” approach to machine learning. Azure has an increasing number of virtual machine options for machine learning training, but the process can have significant compute requirements and is expensive to run, especially if you’re building a large model that requires a lot of data. Not all of us are Open AI and don’t have the budgets to build cloud-hosted supercomputers for training!

With over 40,000 models created in its Transformer model framework, Hugging Face can help avoid the customization problem by having models that the community has created and trained for many more scenarios than Microsoft’s. You are also not limited to text; Hugging Face Transformers have been trained to work with natural language, audio and machine vision. Hugging Face describes these features as “tasks,” with, for example, over 2,000 different templates for image grading and nearly 18,000 for text grading.

Face hugging in Azure

Microsoft recently launched support for Hugging Face models in Azure, offering a set of endpoints that can be used in your code, with models imported from the Hugging Face Hub and its pipeline API. The models are created and tested by the Hugging Face community, and the endpoint approach means they are ready for inference.

The models are available at no cost; all you pay for is Azure compute resources to run inference tasks. That’s not insignificant, especially if you’re working with large amounts of data, and you need to compare pricing with Azure Cognitive Services.

Also Read:  Solving the SBOM crisis with WebAssembly components

Creating endpoints for your code

Creating an endpoint is quite simple. In the Azure Marketplace, select Hugging Face Azure ML to add the service to your account. Add your endpoint to a resource group, then select a region and give it a name. You can now choose a Hugging Face Hub model and select the model ID and associated tasks. Next, choose an Azure compute instance for your service and a virtual network to keep your service secure. This is enough to create an endpoint, generating the URLs and keys needed to use it.

Helpfully, the service supports endpoints to automatically scale as needed, based on the number of requests per minute. By default you are limited to a single instance, but you can use the sliders on the settings screen to set a minimum and maximum number of instances. Scaling is driven by an average number of requests over a five-minute period, with the goal of smoothing out spikes in demand that could cause unnecessary costs.

For now, there is very little documentation on the Azure integration, but you can familiarize yourself with it by consulting the Hugging Face AWS endpoint documentation. The Endpoint API builds on the existing inference API and can determine how to structure the payloads.

The service gives you a convenient URL to test your inference model. This includes sample Python and JavaScript code, as well as the option to use curl from the command line. Data is sent as JSON, and responses are sent in a similar way. You can use standard libraries to assemble and process the JSON, allowing you to embed REST API calls into your code. If you’re using Python, you can take the sample code and copy it into a Jupyter notebook, where you can share tests with colleagues, collaboratively building a more complete application.

Customize Hugging Face models in Azure Machine Learning

You can now use Hugging Face basic models in Azure Machine Learning with the same tools you use to build and train your own models. Although the capability is currently in preview, it is a useful way to work with your models, using familiar tools and technologies, using Azure Machine Learning to tune and deploy Hugging Face models in your applications. You can search for models using the Azure Machine Learning registry, out of the box.

This is a quick way to add additional pretrained model endpoints to your code; You also have the option to wrap models on your own data, use Azure storage for training and test data, and work with Azure Machine Learning pipelines to manage the process. Treating the Hugging Face models as a base for your own makes a lot of sense; they are tested in a variety of cases that might not be right for you. A model trained to recognize metalworking flaws has some of the characteristics needed to handle glass or plastic, so additional training will reduce the risk of error.

There is a growing open source machine learning community, and it is important for companies like Microsoft to adopt it. They may have experience and skills, but they don’t have the scale of that larger community, or their specialization. By working with communities like Hugging Face, developers get more choice and choice. It is a victory for all.

Copyright © 2023 IDG Communications, Inc.

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *