7/4/2023 Admin

Bring Your Own Data to Azure OpenAI


image

Azure OpenAI, using models created by OpenAI such as GPT 3 and GPT 4 can do amazing things using the data they were trained on.

This poses a challenge for users who want to customize Azure OpenAI with their own private data. How can you make Azure OpenAI more responsive and relevant to your specific needs? Basically, how can you get your own custom private data into Azure OpenAI?

The RAG Pattern

image

As described in the article: Use a Poor Developers Vector Database to Implement The Retrieval Augmented Generation (RAG) pattern is a technique for building natural language generation systems that can retrieve and use relevant information from external sources.

image

The concept is to first retrieve a set of passages that are related to the search query, then use them to supply grounding to the prompt, to finally generate a natural language response that incorporates the retrieved information.

To ground a model means to provide it with some factual or contextual information that can help it produce more accurate and coherent outputs.

For example, if the prompt is “Who is the president of France?”, the model needs to know some facts about the current political situation in France.

If the prompt is “How are you feeling today?”, the model needs to know some context about the previous conversation or the user’s mood.

One way to ground a model is to use the RAG pattern and retrieve relevant information from external sources, and supply that information to the prompt.

image

Microsoft Azure OpenAI provides a service that will implement this process called Bring your own data:

  • It allows you to run OpenAI models, such as ChatGPT and GPT-4, on your own data
  • It supports connecting to multiple data sources, such as Azure Cognitive Search index, Azure Blob storage container, or local files (but everything is imported into Azure Cognitive Search)

Set-up Azure OpenAI

image

See the article: What Is Azure OpenAI And Why Would You Want To Use It? for instructions on setting up Azure OpenAI.

Set Up Azure Cognitive Search

image

The Bring your own data feature provides several options to add your own data, but they all involve ultimately importing that data into Azure Cognitive Search.

It provides the following functionality:

  • Azure Cognitive Search is a search engine that allows full text search over a search index containing user-owned content.
  • It provides rich indexing with lexical analysis and optional AI enrichment for content extraction and transformation.
  • It has a rich query syntax for text search, fuzzy search, autocomplete, geo-search and more.
  • It is programmable through REST APIs and client libraries in Azure SDKs.
  • It integrates with Azure at the data layer, machine learning layer, and AI (Cognitive Services).

If you do not already have Azure Cognitive Search set up, go to:  https://portal.azure.com/#create/Microsoft.Search.

image

Fill in the project details, paying special attention to the Pricing tier, and press Next: Scale.

Note: To use this service with Bring your own data, you must use Basic Tier or higher (this is cost is a minimum $75 a month).

image

Set the number of replicas you want and press Review + create.

image

Click Create.

image

After the service is created, you can navigate to it.

image

Note: For the best search results, you will want to enable Semantic Search.

However, at the time of this writing, it costs a minimum of $499 a month.

Get Sample Data

image

For sample data, we will go to the Azure OpenAI documentation page at: https://learn.microsoft.com/en-us/azure/cognitive-services/openai/overview and click the button to Download PDF.

image

If we don’t already have an Azure Storage account, we will go into the Azure Portal, click Create a resource, search for Storage account, select it, and click Create.

image

In the Storage account, create a container.

image

Select the container.

image

Select Upload.

image

Upload the file to the container.

Bring Your Own Data

image

Navigate to the Azure OpenAI Portal using: https://oai.azure.com/.

Click the Chat link.

image

Select the Add your data tab and then click the Add a data source button.

image

In the Add data dialog, select Azure Blob Storage for the data source and fill out the selections to indicate the storage container and Azure Cognitive Search resource created earlier.

Enter openai for the Index name.

image

Click Save and close.

image

You will see a status message that the data is being indexed.

image

It will indicate when the indexing is complete.

Note: There is a checkbox option to limit the responses returned by the Chat to the data supplied. Leave that checked for now.

image

If we open another web browser window and navigate to the Azure Cognitive Search resource created earlier, we will see that an index has been created.

image

If we ever need to restart the Azure OpenAI Bring your own data wizard, in the Azure OpenAI Studio, we can select this existing index by first selecting Azure Cognitive Search for the data source.

Chatting With The Data

image

Returning to the Azure OpenAI Studio, in the Chat session section, we can enter a query.

image

The response will be displayed along with links to the original .pdf document.

Creating A Web Application

image

You can create a web application by selecting the Deploy to dropdown, which will open a deployment wizard.

image

You can specify the settings to create a deployment to an Azure web app.

image

The app will be created and deploy.

image

The Notifications will let you know when the process is complete and provide a link to the web app.


image

When you navigate to the app you will need to log in and grant permission.

image

You will then have the ability to chat with your data source.

Other Options

There are other options to achieve the same results. See: Use a Poor Developers Vector Database to Implement The RAG Pattern

 

Links

What Is Azure OpenAI And Why Would You Want To Use It?

 

Azure Cognitive Search pricing

Semantic search in Azure Cognitive Search

 

Introducing Azure OpenAI Service On Your Data in Public Preview

Azure OpenAI on your data (preview)

(Video) New easy way to add your data to Azure OpenAI Service

(Video) Making Enterprise GPT Real with Azure Cognitive Search and Azure OpenAI Service

(Video) Azure OpenAI Service + Custom Data + Deploy Web App

An error has occurred. This application may no longer respond until reloaded. Reload 🗙