There’s no question that AI is a key technological advancement for the future. Everywhere you turn ChatGPT or other models are being discussed as ways of improving productivity. And with that, many are trying to build their own solutions so that they can feed these models with their own private data to support chat-based or agentic solutions.
So the question is, how do I bring my own data to a chat solution? The answer is RAG.
What is RAG?
So not going to lie, when I think of RAG, I am the level of nerd where my non-neurotypical brain goes directly to this “Hitchhiker’s Guide to the Galaxy”
But that’s not what we are thinking about here, even if I would say the idea of “Don’t Panic, bring RAG” works. RAG stands for Retrieval Augmented Generation”, and refers to the idea of bringing data to your model in a way that doesn’t require retraining of the model. The idea being that an LLM can be very costly to train and update, but if I can provide a vector database that the LLM knows how to read I can make it easier for the model to find the Data it needs.
RAG is an architectural pattern of providing your LLM with a data source for looking up information to help support the responses.
What did I build to help?
So to implement the following, we need an architectural solution to support using a vector database with a storage account. In short an architecture like this:
architecture-beta
group vnet(cloud)[Virtual Network]
service aisearch(database)[AI Search] in vnet
service jumpbox(server)[Jump Box] in vnet
service blob(disk)[Blob Storage] in vnet
service logicapp(internet)[Logic App] in vnet
service openai(internet)[Azure OpenAI] in vnet
blob:L -- R:logicapp
openai:L -- R:logicapp
aisearch:T -- B:logicapp
And that will lead to the following flow:
sequenceDiagram
participant File
participant Blob
participant LogicApp
participant OpenAI
participant AiSearch
File->>Blob: File uploaded to blob storage
Blob->>LogicApp: Logic app triggered to perform index
LogicApp->>LogicApp: Chunk up the data
LogicApp->>OpenAI: Send chunks to embeddings model
OpenAI->>LogicApp: Receive back embedding results
LogicApp->>AiSearch: Output vectors to database
Now the good news is, for the infrastructure, I’ve created bicep templates which will deploy the above architecture in a network isolated manner:
Now using the above code, you can deploy this entire environment, using the following steps:
Deploying the repo
First you will need to clone the repo, and then use the following to log into the azure cli:
az cloud set --name AzureUSGovernment
az login --use-device-code
For this code, I assume you have a virtual network, if you don’t you will need to create one:
RESOURCE_GROUP_NAME="search-rag-demo-rg"
VNET_NAME="search-rag-vnet"
LOCATION="usgovvirginia"
SUBNET_NAME="default"
# Create the resource group
az group create --name $RESOURCE_GROUP_NAME --location $LOCATION
# Create the virtual network
az network vnet create --name $VNET_NAME --resource-group $RESOURCE_GROUP_NAME --subnet-name $SUBNET_NAME
Then you will want to deploy the basic environment:
RESOURCE_GROUP_NAME="search-rag-demo-rg"
PROJECT_PREFIX="rag"
ENV_PREFIX="dev1"
EXISTING_NETWORK_NAME="search-rag-vnet"
DEFAULT_TAG_NAME="environment"
DEFAULT_TAG_VALUE="search-rag"
az deployment group create --resource-group $RESOURCE_GROUP_NAME --template-file ./main.bicep --parameters project_prefix=$PROJECT_PREFIX env_prefix=$ENV_PREFIX existing_network_name=$EXISTING_NETWORK_NAME default_tag_name=$DEFAULT_TAG_NAME default_tag_value=$DEFAULT_TAG_VALUE deploy_openai=true deploy_jumpbox=true
NOTE: If you need only parts of this deployment, there are boolean parameters to deploy only parts, like blob storage, logic app, etc.
And then you will need to deploy the jumpbox (if required), run the following:
ADMIN_USERNAME=""
ADMIN_PASSWORD=""
JUMPBOX_SUBNET_ID=""
az deployment group create --resource-group $RESOURCE_GROUP_NAME --template-file ./main-jumpbox.bicep --parameters project_prefix=$PROJECT_PREFIX env_prefix=$ENV_PREFIX default_tag_name=$DEFAULT_TAG_NAME default_tag_value=$DEFAULT_TAG_VALUE admin_username=$ADMIN_USERNAME admin_password=$ADMIN_PASSWORD jumpbox_subnet_id=$JUMPBOX_SUBNET_ID
And then run the following to deploy the AOAI instance:
ADMIN_EMAIL=""
AOAI_SUBNET_ID=""
az deployment group create --resource-group $RESOURCE_GROUP_NAME --template-file ./main-aoai.bicep --parameters project_prefix=$PROJECT_PREFIX env_prefix=$ENV_PREFIX default_tag_name=$DEFAULT_TAG_NAME default_tag_value=$DEFAULT_TAG_VALUE admin_email=$ADMIN_EMAIL subnet_id=$AOAI_SUBNET_ID
From there, you can leverage the built in logic app workflow to setup the indexing of the data. You can find more details on how to do this with this article from Microsoft