Azure AI Search
What is Azure AI Search?
Azure AI Search (formerly Azure Cognitive Search) is a fully managed cloud search service that provides full-text search, vector search, semantic ranking, and AI enrichment over your own content — think of it as a smart, enterprise-grade search engine you point at your data.
Your Data Azure AI Search Your App────────── ─────────────── ────────Blob Storage ──────────▶ Index + Embeddings ────────▶ Search ResultsSQL Database (indexing) Vector Store (query) RAG AnswersSharePoint AI Enrichment RecommendationsCosmosDB Semantic Ranking AutocompleteCustom API Hybrid Search
Core Concepts
┌─────────────────────────────────────────────────────────────┐│ AZURE AI SEARCH SERVICE ││ ││ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ ││ │ Index │ │ Indexer │ │ Skillset (AI) │ ││ │ │ │ │ │ │ ││ │ - Fields │ │ - Pulls from │ │ - OCR │ ││ │ - Embeddings │ │ data source│ │ - Entity extract │ ││ │ - ACL fields │ │ - Schedules │ │ - Translation │ ││ │ - Schema │ │ - Transforms │ │ - Key phrases │ ││ └──────────────┘ └──────────────┘ │ - Custom skills │ ││ └──────────────────┘ ││ ┌──────────────────────────────────────────────────────┐ ││ │ QUERY ENGINE │ ││ │ Full-text │ Vector │ Hybrid │ Semantic │ Filters │ ││ └──────────────────────────────────────────────────────┘ │└─────────────────────────────────────────────────────────────┘
| Concept | What it is |
|---|---|
| Index | A collection of searchable documents (like a table in a DB) |
| Field | A property of a document (searchable, filterable, facetable) |
| Indexer | Automated pipeline that pulls data from a source into the index |
| Data Source | Connection to where your raw data lives |
| Skillset | AI enrichment pipeline applied during indexing |
| Scoring Profile | Custom relevance boosting rules |
| Semantic Config | L2 reranking using language understanding |
Search Types
Azure AI Search supports four search modes — often combined:
Query: "What is our refund policy for international orders?"Full-text Search Vector Search Hybrid Search──────────────── ───────────── ─────────────Keyword matching Semantic similarity Full-text + VectorBM25 algorithm Embedding comparison Combined score (RRF)"refund policy" Similar meaning docs Best of both worldsinternational Even if words differorders + Semantic Reranking (L2) Re-orders results using deep language model understanding
Full-Text Search
Classic keyword search using BM25 ranking algorithm:
POST /indexes/documents/docs/search?api-version=2024-07-01{ "search": "refund policy international orders", "queryType": "full", "searchMode": "all", "searchFields": ["content", "title"], "select": "id, title, content, source_file", "top": 5, "count": true}
Supports:
Simple query: "refund policy"Phrase query: "\"refund policy\"" exact phraseWildcard: "refund*" prefix matchFuzzy: "refund~1" 1 edit distanceProximity: "\"refund policy\"~5" within 5 wordsBoolean: "refund AND (policy OR terms)"Boosting: "refund^3 policy" boost refund term
Vector Search
Search by semantic meaning using embeddings — finds relevant docs even when exact keywords don’t match:
"How do I get my money back?" ↓Embedding model (text-embedding-3-large) ↓[0.023, -0.412, 0.891, ...] 1536-dimension vector ↓Cosine similarity search in index ↓Finds: "Refund and return policy" (similar meaning) even though no words match
POST /indexes/documents/docs/search?api-version=2024-07-01{ "vectorQueries": [ { "kind": "vector", "vector": [0.023, -0.412, 0.891, ...], "fields": "embedding", "k": 5, "exhaustive": false } ], "select": "id, title, content", "top": 5}
Vector Algorithm Options
| Algorithm | Speed | Accuracy | Use case |
|---|---|---|---|
| HNSW | Fast | High | Production — approximate nearest neighbor |
| Exhaustive KNN | Slow | Perfect | Small indexes or testing |
// Index vector field config{ "name": "embedding", "type": "Collection(Edm.Single)", "searchable": true, "dimensions": 1536, "vectorSearchProfile": "hnsw-profile"}
Hybrid Search (Best Quality)
Combines full-text + vector scores using Reciprocal Rank Fusion (RRF):
Query: "refund policy international"BM25 Results: Vector Results:1. Refund Policy Doc 1. Return & Refund Guide2. International FAQ 2. Customer Service Policy3. Terms & Conditions 3. International Orders FAQ ↓ RRF merges both ranked lists ↓Hybrid Results (best quality):1. Refund Policy Doc (top in both)2. Return & Refund Guide (high vector score)3. International FAQ (high BM25 score)
POST /indexes/documents/docs/search?api-version=2024-07-01{ "search": "refund policy international orders", "vectorQueries": [ { "kind": "vector", "vector": [0.023, -0.412, ...], "fields": "embedding", "k": 50 } ], "queryType": "simple", "select": "id, title, content, source_file", "top": 5}
Semantic Ranking
A second-pass reranking layer using a Microsoft language model — reads and understands the top results to re-order them by actual relevance:
Hybrid Search → Top 50 results ↓ Semantic Ranker (reads each result, understands meaning, compares to query intent) ↓ Reranked Top 5 + Semantic captions + Semantic answers (extracted key passages)
{ "search": "refund policy international orders", "vectorQueries": [...], "queryType": "semantic", "semanticConfiguration": "my-semantic-config", "captions": "extractive", // extract relevant snippets "answers": "extractive|count-3", // extract direct answers "top": 5}
Response includes:
{ "value": [ { "@search.rerankerScore": 2.847, "@search.captions": [ { "text": "International orders are eligible for refund within 30 days", "highlights": "International orders...refund within 30 days" } ], "content": "Full document content...", "title": "Refund Policy" } ], "@search.answers": [ { "text": "International orders are eligible for refund within 30 days of purchase", "score": 0.94 } ]}
Index Schema Design
{ "name": "rag-index", "fields": [ { "name": "chunk_id", "type": "Edm.String", "key": true, "searchable": false, "filterable": true }, { "name": "content", "type": "Edm.String", "searchable": true, // full-text search "filterable": false, "retrievable": true, "analyzer": "en.microsoft" // language analyzer }, { "name": "title", "type": "Edm.String", "searchable": true, "filterable": true, "retrievable": true }, { "name": "source_file", "type": "Edm.String", "searchable": false, "filterable": true, // filter by document "retrievable": true, "facetable": true // faceted navigation }, { "name": "sensitivity", "type": "Edm.String", "filterable": true, // filter by label "facetable": true }, { "name": "allowed_groups", "type": "Collection(Edm.String)", "filterable": true // document-level security }, { "name": "last_modified", "type": "Edm.DateTimeOffset", "filterable": true, "sortable": true }, { "name": "embedding", "type": "Collection(Edm.Single)", "searchable": true, "retrievable": false, // don't return raw vectors "dimensions": 3072, // text-embedding-3-large "vectorSearchProfile": "hnsw-profile" } ], "vectorSearch": { "algorithms": [ { "name": "hnsw-algo", "kind": "hnsw", "hnswParameters": { "metric": "cosine", "m": 4, // connections per layer "efConstruction": 400, // build quality "efSearch": 500 // query quality } } ], "profiles": [ { "name": "hnsw-profile", "algorithm": "hnsw-algo", "vectorizer": "azure-openai-vectorizer" } ], "vectorizers": [ { "name": "azure-openai-vectorizer", "kind": "azureOpenAI", "azureOpenAIParameters": { "resourceUri": "https://myoai.openai.azure.com", "deploymentId": "text-embedding-3-large", "modelName": "text-embedding-3-large" } } ] }, "semantic": { "configurations": [ { "name": "my-semantic-config", "prioritizedFields": { "titleField": { "fieldName": "title" }, "contentFields": [ { "fieldName": "content" } ] } } ] }}
Indexers and Data Sources
Indexers automatically pull data from Azure sources on a schedule:
Data Sources supported:├── Azure Blob Storage (PDF, Word, Excel, HTML, JSON)├── Azure Data Lake Gen2├── Azure SQL Database├── Azure Cosmos DB├── Azure Table Storage├── SharePoint Online└── OneLake (Fabric)
// Data source connection{ "name": "blob-datasource", "type": "azureblob", "credentials": { "connectionString": "ResourceId=/subscriptions/.../storageAccounts/mystg" }, "container": { "name": "documents", "query": "processed/" // only index this folder }, "dataDeletionDetectionPolicy": { "@odata.type": "#Microsoft.Azure.Search.SoftDeleteColumnDeletionDetectionPolicy", "softDeleteColumnName": "IsDeleted", "softDeleteMarkerValue": "true" }}// Indexer — runs on schedule{ "name": "blob-indexer", "dataSourceName": "blob-datasource", "targetIndexName": "rag-index", "skillsetName": "ai-enrichment-skillset", "schedule": { "interval": "PT1H" // run every hour }, "parameters": { "batchSize": 10, "configuration": { "dataToExtract": "contentAndMetadata", "parsingMode": "default" } }, "fieldMappings": [ { "sourceFieldName": "metadata_storage_name", "targetFieldName": "source_file" } ], "outputFieldMappings": [ { "sourceFieldName": "/document/content/pages/*/embedding", "targetFieldName": "embedding" } ]}
AI Enrichment Skillsets
Skillsets are AI pipelines applied at index time — transform raw documents into enriched, searchable content:
Raw PDF ↓OCR Skill → extracts text from scanned images ↓Split Skill → chunks text into 512-token pieces ↓Entity Recognition → extracts people, orgs, locations ↓Key Phrase Extraction → identifies main topics ↓Language Detection → detects document language ↓Translation Skill → translates to English if needed ↓Embedding Skill → generates vectors via Azure OpenAI ↓Index
{ "name": "ai-enrichment-skillset", "skills": [ { "@odata.type": "#Microsoft.Skills.Vision.OcrSkill", "name": "ocr-skill", "inputs": [{ "name": "image", "source": "/document/normalized_images/*" }], "outputs": [{ "name": "text", "targetName": "extracted_text" }] }, { "@odata.type": "#Microsoft.Skills.Text.SplitSkill", "name": "split-skill", "textSplitMode": "pages", "maximumPageLength": 512, "pageOverlapLength": 50, "inputs": [{ "name": "text", "source": "/document/content" }], "outputs": [{ "name": "textItems", "targetName": "pages" }] }, { "@odata.type": "#Microsoft.Skills.Text.AzureOpenAIEmbeddingSkill", "name": "embedding-skill", "resourceUri": "https://myoai.openai.azure.com", "deploymentId": "text-embedding-3-large", "modelName": "text-embedding-3-large", "inputs": [{ "name": "text", "source": "/document/content/pages/*" }], "outputs": [{ "name": "embedding", "targetName": "embedding" }] }, { "@odata.type": "#Microsoft.Skills.Text.EntityRecognitionSkill", "name": "entity-skill", "categories": ["Person", "Organization", "Location"], "inputs": [{ "name": "text", "source": "/document/content" }], "outputs": [ { "name": "persons", "targetName": "persons" }, { "name": "organizations", "targetName": "organizations" } ] } ], "knowledgeStore": { "storageConnectionString": "...", "projections": [ { "tables": [ { "tableName": "enrichedDocuments", "source": "/document" } ] } ] }}
Filtering and Facets
# Security filter — document-level ACLsecurity_filter = ( f"allowed_groups/any(g: search.in(g, '{','.join(user_groups)}'))" f" or allowed_users/any(u: u eq '{user_id}')")# Combined search with filtersresults = search_client.search( search_text=query, vector_queries=[vector_query], # Filter — applied before scoring (fast) filter=f"sensitivity ne 'HighlyConfidential' and ({security_filter})", # Facets — for navigation UI facets=["sensitivity", "source_file", "last_modified,interval:year"], # Ordering order_by=["@search.score desc", "last_modified desc"], # Pagination skip=0, top=10, # Which fields to return select=["id", "title", "content", "source_file", "last_modified"], # Highlight matching terms highlight_fields="content-3", # 3 fragments highlight_pre_tag="<mark>", highlight_post_tag="</mark>")# Facet results for navigationfor facet in results.get_facets().get("sensitivity", []): print(f"{facet['value']}: {facet['count']} docs")
Integrated Vectorization (Preview)
Newest feature — AI Search handles embedding automatically, no separate embedding calls:
# Old way — embed query yourself then searchembedding = openai_client.embeddings.create( input=query, model="text-embedding-3-large").data[0].embeddingresults = search_client.search( vector_queries=[VectorizedQuery(vector=embedding, ...)])# New way — integrated vectorization# Search service embeds query automaticallyresults = search_client.search( search_text=query, vector_queries=[VectorizableTextQuery( text=query, # ← pass text, not vector fields="embedding", k_nearest_neighbors=5 )])
Scoring Profiles (Custom Relevance)
Boost certain fields or freshness in ranking:
{ "scoringProfiles": [ { "name": "boost-recent-and-title", "text": { "weights": { "title": 5, // title matches worth 5x "content": 1 } }, "functions": [ { "type": "freshness", "fieldName": "last_modified", "boost": 3, "freshness": { "boostingDuration": "P30D" // boost docs < 30 days old } }, { "type": "tag", "fieldName": "tags", "boost": 2, "tag": { "tagsParameter": "userTags" // boost matching user tags } } ], "functionAggregation": "sum" } ], "defaultScoringProfile": "boost-recent-and-title"}
Python SDK — Complete RAG Example
from azure.search.documents import SearchClientfrom azure.search.documents.models import ( VectorizedQuery, QueryType, QueryCaptionType, QueryAnswerType)from azure.identity import DefaultAzureCredentialfrom openai import AzureOpenAIcredential = DefaultAzureCredential()search_client = SearchClient( endpoint=SEARCH_ENDPOINT, index_name="rag-index", credential=credential)openai_client = AzureOpenAI( azure_endpoint=OPENAI_ENDPOINT, azure_ad_token_provider=get_token_provider(credential))def hybrid_search_with_security( query: str, user_groups: list, user_id: str, top: int = 5) -> list: # 1. Embed query query_embedding = openai_client.embeddings.create( input=query, model="text-embedding-3-large" ).data[0].embedding # 2. Build security filter group_filter = " or ".join(
[f”allowed_groups/any(g: g eq ‘{g}’)” for g in user_groups]
) security_filter = f”({group_filter}) or allowed_users/any(u: u eq ‘{user_id}’)” # 3. Hybrid search with semantic reranking results = search_client.search( search_text=query, vector_queries=[ VectorizedQuery( vector=query_embedding, k_nearest_neighbors=50, fields=”embedding” ) ], filter=security_filter, query_type=QueryType.SEMANTIC, semantic_configuration_name=”my-semantic-config”, query_caption=QueryCaptionType.EXTRACTIVE, query_answer=QueryAnswerType.EXTRACTIVE, top=top, select=[“chunk_id”, “content”, “title”, “source_file”] ) # 4. Extract results chunks = [] for result in results: chunks.append({ “content”: result[“content”], “title”: result[“title”], “source”: result[“source_file”], “score”: result[“@search.reranker_score”], “caption”: result.get(“@search.captions”, [{}])[0].get(“text”, “”) }) return chunks
SKU / Pricing Tiers
| Tier | Use case | Vector index size | Replicas |
|---|---|---|---|
| Free | Dev / POC | 0.5 GB | 1 |
| Basic | Small prod | 2 GB | 3 max |
| Standard S1 | General prod | 25 GB | 12 max |
| Standard S2 | Large prod | 100 GB | 12 max |
| Standard S3 | Enterprise | 200 GB | 12 max |
| Storage Optimized L1/L2 | Huge indexes | 2 TB | 12 max |
Scale with replicas (HA + throughput) and partitions (storage + index capacity):
Total capacity = replicas × partitionsS1 with 3 replicas + 2 partitions = 6 search units (SU)
Best Practices
| Practice | Why |
|---|---|
| Always use hybrid search | Better quality than either alone |
| Add semantic reranking | Significant quality improvement for top results |
| Set security filter at retrieval | Never rely on post-filter security |
Use retrievable: false on embeddings | Save bandwidth — raw vectors not needed in response |
| Index in batches of 1000 documents | Optimal indexing throughput |
| Use managed identity — no API keys | Security best practice |
Set efSearch ≥ 500 for HNSW | Better recall at cost of slight latency |
| Use separate indexes per environment | Avoid dev data polluting prod |
| Monitor throttling (503 errors) | Add replicas if seeing throttle |
Use @search.score threshold | Filter low-confidence results |
Azure AI Search is the centerpiece of enterprise RAG on Azure — it handles full-text, vector, hybrid, and semantic search in one managed service, with built-in security filtering, AI enrichment, and deep Azure integration.