Understanding Azure AI Search: Features and Benefits

Azure AI Search

What is Azure AI Search?

Azure AI Search (formerly Azure Cognitive Search) is a fully managed cloud search service that provides full-text search, vector search, semantic ranking, and AI enrichment over your own content — think of it as a smart, enterprise-grade search engine you point at your data.

Your Data Azure AI Search Your App
────────── ─────────────── ────────
Blob Storage ──────────▶ Index + Embeddings ────────▶ Search Results
SQL Database (indexing) Vector Store (query) RAG Answers
SharePoint AI Enrichment Recommendations
CosmosDB Semantic Ranking Autocomplete
Custom API Hybrid Search

Core Concepts

┌─────────────────────────────────────────────────────────────┐
│ AZURE AI SEARCH SERVICE │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │
│ │ Index │ │ Indexer │ │ Skillset (AI) │ │
│ │ │ │ │ │ │ │
│ │ - Fields │ │ - Pulls from │ │ - OCR │ │
│ │ - Embeddings │ │ data source│ │ - Entity extract │ │
│ │ - ACL fields │ │ - Schedules │ │ - Translation │ │
│ │ - Schema │ │ - Transforms │ │ - Key phrases │ │
│ └──────────────┘ └──────────────┘ │ - Custom skills │ │
│ └──────────────────┘ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ QUERY ENGINE │ │
│ │ Full-text │ Vector │ Hybrid │ Semantic │ Filters │ │
│ └──────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
ConceptWhat it is
IndexA collection of searchable documents (like a table in a DB)
FieldA property of a document (searchable, filterable, facetable)
IndexerAutomated pipeline that pulls data from a source into the index
Data SourceConnection to where your raw data lives
SkillsetAI enrichment pipeline applied during indexing
Scoring ProfileCustom relevance boosting rules
Semantic ConfigL2 reranking using language understanding

Search Types

Azure AI Search supports four search modes — often combined:

Query: "What is our refund policy for international orders?"
Full-text Search Vector Search Hybrid Search
──────────────── ───────────── ─────────────
Keyword matching Semantic similarity Full-text + Vector
BM25 algorithm Embedding comparison Combined score (RRF)
"refund policy" Similar meaning docs Best of both worlds
international Even if words differ
orders
+
Semantic Reranking (L2)
Re-orders results using
deep language model
understanding

Full-Text Search

Classic keyword search using BM25 ranking algorithm:

POST /indexes/documents/docs/search?api-version=2024-07-01
{
"search": "refund policy international orders",
"queryType": "full",
"searchMode": "all",
"searchFields": ["content", "title"],
"select": "id, title, content, source_file",
"top": 5,
"count": true
}

Supports:

Simple query: "refund policy"
Phrase query: "\"refund policy\"" exact phrase
Wildcard: "refund*" prefix match
Fuzzy: "refund~1" 1 edit distance
Proximity: "\"refund policy\"~5" within 5 words
Boolean: "refund AND (policy OR terms)"
Boosting: "refund^3 policy" boost refund term

Vector Search

Search by semantic meaning using embeddings — finds relevant docs even when exact keywords don’t match:

"How do I get my money back?"
Embedding model (text-embedding-3-large)
[0.023, -0.412, 0.891, ...] 1536-dimension vector
Cosine similarity search in index
Finds: "Refund and return policy" (similar meaning)
even though no words match
POST /indexes/documents/docs/search?api-version=2024-07-01
{
"vectorQueries": [
{
"kind": "vector",
"vector": [0.023, -0.412, 0.891, ...],
"fields": "embedding",
"k": 5,
"exhaustive": false
}
],
"select": "id, title, content",
"top": 5
}

Vector Algorithm Options

AlgorithmSpeedAccuracyUse case
HNSWFastHighProduction — approximate nearest neighbor
Exhaustive KNNSlowPerfectSmall indexes or testing
// Index vector field config
{
"name": "embedding",
"type": "Collection(Edm.Single)",
"searchable": true,
"dimensions": 1536,
"vectorSearchProfile": "hnsw-profile"
}

Hybrid Search (Best Quality)

Combines full-text + vector scores using Reciprocal Rank Fusion (RRF):

Query: "refund policy international"
BM25 Results: Vector Results:
1. Refund Policy Doc 1. Return & Refund Guide
2. International FAQ 2. Customer Service Policy
3. Terms & Conditions 3. International Orders FAQ
↓ RRF merges both ranked lists ↓
Hybrid Results (best quality):
1. Refund Policy Doc (top in both)
2. Return & Refund Guide (high vector score)
3. International FAQ (high BM25 score)
POST /indexes/documents/docs/search?api-version=2024-07-01
{
"search": "refund policy international orders",
"vectorQueries": [
{
"kind": "vector",
"vector": [0.023, -0.412, ...],
"fields": "embedding",
"k": 50
}
],
"queryType": "simple",
"select": "id, title, content, source_file",
"top": 5
}

Semantic Ranking

A second-pass reranking layer using a Microsoft language model — reads and understands the top results to re-order them by actual relevance:

Hybrid Search → Top 50 results
Semantic Ranker
(reads each result,
understands meaning,
compares to query intent)
Reranked Top 5
+ Semantic captions
+ Semantic answers
(extracted key passages)
{
"search": "refund policy international orders",
"vectorQueries": [...],
"queryType": "semantic",
"semanticConfiguration": "my-semantic-config",
"captions": "extractive", // extract relevant snippets
"answers": "extractive|count-3", // extract direct answers
"top": 5
}

Response includes:

{
"value": [
{
"@search.rerankerScore": 2.847,
"@search.captions": [
{
"text": "International orders are eligible for refund within 30 days",
"highlights": "International orders...refund within 30 days"
}
],
"content": "Full document content...",
"title": "Refund Policy"
}
],
"@search.answers": [
{
"text": "International orders are eligible for refund within 30 days of purchase",
"score": 0.94
}
]
}

Index Schema Design

{
"name": "rag-index",
"fields": [
{
"name": "chunk_id",
"type": "Edm.String",
"key": true,
"searchable": false,
"filterable": true
},
{
"name": "content",
"type": "Edm.String",
"searchable": true, // full-text search
"filterable": false,
"retrievable": true,
"analyzer": "en.microsoft" // language analyzer
},
{
"name": "title",
"type": "Edm.String",
"searchable": true,
"filterable": true,
"retrievable": true
},
{
"name": "source_file",
"type": "Edm.String",
"searchable": false,
"filterable": true, // filter by document
"retrievable": true,
"facetable": true // faceted navigation
},
{
"name": "sensitivity",
"type": "Edm.String",
"filterable": true, // filter by label
"facetable": true
},
{
"name": "allowed_groups",
"type": "Collection(Edm.String)",
"filterable": true // document-level security
},
{
"name": "last_modified",
"type": "Edm.DateTimeOffset",
"filterable": true,
"sortable": true
},
{
"name": "embedding",
"type": "Collection(Edm.Single)",
"searchable": true,
"retrievable": false, // don't return raw vectors
"dimensions": 3072, // text-embedding-3-large
"vectorSearchProfile": "hnsw-profile"
}
],
"vectorSearch": {
"algorithms": [
{
"name": "hnsw-algo",
"kind": "hnsw",
"hnswParameters": {
"metric": "cosine",
"m": 4, // connections per layer
"efConstruction": 400, // build quality
"efSearch": 500 // query quality
}
}
],
"profiles": [
{
"name": "hnsw-profile",
"algorithm": "hnsw-algo",
"vectorizer": "azure-openai-vectorizer"
}
],
"vectorizers": [
{
"name": "azure-openai-vectorizer",
"kind": "azureOpenAI",
"azureOpenAIParameters": {
"resourceUri": "https://myoai.openai.azure.com",
"deploymentId": "text-embedding-3-large",
"modelName": "text-embedding-3-large"
}
}
]
},
"semantic": {
"configurations": [
{
"name": "my-semantic-config",
"prioritizedFields": {
"titleField": { "fieldName": "title" },
"contentFields": [
{ "fieldName": "content" }
]
}
}
]
}
}

Indexers and Data Sources

Indexers automatically pull data from Azure sources on a schedule:

Data Sources supported:
├── Azure Blob Storage (PDF, Word, Excel, HTML, JSON)
├── Azure Data Lake Gen2
├── Azure SQL Database
├── Azure Cosmos DB
├── Azure Table Storage
├── SharePoint Online
└── OneLake (Fabric)
// Data source connection
{
"name": "blob-datasource",
"type": "azureblob",
"credentials": {
"connectionString": "ResourceId=/subscriptions/.../storageAccounts/mystg"
},
"container": {
"name": "documents",
"query": "processed/" // only index this folder
},
"dataDeletionDetectionPolicy": {
"@odata.type": "#Microsoft.Azure.Search.SoftDeleteColumnDeletionDetectionPolicy",
"softDeleteColumnName": "IsDeleted",
"softDeleteMarkerValue": "true"
}
}
// Indexer — runs on schedule
{
"name": "blob-indexer",
"dataSourceName": "blob-datasource",
"targetIndexName": "rag-index",
"skillsetName": "ai-enrichment-skillset",
"schedule": {
"interval": "PT1H" // run every hour
},
"parameters": {
"batchSize": 10,
"configuration": {
"dataToExtract": "contentAndMetadata",
"parsingMode": "default"
}
},
"fieldMappings": [
{
"sourceFieldName": "metadata_storage_name",
"targetFieldName": "source_file"
}
],
"outputFieldMappings": [
{
"sourceFieldName": "/document/content/pages/*/embedding",
"targetFieldName": "embedding"
}
]
}

AI Enrichment Skillsets

Skillsets are AI pipelines applied at index time — transform raw documents into enriched, searchable content:

Raw PDF
OCR Skill → extracts text from scanned images
Split Skill → chunks text into 512-token pieces
Entity Recognition → extracts people, orgs, locations
Key Phrase Extraction → identifies main topics
Language Detection → detects document language
Translation Skill → translates to English if needed
Embedding Skill → generates vectors via Azure OpenAI
Index
{
"name": "ai-enrichment-skillset",
"skills": [
{
"@odata.type": "#Microsoft.Skills.Vision.OcrSkill",
"name": "ocr-skill",
"inputs": [{ "name": "image", "source": "/document/normalized_images/*" }],
"outputs": [{ "name": "text", "targetName": "extracted_text" }]
},
{
"@odata.type": "#Microsoft.Skills.Text.SplitSkill",
"name": "split-skill",
"textSplitMode": "pages",
"maximumPageLength": 512,
"pageOverlapLength": 50,
"inputs": [{ "name": "text", "source": "/document/content" }],
"outputs": [{ "name": "textItems", "targetName": "pages" }]
},
{
"@odata.type": "#Microsoft.Skills.Text.AzureOpenAIEmbeddingSkill",
"name": "embedding-skill",
"resourceUri": "https://myoai.openai.azure.com",
"deploymentId": "text-embedding-3-large",
"modelName": "text-embedding-3-large",
"inputs": [{ "name": "text", "source": "/document/content/pages/*" }],
"outputs": [{ "name": "embedding", "targetName": "embedding" }]
},
{
"@odata.type": "#Microsoft.Skills.Text.EntityRecognitionSkill",
"name": "entity-skill",
"categories": ["Person", "Organization", "Location"],
"inputs": [{ "name": "text", "source": "/document/content" }],
"outputs": [
{ "name": "persons", "targetName": "persons" },
{ "name": "organizations", "targetName": "organizations" }
]
}
],
"knowledgeStore": {
"storageConnectionString": "...",
"projections": [
{
"tables": [
{
"tableName": "enrichedDocuments",
"source": "/document"
}
]
}
]
}
}

Filtering and Facets

# Security filter — document-level ACL
security_filter = (
f"allowed_groups/any(g: search.in(g, '{','.join(user_groups)}'))"
f" or allowed_users/any(u: u eq '{user_id}')"
)
# Combined search with filters
results = search_client.search(
search_text=query,
vector_queries=[vector_query],
# Filter — applied before scoring (fast)
filter=f"sensitivity ne 'HighlyConfidential' and ({security_filter})",
# Facets — for navigation UI
facets=["sensitivity", "source_file", "last_modified,interval:year"],
# Ordering
order_by=["@search.score desc", "last_modified desc"],
# Pagination
skip=0,
top=10,
# Which fields to return
select=["id", "title", "content", "source_file", "last_modified"],
# Highlight matching terms
highlight_fields="content-3", # 3 fragments
highlight_pre_tag="<mark>",
highlight_post_tag="</mark>"
)
# Facet results for navigation
for facet in results.get_facets().get("sensitivity", []):
print(f"{facet['value']}: {facet['count']} docs")

Integrated Vectorization (Preview)

Newest feature — AI Search handles embedding automatically, no separate embedding calls:

# Old way — embed query yourself then search
embedding = openai_client.embeddings.create(
input=query,
model="text-embedding-3-large"
).data[0].embedding
results = search_client.search(
vector_queries=[VectorizedQuery(vector=embedding, ...)]
)
# New way — integrated vectorization
# Search service embeds query automatically
results = search_client.search(
search_text=query,
vector_queries=[VectorizableTextQuery(
text=query, # ← pass text, not vector
fields="embedding",
k_nearest_neighbors=5
)]
)

Scoring Profiles (Custom Relevance)

Boost certain fields or freshness in ranking:

{
"scoringProfiles": [
{
"name": "boost-recent-and-title",
"text": {
"weights": {
"title": 5, // title matches worth 5x
"content": 1
}
},
"functions": [
{
"type": "freshness",
"fieldName": "last_modified",
"boost": 3,
"freshness": {
"boostingDuration": "P30D" // boost docs < 30 days old
}
},
{
"type": "tag",
"fieldName": "tags",
"boost": 2,
"tag": {
"tagsParameter": "userTags" // boost matching user tags
}
}
],
"functionAggregation": "sum"
}
],
"defaultScoringProfile": "boost-recent-and-title"
}

Python SDK — Complete RAG Example

from azure.search.documents import SearchClient
from azure.search.documents.models import (
VectorizedQuery,
QueryType,
QueryCaptionType,
QueryAnswerType
)
from azure.identity import DefaultAzureCredential
from openai import AzureOpenAI
credential = DefaultAzureCredential()
search_client = SearchClient(
endpoint=SEARCH_ENDPOINT,
index_name="rag-index",
credential=credential
)
openai_client = AzureOpenAI(
azure_endpoint=OPENAI_ENDPOINT,
azure_ad_token_provider=get_token_provider(credential)
)
def hybrid_search_with_security(
query: str,
user_groups: list,
user_id: str,
top: int = 5
) -> list:
# 1. Embed query
query_embedding = openai_client.embeddings.create(
input=query,
model="text-embedding-3-large"
).data[0].embedding
# 2. Build security filter
group_filter = " or ".join(

[f”allowed_groups/any(g: g eq ‘{g}’)” for g in user_groups]

) security_filter = f”({group_filter}) or allowed_users/any(u: u eq ‘{user_id}’)” # 3. Hybrid search with semantic reranking results = search_client.search( search_text=query, vector_queries=[ VectorizedQuery( vector=query_embedding, k_nearest_neighbors=50, fields=”embedding” ) ], filter=security_filter, query_type=QueryType.SEMANTIC, semantic_configuration_name=”my-semantic-config”, query_caption=QueryCaptionType.EXTRACTIVE, query_answer=QueryAnswerType.EXTRACTIVE, top=top, select=[“chunk_id”, “content”, “title”, “source_file”] ) # 4. Extract results chunks = [] for result in results: chunks.append({ “content”: result[“content”], “title”: result[“title”], “source”: result[“source_file”], “score”: result[“@search.reranker_score”], “caption”: result.get(“@search.captions”, [{}])[0].get(“text”, “”) }) return chunks


SKU / Pricing Tiers

TierUse caseVector index sizeReplicas
FreeDev / POC0.5 GB1
BasicSmall prod2 GB3 max
Standard S1General prod25 GB12 max
Standard S2Large prod100 GB12 max
Standard S3Enterprise200 GB12 max
Storage Optimized L1/L2Huge indexes2 TB12 max

Scale with replicas (HA + throughput) and partitions (storage + index capacity):

Total capacity = replicas × partitions
S1 with 3 replicas + 2 partitions = 6 search units (SU)

Best Practices

PracticeWhy
Always use hybrid searchBetter quality than either alone
Add semantic rerankingSignificant quality improvement for top results
Set security filter at retrievalNever rely on post-filter security
Use retrievable: false on embeddingsSave bandwidth — raw vectors not needed in response
Index in batches of 1000 documentsOptimal indexing throughput
Use managed identity — no API keysSecurity best practice
Set efSearch ≥ 500 for HNSWBetter recall at cost of slight latency
Use separate indexes per environmentAvoid dev data polluting prod
Monitor throttling (503 errors)Add replicas if seeing throttle
Use @search.score thresholdFilter low-confidence results

Azure AI Search is the centerpiece of enterprise RAG on Azure — it handles full-text, vector, hybrid, and semantic search in one managed service, with built-in security filtering, AI enrichment, and deep Azure integration.

Leave a Reply