Azure AI Search — Vector Indexes, Fields & Configurations Explained
The Big Picture
Think of Azure AI Search like a smart library system for AI:
Your Documents ↓Convert to Vectors (embeddings) ↓Store in Vector Index ↓User asks a question → convert to vector → search → find similar docs ↓Return most relevant results
These three concepts — Vector Indexes, Vector Fields, and Vector Search Configurations — are the three layers that make this work.
1. Vector Index
A Vector Index is the overall container — like a database table — that holds all your documents and their vector representations.
What it is
A named, structured storage unit in Azure AI Search where you define the schema (what fields exist) and store all your data.
Analogy
A regular index = a filing cabinet with labeled folders A vector index = a filing cabinet that also stores the “meaning fingerprint” of every document, so you can search by meaning, not just keywords
How it looks (schema definition)
{ "name": "my-document-index", "fields": [ { "name": "chunk_id", "type": "Edm.String", "key": true }, { "name": "text", "type": "Edm.String", "searchable": true }, { "name": "source", "type": "Edm.String", "filterable": true }, { "name": "embedding_vector", "type": "Collection(Edm.Single)", "dimensions": 1536, "vectorSearchProfile": "my-vector-profile" } ], "vectorSearch": { ... }}
Key properties of a vector index
| Property | What it means |
|---|---|
| name | Unique identifier for the index |
| fields | All the columns of data stored |
| key field | Unique ID per document (like a primary key) |
| vectorSearch | Configuration for how vector search behaves |
Lifecycle
Create index (define schema) ↓Load documents + their vectors ↓Index is ready to search ↓Query it anytime via REST API
2. Vector Fields
A Vector Field is a specific field inside the index that stores the actual vector (embedding) — the numerical representation of a piece of text’s meaning.
What it is
A special type of field that holds a list of floating-point numbers (e.g. 1,536 numbers for OpenAI’s text-embedding-ada-002 model). Each number encodes some aspect of the text’s meaning.
Analogy
Regular text field = stores “Refunds are available within 30 days” Vector field = stores [0.023, -0.841, 0.334, 0.012, …] (1536 numbers representing the meaning of that sentence)
How a vector is generated
"Refunds are available within 30 days" ↓ Azure OpenAI Embedding Model ↓[0.023, -0.841, 0.334, 0.012, 0.776, ...](1,536 floating-point numbers) ↓ Stored in the vector field
Vector field definition
{ "name": "embedding_vector", "type": "Collection(Edm.Single)", "dimensions": 1536, "vectorSearchProfile": "my-vector-profile", "searchable": true, "retrievable": false}
Key properties explained
| Property | What it means |
|---|---|
type: Collection(Edm.Single) | Array of 32-bit floats — the vector |
dimensions: 1536 | Must match the embedding model’s output size |
vectorSearchProfile | Links to the algorithm config (see below) |
searchable: true | This field can be used in vector queries |
retrievable: false | Don’t return raw vector in results (saves bandwidth) |
Common embedding model dimensions
| Model | Dimensions |
|---|---|
Azure OpenAI text-embedding-ada-002 | 1,536 |
Azure OpenAI text-embedding-3-small | 1,536 |
Azure OpenAI text-embedding-3-large | 3,072 |
| sentence-transformers (local) | 384 or 768 |
Why dimensions must match
Document embedded with ada-002 → 1536-dimensional vectorQuery embedded with ada-002 → 1536-dimensional vector✅ Same space → similarity search worksDocument embedded with ada-002 → 1536-dimensional vectorQuery embedded with text-3-large → 3072-dimensional vector❌ Different space → results are meaningless
3. Vector Search Configurations
Vector Search Configuration is where you define how the similarity search algorithm works — the engine under the hood that finds the closest vectors.
What it is
A set of rules and parameters that control the search algorithm, the mathematical method for comparing vectors, and performance vs accuracy trade-offs.
It has two parts
Part A — Algorithm Configuration
Defines which algorithm to use for finding similar vectors.
Azure AI Search supports two algorithms:
HNSW (Hierarchical Navigable Small World) — recommended for most use cases
{ "name": "my-hnsw-config", "kind": "hnsw", "hnswParameters": { "metric": "cosine", "m": 4, "efConstruction": 400, "efSearch": 500 }}
Exhaustive KNN (K-Nearest Neighbors) — brute-force, checks every vector
{ "name": "my-knn-config", "kind": "exhaustiveKnn", "exhaustiveKnnParameters": { "metric": "cosine" }}
HNSW vs Exhaustive KNN
| HNSW | Exhaustive KNN | |
|---|---|---|
| Speed | Very fast | Slow (checks everything) |
| Accuracy | Near-perfect | Perfect (100%) |
| Scale | Millions of vectors | Small datasets only |
| Use case | Production RAG | Testing / small indexes |
HNSW parameters explained
| Parameter | What it controls |
|---|---|
metric | How similarity is measured (cosine, euclidean, dotProduct) |
m | Number of links per node — higher = more accurate but uses more memory |
efConstruction | Build-time accuracy — higher = better index quality, slower build |
efSearch | Query-time accuracy — higher = more accurate results, slower query |
Part B — Vector Search Profile
A profile links a field to an algorithm config. This is what a vector field references.
{ "vectorSearch": { "algorithms": [ { "name": "my-hnsw-config", "kind": "hnsw", "hnswParameters": { "metric": "cosine", "m": 4, "efConstruction": 400, "efSearch": 500 } } ], "profiles": [ { "name": "my-vector-profile", "algorithm": "my-hnsw-config" } ] }}
The relationship:
Vector Field └── references → Vector Search Profile └── references → Algorithm Config └── defines metric, m, ef values
Similarity Metrics Explained
The metric property defines how distance between two vectors is calculated:
| Metric | Formula idea | Best for |
|---|---|---|
| cosine | Angle between vectors | Text similarity (most common) |
| euclidean | Straight-line distance | Image embeddings |
| dotProduct | Magnitude × direction | Normalized vectors |
For RAG with text, cosine is almost always the right choice — it measures semantic similarity regardless of text length.
How All Three Work Together
┌─────────────────────────────────────────┐│ VECTOR INDEX ││ "my-document-index" ││ ││ Fields: ││ ┌──────────┐ ┌───────────────────┐ ││ │ chunk_id │ │ embedding_vector │ ││ │ text │ │ (VECTOR FIELD) │ ││ │ source │ │ dim: 1536 │ ││ └──────────┘ │ profile: →────────┼───┼──┐│ └───────────────────┘ │ ││ │ ▼│ Vector Search Config: │ ┌────────────────────┐│ ┌─────────────────────────────────┐ │ │ PROFILE ││ │ Algorithm: HNSW │◄──┼──┤ "my-vector- ││ │ metric: cosine │ │ │ profile" ││ │ m: 4, efConstruction: 400 │ │ └────────────────────┘│ └─────────────────────────────────┘ │└─────────────────────────────────────────┘
A Complete Query Example
When a user asks “What is the refund policy?”:
1. Convert question to vector [0.021, -0.834, 0.291, ...] (1536 numbers)2. Send vector query to Azure AI Search POST /indexes/my-document-index/docs/search { "vectorQueries": [{ "kind": "vector", "vector": [0.021, -0.834, 0.291, ...], "fields": "embedding_vector", "k": 5 }] }3. HNSW algorithm runs → Finds 5 chunks whose vectors are most similar (cosine)4. Returns top matches → "refund_policy.pdf" chunk: score 0.97 ✅ → "shipping_policy.pdf" chunk: score 0.61 → "returns_guide.pdf" chunk: score 0.58
Key Takeaway
| Concept | Role | Analogy |
|---|---|---|
| Vector Index | Container for all data | Database table |
| Vector Field | Stores the meaning fingerprint | DNA of each document |
| Vector Search Config | Controls how similarity is found | Search engine settings |
Together they form a semantic search engine — instead of matching keywords, Azure AI Search matches meaning, making it the backbone of any production RAG system.