Getting started with Azure Search

Feb 04, 2015

When implementing search in a product; Azure Search seems to be the way forward.

Last week, we decided we needed to get search working in the backend for one of our products. We had a rather small database that needed to be searchable, so my initial thought was to simply do this with a simple database query over the tables involved.

Then I remembered that Azure had recently released their hosted search solution into preview a few weeks back. So I decided to give it a try.

After a few setbacks with creating a search service in the Azure Portal (something the PM for the product resolved for me after a question on Stack Overflow), I spent 3 hours going from nothing to a deployed solution.

At the moment, Azure Search has a REST-based API. There are no mentions of a .NET SDK on the website, but a third-party implementation is available:

1 PM> Install-Package RedDog.Search

This is a great package that clearly mirrors the REST API, so I had no trouble going on. First you’ll need to get your API key, and store it in your configuration somewhere (I put it under <appSettings>):

1 <appSettings>
2     <add key="AzureSearch-Endpoint" value="{serviceName}" />
3     <add key="AzureSearch-Key" value="{key}" />
4     <add key="AzureSearch-Analyzer" value="no.lucene" />
5 </appSettings>

Then you need to decide which fields to put in your index. The index defines the model of your search, and typically you would create a separate index for each type of query you want to run. I wanted to search our question database, so creating the index was easy-peasy:

 1 public async Task EnsureIndexesExistAsync()
 2 {
 3     using (var connection = ApiConnection.Create(
 4     	this._configuration.AzureSearch.Endpoint, 
 5     	this._configuration.AzureSearch.Key))
 6     using (var client = new IndexManagementClient(connection))
 7     {
 8         var analyzer = this._configuration.AzureSearch.Analyzer;
 9         var index = await client.GetIndexAsync("questions");
10 
11         if (index.IsSuccess) return;
12 
13         await client.CreateIndexAsync(
14             new Index("questions")
15                 .WithStringField(
16                 	"id", 
17                 	f => f.IsKey().IsRetrivable())
18                 .WithStringField(
19                 	"title", 
20                 	f => f.IsSearchable().Analyzer(analyzer))
21                 .WithStringCollectionField(
22                 	"alternatives", 
23             		f => f.IsSearchable().Analyzer(analyzer))
24                 .WithStringCollectionField(
25                 	"explanations", 
26                 	f => f.IsSearchable().Analyzer(analyzer))
27                 .WithStringCollectionField(
28                 	"tags", 
29                 	f => f.IsFacetable().IsSearchable().Analyzer(analyzer)));
30     }
31 }

There is only one thing that stuck out here, namely that the key fields in the search index has to be strings. I only figured that out because RedDog.Search only takes a string when you are creating an IndexOperation (in the next snippet).

The cool part about the code above is the Analyzer() call. The analyzer tells Azure Search how it should interpret the text it indexes. In my case, the text is in Norwegian, and the analyzer affects e.g. how stemming of words in queries work. E.g. if you in search for “numbers” with the default analyzer, it will also find results for “number” and “numbering”. You need to explicitly set the language, as the rules for which endings can be removed from words is different from language to language. After you’ve set the analyzer, no extra work is required to make searches like this work.

The IsFacetable() call on the tags lets you pivot your results on tags. This is a cool feature that I didn’t have a need for at the moment, but I decided to create the index properly in case I changed my mind later.

This piece of code is only run when my web service starts up. When I want to index a question:

 1 public async Task StoreAsync(Question question)
 2 {
 3     using (var connection = ApiConnection.Create(
 4     	this._configuration.AzureSearch.Endpoint, 
 5     	this._configuration.AzureSearch.Key))
 6     using (var client = new IndexManagementClient(connection))
 7     {
 8         await client.PopulateAsync(
 9             "questions",
10             this.QuestionToIndexOperation(question));
11     }
12 }
13 
14 private IndexOperation QuestionToIndexOperation(Question question)
15 {
16     var operation = new IndexOperation(
17     	IndexOperationType.MergeOrUpload, 
18     	"id", 
19     	question.Id.ToStringInvariant());
20     
21     operation.WithProperty("title", question.Title);
22 
23     if (question.Alternatives.Any())
24     {
25         operation.WithProperty("alternatives", 
26         	question.Alternatives.Select(a => a.Text));
27         operation.WithProperty("explanations", 
28         	question.Alternatives.Select(a => a.Explanation));
29     }
30 
31     if (question.Tags.Any())
32     {
33         operation.WithProperty("tags", 
34         	question.Tags.Select(t => t.Name));
35     }
36 
37     return operation;
38 }

Now, searching this index is quite easy. I am only interested in the keys for my questions, I can efficiently look them up locally and convert them to DTOs when I know the relevant question IDs:

 1 public async Task<IEnumerable<int>> SearchAsync(string query)
 2 {
 3     using (var connection = ApiConnection.Create(
 4     	this._configuration.AzureSearch.Endpoint, 
 5     	this._configuration.AzureSearch.Key))
 6     using (var client = new IndexQueryClient(connection))
 7     {
 8         var results = await client.SearchAsync(
 9         	"questions", 
10         	new SearchQuery(query).Select("id"));
11         return results.Body.Records
12         	.Select(r => int.Parse(r.Properties["id"].ToString()));
13     }
14 }

In total, implementing this solution — including a front-end AngularJS search — took me a grand total of 3 hours. Not bad for a service that is free for small solutions and is still in preview. Try it out today.

Feature image by n8.laverdure on Flickr.