Gauri Mahajan
Index Analyzer

Searching multi-lingual data in Azure SQL Databases

November 8, 2021 by

In this article, we will learn how to enable multi-lingual search on data hosted in Azure SQL Databases.

Introduction

Azure SQL Database is one of the most popular relational databases on the Azure cloud platform. Localization required data to be stored in multiple languages other than English. This leads to a variety of data in a variety of languages. While this is not so complex in terms of data storage or data management, as it’s like any other data with the difference in data collation and needs for a larger character set. But in terms of data consumption, it may get complex as users may want to query data using the language of their choice. While SQL or any other query language may be in regular English language, to enable users to find the data of their choice, a search may be one of the most convenient modes of consuming data using multiple languages. This article will help you understand the multi-lingual support in SQL Databases on Azure.

Adding search to data hosted in Azure SQL Database

The first thing we would need is an Azure account with administrative privileges to operate the Azure Search service. If one has an Azure SQL Database instance already in place or plans to create one to host sample data in this database, then we would need privileges to access the SQL Database service as well. For now, it is assumed that at least the Azure account is in place. Then navigate to the dashboard page of the Azure Search service and create a new search instance with the default configuration. Once the instance is created, it would look as shown below.

Search Dashboard

We intend to import data from an Azure SQL Database that is multi-lingual and then enable search on it. For this, we need to create a brand-new instance as well as populate it with such data as well. To keep our focus on the search functionality and keep the additional dependencies to minimal, we can use the samples that ship with Azure SQL Search. To start importing data in the search instance, click on the Import data button and it would invoke a new wizard as shown below. The first step here is to select an existing data source. Instead of selecting Azure SQL Database, we will select Samples from the list of supported data source types as shown below.

Import Data

In the next step, we would be presented with two options to select data source types from the available list of samples. One of them is an Azure SQL Database which is a sample data of real estate. This dataset contains attributes that store data in various languages, which will fit well with our use-case. Select this data source and move to the next step.

Samples

In the next step, we can optionally add cognitive resources and enrichment configuration during the import process. For now, we do not need these additional value-add capabilities in our search functionality. So, we will skip this part and move to the next step.

Cognitive Skills

In this step, we need to provide the name of the index being created as shown below. The key to the table being used is automatically considered as the key for the index. The suggester name and the search mode would be populated with default values as shown below. The section below would list all the fields with their data type, whether they are retrievable, filterable, sortable, facetable, and searchable. Most of the fields expect the id would be selected for most of these search capabilities by default.

Customize Index

Scroll down and you would be able to find the description field for each language. These fields hold the description of the record in different languages. The interesting part to observe here is the Analyzer associated with it. You would find that with the description field that has data in the English language, it has the English analyzer and the one in a different language like French or Italian for example, has a corresponding analyzer attached to it. This allows the search engine and index to parse the data depending on the language context. After configuring the fields and their settings in the context of search as desired, we can move to the next step.

Index Analyzer

In this step, we need to provide the name of the indexer, the schedule, and an optional description. Since the data is being sourced from a sample that is hosted in Azure SQL Database, we do not need to schedule the frequency of refresh as this data will not change. With this the configuration of the search index is complete and we can hit the Submit button to start creating the search index.

Index Scheduling

Once the index is being created, we can monitor the progress of the index as shown below. We can track the execution history, index definition as well as other related settings from the Index detail page.

Index status and progress

Once the index gets created, the execution history would show that it was completed successfully with the duration, number of records and any errors as shown below. This means the index got created successfully and is ready for querying.

Successful Index Creation

Navigate to the search explorer interface and add the query parameter to the query string as shown below. This would result in returning of top ten records from the index with all the attributes that we configured to be returned as part of the search result.

Search Results

Our intention is to also search based on different languages. If we enter a query string in a different language, it will still return the search result. Typically, any search engine or functionality would match the pattern or exact text that is provided as part of the search query and return the result. One should consider using different tenses, phrases and related keywords compared to the original text in a non-English language, and it would still return the result depending on the analyzer and search method configuration.

Multi-Lingual Search

In this way, we can enable multi-lingual search by importing data from an Azure SQL Database in the search index of the Azure Search instance.

Conclusion

In this article, we learned how to source data from an Azure SQL Database using a sample database available in Azure Search and populate the search index with this data. We also learned how to configure different settings related to the search index, different language analyzers that can be associated with fields hosting data in different languages, and then search the same using the search interface.

Gauri Mahajan
Azure, SQL Azure

About Gauri Mahajan

Gauri is a SQL Server Professional and has 6+ years experience of working with global multinational consulting and technology organizations. She is very passionate about working on SQL Server topics like Azure SQL Database, SQL Server Reporting Services, R, Python, Power BI, Database engine, etc. She has years of experience in technical documentation and is fond of technology authoring. She has a deep experience in designing data and analytics solutions and ensuring its stability, reliability, and performance. She is also certified in SQL Server and have passed certifications like 70-463: Implementing Data Warehouses with Microsoft SQL Server. View all posts by Gauri Mahajan

168 Views