Gauri Mahajan
pgAdmin Home Page

Centralizing data validation with domains in Azure Database for PostgreSQL

December 2, 2021 by

In this article, we will learn how to work with domains in Azure Database for PostgreSQL and learn how to use it with tables to centralize the constraints logic that would be applied across tables.

Introduction

Database systems use tables as the primary database object to host a variety of data. To accommodate the different nature of data and values in the table, we use a variety of data types with the attributes that form the definition of the table. We configure different specifications at the table and attribute level like precision, identity field, defaults, etc., which control the nature of data that would be hosted in any given attribute. One specific configuration that we apply at an attribute level is constraints, which are used to specify what different values are allowed to be accepted in any given attribute. This feature helps to maintain data validity when data is being inserted into a table. As this is one of the fundamental ways to control data validation during data ingestion, almost every table in a database would have constraints defined on it. Often there would be cases where the data validation rules are common across several tables, in which case one would have to manually ensure that the constraints applied on those attributes are identical. Else sync of constraints across identical attributes in various tables would have to be conducted which is not an efficient approach and is prone to manual errors.

Constraints are a very regular construct that is available in most database systems including databases like SQL Server and PostgreSQL. Azure offers PostgreSQL on its platform in the form of a Database for PostgreSQL service. It offers a unique construct known as domains where can help to centralize the data validations that are used in tables, which can help to reduce the potential of discrepancies in constraints on identical attributes across tables.

Domains in Azure Database for PostgreSQL

Domains can be considered like datatypes where it allows to optionally specify constraints. Like different database objects are grouped in the schemas, even domains are grouped in schemas in which they are created. A domain name is required to be unique in the schema in which it is created, but it can have the same name across schemas. A domain can be used to specify defaults as well as different types of constraints like not null, null as well as check constraints.

Now that we understand domains in Azure Database for PostgreSQL at a high level, we can now focus on the actual exercise to create and use domains in Database for PostgreSQL. We would need an Azure account with privileges to access Azure Database for PostgreSQL service. Using this service, we need to create an instance of Azure Database for PostgreSQL service and use a locally installed editor like pgAdmin to connect and work with this instance. It is assumed that such an instance and setup are already available and configured before proceeding with this exercise. Once we are connected to the instance using pgAdmin, it will look as shown below. Under the schemas tab, we should be able to find the default public schema already created and under it, there would be an item named Domains as shown below. By default, there would not be any domains already created.

pgAdmin Home Page

Consider a scenario where we have postal code used in multiple tables and we have a set of constraints that we need to apply while data is ingested in such attributes. We can start by creating a new domain by right clicking the domain item and selecting Create Domain menu item. This will invoke a new window as shown below.

CREATE DOMAIN

By default, the owner of the domain would be selected as sqladmin which is the administrator’s name that we would have selected during the creation of the Azure Database for PostgreSQL instance. This can be different depending on the username using which one has logged on to the instance. The schema selected here by default is public, though we have the option to change it as desired. Provide an appropriate name for this domain as shown above. In the definition tab, we need to provide the details as shown below. First, we need to select a base type using one of the data types supported by the Azure Database for PostgreSQL. In this case, we will use TEXT as the base type.

Domain Definition

In the Constraints tab, we need to add a check constraint to add an expression that will evaluate the data being entered wherever we use the domain as the datatype. We can use any type of expression for now. Shown below is a sample expression. In case we add multiple CHECK constraints, they would be evaluated in alphabetical order, which is the reason we should name the constraint accordingly.

Domain Constraints

Once done, click on the Save button to save the constraint and it would appear on the left pane under the Domains section as shown below.

Domain Created

Now that the domain has been created, we can start using it while creating tables. We can create a new table in Azure Database for PostgreSQL using CREATE TABLE command. When we specify the attributes with data types in the table definition, instead of specifying the constraint we will replace the data type with the domain name. We need to keep in view that the domain is based on the base data type with additional constraints. So, we can use the domain only with attributes where the data type, as well as constraints that are attached to the domain, matches the data validation requirements of the attribute. We will use the graphical features of the pgAdmin tool to create a new table. Right-click on the tables section and select CREATE TABLE option. It would open a new pop-up window as shown below. Provide an appropriate name for this table and select the schema in which this new table should be created.

Create Table

Navigate to the Columns tab and click on the plus button to add fields. We can add fields that can have constraints like primary key and not null as shown below. We have added fields like id and street_address where we used basic data types. In the field named postal code, we will be able to find the domain name itself as datatype as shown below. Once the configuration is complete, click on the Save button and that would result in the creation of a new table named customers.

Domains as Data Types

Now that the table is created, we can test the constraint on the table by attempting to insert a record that violates the constraint that we specified while creating the domain that we used in the table for the postal code field. Once we execute an INSERT command that has an invalid value for the postal code field, we will encounter a data validation error as shown below.

Domain error

This clearly establishes that the domain is active and enforces the constraint that we defined on it. As the domain is available as a data type itself, whenever newer tables are created, all of them would refer to the same domain which would have the logic of constraints centralized in it. In this way, we can use the domain in Azure Database for PostgreSQL to refactor the redundant data validation logic across several database objects.

Conclusion

In this article, we learned the concept of domains in the Azure Database for PostgreSQL. We learned how to create a domain, the specifics, and limitations on domains as well as how to use it with tables in Azure Database for PostgreSQL.

Gauri Mahajan
Azure, PostgreSQL

About Gauri Mahajan

Gauri is a SQL Server Professional and has 6+ years experience of working with global multinational consulting and technology organizations. She is very passionate about working on SQL Server topics like Azure SQL Database, SQL Server Reporting Services, R, Python, Power BI, Database engine, etc. She has years of experience in technical documentation and is fond of technology authoring. She has a deep experience in designing data and analytics solutions and ensuring its stability, reliability, and performance. She is also certified in SQL Server and have passed certifications like 70-463: Implementing Data Warehouses with Microsoft SQL Server. View all posts by Gauri Mahajan

168 Views