Rajendra Gupta
Data Ingest

Copy Data tool to export data from Azure SQL Database into Azure Storage

October 29, 2021 by

DBA may import, export regularly within a different environment. You might receive data in a specified format for importing it into database tables.

The source data can be in various formats such as JSON, CSV, ORC, TXT, EXCEL, Parquet. Azure SQL Database is a popular choice for storing structured data in the cloud. You might store these excel or other format files into the Azure blob storage. You might also get data from the third party regularly for importing into the Azure database regularly.

There are different ways for data importing into the Azure database. In the article, Azure Automation: Use Azure Logic Apps to import data into Azure SQL Database from Azure Blob Storage, we explored data importing using logic apps.

This article will use the Azure Data Factory for populating (export) data from Azure SQL Database to the Azure blob storage.

Pre-requisites

  • Destination data file in the Azure Data Storage: You should have a storage account and blob container for storing data from the Azure database
    • Storage account: sourcedataimport
    • Blob container: sourcedata

    Sample Excel sheet

  • Source Azure SQL Database: You require an active Azure database to import data from the source file
    • Server name: azuredemosqldemo.database.windows.net
    • Database: azuredemodatabase

    Azure SQL Database

Azure Data Factory

We require an Azure Data Factory instance for populating data. For this purpose, navigate to Data Factories in Azure services.

Data Factories

If you have an existing data factory instance, you can use it. Otherwise, click on Create. Enter the resource group, region, name, and version.

Create data factory

On the next page, Git configuration, select option – Configure Git later. This option skips the Git repository configuration with either Azure DevOps or GitHub.

Configure Git later

Deploy the Azure Data Factory instance, and you get the following overview page.

Azure data factory dashboard

Scroll down and click on the Open Azure Data Factory studio.

Open Azure Data Factory studio

It opens another browser tab and gives the following page.

Data Ingest

The Copy data tool in the Azure data factory can perform a data import from over 90+ data sources optimally.

  • It simplifies the data import process where you do not understand linked services, pipelines, data sets, triggers
  • The copy data tool automatically creates data factory resources for copying databases across selected destination or sink data store
  • You can validate data at the time of authoring. It helps to avoid any potential error at the beginning
  • It also gives the flexibility to edit the data factory resources for customization and implementing complex business logics

Click on the Ingest tile to launch the copy data tool. You get two options:

  1. Built-in copy task: The built-in copy task has pre-configured copy tasks, and it does not require much customization
  2. Metadata-driven copy task (Preview): The metadata drive task allows you to configure the parametrized pipelines, external control table for managing data copy at a large scale

Built-in copy task

Select the Built-in copy task and click Next.

Copy data tool source and destinations

Here, we need to create the connection of both source and destination datastore.

Source datastore

  • Source type: Select Azure SQL Database from the drop-down options
  • Connection: Click on +New connection and specify a connection name
    • Here, we select Azure subscription, logical SQL Server instance and Azure database name

      New Azure SQL Database connection

    • Select SQL authentication and enter the username, password for connecting to the Azure database
    • Click on Test connection to validate that the connection to the Azure database is successful

Test connection

Click on Create. On the next page, it lists out the existing database tables. You can select the table’s data source. For example, I selected the following tables in the SalesLT schema.

Select tables for export

You can also use a query to export data from a table or join multiple tables.

Query tables

Click on Next. If you wish to filter data or specify properties such as Query timeout, Isolation level, partition option, specify in the advanced page.

Apply filters

Destination datastore

We need to select Azure blob storage for exporting these tables in the CSV format in the destination datastore.

Destination data store

Add a new connection and enter connection name, Azure subscription, Storage account name. You can test the connection to validate that the copy data tool can connect to the destination Azure Blob Storage.

New connection for Azure blob storage

Click on Create, and it takes you back to the destination datastore page. Here, you can see the target type and connection that we configured. In the folder path, browse to the blob container in which we want to export tables in CSV format.

Destination data connection

By default, it uses file name suffix as .txt. As per our requirement, I changed it to CSV.

File format and suffix

On the next page, you can configure the file format settings. The available options are as below:

  • File format ( Text format, AVRO format, JSON, ORC and Parquet)
  • Column delimiter ( default – Comma (,))
  • Row delimiter
  • Add header to file: To use the first row as column headers
  • Compression type
  • Encoding

File format settings

On the setting page, enter the task name, description. You can configure consistency check, logging from this page. We will skip these options for the demonstration.

Task name and descriptions

On the last page, review the data flow and the configuration completed so far in the copy data tool of Azure Data Factory v2.

View and validate summary

Click Next, and it deploys the workflow. In the initial step, you create a dataset, pipelines, and their validations.

Validate copy runtime environment

It finishes the data export from the Azure database and stores the individual tables as a separate CSV file in the blob storage.

Deployment status

You can go to the storage account and blob container to view these exported tables.

View exported files

You can click on an individual file, download it and open Microsoft excel to view its data. For example, in the below screenshot, we have data for the SalesLTCustomer.CSV

Download file and view its data

Monitor pipeline runs in Azure Data Factory

You can monitor the current running pipelines or workflow using the ADS dashboard. Click on the Author and pipelines runs. By default, it shows the last 24 data. You can change the filter to get the required data.

In the following screenshot, we see the pipeline status. You can also distinguish between the manual trigger and scheduled trigger using the triggered by column.

Monitor pipeline runs in Azure Data Factory

Click on the pipeline name to get a graphical view of the components.

pipeline name

Scroll down and get the activity run history. For example, we exported data from Azure SQL Database tables to Azure blob storage. Therefore, in the activity runs, it shows individual tables source and destination details.

activity run history

To get further details, hover the mouse pointer to any activity row. It gives the following options.

  • Input
  • Output
  • Details

activity row

Input: Click on the Input tab and get JSON data for source, destination, and file formats.

Input data

Output

In the output section, you can view the information such as the number of rows read, rows copied, data read, data written, billable duration, throughput, source and destination details of Azure SQL Database.

  • Note: You can refer to data factory pricing for more details on the pricing part

Output data

Details: The details section gives a graphical overview of both source, destination, task status, Azure Integration service runtime, throughput, start time, end time. You can use this window to go through details you got from both the Input and Output tabs.

Details page

You can also view activity information in the Gantt chart. Click on the Gantt on the top, and it converts the tabular activity log to graphical format. Click on the chart to display JSON format information as shown below.

Activity runs graph

Conclusion

This article explored the Azure Data Factory Copy Data tool for exporting Azure SQL Database data into CSV format. It stored these CSV files into the Azure blob storage. This copy data tool has 90+ built-in containers to configure and use in the pipeline project for data transfer or transformation. You can schedule the pipeline to execute it at a specified time and frequency automatically.

Rajendra Gupta
Latest posts by Rajendra Gupta (see all)
168 Views