ETL

Nisarg Upadhyay
View Connection and Database

Understanding Transfer database objects in SSDT 2017

March 2, 2022 by

In this article, we are going to learn how we can copy database objects between two databases of different instances of SQL Server. Sometimes, we receive the request to provide the specific database object to another database. To fulfill this requirement, instead of using Generating the T-SQL script for each object and export data using import-export task, we can use the Transfer SQL Server Object task of SSDT 2017. In this article, we will understand how we can transfer the database objects of a specific schema to another database.

Read more »
Nisarg Upadhyay
Data has been imported

Configure ODBC drivers for Azure Database for PostgreSQL

January 24, 2022 by

This article helps you learn how to configure an ODBC driver to connect the Azure Database for PostgreSQL. In my previous article, Configure ODBC drivers for PostgreSQL, you learned the step-by-step process to download, install and configure the ODBC driver for PostgreSQL. We also learned the process of creating a DSN used to connect PostgreSQL and populate data from it.

Read more »
Hadi Fadlallah
SSIS OLE DB Destination SQL command Query builder

Data Access Modes in SSIS OLE DB Destination: SQL Command vs. Table or View

October 5, 2021 by

This article compares the SSIS OLE DB Destination SQL command with the “Table or View” data access mode. In a previously published article, SSIS OLE DB Destination vs SQL Server Destination, in this series, we explained the OLE DB Destination component in SSIS and illustrated how it differs from the SQL Server Destination component. So, in this article, we will focus on the OLE DB Destination data access modes, not the component itself.

Read more »
Aveek Das
Overview of Apache Spark Architecture

Introduction to Apache Spark

April 12, 2021 by

In this article, I am going to discuss Apache Spark and how to create robust ETL pipelines for transforming big data. I will start from the very basics of Spark and then provide details on how to install Spark and start building the pipelines. In the later part of the article, I will also discuss how to leverage the Spark APIs to do transformations and obtain data into Spark data frames and SQL to continue with the data analysis.

Read more »
Aveek Das
Executing the master package - ETL in SSIS

Implementing a Modular ETL in SSIS

November 24, 2020 by

In this article, I am going to demonstrate about implementing the Modular ETL in SSIS practically. In my previous article on Designing a Modular ETL Architecture, I have explained in theory what a modular ETL solution is and how to design one. We have also understood the concepts behind a modular ETL solution and the benefits of it in the world of data warehousing. We have also related the concept of microservices architecture in software development to that of the modular ETL solution.

Read more »
Aveek Das

Designing a Modular ETL Architecture

November 17, 2020 by

In this article, I am going to demonstrate in detail the Modular ETL Architecture. ETL is a vast concept which explains the methodology of moving data across various sources to destinations while performing some sort of transformations within it. This is an advanced article that considers that the user has a substantial amount of understanding about how ETL is implemented using different tools like SSIS and the underlying working principle along with how to deploy multiple packages using SSIS. It is extremely important to implement a well-designed ETL architecture for your organization’s workload, otherwise, it might lead to performance degradations along with other challenges. To keep things simple, I will just explain the Modular ETL Architecture in this article which will be followed by a detailed hands-on tutorial in the next article – “Implementing Modular Architecture in ETL using SSIS”.

Read more »
Aveek Das
Executing the SSIS Package

An introduction to SSIS Data Lineage concepts

September 3, 2020 by

In this article, I am going to discuss SSIS data lineage concepts, which are often used while designing ETL workloads on a data warehouse. Although this article is focused on implementing data lineage using SSIS, it does not only confine to SSIS but to any ETL tools in the market using which data is moved from one source to a destination. In my previous article, Understanding Data Lineage in ETL, I have already discussed the generic importance of data lineage concepts for any ETL tool. I would definitely suggest you have a look at it if you want to understand in general how data lineage helps to track the source of a single record in the warehouse.

Read more »
Aveek Das
Transformation flow diagram

Understanding Data Lineage in ETL

September 3, 2020 by

In this article, I am going to explain what Data Lineage in ETL is and how to implement the same. In this modern world, where companies are dealing with a humongous amount of data every day, there also lies a challenge to efficiently manage and monitor this data. There are systems that generate data every second and are being processed to a final reporting or monitoring tool for analysis. In order to process this data, we use a variety of ETL tools, which in turn makes the data transformation possible in a managed way.

Read more »
Timothy Smith

Security Testing with extreme data volume ranges

June 19, 2020 by

When we develop security testing within inconsistent data volume situations, we should consider our use of anti-malware applications that use behavioral analysis. Many of these applications are designed to catch and flag unusual behavior. This may help prevent attacks, but it may also cause ETL flows to be disrupted, potentially disrupting our customers or clients. While we may have a consistent flow of data throughout a time period – allowing for a normal window of behavior to occur – we may also have an inconsistent data schedule or inconsistent amount of data that cause these applications to flag files, directories, or the process itself.

Read more »