Aveek Das

Aveek Das

Learn NoSQL in Azure: An overview of Azure Cosmos DB

June 16, 2021 by

In this article, we are going to learn Azure Cosmos DB. This article is a part of the series “Learn NoSQL in Azure”, where we will explore all the different types of non-relational databases that are supported in Azure at the moment. Azure is one of the most popular public cloud platforms that has a big market share all over the world. Cosmos DB is a part of the Databases section in Azure that allows customers to create and use NoSQL or non-relational databases and consume these at scale. You can leverage Cosmos DB to build highly scalable and robust cloud-based applications that support modern big data workloads. Let us understand more about what a NoSQL database is all about and how it is different from a relational database. Although this article focuses on the NoSQL related to Azure, it is to be known that other open-source projects support NoSQL databases like Apache Cassandra, etc. However, these topics are out of the scope of this article and we will focus on Azure mostly.

Read more »
Installing the PostgreSQL management tool - PGAdmin

An overview of PGAdmin – PostgreSQL Management Tool

June 10, 2021 by

In this article, we are going to learn about PGAdmin, a PostgreSQL management tool. As you are aware SQL Server Management Studio (SSMS) and the MySQL Workbench are the GUI management tools for SQL Server and MySQL respectively. Similarly, in order to manage the Postgres database and its services, PGAdmin is used. PGAdmin is a web-based GUI tool used to interact with the Postgres database sessions, both locally and remote servers as well. You can use PGAdmin to perform any sort of database administration required for a Postgres database.

Read more »
Connecting to the PostgreSQL instance using PGAdmin4

Install and upgrade PostgreSQL to support Spatial Data

May 26, 2021 by

Spatial data deals with data related to geography. In this article, we are going to understand the various concepts related to geographic or spatial data and how PostgreSQL can be leveraged as a database to store such geographic information. As you might be already aware, PostgreSQL is a popular and widely used open-source relational database management system that can handle production workloads very easily. With the availability of the cloud, you can quickly spin up instances of Postgres on major public cloud providers like AWS, Azure, GCP, etc.

Read more »
A simple Regular Expression in PostgreSQL -

Working with Regular Expressions in PostgreSQL

May 14, 2021 by

In this article, I am going to talk about using regular expressions in a Postgres database. Regular Expressions, also known as RegEx are pattern matching criteria that can filter data based on the pattern. It is heavily used to match string values to a specific pattern and then filter the results based on the condition. From a beginner’s perspective, these regular expressions can seem to be quite complex in the first, however, as you will start using these on a daily basis, you will come to the underlying logic, and then you can start writing your own RegEx statements.

Read more »
A dataflow diagram in Power BI

An introduction to Power BI Dataflows

May 7, 2021 by

In this article, we are going to understand what the Power BI Dataflows is all about and how we can get started by building dataflows in Power BI Service. A dataflow can simply be considered as an extract transform and load pipeline that can be used to connect to source data, transform the data by applying business rules, and then finally preparing the data to be available to visualize. In a general data architecture, a dedicated ETL tool is used to prepare and transform the data, which is then loaded into a data warehouse. Power BI was used to connect to this data warehouse and visualize the data from here on.

Read more »
Docker Desktop for Mac Download

How to set up and run SQL Server Docker image

April 28, 2021 by

In this article, I am going to discuss in-depth setting up docker and running a SQL Server Docker image. SQL Server, as you might already be aware that it is one of the most popular relational database technologies in today’s world. SQL Server is widely used in various applications that support transactional and analytical workloads. Docker, on the other hand, is a containerization technology using which you can bundle your applications within a container and distribute them. Docker helps users to build applications independent of the underlying operating system. We will learn more about Docker and using the SQL Server Docker image in this article.

Read more »
Using the PGAdmin Management Tool

Setting up a PostgreSQL Database on Mac

April 23, 2021 by

In this article, I am going to discuss different ways in which you can install and setup Postgres Database on a Mac. Postgres is an open-source relational database system that can be used to develop a wide variety of data-based applications. Postgres has been popular for analytical workloads as well since it has support for column-store index and in-memory storage as well. Postgres is also available on all the major public cloud services like AWS, Azure, and GCP. In order to use those services, it is recommended that you should also have them installed on your local machine before deploying your databases to the cloud directly.

Read more »
CloudFormation Template on AWS Console

Spinning up MySQL instances on RDS using CloudFormation Templates

April 20, 2021 by

In this article, we are going to discuss how to set up a MySQL instance on AWS RDS using Cloud Formation templates. In my previous article, How to configure an Amazon RDS environment for MySQL, I have provided a detailed walkthrough of how to set up a MySQL instance on Amazon. You can use the AWS console to provide all the information required for setting up the instance and then use it. However, in this article, we will discuss an automated way of achieving the same functionalities using Cloud Formation templates.

Read more »
Overview of Apache Spark Architecture

Introduction to Apache Spark

April 12, 2021 by

In this article, I am going to discuss Apache Spark and how to create robust ETL pipelines for transforming big data. I will start from the very basics of Spark and then provide details on how to install Spark and start building the pipelines. In the later part of the article, I will also discuss how to leverage the Spark APIs to do transformations and obtain data into Spark data frames and SQL to continue with the data analysis.

Read more »
Creating a new table in AWS Athena

Getting started with Amazon Athena and S3

April 7, 2021 by

In this article, I am going to discuss Amazon Athena and how we can analyze data stored in S3 using Athena. As you might know, Amazon’s AWS has a lot of services in the field compute, databases, analytics, machine learning, and robotics, one of the most important and popular services is Amazon Athena. By the official definition, “Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.”

Read more »
Using the iterable unpacking operator in python - arguments in python

Understanding *args and *kwargs arguments in Python

April 2, 2021 by

In this article, I am going to talk in detail about the functions and arguments in Python. Python is one of the most popular and in-demand programming languages. Recently, a lot of programmers are gaining interest to work with python and as such, there is a huge community around it that is constantly evolving. Python is also considered to be one of the most flexible languages as it can be used to develop web-based applications, REST APIs as well as can also be used significantly in the scientific computation world to deal with data analysis and machine learning.

Read more »
Console output from the above snippet

Working with JSON data in Python

March 30, 2021 by

In this article, I am going to write about the various ways we can work with JSON data in Python. JSON stands for Java Script Object Notation and has become one of the most important data formats to store and transfer data across various systems. This is due to its easy-to-understand structure and also because it is very lightweight. You can easily write simple and nested data structures using JSON and it can be read by programs as well. In my opinion, JSON is much more human-readable as compared to XML, although both are used to store and transfer data. In modern web applications, by default JSON is being used to transfer information.

Read more »
Generating plots in R

Setting up a Machine Learning environment using R and RStudio

March 23, 2021 by

In this article, I am going to introduce a few concepts of how to set up and get started with R and RStudio to perform machine learning workloads. It has always been the heat of the discussion on whether to choose Python or R for performing Machine Learning analysis. In my opinion, both the languages excel in their own space and there is no point-to-point comparison between the two directly. Mathematicians and statisticians like to work within the R environment, while programmers choose to work with Python.

Read more »
Amazon Athena uses

An introduction to AWS Athena

March 19, 2021 by

In this article, I am going to introduce AWS Athena, a service offered by Amazon which allows users to query data from S3 using standard SQL syntax. AWS is considered to be a leader in the cloud computing world. Almost more than a hundred services are being offered by Amazon which offers competitive performance and cost-effective solutions to run workloads as compared to on-premise architectures. The services offered by Amazon range widely from compute, storage, databases, analytics, IoT, security, and a lot more. One of the popular areas of these services in the Analytics domain. This allows the customer to build architectures that answer key questions to their business decisions.

Read more »

Create REST APIs in Python using Flask

March 12, 2021 by

In this article, I am going to explain what a REST API is all about and how to get started with creating APIs in Python using Flask. In the recent software world, REST APIs play a major role as a communication channel between different services. It has become the de facto standard of passing information across multiple systems in the JSON format. This is because it has a uniform interface to share messages across two different systems. Let us learn more about REST APIs in this article.

Read more »
EC2 Instance Types

Overview of EC2 Instance Types in AWS

March 8, 2021 by

In this article, I am going to talk about the various EC2 instance types available in AWS. EC2, abbreviated as Elastic Compute Cloud is an IaaS offering from AWS using which customers can provision virtual machines on the cloud using different combinations of CPU, RAM, disk and networking. There are many predefined instance types already available in the AWS console, which makes it extremely easy to spin up a new EC2 instance very easily.

Read more »
AWS Certifications - AWS CCP

Preparing for the AWS Certified Cloud Practitioner (CCP) exam

March 1, 2021 by

In this article, I am going to discuss the AWS Certified Cloud Practitioner exam. Cloud Computing is one of the most fast-moving technologies in today’s world. With the rising demand for cloud computing platforms, more and more companies have already started using the cloud or are in the process of moving their infrastructure to the cloud. When the question of cloud vendors comes in, AWS is mostly preferred by major companies, also Azure is on the second list after AWS. With this demand, companies also continuously look for talented individuals who can help them lift and shift their infrastructure all already advise them with their existing cloud infrastructure.

Read more »
Specifying parameters while exporting data to an SQL table - Pandas

Exporting data with Pandas in Python

February 24, 2021 by

In this article, I am going to discuss the various ways in which we can use Pandas in python to export data to a database table or a file. In my previous article Getting started with Pandas in Python, I have explained in detail how to get started with analyzing data in python. Pandas is one of the most popular libraries used for the purpose of data analysis. It is very easy and intuitive to use. Personally, I love using the library due to the ease of use and the great documentation that is available online.

Read more »
Stored Procedure for moving data

Advanced usages of Data-Tier applications

February 12, 2021 by

In this article, I am going to explain some of the advanced usages of data-tier applications in Visual Studio. In my previous article, Working with Database Projects, I have explained how you can start building your database applications for SQL Server and Azure SQL Database using Visual Studio. This article will specifically focus on using SQLCMD variables and Publish Profiles of the Data-Tier Application development. For a better understanding, I would recommend reading the previous article and it will help to clear the basic concepts.

Read more »
PEP Workflow - Programming in Python

Best practices to follow while programming in Python

February 9, 2021 by

In this article, I am going to discuss some of the best practices that a programmer must follow while programming in python. Python as a language has evolved to a great extent over the last few decades and has gained popularity amongst a lot of software programmers, data enthusiasts, and system administrators. This is because of the ease of writing code in python and the large community behind it.

Read more »
Cloud Market Google Trend

Understanding AWS Billing services and concepts

February 3, 2021 by

In this article, I am going to explain AWS Billing services and the underlying concepts that one should be aware of while working with AWS. As you know, more and more companies are taking the essential step to migrate their existing applications to the cloud, it has become important for engineers to keep up the pace and learn the technologies of the cloud. In today’s market, AWS and Azure are two of the major cloud providers which are being used mostly. Also, Google Cloud Platform (GCP) is becoming popular, however, the demand for AWS is the highest.

Read more »
AWS Well-Architected Framework

An overview of AWS Well-Architected Principles

January 27, 2021 by

In this article, I am going to explain about the AWS Well-Architected Framework that helps AWS customers to design solutions following best practices while designing the architectures of their solutions. It enables the users to design secure, reliable and high performant cloud applications and workloads. This is more of a theoretical concept that is often advised to be followed while thinking of the architecture of any system. There are five pillars of the AWS Well-Architected Framework that enables customers to evaluate their existing architectures and implement scalable solutions. In this article, we will learn more about those five pillars and the best practices around them. The discussion below is a summarized form of the official whitepaper: AWS Well-Architected Framework.

Read more »
Create table using Design Pane

Working with Database Projects

January 22, 2021 by

In this article, I am going to talk about developing and deploying a database project, also known as a data-tier application using Visual Studio. In my previous article Getting started with Data-Tier Applications using Visual Studio, I have provided an overview of the data tier applications and how can we create one using Visual Studio. This article is a follow-up to the previous article. I’d advise you to have a look at it before proceeding forward with this as this is a continuation of the previous. For the article, I would be using Visual Studio 2019, however, you are free to use any other versions of Visual Studio.

Read more »
Selecting Target Platform

Getting started with Data-Tier applications in Visual Studio

January 15, 2021 by

In this article, I am going to talk about creating a data-tier application using Visual Studio. In my previous article An introduction to Data-Tier applications in SQL Server, I have explained in detail what a data-tier application is all about. I have explained what the different types of data-tier applications are available and how can we create such applications from existing SQL Server databases. In this article, the primary focus would be to create data-tier applications from scratch using Visual Studio. For this article, I am going to use Visual Studio 2019, however, the technique will remain similar for other editions of SQL Server as well.

Read more »
AWS IAM Service in AWS Management Console

An overview of AWS IAM

January 13, 2021 by

In this article, I am going to introduce the concept of AWS IAM, also known as Identity and Access Management in AWS. In any cloud service, controlling who has access to the services and how each of the services accesses the other services is an important task. If we do not control the access or restrict then there might be cases of a security breach within the services and we might not be able to track those as well. So as a best practice to restrict or control access within the AWS, there is a special service called IAM that can be used to manage and control almost everything in AWS. It is the permission control system that controls access to the various AWS resources and services.

Read more »