Today, data is the basis of any business. The world of enterprise computing is the epitome of data-driven businesses. The importance of structured storage of data is unquestionably in the focus. And at this time, with the computing model quickly shifting towards the cloud, and storage costs falling rapidly, enterprises are leveraging data more and more to tune their businesses.
It is, then, imperative that a ground-zero understanding of this structured model of storage and retrieval of data is of paramount importance. A beginner, however, naturally feels lost in the sea of information available out in the wild world. This series, Introduction to SQL, comes to the rescue, by starting at the very base of SQL — the very fabric — and then, launching from there, towards the bigger picture.
Why You Should Use SQL in 2020
The concept of SQL has been around for four decades now. However, it has been evolving and continues to evolve. We all studied SQL at some point or another in our education. We are here to make it better by moving away from the archaic tools and practices. When you learn a certain system using archaic tools and practices, you need to assimilate it and rethink the application of this system in the modern industry when you become part of it. With this book, on the other hand, you would be learning the concepts using industry-standard tools and practices.
Understand that SQL is not going anywhere. If anything, it has only become a hot cake with a further push in the competition by some strategic drivers like Microsoft, who have now made SQL available for Linux as well. The different cloud offerings such are Microsoft Azure and Amazon Web Services — the two largest players in the field — have a very dedicated approach to the relational database model, and SQL, subsequently. And this is just the beginning of a new era.
When you enter into this era, it is important that you know how what works, so that you can propel the business you own or support, to new heights.
Why is SQL important? What problem is it solving?
Data management or database administration is incomplete without the SQL. To feel comfortable using the powerful SQL as part of your administration or development requires that you understand the basics of SQL, which will take you a long way in your career.
In this book, therefore, we start with the introduction to the SQL itself, and then, understand the important features of SQL Server. The chapters will take you through a demonstration of the internal workings of SQL, starting from SQL standards, evolution, history (this is important so that you know how it grew and became the beast it is today, and so that you can leverage its potential powers) and progresses to creating tables, understanding and defining relationships, writing Transact‑SQL commands, and so on.
But isn’t studying SQL alone, restrictive? Turns out, it isn’t. Of course, we can use SQL on any kind of database or data source, but even if we cannot directly use SQL, most query languages of today have some relationship to SQL. In general, once you know SQL, you can effortlessly pick up other query languages.
Standards are vital because every relational database must build its framework around this framework in order to ensure compatibility. This means that the learning curve is greatly reduced. SQL is ANSI as well as ISO-compliant, along with other standards, which emphasizes the fact that you have to learn the concept only once.
What other tools are out there? Why is SQL better?
There are different ways in which data can be stored. One of the modes of storage is to establish a specific structure based on the blocks you would use to store data, and store the data in that structure. This model of storage is the one with a predefined schema. SQL is best suited for this model of data storage.
There are several tools available to manage, compare, administer and develop SQL databases. Different tools are built for different purposes, with their own sets of pros and cons. However, the underlying fabric is SQL.
SQL has been around for a long time: the first SQL product available for public use was launched in 1979—Oracle version 2— and Oracle remains one of the premier database systems today. And the underlying concepts are the same since then, with a majority of SQL operations (and commands) involving four basic verbs: Select, Insert, Update and Delete. Also, SQL is generally whitespace-independent, which means that adding spaces within or between clauses will not matter. Most SQL queries are standardized to seem like a question you direct towards a database object, which the said database object knows how to respond to. SQL uses a command interpreter to parse the SQL query. And since SQL, combined with the interpreter is so powerful that it has been adopted into many database products.
Of course, there are other SQL products, not among the conventionally used tools and systems. Some are barebones SQL, while others have other features on top of them. There are some exceptions, which may support only a part—not all—of the SQL standard. Therefore, when you start to specialize in a database product, you would need to learn those features, but a majority of what you do will revolve around what you learn in this book; the additional features will only act as the exterior coat.
How can SQL help developers?
In an industry where data is everything, data tools are among the must-haves. SQL is unquestionably on the list. Any data science professional will tell you, for example, how important SQL is. In a survey of analytics, data mining and data science software, conducted in 2015, SQL was placed third in the list of usage, after Python and the R language. Notice how SQL is the first database tool/framework on the list.
SQL can work on varied systems and platforms. Therefore, knowing SQL is important for developers who deal with the data. Understanding SQL will take you a long way in creating fast, efficient software, which leverages the benefits of well-structured data.
It can be said, without a doubt, that SQL is the only transferrable skill that can find a place in the toolbox of any developer or database administrator. The SQL is standardized, and therefore, is useful across systems that deal with databases. Once you know SQL, all you would need is to adopt very minor syntax adjustments from one database system to another—the model remains the same, the fabric remains the same.
Advice for SQL developers and administrators
Let me start with data munging. Data munging, or wrangling, is the process of transforming data into various states based on what state would be better suited for a situation at hand. In other words, how you should change the representation of data to make it more understandable to an application or user. For instance, when you use APIs to download content, or scrape a website or use an existing data set for prediction, how would you transform the data so it becomes a well-defined stream, that can be readily consumed by a certain data analysis tool?
The primary issue with data munging is data duplication. A few wrong SQL joins during munging can potentially generate thousands of duplicate data. These may be because of SQL code or because of issues in the backend database. A quality assessment after every step of data munging is important to avoid these issues.
More important than what good comes out of learning SQL is what bad comes out of not learning SQL. We see in the industry, that in general, very little time is spent in learning or practicing the skills required to manage SQL databases. This leads to a series of bad things, the biggest ones of which are:
Think of your data as the bricks you use to construct a building. Think of the SQL schema as the blueprint of your building. When you are constructing a simple structure of four walls, twelve feet in height and fifteen feet in width each, and no roof, you do not need much of a schematic. However, when your data gets more complicated, and a situation where you need a hundred tables (or walls, in our analogy), plus the roof, and if the blueprint does not account for the structural engineering and the myriad of standards and requirements, you end up building a structure that collapses in no time, which is, loads of duplicate data, over-complicated modeling, non-scalable structures, etc., which leads to a lot of overhead in the long run: on you, as well as the business you manage, support, or own.
Thousands of SQL queries
ORMs or Object-Relational Mapping is used in software code written in an object-oriented language with the relational database. In other words, ORMs convert data between the application and the database. In general, if the data is not modeled properly, these ORMs may create hundreds of SQL queries when mapping the data with the application, which will hammer the database and slow down the systems drastically.
Slow running queries
This is a contrasting situation: developers often do something complex with the ORM, ending up creating a monster SQL query. This query would take several seconds to run, and therefore, hinder the performance of the database, as well as the application.
This is also an important part of data modeling. Data in a SQL database is stored with constraints. These constraints are an integral part of the architecture. If your database does not have thought-out constraints, your data will get messy. Typically, on projects that span years, you usually end up with records that do not get cleaned up when not required anymore.
Every developer (and tomorrow it is going to be you) must take the time to understand SQL in full. Only then should you graduate to using an ORM. A deeper understanding of table modeling, relationships, constraints, and joins will help you avoid these pitfalls in the future.
Get started with SQL with this course. We understand the systems, we understand the industry, and we understand — through our years of experience — the different points-of-failure. This course covers key topics in SQL, such as aggregate functions, constraints, joins, subqueries, etc., which will be invaluable when working with databases. The course starts from the very basics — like “what CRUD is” — and assumes no knowledge of databases at the reader’s end. A perfect beginner’s course in SQL.
Top SQL Myths
The world’s favorite SQL myths are those when comparing SQL to NoSQL. It is often said that NoSQL will replace SQL. This is not true. In fact, NoSQL came into being in the sixties — even before SQL was born. SQL and NoSQL are two complementary solutions. One is better-suited than the other, based on the system that will utilize them.
Also, it is a common idea that SQL and NoSQL come with a clear distinction. The fact is, the line that divides NoSQL from SQL is blurring more and more as the two systems evolve. What is emerging is NewSQL, which is a hybrid between the two. And the fact remains: you need to know SQL to work with these new systems.
Outside of the SQL – NoSQL war is the idea that “SQL is all about fetching data”. This stems from the fact that most resources used to teach beginners concentrate only on the syntax part of SQL. The subtle message that beginners get is that SQL is all about querying. This leads to the idea that if a query executed without syntax errors, it is the right query for the job. This is a wrong notion that springs from a nonunderstanding of the SQL standards, and more importantly data modeling. The overlooking of these have placed SQL among the most underrated tools.
If you would like to test out this myth yourself, take a relational database with about three to ten tables on different topics (e-commerce, airline, retail, etc.) and ask some tough business questions. Try to find answers to them by executing several SQL blocks. These will form a good baseline to test your skills.
SQL is a beautiful beast, which has seen a significant growth in usage in the last couple of years. SQL, among languages, is like water among solvents. With an understanding of SQL, you enhance your capabilities and specialize as a programmer, developer, project manager and more. SQL is an open-source. A number of databases that use SQL — such as MySQL, MariaDB, and PostgreSQL — are open source with a thriving community of users.
SQL may be assumed to handle rows and columns of data, and I hear many undermining it to atrocious extents by comparing it with spreadsheets. Comparing spreadsheets to SQL databases in terms of handling millions of records is like comparing a pinwheel to a windfarm. Whether there are 1,000 records or a hundred million, SQL is equipped to handle data pools of virtually any size.
Also, SQL is omnipresent in the world of computers. Apart from large corporations, small companies, banks, hospitals and colleges, SQL is used virtually everywhere, including your iOS or Android smartphone—and I’m not talking just apps.
One major drawback with SQL is that the basic SQL standard does not have any decision structures or looping constructs such as if-else or for or while. Therefore, vendors have built solutions — or rather, language additions — that extend the vanilla SQL, such as Microsoft’s Transact-SQL (T-SQL) or Oracle’s Procedural Language SQL (PLSQL).
Surprisingly, however, there aren’t many individuals who have an understanding of SQL, beyond a mere working knowledge of it. Knowing SQL makes you invaluable in the field. Especially because more and more businesses are becoming digitized, and any and every business journal talks about analytics or business intelligence. As the usage of analytics and Bi grows (which it will, as the data grows), requirements for professionals to deal with data will increase. And SQL is the key skill that will empower you to be that professional that the businesses continue to seek.