This article is about the fundamental concept of Structured Query Language, SQL. Thus, if you are new to SQL and have no idea what it’s about or wish to refresh your SQL knowledge, this might be the article you’re looking for. This article will give an introduction to SQL and show how SQL works in creating databases and managing them. Then, it will also try to give a guide on which parts or concepts you should focus on based on your job title. Hope that this article gives you some idea of how SQL works.
Without any further ado, let’s get started.
Table of Contents
Introduction to Structured Query Language
Structured Query Langauge, also known as SQL is a programming language specifically designed for relational databases (a type of database that stores data in rows and columns like MS Excel). Its application on databases is so wide that we could use SQL to create and manage databases, tables, insert data, update data, and database security etc, just the same as how we create, manage and update folders and files in Windows or Mac.
But where do we use SQL though? This question might come to you at some point. We indeed need a specific platform for SQL to run, like how we need an operating system to see where our folders are. This is where Relational Database Management System (RDBMS) comes into play. Here is an example of one of the RDBMS so-called “SQL Server Management Studio, SSMS”.
(There are many different RDBMS with different properties, some excel in performance, and some offer more flexibility, but those are not the concerns of this article. So, we will stick with SSMS at this point.)
What is SQL Server Management Studio (SSMS)? by Microsoft
Imagine SQL as the terminal of Windows or MacOS on your computer, it is capable of every job the GUI can do. For example, if we want to create a new folder in Windows, we Right Click > New > Folder, whereas in the terminal, we do “mkdir NewFolderName“. Both execute the same action, just in different ways. In RDBMS, we can create a database with Right Click > New Database. with SQL, we simply run “CREATE DATABASE [Your Database Name]“.
Figure 1 – Create Database in SQL Server
(Of course, we cannot do everything sorely with SQL just as we can only drag-and-drop files between folders with Windows GUI but not terminal, but the core operation between them remains the same)
Types of SQL Commands
Now, you know what SQL is and what it is capable of. Now it is time to break SQL into different parts according to the types of SQL commands. In general, SQL commands consist of 5 types:
Commands Types | Description |
Data Definition Langauge (DDL) | To define the database schema |
Data Manipulation Langauge (DML) | To manipulate and process data in databases |
Data Query Langauge (DQL) | To query or retrieve data from databases |
Data Control Language (DCL) | To control access or permissions of the database system |
Transaction Control Langauge (TCL) | To control the execution and outcome of a transaction |
Familiar with SQL Command Types
After all, this article aims to share the usage of SQL during database development. So, it is better to work with some examples. We want a database to store what we purchased this year. So, the data we need to store is probably the product’s name, the amount purchased, and the price (per unit). Therefore, we need to create a database and a table for us to store the data.
A. Data Definition Langauge, DDL
First, we will create and define the database schema. You’ve already seen this before, we use “CREATE DATABASE [Your Database Name]“.
(If you have SSMS and want to try it yourself, you could click File > New > Query with Current Connection, and start typing the command)
Then, we create the table with the following command and now you’re ready to insert your data. If you somehow define your table wrongly, you could redefine your table by command “ALTER” or “DROP” to delete the table.
CREATE TABLE [Your Table Name] (
id int primary key,
product_name varchar(30),
category string,
amount int,
price_per_unit double,
purchased_date timestamp
);
Note: the “primary key” from the code is the column that makes each row unique. In practice, sometimes there will be more than one primary key column.
B. Data Manipulation Language, DML
Now we will insert our data with DML.
INSERT INTO TABLE [Your Table Name] VALUES (1, "Coffee", "Drink" 1, 5, "2024-01-14 13:23:44.000000");
INSERT command, the only command in SQL for adding data into tables. But what if anyone inserted the wrong data or inserted a few lines of wrong data? What should they do?
DML also have a command to delete data, which is DELETE and TRUNCATE. The “DELETE” command will allow a developer to delete data that matches a condition defined by developers. On the other hand, the “TRUNCATE” command will delete all the data in a table, but not the table itself.
C. Data Query Language, DQL
DQL is a simple and straightforward part, but it could also be very tricky in practice. In our example, we can select the data by simply executing:
SELECT * FROM [Your Table Name] OR
SELECT product_name from [Your Table Name]
Simple right? Unfortunately, it is simply only because we don’t have much data and the database structure is simple. Imagine if we have millions of records and several tables in one project, then we are probably interested in some more complicated scenario. For example,
- What is the total spend on a specific product each quarter year?
- What is the most purchased product for each month?
There are many techniques for advanced queries, but I will leave it for you to explore because that is not the objective of this article.
D. Data Control Langauge, DCL
DCL commands control the authorization of users to data stored in a database. For example, “GRANT” and “REVOKE”. The “GRANT” command is to permit the table for specified groups or users of a database. While “REVOKE” did the opposite, it takes permission away from specified groups or users. Here is an example:
Grant SELECT,INSERT,UPDATE,DELETE on [Your Table Name] To [Group / User Name]
Revoke INSERT On [Your Table Name] To [Group / User Name]
If you don’t know how to create a user or group in a database, you could refer to Users and Groups (Master Data Services) from Microsoft.
E. Transaction Control Language, TCL
TCL commands are used to control database transactions instead of data. For example, if we update data, or insert data, that’s all transactions. So, there are three commonly used TCL commands:
- COMMIT
- This command saves all the transactions a developer makes. Whether it is insert data or delete, the changes will only be permanent after executing this command. Otherwise, a developer could easily roll back the changes.
- SAVEPOINT
- SAVEPOINT allows developers to save their work separately. For example, we have all insert queries as one part of transactions and delete queries as another part.
- ROLLBACK
- ROLLBACK is just like an undo, which allows you to revert your mistakes to a particular savepoint. For example, you initialize a table with data. Then you created a savepoint. Then you mistakenly deleted some useful data. Therefore in such situations, you could roll back the deletion transaction using the “ROLLBACK” command.
Which Parts Should You Focus
Many jobs are using SQL, but I would like to group these jobs into two different groups. Here are the groups:
- Data Analyst / Data Scientist / Business Analyst
- Data Engineer / Database Administrator / Database Architect
Well, it is not too difficult to tell what’s the difference between these two groups. The first group are focused on analytics-kind of work, which means they need to write complex SQL queries to get the crucial information they need. Thus, if you are in the first group, you might want to focus on DQL or maybe some DML for gathering data.
Then, we have the second group which mainly focuses on databases. Thus, if you fall into this group, then you need to be an expert on every part of SQL except DQL as you might not need to do complex queries that much.