SQL stands for Structured Query Language. It was first developed in the 1970s by a team of IBM researchers, NoSQL databases, on the other hand, were first used in 1998 by Carlo Strozzi. The most common difference between these two databases (DB) systems is that SQL is relational and NoSQL is non-relational. Let’s deep dive into these two databases to better inform your decision when next you are considering a database for your project.
Database Structure
Let’s talk about structuring.
SQL
SQL database have a definite schema structure. A schema contains tables, and each table contains a definite number of columns. That means a user cannot update the table beyond the number of columns specified in the table. This is especially useful when you need to maintain data integrity and also to make sure of the kind of data that gets saved into your database. Each table in an SQL database can be related. i.e., You can have relationships between tables. These relationships can be One to Many, Many to Many or One to One. The type of relationship you implement depends on what you require. For instance, let’s consider the hypothetical situation; we have a company with users, and users can make orders for products. Then, we could decide that users can create multiple orders, but each order can only be created by one user. This would be one to many relationships, i.e., one user to many orders. Hence, the table structure for both tables will look similar to the below. In our DB we could have a users table, structured as below, Also, we could have an orders table The user_id on the orders table, makes it easy to map each order on the orders table to the user it belongs to. In the case of a One to One relationship, we could have the order_id also on the users_table if we decide to get the user by its related order id. For, Many to Many situations, an extra table, called a Pivot table, is usually involved. This enables multiple records to be mapped to each other. Using the above instance. We would have, and the order table will be and then the Pivot table will hold both IDs as foreign keys. Based on this structure provided by SQL, you can comfortably write Joins between tables that will provide data from different tables joined together in one query.
NoSQL
NoSQL databases were built to be more flexible than SQL DBs, also to contain larger amounts of data. In NoSQL DBs, there is no pre-defined schema or tables. There are Collections, and in each Collections, there are Documents. This enables you to save data in different forms as they come. You can choose to have multiple varying documents with varying fields in one Collection. It is also possible to manually forge relations between Collections. However, they are not suitable for such purpose. Instead, you could save all that is needed for a single query into the same Collection. If you are an SQL person, you may think of Collections as tables and Documents as rows with the tables. However, there are no restrictions on the columns of data you can add with the table. Going back to our earlier defined hypothetical instance of a company with users and orders. A Users Collection could be defined as, And the Orders Collection could be defined as, However, in this case, we want to avoid having to manually join both Collections (which we shouldn’t, in this case). We can save entries into the Collection that gets the most read. I have decided (for this example) that will be the Orders collection. In this case, we no longer need to read from the Users Collection and only read from the Orders Collection, which now contains all the data we need. A key thing to note here: If you are building an app that does a lot of reads than write, a NoSQL option is likely more suitable for you. Because you could have your data all saved on the same collection, and you could read from that source comfortably to get all the required data. However, for an application that requires a lot of writes (approx. 10,000 writes per second) at that scale, it is not a good idea to have NoSQL option where you need to write the same data to multiple locations. In this situation, an SQL option is likely more suitable, where you have relations existing to all tables, and the same data do not need to be written to multiple locations repeatedly, updating data in one location can be available to other tables via the exiting relationship. This, of course, does not mean that each of these databases can not handle scale.
Scaling
Let’s explore how scaling works.
SQL
SQL DBs can not be scaled horizontally but only vertically. What does this even mean? Horizontally scaling means splitting data from one DB into multiple databases to ease the load. SQL data can, however, not be split on separate DBs due to its strict nature. The proper to scale an SQL DB is to increase the CPU, Memory, and Disk space of the existing DB server, and this is what it means to scale it vertically.
NoSQL
NoSQL DBs can be scaled both horizontally and vertically. This is due to the flexibility in its data storage. This, therefore, allows its data to be split on multiple databases, as is the case with horizontal scaling. It can also be scaled vertically if required. A key thing to note here: When it comes to scaling, both SQL and NoSQL Databases can be scaled effectively. However, for SQL DBs, vertical scaling can be a limitation; a single DB server will have a limitation on the amount of computing power it can carry. It is also important to note here, that for most of the applications you will build you might not hit the maximum of your server’s computing ability, but it’s helpful to keep this in mind. However, for large business applications implementing SQL, a popular option to beat this limitation is by Sharding.
What is Sharding?
Sharding is the process of breaking the large tables into small chunks, which are referred to as shards. Sharding could be done by horizontally partitioning a database. This is not to be confused with Horizontal and Vertical Scaling. Horizontal partitioning refers to the process of storing rows of a table in multiple database nodes. Vertical partitioning, on the other hand, requires saving columns of a table on different nodes. This allows the database to scale effectively and boost performance.
Database Examples
SQL
MySQL – A very popular open-source database. Easily the database of choice for many PHP developers, however, could also be used with Node.js, C#, C++, Java, Perl, Ruby, and Python. MSSQL – Microsoft SQL provides a lot of stability as its development is directly from Microsoft, which also offer some support in terms of disaster recovery. MariaDB – This was built on MySQL by the makers of MySQL, intending to keep MariaDB as a free forever version. PostgresSQL – A very popular open-source database. Prides its self as the world’s most advanced open source database Oracle – This is usually tailored to Oracle’s enterprise solutions with some limitations on its free version.
NoSQL
MongoDB – Probably the most well known NoSQL DB, common amongst application developers who work with MERN stack (MongoDB, Express, React, Node) or MEAN stack (MongoDB, Express, Angular, Node). Firebase – Introduced in 2011 and acquired by Google in 2014, is being widely used by web and mobile application developers. Apache Couch DB – A document-based NoSQL DB which stores data as JSON. Redis: This is NoSQL DB, probably most well known for its use in storing data with optional time to live. It is also well-known for its speed.
Conclusion You can create any kind of application with either an SQL or NoSQL database. It depends on your requirements. If you are considering a database where you have more reads and fewer writes, a NoSQL might be a good option. If you are, however, considering building an app with more writes than reads, an SQL might be the better solution. On scalability, when your app gets to a very massive scale, you might end up using both DBs.