Part 8: Understanding System Design - Databases

In this episode, I will explore the next system design component, databases.

Hello “👋”

Welcome to another week, another opportunity to become a Great Backend Engineer.

Today’s issue is brought to you by Masteringbackend → A great resource for backend engineers. We offer next-level backend engineering training and exclusive resources.

Before we get down to the business of today. Part 8 of Understanding System Design.

I have a gift for you: You will love this one.

MaxAI.me - Do more Faster with 1-Click AI

MaxAI.me lets you chat with GPT-4, Claude 3, Gemini 1.5. You can also perfect your writing anywhere, save 90% of your reading & watching time with AI summary, and reply 10x faster on email & social media.

Now, back to the business of today.

In the previous edition, I discussed one of the system's components: Distributed Caching. I then discussed distributed caching further and introduced you to different caching strategies, how to implement them, and different caching eviction policies.

In this episode, I will explore the next system design component, databases. We will elucidate the following topics:

  • What is a Database?

  • Types of Database

    • Relational Databases

    • Non-Relational Databases

  • What is database replication?

  • Database Sharding

  • Database Indexing

What is a Database?

Databases are among the most important components of every system and are present in every system for 99.99% of the time.

So, it’s important to learn databases deeply and understand the differences between each database so you can make an informed decision when adding it to your system.

A database is a collection of data stored, organized, and retrieved electronically. It stores and manages large amounts of structured and unstructured data. Databases are crucial for data activities, such as data analysis, management, storage, etc.

Types of Database

Databases are broad, and there are numerous types of databases. However, there are two main and 2 less common groups of databases, viz:

  • Relational Databases

  • Non-relational Databases

  • Object-oriented Databases

  • Hierarchical Databases

We will start our discussion with the two less common groups and move further to the main groups of databases.

Hierarchical Databases

According to MongoDB, hierarchical databases developed in the 1960s resemble a family tree in which a single object (Parents) has one or more objects beneath it (Children).

No child can have more than one parent, and because of this easy navigation system, the database offers a high performance, as there’s easy access and a quick querying time. The Windows Registry is one example of this system.

Hierarchical Databases

Object-oriented Databases

Object-oriented databases store data on a database’s server disk. They integrate database capabilities with object-oriented programming language capabilities, which provide a more natural and intuitive way to manage complex data and relationships.

Object-oriented databases can persist object associations, making them unique. This enables rapid and powerful data querying across complex relationships in object-oriented programming.

Mongo Realm is a good example of an Object-oriented database where the query language constructs native objects through your chosen SDK.

A diagram illustrating the key components and relationships within an object-oriented database

Relational Databases

Relational databases are the most popular group of databases. They are also known as relational database management systems (RDMS) or SQL databases.

They store and organize data in a tabular format, which includes rows and columns. This format makes it easy to understand how data relate to one another. The most popular SQL databases are:

  • MySQL

  • SQL Server

  • PostgreSQL

  • Oracle

A simple relational database entity relationships diagram for a bank.

When designing a system, you must decide which database to use and which will align with your use case. Let’s discuss this further.

When should relational databases be used?

  • Need to store structured data: Relation databases should be used when there is a need to store and manage large amounts of data. For instance, a relational database will be a good choice if you have to store “Articles” and their associated “Authors.” Relational databases best fit interrelated data when multiple tables must be joined to retrieve information.

  • When dealing with financial transactions: The ACID properties in relational databases give you a guarantee around data atomicity, consistency, isolation, and durability when dealing with transaction operations.

  • When dealing with frequently changing data - Relational databases are well-suited for use cases where data needs to be queried and updated frequently and where data integrity and consistency are important. They are commonly used in business and enterprise applications.

Benefits of Relational Databases

  1. Query capabilities - It provides a powerful SQL query language for easy data retrieval and manipulation.

  2. Simplicity—The main benefit of relational databases is that they provide an intuitive way to store data and allow easy access to related data points.

  3. Normalization - Relational databases employ normalization to reduce data redundancy.

  4. Data Security - It provides multiple levels of security, from user-level to system-level security, which protects data stored in the database.

  5. ACID Compliance - Relational databases support ACID compliance. ACID is a concept that describes four properties of a transaction of a database system that ensure the accuracy and integrity of data. Below are the key properties of ACID

    1. Atomicity

    2. Consistency

    3. Isolation

    4. Durability

ACID is large, and I will create a dedicated article explaining ACID compliance with a relational database and its benefits.

Limitations of Relational Databases

No matter how popular relational databases are. Some limitations are involved with relational databases.

  • Expensive: The main drawback is managing a relational database, as it is very expensive to set up and maintain. It requires high-performing, costly hardware and a lot of storage.

  • Flexibility: Relational databases are designed to store structured data, which makes them problematic for unstructured data. Therefore, they are not flexible enough for all types of data.

  • Limited Indexing: Limited indexing capability in relational databases can lead to poor query performance for large datasets.

  • Scalability: Relational databases may struggle to handle large amounts of data or a high volume of concurrent transactions. They are typically designed to run on a single server and be scaled vertically. Vertical scaling is limited to processors and memory upgrades. To scale a relational database effectively, it must be distributed onto multiple servers. Handling tables across different servers is difficult.

  • Performance: There are performance issues associated with relational databases when executing complex queries or dealing with many joins.

Non-relational Databases

Non-relational databases are also known as non-relational database management systems or NoSQL data.

They are designed to support storing and managing large amounts of unstructured data. A NoSQL database does not use relational database systems' traditional tabular schema of rows and columns.

They store data in a non-tabular form and are more flexible than relational databases.

A diagram showing non-relational databases

There are four major types of NoSQL databases, viz:

  • Document databases

  • Key-value databases

  • Wide columns databases

  • Graph databases

Document Databases

Document databases store data in a document, a container for key-value pairs. The documents are stored in JSON or XML format and are unstructured without requiring a schema.

These documents are grouped to become a collection, which can be organized into a hierarchy. This allows you to model and retrieve relational data significantly without joins.

They are suitable for online games, IoT, mobile apps, etc. Below are the most popular ones, viz:

  • Firestore

  • DynamoDB

  • CouchDB

  • MongoDB

Key-Value databases

The key-value databases store data in a format where each item contains keys and values. They store data in a hash table where each key is unique, just like in dictionaries, and the value can be in JSON, String, BLOB, etc.

Key-value databases store data mainly in memory and not the disk. This limits the data you can store but makes it extremely fast.

Below are good examples of Key value databases:

  • Redis

  • Memcached

Key-value databases are mainly used for distributed caching, as discussed in the previous newsletters.

Wide-Column Database

A wide-column database, also known as a column-family database, is a type of NoSQL database that organizes data into tables with rows and dynamic columns.

This allows for more flexible and efficient storage of large amounts of unstructured data and faster querying and retrieval of specific data subsets.

Below are examples of wide-column databases:

  • Apache Cassandra

  • HBase

  • Amazon SimpleDB.

Graph Databases

Graph databases store data in nodes and edges. The nodes typically store information about people, places, and things, while edges store information about the relationship between the nodes.

They are a good alternative to relational databases, especially when you must make many joins and performance suffers.

Graphs are useful for identifying patterns in structured and semi-structured data. They are mainly used in fraud detection in finance and to power recommendation engines like AirBnB.

Below are popular examples of graphs databases:

  • Neo4j

  • Dgraph.

When do you use NoSQL databases?

NoSQL is designed to address the major limitations of relational databases. It can be used across large distributed systems since it is more scalable and can handle more data loads than relational databases.

NoSQL databases are the right choice if:

  • If your application requires super-low latency because access is very fast in key/value stores, it provides low latency and high throughput.

  • If your data are unstructured, or you do not have any relational data.

  • If you need to store a massive amount of data.

Benefits of NoSQL database

  • Flexibility: Relational databases strictly store data in a structured format. NoSQL provides support to flexible schemas that enable faster development. The flexible data model is perfect for unstructured and semi-structured data. You can change the structure of data without disrupting the current structure.

  • Scalability: The relational database architecture is mostly scale-up, which increases performance by upgrading CPUs and adding more memory. However, NoSQL databases use the scale-out approach by using distributed hardware clusters. The scale-out architecture makes NoSQL easily scalable when data and traffic grows.

  • Simplicity and Developer-friendly: Many NoSQL databases are simple, and developers find it easier to create various applications than SQL Databases.

  • Large amounts of data storage: Many NoSQL databases can handle large data sets, making them perfect for big data applications. NoSQL is great when you are dealing with large volumes of data.

Limitations of NoSQL database

  1. The lack of SQL—A major complaint about NoSQL is that it does not come with SQL, a mature technology for querying databases. It lacks the complex tool functionality found in SQL.

  2. Lack of Standardization—Every NoSQL database uses its own schema. In databases like MongoDB, there is no schema. In some other databases, like ElasticSearch, it is dynamic. Some other database designs resemble relational databases(for example, Cassandra). The problem here is that each database has its own unique strengths and weaknesses, which must be learned before choosing the right database for a project.

  3. The lack of ACID - I will discuss ACID as the properties that define a transaction. NoSQL does not support these properties. Transactions provide the strongest guarantees available around data consistency.

That will be all for this week. I like to keep this newsletter short.

Today, I discussed Databases as a component of System Design and introduced you to relational and non-relational databases.

Next week, I will start exploring ACID Compliance in Databases as one of the important features of Relational Databases.

Don’t miss it. Share with a friend

Did you learn any new things from this newsletter this week? Please reply to this email and let me know. Feedback like this encourages me to keep going.

See you on Saturday.

Remember to get the  Masteringbackend → A great resource for backend engineers. We offer next-level backend engineering training and exclusive resources.

Top 5 Remote Backend Jobs this week

Here are the top 5 Backend Jobs you can apply to now.

👨‍💻 Canva
✍️ Java Backend Engineer
📍Remote, Prague
💰 Click on Apply for salary details
Click here to Apply for this role.

👨‍💻 Docplanner
✍️ Backend Developer - mid
📍Remote, Italy
💰 Click on Apply for salary details
Click here to Apply for this role.

👨‍💻 BandLab Technologies
✍️ Senior Backend Engineer, Ads Team
📍Remote, Asia, Africa, Europe
💰 Click on Apply for salary details
Click here to Apply for this role.

👨‍💻 WunderGraph
✍️ Senior Golang Engineer - Remote (EMEA)
📍Remote, Go, Golang
💰 Click on Apply for salary details
Click here to Apply for this role.

Want more Remote Backend Jobs? Visit GetBackendJobs.com

Backend Engineering Resources

Whenever you're ready

There are 4 ways I can help you become a great backend engineer:

1. The MB Platform: Join 1000+ backend engineers learning backend engineering on the MB platform. Build real-world backend projects, track your learnings and set schedules, learn from expert-vetted courses and roadmaps, and solve backend engineering tasks, exercises, and challenges.

2. The MB Academy:​ The “MB Academy” is a 6-month intensive Advanced Backend Engineering BootCamp to produce great backend engineers.

3. MB Video-Based Courses: Join 1000+ backend engineers who learn from our meticulously crafted courses designed to empower you with the knowledge and skills you need to excel in backend development.

4. GetBackendJobs: Access 1000+ tailored backend engineering jobs, manage and track all your job applications, create a job streak, and never miss applying. Lastly, you can hire backend engineers anywhere in the world.

LAST WORD 👋 

How am I doing?

I love hearing from readers, and I'm always looking for feedback. How am I doing with The Backend Weekly? Is there anything you'd like to see more or less of? Which aspects of the newsletter do you enjoy the most?

Hit reply and say hello - I'd love to hear from you!

Stay awesome,
Solomon

I moved my newsletter from Substack to Beehiiv, and it's been an amazing journey. Start yours here.

Join the conversation

or to participate.