Different types of database management systems explained
The various types of database software come with advantages, limitations and optimal uses that prospective buyers should be aware of before choosing a DBMS.
Data is the lifeblood of organizations, and the database management system is the beating heart of most operational systems and analytics applications. The DBMS is the primary platform for processing, storing and managing data and serving it to applications and end users. But there are many different types of DBMS technologies, each with its own strengths and weaknesses.
The most prevalent type is the relational database management system (RDBMS). It became the norm for data management almost 40 years ago, after low-cost servers became powerful enough to make the technology widely practical and relatively affordable. Relational databases use the SQL programming language and are based on a data model that supports transactional consistency and reliability, making them a good fit for the traditional structured data that's common in business applications.
But some shortcomings with the relational model -- in particular, its rigidity and cost -- became more apparent in the early years of the web era and were brought to the fore by the emergence of big data technologies starting in 2005. Today, IT departments can consider several other DBMS product options as alternatives to the mainstream RDBMS, depending on the specific applications being run.
Types of DBMS technologies
There's a lot at stake in the DBMS evaluation and selection process. Choosing a DBMS technology can affect the success or failure of your IT systems and applications. Because the database landscape is complex and confusing to navigate, it's crucial that IT and data management teams fully understand the different types of DBMSes, along with when and why to use them. That applies both to on-premises DBMS platforms and the cloud databases that organizations are increasingly using.
To help with that, let's look more closely at the available options.
1. RDBMS
Until relatively recently, the RDBMS was the only product category worth considering for most applications. The big data trend, as well as the growth of technologies like IoT and real-time data streaming, have led to the development of new kinds of DBMSes that compete well with relational software for certain use cases. However, the RDBMS continues to be the undisputed market leader in terms of revenue and installed base.
Based on the mathematics of set theory and first developed for commercial release in the late 1970s, relational databases provide data management, access and protection with reasonable performance for most applications, whether operational or analytical in nature.
Since the mid-1990s, relational software has been the primary operational DBMS, led for most of that time by products from industry giants: Oracle Database, Microsoft SQL Server and IBM Db2, plus SAP HANA as a more recent addition. As more database workloads move to the cloud, AWS and Google Cloud have also become big factors in the RDBMS market, partly by offering cloud-based versions of MySQL, PostgreSQL and MariaDB -- three popular open source DBMSes.
Relational systems also became the most widely used DBMS platforms for the data warehouses that organizations began to deploy in the 1990s to support BI and enterprise reporting applications. In the cloud, Oracle, Microsoft, IBM, SAP and traditional data warehousing vendors like Teradata have similarly been joined by RDBMS technologies such as Amazon Redshift from AWS, Google's BigQuery and Snowflake, although those products use columnar storage to optimize analytical query performance.
The RDBMS is adaptable, stable and reliable. Its maturity has been bolstered by all the years of use in both large and small organizations. Support for the ACID properties -- atomicity, consistency, isolation and durability -- is one of the most compelling features of relational databases. ACID compliance guarantees that all transactions are completed correctly or that a database is returned to its previous state if a transaction fails to go through -- a key capability to ensure that data is always consistent.
Given the comprehensive nature of the RDBMS, why have other types of database systems gained more popularity? Web-scale data processing, big data requirements and AI and machine learning applications challenge the capabilities of conventional relational databases. Although they can be used in these realms, alternatives that offer more flexible database schemas, less stringent consistency models and lower processing overhead can be advantageous in such dynamic environments. Also, the stability and reliability of RDBMS technologies comes at a cost: They aren't cheap.
2. NoSQL DBMS
NoSQL systems began to appear in the mid-2000s. At first meant to be taken literally, NoSQL came to more commonly stand for "not only SQL," as many NoSQL vendors adopted some aspects of the programming language. However, while the SQL-based RDBMS requires a rigidly defined schema, NoSQL databases give users more flexibility, supporting schemas in which all of the different entities don't need to contain the same data elements. For loosely defined or varied data structures that might also evolve over time, a NoSQL DBMS can be a more practical solution than an RDBMS.
Another difference between NoSQL and relational DBMSes is how data consistency is provided. Most NoSQL DBMS products support eventual consistency, in which data might not always be consistent across the nodes of a distributed database but is made so when it isn't actively being updated. RDBMS platforms typically also offer varying levels of locking, consistency and isolation that can be used to implement eventual consistency, and some NoSQL vendors have added full ACID compliance. In general, though, NoSQL systems offer a more relaxed form of consistency, which can speed up processing.
Because of those attributes, NoSQL addresses some of the problems that RDBMS technologies encounter in working with unstructured and semistructured data, as well as varied data sets and large amounts of sparse data. Data is classified as sparse when not every element in a database is populated and there is a lot of "empty space" between the values that are there. For example, think of a matrix that contains numerous zeroes and only a few actual values.
But while certain types of data and use cases can benefit from NoSQL software, using it can come at the price of eliminating transactional integrity, flexible indexing and ease of querying. Further complicating matters is the fact that NoSQL itself includes multiple types of DBMS platforms. It's a broad descriptor for these four primary product categories:
- Key-value stores contain pairs of unique keys and associated values. Examples include Aerospike, Amazon DynamoDB, Redis and Riak.
- Document databases store data in document-like structures encoded in formats such as JSON and XML. Examples include Couchbase Server, CouchDB, MarkLogic Server and MongoDB.
- Wide-column stores hold data in tables that contain large numbers of columns. Examples include Accumulo, Bigtable, Cassandra, HBase and ScyllaDB.
- Graph databases store data in graph form to highlight the connections between different data elements. Examples include Amazon Neptune, ArangoDB, Memgraph, Neo4j and TigerGraph.
Each type of NoSQL DBMS is best suited to particular use cases and has individual pluses and minuses to consider.
3. In-memory DBMS
DBMS technologies also include the in-memory DBMS (IMDBMS), sometimes referred to as a main memory DBMS. An IMDBMS relies mostly on memory to store data, as opposed to using disk-based storage. That makes the data in a database more immediately accessible to end users.
The primary use case for in-memory databases is to improve performance in applications that require fast data throughput. Because data is maintained in memory, I/O latency is greatly reduced by eliminating mechanical disk movement, seek time and the transfer of data to a buffer. IMDBMS products can also reduce processing overhead because the internal algorithms they run usually are simpler to execute, with fewer CPU instructions than the ones in disk-based systems.
In-memory databases aren't a wholly distinct product category, though. SAP HANA is an in-memory RDBMS, as are technologies like Oracle TimesTen In-Memory Database, Volt Active Data and SingleStore, while Aerospike and Redis are examples of in-memory NoSQL DBMSes. In addition, Oracle, Microsoft and IBM have all added in-memory processing capabilities to their flagship RDBMS platforms.
4. Multimodel DBMS
Another technology category is the multimodel DBMS, which supports more than one type of data model. Many NoSQL offerings do so -- for example, combining document and key-value stores. Some, such as Microsoft's Azure Cosmos DB and Progress Software's MarkLogic Server, were initially developed as multimodel products, while multimodel capabilities were added to others as product upgrades. Likewise, some RDBMS platforms have evolved to also support NoSQL data models, such as adding document and graph stores to their core relational engine.
5. NewSQL DBMS
NewSQL databases are a somewhat informal DBMS category. They're RDBMS platforms developed to bridge the gap with NoSQL systems by supporting distributed databases with ACID-compliant capabilities for horizontal scaling, real-time processing and other high-volume data needs. Examples include CockroachDB, Google's Spanner, NuoDB, Volt Active Data and YugabyteDB. Some vendors of such technologies now eschew NewSQL as a label and describe their products as distributed SQL DBMSes.
6. Columnar DBMS
A columnar DBMS is a SQL database system tailored to data warehousing deployments because it's optimized for reading a few columns that contain many rows at once in order to speed up queries. Unlike a traditional RDBMS, which uses row-oriented storage, a columnar DBMS stores data by columns. Doing so is beneficial for complex analytical queries that aggregate data by column values. On the other hand, it isn't efficient for transaction processing or applications that involve data modification.
SAP IQ, developed in the 1990s and formerly named Sybase IQ, is an example of a relational column-store DBMS. While traditional columnar technologies weren't as widely used as relational databases in on-premises data warehouses, the use of columnar storage in the cloud data warehouse platforms mentioned above has expanded its adoption. In addition, some RDBMS products, such as IBM Db2 and Microsoft SQL Server, offer a choice of row storage or columnar storage to support both operational and analytics applications. Products with that capability are sometimes referred to as translytical databases or hybrid transaction and analytical processing ones, the latter commonly known as HTAP for short.
7. Cloud DBMS
As the term indicates, a cloud DBMS operates in a cloud computing environment. Unlike on-premises database systems that are installed and managed locally in an organization's data center, a cloud DBMS is hosted by a cloud service provider.
Similar to in-memory databases, cloud databases aren't a wholly distinct product category -- they encompass all the DBMS technologies listed above. Database deployments are clearly shifting toward the cloud overall, and most on-premises DBMSes now offer a cloud-based option or offshoot. In other cases, vendors have developed DBMS products specifically for the cloud.
Cloud DBMS offerings vary in terms of features, deployment models and pricing structures, but they generally share some common characteristics, including built-in high availability features and elastic scalability that enables users to easily scale systems up or down based on changing workloads and resource requirements. Organizations can also quickly deploy new databases in the cloud or shut down ones they no longer need, providing increased flexibility compared to on-premises DBMSes.
Flexible deployment options are another hallmark of cloud DBMS offerings, which typically enable users to choose between fully managed and self-managed services. In the former, the DBMS provider handles infrastructure provisioning, maintenance and routine database administration tasks for customers, although database administrators in user organizations typically are still involved in overseeing and managing systems. Users also have a choice of deploying databases in public, private or hybrid clouds and using pay-as-you-go, subscription-based or other pricing models.
Other DBMS categories that aren't as prevalent as the preceding ones include the following:
- Time series databases collect data generated on an ongoing basis and store it in successive order to enable analysis of how the data changes over time.
- Search databases are specialized data stores that are designed to support enterprise search applications, as well as application monitoring and other uses.
- XML DBMSes are architected to support XML data. However, most RDBMS platforms now provide XML support, as do NoSQL document databases.
- Most popular in the 1990s, object-oriented DBMSes represent data as objects and were designed to work with object-oriented programming languages. Hybrid object-relational databases that blended those two approaches were also developed.
- Pre-relational DBMS products include hierarchical and network database systems developed to run on mainframes. IBM Information Management System -- better known as IMS -- and IDMS, now owned by Broadcom, are two such technologies that are both still available.
Additional considerations for choosing a DBMS
As you examine the different types of database management system technologies and then specific DBMS products for a planned purchase, one issue that should be at the top of the list to consider is server hardware platform and OS support. The predominant computing environments today are Linux, Windows, Unix and the mainframe. Not every DBMS is supported on each of those platforms.
Another consideration is technical support from DBMS vendors. Software maintenance and support is a critical capability -- and it can be a significant ongoing expense. Many DBMS technologies are open source, particularly NoSQL ones. The open source approach increases flexibility and reduces the initial cost. However, open source software lacks support unless you buy a commercial subscription. Total cost of ownership can end up being higher when you factor in the related administration and support costs.
You might also choose to reduce implementation and administration pain by running a DBMS in the cloud. As mentioned above, you can either deploy a self-managed system in a private or public cloud or use a managed database service offered by a cloud provider or another database vendor. Also known as database as a service (DBaaS), the latter approach frees users from having to install, configure and administer DBMS technologies themselves. Similar data warehouse as a service (DWaaS) offerings are also available. In addition, some DBaaS and DWaaS technologies can now be installed in on-premises data centers and managed remotely by the vendor.
If your organization is considering a DBMS purchase, document your specific needs, determine which DBMS technology is the right fit and then examine the leading products in that category. Doing so will require additional details on the different types of DBMS software and the use cases for which they're optimized. Indeed, there are many variables that need to be evaluated to ensure you make a wise decision when buying a database management system.
Craig S. Mullins has extensive experience in the field of database management, having worked as an application developer, a DBA and an instructor with multiple database management systems including DB2, Sybase and SQL Server.