Skip to content

A comparison between AWS databases

Main databases types:

  • Relational: data are stored in tabular form (rows and columns), where each row represents a unique record. Tables can be put in relation with each other through joins and queried via SQL;
  • Key-value: non-relational database where each record stored as a unique key with its associated value, resembling a dictionary-like structure;
  • Document: semi-structured and hierarchical databases for catalogs and content management systems, often stored as JSON;
  • Graph: the way the data are stored is graph-based, with nodes and edges connecting each data source with the others;
  • Time-series: database optimized for records which indices are timestamps.
Service Type Query language Use cases
Athena Structured, semi-structured and unstructured SQL based on HiveQL DDL and Presto DML Log analysis, OLAP, BI
Aurora1 Relational MySQL, PostgreSQL eCommerce, CRM
DocumentDB Document compatible with MongoDB query language product catalogs, images and videos, application data platform
DynamoDB Key-value NoSQL Mobile and web apps, gaming, IoT
Neptune Graph GQL (Apache TinkerPop Gremlin, SPARQL) Fraud detection, social netowkrs, knowledge graph, recommendation engines
Timestream Time-series ANSI SQL IoT, DevOps, telemetry, forecasting, analytics
RDS Relational SQL DWH, CRM
Redshift Relational SQL based on PostgreSQL Large-scale DWH, data migration, OLAP

Warning

Athena and RDS are somewhat erroneously reported in the above table which refers to databases: a deep dive on the main differences is available below.

Amazon RDS

From the official docs:

Amazon Relational Database Service (Amazon RDS) is a web service that makes it easier to set up, operate, and scale a relational database in the AWS Cloud. It provides cost-efficient, resizable capacity for an industry-standard relational database and manages common database administration tasks.

Amazon RDS is not therefore a database engine itself, rather than a tool that helps in managing relational databases on AWS. Essentially, can be thought as an AWS-managed, generally purpose RDBMS.

It allows to manage and run six database engines:

  • Amazon Aurora
  • MySQL
  • MariaDB
  • PostgreSQL
  • Oracle
  • Microsoft SQL Server

A brief comparison between open source databases

A brief comparison between open source relational databases is available below (full credits to Soufiane L).

Feature MariaDB2 MySQL PostgreSQL3
Materialized Views ✔️
Partial Indexes ✔️
Array Data Type ✔️
JSON Data Type ✔️ ✔️ ✔️
CHECK constraints ✔️ ✔️
Replication ✔️ ✔️ ✔️
Full-Text Search ✔️ ✔️ ✔️
UPSERT ✔️ ✔️ ✔️
Common Table Expressions ✔️ ✔️ ✔️
Sequences ✔️ ✔️
Table Partitioning ✔️ ✔️

Amazon Athena

Amazon Athena it's not a database engine itself but it's defined as an interactive query service based on Apache Presto that makes easy to analyze data stored into S3 via SQL. This means that compute and storage are separate and managed independently.

It is completely serverless and cost-effective, and can be used together with AWS Glue - fully managed ETL service - which takes care of managing a related data catalog as a central source for metadata while preparing data for querying.

In contrast, efficient querying requires data to be partitioned and purposely organized into S3 buckets upfront; moreover AWS Athena users compete for the same resources at the same time. While AWS provisions more resources as needed, it could mean that query performance fluctuates based on other users needs.

Resources


  1. can be run on both RDS and as Aurora serverless

  2. based on MySQL and created after its acquisition by Oracle. 

  3. PostgreSQL is generally considered to be the fastest one in terms of read/write speed.