Skip to main content

Course - AWS Certified Database - Specialty

PTO - Performance Tuning and Optimization

1. Workload-specific database design

  • Select appropriate database services for specific types of data and workloads
  • Determine strategies for disaster recovery and high availability
  • Design database solutions for performance, compliance, and scalability
  • Compare the costs of database solutions

2. Deployment and migration

  • Automate database solution deployments
  • Determine data preparation and migration strategies
  • Execute and validate data migration

3. Management and operations

  • Determine maintenance tasks and processes
  • Determine backup and restore strategies
  • Manage the operational environment of a database solution

4. Monitoring and troubleshooting

  • Determine monitoring and alerting strategies
  • Troubleshoot and resolve common database issues
  • Optimize database performance

5. Database security

  • Encrypt data at rest and in transit
  • Evaluate auditing solutions
  • Determine access control and authentication mechanisms
  • Recognize potential security vulnerabilities within database solutions

https://www.aws.training/Details/eLearning?id=47245

Purpose Built Databases: Match your workload to the right database

Factors while choosing a db

  1. Transactional compliance requirements of your workload
  2. Data longevity
  3. How strict are you with invalid data being sent to your database? (Ideally you are very strict and do server side data validation before persisting it to your database)

Structure of data

The structure of the data basically decides how we need to store and retrieve it. As our applications deal with data present in a variety of formats, selecting the right database should include picking the right data structures for storing and retrieving the data. If we do not select the right data structures for persisting our data, our application will take more time to retrieve data from the database, and will also require more development efforts to work around any data issues.

Size of data to be stored

This factor takes into consideration the quantity of data we need to store and retrieve as critical application data. The amount of data we can store and retrieve may vary depending on a combination of the data structure selected, the ability of the database to differentiate data across multiple file systems and servers, and even vendor-specific optimisations. So we need to choose our database keeping in mind the overall volume of data generated by the application at any specific time and also the size of data to be retrieved from the database.

Speed and scalability

This decides the speed we require for reading the data from the database and writing the data to the database. It addresses the time taken to service all incoming reads and writes to our application. Some databases are actually designed to optimise read-heavy applications, while others are designed in a way to support write-heavy solutions. Selecting a database that can handle our application's input/output needs can actually go a long way to making a scalable architecture.

Accessibility of data

The number of people or users concurrently accessing the database and the level of computation involved in accessing any specific data are also important factors to consider while choosing the right database. The processing speed of the application gets affected if the database chosen is not good enough to handle large loads.

Data modelling

This helps map our application's features into the data structure and we will need to implement the same. Starting with a conceptual model, we can identify the entities, their associated attributes, and the entity relationships that we will need. As we go through the process, the type of data structures we will need in order to implement the application will become more apparent. We can then use these structural considerations to select the right category of database that will serve our application the best.

Scope for multiple databases

During the modelling process, we may realise that we need to store our data in a specific data structure, where certain queries cannot be optimised fully. This may be because of various reasons such as some complex search requirements, the need for robust reporting capabilities, or the requirement for a data pipeline to accept and analyse the incoming data. In all such situations, more than one type of database may be required for our application. When choosing more than one database, it's quite important to select one database that will own any specific set of data. This database acts as the canonical database for those entities. Any additional databases that work with this same set of data may have a copy, but will not be considered as the owner of this data.

Safety and security of data

We should also check the level of security that any database provides to the data stored in it. In scenarios where the data to be stored is highly confidential, we need to have a highly secured database. The safety measures implemented by the database in case of any system crash or failure is quite a significant factor to keep in mind while choosing a database.

Others

  1. Understand the data structure(s) you require, the amount of data you need to store/retrieve, and the speed/scaling requirements
  2. Model your data to determine if a relational, document, columnar, key/value, or graph database is most appropriate for your data
  3. During the modeling process, consider things such as the ratio of reads-to-writes, along with the throughput you will require to satisfy reads and writes
  4. Consider the use of multiple databases to manage data under different contexts/usage patterns
  5. Always use a master database to store and retrieve canonical data, with one or more additional databases to support additional features such as searching, data pipeline processing, and caching