Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Page Properties

Atividade

OPERATIONS

Departamento

Status
titlesALES&MARKETING
Status
colourPurple
titleeNGINEERING
Status
colourRed
titlecOSTUMER cUSTOMER sUPPORT
Status
colourRed
titleSAST

Processo

AWS Region outage

Table of Contents

...

📝 Descrição

🎯 Objetivo

To respond to an AWS region outage impacting Clinical Brain's infrastructure, ensuring rapid service restoration, minimal operational disruption, and clear communication with stakeholders. The goal is to maintain business continuity during such incidents.

...

This procedure is required due to the need to keep Clinical Brain's services running smoothly and without interruption. It's essential for making sure that our systems can quickly recover from an AWS region outage, helping to avoid long downtimes and keep our operations running efficiently. This aligns with our business goal of maintaining a reliable and consistent service for our users.

🧭 Definições

N/A

...

🗒️ Lista de atividades

  1. Incident identification

  2. Stakeholder communication

  3. Automated recovery process initiation

  4. DNS entry update for Disaster Recovery

  5. Service restoration verification

  6. Post-incident review

...

(blue star) Descrição das atividades

(blue star) Atividade #1 - Incident identification

...

Descrição

Initiate automated processes for disaster recovery

Recursos

Responsável

Status
colourPurple
titleeNGINEERING

Substituição

Status
colourGreen
titletodo

Passo a passo

  1. Go to Clinical Brain tags

  2. Look through the list of tags to find the one with the highest value. This tag represents the version of the infrastructure currently running in production

  3. Go to Clinical Brain branches

  4. Launch the Create a branch wizard by clicking on the button New branch

    1. In the Name field, enter disaster-recovery/<major.minor.patch>. Replace <major.minor.patch> with the version numbers of the highest tag you identified earlier. For example, if the highest tag was 1.0.0, your branch name should be disaster-recovery/1.0.0

    2. In the Based on field, select the "tags" tab and choose the same tag you identified earlier as having the highest value. This step ensures that your new branch is based on the current production version

    3. Click on the Create button. This action will not only create the new disaster-recovery branch but also initiate a pipeline that automatically deploys the infrastructure to the disaster recovery region

  5. After initiating the deployment, go to Clinical Brain pipeline to monitor the progress

  6. Keep an eye on the pipeline, as the following error is expected to occur:

    • This is due to a credentials mismatch. When the RDS is restored from a production snapshot into the disaster recovery region, it retains the roles from the original database. Consequently, the database still references those roles credentials from the production account, while new credentials are generated and stored in the disaster region's AWS parameter store. Furthermore, these outdated roles, impede the proper authentication of lambdas interacting with the database.

  7. To fix the error, navigate to Amazon Web Services (AWS)

  8. Log in to the medicineone_clinicalbrain-prod account, utilizing the Disaster_Recovery_Permissions role

  9. Select the Paris region from the region selection menu

  10. Access the Parameter Store service

  11. Locate and open the parameter /databases_connection_strings/clinical_brain/clinical_brain_user

  12. Click on Show decrypted value to reveal its content

    image-20231222-103749.png
  13. Note down the Server value, crucial for connecting to the disaster recovery database

  14. Note down the Password value. You'll need this for updating the database credentials in an SQL script, the details of which will be provided in the subsequent steps

  15. Return to the Parameter store service

  16. Open the parameter /databases_connection_strings/clinical_brain/lambda_user

  17. Click on Show decrypted value and record the Password. This, too, will be required for the SQL script mentioned later

  18. Again, in the Parameter Store, find and open /databases_connection_strings/master_user parameter

  19. Click on Show decrypted value and note the displayed Credentials, needed for authenticating against the disaster recovery database.

  20. Launch the pgAdmin software

  21. Right click on Servers and navigate to Register → Server

    • In the General tab, enter clinical-brain-dr in the Name field

    • Switch to the Connection tab

    • In the Host name/address field, input the Server value you noted earlier

    • Use the Credentials from the Parameter Store for the Username and Password fields

    • Click on Save

  22. Expand the clinical-brain-dr server

  23. Right click on clinical_brain database and select Query Tool

  24. Paste the following script:

    Code Block
    ALTER ROLE clinical_brain WITH PASSWORD '<replace_by_clinical_brain_password>'; --replace with the password obtained from /databases_connection_strings/clinical_brain/clinical_brain_user
    ALTER ROLE lambda WITH PASSWORD '<replace_by_lambda_password>'; --replace with the password obtained from /databases_connection_strings/clinical_brain/lambda_user/databases_connection_strings/clinical_brain/clinical_brain_user
    • Replace <replace_by_clinical_brain_password> with the password obtained earlier for /databases_connection_strings/clinical_brain/clinical_brain_user

    • Replace <replace_by_lambda_password> with the password obtained earlier for /databases_connection_strings/clinical_brain/lambda_user

  25. Now that the database credentials are updated, navigate to Clinical Brain pipeline

  26. Click on the button Run pipeline

  27. Select the previously created disaster recovery branch in the Branch/tag field and click on Run

  28. Monitor the pipeline and wait for it to complete successfully

  29. Access Amazon Web Services (AWS) again

  30. Navigate to the API Gateway service

  31. In the left menu, select Custom domain names

  32. Find and select the custom domain name clinicalbrain.medicineone.cloud

  33. In the Configurations tab, locate the API Gateway domain name and take note of its value. This information will be provided to

    Status
    colourRed
    titleSAST
    for updating the DNS entry, which is detailed in the communication plan

Stakeholders

Status
titlesALES&MARKETING
Status
colourRed
titlecOSTUMER sUPPORT
Status
colourRed
titleSAST
Customers

...