MOR-PLN-039 Version 1 Last Review Date: Nov, 2024
Failure Scenarios
1. Failure Scenario List
| Failure ID | Failure Scenario | Description | Confirmation Method | Recovery Procedure | MoreCore Restoration Time |
|---|---|---|---|---|---|
| 1 | Azure Down (Global) | Microsoft Azure is down globally | Platform down. Can't sign in to portal | Plan | N/A |
| 2 | Azure Primary Region Down | When Australia South East is unavailable | Plan | N/A | |
| 3 | Azure Secondary Region Down | When Australia East is unavailable | Plan | N/A | |
| 4 | Azure Availability Zone Down | Plan | |||
| 5 | Azure Specific Resource Corrupted | When a particular resource, ie. VM is not available or not operating as expected | Plan | ||
| 5A | Azure Specific Resource Failure - App Service | CMMS is not available | Plan | ||
| 5B | Azure Database Corruption | CMMS is available but some functionlity not working | Plan | > 12 hours | |
| 5C | Azure Database Resource Failure | CMMS is not available | Plan | >30 Seconds | |
| 6 | Azure Service Down | Plan | |||
| 7 | App Code Failure | CMMS is working but not as intended and/or error mesasges are occuring in the log | Plan |
2. Failure Verificaton Procedures
Failure ID 1 - Azure Down (Global)
Assesment Method
Attempts to reach MoreCore environments fail
Azure status monitor indicates widespread or complete outage
Check Azure Status via https://status.azure.com/ (opens in a new tab)
Attempts to sign in to portal.azure.com fail
Recovery Plan
Failure ID 2 - Azure Primary Region Down
Assesment Method
You received a notification from Azure Service Health
Azure status monitor indicates outage
Check Azure Status via https://status.azure.com/ (opens in a new tab)
Azure logs / Service Health indicate Region is unavailable
CMMS is not working as expected
Certain funcationality is not working as expected
Recovery Plan
Failure ID 3 - Azure Secondary Region Down
Assesment Method
You received a notification from Azure Service Health
Azure status monitor indicates outage
Check Azure Status via https://status.azure.com/ (opens in a new tab)
Azure logs / Service Health indicate Region is unavailable
CMMS is working as expected
Recovery Plan
Failure ID 5 - Azure Specific Resource Corrupted
Assesment Method
Check the Azure Portal for any alerts or error messages
Azure status monitor indicates outage
Check Azure Status via https://status.azure.com/ (opens in a new tab)
Azure logs / Service Health
CMMS is NOT working as expected
Certain functionality is not working
Recovery Plan
Failure ID 5A - Azure Specific Resource Failure - App Service
Assesment Method
Check the Azure Portal for any alerts or error messages
Azure status monitor indicates outage
Check Azure Status via https://status.azure.com/ (opens in a new tab)
Check Azure logs / Service Health
CMMS is NOT reachable
HTTP 400 is most probably what you would be experiencing
Recovery Plan
Failure ID 5B - Azure Database Corruption
Assesment Method
Check the Azure Portal for any alerts or error messages
Azure status monitor indicates outage
Check Azure Status via https://status.azure.com/ (opens in a new tab)
Check Azure logs / Service Health
CMMS is NOT reachable
HTTP 400 is most probably what you would be experiencing
Run Integrity Checks (for SQL Databases)
-
DBCC CHECKDB: If you're using Azure SQL Database (or SQL Server in general), run the
DBCC CHECKDBcommand to check the database for corruption. This command checks the logical and physical integrity of all the objects in the database.DBCC CHECKDB('YourDatabaseName');If any corruption is found,
DBCC CHECKDBwill provide error messages and, in some cases, suggest repair options. -
Automatic Fix: In cases where corruption is detected in SQL databases, you can use the
DBCC CHECKDBcommand with theREPAIR_ALLOW_DATA_LOSSoption, but this should only be done as a last resort because it may result in data loss.DBCC CHECKDB('YourDatabaseName', REPAIR_ALLOW_DATA_LOSS);
Recovery Plan
Failure ID 5C - Azure Database Resource Failure
Assesment Method
Check the Azure Portal for any alerts or error messages
Azure status monitor indicates outage
Check Azure Status via https://status.azure.com/ (opens in a new tab)
Check Azure logs / Service Health
CMMS is NOT reachable
HTTP 400 is most probably what you would be experiencing
Recovery Plan
Failure ID 6 - Azure Service Down
Assesment Method
Check the Azure Portal for any alerts or error messages
Azure status monitor indicates outage
Check Azure Status via https://status.azure.com/ (opens in a new tab)
Check Azure logs / Service Health
CMMS is reachable but not all functionality is working
Look in the browser console, see if erros are present
Recovery Plan
Link TBC
Failure ID 7 - App Code Failure
Assesment Method
Check the Azure Portal for any alerts or error messages
Azure status monitor indicates outage
Check Azure Status via https://status.azure.com/ (opens in a new tab)
Check Azure logs / Service Health
CMMS is NOT working as expected
Issues such as press a button and nothing appears to happen.
- If you look in the browser debug section you may see errors in the console