Deploying a 'warm-standby' environment for Metric Insights
This article covers how to deploy a warm-standby environment for Metric Insights and the operational requirements to maintain such an environment. By 'warm-standby' we mean an active/passive high-availability deployment. For any additional questions, please contact [email protected].
A 'warm-standby' deployment consists of an active primary application server and an inactive secondary application server. The inactive server essentially serves as a standby server to be promoted to active in the event the primary goes down. Both servers maintain their own Metric Insights database locally.
In the diagram above, notice the Active and the Standby servers mirror each other in terms of architecture:
- Both servers consists of a basic LAMP stack (Linux, Apache, MySQL, PHP) to support the Metric Insights application
- Metric Insights is installed to the default /opt/mi location (this can be changed of course)
- MySQL is running locally for the Metric Insights database
- All data files (json, png, etc.) sit in /opt/mi/iv/data
There are two key differences that distinguish the Active from the Standby however:
- Only the primary serves user sessions
- Only the primary has cron enabled (the mi-cron utility)
This means all user traffic is directed to the primary server and all Metric Insights jobs run on the primary server only. Jobs include data collections and distributions.
To deploy a warm-standby environment, you need at the very least a second server to serve as a 'mirror' for the primary server. The servers should have the same hardware and software specs:
- Same number of CPU cores
- Identical RAM
- Identical Storage
- Identical Linux OS
- Identical Linux kernel
- Identical Apache, MySQL, PHP installations
- Identical MySQL parameters
- Identical Apache, Syslog, Cron configurations
On the active server, install Metric Insights with mi-cron enabled.
On the standby server, install Metric Insights with mi-cron disabled.
mi-cron enable mi-cron disable
On the active primary server, create scheduled backups of Metric Insights using the mi-app-backup utility. Make sure to create full backups (full backups include the database and data files). Please see http://help.metricinsights.com/m/MI_System_Maintenance/l/104530-backup-your-metric-insights-instance for more.
As soon as the backup file is created on the active server, copy the backup to the standby server. There are a number of ways to do this including:
- Using a tool like rsync to copy the backup file (as seen in the diagram)
- Saving the backup file to a shared network drive (mounted on both the active and standby servers)
On the standby server, create scheduled restores of Metric Insights using the mi-app-restore utility. Make sure the utility restores the latest backup from the active server. Please see http://help.metricinsights.com/m/MI_System_Maintenance/l/104531-restore-your-metric-insights-instance for more.
On restore completion, make sure to disable cron by running mi-cron disable. This ensures triggers are not running on the standby server to collect data as well as notification schedules from delivering content.
A simple shell script can be created to both run the restore and disable cron on restore completion.
- Our general recommendation is to create backups daily. Therefore, any backups restored on the standby server will always be a day behind. Should the standby server have to be promoted to active, simply running data collections for the day will bring the data to current.
- In the event the active server goes down and users have created new content since the most recent backup, then the new content will be lost when the standby server is promoted.
- Should the standby server be promoted to active, make sure to enable mi-cron so data collections and distributions can run.
- If the server hostname is mapped directly to the active server (active server's IP), and the active goes down, network administrators must remap the hostname to the standby server (DNS update).
- If using a Load Balancer instead, you can configure the LB to auto-detect server failure and auto-switch to the standby server.
- Both active and standby servers cannot serve user sessions. In other words, the standby server must truly be on standby.
For any additional questions, please contact [email protected].