My design of Internal monitoring of machines and application availability combined with tiered paging allowed our small staff to cover 24x7 operation, and provided for intervention and maintenance or restarts from remote locations.
The design involved DNS servers at geographically distant locations, hosted on different network providers.
Daily data backups to tape and replication of data sets to different locations provide the possibility of fast restarts from another location in case of local damage from natural disasters such as fire, earthquake, or regional loss of power, etc.
Design also included redundant routers, load balancers and switches configured for active failover.