Disaster Recovery & High Availability for AGRESSO PDF Print E-mail

Protecting your data and ensuring maximum system availability can be broken down into two distinct areas:-

High Availability and Disaster Recovery

High Availability
This deals with how long the system will be unavailable to the end users if a server goes offline.

 

Disaster Recovery

Possibly the more acute consideration for most AGRESSO customers. How much data would you lose if a server suffered hardware or software failure?

 

Losing a server

Take the following scenario. At three o clock in the afternoon on any given day, you suffer a problem on one of the live servers which makes it inoperable. Two questions immediately spring to mind.

 

  • Have you lost any data?
  • How long will the system be unavailable for?

 

The answers to these two questions depend on which server is offline. For most AGRESSO users the answer to these two questions typically may be.

 

Loss of Database Server
This will nearly always be the worst type of scenario that could occur. You are dealing with the situation that there is total downtime of the system and no one is able to perform any work on AGRESSO. And, more importantly there is the possibility that you may have suffered data loss.

 

The typical setup of the AGRESSO database server, is that a full backup of the database is performed each night to the local disk. This is then backed up to tape. Assuming that the tape backup worked ok last night, the data loss would be everything that has been carried out on the system in the current working day.

 

For sites using Oracle it may not be as bad as for MS SQL Server users. Oracle should be archiving the used redo logs. Depending on the location of where the archived redo logs are being copied to; and the nature of the disaster you may not lose all of the current days work.

 

In a situation of losing the AGRESSO database server, the backup situation would follow:-

  • Replace server
  • Install and patch windows
  • Configure server on the network
  • Install Microsoft SQL Server or Oracle
  • Apply service packs to Microsoft SQL Server or patch Oracle (if required)
  • Rebuild and restore databases
  • Configure databases security and performance settings
  • Setup and ensure that the overnight jobs run correctly

 

Loss of Business Server
No data is stored on the Business Server. What you may lose are report templates, UK Products, possibly documentation. However, the data on the server is fairly static and hopefully should be able to be retrieved from a tape backup.

 

The immediate impact of losing the Business Server is that no one will be able to log into the AGRESSO back office client. The system is built so that the Business Server hosts the AGRESSO central client and users run the program centrally. Exceptions to this may be for customers using Citrix (but not always the case) or who have built in some redundancy to their environment.

 

The next question is, apart from the AGRESSO Web interface how long will the AGRESSO system be unavailable? Will this be minutes, hours or days?

 

If the server requires replacing is there a replacement readily available, or will the hardware vendor have to replace the server? If so how long will this take and how soon can Windows be installed, configured and patched?

 

This is fairly standard disaster recovery policy and should be accurately estimated by the I.T. department. Depending on the I.T. department and support contract with the hardware vendor, it should typically be hours rather than days.

 

Installing, configuring and patching AGRESSO on the Business Server may be above the knowledge of a large percentage of sites. It is unreasonable to expect someone who may have had one day’s formal training on the AGRESSO system, to install, configure, patch, set up the central client, install the UK Products and setup the correct shared folders and printers.

 

In an ideal scenario it may be possible to restore the relevant data from tape backup. Has this been tried? Would you know how long the process would take and what length of time the system would be offline?


Loss of Web Server
A lot of AGRESSO customers have more than one Web Server and have them in a Network Load Balanced (NLB) configuration. This normally takes the role of Microsoft NLB or DNS Round Robin. Some configuration changes may be required to stop problems occurring when AGRESSO Web users try to connect. Although, if you had more than one AGRESSO Web Server, the loss of one will be not stop people from working.

 

Solutions
For AGRESSO database servers there are a range of high availability and disaster recovery solutions that can be put into place. These can range from implementing some very efficient changes which utilise the existing hardware, through to Server Clustering, Database log shipping or Database Mirroring (Microsoft SQL Server). The requirements for high availability and disaster recovery will vary. This largely depends on three things

 

  • How much data loss is acceptable
  • How long can the AGRESSO system remain unavailable
  • How much investment is available to protect against it