Understanding Exchange Server 2003 Data Storage

Exchange Server 2003 uses the Extensible Storage Engine (ESE) database structure to store data. Data can be stored separately for messages and for transactions:

  • Messages are stored in.edb and .stm database files. Database files also contain a number of other components, including:
    • Rules
    • Folders
    • Attachments
    • Indexes
  • Transactions are stored in transaction log files.

A message that is created is first written to log files before it is written to the database files. From the log files, the transactions are sequentially written to a numbered file. Data in the log file is written to the database at a later stage.

Transaction log files are written to in a sequential order. Database files are written to in a random manner. They are also read in a random manner. Exchange Server 2003 automatically creates a new transaction log file when the current log file reaches 5 megabytes (MB) in size.

The transaction log files and database files should be located on disk systems that have the following characteristics:Exchange Server 2003 Data Storage and Management

  • The disk systems are optimized for the type of functions to be performed.
  • The disk systems do not compete.

The disk setup that you use for transaction log files and database files should provide the following:

  • Best cost.
  • Best data protection.
  • Best performance.

You can use redundant array of independent drives (RAID) systems in Exchange Server 2003. There are software and hardware RAID implementations:

  • Hardware-based RAID: You are in a better position using hardware RAID because of the extra fault-tolerant RAID configurations, dynamic expansion capabilities, hot sparing for online failover, hot swapping of failed hard disks, and dedicated cache memory it provides. A RAID controller is added to the server that maintains the RAID volumes. Performance is improved because the processing overhead is carried by the RAID controller and not the Operating System (OS). With hardware RAID arrays, the logical disk constructed by the RAID controller is seen by OS as a one, large basic disk. Hardware RAID controllers are also better at detecting cascading sector failures. Sophisticated hardware RAID controllers support RAID 0+1, which is a combination of the fault tolerance of mirroring and speed of RAID 0. Hardware RAID solutions therefore provides faster disk I/O than software fault tolerance provided by Windows Server 2003. You should opt to use hardware RAID if you can afford it. Implement software RAID when you cannot match and justify the expense of hardware RAID implementations
  • Software-based RAID: Windows Server 2003 provides a software implementation of RAID to maintain data access when a single disk failure occurs. Data redundancy occurs when a computer writes data to more than one disk. This in turn protects data from a single hard disk failure. The distinction between software RAID and hardware RAID is that software RAID is put into operation solely through software. Software RAID needs no special hardware for it to be implemented. The main advantage of software RAID is its price because it is free. Through the use of the Disk Management utility of Windows Server 2003, you can create striped volumes (RAID 0 configuration), mirrored volumes (RAID 1 configuration) or RAID 5 volumes/disk striping with parity supported by Windows Server 2003. If you plan to implement a Windows Server 2003 RAID solution, bear in mind that no fault tolerance exists after a failure – the fault has to be fixed to restore fault tolerance. In addition to this, data has to be restored from a backup if a second failure occurs prior to you correcting the initial fault.

The commonly used RAID implementations in Exchange Server 2003 are listed here:

  • RAID-0 Disk Striping: You can use disk striping without parity, RAID 0, if you want to utilize space on multiple disks and simultaneously improve read and write performance. In Windows Server 2003, a RAID 0 volume is known as a striped volume. RAID 0 provides no fault tolerance. The data in the entire volume is lost if a disk in a striped volume fails. To perform striping, you need at least 2 dynamic disks. No more than 32 dynamic disks are allowed. You basically create a striped volume set from the free space on the disks. Each component or member disk of the striped volume set is split into equal sized stripes. When you save data, it is spread across the stripes, in a sequential manner. Simply put, data is added to all disks at the same rate. Data is never stored on a single member disk. Because of the simultaneous reads and writes to each disk in the stripe, RAID 0 provides excellent performance. Striped volumes are ideal in circumstances where performance and large storage area are important, for instance, digital media applications and Computer Aided Design (CAD). Data is typically also not mission critical, or is backed up on a regular basis. A RAID 0 array cannot be used for the boot partition and system partition because the Operating System (OS) has to be loaded prior to start the striped volume. RAID-0 Disk Striping is not recommended for mission critical environments.
  • RAID-1 Mirroring (Duplexing): A RAID 1 volume is known as a mirrored volume. Two disks partake in a mirrored volume. This configuration is also known as a mirror set. With mirroring, two copies of all data are written to separate volumes on two different disks. In the case where one disk fails, the remaining disk in the mirror volume set has an identical copy of the data. RAID 1 is frequently used to protect the drive where the OS is situated. It is a good initiative to mirror the boot and system volume to ensure that you can boot the server in the event of a single drive failure. You only need two physical disks to create a mirror volume set. It is also possible to mirror an existing simple volume set, thereby making it fault tolerant. Disk mirroring is simpler to use in Windows Server 2003 because you can create mirrored volumes without rebooting.

Disk mirroring provides almost the identical fault tolerance as disk striping with parity (RAID-5). Mirrored disks however provide better write performance because parity information is not written. On the other hand, disk striping with parity (RAID 5) typically presents better read performance because read operations are distributed across multiple drives.
Disk operations can continue when one of the mirrored disks in a mirror set fails. Data in this case is written to the remaining disk in the mirror set. Mirrored sets are twice as pricy as RAID 0 because you essentially need twice as much storage space as the data you have. For each disk you want to mirror, you need a second disk of the exact same size because data is written to two locations. This in turn leads to mirrored volumes having a high overhead.
A variation of RAID 1 is duplexing. With duplexing, a separate disk controller is used for each mirrored disk. Duplexing can further improve performance and increase fault tolerance.

  • RAID-0+1 Mirrored Stripes: Nested levels of RAID make use of a grouping of the single levels of RAID. Top of the range RAID controllers support nested RAID, such as RAID-0+1, which provides excellent speed and fault tolerance. RAID-0+1 is made up of two striped arrays which are mirrored. The benefits of using RAID-0+1 include performance improvements provided by striping and the redundancy provided by mirrored disks. With RAID-0+1, a single disk failure results in the array becoming a RAID-0 array – no data protection exists. When another disk fails in the same mirror, your datais retained. Data is lost though when disks on either side of the mirror fail. A RAID-0+1 array should be a hardware implementation that supports hot swapping capabilities.
  • RAID-5 Striping with Parity: RAID-5 uses disk striping with parity. RAID-5 needs at least 3 hard disks to implement fault tolerance. A maximum of 32 disks can be used in RAID-5 volumes. To form a RAID-5 volume, you specify portions of unallocated space on the disks and group them into a RAID-5 volume. This level of RAID is also called a striped set with parity. Fault tolerance ensures that a single drive failure does not result in the whole set being down. To enable fault tolerance, RAID-5 writes parity information with the blocks of data. Whenever data is written to RAID-5 volumes, it is written across all the striped disks in the RAID-5 volume, and parity information for the data is also written to disk. Parity information is written to a separate disk from that disk holding the matching data. A disk in the RAID-5 volume can hold a portion of the data, or the parity information which would be used to reconstruct the contents of a lost disk. The original data and parity information is not stored on the identical disk. This parity information is then used to recover the data when a disk in the striped set fails. The RAID-5 set continues to function at this point because the remaining disks deals with disk functions. On the other hand, when two disks in the RAID-5 volume set fails, the parity information is inadequate to recover the data. When this occurs, you have to build the striped set again from backup.

A RAID-5 volume cannot be used for the boot partition or system partition. This is due to the operating system needing to load initially to start the RAID-5 volume. RAID-5 is slower than RAID 1 because of its calculation of parity information. This makes RAID 5 processor intensive. RAID-5 volumes however need less space for redundant data than what RAID 1 needs. RAID-5 works where data availability is essential.
It is recommended that you use a hardware-based RAID-5 implementation in an Exchange Server 2003 organization.

How to implement a Striped Volume (RAID 0):

  1. Open the Disk Management console.
  2. Right-click the unallocated space on the disk where you want to create the volume, and select New Volume to launch the New Volume Wizard. Click Next.
  3. Select Striped on the Select Volume Type window. Click Next.
  4. On the Select Disks window, select the disk(s) to include in the striped volume, and the amount of space to be used. Remember that the total space on any disk equals the smallest free space on any chosen disk Click Next.
  5. On the Assign Drive Letter or Path window, assign a drive letter or mount the volume to an empty NTFS folder. Click Next.
  6. On the Format Volume window, select a format (NTFS) for the volume, or select the Do not format this volume option. Click Next.
  7. The Completing the New Volume Wizard window displays the options you have selected.
  8. Click Finish to create the striped volume.

How to implement a Mirrored Volume (RAID 1):

  1. Open the Disk Management console.
  2. Right-click the volume you wish to mirror, and select Add mirror to open the Add Mirror window.
  3. Select the disk you want to use for a mirror. Click Add Mirror to create the mirror.

How to break a Mirrored Volume:

  1. Open the Disk Management console.
  2. Right-click the mirrored volume you wish to break, and select Break Mirror from the shortcut menu.
  3. A message asking you to verify your actions is displayed. The message also warns you that the volumes will not be fault tolerant if you continue.
  4. Click Yes to continue.
  5. The mirror breaks. The next accessible drive letter is allocated to the volume located on the secondary disk.

How to implement a RAID 5 Volume:

  1. Open the Disk Management console.
  2. Right-click the unallocated space on the disk where you want to create the RAID-5 volume, and select New Volume to launch the New Volume Wizard. Click Next.
  3. Select RAID-5 on the Select Volume Type window. Click Next.
  4. On the Select Disks window, select the disk(s) to include in the volume, and the amount of space to be used. Click Next.
  5. On the Assign Drive Letter or Path window, assign a drive letter or mount the volume to an empty NTFS folder. Click Next.
  6. On the Format Volume window, select a format (NTFS) for the RAID-5 volume, or select the Do not format this volume option. Click Next.
  7. The Completing the New Volume Wizard window displays the options you have selected.
  8. Click Finish to create the RAID-5 volume.

Understanding How Transaction Logs Protect Data

When you store transaction log files separately to database files, the following benefits are achieved:

  • Disk performance is improved.
  • Protection from data loss.

Each storage groups has its own transaction log file. The Information Store is arranged into storage groups. A storage group is a group of separate databases which have a common set of transaction log files. It is these storage groups that contain the mailbox stores, public stores, or both of these stores.

From the transaction log file, the information is saved to the database file of the storage group. A checkpoint file indicates which transaction log entries have since been written to the database file. The information is not deleted from the transaction log file at this stage. It is only deleted when a full online backup of all the databases in the storage group is performed.

The concept of a soft recovery and hard recovery is illustrated below:

  • Soft recovery: When the hard disk containing the storage group databases is lost, the damaged disk can be replaced. You can use the latest database backup for the restore. When the checkpoint file is deleted, an automatic log file replay of all transactions that took place since the backup, transfers the transactions from the log files to the databases. A soft recovery is also referred to a roll-forward recovery.
  • Hard recovery: A hard recovery can be performed when you have backed up transaction log files since the last full backup. With a hard recovery, transaction log files from the backup medium are replayed after the database is restored from an online backup. If Exchange 2003 detects that additional log files are available on the server, soft recovery is initiated to restore these log files to the database.

If the disk you are using for the transaction log files fails and the disk storing the databases are still online, no storage group data needs to be restored. You cannot though replay any transactions that are recorded to log files and not to the database files on disk.

Storage Technologies used with Exchange Server 2003

The different storage technologies that can be used with Exchange Server 2003 are listed below. The storage technology that you choose will be determined by the size of your Exchange Server 2003 organization.

  • External storage array (ESA): The characteristics and features of an external storage array are listed here:
    • An external Small Computer System Interface (SCSI) drive cabinet hosts multiple SCSI disk drives that are typically set up as RAID sets.
    • SCSI cables connect the disk drives to an Exchange Server 2003 server.
    • External storage has to be managed on a per Exchange Server 2003 server basis.
    • External storage provides good performance.
    • Limited scalability is provided.
    • Recommended for small Exchange organizations.
  • Network Attahed Storage (NAS): The characteristics and features of network attached storage are listed here:
    • SCSI or fiber channel connections can be used to connect the storage device to the Ethernet network.
    • Network attached storage has its own IP address and is not directly attached to the Exchange Serve 2003 server.
    • File requests are mapped by the Exchange Server 2003 server to the network attached storage server.
    • This is not the recommended storage technology for Exchange Server 2003, simply because Exchange Server 2003 has local data access and bandwidth requirements that are not compatible with network attached storage products.
  • Storage area network (SAN): The characteristics and features of a storage area network are listed here:
    • For medium to large Exchange organizations, using a SAN as a storage technology is the recommended solution.
    • A SAN uses fiber channel switching to provide fast and reliable connections between storage and applications.
    • SANs optimize the performance and reliability of a server.
    • SAN packages are supplied by hardware vendors such as IBM and Intel. The SAN package includes all hardware, software and support functions.
    • The main components that make up a SAN are listed here:
      • Fiber channel switching.
      • Storage arrays that store and protect data.
      • Storage software and SAN management software.
    • The advantages of using a SAN technology in an Exchange Server 2003 organization are listed here:
      • You can connect multiple Exchange Server 2003 servers to multiple storage arrays and share storage between the servers.
      • The high I/O bandwidth needed by Exchange Server 2003 is supported by SAN solutions.
      • The Exchange organization can be easy expanded by adding servers.
      • SANs are highly scalable and disks can easily be added.

Managing Storage and Storage Groups

A few best practices for configuring storage groups and databases are summarized here:

  • You should never use circular logging.
  • Use only four databases for each storage group.
  • When creating additional storage groups, create the databases that are needed before you create the additional storage groups. This prevents overhead on the server for log file management.
  • To ensure short maintenance and restore times, you should strive to keep the size of the databases small.
  • When you configure your storage group limits, it is recommended that you do not enable the prohibit-send option.
  • You should not change the online system maintenance default setting of enabled.
  • It is recommended that you perform a full backup each day, if feasible.
  • Verify that backups have occurred successfully.
  • You should retain deleted mailboxes for a minimum of thirty days.
  • You should retain deleted items for a minimum of seven days.
  • Ensure that the logs are purged.

Additional Storage groups would need to be created under the following conditions:

  • There is a need to utilize separate physical transaction log drives so that performance can be improved.
  • There is a requirement that multiple databases need to be backed up simultaneously. If you use multiple storage groups, then each storage group can be backed up at the same time.
  • The existing storage group is already using the maximum number of databases supported, and you need another database. For this, you would need to create an additional Storage group.

How to create storage groups:

  1. Click Start, All Programs, Microsoft Exchange, and then select ExchangeSystem Manager.
  2. Exchange System Manager opens.
  3. In the left pane, right-click the Exchange server and select New and then Storage Group from the shortcut menu.
  4. In the Properties dialog box which opens, in the Name textbox, provide a name for the new storage group. This is the name that will appear in Exchange System Manager and in the Active Directory Users And Computers management console.
  5. In the Transaction log location box, provide the location for storing the transaction logs. Click the Browse button to navigate to the location.
  6. In the System path location box, provide the location for storing temporary files. Click the Browse button to navigate to the location.
  7. In the Log file prefix box, the specific log file prefix is automatically assigned by the Exchange server.
  8. Enable the Zero out deleted database pages checkbox to have all deleted data removed from the drive.
  9. The Enable circular logging checkbox should not be enabled.
  10. Click OK.

How to add mailbox stores to a storage group:

  1. Click Start, All Programs, Microsoft Exchange, and then select Exchange System Manager.
  2. Exchange System Manager opens.
  3. In the left pane, right-click the storage group container and select New and then Mailbox Store from the shortcut menu.
  4. On the General tab, provide the following information:
    • Database name.
    • Default public store.
    • Offline address list to use.
    • Enable message archiving.
    • Specify whether clients support S/MIME signatures.
    • Specify whether plain-text should be displayed in fixed-sized font.
  5. On the Database tab, provide the following information:
    • Provide the location for the EDB database.
    • Provide the location for the STM database.
  6. On the Limits tab, provide the following information:
    • Specify the message storage limit.
    • Specify the deleted items policy.
    • Specify the deleted mailbox policy.
  7. On the Full-Text Indexing tab, provide the following information:
    • Specify the frequency at which the full-text index is updated/rebuilt.
  8. On the Details tab, provide the following information:
    • Specify which configuration information needs to be manually inputted by administrators.
  9. On the Policies tab, provide the following information:
    • Specify the system mailbox store policies for the mailbox store.

How to delete mailbox stores:

  1. Click Start, All Programs, Microsoft Exchange, and then select Exchange System Manager.
  2. Exchange System Manager opens next.
  3. Expand the Administrative Groups node, the administrative group, Servers node, the server, and then Storage Groups.
  4. Right-click the store that you want to remove and then click Delete.
  5. Click Yes to delete the store.
  6. Click OK.
  7. Proceed to manually delete the associated database files.

How to create a public store and associate it with the public folder tree:

  1. Click Start, All Programs, Microsoft Exchange, and then select Exchange System Manager.
  2. Exchange System Manager opens.
  3. In the left pane, right-click the storage group and select New and then Public Store from the shortcut menu.
  4. Provide a name for the public store.
  5. Click the Browse button to associate the public store with the public folder tree.
  6. In the Available Public Folder Trees list, select the public folder tree.
  7. Click OK.
  8. Click OK to create the public store.
  9. Click Yes to mount the database

How to delete public older stores:

  1. Open Exchange System Manager.
  2. Expand the Administrative Groups node, the administrative group, Servers node, the server, and then Storage Groups.
  3. Right-click the public folder store and then select Dismount from the shortcut menu.
  4. Click Yes to dismount the store the public folder store.
  5. Right-click the public folder store and select Delete from the shortcut menu.
  6. Click Yes to delete the public folder store.
  7. A message appears, stating that another public folder store has to be selected for the server’s system folders. Click OK.
  8. Select the appropriate public folder store for the system folders and click OK.
  9. Proceed to manually delete the database files.
  10. Right-click the mailbox store and select Dismount from the shortcut menu.
  11. Click Yes to dismount the store the mailbox store.
  12. Right-click the mailbox store and select Delete from the shortcut menu.
  13. Click Yes to delete mailbox store.
  14. Click OK.