Overview of Windows Server 2003 cluster management
Server clusters introduce an environment where disks are managed differently than they are in a stand-alone server environment. For example, in server clusters, multiple initiators do not access a single disk. Clustering requires that only one server or only one node access a Logical Unit Number (LUN) at a time. This configuration guarantees that another server does not try to write to the same disk. If more than one server writes to the same disk, data on the disk may become corrupted.
Health monitoring checks on a cluster-managed LUN
A series of health monitoring checks are performed on cluster-managed LUNs to make sure that a LUN is available. If any one of these checks fail, the Cluster service assumes that there is a problem with the LUN and takes recovery action. Recovery actions may include the following:
- Trying to restart the resources and to mount the disk again on that same node.
- Assuming failover ownership of the disk.
- Trying to bring the disk online on another node in the cluster.
The following checks are performed on any disk that is online and that is managed by the cluster:
- File system level checks
At the file system level, the Physical Disk resource type performs the following checks:
- LooksAlive
By default, a brief check is performed every 5 seconds to verify that a disk is still available. The LooksAlive check determines whether a resource flag is set. This flag indicates that a device has failed. For example, a flag may indicate that periodic reservation has failed. The frequency of this check is user definable. - IsAlive
A complete check is performed every 60 seconds to verify that the disk and the file system, or systems, can be accessed. The IsAlive check effectively performs the same functionality as a dir command that you type at a command prompt. The frequency of this check is user definable.
- Device level checks
At the device level, the Clusdisk.sys driver performs the following checks:
- SCSI Reserve
Every 3 seconds, a SCSI Reserve command is sent to the LUN to make sure that only the owning node has ownership and can access that drive. - Private Sector
Every 3 seconds, the Clusdisk.sys driver performs a read and write operation to sector 12 on the LUN to make sure that the device is writeable.
Note The timing of these device level checks is not user definable. The Clusdisk.sys driver is very dependent on this timing for its algorithm when the Clusdisk.sys driver arbitrates for shared disks. If you modify or suppress this timing functionality, you affect the core multi-initiator management functionality of the cluster.
Maintenance mode in Windows Server 2003 Service Pack 1
Windows Server 2003 Service Pack 1 includes a new feature that is called maintenance mode. This mode lets you perform certain administrative and maintenance tasks in a clustered environment. These tasks must be performed on shared clustered disks. These shared clustered disks may require another application or mechanism to obtain an exclusive lock on a disk. When you perform administrative and maintenance tasks, a disk may appear in a regular online state, but health monitoring may be temporarily suppressed. Also, the disk may not be available to clients.
For example, if you type
chkdsk /f on a disk that is managed and monitored by the cluster, the disk is locked for exclusive use to examine the consistency of the disk. This behavior may cause health monitoring checks for that disk to fail if the
CHKDSK command takes longer then the time-out period that is set for the disk. For example, the IsAlive or LooksAlive checks may fail. Therefore, the group in which that disk resides may fail over to another node in the cluster. A failover interrupts the
CHKDSK command and affects the availability of all other resources during the failover.
Extended maintenance mode
Windows Server 2003 SP1 added another feature to the Cluster.exe command-line tool. The tool now waits for the internal state of a resource to stabilize and to complete the online or offline process. This process can be scripted. For example, after you put a resource in extended maintenance mode, the script calls the
/waitmaint parameter. The
/waitmaint parameter blocks activity until the resource has gone into a full stable state internally. Use the following command
syntax to call the
/waitmaint parameter:
cluster resource_name /waitmaint[pending]
This update enables the maintenance mode process to perform the following extended functions:
- The Cluster service gives over control at the LUN level to other applications or functions that need exclusive access to the LUN.
- The Cluster service gives over complete control of the whole LUN to another application for a short time.
- The Cluster service suppresses low-level periodic SCSI reservations and the writes to reserved sectors to guarantee LUN availability.
- The Cluster service suppresses the LooksAlive and IsAlive health monitoring checks.
When you use extended maintenance mode, the disk is put into an offline functional state "internally". When an internal function takes the disk offline, the LUN is dismounted, and all health monitoring is stopped. However, the disk appears to be in an online state at a higher level in Cluster Administrator. Also, the disk appears to be in an online state to dependent resources. The disk does not fail and is not taken offline. Therefore, the dependent resources remain online.
You can use extended maintenance mode together with a hardware snapshot application to perform any functions that are required to complete a snapshot restore. For example, you can mask off a LUN, swap LUNs, put a LUN in read-only mode, and so on.
Important A backup requestor application must provide the timing coordination during a snapshot restore to make sure that the correct application VSS writers are called. Application VSS writers make sure that all handles are closed and that there is no disk usage. Timing coordination prevents an adverse effect on higher level applications when the disk dismount occurs. After the snapshot restore is completed, the disk must be brought out of extended maintenance mode to make sure that the disk is mounted and ready to be accessed by applications.
When extended maintenance mode hands over complete control of the LUN to another process, such as to the operating system, the disk is dismounted. In this case, the disk is inaccessible to applications. Use extended maintenance mode only together with a backup requestor application that stops all usage of a disk by applications before the backup requestor application invokes extended maintenance mode. If disk usage is not stopped, application failures and potential corruption of data may occur. Extended maintenance mode should be started only by backup requestor applications that are configured by independent software vendors (ISVs). These ISVs must be familiar with application VSS writers and similar mechanisms that help prevent timing problems.
Warning We do not recommend that you use maintenance mode to bypass health monitoring checks for a disk that is experiencing problems. When a disk is in maintenance mode, any failures or time-outs of the disk are ignored by the Cluster service. This behavior may prevent failover and recoverability if a disk is intentionally or accidentally left permanently in maintenance mode.
How to put a disk in maintenance mode
To start maintenance mode, use one of the following methods:
- Use the Cluster.exe command-line tool.
- Use the Cluster API. For more information about how to use the Cluster API, see the Microsoft Windows Server 2003 SDK.
Put a disk resource in maintenance mode only if it meets all the following requirements:
- The disk must be a "Physical Disk" type resource.
Note Resources that use other resource DLL types cannot be put in maintenance mode. - The disk must not be designated as the quorum resource.
- The disk must be online.
When you put a disk in maintenance mode, this setting is an in-memory state and is not saved in the cluster registry hive. This change is not a persistent change. The next time that a disk is brought offline and then back online, the disk reverts to its standard behavior.
If there is any change to the state of the disk resource in maintenance mode, the maintenance mode setting is disabled. The maintenance mode setting is disabled when the following conditions are true:
- You take a resource offline.
- You change the resource status to Offline Pending or to Failed.
Also, if the disk fails over to another node, or another similar action occurs, the disk switches from maintenance mode to regular operational behavior and continues health monitoring. This behavior makes sure that you do not accidentally leave disks in permanent maintenance mode.
The Cluster.exe command-line tool is used to query, to set, and to clear maintenance mode for a resource.
Note You cannot view, change, or set maintenance mode for a resource from the Cluster Administrator graphical user interface tool.
Maintenance mode command-line syntax
To put a disk resource in maintenance mode, use the following Cluster.exe command syntax:
cluster.exe . res "%disk_name%" /maint:on
Note For more information about Cluster.exe command syntax, type
cluster /? at a command prompt.
The following is sample output that shows that a resource has been put in maintenance mode:
G:\>cluster TestCluster res "Disk H:" /maint:on
Setting maintenance mode for resource 'Disk H:'
Resource Group Node Status
-------------------- -------------------- --------------- ------
Disk H: Exchange Node1 Online(Maintenance)
Use the following Cluster.exe command syntax to query a disk resource to determine whether the resource is in maintenance mode:
Cluster.exe . res "%Disk Name%" /maint
The following sample output shows a resource that is being queried for its maintenance mode setting:
G:\>cluster TestCluster res "Disk H:" /maint
Resource Group Node Status
-------------------- -------------------- --------------- ------
Disk H: Exchange Node1 Online(Maintenance)
Use the following Cluster.exe command syntax to bring a disk resource out of maintenance mode:
cluster.exe . res "%disk_name%" /maint:off
The following sample output shows a resource that is out of maintenance mode:
G:\>cluster TestCluster res "Disk H:" /maint:off
Clearing maintenance mode for resource 'Disk H:'
Resource Group Node Status
-------------------- -------------------- --------------- ------
Disk H: Exchange Node1 Online
Example scenarios
How to put a disk into extended maintenance mode
- Before you put a disk in extended maintenance mode, you must put the disk in maintenance mode. To do this, type cluster res "Disk H:" /maint:1, and then press ENTER.
Note Both the 1 value and the ON value turn on maintenance mode. In this step, 1 is used to turn on maintenance mode.
You receive the following output:
Setting maintenance mode for resource 'Disk H:'
Resource Group Node Status
-------------------- -------------------- --------------- ------
Disk H: Exchange Node1 Online(Maintenance)
- Put the disk in extended maintenance mode. To do this, type cluster res "Disk H:" /extmaint:1, and then press ENTER.
You receive the following output:
Setting extended maintenance mode for resource 'Disk H:'
System error 997 has occurred (0x000003e5).
Overlapped I/O operation is in progress.
Note Error 997 is expected because the return status of the Cluster.exe program is not dynamic. Therefore, it may take several seconds to put the disk in extended maintenance mode.
-
Type cluster res "Disk H:", and then press ENTER.
You receive the following output:
Listing status for resource 'Disk H:':
Resource Group Node Status
-------------------- -------------------- --------------- ------
Disk H: Exchange Node1 Online(Ext Maintenance, Internal State 'Offline')
How to bring a disk out of extended maintenance mode
- Type cluster res "Disk H:" /extmaint:0, and then press ENTER.
You receive the following output:
Clearing extended maintenance mode for resource 'Disk H:'
System error 997 has occurred (0x000003e5).
Overlapped I/O operation is in progress.
- Type cluster res "Disk H:", and then press ENTER.
You receive the following output:
Listing status for resource 'Disk H:':
Resource Group Node Status
-------------------- -------------------- --------------- ------
Disk H: Exchange Node1 Online(Maintenance)
The disk is now in maintenance mode. - Bring the disk out of maintenance mode. To do this, type cluster res "Disk H:" /maint:0, and then press ENTER.
You receive the following output:
C:\WINDOWS\Cluster>
Clearing maintenance mode for resource 'Disk H:'
Resource Group Node Status
-------------------- -------------------- --------------- ------
Disk H: Exchange Node1 Online
Extended maintenance mode limitations
A backup and restore application must be able to "hot swap" a cluster disk without taking the disk offline. Hot swapping occurs when you remove a device from a computer that is running and then replace that device with an identical device in the same slot. When a cluster disk is online, the Clusdisk.sys driver is attached to the disk and helps protect the disk. When a disk is removed from the system, all state Clusdisk.sys and disk resource maintain operations for an online disk are invalidated. The hot swap disk has to be brought online by the Clusdisk.sys driver and by disk resource operations before the hot swap disk can be used by the applications. You must perform all the operations that are described in this article without taking the corresponding cluster resource for the disk offline.
To support hot swapping, cluster disk resource operations must be able to bring the disk offline and online internally. For example, the LUN must be dismounted, and all monitoring must stop. However, the corresponding cluster resource must remain online. When hot swapping is completed, the LUN is remounted, and monitoring resumes.
When you put a disk in extended maintenance mode, the state transitions are symmetric. When a disk is online, you can switch to maintenance mode. From maintenance mode, you can switch to extended maintenance mode. However, when a disk is online, you cannot switch directly to extended maintenance mode.
A regression has been found in this update that prevents you from creating a quorum by using Majority Node Set (MNS). If you convert from shared quorum resource to MNS, an error 1 (invalid function) occurs. If the Cluster is already using MNS for a quorum and you apply this update, the MNS resource cannot come online. Additionally, Cluster administrator displays the following error message:
An error has occurred attempting to make <MNS_Resource> the quorum resource. Incorrect function
Error ID: 1 (00000001). Snippit from the cluster log:
Majority Node Set <MNS>: Expanded path '\\fa67fd8c-7325-4\fa67fd8c-7325-4751-bf3b-d3f3131f32b6$' [FM]
FmSetQuorumResource: Entry, pszClusFileRootPath=\\fa67fd8c-7325-4\fa67fd8c-7325-4751-bf3b-d3f3131f32b6$\MSCS 000000ac.00001038::2006/10/01-03:38:13.370 ERR [FM] FmSetQuorumResource: Unable to get maintenance mode info for resource 'MNS', status 1 [FM]
FmSetQuorumResource: Exit, status=1 [FM]
FmSetQuorumResource: Entry, pszClusFileRootPath=\\fa67fd8c-7325-4\fa67fd8c-7325-4751-bf3b-d3f3131f32b6$\MSCS 000000ac.00001758::2006/10/01-03:38:59.730 ERR [FM] FmSetQuorumResource: Unable to get maintenance mode info for resource 'MNS', status 1 [FM] FmSetQuorumResource: Exit, status=1
To resolve the issue with MNS, please download the following update:
921181 An update is available that adds a file share witness feature and a configurable cluster heartbeats feature to Windows Server 2003 Service Pack 1-based server clusters
Service pack information
To resolve this problem, obtain the latest service pack for Windows Server 2003. For more information, click the following article number to view the article in the Microsoft Knowledge Base:
889100 How to obtain the latest service pack for Windows Server 2003
Update information
A supported hotfix is available from Microsoft. However, this hotfix is intended to correct only the problem that is described in this article. Apply this hotfix only to systems that are experiencing this specific problem. This hotfix might receive additional testing. Therefore, if you are not severely affected by this problem, we recommend that you wait for the next software update that contains this hotfix.
If the hotfix is available for download, there is a "Hotfix download available" section at the top of this Knowledge Base article. If this section does not appear, contact Microsoft Customer Service and Support to obtain the hotfix.
Note If additional issues occur or if any troubleshooting is required, you might have to create a separate service request. The usual support costs will apply to additional support questions and issues that do not qualify for this specific hotfix. For a complete list of Microsoft Customer Service and Support telephone numbers or to create a separate service request, visit the following Microsoft Web site:
Note The "Hotfix download available" form displays the languages for which the hotfix is available. If you do not see your language, it is because a hotfix is not available for that language.
Restart requirement
You must restart the computer after you apply this update.
Update replacement information
This update does not replace any other updates.
File information
The English version of this update has the file attributes (or later file attributes) that are listed in the following table. The dates and times for these files are listed in Coordinated Universal Time (UTC). When you view the file information, it is converted to local time. To find the difference between UTC and local time, use the
Time Zone tab in the Date and Time item in Control Panel.
Windows Server 2003, x86-based versions
Date Time Version Size File name
--------------------------------------------------------------
18-Aug-2005 04:54 5.2.3790.2511 476,672 Clusres.dll
18-Aug-2005 02:39 5.2.3790.2511 838,144 Clussvc.exe
18-Aug-2005 02:40 5.2.3790.2511 181,248 Cluster.exe
18-Aug-2005 02:40 5.2.3790.2511 68,096 Resrcmon.exe
18-Aug-2005 02:33 5.2.3790.2511 7,168 W03a2409.dll
18-Aug-2005 02:48 5.2.3790.2511 32,256 Arpidfix.exe
Windows Server 2003, x64-based versions
Date Time Version Size File name Platform
--------------------------------------------------------------------
18-Aug-2005 14:19 5.2.3790.2511 651,264 Clusres.dll x64
18-Aug-2005 14:19 5.2.3790.2511 1,231,360 Clussvc.exe x64
18-Aug-2005 14:19 5.2.3790.2511 338,432 Cluster.exe x64
18-Aug-2005 14:19 5.2.3790.2511 97,280 Resrcmon.exe x64
18-Aug-2005 14:19 5.2.3790.2511 7,680 W03a2409.dll x64
18-Aug-2005 14:19 5.2.3790.2511 181,248 Wcluster.exe x86
18-Aug-2005 14:19 5.2.3790.2511 68,096 Wresrcmon.exe x86
18-Aug-2005 14:19 5.2.3790.2511 7,168 Ww03a2409.dll x86
18-Aug-2005 14:19 5.2.3790.2511 43,008 Arpidfix.exe x64
Windows Server 2003, Itanium-based versions
Date Time Version Size File name Platform
--------------------------------------------------------------------
18-Aug-2005 14:19 5.2.3790.2511 1,162,240 Clusres.dll IA-64
18-Aug-2005 14:19 5.2.3790.2511 2,068,992 Clussvc.exe IA-64
18-Aug-2005 14:19 5.2.3790.2511 543,744 Cluster.exe IA-64
18-Aug-2005 14:19 5.2.3790.2511 184,320 Resrcmon.exe IA-64
18-Aug-2005 14:19 5.2.3790.2511 6,144 W03a2409.dll IA-64
18-Aug-2005 14:19 5.2.3790.2511 181,248 Wcluster.exe x86
18-Aug-2005 14:19 5.2.3790.2511 68,096 Wresrcmon.exe x86
18-Aug-2005 14:19 5.2.3790.2511 7,168 Ww03a2409.dll x86
18-Aug-2005 14:19 5.2.3790.2511 74,752 Arpidfix.exe IA-64
For more information, click the following article number to view the article in the Microsoft Knowledge Base:
824684
Description of the standard terminology that is used to describe Microsoft software updates