Notice: This website is an unofficial Microsoft Knowledge Base (hereinafter KB) archive and is intended to provide a reliable access to deleted content from Microsoft KB. All KB articles are owned by Microsoft Corporation. Read full disclaimer for more details.

A Windows Server 2003 based-computer that is running the Cluster service may be unable to join a cluster after the computer is first restarted


View products that this article applies to.

Symptoms

A Windows Server 2003 based-computer that is running the Cluster service may be unable to join a server cluster after the computer is first restarted. However, the computer can successfully join the cluster after the Cluster service restarts. When this problem occurs, entries that resemble the following are logged in the System log:

Event ID: 1209

Date: date
Time: time
Event ID: 1209
Event type: Error
Computer: ComputerName
Description: Cluster service is requesting a bus reset for device \Device\ClusDisk0.

Event ID: 118

Date: date
Time: time
Event ID: 118
Event type: Warning
Computer: ComputerName
Description: The driver for device \Device\RaidPort1 performed a bus reset upon request.

Event ID: 7031

Date: date
Time: time
Event ID: 7031
Event type: Error
Computer: ComputerName
Description: The Cluster Service service terminated unexpectedly. It has done this 1 time(s). The following corrective action will be taken in 60000 milliseconds: Restart the service.

Event ID: 1062

Date: date
Time: time
Event ID: 1062
Event type: Information
Computer: ComputerName
Description: Cluster service successfully joined the server cluster ClusterName.

Event ID: 7036

Date: date
Time: time
Event ID: 7036
Event type: Error
Computer: ComputerName
Description: The Cluster Service service entered the running state.

Entries that resemble the following are logged in the Cluster.log file:
<Date> <Time> INFO [JOIN] Sponsor IP-Address is not available (JoinVersion), status=1722.
<Date> <Time> WARN [JOIN] JoinVersion data for sponsor IP-Address is invalid, status 1722.
<Date> <Time> WARN [JOIN] Unable to resolve JoinVersion endpoint for sponsor IP-Address, status 1726.
<Date> <Time> WARN [JOIN] JoinVersion data for sponsor IP-Address is invalid, status 1726.
<Date> <Time> WARN [JOIN] Unable to resolve JoinVersion endpoint for sponsor IP-Address, status 1726.
<Date> <Time> WARN [JOIN] JoinVersion data for sponsor IP-Address is invalid, status 1726.
<Date> <Time> WARN [JOIN] Unable to resolve JoinVersion endpoint for sponsor NodeName, status 1726.
<Date> <Time> WARN [JOIN] JoinVersion data for sponsor NodeName is invalid, status 1726.
<Date> <Time> INFO [JOIN] Got out of the join wait, CsJoinThreadCount = 1.
<Date> <Time> ERR [JOIN] Unable to connect to any sponsor node.
<Date> <Time> WARN [INIT] Failed to join cluster, status 53
<Date> <Time> INFO [INIT] Attempting to form cluster ClusterName
<Date> <Time> INFO Physical Disk <Disk Q:>: [DiskArb] Arbitrate for ownership of the disk by reading/writing various disk sectors.
<Date> <Time> INFO Physical Disk <Disk Q:>: [DiskArb] Successful read (sector 12) [NodeName:353] (0,657b3ed8:01c5a19f).
<Date> <Time> INFO Physical Disk <Disk Q:>: [DiskArb] No reservation found. Read'n'wait.
<Date> <Time> ERR Physical Disk <Disk Q:>: [DiskArb] Failed to read (sector 12), error 170.
<Date> <Time> WARN Physical Disk <Disk Q:>: [DiskArb] Retry arbitration, 4 attempts left
<Date> <Time> INFO Physical Disk <Disk Q:>: [DiskArb] Read the partition info to insure the disk is accessible.
<Date> <Time> INFO Physical Disk <Disk Q:>: [DiskArb] Issuing GetPartInfo on signature d5fed4df.
<Date> <Time> INFO Physical Disk <Disk Q:>: [DiskArb] GetPartInfo completed, status 0.
<Date> <Time> INFO Physical Disk <Disk Q:>: [DiskArb] Arbitrate for ownership of the disk by reading/writing various disk sectors.
<Date> <Time> ERR Physical Disk <Disk Q:>: [DiskArb] Failed to read (sector 12), error 170.
<Date> <Time> INFO Physical Disk <Disk Q:>: [DiskArb] We are about to break reserve.
<Date> <Time> INFO Physical Disk <Disk Q:>: [DiskArb] Issuing BusReset on signature d5fed4df.
<Date> <Time> INFO Physical Disk <Disk Q:>: [DiskArb] BusReset completed, status 0.
<Date> <Time> INFO Physical Disk <Disk Q:>: [DiskArb] Read the partition info from the disk to insure disk is accessible.
<Date> <Time> INFO Physical Disk <Disk Q:>: [DiskArb] Issuing GetPartInfo on signature d5fed4df.
<Date> <Time> INFO Physical Disk <Disk Q:>: [DiskArb] GetPartInfo completed, status 0.
<Date> <Time> INFO Physical Disk <Disk Q:>: [DiskArb] Successful write (sector 12) [NodeName:0] (0,2007097e:01c5a1a7).
<Date> <Time> ERR Physical Disk <Disk Q:>: [DiskArb] Failed to read (sector 12), error 170.

↑ Back to the top


Cause

This problem may occur if any of the following items are true:
  • The Windows Firewall/Internet Connection Sharing service is enabled, or it is set to start automatically.
  • Windows Firewall is configured incorrectly to work with the server cluster. When the Cluster service on a computer starts, the computer tries to find a sponsor node so that it can join an existing cluster. If the computer that is running the Cluster service cannot connect to a sponsor, the computer tries to form a cluster.

    When this occurs, the node must take ownership of the quorum device. However, because the quorum device is being used by another node, the cluster node that tries to form a cluster cannot gain access to the quorum. Therefore, the Cluster service exits, and then later it tries to restart. For more information about how the Cluster service reserves a disk, click the following article number to view the article in the Microsoft Knowledge Base:
    309186 How the Cluster service reserves a disk and brings a disk online

    By default, when the Cluster service exits, the service tries to restart about a minute later. To determine whether the computer succeeds in joining the cluster on the second try, look for entries that resemble the following in the System log:

    Event ID: 1062

    Event Source: ClusSvc
    Date: date
    Time: time
    Event ID: 1062
    Level: Information
    Computer: ComputerName
    Description: Cluster service successfully joined the server cluster ClusterName.

    Event ID: 7036
    Event Source: Service Control Manager
    Date: date
    Time: time
    Event ID: 7036
    Level: Information
    Computer: ComputerName
    Description: The Cluster Service service entered the running state.

↑ Back to the top


Resolution

To resolve these problems, you can use either of the following methods.

Method 1: Disable the Windows Firewall/Internet Connection Sharing service

If the Windows Firewall/Internet Connection Sharing service is not required, disable the service. To determine whether the Windows Firewall/Internet Connection Sharing service is running and set to start automatically, follow these steps:
  1. Click Start, point to Administrative Tools, and then click Services.
  2. In the Services Microsoft Management Console (MMC) snap-in, right-click Windows Firewall/Internet Connection Sharing (ICS), and then click Properties.
  3. Make sure that the Startup type is set to disabled and that the Service status is Stopped.

Method 2: Make sure that the Windows Firewall/Internet Connection Sharing service is configured so that the cluster nodes can communicate as required

Install the hotfix that is described in the following Microsoft Knowledge Base article:
897651 VPN clients can no longer access internal resources after you install Windows Server 2003 Service Pack 1 on a computer that is running ISA Server 2000
After you install this hotfix, you must follow the steps in the article to create the required registry key. These steps disable the boot time policy of Windows Firewall.

Note The version of the Ipnat.sys file that is included in Windows Server 2003 Service Pack 2 (SP2) also contains this fix. If Windows Server 2003 SP2 is installed on the computer, see the following Microsoft Knowledge Base article:
917730 You cannot create a network connection when you are starting a Windows XP Service Pack 2-based computer
For more information about how to use Windows Firewall together with a server cluster, click the following Microsoft Web site: For more information about Cluster service errors on Windows Server 2003 Service Pack 1-based clusters that have Internet Connection Firewall enabled, click the following article number to view the article in the Microsoft Knowledge Base:
883398 Cluster Services does not work correctly in a Windows Server 2003 Service Pack 1-based cluster that has the Internet Connection Firewall enabled

↑ Back to the top


Keywords: KB938615, kbprb, kbtshoot, kbclustering

↑ Back to the top

Article Info
Article ID : 938615
Revision : 2
Created on : 1/9/2008
Published on : 1/9/2008
Exists online : False
Views : 248