Troubleshooting Host Resolution
Hosts on a network
identify each other using a MAC address, which is a unique 48-bit number
programmed into to every network interface card (NIC). When a host
needs to locate another host by hostname, the hostname is first resolved
into an IP address (typically by DNS) and then ARP resolves the IP
address into a MAC address.
How ARP Works
Typically, ARP
operation is invisible to the user. If anything does go wrong, however,
you need to examine the ARP cache or use Network Monitor to look at the
content of ARP frames. To make sense of the information that these tools
provide, you need to know how ARP works.
ARP
resolves IP addresses used by TCP/IP-based software to MAC addresses
used by network hardware, such as Ethernet. As each outgoing IP datagram
is encapsulated in a frame, source and destination MAC addresses must
be added. ARP determines the destination MAC address for each frame.
When ARP receives a request
to resolve an IP address, it first checks to ascertain whether it has
recently resolved that address or whether it has a permanent record of
the MAC address that corresponds to the IP address requested. This
information is held in the ARP cache. If it cannot resolve the IP
address from cache, ARP broadcasts a request that contains the source IP
and MAC addresses and the target IP address. When the ARP request is
answered, the responding PC and the original ARP requester record each
other’s IP address and MAC address in their ARP caches.
Resolving a Local Address
ARP operation is best illustrated by considering examples of local and
remote address resolution. In the first example, Host A, Host B, and
Host C are on the same subnet. A ping command is issued on Host A,
specifying the IP address of Host C. ICMP instructs ARP to resolve this
IP address.
ARP checks the cache on
Host A. If the IP address cannot be resolved from cached information,
then an ARP request is broadcast to all the hosts on the subnet. The ARP
broadcast supplies the source IP and MAC addresses and requests a MAC
address that corresponds with the IP address specified. Because the ARP
frame is a broadcast, all hosts on the subnet will process it. However,
hosts that do not have the corresponding IP address (such as Host B)
reject the broadcast frame. Host C recognizes the IP address as its own
and stores the IP address/MAC address pair for Host A in its cache. This
process is illustrated in Figure 1.
The target address shown in this figure is the Ethernet address for a
broadcast frame (FFFFFFFFFFFF). The MAC address of the target host is
not known and is assigned the value 000000000000.
Host
C sends an ARP reply message that contains its MAC address directly
back to Host A. When Host A receives this message, it updates its ARP
cache with Host C’s address pair. Host A can now send the ICMP ping
datagram (or any IP datagram) directly to Host C. This process is
illustrated in Figure 2.
Resolving a Remote Address
When the target address of an IP datagram is on a remote subnet, ARP
will resolve the IP address to the MAC address of the NIC in the router
gateway that is on the source host’s local interface. In this example,
Host A and Host B are on different subnets. A ping command issued on
Host A specifies the IP address of Host B.
As in the previous
example, ARP first checks its cache on the source host (Host A). If the
destination IP address cannot be resolved from cache, an ARP request is
broadcast. ARP does not know that the target host is remote because
routing is an IP function, not an ARP function. The ARP request to
resolve a remote IP address is therefore exactly the same as the ARP
request to resolve a local address.
All the ordinary hosts on
the local subnet reject the request because none of them has a matching
IP address. The router, however, checks its routing table and
determines that it can access the subnet for the remote host. It then
caches the IP address/MAC address pair for Host A and sends back an ARP
reply that specifies the MAC address of its gateway NIC. On Host A, ARP
caches that MAC address with the IP address it is resolving. As far as
ARP on Host A is concerned, it has done its job. Thus, Host A resolves a
remote IP address to the MAC address of its default gateway.
At
this stage, ARP on the router takes over the task of IP address
resolution. First, it checks its cache for the target host’s interface.
If it cannot resolve the target host’s IP from cache, it broadcasts an
ARP request to the target host’s subnet, supplying the IP address and
MAC address of the gateway NIC that accesses the target host’s
interface.
In the example illustrated in Figure 3,
Host B recognizes its own IP address, caches the IP address and MAC
address of its default gateway, and returns its MAC address in an ARP
reply frame directed to that gateway. On the gateway, ARP caches Host
B’s MAC address along with the IP address it is resolving, and the
process is complete. The address pairs in the ARP caches shown in Figure 14-18 are the result of a successful resolution.
Troubleshooting DNS
Several
methods are available for resolving a hostname to an IP address. If the
same hostname was resolved recently, the information will normally be
available in the host’s DNS cache. Cache resolution is quick and
efficient and is always the first resolution method that is attempted.
Static host files can resolve hostnames, but these require a lot of
administrative effort because you need to put them on every computer.
NetBIOS methods such as the Windows Internet Name System (WINS) are
useful in mixed-mode domains. However, in Windows Server 2003 (and
Windows 2000 Server), dynamic DNS (DDNS) is available and is the
resolution method of choice. In the remainder of this section, when we
consider hosts registering their DNS records dynamically, functionality
assumes that DDNS is used.
Failure of a DNS Server
It is unusual for DNS to
fail completely in an Active Directory domain. Typically, Active
Directory–integrated DNS is available on more than one domain controller
to provide failover support. If Active Directory DNS is not used, then a
secondary DNS server is used to back up the primary DNS server. A
primary DNS server that is not Active Directory–integrated is a single
source of failure. If it goes down, you cannot add new entries to the
DNS zone file. However, the secondary will continue to provide a name
resolution service, usually for a length of time sufficient to bring the
primary DNS server back on line.
However, the failure of a
DNS server can cause problems if a host is not configured with the IP
address of at least one alternative DNS server. If a host is configured
with only one DNS server’s IP address and that server goes down, then
the host is unable to resolve hostnames, even though the DNS service is
available on the other server. Typically, client machines are configured
through the Dynamic Host Configuration Protocol (DHCP) and receive a
list of all the available DNS servers. However, servers such as Exchange
Server 2003 servers are usually configured manually. It is easy to
forget to add alternative DNS servers, and everything will work
perfectly unless the DNS server fails.
A Server Does Not Register in DDNS
When
a new server comes online, it takes some time (sometimes as long as 15
minutes) for it to register dynamically in DDNS. If the services
provided by that server are required immediately, then you can force
registration by opening the Command console and entering the following
commands in succession:
ipconfig / registerdns
net stop netlogon
net start netlogon
You need to check
Event Viewer for errors if registration fails to occur. However, unless
there are other errors, you should see the server’s A (host) record
appear in DNS almost immediately.
Negative Caching
If DNS resolves a
hostname to an IP address, the hostname/IP address pair is held in cache
on the host that originated the request (the resolver). However, if
resolution is unsuccessful, that information is also cached. This is to
stop the waste of resources when a user types in a hostname incorrectly.
Suppose, however, that the hostname is correct but because of some
fault, the resolution does not take place. This negative information is
cached. Suppose that the fault is then fixed. Now every client can
resolve the hostname except for the client that tried to do so earlier.
It attempts to resolve the hostname from cache, obtains the negative
information, and returns an error. You can solve the problem by opening
the Command console on that client and entering ipconfig /flushdns.
DHCP Problems
Sometimes
a client machine cannot access any servers on a network or resolve any
hostnames, when all the other clients are having no problems. In this
case, check the configuration of the client using the ipconfig utility.
There is a good chance that the client’s IP address will be in the
169.254.x.x
range. What has happened is that the DHCP service has stopped for some
reason, or has run out of leases, and the host has been configured
through automatic private IP addressing (APIPA). If you fix the DHCP
problem, then the client will obtain a DHCP lease in approximately five
minutes and the problem is solved. If you need an immediate solution,
then open the Command console on the client and enter the following
commands:
ipconfig /release
ipconfig /renew
Note
It is sufficient to enter only ipconfig /renew
when converting an APIPA address to a DHCP lease. However, it is good
practice always to release an IP configuration before you renew it. |
Troubleshooting Active Directory Issues
As with DNS, Active
Directory must be available in order to install Exchange Server 2003 and
create an Exchange Server 2003 organization. Active Directory is robust
because the Active Directory database is replicated between domain
controllers. Unless you have only one domain controller (not
recommended), there is no single point of failure for the entire Active
Directory.
However, Active Directory
uses operations masters, and the failure of an operations master affects
the functionality of the Active Directory directory service. Typically,
the following problems are associated with operations masters:
You cannot create security principals
Assuming that you have sufficient permissions to create a security
principal, then typically this problem occurs when the Relative Identity
(RID) master is not available or has failed to replicate. This may be
caused by a network connectivity problem or may be due to the failure of
the computer holding the RID master role. This fault can also occur
when the Access This Computer From The Network user right is not
assigned to the appropriate groups on the RID master.
You cannot change group membership
Assuming that you have the necessary administrative credentials to
manage group membership, this problem typically occurs when the
infrastructure master is not available. This may be caused by a network
connectivity problem. It may also be due to a failure of the computer
holding the infrastructure master role.
Users cannot authenticate This
can be a problem in mixed mode domains in which some clients are not
Active Directory–aware. Typically, it happens when the user’s password
has expired and the primary domain controller (PDC) emulator master is
not available. This may be caused by a network connectivity problem. It
may also be due to a failure of the computer holding the PDC emulator
master role.
In all of these cases, you
can identify the computer holding the RID master role, the
infrastructure master role, or the PDC emulator role by issuing the netdom query fsmo command from the Command console of any host in the domain, as shown in Figure 4.
You can then repair or replace the computer holding the appropriate
operations master role. You may need to seize the operations master
role. Alternatively, you may need to resolve the network connectivity
problem.
Important
If
you have any Windows 2000 domain controllers in your domain, ensure
that SP3 or later is installed on them. Otherwise, Exchange Server 2003
cannot access them. |