Tag Archive for troubleshooting

DFS Troubleshooting on Windows Server 2008 R2

helpicon

DFS Troubleshooting

The DFS Management MMC is the tool that can manage most common administration activities related to DFS-Namespaces. This will show up under “Administrative Tools” after you add the DFS role service in Server Manager. You can also add just the MMC for remote management of a DFS namespace server. You can find this in Server Manager, under Add Feature, Remote Server Administration Tools (RSAT), Role Administration Tools, File Services Tools.

Another option to manage DFS is to use DFSUTIL.EXE, which is a command line tool. There are many options and you can perform almost any DFS-related activity, from creating a namespace to adding links to exporting the entire configuration to troubleshooting. This can be very handy for automating tasks by writing scripts or batch files. DFSUTIL.EXE is an in-box tool in Windows Server 2008.

What can go wrong?

  • Access to the DFS namespace
  • Finding shared folders
  • Access to DFS links and shared folders
  • Security-related issues
  • Replication latency
  • Failure to connect to a domain controller to obtain a DFSN namespace referral
  • Failure to connect to a DFS server
  • Failure of the DFS server to provide a folder referral

Methods of Troubleshooting

I have a very basic lab set up with DFS running on 2 servers. I will be using this to demonstrate the troubleshooting methods

My DFS Namespace is \\dacmt.local\shared

Troubleshooting Commands

  • dfsutil.exe /spcinfo

Determine whether the client was able to connect to a domain controller for domain information by using the DFSUtil.exe /spcinfo command. The output of this command describes the trusted domains and their domain controllers that are discovered by the client through DFSN referral queries. This is known as the “Domain Cache”

dfs1

  • start \\10.1.1.160 (where 10.1.1.160 is your DC)

This should pop up with an Explorer box listing the shares hosted by your Domain Controller

dfs2

  •  netview \\10.1.1.160 (where 10.1.1.160 is your DC)

A successful connection lists all shares that are hosted by the domain controller.

dfs3

  • net view \\10.1.1.200 (Where 10.1.1.200 is your DFS Server)

You can see this shows you your namespace and your shares held on your DFS Server

dfs7

  • dfsutil.exe /pktinfo 

If the above connection tests are successful, determine whether a valid DFSN referral is returned to the client after it accesses the namespace. You can do this by viewing the referral cache (also known as the PKT cache) by using the DFSUtil.exe /pktinfo command

If you cannot find an entry for the desired namespace, this is evidence that the domain controller did not return a referral

dfs4

  • dfsutil.exe cache domain flush
  • dfsutil.exe cache referral flush
  • dfsutil.exe cache provider flush

dfs6

  • ipconfig /flushdns and dfsutil.exe /pktflush and dfsutil.exe /spcflush

By default, DFSN stores NetBIOS names for root servers. DFSN can also be configured to use DNS names for environments without WINS servers. For more information, click the underlined link to view the article in the Microsoft Knowledge Base:

dfs8

  •  DFS and System Configuration

Even when connectivity and name resolution are functioning correctly, DFS configuration problems may cause the error to occur on a client. DFS relies on up-to-date DFS configuration data, correctly configured service settings, and Active Directory site configuration.

First, verify that the DFS service is started on all domain controllers and on DFS namespace/root servers. If the service is started in all locations, make sure that no DFS-related errors are reported in the system event logs of the servers.

dfs9

  • repadmin /showrepl * dc=dacmt,dc=local

When an administrator makes a change to the domain-based namespace, the change is made on the Primary Domain Controller (PDC) emulator master. Domain controllers and DFS root servers periodically poll PDC for configuration information. If the PDC is unavailable, or if “Root Scalability Mode” is enabled, Active Directory replication latencies and failures may prevent servers from issuing correct referrals.

dfs10

  • DFS and NTFS Permissions

If a client cannot gain access to a shared folder specified by a DFS link, check the following:

  • Use the DFS administrative tool to identify the underlying shared folder.
  • Check status to confirm that the DFS link and the shared folder (or replica set) to which it points are valid. For more information, see “Checking Shared Folder Status” earlier in this chapter.
  • The user should go to the Windows Explorer DFS property page to determine the actual shared folder that he or she is attempting to connect to.
  • The user should attempt to connect to the shared folder directly by way of the physical namespace. By using a command such as ping, net view or net use, you can establish connectivity with the target computer and shared folder.
  • If the DFS link has a replica set configured, then be aware of the latency involved in content replication. Files and folders that have been modified on one replica might not yet have replicated to other replicas.

It is also worth checking you do not have any general networking issues on the server you are connecting from and also that there are no firewall rules or Group Policies blocking File and Printer Sharing!

  • DFS Tab on DFS folders accessed through the DFS Namespace

It is recommended that one of the first things that you determine when tracking an access-related issue with DFS is the name of the underlying shared folder that the client has been referred to. In Windows 2000, there is a shell extension to Windows Explorer for precisely this purpose. When you right-click a folder that is in the DFS namespace, there is a DFS tab available in the Properties window. From the DFS tab, you can see which shared folder you are referencing for the DFS link. In addition, you can see the list of replicas that refer to the DFS link, so you can disconnect from one replica and select another. Finally, you can also refresh the referral cache for the specified DFS link. This makes the client obtain a new referral for the link from the DFS server.

dfs11

  • Replication Latency

Because the topology knowledge is stored in the domain’s Active Directory, there is some latency before any modification to the DFS namespace is replicated to all domain controllers.

From an administrator’s perspective, remember that the DFS administrative console connects directly to a domain controller. Therefore, the information that you see on one DFS administrative console might not be identical with the information about another DFS administrative console (which might be obtaining its information from a different domain controller).

From a client’s perspective, you have the additional possibility that the client itself might have cached the information before it was modified. So, even though the information about the modification might have replicated to all the domain controllers, and even if the DFS servers have obtained updates about the modification, the client might still be using an older cached copy. The ability to manually flush the cache before the referral time-out has expired, which is done from the DFS tab in the Properties window in Windows Explorer, can be useful in this situation.

  • dfsdiag /testdcs /domain:dacmt.local
  • DFSDiag /testsites /dfspath:\\dacmt.local\Shared\Folder 1 /full
  • DFSDiag /testsites /dfspath:\\dacmt.local\Shared /recurse /full
  • DFSDiag /testdfsconfig /dfsroot:\\dacmt.local\Shared
  • DFSDiag /testdfsintegrity /dfsroot:\\dacmt.local\Shared
  • DFSDiag /testreferral /dfspath:\\dacmt.local\Shared

With this you can check the configuration of the domain controllers on your DFS Server. It verifies that the DFS Namespace service is running on all the DCs and its Startup Type is set to Automatic, checks for the support of site-costed referrals for NETLOGON and SYSVOL and verifies the consistency of site association by hostname and IP address on each DC.

dfs12
and

dfs13

and

dfs14

DFSR and File Locking

DFS lacks a central feature important for a collaborative environment where inter-office file servers are mirrored and data is shared: File Locking. Without integrated file locking, using DFS to mirror file servers exposes live documents to version conflicts. For example, if a colleague in Office A can open and edit a document at the same time that a colleague in Office B is working on the same document, then DFS will only save the changes made by the person closing the file last.

There is also another version conflict potential which arises even when the two colleagues are not working on the same file at the same time. DFS Replication is a single-threaded operation, a “pull” process. The result, synchronisation tasks are able to quite easily “queue” up and create a backlog. As a result changes made at one location are not immediately replicated to the other side. It is this time delay which creates yet another opportunity for file version conflicts to occur.

http://blogs.technet.com/b/askds/archive/2009/02/20/understanding-the-lack-of-distributed-file-locking-in-dfsr.aspx

NETBIOS Considerations

In terms of NetBios, the default behavior of DFS is to use NetBIOS names for all target servers in the namespace. This allows clients that support NetBios only name resolution to locate and connect to targets in a DFS namespace. Administrators can use NetBIOS names when specifying target names and those exact paths are added to the DFS metadata. For example, an administrator can specify a target \\dacmt\Users, where dacmt is the NetBIOS name of a server whose DNS or FQDN name is dacmt.local

http://support.microsoft.com/kb/244380

Determine the root cause of a vSphere management or connectivity issue

images

Points to think about

  • Check Network Connectivity
  • Check Storage Connectivity
  • Check vCenter Connectivity
  • Check Host Connectivity
  • Check VM Connectivity
  • Check Host Logs
  • Check vCenter Logs
  • Check Monitoring Systems
  • Check Physical Switches
  • Check Cables
  • Check Virtual Switches
  • Check FC Switches
  • Check SAN Storage
  • Check vCenter DB Connectivity
  • Check Router Connectivity
  • Check Power Issues
  • Check KB Articles for Error IDs

Troubleshoot ESXi host management and connectivity issues

images

Troubleshooting

  • Verify that the network adapter and server hardware are supported. For more information, see Verifying ESX/ESXi host hardware (System, Storage, and I/O devices) are supported (1003916).
  • Verify that the network link is up. For more information, see Verifying a network link (1003724).
  • Verify that proper VLAN IDs exist on the portgroup. For more information, see Configuring a VLAN on a portgroup (1003825).
  • If you are using NIC teaming on the virtual switch, verify that the physical switch ports are configured consistently for each teamed network adapter and that the proper load balancing policy is configured on the virtual switch. VMware recommends you to use the default Route based on the originating virtual port ID load balancing policy. If link aggregation on the physical switch is configured, use the Route based on ip hash load balancing policy. For more information, see NIC teaming in ESX/ESXi (1004088) and ESX/ESXi host requirements for link aggregation (1001938).
  • Verify that the speed and duplex of the network links are consistent. For more information, see Configuring the speed and duplex of an ESX/ESXi Server network adapter (1004089).
  • Verify the ESX host networking configuration. For more information, see Verifying ESX Server host networking configuration on the service console (1003796).
  • Verify that port security is not configured on the physical switch ports. For more information, see Loss of network connectivity when port security is configured on the physical switch (1002811).
  • Verify that portfast (or equivalent) is enabled on all of the ESX host’s physical switch ports. For more information, see STP may cause temporary loss of network connectivity when a failover or failback event occurs (1003804).
  • Verify the integrity of the physical network adapter. For more information, see Verifying the integrity of the physical network adapter (1003686).
  • Verify that no duplicate IP addresses exist on the network. For more information, see Warning for Duplicate IP Address for VMware VMotion Port Group (10165) or Duplicate IP address detected (1020647).
  • Verify that all the NICs participating as uplinks on the vSS and VDS are observing all the network information. For more information, see Observed IP range does not show network in ESX or ESXi (1006744). Until the time the issue of observed IP range is not resolved on external physical network, you can set the problematic NIC in unused mode and then verify the networking functionality again.

Useful Link

vSphere_Troubleshooting

Analyse troubleshooting data to see if the problem lies in the Virtual or the Physical layer

images

Troubleshooting

Troubleshooting can often be frustrating and challenging, and knowing where to look and what to do is the key to quickly finding and resolving problems. You shouldn’t just look through log files when you are experiencing known problems, however. Often, many problems might not be that obvious, and the log files are a good place to look for signs of them happening. You should keep a list of all the log files handy so that you can quickly access them if needed and so not have to waste time when a problem is happening trying to remember their path and filenames. You might not know how to resolve or troubleshoot every problem you encounter, so be sure to rely on the resources available to you, including documentation, support forums, knowledge base, and VMware’s technical support. Being properly prepared to handle problems when they occur is one of the best troubleshooting skills that you can have.

What you can do

  • Check Monitoring Systems if you have them. SCOM, Nagios etc. Some companies have real-time screens showing monitoring
  • Check with your Network Team as they will more than likely be alerted to physical problems faster than you
  • Can you isolate the problem to a VM, Host, Switch or router or is the issue affecting the whole network
  • Ensure that the Port Group name(s) associated with the virtual machine’s network adapter(s) exists in your vSwitch or Virtual Distributed Switch and is/are spelt correctly.
  • Check any warning Triangles or exclamation marks on the standard or distributed switches
  • Verify the virtual network adapter is present and connected for all VMkernel ports
  • Verify that the networking within the virtual machine’s guest operating system is correct
  • Verify that the vSwitch has enough ports for the virtual machine
  • Ensure the physical switch ports are configured as port-channel
  • Shut down all but one of the physical ports the NICs are connected to, and toggle this between all the ports by keeping only one port connected at a time. Take note of the port/NIC combination where the virtual machines lose network connectivity.
  • Check Logs