Tuesday, 21 April 2015

Troubleshooting CIFS issues


• Use "sysstat –x 1" to determine how many CIFS ops/s and how much CPU is being utilized.

• Check /etc/messages for any abnormal messages, especially for oplock break timeouts.

• Use "perfstat" to gather data and analyze (note information from "ifstat","statit", "cifs stat", and "smb_hist", messages, general cifs info).

• "pktt" may be necessary to determine what is being sent/received over the network.

• "sio" should / could be used to determine how fast data can be written/read from the filer.

• Client troubleshooting may include review of event logs, ping of filer, test using a different filer or Windows server.

• If it is a network issue, check "ifstat –a", "netstat –in" for any I/O errors or collisions.

• If it is a gigabit issue check to see if the flow control is set to FULL on the filer and the switch.

• On the filer if it is one volume having an issue, do "df" to see if the volume is full.

• Do "df –i" to see if the filer is running out of inodes.

• From "statit" output, if it is one volume that is having an issue check for disk fragmentation.

• Try the "netdiag –dv" command to test filer side duplex mismatch. It is important to find out what the benchmark is and if it’s a reasonable one.

• If the problem is poor performance, try a simple file copy using Explorer and compare it with the application's performance. If they both are same, the issue probably is not the application. Rule out client problems and make sure it is tested on multiple clients. If it is an application performance issue, get all the details about:
◦ The version of the application
◦ What specifics of the application are slow, if any
◦ How the application works
◦ Is this equally slow while using another Windows server over the
network?
◦ The recipe for reproducing the problem in a NetApp lab.

• If the slowness only happens at certain times of the day, check if the times coincide with other heavy activity like SnapMirror, SnapShots, dump, etc. on the filer. If normal file reads/writes are slow:
◦ Check duplex mismatch (both client side and filer side)
◦ Check if oplocks are used (assuming they are turned off)
◦ Check if there is an Anti-Virus application running on the client. This can cause performance issues especially when copying multiple small files.
◦ Check "cifs stat" to see if the Max Multiplex value is near the cifs.max_mpx option value. Common situations where this may need to be increased are when the filer is being used by a Windows Terminal Server or any other kind of server that might have many users opening new connections to the filer. What is CIFS Max Multiplex?
◦ Check the value of OpLkBkNoBreakAck in "cifs stat". Non-zero numbers indicate oplock break timeouts, which cause performance problem.

No comments:

Post a Comment