[Tfug] Poor NFS write performance with Linux

Fri Jan 6 02:40:31 MST 2012

I'm looking for some tips and suggestions to improve the dismal
performance I am currently seeing with NFSv4 writes.

This setup is at home between my desktop PC and my file/Myth server.
As far as I can tell, everything is operating correctly from a
hardware perspective.  The machines are on a gigabit Ethernet network,
both are using the same 64bit Linux 3.1.5 kernel, and both are running
Debian.  The desktop PC is configured with XFS file systems on LVM on
encryption on non-Linux software RAID-0.  The file server is similar
with XFS file systems on LVM on encryption on Linux software RAID-5.

NFS read performance is not super, though I could live with it.  NFS
write performance, however, is truly abysmal.  I've run some simple
read/write throughput and timing tests between the machines and I
think they show enough to rule out LVM, encryption, and RAID as
culprits.  The tests were done with two data sets: a single 160 MB
file and an 18 MB directory tree containing 920 files and directories.
 I ran each test five times and averaged, but as these were simple
tests I did not do anything to factor out the cache except by doing
multiple runs.  The one exception is the large file NFS copy from the
server to the desktop computer where I did have to fool the I/O layer
into giving me fresh data each time, otherwise the test was cache
dominated and meaningless.  In the tests, "TO remote" is a transfer
from my desktop machine to the server, and "FROM remote" is a transfer
from the server to my desktop machine, and all commands were executed
on the desktop machine.  For comparison, along with NFS I/O I also
timed FTP and scp transfers.  Times were acquired with the external
'time' utility, not the bash built-in version.  Here is the data
(rates are given as megabytes/second):

Copy 160.79 MB single file TO remote
FTP     108.3600 MB/s
scp      58.5560 MB/s
NFS cp   32.2751 MB/s

Copy 160.79 MB single file FROM remote
FTP     109.3809 MB/s
scp      55.3684 MB/s
NFS cp   77.6763 MB/s

Copy 17.84 MB tree (920 files and directories) TO remote
FTP       9.1206 MB/s
scp       2.5551 MB/s
NFS cp    0.2052 MB/s

Copy 17.84 MB tree (920 files and directories) FROM remote
FTP       8.1017 MB/s
scp       8.4710 MB/s
NFS cp   16.8056 MB/s

Delete 17.84 MB tree (920 files and directories) on remote
NFS rm -r   16.18 seconds

The write performance when copying the directory tree to the server is
the most telling.  That test took about 87 seconds on average to
perform.  The time elapsed to delete that same tree with 'rm -r' was
also extremely long, especially when that same operation, when done
locally, is finished in a fraction of a second.

Google hasn't really been much help.  I found little information on
NFS performance tuning and what I did find was very old (NFSv3 was
new, for example) so NFSv4 specific information was almost
nonexistent.  What I did find indicated that fiddling with the rsize
and wsize values is a good thing to try, but I also found mention that
these values are sufficiently large on modern kernels and also that
they may be negotiated somehow between client and server.  At any
rate, the old documents talked about defaults in the 4K to 8K range
and perhaps trying 32K as a high value, but my system shows a default
value of 524288 for both parameters.  Other tips included checking
network statistics for errors, retransmissions, etc. but all of those
values look fine on my system.  I'm using TCP for the NFS protocol,
but the NFS howto indicates that on a properly configured network the
performance difference between TCP and UDP is negligible.

The most useful suggestion I found was to switch from synchronous
operation to asynchronous.  Async used to be the default, but it was
changed a number of years ago to sync because data integrity could be
subtly compromised without any noticeable error or indication under
certain circumstances.  I found a short thread from a couple of years
ago that said the difference between sync and async operation was
substantial in the write case, but it didn't give any numbers to back
it up.  That same thread also suggested using XFS over ext3 for better
write performance, perhaps on the order of about a +30% improvement,
but I am already using XFS.

I'm not sure if it is related, but I also encountered a "stale" data
issue while running these tests.  I read a text file on the server
over an NFS mount and then I edited it on the server via an SSH
session, but when I accessed it again (using diff) over the NFS mount
I was still seeing the old data.  Cat'ing it in the SSH session showed
the correct data, but diff kept showing the wrong thing several times.
 I finally cat'ed the file over the NFS mount and only then saw the
new data, and after that diff was okay.  Again, I don't know if it is
in any way related to the other NFS issue, and I'm pretty sure it's
not a common occurrence since I think I would have noticed it before.
It's somewhat troubling, though.

Okay, that was a bit long winded, but I hope thorough.  Does anybody
have any suggestions on what I can try or check to improve the NFS
write performance?  If necessary, I suppose I could try switching from
sync to async.  It used to be the default and the world didn't end,
but if there is a better solution then I'd rather do that.  After all,
it somewhat defeats the purpose of having a RAID-5 array to guard
against hardware failures if the software on those drives is munching
on the data anyway.

Even though this is at home, once I get another workstation repaired
at work, 2+ computers there will be sharing a lot of data over NFS and
will be running similar software so I'd like to make sure I don't run
into the same problem there.

-- 
--John Gruenenfelder    Systems Manager, MKS Imaging Technology, LLC.
Try Weasel Reader for Palm OS  --  http://weaselreader.org
"This is the most fun I've had without being drenched in the blood
of my enemies!"
        --Sam of Sam & Max