in comparing server identities, in previous commit we removed
comparison of the returned clientids.
however, running against the emc server, we ran into issues of data
servers retuning the same server major, minor, and scope identities
but different clientids. we then decided that it's the same data
server.
adding -o timeout=value (in sec) mount option to override a default
timeout value for the upcalls for non-io upcalls. Default timeout is
at least the lease_time seconds of an established mount or 20secs.
read/write upcall also take into account number of bytes for the request
and 100MB/s network speed and 100MB/s disk speed
when we are doing non-pnfs and writing to a netapp filer, even though we
write UNSTABLE it returns us FILE_SYNC4. since we were doing unstable
writes we were not sending getattrs, and since the data servers we returning
stable commits we didn't send commit. in doing so, we never update our
attribute cache and had a wrong file size.
client was previously only sending RECLAIM_COMPLETE during grace period. the spec mandates that RECLAIM_COMPLETE be sent before any non-reclaim locking operations, regardless of grace period. when recovery code detects a NFS4ERR_NO_GRACE error, send RECLAIM_COMPLETE immediately before attempting any out-of-grace recovery
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
CB_GETATTR specifies the delegation by filehandle, but deleg_fh_cmp() was comparing superblock and fileid. renamed deleg_fh_cmp() to deleg_file_cmp(), and created new deleg_fh_cmp() to compare actual filehandles
call to nfs41_attr_cache_lookup() was not setting flags in info.attrmask, so the change and size attributes were not being encoded in the CB_GETATTR response
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
fixes an issue introduced by commit 1f7e560a9a, "pnfs: fix for short write on ds error"
when a write is satisfied completely by pnfs, the upcall was returning out_len=0. the only time i noticed this was against python server, where cp in cygwin would succeed but print 'No space left on device'
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
added expiration field to struct attr_cache_entry. expiration timer is reset whenever attr_cache_update() is called with a 'change' attribute
new function attr_cache_entry_expired() is used by nfs41_attr_cache_lookup() and name_cache_lookup()/entry_invis() to prevent the use of expired attributes
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
nfs41_delegation_remove_srvopen() gets a reference to a delegation via delegation_find(). this reference wasn't being released
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
given a delegation type received on open, set buffering flags accordingly.
however, provide -o nocoherence mount options that will enable read/write
buffering regardless
when we receive a delegation recall, we need to make a downcall
to the kernel and change a buffering/caching policy on this file.
on each open and close we pass an srv_open pointer to the nfsd to
keep in case it receives a cb_recall. we store srv_open in the
delegation state if we got a delegation.
when open_update_cache() calls open_delegation_return() to return its delegation, it doesn't update its local variable 'delegation_type' to reflect the new delegation->type. this causes the next call to nfs41_name_cache_insert() to continue failing
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
when only the SEQ4_STATUS_RECALLABLE_STATE_REVOKED flag is set, we only have to test/free delegation stateids. nfs41_client_state_revoked() handles this by calling stateid_array() with an empty list of opens
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
adds a check for sequence flag SEQ4_STATUS_RESTART_RECLAIM_NEEDED in nfs41_recover_sequence_flags(). if other _STATE_REVOKED flags are set, call nfs41_client_state_revoked() as normal but follow it up with a RECLAIM_COMPLETE. otherwise, reclaim all locks with nfs41_recover_client_state(). like other callers of nfs41_recover_client_state(), we need to handle BADSESSION errors because the recovery operations are sent with try_recovery=FALSE. because this logic is similar to our default NFS4ERR_BADSESSION handling, i moved the common code into a new nfs41_recover_session() function in recovery.c
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
new function nfs41_recover_sequence_flags() in recovery.c takes care of checking status flags, entering recovery mode, and reclaiming state
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
the thread-by-server model breaks down when using dense layouts, because multiple stripes could be mapped to a single data server, and the per-data-server thread would have to send a COMMIT for each stripe
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
windows server does not allow for removing of opened file and instead
errors with nfs41err_file_open. we were failing the remove the silly
renamed file.
this addresses two error cases:
1) when the src entry is negative or does not exist: the dst entry could be negative or point to something else, so it needs to be removed
2) when the dst_parent entry is negative or does not exist: src needs to be made a negative entry
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
the namecache/delegation feedback code in nfs41_ops.c:open_update_cache() is interfering with delegation recovery. if its call to nfs41_name_cache_insert() fails with ERROR_TOO_MANY_OPEN_FILES, the delegation is returned instead of being recovered. this shouldn't happen during recovery, because we're replacing a delegation rather than adding a new one
nfs41_open() now remembers whether we already had a delegation. if we did, open_update_cache() will pass OPEN_DELEGATE_NONE to nfs41_name_cache_insert() to prevent it from returning ERROR_TOO_MANY_OPEN_FILES. if we didn't already have a delegation, the nfs41_delegreturn() needs to be called with the same try_recovery flag from nfs41_open()
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
nfs41_name_cache_remove() needs to update the 'numlinks' attribute for other links, even if the file being removed is not found in the cache. to search for its attr cache entry, nfs41_name_cache_remove() now requires a fileid argument. nfs41_remove() only gets a pointer to the parent's filehandle, so it also needs the target fileid argument
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
if any of (SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED | SEQ4_STATUS_EXPIRED_SOME_STATE_REVOKED | SEQ4_STATUS_ADMIN_STATE_REVOKED) are set in the session flags returned by SEQUENCE:
* enter client recovery mode
* determine which state was lost with TEST_STATEID (consider all delegations, opens, locks, and layouts)
* send FREE_STATEID for each stateid revoked
* recall all layouts and forget devices (required by 12.7.2: Dealing with Lease Expiration on the Client)
* call recover_delegation(), recover_open() or recover_locks() to reclaim each lock
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
in the case of ds writes returning FILE_SYNC, we don't need to send a COMMIT or LAYOUTCOMMIT to the mds. COMMIT and LAYOUTCOMMIT, however, are the places where we do GETATTR(size) to update the attribute cache. so we must add a separate call to GETATTR to accomplish this after ds writes return FILE_SYNC
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
small mds writes were using DATA_SYNC4 and were not followed by COMMITs, so there was no guarantee of an updated file size before returning success
use FILE_SYNC4 for these small writes, and send COMMITs for both UNSTABLE4 and DATA_SYNC4
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>