Basic handling of owner and group security query (no dacl).
Added new upcall for NFS41_ACL_QUERY (driver and daemon code).
Daemon, upon getting NFS41_ACL_QUERY first places a getattr that has
owner, group attribute request. We currently don't cache them!!!
Then, we parse nfs4name format (ie user@domain or group@domain)
into user and domain. We currently ignore domain part!!!
Then, we assume that whatever we are mapping is "known" locally
(ie LookupAccountName() api which retrieves a SID for a given name).
Mapping from name to SID can only be done in the userland. We then
copy the bytes via the upcall pipe to the kernel. If the received
user or group cant be mapped via LookupAccoundName(), we create a
well known null SID as the reply.
Kernel creates a security descriptor in the absolute-format and adds
owner and group sids to it. Important: RtlSetOwner/Group functions only
work with absolute-format security descriptor, however the reply to the
user needs to be in the self-relative format.
The way security query works is that it passes us a buffer to be filled
with the security context. However the user doesn't know how big the
buffer should be so, the user is allowed to pass a null buffer and have
the kernel return how much memory is needed. This leads to 2 security
queries => 2 NFS41_ACL_QUERY upcalls => 2 getattr rpcs... It should be
improved.
TODO:
- need to add caching of owner/group attributes for a file?
- need to add calls to LDAP for more general mapping?
- need to cache reply of the ACL if supplied length is 0?
adding a check to see if the destination filename is currently opened by
looking through the list of open states stored for a given client.
fail rename with ERROR_FILE_EXISTS if we find an open.
exclusive locks are no longer held over LAYOUTGET, LAYOUTRETURN, or GETDEVICEINFO rpcs. this prevents a deadlock when CB_LAYOUTRECALL needs an exclusive lock while another operation is on the wire
introduced a 'pending' condition variable to protect access to state->layout while the layout's lock is not held
updated file_layout_recall() to compare the stateid sequence numbers to determine if the server has processed an outstanding LAYOUTGET or LAYOUTRETURN, where we're required to reply with NFS4ERR_DELAY
LAYOUTGET, LAYOUTRETURN, and GETDEVICEINFO can now be sent with try_recovery=TRUE because they no longer hold an exclusive lock. this makes it possible for recover_client_state() to recall all of the client's layouts without deadlocking
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
nfs41_lock_stateid_arg() is now called only once in handle_read()/handle_write(), and pnfs_read()/pnfs_write() no longer depend on nfs41_open_state
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
on a successful LAYOUTGET, file_layout_fetch() calls layout_update() to copy the first layout segment returned and update the layout stateid
on a successful LAYOUTRETURN, file_layout_return() frees the layout segment and updates/clears the stateid
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
LAYOUTGET xdr now supports decoding of multiple layout segments, which are returned in a list with pnfs_layoutget_res_ok
LAYOUTRETURN no longer operates on an existing pnfs_file_layout. it now takes a copy of the layout stateid, and returns the new stateid with pnfs_layoutreturn_res
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
moved state data (stateid, flags, locks, and reference counts) out of struct pnfs_layout, which should represent a layout segment returned by LAYOUTGET
struct pnfs_layout_state now holds this state, along with a pointer to a single pnfs_file_layout
struct pnfs_file_layout_list is now a list of pnfs_layout_states, and was renamed to pnfs_layout_list
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
we already drop the lock between sending and receiving the rpc packets. now making it so that receive doesn't block for too long (ie 100ms) before unlocking the socket. this is needed for the callback. original rpc is sent and it triggers a callback from the server. we fork another thread to handle it, ie it needs to send a deleg_return rpc. if original rpc gets control and blocks on trying to receive its reply, it'll timeout and original rpc will return an error. instead we need to not block for long and allow the deleg_return to go thru so that the server can reply successfully to the original rpc.
added pnfs_layout.open_count to count open references, and only return the layout when pnfs_open_state_close() takes the open_count to 0
use InterlockedIncrement/Decrement to avoid an exclusive lock on the layout
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
connectathon locking tests trigger an interrupted UNLOCK upcall, which leads to the bugcheck in CloseSrvOpen() when freeing the security context
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
leaving CLOSE upcall non-interruptable as it leads to issues with security context.
making all other upcalls interruptable so that when something goes wrong we can ctrl-c out of a user application. otherwise, the machine requires a reboot (ie caz the wait we made the wait non-interrutable so nothing can kill it).
note: privacy will not work when we have more than 1 outstanding rpcs which generates out of order replies which sspi does not allow when privacy is enabled.
adding auth_wrap() and auth_unwrap() to per-message gss token protection required adding these methods to auth_sys and auth_non.
linux server doesnt support v2 kerberos tokens that have rotated data. sspi will always produce such tokens for aes. thus thus code was only tested for v1 kerberos tokens (ie des).
we were checking for error result of send_null but not setting
status, then going to "out_unlock" and since status is NO_ERROR
trying to send bind_conn_to_session
switching user's upcall wait from being UserMode and TRUE (interruptable) to KernelMode and FALSE. msdn doc does recommend for simplicity of the drivers to do that.
it seems to no longer generate interrupts on close irps but we are still able to ctrl-c running tests.
instead of getting security context on every upcall, acquire security context on open and save it in fobx. cache manager does read and write calls in a system csecurity context not in users, thus we need to use the context of the open instead.
instead of ignoring errors from proc_cb_compound_args(), return NFS4ERR_BADXDR. note that we still need to allocate the cb_compound_res structure to return this error
added null checks to the end of handle_cb_compound(); if the cb_compound_res allocation fails, we'd crash trying to access res->status and res->resarray_count
also fixed some indenting
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
on failure to renew a session, we don't need to free the session (this leads to crashes). if we simply return the error to compound_encode_send_decode(), we'll fail any subsequent operations on the session, but still be able to unmount and remain stable
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
send CREATE_SESSION with compound_encode_send_decode() instead of nfs41_send_compound() for its NFS4ERR_DELAY and NFS4ERR_STALE_CLIENTID handling
added 'try_recovery' argument to nfs41_create_session(), which is passed on to compound_encode_send_decode(). nfs41_session_renew() uses try_recovery=FALSE, because it handles the NFS4ERR_STALE_CLIENTID error on its own. nfs41_session_create() uses try_recovery=TRUE to make use of the NFS4ERR_STALE_CLIENTID error handling. modified the NFS4ERR_STALE_CLIENTID block to call nfs41_client_renew() and retry the operation (i.e. CREATE_SESSION), instead of falling through to session recovery
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
removed unused variable 'buffer_size' in lookup_rpc()
renamed map_lookup_error()'s parameter 'is_last_component' to 'last_component' to avoid conflicting with function is_last_component()
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
changed goto out -> out_err, so the root is freed on buffer overflow
updated error messages for nfs41_root_create() and nfs41_root_mount_addrs()
if the root lookup fails, return ERROR_BAD_NETPATH instead of ERROR_FILE_NOT_FOUND
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
server_create() was ignoring the return value of nfs41_name_cache_create(), but it needs to be propagated all the way back through nfs41_server_find_or_create() to nfs41_client_create() and nfs41_client_renew()
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
12.7.4. Recovery from Metadata Server Restart
"The client MUST stop using layouts and delete the device ID to device address mappings it previously received from the metadata server."
during client state recovery, call pnfs_file_layout_recall() to revoke all layouts and devices held by the client
LAYOUTGET, LAYOUTRETURN, and GETDEVICEINFO are all sent under their respective locks, and pnfs_file_layout_recall() requires a lock on each layout and device it operates on, so this would cause a deadlock if one of those operations triggered the recovery. to avoid this, LAYOUTGET, LAYOUTRETURN, and GETDEVICEINFO are all sent with try_recovery=FALSE. this behavior is preferable for recovery, because errors in the pnfs path cause us to fall back to the metadata server
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>