when starting io, both pnfs_read() and pnfs_write() need a guarantee that their range is covered by layout segments. because we have to drop the lock for LAYOUTGET and GETDEVICEINFO, earlier layout segments may be recalled during this process. to avoid this, new function pnfs_layout_state_prepare() gets called repeatedly until it can verify under a single lock that 1) the entire desired range is covered with layouts and 2) each of these layouts has an associated device. whenever pnfs_layout_state_prepare() has to drop its lock for LAYOUTGET or GETDEVICEINFO, it returns PNFS_PENDING
on PNFS_SUCCESS, the caller knows that all segments in the range are valid and can dispatch io to those segments without worrying about recalls, because it still holds the pnfs_layout_state lock
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
the thread-by-server model breaks down when using dense layouts, because multiple stripes could be mapped to a single data server, and the per-data-server thread would have to send a COMMIT for each stripe
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
sorry, earlier Casey, but the patch 'threading by io unit instead of stripe' from 6/20/2010 was nuts! with PNFS_THREAD_BY_SERVER disabled, we definitely -don't- want to create a separate thread for each io unit (each READ/WRITE request to a ds). we just want the one per stripe, as the intended alternative to PNFS_THREAD_BY_SERVER
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
added 2011 year to the copyright line
added authors info to the license
added UofM license to libtirpc files that we modified
(but i probably missed some)
exclusive locks are no longer held over LAYOUTGET, LAYOUTRETURN, or GETDEVICEINFO rpcs. this prevents a deadlock when CB_LAYOUTRECALL needs an exclusive lock while another operation is on the wire
introduced a 'pending' condition variable to protect access to state->layout while the layout's lock is not held
updated file_layout_recall() to compare the stateid sequence numbers to determine if the server has processed an outstanding LAYOUTGET or LAYOUTRETURN, where we're required to reply with NFS4ERR_DELAY
LAYOUTGET, LAYOUTRETURN, and GETDEVICEINFO can now be sent with try_recovery=TRUE because they no longer hold an exclusive lock. this makes it possible for recover_client_state() to recall all of the client's layouts without deadlocking
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
nfs41_lock_stateid_arg() is now called only once in handle_read()/handle_write(), and pnfs_read()/pnfs_write() no longer depend on nfs41_open_state
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
on a successful LAYOUTGET, file_layout_fetch() calls layout_update() to copy the first layout segment returned and update the layout stateid
on a successful LAYOUTRETURN, file_layout_return() frees the layout segment and updates/clears the stateid
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
moved state data (stateid, flags, locks, and reference counts) out of struct pnfs_layout, which should represent a layout segment returned by LAYOUTGET
struct pnfs_layout_state now holds this state, along with a pointer to a single pnfs_file_layout
struct pnfs_file_layout_list is now a list of pnfs_layout_states, and was renamed to pnfs_layout_list
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
added pnfs_layout.open_count to count open references, and only return the layout when pnfs_open_state_close() takes the open_count to 0
use InterlockedIncrement/Decrement to avoid an exclusive lock on the layout
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
20.3. CB_LAYOUTRECALL
"LAYOUTRECALL4_FSID and LAYOUTRECALL4_ALL specify that all the storage device ID to storage device address mappings in the affected file system(s) are also recalled."
pnfs_file_layout_recall() now takes a nfs41_client instead of just the pnfs_file_layout_list, because both the layout list and device list are accessible from nfs41_client. for bulk recalls, calls new function pnfs_file_device_list_invalidate(). each device with layout_count=0 is removed and freed, and devices in use are flagged as REVOKED and freed when layout_count->0
layout_recall_return() now takes a pnfs_file_layout instead of pnfs_layout for access to pnfs_file_layout.device. pnfs_layout_io_start() and pnfs_layout_io_finish() do the same, because pnfs_layout_io_finish() calls layout_recall_return(). layout_recall_return() calls pnfs_file_device_put() to release its reference on the device
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
pnfs_device.status remembers whether a given device has been GRANTED/REVOKED
pnfs_device.layout_count tracks the number of layouts using the device, incremented by pnfs_file_device_get() and decremented by pnfs_file_device_put(). when pnfs_file_device_put() takes layout_count to 0, remove and free the device only if it's flagged as REVOKED
because pnfs_file_device_get() modifies pnfs_device.layout_count, we can no longer use a shared lock; changed pnfs_file_device.lock from SRWLOCK to CRITICAL_SECTION, and moved to pnfs_device.lock to document the fact that it's used for pnfs_device.status and pnfs_device.layout_count
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
operations that require a stateid now take stateid_arg for recovery information. these operations include close, setattr, lock/unlock, layoutget, and read/write (including pnfs)
nfs41_open_stateid_arg() locks nfs41_open_state and copies its stateid into a stateid_arg
nfs41_lock_stateid_arg() locks nfs41_open_state.last_lock and copies its stateid into a stateid_arg; if there is no lock state, it falls back to nfs41_open_stateid_arg()
pnfs_read/write() now take nfs41_open_state so they can generate stateid_args
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>
fixed a few cases of warning 4242: possible loss of data
wincrypt.h appears to come with windows.h in later versions of the ddk, but nfs41_client.c fails to compile in WDK 6001 without #include <wincrypt.h>
Signed-off-by: Casey Bodley <cbodley@citi.umich.edu>