Btrfs: keep processing bios for a given bdev if our proc is batching

Btrfs uses async helper threads to submit write bios so the checksumming helper threads don't block on the disk. The submit bio threads may process bios for more than one block device, so when they find one device congested they try to move on to other devices instead of blocking in get_request_wait for one device. This does a pretty good job of keeping multiple devices busy, but the congested flag has a number of problems. A congested device may still give you a request, and other procs that aren't backing off the congested device may starve you out. This commit uses the io_context stored in current to decide if our process has been made a batching process by the block layer. If so, it keeps sending IO down for at least one batch. This helps make sure we do a good amount of work each time we visit a bdev, and avoids large IO stalls in multi-device workloads. It's also very ugly. A better solution is in the works with Jens Axboe. Signed-off-by: Chris Mason <chris.mason@oracle.com>
author: Chris Mason <chris.mason@oracle.com> 2009-04-03 10:27:10 -0400
committer: Chris Mason <chris.mason@oracle.com> 2009-04-03 10:27:10 -0400
commit: b765ead57da62cccf7fa21e00e6eed65e9df62b0 (patch)
tree: 66541fd018482a8d0db0021c3a3f3e8611ddb6fe /fs/btrfs
parent: d57e62b89796f751c9422801cbcd407a9f8dcdc4 (diff)
1 files changed, 27 insertions, 0 deletions
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index dd06e18e5aa..cc01abff03d 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -20,6 +20,7 @@
 #include <linux/buffer_head.h>
 #include <linux/blkdev.h>
 #include <linux/random.h>
+#include <linux/iocontext.h>
 #include <asm/div64.h>
 #include "compat.h"
 #include "ctree.h"
@@ -145,6 +146,7 @@ static noinline int run_scheduled_bios(struct btrfs_device *device)
 	int again = 0;
 	unsigned long num_run = 0;
 	unsigned long limit;
+	unsigned long last_waited = 0;
 
 	bdi = device->bdev->bd_inode->i_mapping->backing_dev_info;
 	fs_info = device->dev_root->fs_info;
@@ -207,7 +209,32 @@ loop_lock:
 		if (pending && bdi_write_congested(bdi) && num_run > 16 &&
 		    fs_info->fs_devices->open_devices > 1) {
 			struct bio *old_head;
+			struct io_context *ioc;
 
+			ioc = current->io_context;
+
+			/*
+			 * the main goal here is that we don't want to
+			 * block if we're going to be able to submit
+			 * more requests without blocking.
+			 *
+			 * This code does two great things, it pokes into
+			 * the elevator code from a filesystem _and_
+			 * it makes assumptions about how batching works.
+			 */
+			if (ioc && ioc->nr_batch_requests > 0 &&
+			    time_before(jiffies, ioc->last_waited + HZ/50UL) &&
+			    (last_waited == 0 ||
+			     ioc->last_waited == last_waited)) {
+				/*
+				 * we want to go through our batch of
+				 * requests and stop.  So, we copy out
+				 * the ioc->last_waited time and test
+				 * against it before looping
+				 */
+				last_waited = ioc->last_waited;
+				continue;
+			}
 			spin_lock(&device->io_lock);
 
 			old_head = device->pending_bios;
author	Chris Mason <chris.mason@oracle.com>	2009-04-03 10:27:10 -0400
committer	Chris Mason <chris.mason@oracle.com>	2009-04-03 10:27:10 -0400
commit	b765ead57da62cccf7fa21e00e6eed65e9df62b0 (patch)
tree	66541fd018482a8d0db0021c3a3f3e8611ddb6fe /fs/btrfs
parent	d57e62b89796f751c9422801cbcd407a9f8dcdc4 (diff)