From 43a705076e51c5af21ec4260a35699775ea298f5 Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Mon, 14 Dec 2009 12:49:55 +1100 Subject: md: support updating bitmap parameters via sysfs. A new attribute directory 'bitmap' in 'md' is created which contains files for configuring the bitmap. 'location' identifies where the bitmap is, either 'none', or 'file' or 'sector offset from metadata'. Writing 'location' can create or remove a bitmap. Adding a 'file' bitmap this way is not yet supported. 'chunksize' and 'time_base' must be set before 'location' can be set. 'chunksize' can be set before creating a bitmap, but is currently always over-ridden by the bitmap superblock. 'time_base' and 'backlog' can be updated at any time. Signed-off-by: NeilBrown Reviewed-by: Andre Noll --- Documentation/md.txt | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) (limited to 'Documentation') diff --git a/Documentation/md.txt b/Documentation/md.txt index 4edd39ec7db..18fad687622 100644 --- a/Documentation/md.txt +++ b/Documentation/md.txt @@ -296,6 +296,35 @@ All md devices contain: active-idle like active, but no writes have been seen for a while (safe_mode_delay). + bitmap/location + This indicates where the write-intent bitmap for the array is + stored. + It can be one of "none", "file" or "[+-]N". + "file" may later be extended to "file:/file/name" + "[+-]N" means that many sectors from the start of the metadata. + This is replicated on all devices. For arrays with externally + managed metadata, the offset is from the beginning of the + device. + bitmap/chunksize + The size, in bytes, of the chunk which will be represented by a + single bit. For RAID456, it is a portion of an individual + device. For RAID10, it is a portion of the array. For RAID1, it + is both (they come to the same thing). + bitmap/time_base + The time, in seconds, between looking for bits in the bitmap to + be cleared. In the current implementation, a bit will be cleared + between 2 and 3 times "time_base" after all the covered blocks + are known to be in-sync. + bitmap/backlog + When write-mostly devices are active in a RAID1, write requests + to those devices proceed in the background - the filesystem (or + other user of the device) does not have to wait for them. + 'backlog' sets a limit on the number of concurrent background + writes. If there are more than this, new writes will by + synchronous. + + + As component devices are added to an md array, they appear in the 'md' directory as new directories named -- cgit v1.2.3 From ece5cff0da9e696c360fff592cb5f51b6419e4d6 Mon Sep 17 00:00:00 2001 From: NeilBrown Date: Mon, 14 Dec 2009 12:49:56 +1100 Subject: md: Support write-intent bitmaps with externally managed metadata. In this case, the metadata needs to not be in the same sector as the bitmap. md will not read/write any bitmap metadata. Config must be done via sysfs and when a recovery makes the array non-degraded again, writing 'true' to 'bitmap/can_clear' will allow bits in the bitmap to be cleared again. Signed-off-by: NeilBrown --- Documentation/md.txt | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) (limited to 'Documentation') diff --git a/Documentation/md.txt b/Documentation/md.txt index 18fad687622..21d26fb5d02 100644 --- a/Documentation/md.txt +++ b/Documentation/md.txt @@ -322,6 +322,22 @@ All md devices contain: 'backlog' sets a limit on the number of concurrent background writes. If there are more than this, new writes will by synchronous. + bitmap/metadata + This can be either 'internal' or 'external'. + 'internal' is the default and means the metadata for the bitmap + is stored in the first 256 bytes of the allocated space and is + managed by the md module. + 'external' means that bitmap metadata is managed externally to + the kernel (i.e. by some userspace program) + bitmap/can_clear + This is either 'true' or 'false'. If 'true', then bits in the + bitmap will be cleared when the corresponding blocks are thought + to be in-sync. If 'false', bits will never be cleared. + This is automatically set to 'false' if a write happens on a + degraded array, or if the array becomes degraded during a write. + When metadata is managed externally, it should be set to true + once the array becomes non-degraded, and this fact has been + recorded in the metadata. -- cgit v1.2.3 From 06e3c817b750c131a20e82eed57a17841ea88ed2 Mon Sep 17 00:00:00 2001 From: Dan Williams Date: Sat, 12 Dec 2009 21:17:12 -0700 Subject: md: add 'recovery_start' per-device sysfs attribute Enable external metadata arrays to manage rebuild checkpointing via a md/dev-XXX/recovery_start attribute which reflects rdev->recovery_offset Also update resync_start_store to allow 'none' to be written, for consistency. Signed-off-by: Dan Williams Signed-off-by: NeilBrown --- Documentation/md.txt | 27 +++++++++++++++++++++++---- 1 file changed, 23 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/md.txt b/Documentation/md.txt index 21d26fb5d02..188f4768f1d 100644 --- a/Documentation/md.txt +++ b/Documentation/md.txt @@ -233,9 +233,9 @@ All md devices contain: resync_start The point at which resync should start. If no resync is needed, - this will be a very large number. At array creation it will - default to 0, though starting the array as 'clean' will - set it much larger. + this will be a very large number (or 'none' since 2.6.30-rc1). At + array creation it will default to 0, though starting the array as + 'clean' will set it much larger. new_dev This file can be written but not read. The value written should @@ -379,8 +379,9 @@ Each directory contains: Writing "writemostly" sets the writemostly flag. Writing "-writemostly" clears the writemostly flag. Writing "blocked" sets the "blocked" flag. - Writing "-blocked" clear the "blocked" flag and allows writes + Writing "-blocked" clears the "blocked" flag and allows writes to complete. + Writing "in_sync" sets the in_sync flag. This file responds to select/poll. Any change to 'faulty' or 'blocked' causes an event. @@ -417,6 +418,24 @@ Each directory contains: array. If a value less than the current component_size is written, it will be rejected. + recovery_start + + When the device is not 'in_sync', this records the number of + sectors from the start of the device which are known to be + correct. This is normally zero, but during a recovery + operation is will steadily increase, and if the recovery is + interrupted, restoring this value can cause recovery to + avoid repeating the earlier blocks. With v1.x metadata, this + value is saved and restored automatically. + + This can be set whenever the device is not an active member of + the array, either before the array is activated, or before + the 'slot' is set. + + Setting this to 'none' is equivalent to setting 'in_sync'. + Setting to any other value also clears the 'in_sync' flag. + + An active md device will also contain and entry for each active device in the array. These are named -- cgit v1.2.3