aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2008-07-23md: delay notification of 'active_idle' to the recovery threadDan Williams
sysfs_notify might sleep, so do not call it from md_safemode_timeout. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-21md: Protect access to mddev->disks list using RCUNeilBrown
All modifications and most access to the mddev->disks list are made under the reconfig_mutex lock. However there are three places where the list is walked without any locking. If a reconfig happens at this time, havoc (and oops) can ensue. So use RCU to protect these accesses: - wrap them in rcu_read_{,un}lock() - use list_for_each_entry_rcu - add to the list with list_add_rcu - delete from the list with list_del_rcu - delay the 'free' with call_rcu rather than schedule_work Note that export_rdev did a list_del_init on this list. In almost all cases the entry was not in the list anymore so it was a no-op and so safe. It is no longer safe as after list_del_rcu we may not touch the list_head. An audit shows that export_rdev is called: - after unbind_rdev_from_array, in which case the delete has already been done, - after bind_rdev_to_array fails, in which case the delete isn't needed. - before the device has been put on a list at all (e.g. in add_new_disk where reading the superblock fails). - and in autorun devices after a failure when the device is on a different list. So remove the list_del_init call from export_rdev, and add it back immediately before the called to export_rdev for that last case. Note also that ->same_set is sometimes used for lists other than mddev->list (e.g. candidates). In these cases rcu is not needed. Signed-off-by: NeilBrown <neilb@suse.de>
2008-07-21md: only count actual openers as access which prevent a 'stop'NeilBrown
Open isn't the only thing that increments ->active. e.g. reading /proc/mdstat will increment it briefly. So to avoid false positives in testing for concurrent access, introduce a new counter that counts just the number of times the md device it open. Signed-off-by: NeilBrown <neilb@suse.de>
2008-07-21md: linear: Make array_size sector-based and rename it to array_sectors.Andre Noll
Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
2008-07-21md: Make mddev->array_size sector-based.Andre Noll
This patch renames the array_size field of struct mddev_s to array_sectors and converts all instances to use units of 512 byte sectors instead of 1k blocks. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
2008-07-11md: Remove some unused macros.Andre Noll
Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: Neil Brown <neilb@suse.de>
2008-07-11md: Turn rdev->sb_offset into a sector-based quantity.Andre Noll
Rename it to sb_start to make sure all users have been converted. Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: Neil Brown <neilb@suse.de>
2008-07-11Merge branch 'master' into for-nextNeil Brown
2008-07-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits) tun: Persistent devices can get stuck in xoff state xfrm: Add a XFRM_STATE_AF_UNSPEC flag to xfrm_usersa_info ipv6: missed namespace context in ipv6_rthdr_rcv netlabel: netlink_unicast calls kfree_skb on error path by itself ipv4: fib_trie: Fix lookup error return tcp: correct kcalloc usage ip: sysctl documentation cleanup Documentation: clarify tcp_{r,w}mem sysctl docs netfilter: nf_nat_snmp_basic: fix a range check in NAT for SNMP netfilter: nf_conntrack_tcp: fix endless loop libertas: fix memory alignment problems on the blackfin zd1211rw: stop beacons on remove_interface rt2x00: Disable synchronization during initialization rc80211_pid: Fix fast_start parameter handling sctp: Add documentation for sctp sysctl variable ipv6: fix race between ipv6_del_addr and DAD timer irda: Fix netlink error path return value irda: New device ID for nsc-ircc irda: via-ircc proper dma freeing sctp: Mark the tsn as received after all allocations finish ...
2008-07-10xfrm: Add a XFRM_STATE_AF_UNSPEC flag to xfrm_usersa_infoSteffen Klassert
Add a XFRM_STATE_AF_UNSPEC flag to handle the AF_UNSPEC behavior for the selector family. Userspace applications can set this flag to leave the selector family of the xfrm_state unspecified. This can be used to to handle inter family tunnels if the selector is not set from userspace. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-08ide: add __ide_default_irq() inline helperBartlomiej Zolnierkiewicz
Add __ide_default_irq() inline helper and use it instead of ide_default_irq() in ide-probe.c and ns87415.c (all host drivers except IDE PCI ones always setup hwif->irq so it is enough to check only for I/O bases 0x1f0 and 0x170). This fixes post-2.6.25 regression since ide_default_irq() define could shadow ide_default_irq() inline. Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-08Merge branch 'for-neil' of ↵Neil Brown
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/md into for-next
2008-07-08Merge branch 'master' into for-nextNeil Brown
2008-07-05Move _RET_IP_ and _THIS_IP_ to include/linux/kernel.hEduard - Gabriel Munteanu
These two macros are useful beyond lock debugging. Moved definitions from include/linux/debug_locks.h to include/linux/kernel.h, so code that needs them does not have to include the former, which would have been a less intuitive choice of a header. Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-04cpumask: introduce new APIsStephen Rothwell
In linux-next there is a commit ("x86: Add performance variants of cpumask operators") which, as part of the 4096 cpu support work adds some new APIs for dealing with cpu masks. Add trivial versions of these now so that subsystems can update in a timely manner and avoid conflicts in linux-next and the next merge window. Cc: Mike Travis <travis@sgi.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-04olpc: sdhci: add quirk for the Marvell CaFe's vdd/powerup issueAndres Salomon
This has been sitting around unloved for way too long.. The Marvell CaFe chip's SD implementation chokes during card insertion if one attempts to set the voltage and power up in the same SDHCI_POWER_CONTROL register write. This adds a quirk that does that particular dance in two steps. It also adds an entry to pci_ids.h for the CaFe chip's SD device. Signed-off-by: Andres Salomon <dilinger@debian.org> Cc: Pierre Ossman <drzeus-list@drzeus.cx> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-04security: filesystem capabilities: fix fragile setuid fixup codeAndrew G. Morgan
This commit includes a bugfix for the fragile setuid fixup code in the case that filesystem capabilities are supported (in access()). The effect of this fix is gated on filesystem capability support because changing securebits is only supported when filesystem capabilities support is configured.) [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Andrew G. Morgan <morgan@kernel.org> Acked-by: Serge Hallyn <serue@us.ibm.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-04Introduce rculist.hStephen Rothwell
In linux-next there is a commit ("rcu: split list.h and move rcu-protected lists into rculist.h") that moved the rcu related list iterators from list.h to rculist.h. Add a trivial version of the file now so that various subsystem trees can start using it now for -next changes and so reduce the build errors caused by adding uses of the moved functions. Cc: Franck Bui-Huu <fbuihuu@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Josh Triplett <josh@kernel.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-04Miguel Ojeda has movedMiguel Ojeda
Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-04firmware: fix the request_firmware() dummyJames Bottomley
> the build (.config attached) failed, make ends with : > ... > UPD include/linux/compile.h > CC init/version.o > LD init/built-in.o > LD vmlinux > drivers/built-in.o: In function `sas_request_addr': > (.text+0x33bab): undefined reference to `request_firmware' > drivers/built-in.o: In function `sas_request_addr': > (.text+0x33c3f): undefined reference to `release_firmware' > make: *** [vmlinux] Error 1 There's a slight fault in the stub logic. It fails for FW_LOADER=m and the user =y. This should fix it. This patch fixes the following 2.6.26-rc regression: http://bugzilla.kernel.org/show_bug.cgi?id=10730 Reviewed-by: Toralf Foerster <toralf.foerster@gmx.de> Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-04Christoph has movedChristoph Lameter
Remove all clameter@sgi.com addresses from the kernel tree since they will become invalid on June 27th. Change my maintainer email address for the slab allocators to cl@linux-foundation.org (which will be the new email address for the future). Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-04Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6: slub: Do not use 192 byte sized cache if minimum alignment is 128 byte
2008-07-03slub: Do not use 192 byte sized cache if minimum alignment is 128 byteChristoph Lameter
The 192 byte cache is not necessary if we have a basic alignment of 128 byte. If it would be used then the 192 would be aligned to the next 128 byte boundary which would result in another 256 byte cache. Two 256 kmalloc caches cause sysfs to complain about a duplicate entry. MIPS needs 128 byte aligned kmalloc caches and spits out warnings on boot without this patch. Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2008-07-02Merge branch 'for-2.6.26' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-2.6.26' of git://git.kernel.dk/linux-2.6-block: Properly notify block layer of sync writes block: Fix the starving writes bug in the anticipatory IO scheduler
2008-07-02Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6Linus Torvalds
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6: i2c: Fix bad hint about irqs in i2c.h i2c: Documentation: fix device matching description
2008-07-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (55 commits) net: fib_rules: fix error code for unsupported families netdevice: Fix wrong string handle in kernel command line parsing net: Tyop of sk_filter() comment netlink: Unneeded local variable net-sched: fix filter destruction in atm/hfsc qdisc destruction net-sched: change tcf_destroy_chain() to clear start of filter list ipv4: fix sysctl documentation of time related values mac80211: don't accept WEP keys other than WEP40 and WEP104 hostap: fix sparse warnings hostap: don't report useless WDS frames by default textsearch: fix Boyer-Moore text search bug netfilter: nf_conntrack_tcp: fixing to check the lower bound of valid ACK ipv6 route: Convert rt6_device_match() to use RT6_LOOKUP_F_xxx flags. netlabel: Fix a problem when dumping the default IPv6 static labels net/inet_lro: remove setting skb->ip_summed when not LRO-able inet fragments: fix race between inet_frag_find and inet_frag_secret_rebuild CONNECTOR: add a proc entry to list connectors netlink: Fix some doc comments in net/netlink/attr.c tcp: /proc/net/tcp rto,ato values not scaled properly (v2) include/linux/netdevice.h: don't export MAX_HEADER to userspace ...
2008-07-01i2c: Fix bad hint about irqs in i2c.hWolfram Sang
i2c.h mentions -1 as a not-issued irq. This false hint was taken by of_i2c and caused crashes. Don't give any advice as 'no irq' is not consistent across all architectures yet and it is not needed internally by the i2c-core. Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-07-01Properly notify block layer of sync writesJens Axboe
fsync_buffers_list() and sync_dirty_buffer() both issue async writes and then immediately wait on them. Conceptually, that makes them sync writes and we should treat them as such so that the IO schedulers can handle them appropriately. This patch fixes a write starvation issue that Lin Ming reported, where xx is stuck for more than 2 minutes because of a large number of synchronous IO in the system: INFO: task kjournald:20558 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kjournald D ffff810010820978 6712 20558 2 ffff81022ddb1d10 0000000000000046 ffff81022e7baa10 ffffffff803ba6f2 ffff81022ecd0000 ffff8101e6dc9160 ffff81022ecd0348 000000008048b6cb 0000000000000086 ffff81022c4e8d30 0000000000000000 ffffffff80247537 Call Trace: [<ffffffff803ba6f2>] kobject_get+0x12/0x17 [<ffffffff80247537>] getnstimeofday+0x2f/0x83 [<ffffffff8029c1ac>] sync_buffer+0x0/0x3f [<ffffffff8066d195>] io_schedule+0x5d/0x9f [<ffffffff8029c1e7>] sync_buffer+0x3b/0x3f [<ffffffff8066d3f0>] __wait_on_bit+0x40/0x6f [<ffffffff8029c1ac>] sync_buffer+0x0/0x3f [<ffffffff8066d48b>] out_of_line_wait_on_bit+0x6c/0x78 [<ffffffff80243909>] wake_bit_function+0x0/0x23 [<ffffffff8029e3ad>] sync_dirty_buffer+0x98/0xcb [<ffffffff8030056b>] journal_commit_transaction+0x97d/0xcb6 [<ffffffff8023a676>] lock_timer_base+0x26/0x4b [<ffffffff8030300a>] kjournald+0xc1/0x1fb [<ffffffff802438db>] autoremove_wake_function+0x0/0x2e [<ffffffff80302f49>] kjournald+0x0/0x1fb [<ffffffff802437bb>] kthread+0x47/0x74 [<ffffffff8022de51>] schedule_tail+0x28/0x5d [<ffffffff8020cac8>] child_rip+0xa/0x12 [<ffffffff80243774>] kthread+0x0/0x74 [<ffffffff8020cabe>] child_rip+0x0/0x12 Lin Ming confirms that this patch fixes the issue. I've run tests with it for the past week and no ill effects have been observed, so I'm proposing it for inclusion into 2.6.26. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-06-30md: resolve external metadata handling deadlock in md_allow_writeDan Williams
md_allow_write() marks the metadata dirty while holding mddev->lock and then waits for the write to complete. For externally managed metadata this causes a deadlock as userspace needs to take the lock to communicate that the metadata update has completed. Change md_allow_write() in the 'external' case to start the 'mark active' operation and then return -EAGAIN. The expected side effects while waiting for userspace to write 'active' to 'array_state' are holding off reshape (code currently handles -ENOMEM), cause some 'stripe_cache_size' change requests to fail, cause some GET_BITMAP_FILE ioctl requests to fall back to GFP_NOIO, and cause updates to 'raid_disks' to fail. Except for 'stripe_cache_size' changes these failures can be mitigated by coordinating with mdmon. md_write_start() still prevents writes from occurring until the metadata handler has had a chance to take action as it unconditionally waits for MD_CHANGE_CLEAN to be cleared. [neilb@suse.de: return -EAGAIN, try GFP_NOIO] Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-06-30Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: fix locking in force-feedback core Input: add KEY_MEDIA_REPEAT definition
2008-06-30Input: add KEY_MEDIA_REPEAT definitionBastien Nocera
This patch adds the Repeat key to the input layer. The usage in the HUT is 0xBC (listed under "15.7 Transport Controls"). Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2008-06-29Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: dock: bay: Don't call acpi_walk_namespace() when ACPI is disabled. ACPI: don't walk tables if ACPI was disabled thermal: Create CONFIG_THERMAL_HWMON=n
2008-06-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixesLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes: kbuild: fix a.out.h export to userspace with O= build.
2008-06-29Merge branch 'audit.b52' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current * 'audit.b52' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: [PATCH] remove useless argument type in audit_filter_user() [PATCH] audit: fix kernel-doc parameter notation [PATCH] kernel/audit.c: nlh->nlmsg_type is gotten more than once
2008-06-29Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: [patch 2/3] vfs: dcache cleanups [patch 1/3] vfs: dcache sparse fixes [patch 3/3] vfs: make d_path() consistent across mount operations [patch 4/4] flock: remove unused fields from file_lock_operations [patch 3/4] vfs: fix ERR_PTR abuse in generic_readlink [patch 2/4] fs: make struct file arg to d_path const [patch 1/4] vfs: path_{get,put}() cleanups [patch for 2.6.26 4/4] vfs: utimensat(): fix write access check for futimens() [patch for 2.6.26 3/4] vfs: utimensat(): fix error checking for {UTIME_NOW,UTIME_OMIT} case [patch for 2.6.26 1/4] vfs: utimensat(): ignore tv_sec if tv_nsec == UTIME_OMIT or UTIME_NOW [patch for 2.6.26 2/4] vfs: utimensat(): be consistent with utime() for immutable and append-only files [PATCH] fix cgroup-inflicted breakage in block_dev.c
2008-06-27net/inet_lro: remove setting skb->ip_summed when not LRO-ableEli Cohen
When an SKB cannot be chained to a session, the current code attempts to "restore" its ip_summed field from lro_mgr->ip_summed. However, lro_mgr->ip_summed does not hold the original value; in fact, we'd better not touch skb->ip_summed since it is not modified by the code in the path leading to a failure to chain it. Also use a cleaer comment to the describe the ip_summed field of struct net_lro_mgr. Issue raised by Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27include/linux/netdevice.h: don't export MAX_HEADER to userspaceAdrian Bunk
Due to the CONFIG_'s the value is anyway not correct in userspace. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-28md: replace R5_WantPrexor with R5_WantDrain, add 'prexor' reconstruct_statesDan Williams
From: Dan Williams <dan.j.williams@intel.com> Currently ops_run_biodrain and other locations have extra logic to determine which blocks are processed in the prexor and non-prexor cases. This can be eliminated if handle_write_operations5 flags the blocks to be processed in all cases via R5_Wantdrain. The presence of the prexor operation is tracked in sh->reconstruct_state. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>
2008-06-28md: replace STRIPE_OP_{BIODRAIN,PREXOR,POSTXOR} with 'reconstruct_states'Dan Williams
From: Dan Williams <dan.j.williams@intel.com> Track the state of reconstruct operations (recalculating the parity block usually due to incoming writes, or as part of array expansion) Reduces the scope of the STRIPE_OP_{BIODRAIN,PREXOR,POSTXOR} flags to only tracking whether a reconstruct operation has been requested via the ops_request field of struct stripe_head_state. This is the final step in the removal of ops.{pending,ack,complete,count}, i.e. the STRIPE_OP_{BIODRAIN,PREXOR,POSTXOR} flags only request an operation and do not track the state of the operation. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>
2008-06-28md: replace STRIPE_OP_CHECK with 'check_states'Dan Williams
From: Dan Williams <dan.j.williams@intel.com> The STRIPE_OP_* flags record the state of stripe operations which are performed outside the stripe lock. Their use in indicating which operations need to be run is straightforward; however, interpolating what the next state of the stripe should be based on a given combination of these flags is not straightforward, and has led to bugs. An easier to read implementation with minimal degrees of freedom is needed. Towards this goal, this patch introduces explicit states to replace what was previously interpolated from the STRIPE_OP_* flags. For now this only converts the handle_parity_checks5 path, removing a user of the ops.{pending,ack,complete,count} fields of struct stripe_operations. This conversion also found a remaining issue with the current code. There is a small window for a drive to fail between when we schedule a repair and when the parity calculation for that repair completes. When this happens we will writeback to 'failed_num' when we really want to write back to 'pd_idx'. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>
2008-06-28md: kill STRIPE_OP_IO flagDan Williams
From: Dan Williams <dan.j.williams@intel.com> The R5_Want{Read,Write} flags already gate i/o. So, this flag is superfluous and we can unconditionally call ops_run_io(). Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>
2008-06-28md: kill STRIPE_OP_MOD_DMA in raid5 offloadDan Williams
From: Dan Williams <dan.j.williams@intel.com> This micro-optimization allowed the raid code to skip a re-read of the parity block after checking parity. It took advantage of the fact that xor-offload-engines have their own internal result buffer and can check parity without writing to memory. Remove it for the following reasons: 1/ It is a layering violation for MD to need to manage the DMA and non-DMA paths within async_xor_zero_sum 2/ Bad precedent to toggle the 'ops' flags outside the lock 3/ Hard to realize a performance gain as reads will not need an updated parity block and writes will dirty it anyways. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Neil Brown <neilb@suse.de>
2008-06-28Make sure all changes to md/dev-XX/state are notifiedNeil Brown
The important state change happens during an interrupt in md_error. So just set a flag there and call sysfs_notify later in process context. Signed-off-by: Neil Brown <neilb@suse.de>
2008-06-28Make sure all changes to md/sync_action are notified.Neil Brown
When the 'resync' thread starts or stops, when we explicitly set sync_action, or when we determine that there is definitely nothing to do, we notify sync_action. To stop "sync_action" from occasionally showing the wrong value, we introduce a new flags - MD_RECOVERY_RECOVER - to say that a recovery is probably needed or happening, and we make sure that we set MD_RECOVERY_RUNNING before clearing MD_RECOVERY_NEEDED. Signed-off-by: Neil Brown <neilb@suse.de>
2008-06-28Allow setting start point for requested check/repairNeil Brown
This makes it possible to just resync a small part of an array. e.g. if a drive reports that it has questionable sectors, a 'repair' of just the region covering those sectors will cause them to be read and, if there is an error, re-written with correct data. Signed-off-by: Neil Brown <neilb@suse.de>
2008-06-28Improve setting of "events_cleared" for write-intent bitmaps.Neil Brown
When an array is degraded, bits in the write-intent bitmap are not cleared, so that if the missing device is re-added, it can be synced by only updated those parts of the device that have changed since it was removed. The enable this a 'events_cleared' value is stored. It is the event counter for the array the last time that any bits were cleared. Sometimes - if a device disappears from an array while it is 'clean' - the events_cleared value gets updated incorrectly (there are subtle ordering issues between updateing events in the main metadata and the bitmap metadata) resulting in the missing device appearing to require a full resync when it is re-added. With this patch, we update events_cleared precisely when we are about to clear a bit in the bitmap. We record events_cleared when we clear the bit internally, and copy that to the superblock which is written out before the bit on storage. This makes it more "obviously correct". We also need to update events_cleared when the event_count is going backwards (as happens on a dirty->clean transition of a non-degraded array). Thanks to Mike Snitzer for identifying this problem and testing early "fixes". Cc: "Mike Snitzer" <snitzer@gmail.com> Signed-off-by: Neil Brown <neilb@suse.de>
2008-06-27kbuild: fix a.out.h export to userspace with O= build.David Woodhouse
We need to check for existence of the a.out.h header in the source tree, not the object tree, if we want it to get the right answer with O=. Signed-off-by: David Woodhouse <david.woodhouse@intel.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-06-25thermal: Create CONFIG_THERMAL_HWMON=nRene Herman
A bug in libsensors <= 2.10.6 is exposed when this new hwmon I/F is enabled. Create CONFIG_THERMAL_HWMON=n until some time after libsensors 2.10.7 ships so those users can run the latest kernel. libsensors 3.x is already fixed -- those users can use CONFIG_THERMAL_HWMON=y now. Signed-off-by: Rene Herman <rene.herman@gmail.com> Acked-by: Mark M. Hoffman <mhoffman@lightlink.com> Signed-off-by: Len Brown <len.brown@intel.com>
2008-06-24[PATCH] remove useless argument type in audit_filter_user()Peng Haitao
The second argument "type" is not used in audit_filter_user(), so I think that type can be removed. If I'm wrong, please tell me. Signed-off-by: Peng Haitao <penght@cn.fujitsu.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-06-24KVM: close timer injection race window in __vcpu_runMarcelo Tosatti
If a timer fires after kvm_inject_pending_timer_irqs() but before local_irq_disable() the code will enter guest mode and only inject such timer interrupt the next time an unrelated event causes an exit. It would be simpler if the timer->pending irq conversion could be done with IRQ's disabled, so that the above problem cannot happen. For now introduce a new vcpu requests bit to cancel guest entry. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>