Age | Commit message (Collapse) | Author |
|
Vmalloc is called to allocate journal->j_cnode_free_list but
we hold the reiserfs lock at this time, which raises a
{RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} lock inversion.
Just drop the reiserfs lock at this time, as it's not even
needed but kept for paranoid reasons.
This fixes:
[ INFO: inconsistent lock state ]
2.6.33-rc5 #1
---------------------------------
inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
kswapd0/313 [HC0[0]:SC0[0]:HE1:SE1] takes:
(&REISERFS_SB(s)->lock){+.+.?.}, at: [<c11118c8>]
reiserfs_write_lock_once+0x28/0x50
{RECLAIM_FS-ON-W} state was registered at:
[<c104ee32>] mark_held_locks+0x62/0x90
[<c104eefa>] lockdep_trace_alloc+0x9a/0xc0
[<c108f7b6>] kmem_cache_alloc+0x26/0xf0
[<c108621c>] __get_vm_area_node+0x6c/0xf0
[<c108690e>] __vmalloc_node+0x7e/0xa0
[<c1086aab>] vmalloc+0x2b/0x30
[<c110e1fb>] journal_init+0x6cb/0xa10
[<c10f90a2>] reiserfs_fill_super+0x342/0xb80
[<c1095665>] get_sb_bdev+0x145/0x180
[<c10f68e1>] get_super_block+0x21/0x30
[<c1094520>] vfs_kern_mount+0x40/0xd0
[<c1094609>] do_kern_mount+0x39/0xd0
[<c10aaa97>] do_mount+0x2c7/0x6d0
[<c10aaf06>] sys_mount+0x66/0xa0
[<c16198a7>] mount_block_root+0xc4/0x245
[<c1619a81>] mount_root+0x59/0x5f
[<c1619b98>] prepare_namespace+0x111/0x14b
[<c1619269>] kernel_init+0xcf/0xdb
[<c100303a>] kernel_thread_helper+0x6/0x1c
irq event stamp: 63236801
hardirqs last enabled at (63236801): [<c134e7fa>]
__mutex_unlock_slowpath+0x9a/0x120
hardirqs last disabled at (63236800): [<c134e799>]
__mutex_unlock_slowpath+0x39/0x120
softirqs last enabled at (63218800): [<c102f451>] __do_softirq+0xc1/0x110
softirqs last disabled at (63218789): [<c102f4ed>] do_softirq+0x4d/0x60
other info that might help us debug this:
2 locks held by kswapd0/313:
#0: (shrinker_rwsem){++++..}, at: [<c1074bb4>] shrink_slab+0x24/0x170
#1: (&type->s_umount_key#19){++++..}, at: [<c10a2edd>]
shrink_dcache_memory+0xfd/0x1a0
stack backtrace:
Pid: 313, comm: kswapd0 Not tainted 2.6.33-rc5 #1
Call Trace:
[<c134db2c>] ? printk+0x18/0x1c
[<c104e7ef>] print_usage_bug+0x15f/0x1a0
[<c104ebcf>] mark_lock+0x39f/0x5a0
[<c104d66b>] ? trace_hardirqs_off+0xb/0x10
[<c1052c50>] ? check_usage_forwards+0x0/0xf0
[<c1050c24>] __lock_acquire+0x214/0xa70
[<c10438c5>] ? sched_clock_cpu+0x95/0x110
[<c10514fa>] lock_acquire+0x7a/0xa0
[<c11118c8>] ? reiserfs_write_lock_once+0x28/0x50
[<c134f03f>] mutex_lock_nested+0x5f/0x2b0
[<c11118c8>] ? reiserfs_write_lock_once+0x28/0x50
[<c11118c8>] ? reiserfs_write_lock_once+0x28/0x50
[<c11118c8>] reiserfs_write_lock_once+0x28/0x50
[<c10f05b0>] reiserfs_delete_inode+0x50/0x140
[<c10a653f>] ? generic_delete_inode+0x5f/0x150
[<c10f0560>] ? reiserfs_delete_inode+0x0/0x140
[<c10a657c>] generic_delete_inode+0x9c/0x150
[<c10a666d>] generic_drop_inode+0x3d/0x60
[<c10a5597>] iput+0x47/0x50
[<c10a2a4f>] dentry_iput+0x6f/0xf0
[<c10a2af4>] d_kill+0x24/0x50
[<c10a2d3d>] __shrink_dcache_sb+0x21d/0x2b0
[<c10a2f0f>] shrink_dcache_memory+0x12f/0x1a0
[<c1074c9e>] shrink_slab+0x10e/0x170
[<c1075177>] kswapd+0x477/0x6a0
[<c1072d10>] ? isolate_pages_global+0x0/0x1b0
[<c103e160>] ? autoremove_wake_function+0x0/0x40
[<c1074d00>] ? kswapd+0x0/0x6a0
[<c103de6c>] kthread+0x6c/0x80
[<c103de00>] ? kthread+0x0/0x80
[<c100303a>] kernel_thread_helper+0x6/0x1c
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Chris Mason <chris.mason@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing
* 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing:
reiserfs: Relax reiserfs_xattr_set_handle() while acquiring xattr locks
reiserfs: Fix unreachable statement
reiserfs: Don't call reiserfs_get_acl() with the reiserfs lock
reiserfs: Relax lock on xattr removing
reiserfs: Relax the lock before truncating pages
reiserfs: Fix recursive lock on lchown
reiserfs: Fix mistake in down_write() conversion
|
|
Fix remaining xattr locks acquired in reiserfs_xattr_set_handle()
while we are holding the reiserfs lock to avoid lock inversions.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
|
|
Stanse found an unreachable statement in reiserfs_ioctl. There is a
if followed by error assignment and `break' with no braces. Add the
braces so that we don't break every time, but only in error case,
so that REISERFS_IOC_SETVERSION actually works when it returns no
error.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Reiserfs <reiserfs-devel@vger.kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|
|
reiserfs_get_acl is usually not called under the reiserfs lock,
as it doesn't need it. But it happens when it is called by
reiserfs_acl_chmod(), which creates a dependency inversion against
the private xattr inodes mutexes for the given inode.
We need to call it without the reiserfs lock, especially since
it's unnecessary.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
|
|
When we remove an xattr, we call lookup_and_delete_xattr()
that takes some private xattr inodes mutexes. But we hold
the reiserfs lock at this time, which leads to dependency
inversions.
We can safely call lookup_and_delete_xattr() without the
reiserfs lock, where xattr inodes lookups only need the
xattr inodes mutexes.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
|
|
While truncating a file, reiserfs_setattr() calls inode_setattr()
that will truncate the mapping for the given inode, but for that
it needs the pages locks.
In order to release these, the owners need the reiserfs lock to
complete their jobs. But they can't, as we don't release it before
calling inode_setattr().
We need to do that to fix the following softlockups:
INFO: task flush-8:0:2149 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
flush-8:0 D f51af998 0 2149 2 0x00000000
f51af9ac 00000092 00000002 f51af998 c2803304 00000000 c1894ad0 010f3000
f51af9cc c1462604 c189ef80 f51af974 c1710304 f715b450 f715b5ec c2807c40
00000000 0005bb00 c2803320 c102c55b c1710304 c2807c50 c2803304 00000246
Call Trace:
[<c1462604>] ? schedule+0x434/0xb20
[<c102c55b>] ? resched_task+0x4b/0x70
[<c106fa22>] ? mark_held_locks+0x62/0x80
[<c146414d>] ? mutex_lock_nested+0x1fd/0x350
[<c14640b9>] mutex_lock_nested+0x169/0x350
[<c1178cde>] ? reiserfs_write_lock+0x2e/0x40
[<c1178cde>] reiserfs_write_lock+0x2e/0x40
[<c11719a2>] do_journal_end+0xc2/0xe70
[<c1172912>] journal_end+0xb2/0x120
[<c11686b3>] ? pathrelse+0x33/0xb0
[<c11729e4>] reiserfs_end_persistent_transaction+0x64/0x70
[<c1153caa>] reiserfs_get_block+0x12ba/0x15f0
[<c106fa22>] ? mark_held_locks+0x62/0x80
[<c1154b24>] reiserfs_writepage+0xa74/0xe80
[<c1465a27>] ? _raw_spin_unlock_irq+0x27/0x50
[<c11f3d25>] ? radix_tree_gang_lookup_tag_slot+0x95/0xc0
[<c10b5377>] ? find_get_pages_tag+0x127/0x1a0
[<c106fa22>] ? mark_held_locks+0x62/0x80
[<c106fcd4>] ? trace_hardirqs_on_caller+0x124/0x170
[<c10bc1e0>] __writepage+0x10/0x40
[<c10bc9ab>] write_cache_pages+0x16b/0x320
[<c10bc1d0>] ? __writepage+0x0/0x40
[<c10bcb88>] generic_writepages+0x28/0x40
[<c10bcbd5>] do_writepages+0x35/0x40
[<c11059f7>] writeback_single_inode+0xc7/0x330
[<c11067b2>] writeback_inodes_wb+0x2c2/0x490
[<c1106a86>] wb_writeback+0x106/0x1b0
[<c1106cf6>] wb_do_writeback+0x106/0x1e0
[<c1106c18>] ? wb_do_writeback+0x28/0x1e0
[<c1106e0a>] bdi_writeback_task+0x3a/0xb0
[<c10cbb13>] bdi_start_fn+0x63/0xc0
[<c10cbab0>] ? bdi_start_fn+0x0/0xc0
[<c105d1f4>] kthread+0x74/0x80
[<c105d180>] ? kthread+0x0/0x80
[<c100327a>] kernel_thread_helper+0x6/0x10
3 locks held by flush-8:0/2149:
#0: (&type->s_umount_key#30){+++++.}, at: [<c110676f>] writeback_inodes_wb+0x27f/0x490
#1: (&journal->j_mutex){+.+...}, at: [<c117199a>] do_journal_end+0xba/0xe70
#2: (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1178cde>] reiserfs_write_lock+0x2e/0x40
INFO: task fstest:3813 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
fstest D 00000002 0 3813 3812 0x00000000
f5103c94 00000082 f5103c40 00000002 f5ad5450 00000007 f5103c28 011f3000
00000006 f5ad5450 c10bb005 00000480 c1710304 f5ad5450 f5ad55ec c2907c40
00000001 f5ad5450 f5103c74 00000046 00000002 f5ad5450 00000007 f5103c6c
Call Trace:
[<c10bb005>] ? free_hot_cold_page+0x1d5/0x280
[<c1462d64>] io_schedule+0x74/0xc0
[<c10b5a45>] sync_page+0x35/0x60
[<c146325a>] __wait_on_bit_lock+0x4a/0x90
[<c10b5a10>] ? sync_page+0x0/0x60
[<c10b59e5>] __lock_page+0x85/0x90
[<c105d660>] ? wake_bit_function+0x0/0x60
[<c10bf654>] truncate_inode_pages_range+0x1e4/0x2d0
[<c10bf75f>] truncate_inode_pages+0x1f/0x30
[<c10bf7cf>] truncate_pagecache+0x5f/0xa0
[<c10bf86a>] vmtruncate+0x5a/0x70
[<c10fdb7d>] inode_setattr+0x5d/0x190
[<c1150117>] reiserfs_setattr+0x1f7/0x2f0
[<c1464569>] ? down_write+0x49/0x70
[<c10fde01>] notify_change+0x151/0x330
[<c10e6f3d>] do_truncate+0x6d/0xa0
[<c10f4ce2>] do_filp_open+0x9a2/0xcf0
[<c1465aec>] ? _raw_spin_unlock+0x2c/0x50
[<c10fec50>] ? alloc_fd+0xe0/0x100
[<c10e602d>] do_sys_open+0x6d/0x130
[<c1002cfb>] ? sysenter_exit+0xf/0x16
[<c10e615e>] sys_open+0x2e/0x40
[<c1002ccc>] sysenter_do_call+0x12/0x32
3 locks held by fstest/3813:
#0: (&sb->s_type->i_mutex_key#4){+.+.+.}, at: [<c10e6f33>] do_truncate+0x63/0xa0
#1: (&sb->s_type->i_alloc_sem_key#3){+.+.+.}, at: [<c10fdf07>] notify_change+0x257/0x330
#2: (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1178c8e>] reiserfs_write_lock_once+0x2e/0x50
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
|
|
On chown, reiserfs will call reiserfs_setattr() to change the owner
of the given inode, but it may also recursively call
reiserfs_setattr() to propagate the owner change to the private xattr
files for this inode.
Hence, the reiserfs lock may be acquired twice which is not wanted
as reiserfs_setattr() calls journal_begin() that is going to try to
relax the lock in order to safely acquire the journal mutex.
Using reiserfs_write_lock_once() from reiserfs_setattr() solves
the problem.
This fixes the following warning, that precedes a lockdep report.
WARNING: at fs/reiserfs/lock.c:95 reiserfs_lock_check_recursive+0x3f/0x50()
Hardware name: MS-7418
Unwanted recursive reiserfs lock!
Pid: 4189, comm: fsstress Not tainted 2.6.33-rc2-tip-atom+ #195
Call Trace:
[<c1178bff>] ? reiserfs_lock_check_recursive+0x3f/0x50
[<c1178bff>] ? reiserfs_lock_check_recursive+0x3f/0x50
[<c103f7ac>] warn_slowpath_common+0x6c/0xc0
[<c1178bff>] ? reiserfs_lock_check_recursive+0x3f/0x50
[<c103f84b>] warn_slowpath_fmt+0x2b/0x30
[<c1178bff>] reiserfs_lock_check_recursive+0x3f/0x50
[<c1172ae3>] do_journal_begin_r+0x83/0x350
[<c1172f2d>] journal_begin+0x7d/0x140
[<c106509a>] ? in_group_p+0x2a/0x30
[<c10fda71>] ? inode_change_ok+0x91/0x140
[<c115007d>] reiserfs_setattr+0x15d/0x2e0
[<c10f9bf3>] ? dput+0xe3/0x140
[<c1465adc>] ? _raw_spin_unlock+0x2c/0x50
[<c117831d>] chown_one_xattr+0xd/0x10
[<c11780a3>] reiserfs_for_each_xattr+0x113/0x2c0
[<c1178310>] ? chown_one_xattr+0x0/0x10
[<c14641e9>] ? mutex_lock_nested+0x2a9/0x350
[<c117826f>] reiserfs_chown_xattrs+0x1f/0x60
[<c106509a>] ? in_group_p+0x2a/0x30
[<c10fda71>] ? inode_change_ok+0x91/0x140
[<c1150046>] reiserfs_setattr+0x126/0x2e0
[<c1177c20>] ? reiserfs_getxattr+0x0/0x90
[<c11b0d57>] ? cap_inode_need_killpriv+0x37/0x50
[<c10fde01>] notify_change+0x151/0x330
[<c10e659f>] chown_common+0x6f/0x90
[<c10e67bd>] sys_lchown+0x6d/0x80
[<c1002ccc>] sysenter_do_call+0x12/0x32
---[ end trace 7c2b77224c1442fc ]---
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
|
|
Fix a mistake in commit 0719d3434747889b314a1e8add776418c4148bcf
(reiserfs: Fix reiserfs lock <-> i_xattr_sem dependency inversion)
that has converted a down_write() into a down_read() accidentally.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing
* 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing:
reiserfs: Safely acquire i_mutex from xattr_rmdir
reiserfs: Safely acquire i_mutex from reiserfs_for_each_xattr
reiserfs: Fix journal mutex <-> inode mutex lock inversion
reiserfs: Fix unwanted recursive reiserfs lock in reiserfs_unlink()
reiserfs: Relax lock before open xattr dir in reiserfs_xattr_set_handle()
reiserfs: Relax reiserfs lock while freeing the journal
reiserfs: Fix reiserfs lock <-> i_mutex dependency inversion on xattr
reiserfs: Warn on lock relax if taken recursively
reiserfs: Fix reiserfs lock <-> i_xattr_sem dependency inversion
reiserfs: Fix remaining in-reclaim-fs <-> reclaim-fs-on locking inversion
reiserfs: Fix reiserfs lock <-> inode mutex dependency inversion
reiserfs: Fix reiserfs lock and journal lock inversion dependency
reiserfs: Fix possible recursive lock
|
|
Relax the reiserfs lock before taking the inode mutex from
xattr_rmdir() to avoid the usual reiserfs lock <-> inode mutex
bad dependency.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
|
|
Relax the reiserfs lock before taking the inode mutex from
reiserfs_for_each_xattr() to avoid the usual bad dependencies:
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-atom #179
-------------------------------------------------------
rm/3242 is trying to acquire lock:
(&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c11428ef>] reiserfs_for_each_xattr+0x23f/0x290
but task is already holding lock:
(&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143389>] reiserfs_write_lock+0x29/0x40
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
[<c105ea7f>] __lock_acquire+0x11ff/0x19e0
[<c105f2c8>] lock_acquire+0x68/0x90
[<c1401aab>] mutex_lock_nested+0x5b/0x340
[<c1143339>] reiserfs_write_lock_once+0x29/0x50
[<c1117022>] reiserfs_lookup+0x62/0x140
[<c10bd85f>] __lookup_hash+0xef/0x110
[<c10bf21d>] lookup_one_len+0x8d/0xc0
[<c1141e3a>] open_xa_dir+0xea/0x1b0
[<c1142720>] reiserfs_for_each_xattr+0x70/0x290
[<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
[<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
[<c10c9c32>] generic_delete_inode+0xa2/0x170
[<c10c9d4f>] generic_drop_inode+0x4f/0x70
[<c10c8b07>] iput+0x47/0x50
[<c10c0965>] do_unlinkat+0xd5/0x160
[<c10c0b13>] sys_unlinkat+0x23/0x40
[<c1002ec4>] sysenter_do_call+0x12/0x32
-> #0 (&sb->s_type->i_mutex_key#4/3){+.+.+.}:
[<c105f176>] __lock_acquire+0x18f6/0x19e0
[<c105f2c8>] lock_acquire+0x68/0x90
[<c1401aab>] mutex_lock_nested+0x5b/0x340
[<c11428ef>] reiserfs_for_each_xattr+0x23f/0x290
[<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
[<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
[<c10c9c32>] generic_delete_inode+0xa2/0x170
[<c10c9d4f>] generic_drop_inode+0x4f/0x70
[<c10c8b07>] iput+0x47/0x50
[<c10c0965>] do_unlinkat+0xd5/0x160
[<c10c0b13>] sys_unlinkat+0x23/0x40
[<c1002ec4>] sysenter_do_call+0x12/0x32
other info that might help us debug this:
1 lock held by rm/3242:
#0: (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143389>] reiserfs_write_lock+0x29/0x40
stack backtrace:
Pid: 3242, comm: rm Not tainted 2.6.32-atom #179
Call Trace:
[<c13ffa13>] ? printk+0x18/0x1a
[<c105d33a>] print_circular_bug+0xca/0xd0
[<c105f176>] __lock_acquire+0x18f6/0x19e0
[<c105c932>] ? mark_held_locks+0x62/0x80
[<c105cc3b>] ? trace_hardirqs_on+0xb/0x10
[<c1401098>] ? mutex_unlock+0x8/0x10
[<c105f2c8>] lock_acquire+0x68/0x90
[<c11428ef>] ? reiserfs_for_each_xattr+0x23f/0x290
[<c11428ef>] ? reiserfs_for_each_xattr+0x23f/0x290
[<c1401aab>] mutex_lock_nested+0x5b/0x340
[<c11428ef>] ? reiserfs_for_each_xattr+0x23f/0x290
[<c11428ef>] reiserfs_for_each_xattr+0x23f/0x290
[<c1143180>] ? delete_one_xattr+0x0/0x100
[<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
[<c1143339>] ? reiserfs_write_lock_once+0x29/0x50
[<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
[<c11b0d4f>] ? _atomic_dec_and_lock+0x4f/0x70
[<c111e990>] ? reiserfs_delete_inode+0x0/0x150
[<c10c9c32>] generic_delete_inode+0xa2/0x170
[<c10c9d4f>] generic_drop_inode+0x4f/0x70
[<c10c8b07>] iput+0x47/0x50
[<c10c0965>] do_unlinkat+0xd5/0x160
[<c1401098>] ? mutex_unlock+0x8/0x10
[<c10c3e0d>] ? vfs_readdir+0x7d/0xb0
[<c10c3af0>] ? filldir64+0x0/0xf0
[<c1002ef3>] ? sysenter_exit+0xf/0x16
[<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
[<c10c0b13>] sys_unlinkat+0x23/0x40
[<c1002ec4>] sysenter_do_call+0x12/0x32
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
|
|
We need to relax the reiserfs lock before locking the inode mutex
from xattr_unlink(), otherwise we'll face the usual bad dependencies:
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-atom #178
-------------------------------------------------------
rm/3202 is trying to acquire lock:
(&journal->j_mutex){+.+...}, at: [<c113c234>] do_journal_begin_r+0x94/0x360
but task is already holding lock:
(&sb->s_type->i_mutex_key#4/2){+.+...}, at: [<c1142a67>] xattr_unlink+0x57/0xb0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (&sb->s_type->i_mutex_key#4/2){+.+...}:
[<c105ea7f>] __lock_acquire+0x11ff/0x19e0
[<c105f2c8>] lock_acquire+0x68/0x90
[<c1401a7b>] mutex_lock_nested+0x5b/0x340
[<c1142a67>] xattr_unlink+0x57/0xb0
[<c1143179>] delete_one_xattr+0x29/0x100
[<c11427bb>] reiserfs_for_each_xattr+0x10b/0x290
[<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
[<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
[<c10c9c32>] generic_delete_inode+0xa2/0x170
[<c10c9d4f>] generic_drop_inode+0x4f/0x70
[<c10c8b07>] iput+0x47/0x50
[<c10c0965>] do_unlinkat+0xd5/0x160
[<c10c0b13>] sys_unlinkat+0x23/0x40
[<c1002ec4>] sysenter_do_call+0x12/0x32
-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
[<c105ea7f>] __lock_acquire+0x11ff/0x19e0
[<c105f2c8>] lock_acquire+0x68/0x90
[<c1401a7b>] mutex_lock_nested+0x5b/0x340
[<c1143359>] reiserfs_write_lock+0x29/0x40
[<c113c23c>] do_journal_begin_r+0x9c/0x360
[<c113c680>] journal_begin+0x80/0x130
[<c1127363>] reiserfs_remount+0x223/0x4e0
[<c10b6dd6>] do_remount_sb+0xa6/0x140
[<c10ce6a0>] do_mount+0x560/0x750
[<c10ce914>] sys_mount+0x84/0xb0
[<c1002ec4>] sysenter_do_call+0x12/0x32
-> #0 (&journal->j_mutex){+.+...}:
[<c105f176>] __lock_acquire+0x18f6/0x19e0
[<c105f2c8>] lock_acquire+0x68/0x90
[<c1401a7b>] mutex_lock_nested+0x5b/0x340
[<c113c234>] do_journal_begin_r+0x94/0x360
[<c113c680>] journal_begin+0x80/0x130
[<c1116d63>] reiserfs_unlink+0x83/0x2e0
[<c1142a74>] xattr_unlink+0x64/0xb0
[<c1143179>] delete_one_xattr+0x29/0x100
[<c11427bb>] reiserfs_for_each_xattr+0x10b/0x290
[<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
[<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
[<c10c9c32>] generic_delete_inode+0xa2/0x170
[<c10c9d4f>] generic_drop_inode+0x4f/0x70
[<c10c8b07>] iput+0x47/0x50
[<c10c0965>] do_unlinkat+0xd5/0x160
[<c10c0b13>] sys_unlinkat+0x23/0x40
[<c1002ec4>] sysenter_do_call+0x12/0x32
other info that might help us debug this:
2 locks held by rm/3202:
#0: (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c114274b>] reiserfs_for_each_xattr+0x9b/0x290
#1: (&sb->s_type->i_mutex_key#4/2){+.+...}, at: [<c1142a67>] xattr_unlink+0x57/0xb0
stack backtrace:
Pid: 3202, comm: rm Not tainted 2.6.32-atom #178
Call Trace:
[<c13ff9e3>] ? printk+0x18/0x1a
[<c105d33a>] print_circular_bug+0xca/0xd0
[<c105f176>] __lock_acquire+0x18f6/0x19e0
[<c1142a67>] ? xattr_unlink+0x57/0xb0
[<c105f2c8>] lock_acquire+0x68/0x90
[<c113c234>] ? do_journal_begin_r+0x94/0x360
[<c113c234>] ? do_journal_begin_r+0x94/0x360
[<c1401a7b>] mutex_lock_nested+0x5b/0x340
[<c113c234>] ? do_journal_begin_r+0x94/0x360
[<c113c234>] do_journal_begin_r+0x94/0x360
[<c10411b6>] ? run_timer_softirq+0x1a6/0x220
[<c103cb00>] ? __do_softirq+0x50/0x140
[<c113c680>] journal_begin+0x80/0x130
[<c103cba2>] ? __do_softirq+0xf2/0x140
[<c104f72f>] ? hrtimer_interrupt+0xdf/0x220
[<c1116d63>] reiserfs_unlink+0x83/0x2e0
[<c105c932>] ? mark_held_locks+0x62/0x80
[<c11b8d08>] ? trace_hardirqs_on_thunk+0xc/0x10
[<c1002fd8>] ? restore_all_notrace+0x0/0x18
[<c1142a67>] ? xattr_unlink+0x57/0xb0
[<c1142a74>] xattr_unlink+0x64/0xb0
[<c1143179>] delete_one_xattr+0x29/0x100
[<c11427bb>] reiserfs_for_each_xattr+0x10b/0x290
[<c1143150>] ? delete_one_xattr+0x0/0x100
[<c1401cb9>] ? mutex_lock_nested+0x299/0x340
[<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60
[<c1143309>] ? reiserfs_write_lock_once+0x29/0x50
[<c111ea2f>] reiserfs_delete_inode+0x9f/0x150
[<c11b0d1f>] ? _atomic_dec_and_lock+0x4f/0x70
[<c111e990>] ? reiserfs_delete_inode+0x0/0x150
[<c10c9c32>] generic_delete_inode+0xa2/0x170
[<c10c9d4f>] generic_drop_inode+0x4f/0x70
[<c10c8b07>] iput+0x47/0x50
[<c10c0965>] do_unlinkat+0xd5/0x160
[<c1401068>] ? mutex_unlock+0x8/0x10
[<c10c3e0d>] ? vfs_readdir+0x7d/0xb0
[<c10c3af0>] ? filldir64+0x0/0xf0
[<c1002ef3>] ? sysenter_exit+0xf/0x16
[<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
[<c10c0b13>] sys_unlinkat+0x23/0x40
[<c1002ec4>] sysenter_do_call+0x12/0x32
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
|
|
reiserfs_unlink() may or may not be called under the reiserfs
lock.
But it also takes the reiserfs lock and can then acquire it
recursively which leads to do_journal_begin_r() that fails to
relax the reiserfs lock before grabbing the journal mutex,
creating an unexpected lock inversion.
We need to ensure reiserfs_unlink() won't get the reiserfs lock
recursively using reiserfs_write_lock_once().
This fixes the following warning that precedes a lock inversion
report (reiserfs lock <-> journal mutex).
------------[ cut here ]------------
WARNING: at fs/reiserfs/lock.c:95 reiserfs_lock_check_recursive+0x3a/0x50()
Hardware name: MS-7418
Unwanted recursive reiserfs lock!
Pid: 3208, comm: dbench Not tainted 2.6.32-atom #177
Call Trace:
[<c114327a>] ? reiserfs_lock_check_recursive+0x3a/0x50
[<c114327a>] ? reiserfs_lock_check_recursive+0x3a/0x50
[<c10373a7>] warn_slowpath_common+0x67/0xc0
[<c114327a>] ? reiserfs_lock_check_recursive+0x3a/0x50
[<c1037446>] warn_slowpath_fmt+0x26/0x30
[<c114327a>] reiserfs_lock_check_recursive+0x3a/0x50
[<c113c213>] do_journal_begin_r+0x83/0x360
[<c105eb16>] ? __lock_acquire+0x1296/0x19e0
[<c1142a57>] ? xattr_unlink+0x57/0xb0
[<c113c670>] journal_begin+0x80/0x130
[<c1116d5d>] reiserfs_unlink+0x7d/0x2d0
[<c1142a57>] ? xattr_unlink+0x57/0xb0
[<c1142a57>] ? xattr_unlink+0x57/0xb0
[<c1142a57>] ? xattr_unlink+0x57/0xb0
[<c1142a64>] xattr_unlink+0x64/0xb0
[<c1143169>] delete_one_xattr+0x29/0x100
[<c11427ab>] reiserfs_for_each_xattr+0x10b/0x290
[<c1143140>] ? delete_one_xattr+0x0/0x100
[<c1401ca9>] ? mutex_lock_nested+0x299/0x340
[<c11429aa>] reiserfs_delete_xattrs+0x1a/0x60
[<c11432f9>] ? reiserfs_write_lock_once+0x29/0x50
[<c111ea1f>] reiserfs_delete_inode+0x9f/0x150
[<c11b0d0f>] ? _atomic_dec_and_lock+0x4f/0x70
[<c111e980>] ? reiserfs_delete_inode+0x0/0x150
[<c10c9c32>] generic_delete_inode+0xa2/0x170
[<c10c9d4f>] generic_drop_inode+0x4f/0x70
[<c10c8b07>] iput+0x47/0x50
[<c10c0965>] do_unlinkat+0xd5/0x160
[<c10505c6>] ? up_read+0x16/0x30
[<c1022ab7>] ? do_page_fault+0x187/0x330
[<c1002fd8>] ? restore_all_notrace+0x0/0x18
[<c1022930>] ? do_page_fault+0x0/0x330
[<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
[<c10c0a00>] sys_unlink+0x10/0x20
[<c1002ec4>] sysenter_do_call+0x12/0x32
---[ end trace 2e35d71a6cc69d0c ]---
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
|
|
We call xattr_lookup() from reiserfs_xattr_get(). We then hold
the reiserfs lock when we grab the i_mutex. But later, we may
relax the reiserfs lock, creating dependency inversion between
both locks.
The lookups and creation jobs ar already protected by the
inode mutex, so we can safely relax the reiserfs lock, dropping
the unwanted reiserfs lock -> i_mutex dependency, as shown
in the following lockdep report:
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-atom #173
-------------------------------------------------------
cp/3204 is trying to acquire lock:
(&REISERFS_SB(s)->lock){+.+.+.}, at: [<c11432b9>] reiserfs_write_lock_once+0x29/0x50
but task is already holding lock:
(&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&sb->s_type->i_mutex_key#4/3){+.+.+.}:
[<c105ea7f>] __lock_acquire+0x11ff/0x19e0
[<c105f2c8>] lock_acquire+0x68/0x90
[<c1401a2b>] mutex_lock_nested+0x5b/0x340
[<c1141d83>] open_xa_dir+0x43/0x1b0
[<c1142722>] reiserfs_for_each_xattr+0x62/0x260
[<c114299a>] reiserfs_delete_xattrs+0x1a/0x60
[<c111ea1f>] reiserfs_delete_inode+0x9f/0x150
[<c10c9c32>] generic_delete_inode+0xa2/0x170
[<c10c9d4f>] generic_drop_inode+0x4f/0x70
[<c10c8b07>] iput+0x47/0x50
[<c10c0965>] do_unlinkat+0xd5/0x160
[<c10c0a00>] sys_unlink+0x10/0x20
[<c1002ec4>] sysenter_do_call+0x12/0x32
-> #0 (&REISERFS_SB(s)->lock){+.+.+.}:
[<c105f176>] __lock_acquire+0x18f6/0x19e0
[<c105f2c8>] lock_acquire+0x68/0x90
[<c1401a2b>] mutex_lock_nested+0x5b/0x340
[<c11432b9>] reiserfs_write_lock_once+0x29/0x50
[<c1117012>] reiserfs_lookup+0x62/0x140
[<c10bd85f>] __lookup_hash+0xef/0x110
[<c10bf21d>] lookup_one_len+0x8d/0xc0
[<c1141e2a>] open_xa_dir+0xea/0x1b0
[<c1141fe5>] xattr_lookup+0x15/0x160
[<c1142476>] reiserfs_xattr_get+0x56/0x2a0
[<c1144042>] reiserfs_get_acl+0xa2/0x360
[<c114461a>] reiserfs_cache_default_acl+0x3a/0x160
[<c111789c>] reiserfs_mkdir+0x6c/0x2c0
[<c10bea96>] vfs_mkdir+0xd6/0x180
[<c10c0c10>] sys_mkdirat+0xc0/0xd0
[<c10c0c40>] sys_mkdir+0x20/0x30
[<c1002ec4>] sysenter_do_call+0x12/0x32
other info that might help us debug this:
2 locks held by cp/3204:
#0: (&sb->s_type->i_mutex_key#4/1){+.+.+.}, at: [<c10bd8d6>] lookup_create+0x26/0xa0
#1: (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0
stack backtrace:
Pid: 3204, comm: cp Not tainted 2.6.32-atom #173
Call Trace:
[<c13ff993>] ? printk+0x18/0x1a
[<c105d33a>] print_circular_bug+0xca/0xd0
[<c105f176>] __lock_acquire+0x18f6/0x19e0
[<c105d3aa>] ? check_usage+0x6a/0x460
[<c105f2c8>] lock_acquire+0x68/0x90
[<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
[<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
[<c1401a2b>] mutex_lock_nested+0x5b/0x340
[<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
[<c11432b9>] reiserfs_write_lock_once+0x29/0x50
[<c1117012>] reiserfs_lookup+0x62/0x140
[<c105ccca>] ? debug_check_no_locks_freed+0x8a/0x140
[<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
[<c10bd85f>] __lookup_hash+0xef/0x110
[<c10bf21d>] lookup_one_len+0x8d/0xc0
[<c1141e2a>] open_xa_dir+0xea/0x1b0
[<c1141fe5>] xattr_lookup+0x15/0x160
[<c1142476>] reiserfs_xattr_get+0x56/0x2a0
[<c1144042>] reiserfs_get_acl+0xa2/0x360
[<c10ca2e7>] ? new_inode+0x27/0xa0
[<c114461a>] reiserfs_cache_default_acl+0x3a/0x160
[<c1402eb7>] ? _spin_unlock+0x27/0x40
[<c111789c>] reiserfs_mkdir+0x6c/0x2c0
[<c10c7cb8>] ? __d_lookup+0x108/0x190
[<c105c932>] ? mark_held_locks+0x62/0x80
[<c1401c8d>] ? mutex_lock_nested+0x2bd/0x340
[<c10bd17a>] ? generic_permission+0x1a/0xa0
[<c11788fe>] ? security_inode_permission+0x1e/0x20
[<c10bea96>] vfs_mkdir+0xd6/0x180
[<c10c0c10>] sys_mkdirat+0xc0/0xd0
[<c10505c6>] ? up_read+0x16/0x30
[<c1002fd8>] ? restore_all_notrace+0x0/0x18
[<c10c0c40>] sys_mkdir+0x20/0x30
[<c1002ec4>] sysenter_do_call+0x12/0x32
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
|
|
Keeping the reiserfs lock while freeing the journal on
umount path triggers a lock inversion between bdev->bd_mutex
and the reiserfs lock.
We don't need the reiserfs lock at this stage. The filesystem
is not usable anymore, and there are no more pending commits,
everything got flushed (even this operation was done in parallel
and didn't required the reiserfs lock from the current process).
This fixes the following lockdep report:
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-atom #172
-------------------------------------------------------
umount/3904 is trying to acquire lock:
(&bdev->bd_mutex){+.+.+.}, at: [<c10de2c2>] __blkdev_put+0x22/0x160
but task is already holding lock:
(&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143279>] reiserfs_write_lock+0x29/0x40
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (&REISERFS_SB(s)->lock){+.+.+.}:
[<c105ea7f>] __lock_acquire+0x11ff/0x19e0
[<c105f2c8>] lock_acquire+0x68/0x90
[<c140199b>] mutex_lock_nested+0x5b/0x340
[<c1143229>] reiserfs_write_lock_once+0x29/0x50
[<c111c485>] reiserfs_get_block+0x85/0x1620
[<c10e1040>] do_mpage_readpage+0x1f0/0x6d0
[<c10e1640>] mpage_readpages+0xc0/0x100
[<c1119b89>] reiserfs_readpages+0x19/0x20
[<c108f1ec>] __do_page_cache_readahead+0x1bc/0x260
[<c108f2b8>] ra_submit+0x28/0x40
[<c1087e3e>] filemap_fault+0x40e/0x420
[<c109b5fd>] __do_fault+0x3d/0x430
[<c109d47e>] handle_mm_fault+0x12e/0x790
[<c1022a65>] do_page_fault+0x135/0x330
[<c1403663>] error_code+0x6b/0x70
[<c10ef9ca>] load_elf_binary+0x82a/0x1a10
[<c10ba130>] search_binary_handler+0x90/0x1d0
[<c10bb70f>] do_execve+0x1df/0x250
[<c1001746>] sys_execve+0x46/0x70
[<c1002fa5>] syscall_call+0x7/0xb
-> #2 (&mm->mmap_sem){++++++}:
[<c105ea7f>] __lock_acquire+0x11ff/0x19e0
[<c105f2c8>] lock_acquire+0x68/0x90
[<c109b1ab>] might_fault+0x8b/0xb0
[<c11b8f52>] copy_to_user+0x32/0x70
[<c10c3b94>] filldir64+0xa4/0xf0
[<c1109116>] sysfs_readdir+0x116/0x210
[<c10c3e1d>] vfs_readdir+0x8d/0xb0
[<c10c3ea9>] sys_getdents64+0x69/0xb0
[<c1002ec4>] sysenter_do_call+0x12/0x32
-> #1 (sysfs_mutex){+.+.+.}:
[<c105ea7f>] __lock_acquire+0x11ff/0x19e0
[<c105f2c8>] lock_acquire+0x68/0x90
[<c140199b>] mutex_lock_nested+0x5b/0x340
[<c110951c>] sysfs_addrm_start+0x2c/0xb0
[<c1109aa0>] create_dir+0x40/0x90
[<c1109b1b>] sysfs_create_dir+0x2b/0x50
[<c11b2352>] kobject_add_internal+0xc2/0x1b0
[<c11b2531>] kobject_add_varg+0x31/0x50
[<c11b25ac>] kobject_add+0x2c/0x60
[<c1258294>] device_add+0x94/0x560
[<c11036ea>] add_partition+0x18a/0x2a0
[<c110418a>] rescan_partitions+0x33a/0x450
[<c10de5bf>] __blkdev_get+0x12f/0x2d0
[<c10de76a>] blkdev_get+0xa/0x10
[<c11034b8>] register_disk+0x108/0x130
[<c11a87a9>] add_disk+0xd9/0x130
[<c12998e5>] sd_probe_async+0x105/0x1d0
[<c10528af>] async_thread+0xcf/0x230
[<c104bfd4>] kthread+0x74/0x80
[<c1003aab>] kernel_thread_helper+0x7/0x3c
-> #0 (&bdev->bd_mutex){+.+.+.}:
[<c105f176>] __lock_acquire+0x18f6/0x19e0
[<c105f2c8>] lock_acquire+0x68/0x90
[<c140199b>] mutex_lock_nested+0x5b/0x340
[<c10de2c2>] __blkdev_put+0x22/0x160
[<c10de40a>] blkdev_put+0xa/0x10
[<c113ce22>] free_journal_ram+0xd2/0x130
[<c113ea18>] do_journal_release+0x98/0x190
[<c113eb2a>] journal_release+0xa/0x10
[<c1128eb6>] reiserfs_put_super+0x36/0x130
[<c10b776f>] generic_shutdown_super+0x4f/0xe0
[<c10b7825>] kill_block_super+0x25/0x40
[<c11255df>] reiserfs_kill_sb+0x7f/0x90
[<c10b7f4a>] deactivate_super+0x7a/0x90
[<c10cccd8>] mntput_no_expire+0x98/0xd0
[<c10ccfcc>] sys_umount+0x4c/0x310
[<c10cd2a9>] sys_oldumount+0x19/0x20
[<c1002ec4>] sysenter_do_call+0x12/0x32
other info that might help us debug this:
2 locks held by umount/3904:
#0: (&type->s_umount_key#30){+++++.}, at: [<c10b7f45>] deactivate_super+0x75/0x90
#1: (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143279>] reiserfs_write_lock+0x29/0x40
stack backtrace:
Pid: 3904, comm: umount Not tainted 2.6.32-atom #172
Call Trace:
[<c13ff903>] ? printk+0x18/0x1a
[<c105d33a>] print_circular_bug+0xca/0xd0
[<c105f176>] __lock_acquire+0x18f6/0x19e0
[<c108b66f>] ? free_pcppages_bulk+0x1f/0x250
[<c105f2c8>] lock_acquire+0x68/0x90
[<c10de2c2>] ? __blkdev_put+0x22/0x160
[<c10de2c2>] ? __blkdev_put+0x22/0x160
[<c140199b>] mutex_lock_nested+0x5b/0x340
[<c10de2c2>] ? __blkdev_put+0x22/0x160
[<c105c932>] ? mark_held_locks+0x62/0x80
[<c10afe12>] ? kfree+0x92/0xd0
[<c10de2c2>] __blkdev_put+0x22/0x160
[<c105cc3b>] ? trace_hardirqs_on+0xb/0x10
[<c10de40a>] blkdev_put+0xa/0x10
[<c113ce22>] free_journal_ram+0xd2/0x130
[<c113ea18>] do_journal_release+0x98/0x190
[<c113eb2a>] journal_release+0xa/0x10
[<c1128eb6>] reiserfs_put_super+0x36/0x130
[<c1050596>] ? up_write+0x16/0x30
[<c10b776f>] generic_shutdown_super+0x4f/0xe0
[<c10b7825>] kill_block_super+0x25/0x40
[<c10f41e0>] ? vfs_quota_off+0x0/0x20
[<c11255df>] reiserfs_kill_sb+0x7f/0x90
[<c10b7f4a>] deactivate_super+0x7a/0x90
[<c10cccd8>] mntput_no_expire+0x98/0xd0
[<c10ccfcc>] sys_umount+0x4c/0x310
[<c10cd2a9>] sys_oldumount+0x19/0x20
[<c1002ec4>] sysenter_do_call+0x12/0x32
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
|
|
While deleting the xattrs of an inode, we hold the reiserfs lock
and grab the inode->i_mutex of the targeted inode and the root
private xattr directory.
Later on, we may relax the reiserfs lock for various reasons, this
creates inverted dependencies.
We can remove the reiserfs lock -> i_mutex dependency by relaxing
the former before calling open_xa_dir(). This is fine because the
lookup and creation of xattr private directories done in
open_xa_dir() are covered by the targeted inode mutexes. And deeper
operations in the tree are still done under the write lock.
This fixes the following lockdep report:
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-atom #173
-------------------------------------------------------
cp/3204 is trying to acquire lock:
(&REISERFS_SB(s)->lock){+.+.+.}, at: [<c11432b9>] reiserfs_write_lock_once+0x29/0x50
but task is already holding lock:
(&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&sb->s_type->i_mutex_key#4/3){+.+.+.}:
[<c105ea7f>] __lock_acquire+0x11ff/0x19e0
[<c105f2c8>] lock_acquire+0x68/0x90
[<c1401a2b>] mutex_lock_nested+0x5b/0x340
[<c1141d83>] open_xa_dir+0x43/0x1b0
[<c1142722>] reiserfs_for_each_xattr+0x62/0x260
[<c114299a>] reiserfs_delete_xattrs+0x1a/0x60
[<c111ea1f>] reiserfs_delete_inode+0x9f/0x150
[<c10c9c32>] generic_delete_inode+0xa2/0x170
[<c10c9d4f>] generic_drop_inode+0x4f/0x70
[<c10c8b07>] iput+0x47/0x50
[<c10c0965>] do_unlinkat+0xd5/0x160
[<c10c0a00>] sys_unlink+0x10/0x20
[<c1002ec4>] sysenter_do_call+0x12/0x32
-> #0 (&REISERFS_SB(s)->lock){+.+.+.}:
[<c105f176>] __lock_acquire+0x18f6/0x19e0
[<c105f2c8>] lock_acquire+0x68/0x90
[<c1401a2b>] mutex_lock_nested+0x5b/0x340
[<c11432b9>] reiserfs_write_lock_once+0x29/0x50
[<c1117012>] reiserfs_lookup+0x62/0x140
[<c10bd85f>] __lookup_hash+0xef/0x110
[<c10bf21d>] lookup_one_len+0x8d/0xc0
[<c1141e2a>] open_xa_dir+0xea/0x1b0
[<c1141fe5>] xattr_lookup+0x15/0x160
[<c1142476>] reiserfs_xattr_get+0x56/0x2a0
[<c1144042>] reiserfs_get_acl+0xa2/0x360
[<c114461a>] reiserfs_cache_default_acl+0x3a/0x160
[<c111789c>] reiserfs_mkdir+0x6c/0x2c0
[<c10bea96>] vfs_mkdir+0xd6/0x180
[<c10c0c10>] sys_mkdirat+0xc0/0xd0
[<c10c0c40>] sys_mkdir+0x20/0x30
[<c1002ec4>] sysenter_do_call+0x12/0x32
other info that might help us debug this:
2 locks held by cp/3204:
#0: (&sb->s_type->i_mutex_key#4/1){+.+.+.}, at: [<c10bd8d6>] lookup_create+0x26/0xa0
#1: (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0
stack backtrace:
Pid: 3204, comm: cp Not tainted 2.6.32-atom #173
Call Trace:
[<c13ff993>] ? printk+0x18/0x1a
[<c105d33a>] print_circular_bug+0xca/0xd0
[<c105f176>] __lock_acquire+0x18f6/0x19e0
[<c105d3aa>] ? check_usage+0x6a/0x460
[<c105f2c8>] lock_acquire+0x68/0x90
[<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
[<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
[<c1401a2b>] mutex_lock_nested+0x5b/0x340
[<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50
[<c11432b9>] reiserfs_write_lock_once+0x29/0x50
[<c1117012>] reiserfs_lookup+0x62/0x140
[<c105ccca>] ? debug_check_no_locks_freed+0x8a/0x140
[<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170
[<c10bd85f>] __lookup_hash+0xef/0x110
[<c10bf21d>] lookup_one_len+0x8d/0xc0
[<c1141e2a>] open_xa_dir+0xea/0x1b0
[<c1141fe5>] xattr_lookup+0x15/0x160
[<c1142476>] reiserfs_xattr_get+0x56/0x2a0
[<c1144042>] reiserfs_get_acl+0xa2/0x360
[<c10ca2e7>] ? new_inode+0x27/0xa0
[<c114461a>] reiserfs_cache_default_acl+0x3a/0x160
[<c1402eb7>] ? _spin_unlock+0x27/0x40
[<c111789c>] reiserfs_mkdir+0x6c/0x2c0
[<c10c7cb8>] ? __d_lookup+0x108/0x190
[<c105c932>] ? mark_held_locks+0x62/0x80
[<c1401c8d>] ? mutex_lock_nested+0x2bd/0x340
[<c10bd17a>] ? generic_permission+0x1a/0xa0
[<c11788fe>] ? security_inode_permission+0x1e/0x20
[<c10bea96>] vfs_mkdir+0xd6/0x180
[<c10c0c10>] sys_mkdirat+0xc0/0xd0
[<c10505c6>] ? up_read+0x16/0x30
[<c1002fd8>] ? restore_all_notrace+0x0/0x18
[<c10c0c40>] sys_mkdir+0x20/0x30
[<c1002ec4>] sysenter_do_call+0x12/0x32
v2: Don't drop reiserfs_mutex_lock_nested_safe() as we'll still
need it later
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Tested-by: Christian Kujau <lists@nerdbynature.de>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
|
|
When we relax the reiserfs lock to avoid creating unwanted
dependencies against others locks while grabbing these,
we want to ensure it has not been taken recursively, otherwise
the lock won't be really relaxed. Only its depth will be decreased.
The unwanted dependency would then actually happen.
To prevent from that, add a reiserfs_lock_check_recursive() call
in the places that need it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
|
|
i_xattr_sem depends on the reiserfs lock. But after we grab
i_xattr_sem, we may relax/relock the reiserfs lock while waiting
on a freezed filesystem, creating a dependency inversion between
the two locks.
In order to avoid the i_xattr_sem -> reiserfs lock dependency, let's
create a reiserfs_down_read_safe() that acts like
reiserfs_mutex_lock_safe(): relax the reiserfs lock while grabbing
another lock to avoid undesired dependencies induced by the
heivyweight reiserfs lock.
This fixes the following warning:
[ 990.005931] =======================================================
[ 990.012373] [ INFO: possible circular locking dependency detected ]
[ 990.013233] 2.6.33-rc1 #1
[ 990.013233] -------------------------------------------------------
[ 990.013233] dbench/1891 is trying to acquire lock:
[ 990.013233] (&REISERFS_SB(s)->lock){+.+.+.}, at: [<ffffffff81159505>] reiserfs_write_lock+0x35/0x50
[ 990.013233]
[ 990.013233] but task is already holding lock:
[ 990.013233] (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}, at: [<ffffffff8115899a>] reiserfs_xattr_set_handle+0x8a/0x470
[ 990.013233]
[ 990.013233] which lock already depends on the new lock.
[ 990.013233]
[ 990.013233]
[ 990.013233] the existing dependency chain (in reverse order) is:
[ 990.013233]
[ 990.013233] -> #1 (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}:
[ 990.013233] [<ffffffff81063afc>] __lock_acquire+0xf9c/0x1560
[ 990.013233] [<ffffffff8106414f>] lock_acquire+0x8f/0xb0
[ 990.013233] [<ffffffff814ac194>] down_write+0x44/0x80
[ 990.013233] [<ffffffff8115899a>] reiserfs_xattr_set_handle+0x8a/0x470
[ 990.013233] [<ffffffff81158e30>] reiserfs_xattr_set+0xb0/0x150
[ 990.013233] [<ffffffff8115a6aa>] user_set+0x8a/0x90
[ 990.013233] [<ffffffff8115901a>] reiserfs_setxattr+0xaa/0xb0
[ 990.013233] [<ffffffff810e2596>] __vfs_setxattr_noperm+0x36/0xa0
[ 990.013233] [<ffffffff810e26bc>] vfs_setxattr+0xbc/0xc0
[ 990.013233] [<ffffffff810e2780>] setxattr+0xc0/0x150
[ 990.013233] [<ffffffff810e289d>] sys_fsetxattr+0x8d/0xa0
[ 990.013233] [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b
[ 990.013233]
[ 990.013233] -> #0 (&REISERFS_SB(s)->lock){+.+.+.}:
[ 990.013233] [<ffffffff81063e30>] __lock_acquire+0x12d0/0x1560
[ 990.013233] [<ffffffff8106414f>] lock_acquire+0x8f/0xb0
[ 990.013233] [<ffffffff814aba77>] __mutex_lock_common+0x47/0x3b0
[ 990.013233] [<ffffffff814abebe>] mutex_lock_nested+0x3e/0x50
[ 990.013233] [<ffffffff81159505>] reiserfs_write_lock+0x35/0x50
[ 990.013233] [<ffffffff811340e5>] reiserfs_prepare_write+0x45/0x180
[ 990.013233] [<ffffffff81158bb6>] reiserfs_xattr_set_handle+0x2a6/0x470
[ 990.013233] [<ffffffff81158e30>] reiserfs_xattr_set+0xb0/0x150
[ 990.013233] [<ffffffff8115a6aa>] user_set+0x8a/0x90
[ 990.013233] [<ffffffff8115901a>] reiserfs_setxattr+0xaa/0xb0
[ 990.013233] [<ffffffff810e2596>] __vfs_setxattr_noperm+0x36/0xa0
[ 990.013233] [<ffffffff810e26bc>] vfs_setxattr+0xbc/0xc0
[ 990.013233] [<ffffffff810e2780>] setxattr+0xc0/0x150
[ 990.013233] [<ffffffff810e289d>] sys_fsetxattr+0x8d/0xa0
[ 990.013233] [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b
[ 990.013233]
[ 990.013233] other info that might help us debug this:
[ 990.013233]
[ 990.013233] 2 locks held by dbench/1891:
[ 990.013233] #0: (&sb->s_type->i_mutex_key#12){+.+.+.}, at: [<ffffffff810e2678>] vfs_setxattr+0x78/0xc0
[ 990.013233] #1: (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}, at: [<ffffffff8115899a>] reiserfs_xattr_set_handle+0x8a/0x470
[ 990.013233]
[ 990.013233] stack backtrace:
[ 990.013233] Pid: 1891, comm: dbench Not tainted 2.6.33-rc1 #1
[ 990.013233] Call Trace:
[ 990.013233] [<ffffffff81061639>] print_circular_bug+0xe9/0xf0
[ 990.013233] [<ffffffff81063e30>] __lock_acquire+0x12d0/0x1560
[ 990.013233] [<ffffffff8115899a>] ? reiserfs_xattr_set_handle+0x8a/0x470
[ 990.013233] [<ffffffff8106414f>] lock_acquire+0x8f/0xb0
[ 990.013233] [<ffffffff81159505>] ? reiserfs_write_lock+0x35/0x50
[ 990.013233] [<ffffffff8115899a>] ? reiserfs_xattr_set_handle+0x8a/0x470
[ 990.013233] [<ffffffff814aba77>] __mutex_lock_common+0x47/0x3b0
[ 990.013233] [<ffffffff81159505>] ? reiserfs_write_lock+0x35/0x50
[ 990.013233] [<ffffffff81159505>] ? reiserfs_write_lock+0x35/0x50
[ 990.013233] [<ffffffff81062592>] ? mark_held_locks+0x72/0xa0
[ 990.013233] [<ffffffff814ab81d>] ? __mutex_unlock_slowpath+0xbd/0x140
[ 990.013233] [<ffffffff810628ad>] ? trace_hardirqs_on_caller+0x14d/0x1a0
[ 990.013233] [<ffffffff814abebe>] mutex_lock_nested+0x3e/0x50
[ 990.013233] [<ffffffff81159505>] reiserfs_write_lock+0x35/0x50
[ 990.013233] [<ffffffff811340e5>] reiserfs_prepare_write+0x45/0x180
[ 990.013233] [<ffffffff81158bb6>] reiserfs_xattr_set_handle+0x2a6/0x470
[ 990.013233] [<ffffffff81158e30>] reiserfs_xattr_set+0xb0/0x150
[ 990.013233] [<ffffffff814abcb4>] ? __mutex_lock_common+0x284/0x3b0
[ 990.013233] [<ffffffff8115a6aa>] user_set+0x8a/0x90
[ 990.013233] [<ffffffff8115901a>] reiserfs_setxattr+0xaa/0xb0
[ 990.013233] [<ffffffff810e2596>] __vfs_setxattr_noperm+0x36/0xa0
[ 990.013233] [<ffffffff810e26bc>] vfs_setxattr+0xbc/0xc0
[ 990.013233] [<ffffffff810e2780>] setxattr+0xc0/0x150
[ 990.013233] [<ffffffff81056018>] ? sched_clock_cpu+0xb8/0x100
[ 990.013233] [<ffffffff8105eded>] ? trace_hardirqs_off+0xd/0x10
[ 990.013233] [<ffffffff810560a3>] ? cpu_clock+0x43/0x50
[ 990.013233] [<ffffffff810c6820>] ? fget+0xb0/0x110
[ 990.013233] [<ffffffff810c6770>] ? fget+0x0/0x110
[ 990.013233] [<ffffffff81002ddc>] ? sysret_check+0x27/0x62
[ 990.013233] [<ffffffff810e289d>] sys_fsetxattr+0x8d/0xa0
[ 990.013233] [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b
Reported-and-tested-by: Christian Kujau <lists@nerdbynature.de>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
|
|
Commit 500f5a0bf5f0624dae34307010e240ec090e4cde
(reiserfs: Fix possible recursive lock) fixed a vmalloc under reiserfs
lock that triggered a lockdep warning because of a
IN-FS-RECLAIM <-> RECLAIM-FS-ON locking dependency inversion.
But this patch has ommitted another vmalloc call in the same path
that allocates the journal. Relax the lock for this one too.
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
|
|
It can happen that write does not use all the blocks allocated in
write_begin either because of some filesystem error (like ENOSPC) or
because page with data to write has been removed from memory. We truncate
these blocks so that we don't have dangling blocks beyond i_size.
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This reverts commit e4c570c4cb7a95dbfafa3d016d2739bf3fdfe319, as
requested by Alexey:
"I think I gave a good enough arguments to not merge it.
To iterate:
* patch makes impossible to start using ext3 on EXT3_FS=n kernels
without reboot.
* this is done only for one pointer on task_struct"
None of config options which define task_struct are tristate directly
or effectively."
Requested-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The reiserfs lock -> inode mutex dependency gets inverted when we
relax the lock while walking to the tree.
To fix this, use a specialized version of reiserfs_mutex_lock_safe
that takes care of mutex subclasses. Then we can grab the inode
mutex with I_MUTEX_XATTR subclass without any reiserfs lock
dependency.
This fixes the following report:
[ INFO: possible circular locking dependency detected ]
2.6.32-06793-gf405425-dirty #2
-------------------------------------------------------
mv/18566 is trying to acquire lock:
(&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1110708>] reiserfs_write_lock+0x28=
/0x40
but task is already holding lock:
(&sb->s_type->i_mutex_key#5/3){+.+.+.}, at: [<c111033c>]
reiserfs_for_each_xattr+0x10c/0x380
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&sb->s_type->i_mutex_key#5/3){+.+.+.}:
[<c104f723>] validate_chain+0xa23/0xf70
[<c1050155>] __lock_acquire+0x4e5/0xa70
[<c105075a>] lock_acquire+0x7a/0xa0
[<c134c76f>] mutex_lock_nested+0x5f/0x2b0
[<c11102b4>] reiserfs_for_each_xattr+0x84/0x380
[<c1110615>] reiserfs_delete_xattrs+0x15/0x50
[<c10ef57f>] reiserfs_delete_inode+0x8f/0x140
[<c10a565c>] generic_delete_inode+0x9c/0x150
[<c10a574d>] generic_drop_inode+0x3d/0x60
[<c10a4667>] iput+0x47/0x50
[<c109cc0b>] do_unlinkat+0xdb/0x160
[<c109cca0>] sys_unlink+0x10/0x20
[<c1002c50>] sysenter_do_call+0x12/0x36
-> #0 (&REISERFS_SB(s)->lock){+.+.+.}:
[<c104fc68>] validate_chain+0xf68/0xf70
[<c1050155>] __lock_acquire+0x4e5/0xa70
[<c105075a>] lock_acquire+0x7a/0xa0
[<c134c76f>] mutex_lock_nested+0x5f/0x2b0
[<c1110708>] reiserfs_write_lock+0x28/0x40
[<c1103d6b>] search_by_key+0x1f7b/0x21b0
[<c10e73ef>] search_by_entry_key+0x1f/0x3b0
[<c10e77f7>] reiserfs_find_entry+0x77/0x400
[<c10e81e5>] reiserfs_lookup+0x85/0x130
[<c109a144>] __lookup_hash+0xb4/0x110
[<c109b763>] lookup_one_len+0xb3/0x100
[<c1110350>] reiserfs_for_each_xattr+0x120/0x380
[<c1110615>] reiserfs_delete_xattrs+0x15/0x50
[<c10ef57f>] reiserfs_delete_inode+0x8f/0x140
[<c10a565c>] generic_delete_inode+0x9c/0x150
[<c10a574d>] generic_drop_inode+0x3d/0x60
[<c10a4667>] iput+0x47/0x50
[<c10a1c4f>] dentry_iput+0x6f/0xf0
[<c10a1d74>] d_kill+0x24/0x50
[<c10a396b>] dput+0x5b/0x120
[<c109ca89>] sys_renameat+0x1b9/0x230
[<c109cb28>] sys_rename+0x28/0x30
[<c1002c50>] sysenter_do_call+0x12/0x36
other info that might help us debug this:
2 locks held by mv/18566:
#0: (&sb->s_type->i_mutex_key#5/1){+.+.+.}, at: [<c109b6ac>]
lock_rename+0xcc/0xd0
#1: (&sb->s_type->i_mutex_key#5/3){+.+.+.}, at: [<c111033c>]
reiserfs_for_each_xattr+0x10c/0x380
stack backtrace:
Pid: 18566, comm: mv Tainted: G C 2.6.32-06793-gf405425-dirty #2
Call Trace:
[<c134b252>] ? printk+0x18/0x1e
[<c104e790>] print_circular_bug+0xc0/0xd0
[<c104fc68>] validate_chain+0xf68/0xf70
[<c104c8cb>] ? trace_hardirqs_off+0xb/0x10
[<c1050155>] __lock_acquire+0x4e5/0xa70
[<c105075a>] lock_acquire+0x7a/0xa0
[<c1110708>] ? reiserfs_write_lock+0x28/0x40
[<c134c76f>] mutex_lock_nested+0x5f/0x2b0
[<c1110708>] ? reiserfs_write_lock+0x28/0x40
[<c1110708>] ? reiserfs_write_lock+0x28/0x40
[<c134b60a>] ? schedule+0x27a/0x440
[<c1110708>] reiserfs_write_lock+0x28/0x40
[<c1103d6b>] search_by_key+0x1f7b/0x21b0
[<c1050176>] ? __lock_acquire+0x506/0xa70
[<c1051267>] ? lock_release_non_nested+0x1e7/0x340
[<c1110708>] ? reiserfs_write_lock+0x28/0x40
[<c104e354>] ? trace_hardirqs_on_caller+0x124/0x170
[<c104e3ab>] ? trace_hardirqs_on+0xb/0x10
[<c1042a55>] ? T.316+0x15/0x1a0
[<c1042d2d>] ? sched_clock_cpu+0x9d/0x100
[<c10e73ef>] search_by_entry_key+0x1f/0x3b0
[<c134bf2a>] ? __mutex_unlock_slowpath+0x9a/0x120
[<c104e354>] ? trace_hardirqs_on_caller+0x124/0x170
[<c10e77f7>] reiserfs_find_entry+0x77/0x400
[<c10e81e5>] reiserfs_lookup+0x85/0x130
[<c1042d2d>] ? sched_clock_cpu+0x9d/0x100
[<c109a144>] __lookup_hash+0xb4/0x110
[<c109b763>] lookup_one_len+0xb3/0x100
[<c1110350>] reiserfs_for_each_xattr+0x120/0x380
[<c110ffe0>] ? delete_one_xattr+0x0/0x1c0
[<c1003342>] ? math_error+0x22/0x150
[<c1110708>] ? reiserfs_write_lock+0x28/0x40
[<c1110615>] reiserfs_delete_xattrs+0x15/0x50
[<c1110708>] ? reiserfs_write_lock+0x28/0x40
[<c10ef57f>] reiserfs_delete_inode+0x8f/0x140
[<c10a561f>] ? generic_delete_inode+0x5f/0x150
[<c10ef4f0>] ? reiserfs_delete_inode+0x0/0x140
[<c10a565c>] generic_delete_inode+0x9c/0x150
[<c10a574d>] generic_drop_inode+0x3d/0x60
[<c10a4667>] iput+0x47/0x50
[<c10a1c4f>] dentry_iput+0x6f/0xf0
[<c10a1d74>] d_kill+0x24/0x50
[<c10a396b>] dput+0x5b/0x120
[<c109ca89>] sys_renameat+0x1b9/0x230
[<c1042d2d>] ? sched_clock_cpu+0x9d/0x100
[<c104c8cb>] ? trace_hardirqs_off+0xb/0x10
[<c1042dde>] ? cpu_clock+0x4e/0x60
[<c1350825>] ? do_page_fault+0x155/0x370
[<c1041816>] ? up_read+0x16/0x30
[<c1350825>] ? do_page_fault+0x155/0x370
[<c109cb28>] sys_rename+0x28/0x30
[<c1002c50>] sysenter_do_call+0x12/0x36
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (38 commits)
direct I/O fallback sync simplification
ocfs: stop using do_sync_mapping_range
cleanup blockdev_direct_IO locking
make generic_acl slightly more generic
sanitize xattr handler prototypes
libfs: move EXPORT_SYMBOL for d_alloc_name
vfs: force reval of target when following LAST_BIND symlinks (try #7)
ima: limit imbalance msg
Untangling ima mess, part 3: kill dead code in ima
Untangling ima mess, part 2: deal with counters
Untangling ima mess, part 1: alloc_file()
O_TRUNC open shouldn't fail after file truncation
ima: call ima_inode_free ima_inode_free
IMA: clean up the IMA counts updating code
ima: only insert at inode creation time
ima: valid return code from ima_inode_alloc
fs: move get_empty_filp() deffinition to internal.h
Sanitize exec_permission_lite()
Kill cached_lookup() and real_lookup()
Kill path_lookup_open()
...
Trivial conflicts in fs/direct-io.c
|
|
Add a flags argument to struct xattr_handler and pass it to all xattr
handler methods. This allows using the same methods for multiple
handlers, e.g. for the ACL methods which perform exactly the same action
for the access and default ACLs, just using a different underlying
attribute. With a little more groundwork it'll also allow sharing the
methods for the regular user/trusted/secure handlers in extN, ocfs2 and
jffs2 like it's already done for xfs in this patch.
Also change the inode argument to the handlers to a dentry to allow
using the handlers mechnism for filesystems that require it later,
e.g. cifs.
[with GFS2 bits updated by Steven Whitehouse <swhiteho@redhat.com>]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
* small define cleanup in header
* fix #ifdeffery in procfs.c via Kconfig
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
/proc/fs/reiserfs/version is on the way of removing ->read_proc interface.
It's empty however, so simply remove it instead of doing dummy
conversion. It's hard to see what information userspace can extract from
empty file.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
journal_info in task_struct is used in journaling file system only. So
introduce CONFIG_FS_JOURNAL_INFO and make it conditional.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When we were using the bkl, we didn't care about dependencies against
other locks, but the mutex conversion created new ones, which is why
we have reiserfs_mutex_lock_safe(), which unlocks the reiserfs lock
before acquiring another mutex.
But this trick actually fails if we have acquired the reiserfs lock
recursively, as we try to unlock it to acquire the new mutex without
inverted dependency, but we eventually only decrease its depth.
This happens in the case of a nested inode creation/deletion.
Say we have no space left on the device, we create an inode
and tak the lock but fail to create its entry, then we release the
inode using iput(), which calls reiserfs_delete_inode() that takes
the reiserfs lock recursively. The path eventually ends up in
journal_begin() where we try to take the journal safely but we
fail because of the reiserfs lock recursion:
[ INFO: possible circular locking dependency detected ]
2.6.32-06486-g053fe57 #2
-------------------------------------------------------
vi/23454 is trying to acquire lock:
(&journal->j_mutex){+.+...}, at: [<c110dac4>] do_journal_begin_r+0x64/0x2f0
but task is already holding lock:
(&REISERFS_SB(s)->lock){+.+.+.}, at: [<c11106a8>] reiserfs_write_lock+0x28/0x40
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
[<c104f8f3>] validate_chain+0xa23/0xf70
[<c1050325>] __lock_acquire+0x4e5/0xa70
[<c105092a>] lock_acquire+0x7a/0xa0
[<c134c78f>] mutex_lock_nested+0x5f/0x2b0
[<c11106a8>] reiserfs_write_lock+0x28/0x40
[<c110dacb>] do_journal_begin_r+0x6b/0x2f0
[<c110ddcf>] journal_begin+0x7f/0x120
[<c10f76c2>] reiserfs_remount+0x212/0x4d0
[<c1093997>] do_remount_sb+0x67/0x140
[<c10a9ca6>] do_mount+0x436/0x6b0
[<c10a9f86>] sys_mount+0x66/0xa0
[<c1002c50>] sysenter_do_call+0x12/0x36
-> #0 (&journal->j_mutex){+.+...}:
[<c104fe38>] validate_chain+0xf68/0xf70
[<c1050325>] __lock_acquire+0x4e5/0xa70
[<c105092a>] lock_acquire+0x7a/0xa0
[<c134c78f>] mutex_lock_nested+0x5f/0x2b0
[<c110dac4>] do_journal_begin_r+0x64/0x2f0
[<c110ddcf>] journal_begin+0x7f/0x120
[<c10ef52f>] reiserfs_delete_inode+0x9f/0x140
[<c10a55fc>] generic_delete_inode+0x9c/0x150
[<c10a56ed>] generic_drop_inode+0x3d/0x60
[<c10a4607>] iput+0x47/0x50
[<c10e915c>] reiserfs_create+0x16c/0x1c0
[<c109a9c1>] vfs_create+0xc1/0x130
[<c109dbec>] do_filp_open+0x81c/0x920
[<c109004f>] do_sys_open+0x4f/0x110
[<c1090179>] sys_open+0x29/0x40
[<c1002c50>] sysenter_do_call+0x12/0x36
other info that might help us debug this:
2 locks held by vi/23454:
#0: (&sb->s_type->i_mutex_key#5){+.+.+.}, at: [<c109d64e>]
do_filp_open+0x27e/0x920
#1: (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c11106a8>]
reiserfs_write_lock+0x28/0x40
stack backtrace:
Pid: 23454, comm: vi Not tainted 2.6.32-06486-g053fe57 #2
Call Trace:
[<c134b202>] ? printk+0x18/0x1e
[<c104e960>] print_circular_bug+0xc0/0xd0
[<c104fe38>] validate_chain+0xf68/0xf70
[<c104ca9b>] ? trace_hardirqs_off+0xb/0x10
[<c1050325>] __lock_acquire+0x4e5/0xa70
[<c105092a>] lock_acquire+0x7a/0xa0
[<c110dac4>] ? do_journal_begin_r+0x64/0x2f0
[<c134c78f>] mutex_lock_nested+0x5f/0x2b0
[<c110dac4>] ? do_journal_begin_r+0x64/0x2f0
[<c110dac4>] ? do_journal_begin_r+0x64/0x2f0
[<c110ff80>] ? delete_one_xattr+0x0/0x1c0
[<c110dac4>] do_journal_begin_r+0x64/0x2f0
[<c110ddcf>] journal_begin+0x7f/0x120
[<c11105b5>] ? reiserfs_delete_xattrs+0x15/0x50
[<c10ef52f>] reiserfs_delete_inode+0x9f/0x140
[<c10a55bf>] ? generic_delete_inode+0x5f/0x150
[<c10ef490>] ? reiserfs_delete_inode+0x0/0x140
[<c10a55fc>] generic_delete_inode+0x9c/0x150
[<c10a56ed>] generic_drop_inode+0x3d/0x60
[<c10a4607>] iput+0x47/0x50
[<c10e915c>] reiserfs_create+0x16c/0x1c0
[<c1099a5d>] ? inode_permission+0x7d/0xa0
[<c109a9c1>] vfs_create+0xc1/0x130
[<c10e8ff0>] ? reiserfs_create+0x0/0x1c0
[<c109dbec>] do_filp_open+0x81c/0x920
[<c104ca9b>] ? trace_hardirqs_off+0xb/0x10
[<c134dc0d>] ? _spin_unlock+0x1d/0x20
[<c10a6eea>] ? alloc_fd+0xba/0xf0
[<c109004f>] do_sys_open+0x4f/0x110
[<c1090179>] sys_open+0x29/0x40
[<c1002c50>] sysenter_do_call+0x12/0x36
To fix this, use reiserfs_lock_once() from reiserfs_delete_inode()
which prevents from adding reiserfs lock recursion.
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
|
|
While allocating the bitmap using vmalloc, we hold the reiserfs lock,
which makes lockdep later reporting a possible deadlock as we may
swap out pages to allocate memory and then take the reiserfs lock
recursively:
inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
kswapd0/312 [HC0[0]:SC0[0]:HE1:SE1] takes:
(&REISERFS_SB(s)->lock){+.+.?.}, at: [<c11108a8>] reiserfs_write_lock+0x28/0x40
{RECLAIM_FS-ON-W} state was registered at:
[<c104e1c2>] mark_held_locks+0x62/0x90
[<c104e28a>] lockdep_trace_alloc+0x9a/0xc0
[<c108e396>] kmem_cache_alloc+0x26/0xf0
[<c10850ec>] __get_vm_area_node+0x6c/0xf0
[<c10857de>] __vmalloc_node+0x7e/0xa0
[<c108597b>] vmalloc+0x2b/0x30
[<c10e00b9>] reiserfs_init_bitmap_cache+0x39/0x70
[<c10f8178>] reiserfs_fill_super+0x2e8/0xb90
[<c1094345>] get_sb_bdev+0x145/0x180
[<c10f5a11>] get_super_block+0x21/0x30
[<c10931f0>] vfs_kern_mount+0x40/0xd0
[<c10932d9>] do_kern_mount+0x39/0xd0
[<c10a9857>] do_mount+0x2c7/0x6b0
[<c10a9ca6>] sys_mount+0x66/0xa0
[<c161589b>] mount_block_root+0xc4/0x245
[<c1615a75>] mount_root+0x59/0x5f
[<c1615b8c>] prepare_namespace+0x111/0x14b
[<c1615269>] kernel_init+0xcf/0xdb
[<c10031fb>] kernel_thread_helper+0x7/0x1c
This is actually fine for two reasons: we call vmalloc at mount time
then it's not in the swapping out path. Also the reiserfs lock can be
acquired recursively, but since its implementation depends on a mutex,
it's hard and not necessary worth it to teach that to lockdep.
The lock is useless at mount time anyway, at least until we replay the
journal. But let's remove it from this path later as this needs
more thinking and is a sensible change.
For now we can just relax the lock around vmalloc,
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (42 commits)
tree-wide: fix misspelling of "definition" in comments
reiserfs: fix misspelling of "journaled"
doc: Fix a typo in slub.txt.
inotify: remove superfluous return code check
hdlc: spelling fix in find_pvc() comment
doc: fix regulator docs cut-and-pasteism
mtd: Fix comment in Kconfig
doc: Fix IRQ chip docs
tree-wide: fix assorted typos all over the place
drivers/ata/libata-sff.c: comment spelling fixes
fix typos/grammos in Documentation/edac.txt
sysctl: add missing comments
fs/debugfs/inode.c: fix comment typos
sgivwfb: Make use of ARRAY_SIZE.
sky2: fix sky2_link_down copy/paste comment error
tree-wide: fix typos "couter" -> "counter"
tree-wide: fix typos "offest" -> "offset"
fix kerneldoc for set_irq_msi()
spidev: fix double "of of" in comment
comment typo fix: sybsystem -> subsystem
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing
* 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing: (31 commits)
kill-the-bkl/reiserfs: turn GFP_ATOMIC flag to GFP_NOFS in reiserfs_get_block()
kill-the-bkl/reiserfs: drop the fs race watchdog from _get_block_create_0()
kill-the-bkl/reiserfs: definitely drop the bkl from reiserfs_ioctl()
kill-the-bkl/reiserfs: always lock the ioctl path
kill-the-bkl/reiserfs: fix reiserfs lock to cpu_add_remove_lock dependency
kill-the-bkl/reiserfs: Fix induced mm->mmap_sem to sysfs_mutex dependency
kill-the-bkl/reiserfs: panic in case of lock imbalance
kill-the-bkl/reiserfs: fix recursive reiserfs write lock in reiserfs_commit_write()
kill-the-bkl/reiserfs: fix recursive reiserfs lock in reiserfs_mkdir()
kill-the-bkl/reiserfs: fix "reiserfs lock" / "inode mutex" lock inversion dependency
kill-the-bkl/reiserfs: move the concurrent tree accesses checks per superblock
kill-the-bkl/reiserfs: acquire the inode mutex safely
kill-the-bkl/reiserfs: unlock only when needed in search_by_key
kill-the-bkl/reiserfs: use mutex_lock in reiserfs_mutex_lock_safe
kill-the-bkl/reiserfs: factorize the locking in reiserfs_write_end()
kill-the-bkl/reiserfs: reduce number of contentions in search_by_key()
kill-the-bkl/reiserfs: don't hold the write recursively in reiserfs_lookup()
kill-the-bkl/reiserfs: lock only once on reiserfs_get_block()
kill-the-bkl/reiserfs: conditionaly release the write lock on fs_changed()
kill-the-BKL/reiserfs: add reiserfs_cond_resched()
...
|
|
Merge-reason: The tree was based 2.6.31. It's better to be up to date
with 2.6.32. Although no conflicting changes were made in between,
it gives benchmarking results closer to the lastest kernel behaviour.
|
|
"Journaled" is misspelled "journlaled" in an output string; this patch
fixed it. No changes in functionality.
Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
GFP_ATOMIC was used in reiserfs_get_block to not lose the Bkl so that
nobody can modify the tree in the middle of its work. Now that we
kicked out the bkl, we can use a more friendly flag. We use GFP_NOFS
here because we already hold the reiserfs lock.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
Cc: Thomas Gleixner <tglx@linutronix.de>
|
|
We had a watchdog in _get_block_create_0() that jumped to a fixup retry
path in case the bkl got relaxed while calling kmap().
This is not necessary anymore since we now have a reiserfs lock that is
not implicitly relaxed while sleeping.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
Cc: Thomas Gleixner <tglx@linutronix.de>
|
|
The reiserfs ioctl path doesn't need the big kernel lock anymore , now
that the filesystem synchronizes through its own lock.
We can then turn reiserfs_ioctl() into an unlocked_ioctl callback.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
Cc: Thomas Gleixner <tglx@linutronix.de>
|
|
Reiserfs uses the ioctl callback for its file operations, which means
that its ioctl path is still locked by the bkl, this was synchronizing
with the rest of the filsystem operations. We have changed that by
locking it with the new reiserfs lock but we do that only from the
compat_ioctl callback.
Fix that by locking reiserfs_ioctl() everytime.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
Cc: Thomas Gleixner <tglx@linutronix.de>
|
|
While creating the reiserfs workqueue during the journal
initialization, we are holding the reiserfs lock, but
create_workqueue() also holds the cpu_add_remove_lock, creating
then the following dependency:
- reiserfs lock -> cpu_add_remove_lock
But we also have the following existing dependencies:
- mm->mmap_sem -> reiserfs lock
- cpu_add_remove_lock -> cpu_hotplug.lock -> slub_lock -> sysfs_mutex
The merged dependency chain then becomes:
- mm->mmap_sem -> reiserfs lock -> cpu_add_remove_lock ->
cpu_hotplug.lock -> slub_lock -> sysfs_mutex
But when we fill a dir entry in sysfs_readir(), we are holding the
sysfs_mutex and we also might fault while copying the directory entry
to the user, leading to the following dependency:
- sysfs_mutex -> mm->mmap_sem
The end result is then a lock inversion between sysfs_mutex and
mm->mmap_sem, as reported in the following lockdep warning:
[ INFO: possible circular locking dependency detected ]
2.6.31-07095-g25a3912 #4
-------------------------------------------------------
udevadm/790 is trying to acquire lock:
(&mm->mmap_sem){++++++}, at: [<c1098942>] might_fault+0x72/0xc0
but task is already holding lock:
(sysfs_mutex){+.+.+.}, at: [<c110813c>] sysfs_readdir+0x7c/0x260
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #5 (sysfs_mutex){+.+.+.}:
[...]
-> #4 (slub_lock){+++++.}:
[...]
-> #3 (cpu_hotplug.lock){+.+.+.}:
[...]
-> #2 (cpu_add_remove_lock){+.+.+.}:
[...]
-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
[...]
-> #0 (&mm->mmap_sem){++++++}:
[...]
This can be fixed by relaxing the reiserfs lock while creating the
workqueue.
This is fine to relax the lock here, we just keep it around to pass
through reiserfs lock checks and for paranoid reasons.
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Tested-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
|
|
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Alexander Beregalov reported the following warning:
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.31-03149-gdcc030a #1
-------------------------------------------------------
udevadm/716 is trying to acquire lock:
(&mm->mmap_sem){++++++}, at: [<c107249a>] might_fault+0x4a/0xa0
but task is already holding lock:
(sysfs_mutex){+.+.+.}, at: [<c10cb9aa>] sysfs_readdir+0x5a/0x200
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (sysfs_mutex){+.+.+.}:
[...]
-> #2 (&bdev->bd_mutex){+.+.+.}:
[...]
-> #1 (&REISERFS_SB(s)->lock){+.+.+.}:
[...]
-> #0 (&mm->mmap_sem){++++++}:
[...]
On reiserfs mount path, we take the reiserfs lock and while
initializing the journal, we open the device, taking the
bdev->bd_mutex. Then rescan_partition() may signal the change
to sysfs.
We have then the following dependency:
reiserfs_lock -> bd_mutex -> sysfs_mutex
Later, while entering reiserfs_readpage() after a pagefault in an
mmaped reiserfs file, we are holding the mm->mmap_sem, and we are going
to take the reiserfs lock too.
We have then the following dependency:
mm->mmap_sem -> reiserfs_lock
which, expanded with the previous dependency gives us:
mm->mmap_sem -> reiserfs_lock -> bd_mutex -> sysfs_mutex
Now while entering the sysfs readdir path, we are holding the
sysfs_mutex. And when we copy a directory entry to the user buffer, we
might fault and then take the mm->mmap_sem lock. Which leads to the
circular locking dependency reported.
We can fix that by relaxing the reiserfs lock during the call to
journal_init_dev(), which is the place where we open the mounted
device.
This is fine to relax the lock here because we are in the begining of
the reiserfs mount path and there is nothing to protect at this time,
the journal is not intialized.
We just keep this lock around for paranoid reasons.
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Tested-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
|
|
Until now, trying to unlock the reiserfs write lock whereas the current
task doesn't hold it lead to a simple warning.
We should actually warn and panic in this case to avoid the user datas
to reach an unstable state.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
|
|
reiserfs_commit_write()
reiserfs_commit_write() is always called with the write lock held.
Thus the current calls to reiserfs_write_lock() in this function are
acquiring the lock recursively.
We can safely drop them.
This also solves further assumptions for this lock to be really
released while calling reiserfs_write_unlock().
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
|
|
reiserfs_mkdir() acquires the reiserfs lock, assuming it has been called
from the dir inodes callbacks, without the lock held.
But it can also be called from other internal sites such as
reiserfs_xattr_init() which already holds the lock. This recursive
locking leads to further wrong assumptions. For example, later calls
to reiserfs_mutex_lock_safe() won't actually unlock the reiserfs lock
the time we acquire a given mutex, creating unexpected lock inversions.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
|
|
dependency
reiserfs_xattr_init is called with the reiserfs write lock held, but
if the ".reiserfs_priv" entry is not created, we take the superblock
root directory inode mutex until .reiserfs_priv is created.
This creates a lock dependency inversion against other sites such as
reiserfs_file_release() which takes an inode mutex and the reiserfs
lock after.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Laurent Riffard <laurent.riffard@free.fr>
|
|
When do_balance() balances the tree, a trick is performed to
provide the ability for other tree writers/readers to check whether
do_balance() is executing concurrently (requires CONFIG_REISERFS_CHECK).
This is done to protect concurrent accesses to the tree. The trick
is the following:
When do_balance is called, a unique global variable called cur_tb
takes a pointer to the current tree to be rebalanced.
Once do_balance finishes its work, cur_tb takes the NULL value.
Then, concurrent tree readers/writers just have to check the value
of cur_tb to ensure do_balance isn't executing concurrently.
If it is, then it proves that schedule() occured on do_balance(),
which then relaxed the bkl that protected the tree.
Now that the bkl has be turned into a mutex, this check is still
fine even though do_balance() becomes preemptible: the write lock
will not be automatically released on schedule(), so the tree is
still protected.
But this is only fine if we have a single reiserfs mountpoint.
Indeed, because the bkl is a global lock, it didn't allowed
concurrent executions between a tree reader/writer in a mount point
and a do_balance() on another tree from another mountpoint.
So assuming all these readers/writers weren't supposed to be
reentrant, the current check now sometimes detect false positives with
the current per-superblock mutex which allows this reentrancy.
This patch keeps the concurrent tree accesses check but moves it
per superblock, so that only trees from a same mount point are
checked to be not accessed concurrently.
[ Impact: fix spurious panic while running several reiserfs mount-points ]
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|
|
While searching a pathname, an inode mutex can be acquired
in do_lookup() which calls reiserfs_lookup() which in turn
acquires the write lock.
On the other side reiserfs_fill_super() can acquire the write_lock
and then call reiserfs_lookup_privroot() which can acquire an
inode mutex (the root of the mount point).
So we theoretically risk an AB - BA lock inversion that could lead
to a deadlock.
As for other lock dependencies found since the bkl to mutex
conversion, the fix is to use reiserfs_mutex_lock_safe() which
drops the lock dependency to the write lock.
[ Impact: fix a possible deadlock with reiserfs ]
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|
|
search_by_key() is the site which most requires the lock.
This is mostly because it is a very central function and also
because it releases/reaqcuires the write lock at least once each
time it is called.
Such release/reacquire creates a lot of contention in this place and
also opens more the window which let another thread changing the tree.
When it happens, the current path searching over the tree must be
retried from the beggining (the root) which is a wasteful and
time consuming recovery.
This patch factorizes two release/reacquire sequences:
- reading leaf nodes blocks
- reading current block
The latter immediately follows the former.
The whole sequence is safe as a single unlocked section because
we check just after if the tree has changed during these operations.
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|
|
reiserfs_mutex_lock_safe() is a hack to avoid any dependency between
an internal reiserfs mutex and the write lock, it has been proposed
to follow the old bkl logic.
The code does the following:
while (!mutex_trylock(m)) {
reiserfs_write_unlock(s);
schedule();
reiserfs_write_lock(s);
}
It then imitate the implicit behaviour of the lock when it was
a Bkl and hadn't such dependency:
mutex_lock(m) {
if (fastpath)
let's go
else {
wait_for_mutex() {
schedule() {
unlock_kernel()
reacquire_lock_kernel()
}
}
}
}
The problem is that by using such explicit schedule(), we don't
benefit of the adaptive mutex spinning on owner.
The logic in use now is:
reiserfs_write_unlock(s);
mutex_lock(m); // -> possible adaptive spinning
reiserfs_write_lock(s);
[ Impact: restore the use of adaptive spinning mutexes in reiserfs ]
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|