Age | Commit message (Collapse) | Author |
|
Give waitqueue spinlocks their own lockdep classes when they
are initialised from init_waitqueue_head(). This means that
struct wait_queue::func functions can operate other waitqueues.
This is used by CacheFiles to catch the page from a backing fs
being unlocked and to wake up another thread to take a copy of
it.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Takashi Iwai <tiwai@suse.de>
Cc: linux-cachefs@redhat.com
Cc: torvalds@osdl.org
Cc: akpm@linux-foundation.org
LKML-Reference: <20090810113305.17284.81508.stgit@warthog.procyon.org.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
* 'kvm-updates/2.6.31' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: Avoid redelivery of edge interrupt before next edge
KVM: MMU: limit rmap chain length
KVM: ia64: fix build failures due to ia64/unsigned long mismatches
KVM: Make KVM_HPAGES_PER_HPAGE unsigned long to avoid build error on powerpc
KVM: fix ack not being delivered when msi present
KVM: s390: fix wait_queue handling
KVM: VMX: Fix locking imbalance on emulation failure
KVM: VMX: Fix locking order in handle_invalid_guest_state
KVM: MMU: handle n_free_mmu_pages > n_alloc_mmu_pages in kvm_mmu_change_mmu_pages
KVM: SVM: force new asid on vcpu migration
KVM: x86: verify MTRR/PAT validity
KVM: PIT: fix kpit_elapsed division by zero
KVM: Fix KVM_GET_MSR_INDEX_LIST
|
|
This patch implements the kernel side support for ftrace event
record sampling.
A new counter sampling attribute is added:
PERF_SAMPLE_TP_RECORD
which requests ftrace events record sampling. In this case
if a PERF_TYPE_TRACEPOINT counter is active and a tracepoint
fires, we emit the tracepoint binary record to the
perfcounter event buffer, as a sample.
Result, after setting PERF_SAMPLE_TP_RECORD attribute from perf
record:
perf record -f -F 1 -a -e workqueue:workqueue_execution
perf report -D
0x21e18 [0x48]: event: 9
.
. ... raw event: size 72 bytes
. 0000: 09 00 00 00 01 00 48 00 d0 c7 00 81 ff ff ff ff ......H........
. 0010: 0a 00 00 00 0a 00 00 00 21 00 00 00 00 00 00 00 ........!......
. 0020: 2b 00 01 02 0a 00 00 00 0a 00 00 00 65 76 65 6e +...........eve
. 0030: 74 73 2f 31 00 00 00 00 00 00 00 00 0a 00 00 00 ts/1...........
. 0040: e0 b1 31 81 ff ff ff ff .......
.
0x21e18 [0x48]: PERF_EVENT_SAMPLE (IP, 1): 10: 0xffffffff8100c7d0 period: 33
The raw ftrace binary record starts at offset 0020.
Translation:
struct trace_entry {
type = 0x2b = 43;
flags = 1;
preempt_count = 2;
pid = 0xa = 10;
tgid = 0xa = 10;
}
thread_comm = "events/1"
thread_pid = 0xa = 10;
func = 0xffffffff8131b1e0 = flush_to_ldisc()
What will come next?
- Userspace support ('perf trace'), 'flight data recorder' mode
for perf trace, etc.
- The unconditional copy from the profiling callback brings
some costs however if someone wants no such sampling to
occur, and needs to be fixed in the future. For that we need
to have an instant access to the perf counter attribute.
This is a matter of a flag to add in the struct ftrace_event.
- Take care of the events recursivity! Don't ever try to record
a lock event for example, it seems some locking is used in
the profiling fast path and lead to a tracing recursivity.
That will be fixed using raw spinlock or recursivity
protection.
- [...]
- Profit! :-)
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/hch/xfs-icache-races:
xfs: fix freeing of inodes not yet added to the inode cache
vfs: add __destroy_inode
vfs: fix inode_init_always calling convention
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf_counter: Fix double list iteration in per task precise stats
perf: Auto-detect libelf
perf symbol: Fix symbol parsing in certain cases: use the build-id as a symlink
perf_counter/powerpc: Check oprofile_cpu_type for NULL before using it
ftrace: Fix perf-tracepoint OOPS
perf report: Add missing command line options to man page
perf: Auto-detect libbfd
perf report: Make --sort comm,dso,symbol the default
|
|
* git://git.infradead.org/mtd-2.6:
jffs2: Fix return value from jffs2_do_readpage_nolock()
mtd: mtdblock: introduce mtdblks_lock
mtd: remove 'SBC8240 Wind River' Device Driver Code
mtd: OneNAND: OMAP2/3: free GPMC CS on module removal
mtd: OneNAND: fix incorrect bufferram offset
mtd: blkdevs: do not forget to get MTD devices
mtd: fix the conversion from dev to mtd_info
mtd: let include/linux/mtd/partitions.h stand on its own
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: matrix_keypad - make matrix keymap size dynamic
Input: wistron_btns - support Prestigio Wifi RF kill button
Input: i8042 - add Asus G1S to noloop exception list
|
|
Fix and improve comments in decompress/generic.h that describe the
decompressor API. Also remove an unused definition, and rename INBUF_LEN
in lib/decompress_inflate.c to conform to bzip2/lzma naming.
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
At first, init_task's mems_allowed is initialized as this.
init_task->mems_allowed == node_state[N_POSSIBLE]
And cpuset's top_cpuset mask is initialized as this
top_cpuset->mems_allowed = node_state[N_HIGH_MEMORY]
Before 2.6.29:
policy's mems_allowed is initialized as this.
1. update tasks->mems_allowed by its cpuset->mems_allowed.
2. policy->mems_allowed = nodes_and(tasks->mems_allowed, user's mask)
Updating task's mems_allowed in reference to top_cpuset's one.
cpuset's mems_allowed is aware of N_HIGH_MEMORY, always.
In 2.6.30: After commit 58568d2a8215cb6f55caf2332017d7bdff954e1c
("cpuset,mm: update tasks' mems_allowed in time"), policy's mems_allowed
is initialized as this.
1. policy->mems_allowd = nodes_and(task->mems_allowed, user's mask)
Here, if task is in top_cpuset, task->mems_allowed is not updated from
init's one. Assume user excutes command as #numactrl --interleave=all
,....
policy->mems_allowd = nodes_and(N_POSSIBLE, ALL_SET_MASK)
Then, policy's mems_allowd can includes a possible node, which has no pgdat.
MPOL's INTERLEAVE just scans nodemask of task->mems_allowd and access this
directly.
NODE_DATA(nid)->zonelist even if NODE_DATA(nid)==NULL
Then, what's we need is making policy->mems_allowed be aware of
N_HIGH_MEMORY. This patch does that. But to do so, extra nodemask will
be on statck. Because I know cpumask has a new interface of
CPUMASK_ALLOC(), I added it to node.
This patch stands on old behavior. But I feel this fix itself is just a
Band-Aid. But to do fundametal fix, we have to take care of memory
hotplug and it takes time. (task->mems_allowd should be N_HIGH_MEMORY, I
think.)
mpol_set_nodemask() should be aware of N_HIGH_MEMORY and policy's nodemask
should be includes only online nodes.
In old behavior, this is guaranteed by frequent reference to cpuset's
code. Now, most of them are removed and mempolicy has to check it by
itself.
To do check, a few nodemask_t will be used for calculating nodemask. But,
size of nodemask_t can be big and it's not good to allocate them on stack.
Now, cpumask_t has CPUMASK_ALLOC/FREE an easy code for get scratch area.
NODEMASK_ALLOC/FREE shoudl be there.
[akpm@linux-foundation.org: cleanups & tweaks]
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Paul Menage <menage@google.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: David Rientjes <rientjes@google.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When we want to tear down an inode that lost the add to the cache race
in XFS we must not call into ->destroy_inode because that would delete
the inode that won the race from the inode cache radix tree.
This patch provides the __destroy_inode helper needed to fix this,
the actual fix will be in th next patch. As XFS was the only reason
destroy_inode was exported we shift the export to the new __destroy_inode.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
|
|
Currently inode_init_always calls into ->destroy_inode if the additional
initialization fails. That's not only counter-intuitive because
inode_init_always did not allocate the inode structure, but in case of
XFS it's actively harmful as ->destroy_inode might delete the inode from
a radix-tree that has never been added. This in turn might end up
deleting the inode for the same inum that has been instanciated by
another process and cause lots of cause subtile problems.
Also in the case of re-initializing a reclaimable inode in XFS it would
free an inode we still want to keep alive.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
|
|
Remove assumption on the shift and size of rows/columns form
matrix_keypad driver.
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
|
|
Not all tracepoints are created equal, in specific the ftrace
tracepoints are created with TRACE_EVENT_FORMAT() which does
not generate the needed bits to tie them into perf counters.
For those events, don't create the 'id' file and fail
->profile_enable when their ID is specified through other
means.
Reported-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1249497664.5890.4.camel@laptop>
[ v2: fix build error in the !CONFIG_EVENT_PROFILE case ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
kvm_notify_acked_irq does not check irq type, so that it sometimes
interprets msi vector as irq. As a result, ack notifiers are not
called, which typially hangs the guest. The fix is to track and
check irq type.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
tty-ldisc: be more careful in 'put_ldisc' locking
tty-ldisc: turn ldisc user count into a proper refcount
tty-ldisc: make refcount be atomic_t 'users' count
|
|
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
Make SCSI SG v4 driver enabled by default and remove EXPERIMENTAL dependency, since udev depends on BSG
block: Update topology documentation
block: Stack optimal I/O size
block: Add a wrapper for setting minimum request size without a queue
block: Make blk_queue_stack_limits use the new stacking interface
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
ehea: Fix napi list corruption on ifconfig down
igbvf: Allow VF driver to correctly recognize failure to set mac
3c59x: Fix build failure with gcc 3.2
sky2: Avoid transmits during sky2_down()
iwlagn: do not send key clear commands when rfkill enabled
libertas: Read buffer overflow
drivers/net/wireless: introduce missing kfree
drivers/net/wireless/iwlwifi: introduce missing kfree
zd1211rw: fix unaligned access in zd_mac_rx
cfg80211: fix regression on beacon world roaming feature
cfg80211: add two missing NULL pointer checks
ixgbe: Patch to modify 82598 PCIe completion timeout values
bluetooth: rfcomm_init bug fix
mlx4_en: Fix double pci unmapping.
mISDN: Fix handling of receive buffer size in L1oIP
pcnet32: VLB support fixes
pcnet32: remove superfluous NULL pointer check in pcnet32_probe1()
net: restore the original spinlock to protect unicast list
netxen: fix coherent dma mask setting
mISDN: Read buffer overflow
...
|
|
This is pure preparation of changing the ldisc reference counting to be
a true refcount that defines the lifetime of the ldisc. But this is a
purely syntactic change for now to make the next steps easier.
This patch should make no semantic changes at all. But I wanted to make
the ldisc refcount be an atomic (I will be touching it without locks
soon enough), and I wanted to rename it so that there isn't quite as
much confusion between 'ldo->refcount' (ldisk operations refcount) and
'ld->refcount' (ldisc refcount itself) in the same file.
So it's now an atomic 'ld->users' count. It still starts at zero,
despite having a reference from 'tty->ldisc', but that will change once
we turn it into a _real_ refcount.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@mail.by>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
The patch fixes a bug when converting dev to mtd_info by using the
drvdata of the dev, the previous code used
container_of(dev, struct mtd_info, dev), but won't work for the mtdXro
devices as they created without being contained inside mtd_info structure.
Signed-off-by: Saeed Bishara <saeed@marvell.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
When declaring static MTD partitions in board specific code, only
including <include/linux/mtd/partitions.h> should suffice without
gcc nagging us with:
In file included from arch/arm/mach-kirkwood/sheevaplug-setup.c:14:
include/linux/mtd/partitions.h:50: warning: 'struct mtd_info' declared inside parameter list
include/linux/mtd/partitions.h:50: warning: its scope is only this definition or declaration, which is probably not what you want
include/linux/mtd/partitions.h:51: warning: 'struct mtd_info' declared inside parameter list
include/linux/mtd/partitions.h:61: warning: 'struct mtd_info' declared inside parameter list
include/linux/mtd/partitions.h:67: warning: 'struct mtd_info' declared inside parameter list
Signed-off-by: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
In order to be able to distinguish between no samples due to
inactivity and no samples due to task ended, Arjan asked for
PERF_EVENT_EXIT events. This is useful to the boot delay
instrumentation (bootchart) app.
This patch changes the PERF_EVENT_FORK to be emitted on every
clone, and adds PERF_EVENT_EXIT to be emitted on task exit,
after the task's counters have been closed.
This task tracing is controlled through: attr.comm || attr.mmap
and through the new attr.task field.
Suggested-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
[ cleaned up perf_counter.h a bit ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Introduce blk_limits_io_min() and make blk_queue_io_min() call it.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
io context: fix ref counting
block: make the end_io functions be non-GPL exports
block: fix improper kobject release in blk_integrity_unregister
block: always assign default lock to queues
mg_disk: Add missing ready status check on mg_write()
mg_disk: fix issue with data integrity on error in mg_write()
mg_disk: fix reading invalid status when use polling driver
mg_disk: remove prohibited sleep operation
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
clocksource: Save mult_orig in clocksource_disable()
|
|
To fix the common case where ->enable() does not set up
mult, make sure mult_orig is saved in mult on disable.
Also add comments to explain why we do this.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Cc: johnstul@us.ibm.com
Cc: lethal@linux-sh.org
Cc: akpm@linux-foundation.org
LKML-Reference: <20090618152432.10136.9932.sendpatchset@rx1.opensource.se>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
the code allready uses flush_kernel_dcache_page(). This patch updates the
driver to the recent sg API changes which require that either SG_MITER_TO_SG
or SG_MITER_FROM_SG is set. SG_MITER_TO_SG calls flush_kernel_dcache_page()
in sg_mitter_stop()
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Acked-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Pierre Ossman <pierre@ossman.eu>
|
|
sg_miter_start() is currently unaware of the direction of the copy
process (to or from the scatter list). It is important to know the
direction because the page has to be flushed in case the data written
is seen on a different mapping in user land on cache incoherent
architectures.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Pierre Ossman <pierre@ossman.eu>
|
|
Commit d9c7d394a8ebacb60097b192939ae9f15235225e
("block: prevent possible io_context->refcount overflow") mistakenly
changed atomic_inc(&ioc->nr_tasks) to atomic_long_inc(&ioc->refcount).
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
I've been doing this for years, and akpm picked me up on it about 12
months ago. lguest partly serves as example code, so let's do it Right.
Also, remove two unused fields in struct vblk_info in the example launcher.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@redhat.com>
|
|
I don't really notice it (except to begrudge the extra vertical
space), but Ingo does. And he pointed out that one excuse of lguest
is as a teaching tool, it should set a good example.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
PM / Hibernate: Replace bdget call with simple atomic_inc of i_count
PM / ACPI: HP G7000 Notebook needs a SCI_EN resume quirk
|
|
To avoid userspace build failures such as:
.../linux/uio.h:37: error: expected `=', `,', `;', `asm' or `__attribute__' before `iov_length'
.../linux/uio.h:47: error: expected declaration specifiers or `...' before `size_t'
move uio functions inside a __KERNEL__ block.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Once a structure goes over PAGE_SIZE*2, we see occasional allocation
failures. Some people have chosen to switch over to things like vmalloc()
that will let them keep array-like access to such a large structures.
But, vmalloc() has plenty of downsides.
Here's an alternative. I think it's what Andrew was suggesting here:
http://lkml.org/lkml/2009/7/2/518
I call it a flexible array. It does all of its work in PAGE_SIZE bits, so
never does an order>0 allocation. The base level has
PAGE_SIZE-2*sizeof(int) bytes of storage for pointers to the second level.
So, with a 32-bit arch, you get about 4MB (4183112 bytes) of total
storage when the objects pack nicely into a page. It is half that on
64-bit because the pointers are twice the size. There's a table detailing
this in the code.
There are kerneldocs for the functions, but here's an
overview:
flex_array_alloc() - dynamically allocate a base structure
flex_array_free() - free the array and all of the
second-level pages
flex_array_free_parts() - free the second-level pages, but
not the base (for static bases)
flex_array_put() - copy into the array at the given index
flex_array_get() - copy out of the array at the given index
flex_array_prealloc() - preallocate the second-level pages
between the given indexes to
guarantee no allocs will occur at
put() time.
We could also potentially just pass the "element_size" into each of the
API functions instead of storing it internally. That would get us one
more base pointer on 32-bit.
I've been testing this by running it in userspace. The header and patch
that I've been using are here, as well as the little script I'm using to
generate the size table which goes in the kerneldocs.
http://sr71.net/~dave/linux/flexarray/
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Found with make headers_check
/usr/include/linux/pps.h:52: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Dave Jones <davej@redhat.com>
Cc: Rodolfo Giometti <giometti@linux.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
After commit ec64f51545fffbc4cb968f0cea56341a4b07e85a ("cgroup: fix
frequent -EBUSY at rmdir"), cgroup's rmdir (especially against memcg)
doesn't return -EBUSY by temporary ref counts. That commit expects all
refs after pre_destroy() is temporary but...it wasn't. Then, rmdir can
wait permanently. This patch tries to fix that and change followings.
- set CGRP_WAIT_ON_RMDIR flag before pre_destroy().
- clear CGRP_WAIT_ON_RMDIR flag when the subsys finds racy case.
if there are sleeping ones, wakes them up.
- rmdir() sleeps only when CGRP_WAIT_ON_RMDIR flag is set.
Tested-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Reported-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Reviewed-by: Paul Menage <menage@google.com>
Acked-by: Balbir Sigh <balbir@linux.vnet.ibm.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The bug was introduced by commit cc31edceee04a7b87f2be48f9489ebb72d264844
("cgroups: convert tasks file to use a seq_file with shared pid array").
We cache a pid array for all threads that are opening the same "tasks"
file, but the pids in the array are always from the namespace of the
last process that opened the file, so all other threads will read pids
from that namespace instead of their own namespaces.
To fix it, we maintain a list of pid arrays, which is keyed by pid_ns.
The list will be of length 1 at most time.
Reported-by: Paul Menage <menage@google.com>
Idea-by: Paul Menage <menage@google.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Reviewed-by: Serge Hallyn <serue@us.ibm.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata: accept late unlocking of HPA
libata: Updates and fixes for pata_at91 driver
ata_piix: Add new short cable ID
ata_piix: Add new laptop short cable IDs
ahci: add device IDs for Ibex Peak ahci controllers
libata: remove superfluous NULL pointer checks
libata: add missing NULL pointer check to ata_eh_reset()
pata_pcmcia: add CNF-CDROM-ID
|
|
We really don't want to mark the pty as a low-latency device, because as
Alan points out, the ->write method can be called from an IRQ (ppp?),
and that means we can't use ->low_latency=1 as we take mutexes in the
low_latency case.
So rather than using low_latency to force the written data to be pushed
to the ldisc handling at 'write()' time, just make the reader side (or
the poll function) do the flush when it checks whether there is data to
be had.
This also fixes the problem with lost data in an emacs compile buffer
(bugzilla 13815), and we can thus revert the low_latency pty hack
(commit 3a54297478e6578f96fd54bf4daa1751130aca86: "pty: quickfix for the
pty ENXIO timing problems").
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[ Modified to do the tty_flush_to_ldisc() inside input_available_p() so
that it triggers for both read and poll() - Linus]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Create bdgrab(). This function copies an existing reference to a
block_device. It is safe to call from any context.
Hibernation code wishes to copy a reference to the active swap device.
Right now it calls bdget() under a spinlock, but this is wrong because
bdget() can sleep. It doesn't need a full bdget() because we already
hold a reference to active swap devices (and the spinlock protects
against swapoff).
Fixes http://bugzilla.kernel.org/show_bug.cgi?id=13827
Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
|
|
On certain configurations, HPA isn't or can't be unlocked during
probing but it somehow ends up unlocked afterwards. In the following
thread, the problem can be reliably reproduced after resuming from
STR. The BIOS turns on HPA during boot but forgets to do it during
resume.
http://thread.gmane.org/gmane.linux.kernel/858310
This patch updates libata revalidation such that it considers native
n_sectors. If the device size has increased to match native
n_sectors, it's assumed that HPA has been unlocked involuntarily and
the device is recognized as the same one. This should be fairly safe
while nicely working around the problem.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Christof Warlich <christof@warlich.name>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
|
|
Even though reverse path filter was changed from simple boolean to
trinary control, the loose mode only works if both all and device are
configured because of this logic error.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
* 'for-linus' of git://git.infradead.org/users/eparis/notify:
inotify: use GFP_NOFS under potential memory pressure
fsnotify: fix inotify tail drop check with path entries
inotify: check filename before dropping repeat events
fsnotify: use def_bool in kconfig instead of letting the user choose
inotify: fix error paths in inotify_update_watch
inotify: do not leak inode marks in inotify_add_watch
inotify: drop user watch count when a watch is removed
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (45 commits)
cnic: Fix ISCSI_KEVENT_IF_DOWN message handling.
net: irda: init spinlock after memcpy
ixgbe: fix for 82599 errata marking UDP checksum errors
r8169: WakeOnLan fix for the 8168
netxen: reset ring consumer during cleanup
net/bridge: use kobject_put to release kobject in br_add_if error path
smc91x.h: add config for Nomadik evaluation kit
NET: ROSE: Don't use static buffer.
eepro: Read buffer overflow
tokenring: Read buffer overflow
at1700: Read buffer overflow
fealnx: Write outside array bounds
ixgbe: remove unnecessary call to device_init_wakeup
ixgbe: Don't priority tag control frames in DCB mode
ixgbe: Enable FCoE offload when DCB is enabled for 82599
net: Rework mdio-ofgpio driver to use of_mdio infrastructure
register at91_ether using platform_driver_probe
skge: Enable WoL by default if supported
net: KS8851 needs to depend on MII
be2net: Bug fix in the non-lro path. Size of received packet was not updated in statistics properly.
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (34 commits)
V4L/DVB (12303): cx23885: check pointers before dereferencing in dprintk macro
V4L/DVB (12302): cx23885-417: fix broken IOCTL handling
V4L/DVB (12300): bttv: fix regression: tvaudio must be loaded before tuner
V4L/DVB (12291): b2c2: fix frontends compiled into kernel
V4L/DVB (12286): sn9c20x: reorder includes to be like other drivers
V4L/DVB (12284): gspca - jpeg subdrivers: Check the result of kmalloc(jpeg header).
V4L/DVB (12283): gspca - sn9c20x: New subdriver for sn9c201 and sn9c202 bridges.
V4L/DVB (12282): gspca - main: Support for vidioc_g_chip_ident and vidioc_g/s_register.
V4L/DVB (12269): af9013: auto-detect parameters in case of garbage given by app
V4L/DVB (12267): gspca - sonixj: Bad sensor init of non ov76xx sensors.
V4L/DVB (12265): em28xx: fix tuning problem in HVR-900 (R1)
V4L/DVB (12263): em28xx: set demod profile for Pinnacle Hybrid Pro 320e
V4L/DVB (12262): em28xx: Make sure the tuner is initialized if generic empia USB id was used
V4L/DVB (12261): em28xx: set GPIO properly for Pinnacle Hybrid Pro analog support
V4L/DVB (12260): em28xx: make support work for the Pinnacle Hybrid Pro (eb1a:2881)
V4L/DVB (12258): em28xx: fix typo in mt352 init sequence for Terratec Cinergy T XS USB
V4L/DVB (12257): em28xx: make tuning work for Terratec Cinergy T XS USB (mt352 variant)
V4L/DVB (12245): em28xx: add support for mt9m001 webcams
V4L/DVB (12244): em28xx: adjust vinmode/vinctl based on the stream input format
V4L/DVB (12243): em28xx: allow specifying sensor xtal frequency
...
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm:
dm table: pass correct dev area size to device_area_is_valid
dm: remove queue next_ordered workaround for barriers
dm raid1: wake kmirrord when requeueing delayed bios after remote recovery
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
jbd: fix race between write_metadata_buffer and get_write_access
ext3: Get rid of extenddisksize parameter of ext3_get_blocks_handle()
jbd: Fix a race between checkpointing code and journal_get_write_access()
ext3: Fix truncation of symlinks after failed write
jbd: Fail to load a journal if it is too short
|
|
Signed-off-by: Brian Johnson <brijohn@gmail.com>
Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
|
|
Incorrect device area lengths are being passed to device_area_is_valid().
The regression appeared in 2.6.31-rc1 through commit
754c5fc7ebb417b23601a6222a6005cc2e7f2913.
With the dm-stripe target, the size of the target (ti->len) was used
instead of the stripe_width (ti->len/#stripes). An example of a
consequent incorrect error message is:
device-mapper: table: 254:0: sdb too small for target
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/peterz/linux-2.6-perf
* 'perf-counters-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/peterz/linux-2.6-perf: (31 commits)
perf_counter tools: Give perf top inherit option
perf_counter tools: Fix vmlinux symbol generation breakage
perf_counter: Detect debugfs location
perf_counter: Add tracepoint support to perf list, perf stat
perf symbol: C++ demangling
perf: avoid structure size confusion by using a fixed size
perf_counter: Fix throttle/unthrottle event logging
perf_counter: Improve perf stat and perf record option parsing
perf_counter: PERF_SAMPLE_ID and inherited counters
perf_counter: Plug more stack leaks
perf: Fix stack data leak
perf_counter: Remove unused variables
perf_counter: Make call graph option consistent
perf_counter: Add perf record option to log addresses
perf_counter: Log vfork as a fork event
perf_counter: Synthesize VDSO mmap event
perf_counter: Make sure we dont leak kernel memory to userspace
perf_counter tools: Fix index boundary check
perf_counter: Fix the tracepoint channel to perfcounters
perf_counter, x86: Extend perf_counter Pentium M support
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
|