aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2006-12-27V4L/DVB (5014): Allyesconfig build fixes on some non x86 archDavid Brownell
- CAFE_CCIC needs to depend on PCI, else "allyesconfig" breaks on systems without PCI - em28xx-video can't udelay(2500) else "allyesconfig" breaks on systems that refuse to spin that long (I saw it on ARM) Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4997): Bttv: delete duplicated ioremap()Akinobu Mita
ioremap() is called twice to same resource. The returen value of first one is not error-checked. second one is complely ignored. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4996): Msp3400: fix kthread_run error checkAkinobu Mita
The return value of kthread_run() should be checked by IS_ERR(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4995): Vivi: fix kthread_run() error checkAkinobu Mita
The return value of kthread_run() should be checked by IS_ERR(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4994): Vivi: fix use after free in list_for_each()Akinobu Mita
Freeing data including list_head in list_for_each() is not safe. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4992): Fix typo in saa7134-dvb.cStephan Berberig
Fix a typo (use_frontent -> use_frontend) in saa7134-dvb.c. Signed-off-by: Stephan Berberig <s.berberig@arcor.de> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4991): Cafe_ccic.c: fix NULL dereferenceAdrian Bunk
We shouldn't dereference "cam" when we already know it's NULL. Spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4990): Cpia2/cpia2_usb.c: fix error-path leakAmit Choudhary
Free previously allocated memory (in array elements) if kmalloc() returns NULL in submit_urbs(). Signed-off-by: Amit Choudhary <amit2030@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4988): Cx2341x audio_properties is an u16, not u8Hans Verkuil
This bug broke the MPEG audio mode controls. Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4984): LOG_STATUS should show the real temporal filter value.Hans Verkuil
The temporal filter is forced off when scaling. The VIDIOC_LOG_STATUS handler still showed the old temporal filter. It is now consistent with the real temporal filter value. Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4983): Force temporal filter to 0 when scaling to prevent ghosting.Hans Verkuil
Change the code to unconditionally turn off the temporal filter when scaling. If the window is not full screen the filter will introduce a nasty ghosting effect. Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4982): Fix broken audio mode handling for line-in in msp3400.Hans Verkuil
The wrong matrix was used when an external input was selected instead of the tuner input. The rxsubchans field was also not initialized to STEREO for an external input. And finally the msp34xxg_detect_stereo() should not try to detect stereo for an external input, that code is for the tuner input only. Together these bugs made it hit 'n miss whether you ever got stereo out of the msp3400 for an external input. Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4980): Fixes bug 7267: PAL/60 is not workingMauro Carvalho Chehab
On cx88 driver, sampling rate should be at chroma subcarrier freq (FSC). However, driver were programming wrong values for PAL/60, PAL/Nc and NTSC 4.43. This patch do the proper calculation. It also calculates htotal, hdelay and hactive constants, according with the sampling rate. It is tested with PAL/60 by Piotr Maksymuk and Olivier. Also tested with the already-supported standards. Test is still required for PAL/Nc. Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4979): Fixes compilation when CONFIG_V4L1_COMPAT is not selectedDwaine Garden
- SYSFS: Replaced all to_video_device(cd), video_device_create_file, video_device_remove_file and add the proper checks at create_file - Converted old norm values to V4L2 ones. - Robustness on sysfs hue/contrast/saturation queries. Additional check in order to return 0 if the driver is not opened. - Whitespace cleanups in usbvision-cards.c This patch merges two fixes by Thierry MERLE and Mauro Chehab, and adds additional checks. Signed-off-by: Dwaine Garden<DwaineGarden@rogers.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4973): Dvb-core: fix printk type warningMichael Krufky
dvb_net.c: In function 'dvb_net_ule': dvb_net.c:628: warning: format '%#lx' expects type 'long unsigned int', but argument 3 has type 'u32' dvb_net.c:628: warning: format '%#lx' expects type 'long unsigned int', but argument 4 has type 'u32' Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4972): Dvb-core: fix bug in CRC-32 checking on 64-bit systemsAng Way Chuang
CRC-32 checking during ULE decapsulation always failed on x86_64 systems due to the size of a variable used to store CRC. This bug was discovered on Fedora Core 6 with kernel-2.6.18-1.2849. The i386 counterpart has no such problem. This patch has been tested on 64-bit system as well as 32-bit system. Signed-off-by: Ang Way Chuang <wcang@nrg.cs.usm.my> Signed-off-by: Michael Krufky <mkrufky@linuxtv.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4970): Usbvision memory fixesThierry MERLE
- fix decompression buffer allocation not done at first driver open - simplification of USB sbuf allocation (use of usb_buffer_alloc) - replaced vmalloc by vmalloc_32 (for homogeneity) - add of saa7111 (i2cAddr=0x48) detection printout in attach_inform Signed-off-by: Thierry MERLE <thierry.merle@free.fr> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4968): Add PAL-60 support for cx2584x.Hans Verkuil
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4967): Add missing tuner module option pal=60 for PAL-60 support.Hans Verkuil
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4964): VIDEO_PALETTE_YUYV and VIDEO_PALETTE_YUV422 are the same paletteaudetto@tiscali.it
Consistent handling of VIDEO_PALETTE_YUYV and VIDEO_PALETTE_YUV422 Signed-off-by: Andrea A Odetti <audetto@tiscali.it> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4960): Removal of unused code from usbvision-i2c.cMauro Carvalho Chehab
i2c_adap is almost not used. This patch removes it, cleaning the i2c support, and improving driver understanding. Thanks to Thierry Merle for testing it. Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4959): Usbvision: possible cleanupsAdrian Bunk
This patch contains the following possible cleanups: - make needlessly global functions static - remove the unused EXPORT_SYMBOL's Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4958): Fix namespace conflict between w9968cf.c on MIPSRalf Baechle
Both use __SC. Since __* is sort of private namespace I've choosen to fix this in the driver. For consistency I decieded to also change __UNSC to UNSC. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4956): [NOVA-T-USB2] Put remote-debugging in the right placeMario Rossi
This patch removes unnecessary (and misleading) debug output (it printed the values of the keys in the table up to the value of the key pressed). Signed-off-by: Mario Rossi <mariofutire@googlemail.com> Signed-off-by: Patrick Boettcher <pb@linuxtv.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-27V4L/DVB (4955): Fix autosearch indexMario Rossi
After rewriting the driver the wrong autosearch index was used when COFDM-parameter needed to be detected. Thanks to Mario Rossi who found it. Signed-off-by: Mario Rossi <mariofutire@googlemail.com> Signed-off-by: Patrick Boettcher <pb@linuxtv.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2006-12-23Linux 2.6.20-rc2Linus Torvalds
2006-12-23Fix up CIFS for "test_clear_page_dirty()" removalLinus Torvalds
This also adds he required page "writeback" flag handling, that cifs hasn't been doing and that the page dirty flag changes made obvious. Acked-by: Steve French <smfltc@us.ibm.com> Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-23[PATCH] arch/i386/pci/mmconfig.c tlb flush fixOGAWA Hirofumi
We use the fixmap for accessing pci config space in pci_mmcfg_read/write(). The problem is in pci_exp_set_dev_base(). It is caching a last accessed address to avoid calling set_fixmap_nocache() whenever pci_mmcfg_read/write() is used. static inline void pci_exp_set_dev_base(int bus, int devfn) { u32 dev_base = base | (bus << 20) | (devfn << 12); if (dev_base != mmcfg_last_accessed_device) { mmcfg_last_accessed_device = dev_base; set_fixmap_nocache(FIX_PCIE_MCFG, dev_base); } } cpu0 cpu1 --------------------------------------------------------------------------- pci_mmcfg_read("device-A") pci_exp_set_dev_base() set_fixmap_nocache() pci_mmcfg_read("device-B") pci_exp_set_dev_base() set_fixmap_nocache() pci_mmcfg_read("device-B") pci_exp_set_dev_base() /* doesn't flush tlb */ But if cpus accessed the above order, the second pci_mmcfg_read() on cpu0 doesn't flush the TLB, because "mmcfg_last_accessed_device" is device-B. So, second pci_mmcfg_read() on cpu0 accesses a device-A via a previous TLB cache. This problem became the cause of several strange behavior. This patches fixes this situation by adds "mmcfg_last_accessed_cpu" check. [ Alternatively, we could make a per-cpu mapping area or something. Not that it's probably worth it, but if we wanted to avoid all locking and instead just disable preemption, that would be the way to go. --Linus ] Signed-off-by: OGAWA Hirofumi <hogawa@miraclelinux.com> Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-23[PATCH] suspend: fix suspend on single-CPU systemsIngo Molnar
Clark Williams reported that suspend doesnt work on his laptop on 2.6.20-rc1-rt kernels. The bug was introduced by the following cleanup commit: commit 112cecb2cc0e7341db92281ba04b26c41bb8146d Author: Siddha, Suresh B <suresh.b.siddha@intel.com> Date: Wed Dec 6 20:34:31 2006 -0800 [PATCH] suspend: don't change cpus_allowed for task initiating the suspend because with this change 'error' is not initialized to 0 anymore, if there are no other online CPUs. (i.e. if the system is single-CPU). the fix is the initialize it to 0. The really weird thing is that my version of gcc does not warn about this non-initialized variable situation ... (also fix the kernel printk in the error branch, it was missing a newline) Reported-by: Clark Williams <williams@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-23Fix reiserfs after "test_clear_page_dirty()" removalLinus Torvalds
Thanks to Len Brown for testing this fix, since while they have in the past, none of my machines run reiserfs at the moment. Cc: Vladimir V. Saveliev <vs@namesys.com> Acked-by: Len Brown <lenb@kernel.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-23Clean up and export cancel_dirty_page() to modulesLinus Torvalds
Make cancel_dirty_page() act more like all the other dirty and writeback accounting functions: test for "mapping" being NULL, and do the NR_FILE_DIRY accounting purely based on mapping_cap_account_dirty()). Also, add it to the exports, so that modular filesystems can use it. Acked-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-22Merge branch 'release' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (68 commits) ACPI: replace kmalloc+memset with kzalloc ACPI: Add support for acpi_load_table/acpi_unload_table_id fbdev: update after backlight argument change ACPI: video: Add dev argument for backlight_device_register ACPI: Implement acpi_video_get_next_level() ACPI: Kconfig - depend on PM rather than selecting it ACPI: fix NULL check in drivers/acpi/osl.c ACPI: make drivers/acpi/ec.c:ec_ecdt static ACPI: prevent processor module from loading on failures ACPI: fix single linked list manipulation ACPI: ibm_acpi: allow clean removal ACPI: fix git automerge failure ACPI: ibm_acpi: respond to workqueue update ACPI: dock: add uevent to indicate change in device status ACPI: ec: Lindent once again ACPI: ec: Change #define to enums there possible. ACPI: ec: Style changes. ACPI: ec: Acquire Global Lock under EC mutex. ACPI: ec: Drop udelay() from poll mode. Loop by reading status field instead. ACPI: ec: Rename gpe_bit to gpe ...
2006-12-22[PATCH] Call init_timer() for ISDN PPP CCP reset state timerMarcel Holtmann
The function isdn_ppp_ccp_reset_alloc_state() sets ->timer.function and ->timer.data and later on calls add_timer() with no init_timer() ever done. Noted by Al Viro. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Karsten Keil <kkeil@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-22Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [UDP]: Fix reversed logic in udp_get_port(). [IPV6]: Dumb typo in generic csum_ipv6_magic() [SCTP]: make 2 functions static [SCTP]: Fix typo adaption -> adaptation as per the latest API draft. [SCTP]: Don't export include/linux/sctp.h to userspace. [TCP]: Fix ambiguity in the `before' relation. [ATM] drivers/atm/fore200e.c: Cleanups. [ATM]: Remove dead ATM_TNETA1570 option. NetLabel: correctly fill in unused CIPSOv4 level and category mappings NetLabel: perform input validation earlier on CIPSOv4 DOI add ops
2006-12-22[PATCH] cfq-iosched: tighten allow merge criteriaJens Axboe
The logic in cfq_allow_merge() wasn't clear enough - basically allow merging for the same queues only. Do a fast check for 'rq and bio both sync/async' before doing the cfqq hash lookup. This is verified to work with the fixed elv_try_merge() from commit bb4067e34159648d394943d5e2a011f838bff22f. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-22[UDP]: Fix reversed logic in udp_get_port().David S. Miller
When this code was converted to use sk_for_each() the logic for the "best hash chain length" code was reversed, breaking everything. The original code was of the form: size = 0; do { if (++size >= best_size_so_far) goto next; } while ((sk = sk->next) != NULL); best_size_so_far = size; best = result; next:; and this got converted into: sk_for_each(sk2, node, head) if (++size < best_size_so_far) { best_size_so_far = size; best = result; } Which does something very very different from the original. Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-22[IPV6]: Dumb typo in generic csum_ipv6_magic()Al Viro
... duh Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-22[SCTP]: make 2 functions staticAdrian Bunk
This patch makes the following needlessly global functions static: - ipv6.c: sctp_inet6addr_event() - protocol.c: sctp_inetaddr_event() Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-22[SCTP]: Fix typo adaption -> adaptation as per the latest API draft.Ivan Skytte Jorgensen
Signed-off-by: Ivan Skytte Jorgensen <isj-sctp@i1.dk> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-22[SCTP]: Don't export include/linux/sctp.h to userspace.Sridhar Samudrala
This file contains protocol definitions and there are no SCTP apps that use this file. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-22[TCP]: Fix ambiguity in the `before' relation.Gerrit Renker
While looking at DCCP sequence numbers, I stumbled over a problem with the following definition of before in tcp.h: static inline int before(__u32 seq1, __u32 seq2) { return (__s32)(seq1-seq2) < 0; } Problem: This definition suffers from an an ambiguity, i.e. always before(a, (a + 2^31) % 2^32)) = 1 before((a + 2^31) % 2^32), a) = 1 In text: when the difference between a and b amounts to 2^31, a is always considered `before' b, the function can not decide. The reason is that implicitly 0 is `before' 1 ... 2^31-1 ... 2^31 Solution: There is a simple fix, by defining before in such a way that 0 is no longer `before' 2^31, i.e. 0 `before' 1 ... 2^31-1 By not using the middle between 0 and 2^32, before can be made unambiguous. This is achieved by testing whether seq2-seq1 > 0 (using signed 32-bit arithmetic). I attach a patch to codify this. Also the `after' relation is basically a redefinition of `before', it is now defined as a macro after before. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-22[ATM] drivers/atm/fore200e.c: Cleanups.Adrian Bunk
This patch contains the following transformations from custom functions to standard kernel version: - fore200e_kmalloc() -> kzalloc() - fore200e_kfree() -> kfree() - fore200e_swap() -> cpu_to_be32() Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-22[ATM]: Remove dead ATM_TNETA1570 option.Adrian Bunk
This patch removes the unconverted ATM_TNETA1570 option that also lacks any code in the kernel. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2006-12-22NetLabel: correctly fill in unused CIPSOv4 level and category mappingsPaul Moore
Back when the original NetLabel patches were being changed to use Netlink attributes correctly some code was accidentially dropped which set all of the undefined CIPSOv4 level and category mappings to a sentinel value. The result is the mappings data in the kernel contains bogus mappings which always map to zero. This patch restores the old/correct behavior by initializing the mapping data to the correct sentinel value. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
2006-12-22NetLabel: perform input validation earlier on CIPSOv4 DOI add opsPaul Moore
There are a couple of cases where the user input for a CIPSOv4 DOI add operation was not being done soon enough; the result was unexpected behavior which was resulting in oops/panics/lockups on some platforms. This patch moves the existing input validation code earlier in the code path to protect against bogus user input. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
2006-12-22[PATCH] Fix up page_mkclean_one(): virtual caches, s390Peter Zijlstra
- add flush_cache_page() for all those virtual indexed cache architectures. - handle s390. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-22[PATCH] serial/uartlite: Only enable port if request_port succeededPeter Korsgaard
The uartlite driver used to always enable the port even if request_port failed causing havoc. This patch fixes it. Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-22[PATCH] Fix reparenting to the same thread group. (take 2)Eric W. Biederman
This patch fixes the case when we reparent to a different thread in the same thread group. This modifies the code so that we do not send signals and do not change the signal to send to SIGCHLD unless we have change the thread group of our parents. It also suppresses sending pdeath_sig in this cas as well since the result of geppid doesn't change. Thanks to Oleg for spotting my bug of only fixing this for non-ptraced tasks. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Albert Cahalan <acahalan@gmail.com> Cc: Andrew Morton <akpm@osdl.org> Cc: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Coywolf Qi Hunt <qiyong@fc-cn.com> Acked-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-22[PATCH] build compile.h earlierAndrew Morton
compile.h is created super-late in the build. But proc_misc.c want to include it, and it's generally not sane to have a header file in include/linux be created at the end of the build: it's either not present or, worse, wrong for most of the build. So the patch arranges for compile.h to be built at the start of the build process. It also consolidates the compile.h rules with those for version.h and utsname.h, so they all get built together. I hope. My chances of having got this right are about 2%. Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-22[PATCH] sched: fix bad missed wakeups in the i386, x86_64, ia64, ACPI and ↵Ingo Molnar
APM idle code Fernando Lopez-Lezcano reported frequent scheduling latencies and audio xruns starting at the 2.6.18-rt kernel, and those problems persisted all until current -rt kernels. The latencies were serious and unjustified by system load, often in the milliseconds range. After a patient and heroic multi-month effort of Fernando, where he tested dozens of kernels, tried various configs, boot options, test-patches of mine and provided latency traces of those incidents, the following 'smoking gun' trace was captured by him: _------=> CPU# / _-----=> irqs-off | / _----=> need-resched || / _---=> hardirq/softirq ||| / _--=> preempt-depth |||| / ||||| delay cmd pid ||||| time | caller \ / ||||| \ | / IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup (try_to_wake_up) IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup <<...>-5856> (37 0) IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup (c01262ba 0 0) IRQ_19-1479 1D..1 0us : resched_task (try_to_wake_up) IRQ_19-1479 1D..1 0us : __spin_unlock_irqrestore (try_to_wake_up) ... <idle>-0 1...1 11us!: default_idle (cpu_idle) ... <idle>-0 0Dn.1 602us : smp_apic_timer_interrupt (c0103baf 1 0) ... <...>-5856 0D..2 618us : __switch_to (__schedule) <...>-5856 0D..2 618us : __schedule <<idle>-0> (20 162) <...>-5856 0D..2 619us : __spin_unlock_irq (__schedule) <...>-5856 0...1 619us : trace_stop_sched_switched (__schedule) <...>-5856 0D..1 619us : trace_stop_sched_switched <<...>-5856> (37 0) what is visible in this trace is that CPU#1 ran try_to_wake_up() for PID:5856, it placed PID:5856 on CPU#0's runqueue and ran resched_task() for CPU#0. But it decided to not send an IPI that no CPU - due to TS_POLLING. But CPU#0 never woke up after its NEED_RESCHED bit was set, and only rescheduled to PID:5856 upon the next lapic timer IRQ. The result was a 600+ usecs latency and a missed wakeup! the bug turned out to be an idle-wakeup bug introduced into the mainline kernel this summer via an optimization in the x86_64 tree: commit 495ab9c045e1b0e5c82951b762257fe1c9d81564 Author: Andi Kleen <ak@suse.de> Date: Mon Jun 26 13:59:11 2006 +0200 [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status During some profiling I noticed that default_idle causes a lot of memory traffic. I think that is caused by the atomic operations to clear/set the polling flag in thread_info. There is actually no reason to make this atomic - only the idle thread does it to itself, other CPUs only read it. So I moved it into ti->status. the problem is this type of change: if (!hlt_counter && boot_cpu_data.hlt_works_ok) { - clear_thread_flag(TIF_POLLING_NRFLAG); + current_thread_info()->status &= ~TS_POLLING; smp_mb__after_clear_bit(); while (!need_resched()) { local_irq_disable(); this changes clear_thread_flag() to an explicit clearing of TS_POLLING. clear_thread_flag() is defined as: clear_bit(flag, &ti->flags); and clear_bit() is a LOCK-ed atomic instruction on all x86 platforms: static inline void clear_bit(int nr, volatile unsigned long * addr) { __asm__ __volatile__( LOCK_PREFIX "btrl %1,%0" hence smp_mb__after_clear_bit() is defined as a simple compile barrier: #define smp_mb__after_clear_bit() barrier() but the explicit TS_POLLING clearing introduced by the patch: + current_thread_info()->status &= ~TS_POLLING; is not an atomic op! So the clearing of the TS_POLLING bit is freely reorderable with the reading of the NEED_RESCHED bit - and both now reside in different memory addresses. CPU idle wakeup very much depends on ordered memory ops, the clearing of the TS_POLLING flag must always be done before we test need_resched() and hit the idle instruction(s). [Symmetrically, the wakeup code needs to set NEED_RESCHED before it tests the TS_POLLING flag, so memory ordering is paramount.] Fernando's dual-core Athlon64 system has a sufficiently advanced memory ordering model so that it triggered this scenario very often. ( And it also turned out that the reason why these latencies never triggered on my testsystems is that i routinely use idle=poll, which was the only idle variant not affected by this bug. ) The fix is to change the smp_mb__after_clear_bit() to an smp_mb(), to act as an absolute barrier between the TS_POLLING write and the NEED_RESCHED read. This affects almost all idling methods (default, ACPI, APM), on all 3 x86 architectures: i386, x86_64, ia64. Signed-off-by: Ingo Molnar <mingo@elte.hu> Tested-by: Fernando Lopez-Lezcano <nando@ccrma.Stanford.EDU> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>