aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2008-10-10[S390] 3215: Remove tasklet.Heiko Carstens
The 3215 console irq handler used to schedule a tasklet. However the console irq handler also gets called from the infamous cio_tpi() function. Which in turn does something like local_bh_disable() [call console irq handler] _local_bh_enable() _local_bh_enable() prevents execution of softirqs, which is intended within cio_tpi(). However there might be a new softirq pending because irq handler scheduled a tasklet. In order to prevent this behaviour we just get rid of the tasklet. It's not doing much anyway. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] console flush on panic / rebootHolger Smolinski
The s390 console drivers use the unblank callback of the console structure to flush the console buffer. In case of a panic or a reboot the CPU doing the callback can block on the console i/o. The other CPUs in the system continue to work. For panic this is not a good idea. Replace the unblank callback with proper panic/reboot notifier. These get called after all but one CPU have been stopped. Signed-off-by: Holger Smolinski <Holger.Smolinski@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] introduce dirty bit for kvm live migrationFlorian Funke
This patch defines a dirty bit in the PGSTE that can be used to implement dirty pages logging for KVM's live migration. The bit is set in the ptep_rcp_copy function, which is called to save dirty and referenced information from the storage key in the PGSTE. The bit can be tested and reset by KVM using the kvm_s390_test_and_clear_page_dirty function that is introduced by this patch. Acked-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Florian Funke <ffunke@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] Add ioctl support for EMC Symmetrix Subsystem Control I/ONigel Hislop
EMC Symmetrix Subsystem Control I/O through CKD dasd requires a specific parameter list sent to the array via a Perform Subsystem Function CCW. The Symmetrix response is retrieved from the array via a Read Subsystem Data CCW. Signed-off-by: Nigel Hislop <hislop_nigel@emc.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] xpram: per device block request queues.Martin Schwidefsky
The xpram driver uses a single block device queue for all of its devices so far. With recent kernels removing xpram module fails to clean up all sysfs files. The next time the xpram module is loaded you'll get warnings: WARNING: at fs/sysfs/dir.c:463 sysfs_add_one+0x5e/0x64() sysfs: duplicate filename '35:0' can not be created Modules linked in: xpram(+) [last unloaded: xpram] Followed by the usual WARN_ON output, followed by an error message from kobject_add_internal, followed by a badness in genhd. Allocating a block queue per device fixes this. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] dasd: fix message flood for unsolicited interruptsStefan Haberland
In the unsolicited interupt handler fake IRBs from CIO have to be ignored because there is nothing to do. The function dump_sense should not be called if there is no sense data available. Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] Move private simple udelay function to arch/s390/lib/delay.c.Heiko Carstens
Move cio's private simple udelay function to lib/delay.c and turn it into something much more readable. So we have all implementations at one place. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] dcssblk: add >2G DCSSs support and stacked contiguous DCSSs support.Hongjie Yang
The DCSS block device driver is modified to add >2G DCSSs support and allow a DCSS block device to map to a set of contiguous DCSSs. The extmem code is also modified to use new Diagnose x'64' subcodes for >2G DCSSs. Signed-off-by: Hongjie Yang <hongjie@us.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] ptrace changesMartin Schwidefsky
* System call parameter and result access functions * Add tracehook calls * Split syscall_trace into two functions do_syscall_trace_enter and do_syscall_trace_exit Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] s390: use sys_pause for 31bit pause entry pointChristoph Hellwig
sys32_pause is a useless copy of the generic sys_pause. (and it's certainly not there for old sparc32 binaries..) Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] qdio enhanced SIGA (iqdio) support.Klaus-Dieter Wacker
Add support for z10 HiperSockets multiwrite SBALs on output queues. This is used on LPAR with EDDP enabled devices. Signed-off-by: Klaus-Dieter Wacker <kdwacker@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] cio: fix cio_tpi.Heiko Carstens
In cio_tpi only disable bottom halves when not in interrupt context. Otherwise a WARN_ON gets triggered. Besides that, when we are in interrupt context bottom halves are disabled anyway. Fixes this one: Badness at kernel/softirq.c:77 Modules linked in: CPU: 2 Not tainted 2.6.26 #4 Process swapper (pid: 0, task: 000000003fe83db0, ksp: 000000003fea7d28) Krnl PSW : 0404c00180000000 0000000000053f4e (__local_bh_disable+0xbe/0xcc) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:3 Krnl GPRS: 0000000000008ee0 00000000005f95e0 0000000000000000 0000000000000001 000000000020be92 0000000000000000 0000000000000210 00000000005d36c0 000000003fb5f4d8 0000000000000000 000000000020bed0 000000003fb5f3c8 00000000009be920 0000000000364898 000000003fb5f408 000000003fb5f3c8 Krnl Code: 0000000000053f42: bf2f1000 icm %r2,15,0(%r1) 0000000000053f46: a774ffc5 brc 7,53ed0 0000000000053f4a: a7f40001 brc 15,53f4c >0000000000053f4e: a7280001 lhi %r2,1 0000000000053f52: 50201000 st %r2,0(%r1) 0000000000053f56: a7f4ffbd brc 15,53ed0 0000000000053f5a: 0707 bcr 0,%r7 0000000000053f5c: a7f13fc0 tmll %r15,16320 Call Trace: ([<0000000000000210>] 0x210) [<0000000000053f86>] local_bh_disable+0x2a/0x38 [<000000000020bed0>] wait_cons_dev+0xd4/0x154 [<0000000000247cb2>] raw3215_make_room+0x6a/0x1a8 [<000000000024861a>] raw3215_write+0x86/0x28c [<00000000002488a0>] con3215_write+0x80/0x110 [<000000000004c3e0>] __call_console_drivers+0xc8/0xe4 [<000000000004c47e>] _call_console_drivers+0x82/0xc4 [<000000000004c744>] release_console_sem+0x218/0x2c0 [<000000000004cf64>] vprintk+0x3c0/0x504 [<0000000000354a4a>] printk+0x52/0x64 [<0000000000088004>] __print_symbol+0x40/0x50 [<0000000000071dbc>] print_stack_trace+0x78/0xac [<0000000000079e78>] print_lock_dependencies+0x148/0x208 [<000000000007a050>] print_irq_inversion_bug+0x118/0x15c [<000000000007a106>] check_usage_forwards+0x72/0x84 [<000000000007a36e>] mark_lock+0x1d2/0x594 [<000000000007baca>] __lock_acquire+0x886/0xf48 [<000000000007c234>] lock_acquire+0xa8/0xe0 [<0000000000350316>] _write_lock+0x56/0x98 [<000000000026cd92>] zfcp_erp_adapter_reopen+0x4e/0x8c [<000000000026f1e8>] zfcp_qdio_int_resp+0x2e4/0x2f4 [<00000000002210f4>] qdio_int_handler+0x274/0x888 [<00000000002177b6>] ccw_device_call_handler+0x6e/0xd8 [<0000000000215336>] ccw_device_irq+0xd6/0x160 [<0000000000212f88>] io_subchannel_irq+0x8c/0x118 [<000000000020c120>] do_IRQ+0x1d0/0x1fc [<00000000000270b2>] io_return+0x0/0x8 [<000000000001c8a4>] cpu_idle+0x178/0x21c ([<000000000001c884>] cpu_idle+0x158/0x21c) [<00000000003483a2>] start_secondary+0xb6/0xc8 INFO: lockdep is turned off. Last Breaking-Event-Address: [<0000000000053f4a>] __local_bh_disable+0xba/0xcc Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] cio: Correct use of ! and &Julia Lawall
In commit e6bafba5b4765a5a252f1b8d31cbf6d2459da337, a bug was fixed that involved converting !x & y to !(x & y). The code below shows the same pattern, and thus should perhaps be fixed in the same way. In particular, the result of !scsw_stctl(&request->irb.scsw) & SCSW_STCTL_STATUS_PEND is always just !scsw_stctl(&request->irb.scsw). The semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @@ expression E; constant C; @@ ( !E & !C | - !E & C + !(E & C) ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] cio: inline assembly cleanupPeter Oberparleiter
Fix incorrect in- and output constraints, remove volatile declaration of inline assembly parameters and reformat constraint declarations to be more consistent. Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] bus_id -> dev_set_name() for css and ccw bussesCornelia Huck
Convert remaining s390 users setting bus_id to dev_set_name() or init_name. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] bus_id ->dev_name() conversions in qdioMartin Schwidefsky
Use dev_name() in the new qdio driver. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] Use s390_root_dev_* in kvm_virtio.Cornelia Huck
No need to define a static device for the kvm_s390 root device, just use s390_root_dev_register(). This is needed for the bus_id rework Acked-by: Carsten Otte <cotte@de.ibm.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] more bus_id -> dev_name conversionsCornelia Huck
Some further bus_id -> dev_name() conversions in s390 code. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] bus_id -> dev_set_name() changesCornelia Huck
Convert most s390 users setting bus_id to dev_set_name(). css and ccw busses are deferred since they need some special treatment. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] bus_id -> dev_name conversionsKay Sievers
bus_id -> dev_name() conversions in s390 code. [cornelia.huck@de.ibm.com: minor adaptions] Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] cio: Exorcise cio_msg= from documentation.Cornelia Huck
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] cio: Update cio_ignore documentation.Cornelia Huck
Add documentation for the new "purge" cio_ignore parameter. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] cio: introduce purge function for /proc/cio_ignorePeter Oberparleiter
Allow users to remove blacklisted ccw devices by using the /proc/cio_ignore interface: echo purge > /proc/cio_ignore will remove all devices which are offline and blacklisted. Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] cio: move device unregistration to dedicated work queuePeter Oberparleiter
Use dedicated slow path work queue when unregistering a device due to a user action. This ensures serialialization of other register/ unregister requests. Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10[S390] qdio: speed up multicast traffic on full HiperSocket queueUrsula Braun
If an asynchronous HiperSockets queue runs full, no further packet can be sent. In this case the next initiative to give transmitted skbs back to the stack is triggered only by a 10-seconds qdio timer. This timer has been introduced for low multicast traffic scenarios to guarantee freeing of skbs in a limited amount of time. For high HiperSocket multicast traffic scenarios progress checking on the outbound queue should be enforced by tasklet rescheduling. Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-10-10ath9k: Fix return code when ath9k_hw_setpower() fails on resetLuis R. Rodriguez
We were not reporting a status code back ath9k_hw_setpower() failed during reset so lets correct this. Reported-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-10ath9k: remove nasty FAIL macro from ath9k_hw_reset()Luis R. Rodriguez
This is fucking horribe crap code so nuke it. There I cursed too in a commit log. Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-10gre: minor cleanups in netlink interfacePatrick McHardy
- use typeful helpers for IFLA_GRE_LOCAL/IFLA_GRE_REMOTE - replace magic value by FIELD_SIZEOF - use MODULE_ALIAS_RTNL_LINK macro Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-10gre: fix copy and paste errorPatrick McHardy
The flags are dumped twice, the keys not at all. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: skcipher - Use RNG interface instead of get_random_bytes crypto: rng - RNG interface and implementation crypto: api - Add fips_enable flag crypto: skcipher - Move IV generators into their own modules crypto: cryptomgr - Test ciphers using ECB crypto: api - Use test infrastructure crypto: cryptomgr - Add test infrastructure crypto: tcrypt - Add alg_test interface crypto: tcrypt - Abort and only log if there is an error crypto: crc32c - Use Intel CRC32 instruction crypto: tcrypt - Avoid using contiguous pages crypto: api - Display larval objects properly crypto: api - Export crypto_alg_lookup instead of __crypto_alg_lookup crypto: Kconfig - Replace leading spaces with tabs
2008-10-10Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (29 commits) RDMA/nes: Fix slab corruption IB/mlx4: Set RLKEY bit for kernel QPs RDMA/nes: Correct error_module bit mask RDMA/nes: Fix routed RDMA connections RDMA/nes: Enhanced PFT management scheme RDMA/nes: Handle AE bounds violation RDMA/nes: Limit critical error interrupts RDMA/nes: Stop spurious MAC interrupts RDMA/nes: Correct tso_wqe_length RDMA/nes: Fill in firmware version for ethtool RDMA/nes: Use ethtool timer value RDMA/nes: Correct MAX TSO frags value RDMA/nes: Enable MC/UC after changing MTU RDMA/nes: Free NIC TX buffers when destroying NIC QP RDMA/nes: Fix MDC setting RDMA/nes: Add wqm_quanta module option RDMA/nes: Module parameter permissions RDMA/cxgb3: Set active_mtu in ib_port_attr RDMA/nes: Add support for 4-port 1G HP blade card RDMA/nes: Make mini_cm_connect() static ...
2008-10-10Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm: dlm: choose better identifiers dlm: remove bkl dlm: fix address compare dlm: fix locking of lockspace list in dlm_scand dlm: detect available userspace daemon dlm: allow multiple lockspace creates
2008-10-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dmLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm: dm: detect lost queue dm: publish dm_vcalloc dm: publish dm_table_unplug_all dm: publish dm_get_mapinfo dm: export struct dm_dev dm crypt: avoid unnecessary wait when splitting bio dm crypt: tidy ctx pending dm crypt: fix async inc_pending dm crypt: move dec_pending on error into write_io_submit dm crypt: remove inc_pending from write_io_submit dm crypt: tidy write loop pending dm crypt: tidy crypt alloc dm crypt: tidy inc pending dm exception store: use chunk_t for_areas dm exception store: introduce area_location function dm raid1: kcopyd should stop on error if errors handled dm mpath: remove is_active from struct dm_path dm mpath: use more error codes Fixed up trivial conflict in drivers/md/dm-mpath.c manually.
2008-10-10Fix barrier fail detection in XFSChristoph Hellwig
Currently we disable barriers as soon as we get a buffer in xlog_iodone that has the XBF_ORDERED flag cleared. But this can be the case not only for buffers where the barrier failed, but also the first buffer of a split log write in case of a log wraparound. Due to the disabled barriers we can easily get directory corruption on unclean shutdowns. So instead of using this check add a new buffer flag for failed barrier writes. This is a regression vs 2.6.26 caused by patch to use the right macro to check for the ORDERED flag, as we previously got true returned for every buffer. Thanks to Toei Rei for reporting the bug. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: David Chinner <david@fromorbit.com> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmwLinus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: GFS2: Support for I/O barriers GFS2: Add UUID to GFS2 sb GFS2: high time to take some time over atime GFS2: The war on bloat GFS2: GFS2 will panic if you misspell any mount options GFS2: Direct IO write at end of file error GFS2: Use an IS_ERR test rather than a NULL test GFS2: Fix race relating to glock min-hold time GFS2: Fix & clean up GFS2 rename GFS2: rm on multiple nodes causes panic GFS2: Fix metafs mounts GFS2: Fix debugfs glock file iterator
2008-10-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (37 commits) [SCSI] zfcp: fix double dbf id usage [SCSI] zfcp: wait on SCSI work to be finished before proceeding with init dev [SCSI] zfcp: fix erp list usage without using locks [SCSI] zfcp: prevent fc_remote_port_delete calls for unregistered rport [SCSI] zfcp: fix deadlock caused by shared work queue tasks [SCSI] zfcp: put threshold data in hba trace [SCSI] zfcp: Simplify zfcp data structures [SCSI] zfcp: Simplify get_adapter_by_busid [SCSI] zfcp: remove all typedefs and replace them with standards [SCSI] zfcp: attach and release SAN nameserver port on demand [SCSI] zfcp: remove unused references, declarations and flags [SCSI] zfcp: Update message with input from review [SCSI] zfcp: add queue_full sysfs attribute [SCSI] scsi_dh: suppress comparison warning [SCSI] scsi_dh: add Dell product information into rdac device handler [SCSI] qla2xxx: remove the unused SCSI_QLOGIC_FC_FIRMWARE option [SCSI] qla2xxx: fix printk format warnings [SCSI] qla2xxx: Update version number to 8.02.01-k8. [SCSI] qla2xxx: Ignore payload reserved-bits during RSCN processing. [SCSI] qla2xxx: Additional residual-count corrections during UNDERRUN handling. ...
2008-10-10Merge branch 'for-2.6.28' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds
* 'for-2.6.28' of git://git.kernel.dk/linux-2.6-block: (132 commits) doc/cdrom: Trvial documentation error, file not present block_dev: fix kernel-doc in new functions block: add some comments around the bio read-write flags block: mark bio_split_pool static block: Find bio sector offset given idx and offset block: gendisk integrity wrapper block: Switch blk_integrity_compare from bdev to gendisk block: Fix double put in blk_integrity_unregister block: Introduce integrity data ownership flag block: revert part of d7533ad0e132f92e75c1b2eb7c26387b25a583c1 bio.h: Remove unused conditional code block: remove end_{queued|dequeued}_request() block: change elevator to use __blk_end_request() gdrom: change to use __blk_end_request() memstick: change to use __blk_end_request() virtio_blk: change to use __blk_end_request() blktrace: use BLKTRACE_BDEV_SIZE as the name size for setup structure block: add lld busy state exporting interface block: Fix blk_start_queueing() to not kick a stopped queue include blktrace_api.h in headers_install ...
2008-10-10Merge branches 'core/iommu', 'x86/amd-iommu' and 'x86/iommu' into ↵Ingo Molnar
x86-v28-for-linus-phase3-B Conflicts: arch/x86/kernel/pci-gart_64.c include/asm-x86/dma-mapping.h
2008-10-10Merge branch 'linus' into x86/pat2Ingo Molnar
Conflicts: arch/x86/mm/init_64.c
2008-10-10x86, cpa: make the kernel physical mapping initialization a two pass ↵Suresh Siddha
sequence, fix Jeremy Fitzhardinge wrote: > I'd noticed that current tip/master hasn't been booting under Xen, and I > just got around to bisecting it down to this change. > > commit 065ae73c5462d42e9761afb76f2b52965ff45bd6 > Author: Suresh Siddha <suresh.b.siddha@intel.com> > > x86, cpa: make the kernel physical mapping initialization a two pass sequence > > This patch is causing Xen to fail various pagetable updates because it > ends up remapping pagetables to RW, which Xen explicitly prohibits (as > that would allow guests to make arbitrary changes to pagetables, rather > than have them mediated by the hypervisor). Instead of making init a two pass sequence, to satisfy the Intel's TLB Application note (developer.intel.com/design/processor/applnots/317080.pdf Section 6 page 26), we preserve the original page permissions when fragmenting the large mappings and don't touch the existing memory mapping (which satisfies Xen's requirements). Only open issue is: on a native linux kernel, we will go back to mapping the first 0-1GB kernel identity mapping as executable (because of the static mapping setup in head_64.S). We can fix this in a different patch if needed. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-10x86, pat: cleanupsIngo Molnar
clean up recently added code to be more consistent with other x86 code. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-10x86: fix pagetable init 64-bit breakageSuresh Siddha
Fix _end alignment check - can trigger a crash if _end happens to be on a page boundary. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-10x86: track memtype for RAM in page structSuresh Siddha
Track the memtype for RAM pages in page struct instead of using the memtype list. This avoids the explosion in the number of entries in memtype list (of the order of 20,000 with AGP) and makes the PAT tracking simpler. We are using PG_arch_1 bit in page->flags. We still use the memtype list for non RAM pages. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-10x86, cpa: srlz cpa(), global flush tlb after splitting big page and before ↵Suresh Siddha
doing cpa Do a global flush tlb after splitting the large page and before we do the actual change page attribute in the PTE. With out this, we violate the TLB application note, which says "The TLBs may contain both ordinary and large-page translations for a 4-KByte range of linear addresses. This may occur if software modifies the paging structures so that the page size used for the address range changes. If the two translations differ with respect to page frame or attributes (e.g., permissions), processor behavior is undefined and may be implementation-specific." And also serialize cpa() (for !DEBUG_PAGEALLOC which uses large identity mappings) using cpa_lock. So that we don't allow any other cpu, with stale large tlb entries change the page attribute in parallel to some other cpu splitting a large page entry along with changing the attribute. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: arjan@linux.intel.com Cc: venkatesh.pallipadi@intel.com Cc: jeremy@goop.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-10x86, cpa: remove cpa pool codeSuresh Siddha
Interrupt context no longer splits large page in cpa(). So we can do away with cpa memory pool code. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: arjan@linux.intel.com Cc: venkatesh.pallipadi@intel.com Cc: jeremy@goop.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-10x86, cpa: no need to check alias for __set_pages_p/__set_pages_npSuresh Siddha
No alias checking needed for setting present/not-present mapping. Otherwise, we may need to break large pages for 64-bit kernel text mappings (this adds to complexity if we want to do this from atomic context especially, for ex: with CONFIG_DEBUG_PAGEALLOC). Let's keep it simple! Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: arjan@linux.intel.com Cc: venkatesh.pallipadi@intel.com Cc: jeremy@goop.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-10x86, cpa: dont use large pages for kernel identity mapping with DEBUG_PAGEALLOCSuresh Siddha
Don't use large pages for kernel identity mapping with DEBUG_PAGEALLOC. This will remove the need to split the large page for the allocated kernel page in the interrupt context. This will simplify cpa code(as we don't do the split any more from the interrupt context). cpa code simplication in the subsequent patches. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: arjan@linux.intel.com Cc: venkatesh.pallipadi@intel.com Cc: jeremy@goop.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-10x86, cpa: make the kernel physical mapping initialization a two pass sequenceSuresh Siddha
In the first pass, kernel physical mapping will be setup using large or small pages but uses the same PTE attributes as that of the early PTE attributes setup by early boot code in head_[32|64].S After flushing TLB's, we go through the second pass, which setups the direct mapped PTE's with the appropriate attributes (like NX, GLOBAL etc) which are runtime detectable. This two pass mechanism conforms to the TLB app note which says: "Software should not write to a paging-structure entry in a way that would change, for any linear address, both the page size and either the page frame or attributes." Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: arjan@linux.intel.com Cc: venkatesh.pallipadi@intel.com Cc: jeremy@goop.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-10x86, cpa: remove USER permission from the very early identity mapping attributeSuresh Siddha
remove USER from the PTE/PDE attributes for the very early identity mapping. We overwrite these mappings with KERNEL attribute later in the boot. Just being paranoid here as there is no need for USER bit to be set. If this breaks something(don't know the history), then we can simply drop this change. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: arjan@linux.intel.com Cc: venkatesh.pallipadi@intel.com Cc: jeremy@goop.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-10x86, cpa: rename PTE attribute macros for kernel direct mapping in early bootSuresh Siddha
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: arjan@linux.intel.com Cc: venkatesh.pallipadi@intel.com Cc: jeremy@goop.org Signed-off-by: Ingo Molnar <mingo@elte.hu>