aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2008-02-04virtio: balloon driverRusty Russell
After discussions with Anthony Liguori, it seems that the virtio balloon can be made even simpler. Here's my attempt. The device configuration tells the driver how much memory it should take from the guest (ie. balloon size). The guest feeds the page numbers it has taken via one virtqueue. A second virtqueue feeds the page numbers the driver wants back: if the device has the VIRTIO_BALLOON_F_MUST_TELL_HOST bit, then this queue is compulsory, otherwise it's advisory (and the guest can simply fault the pages back in). This driver can be enhanced later to deflate the balloon via a shrinker, oom callback or we could even go for a complete set of in-guest regulators. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: Use PCI revision field to indicate virtio PCI ABI versionAnthony Liguori
As Avi pointed out, as we continue to massage the virtio PCI ABI, we can make things a little more friendly to users by utilizing the PCI revision field to indicate which version of the ABI we're using. This is a hard ABI version and incrementing it will cause the guest driver to break. This is the necessary changes to virtio_pci to support this. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: PCI deviceAnthony Liguori
This is a PCI device that implements a transport for virtio. It allows virtio devices to be used by QEMU based VMMs like KVM or Xen. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio_blk: implement naming for vda-vdz,vdaa-vdzz,vdaaa-vdzzzChristian Borntraeger
Am Freitag, 1. Februar 2008 schrieb Christian Borntraeger: > Right. I will fix that with an additional patch. This patch goes on top of the minor number patch. Please let me know if you want a merged patch: Currently virtio_blk creates the disk name combinging "vd" with 'a'++. This will give strange names after vdz. I have implemented names up to vdzzz - inspired by the sd.c code. That should be sufficient for now. There is one driver in the kernel (driver/s390/block/dasd_genhd.c) that implements names from dasda-dasdzzzz allowing even more disks. Maybe a janitor can come up with a common implementation usable for all kind of block device drivers. I have tested this patch with 100 disks - seems to work. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio_blk: Dont waste major numbersChristian Borntraeger
Rusty, currently virtio_blk uses one major number per device. While this works quite well on most systems it is wasteful and will exhaust major numbers on larger installations. This patch allocates a major number on init and will use 16 minor numbers for each disk. That will allow ~64k virtio_blk disks. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio_blk: provide getgeoChristian Borntraeger
Rusty, I currently try to make my guest boot from an virtio root device without having an external kernel. Some of the tools that I tried expect HDIO_GETGEO to work. The most interesting value is likely the geo.start value to get the offset of a partition. This value is filled by block/ioctl.c if fops->getgeo is set. This patch also fills in some standard values for heads, sectors and cylinders. Makes sense? Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio_net: parametrize the napi_weight for virtio receive queue.Dor Laor
It is done in order to improve performance. Signed-off-by: Dor Laor <dor.laor@qumranet.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: free transmit skbs when notified, not on next xmit.Rusty Russell
This fixes a potential dangling xmit problem. We also suppress refill interrupts until we need them. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: flush buffers on openRusty Russell
Fix bug found by Christian Borntraeger: if the other side fills all the registered network buffers before we enable NAPI, we will never get an interrupt. The simplest fix is to process the input queue once on open. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtnet: remove double ether_setupChristian Borntraeger
Hello Rusty, virtnet_probe already calls alloc_etherdev, which calls ether_setup. There is no need to do that again. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: Allow virtio to be modular and used by modulesRusty Russell
This is needed for the virtio PCI device to be compiled as a module. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: Use the sg_phys convenience function.Rusty Russell
Simple cleanup. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: Put the virtio under the virtualization menuAnthony Liguori
This patch moves virtio under the virtualization menu and changes virtio devices to not claim to only be for lguest. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: handle interrupts after callbacks turned offRusty Russell
Anthony Liguori found double interrupt suppression in the virtio_net driver, triggered by two skb_recv_done's in a row. This is because virtio_ring's interrupt suppression is a best-effort optimization: it contains no synchronization so the host can miss it and still send interrupts. But it's certainly nicer for virtio users if calling disable_cb actually disables callbacks, so we check for the race in the interrupt routine. Note: SMP guests might require syncronization here, but since disable_cb is actually called from interrupt context, there has to be some form of synchronization before the next same interrupt handler is called (Linux guarantees that the same device's irq handler will never run simultanously on multiple CPUs). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: reset functionRusty Russell
A reset function solves three problems: 1) It allows us to renegotiate features, eg. if we want to upgrade a guest driver without rebooting the guest. 2) It gives us a clean way of shutting down virtqueues: after a reset, we know that the buffers won't be used by the host, and 3) It helps the guest recover from messed-up drivers. So we remove the ->shutdown hook, and the only way we now remove feature bits is via reset. We leave it to the driver to do the reset before it deletes queues: the balloon driver, for example, needs to chat to the host in its remove function. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: populate network rings in the probe routine, not openRusty Russell
Since we want to reset the device to remove them, this is simpler (device is reset for us on driver remove). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: Tweak virtio_net definesRusty Russell
1) Turn GSO on virtio net into an all-or-nothing (keep checksumming separate). Having multiple bits is a pain: if you can't support something you should handle it in software, which is still a performance win. 2) Make VIRTIO_NET_HDR_GSO_ECN a flag in the header, so it can apply to IPv6 or v4. 3) Rename VIRTIO_NET_F_NO_CSUM to VIRTIO_NET_F_CSUM (ie. means we do checksumming). 4) Add csum and gso params to virtio_net to allow more testing. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: Net header needs hdr_lenRusty Russell
It's far easier to deal with packets if we don't have to parse the packet to figure out the header length to know how much to pull into the skb data. Add the field to the virtio_net_hdr struct (and fix the spaces that somehow crept in there). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: remove unused id field from struct virtio_blk_outhdrRusty Russell
This field has been unused since an older version of virtio. Remove it now before we freeze the ABI. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au.
2008-02-04virtio: clarify NO_NOTIFY flag usageRusty Russell
The other side (host) can set the NO_NOTIFY flag as an optimization, to say "no need to kick me when you add things". Make it clear that this is advisory only; especially that we should always notify when the ring is full. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: Fix vring_init/vring_size to take unsigned longAnthony Liguori
Using unsigned int resulted in silent truncation of the upper 32-bit on x86_64 resulting in an OOPS since the ring was being initialized wrong. Please reconsider my previous patch to just use PAGE_ALIGN(). Open coding this sort of stuff, no matter how simple it seems, is just asking for this sort of trouble. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: configuration change callbackRusty Russell
Various drivers want to know when their configuration information changes: the balloon driver is the immediate user, but the network driver may one day have a "carrier" status as well. This introduces that callback (lguest doesn't use it yet). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: explicit enable_cb/disable_cb rather than callback return.Rusty Russell
It seems that virtio_net wants to disable callbacks (interrupts) before calling netif_rx_schedule(), so we can't use the return value to do so. Rename "restart" to "cb_enable" and introduce "cb_disable" hook: callback now returns void, rather than a boolean. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: simplify config mechanism.Rusty Russell
Previously we used a type/len pair within the config space, but this seems overkill. We now simply define a structure which represents the layout in the config space: the config space can now only be extended at the end. The main driver-visible changes: 1) We indicate what fields are present with an explicit feature bit. 2) Virtqueues are explicitly numbered, and not in the config space. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-02-04virtio: Implement skb_partial_csum_set, for setting partial csums on ↵Rusty Russell
untrusted packets. Use it in virtio_net (replacing buggy version there), it's also going to be used by TAP for partial csum support. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: David S. Miller <davem@davemloft.net>
2008-02-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (95 commits) ide-tape: remove idetape_config_t typedef ide-tape: remove mtio.h related comments ide-tape: make function name more accurate ide-tape: remove unused sense packet commands. ide-tape: use generic byteorder macros ide-tape: remove EXPERIMENTAL driver status ide-tape: use generic scsi commands ide-tape: remove struct idetape_block_size_page_t ide-tape: remove structs os_partition_t, os_dat_entry_t, os_dat_t ide-tape: remove struct idetape_parameter_block_descriptor_t ide-tape: remove struct idetape_medium_partition_page_t ide-tape: remove struct idetape_data_compression_page_t ide-tape: remove struct idetape_inquiry_result_t ide-tape: remove struct idetape_capabilities_page_t ide-tape: remove IDETAPE_DEBUG_BUGS ide-tape: remove IDETAPE_DEBUG_INFO ide-tape: dump gcw fields on error in idetape_identify_device() ide-tape: remove struct idetape_mode_parameter_header_t ide-tape: remove struct idetape_request_sense_result_t ide-tape: remove dead code ...
2008-02-03fix writev regression: pan hanging unkillable and un-straceableNick Piggin
Frederik Himpe reported an unkillable and un-straceable pan process. Zero length iovecs can go into an infinite loop in writev, because the iovec iterator does not always advance over them. The sequence required to trigger this is not trivial. I think it requires that a zero-length iovec be followed by a non-zero-length iovec which causes a pagefault in the atomic usercopy. This causes the writev code to drop back into single-segment copy mode, which then tries to copy the 0 bytes of the zero-length iovec; a zero length copy looks like a failure though, so it loops. Put a test into iov_iter_advance to catch zero-length iovecs. We could just put the test in the fallback path, but I feel it is more robust to skip over zero-length iovecs throughout the code (iovec iterator may be used in filesystems too, so it should be robust). Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-03Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: ieee1394: sbp2: fix bogus s/g access change
2008-02-02ide-tape: remove idetape_config_t typedefBorislav Petkov
Since this is used only in idetape_blkdev_ioctl(), remove the typedef and make the struct function-local. Bart: - s/sizeof(struct idetape_config)/sizeof(config)/ Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: remove mtio.h related commentsBorislav Petkov
Those are already in mtio.h. Bart: - undo 'unsigned int/unsigned long' -> 'uint/ulong' conversion Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: make function name more accurateBorislav Petkov
idetape_active_next_stage() was rather ambiguous wrt its purpose. Make that more explicit and remove superfluous comment. Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: remove unused sense packet commands.Borislav Petkov
Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: use generic byteorder macrosBorislav Petkov
This is not a network driver. Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: remove EXPERIMENTAL driver statusBorislav Petkov
ide-tape has depended on EXPERIMENTAL for ages. Change that since the driver is being only maintained now. Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: use generic scsi commandsBorislav Petkov
Also, remove those which weren't used. Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: remove struct idetape_block_size_page_tBorislav Petkov
Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: remove structs os_partition_t, os_dat_entry_t, os_dat_tBorislav Petkov
They seem just to sit there completely unused. Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: remove struct idetape_parameter_block_descriptor_tBorislav Petkov
Also, shorten function name idetape_get_blocksize_from_block_descriptor() and move its definition up thereby getting rid of its forward declaration. Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: remove struct idetape_medium_partition_page_tBorislav Petkov
There should be no functional changes resulting from this patch. Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: remove struct idetape_data_compression_page_tBorislav Petkov
There should be no functional changes resulting from this patch. Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: remove struct idetape_inquiry_result_tBorislav Petkov
There should be no functional changes resulting from this patch. Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: remove struct idetape_capabilities_page_tBorislav Petkov
All those 2-byte values denoting the different capabilities are being written to the local copy of the caps buffer without being converted to big endian for simplicity of usage and shorter code later. Also, we add some comments stating which are the fields of the caps page in question in order to alleviate the cryptic pointer casting exercises as in e.g. idetape_get_mode_sense_results(). There should be no functional changes resulting from this patch. Bart: - remove two needless "!!" Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: remove IDETAPE_DEBUG_BUGSBorislav Petkov
Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: remove IDETAPE_DEBUG_INFOBorislav Petkov
The device capabilities are probed for during device initialization so this info is available through proc/ioctl() und it is redundant here. Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: dump gcw fields on error in idetape_identify_device()Bartlomiej Zolnierkiewicz
Cc: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: remove struct idetape_mode_parameter_header_tBorislav Petkov
Bart: - remove 'capabilities->speed' chunk - re-add brackets to block_descrp assignment Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: remove struct idetape_request_sense_result_tBorislav Petkov
Bart: - remove unnecessary comment change - remove two needless "!!" Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: remove dead codeBorislav Petkov
Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-tape: move historical changelog to ↵Borislav Petkov
Documentation/ide/ChangeLog.ide-tape.1995-2002 Also, cleanup whitespace and update comments. Bart: - remove reference to drivers/block/ide.c - move driver documentation to Documentation/ide/ide-tape.txt Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-02-02ide-cs: use ide_std_init_ports()Bartlomiej Zolnierkiewicz
No reason to use ide_init_hwif_ports() in ide-cs (as a nice side-effect this makes ide-cs work on archs that don't define IDE_ARCH_OBSOLETE_INIT). Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>