From 86cf898e1d0fca245173980e3897580db38569a8 Mon Sep 17 00:00:00 2001
From: David Woodhouse <David.Woodhouse@intel.com>
Date: Mon, 9 Nov 2009 22:15:15 +0000
Subject: intel-iommu: Check for 'DMAR at zero' BIOS error earlier.

Chris Wright has some patches which let us fall back to swiotlb nicely
if IOMMU initialisation fails. But those are a bit much for 2.6.32.

Instead, let's shift the check for the biggest problem, the HP and Acer
BIOS bug which reports a DMAR at physical address zero. That one can
actually be checked much earlier -- before we even admit to having
detected an IOMMU in the first place. So the swiotlb init goes ahead as
we want.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
---
 drivers/pci/dmar.c | 49 +++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 39 insertions(+), 10 deletions(-)

(limited to 'drivers/pci')

diff --git a/drivers/pci/dmar.c b/drivers/pci/dmar.c
index 22b02c6df85..e5f8fc164fd 100644
--- a/drivers/pci/dmar.c
+++ b/drivers/pci/dmar.c
@@ -175,15 +175,6 @@ dmar_parse_one_drhd(struct acpi_dmar_header *header)
 	int ret = 0;
 
 	drhd = (struct acpi_dmar_hardware_unit *)header;
-	if (!drhd->address) {
-		/* Promote an attitude of violence to a BIOS engineer today */
-		WARN(1, "Your BIOS is broken; DMAR reported at address zero!\n"
-		     "BIOS vendor: %s; Ver: %s; Product Version: %s\n",
-		     dmi_get_system_info(DMI_BIOS_VENDOR),
-		     dmi_get_system_info(DMI_BIOS_VERSION),
-		     dmi_get_system_info(DMI_PRODUCT_VERSION));
-		return -ENODEV;
-	}
 	dmaru = kzalloc(sizeof(*dmaru), GFP_KERNEL);
 	if (!dmaru)
 		return -ENOMEM;
@@ -591,12 +582,50 @@ int __init dmar_table_init(void)
 	return 0;
 }
 
+int __init check_zero_address(void)
+{
+	struct acpi_table_dmar *dmar;
+	struct acpi_dmar_header *entry_header;
+	struct acpi_dmar_hardware_unit *drhd;
+
+	dmar = (struct acpi_table_dmar *)dmar_tbl;
+	entry_header = (struct acpi_dmar_header *)(dmar + 1);
+
+	while (((unsigned long)entry_header) <
+			(((unsigned long)dmar) + dmar_tbl->length)) {
+		/* Avoid looping forever on bad ACPI tables */
+		if (entry_header->length == 0) {
+			printk(KERN_WARNING PREFIX
+				"Invalid 0-length structure\n");
+			return 0;
+		}
+
+		if (entry_header->type == ACPI_DMAR_TYPE_HARDWARE_UNIT) {
+			drhd = (void *)entry_header;
+			if (!drhd->address) {
+				/* Promote an attitude of violence to a BIOS engineer today */
+				WARN(1, "Your BIOS is broken; DMAR reported at address zero!\n"
+				     "BIOS vendor: %s; Ver: %s; Product Version: %s\n",
+				     dmi_get_system_info(DMI_BIOS_VENDOR),
+				     dmi_get_system_info(DMI_BIOS_VERSION),
+				     dmi_get_system_info(DMI_PRODUCT_VERSION));
+				return 0;
+			}
+			break;
+		}
+
+		entry_header = ((void *)entry_header + entry_header->length);
+	}
+	return 1;
+}
+
 void __init detect_intel_iommu(void)
 {
 	int ret;
 
 	ret = dmar_table_detect();
-
+	if (ret)
+		ret = check_zero_address();
 	{
 #ifdef CONFIG_INTR_REMAP
 		struct acpi_table_dmar *dmar;
-- 
cgit v1.2.3


From e8bb910d1bbc65e7081e73aab4b3a3dd8630332c Mon Sep 17 00:00:00 2001
From: Alex Williamson <alex.williamson@hp.com>
Date: Wed, 4 Nov 2009 15:59:34 -0700
Subject: intel-iommu: Obey coherent_dma_mask for alloc_coherent on passthrough

The model for IOMMU passthrough is that decent devices that can cope
with DMA to all of memory get passthrough; crappy devices with a limited
dma_mask don't -- they get to use the IOMMU anyway.

This is done on the basis that IOMMU passthrough is usually wanted for
performance reasons, and it's only the decent PCI devices that you
really care about performance for, while the crappy 32-bit ones like
your USB controller can just use the IOMMU and you won't really care.

Unfortunately, the check for this was only looking at dev->dma_mask, not
at dev->coherent_dma_mask. And some devices have a 32-bit
coherent_dma_mask even though they have a full 64-bit dma_mask.

Even more unfortunately, fixing that simple oversight would upset
certain broken HP devices. Not only do they have a 32-bit
coherent_dma_mask, but they also have a tendency to do stray DMA to
unmapped addresses. And then they die when they take the DMA fault they
so richly deserve.

So if we do the 'correct' fix, it'll mean that affected users have to
disable IOMMU support completely on "a large percentage of servers from
a major vendor."

Personally, I have little sympathy -- given that this is the _same_
'major vendor' who is shipping machines which claim to have IOMMU
support but have obviously never _once_ booted a VT-d capable OS to do
any form of QA. But strictly speaking, it _would_ be a regression even
though it only ever worked by fluke.

For 2.6.33, we'll come up with a quirk which gives swiotlb support
for this particular device, and other devices with an inadequate
coherent_dma_mask will just get normal IOMMU mapping.

The simplest fix for 2.6.32, though, is just to jump through some hoops
to try to allocate coherent DMA memory for such devices in a place that
they can reach. We'd use dma_generic_alloc_coherent() for this if it
existed on IA64.

Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
---
 drivers/pci/intel-iommu.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

(limited to 'drivers/pci')

diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
index b1e97e68250..7fe5f7920ca 100644
--- a/drivers/pci/intel-iommu.c
+++ b/drivers/pci/intel-iommu.c
@@ -2767,7 +2767,15 @@ static void *intel_alloc_coherent(struct device *hwdev, size_t size,
 
 	size = PAGE_ALIGN(size);
 	order = get_order(size);
-	flags &= ~(GFP_DMA | GFP_DMA32);
+
+	if (!iommu_no_mapping(hwdev))
+		flags &= ~(GFP_DMA | GFP_DMA32);
+	else if (hwdev->coherent_dma_mask < dma_get_required_mask(hwdev)) {
+		if (hwdev->coherent_dma_mask < DMA_BIT_MASK(32))
+			flags |= GFP_DMA;
+		else
+			flags |= GFP_DMA32;
+	}
 
 	vaddr = (void *)__get_free_pages(flags, order);
 	if (!vaddr)
-- 
cgit v1.2.3


From 99dcadede42f8898d4c963ef69192ef4b9b76ba8 Mon Sep 17 00:00:00 2001
From: Fenghua Yu <fenghua.yu@intel.com>
Date: Wed, 11 Nov 2009 07:23:06 -0800
Subject: intel-iommu: Support PCIe hot-plug

To support PCIe hot plug in IOMMU, we register a notifier to respond to device
change action.

When the notifier gets BUS_NOTIFY_UNBOUND_DRIVER, it removes the device
from its DMAR domain.

A hot added device will be added into an IOMMU domain when it first does IOMMU
op. So there is no need to add more code for hot add.

Without the patch, after a hot-remove, a hot-added device on the same
slot will not work.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Tested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
---
 drivers/pci/intel-iommu.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

(limited to 'drivers/pci')

diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
index 7fe5f7920ca..1840a0578a4 100644
--- a/drivers/pci/intel-iommu.c
+++ b/drivers/pci/intel-iommu.c
@@ -3215,6 +3215,33 @@ static int __init init_iommu_sysfs(void)
 }
 #endif	/* CONFIG_PM */
 
+/*
+ * Here we only respond to action of unbound device from driver.
+ *
+ * Added device is not attached to its DMAR domain here yet. That will happen
+ * when mapping the device to iova.
+ */
+static int device_notifier(struct notifier_block *nb,
+				  unsigned long action, void *data)
+{
+	struct device *dev = data;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	struct dmar_domain *domain;
+
+	domain = find_domain(pdev);
+	if (!domain)
+		return 0;
+
+	if (action == BUS_NOTIFY_UNBOUND_DRIVER && !iommu_pass_through)
+		domain_remove_one_dev_info(domain, pdev);
+
+	return 0;
+}
+
+static struct notifier_block device_nb = {
+	.notifier_call = device_notifier,
+};
+
 int __init intel_iommu_init(void)
 {
 	int ret = 0;
@@ -3267,6 +3294,8 @@ int __init intel_iommu_init(void)
 
 	register_iommu(&intel_iommu_ops);
 
+	bus_register_notifier(&pci_bus_type, &device_nb);
+
 	return 0;
 }
 
-- 
cgit v1.2.3