Added the pci_get_domain_and_slot_function which is analogous to
pci_get_bus_and_slot. It returns a pci_dev given a domain (segment) number,
bus number, and devnr. Like pci_get_bus_and_slot,
pci_get_domain_bus_and_slot holds a reference to the returned pci_dev.
Converted pci_get_bus_and_slot to a wrapper that calls
pci_get_domain_bus_and_slot with the domain hard-coded to 0.
This routine was patterned off code suggested by Bjorn Helgaas.
Acked-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Change to populate the subsystem vendor and subsytem device IDs for
PCI-PCI bridges that implement the PCI Subsystem Vendor ID capability.
Previously bridges left subsystem vendor IDs unpopulated.
Signed-off-by: Gabe Black <gabe.black@ni.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Feedback from Hidetoshi Seto and Kenji Kaneshige incorporated. This
correctly handles PCI-X bridges, PCIe root ports and endpoints, and
prints debug messages when invalid/reserved types are found in the
HEST. PCI devices not in domain/segment 0 are not represented in
HEST, thus will be ignored.
Today, the PCIe Advanced Error Reporting (AER) driver attaches itself
to every PCIe root port for which BIOS reports it should, via ACPI
_OSC.
However, _OSC alone is insufficient for newer BIOSes. Part of ACPI
4.0 is the new APEI (ACPI Platform Error Interfaces) which is a way
for OS and BIOS to handshake over which errors for which components
each will handle. One table in ACPI 4.0 is the Hardware Error Source
Table (HEST), where BIOS can define that errors for certain PCIe
devices (or all devices), should be handled by BIOS ("Firmware First
mode"), rather than be handled by the OS.
Dell PowerEdge 11G server BIOS defines Firmware First mode in HEST, so
that it may manage such errors, log them to the System Event Log, and
possibly take other actions. The aer driver should honor this, and
not attach itself to devices noted as such.
Furthermore, Kenji Kaneshige reminded us to disallow changing the AER
registers when respecting Firmware First mode. Platform firmware is
expected to manage these, and if changes to them are allowed, it could
break that firmware's behavior.
The HEST parsing code may be replaced in the future by a more
feature-rich implementation. This patch provides the minimum needed
to prevent breakage until that implementation is available.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This cleanup patch puts struct/union/enum opening braces,
in first line to ease grep games.
struct something
{
becomes :
struct something {
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All architectures in the kernel increment/decrement the stack pointer
before storing values on the stack.
On architectures which have the stack grow down sas_ss_sp == sp is not
on the alternate signal stack while sas_ss_sp + sas_ss_size == sp is
on the alternate signal stack.
On architectures which have the stack grow up sas_ss_sp == sp is on
the alternate signal stack while sas_ss_sp + sas_ss_size == sp is not
on the alternate signal stack.
The current implementation fails for architectures which have the
stack grow down on the corner case where sas_ss_sp == sp.This was
reported as Debian bug #544905 on AMD64.
Simplified test case: http://download.breakpoint.cc/tc-sig-stack.c
The test case creates the following stack scenario:
0xn0300 stack top
0xn0200 alt stack pointer top (when switching to alt stack)
0xn01ff alt stack end
0xn0100 alt stack start == stack pointer
If the signal is sent the stack pointer is pointing to the base
address of the alt stack and the kernel erroneously decides that it
has already switched to the alternate stack because of the current
check for "sp - sas_ss_sp < sas_ss_size"
On parisc (stack grows up) the scenario would be:
0xn0200 stack pointer
0xn01ff alt stack end
0xn0100 alt stack start = alt stack pointer base
(when switching to alt stack)
0xn0000 stack base
This is handled correctly by the current implementation.
[ tglx: Modified for archs which have the stack grow up (parisc) which
would fail with the correct implementation for stack grows
down. Added a check for sp >= current->sas_ss_sp which is
strictly not necessary but makes the code symetric for both
variants ]
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: stable@kernel.org
LKML-Reference: <20091025143758.GA6653@Chamillionaire.breakpoint.cc>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Note: dom0 checking in v4 has been separated out into 2/2.
This patch enables P2P upstream forwarding in ACS capable PCIe switches.
It solves two potential problems in virtualization environment where a PCIe
device is assigned to a guest domain using a HW iommu such as VT-d:
1) Unintentional failure caused by guest physical address programmed
into the device's DMA that happens to match the memory address range
of other downstream ports in the same PCIe switch. This causes the PCI
transaction to go to the matching downstream port instead of go to the
root complex to get translated by VT-d as it should be.
2) Malicious guest software intentionally attacks another downstream
PCIe device by programming the DMA address into the assigned device
that matches memory address range of the downstream PCIe port.
We are in process of implementing device filtering software in KVM/XEN
management software to allow device assignment of PCIe devices behind a PCIe
switch only if it has ACS capability and with the P2P upstream forwarding bits
enabled. This patch is intended to work for both KVM and Xen environments.
Signed-off-by: Allen Kay <allen.m.kay@intel.com>
Reviewed-by: Mathew Wilcox <willy@linux.intel.com>
Reviewed-by: Chris Wright <chris@sous-sol.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
For non hotplug PCI devices, the system firmware usually configures
CLS correctly. For pccard devices system firmware can't do it and
Linux PCI layer doesn't do it either. Unfortunately this leads to
poor performance for certain devices (sata_sil). Unless MWI, which
requires separate configuration, is to be used, CLS doesn't affect
correctness, so the configuration should be harmless.
This patch makes pci_set_cacheline_size() always built and export it
and make pccard call it during attach.
Please note that some other PCI hotplug drivers (shpchp and pciehp)
also configure CLS on hotplug.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Daniel Ritz <daniel.ritz@gmx.ch>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Greg KH <greg@kroah.com>
Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Cc: Axel Birndt <towerlexa@gmx.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Till now, CLS has been determined either by arch code or as
L1_CACHE_BYTES. Only x86 and ia64 set CLS explicitly and x86 doesn't
always get it right. On most configurations, the chance is that
firmware configures the correct value during boot.
This patch makes pci_init() determine CLS by looking at what firmware
has configured. It scans all devices and if all non-zero values
agree, the value is used. If none is configured or there is a
disagreement, pci_dfl_cache_line_size is used. arch can set the dfl
value (via PCI_CACHE_LINE_BYTES or pci_dfl_cache_line_size) or
override the actual one.
ia64, x86 and sparc64 updated to set the default cls instead of the
actual one.
While at it, declare pci_cache_line_size and pci_dfl_cache_line_size
in pci.h and drop private declarations from arch code.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David Miller <davem@davemloft.net>
Acked-by: Greg KH <gregkh@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch adds support for the APBUART serial port from Aeroflex
Gaisler's IP library GRLIB. It is currently used in all LEON3 designs
(SPARC V8) but can be used on other platforms as well (which support OF).
Signed-off-by: Kristoffer Glembo <kristoffer@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds RCU management to the list of netdevices.
Convert some for_each_netdev() users to RCU version, if
it can avoid read_lock-ing dev_base_lock
Ie:
read_lock(&dev_base_loack);
for_each_netdev(net, dev)
some_action();
read_unlock(&dev_base_lock);
becomes :
rcu_read_lock();
for_each_netdev_rcu(net, dev)
some_action();
rcu_read_unlock();
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 74296a8ed added this function for debug purposes, but it was
never used for anything. Remove it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Currently partition_sched_domains() takes a 'struct cpumask
*doms_new' which is a kmalloc'ed array of cpumask_t. You can't
have such an array if 'struct cpumask' is undefined, as we plan
for CONFIG_CPUMASK_OFFSTACK=y.
So, we make this an array of cpumask_var_t instead: this is the
same for the CONFIG_CPUMASK_OFFSTACK=n case, but requires
multiple allocations for the CONFIG_CPUMASK_OFFSTACK=y case.
Hence we add alloc_sched_domains() and free_sched_domains()
functions.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <200911031453.40668.rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Conflicts:
tools/perf/Makefile
Merge reason: Resolve the conflict, merge to upstream and merge in
perf fixes so we can add a dependent patch.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There are reasons for kernel code to ask for, and use, performance
counters.
For example, in CPU freq governors this tends to be a good idea, but
there are other examples possible as well of course.
This patch adds the needed bits to do enable this functionality; they
have been tested in an experimental cpufreq driver that I'm working on,
and the changes are all that I needed to access counters properly.
[fweisbec@gmail.com: added pid to perf_event_create_kernel_counter so
that we can profile a particular task too
TODO: Have a better error reporting, don't just return NULL in fail
case.]
v2: Remove the wrong comment about the fact
perf_event_create_kernel_counter must be called from a kernel
thread.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: "K.Prasad" <prasad@linux.vnet.ibm.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Jiri Slaby <jirislaby@gmail.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jan Kiszka <jan.kiszka@web.de>
Cc: Avi Kivity <avi@redhat.com>
LKML-Reference: <20090925122556.2f8bd939@infradead.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
This patch provides basic hash rules programming via the ethtool
interface.
Signed-off-by: Sandeep Gopalpet <Sandeep.Kumar@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds code to disable the TXC and RXC reference clocks if link
is not available.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 5785 does not use the RXC reference clock. Turning it off is
desirable as it saves power.
By default, the 50610 enables the RXC reference clock and the 50610M
disables it. Presumably this is one of the reasons why the hardware
architect chose one over the other.
Adding a "rx reference clock disable" flag is not the ideal way to
describe the option, as it would force the MAC using a 50610M to set
the flag. Ideally we want the flags to represent opt-in behavior that
deviates from hardware defaults. Furthermore, the lack of a
"disable" flag implies that the requester wants the rx reference clock
enabled, which doesn't necessarily follow.
By presenting the option as a passive statement (rx reference clock
unused) rather than a command, I hope to convey an opt-in option to
disable the rx reference clock that falls back to hardware defaults if
not set. A secondary benefit of this is that it keeps the
intelligence about phy defaults in the broadcom module where it belongs
and allows the broadcom module more latitude should a bug arise.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Broadcom 50610M parts changed the default definitions of the RGMII mode
shadow register. The 5785 needs the RGMII mode selection bits [4:3]
cleared.
The default value of the remaining bits in this register are zero.
Rather than unnecessarily burn an extra bit in the dev_flags member in
an attempt to enumerate all possible combinations, this patch take a
more course grained approach and labels the option as "clear all bits".
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch moves all the dev_flags enumerations outside the broadcom.c
file to include/linux/brcmphy.h. The existing flags were not used yet
and have been re-enumerated to avoid conflicts.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The custom_data pointer in ff_periodict_effect structure is a
userspace pointer and should be marked as such.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf tools: Remove -Wcast-align
perf tools: Fix compatibility with libelf 0.8 and autodetect
perf events: Don't generate events for the idle task when exclude_idle is set
perf events: Fix swevent hrtimer sampling by keeping track of remaining time when enabling/disabling swevent hrtimers
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing: Remove cpu arg from the rb_time_stamp() function
tracing: Fix comment typo and documentation example
tracing: Fix trace_seq_printf() return value
tracing: Update *ppos instead of filp->f_pos
So remove both the comment and the inline requirement, going back to the
inline hint.
Signed-off-by: Alberto Bertogli <albertito@blitiri.com.ar>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Some workloads hit dev_base_lock rwlock pretty hard.
We can use RCU lookups to avoid touching this rwlock
(and avoid touching netdevice refcount)
netdevices are already freed after a RCU grace period, so this patch
adds no penalty at device dismantle time.
However, it adds a synchronize_rcu() call in dev_change_name()
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Different CPUs will have different starting vectors, with varying
amounts of reserved or unusable vector space prior to the first slot.
This introduces a legacy vector reservation system that inserts itself in
between the CPU vector map registration and the platform specific IRQ
setup. This works fine in practice as the only new vectors that boards
need to establish on their own should be dynamically allocated rather
than arbitrarily assigned. As a plus, this also makes all of the
converted platforms sparseirq ready.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
I was told that there are obscure architectures with non-coherent DMA
which may DMA-map to bus address 0. We shall not use 0 as a magic
number of uninitialized bus address variables.
The packet->payload_length > 0 test cannot be used either (except in
at_context_queue_packet) because local requests are not DMA-mapped
regardless of payload_length. Hence add a state flag to struct
fw_packet.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
RDS currently supports a GET_MR sockopt to establish a
memory region (MR) for a chunk of memory. However, the fastreg
method ties a MR to a particular destination. The GET_MR_FOR_DEST
sockopt allows the remote machine to be specified, and thus
support for fastreg (aka FRWRs).
Note that this patch does *not* do all of this - it simply
implements the new sockopt in terms of the old one, so applications
can begin to use the new sockopt in preparation for cutover to
FRWRs.
Signed-off-by: Andy Grover <andy.grover@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Platform drivers registered via platform_driver_probe() can be bound
to devices only once, upon registration, because discard their probe()
routines to save memory. Unbinding the driver through sysfs 'unbind'
leaves the device stranded and confuses users so let's not create
bind and unbind attributes for such drivers.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Cc: Éric Piel <eric.piel@tremplin-utc.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We drop nullfunc frames, but not qos-nullfunc frames,
even though those could be used for PS state control
as well.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This isn't beautifully abstracted, but it is simple,
simplifies uses and so far is only needed for the bonding driver.
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On UDP sockets, we must call skb_free_datagram() with socket locked,
or risk sk_forward_alloc corruption. This requirement is not respected
in SUNRPC.
Add a convenient helper, skb_free_datagram_locked() and use it in SUNRPC
Reported-by: Francis Moreau <francis.moro@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct sk_buff kmemcheck annotations enlarged this structure by 8/16 bytes
Fix this by moving 'protocol' inside flags1 bitfield,
and queue_mapping inside flags2 bitfield.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This will allow drivers to adjust their receive path dynamically
based on whether GRO is being applied successfully.
Currently all in-tree callers ignore the return values of these
functions and do not need to be changed.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hi James, would you mind taking the following into
security-testing?
The securebits are used by passing them to prctl with the
PR_{S,G}ET_SECUREBITS commands. But the defines must be
shifted to be used in prctl, which begs to be confused and
misused by userspace. So define some more convenient
values for userspace to specify. This way userspace does
prctl(PR_SET_SECUREBITS, SECBIT_NOROOT);
instead of
prctl(PR_SET_SECUREBITS, 1 << SECURE_NOROOT);
(Thanks to Michael for the idea)
This patch also adds include/linux/securebits to the installed headers.
Then perhaps it can be included by glibc's sys/prctl.h.
Changelog:
Oct 29: Stephen Rothwell points out that issecure can
be under __KERNEL__.
Oct 14: (Suggestions by Michael Kerrisk):
1. spell out SETUID in SECBIT_NO_SETUID*
2. SECBIT_X_LOCKED does not imply SECBIT_X
3. add definitions for keepcaps
Oct 14: As suggested by Michael Kerrisk, don't
use SB_* as that convention is already in
use. Use SECBIT_ prefix instead.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: James Morris <jmorris@namei.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-param-fixes:
param: fix setting arrays of bool
param: fix NULL comparison on oom
param: fix lots of bugs with writing charp params from sysfs, by leaking mem.
* 'hwpoison-2.6.32' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6:
HWPOISON: fix invalid page count in printk output
HWPOISON: Allow schedule_on_each_cpu() from keventd
HWPOISON: fix/proc/meminfo alignment
HWPOISON: fix oops on ksm pages
HWPOISON: Fix page count leak in hwpoison late kill in do_swap_page
HWPOISON: return early on non-LRU pages
HWPOISON: Add brief hwpoison description to Documentation
HWPOISON: Clean up PR_MCE_KILL interface