Make the uprobes code readable to me:
- improve the Kconfig text so that a mere mortal gets some idea
what CONFIG_UPROBES=y is really about
- do trivial renames to standardize around the uprobes_*() namespace
- clean up and simplify various code flow details
- separate basic blocks of functionality
- line break artifact and white space related removal
- use standard local varible definition blocks
- use vertical spacing to make things more readable
- remove unnecessary volatile
- restructure comment blocks to make them more uniform and
more readable in general
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Anton Arapov <anton@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Link: http://lkml.kernel.org/n/tip-ewbwhb8o6navvllsauu7k07p@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add uprobes support to the core kernel, with x86 support.
This commit adds the kernel facilities, the actual uprobes
user-space ABI and perf probe support comes in later commits.
General design:
Uprobes are maintained in an rb-tree indexed by inode and offset
(the offset here is from the start of the mapping). For a unique
(inode, offset) tuple, there can be at most one uprobe in the
rb-tree.
Since the (inode, offset) tuple identifies a unique uprobe, more
than one user may be interested in the same uprobe. This provides
the ability to connect multiple 'consumers' to the same uprobe.
Each consumer defines a handler and a filter (optional). The
'handler' is run every time the uprobe is hit, if it matches the
'filter' criteria.
The first consumer of a uprobe causes the breakpoint to be
inserted at the specified address and subsequent consumers are
appended to this list. On subsequent probes, the consumer gets
appended to the existing list of consumers. The breakpoint is
removed when the last consumer unregisters. For all other
unregisterations, the consumer is removed from the list of
consumers.
Given a inode, we get a list of the mms that have mapped the
inode. Do the actual registration if mm maps the page where a
probe needs to be inserted/removed.
We use a temporary list to walk through the vmas that map the
inode.
- The number of maps that map the inode, is not known before we
walk the rmap and keeps changing.
- extending vm_area_struct wasn't recommended, it's a
size-critical data structure.
- There can be more than one maps of the inode in the same mm.
We add callbacks to the mmap methods to keep an eye on text vmas
that are of interest to uprobes. When a vma of interest is mapped,
we insert the breakpoint at the right address.
Uprobe works by replacing the instruction at the address defined
by (inode, offset) with the arch specific breakpoint
instruction. We save a copy of the original instruction at the
uprobed address.
This is needed for:
a. executing the instruction out-of-line (xol).
b. instruction analysis for any subsequent fixups.
c. restoring the instruction back when the uprobe is unregistered.
We insert or delete a breakpoint instruction, and this
breakpoint instruction is assumed to be the smallest instruction
available on the platform. For fixed size instruction platforms
this is trivially true, for variable size instruction platforms
the breakpoint instruction is typically the smallest (often a
single byte).
Writing the instruction is done by COWing the page and changing
the instruction during the copy, this even though most platforms
allow atomic writes of the breakpoint instruction. This also
mirrors the behaviour of a ptrace() memory write to a PRIVATE
file map.
The core worker is derived from KSM's replace_page() logic.
In essence, similar to KSM:
a. allocate a new page and copy over contents of the page that
has the uprobed vaddr
b. modify the copy and insert the breakpoint at the required
address
c. switch the original page with the copy containing the
breakpoint
d. flush page tables.
replace_page() is being replicated here because of some minor
changes in the type of pages and also because Hugh Dickins had
plans to improve replace_page() for KSM specific work.
Instruction analysis on x86 is based on instruction decoder and
determines if an instruction can be probed and determines the
necessary fixups after singlestep. Instruction analysis is done
at probe insertion time so that we avoid having to repeat the
same analysis every time a probe is hit.
A lot of code here is due to the improvement/suggestions/inputs
from Peter Zijlstra.
Changelog:
(v10):
- Add code to clear REX.B prefix as suggested by Denys Vlasenko
and Masami Hiramatsu.
(v9):
- Use insn_offset_modrm as suggested by Masami Hiramatsu.
(v7):
Handle comments from Peter Zijlstra:
- Dont take reference to inode. (expect inode to uprobe_register to be sane).
- Use PTR_ERR to set the return value.
- No need to take reference to inode.
- use PTR_ERR to return error value.
- register and uprobe_unregister share code.
(v5):
- Modified del_consumer as per comments from Peter.
- Drop reference to inode before dropping reference to uprobe.
- Use i_size_read(inode) instead of inode->i_size.
- Ensure uprobe->consumers is NULL, before __uprobe_unregister() is called.
- Includes errno.h as recommended by Stephen Rothwell to fix a build issue
on sparc defconfig
- Remove restrictions while unregistering.
- Earlier code leaked inode references under some conditions while
registering/unregistering.
- Continue the vma-rmap walk even if the intermediate vma doesnt
meet the requirements.
- Validate the vma found by find_vma before inserting/removing the
breakpoint
- Call del_consumer under mutex_lock.
- Use hash locks.
- Handle mremap.
- Introduce find_least_offset_node() instead of close match logic in
find_uprobe
- Uprobes no more depends on MM_OWNER; No reference to task_structs
while inserting/removing a probe.
- Uses read_mapping_page instead of grab_cache_page so that the pages
have valid content.
- pass NULL to get_user_pages for the task parameter.
- call SetPageUptodate on the new page allocated in write_opcode.
- fix leaking a reference to the new page under certain conditions.
- Include Instruction Decoder if Uprobes gets defined.
- Remove const attributes for instruction prefix arrays.
- Uses mm_context to know if the application is 32 bit.
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Also-written-by: Jim Keniston <jkenisto@us.ibm.com>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Roland McGrath <roland@hack.frob.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Anton Arapov <anton@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linux-mm <linux-mm@kvack.org>
Link: http://lkml.kernel.org/r/20120209092642.GE16600@linux.vnet.ibm.com
[ Made various small edits to the commit log ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The SH7757 has 2 Fast Ethernet and 2 Gigabit Ethernet, and the first
Gigabit channel needs the initialization. So, this patch adds the
parameter of "needs_init", and if the sh_eth_plat_data is set it
to 1, the driver will initialize the channel.
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Our dprintk() debugging facility doesn't specify any verbosity level
for it's printk() calls, but it should.
The default verbosity for printk's is KERN_DEFAULT. You might argue
that these are debugging printk's and thus the verbosity should be
KERN_DEBUG. That would mean that to see NFS and SUNRPC debugging
output an admin would also have to boost the syslog verbosity, which
would be insufferably noisy.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
With static RPC slots, the xprt backlog queue stats were useful in showing
when the transport (TCP) was starved by lack of RPC slots. The new dynamic
RPC slot code, commit d9ba131d8f, always
provides an RPC slot and so only uses the xprt backlog queue when the
tcp_max_slot_table_entries value has been hit or when an allocation error
occurs. All requests are now placed on the xprt sending or pending queue which
need to be monitored for debugging.
The max_slot stat shows the maximum number of dynamic RPC slots reached which is
useful when debugging performance issues.
Add the new fields at the end of the mountstats xprt stanza so that mountstats
outputs the previous correct values and ignores the new fields. Bump
NFS_IOSTATS_VERS.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Make irq_domain_ops pointer a constant to make it safer for multiple
instances to share the same ops pointer and change the irq_domain code
so that it does not modify the ops.
v4: Fix mismatched type reference in powerpc code
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Milton Miller <miltonm@bga.com>
Tested-by: Olof Johansson <olof@lixom.net>
irq_domain_add_simple() was a stop-gap measure until complete irq_domain
support was complete. This patch removes the irq_domain_add_simple()
interface.
This patch also drops the explicit irq_domain initialization performed
by the mach-versatile code because the versatile interrupt controller
already has irq_domain support built into it. This was a bug that was
hanging around quietly for a while, but with the full irq_domain which
actually verifies that irq_domain ranges are available it would cause
the registration to fail and the system wouldn't boot.
v4: Fixed number of irqs in mx5 gpio code
v2: Updated to pass in host_data pointer on irq_domain allocation.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Cc: Russell King <linux@arm.linux.org.uk>
Tested-by: Olof Johansson <olof@lixom.net>
This patch removes the simplistic implementation of irq_domains and enables
the powerpc infrastructure for all irq_domain users. The powerpc
infrastructure includes support for complex mappings between Linux and
hardware irq numbers, and can manage allocation of irq_descs.
This patch also converts the few users of irq_domain_add()/irq_domain_del()
to call irq_domain_add_legacy() instead.
v3: Fix bug that set up too many irqs in translation range.
v2: Fix removal of irq_alloc_descs() call in gic driver
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Tested-by: Olof Johansson <olof@lixom.net>
As the title says, this patch adds empty implementations for the address
translation functions so that they can be used when CONFIG_OF is disabled.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Rob Herring <rob.herring@calxeda.com>
Add support for a legacy mapping where irq = (hwirq - first_hwirq + first_irq)
so that a controller driver can allocate a fixed range of irq_descs and use
a simple calculation to translate back and forth between linux and hw irq
numbers. This is needed to use an irq_domain with many of the ARM interrupt
controller drivers that manage their own irq_desc allocations. Ultimately
the goal is to migrate those drivers to use the linear revmap, but doing it
this way allows each driver to be converted separately which makes the
migration path easier.
This patch generalizes the IRQ_DOMAIN_MAP_LEGACY method to use
(first_irq-first_hwirq) as the offset between hwirq and linux irq number,
and adds checks to make sure that the hwirq number does not exceed range
assigned to the controller.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Tested-by: Olof Johansson <olof@lixom.net>
Each revmap type has different arguments for setting up the revmap.
This patch splits up the generator functions so that each revmap type
can do its own setup and the user doesn't need to keep track of how
each revmap type handles the arguments.
This patch also adds a host_data argument to the generators. There are
cases where the host_data pointer will be needed before the function returns.
ie. the legacy map calls the .map callback for each irq before returning.
v2: - Add void *host_data argument to irq_domain_add_*() functions
- fixed failure to compile
- Moved IRQ_DOMAIN_MAP_* defines into irqdomain.c
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Tested-by: Olof Johansson <olof@lixom.net>
This patch only moves the code. It doesn't make any changes, and the
code is still only compiled for powerpc. Follow-on patches will generalize
the code for other architectures.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Tested-by: Olof Johansson <olof@lixom.net>
Use standard ror64() instead of hand-written.
There is no standard ror64, so create it.
The difference is shift value being "unsigned int" instead of uint64_t
(for which there is no reason). gcc starts to emit native ROR instructions
which it doesn't do for some reason currently. This should make the code
faster.
Patch survives in-tree crypto test and ping flood with hmac(sha512) on.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
For a process to entirely disable Yama ptrace restrictions, it can use
the special PR_SET_PTRACER_ANY pid to indicate that any otherwise allowed
process may ptrace it. This is stronger than calling PR_SET_PTRACER with
pid "1" because it includes processes in external pid namespaces. This is
currently needed by the Chrome renderer, since its crash handler (Breakpad)
runs external to the renderer's pid namespace.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <jmorris@namei.org>
While updating locking, b2efa05265 "block, cfq: unlink
cfq_io_context's immediately" moved elevator_exit_icq_fn() invocation
from exit_io_context() to the final ioc put. While this doesn't cause
catastrophic failure, it effectively removes task exit notification to
elevator and cause noticeable IO performance degradation with CFQ.
On task exit, CFQ used to immediately expire the slice if it was being
used by the exiting task as no more IO would be issued by the task;
however, after b2efa05265, the notification is lost and disk could sit
idle needlessly, leading to noticeable IO performance degradation for
certain workloads.
This patch renames ioc_exit_icq() to ioc_destroy_icq(), separates
elevator_exit_icq_fn() invocation into ioc_exit_icq() and invokes it
from exit_io_context(). ICQ_EXITED flag is added to avoid invoking
the callback more than once for the same icq.
Walking icq_list from ioc side and invoking elevator callback requires
reverse double locking. This may be better implemented using RCU;
unfortunately, using RCU isn't trivial. e.g. RCU protection would
need to cover request_queue and queue_lock switch on cleanup makes
grabbing queue_lock from RCU unsafe. Reverse double locking should
do, at least for now.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-bisected-by: Shaohua Li <shli@kernel.org>
LKML-Reference: <CANejiEVzs=pUhQSTvUppkDcc2TNZyfohBRLygW5zFmXyk5A-xQ@mail.gmail.com>
Tested-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
icq->changed was used for ICQ_*_CHANGED bits. Rename it to flags and
access it under ioc->lock instead of using atomic bitops.
ioc_get_changed() is added so that the changed part can be fetched and
cleared as before.
icq->flags will be used to carry other flags.
Signed-off-by: Tejun Heo <tj@kernel.org>
Tested-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Lockd now managed in network namespace context. And this patch introduces
network namespace related NLM hosts shutdown in case of releasing per-net Lockd
resources.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This object depends on RPC client, and thus on network namespace.
So let's make it's allocation and lookup in network namespace context.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch introduces per-net Lockd initialization and destruction routines.
The logic is the same as in global Lockd up and down routines. Probably the
solution is not the best one. But at least it looks clear.
So per-net "up" routine are called only in case of lockd is running already. If
per-net resources are not allocated yet, then service is being registered with
local portmapper and lockd sockets created.
Per-net "down" routine is called on every lockd_down() call in case of global
users counter is not zero.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
v2: Added comment to BUG_ON's in svc_destroy() to make code looks clearer.
This patch introduces network namespace filter for service destruction
function.
Nothing special here - just do exactly the same operations, but only for
tranports in passed networks namespace context.
BTW, BUG_ON() checks for empty service transports lists were returned into
svc_destroy() function. This is because of swithing generic svc_close_all() to
networks namespace dependable svc_close_net().
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Add the module parameter 'max_session_slots' to set the initial number
of slots that the NFSv4.1 client will attempt to negotiate with the
server.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
It is perfectly legal to negotiate up to 2^32-1 slots in the protocol,
and with 10GigE, we are already seeing that 255 slots is far too limiting.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The bulk_write() supports the data transfer to multi
register which takes the data into cpu_endianness format
and does formatting of data to device format before
sending to device.
The transfer can be completed in single transfer or multiple
transfer based on data formatting.
Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Part of the series to unify the irq remapping mechanisms in the
kernel. A follow up patch will copy the powerpc implementation into
kernel/irq/irqdomain.c, which will be a lot easier if the structures
are identical.
Where they differ, I've chose to use the powerpc names since there is
a lot more code using those names.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Cc: Rob Herring <rob.herring@calxeda.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Milton Miller <miltonm@bga.com>
Tested-by: Olof Johansson <olof@lixom.net>
<asm/posix_types.h> includes a set of macros that operate on file
descriptors. Way long ago those were exported to user space, but
nowadays they are #ifdef __KERNEL__.
However, they are nothing but standard (nonatomic) bit operations, and
we already have optimized versions of bit operations in the kernel.
We can't include <linux/bitops.h> in <asm/posix_types.h> but we can
move the definitions to <linux/time.h> and define them there in terms
of standard kernel bitops.
[ v2: folds the following fixes in:
a) Stray space in __FD_SET(), reported by Andrew Morton
b) #include <linux/string.h> needed for memset(), reported by Tony Luck ]
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/r/1328677745-20121-22-git-send-email-hpa@zytor.com
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
USB 3.0 hubs don't have a port suspend change bit (that bit is now
reserved). Instead, when a host-initiated resume finishes, the hub sets
the port link state change bit.
When a USB 3.0 device initiates remote wakeup, the parent hubs with
their upstream links in U3 will pass the LFPS up the chain. The first
hub that has an upstream link in U0 (which may be the roothub) will
reflect that LFPS back down the path to the device.
However, the parent hubs in the resumed path will not set their link
state change bit. Instead, the device that initiated the resume has to
send an asynchronous "Function Wake" Device Notification up to the host
controller. Therefore, we need a way to notify the USB core of a device
resume without going through the normal hub URB completion method.
First, make the xHCI roothub act like an external USB 3.0 hub and not
pass up the port link state change bit when a device-initiated resume
finishes. Introduce a new xHCI bit field, port_remote_wakeup, so that
we can tell the difference between a port coming out of the U3Exit state
(host-initiated resume) and the RExit state (ending state of
device-initiated resume).
Since the USB core can't tell whether a port on a hub has resumed by
looking at the Hub Status buffer, we need to introduce a bitfield,
wakeup_bits, that indicates which ports have resumed. When the xHCI
driver notices a port finishing a device-initiated resume, we call into
a new USB core function, usb_wakeup_notification(), that will set
the right bit in wakeup_bits, and kick khubd for that hub.
We also call usb_wakeup_notification() when the Function Wake Device
Notification is received by the xHCI driver. This covers the case where
the link between the roothub and the first-tier hub is in U0, and the
hub reflects the resume signaling back to the device without giving any
indication it has done so until the device sends the Function Wake
notification.
Change the code in khubd that handles the remote wakeup to look at the
state the USB core thinks the device is in, and handle the remote wakeup
if the port's wakeup bit is set.
This patch only takes care of the case where the device is attached
directly to the roothub, or the USB 3.0 hub that is attached to the root
hub is the device sending the Function Wake Device Notification (e.g.
because a new USB device was attached). The other cases will be covered
in a second patch.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
USB 3.0 hubs have a different remote wakeup policy than USB 2.0 hubs.
USB 2.0 hubs, once they have remote wakeup enabled, will always send
remote wakes when anything changes on a port.
However, USB 3.0 hubs have a per-port remote wake up policy that is off
by default. The Set Feature remote wake mask can be changed for any
port, enabling remote wakeup for a connect, disconnect, or overcurrent
event, much like EHCI and xHCI host controller "wake on" port status
bits. The bits are cleared to zero on the initial hub power on, or
after the hub has been reset.
Without this patch, when a USB 3.0 hub gets suspended, it will not send
a remote wakeup on device connect or disconnect. This would show up to
the user as "dead ports" unless they ran lsusb -v (since newer versions
of lsusb use the sysfs files, rather than sending control transfers).
Change the hub driver's suspend method to enable remote wake up for
disconnect, connect, and overcurrent for all ports on the hub. Modify
the xHCI driver's roothub code to handle that request, and set the "wake
on" bits in the port status registers accordingly.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
This allows us to move the definition of struct resource_list to
setup_bus.c and later convert resource_list to a regular list.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Current rescan will not touch bridge MMIO and IO.
Try to reuse pci_assign_unassigned_bridge_resources(bridge) to update bridge
resources, if child devices need more resources.
Only do that for bridges whose children are all removed already; i.e. don't
release resources that could already be in use by drivers on child devices.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The use case of this is when a driver wants to call FLR when a device
is attached to it using the SysFS "bind" or "unbind" functionality.
The call chain when a user does "bind" looks as so:
echo "0000:01.07.0" > /sys/bus/pci/drivers/XXXX/bind
and ends up calling:
driver_bind:
device_lock(dev); <=== TAKES LOCK
XXXX_probe:
.. pci_enable_device()
...__pci_reset_function(), which calls
pci_dev_reset(dev, 0):
if (!0) {
device_lock(dev) <==== DEADLOCK
The __pci_reset_function_locked function allows the the drivers
'probe' function to call the "pci_reset_function" while still holding
the driver mutex lock.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch converts the underlying maintenance aspects of FW-assigned
BIOS BAR values from a statically allocated array within struct pci_dev
to a list of temporary, stand alone, entries.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Commit 58c84eda07 introduced functionality to try and reinstate the
original BIOS BAR addresses of a PCI device when normal resource
assignment attempts fail. To keep track of the BIOS BAR addresses,
struct pci_dev was augmented with an array to hold the BAR addresses
of the PCI device: 'resource_size_t fw_addr[DEVICE_COUNT_RESOURCE]'.
The reinstatement of BAR addresses is an uncommon event leaving the
'fw_addr' array unused under normal circumstances. This functionality
is also currently architecture specific with an implementation limited
to x86. As the use of struct pci_dev is so prevalent, having the
'fw_addr' array residing within such seems somewhat wasteful.
This patch introduces a stand alone data structure and interfacing
routines for maintaining a list of FW-assigned BIOS BAR value entries.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
With this private ioctl it is possible to use the ivtv decoder without
requiring the dvb/video.h and dvb/audio.h headers.
Eventually support for those DVB APIs will be dropped from ivtv.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
As discussed during the 2011 V4L-DVB workshop we want to create a proper V4L2
decoder API that replaces the DVBv5 API that has been used until now.
This adds the four controls necessary to be able to switch ivtv over to this
new API.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
As discussed during the 2011 V4L-DVB workshop, the API in dvb/video.h should
be replaced by a proper V4L2 API. This patch turns the VIDEO_(TRY_)DECODER_CMD
ioctls into proper V4L2 ioctls.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
In quota code we need to find a superblock corresponding to a device and wait
for superblock to be unfrozen. However this waiting has to happen without
s_umount semaphore because that is required for superblock to thaw. So provide
a function in VFS for this to keep dances with s_umount where they belong.
[AV: implementation switched to saner variant]
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Modified the mmc_poweroff to resume before sending the poweroff
notification command. In sleep mode only AWAKE and RESET commands are
allowed, so before sending the poweroff notification command resume from
sleep mode and then send the notification command.
PowerOff Notify is tested on a Synopsis Designware Host Controller
(eMMC 4.5). The suspend to RAM and resume works fine.
Signed-off-by: Girish K S <girish.shivananjappa@linaro.org>
Tested-by: Girish K S <girish.shivananjappa@linaro.org>
Reviewed-by: Saugata Das <saugata.das@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
There is an understood mismatch between the voltage the host controller is
set to and the voltage supplied to the card by a fixed voltage regulator.
Teaching the driver to accept the mismatch is overly complicated. Instead
just accept the regulator's voltage.
This patch adds MMC_CAP2_BROKEN_VOLTAGE.
If the voltage didn't satisfy between min_uV and max_uV, try to change
the voltage in core.c. When changing the voltage, maybe use
regulator_set_voltage().
In regulator_set_voltage(), check the below condition.
/* sanity check */
if (!rdev->desc->ops->set_voltage &&
!rdev->desc->ops->set_voltage_sel) {
ret = -EINVAL;
goto out;
}
If some board should use the fixed-regulator, always return -EINVAL.
Then, eMMC didn't initialize always.
So if use a fixed-regulator, we need to add the MMC_CAP2_BROKEN_VOLTAGE.
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Ensure clocks are always enabled before any interaction with the
host controller driver. This makes sure that there is no race
between host execution and the core layer turning off clocks
in different context with clock gating framework.
Signed-off-by: Sujit Reddy Thumma <sthumma@codeaurora.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Per Forlin <per.forlin@stericsson.com>
Signed-off-by: Chris Ball <cjb@laptop.org>