Pull scheduler fix from Ingo Molnar:
"A delayacct statistics correctness fix"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
delayacct: Account blkio completion on the correct task
Pull x86 pti bits and fixes from Thomas Gleixner:
"This last update contains:
- An objtool fix to prevent a segfault with the gold linker by
changing the invocation order. That's not just for gold, it's a
general robustness improvement.
- An improved error message for objtool which spares tearing hairs.
- Make KASAN fail loudly if there is not enough memory instead of
oopsing at some random place later
- RSB fill on context switch to prevent RSB underflow and speculation
through other units.
- Make the retpoline/RSB functionality work reliably for both Intel
and AMD
- Add retpoline to the module version magic so mismatch can be
detected
- A small (non-fix) update for cpufeatures which prevents cpu feature
clashing for the upcoming extra mitigation bits to ease
backporting"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
module: Add retpoline tag to VERMAGIC
x86/cpufeature: Move processor tracing out of scattered features
objtool: Improve error message for bad file argument
objtool: Fix seg fault with gold linker
x86/retpoline: Add LFENCE to the retpoline/RSB filling RSB macros
x86/retpoline: Fill RSB on context switch for affected CPUs
x86/kasan: Panic if there is not enough memory to boot
Define the bit positions instead of macros using the magic values,
and move the expanded helpers to calculate the size and size unit into
the implementation C file.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
The latest merge between net and net-next introduced a complier assert in
mlx5 driver. In hca_cap_bits older fields are kept along with newer
fields that should have replaced them.
Fixes: c02b3741eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that no more drivers rely on arbitrary early initialisation via an
of_iommu_init_fn hook, let's clean up the redundant remnants. The
IOMMU_OF_DECLARE() macro needs to remain for now, as the probe-deferral
mechanism has no other nice way to detect built-in drivers before they
have registered themselves, such that it can make the right decision.
Reviewed-by: Sricharan R <sricharan@codeaurora.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
If anyone is going to use dm_table_create(), they probably should be
able to use dm_table_destroy() too. Move the dm_table_destroy()
definition outside the private header, near dm_table_create()
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Add a check to verify IOMMU 5-level paging support. If the CPU supports
supports 5-level paging but the IOMMU does not support it then disable
SVM by not allocating PASID tables.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Add a check to verify IOMMU 1GB page support. If the CPU supports 1GB
pages but the IOMMU does not support it then disable SVM by not
allocating PASID tables.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
There doesn't seem to be any need to have the INIT_SIGNALS and INIT_SIGHAND
macros, so expand them in their single places of use and remove them.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Tested-by: Will Deacon <will.deacon@arm.com> (arm64)
Tested-by: Palmer Dabbelt <palmer@sifive.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Daniel Borkmann says:
====================
pull-request: bpf-next 2018-01-17
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Add initial BPF map offloading for nfp driver. Currently only
programs were supported so far w/o being able to access maps.
Offloaded programs are right now only allowed to perform map
lookups, and control path is responsible for populating the
maps. BPF core infrastructure along with nfp implementation is
provided, from Jakub.
2) Various follow-ups to Josef's BPF error injections. More
specifically that includes: properly check whether the error
injectable event is on function entry or not, remove the percpu
bpf_kprobe_override and rather compare instruction pointer
with original one, separate error-injection from kprobes since
it's not limited to it, add injectable error types in order to
specify what is the expected type of failure, and last but not
least also support the kernel's fault injection framework, all
from Masami.
3) Various misc improvements and cleanups to the libbpf Makefile.
That is, fix permissions when installing BPF header files, remove
unused variables and functions, and also install the libbpf.h
header, from Jesper.
4) When offloading to nfp JIT and the BPF insn is unsupported in the
JIT, then reject right at verification time. Also fix libbpf with
regards to ELF section name matching by properly treating the
program type as prefix. Both from Quentin.
5) Add -DPACKAGE to bpftool when including bfd.h for the disassembler.
This is needed, for example, when building libfd from source as
bpftool doesn't supply a config.h for bfd.h. Fix from Jiong.
6) xdp_convert_ctx_access() is simplified since it doesn't need to
set target size during verification, from Jesper.
7) Let bpftool properly recognize BPF_PROG_TYPE_CGROUP_DEVICE
program types, from Roman.
8) Various functions in BPF cpumap were not declared static, from Wei.
9) Fix a double semicolon in BPF samples, from Luis.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Two read past end of buffer fixes in AF_KEY, from Eric Biggers.
2) Memory leak in key_notify_policy(), from Steffen Klassert.
3) Fix overflow with bpf arrays, from Daniel Borkmann.
4) Fix RDMA regression with mlx5 due to mlx5 no longer using
pci_irq_get_affinity(), from Saeed Mahameed.
5) Missing RCU read locking in nl80211_send_iface() when it calls
ieee80211_bss_get_ie(), from Dominik Brodowski.
6) cfg80211 should check dev_set_name()'s return value, from Johannes
Berg.
7) Missing module license tag in 9p protocol, from Stephen Hemminger.
8) Fix crash due to too small MTU in udp ipv6 sendmsg, from Mike
Maloney.
9) Fix endless loop in netlink extack code, from David Ahern.
10) TLS socket layer sets inverted error codes, resulting in an endless
loop. From Robert Hering.
11) Revert openvswitch erspan tunnel support, it's mis-designed and we
need to kill it before it goes into a real release. From William Tu.
12) Fix lan78xx failures in full speed USB mode, from Yuiko Oshino.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (54 commits)
net, sched: fix panic when updating miniq {b,q}stats
qed: Fix potential use-after-free in qed_spq_post()
nfp: use the correct index for link speed table
lan78xx: Fix failure in USB Full Speed
sctp: do not allow the v4 socket to bind a v4mapped v6 address
sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf
sctp: reinit stream if stream outcnt has been change by sinit in sendmsg
ibmvnic: Fix pending MAC address changes
netlink: extack: avoid parenthesized string constant warning
ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY
net: Allow neigh contructor functions ability to modify the primary_key
sh_eth: fix dumping ARSTR
Revert "openvswitch: Add erspan tunnel support."
net/tls: Fix inverted error codes to avoid endless loop
ipv6: ip6_make_skb() needs to clear cork.base.dst
sctp: avoid compiler warning on implicit fallthru
net: ipv4: Make "ip route get" match iif lo rules again.
netlink: extack needs to be reset each time through loop
tipc: fix a memory leak in tipc_nl_node_get_link()
ipv6: fix udpv6 sendmsg crash caused by too small MTU
...
Based on the recent introduction of phy_modify add helpers for setting
and clearing bits in PHY registers.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clean up: Over time, the industry has adopted the term "frwr"
instead of "frmr". The term "frwr" is now more widely recognized.
For the past couple of years I've attempted to add new code using
"frwr" , but there still remains plenty of older code that still
uses "frmr". Replace all usage of "frmr" to avoid confusion.
While we're churning code, rename variables unhelpfully called "f"
to "frwr", to improve code clarity.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Some older compilers (gcc-4.4 through 4.6 in particular) struggle
with the way that blkg_rwstat_read() returns a structure, leading
to excessive stack usage and rather inefficient code:
block/blk-cgroup.c: In function 'blkg_destroy':
block/blk-cgroup.c:354:1: error: the frame size of 1296 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
block/cfq-iosched.c: In function 'cfqg_stats_add_aux':
block/cfq-iosched.c:753:1: error: the frame size of 1928 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
block/bfq-cgroup.c: In function 'bfqg_stats_add_aux':
block/bfq-cgroup.c:299:1: error: the frame size of 1928 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
I also notice that there is no point in using atomic accesses
for the local variables, so storing the temporaries in simple 'u64'
variables not only avoids the stack usage on older compilers but
also improves the object code on modern versions.
Fixes: e6269c4454 ("blkcg: add blkg_[rw]stat->aux_cnt and replace cfq_group->dead_stats with it")
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This patch brings basic support for the Linux Driver Model to the
NuBus subsystem.
For flexibility, the matching of boards with drivers is left up to the
drivers. This is also the approach taken by NetBSD. A board may have
many functions, and drivers may have to consider many functional
resources and board resources in order to match a device.
This implementation does not bind drivers to resources (nor does it bind
many drivers to the same board). Apple's NuBus declaration ROM design
is flexible enough to allow that, but I don't see a need to support it
as we don't use the "slot zero" resources (in the main logic board ROM).
Eliminate the global nubus_boards linked list by rewriting the procfs
board iterator around bus_for_each_dev(). Hence the nubus device refcount
can be used to determine the lifespan of board objects.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
It is misleading to call a functional resource a "device". In adopting
the Linux Driver Model, the struct device will be embedded in struct
nubus_board. That will compound the terminlogy problem because drivers
will bind with boards, not with functional resources. Avoid this by
renaming struct nubus_dev as struct nubus_rsrc. "Functional resource"
is the vendor's terminology so this helps avoid confusion.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
The /proc/bus/nubus/s/ directory tree for any slot s is missing a lot
of information. The struct file_operations methods have long been left
unimplemented (hence the familiar compile-time warning, "Need to set
some I/O handlers here").
Slot resources have a complex structure which varies depending on board
function. The logic for interpreting these ROM data structures is found
in nubus.c. Let's not duplicate that logic in proc.c.
Create the /proc/bus/nubus/s/ inodes while scanning slot s. During
descent through slot resource subdirectories, call the new
nubus_proc_add_foo() functions to create the procfs inodes.
Also add a new function, nubus_seq_write_rsrc_mem(), to write the
contents of a particular slot resource to a given seq_file. This is
used by the procfs file_operations methods, to finally give userspace
access to slot ROM information, such as the available video modes.
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Eliminate unused values from struct nubus_dev to save wasted memory
(a Radius PrecisionColor 24X card has about 95 functional resources
and up to six such cards may be fitted). Also remove redundant static
variable initialization, an unreachable !MACH_IS_MAC conditional,
the unused nubus_find_device() function, the bogus get_nubus_list()
prototype and the pointless card_present temporary variable.
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
This fixes a couple of warnings from 'make W=1':
drivers/nubus/nubus.c:790: warning: no previous prototype for 'nubus_probe_slot'
drivers/nubus/nubus.c:824: warning: no previous prototype for 'nubus_scan_bus'
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Due to the '#ifdef __KERNEL__' being located in the wrong place, some
definitions from the kernel API were placed in the UAPI header during
the scripted header split. Fix this. Also, remove the duplicate comment
which is only relevant to the UAPI header.
Fixes: 607ca46e97 ("UAPI: (Scripted) Disintegrate include/linux")
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Check array indices. Avoid sprintf. Use buffers of sufficient size.
Use appropriate types for array length parameters.
Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
This part of Secure Encrypted Virtualization (SEV) patch series focuses on KVM
changes required to create and manage SEV guests.
SEV is an extension to the AMD-V architecture which supports running encrypted
virtual machine (VMs) under the control of a hypervisor. Encrypted VMs have their
pages (code and data) secured such that only the guest itself has access to
unencrypted version. Each encrypted VM is associated with a unique encryption key;
if its data is accessed to a different entity using a different key the encrypted
guest's data will be incorrectly decrypted, leading to unintelligible data.
This security model ensures that hypervisor will no longer able to inspect or
alter any guest code or data.
The key management of this feature is handled by a separate processor known as
the AMD Secure Processor (AMD-SP) which is present on AMD SOCs. The SEV Key
Management Specification (see below) provides a set of commands which can be
used by hypervisor to load virtual machine keys through the AMD-SP driver.
The patch series adds a new ioctl in KVM driver (KVM_MEMORY_ENCRYPT_OP). The
ioctl will be used by qemu to issue SEV guest-specific commands defined in Key
Management Specification.
The following links provide additional details:
AMD Memory Encryption white paper:
http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf
AMD64 Architecture Programmer's Manual:
http://support.amd.com/TechDocs/24593.pdf
SME is section 7.10
SEV is section 15.34
SEV Key Management:
http://support.amd.com/TechDocs/55766_SEV-KM API_Specification.pdf
KVM Forum Presentation:
http://www.linux-kvm.org/images/7/74/02x08A-Thomas_Lendacky-AMDs_Virtualizatoin_Memory_Encryption_Technology.pdf
SEV Guest BIOS support:
SEV support has been add to EDKII/OVMF BIOS
https://github.com/tianocore/edk2
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Various CAN or CAN-FD IP may be able to run at a faster rate than
what the transceiver the CAN node is connected to. This can lead to
unexpected errors. However, CAN transceivers typically have fixed
limitations and provide no means to discover these limitations at
runtime. Therefore, add support for a can-transceiver node that
can be reused by other CAN peripheral drivers to determine for both
CAN and CAN-FD what the max bitrate that can be used. If the user
tries to configure CAN to pass these maximum bitrates it will throw
an error.
Also add support for reading bitrate_max via the netlink interface.
Reviewed-by: Suman Anna <s-anna@ti.com>
Signed-off-by: Franklin S Cooper Jr <fcooper@ti.com>
[nsekhar@ti.com: fix build error with !CONFIG_OF]
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
hrtimer callbacks are always invoked in hard interrupt context. Several
users in tree require soft interrupt context for their callbacks and
achieve this by combining a hrtimer with a tasklet. The hrtimer schedules
the tasklet in hard interrupt context and the tasklet callback gets invoked
in softirq context later.
That's suboptimal and aside of that the real-time patch moves most of the
hrtimers into softirq context. So adding native support for hrtimers
expiring in softirq context is a valuable extension for both mainline and
the RT patch set.
Each valid hrtimer clock id has two associated hrtimer clock bases: one for
timers expiring in hardirq context and one for timers expiring in softirq
context.
Implement the functionality to associate a hrtimer with the hard or softirq
related clock bases and update the relevant functions to take them into
account when the next expiry time needs to be evaluated.
Add a check into the hard interrupt context handler functions to check
whether the first expiring softirq based timer has expired. If it's expired
the softirq is raised and the accounting of softirq based timers to
evaluate the next expiry time for programming the timer hardware is skipped
until the softirq processing has finished. At the end of the softirq
processing the regular processing is resumed.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: keescook@chromium.org
Link: http://lkml.kernel.org/r/20171221104205.7269-29-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Before commit:
e33a9bba85 ("sched/core: move IO scheduling accounting from io_schedule_timeout() into scheduler")
delayacct_blkio_end() was called after context-switching into the task which
completed I/O.
This resulted in double counting: the task would account a delay both waiting
for I/O and for time spent in the runqueue.
With e33a9bba85, delayacct_blkio_end() is called by try_to_wake_up().
In ttwu, we have not yet context-switched. This is more correct, in that
the delay accounting ends when the I/O is complete.
But delayacct_blkio_end() relies on 'get_current()', and we have not yet
context-switched into the task whose I/O completed. This results in the
wrong task having its delay accounting statistics updated.
Instead of doing that, pass the task_struct being woken to delayacct_blkio_end(),
so that it can update the statistics of the correct task.
Signed-off-by: Josh Snyder <joshs@netflix.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Cc: <stable@vger.kernel.org>
Cc: Brendan Gregg <bgregg@netflix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-block@vger.kernel.org
Fixes: e33a9bba85 ("sched/core: move IO scheduling accounting from io_schedule_timeout() into scheduler")
Link: http://lkml.kernel.org/r/1513613712-571-1-git-send-email-joshs@netflix.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently hrtimer callback functions are always executed in hard interrupt
context. Users of hrtimers, which need their timer function to be executed
in soft interrupt context, make use of tasklets to get the proper context.
Add additional hrtimer clock bases for timers which must expire in softirq
context, so the detour via the tasklet can be avoided. This is also
required for RT, where the majority of hrtimer is moved into softirq
hrtimer context.
The selection of the expiry mode happens via a mode bit. Introduce
HRTIMER_MODE_SOFT and the matching combinations with the ABS/REL/PINNED
bits and update the decoding of hrtimer_mode in tracepoints.
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: keescook@chromium.org
Link: http://lkml.kernel.org/r/20171221104205.7269-27-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
hrtimer_cpu_base.next_timer stores the pointer to the next expiring timer
in a CPU base.
This pointer cannot be dereferenced and is solely used to check whether a
hrtimer which is removed is the hrtimer which is the first to expire in the
CPU base. If this is the case, then the timer hardware needs to be
reprogrammed to avoid an extra interrupt for nothing.
Again, this is conditional functionality, but there is no compelling reason
to make this conditional. As a preparation, hrtimer_cpu_base.next_timer
needs to be available unconditonally.
Aside of that the upcoming support for softirq based hrtimers requires access
to this pointer unconditionally as well, so our motivation is not entirely
simplicity based.
Make the update of hrtimer_cpu_base.next_timer unconditional and remove the
#ifdef cruft. The impact on CONFIG_HIGH_RES_TIMERS=n && CONFIG_NOHZ=n is
marginal as it's just a store on an already dirtied cacheline.
No functional change.
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: keescook@chromium.org
Link: http://lkml.kernel.org/r/20171221104205.7269-17-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
hrtimer_cpu_base.expires_next is used to cache the next event armed in the
timer hardware. The value is used to check whether an hrtimer can be
enqueued remotely. If the new hrtimer is expiring before expires_next, then
remote enqueue is not possible as the remote hrtimer hardware cannot be
accessed for reprogramming to an earlier expiry time.
The remote enqueue check is currently conditional on
CONFIG_HIGH_RES_TIMERS=y and hrtimer_cpu_base.hres_active. There is no
compelling reason to make this conditional.
Move hrtimer_cpu_base.expires_next out of the CONFIG_HIGH_RES_TIMERS=y
guarded area and remove the conditionals in hrtimer_check_target().
The check is currently a NOOP for the CONFIG_HIGH_RES_TIMERS=n and the
!hrtimer_cpu_base.hres_active case because in these cases nothing updates
hrtimer_cpu_base.expires_next yet. This will be changed with later patches
which further reduce the #ifdef zoo in this code.
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: keescook@chromium.org
Link: http://lkml.kernel.org/r/20171221104205.7269-16-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The pointer to the currently running timer is stored in hrtimer_cpu_base
before the base lock is dropped and the callback is invoked.
This results in two levels of indirections and the upcoming support for
softirq based hrtimer requires splitting the "running" storage into soft
and hard IRQ context expiry.
Storing both in the cpu base would require conditionals in all code paths
accessing that information.
It's possible to have a per clock base sequence count and running pointer
without changing the semantics of the related mechanisms because the timer
base pointer cannot be changed while a timer is running the callback.
Unfortunately this makes cpu_clock base larger than 32 bytes on 32-bit
kernels. Instead of having huge gaps due to alignment, remove the alignment
and let the compiler pack CPU base for 32-bit kernels. The resulting cache access
patterns are fortunately not really different from the current
behaviour. On 64-bit kernels the 64-byte alignment stays and the behaviour is
unchanged. This was determined by analyzing the resulting layout and
looking at the number of cache lines involved for the frequently used
clocks.
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: keescook@chromium.org
Link: http://lkml.kernel.org/r/20171221104205.7269-12-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
schedule_hrtimeout_range_clock() uses an 'int clock' parameter for the
clock ID, instead of the customary predefined "clockid_t" type.
In hrtimer coding style the canonical variable name for the clock ID is
'clock_id', therefore change the name of the parameter here as well
to make it all consistent.
While at it, clean up the description for the 'clock_id' and 'mode'
function parameters. The clock modes and the clock IDs are not
restricted as the comment suggests.
Fix the mode description as well for the callers of schedule_hrtimeout_range_clock().
No functional changes intended.
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: keescook@chromium.org
Link: http://lkml.kernel.org/r/20171221104205.7269-5-anna-maria@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The function copy_siginfo_from_user32 is used for two things, in ptrace
since the dawn of siginfo for arbirarily modifying a signal that
user space sees, and in sigqueueinfo to send a signal with arbirary
siginfo data.
Create a single copy of copy_siginfo_from_user32 that all architectures
share, and teach it to handle all of the cases in the siginfo union.
In the generic version of copy_siginfo_from_user32 ensure that all
of the fields in siginfo are initialized so that the siginfo structure
can be safely copied to userspace if necessary.
When copying the embedded sigval union copy the si_int member. That
ensures the 32bit values passes through the kernel unchanged.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
The addr_lsb fields is only valid and available when the
signal is SIGBUS and the si_code is BUS_MCEERR_AR or BUS_MCEERR_AO.
Document this with a comment and place the field in the _sigfault union
to make this clear.
All of the fields stay in the same physical location so both the old
and new definitions of struct siginfo will continue to work.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>