There's no driver currently using it; it is also not
documented about what it would be supposed to do.
So, get rid of it.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
There's a flag defined for Digital TV demux that is not used
anywhere, called DMX_KERNEL_CLIENT. Get rid of it.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Now that frontend.h contains most documentation for the frontend,
remove the duplicated information from Documentation/ and use the
kernel-doc auto-generated one instead.
That should simplify maintainership of DVB frontend uAPI, as most
of the documentation will stick with the header file.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Most of the stuff at the Digital TV frontend header file
are documented only at the Documentation. However, a few
kernel-doc markups are there, several of them with parsing
issues.
Add the missing documentation, copying definitions from the
Documentation when it applies, fixing some bugs.
Please notice that DVBv3 stuff that were deprecated weren't
commented by purpose. Instead, they were clearly tagged as
such.
This patch prepares to move part of the documentation from
Documentation/ to kernel-doc comments.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
struct dtv_cmds_h is just an ancillary struct used by the
dvb_frontend.c to internally store frontend commands.
It doesn't belong to the userspace header, nor it is used anywhere,
except inside the DVB core. So, remove it from the header.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Using typedefs inside the Kernel is against CodingStyle, and
there's no good usage here.
Just like we did at frontend.h, at commit 0df289a209
("[media] dvb: Get rid of typedev usage for enums"), let's keep
those typedefs only to provide userspace backward compatibility.
No functional changes.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Using typedefs inside the Kernel is against CodingStyle, and
there's no good usage here.
Just like we did at frontend.h, at commit 0df289a209 ("[media] dvb:
Get rid of typedev usage for enums"), let's keep those typedefs only
to provide userspace backward compatibility.
No functional changes.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Pull x86 asm updates from Ingo Molnar:
- Introduce the ORC unwinder, which can be enabled via
CONFIG_ORC_UNWINDER=y.
The ORC unwinder is a lightweight, Linux kernel specific debuginfo
implementation, which aims to be DWARF done right for unwinding.
Objtool is used to generate the ORC unwinder tables during build, so
the data format is flexible and kernel internal: there's no
dependency on debuginfo created by an external toolchain.
The ORC unwinder is almost two orders of magnitude faster than the
(out of tree) DWARF unwinder - which is important for perf call graph
profiling. It is also significantly simpler and is coded defensively:
there has not been a single ORC related kernel crash so far, even
with early versions. (knock on wood!)
But the main advantage is that enabling the ORC unwinder allows
CONFIG_FRAME_POINTERS to be turned off - which speeds up the kernel
measurably:
With frame pointers disabled, GCC does not have to add frame pointer
instrumentation code to every function in the kernel. The kernel's
.text size decreases by about 3.2%, resulting in better cache
utilization and fewer instructions executed, resulting in a broad
kernel-wide speedup. Average speedup of system calls should be
roughly in the 1-3% range - measurements by Mel Gorman [1] have shown
a speedup of 5-10% for some function execution intense workloads.
The main cost of the unwinder is that the unwinder data has to be
stored in RAM: the memory cost is 2-4MB of RAM, depending on kernel
config - which is a modest cost on modern x86 systems.
Given how young the ORC unwinder code is it's not enabled by default
- but given the performance advantages the plan is to eventually make
it the default unwinder on x86.
See Documentation/x86/orc-unwinder.txt for more details.
- Remove lguest support: its intended role was that of a temporary
proof of concept for virtualization, plus its removal will enable the
reduction (removal) of the paravirt API as well, so Rusty agreed to
its removal. (Juergen Gross)
- Clean up and fix FSGS related functionality (Andy Lutomirski)
- Clean up IO access APIs (Andy Shevchenko)
- Enhance the symbol namespace (Jiri Slaby)
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits)
objtool: Handle GCC stack pointer adjustment bug
x86/entry/64: Use ENTRY() instead of ALIGN+GLOBAL for stub32_clone()
x86/fpu/math-emu: Add ENDPROC to functions
x86/boot/64: Extract efi_pe_entry() from startup_64()
x86/boot/32: Extract efi_pe_entry() from startup_32()
x86/lguest: Remove lguest support
x86/paravirt/xen: Remove xen_patch()
objtool: Fix objtool fallthrough detection with function padding
x86/xen/64: Fix the reported SS and CS in SYSCALL
objtool: Track DRAP separately from callee-saved registers
objtool: Fix validate_branch() return codes
x86: Clarify/fix no-op barriers for text_poke_bp()
x86/switch_to/64: Rewrite FS/GS switching yet again to fix AMD CPUs
selftests/x86/fsgsbase: Test selectors 1, 2, and 3
x86/fsgsbase/64: Report FSBASE and GSBASE correctly in core dumps
x86/fsgsbase/64: Fully initialize FS and GS state in start_thread_common
x86/asm: Fix UNWIND_HINT_REGS macro for older binutils
x86/asm/32: Fix regs_get_register() on segment registers
x86/xen/64: Rearrange the SYSCALL entries
x86/asm/32: Remove a bunch of '& 0xffff' from pt_regs segment reads
...
Pull perf updates from Ingo Molnar:
"Kernel side changes:
- Add branch type profiling/tracing support. (Jin Yao)
- Add the PERF_SAMPLE_PHYS_ADDR ABI to allow the tracing/profiling of
physical memory addresses, where the PMU supports it. (Kan Liang)
- Export some PMU capability details in the new
/sys/bus/event_source/devices/cpu/caps/ sysfs directory. (Andi
Kleen)
- Aux data fixes and updates (Will Deacon)
- kprobes fixes and updates (Masami Hiramatsu)
- AMD uncore PMU driver fixes and updates (Janakarajan Natarajan)
On the tooling side, here's a (limited!) list of highlights - there
were many other changes that I could not list, see the shortlog and
git history for details:
UI improvements:
- Implement a visual marker for fused x86 instructions in the
annotate TUI browser, available now in 'perf report', more work
needed to have it available as well in 'perf top' (Jin Yao)
Further explanation from one of Jin's patches:
│ ┌──cmpl $0x0,argp_program_version_hook
81.93 │ ├──je 20
│ │ lock cmpxchg %esi,0x38a9a4(%rip)
│ │↓ jne 29
│ │↓ jmp 43
11.47 │20:└─→cmpxch %esi,0x38a999(%rip)
That means the cmpl+je is a fused instruction pair and they should
be considered together.
- Record the branch type and then show statistics and info about in
callchain entries (Jin Yao)
Example from one of Jin's patches:
# perf record -g -j any,save_type
# perf report --branch-history --stdio --no-children
38.50% div.c:45 [.] main div
|
---main div.c:42 (RET CROSS_2M cycles:2)
compute_flag div.c:28 (cycles:2)
compute_flag div.c:27 (RET CROSS_2M cycles:1)
rand rand.c:28 (cycles:1)
rand rand.c:28 (RET CROSS_2M cycles:1)
__random random.c:298 (cycles:1)
__random random.c:297 (COND_BWD CROSS_2M cycles:1)
__random random.c:295 (cycles:1)
__random random.c:295 (COND_BWD CROSS_2M cycles:1)
__random random.c:295 (cycles:1)
__random random.c:295 (RET CROSS_2M cycles:9)
namespaces support:
- Add initial support for namespaces, using setns to access files in
namespaces, grabbing their build-ids, etc. (Krister Johansen)
perf trace enhancements:
- Beautify pkey_{alloc,free,mprotect} arguments in 'perf trace'
(Arnaldo Carvalho de Melo)
- Add initial 'clone' syscall args beautifier in 'perf trace'
(Arnaldo Carvalho de Melo)
- Ignore 'fd' and 'offset' args for MAP_ANONYMOUS in 'perf trace'
(Arnaldo Carvalho de Melo)
- Beautifiers for the 'cmd' arg of several ioctl types, including:
sound, DRM, KVM, vhost virtio and perf_events. (Arnaldo Carvalho de
Melo)
- Add PERF_SAMPLE_CALLCHAIN and PERF_RECORD_MMAP[2] to 'perf data'
CTF conversion, allowing CTF trace visualization tools to show
callchains and to resolve symbols (Geneviève Bastien)
- Beautify the fcntl syscall, which is an interesting one in the
sense that infrastructure had to be put in place to change the
formatters of some arguments according to the value in a previous
one, i.e. cmd dictates how arg and the syscall return will be
formatted. (Arnaldo Carvalho de Melo
perf stat enhancements:
- Use group read for event groups in 'perf stat', reducing overhead
when groups are defined in the event specification, i.e. when using
{} to enclose a list of events, asking them to be read at the same
time, e.g.: "perf stat -e '{cycles,instructions}'" (Jiri Olsa)
pipe mode improvements:
- Process tracing data in 'perf annotate' pipe mode (David
Carrillo-Cisneros)
- Add header record types to pipe-mode, now this command:
$ perf record -o - -e cycles sleep 1 | perf report --stdio --header
Will show the same as in non-pipe mode, i.e. involving a perf.data
file (David Carrillo-Cisneros)
Vendor specific hardware event support updates/enhancements:
- Update POWER9 vendor events tables (Sukadev Bhattiprolu)
- Add POWER9 PMU events Sukadev (Bhattiprolu)
- Support additional POWER8+ PVR in PMU mapfile (Shriya)
- Add Skylake server uncore JSON vendor events (Andi Kleen)
- Support exporting Intel PT data to sqlite3 with python perf
scripts, this is in addition to the postgresql support that was
already there (Adrian Hunter)"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (253 commits)
perf symbols: Fix plt entry calculation for ARM and AARCH64
perf probe: Fix kprobe blacklist checking condition
perf/x86: Fix caps/ for !Intel
perf/core, x86: Add PERF_SAMPLE_PHYS_ADDR
perf/core, pt, bts: Get rid of itrace_started
perf trace beauty: Beautify pkey_{alloc,free,mprotect} arguments
tools headers: Sync cpu features kernel ABI headers with tooling headers
perf tools: Pass full path of FEATURES_DUMP
perf tools: Robustify detection of clang binary
tools lib: Allow external definition of CC, AR and LD
perf tools: Allow external definition of flex and bison binary names
tools build tests: Don't hardcode gcc name
perf report: Group stat values on global event id
perf values: Zero value buffers
perf values: Fix allocation check
perf values: Fix thread index bug
perf report: Add dump_read function
perf record: Set read_format for inherit_stat
perf c2c: Fix remote HITM detection for Skylake
perf tools: Fix static build with newer toolchains
...
In the last NFWS in Faro, Portugal, we discussed that netlink is lacking
the semantics to request non recursive deletions, ie. do not delete an
object iff it has child objects that hang from this parent object that
the user requests to be deleted.
We need this new flag to solve a problem for the iptables-compat
backward compatibility utility, that runs iptables commands using the
existing nf_tables netlink interface. Specifically, custom chains in
iptables cannot be deleted if there are rules in it, however, nf_tables
allows to remove any chain that is populated with content. To sort out
this asymmetry, iptables-compat userspace sets this new NLM_F_NONREC
flag to obtain the same semantics that iptables provides.
This new flag should only be used for deletion requests. Note this new
flag value overlaps with the existing:
* NLM_F_ROOT for get requests.
* NLM_F_REPLACE for new requests.
However, those flags should not ever be used in deletion requests.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pull RCU updates from Ingo Molnad:
"The main RCU related changes in this cycle were:
- Removal of spin_unlock_wait()
- SRCU updates
- RCU torture-test updates
- RCU Documentation updates
- Extend the sys_membarrier() ABI with the MEMBARRIER_CMD_PRIVATE_EXPEDITED variant
- Miscellaneous RCU fixes
- CPU-hotplug fixes"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
arch: Remove spin_unlock_wait() arch-specific definitions
locking: Remove spin_unlock_wait() generic definitions
drivers/ata: Replace spin_unlock_wait() with lock/unlock pair
ipc: Replace spin_unlock_wait() with lock/unlock pair
exit: Replace spin_unlock_wait() with lock/unlock pair
completion: Replace spin_unlock_wait() with lock/unlock pair
doc: Set down RCU's scheduling-clock-interrupt needs
doc: No longer allowed to use rcu_dereference on non-pointers
doc: Add RCU files to docbook-generation files
doc: Update memory-barriers.txt for read-to-write dependencies
doc: Update RCU documentation
membarrier: Provide expedited private command
rcu: Remove exports from rcu_idle_exit() and rcu_idle_enter()
rcu: Add warning to rcu_idle_enter() for irqs enabled
rcu: Make rcu_idle_enter() rely on callers disabling irqs
rcu: Add assertions verifying blocked-tasks list
rcu/tracing: Set disable_rcu_irq_enter on rcu_eqs_exit()
rcu: Add TPS() protection for _rcu_barrier_trace strings
rcu: Use idle versions of swait to make idle-hack clear
swait: Add idle variants which don't contribute to load average
...
Register a new limit stateful object type into the stateful object
infrastructure.
Signed-off-by: Pablo M. Bermudo Garay <pablombg@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds a new feature to hashlimit that allows matching on the
current packet/byte rate without rate limiting. This can be enabled
with a new flag --hashlimit-rate-match. The match returns true if the
current rate of packets is above/below the user specified value.
The main difference between the existing algorithm and the new one is
that the existing algorithm rate-limits the flow whereas the new
algorithm does not. Instead it *classifies* the flow based on whether
it is above or below a certain rate. I will demonstrate this with an
example below. Let us assume this rule:
iptables -A INPUT -m hashlimit --hashlimit-above 10/s -j new_chain
If the packet rate is 15/s, the existing algorithm would ACCEPT 10
packets every second and send 5 packets to "new_chain".
But with the new algorithm, as long as the rate of 15/s is sustained,
all packets will continue to match and every packet is sent to new_chain.
This new functionality will let us classify different flows based on
their current rate, so that further decisions can be made on them based on
what the current rate is.
This is how the new algorithm works:
We divide time into intervals of 1 (sec/min/hour) as specified by
the user. We keep track of the number of packets/bytes processed in the
current interval. After each interval we reset the counter to 0.
When we receive a packet for match, we look at the packet rate
during the current interval and the previous interval to make a
decision:
if [ prev_rate < user and cur_rate < user ]
return Below
else
return Above
Where cur_rate is the number of packets/bytes seen in the current
interval, prev is the number of packets/bytes seen in the previous
interval and 'user' is the rate specified by the user.
We also provide flexibility to the user for choosing the time
interval using the option --hashilmit-interval. For example the user can
keep a low rate like x/hour but still keep the interval as small as 1
second.
To preserve backwards compatibility we have to add this feature in a new
revision, so I've created revision 3 for hashlimit. The two new options
we add are:
--hashlimit-rate-match
--hashlimit-rate-interval
I have updated the help text to add these new options. Also added a few
tests for the new options.
Suggested-by: Igor Lubashev <ilubashe@akamai.com>
Reviewed-by: Josh Hunt <johunt@akamai.com>
Signed-off-by: Vishwanath Pai <vpai@akamai.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for your net-next
tree. Basically, updates to the conntrack core, enhancements for
nf_tables, conversion of netfilter hooks from linked list to array to
improve memory locality and asorted improvements for the Netfilter
codebase. More specifically, they are:
1) Add expection to hashes after timer initialization to prevent
access from another CPU that walks on the hashes and calls
del_timer(), from Florian Westphal.
2) Don't update nf_tables chain counters from hot path, this is only
used by the x_tables compatibility layer.
3) Get rid of nested rcu_read_lock() calls from netfilter hook path.
Hooks are always guaranteed to run from rcu read side, so remove
nested rcu_read_lock() where possible. Patch from Taehee Yoo.
4) nf_tables new ruleset generation notifications include PID and name
of the process that has updated the ruleset, from Phil Sutter.
5) Use skb_header_pointer() from nft_fib, so we can reuse this code from
the nf_family netdev family. Patch from Pablo M. Bermudo.
6) Add support for nft_fib in nf_tables netdev family, also from Pablo.
7) Use deferrable workqueue for conntrack garbage collection, to reduce
power consumption, from Patch from Subash Abhinov Kasiviswanathan.
8) Add nf_ct_expect_iterate_net() helper and use it. From Florian
Westphal.
9) Call nf_ct_unconfirmed_destroy only from cttimeout, from Florian.
10) Drop references on conntrack removal path when skbuffs has escaped via
nfqueue, from Florian.
11) Don't queue packets to nfqueue with dying conntrack, from Florian.
12) Constify nf_hook_ops structure, from Florian.
13) Remove neededlessly branch in nf_tables trace code, from Phil Sutter.
14) Add nla_strdup(), from Phil Sutter.
15) Rise nf_tables objects name size up to 255 chars, people want to use
DNS names, so increase this according to what RFC 1035 specifies.
Patch series from Phil Sutter.
16) Kill nf_conntrack_default_on, it's broken. Default on conntrack hook
registration on demand, suggested by Eric Dumazet, patch from Florian.
17) Remove unused variables in compat_copy_entry_from_user both in
ip_tables and arp_tables code. Patch from Taehee Yoo.
18) Constify struct nf_conntrack_l4proto, from Julia Lawall.
19) Constify nf_loginfo structure, also from Julia.
20) Use a single rb root in connlimit, from Taehee Yoo.
21) Remove unused netfilter_queue_init() prototype, from Taehee Yoo.
22) Use audit_log() instead of open-coding it, from Geliang Tang.
23) Allow to mangle tcp options via nft_exthdr, from Florian.
24) Allow to fetch TCP MSS from nft_rt, from Florian. This includes
a fix for a miscalculation of the minimal length.
25) Simplify branch logic in h323 helper, from Nick Desaulniers.
26) Calculate netlink attribute size for conntrack tuple at compile
time, from Florian.
27) Remove protocol name field from nf_conntrack_{l3,l4}proto structure.
From Florian.
28) Remove holes in nf_conntrack_l4proto structure, so it becomes
smaller. From Florian.
29) Get rid of print_tuple() indirection for /proc conntrack listing.
Place all the code in net/netfilter/nf_conntrack_standalone.c.
Patch from Florian.
30) Do not built in print_conntrack() if CONFIG_NF_CONNTRACK_PROCFS is
off. From Florian.
31) Constify most nf_conntrack_{l3,l4}proto helper functions, from
Florian.
32) Fix broken indentation in ebtables extensions, from Colin Ian King.
33) Fix several harmless sparse warning, from Florian.
34) Convert netfilter hook infrastructure to use array for better memory
locality, joint work done by Florian and Aaron Conole. Moreover, add
some instrumentation to debug this.
35) Batch nf_unregister_net_hooks() calls, to call synchronize_net once
per batch, from Florian.
36) Get rid of noisy logging in ICMPv6 conntrack helper, from Florian.
37) Get rid of obsolete NFDEBUG() instrumentation, from Varsha Rao.
38) Remove unused code in the generic protocol tracker, from Davide
Caratti.
I think I will have material for a second Netfilter batch in my queue if
time allow to make it fit in this merge window.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull drm updates from Dave Airlie:
"This is the main drm pull request for 4.14 merge window.
I'm sending this early, as my continuing journey into fatherhood is
occurring really soon now, I'm going to be mostly useless for the next
couple of weeks, though I may be able to read email, I doubt I'll be
doing much patch applications or git sending. If anything urgent pops
up I've asked Daniel/Jani/Alex/Sean to try and direct stuff towards
you.
Outside drm changes:
Some rcar-du updates that touch the V4L tree, all acks should be in
place. It adds one export to the radix tree code for new i915 use
case. There are some minor AGP cleanups (don't see that too often).
Changes to the vbox driver in staging to avoid breaking compilation.
Summary:
core:
- Atomic helper fixes
- Atomic UAPI fixes
- Add YCBCR 4:2:0 support
- Drop set_busid hook
- Refactor fb_helper locking
- Remove a bunch of internal APIs
- Add a bunch of better default handlers
- Format modifier/blob plane property added
- More internal header refactoring
- Make more internal API names consistent
- Enhanced syncobj APIs (wait/signal/reset/create signalled)
bridge:
- Add Synopsys Designware MIPI DSI host bridge driver
tiny:
- Add Pervasive Displays RePaper displays
- Add support for LEGO MINDSTORMS EV3 LCD
i915:
- Lots of GEN10/CNL support patches
- drm syncobj support
- Skylake+ watermark refactoring
- GVT vGPU 48-bit ppgtt support
- GVT performance improvements
- NOA change ioctl
- CCS (color compression) scanout support
- GPU reset improvements
amdgpu:
- Initial hugepage support
- BO migration logic rework
- Vega10 improvements
- Powerplay fixes
- Stop reprogramming the MC
- Fixes for ACP audio on stoney
- SR-IOV fixes/improvements
- Command submission overhead improvements
amdkfd:
- Non-dGPU upstreaming patches
- Scratch VA ioctl
- Image tiling modes
- Update PM4 headers for new firmware
- Drop all BUG_ONs.
nouveau:
- GP108 modesetting support.
- Disable MSI on big endian.
vmwgfx:
- Add fence fd support.
msm:
- Runtime PM improvements
exynos:
- NV12MT support
- Refactor KMS drivers
imx-drm:
- Lock scanout channel to improve memory bw
- Cleanups
etnaviv:
- GEM object population fixes
tegra:
- Prep work for Tegra186 support
- PRIME mmap support
sunxi:
- HDMI support improvements
- HDMI CEC support
omapdrm:
- HDMI hotplug IRQ support
- Big driver cleanup
- OMAP5 DSI support
rcar-du:
- vblank fixes
- VSP1 updates
arcgpu:
- Minor fixes
stm:
- Add STM32 DSI controller driver
dw_hdmi:
- Add support for Rockchip RK3399
- HDMI CEC support
atmel-hlcdc:
- Add 8-bit color support
vc4:
- Atomic fixes
- New ioctl to attach a label to a buffer object
- HDMI CEC support
- Allow userspace to dictate rendering order on submit ioctl"
* tag 'drm-for-v4.14' of git://people.freedesktop.org/~airlied/linux: (1074 commits)
drm/syncobj: Add a signal ioctl (v3)
drm/syncobj: Add a reset ioctl (v3)
drm/syncobj: Add a syncobj_array_find helper
drm/syncobj: Allow wait for submit and signal behavior (v5)
drm/syncobj: Add a CREATE_SIGNALED flag
drm/syncobj: Add a callback mechanism for replace_fence (v3)
drm/syncobj: add sync obj wait interface. (v8)
i915: Use drm_syncobj_fence_get
drm/syncobj: Add a race-free drm_syncobj_fence_get helper (v2)
drm/syncobj: Rename fence_get to find_fence
drm: kirin: Add mode_valid logic to avoid mode clocks we can't generate
drm/vmwgfx: Bump the version for fence FD support
drm/vmwgfx: Add export fence to file descriptor support
drm/vmwgfx: Add support for imported Fence File Descriptor
drm/vmwgfx: Prepare to support fence fd
drm/vmwgfx: Fix incorrect command header offset at restart
drm/vmwgfx: Support the NOP_ERROR command
drm/vmwgfx: Restart command buffers after errors
drm/vmwgfx: Move irq bottom half processing to threads
drm/vmwgfx: Don't use drm_irq_[un]install
...
Pull misc fixes from Al Viro:
"Loose ends and regressions from the last merge window.
Strictly speaking, only binfmt_flat thing is a build regression per
se - the rest is 'only sparse cares about that' stuff"
[ This came in before the 4.13 release and could have gone there, but it
was late in the release and nothing seemed critical enough to care, so
I'm pulling it in the 4.14 merge window instead - Linus ]
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
binfmt_flat: fix arch/m32r and arch/microblaze flat_put_addr_at_rp()
compat_hdio_ioctl: Fix a declaration
<linux/uaccess.h>: Fix copy_in_user() declaration
annotate RWF_... flags
teach SYSCALL_DEFINE/COMPAT_SYSCALL_DEFINE to handle __bitwise arguments
Report TCP MD5 (RFC2385) signing keys, addresses and address prefixes to
processes with CAP_NET_ADMIN requesting INET_DIAG_INFO. Currently it is
not possible to retrieve these from the kernel once they have been
configured on sockets.
Signed-off-by: Ivan Delalande <colona@arista.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The FMR_OF_LAST flag is set on the last fsmap record being returned for
the dataset requested, contrary to what the header file says. Fix the
docs to reflect the behavior of all fsmap implementations.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Root in a non-initial user ns cannot be trusted to write a traditional
security.capability xattr. If it were allowed to do so, then any
unprivileged user on the host could map his own uid to root in a private
namespace, write the xattr, and execute the file with privilege on the
host.
However supporting file capabilities in a user namespace is very
desirable. Not doing so means that any programs designed to run with
limited privilege must continue to support other methods of gaining and
dropping privilege. For instance a program installer must detect
whether file capabilities can be assigned, and assign them if so but set
setuid-root otherwise. The program in turn must know how to drop
partial capabilities, and do so only if setuid-root.
This patch introduces v3 of the security.capability xattr. It builds a
vfs_ns_cap_data struct by appending a uid_t rootid to struct
vfs_cap_data. This is the absolute uid_t (that is, the uid_t in user
namespace which mounted the filesystem, usually init_user_ns) of the
root id in whose namespaces the file capabilities may take effect.
When a task asks to write a v2 security.capability xattr, if it is
privileged with respect to the userns which mounted the filesystem, then
nothing should change. Otherwise, the kernel will transparently rewrite
the xattr as a v3 with the appropriate rootid. This is done during the
execution of setxattr() to catch user-space-initiated capability writes.
Subsequently, any task executing the file which has the noted kuid as
its root uid, or which is in a descendent user_ns of such a user_ns,
will run the file with capabilities.
Similarly when asking to read file capabilities, a v3 capability will
be presented as v2 if it applies to the caller's namespace.
If a task writes a v3 security.capability, then it can provide a uid for
the xattr so long as the uid is valid in its own user namespace, and it
is privileged with CAP_SETFCAP over its namespace. The kernel will
translate that rootid to an absolute uid, and write that to disk. After
this, a task in the writer's namespace will not be able to use those
capabilities (unless rootid was 0), but a task in a namespace where the
given uid is root will.
Only a single security.capability xattr may exist at a time for a given
file. A task may overwrite an existing xattr so long as it is
privileged over the inode. Note this is a departure from previous
semantics, which required privilege to remove a security.capability
xattr. This check can be re-added if deemed useful.
This allows a simple setxattr to work, allows tar/untar to work, and
allows us to tar in one namespace and untar in another while preserving
the capability, without risking leaking privilege into a parent
namespace.
Example using tar:
$ cp /bin/sleep sleepx
$ mkdir b1 b2
$ lxc-usernsexec -m b:0:100000:1 -m b:1:$(id -u):1 -- chown 0:0 b1
$ lxc-usernsexec -m b:0:100001:1 -m b:1:$(id -u):1 -- chown 0:0 b2
$ lxc-usernsexec -m b:0:100000:1000 -- tar --xattrs-include=security.capability --xattrs -cf b1/sleepx.tar sleepx
$ lxc-usernsexec -m b:0:100001:1000 -- tar --xattrs-include=security.capability --xattrs -C b2 -xf b1/sleepx.tar
$ lxc-usernsexec -m b:0:100001:1000 -- getcap b2/sleepx
b2/sleepx = cap_sys_admin+ep
# /opt/ltp/testcases/bin/getv3xattr b2/sleepx
v3 xattr, rootid is 100001
A patch to linux-test-project adding a new set of tests for this
functionality is in the nsfscaps branch at github.com/hallyn/ltp
Changelog:
Nov 02 2016: fix invalid check at refuse_fcap_overwrite()
Nov 07 2016: convert rootid from and to fs user_ns
(From ebiederm: mar 28 2017)
commoncap.c: fix typos - s/v4/v3
get_vfs_caps_from_disk: clarify the fs_ns root access check
nsfscaps: change the code split for cap_inode_setxattr()
Apr 09 2017:
don't return v3 cap for caps owned by current root.
return a v2 cap for a true v2 cap in non-init ns
Apr 18 2017:
. Change the flow of fscap writing to support s_user_ns writing.
. Remove refuse_fcap_overwrite(). The value of the previous
xattr doesn't matter.
Apr 24 2017:
. incorporate Eric's incremental diff
. move cap_convert_nscap to setxattr and simplify its usage
May 8, 2017:
. fix leaking dentry refcount in cap_inode_getsecurity
Signed-off-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
The BINDER_GET_NODE_DEBUG_INFO ioctl will return debug info on
a node. Each successive call reusing the previous return value
will return the next node. The data will be used by
libmemunreachable to mark the pointers with kernel references
as reachable.
Signed-off-by: Colin Cross <ccross@android.com>
Signed-off-by: Martijn Coenen <maco@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add socket mark and priority to fields that can be set by
ebpf program when a socket is created.
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This will be used by the IPv6 host table which will be introduced in the
following patches. The fields in the header are added per-use. This header
is global and can be reused by many drivers.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a different approach from the first attempt in f2c6df7dbf
("loop: support 4k physical blocksize"). Rather than extending
LOOP_{GET,SET}_STATUS, add a separate ioctl just for setting the block
size.
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This adds information about storage keys to the struct returned by
the KVM_PPC_GET_SMMU_INFO ioctl. The new fields replace a pad field,
which was zeroed by previous kernel versions. Thus userspace that
knows about the new fields will see zeroes when running on an older
kernel, indicating that storage keys are not supported. The size of
the structure has not changed.
The number of keys is hard-coded for the CPUs supported by HV KVM,
which is just POWER7, POWER8 and POWER9.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Define the raw IP type. This is needed for raw IP net devices
like rmnet.
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Define the Qualcomm multiplexing and aggregation (MAP) ether type 0x00F9.
This is needed for receiving data in the MAP protocol like RMNET. This is
not an officially registered ID.
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 45f119bf93.
Eric Dumazet says:
We found at Google a significant regression caused by
45f119bf93 tcp: remove header prediction
In typical RPC (TCP_RR), when a TCP socket receives data, we now call
tcp_ack() while we used to not call it.
This touches enough cache lines to cause a slowdown.
so problem does not seem to be HP removal itself but the tcp_ack()
call. Therefore, it might be possible to remove HP after all, provided
one finds a way to elide tcp_ack for most cases.
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The NSH draft says:
An IEEE EtherType, 0x894F, has been allocated for NSH.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the forces IFE lfb type according to IEEE registered
ethertypes. See http://standards-oui.ieee.org/ethertype/eth.txt for more
information. Since there exists the IFE subsystem it can be used there.
This patch also use the correct word "ForCES" instead of "FoRCES" which
is a spelling error inside the IEEE ethertype specification.
Signed-off-by: Alexander Aring <aring@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For understanding how the workload maps to memory channels and hardware
behavior, it's very important to collect address maps with physical
addresses. For example, 3D XPoint access can only be found by filtering
the physical address.
Add a new sample type for physical address.
perf already has a facility to collect data virtual address. This patch
introduces a function to convert the virtual address to physical address.
The function is quite generic and can be extended to any architecture as
long as a virtual address is provided.
- For kernel direct mapping addresses, virt_to_phys is used to convert
the virtual addresses to physical address.
- For user virtual addresses, __get_user_pages_fast is used to walk the
pages tables for user physical address.
- This does not work for vmalloc addresses right now. These are not
resolved, but code to do that could be added.
The new sample type requires collecting the virtual address. The
virtual address will not be output unless SAMPLE_ADDR is applied.
For security, the physical address can only be exposed to root or
privileged user.
Tested-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: mpe@ellerman.id.au
Link: http://lkml.kernel.org/r/1503967969-48278-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
MediaTek BTIF controller is the serial interface similar to UART but it
works only as the digital device which is mainly used to communicate with
the connectivity module called CONNSYS inside the SoC which could be mostly
found on those MediaTek SoCs with Bluetooth feature such as MT7622 and
MT7623 SoCs.
And the controller is made as being compatible with the 8250 register
layout with extra registers such as DMA enablement so it tends to be
integrated with reusing 8250 OF driver. However, DMA mode is not being
supported yet in the current driver.
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The UAPI has a global list of unique numbers for different port types.
The commit
a2d6a987bf ("serial: 8250: Add new port type for TI DA8xx/66AK2x")
introduced a new port type and brought the collision with two other port
types.
Reuse 95 for it instead.
Fixes: a2d6a987bf ("serial: 8250: Add new port type for TI DA8xx/66AK2x")
Cc: David Lechner <david@lechnology.com>
Cc: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
PORT_MFD is not in use since commit
1bd187de53 ("x86, intel-mid: remove Intel MID specific serial support")
Remove leftover.
Fixes: 1bd187de53 ("x86, intel-mid: remove Intel MID specific serial support")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It used to be a gap in port definitions after PORT_MAX_8250. Since the
new drivers are coming the gap become shorter and shorter until the
commit a2d6a987bf ("serial: 8250: Add new port type for TI DA8xx/66AK2x")
completely removed it.
So, while type here is just a formality, make things a little bit more
explicit for this driver and move port types to UAPI header. Note,
it uses two types for now.
Fixes: fddceb8b53 ("tty: 8250: Add 64byte UART support for FSL platforms")
Cc: Priyanka Jain <Priyanka.Jain@freescale.com>
Cc: Poonam Aggrwal <poonam.aggrwal@freescale.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The addition of map_flags BPF_SOCKMAP_STRPARSER flags was to handle a
specific use case where we want to have BPF parse program disabled on
an entry in a sockmap.
However, Alexei found the API a bit cumbersome and I agreed. Lets
remove the STRPARSER flag and support the use case by allowing socks
to be in multiple maps. This allows users to create two maps one with
programs attached and one without. When socks are added to maps they
now inherit any programs attached to the map. This is a nice
generalization and IMO improves the API.
The API rules are less ambiguous and do not need a flag:
- When a sock is added to a sockmap we have two cases,
i. The sock map does not have any attached programs so
we can add sock to map without inheriting bpf programs.
The sock may exist in 0 or more other maps.
ii. The sock map has an attached BPF program. To avoid duplicate
bpf programs we only add the sock entry if it does not have
an existing strparser/verdict attached, returning -EBUSY if
a program is already attached. Otherwise attach the program
and inherit strparser/verdict programs from the sock map.
This allows for socks to be in a multiple maps for redirects and
inherit a BPF program from a single map.
Also this patch simplifies the logic around BPF_{EXIST|NOEXIST|ANY}
flags. In the original patch I tried to be extra clever and only
update map entries when necessary. Now I've decided the complexity
is not worth it. If users constantly update an entry with the same
sock for no reason (i.e. update an entry without actually changing
any parameters on map or sock) we still do an alloc/release. Using
this and allowing multiple entries of a sock to exist in a map the
logic becomes much simpler.
Note: Now that multiple maps are supported the "maps" pointer called
when a socket is closed becomes a list of maps to remove the sock from.
To keep the map up to date when a sock is added to the sockmap we must
add the map/elem in the list. Likewise when it is removed we must
remove it from the list. This results in searching the per psock list
on delete operation. On TCP_CLOSE events we walk the list and remove
the psock from all map/entry locations. I don't see any perf
implications in this because at most I have a psock in two maps. If
a psock were to be in many maps its possibly this might be noticeable
on delete but I can't think of a reason to dup a psock in many maps.
The sk_callback_lock is used to protect read/writes to the list. This
was convenient because in all locations we were taking the lock
anyways just after working on the list. Also the lock is per sock so
in normal cases we shouldn't see any contention.
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Fixes: 174a79ff95 ("bpf: sockmap with sk redirect support")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the initial sockmap API we provided strparser and verdict programs
using a single attach command by extending the attach API with a the
attach_bpf_fd2 field.
However, if we add other programs in the future we will be adding a
field for every new possible type, attach_bpf_fd(3,4,..). This
seems a bit clumsy for an API. So lets push the programs using two
new type fields.
BPF_SK_SKB_STREAM_PARSER
BPF_SK_SKB_STREAM_VERDICT
This has the advantage of having a readable name and can easily be
extended in the future.
Updates to samples and sockmap included here also generalize tests
slightly to support upcoming patch for multiple map support.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Fixes: 174a79ff95 ("bpf: sockmap with sk redirect support")
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the command payloads that do not have an associated libnvdimm
ioctl. I.e. remove the payloads that would only ever be carried in the
ND_CMD_CALL envelope. This prevents userspace from growing unnecessary
dependencies on this kernel header when userspace already has everything
it needs to craft and send these commands.
Cc: Jerry Hoemann <jerry.hoemann@hpe.com>
Reported-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Increase PPL area to 1MB and use it as circular buffer to store PPL. The
entry with highest generation number is the latest one. If PPL to be
written is larger then space left in a buffer, rewind the buffer to the
start (don't wrap it).
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
The fe_status variable s is not initialized meaning it can have any
random garbage status. This could be problematic if fe->ops.tune is
false as s is not updated by the call to fe->ops.tune() and a
subsequent check on the change status will using a garbage value.
Fix this by adding FE_NONE to the enum fe_status and initializing
s to this.
Detected by CoverityScan, CID#112887 ("Uninitialized scalar variable")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
These formats are compressed 12-bit raw bayer formats with four different
pixel orders. They are similar to 10-bit variants. The formats added by
this patch are
V4L2_PIX_FMT_SBGGR12P
V4L2_PIX_FMT_SGBRG12P
V4L2_PIX_FMT_SGRBG12P
V4L2_PIX_FMT_SRGGB12P
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
This patch implements the L2 frame encapsulation mechanism, referred to
as T.Encaps.L2 in the SRv6 specifications [1].
A new type of SRv6 tunnel mode is added (SEG6_IPTUN_MODE_L2ENCAP). It only
accepts packets with an existing MAC header (i.e., it will not work for
locally generated packets). The resulting packet looks like IPv6 -> SRH ->
Ethernet -> original L3 payload. The next header field of the SRH is set to
NEXTHDR_NONE.
[1] https://tools.ietf.org/html/draft-filsfils-spring-srv6-network-programming-01
Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Skylake changed the encoding of the PEBS data source field.
Some combinations are not available anymore, but some new cases
e.g. for L4 cache hit are added.
Fix up the conversion table for Skylake, similar as had been done
for Nehalem.
On Skylake server the encoding for L4 actually means persistent
memory. Handle this case too.
To properly describe it in the abstracted perf format I had to add
some new fields. Since a hit can have only one level add a new
field that is an enumeration, not a bit field to describe
the level. It can describe any level. Some numbers are also
used to describe PMEM and LFB.
Also add a new generic remote flag that can be combined with
the generic level to signify a remote cache.
And there is an extension field for the snoop indication to handle
the Forward state.
I didn't add a generic flag for hops because it's not needed
for Skylake.
I changed the existing encodings for older CPUs to also fill in the
new level and remote fields.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Link: http://lkml.kernel.org/r/20170816222156.19953-3-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This will be used by the IPv4 host table which will be introduced in the
following patches. This header is global and can be reused by many
drivers.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This will be used by the IPv4 host table which will be introduced in the
following patches. This header is global and can be reused by many
drivers.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>