Commit Graph

29483 Commits

Author SHA1 Message Date
Patrick Bellasi
b5c0ce7bd1 sched/fair: Add lsub_positive() and use it consistently
The following pattern:

   var -= min_t(typeof(var), var, val);

is used multiple times in fair.c.

The existing sub_positive() already captures that pattern, but it also
adds an explicit load-store to properly support lockless observations.
In other cases the pattern above is used to update local, and/or not
concurrently accessed, variables.

Let's add a simpler version of sub_positive(), targeted at local variables
updates, which gives the same readability benefits at calling sites,
without enforcing {READ,WRITE}_ONCE() barriers.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Perret <quentin.perret@arm.com>
Cc: Steve Muckle <smuckle@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@google.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/lkml/20181031184527.GA3178@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-11-12 06:17:52 +01:00
Patrick Bellasi
92a801e5d5 sched/fair: Mask UTIL_AVG_UNCHANGED usages
The _task_util_est() is mainly used to add/remove the task contribution
to/from the rq's estimated utilization at task enqueue/dequeue time.
In both cases we ensure the UTIL_AVG_UNCHANGED flag is set to keep
consistency between enqueue and dequeue time while still being
transparent to update_load_avg calls which will eventually reset the
flag.

Let's move the flag forcing within _task_util_est() itself so that we
can simplify calling code by hiding that estimated utilization
implementation detail into one of its internal functions.

This will affect also the "public" API task_util_est() but we know that
the flag will (eventually) impact just on the LSB of the estimated
utilization, thus it's certainly acceptable.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Perret <quentin.perret@arm.com>
Cc: Steve Muckle <smuckle@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@google.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Link: http://lkml.kernel.org/r/20181105145400.935-3-patrick.bellasi@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-11-12 06:17:52 +01:00
Ingo Molnar
59e1678c29 Merge branch 'sched/urgent' into sched/core, to pick up dependent fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-11-12 05:15:33 +01:00
Patrick Bellasi
c469933e77 sched/fair: Fix cpu_util_wake() for 'execl' type workloads
A ~10% regression has been reported for UnixBench's execl throughput
test by Aaron Lu and Ye Xiaolong:

  https://lkml.org/lkml/2018/10/30/765

That test is pretty simple, it does a "recursive" execve() syscall on the
same binary. Starting from the syscall, this sequence is possible:

   do_execve()
     do_execveat_common()
       __do_execve_file()
         sched_exec()
           select_task_rq_fair()          <==| Task already enqueued
             find_idlest_cpu()
               find_idlest_group()
                 capacity_spare_wake()    <==| Functions not called from
		   cpu_util_wake()           | the wakeup path

which means we can end up calling cpu_util_wake() not only from the
"wakeup path", as its name would suggest. Indeed, the task doing an
execve() syscall is already enqueued on the CPU we want to get the
cpu_util_wake() for.

The estimated utilization for a CPU computed in cpu_util_wake() was
written under the assumption that function can be called only from the
wakeup path. If instead the task is already enqueued, we end up with a
utilization which does not remove the current task's contribution from
the estimated utilization of the CPU.
This will wrongly assume a reduced spare capacity on the current CPU and
increase the chances to migrate the task on execve.

The regression is tracked down to:

 commit d519329f72 ("sched/fair: Update util_est only on util_avg updates")

because in that patch we turn on by default the UTIL_EST sched feature.
However, the real issue is introduced by:

 commit f9be3e5961 ("sched/fair: Use util_est in LB and WU paths")

Let's fix this by ensuring to always discount the task estimated
utilization from the CPU's estimated utilization when the task is also
the current one. The same benchmark of the bug report, executed on a
dual socket 40 CPUs Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz machine,
reports these "Execl Throughput" figures (higher the better):

   mainline     : 48136.5 lps
   mainline+fix : 55376.5 lps

which correspond to a 15% speedup.

Moreover, since {cpu_util,capacity_spare}_wake() are not really only
used from the wakeup path, let's remove this ambiguity by using a better
matching name: {cpu_util,capacity_spare}_without().

Since we are at that, let's also improve the existing documentation.

Reported-by: Aaron Lu <aaron.lu@intel.com>
Reported-by: Ye Xiaolong <xiaolong.ye@intel.com>
Tested-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Perret <quentin.perret@arm.com>
Cc: Steve Muckle <smuckle@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Todd Kjos <tkjos@google.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Fixes: f9be3e5961 (sched/fair: Use util_est in LB and WU paths)
Link: https://lore.kernel.org/lkml/20181025093100.GB13236@e110439-lin/
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-11-12 05:00:46 +01:00
Linus Torvalds
08b5278650 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
 "Just the removal of a redundant call into the sched deadline overrun
  check"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  posix-cpu-timers: Remove useless call to check_dl_overrun()
2018-11-11 16:37:41 -06:00
Linus Torvalds
024d4d4c0c Merge branch 'sched/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
 "Two small scheduler fixes:

   - Take hotplug lock in sched_init_smp(). Technically not really
     required, but lockdep will complain other.

   - Trivial comment fix in sched/fair"

* 'sched/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Fix a comment in task_numa_fault()
  sched/core: Take the hotplug lock in sched_init_smp()
2018-11-11 16:33:00 -06:00
Linus Torvalds
0b002cdd50 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core fixes from Thomas Gleixner:
 "A couple of fixlets for the core:

   - Kernel doc function documentation fixes

   - Missing prototypes for weak watchdog functions"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  resource/docs: Complete kernel-doc style function documentation
  watchdog/core: Add missing prototypes for weak functions
  resource/docs: Fix new kernel-doc warnings
2018-11-11 16:14:05 -06:00
Paul E. McKenney
9cac83a57e rcu: Stop expedited grace periods from relying on stop-machine
The CPU-selection code in sync_rcu_exp_select_cpus() disables preemption
to prevent the cpu_online_mask from changing.  However, this relies on
the stop-machine mechanism in the CPU-hotplug offline code, which is not
desirable (it would be good to someday remove the stop-machine mechanism).

This commit therefore instead uses the relevant leaf rcu_node structure's
->ffmask, which has a bit set for all CPUs that are fully functional.
A given CPU's bit is cleared very early during offline processing by
rcutree_offline_cpu() and set very late during online processing by
rcutree_online_cpu().  Therefore, if a CPU's bit is set in this mask, and
preemption is disabled, we have to be before the synchronize_sched() in
the CPU-hotplug offline code, which means that the CPU is guaranteed to be
workqueue-ready throughout the duration of the enclosing preempt_disable()
region of code.

This also has the side-effect of using WORK_CPU_UNBOUND if all the CPUs for
this leaf rcu_node structure are offline, which is an acceptable difference
in behavior.

Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-11-11 11:23:01 -08:00
Andrey Ignatov
46f53a65d2 bpf: Allow narrow loads with offset > 0
Currently BPF verifier allows narrow loads for a context field only with
offset zero. E.g. if there is a __u32 field then only the following
loads are permitted:
  * off=0, size=1 (narrow);
  * off=0, size=2 (narrow);
  * off=0, size=4 (full).

On the other hand LLVM can generate a load with offset different than
zero that make sense from program logic point of view, but verifier
doesn't accept it.

E.g. tools/testing/selftests/bpf/sendmsg4_prog.c has code:

  #define DST_IP4			0xC0A801FEU /* 192.168.1.254 */
  ...
  	if ((ctx->user_ip4 >> 24) == (bpf_htonl(DST_IP4) >> 24) &&

where ctx is struct bpf_sock_addr.

Some versions of LLVM can produce the following byte code for it:

       8:       71 12 07 00 00 00 00 00         r2 = *(u8 *)(r1 + 7)
       9:       67 02 00 00 18 00 00 00         r2 <<= 24
      10:       18 03 00 00 00 00 00 fe 00 00 00 00 00 00 00 00         r3 = 4261412864 ll
      12:       5d 32 07 00 00 00 00 00         if r2 != r3 goto +7 <LBB0_6>

where `*(u8 *)(r1 + 7)` means narrow load for ctx->user_ip4 with size=1
and offset=3 (7 - sizeof(ctx->user_family) = 3). This load is currently
rejected by verifier.

Verifier code that rejects such loads is in bpf_ctx_narrow_access_ok()
what means any is_valid_access implementation, that uses the function,
works this way, e.g. bpf_skb_is_valid_access() for __sk_buff or
sock_addr_is_valid_access() for bpf_sock_addr.

The patch makes such loads supported. Offset can be in [0; size_default)
but has to be multiple of load size. E.g. for __u32 field the following
loads are supported now:
  * off=0, size=1 (narrow);
  * off=1, size=1 (narrow);
  * off=2, size=1 (narrow);
  * off=3, size=1 (narrow);
  * off=0, size=2 (narrow);
  * off=2, size=2 (narrow);
  * off=0, size=4 (full).

Reported-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10 22:29:59 -08:00
Quentin Monnet
16a8cb5cff bpf: do not pass netdev to translate() and prepare() offload callbacks
The kernel functions to prepare verifier and translate for offloaded
program retrieve "offload" from "prog", and "netdev" from "offload".
Then both "prog" and "netdev" are passed to the callbacks.

Simplify this by letting the drivers retrieve the net device themselves
from the offload object attached to prog - if they need it at all. There
is currently no need to pass the netdev as an argument to those
functions.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10 15:39:54 -08:00
Quentin Monnet
a40a26322a bpf: pass prog instead of env to bpf_prog_offload_verifier_prep()
Function bpf_prog_offload_verifier_prep(), called from the kernel BPF
verifier to run a driver-specific callback for preparing for the
verification step for offloaded programs, takes a pointer to a struct
bpf_verifier_env object. However, no driver callback needs the whole
structure at this time: the two drivers supporting this, nfp and
netdevsim, only need a pointer to the struct bpf_prog instance held by
env.

Update the callback accordingly, on kernel side and in these two
drivers.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10 15:39:54 -08:00
Quentin Monnet
eb9119471e bpf: pass destroy() as a callback and remove its ndo_bpf subcommand
As part of the transition from ndo_bpf() to callbacks attached to struct
bpf_offload_dev for some of the eBPF offload operations, move the
functions related to program destruction to the struct and remove the
subcommand that was used to call them through the NDO.

Remove function __bpf_offload_ndo(), which is no longer used.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10 15:39:54 -08:00
Quentin Monnet
b07ade27e9 bpf: pass translate() as a callback and remove its ndo_bpf subcommand
As part of the transition from ndo_bpf() to callbacks attached to struct
bpf_offload_dev for some of the eBPF offload operations, move the
functions related to code translation to the struct and remove the
subcommand that was used to call them through the NDO.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10 15:39:54 -08:00
Quentin Monnet
00db12c3d1 bpf: call verifier_prep from its callback in struct bpf_offload_dev
In a way similar to the change previously brought to the verify_insn
hook and to the finalize callback, switch to the newly added ops in
struct bpf_prog_offload for calling the functions used to prepare driver
verifiers.

Since the dev_ops pointer in struct bpf_prog_offload is no longer used
by any callback, we can now remove it from struct bpf_prog_offload.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10 15:39:54 -08:00
Quentin Monnet
6dc18fa6f4 bpf: call finalize() from its callback in struct bpf_offload_dev
In a way similar to the change previously brought to the verify_insn
hook, switch to the newly added ops in struct bpf_prog_offload for
calling the functions used to perform final verification steps for
offloaded programs.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10 15:39:53 -08:00
Quentin Monnet
341b3e7b7b bpf: call verify_insn from its callback in struct bpf_offload_dev
We intend to remove the dev_ops in struct bpf_prog_offload, and to only
keep the ops in struct bpf_offload_dev instead, which is accessible from
more locations for passing function pointers.

But dev_ops is used for calling the verify_insn hook. Switch to the
newly added ops in struct bpf_prog_offload instead.

To avoid table lookups for each eBPF instruction to verify, we remember
the offdev attached to a netdev and modify bpf_offload_find_netdev() to
avoid performing more than once a lookup for a given offload object.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10 15:39:53 -08:00
Quentin Monnet
1385d755cf bpf: pass a struct with offload callbacks to bpf_offload_dev_create()
For passing device functions for offloaded eBPF programs, there used to
be no place where to store the pointer without making the non-offloaded
programs pay a memory price.

As a consequence, three functions were called with ndo_bpf() through
specific commands. Now that we have struct bpf_offload_dev, and since
none of those operations rely on RTNL, we can turn these three commands
into hooks inside the struct bpf_prog_offload_ops, and pass them as part
of bpf_offload_dev_create().

This commit effectively passes a pointer to the struct to
bpf_offload_dev_create(). We temporarily have two struct
bpf_prog_offload_ops instances, one under offdev->ops and one under
offload->dev_ops. The next patches will make the transition towards the
former, so that offload->dev_ops can be removed, and callbacks relying
on ndo_bpf() added to offdev->ops as well.

While at it, rename "nfp_bpf_analyzer_ops" as "nfp_bpf_dev_ops" (and
similarly for netdevsim).

Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-11-10 15:39:53 -08:00
Linus Torvalds
1de4f2ef21 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull namespace fixes from Eric Biederman:
 "I believe all of these are simple obviously correct bug fixes. These
  fall into two groups:

   - Fixing the implementation of MNT_LOCKED which prevents lesser
     privileged users from seeing unders mounts created by more
     privileged users.

   - Fixing the extended uid and group mapping in user namespaces.

  As well as ensuring the code looks correct I have spot tested these
  changes as well and in my testing the fixes are working"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  mount: Prevent MNT_DETACH from disconnecting locked mounts
  mount: Don't allow copying MNT_UNBINDABLE|MNT_LOCKED mounts
  mount: Retest MNT_LOCKED in do_umount
  userns: also map extents in the reverse map to kernel IDs
2018-11-10 13:27:58 -06:00
Jiong Wang
e647815a4d bpf: let verifier to calculate and record max_pkt_offset
In check_packet_access, update max_pkt_offset after the offset has passed
__check_packet_access.

It should be safe to use u32 for max_pkt_offset as explained in code
comment.

Also, when there is tail call, the max_pkt_offset of the called program is
unknown, so conservatively set max_pkt_offset to MAX_PACKET_OFF for such
case.

Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-11-09 09:16:31 +01:00
Paul E. McKenney
0607ba8403 srcu: Prevent __call_srcu() counter wrap with read-side critical section
Ever since cdf7abc461 ("srcu: Allow use of Tiny/Tree SRCU from
both process and interrupt context"), it has been permissible
to use SRCU read-side critical sections in interrupt context.
This allows __call_srcu() to use SRCU read-side critical sections to
prevent a new SRCU grace period from ending before the call to either
srcu_funnel_gp_start() or srcu_funnel_exp_start completes, thus preventing
SRCU grace-period counter overflow during that time.

Note that this does not permit removal of the counter-wrap checks in
srcu_gp_end().  These check are necessary to handle the case where
a given CPU does not interact at all with SRCU for an extended time
period.

This commit therefore adds an SRCU read-side critical section to
__call_srcu() in order to prevent grace period counter wrap during
the funnel-locking process.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-11-08 21:54:14 -08:00
Paul E. McKenney
78d125d338 sched/membarrier: Replace synchronize_sched() with synchronize_rcu()
Now that synchronize_rcu() waits for preempt-disable regions of code
as well as RCU read-side critical sections, the synchronize_sched()
in sys_membarrier() can be replaced by synchronize_rcu().  This commit
therefore makes this change.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
2018-11-08 21:43:20 -08:00
Paul E. McKenney
d3ff3891b2 rcu: Consolidate the RCU update functions invoked by sync.c
This commit retains all the various gp_ops[] entries, but makes their
update functions all be synchronize_rcu(), call_rcu() and rcu_barrier().
The read-side checks remain consistent with the various RCU flavors,
which still exist on the read side.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
2018-11-08 21:43:20 -08:00
Paul E. McKenney
309ba859b9 rcu: Eliminate synchronize_rcu_mult()
Now that synchronize_rcu() waits for both RCU read-side critical
sections and preempt-disabled regions of code, the sole caller of
synchronize_rcu_mult() can be replaced by synchronize_rcu().
This patch makes this change and removes synchronize_rcu_mult().
Note that _wait_rcu_gp() still supports synchronize_rcu_mult(),
and thus might be simplified in the future to take only take
a single call_rcu() function rather than the current list of them.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-11-08 21:43:20 -08:00
Joel Fernandes (Google)
adbccddb4a rcu: Fix rcu_{node,data} comments about gp_seq_needed
Recent changes have removed the old ->gp_seq_needed field from the
rcu_state structure, which in turn obsoleted a couple of comments in
the rcu_node and rcu_data structures.  This commit therefore updates
these comments accordingly.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: <kernel-team@android.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-08 21:43:20 -08:00
Joel Fernandes (Google)
75a8f72245 rcu: Remove unused rcu_state externs
The rcu_bh_state and rcu_sched_state variables were removed during the
RCU flavor consolidations, but external declarations remain in tree.h.
This commit therefore removes these obsolete declarations.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: <kernel-team@android.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-08 21:43:20 -08:00
Paul E. McKenney
08543bda42 rcu: Eliminate BUG_ON() for kernel/rcu/tree.c
The tree.c file has a number of calls to BUG_ON(), which panics the
kernel, which is not a good strategy for devices (like embedded) that
don't have a way to capture console output.  This commit therefore
converts these BUG_ON() calls to WARN_ON_ONCE() and WARN_ONCE().

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-08 21:41:57 -08:00
Paul E. McKenney
042d4c70a2 rcu: Eliminate BUG_ON() for sync.c
The sync.c file has a number of calls to BUG_ON(), which panics the
kernel, which is not a good strategy for devices (like embedded) that
don't have a way to capture console output.  This commit therefore
changes these BUG_ON() calls to WARN_ON_ONCE(), but does so quite naively.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
2018-11-08 21:41:57 -08:00
Waiman Long
5cf8114d6e cpuset: Expose cpuset.cpus.subpartitions with cgroup_debug
For debugging purpose, it will be useful to expose the content of the
subparts_cpus as a read-only file to see if the code work correctly.
However, subparts_cpus will not be used at all in most use cases. So
adding a new cpuset file that clutters the cgroup directory may not be
desirable.  This is now being done by using the hidden "cgroup_debug"
kernel command line option to expose a new "cpuset.cpus.subpartitions"
file.

That option was originally used by the debug controller to expose
itself when configured into the kernel. This is now extended to set an
internal flag used by cgroup_addrm_files(). A new CFTYPE_DEBUG flag
can now be used to specify that a cgroup file should only be created
when the "cgroup_debug" option is specified.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-11-08 12:27:32 -08:00
Waiman Long
bb5b553c33 cpuset: Use descriptive text when reading/writing cpuset.sched.partition
Currently, cpuset.sched.partition returns the values, 0, 1 or -1 on
read. A person who is not familiar with the partition code may not
understand what they mean.

In order to make cpuset.sched.partition more user-friendly, it will
now display the following descriptive text on read:

  "root" - A partition root (top cpuset of a partition)
  "member" - A non-root member of a partition
  "root invalid" - An invalid partition root

Note that there is at least one partition in the whole cgroup hierarchy.
The top cpuset is the root of that partition.  The rests are either a
root if it starts a new partition or a member of a partition.

The cpuset.sched.partition file will now also accept "root" and
"member" besides 1 and 0 as valid input values. The "root invalid"
value is internal only and cannot be written to the file.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-11-08 12:27:31 -08:00
Waiman Long
5776ceccd4 cpuset: Expose cpus.effective and mems.effective on cgroup v2 root
Because of the fact that setting the "cpuset.sched.partition" in
a direct child of root can remove CPUs from the root's effective CPU
list, it makes sense to know what CPUs are left in the root cgroup for
scheduling purpose. So the "cpuset.cpus.effective" control file is now
exposed in the v2 cgroup root.

For consistency, the "cpuset.mems.effective" control file is exposed
as well.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-11-08 12:27:31 -08:00
Waiman Long
0ccea8feb9 cpuset: Make generate_sched_domains() work with partition
The generate_sched_domains() function is modified to make it work
correctly with the newly introduced subparts_cpus mask for scheduling
domains generation.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-11-08 12:27:30 -08:00
Waiman Long
4b842da276 cpuset: Make CPU hotplug work with partition
When there is a cpu hotplug event (CPU online or offline), the partitions
may need to be reconfigured and regenerated. So code is added to the
hotplug functions to make them work with new subparts_cpus mask to
compute the right effective_cpus for each of the affected cpusets.
It may also change the state of a partition root from real one to an
erroneous one or vice versa.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-11-08 12:27:30 -08:00
Waiman Long
4716909cc5 cpuset: Track cpusets that use parent's effective_cpus
In the default hierarchy, a cpuset will use the parent's effective_cpus
if none of the requested CPUs can be granted from the parent. That can
be a problem if a parent is a partition root with children partition
roots. Changes to a parent's effective_cpus list due to changes in a
child partition root may not be properly reflected in a child cpuset
that use parent's effective_cpus because the cpu_exclusive rule of a
partition root will not guard against that.

In order to avoid the mismatch, two new tracking variables are added to
the cpuset structure to track if a cpuset uses parent's effective_cpus
and the number of children cpusets that use its effective_cpus. So
whenever cpumask changes are made to a parent, it will also check to
see if it has other children cpusets that use its effective_cpus and
call update_cpumasks_hier() if that is the case.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-11-08 12:27:29 -08:00
Waiman Long
3881b86128 cpuset: Add an error state to cpuset.sched.partition
When external events like CPU offlining or user events like changing
the cpu list of an ancestor cpuset happen, update_cpumasks_hier()
will be called to update the effective cpus of each of the affected
cpusets. That will then call update_parent_subparts_cpumask() if
partitions are impacted.

Currently, these events may cause update_parent_subparts_cpumask()
to return error if none of the requested cpus are available or it will
consume all the cpus in the parent partition root. Handling these errors
is problematic as the states may become inconsistent.

Instead of letting update_parent_subparts_cpumask() return error, a new
error state (-1) is added to the partition_root_state flag to designate
the fact that the partition is no longer valid. IOW, it is no longer a
real partition root, but the CS_CPU_EXCLUSIVE flag will still be set
as it can be changed back to a real one if favorable change happens
later on.

This new error state is set internally and user cannot write this new
value to "cpuset.sched.partition".

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-11-08 12:27:29 -08:00
Waiman Long
ee8dde0cd2 cpuset: Add new v2 cpuset.sched.partition flag
A new cpuset.sched.partition boolean flag is added to cpuset v2.
This new flag, if set, indicates that the cgroup is the root of a
new scheduling domain or partition that includes itself and all its
descendants except those that are scheduling domain roots themselves
and their descendants.

With this new flag, one can directly create as many partitions as
necessary without ever using the v1 trick of turning off load balancing
in specific cpusets to create partitions as a side effect.

This new flag is owned by the parent and will cause the CPUs in the
cpuset to be removed from the effective CPUs of its parent.

This is implemented internally by adding a new subparts_cpus mask that
holds the CPUs belonging to child partitions so that:

        subparts_cpus | effective_cpus = cpus_allowed
        subparts_cpus & effective_cpus = 0

This new flag can only be turned on in a cpuset if its parent is a
partition root itself. The state of this flag cannot be changed if the
cpuset has children.

Once turned on, further changes to "cpuset.cpus" is allowed as long
as there is at least one CPU left that can be granted from the parent
and a child partition root cannot use up all the CPUs in the parent's
effective_cpus.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-11-08 12:27:28 -08:00
Waiman Long
bf92370c03 cpuset: Simply allocation and freeing of cpumasks
The previous commit introduces a new subparts_cpus mask into the cpuset
data structure and a new tmpmasks structure.  Managing the allocation
and freeing of those cpumasks is becoming more complex.

So a number of helper functions are added to simplify and streamline
the management of those cpumasks. To make it simple, all the cpumasks
are now pre-cleared on allocation.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-11-08 12:27:28 -08:00
Waiman Long
58b7484250 cpuset: Define data structures to support scheduling partition
>From a cpuset point of view, a scheduling partition is a group of
cpusets with their own set of exclusive CPUs that are not shared by
other tasks outside the scheduling partition.

In the legacy hierarchy, scheduling partitions are supported indirectly
via the right use of the load balancing and the exclusive CPUs flag
which is not intuitive and can be hard to use.

To fully support the concept of scheduling partitions in the default
hierarchy, we need to add some new field into the cpuset structure as
well as a new tmpmasks structure that is used to pre-allocate cpumasks
at the top level cpuset functions to avoid memory allocation in inner
functions as memory allocation failure in those inner functions may
cause a cpuset to have inconsistent states.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-11-08 12:27:27 -08:00
Waiman Long
4ec22e9c5a cpuset: Enable cpuset controller in default hierarchy
Given the fact that thread mode had been merged into 4.14, it is now
time to enable cpuset to be used in the default hierarchy (cgroup v2)
as it is clearly threaded.

The cpuset controller had experienced feature creep since its
introduction more than a decade ago. Besides the core cpus and mems
control files to limit cpus and memory nodes, there are a bunch of
additional features that can be controlled from the userspace. Some of
the features are of doubtful usefulness and may not be actively used.

This patch enables cpuset controller in the default hierarchy with
a minimal set of features, namely just the cpus and mems and their
effective_* counterparts.  We can certainly add more features to the
default hierarchy in the future if there is a real user need for them
later on.

Alternatively, with the unified hiearachy, it may make more sense
to move some of those additional cpuset features, if desired, to
memory controller or may be to the cpu controller instead of staying
with cpuset.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-11-08 12:27:27 -08:00
Juri Lelli
e6a2d72c10 posix-cpu-timers: Remove useless call to check_dl_overrun()
check_dl_overrun() is used to send a SIGXCPU to users that asked to be
informed when a SCHED_DEADLINE runtime overruns occur.

The function is called by check_thread_timers() already, so the call in
check_process_timers() is redundant/wrong (even though harmless).

Remove it.

Fixes: 34be39305a ("sched/deadline: Implement "runtime overrun signal" support")
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: linux-rt-users@vger.kernel.org
Cc: mtk.manpages@gmail.com
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Luca Abeni <luca.abeni@santannapisa.it>
Cc: Claudio Scordino <claudio@evidence.eu.com>
Link: https://lkml.kernel.org/r/20181107111032.32291-1-juri.lelli@redhat.com
2018-11-08 07:43:35 +01:00
Jann Horn
d2f007dbe7 userns: also map extents in the reverse map to kernel IDs
The current logic first clones the extent array and sorts both copies, then
maps the lower IDs of the forward mapping into the lower namespace, but
doesn't map the lower IDs of the reverse mapping.

This means that code in a nested user namespace with >5 extents will see
incorrect IDs. It also breaks some access checks, like
inode_owner_or_capable() and privileged_wrt_inode_uidgid(), so a process
can incorrectly appear to be capable relative to an inode.

To fix it, we have to make sure that the "lower_first" members of extents
in both arrays are translated; and we have to make sure that the reverse
map is sorted *after* the translation (since otherwise the translation can
break the sorting).

This is CVE-2018-18955.

Fixes: 6397fac491 ("userns: bump idmap limits to 340")
Cc: stable@vger.kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Tested-by: Eric W. Biederman <ebiederm@xmission.com>
Reviewed-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2018-11-07 23:51:16 -06:00
Jens Axboe
a0fedc857d Merge branch 'irq/for-block' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into for-4.21/block
Pull in the irq affinity commits, that are staged through Thomas's
tree.

* 'irq/for-block' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/affinity: Add support for allocating interrupt sets
  genirq/affinity: Pass first vector to __irq_build_affinity_masks()
  genirq/affinity: Move two stage affinity spreading into a helper function
  genirq/affinity: Spread IRQs to all available NUMA nodes
2018-11-07 13:43:54 -07:00
Borislav Petkov
f26621e60b resource/docs: Complete kernel-doc style function documentation
Add the missing kernel-doc style function parameters documentation.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akpm@linux-foundation.org
Cc: linux-tip-commits@vger.kernel.org
Cc: rdunlap@infradead.org
Fixes: b69c2e20f6 ("resource: Clean it up a bit")
Link: http://lkml.kernel.org/r/20181105093307.GA12445@zn.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-11-07 16:47:47 +01:00
Ke Wu
e84cd7ee63 modsign: use all trusted keys to verify module signature
Make mod_verify_sig to use all trusted keys. This allows keys in
secondary_trusted_keys to be used to verify PKCS#7 signature on a
kernel module.

Signed-off-by: Ke Wu <mikewu@google.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2018-11-07 14:41:41 +01:00
Long Li
e8da8794a7 genirq/matrix: Improve target CPU selection for managed interrupts.
On large systems with multiple devices of the same class (e.g. NVMe disks,
using managed interrupts), the kernel can affinitize these interrupts to a
small subset of CPUs instead of spreading them out evenly.

irq_matrix_alloc_managed() tries to select the CPU in the supplied cpumask
of possible target CPUs which has the lowest number of interrupt vectors
allocated.

This is done by searching the CPU with the highest number of available
vectors. While this is correct for non-managed CPUs it can select the wrong
CPU for managed interrupts. Under certain constellations this results in
affinitizing the managed interrupts of several devices to a single CPU in
a set.

The book keeping of available vectors works the following way:

 1) Non-managed interrupts:

    available is decremented when the interrupt is actually requested by
    the device driver and a vector is assigned. It's incremented when the
    interrupt and the vector are freed.

 2) Managed interrupts:

    Managed interrupts guarantee vector reservation when the MSI/MSI-X
    functionality of a device is enabled, which is achieved by reserving
    vectors in the bitmaps of the possible target CPUs. This reservation
    decrements the available count on each possible target CPU.

    When the interrupt is requested by the device driver then a vector is
    allocated from the reserved region. The operation is reversed when the
    interrupt is freed by the device driver. Neither of these operations
    affect the available count.

    The reservation persist up to the point where the MSI/MSI-X
    functionality is disabled and only this operation increments the
    available count again.

For non-managed interrupts the available count is the correct selection
criterion because the guaranteed reservations need to be taken into
account. Using the allocated counter could lead to a failing allocation in
the following situation (total vector space of 10 assumed):

		 CPU0	CPU1
 available:	    2	   0
 allocated:	    5	   3   <--- CPU1 is selected, but available space = 0
 managed reserved:  3	   7

 while available yields the correct result.

For managed interrupts the available count is not the appropriate
selection criterion because as explained above the available count is not
affected by the actual vector allocation.

The following example illustrates that. Total vector space of 10
assumed. The starting point is:

		 CPU0	CPU1
 available:	    5	   4
 allocated:	    2	   3
 managed reserved:  3	   3

 Allocating vectors for three non-managed interrupts will result in
 affinitizing the first two to CPU0 and the third one to CPU1 because the
 available count is adjusted with each allocation:

		  CPU0	CPU1
 available:	     5	   4	<- Select CPU0 for 1st allocation
 --> allocated:	     3	   3

 available:	     4	   4	<- Select CPU0 for 2nd allocation
 --> allocated:	     4	   3

 available:	     3	   4	<- Select CPU1 for 3rd allocation
 --> allocated:	     4	   4

 But the allocation of three managed interrupts starting from the same
 point will affinitize all of them to CPU0 because the available count is
 not affected by the allocation (see above). So the end result is:

		  CPU0	CPU1
 available:	     5	   4
 allocated:	     5	   3

Introduce a "managed_allocated" field in struct cpumap to track the vector
allocation for managed interrupts separately. Use this information to
select the target CPU when a vector is allocated for a managed interrupt,
which results in more evenly distributed vector assignments. The above
example results in the following allocations:

		 CPU0	CPU1
 managed_allocated: 0	   0	<- Select CPU0 for 1st allocation
 --> allocated:	    3	   3

 managed_allocated: 1	   0	<- Select CPU1 for 2nd allocation
 --> allocated:	    3	   4

 managed_allocated: 1	   1	<- Select CPU0 for 3rd allocation
 --> allocated:	    4	   4

The allocation of non-managed interrupts is not affected by this change and
is still evaluating the available count.

The overall distribution of interrupt vectors for both types of interrupts
might still not be perfectly even depending on the number of non-managed
and managed interrupts in a system, but due to the reservation guarantee
for managed interrupts this cannot be avoided.

Expose the new field in debugfs as well.

[ tglx: Clarified the background of the problem in the changelog and
  	described it independent of NVME ]

Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Michael Kelley <mikelley@microsoft.com>
Link: https://lkml.kernel.org/r/20181106040000.27316-1-longli@linuxonhyperv.com
2018-11-06 23:20:13 +01:00
Linus Torvalds
8053e5b93e Merge tag 'trace-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
 "Masami found a slight bug in his code where he transposed the
  arguments of a call to strpbrk.

  The reason this wasn't detected in our tests is that the only way this
  would transpire is when a kprobe event with a symbol offset is
  attached to a function that belongs to a module that isn't loaded yet.
  When the kprobe trace event is added, the offset would be truncated
  after it was parsed, and when the module is loaded, it would use the
  symbol without the offset (as the nul character added by the parsing
  would not be replaced with the original character)"

* tag 'trace-v4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing/kprobes: Fix strpbrk() argument order
2018-11-06 08:12:10 -08:00
Linus Torvalds
a13511dfa8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Handle errors mid-stream of an all dump, from Alexey Kodanev.

 2) Fix build of openvswitch with certain combinations of netfilter
    options, from Arnd Bergmann.

 3) Fix interactions between GSO and BQL, from Eric Dumazet.

 4) Don't put a '/' in RTL8201F's sysfs file name, from Holger
    Hoffstätte.

 5) S390 qeth driver fixes from Julian Wiedmann.

 6) Allow ipv6 link local addresses for netconsole when both source and
    destination are link local, from Matwey V. Kornilov.

 7) Fix the BPF program address seen in /proc/kallsyms, from Song Liu.

 8) Initialize mutex before use in dsa microchip driver, from Tristram
    Ha.

 9) Out-of-bounds access in hns3, from Yunsheng Lin.

10) Various netfilter fixes from Stefano Brivio, Jozsef Kadlecsik, Jiri
    Slaby, Florian Westphal, Eric Westbrook, Andrey Ryabinin, and Pablo
    Neira Ayuso.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (50 commits)
  net: alx: make alx_drv_name static
  net: bpfilter: fix iptables failure if bpfilter_umh is disabled
  sock_diag: fix autoloading of the raw_diag module
  net: core: netpoll: Enable netconsole IPv6 link local address
  ipv6: properly check return value in inet6_dump_all()
  rtnetlink: restore handling of dumpit return value in rtnl_dump_all()
  net/ipv6: Move anycast init/cleanup functions out of CONFIG_PROC_FS
  bonding/802.3ad: fix link_failure_count tracking
  net: phy: realtek: fix RTL8201F sysfs name
  sctp: define SCTP_SS_DEFAULT for Stream schedulers
  sctp: fix strchange_flags name for Stream Change Event
  mlxsw: spectrum: Fix IP2ME CPU policer configuration
  openvswitch: fix linking without CONFIG_NF_CONNTRACK_LABELS
  qed: fix link config error handling
  net: hns3: Fix for out-of-bounds access when setting pfc back pressure
  net/mlx4_en: use __netdev_tx_sent_queue()
  net: do not abort bulk send on BQL status
  net: bql: add __netdev_tx_sent_queue()
  s390/qeth: report 25Gbit link speed
  s390/qeth: sanitize ARP requests
  ...
2018-11-06 07:44:04 -08:00
Richard Guy Briggs
ea956d8be9 audit: print empty EXECVE args
Empty executable arguments were being skipped when printing out the list
of arguments in an EXECVE record, making it appear they were somehow
lost.  Include empty arguments as an itemized empty string.

Reproducer:
	autrace /bin/ls "" "/etc"
	ausearch --start recent -m execve -i | grep EXECVE
	type=EXECVE msg=audit(10/03/2018 13:04:03.208:1391) : argc=3 a0=/bin/ls a2=/etc

With fix:
	type=EXECVE msg=audit(10/03/2018 21:51:38.290:194) : argc=3 a0=/bin/ls a1= a2=/etc
	type=EXECVE msg=audit(1538617898.290:194): argc=3 a0="/bin/ls" a1="" a2="/etc"

Passes audit-testsuite.  GH issue tracker at
https://github.com/linux-audit/audit-kernel/issues/99

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[PM: cleaned up the commit metadata]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2018-11-05 16:41:49 -05:00
Yangtao Li
4d9ebbe2b0 cgroup: remove unnecessary unlikely()
WARN_ON() already contains an unlikely(), so it's not necessary to use
unlikely.

Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2018-11-05 08:28:11 -08:00
Masami Hiramatsu
ee474b81fe tracing/kprobes: Fix strpbrk() argument order
Fix strpbrk()'s argument order, it must pass acceptable string
in 2nd argument. Note that this can cause a kernel panic where
it recovers backup character to code->data.

Link: http://lkml.kernel.org/r/154108256792.2604.1816052586385217811.stgit@devbox

Fixes: a6682814f3 ("tracing/kprobes: Allow kprobe-events to record module symbol")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-11-05 09:47:14 -05:00
Yangtao Li
7d9df98be6 clockevents: Remove unnecessary unlikely()
WARN_ON() and WARN_ON_ONCE() already contains an unlikely(), so it's not
necessary to use unlikely.

Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20181104023104.2572-1-tiny.windzz@gmail.com
2018-11-05 14:22:54 +01:00