Commit Graph

649779 Commits

Author SHA1 Message Date
Amir Vadai
95c2027bfe net/sched: pedit: make sure that offset is valid
Add a validation function to make sure offset is valid:
1. Not below skb head (could happen when offset is negative).
2. Validate both 'offset' and 'at'.

Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 19:46:00 -05:00
David S. Miller
a152c91889 Merge branch 'eee-broken-modes'
Jerome Brunet says:

====================
Fix OdroidC2 Gigabit Tx link issue

This patchset fixes an issue with the OdroidC2 board (DWMAC + RTL8211F).
The platform seems to enter LPI on the Rx path too often while performing
relatively high TX transfer. This eventually break the link (both Tx and
Rx), and require to bring the interface down and up again to get the Rx
path working again.

The root cause of this issue is not fully understood yet but disabling EEE
advertisement on the PHY prevent this feature to be negotiated.
With this change, the link is stable and reliable, with the expected
throughput performance.

The patchset adds options in the generic phy driver to disable EEE
advertisement, through device tree. The way it is done is very similar
to the handling of the max-speed property.

Changes since V2: [2]
 - Rename "eee-advert-disable" to "eee-broken-modes" to make the intended
   purpose of this option clear (flag broken configuration, not a
   configuration option)
 - Add DT bindings constants so the DT configuration is more user friendly
 - Submit to net-next instead of net.

Changes since V1: [1]
 - Disable the advertisement of EEE in the generic code instead of the
   realtek driver.

[1] : http://lkml.kernel.org/r/1479220154-25851-1-git-send-email-jbrunet@baylibre.com
[2] : http://lkml.kernel.org/r/1479742524-30222-1-git-send-email-jbrunet@baylibre.com
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 19:38:31 -05:00
jbrunet
c44a3bc142 dt: bindings: add ethernet phy eee-broken-modes option documentation
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 19:38:31 -05:00
jbrunet
1fc31357ad dt-bindings: net: add EEE capability constants
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Tested-by: Yegor Yefremov <yegorslists@googlemail.com>
Tested-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 19:38:31 -05:00
jbrunet
d853d145ea net: phy: add an option to disable EEE advertisement
This patch adds an option to disable EEE advertisement in the generic PHY
by providing a mask of prohibited modes corresponding to the value found in
the MDIO_AN_EEE_ADV register.

On some platforms, PHY Low power idle seems to be causing issues, even
breaking the link some cases. The patch provides a convenient way for these
platforms to disable EEE advertisement and work around the issue.

Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Tested-by: Yegor Yefremov <yegorslists@googlemail.com>
Tested-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 19:38:31 -05:00
Niklas Cassel
436feafe95 net: stmmac: enable tx queue 0 for gmac4 IPs synthesized with multiple TX queues
The dwmac4 IP can synthesized with 1-8 number of tx queues.
On an IP synthesized with DWC_EQOS_NUM_TXQ > 1, all txqueues are disabled
by default. For these IPs, the bitfield TXQEN is R/W.

Always enable tx queue 0. The write will have no effect on IPs synthesized
with DWC_EQOS_NUM_TXQ == 1.

The driver does still not utilize more than one tx queue in the IP.

Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 19:10:47 -05:00
Bjorn Helgaas
0b457dde3c PCI: Add comments about ROM BAR updating
pci_update_resource() updates a hardware BAR so its address matches the
kernel's struct resource UNLESS it's a disabled ROM BAR.  We only update
those when we enable the ROM.

It's not obvious from the code why ROM BARs should be handled specially.
Apparently there are Matrox devices with defective ROM BARs that read as
zero when disabled.  That means that if pci_enable_rom() reads the disabled
BAR, sets PCI_ROM_ADDRESS_ENABLE (without re-inserting the address), and
writes it back, it would enable the ROM at address zero.

Add comments and references to explain why we can't make the code look more
rational.

The code changes are from 755528c860 ("Ignore disabled ROM resources at
setup") and 8085ce084c ("[PATCH] Fix PCI ROM mapping").

Link: https://lkml.org/lkml/2005/8/30/138
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
2016-11-29 18:05:09 -06:00
Bjorn Helgaas
7a6d312b50 PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE
Remove the assumption that IORESOURCE_ROM_ENABLE == PCI_ROM_ADDRESS_ENABLE.
PCI_ROM_ADDRESS_ENABLE is the ROM enable bit defined by the PCI spec, so if
we're reading or writing a BAR register value, that's what we should use.
IORESOURCE_ROM_ENABLE is a corresponding bit in struct resource flags.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
2016-11-29 18:05:09 -06:00
Bjorn Helgaas
286c2378aa PCI: Remove pci_resource_bar() and pci_iov_resource_bar()
pci_std_update_resource() only deals with standard BARs, so we don't have
to worry about the complications of VF BARs in an SR-IOV capability.

Compute the BAR address inline and remove pci_resource_bar().  That makes
pci_iov_resource_bar() unused, so remove that as well.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
2016-11-29 18:05:09 -06:00
Bjorn Helgaas
546ba9f8f2 PCI: Don't update VF BARs while VF memory space is enabled
If we update a VF BAR while it's enabled, there are two potential problems:

  1) Any driver that's using the VF has a cached BAR value that is stale
     after the update, and

  2) We can't update 64-bit BARs atomically, so the intermediate state
     (new lower dword with old upper dword) may conflict with another
     device, and an access by a driver unrelated to the VF may cause a bus
     error.

Warn about attempts to update VF BARs while they are enabled.  This is a
programming error, so use dev_WARN() to get a backtrace.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
2016-11-29 18:05:09 -06:00
Bjorn Helgaas
6ffa2489c5 PCI: Separate VF BAR updates from standard BAR updates
Previously pci_update_resource() used the same code path for updating
standard BARs and VF BARs in SR-IOV capabilities.

Split the VF BAR update into a new pci_iov_update_resource() internal
interface, which makes it simpler to compute the BAR address (we can get
rid of pci_resource_bar() and pci_iov_resource_bar()).

This patch:

  - Renames pci_update_resource() to pci_std_update_resource(),
  - Adds pci_iov_update_resource(),
  - Makes pci_update_resource() a wrapper that calls the appropriate one,

No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
2016-11-29 18:05:09 -06:00
Linus Torvalds
faaae2a581 Re-enable CONFIG_MODVERSIONS in a slightly weaker form
This enables CONFIG_MODVERSIONS again, but allows for missing symbol CRC
information in order to work around the issue that newer binutils
versions seem to occasionally drop the CRC on the floor.  binutils 2.26
seems to work fine, while binutils 2.27 seems to break MODVERSIONS of
symbols that have been defined in assembler files.

[ We've had random missing CRC's before - it may be an old problem that
  just is now reliably triggered with the weak asm symbols and a new
  version of binutils ]

Some day I really do want to remove MODVERSIONS entirely.  Sadly, today
does not appear to be that day: Debian people apparently do want the
option to enable MODVERSIONS to make it easier to have external modules
across kernel versions, and this seems to be a fairly minimal fix for
the annoying problem.

Cc: Ben Hutchings <ben@decadent.org.uk>
Acked-by: Michal Marek <mmarek@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-11-29 16:01:30 -08:00
Peter Robinson
530742e707 net: arc_emac: add dependencies on associated arches and compile test
Add dependencies on the architectures that support these devices and
add compile test to ensure ongoing code build coverage.

Signed-off-by: Peter Robinson <pbrobinson@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 18:57:36 -05:00
Andreas Färber
b26bff6e52 MAINTAINERS: Add device tree bindings to mv88e6xx section
Also include the netdev list for convenience, as done elsewhere.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Cc: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-29 18:55:52 -05:00
Konstantin Khlebnikov
e8d7c33232 md/raid5: limit request size according to implementation limits
Current implementation employ 16bit counter of active stripes in lower
bits of bio->bi_phys_segments. If request is big enough to overflow
this counter bio will be completed and freed too early.

Fortunately this not happens in default configuration because several
other limits prevent that: stripe_cache_size * nr_disks effectively
limits count of active stripes. And small max_sectors_kb at lower
disks prevent that during normal read/write operations.

Overflow easily happens in discard if it's enabled by module parameter
"devices_handle_discard_safely" and stripe_cache_size is set big enough.

This patch limits requests size with 256Mb - 8Kb to prevent overflows.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Shaohua Li <shli@kernel.org>
Cc: Neil Brown <neilb@suse.com>
Cc: stable@vger.kernel.org
Signed-off-by: Shaohua Li <shli@fb.com>
2016-11-29 15:53:21 -08:00
Chao Yu
0002b61bda f2fs: return AOP_WRITEPAGE_ACTIVATE for writepage
We should use AOP_WRITEPAGE_ACTIVATE when we bypass writing pages.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-29 15:43:00 -08:00
Jaegeuk Kim
26787236b3 f2fs: do not activate auto_recovery for fallocated i_size
If a file needs to keep its i_size by fallocate, we need to turn off auto
recovery during roll-forward recovery.

This will resolve the below scenario.

1. xfs_io -f /mnt/f2fs/file -c "pwrite 0 4096" -c "fsync"
2. xfs_io -f /mnt/f2fs/file -c "falloc -k 4096 4096" -c "fsync"
3. md5sum /mnt/f2fs/file;
4. godown /mnt/f2fs/
5. umount /mnt/f2fs/
6. mount -t f2fs /dev/sdx /mnt/f2fs
7. md5sum /mnt/f2fs/file

Reported-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-11-29 15:42:58 -08:00
Derek Foreman
26fc78f6fe drm/vc4: Fix race between page flip completion event and clean-up
There was a small window where a userspace program could submit
a pageflip after receiving a pageflip completion event yet still
receive EBUSY.

Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2016-11-29 15:39:45 -08:00
Long Li
0de8ce3ee8 PCI: hv: Allocate physically contiguous hypercall params buffer
hv_do_hypercall() assumes that we pass a segment from a physically
contiguous buffer.  A buffer allocated on the stack may not work if
CONFIG_VMAP_STACK=y is set.

Use kmalloc() to allocate this buffer.

Reported-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
2016-11-29 17:22:43 -06:00
JackieLiu
1a0ec5c30c md/raid5-cache: do not need to set STRIPE_PREREAD_ACTIVE repeatedly
R5c_make_stripe_write_out has set this flag, do not need to set again.

Signed-off-by: JackieLiu <liuyun01@kylinos.cn>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-11-29 14:46:23 -08:00
JackieLiu
dbd22c8d7f md/raid5-cache: remove the unnecessary next_cp_seq field from the r5l_log
The next_cp_seq field is useless, remove it.

Signed-off-by: JackieLiu <liuyun01@kylinos.cn>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-11-29 14:46:22 -08:00
JackieLiu
bc8f167f9c md/raid5-cache: release the stripe_head at the appropriate location
If we released the 'stripe_head' in r5c_recovery_flush_log,
ctx->cached_list will both release the data-parity stripes and
data-only stripes, which will become empty.
And we also need to use the data-only stripes in
r5c_recovery_rewrite_data_only_stripes, so we should wait util rewrite
data-only stripes is done before releasing them.

Reviewed-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn>
Reviewed-by: Song Liu <songliubraving@fb.com>
Signed-off-by: JackieLiu <liuyun01@kylinos.cn>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-11-29 14:46:22 -08:00
JackieLiu
fc833c2a2f md/raid5-cache: use ring add to prevent overflow
'write_pos' must be protected with 'r5l_ring_add', or it may overflow

Signed-off-by: JackieLiu <liuyun01@kylinos.cn>
Reviewed-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-11-29 14:46:21 -08:00
JackieLiu
9b69173e5c md/raid5-cache: remove unnecessary function parameters
The function parameter 'recovery_list' is not used in
body, we can delete it

Signed-off-by: JackieLiu <liuyun01@kylinos.cn>
Reviewed-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-11-29 14:46:21 -08:00
Zhengyuan Liu
462eb7d872 raid5-cache: don't set STRIPE_R5C_PARTIAL_STRIPE flag while load stripe into cache
r5c_recovery_load_one_stripe should not set STRIPE_R5C_PARTIAL_STRIPE flag,as
the data-only stripe may be STRIPE_R5C_FULL_STRIPE stripe. The state machine
would release the stripe later and add it into neither r5c_cached_full_stripes
list or r5c_cached_partial_stripes list and set correct flag.

Reviewed-by: JackieLiu <liuyun01@kylinos.cn>
Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn>
Signed-off-by: Shaohua Li <shli@fb.com>
2016-11-29 14:45:14 -08:00
Daniel Vetter
661a375561 drm: Fix locking cargo-cult in encoder/plane init/cleanup
Encoders&planes can't be hotplugged, we dont need locking for this
since it's all single-threaded driver setup/teardown code. CRTCs
already don't grab locks.

While at it I noticed that plane's are missing the
drm_modeset_lock_fini() call, so add it.

Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161129094538.9650-1-daniel.vetter@ffwll.ch
2016-11-29 23:34:36 +01:00
Daniel Vetter
b07c42547b drm/doc: Fix indenting in drm_modeset_lock.c comment
This isn't part of the code snippet anymore ...

Acked-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161129092440.6940-1-daniel.vetter@ffwll.ch
2016-11-29 23:34:36 +01:00
Jacob Pan
feb6cd6a0f thermal/intel_powerclamp: stop sched tick in forced idle
With the introduction of play_idle(), idle injection kthread can
go through the normal idle task processing to get correct accounting
and turn off scheduler tick when possible.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-29 23:34:11 +01:00
Sebastian Andrzej Siewior
cb91fef1b7 thermal/intel_powerclamp: Convert to CPU hotplug state
This is a conversation to the new hotplug state machine with
the difference that CPU_DEAD becomes CPU_PREDOWN.

At the same time it makes the handling of the two states symmetrical.
stop_power_clamp_worker() is called unconditionally and the controversial
error message is removed.

Finally, the hotplug state callbacks are removed after the powerclamping
is stopped to avoid a potential race.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
[pmladek@suse.com: Fixed the possible race in powerclamp_exit()]
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-29 23:34:11 +01:00
Petr Mladek
8d962ac7f3 thermal/intel_powerclamp: Convert the kthread to kthread worker API
Kthreads are currently implemented as an infinite loop. Each
has its own variant of checks for terminating, freezing,
awakening. In many cases it is unclear to say in which state
it is and sometimes it is done a wrong way.

The plan is to convert kthreads into kthread_worker or workqueues
API. It allows to split the functionality into separate operations.
It helps to make a better structure. Also it defines a clean state
where no locks are taken, IRQs blocked, the kthread might sleep
or even be safely migrated.

The kthread worker API is useful when we want to have a dedicated
single thread for the work. It helps to make sure that it is
available when needed. Also it allows a better control, e.g.
define a scheduling priority.

This patch converts the intel powerclamp kthreads into the kthread
worker because they need to have a good control over the assigned
CPUs.

IMHO, the most natural way is to split one cycle into two works.
First one does some balancing and let the CPU work normal
way for some time. The second work checks what the CPU has done
in the meantime and put it into C-state to reach the required
idle time ratio. The delay between the two works is achieved
by the delayed kthread work.

The two works have to share some data that used to be local
variables of the single kthread function. This is achieved
by the new per-CPU struct kthread_worker_data. It might look
as a complication. On the other hand, the long original kthread
function was not nice either.

The patch tries to avoid extra init and cleanup works. All the
actions might be done outside the thread. They are moved
to the functions that create or destroy the worker. Especially,
I checked that the timers are assigned to the right CPU.

The two works are queuing each other. It makes it a bit tricky to
break it when we want to stop the worker. We use the global and
per-worker "clamping" variables to make sure that the re-queuing
eventually stops. We also cancel the works to make it faster.
Note that the canceling is not reliable because the handling
of the two variables and queuing is not synchronized via a lock.
But it is not a big deal because it is just an optimization.
The job is stopped faster than before in most cases.

Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-29 23:34:10 +01:00
Petr Mladek
14f3f7d8cb thermal/intel_powerclamp: Remove duplicated code that starts the kthread
This patch removes code duplication. It does not modify
the functionality.

Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-29 23:34:10 +01:00
Hans de Goede
6276e53fa8 ACPI / video: Add force_native quirk for HP Pavilion dv6
The HP Pavilion dv6 has a non-working acpi_video0 backlight interface
and an intel_backlight interface which works fine. Add a force_native
quirk for it so that the non-working acpi_video0 interface does not get
registered.

Note that there are quite a few HP Pavilion dv6 variants, some
woth ATI and some with NVIDIA hybrid gfx, both seem to need this
quirk to have working backlight control. There are also some versions
with only Intel integrated gfx, these may not need this quirk, but it
should not hurt there.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=1204476
Link: https://bugs.launchpad.net/ubuntu/+source/linux-lts-trusty/+bug/1416940
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-29 23:27:17 +01:00
Hans de Goede
350fa038c3 ACPI / video: Add force_native quirk for Dell XPS 17 L702X
The Dell XPS 17 L702X has a non-working acpi_video0 backlight interface
and an intel_backlight interface which works fine. Add a force_native
quirk for it so that the non-working acpi_video0 interface does not get
registered.

Note that there also is an issue with the brightnesskeys on this laptop,
they do not generate key-press events in anyway. That is not solved by
this patch.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=1123661
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-29 23:27:16 +01:00
Zhang Rui
7020bcb828 ACPI: do not warn if _BQC does not exist
Starting from ACPI spec 3.0, it's only clarified that _BCM control
method is required if _BCL is implemented. There is no word
saying _BQC is required.

And in ACPI spec 6.1 B.5.4, for _BQC, it is explicitly stated that
"This optional method returns the current brightness level of a
built-in display output device. If present, it must be set by
the platform for initial brightness."

Thus remove the obsolete warning message.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-29 23:19:46 +01:00
Alan Tull
e499807737 MAINTAINERS: add git url for fpga
Add git url for fpga stuff.

Signed-off-by: Alan Tull <atull@opensource.altera.com>
2016-11-29 15:51:50 -06:00
Jason Gunthorpe
1d7f1589d3 fpga: Clarify how write_init works streaming modes
This interface was designed for streaming, but write_init's buf
argument has an unclear purpose. Define it to be the first bytes
of the bitstream. Each driver gets to set how many bytes (at most)
it wants to see. Short bitstreams will be passed through as-is, while
long ones will be truncated.

The intent is to allow drivers to peek at the header before the transfer
actually starts.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: Alan Tull <atull@opensource.altera.com>
2016-11-29 15:51:49 -06:00
Jason Gunthorpe
340c0c53ea fpga zynq: Fix incorrect ISR state on bootup
It is best practice to clear and mask all interrupts before
associating the IRQ, and this should be done after the clock
is enabled.

This corrects a bad result from zynq_fpga_ops_state on bootup
where left over latched values in INT_STS_OFFSET caused it to
report an unconfigured FPGA as configured.

After this change the boot up operating state for an unconfigured
FPGA reports 'unknown'.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: Alan Tull <atull@opensource.altera.com>
Acked-by: Moritz Fischer <moritz.fischer@ettus.com>
2016-11-29 15:51:48 -06:00
Jason Gunthorpe
80baf649c2 fpga zynq: Remove priv->dev
socfpga uses mgr->dev for debug prints, there should be consistency
here, so standardize on that. The only other use was for dma
which can be replaced with mgr->dev.parent.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: Alan Tull <atull@opensource.altera.com>
Acked-by: Moritz Fischer <moritz.fischer@ettus.com>
2016-11-29 15:51:46 -06:00
Jason Gunthorpe
1930c28651 fpga zynq: Add missing \n to messages
Function dev_err doesn't add a newline at the end of the string. This will
lead to a hard to read kernel log.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Moritz Fischer <moritz.fischer@ettus.com>
Reviewed-by: Matthias Brugger <mbrugger@suse.com>
Acked-by: Alan Tull <atull@opensource.altera.com>
2016-11-29 15:51:45 -06:00
Jason Gunthorpe
a0e1b61858 fpga: Add COMPILE_TEST to all drivers
Like Zynq the Altera drivers compile fine on x86 and others too,
so make it easier to compile test this stuff.

A10 requires REGMAP_MMIO to compile, so be explicit rather than
relying on it via ARCH_SOCFPGA.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: Alan Tull <atull@opensource.altera.com>
2016-11-29 15:51:44 -06:00
Chuck Lever
3a72dc771c xprtrdma: Relocate connection helper functions
Clean up: Disentangle connection helpers from RPC-over-RDMA reply
decoding functions.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-11-29 16:45:44 -05:00
Chuck Lever
c351f94387 xprtrdma: Update dprintk in rpcrdma_count_chunks
Clean up: offset and handle should be zero-filled, just like in the
chunk encoders.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-11-29 16:45:44 -05:00
Chuck Lever
2f6922ca33 xprtrdma: Shorten QP access error message
Clean up: The convention for this type of warning message is not to
show the function name or "RPC:       ".

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-11-29 16:45:44 -05:00
Chuck Lever
6d6bf72de9 xprtrdma: Squelch "max send, max recv" messages at connect time
Clean up: This message was intended to be a dprintk, as it is on the
server-side.

Fixes: 87cfb9a0c8 ('xprtrdma: Client-side support for ...')
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-11-29 16:45:44 -05:00
Chuck Lever
289400af2b xprtrdma: Update documenting comment
Clean up: If reset fails, FRMRs are no longer abandoned, rather
they are released immediately. Update the comment to reflect this.

Fixes: 2ffc871a57 ('xprtrdma: Release orphaned MRs immediately')
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-11-29 16:45:44 -05:00
Chuck Lever
a100fda1a2 xprtrdma: Refactor FRMR invalidation
Clean up: After some recent updates, clarifications can be made to
the FRMR invalidation logic.

- Both the remote and local invalidation case mark the frmr INVALID,
  so make that a common path.

- Manage the WR list more "tastefully" by replacing the conditional
  that discriminates between the list head and ->next pointers.

- Use mw->mw_handle in all cases, since that has the same value as
  f->fr_mr->rkey, and is already in cache.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-11-29 16:45:44 -05:00
Chuck Lever
48016dce46 xprtrdma: Avoid calls to ro_unmap_safe()
Micro-optimization: Most of the time, calls to ro_unmap_safe are
expensive no-ops. Call only when there is work to do.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-11-29 16:45:44 -05:00
Chuck Lever
109b88ab9d xprtrdma: Address coverity complaint about wait_for_completion()
> ** CID 114101:  Error handling issues  (CHECKED_RETURN)
> /net/sunrpc/xprtrdma/verbs.c: 355 in rpcrdma_create_id()

Commit 5675add36e ("RPC/RDMA: harden connection logic against
missing/late rdma_cm upcalls.") replaced wait_for_completion() calls
with these two call sites.

The original wait_for_completion() calls were added in the initial
commit of verbs.c, which was commit c56c65fb67 ("RPCRDMA: rpc rdma
verbs interface implementation"), but these returned void.

rpcrdma_create_id() is called by the RDMA connect worker, which
probably won't ever be interrupted. It is also called by
rpcrdma_ia_open which is in the synchronous mount path, and ^C is
possible there.

Add a bit of logic at those two call sites to return if the waits
return ERESTARTSYS.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-11-29 16:45:44 -05:00
Chuck Lever
ae09531d3c SUNRPC: Proper metric accounting when RPC is not transmitted
I noticed recently that during an xfstests on a krb5i mount, the
retransmit count for certain operations had gone negative, and the
backlog value became unreasonably large. I recall that Andy has
pointed this out to me in the past.

When call_refresh fails to find a valid credential for an RPC, the
RPC exits immediately without sending anything on the wire. This
leaves rq_ntrans, rq_xtime, and rq_rtt set to zero.

The solution for om_queue is to not add the to RPC's running backlog
queue total whenever rq_xtime is zero.

For om_ntrans, it's a bit more difficult. A zero rq_ntrans causes
om_ops to become larger than om_ntrans. The design of the RPC
metrics API assumes that ntrans will always be equal to or larger
than the ops count. The result is that when an RPC fails to find
credentials, the RPC operation's reported retransmit count, which is
computed in user space as the difference between ops and ntrans,
goes negative.

Ideally the kernel API should report a separate retransmit and
"exited before initial transmission" metric, so that user space can
sort out the difference properly.

To avoid kernel API changes and changes to the way rq_ntrans is used
when performing transport locking, account for untransmitted RPCs
so that om_ntrans keeps up with om_ops: always add one or more to
om_ntrans.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-11-29 16:45:44 -05:00
Chuck Lever
5e9fc6a06b xprtrdma: Support for SG_GAP devices
Some devices (such as the Mellanox CX-4) can register, under a
single R_key, a set of memory regions that are not contiguous. When
this is done, all the segments in a Reply list, say, can then be
invalidated in a single LocalInv Work Request (or via Remote
Invalidation, which can invalidate exactly one R_key when completing
a Receive).

This means a single FastReg WR is used to register, and one or zero
LocalInv WRs can invalidate, the memory involved with RDMA transfers
on behalf of an RPC.

In addition, xprtrdma constructs some Reply chunks from three or
more segments. By registering them with SG_GAP, only one segment
is needed for the Reply chunk, allowing the whole chunk to be
invalidated remotely.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2016-11-29 16:45:44 -05:00