- Add AMD Zen Raven and Renoir Root Ports to P2PDMA whitelist (Alex
Deucher)
* pci/p2pdma:
PCI/P2PDMA: Add AMD Zen Raven and Renoir Root Ports to whitelist
- Clarify that platform_get_irq() should never return 0 (Bjorn Helgaas)
- Check for platform_get_irq() failure consistently (Bjorn Helgaas)
- Replace zero-length array with flexible-array (Gustavo A. R. Silva)
- Unify pcie_find_root_port() and pci_find_pcie_root_port() (Yicong Yang)
- Quirk Intel C620 MROMs, which have non-BARs in BAR locations (Xiaochun
Lee)
- Fix pcie_pme_resume() and pcie_pme_remove() kernel-doc (Jay Fang)
- Rename _DSM constants to align with spec (Krzysztof Wilczyński)
* pci/misc:
PCI: Rename _DSM constants to align with spec
PCI/PME: Fix kernel-doc of pcie_pme_resume() and pcie_pme_remove()
x86/PCI: Mark Intel C620 MROMs as having non-compliant BARs
PCI: Unify pcie_find_root_port() and pci_find_pcie_root_port()
PCI: Replace zero-length array with flexible-array
PCI: Check for platform_get_irq() failure consistently
driver core: platform: Clarify that IRQ 0 is invalid
- Remove unused pciehp EMI() and HP_SUPR_RM() macros (Ani Sinha)
- Use of_node_name_eq() for node name comparisons (Rob Herring)
- Convert shpchp_unconfigure_device() to void (Krzysztof Wilczynski)
* pci/hotplug:
PCI: shpchp: Make shpchp_unconfigure_device() void
PCI: Use of_node_name_eq() for node name comparisons
PCI: pciehp: Remove unused EMI() and HP_SUPR_RM() macros
- Log only ACPI_NOTIFY_DISCONNECT_RECOVER events for EDR, not all ACPI
SYSTEM-level events (Kuppuswamy Sathyanarayanan)
- Rely only on _OSC (not _OSC + HEST FIRMWARE_FIRST) to negotiate AER
Capability ownership (Alexandru Gagniuc)
- Remove HEST/FIRMWARE_FIRST parsing that was previously used to help
intuit AER Capability ownership (Kuppuswamy Sathyanarayanan)
- Remove redundant pci_is_pcie() and dev->aer_cap checks (Kuppuswamy
Sathyanarayanan)
- Print IRQ number used by DPC (Yicong Yang)
* pci/error:
PCI/DPC: Print IRQ number used by port
PCI/AER: Use "aer" variable for capability offset
PCI/AER: Remove redundant dev->aer_cap checks
PCI/AER: Remove redundant pci_is_pcie() checks
PCI/AER: Remove HEST/FIRMWARE_FIRST parsing for AER ownership
PCI/AER: Use only _OSC to determine AER ownership
PCI/EDR: Log only ACPI_NOTIFY_DISCONNECT_RECOVER events
Pull chrome platform updates from Benson Leung:
"cros_ec_typec:
- Add notifier for update, and register port partner
Sensors/iio:
- Fixes to cros_ec_sensorhub around allocation of resources, and
send_sample
Wilco EC:
- Fix to output format of h1_gpio
Misc:
- Misc fixes to appease kernel-doc and other warnings
- Set user space log size in chromeos_pstore"
* tag 'tag-chrome-platform-for-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
platform/chrome: cros_usbpd_logger: Add __printf annotation to append_str()
platform/chrome: cros_ec_i2c: Appease the kernel-doc deity
platform/chrome: typec: Fix ret value check error
platform/chrome: cros_ec_typec: Register port partner
platform/chrome: cros_ec_typec: Add struct for port data
platform/chrome: cros_ec_typec: Use notifier for updates
platform/chrome: cros_ec_ishtp: free ishtp buffer before sending event
platform/chrome: cros_ec_ishtp: skip old cros_ec responses
platform/chrome: wilco_ec: Provide correct output format to 'h1_gpio' file
platform/chrome: chromeos_pstore: set user space log size
Loongson-3 overrides lwc2 instructions to implement CPUCFG and CSR
read/write functions. These instructions all cause guest exit so CSR
doesn't benifit KVM guest (and there are always legacy methods to
provide the same functions as CSR). So, we only emulate CPUCFG and let
it return a reduced feature list (which means the virtual CPU doesn't
have any other advanced features, including CSR) in KVM.
Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Co-developed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Message-Id: <1590220602-3547-12-git-send-email-chenhc@lemote.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch add Loongson-3 Virtual IPI interrupt support in the kernel.
The current implementation of IPI emulation in QEMU is based on GIC for
MIPS, but Loongson-3 doesn't use GIC. Furthermore, IPI emulation in QEMU
is too expensive for performance (because of too many context switches
between Host and Guest). With current solution, the IPI delay may even
cause RCU stall warnings in a multi-core Guest. So, we design a faster
solution that emulate IPI interrupt in kernel (only used by Loongson-3
now).
Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Co-developed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Message-Id: <1590220602-3547-11-git-send-email-chenhc@lemote.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull watchdog updates from Wim Van Sebroeck:
- add new arm_smc_wdt watchdog driver
- da9062 and da9063 improvements
- clarify documentation about stop() that became optional
- document r8a7742 support
- some overall fixes and improvements
* tag 'linux-watchdog-5.8-rc1' of git://www.linux-watchdog.org/linux-watchdog:
watchdog: m54xx: Add missing include
dt-bindings: watchdog: renesas,wdt: Document r8a7742 support
watchdog: Fix runtime PM imbalance on error
watchdog: riowd: remove unneeded semicolon
watchdog: Add new arm_smc_wdt watchdog driver
dt-bindings: watchdog: Add ARM smc wdt for mt8173 watchdog
watchdog: imx2_wdt: update contact email
watchdog: iTCO: fix link error
watchdog: da9062: No need to ping manually before setting timeout
watchdog: da9063: Make use of pre-configured timeout during probe
watchdog: da9062: Initialize timeout during probe
watchdog: clarify that stop() is optional
watchdog: imx_sc_wdt: Fix reboot on crash
watchdog: ts72xx_wdt: fix build error
In current implementation, MIPS KVM uses IP2, IP3, IP4 and IP7 for
external interrupt, two kinds of IPIs and timer interrupt respectively,
but Loongson-3 based machines prefer to use IP2, IP3, IP6 and IP7 for
two kinds of external interrupts, IPI and timer interrupt. So we define
two priority-irq mapping tables: kvm_loongson3_priority_to_irq[] for
Loongson-3, and kvm_default_priority_to_irq[] for others. The virtual
interrupt infrastructure is updated to deliver all types of interrupts
from IP2, IP3, IP4, IP6 and IP7.
Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Co-developed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Message-Id: <1590220602-3547-10-git-send-email-chenhc@lemote.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM guest has two levels of address translation: guest tlb translates
GVA to GPA, and root tlb translates GPA to HPA. By default guest's CCA
is controlled by guest tlb, but Loongson-3 maintains all cache coherency
by hardware (including multi-core coherency and I/O DMA coherency) so it
prefers all guest mappings be cacheable mappings. Thus, we use root tlb
to control guest's CCA for Loongson-3.
Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Co-developed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Message-Id: <1590220602-3547-8-git-send-email-chenhc@lemote.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If a CPU support more than 32bit vmbits (which is true for 64bit CPUs),
VPN2_MASK set to fixed 0xffffe000 will lead to a wrong EntryHi in some
functions such as _kvm_mips_host_tlb_inv().
The cpu_vmbits definition of 32bit CPU in cpu-features.h is 31, so we
still use the old definition.
Cc: Stable <stable@vger.kernel.org>
Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
Signed-off-by: Xing Li <lixing@loongson.cn>
[Huacai: Improve commit messages]
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Message-Id: <1590220602-3547-3-git-send-email-chenhc@lemote.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The code in decode_config4() of arch/mips/kernel/cpu-probe.c
asid_mask = MIPS_ENTRYHI_ASID;
if (config4 & MIPS_CONF4_AE)
asid_mask |= MIPS_ENTRYHI_ASIDX;
set_cpu_asid_mask(c, asid_mask);
set asid_mask to cpuinfo->asid_mask.
So in order to support variable ASID_MASK, KVM_ENTRYHI_ASID should also
be changed to cpu_asid_mask(&boot_cpu_data).
Cc: Stable <stable@vger.kernel.org> #4.9+
Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>
Signed-off-by: Xing Li <lixing@loongson.cn>
[Huacai: Change current_cpu_data to boot_cpu_data for optimization]
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Message-Id: <1590220602-3547-2-git-send-email-chenhc@lemote.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We already have the buffer selected, but we should set the iter list
again.
Cc: stable@vger.kernel.org # v5.7
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull backlight updates from Lee Jones:
"Core Framework:
- Add backlight_device_get_by_name() to the API
New Device Support:
- Add support for WLED5 to Qualcomm WLED
Fix-ups:
- Convert to GPIO descriptors in l4f00242t03
- Device Tree fix-ups for qcom-wled
Bug Fixes:
- Properly disable regulators on .probe() failure"
* tag 'backlight-next-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
backlight: Add backlight_device_get_by_name()
backlight: qcom-wled: Add support for WLED5 peripheral that is present on PM8150L PMICs
dt-bindings: backlight: qcom-wled: Add WLED5 bindings
backlight: qcom-wled: Add callback functions
dt-bindings: backlight: qcom-wled: Convert the wled bindings to .yaml format
backlight: l4f00242t03: Convert to GPIO descriptors
backlight: lp855x: Ensure regulators are disabled on probe failure
Pull MFD updates from Lee Jones:
"Core Frameworks:
- Constify 'properties' attribute in core header file
New Drivers:
- Add support for Gateworks System Controller
- Add support for MediaTek MT6358 PMIC
- Add support for Mediatek MT6360 PMIC
- Add support for Monolithic Power Systems MP2629 ADC and Battery charger
Fix-ups:
- Use new I2C API in htc-i2cpld
- Remove superfluous code in sprd-sc27xx-spi
- Improve error handling in stm32-timers
- Device Tree additions/fixes in mt6397
- Defer probe betterment in wm8994-core
- Improve module handling in wm8994-core
- Staticify in stpmic1
- Trivial (spelling, formatting) in tqmx86
Bug Fixes:
- Fix incorrect register/PCI IDs in intel-lpss-pci
- Fix unbalanced Regulator API calls in wm8994-core
- Fix double free() in wcd934x
- Remove IRQ domain on failure in stmfx
- Reset chip on resume in stmfx
- Disable/enable IRQs on suspend/resume in stmfx
- Do not use bulk writes on H/W which does not support them in max77620"
* tag 'mfd-next-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (29 commits)
mfd: mt6360: Remove duplicate REGMAP_IRQ_REG_LINE() entry
mfd: Add support for PMIC MT6360
mfd: max77620: Use single-byte writes on MAX77620
mfd: wcd934x: Drop kfree for memory allocated with devm_kzalloc
mfd: stmfx: Disable IRQ in suspend to avoid spurious interrupt
mfd: stmfx: Fix stmfx_irq_init error path
mfd: stmfx: Reset chip on resume as supply was disabled
mfd: wm8994: Silence warning about supplies during deferred probe
mfd: wm8994: Fix unbalanced calls to regulator_bulk_disable()
mfd: wm8994: Fix driver operation if loaded as modules
dt-bindings: mfd: mediatek: Add MT6397 Pin Controller
mfd: Constify properties in mfd_cell
mfd: stm32-timers: Use dma_request_chan() instead dma_request_slave_channel()
mfd: sprd: Remove unnecessary spi_bus_type setting
mfd: intel-lpss: Update LPSS UART #2 PCI ID for Jasper Lake
mfd: tqmx86: Fix a typo in MODULE_DESCRIPTION
mfd: stpmic1: Make stpmic1_regmap_config static
mfd: htc-i2cpld: Convert to use i2c_new_client_device()
MAINTAINERS: Add entry for mp2629 Battery Charger driver
power: supply: mp2629: Add impedance compensation config
...
Pull smack updates from Casey Schaufler:
"Clean out dead code and repair an out-of-bounds warning"
* tag 'Smack-for-5.8' of git://github.com/cschaufler/smack-next:
Smack: Remove unused inline function smk_ad_setfield_u_fs_path_mnt
Smack:- Remove redundant inode_smack cache
Smack:- Remove mutex lock "smk_lock" from inode_smack
Smack: slab-out-of-bounds in vsscanf
smack: remove redundant structure variable from header.
smack: avoid unused 'sip' variable warning
Pull keyring updates from David Howells:
- Fix a documentation warning.
- Replace a zero-length array with a flexible one
- Make the big_key key type use ChaCha20Poly1305 and use the crypto
algorithm directly rather than going through the crypto layer.
- Implement the update op for the big_key type.
* tag 'keys-next-20200602' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
keys: Implement update for the big_key type
security/keys: rewrite big_key crypto to use library interface
KEYS: Replace zero-length array with flexible-array
Documentation: security: core.rst: add missing argument
Pull perf tooling updates from Arnaldo Carvalho de Melo:
"These are additional changes to the perf tools, on top of what Ingo
already submitted.
- Further Intel PT call-trace fixes
- Improve SELinux docs and tool warnings
- Fix race at exit in 'perf record' using eventfd.
- Add missing build tests to the default set of 'make -C tools/perf
build-test'
- Sync msr-index.h getting new AMD MSRs to decode and filter in 'perf
trace'.
- Fix fallback to libaudit in 'perf trace' for arches not using
per-arch *.tbl files.
- Fixes for 'perf ftrace'.
- Fixes and improvements for the 'perf stat' metrics.
- Use dummy event to get PERF_RECORD_{FORK,MMAP,etc} while
synthesizing those metadata events for pre-existing threads.
- Fix leaks detected using clang tooling.
- Improvements to PMU event metric testing.
- Report summary for 'perf stat' interval mode at the end, summing up
all the intervals.
- Improve pipe mode, i.e. this now works as expected, continuously
dumping samples:
# perf record -g -e raw_syscalls:sys_enter | perf --no-pager script
- Fixes for event grouping, detecting incompatible groups such as:
# perf stat -e '{cycles,power/energy-cores/}' -v
WARNING: group events cpu maps do not match, disabling group:
anon group { power/energy-cores/, cycles }
power/energy-cores/: 0
cycles: 0-7
- Fixes for 'perf probe': blacklist address checking, number of
kretprobe instances, etc.
- JIT processing improvements and fixes plus the addition of a 'perf
test' entry for the java demangler.
- Add support for synthesizing first/last level cache, TLB and remove
access events from HW tracing in the auxtrace code, first to use is
ARM SPE.
- Vendor events updates and fixes, including for POWER9 and Intel.
- Allow using ~/.perfconfig for removing the ',' separators in 'perf
stat' output.
- Opt-in support for libpfm4"
* tag 'perf-tools-2020-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (120 commits)
perf tools: Remove some duplicated includes
perf symbols: Fix kernel maps for kcore and eBPF
tools arch x86: Sync the msr-index.h copy with the kernel sources
perf stat: Ensure group is defined on top of the same cpu mask
perf libdw: Fix off-by 1 relative directory includes
perf arm-spe: Support synthetic events
perf auxtrace: Add four itrace options
perf tools: Move arm-spe-pkt-decoder.h/c to the new dir
perf test: Initialize memory in dwarf-unwind
perf tests: Don't tail call optimize in unwind test
tools compiler.h: Add attribute to disable tail calls
perf build: Add a LIBPFM4=1 build test entry
perf tools: Add optional support for libpfm4
perf tools: Correct license on jsmn JSON parser
perf jit: Fix inaccurate DWARF line table
perf jvmti: Remove redundant jitdump line table entries
perf build: Add NO_SDT=1 to the default set of build tests
perf build: Add NO_LIBCRYPTO=1 to the default set of build tests
perf build: Add NO_SYSCALL_TABLE=1 to the build tests
perf build: Remove libaudit from the default feature checks
...
Fail recv/send in case of IORING_SETUP_IOPOLL earlier during prep,
so it'd be done only once. Removes duplication as well
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
io_openat_prep() and io_openat2_prep() are identical except for how
struct open_how is built. Deduplicate it with a helper.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
build_open_how() is just adjusting open_flags/mode. Do it once during
prep. It looks better than storing raw values for the future.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
IORING_SETUP_IOPOLL is defined only for read/write, other opcodes should
be disallowed, otherwise it'll get an error as below. Also refuse
open/close with SQPOLL, as the polling thread wouldn't know which file
table to use.
RIP: 0010:io_iopoll_getevents+0x111/0x5a0
Call Trace:
? _raw_spin_unlock_irqrestore+0x24/0x40
? do_send_sig_info+0x64/0x90
io_iopoll_reap_events.part.0+0x5e/0xa0
io_ring_ctx_wait_and_kill+0x132/0x1c0
io_uring_release+0x20/0x30
__fput+0xcd/0x230
____fput+0xe/0x10
task_work_run+0x67/0xa0
do_exit+0x353/0xb10
? handle_mm_fault+0xd4/0x200
? syscall_trace_enter+0x18c/0x2c0
do_group_exit+0x43/0xa0
__x64_sys_exit_group+0x18/0x20
do_syscall_64+0x60/0x1e0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
[axboe: allow provide/remove buffers and files update]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Unconditionally return true when querying the validity of
MSR_IA32_PERF_CAPABILITIES so as to defer the validity check to
intel_pmu_{get,set}_msr(), which can properly give the MSR a pass when
the access is initiated from host userspace. The MSR is emulated so
there is no underlying hardware dependency to worry about.
Fixes: 27461da310 ("KVM: x86/pmu: Support full width counting")
Cc: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200603203303.28545-1-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fix the SCS debug usage check so that we report the number of bytes
used, rather than the number of entries.
Fixes: 5bbaf9d1fc ("scs: Add support for stack usage debugging")
Reported-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Will Deacon <will@kernel.org>
After commit 63d0434 ("KVM: x86: move kvm_create_vcpu_debugfs after
last failure point") we are creating the pre-vCPU debugfs files
after the creation of the vCPU file descriptor. This makes it
possible for userspace to reach kvm_vcpu_release before
kvm_create_vcpu_debugfs has finished. The vcpu->debugfs_dentry
then does not have any associated inode anymore, and this causes
a NULL-pointer dereference in debugfs_create_file.
The solution is simply to avoid removing the files; they are
cleaned up when the VM file descriptor is closed (and that must be
after KVM_CREATE_VCPU returns). We can stop storing the dentry
in struct kvm_vcpu too, because it is not needed anywhere after
kvm_create_vcpu_debugfs returns.
Reported-by: syzbot+705f4401d5a93a59b87d@syzkaller.appspotmail.com
Fixes: 63d0434837 ("KVM: x86: move kvm_create_vcpu_debugfs after last failure point")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Adjust the fileserver rotation algorithm so that if we've tried all the
addresses on a server (cumulatively over multiple operations) until we've
run out of untried addresses, immediately reprobe all that server's
interfaces and retry the op at least once before we move onto the next
server.
Signed-off-by: David Howells <dhowells@redhat.com>
Display more information about the state of a server record, including the
flags, rtt and break counter plus the probe state for each server in
/proc/net/afs/servers.
Rearrange the server flags a bit to make them easier to read at a glance in
the proc file.
Signed-off-by: David Howells <dhowells@redhat.com>
Don't use the running state for fileserver probes to make decisions about
which server to use as the state is cleared at the start of a probe and
also intermediate values might be misleading.
Instead, add a separate 'latest known' rtt in the afs_server struct and a
flag to indicate if the server is known to be responding and update these
as and when we know what to change them to.
Signed-off-by: David Howells <dhowells@redhat.com>
Fix afs_statfs() so that the value for f_bavail and f_bfree don't go
"negative" if the number of blocks in use by a volume exceeds the max quota
for that volume.
Signed-off-by: David Howells <dhowells@redhat.com>
Whilst it shouldn't happen, it is possible for multiple fileservers to
share a UUID, particularly if an entire cell has been duplicated, UUIDs and
all. In such a case, it's not necessarily possible to map the effect of
the CB.InitCallBackState3 incoming RPC to a specific server unambiguously
by UUID and thus to a specific cell.
Indeed, there's a problem whereby multiple server records may need to
occupy the same spot in the rb_tree rooted in the afs_net struct.
Fix this by allowing servers to form a list, with the head of the list in
the tree. When the front entry in the list is removed, the second in the
list just replaces it. afs_init_callback_state() then just goes down the
line, poking each server in the list.
This means that some servers will be unnecessarily poked, unfortunately.
An alternative would be to route by call parameters.
Reported-by: Jeffrey Altman <jaltman@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Fixes: d2ddc776a4 ("afs: Overhaul volume and server record caching and fileserver rotation")
Reorganise afs_volume objects such that they're in a tree keyed on volume
ID, rooted at on an afs_cell object rather than being in multiple trees,
each of which is rooted on an afs_server object.
afs_server structs become per-cell and acquire a pointer to the cell.
The process of breaking a callback then starts with finding the server by
its network address, following that to the cell and then looking up each
volume ID in the volume tree.
This is simpler than the afs_vol_interest/afs_cb_interest N:M mapping web
and allows those structs and the code for maintaining them to be simplified
or removed.
It does make a couple of things a bit more tricky, though:
(1) Operations now start with a volume, not a server, so there can be more
than one answer as to whether or not the server we'll end up using
supports the FS.InlineBulkStatus RPC.
(2) CB RPC operations that specify the server UUID. There's still a tree
of servers by UUID on the afs_net struct, but the UUIDs in it aren't
guaranteed unique.
Signed-off-by: David Howells <dhowells@redhat.com>
YFS Volume Location servers have an operation by which the cell name may be
queried. Use this to find out what a YFS server thinks the canonical cell
name should be.
Signed-off-by: David Howells <dhowells@redhat.com>
Implement the second phase of cell alias detection. This part handles
alias detection for cells that don't have root.cell volumes and so we have
to find some other volume or fileserver to query.
We take the first volume from each such cell and attempt to look it up in
the new cell. If found, we compare the records, if they are the same, we
judge the cell names to be aliases.
Signed-off-by: David Howells <dhowells@redhat.com>
Put in the first phase of cell alias detection. This part handles alias
detection for cells that have root.cell volumes (which is expected to be
likely).
When a cell becomes newly active, it is probed for its root.cell volume,
and if it has one, this volume is compared against other root.cell volumes
to find out if the list of fileserver UUIDs have any in common - and if
that's the case, do the address lists of those fileservers have any
addresses in common. If they do, the new cell is adjudged to be an alias
of the old cell and the old cell is used instead.
Comparing is aided by the server list in struct afs_server_list being
sorted in UUID order and the addresses in the fileserver address lists
being sorted in address order.
The cell then retains the afs_volume object for the root.cell volume, even
if it's not mounted for future alias checking.
This necessary because:
(1) Whilst fileservers have UUIDs that are meant to be globally unique, in
practice they are not because cells get cloned without changing the
UUIDs - so afs_server records need to be per cell.
(2) Sometimes the DNS is used to make cell aliases - but if we don't know
they're the same, we may end up with multiple superblocks and multiple
afs_server records for the same thing, impairing our ability to
deliver callback notifications of third party changes
(3) The fileserver RPC API doesn't contain the cell name, so it can't tell
us which cell it's notifying and can't see that a change made to to
one cell should notify the same client that's also accessed as the
other cell.
Reported-by: Jeffrey Altman <jaltman@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>