Andrii Nakryiko says:
====================
This patch set adds ability to memory-map BPF array maps (single- and
multi-element). The primary use case is memory-mapping BPF array maps, created
to back global data variables, created by libbpf implicitly. This allows for
much better usability, along with avoiding syscalls to read or update data
completely.
Due to memory-mapping requirements, BPF array map that is supposed to be
memory-mapped, has to be created with special BPF_F_MMAPABLE attribute, which
triggers slightly different memory allocation strategy internally. See
patch 1 for details.
Libbpf is extended to detect kernel support for this flag, and if supported,
will specify it for all global data maps automatically.
Patch #1 refactors bpf_map_inc() and converts bpf_map's refcnt to atomic64_t
to make refcounting never fail. Patch #2 does similar refactoring for
bpf_prog_add()/bpf_prog_inc().
v5->v6:
- add back uref counting (Daniel);
v4->v5:
- change bpf_prog's refcnt to atomic64_t (Daniel);
v3->v4:
- add mmap's open() callback to fix refcounting (Johannes);
- switch to remap_vmalloc_pages() instead of custom fault handler (Johannes);
- converted bpf_map's refcnt/usercnt into atomic64_t;
- provide default bpf_map_default_vmops handling open/close properly;
v2->v3:
- change allocation strategy to avoid extra pointer dereference (Jakub);
v1->v2:
- fix map lookup code generation for BPF_F_MMAPABLE case;
- prevent BPF_F_MMAPABLE flag for all but plain array map type;
- centralize ref-counting in generic bpf_map_mmap();
- don't use uref counting (Alexei);
- use vfree() directly;
- print flags with %x (Song);
- extend tests to verify bpf_map_{lookup,update}_elem() logic as well.
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add selftests validating mmap()-ing BPF array maps: both single-element and
multi-element ones. Check that plain bpf_map_update_elem() and
bpf_map_lookup_elem() work correctly with memory-mapped array. Also convert
CO-RE relocation tests to use memory-mapped views of global data.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191117172806.2195367-6-andriin@fb.com
Add ability to memory-map contents of BPF array map. This is extremely useful
for working with BPF global data from userspace programs. It allows to avoid
typical bpf_map_{lookup,update}_elem operations, improving both performance
and usability.
There had to be special considerations for map freezing, to avoid having
writable memory view into a frozen map. To solve this issue, map freezing and
mmap-ing is happening under mutex now:
- if map is already frozen, no writable mapping is allowed;
- if map has writable memory mappings active (accounted in map->writecnt),
map freezing will keep failing with -EBUSY;
- once number of writable memory mappings drops to zero, map freezing can be
performed again.
Only non-per-CPU plain arrays are supported right now. Maps with spinlocks
can't be memory mapped either.
For BPF_F_MMAPABLE array, memory allocation has to be done through vmalloc()
to be mmap()'able. We also need to make sure that array data memory is
page-sized and page-aligned, so we over-allocate memory in such a way that
struct bpf_array is at the end of a single page of memory with array->value
being aligned with the start of the second page. On deallocation we need to
accomodate this memory arrangement to free vmalloc()'ed memory correctly.
One important consideration regarding how memory-mapping subsystem functions.
Memory-mapping subsystem provides few optional callbacks, among them open()
and close(). close() is called for each memory region that is unmapped, so
that users can decrease their reference counters and free up resources, if
necessary. open() is *almost* symmetrical: it's called for each memory region
that is being mapped, **except** the very first one. So bpf_map_mmap does
initial refcnt bump, while open() will do any extra ones after that. Thus
number of close() calls is equal to number of open() calls plus one more.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: https://lore.kernel.org/bpf/20191117172806.2195367-4-andriin@fb.com
Similarly to bpf_map's refcnt/usercnt, convert bpf_prog's refcnt to atomic64
and remove artificial 32k limit. This allows to make bpf_prog's refcounting
non-failing, simplifying logic of users of bpf_prog_add/bpf_prog_inc.
Validated compilation by running allyesconfig kernel build.
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191117172806.2195367-3-andriin@fb.com
92117d8443 ("bpf: fix refcnt overflow") turned refcounting of bpf_map into
potentially failing operation, when refcount reaches BPF_MAX_REFCNT limit
(32k). Due to using 32-bit counter, it's possible in practice to overflow
refcounter and make it wrap around to 0, causing erroneous map free, while
there are still references to it, causing use-after-free problems.
But having a failing refcounting operations are problematic in some cases. One
example is mmap() interface. After establishing initial memory-mapping, user
is allowed to arbitrarily map/remap/unmap parts of mapped memory, arbitrarily
splitting it into multiple non-contiguous regions. All this happening without
any control from the users of mmap subsystem. Rather mmap subsystem sends
notifications to original creator of memory mapping through open/close
callbacks, which are optionally specified during initial memory mapping
creation. These callbacks are used to maintain accurate refcount for bpf_map
(see next patch in this series). The problem is that open() callback is not
supposed to fail, because memory-mapped resource is set up and properly
referenced. This is posing a problem for using memory-mapping with BPF maps.
One solution to this is to maintain separate refcount for just memory-mappings
and do single bpf_map_inc/bpf_map_put when it goes from/to zero, respectively.
There are similar use cases in current work on tcp-bpf, necessitating extra
counter as well. This seems like a rather unfortunate and ugly solution that
doesn't scale well to various new use cases.
Another approach to solve this is to use non-failing refcount_t type, which
uses 32-bit counter internally, but, once reaching overflow state at UINT_MAX,
stays there. This utlimately causes memory leak, but prevents use after free.
But given refcounting is not the most performance-critical operation with BPF
maps (it's not used from running BPF program code), we can also just switch to
64-bit counter that can't overflow in practice, potentially disadvantaging
32-bit platforms a tiny bit. This simplifies semantics and allows above
described scenarios to not worry about failing refcount increment operation.
In terms of struct bpf_map size, we are still good and use the same amount of
space:
BEFORE (3 cache lines, 8 bytes of padding at the end):
struct bpf_map {
const struct bpf_map_ops * ops __attribute__((__aligned__(64))); /* 0 8 */
struct bpf_map * inner_map_meta; /* 8 8 */
void * security; /* 16 8 */
enum bpf_map_type map_type; /* 24 4 */
u32 key_size; /* 28 4 */
u32 value_size; /* 32 4 */
u32 max_entries; /* 36 4 */
u32 map_flags; /* 40 4 */
int spin_lock_off; /* 44 4 */
u32 id; /* 48 4 */
int numa_node; /* 52 4 */
u32 btf_key_type_id; /* 56 4 */
u32 btf_value_type_id; /* 60 4 */
/* --- cacheline 1 boundary (64 bytes) --- */
struct btf * btf; /* 64 8 */
struct bpf_map_memory memory; /* 72 16 */
bool unpriv_array; /* 88 1 */
bool frozen; /* 89 1 */
/* XXX 38 bytes hole, try to pack */
/* --- cacheline 2 boundary (128 bytes) --- */
atomic_t refcnt __attribute__((__aligned__(64))); /* 128 4 */
atomic_t usercnt; /* 132 4 */
struct work_struct work; /* 136 32 */
char name[16]; /* 168 16 */
/* size: 192, cachelines: 3, members: 21 */
/* sum members: 146, holes: 1, sum holes: 38 */
/* padding: 8 */
/* forced alignments: 2, forced holes: 1, sum forced holes: 38 */
} __attribute__((__aligned__(64)));
AFTER (same 3 cache lines, no extra padding now):
struct bpf_map {
const struct bpf_map_ops * ops __attribute__((__aligned__(64))); /* 0 8 */
struct bpf_map * inner_map_meta; /* 8 8 */
void * security; /* 16 8 */
enum bpf_map_type map_type; /* 24 4 */
u32 key_size; /* 28 4 */
u32 value_size; /* 32 4 */
u32 max_entries; /* 36 4 */
u32 map_flags; /* 40 4 */
int spin_lock_off; /* 44 4 */
u32 id; /* 48 4 */
int numa_node; /* 52 4 */
u32 btf_key_type_id; /* 56 4 */
u32 btf_value_type_id; /* 60 4 */
/* --- cacheline 1 boundary (64 bytes) --- */
struct btf * btf; /* 64 8 */
struct bpf_map_memory memory; /* 72 16 */
bool unpriv_array; /* 88 1 */
bool frozen; /* 89 1 */
/* XXX 38 bytes hole, try to pack */
/* --- cacheline 2 boundary (128 bytes) --- */
atomic64_t refcnt __attribute__((__aligned__(64))); /* 128 8 */
atomic64_t usercnt; /* 136 8 */
struct work_struct work; /* 144 32 */
char name[16]; /* 176 16 */
/* size: 192, cachelines: 3, members: 21 */
/* sum members: 154, holes: 1, sum holes: 38 */
/* forced alignments: 2, forced holes: 1, sum forced holes: 38 */
} __attribute__((__aligned__(64)));
This patch, while modifying all users of bpf_map_inc, also cleans up its
interface to match bpf_map_put with separate operations for bpf_map_inc and
bpf_map_inc_with_uref (to match bpf_map_put and bpf_map_put_with_uref,
respectively). Also, given there are no users of bpf_map_inc_not_zero
specifying uref=true, remove uref flag and default to uref=false internally.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20191117172806.2195367-2-andriin@fb.com
xdr_shrink_pagelen() BUG's when @len is larger than buf->page_len.
This can happen when xdr_buf_read_mic() is given an xdr_buf with
a small page array (like, only a few bytes).
Instead, just cap the number of bytes that xdr_shrink_pagelen()
will move.
Fixes: 5f1bc39979 ("SUNRPC: Fix buffer handling of GSS MIC ... ")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
One of the most frustrating messages our sustaining team sees is
the "Lock reclaim failed!" message. Add some observability in the
client's lock reclaim logic so we can capture better data the
first time a problem occurs.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Add a trace point in the main state manager loop to observe state
recovery operation. Help track down state recovery bugs.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
NFSoRDMA Client Updates for Linux 5.5
New Features:
- New tracepoints for congestion control and Local Invalidate WRs
Bugfixes and Cleanups:
- Eliminate log noise in call_reserveresult
- Fix unstable connections after a reconnect
- Clean up some code duplication
- Close race between waking a sender and posting a receive
- Fix MR list corruption, and clean up MR usage
- Remove unused rpcrdma_sendctx fields
- Try to avoid DMA mapping pages if it is too costly
- Wake pending tasks if connection fails
- Replace some dprintk()s with tracepoints
The removed calgary IOMMU driver was the only user of this header file.
Reported-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Christoph Hellwig <hch@lst.de>
Fix sparse warning:
fs/nfs/nfs42proc.c:527:5: warning:
symbol '_nfs42_proc_copy_notify' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Changing a sparse file could have an effect not only on the file size,
but also on the number of blocks used by the file in the underlying
filesystem. The server's cache_consistency_bitmap doesn't update the
SPACE_USED attribute, so let's switch to the nfs4_fattr_bitmap to catch
this update whenever we do an ALLOCATE or DEALLOCATE.
This patch fixes xfstests generic/568, which tests that fallocating an
unaligned range allocates all blocks touched by that range. Without this
patch, `stat` reports 0 bytes used immediately after the fallocate.
Adding a `sleep 5` to the test also catches the update, but it's better
to do so when we know something has changed.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
My understanding is that -EBUSY refers to the underlying device, and
that -ETXTBSY is used when attempting to access a file in use by the
kernel (like a swapfile). Changing this return code helps us pass
xfstests generic/569
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
For imx chipidea controllers, if they use mxs PHY, they need pinctrl
for HSIC. Otherwise, it doesn't need pinctrl and usbmisc control. Like
imx7d and imx8mm.
Reported-by: André Draszik <git@andred.net>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
The current_stateid is exported from nfs4state.c but not
declared in any of the headers. Add to nfs4_fs.h to
remove the following warning:
fs/nfs/nfs4state.c:80:20: warning: symbol 'current_stateid' was not declared. Should it be static?
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
In the event that the RMI device is unreachable, the calls to rmi_set_mode() or
rmi_set_page() will fail before registering the RMI transport device. When the
device is removed, rmi_remove() will call rmi_unregister_transport_device()
which will attempt to access the rmi_dev pointer which was not set.
This patch adds a check of the RMI_STARTED bit before calling
rmi_unregister_transport_device(). The RMI_STARTED bit is only set
after rmi_register_transport_device() completes successfully.
The kernel oops was reported in this message:
https://www.spinics.net/lists/linux-input/msg58433.html
[jkosina@suse.cz: reworded changelog as agreed with Andrew]
Signed-off-by: Andrew Duggan <aduggan@synaptics.com>
Reported-by: Federico Cerutti <federico@ceres-c.it>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Autoloading of Falcon IDE driver modules requires converting these
drivers to platform drivers.
Add platform device for Falcon IDE interface in Atari platform setup
code. Use this in the pata_falcon driver in place of the simple
platform device set up on the fly.
Convert falconide driver to use the same platform device that is used
by pata_falcon also. (With the introduction of a platform device for
the Atari Falcon IDE interface, the old Falcon IDE driver no longer
loads (resource already claimed by the platform device)).
Tested (as built-in driver) on my Atari Falcon.
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Link: https://lore.kernel.org/r/1573008449-8226-1-git-send-email-schmitzmic@gmail.com
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Since e04a0442d3 ("HID: core: remove the absolute need of
hid_have_special_driver[]") it's no longer needed to list these LED
devices in hid_have_special_driver[]. This allows libraries needing
access to the hidraw device to work properly.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
With large eMMC cards, it is possible to create general purpose
partitions that are bigger than 4GB. The size member of the mmc_part
struct is only an unsigned int which overflows for gp partitions larger
than 4GB. Change this to a u64 to handle the overflow.
Signed-off-by: Bradley Bolen <bradleybolen@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Isolated initially to renesas_sdhi_internal_dmac [1], Ulf suggested
adding MMC_CAP_ERASE to the TMIO mmc core:
On Fri, Nov 15, 2019 at 10:27:25AM +0100, Ulf Hansson wrote:
-- snip --
This test and due to the discussions with Wolfram and you in this
thread, I would actually suggest that you enable MMC_CAP_ERASE for all
tmio variants, rather than just for this particular one.
In other words, set the cap in tmio_mmc_host_probe() should be fine,
as it seems none of the tmio variants supports HW busy detection at
this point.
-- snip --
Testing on R-Car H3ULCB-KF doesn't reveal any issues (v5.4-rc7):
root@rcar-gen3:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
mmcblk0 179:0 0 59.2G 0 disk <--- eMMC
mmcblk0boot0 179:8 0 4M 1 disk
mmcblk0boot1 179:16 0 4M 1 disk
mmcblk1 179:24 0 30G 0 disk <--- SD card
root@rcar-gen3:~# time blkdiscard /dev/mmcblk0
real 0m8.659s
user 0m0.001s
sys 0m1.920s
root@rcar-gen3:~# time blkdiscard /dev/mmcblk1
real 0m1.176s
user 0m0.001s
sys 0m0.124s
[1] https://lore.kernel.org/linux-renesas-soc/20191112134808.23546-1-erosca@de.adit-jv.com/
Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Andrew Gabbasov <andrew_gabbasov@mentor.com>
Originally-by: Harish Jenny K N <harish_kandiga@mentor.com>
Suggested-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
- -EPROBE_DEFER is an error, but without need show error message
- If pintrol is not existed, as pintrol is NULL
Signed-off-by: Peter Chen <peter.chen@nxp.com>
As usbmisc_data is optional, so add the check before access its member,
this fix below static checker warning:
drivers/usb/chipidea/ci_hdrc_imx.c:438 ci_hdrc_imx_probe()
warn: 'data->usbmisc_data' can also be NULL
which is introduced by Patch 15b80f7c3a7f:
"usb: chipidea: imx: enable vbus and id wakeup only for OTG events"
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Li Jun <jun.li@nxp.com>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Vbus regualtor is an optional regulator, for platforms, which
doesn't have this regulator, it will get a dummy regulator and
show warning message.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
If ID or VBUS is from external block, don't enable its wakeup
because it isn't used at all.
Signed-off-by: Li Jun <jun.li@nxp.com>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
We hit the problem with below sequence:
- ci_udc_vbus_session() update vbus_active flag and ci->driver
is valid,
- before calling the ci_hdrc_gadget_connect(),
usb_gadget_udc_stop() is called by application remove gadget
driver,
- ci_udc_vbus_session() will contine do ci_hdrc_gadget_connect() as
gadget_ready is 1, so udc interrupt is enabled, but ci->driver is
NULL.
- USB connection irq generated but ci->driver is NULL.
As udc irq only should be enabled when gadget driver is binded, so
add spinlock to protect the usb irq enable for vbus session handling.
Signed-off-by: Jun Li <jun.li@nxp.com>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
This API is used enable device function, it is called at below
situations:
- VBUS is connected during boots up
- Hot plug occurs during runtime
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Jun Li <jun.li@nxp.com>
If the clone3() syscall is not implemented we should skip the tests.
Fixes: 41585bbeee ("selftests: add tests for clone3() with *set_tid")
Fixes: 17a810699c ("selftests: add tests for clone3()")
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Felipe writes:
USB: changes for v5.5
We have TI's glue layer for the Cadence USB3 controller going
upstream. Tegra's XUDC driver is also going upstream with this pull
request.
Apart from these two big features, we have a bunch of patches switching
over to devm_platform_ioremap_resource() in order to simplify code a
little; and a non-critical fix for DWC3 usage via kexec.
* tag 'usb-for-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb: (44 commits)
usb: dwc3: of-simple: add a shutdown
usb: cdns3: Add TI specific wrapper driver
dt-bindings: usb: Add binding for the TI wrapper for Cadence USB3 controller
usb: mtu3: fix race condition about delayed_status
usb: gadget: Add UDC driver for tegra XUSB device mode controller
usb: dwc3: debug: Remove newline printout
usb: dwc2: use a longer core rest timeout in dwc2_core_reset()
usb: gadget: udc: lpc32xx: Use devm_platform_ioremap_resource() in lpc32xx_udc_probe()
USB: gadget: udc: clean up an indentation issue
usb: gadget: Quieten gadget config message
phy: renesas: rcar-gen3-usb2: Use platform_get_irq_optional() for optional irq
usb: gadget: Remove set but not used variable 'opts' in msg_do_config
usb: gadget: Remove set but not used variable 'opts' in acm_ms_do_config
usb: mtu3: add a new function to do status stage
usb: gadget: configfs: fix concurrent issue between composite APIs
usb: gadget: f_tcm: Provide support to get alternate setting in tcm function
usb: gadget: Correct NULL pointer checking in fsl gadget
usb: fsl: Remove unused variable
USB: dummy-hcd: use usb_urb_dir_in instead of usb_pipein
USB: dummy-hcd: increase max number of devices to 32
...
This allows just loading the kernel at a pre-set address without
qemu going bonkers trying to map the ELF file.
Contains a contribution from Aurabindo Jayamohanan to reuse the
PAGE_OFFSET definition.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
[paul.walmsley@sifive.com: fixed checkpatch issue; minor commit
message fix]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
The kernel runs in M-mode without using page tables, and thus can't run
bare metal without help from additional firmware.
Most of the patch is just stubbing out code not needed without page
tables, but there is an interesting detail in the signals implementation:
- The normal RISC-V syscall ABI only implements rt_sigreturn as VDSO
entry point, but the ELF VDSO is not supported for nommu Linux.
We instead copy the code to call the syscall onto the stack.
In addition to enabling the nommu code a new defconfig for a small
kernel image that can run in nommu mode on qemu is also provided, to run
a kernel in qemu you can use the following command line:
qemu-system-riscv64 -smp 2 -m 64 -machine virt -nographic \
-kernel arch/riscv/boot/loader \
-drive file=rootfs.ext2,format=raw,id=hd0 \
-device virtio-blk-device,drive=hd0
Contains contributions from Damien Le Moal <Damien.LeMoal@wdc.com>.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
[paul.walmsley@sifive.com: updated to apply; add CONFIG_MMU guards
around PCI_IOBASE definition to fix build issues; fixed checkpatch
issues; move the PCI_IO_* and VMEMMAP address space macros along
with the others; resolve sparse warning]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
When we get booted we want a clear slate without any leaks from previous
supervisors or the firmware. Flush the instruction cache and then clear
all registers to known good values. This is really important for the
upcoming nommu support that runs on M-mode, but can't really harm when
running in S-mode either. Vaguely based on the concepts from opensbi.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
When in M-Mode, we can use the mhartid CSR to get the ID of the running
HART. Doing so, direct M-Mode boot without firmware is possible.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
RISC-V has the concept of a cpu level interrupt controller. The
interface for it is split between a standardized part that is exposed
as bits in the mstatus/sstatus register and the mie/mip/sie/sip
CRS. But the bit to actually trigger IPIs is not standardized and
just mentioned as implementable using MMIO.
Add support for IPIs using MMIO using the SiFive clint layout (which
is also shared by Ariane, Kendryte and the Qemu virt platform).
Additionally the MMIO block also supports the time value and timer
compare registers, so they are also set up using the same OF node.
Support for other layouts should also be relatively easy to add in the
future.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
[paul.walmsley@sifive.com: update include guard format; fix checkpatch
issues; minor commit message cleanup]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>