docs/vm: cleancache.txt: convert to ReST format
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
This commit is contained in:

committed by
Jonathan Corbet

parent
d04f9f5a78
commit
5ef829e056
@@ -1,4 +1,11 @@
|
|||||||
MOTIVATION
|
.. _cleancache:
|
||||||
|
|
||||||
|
==========
|
||||||
|
Cleancache
|
||||||
|
==========
|
||||||
|
|
||||||
|
Motivation
|
||||||
|
==========
|
||||||
|
|
||||||
Cleancache is a new optional feature provided by the VFS layer that
|
Cleancache is a new optional feature provided by the VFS layer that
|
||||||
potentially dramatically increases page cache effectiveness for
|
potentially dramatically increases page cache effectiveness for
|
||||||
@@ -21,9 +28,10 @@ Transcendent memory "drivers" for cleancache are currently implemented
|
|||||||
in Xen (using hypervisor memory) and zcache (using in-kernel compressed
|
in Xen (using hypervisor memory) and zcache (using in-kernel compressed
|
||||||
memory) and other implementations are in development.
|
memory) and other implementations are in development.
|
||||||
|
|
||||||
FAQs are included below.
|
:ref:`FAQs <faq>` are included below.
|
||||||
|
|
||||||
IMPLEMENTATION OVERVIEW
|
Implementation Overview
|
||||||
|
=======================
|
||||||
|
|
||||||
A cleancache "backend" that provides transcendent memory registers itself
|
A cleancache "backend" that provides transcendent memory registers itself
|
||||||
to the kernel's cleancache "frontend" by calling cleancache_register_ops,
|
to the kernel's cleancache "frontend" by calling cleancache_register_ops,
|
||||||
@@ -80,22 +88,33 @@ different Linux threads are simultaneously putting and invalidating a page
|
|||||||
with the same handle, the results are indeterminate. Callers must
|
with the same handle, the results are indeterminate. Callers must
|
||||||
lock the page to ensure serial behavior.
|
lock the page to ensure serial behavior.
|
||||||
|
|
||||||
CLEANCACHE PERFORMANCE METRICS
|
Cleancache Performance Metrics
|
||||||
|
==============================
|
||||||
|
|
||||||
If properly configured, monitoring of cleancache is done via debugfs in
|
If properly configured, monitoring of cleancache is done via debugfs in
|
||||||
the /sys/kernel/debug/cleancache directory. The effectiveness of cleancache
|
the `/sys/kernel/debug/cleancache` directory. The effectiveness of cleancache
|
||||||
can be measured (across all filesystems) with:
|
can be measured (across all filesystems) with:
|
||||||
|
|
||||||
succ_gets - number of gets that were successful
|
``succ_gets``
|
||||||
failed_gets - number of gets that failed
|
number of gets that were successful
|
||||||
puts - number of puts attempted (all "succeed")
|
|
||||||
invalidates - number of invalidates attempted
|
``failed_gets``
|
||||||
|
number of gets that failed
|
||||||
|
|
||||||
|
``puts``
|
||||||
|
number of puts attempted (all "succeed")
|
||||||
|
|
||||||
|
``invalidates``
|
||||||
|
number of invalidates attempted
|
||||||
|
|
||||||
A backend implementation may provide additional metrics.
|
A backend implementation may provide additional metrics.
|
||||||
|
|
||||||
FAQ
|
.. _faq:
|
||||||
|
|
||||||
1) Where's the value? (Andrew Morton)
|
FAQ
|
||||||
|
===
|
||||||
|
|
||||||
|
* Where's the value? (Andrew Morton)
|
||||||
|
|
||||||
Cleancache provides a significant performance benefit to many workloads
|
Cleancache provides a significant performance benefit to many workloads
|
||||||
in many environments with negligible overhead by improving the
|
in many environments with negligible overhead by improving the
|
||||||
@@ -137,8 +156,8 @@ device that stores pages of data in a compressed state. And
|
|||||||
the proposed "RAMster" driver shares RAM across multiple physical
|
the proposed "RAMster" driver shares RAM across multiple physical
|
||||||
systems.
|
systems.
|
||||||
|
|
||||||
2) Why does cleancache have its sticky fingers so deep inside the
|
* Why does cleancache have its sticky fingers so deep inside the
|
||||||
filesystems and VFS? (Andrew Morton and Christoph Hellwig)
|
filesystems and VFS? (Andrew Morton and Christoph Hellwig)
|
||||||
|
|
||||||
The core hooks for cleancache in VFS are in most cases a single line
|
The core hooks for cleancache in VFS are in most cases a single line
|
||||||
and the minimum set are placed precisely where needed to maintain
|
and the minimum set are placed precisely where needed to maintain
|
||||||
@@ -168,9 +187,9 @@ filesystems in the future.
|
|||||||
The total impact of the hooks to existing fs and mm files is only
|
The total impact of the hooks to existing fs and mm files is only
|
||||||
about 40 lines added (not counting comments and blank lines).
|
about 40 lines added (not counting comments and blank lines).
|
||||||
|
|
||||||
3) Why not make cleancache asynchronous and batched so it can
|
* Why not make cleancache asynchronous and batched so it can more
|
||||||
more easily interface with real devices with DMA instead
|
easily interface with real devices with DMA instead of copying each
|
||||||
of copying each individual page? (Minchan Kim)
|
individual page? (Minchan Kim)
|
||||||
|
|
||||||
The one-page-at-a-time copy semantics simplifies the implementation
|
The one-page-at-a-time copy semantics simplifies the implementation
|
||||||
on both the frontend and backend and also allows the backend to
|
on both the frontend and backend and also allows the backend to
|
||||||
@@ -182,8 +201,8 @@ are avoided. While the interface seems odd for a "real device"
|
|||||||
or for real kernel-addressable RAM, it makes perfect sense for
|
or for real kernel-addressable RAM, it makes perfect sense for
|
||||||
transcendent memory.
|
transcendent memory.
|
||||||
|
|
||||||
4) Why is non-shared cleancache "exclusive"? And where is the
|
* Why is non-shared cleancache "exclusive"? And where is the
|
||||||
page "invalidated" after a "get"? (Minchan Kim)
|
page "invalidated" after a "get"? (Minchan Kim)
|
||||||
|
|
||||||
The main reason is to free up space in transcendent memory and
|
The main reason is to free up space in transcendent memory and
|
||||||
to avoid unnecessary cleancache_invalidate calls. If you want inclusive,
|
to avoid unnecessary cleancache_invalidate calls. If you want inclusive,
|
||||||
@@ -193,7 +212,7 @@ be easily extended to add a "get_no_invalidate" call.
|
|||||||
|
|
||||||
The invalidate is done by the cleancache backend implementation.
|
The invalidate is done by the cleancache backend implementation.
|
||||||
|
|
||||||
5) What's the performance impact?
|
* What's the performance impact?
|
||||||
|
|
||||||
Performance analysis has been presented at OLS'09 and LCA'10.
|
Performance analysis has been presented at OLS'09 and LCA'10.
|
||||||
Briefly, performance gains can be significant on most workloads,
|
Briefly, performance gains can be significant on most workloads,
|
||||||
@@ -206,7 +225,7 @@ single-core systems with slow memory-copy speeds, cleancache
|
|||||||
has little value, but in newer multicore machines, especially
|
has little value, but in newer multicore machines, especially
|
||||||
consolidated/virtualized machines, it has great value.
|
consolidated/virtualized machines, it has great value.
|
||||||
|
|
||||||
6) How do I add cleancache support for filesystem X? (Boaz Harrash)
|
* How do I add cleancache support for filesystem X? (Boaz Harrash)
|
||||||
|
|
||||||
Filesystems that are well-behaved and conform to certain
|
Filesystems that are well-behaved and conform to certain
|
||||||
restrictions can utilize cleancache simply by making a call to
|
restrictions can utilize cleancache simply by making a call to
|
||||||
@@ -217,26 +236,26 @@ not enable the optional cleancache.
|
|||||||
|
|
||||||
Some points for a filesystem to consider:
|
Some points for a filesystem to consider:
|
||||||
|
|
||||||
- The FS should be block-device-based (e.g. a ram-based FS such
|
- The FS should be block-device-based (e.g. a ram-based FS such
|
||||||
as tmpfs should not enable cleancache)
|
as tmpfs should not enable cleancache)
|
||||||
- To ensure coherency/correctness, the FS must ensure that all
|
- To ensure coherency/correctness, the FS must ensure that all
|
||||||
file removal or truncation operations either go through VFS or
|
file removal or truncation operations either go through VFS or
|
||||||
add hooks to do the equivalent cleancache "invalidate" operations
|
add hooks to do the equivalent cleancache "invalidate" operations
|
||||||
- To ensure coherency/correctness, either inode numbers must
|
- To ensure coherency/correctness, either inode numbers must
|
||||||
be unique across the lifetime of the on-disk file OR the
|
be unique across the lifetime of the on-disk file OR the
|
||||||
FS must provide an "encode_fh" function.
|
FS must provide an "encode_fh" function.
|
||||||
- The FS must call the VFS superblock alloc and deactivate routines
|
- The FS must call the VFS superblock alloc and deactivate routines
|
||||||
or add hooks to do the equivalent cleancache calls done there.
|
or add hooks to do the equivalent cleancache calls done there.
|
||||||
- To maximize performance, all pages fetched from the FS should
|
- To maximize performance, all pages fetched from the FS should
|
||||||
go through the do_mpag_readpage routine or the FS should add
|
go through the do_mpag_readpage routine or the FS should add
|
||||||
hooks to do the equivalent (cf. btrfs)
|
hooks to do the equivalent (cf. btrfs)
|
||||||
- Currently, the FS blocksize must be the same as PAGESIZE. This
|
- Currently, the FS blocksize must be the same as PAGESIZE. This
|
||||||
is not an architectural restriction, but no backends currently
|
is not an architectural restriction, but no backends currently
|
||||||
support anything different.
|
support anything different.
|
||||||
- A clustered FS should invoke the "shared_init_fs" cleancache
|
- A clustered FS should invoke the "shared_init_fs" cleancache
|
||||||
hook to get best performance for some backends.
|
hook to get best performance for some backends.
|
||||||
|
|
||||||
7) Why not use the KVA of the inode as the key? (Christoph Hellwig)
|
* Why not use the KVA of the inode as the key? (Christoph Hellwig)
|
||||||
|
|
||||||
If cleancache would use the inode virtual address instead of
|
If cleancache would use the inode virtual address instead of
|
||||||
inode/filehandle, the pool id could be eliminated. But, this
|
inode/filehandle, the pool id could be eliminated. But, this
|
||||||
@@ -251,7 +270,7 @@ of cleancache would be lost because the cache of pages in cleanache
|
|||||||
is potentially much larger than the kernel pagecache and is most
|
is potentially much larger than the kernel pagecache and is most
|
||||||
useful if the pages survive inode cache removal.
|
useful if the pages survive inode cache removal.
|
||||||
|
|
||||||
8) Why is a global variable required?
|
* Why is a global variable required?
|
||||||
|
|
||||||
The cleancache_enabled flag is checked in all of the frequently-used
|
The cleancache_enabled flag is checked in all of the frequently-used
|
||||||
cleancache hooks. The alternative is a function call to check a static
|
cleancache hooks. The alternative is a function call to check a static
|
||||||
@@ -262,14 +281,14 @@ global variable allows cleancache to be enabled by default at compile
|
|||||||
time, but have insignificant performance impact when cleancache remains
|
time, but have insignificant performance impact when cleancache remains
|
||||||
disabled at runtime.
|
disabled at runtime.
|
||||||
|
|
||||||
9) Does cleanache work with KVM?
|
* Does cleanache work with KVM?
|
||||||
|
|
||||||
The memory model of KVM is sufficiently different that a cleancache
|
The memory model of KVM is sufficiently different that a cleancache
|
||||||
backend may have less value for KVM. This remains to be tested,
|
backend may have less value for KVM. This remains to be tested,
|
||||||
especially in an overcommitted system.
|
especially in an overcommitted system.
|
||||||
|
|
||||||
10) Does cleancache work in userspace? It sounds useful for
|
* Does cleancache work in userspace? It sounds useful for
|
||||||
memory hungry caches like web browsers. (Jamie Lokier)
|
memory hungry caches like web browsers. (Jamie Lokier)
|
||||||
|
|
||||||
No plans yet, though we agree it sounds useful, at least for
|
No plans yet, though we agree it sounds useful, at least for
|
||||||
apps that bypass the page cache (e.g. O_DIRECT).
|
apps that bypass the page cache (e.g. O_DIRECT).
|
||||||
|
Reference in New Issue
Block a user