docs: admin-guide: add a series of orphaned documents

There are lots of documents that belong to the admin-guide but
are on random places (most under Documentation root dir).

Move them to the admin guide.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
This commit is contained in:
Mauro Carvalho Chehab
2019-06-27 14:56:51 -03:00
parent da82c92f11
commit 4f4cfa6c56
40 changed files with 47 additions and 33 deletions

View File

@@ -0,0 +1,124 @@
=============
btmrvl driver
=============
All commands are used via debugfs interface.
Set/get driver configurations
=============================
Path: /debug/btmrvl/config/
gpiogap=[n], hscfgcmd
These commands are used to configure the host sleep parameters::
bit 8:0 -- Gap
bit 16:8 -- GPIO
where GPIO is the pin number of GPIO used to wake up the host.
It could be any valid GPIO pin# (e.g. 0-7) or 0xff (SDIO interface
wakeup will be used instead).
where Gap is the gap in milli seconds between wakeup signal and
wakeup event, or 0xff for special host sleep setting.
Usage::
# Use SDIO interface to wake up the host and set GAP to 0x80:
echo 0xff80 > /debug/btmrvl/config/gpiogap
echo 1 > /debug/btmrvl/config/hscfgcmd
# Use GPIO pin #3 to wake up the host and set GAP to 0xff:
echo 0x03ff > /debug/btmrvl/config/gpiogap
echo 1 > /debug/btmrvl/config/hscfgcmd
psmode=[n], pscmd
These commands are used to enable/disable auto sleep mode
where the option is::
1 -- Enable auto sleep mode
0 -- Disable auto sleep mode
Usage::
# Enable auto sleep mode
echo 1 > /debug/btmrvl/config/psmode
echo 1 > /debug/btmrvl/config/pscmd
# Disable auto sleep mode
echo 0 > /debug/btmrvl/config/psmode
echo 1 > /debug/btmrvl/config/pscmd
hsmode=[n], hscmd
These commands are used to enable host sleep or wake up firmware
where the option is::
1 -- Enable host sleep
0 -- Wake up firmware
Usage::
# Enable host sleep
echo 1 > /debug/btmrvl/config/hsmode
echo 1 > /debug/btmrvl/config/hscmd
# Wake up firmware
echo 0 > /debug/btmrvl/config/hsmode
echo 1 > /debug/btmrvl/config/hscmd
Get driver status
=================
Path: /debug/btmrvl/status/
Usage::
cat /debug/btmrvl/status/<args>
where the args are:
curpsmode
This command displays current auto sleep status.
psstate
This command display the power save state.
hsstate
This command display the host sleep state.
txdnldrdy
This command displays the value of Tx download ready flag.
Issuing a raw hci command
=========================
Use hcitool to issue raw hci command, refer to hcitool manual
Usage::
Hcitool cmd <ogf> <ocf> [Parameters]
Interface Control Command::
hcitool cmd 0x3f 0x5b 0xf5 0x01 0x00 --Enable All interface
hcitool cmd 0x3f 0x5b 0xf5 0x01 0x01 --Enable Wlan interface
hcitool cmd 0x3f 0x5b 0xf5 0x01 0x02 --Enable BT interface
hcitool cmd 0x3f 0x5b 0xf5 0x00 0x00 --Disable All interface
hcitool cmd 0x3f 0x5b 0xf5 0x00 0x01 --Disable Wlan interface
hcitool cmd 0x3f 0x5b 0xf5 0x00 0x02 --Disable BT interface
SD8688 firmware
===============
Images:
- /lib/firmware/sd8688_helper.bin
- /lib/firmware/sd8688.bin
The images can be downloaded from:
git.infradead.org/users/dwmw2/linux-firmware.git/libertas/

View File

@@ -0,0 +1,9 @@
Clearing WARN_ONCE
------------------
WARN_ONCE / WARN_ON_ONCE / printk_once only emit a message once.
echo 1 > /sys/kernel/debug/clear_warn_once
clears the state and allows the warnings to print once again.
This can be useful after test suite runs to reproduce problems.

View File

@@ -0,0 +1,114 @@
========
CPU load
========
Linux exports various bits of information via ``/proc/stat`` and
``/proc/uptime`` that userland tools, such as top(1), use to calculate
the average time system spent in a particular state, for example::
$ iostat
Linux 2.6.18.3-exp (linmac) 02/20/2007
avg-cpu: %user %nice %system %iowait %steal %idle
10.01 0.00 2.92 5.44 0.00 81.63
...
Here the system thinks that over the default sampling period the
system spent 10.01% of the time doing work in user space, 2.92% in the
kernel, and was overall 81.63% of the time idle.
In most cases the ``/proc/stat`` information reflects the reality quite
closely, however due to the nature of how/when the kernel collects
this data sometimes it can not be trusted at all.
So how is this information collected? Whenever timer interrupt is
signalled the kernel looks what kind of task was running at this
moment and increments the counter that corresponds to this tasks
kind/state. The problem with this is that the system could have
switched between various states multiple times between two timer
interrupts yet the counter is incremented only for the last state.
Example
-------
If we imagine the system with one task that periodically burns cycles
in the following manner::
time line between two timer interrupts
|--------------------------------------|
^ ^
|_ something begins working |
|_ something goes to sleep
(only to be awaken quite soon)
In the above situation the system will be 0% loaded according to the
``/proc/stat`` (since the timer interrupt will always happen when the
system is executing the idle handler), but in reality the load is
closer to 99%.
One can imagine many more situations where this behavior of the kernel
will lead to quite erratic information inside ``/proc/stat``::
/* gcc -o hog smallhog.c */
#include <time.h>
#include <limits.h>
#include <signal.h>
#include <sys/time.h>
#define HIST 10
static volatile sig_atomic_t stop;
static void sighandler (int signr)
{
(void) signr;
stop = 1;
}
static unsigned long hog (unsigned long niters)
{
stop = 0;
while (!stop && --niters);
return niters;
}
int main (void)
{
int i;
struct itimerval it = { .it_interval = { .tv_sec = 0, .tv_usec = 1 },
.it_value = { .tv_sec = 0, .tv_usec = 1 } };
sigset_t set;
unsigned long v[HIST];
double tmp = 0.0;
unsigned long n;
signal (SIGALRM, &sighandler);
setitimer (ITIMER_REAL, &it, NULL);
hog (ULONG_MAX);
for (i = 0; i < HIST; ++i) v[i] = ULONG_MAX - hog (ULONG_MAX);
for (i = 0; i < HIST; ++i) tmp += v[i];
tmp /= HIST;
n = tmp - (tmp / 3.0);
sigemptyset (&set);
sigaddset (&set, SIGALRM);
for (;;) {
hog (n);
sigwait (&set, &i);
}
return 0;
}
References
----------
- http://lkml.org/lkml/2007/2/12/6
- Documentation/filesystems/proc.txt (1.8)
Thanks
------
Con Kolivas, Pavel Machek

View File

@@ -0,0 +1,177 @@
===========================================
How CPU topology info is exported via sysfs
===========================================
Export CPU topology info via sysfs. Items (attributes) are similar
to /proc/cpuinfo output of some architectures. They reside in
/sys/devices/system/cpu/cpuX/topology/:
physical_package_id:
physical package id of cpuX. Typically corresponds to a physical
socket number, but the actual value is architecture and platform
dependent.
die_id:
the CPU die ID of cpuX. Typically it is the hardware platform's
identifier (rather than the kernel's). The actual value is
architecture and platform dependent.
core_id:
the CPU core ID of cpuX. Typically it is the hardware platform's
identifier (rather than the kernel's). The actual value is
architecture and platform dependent.
book_id:
the book ID of cpuX. Typically it is the hardware platform's
identifier (rather than the kernel's). The actual value is
architecture and platform dependent.
drawer_id:
the drawer ID of cpuX. Typically it is the hardware platform's
identifier (rather than the kernel's). The actual value is
architecture and platform dependent.
core_cpus:
internal kernel map of CPUs within the same core.
(deprecated name: "thread_siblings")
core_cpus_list:
human-readable list of CPUs within the same core.
(deprecated name: "thread_siblings_list");
package_cpus:
internal kernel map of the CPUs sharing the same physical_package_id.
(deprecated name: "core_siblings")
package_cpus_list:
human-readable list of CPUs sharing the same physical_package_id.
(deprecated name: "core_siblings_list")
die_cpus:
internal kernel map of CPUs within the same die.
die_cpus_list:
human-readable list of CPUs within the same die.
book_siblings:
internal kernel map of cpuX's hardware threads within the same
book_id.
book_siblings_list:
human-readable list of cpuX's hardware threads within the same
book_id.
drawer_siblings:
internal kernel map of cpuX's hardware threads within the same
drawer_id.
drawer_siblings_list:
human-readable list of cpuX's hardware threads within the same
drawer_id.
Architecture-neutral, drivers/base/topology.c, exports these attributes.
However, the book and drawer related sysfs files will only be created if
CONFIG_SCHED_BOOK and CONFIG_SCHED_DRAWER are selected, respectively.
CONFIG_SCHED_BOOK and CONFIG_SCHED_DRAWER are currently only used on s390,
where they reflect the cpu and cache hierarchy.
For an architecture to support this feature, it must define some of
these macros in include/asm-XXX/topology.h::
#define topology_physical_package_id(cpu)
#define topology_die_id(cpu)
#define topology_core_id(cpu)
#define topology_book_id(cpu)
#define topology_drawer_id(cpu)
#define topology_sibling_cpumask(cpu)
#define topology_core_cpumask(cpu)
#define topology_die_cpumask(cpu)
#define topology_book_cpumask(cpu)
#define topology_drawer_cpumask(cpu)
The type of ``**_id macros`` is int.
The type of ``**_cpumask macros`` is ``(const) struct cpumask *``. The latter
correspond with appropriate ``**_siblings`` sysfs attributes (except for
topology_sibling_cpumask() which corresponds with thread_siblings).
To be consistent on all architectures, include/linux/topology.h
provides default definitions for any of the above macros that are
not defined by include/asm-XXX/topology.h:
1) topology_physical_package_id: -1
2) topology_die_id: -1
3) topology_core_id: 0
4) topology_sibling_cpumask: just the given CPU
5) topology_core_cpumask: just the given CPU
6) topology_die_cpumask: just the given CPU
For architectures that don't support books (CONFIG_SCHED_BOOK) there are no
default definitions for topology_book_id() and topology_book_cpumask().
For architectures that don't support drawers (CONFIG_SCHED_DRAWER) there are
no default definitions for topology_drawer_id() and topology_drawer_cpumask().
Additionally, CPU topology information is provided under
/sys/devices/system/cpu and includes these files. The internal
source for the output is in brackets ("[]").
=========== ==========================================================
kernel_max: the maximum CPU index allowed by the kernel configuration.
[NR_CPUS-1]
offline: CPUs that are not online because they have been
HOTPLUGGED off (see cpu-hotplug.txt) or exceed the limit
of CPUs allowed by the kernel configuration (kernel_max
above). [~cpu_online_mask + cpus >= NR_CPUS]
online: CPUs that are online and being scheduled [cpu_online_mask]
possible: CPUs that have been allocated resources and can be
brought online if they are present. [cpu_possible_mask]
present: CPUs that have been identified as being present in the
system. [cpu_present_mask]
=========== ==========================================================
The format for the above output is compatible with cpulist_parse()
[see <linux/cpumask.h>]. Some examples follow.
In this example, there are 64 CPUs in the system but cpus 32-63 exceed
the kernel max which is limited to 0..31 by the NR_CPUS config option
being 32. Note also that CPUs 2 and 4-31 are not online but could be
brought online as they are both present and possible::
kernel_max: 31
offline: 2,4-31,32-63
online: 0-1,3
possible: 0-31
present: 0-31
In this example, the NR_CPUS config option is 128, but the kernel was
started with possible_cpus=144. There are 4 CPUs in the system and cpu2
was manually taken offline (and is the only CPU that can be brought
online.)::
kernel_max: 127
offline: 2,4-127,128-143
online: 0-1,3
possible: 0-127
present: 0-3
See cpu-hotplug.txt for the possible_cpus=NUM kernel start parameter
as well as more information on the various cpumasks.

View File

@@ -13,7 +13,7 @@ the range specified.
The I/O statistics counters for each step-sized area of a region are
in the same format as `/sys/block/*/stat` or `/proc/diskstats` (see:
Documentation/iostats.txt). But two extra counters (12 and 13) are
Documentation/admin-guide/iostats.rst). But two extra counters (12 and 13) are
provided: total time spent reading and writing. When the histogram
argument is used, the 14th parameter is reported that represents the
histogram of latencies. All these counters may be accessed by sending
@@ -151,7 +151,7 @@ Messages
The first 11 counters have the same meaning as
`/sys/block/*/stat or /proc/diskstats`.
Please refer to Documentation/iostats.txt for details.
Please refer to Documentation/admin-guide/iostats.rst for details.
1. the number of reads completed
2. the number of reads merged

View File

@@ -0,0 +1,100 @@
=================
The EFI Boot Stub
=================
On the x86 and ARM platforms, a kernel zImage/bzImage can masquerade
as a PE/COFF image, thereby convincing EFI firmware loaders to load
it as an EFI executable. The code that modifies the bzImage header,
along with the EFI-specific entry point that the firmware loader
jumps to are collectively known as the "EFI boot stub", and live in
arch/x86/boot/header.S and arch/x86/boot/compressed/eboot.c,
respectively. For ARM the EFI stub is implemented in
arch/arm/boot/compressed/efi-header.S and
arch/arm/boot/compressed/efi-stub.c. EFI stub code that is shared
between architectures is in drivers/firmware/efi/libstub.
For arm64, there is no compressed kernel support, so the Image itself
masquerades as a PE/COFF image and the EFI stub is linked into the
kernel. The arm64 EFI stub lives in arch/arm64/kernel/efi-entry.S
and drivers/firmware/efi/libstub/arm64-stub.c.
By using the EFI boot stub it's possible to boot a Linux kernel
without the use of a conventional EFI boot loader, such as grub or
elilo. Since the EFI boot stub performs the jobs of a boot loader, in
a certain sense it *IS* the boot loader.
The EFI boot stub is enabled with the CONFIG_EFI_STUB kernel option.
How to install bzImage.efi
--------------------------
The bzImage located in arch/x86/boot/bzImage must be copied to the EFI
System Partition (ESP) and renamed with the extension ".efi". Without
the extension the EFI firmware loader will refuse to execute it. It's
not possible to execute bzImage.efi from the usual Linux file systems
because EFI firmware doesn't have support for them. For ARM the
arch/arm/boot/zImage should be copied to the system partition, and it
may not need to be renamed. Similarly for arm64, arch/arm64/boot/Image
should be copied but not necessarily renamed.
Passing kernel parameters from the EFI shell
--------------------------------------------
Arguments to the kernel can be passed after bzImage.efi, e.g.::
fs0:> bzImage.efi console=ttyS0 root=/dev/sda4
The "initrd=" option
--------------------
Like most boot loaders, the EFI stub allows the user to specify
multiple initrd files using the "initrd=" option. This is the only EFI
stub-specific command line parameter, everything else is passed to the
kernel when it boots.
The path to the initrd file must be an absolute path from the
beginning of the ESP, relative path names do not work. Also, the path
is an EFI-style path and directory elements must be separated with
backslashes (\). For example, given the following directory layout::
fs0:>
Kernels\
bzImage.efi
initrd-large.img
Ramdisks\
initrd-small.img
initrd-medium.img
to boot with the initrd-large.img file if the current working
directory is fs0:\Kernels, the following command must be used::
fs0:\Kernels> bzImage.efi initrd=\Kernels\initrd-large.img
Notice how bzImage.efi can be specified with a relative path. That's
because the image we're executing is interpreted by the EFI shell,
which understands relative paths, whereas the rest of the command line
is passed to bzImage.efi.
The "dtb=" option
-----------------
For the ARM and arm64 architectures, a device tree must be provided to
the kernel. Normally firmware shall supply the device tree via the
EFI CONFIGURATION TABLE. However, the "dtb=" command line option can
be used to override the firmware supplied device tree, or to supply
one when firmware is unable to.
Please note: Firmware adds runtime configuration information to the
device tree before booting the kernel. If dtb= is used to override
the device tree, then any runtime data provided by firmware will be
lost. The dtb= option should only be used either as a debug tool, or
as a last resort when a device tree is not provided in the EFI
CONFIGURATION TABLE.
"dtb=" is processed in the same manner as the "initrd=" option that is
described above.

View File

@@ -0,0 +1,80 @@
===================================================
Notes on the change from 16-bit UIDs to 32-bit UIDs
===================================================
:Author: Chris Wing <wingc@umich.edu>
:Last updated: January 11, 2000
- kernel code MUST take into account __kernel_uid_t and __kernel_uid32_t
when communicating between user and kernel space in an ioctl or data
structure.
- kernel code should use uid_t and gid_t in kernel-private structures and
code.
What's left to be done for 32-bit UIDs on all Linux architectures:
- Disk quotas have an interesting limitation that is not related to the
maximum UID/GID. They are limited by the maximum file size on the
underlying filesystem, because quota records are written at offsets
corresponding to the UID in question.
Further investigation is needed to see if the quota system can cope
properly with huge UIDs. If it can deal with 64-bit file offsets on all
architectures, this should not be a problem.
- Decide whether or not to keep backwards compatibility with the system
accounting file, or if we should break it as the comments suggest
(currently, the old 16-bit UID and GID are still written to disk, and
part of the former pad space is used to store separate 32-bit UID and
GID)
- Need to validate that OS emulation calls the 16-bit UID
compatibility syscalls, if the OS being emulated used 16-bit UIDs, or
uses the 32-bit UID system calls properly otherwise.
This affects at least:
- iBCS on Intel
- sparc32 emulation on sparc64
(need to support whatever new 32-bit UID system calls are added to
sparc32)
- Validate that all filesystems behave properly.
At present, 32-bit UIDs _should_ work for:
- ext2
- ufs
- isofs
- nfs
- coda
- udf
Ioctl() fixups have been made for:
- ncpfs
- smbfs
Filesystems with simple fixups to prevent 16-bit UID wraparound:
- minix
- sysv
- qnx4
Other filesystems have not been checked yet.
- The ncpfs and smpfs filesystems cannot presently use 32-bit UIDs in
all ioctl()s. Some new ioctl()s have been added with 32-bit UIDs, but
more are needed. (as well as new user<->kernel data structures)
- The ELF core dump format only supports 16-bit UIDs on arm, i386, m68k,
sh, and sparc32. Fixing this is probably not that important, but would
require adding a new ELF section.
- The ioctl()s used to control the in-kernel NFS server only support
16-bit UIDs on arm, i386, m68k, sh, and sparc32.
- make sure that the UID mapping feature of AX25 networking works properly
(it should be safe because it's always used a 32-bit integer to
communicate between user and kernel)

View File

@@ -241,7 +241,7 @@ Guest mitigation mechanisms
For further information about confining guests to a single or to a group
of cores consult the cpusets documentation:
https://www.kernel.org/doc/Documentation/cgroup-v1/cpusets.rst
https://www.kernel.org/doc/Documentation/admin-guide/cgroup-v1/cpusets.rst
.. _interrupt_isolation:

View File

@@ -0,0 +1,105 @@
==========================================================
Linux support for random number generator in i8xx chipsets
==========================================================
Introduction
============
The hw_random framework is software that makes use of a
special hardware feature on your CPU or motherboard,
a Random Number Generator (RNG). The software has two parts:
a core providing the /dev/hwrng character device and its
sysfs support, plus a hardware-specific driver that plugs
into that core.
To make the most effective use of these mechanisms, you
should download the support software as well. Download the
latest version of the "rng-tools" package from the
hw_random driver's official Web site:
http://sourceforge.net/projects/gkernel/
Those tools use /dev/hwrng to fill the kernel entropy pool,
which is used internally and exported by the /dev/urandom and
/dev/random special files.
Theory of operation
===================
CHARACTER DEVICE. Using the standard open()
and read() system calls, you can read random data from
the hardware RNG device. This data is NOT CHECKED by any
fitness tests, and could potentially be bogus (if the
hardware is faulty or has been tampered with). Data is only
output if the hardware "has-data" flag is set, but nevertheless
a security-conscious person would run fitness tests on the
data before assuming it is truly random.
The rng-tools package uses such tests in "rngd", and lets you
run them by hand with a "rngtest" utility.
/dev/hwrng is char device major 10, minor 183.
CLASS DEVICE. There is a /sys/class/misc/hw_random node with
two unique attributes, "rng_available" and "rng_current". The
"rng_available" attribute lists the hardware-specific drivers
available, while "rng_current" lists the one which is currently
connected to /dev/hwrng. If your system has more than one
RNG available, you may change the one used by writing a name from
the list in "rng_available" into "rng_current".
==========================================================================
Hardware driver for Intel/AMD/VIA Random Number Generators (RNG)
- Copyright 2000,2001 Jeff Garzik <jgarzik@pobox.com>
- Copyright 2000,2001 Philipp Rumpf <prumpf@mandrakesoft.com>
About the Intel RNG hardware, from the firmware hub datasheet
=============================================================
The Firmware Hub integrates a Random Number Generator (RNG)
using thermal noise generated from inherently random quantum
mechanical properties of silicon. When not generating new random
bits the RNG circuitry will enter a low power state. Intel will
provide a binary software driver to give third party software
access to our RNG for use as a security feature. At this time,
the RNG is only to be used with a system in an OS-present state.
Intel RNG Driver notes
======================
FIXME: support poll(2)
.. note::
request_mem_region was removed, for three reasons:
1) Only one RNG is supported by this driver;
2) The location used by the RNG is a fixed location in
MMIO-addressable memory;
3) users with properly working BIOS e820 handling will always
have the region in which the RNG is located reserved, so
request_mem_region calls always fail for proper setups.
However, for people who use mem=XX, BIOS e820 information is
**not** in /proc/iomem, and request_mem_region(RNG_ADDR) can
succeed.
Driver details
==============
Based on:
Intel 82802AB/82802AC Firmware Hub (FWH) Datasheet
May 1999 Order Number: 290658-002 R
Intel 82802 Firmware Hub:
Random Number Generator
Programmer's Reference Manual
December 1999 Order Number: 298029-001 R
Intel 82802 Firmware HUB Random Number Generator Driver
Copyright (c) 2000 Matt Sottek <msottek@quiknet.com>
Special thanks to Matt Sottek. I did the "guts", he
did the "brains" and all the testing.

View File

@@ -85,8 +85,25 @@ configure specific aspects of kernel behavior to your liking.
perf-security
acpi/index
aoe/index
btmrvl
clearing-warn-once
cpu-load
cputopology
device-mapper/index
efi-stub
highuid
hw_random
iostats
kernel-per-CPU-kthreads
laptops/index
lcd-panel-cgram
ldm
lockup-watchdogs
numastat
pnp
rtc
svga
video-output
.. only:: subproject and html

View File

@@ -0,0 +1,197 @@
=====================
I/O statistics fields
=====================
Since 2.4.20 (and some versions before, with patches), and 2.5.45,
more extensive disk statistics have been introduced to help measure disk
activity. Tools such as ``sar`` and ``iostat`` typically interpret these and do
the work for you, but in case you are interested in creating your own
tools, the fields are explained here.
In 2.4 now, the information is found as additional fields in
``/proc/partitions``. In 2.6 and upper, the same information is found in two
places: one is in the file ``/proc/diskstats``, and the other is within
the sysfs file system, which must be mounted in order to obtain
the information. Throughout this document we'll assume that sysfs
is mounted on ``/sys``, although of course it may be mounted anywhere.
Both ``/proc/diskstats`` and sysfs use the same source for the information
and so should not differ.
Here are examples of these different formats::
2.4:
3 0 39082680 hda 446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160
3 1 9221278 hda1 35486 0 35496 38030 0 0 0 0 0 38030 38030
2.6+ sysfs:
446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160
35486 38030 38030 38030
2.6+ diskstats:
3 0 hda 446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160
3 1 hda1 35486 38030 38030 38030
4.18+ diskstats:
3 0 hda 446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160 0 0 0 0
On 2.4 you might execute ``grep 'hda ' /proc/partitions``. On 2.6+, you have
a choice of ``cat /sys/block/hda/stat`` or ``grep 'hda ' /proc/diskstats``.
The advantage of one over the other is that the sysfs choice works well
if you are watching a known, small set of disks. ``/proc/diskstats`` may
be a better choice if you are watching a large number of disks because
you'll avoid the overhead of 50, 100, or 500 or more opens/closes with
each snapshot of your disk statistics.
In 2.4, the statistics fields are those after the device name. In
the above example, the first field of statistics would be 446216.
By contrast, in 2.6+ if you look at ``/sys/block/hda/stat``, you'll
find just the eleven fields, beginning with 446216. If you look at
``/proc/diskstats``, the eleven fields will be preceded by the major and
minor device numbers, and device name. Each of these formats provides
eleven fields of statistics, each meaning exactly the same things.
All fields except field 9 are cumulative since boot. Field 9 should
go to zero as I/Os complete; all others only increase (unless they
overflow and wrap). Yes, these are (32-bit or 64-bit) unsigned long
(native word size) numbers, and on a very busy or long-lived system they
may wrap. Applications should be prepared to deal with that; unless
your observations are measured in large numbers of minutes or hours,
they should not wrap twice before you notice them.
Each set of stats only applies to the indicated device; if you want
system-wide stats you'll have to find all the devices and sum them all up.
Field 1 -- # of reads completed
This is the total number of reads completed successfully.
Field 2 -- # of reads merged, field 6 -- # of writes merged
Reads and writes which are adjacent to each other may be merged for
efficiency. Thus two 4K reads may become one 8K read before it is
ultimately handed to the disk, and so it will be counted (and queued)
as only one I/O. This field lets you know how often this was done.
Field 3 -- # of sectors read
This is the total number of sectors read successfully.
Field 4 -- # of milliseconds spent reading
This is the total number of milliseconds spent by all reads (as
measured from __make_request() to end_that_request_last()).
Field 5 -- # of writes completed
This is the total number of writes completed successfully.
Field 6 -- # of writes merged
See the description of field 2.
Field 7 -- # of sectors written
This is the total number of sectors written successfully.
Field 8 -- # of milliseconds spent writing
This is the total number of milliseconds spent by all writes (as
measured from __make_request() to end_that_request_last()).
Field 9 -- # of I/Os currently in progress
The only field that should go to zero. Incremented as requests are
given to appropriate struct request_queue and decremented as they finish.
Field 10 -- # of milliseconds spent doing I/Os
This field increases so long as field 9 is nonzero.
Since 5.0 this field counts jiffies when at least one request was
started or completed. If request runs more than 2 jiffies then some
I/O time will not be accounted unless there are other requests.
Field 11 -- weighted # of milliseconds spent doing I/Os
This field is incremented at each I/O start, I/O completion, I/O
merge, or read of these stats by the number of I/Os in progress
(field 9) times the number of milliseconds spent doing I/O since the
last update of this field. This can provide an easy measure of both
I/O completion time and the backlog that may be accumulating.
Field 12 -- # of discards completed
This is the total number of discards completed successfully.
Field 13 -- # of discards merged
See the description of field 2
Field 14 -- # of sectors discarded
This is the total number of sectors discarded successfully.
Field 15 -- # of milliseconds spent discarding
This is the total number of milliseconds spent by all discards (as
measured from __make_request() to end_that_request_last()).
To avoid introducing performance bottlenecks, no locks are held while
modifying these counters. This implies that minor inaccuracies may be
introduced when changes collide, so (for instance) adding up all the
read I/Os issued per partition should equal those made to the disks ...
but due to the lack of locking it may only be very close.
In 2.6+, there are counters for each CPU, which make the lack of locking
almost a non-issue. When the statistics are read, the per-CPU counters
are summed (possibly overflowing the unsigned long variable they are
summed to) and the result given to the user. There is no convenient
user interface for accessing the per-CPU counters themselves.
Disks vs Partitions
-------------------
There were significant changes between 2.4 and 2.6+ in the I/O subsystem.
As a result, some statistic information disappeared. The translation from
a disk address relative to a partition to the disk address relative to
the host disk happens much earlier. All merges and timings now happen
at the disk level rather than at both the disk and partition level as
in 2.4. Consequently, you'll see a different statistics output on 2.6+ for
partitions from that for disks. There are only *four* fields available
for partitions on 2.6+ machines. This is reflected in the examples above.
Field 1 -- # of reads issued
This is the total number of reads issued to this partition.
Field 2 -- # of sectors read
This is the total number of sectors requested to be read from this
partition.
Field 3 -- # of writes issued
This is the total number of writes issued to this partition.
Field 4 -- # of sectors written
This is the total number of sectors requested to be written to
this partition.
Note that since the address is translated to a disk-relative one, and no
record of the partition-relative address is kept, the subsequent success
or failure of the read cannot be attributed to the partition. In other
words, the number of reads for partitions is counted slightly before time
of queuing for partitions, and at completion for whole disks. This is
a subtle distinction that is probably uninteresting for most cases.
More significant is the error induced by counting the numbers of
reads/writes before merges for partitions and after for disks. Since a
typical workload usually contains a lot of successive and adjacent requests,
the number of reads/writes issued can be several times higher than the
number of reads/writes completed.
In 2.6.25, the full statistic set is again available for partitions and
disk and partition statistics are consistent again. Since we still don't
keep record of the partition-relative address, an operation is attributed to
the partition which contains the first sector of the request after the
eventual merges. As requests can be merged across partition, this could lead
to some (probably insignificant) inaccuracy.
Additional notes
----------------
In 2.6+, sysfs is not mounted by default. If your distribution of
Linux hasn't added it already, here's the line you'll want to add to
your ``/etc/fstab``::
none /sys sysfs defaults 0 0
In 2.6+, all disk statistics were removed from ``/proc/stat``. In 2.4, they
appear in both ``/proc/partitions`` and ``/proc/stat``, although the ones in
``/proc/stat`` take a very different format from those in ``/proc/partitions``
(see proc(5), if your system has it.)
-- ricklind@us.ibm.com

View File

@@ -5066,7 +5066,7 @@
vga= [BOOT,X86-32] Select a particular video mode
See Documentation/x86/boot.rst and
Documentation/svga.txt.
Documentation/admin-guide/svga.rst.
Use vga=ask for menu.
This is actually a boot loader parameter; the value is
passed to the kernel using a special protocol.

View File

@@ -0,0 +1,356 @@
==========================================
Reducing OS jitter due to per-cpu kthreads
==========================================
This document lists per-CPU kthreads in the Linux kernel and presents
options to control their OS jitter. Note that non-per-CPU kthreads are
not listed here. To reduce OS jitter from non-per-CPU kthreads, bind
them to a "housekeeping" CPU dedicated to such work.
References
==========
- Documentation/IRQ-affinity.txt: Binding interrupts to sets of CPUs.
- Documentation/admin-guide/cgroup-v1: Using cgroups to bind tasks to sets of CPUs.
- man taskset: Using the taskset command to bind tasks to sets
of CPUs.
- man sched_setaffinity: Using the sched_setaffinity() system
call to bind tasks to sets of CPUs.
- /sys/devices/system/cpu/cpuN/online: Control CPU N's hotplug state,
writing "0" to offline and "1" to online.
- In order to locate kernel-generated OS jitter on CPU N:
cd /sys/kernel/debug/tracing
echo 1 > max_graph_depth # Increase the "1" for more detail
echo function_graph > current_tracer
# run workload
cat per_cpu/cpuN/trace
kthreads
========
Name:
ehca_comp/%u
Purpose:
Periodically process Infiniband-related work.
To reduce its OS jitter, do any of the following:
1. Don't use eHCA Infiniband hardware, instead choosing hardware
that does not require per-CPU kthreads. This will prevent these
kthreads from being created in the first place. (This will
work for most people, as this hardware, though important, is
relatively old and is produced in relatively low unit volumes.)
2. Do all eHCA-Infiniband-related work on other CPUs, including
interrupts.
3. Rework the eHCA driver so that its per-CPU kthreads are
provisioned only on selected CPUs.
Name:
irq/%d-%s
Purpose:
Handle threaded interrupts.
To reduce its OS jitter, do the following:
1. Use irq affinity to force the irq threads to execute on
some other CPU.
Name:
kcmtpd_ctr_%d
Purpose:
Handle Bluetooth work.
To reduce its OS jitter, do one of the following:
1. Don't use Bluetooth, in which case these kthreads won't be
created in the first place.
2. Use irq affinity to force Bluetooth-related interrupts to
occur on some other CPU and furthermore initiate all
Bluetooth activity on some other CPU.
Name:
ksoftirqd/%u
Purpose:
Execute softirq handlers when threaded or when under heavy load.
To reduce its OS jitter, each softirq vector must be handled
separately as follows:
TIMER_SOFTIRQ
-------------
Do all of the following:
1. To the extent possible, keep the CPU out of the kernel when it
is non-idle, for example, by avoiding system calls and by forcing
both kernel threads and interrupts to execute elsewhere.
2. Build with CONFIG_HOTPLUG_CPU=y. After boot completes, force
the CPU offline, then bring it back online. This forces
recurring timers to migrate elsewhere. If you are concerned
with multiple CPUs, force them all offline before bringing the
first one back online. Once you have onlined the CPUs in question,
do not offline any other CPUs, because doing so could force the
timer back onto one of the CPUs in question.
NET_TX_SOFTIRQ and NET_RX_SOFTIRQ
---------------------------------
Do all of the following:
1. Force networking interrupts onto other CPUs.
2. Initiate any network I/O on other CPUs.
3. Once your application has started, prevent CPU-hotplug operations
from being initiated from tasks that might run on the CPU to
be de-jittered. (It is OK to force this CPU offline and then
bring it back online before you start your application.)
BLOCK_SOFTIRQ
-------------
Do all of the following:
1. Force block-device interrupts onto some other CPU.
2. Initiate any block I/O on other CPUs.
3. Once your application has started, prevent CPU-hotplug operations
from being initiated from tasks that might run on the CPU to
be de-jittered. (It is OK to force this CPU offline and then
bring it back online before you start your application.)
IRQ_POLL_SOFTIRQ
----------------
Do all of the following:
1. Force block-device interrupts onto some other CPU.
2. Initiate any block I/O and block-I/O polling on other CPUs.
3. Once your application has started, prevent CPU-hotplug operations
from being initiated from tasks that might run on the CPU to
be de-jittered. (It is OK to force this CPU offline and then
bring it back online before you start your application.)
TASKLET_SOFTIRQ
---------------
Do one or more of the following:
1. Avoid use of drivers that use tasklets. (Such drivers will contain
calls to things like tasklet_schedule().)
2. Convert all drivers that you must use from tasklets to workqueues.
3. Force interrupts for drivers using tasklets onto other CPUs,
and also do I/O involving these drivers on other CPUs.
SCHED_SOFTIRQ
-------------
Do all of the following:
1. Avoid sending scheduler IPIs to the CPU to be de-jittered,
for example, ensure that at most one runnable kthread is present
on that CPU. If a thread that expects to run on the de-jittered
CPU awakens, the scheduler will send an IPI that can result in
a subsequent SCHED_SOFTIRQ.
2. CONFIG_NO_HZ_FULL=y and ensure that the CPU to be de-jittered
is marked as an adaptive-ticks CPU using the "nohz_full="
boot parameter. This reduces the number of scheduler-clock
interrupts that the de-jittered CPU receives, minimizing its
chances of being selected to do the load balancing work that
runs in SCHED_SOFTIRQ context.
3. To the extent possible, keep the CPU out of the kernel when it
is non-idle, for example, by avoiding system calls and by
forcing both kernel threads and interrupts to execute elsewhere.
This further reduces the number of scheduler-clock interrupts
received by the de-jittered CPU.
HRTIMER_SOFTIRQ
---------------
Do all of the following:
1. To the extent possible, keep the CPU out of the kernel when it
is non-idle. For example, avoid system calls and force both
kernel threads and interrupts to execute elsewhere.
2. Build with CONFIG_HOTPLUG_CPU=y. Once boot completes, force the
CPU offline, then bring it back online. This forces recurring
timers to migrate elsewhere. If you are concerned with multiple
CPUs, force them all offline before bringing the first one
back online. Once you have onlined the CPUs in question, do not
offline any other CPUs, because doing so could force the timer
back onto one of the CPUs in question.
RCU_SOFTIRQ
-----------
Do at least one of the following:
1. Offload callbacks and keep the CPU in either dyntick-idle or
adaptive-ticks state by doing all of the following:
a. CONFIG_NO_HZ_FULL=y and ensure that the CPU to be
de-jittered is marked as an adaptive-ticks CPU using the
"nohz_full=" boot parameter. Bind the rcuo kthreads to
housekeeping CPUs, which can tolerate OS jitter.
b. To the extent possible, keep the CPU out of the kernel
when it is non-idle, for example, by avoiding system
calls and by forcing both kernel threads and interrupts
to execute elsewhere.
2. Enable RCU to do its processing remotely via dyntick-idle by
doing all of the following:
a. Build with CONFIG_NO_HZ=y and CONFIG_RCU_FAST_NO_HZ=y.
b. Ensure that the CPU goes idle frequently, allowing other
CPUs to detect that it has passed through an RCU quiescent
state. If the kernel is built with CONFIG_NO_HZ_FULL=y,
userspace execution also allows other CPUs to detect that
the CPU in question has passed through a quiescent state.
c. To the extent possible, keep the CPU out of the kernel
when it is non-idle, for example, by avoiding system
calls and by forcing both kernel threads and interrupts
to execute elsewhere.
Name:
kworker/%u:%d%s (cpu, id, priority)
Purpose:
Execute workqueue requests
To reduce its OS jitter, do any of the following:
1. Run your workload at a real-time priority, which will allow
preempting the kworker daemons.
2. A given workqueue can be made visible in the sysfs filesystem
by passing the WQ_SYSFS to that workqueue's alloc_workqueue().
Such a workqueue can be confined to a given subset of the
CPUs using the ``/sys/devices/virtual/workqueue/*/cpumask`` sysfs
files. The set of WQ_SYSFS workqueues can be displayed using
"ls sys/devices/virtual/workqueue". That said, the workqueues
maintainer would like to caution people against indiscriminately
sprinkling WQ_SYSFS across all the workqueues. The reason for
caution is that it is easy to add WQ_SYSFS, but because sysfs is
part of the formal user/kernel API, it can be nearly impossible
to remove it, even if its addition was a mistake.
3. Do any of the following needed to avoid jitter that your
application cannot tolerate:
a. Build your kernel with CONFIG_SLUB=y rather than
CONFIG_SLAB=y, thus avoiding the slab allocator's periodic
use of each CPU's workqueues to run its cache_reap()
function.
b. Avoid using oprofile, thus avoiding OS jitter from
wq_sync_buffer().
c. Limit your CPU frequency so that a CPU-frequency
governor is not required, possibly enlisting the aid of
special heatsinks or other cooling technologies. If done
correctly, and if you CPU architecture permits, you should
be able to build your kernel with CONFIG_CPU_FREQ=n to
avoid the CPU-frequency governor periodically running
on each CPU, including cs_dbs_timer() and od_dbs_timer().
WARNING: Please check your CPU specifications to
make sure that this is safe on your particular system.
d. As of v3.18, Christoph Lameter's on-demand vmstat workers
commit prevents OS jitter due to vmstat_update() on
CONFIG_SMP=y systems. Before v3.18, is not possible
to entirely get rid of the OS jitter, but you can
decrease its frequency by writing a large value to
/proc/sys/vm/stat_interval. The default value is HZ,
for an interval of one second. Of course, larger values
will make your virtual-memory statistics update more
slowly. Of course, you can also run your workload at
a real-time priority, thus preempting vmstat_update(),
but if your workload is CPU-bound, this is a bad idea.
However, there is an RFC patch from Christoph Lameter
(based on an earlier one from Gilad Ben-Yossef) that
reduces or even eliminates vmstat overhead for some
workloads at https://lkml.org/lkml/2013/9/4/379.
e. Boot with "elevator=noop" to avoid workqueue use by
the block layer.
f. If running on high-end powerpc servers, build with
CONFIG_PPC_RTAS_DAEMON=n. This prevents the RTAS
daemon from running on each CPU every second or so.
(This will require editing Kconfig files and will defeat
this platform's RAS functionality.) This avoids jitter
due to the rtas_event_scan() function.
WARNING: Please check your CPU specifications to
make sure that this is safe on your particular system.
g. If running on Cell Processor, build your kernel with
CBE_CPUFREQ_SPU_GOVERNOR=n to avoid OS jitter from
spu_gov_work().
WARNING: Please check your CPU specifications to
make sure that this is safe on your particular system.
h. If running on PowerMAC, build your kernel with
CONFIG_PMAC_RACKMETER=n to disable the CPU-meter,
avoiding OS jitter from rackmeter_do_timer().
Name:
rcuc/%u
Purpose:
Execute RCU callbacks in CONFIG_RCU_BOOST=y kernels.
To reduce its OS jitter, do at least one of the following:
1. Build the kernel with CONFIG_PREEMPT=n. This prevents these
kthreads from being created in the first place, and also obviates
the need for RCU priority boosting. This approach is feasible
for workloads that do not require high degrees of responsiveness.
2. Build the kernel with CONFIG_RCU_BOOST=n. This prevents these
kthreads from being created in the first place. This approach
is feasible only if your workload never requires RCU priority
boosting, for example, if you ensure frequent idle time on all
CPUs that might execute within the kernel.
3. Build with CONFIG_RCU_NOCB_CPU=y and boot with the rcu_nocbs=
boot parameter offloading RCU callbacks from all CPUs susceptible
to OS jitter. This approach prevents the rcuc/%u kthreads from
having any work to do, so that they are never awakened.
4. Ensure that the CPU never enters the kernel, and, in particular,
avoid initiating any CPU hotplug operations on this CPU. This is
another way of preventing any callbacks from being queued on the
CPU, again preventing the rcuc/%u kthreads from having any work
to do.
Name:
rcuop/%d and rcuos/%d
Purpose:
Offload RCU callbacks from the corresponding CPU.
To reduce its OS jitter, do at least one of the following:
1. Use affinity, cgroups, or other mechanism to force these kthreads
to execute on some other CPU.
2. Build with CONFIG_RCU_NOCB_CPU=n, which will prevent these
kthreads from being created in the first place. However, please
note that this will not eliminate OS jitter, but will instead
shift it to RCU_SOFTIRQ.
Name:
watchdog/%u
Purpose:
Detect software lockups on each CPU.
To reduce its OS jitter, do at least one of the following:
1. Build with CONFIG_LOCKUP_DETECTOR=n, which will prevent these
kthreads from being created in the first place.
2. Boot with "nosoftlockup=0", which will also prevent these kthreads
from being created. Other related watchdog and softlockup boot
parameters may be found in Documentation/admin-guide/kernel-parameters.rst
and Documentation/watchdog/watchdog-parameters.rst.
3. Echo a zero to /proc/sys/kernel/watchdog to disable the
watchdog timer.
4. Echo a large number of /proc/sys/kernel/watchdog_thresh in
order to reduce the frequency of OS jitter due to the watchdog
timer down to a level that is acceptable for your workload.

View File

@@ -0,0 +1,27 @@
======================================
Parallel port LCD/Keypad Panel support
======================================
Some LCDs allow you to define up to 8 characters, mapped to ASCII
characters 0 to 7. The escape code to define a new character is
'\e[LG' followed by one digit from 0 to 7, representing the character
number, and up to 8 couples of hex digits terminated by a semi-colon
(';'). Each couple of digits represents a line, with 1-bits for each
illuminated pixel with LSB on the right. Lines are numbered from the
top of the character to the bottom. On a 5x7 matrix, only the 5 lower
bits of the 7 first bytes are used for each character. If the string
is incomplete, only complete lines will be redefined. Here are some
examples::
printf "\e[LG0010101050D1F0C04;" => 0 = [enter]
printf "\e[LG1040E1F0000000000;" => 1 = [up]
printf "\e[LG2000000001F0E0400;" => 2 = [down]
printf "\e[LG3040E1F001F0E0400;" => 3 = [up-down]
printf "\e[LG40002060E1E0E0602;" => 4 = [left]
printf "\e[LG500080C0E0F0E0C08;" => 5 = [right]
printf "\e[LG60016051516141400;" => 6 = "IP"
printf "\e[LG00103071F1F070301;" => big speaker
printf "\e[LG00002061E1E060200;" => small speaker
Willy

View File

@@ -0,0 +1,121 @@
==========================================
LDM - Logical Disk Manager (Dynamic Disks)
==========================================
:Author: Originally Written by FlatCap - Richard Russon <ldm@flatcap.org>.
:Last Updated: Anton Altaparmakov on 30 March 2007 for Windows Vista.
Overview
--------
Windows 2000, XP, and Vista use a new partitioning scheme. It is a complete
replacement for the MSDOS style partitions. It stores its information in a
1MiB journalled database at the end of the physical disk. The size of
partitions is limited only by disk space. The maximum number of partitions is
nearly 2000.
Any partitions created under the LDM are called "Dynamic Disks". There are no
longer any primary or extended partitions. Normal MSDOS style partitions are
now known as Basic Disks.
If you wish to use Spanned, Striped, Mirrored or RAID 5 Volumes, you must use
Dynamic Disks. The journalling allows Windows to make changes to these
partitions and filesystems without the need to reboot.
Once the LDM driver has divided up the disk, you can use the MD driver to
assemble any multi-partition volumes, e.g. Stripes, RAID5.
To prevent legacy applications from repartitioning the disk, the LDM creates a
dummy MSDOS partition containing one disk-sized partition. This is what is
supported with the Linux LDM driver.
A newer approach that has been implemented with Vista is to put LDM on top of a
GPT label disk. This is not supported by the Linux LDM driver yet.
Example
-------
Below we have a 50MiB disk, divided into seven partitions.
.. note::
The missing 1MiB at the end of the disk is where the LDM database is
stored.
+-------++--------------+---------+-----++--------------+---------+----+
|Device || Offset Bytes | Sectors | MiB || Size Bytes | Sectors | MiB|
+=======++==============+=========+=====++==============+=========+====+
|hda || 0 | 0 | 0 || 52428800 | 102400 | 50|
+-------++--------------+---------+-----++--------------+---------+----+
|hda1 || 51380224 | 100352 | 49 || 1048576 | 2048 | 1|
+-------++--------------+---------+-----++--------------+---------+----+
|hda2 || 16384 | 32 | 0 || 6979584 | 13632 | 6|
+-------++--------------+---------+-----++--------------+---------+----+
|hda3 || 6995968 | 13664 | 6 || 10485760 | 20480 | 10|
+-------++--------------+---------+-----++--------------+---------+----+
|hda4 || 17481728 | 34144 | 16 || 4194304 | 8192 | 4|
+-------++--------------+---------+-----++--------------+---------+----+
|hda5 || 21676032 | 42336 | 20 || 5242880 | 10240 | 5|
+-------++--------------+---------+-----++--------------+---------+----+
|hda6 || 26918912 | 52576 | 25 || 10485760 | 20480 | 10|
+-------++--------------+---------+-----++--------------+---------+----+
|hda7 || 37404672 | 73056 | 35 || 13959168 | 27264 | 13|
+-------++--------------+---------+-----++--------------+---------+----+
The LDM Database may not store the partitions in the order that they appear on
disk, but the driver will sort them.
When Linux boots, you will see something like::
hda: 102400 sectors w/32KiB Cache, CHS=50/64/32
hda: [LDM] hda1 hda2 hda3 hda4 hda5 hda6 hda7
Compiling LDM Support
---------------------
To enable LDM, choose the following two options:
- "Advanced partition selection" CONFIG_PARTITION_ADVANCED
- "Windows Logical Disk Manager (Dynamic Disk) support" CONFIG_LDM_PARTITION
If you believe the driver isn't working as it should, you can enable the extra
debugging code. This will produce a LOT of output. The option is:
- "Windows LDM extra logging" CONFIG_LDM_DEBUG
N.B. The partition code cannot be compiled as a module.
As with all the partition code, if the driver doesn't see signs of its type of
partition, it will pass control to another driver, so there is no harm in
enabling it.
If you have Dynamic Disks but don't enable the driver, then all you will see
is a dummy MSDOS partition filling the whole disk. You won't be able to mount
any of the volumes on the disk.
Booting
-------
If you enable LDM support, then lilo is capable of booting from any of the
discovered partitions. However, grub does not understand the LDM partitioning
and cannot boot from a Dynamic Disk.
More Documentation
------------------
There is an Overview of the LDM together with complete Technical Documentation.
It is available for download.
http://www.linux-ntfs.org/
If you have any LDM questions that aren't answered in the documentation, email
me.
Cheers,
FlatCap - Richard Russon
ldm@flatcap.org

View File

@@ -0,0 +1,83 @@
===============================================================
Softlockup detector and hardlockup detector (aka nmi_watchdog)
===============================================================
The Linux kernel can act as a watchdog to detect both soft and hard
lockups.
A 'softlockup' is defined as a bug that causes the kernel to loop in
kernel mode for more than 20 seconds (see "Implementation" below for
details), without giving other tasks a chance to run. The current
stack trace is displayed upon detection and, by default, the system
will stay locked up. Alternatively, the kernel can be configured to
panic; a sysctl, "kernel.softlockup_panic", a kernel parameter,
"softlockup_panic" (see "Documentation/admin-guide/kernel-parameters.rst" for
details), and a compile option, "BOOTPARAM_SOFTLOCKUP_PANIC", are
provided for this.
A 'hardlockup' is defined as a bug that causes the CPU to loop in
kernel mode for more than 10 seconds (see "Implementation" below for
details), without letting other interrupts have a chance to run.
Similarly to the softlockup case, the current stack trace is displayed
upon detection and the system will stay locked up unless the default
behavior is changed, which can be done through a sysctl,
'hardlockup_panic', a compile time knob, "BOOTPARAM_HARDLOCKUP_PANIC",
and a kernel parameter, "nmi_watchdog"
(see "Documentation/admin-guide/kernel-parameters.rst" for details).
The panic option can be used in combination with panic_timeout (this
timeout is set through the confusingly named "kernel.panic" sysctl),
to cause the system to reboot automatically after a specified amount
of time.
Implementation
==============
The soft and hard lockup detectors are built on top of the hrtimer and
perf subsystems, respectively. A direct consequence of this is that,
in principle, they should work in any architecture where these
subsystems are present.
A periodic hrtimer runs to generate interrupts and kick the watchdog
task. An NMI perf event is generated every "watchdog_thresh"
(compile-time initialized to 10 and configurable through sysctl of the
same name) seconds to check for hardlockups. If any CPU in the system
does not receive any hrtimer interrupt during that time the
'hardlockup detector' (the handler for the NMI perf event) will
generate a kernel warning or call panic, depending on the
configuration.
The watchdog task is a high priority kernel thread that updates a
timestamp every time it is scheduled. If that timestamp is not updated
for 2*watchdog_thresh seconds (the softlockup threshold) the
'softlockup detector' (coded inside the hrtimer callback function)
will dump useful debug information to the system log, after which it
will call panic if it was instructed to do so or resume execution of
other kernel code.
The period of the hrtimer is 2*watchdog_thresh/5, which means it has
two or three chances to generate an interrupt before the hardlockup
detector kicks in.
As explained above, a kernel knob is provided that allows
administrators to configure the period of the hrtimer and the perf
event. The right value for a particular environment is a trade-off
between fast response to lockups and detection overhead.
By default, the watchdog runs on all online cores. However, on a
kernel configured with NO_HZ_FULL, by default the watchdog runs only
on the housekeeping cores, not the cores specified in the "nohz_full"
boot argument. If we allowed the watchdog to run by default on
the "nohz_full" cores, we would have to run timer ticks to activate
the scheduler, which would prevent the "nohz_full" functionality
from protecting the user code on those cores from the kernel.
Of course, disabling it by default on the nohz_full cores means that
when those cores do enter the kernel, by default we will not be
able to detect if they lock up. However, allowing the watchdog
to continue to run on the housekeeping (non-tickless) cores means
that we will continue to detect lockups properly on those cores.
In either case, the set of cores excluded from running the watchdog
may be adjusted via the kernel.watchdog_cpumask sysctl. For
nohz_full cores, this may be useful for debugging a case where the
kernel seems to be hanging on the nohz_full cores.

View File

@@ -0,0 +1,25 @@
=====================
CMA Debugfs Interface
=====================
The CMA debugfs interface is useful to retrieve basic information out of the
different CMA areas and to test allocation/release in each of the areas.
Each CMA zone represents a directory under <debugfs>/cma/, indexed by the
kernel's CMA index. So the first CMA zone would be:
<debugfs>/cma/cma-0
The structure of the files created under that directory is as follows:
- [RO] base_pfn: The base PFN (Page Frame Number) of the zone.
- [RO] count: Amount of memory in the CMA area.
- [RO] order_per_bit: Order of pages represented by one bit.
- [RO] bitmap: The bitmap of page states in the zone.
- [WO] alloc: Allocate N pages from that CMA area. For example::
echo 5 > <debugfs>/cma/cma-2/alloc
would try to allocate 5 pages from the cma-2 area.
- [WO] free: Free N pages from that CMA area, similar to the above.

View File

@@ -26,6 +26,7 @@ the Linux memory management.
:maxdepth: 1
concepts
cma_debugfs
hugetlbpage
idle_page_tracking
ksm

View File

@@ -0,0 +1,30 @@
===============================
Numa policy hit/miss statistics
===============================
/sys/devices/system/node/node*/numastat
All units are pages. Hugepages have separate counters.
=============== ============================================================
numa_hit A process wanted to allocate memory from this node,
and succeeded.
numa_miss A process wanted to allocate memory from another node,
but ended up with memory from this node.
numa_foreign A process wanted to allocate on this node,
but ended up with memory from another one.
local_node A process ran on this node and got memory from it.
other_node A process ran on this node and got memory from another node.
interleave_hit Interleaving wanted to allocate from this node
and succeeded.
=============== ============================================================
For easier reading you can use the numastat utility from the numactl package
(http://oss.sgi.com/projects/libnuma/). Note that it only works
well right now on machines with a small number of CPUs.

View File

@@ -0,0 +1,292 @@
=================================
Linux Plug and Play Documentation
=================================
:Author: Adam Belay <ambx1@neo.rr.com>
:Last updated: Oct. 16, 2002
Overview
--------
Plug and Play provides a means of detecting and setting resources for legacy or
otherwise unconfigurable devices. The Linux Plug and Play Layer provides these
services to compatible drivers.
The User Interface
------------------
The Linux Plug and Play user interface provides a means to activate PnP devices
for legacy and user level drivers that do not support Linux Plug and Play. The
user interface is integrated into sysfs.
In addition to the standard sysfs file the following are created in each
device's directory:
- id - displays a list of support EISA IDs
- options - displays possible resource configurations
- resources - displays currently allocated resources and allows resource changes
activating a device
^^^^^^^^^^^^^^^^^^^
::
# echo "auto" > resources
this will invoke the automatic resource config system to activate the device
manually activating a device
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
::
# echo "manual <depnum> <mode>" > resources
<depnum> - the configuration number
<mode> - static or dynamic
static = for next boot
dynamic = now
disabling a device
^^^^^^^^^^^^^^^^^^
::
# echo "disable" > resources
EXAMPLE:
Suppose you need to activate the floppy disk controller.
1. change to the proper directory, in my case it is
/driver/bus/pnp/devices/00:0f::
# cd /driver/bus/pnp/devices/00:0f
# cat name
PC standard floppy disk controller
2. check if the device is already active::
# cat resources
DISABLED
- Notice the string "DISABLED". This means the device is not active.
3. check the device's possible configurations (optional)::
# cat options
Dependent: 01 - Priority acceptable
port 0x3f0-0x3f0, align 0x7, size 0x6, 16-bit address decoding
port 0x3f7-0x3f7, align 0x0, size 0x1, 16-bit address decoding
irq 6
dma 2 8-bit compatible
Dependent: 02 - Priority acceptable
port 0x370-0x370, align 0x7, size 0x6, 16-bit address decoding
port 0x377-0x377, align 0x0, size 0x1, 16-bit address decoding
irq 6
dma 2 8-bit compatible
4. now activate the device::
# echo "auto" > resources
5. finally check if the device is active::
# cat resources
io 0x3f0-0x3f5
io 0x3f7-0x3f7
irq 6
dma 2
also there are a series of kernel parameters::
pnp_reserve_irq=irq1[,irq2] ....
pnp_reserve_dma=dma1[,dma2] ....
pnp_reserve_io=io1,size1[,io2,size2] ....
pnp_reserve_mem=mem1,size1[,mem2,size2] ....
The Unified Plug and Play Layer
-------------------------------
All Plug and Play drivers, protocols, and services meet at a central location
called the Plug and Play Layer. This layer is responsible for the exchange of
information between PnP drivers and PnP protocols. Thus it automatically
forwards commands to the proper protocol. This makes writing PnP drivers
significantly easier.
The following functions are available from the Plug and Play Layer:
pnp_get_protocol
increments the number of uses by one
pnp_put_protocol
deincrements the number of uses by one
pnp_register_protocol
use this to register a new PnP protocol
pnp_unregister_protocol
use this function to remove a PnP protocol from the Plug and Play Layer
pnp_register_driver
adds a PnP driver to the Plug and Play Layer
this includes driver model integration
returns zero for success or a negative error number for failure; count
calls to the .add() method if you need to know how many devices bind to
the driver
pnp_unregister_driver
removes a PnP driver from the Plug and Play Layer
Plug and Play Protocols
-----------------------
This section contains information for PnP protocol developers.
The following Protocols are currently available in the computing world:
- PNPBIOS:
used for system devices such as serial and parallel ports.
- ISAPNP:
provides PnP support for the ISA bus
- ACPI:
among its many uses, ACPI provides information about system level
devices.
It is meant to replace the PNPBIOS. It is not currently supported by Linux
Plug and Play but it is planned to be in the near future.
Requirements for a Linux PnP protocol:
1. the protocol must use EISA IDs
2. the protocol must inform the PnP Layer of a device's current configuration
- the ability to set resources is optional but preferred.
The following are PnP protocol related functions:
pnp_add_device
use this function to add a PnP device to the PnP layer
only call this function when all wanted values are set in the pnp_dev
structure
pnp_init_device
call this to initialize the PnP structure
pnp_remove_device
call this to remove a device from the Plug and Play Layer.
it will fail if the device is still in use.
automatically will free mem used by the device and related structures
pnp_add_id
adds an EISA ID to the list of supported IDs for the specified device
For more information consult the source of a protocol such as
/drivers/pnp/pnpbios/core.c.
Linux Plug and Play Drivers
---------------------------
This section contains information for Linux PnP driver developers.
The New Way
^^^^^^^^^^^
1. first make a list of supported EISA IDS
ex::
static const struct pnp_id pnp_dev_table[] = {
/* Standard LPT Printer Port */
{.id = "PNP0400", .driver_data = 0},
/* ECP Printer Port */
{.id = "PNP0401", .driver_data = 0},
{.id = ""}
};
Please note that the character 'X' can be used as a wild card in the function
portion (last four characters).
ex::
/* Unknown PnP modems */
{ "PNPCXXX", UNKNOWN_DEV },
Supported PnP card IDs can optionally be defined.
ex::
static const struct pnp_id pnp_card_table[] = {
{ "ANYDEVS", 0 },
{ "", 0 }
};
2. Optionally define probe and remove functions. It may make sense not to
define these functions if the driver already has a reliable method of detecting
the resources, such as the parport_pc driver.
ex::
static int
serial_pnp_probe(struct pnp_dev * dev, const struct pnp_id *card_id, const
struct pnp_id *dev_id)
{
. . .
ex::
static void serial_pnp_remove(struct pnp_dev * dev)
{
. . .
consult /drivers/serial/8250_pnp.c for more information.
3. create a driver structure
ex::
static struct pnp_driver serial_pnp_driver = {
.name = "serial",
.card_id_table = pnp_card_table,
.id_table = pnp_dev_table,
.probe = serial_pnp_probe,
.remove = serial_pnp_remove,
};
* name and id_table cannot be NULL.
4. register the driver
ex::
static int __init serial8250_pnp_init(void)
{
return pnp_register_driver(&serial_pnp_driver);
}
The Old Way
^^^^^^^^^^^
A series of compatibility functions have been created to make it easy to convert
ISAPNP drivers. They should serve as a temporary solution only.
They are as follows::
struct pnp_card *pnp_find_card(unsigned short vendor,
unsigned short device,
struct pnp_card *from)
struct pnp_dev *pnp_find_dev(struct pnp_card *card,
unsigned short vendor,
unsigned short function,
struct pnp_dev *from)

View File

@@ -0,0 +1,140 @@
=======================================
Real Time Clock (RTC) Drivers for Linux
=======================================
When Linux developers talk about a "Real Time Clock", they usually mean
something that tracks wall clock time and is battery backed so that it
works even with system power off. Such clocks will normally not track
the local time zone or daylight savings time -- unless they dual boot
with MS-Windows -- but will instead be set to Coordinated Universal Time
(UTC, formerly "Greenwich Mean Time").
The newest non-PC hardware tends to just count seconds, like the time(2)
system call reports, but RTCs also very commonly represent time using
the Gregorian calendar and 24 hour time, as reported by gmtime(3).
Linux has two largely-compatible userspace RTC API families you may
need to know about:
* /dev/rtc ... is the RTC provided by PC compatible systems,
so it's not very portable to non-x86 systems.
* /dev/rtc0, /dev/rtc1 ... are part of a framework that's
supported by a wide variety of RTC chips on all systems.
Programmers need to understand that the PC/AT functionality is not
always available, and some systems can do much more. That is, the
RTCs use the same API to make requests in both RTC frameworks (using
different filenames of course), but the hardware may not offer the
same functionality. For example, not every RTC is hooked up to an
IRQ, so they can't all issue alarms; and where standard PC RTCs can
only issue an alarm up to 24 hours in the future, other hardware may
be able to schedule one any time in the upcoming century.
Old PC/AT-Compatible driver: /dev/rtc
--------------------------------------
All PCs (even Alpha machines) have a Real Time Clock built into them.
Usually they are built into the chipset of the computer, but some may
actually have a Motorola MC146818 (or clone) on the board. This is the
clock that keeps the date and time while your computer is turned off.
ACPI has standardized that MC146818 functionality, and extended it in
a few ways (enabling longer alarm periods, and wake-from-hibernate).
That functionality is NOT exposed in the old driver.
However it can also be used to generate signals from a slow 2Hz to a
relatively fast 8192Hz, in increments of powers of two. These signals
are reported by interrupt number 8. (Oh! So *that* is what IRQ 8 is
for...) It can also function as a 24hr alarm, raising IRQ 8 when the
alarm goes off. The alarm can also be programmed to only check any
subset of the three programmable values, meaning that it could be set to
ring on the 30th second of the 30th minute of every hour, for example.
The clock can also be set to generate an interrupt upon every clock
update, thus generating a 1Hz signal.
The interrupts are reported via /dev/rtc (major 10, minor 135, read only
character device) in the form of an unsigned long. The low byte contains
the type of interrupt (update-done, alarm-rang, or periodic) that was
raised, and the remaining bytes contain the number of interrupts since
the last read. Status information is reported through the pseudo-file
/proc/driver/rtc if the /proc filesystem was enabled. The driver has
built in locking so that only one process is allowed to have the /dev/rtc
interface open at a time.
A user process can monitor these interrupts by doing a read(2) or a
select(2) on /dev/rtc -- either will block/stop the user process until
the next interrupt is received. This is useful for things like
reasonably high frequency data acquisition where one doesn't want to
burn up 100% CPU by polling gettimeofday etc. etc.
At high frequencies, or under high loads, the user process should check
the number of interrupts received since the last read to determine if
there has been any interrupt "pileup" so to speak. Just for reference, a
typical 486-33 running a tight read loop on /dev/rtc will start to suffer
occasional interrupt pileup (i.e. > 1 IRQ event since last read) for
frequencies above 1024Hz. So you really should check the high bytes
of the value you read, especially at frequencies above that of the
normal timer interrupt, which is 100Hz.
Programming and/or enabling interrupt frequencies greater than 64Hz is
only allowed by root. This is perhaps a bit conservative, but we don't want
an evil user generating lots of IRQs on a slow 386sx-16, where it might have
a negative impact on performance. This 64Hz limit can be changed by writing
a different value to /proc/sys/dev/rtc/max-user-freq. Note that the
interrupt handler is only a few lines of code to minimize any possibility
of this effect.
Also, if the kernel time is synchronized with an external source, the
kernel will write the time back to the CMOS clock every 11 minutes. In
the process of doing this, the kernel briefly turns off RTC periodic
interrupts, so be aware of this if you are doing serious work. If you
don't synchronize the kernel time with an external source (via ntp or
whatever) then the kernel will keep its hands off the RTC, allowing you
exclusive access to the device for your applications.
The alarm and/or interrupt frequency are programmed into the RTC via
various ioctl(2) calls as listed in ./include/linux/rtc.h
Rather than write 50 pages describing the ioctl() and so on, it is
perhaps more useful to include a small test program that demonstrates
how to use them, and demonstrates the features of the driver. This is
probably a lot more useful to people interested in writing applications
that will be using this driver. See the code at the end of this document.
(The original /dev/rtc driver was written by Paul Gortmaker.)
New portable "RTC Class" drivers: /dev/rtcN
--------------------------------------------
Because Linux supports many non-ACPI and non-PC platforms, some of which
have more than one RTC style clock, it needed a more portable solution
than expecting a single battery-backed MC146818 clone on every system.
Accordingly, a new "RTC Class" framework has been defined. It offers
three different userspace interfaces:
* /dev/rtcN ... much the same as the older /dev/rtc interface
* /sys/class/rtc/rtcN ... sysfs attributes support readonly
access to some RTC attributes.
* /proc/driver/rtc ... the system clock RTC may expose itself
using a procfs interface. If there is no RTC for the system clock,
rtc0 is used by default. More information is (currently) shown
here than through sysfs.
The RTC Class framework supports a wide variety of RTCs, ranging from those
integrated into embeddable system-on-chip (SOC) processors to discrete chips
using I2C, SPI, or some other bus to communicate with the host CPU. There's
even support for PC-style RTCs ... including the features exposed on newer PCs
through ACPI.
The new framework also removes the "one RTC per system" restriction. For
example, maybe the low-power battery-backed RTC is a discrete I2C chip, but
a high functionality RTC is integrated into the SOC. That system might read
the system clock from the discrete RTC, but use the integrated one for all
other tasks, because of its greater functionality.
Check out tools/testing/selftests/rtc/rtctest.c for an example usage of the
ioctl interface.

View File

@@ -0,0 +1,249 @@
.. include:: <isonum.txt>
=================================
Video Mode Selection Support 2.13
=================================
:Copyright: |copy| 1995--1999 Martin Mares, <mj@ucw.cz>
Intro
~~~~~
This small document describes the "Video Mode Selection" feature which
allows the use of various special video modes supported by the video BIOS. Due
to usage of the BIOS, the selection is limited to boot time (before the
kernel decompression starts) and works only on 80X86 machines.
.. note::
Short intro for the impatient: Just use vga=ask for the first time,
enter ``scan`` on the video mode prompt, pick the mode you want to use,
remember its mode ID (the four-digit hexadecimal number) and then
set the vga parameter to this number (converted to decimal first).
The video mode to be used is selected by a kernel parameter which can be
specified in the kernel Makefile (the SVGA_MODE=... line) or by the "vga=..."
option of LILO (or some other boot loader you use) or by the "vidmode" utility
(present in standard Linux utility packages). You can use the following values
of this parameter::
NORMAL_VGA - Standard 80x25 mode available on all display adapters.
EXTENDED_VGA - Standard 8-pixel font mode: 80x43 on EGA, 80x50 on VGA.
ASK_VGA - Display a video mode menu upon startup (see below).
0..35 - Menu item number (when you have used the menu to view the list of
modes available on your adapter, you can specify the menu item you want
to use). 0..9 correspond to "0".."9", 10..35 to "a".."z". Warning: the
mode list displayed may vary as the kernel version changes, because the
modes are listed in a "first detected -- first displayed" manner. It's
better to use absolute mode numbers instead.
0x.... - Hexadecimal video mode ID (also displayed on the menu, see below
for exact meaning of the ID). Warning: rdev and LILO don't support
hexadecimal numbers -- you have to convert it to decimal manually.
Menu
~~~~
The ASK_VGA mode causes the kernel to offer a video mode menu upon
bootup. It displays a "Press <RETURN> to see video modes available, <SPACE>
to continue or wait 30 secs" message. If you press <RETURN>, you enter the
menu, if you press <SPACE> or wait 30 seconds, the kernel will boot up in
the standard 80x25 mode.
The menu looks like::
Video adapter: <name-of-detected-video-adapter>
Mode: COLSxROWS:
0 0F00 80x25
1 0F01 80x50
2 0F02 80x43
3 0F03 80x26
....
Enter mode number or ``scan``: <flashing-cursor-here>
<name-of-detected-video-adapter> tells what video adapter did Linux detect
-- it's either a generic adapter name (MDA, CGA, HGC, EGA, VGA, VESA VGA [a VGA
with VESA-compliant BIOS]) or a chipset name (e.g., Trident). Direct detection
of chipsets is turned off by default as it's inherently unreliable due to
absolutely insane PC design.
"0 0F00 80x25" means that the first menu item (the menu items are numbered
from "0" to "9" and from "a" to "z") is a 80x25 mode with ID=0x0f00 (see the
next section for a description of mode IDs).
<flashing-cursor-here> encourages you to enter the item number or mode ID
you wish to set and press <RETURN>. If the computer complains something about
"Unknown mode ID", it is trying to tell you that it isn't possible to set such
a mode. It's also possible to press only <RETURN> which leaves the current mode.
The mode list usually contains a few basic modes and some VESA modes. In
case your chipset has been detected, some chipset-specific modes are shown as
well (some of these might be missing or unusable on your machine as different
BIOSes are often shipped with the same card and the mode numbers depend purely
on the VGA BIOS).
The modes displayed on the menu are partially sorted: The list starts with
the standard modes (80x25 and 80x50) followed by "special" modes (80x28 and
80x43), local modes (if the local modes feature is enabled), VESA modes and
finally SVGA modes for the auto-detected adapter.
If you are not happy with the mode list offered (e.g., if you think your card
is able to do more), you can enter "scan" instead of item number / mode ID. The
program will try to ask the BIOS for all possible video mode numbers and test
what happens then. The screen will be probably flashing wildly for some time and
strange noises will be heard from inside the monitor and so on and then, really
all consistent video modes supported by your BIOS will appear (plus maybe some
``ghost modes``). If you are afraid this could damage your monitor, don't use
this function.
After scanning, the mode ordering is a bit different: the auto-detected SVGA
modes are not listed at all and the modes revealed by ``scan`` are shown before
all VESA modes.
Mode IDs
~~~~~~~~
Because of the complexity of all the video stuff, the video mode IDs
used here are also a bit complex. A video mode ID is a 16-bit number usually
expressed in a hexadecimal notation (starting with "0x"). You can set a mode
by entering its mode directly if you know it even if it isn't shown on the menu.
The ID numbers can be divided to those regions::
0x0000 to 0x00ff - menu item references. 0x0000 is the first item. Don't use
outside the menu as this can change from boot to boot (especially if you
have used the ``scan`` feature).
0x0100 to 0x017f - standard BIOS modes. The ID is a BIOS video mode number
(as presented to INT 10, function 00) increased by 0x0100.
0x0200 to 0x08ff - VESA BIOS modes. The ID is a VESA mode ID increased by
0x0100. All VESA modes should be autodetected and shown on the menu.
0x0900 to 0x09ff - Video7 special modes. Set by calling INT 0x10, AX=0x6f05.
(Usually 940=80x43, 941=132x25, 942=132x44, 943=80x60, 944=100x60,
945=132x28 for the standard Video7 BIOS)
0x0f00 to 0x0fff - special modes (they are set by various tricks -- usually
by modifying one of the standard modes). Currently available:
0x0f00 standard 80x25, don't reset mode if already set (=FFFF)
0x0f01 standard with 8-point font: 80x43 on EGA, 80x50 on VGA
0x0f02 VGA 80x43 (VGA switched to 350 scanlines with a 8-point font)
0x0f03 VGA 80x28 (standard VGA scans, but 14-point font)
0x0f04 leave current video mode
0x0f05 VGA 80x30 (480 scans, 16-point font)
0x0f06 VGA 80x34 (480 scans, 14-point font)
0x0f07 VGA 80x60 (480 scans, 8-point font)
0x0f08 Graphics hack (see the VIDEO_GFX_HACK paragraph below)
0x1000 to 0x7fff - modes specified by resolution. The code has a "0xRRCC"
form where RR is a number of rows and CC is a number of columns.
E.g., 0x1950 corresponds to a 80x25 mode, 0x2b84 to 132x43 etc.
This is the only fully portable way to refer to a non-standard mode,
but it relies on the mode being found and displayed on the menu
(remember that mode scanning is not done automatically).
0xff00 to 0xffff - aliases for backward compatibility:
0xffff equivalent to 0x0f00 (standard 80x25)
0xfffe equivalent to 0x0f01 (EGA 80x43 or VGA 80x50)
If you add 0x8000 to the mode ID, the program will try to recalculate
vertical display timing according to mode parameters, which can be used to
eliminate some annoying bugs of certain VGA BIOSes (usually those used for
cards with S3 chipsets and old Cirrus Logic BIOSes) -- mainly extra lines at the
end of the display.
Options
~~~~~~~
Build options for arch/x86/boot/* are selected by the kernel kconfig
utility and the kernel .config file.
VIDEO_GFX_HACK - includes special hack for setting of graphics modes
to be used later by special drivers.
Allows to set _any_ BIOS mode including graphic ones and forcing specific
text screen resolution instead of peeking it from BIOS variables. Don't use
unless you think you know what you're doing. To activate this setup, use
mode number 0x0f08 (see the Mode IDs section above).
Still doesn't work?
~~~~~~~~~~~~~~~~~~~
When the mode detection doesn't work (e.g., the mode list is incorrect or
the machine hangs instead of displaying the menu), try to switch off some of
the configuration options listed under "Options". If it fails, you can still use
your kernel with the video mode set directly via the kernel parameter.
In either case, please send me a bug report containing what _exactly_
happens and how do the configuration switches affect the behaviour of the bug.
If you start Linux from M$-DOS, you might also use some DOS tools for
video mode setting. In this case, you must specify the 0x0f04 mode ("leave
current settings") to Linux, because if you don't and you use any non-standard
mode, Linux will switch to 80x25 automatically.
If you set some extended mode and there's one or more extra lines on the
bottom of the display containing already scrolled-out text, your VGA BIOS
contains the most common video BIOS bug called "incorrect vertical display
end setting". Adding 0x8000 to the mode ID might fix the problem. Unfortunately,
this must be done manually -- no autodetection mechanisms are available.
History
~~~~~~~
=============== ================================================================
1.0 (??-Nov-95) First version supporting all adapters supported by the old
setup.S + Cirrus Logic 54XX. Present in some 1.3.4? kernels
and then removed due to instability on some machines.
2.0 (28-Jan-96) Rewritten from scratch. Cirrus Logic 64XX support added, almost
everything is configurable, the VESA support should be much more
stable, explicit mode numbering allowed, "scan" implemented etc.
2.1 (30-Jan-96) VESA modes moved to 0x200-0x3ff. Mode selection by resolution
supported. Few bugs fixed. VESA modes are listed prior to
modes supplied by SVGA autodetection as they are more reliable.
CLGD autodetect works better. Doesn't depend on 80x25 being
active when started. Scanning fixed. 80x43 (any VGA) added.
Code cleaned up.
2.2 (01-Feb-96) EGA 80x43 fixed. VESA extended to 0x200-0x4ff (non-standard 02XX
VESA modes work now). Display end bug workaround supported.
Special modes renumbered to allow adding of the "recalculate"
flag, 0xffff and 0xfffe became aliases instead of real IDs.
Screen contents retained during mode changes.
2.3 (15-Mar-96) Changed to work with 1.3.74 kernel.
2.4 (18-Mar-96) Added patches by Hans Lermen fixing a memory overwrite problem
with some boot loaders. Memory management rewritten to reflect
these changes. Unfortunately, screen contents retaining works
only with some loaders now.
Added a Tseng 132x60 mode.
2.5 (19-Mar-96) Fixed a VESA mode scanning bug introduced in 2.4.
2.6 (25-Mar-96) Some VESA BIOS errors not reported -- it fixes error reports on
several cards with broken VESA code (e.g., ATI VGA).
2.7 (09-Apr-96) - Accepted all VESA modes in range 0x100 to 0x7ff, because some
cards use very strange mode numbers.
- Added Realtek VGA modes (thanks to Gonzalo Tornaria).
- Hardware testing order slightly changed, tests based on ROM
contents done as first.
- Added support for special Video7 mode switching functions
(thanks to Tom Vander Aa).
- Added 480-scanline modes (especially useful for notebooks,
original version written by hhanemaa@cs.ruu.nl, patched by
Jeff Chua, rewritten by me).
- Screen store/restore fixed.
2.8 (14-Apr-96) - Previous release was not compilable without CONFIG_VIDEO_SVGA.
- Better recognition of text modes during mode scan.
2.9 (12-May-96) - Ignored VESA modes 0x80 - 0xff (more VESA BIOS bugs!)
2.10(11-Nov-96) - The whole thing made optional.
- Added the CONFIG_VIDEO_400_HACK switch.
- Added the CONFIG_VIDEO_GFX_HACK switch.
- Code cleanup.
2.11(03-May-97) - Yet another cleanup, now including also the documentation.
- Direct testing of SVGA adapters turned off by default, ``scan``
offered explicitly on the prompt line.
- Removed the doc section describing adding of new probing
functions as I try to get rid of _all_ hardware probing here.
2.12(25-May-98) Added support for VESA frame buffer graphics.
2.13(14-May-99) Minor documentation fixes.
=============== ================================================================

View File

@@ -327,7 +327,7 @@ when a hard lockup is detected.
0 - don't panic on hard lockup
1 - panic on hard lockup
See Documentation/lockup-watchdogs.txt for more information. This can
See Documentation/admin-guide/lockup-watchdogs.rst for more information. This can
also be set using the nmi_watchdog kernel parameter.

View File

@@ -0,0 +1,34 @@
Video Output Switcher Control
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2006 luming.yu@intel.com
The output sysfs class driver provides an abstract video output layer that
can be used to hook platform specific methods to enable/disable video output
device through common sysfs interface. For example, on my IBM ThinkPad T42
laptop, The ACPI video driver registered its output devices and read/write
method for 'state' with output sysfs class. The user interface under sysfs is::
linux:/sys/class/video_output # tree .
.
|-- CRT0
| |-- device -> ../../../devices/pci0000:00/0000:00:01.0
| |-- state
| |-- subsystem -> ../../../class/video_output
| `-- uevent
|-- DVI0
| |-- device -> ../../../devices/pci0000:00/0000:00:01.0
| |-- state
| |-- subsystem -> ../../../class/video_output
| `-- uevent
|-- LCD0
| |-- device -> ../../../devices/pci0000:00/0000:00:01.0
| |-- state
| |-- subsystem -> ../../../class/video_output
| `-- uevent
`-- TV0
|-- device -> ../../../devices/pci0000:00/0000:00:01.0
|-- state
|-- subsystem -> ../../../class/video_output
`-- uevent