Merge tag 'docs-4.10' of git://git.lwn.net/linux

Pull documentation update from Jonathan Corbet:
 "These are the documentation changes for 4.10.

  It's another busy cycle for the docs tree, as the sphinx conversion
  continues. Highlights include:

   - Further work on PDF output, which remains a bit of a pain but
     should be more solid now.

   - Five more DocBook template files converted to Sphinx. Only 27 to
     go... Lots of plain-text files have also been converted and
     integrated.

   - Images in binary formats have been replaced with more
     source-friendly versions.

   - Various bits of organizational work, including the renaming of
     various files discussed at the kernel summit.

   - New documentation for the device_link mechanism.

  ... and, of course, lots of typo fixes and small updates"

* tag 'docs-4.10' of git://git.lwn.net/linux: (193 commits)
  dma-buf: Extract dma-buf.rst
  Update Documentation/00-INDEX
  docs: 00-INDEX: document directories/files with no docs
  docs: 00-INDEX: remove non-existing entries
  docs: 00-INDEX: add missing entries for documentation files/dirs
  docs: 00-INDEX: consolidate process/ and admin-guide/ description
  scripts: add a script to check if Documentation/00-INDEX is sane
  Docs: change sh -> awk in REPORTING-BUGS
  Documentation/core-api/device_link: Add initial documentation
  core-api: remove an unexpected unident
  ppc/idle: Add documentation for powersave=off
  Doc: Correct typo, "Introdution" => "Introduction"
  Documentation/atomic_ops.txt: convert to ReST markup
  Documentation/local_ops.txt: convert to ReST markup
  Documentation/assoc_array.txt: convert to ReST markup
  docs-rst: parse-headers.pl: cleanup the documentation
  docs-rst: fix media cleandocs target
  docs-rst: media/Makefile: reorganize the rules
  docs-rst: media: build SVG from graphviz files
  docs-rst: replace bayer.png by a SVG image
  ...
This commit is contained in:
Linus Torvalds
2016-12-12 21:58:13 -08:00
344 changed files with 39950 additions and 21197 deletions

View File

@@ -0,0 +1,411 @@
Linux kernel release 4.x <http://kernel.org/>
=============================================
These are the release notes for Linux version 4. Read them carefully,
as they tell you what this is all about, explain how to install the
kernel, and what to do if something goes wrong.
What is Linux?
--------------
Linux is a clone of the operating system Unix, written from scratch by
Linus Torvalds with assistance from a loosely-knit team of hackers across
the Net. It aims towards POSIX and Single UNIX Specification compliance.
It has all the features you would expect in a modern fully-fledged Unix,
including true multitasking, virtual memory, shared libraries, demand
loading, shared copy-on-write executables, proper memory management,
and multistack networking including IPv4 and IPv6.
It is distributed under the GNU General Public License - see the
accompanying COPYING file for more details.
On what hardware does it run?
-----------------------------
Although originally developed first for 32-bit x86-based PCs (386 or higher),
today Linux also runs on (at least) the Compaq Alpha AXP, Sun SPARC and
UltraSPARC, Motorola 68000, PowerPC, PowerPC64, ARM, Hitachi SuperH, Cell,
IBM S/390, MIPS, HP PA-RISC, Intel IA-64, DEC VAX, AMD x86-64, AXIS CRIS,
Xtensa, Tilera TILE, AVR32, ARC and Renesas M32R architectures.
Linux is easily portable to most general-purpose 32- or 64-bit architectures
as long as they have a paged memory management unit (PMMU) and a port of the
GNU C compiler (gcc) (part of The GNU Compiler Collection, GCC). Linux has
also been ported to a number of architectures without a PMMU, although
functionality is then obviously somewhat limited.
Linux has also been ported to itself. You can now run the kernel as a
userspace application - this is called UserMode Linux (UML).
Documentation
-------------
- There is a lot of documentation available both in electronic form on
the Internet and in books, both Linux-specific and pertaining to
general UNIX questions. I'd recommend looking into the documentation
subdirectories on any Linux FTP site for the LDP (Linux Documentation
Project) books. This README is not meant to be documentation on the
system: there are much better sources available.
- There are various README files in the Documentation/ subdirectory:
these typically contain kernel-specific installation notes for some
drivers for example. See Documentation/00-INDEX for a list of what
is contained in each file. Please read the
:ref:`Documentation/process/changes.rst <changes>` file, as it
contains information about the problems, which may result by upgrading
your kernel.
- The Documentation/DocBook/ subdirectory contains several guides for
kernel developers and users. These guides can be rendered in a
number of formats: PostScript (.ps), PDF, HTML, & man-pages, among others.
After installation, ``make psdocs``, ``make pdfdocs``, ``make htmldocs``,
or ``make mandocs`` will render the documentation in the requested format.
Installing the kernel source
----------------------------
- If you install the full sources, put the kernel tarball in a
directory where you have permissions (e.g. your home directory) and
unpack it::
xz -cd linux-4.X.tar.xz | tar xvf -
Replace "X" with the version number of the latest kernel.
Do NOT use the /usr/src/linux area! This area has a (usually
incomplete) set of kernel headers that are used by the library header
files. They should match the library, and not get messed up by
whatever the kernel-du-jour happens to be.
- You can also upgrade between 4.x releases by patching. Patches are
distributed in the xz format. To install by patching, get all the
newer patch files, enter the top level directory of the kernel source
(linux-4.X) and execute::
xz -cd ../patch-4.x.xz | patch -p1
Replace "x" for all versions bigger than the version "X" of your current
source tree, **in_order**, and you should be ok. You may want to remove
the backup files (some-file-name~ or some-file-name.orig), and make sure
that there are no failed patches (some-file-name# or some-file-name.rej).
If there are, either you or I have made a mistake.
Unlike patches for the 4.x kernels, patches for the 4.x.y kernels
(also known as the -stable kernels) are not incremental but instead apply
directly to the base 4.x kernel. For example, if your base kernel is 4.0
and you want to apply the 4.0.3 patch, you must not first apply the 4.0.1
and 4.0.2 patches. Similarly, if you are running kernel version 4.0.2 and
want to jump to 4.0.3, you must first reverse the 4.0.2 patch (that is,
patch -R) **before** applying the 4.0.3 patch. You can read more on this in
:ref:`Documentation/process/applying-patches.rst <applying_patches>`.
Alternatively, the script patch-kernel can be used to automate this
process. It determines the current kernel version and applies any
patches found::
linux/scripts/patch-kernel linux
The first argument in the command above is the location of the
kernel source. Patches are applied from the current directory, but
an alternative directory can be specified as the second argument.
- Make sure you have no stale .o files and dependencies lying around::
cd linux
make mrproper
You should now have the sources correctly installed.
Software requirements
---------------------
Compiling and running the 4.x kernels requires up-to-date
versions of various software packages. Consult
:ref:`Documentation/process/changes.rst <changes>` for the minimum version numbers
required and how to get updates for these packages. Beware that using
excessively old versions of these packages can cause indirect
errors that are very difficult to track down, so don't assume that
you can just update packages when obvious problems arise during
build or operation.
Build directory for the kernel
------------------------------
When compiling the kernel, all output files will per default be
stored together with the kernel source code.
Using the option ``make O=output/dir`` allows you to specify an alternate
place for the output files (including .config).
Example::
kernel source code: /usr/src/linux-4.X
build directory: /home/name/build/kernel
To configure and build the kernel, use::
cd /usr/src/linux-4.X
make O=/home/name/build/kernel menuconfig
make O=/home/name/build/kernel
sudo make O=/home/name/build/kernel modules_install install
Please note: If the ``O=output/dir`` option is used, then it must be
used for all invocations of make.
Configuring the kernel
----------------------
Do not skip this step even if you are only upgrading one minor
version. New configuration options are added in each release, and
odd problems will turn up if the configuration files are not set up
as expected. If you want to carry your existing configuration to a
new version with minimal work, use ``make oldconfig``, which will
only ask you for the answers to new questions.
- Alternative configuration commands are::
"make config" Plain text interface.
"make menuconfig" Text based color menus, radiolists & dialogs.
"make nconfig" Enhanced text based color menus.
"make xconfig" Qt based configuration tool.
"make gconfig" GTK+ based configuration tool.
"make oldconfig" Default all questions based on the contents of
your existing ./.config file and asking about
new config symbols.
"make silentoldconfig"
Like above, but avoids cluttering the screen
with questions already answered.
Additionally updates the dependencies.
"make olddefconfig"
Like above, but sets new symbols to their default
values without prompting.
"make defconfig" Create a ./.config file by using the default
symbol values from either arch/$ARCH/defconfig
or arch/$ARCH/configs/${PLATFORM}_defconfig,
depending on the architecture.
"make ${PLATFORM}_defconfig"
Create a ./.config file by using the default
symbol values from
arch/$ARCH/configs/${PLATFORM}_defconfig.
Use "make help" to get a list of all available
platforms of your architecture.
"make allyesconfig"
Create a ./.config file by setting symbol
values to 'y' as much as possible.
"make allmodconfig"
Create a ./.config file by setting symbol
values to 'm' as much as possible.
"make allnoconfig" Create a ./.config file by setting symbol
values to 'n' as much as possible.
"make randconfig" Create a ./.config file by setting symbol
values to random values.
"make localmodconfig" Create a config based on current config and
loaded modules (lsmod). Disables any module
option that is not needed for the loaded modules.
To create a localmodconfig for another machine,
store the lsmod of that machine into a file
and pass it in as a LSMOD parameter.
target$ lsmod > /tmp/mylsmod
target$ scp /tmp/mylsmod host:/tmp
host$ make LSMOD=/tmp/mylsmod localmodconfig
The above also works when cross compiling.
"make localyesconfig" Similar to localmodconfig, except it will convert
all module options to built in (=y) options.
You can find more information on using the Linux kernel config tools
in Documentation/kbuild/kconfig.txt.
- NOTES on ``make config``:
- Having unnecessary drivers will make the kernel bigger, and can
under some circumstances lead to problems: probing for a
nonexistent controller card may confuse your other controllers
- A kernel with math-emulation compiled in will still use the
coprocessor if one is present: the math emulation will just
never get used in that case. The kernel will be slightly larger,
but will work on different machines regardless of whether they
have a math coprocessor or not.
- The "kernel hacking" configuration details usually result in a
bigger or slower kernel (or both), and can even make the kernel
less stable by configuring some routines to actively try to
break bad code to find kernel problems (kmalloc()). Thus you
should probably answer 'n' to the questions for "development",
"experimental", or "debugging" features.
Compiling the kernel
--------------------
- Make sure you have at least gcc 3.2 available.
For more information, refer to :ref:`Documentation/process/changes.rst <changes>`.
Please note that you can still run a.out user programs with this kernel.
- Do a ``make`` to create a compressed kernel image. It is also
possible to do ``make install`` if you have lilo installed to suit the
kernel makefiles, but you may want to check your particular lilo setup first.
To do the actual install, you have to be root, but none of the normal
build should require that. Don't take the name of root in vain.
- If you configured any of the parts of the kernel as ``modules``, you
will also have to do ``make modules_install``.
- Verbose kernel compile/build output:
Normally, the kernel build system runs in a fairly quiet mode (but not
totally silent). However, sometimes you or other kernel developers need
to see compile, link, or other commands exactly as they are executed.
For this, use "verbose" build mode. This is done by passing
``V=1`` to the ``make`` command, e.g.::
make V=1 all
To have the build system also tell the reason for the rebuild of each
target, use ``V=2``. The default is ``V=0``.
- Keep a backup kernel handy in case something goes wrong. This is
especially true for the development releases, since each new release
contains new code which has not been debugged. Make sure you keep a
backup of the modules corresponding to that kernel, as well. If you
are installing a new kernel with the same version number as your
working kernel, make a backup of your modules directory before you
do a ``make modules_install``.
Alternatively, before compiling, use the kernel config option
"LOCALVERSION" to append a unique suffix to the regular kernel version.
LOCALVERSION can be set in the "General Setup" menu.
- In order to boot your new kernel, you'll need to copy the kernel
image (e.g. .../linux/arch/x86/boot/bzImage after compilation)
to the place where your regular bootable kernel is found.
- Booting a kernel directly from a floppy without the assistance of a
bootloader such as LILO, is no longer supported.
If you boot Linux from the hard drive, chances are you use LILO, which
uses the kernel image as specified in the file /etc/lilo.conf. The
kernel image file is usually /vmlinuz, /boot/vmlinuz, /bzImage or
/boot/bzImage. To use the new kernel, save a copy of the old image
and copy the new image over the old one. Then, you MUST RERUN LILO
to update the loading map! If you don't, you won't be able to boot
the new kernel image.
Reinstalling LILO is usually a matter of running /sbin/lilo.
You may wish to edit /etc/lilo.conf to specify an entry for your
old kernel image (say, /vmlinux.old) in case the new one does not
work. See the LILO docs for more information.
After reinstalling LILO, you should be all set. Shutdown the system,
reboot, and enjoy!
If you ever need to change the default root device, video mode,
ramdisk size, etc. in the kernel image, use the ``rdev`` program (or
alternatively the LILO boot options when appropriate). No need to
recompile the kernel to change these parameters.
- Reboot with the new kernel and enjoy.
If something goes wrong
-----------------------
- If you have problems that seem to be due to kernel bugs, please check
the file MAINTAINERS to see if there is a particular person associated
with the part of the kernel that you are having trouble with. If there
isn't anyone listed there, then the second best thing is to mail
them to me (torvalds@linux-foundation.org), and possibly to any other
relevant mailing-list or to the newsgroup.
- In all bug-reports, *please* tell what kernel you are talking about,
how to duplicate the problem, and what your setup is (use your common
sense). If the problem is new, tell me so, and if the problem is
old, please try to tell me when you first noticed it.
- If the bug results in a message like::
unable to handle kernel paging request at address C0000010
Oops: 0002
EIP: 0010:XXXXXXXX
eax: xxxxxxxx ebx: xxxxxxxx ecx: xxxxxxxx edx: xxxxxxxx
esi: xxxxxxxx edi: xxxxxxxx ebp: xxxxxxxx
ds: xxxx es: xxxx fs: xxxx gs: xxxx
Pid: xx, process nr: xx
xx xx xx xx xx xx xx xx xx xx
or similar kernel debugging information on your screen or in your
system log, please duplicate it *exactly*. The dump may look
incomprehensible to you, but it does contain information that may
help debugging the problem. The text above the dump is also
important: it tells something about why the kernel dumped code (in
the above example, it's due to a bad kernel pointer). More information
on making sense of the dump is in Documentation/admin-guide/oops-tracing.rst
- If you compiled the kernel with CONFIG_KALLSYMS you can send the dump
as is, otherwise you will have to use the ``ksymoops`` program to make
sense of the dump (but compiling with CONFIG_KALLSYMS is usually preferred).
This utility can be downloaded from
ftp://ftp.<country>.kernel.org/pub/linux/utils/kernel/ksymoops/ .
Alternatively, you can do the dump lookup by hand:
- In debugging dumps like the above, it helps enormously if you can
look up what the EIP value means. The hex value as such doesn't help
me or anybody else very much: it will depend on your particular
kernel setup. What you should do is take the hex value from the EIP
line (ignore the ``0010:``), and look it up in the kernel namelist to
see which kernel function contains the offending address.
To find out the kernel function name, you'll need to find the system
binary associated with the kernel that exhibited the symptom. This is
the file 'linux/vmlinux'. To extract the namelist and match it against
the EIP from the kernel crash, do::
nm vmlinux | sort | less
This will give you a list of kernel addresses sorted in ascending
order, from which it is simple to find the function that contains the
offending address. Note that the address given by the kernel
debugging messages will not necessarily match exactly with the
function addresses (in fact, that is very unlikely), so you can't
just 'grep' the list: the list will, however, give you the starting
point of each kernel function, so by looking for the function that
has a starting address lower than the one you are searching for but
is followed by a function with a higher address you will find the one
you want. In fact, it may be a good idea to include a bit of
"context" in your problem report, giving a few lines around the
interesting one.
If you for some reason cannot do the above (you have a pre-compiled
kernel image or similar), telling me as much about your setup as
possible will help. Please read the :ref:`admin-guide/reporting-bugs.rst <reportingbugs>`
document for details.
- Alternatively, you can use gdb on a running kernel. (read-only; i.e. you
cannot change values or set break points.) To do this, first compile the
kernel with -g; edit arch/x86/Makefile appropriately, then do a ``make
clean``. You'll also need to enable CONFIG_PROC_FS (via ``make config``).
After you've rebooted with the new kernel, do ``gdb vmlinux /proc/kcore``.
You can now use all the usual gdb commands. The command to look up the
point where your system crashed is ``l *0xXXXXXXXX``. (Replace the XXXes
with the EIP value.)
gdb'ing a non-running kernel currently fails because ``gdb`` (wrongly)
disregards the starting offset for which the kernel is compiled.

View File

@@ -0,0 +1,151 @@
Kernel Support for miscellaneous (your favourite) Binary Formats v1.1
=====================================================================
This Kernel feature allows you to invoke almost (for restrictions see below)
every program by simply typing its name in the shell.
This includes for example compiled Java(TM), Python or Emacs programs.
To achieve this you must tell binfmt_misc which interpreter has to be invoked
with which binary. Binfmt_misc recognises the binary-type by matching some bytes
at the beginning of the file with a magic byte sequence (masking out specified
bits) you have supplied. Binfmt_misc can also recognise a filename extension
aka ``.com`` or ``.exe``.
First you must mount binfmt_misc::
mount binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc
To actually register a new binary type, you have to set up a string looking like
``:name:type:offset:magic:mask:interpreter:flags`` (where you can choose the
``:`` upon your needs) and echo it to ``/proc/sys/fs/binfmt_misc/register``.
Here is what the fields mean:
- ``name``
is an identifier string. A new /proc file will be created with this
``name below /proc/sys/fs/binfmt_misc``; cannot contain slashes ``/`` for
obvious reasons.
- ``type``
is the type of recognition. Give ``M`` for magic and ``E`` for extension.
- ``offset``
is the offset of the magic/mask in the file, counted in bytes. This
defaults to 0 if you omit it (i.e. you write ``:name:type::magic...``).
Ignored when using filename extension matching.
- ``magic``
is the byte sequence binfmt_misc is matching for. The magic string
may contain hex-encoded characters like ``\x0a`` or ``\xA4``. Note that you
must escape any NUL bytes; parsing halts at the first one. In a shell
environment you might have to write ``\\x0a`` to prevent the shell from
eating your ``\``.
If you chose filename extension matching, this is the extension to be
recognised (without the ``.``, the ``\x0a`` specials are not allowed).
Extension matching is case sensitive, and slashes ``/`` are not allowed!
- ``mask``
is an (optional, defaults to all 0xff) mask. You can mask out some
bits from matching by supplying a string like magic and as long as magic.
The mask is anded with the byte sequence of the file. Note that you must
escape any NUL bytes; parsing halts at the first one. Ignored when using
filename extension matching.
- ``interpreter``
is the program that should be invoked with the binary as first
argument (specify the full path)
- ``flags``
is an optional field that controls several aspects of the invocation
of the interpreter. It is a string of capital letters, each controls a
certain aspect. The following flags are supported:
``P`` - preserve-argv[0]
Legacy behavior of binfmt_misc is to overwrite
the original argv[0] with the full path to the binary. When this
flag is included, binfmt_misc will add an argument to the argument
vector for this purpose, thus preserving the original ``argv[0]``.
e.g. If your interp is set to ``/bin/foo`` and you run ``blah``
(which is in ``/usr/local/bin``), then the kernel will execute
``/bin/foo`` with ``argv[]`` set to ``["/bin/foo", "/usr/local/bin/blah", "blah"]``. The interp has to be aware of this so it can
execute ``/usr/local/bin/blah``
with ``argv[]`` set to ``["blah"]``.
``O`` - open-binary
Legacy behavior of binfmt_misc is to pass the full path
of the binary to the interpreter as an argument. When this flag is
included, binfmt_misc will open the file for reading and pass its
descriptor as an argument, instead of the full path, thus allowing
the interpreter to execute non-readable binaries. This feature
should be used with care - the interpreter has to be trusted not to
emit the contents of the non-readable binary.
``C`` - credentials
Currently, the behavior of binfmt_misc is to calculate
the credentials and security token of the new process according to
the interpreter. When this flag is included, these attributes are
calculated according to the binary. It also implies the ``O`` flag.
This feature should be used with care as the interpreter
will run with root permissions when a setuid binary owned by root
is run with binfmt_misc.
``F`` - fix binary
The usual behaviour of binfmt_misc is to spawn the
binary lazily when the misc format file is invoked. However,
this doesn``t work very well in the face of mount namespaces and
changeroots, so the ``F`` mode opens the binary as soon as the
emulation is installed and uses the opened image to spawn the
emulator, meaning it is always available once installed,
regardless of how the environment changes.
There are some restrictions:
- the whole register string may not exceed 1920 characters
- the magic must reside in the first 128 bytes of the file, i.e.
offset+size(magic) has to be less than 128
- the interpreter string may not exceed 127 characters
To use binfmt_misc you have to mount it first. You can mount it with
``mount -t binfmt_misc none /proc/sys/fs/binfmt_misc`` command, or you can add
a line ``none /proc/sys/fs/binfmt_misc binfmt_misc defaults 0 0`` to your
``/etc/fstab`` so it auto mounts on boot.
You may want to add the binary formats in one of your ``/etc/rc`` scripts during
boot-up. Read the manual of your init program to figure out how to do this
right.
Think about the order of adding entries! Later added entries are matched first!
A few examples (assumed you are in ``/proc/sys/fs/binfmt_misc``):
- enable support for em86 (like binfmt_em86, for Alpha AXP only)::
echo ':i386:M::\x7fELF\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x03:\xff\xff\xff\xff\xff\xfe\xfe\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfb\xff\xff:/bin/em86:' > register
echo ':i486:M::\x7fELF\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x06:\xff\xff\xff\xff\xff\xfe\xfe\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfb\xff\xff:/bin/em86:' > register
- enable support for packed DOS applications (pre-configured dosemu hdimages)::
echo ':DEXE:M::\x0eDEX::/usr/bin/dosexec:' > register
- enable support for Windows executables using wine::
echo ':DOSWin:M::MZ::/usr/local/bin/wine:' > register
For java support see Documentation/admin-guide/java.rst
You can enable/disable binfmt_misc or one binary type by echoing 0 (to disable)
or 1 (to enable) to ``/proc/sys/fs/binfmt_misc/status`` or
``/proc/.../the_name``.
Catting the file tells you the current status of ``binfmt_misc/the_entry``.
You can remove one entry or all entries by echoing -1 to ``/proc/.../the_name``
or ``/proc/sys/fs/binfmt_misc/status``.
Hints
-----
If you want to pass special arguments to your interpreter, you can
write a wrapper script for it. See Documentation/admin-guide/java.rst for an
example.
Your interpreter should NOT look in the PATH for the filename; the kernel
passes it the full filename (or the file descriptor) to use. Using ``$PATH`` can
cause unexpected behaviour and can be a security hazard.
Richard Günther <rguenth@tat.physik.uni-tuebingen.de>

View File

@@ -0,0 +1,38 @@
Linux Braille Console
=====================
To get early boot messages on a braille device (before userspace screen
readers can start), you first need to compile the support for the usual serial
console (see :ref:`Documentation/admin-guide/serial-console.rst <serial_console>`), and
for braille device
(in :menuselection:`Device Drivers --> Accessibility support --> Console on braille device`).
Then you need to specify a ``console=brl``, option on the kernel command line, the
format is::
console=brl,serial_options...
where ``serial_options...`` are the same as described in
:ref:`Documentation/admin-guide/serial-console.rst <serial_console>`.
So for instance you can use ``console=brl,ttyS0`` if the braille device is connected to the first serial port, and ``console=brl,ttyS0,115200`` to
override the baud rate to 115200, etc.
By default, the braille device will just show the last kernel message (console
mode). To review previous messages, press the Insert key to switch to the VT
review mode. In review mode, the arrow keys permit to browse in the VT content,
:kbd:`PAGE-UP`/:kbd:`PAGE-DOWN` keys go at the top/bottom of the screen, and
the :kbd:`HOME` key goes back
to the cursor, hence providing very basic screen reviewing facility.
Sound feedback can be obtained by adding the ``braille_console.sound=1`` kernel
parameter.
For simplicity, only one braille console can be enabled, other uses of
``console=brl,...`` will be discarded. Also note that it does not interfere with
the console selection mechanism described in
:ref:`Documentation/admin-guide/serial-console.rst <serial_console>`.
For now, only the VisioBraille device is supported.
Samuel Thibault <samuel.thibault@ens-lyon.org>

View File

@@ -0,0 +1,76 @@
Bisecting a bug
+++++++++++++++
Last updated: 28 October 2016
Introduction
============
Always try the latest kernel from kernel.org and build from source. If you are
not confident in doing that please report the bug to your distribution vendor
instead of to a kernel developer.
Finding bugs is not always easy. Have a go though. If you can't find it don't
give up. Report as much as you have found to the relevant maintainer. See
MAINTAINERS for who that is for the subsystem you have worked on.
Before you submit a bug report read
:ref:`Documentation/admin-guide/reporting-bugs.rst <reportingbugs>`.
Devices not appearing
=====================
Often this is caused by udev/systemd. Check that first before blaming it
on the kernel.
Finding patch that caused a bug
===============================
Using the provided tools with ``git`` makes finding bugs easy provided the bug
is reproducible.
Steps to do it:
- build the Kernel from its git source
- start bisect with [#f1]_::
$ git bisect start
- mark the broken changeset with::
$ git bisect bad [commit]
- mark a changeset where the code is known to work with::
$ git bisect good [commit]
- rebuild the Kernel and test
- interact with git bisect by using either::
$ git bisect good
or::
$ git bisect bad
depending if the bug happened on the changeset you're testing
- After some interactions, git bisect will give you the changeset that
likely caused the bug.
- For example, if you know that the current version is bad, and version
4.8 is good, you could do::
$ git bisect start
$ git bisect bad # Current version is bad
$ git bisect good v4.8
.. [#f1] You can, optionally, provide both good and bad arguments at git
start with ``git bisect start [BAD] [GOOD]``
For further references, please read:
- The man page for ``git-bisect``
- `Fighting regressions with git bisect <https://www.kernel.org/pub/software/scm/git/docs/git-bisect-lk2009.html>`_
- `Fully automated bisecting with "git bisect run" <https://lwn.net/Articles/317154>`_
- `Using Git bisect to figure out when brokenness was introduced <http://webchick.net/node/99>`_

View File

@@ -0,0 +1,369 @@
Bug hunting
===========
Kernel bug reports often come with a stack dump like the one below::
------------[ cut here ]------------
WARNING: CPU: 1 PID: 28102 at kernel/module.c:1108 module_put+0x57/0x70
Modules linked in: dvb_usb_gp8psk(-) dvb_usb dvb_core nvidia_drm(PO) nvidia_modeset(PO) snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore nvidia(PO) [last unloaded: rc_core]
CPU: 1 PID: 28102 Comm: rmmod Tainted: P WC O 4.8.4-build.1 #1
Hardware name: MSI MS-7309/MS-7309, BIOS V1.12 02/23/2009
00000000 c12ba080 00000000 00000000 c103ed6a c1616014 00000001 00006dc6
c1615862 00000454 c109e8a7 c109e8a7 00000009 ffffffff 00000000 f13f6a10
f5f5a600 c103ee33 00000009 00000000 00000000 c109e8a7 f80ca4d0 c109f617
Call Trace:
[<c12ba080>] ? dump_stack+0x44/0x64
[<c103ed6a>] ? __warn+0xfa/0x120
[<c109e8a7>] ? module_put+0x57/0x70
[<c109e8a7>] ? module_put+0x57/0x70
[<c103ee33>] ? warn_slowpath_null+0x23/0x30
[<c109e8a7>] ? module_put+0x57/0x70
[<f80ca4d0>] ? gp8psk_fe_set_frontend+0x460/0x460 [dvb_usb_gp8psk]
[<c109f617>] ? symbol_put_addr+0x27/0x50
[<f80bc9ca>] ? dvb_usb_adapter_frontend_exit+0x3a/0x70 [dvb_usb]
[<f80bb3bf>] ? dvb_usb_exit+0x2f/0xd0 [dvb_usb]
[<c13d03bc>] ? usb_disable_endpoint+0x7c/0xb0
[<f80bb48a>] ? dvb_usb_device_exit+0x2a/0x50 [dvb_usb]
[<c13d2882>] ? usb_unbind_interface+0x62/0x250
[<c136b514>] ? __pm_runtime_idle+0x44/0x70
[<c13620d8>] ? __device_release_driver+0x78/0x120
[<c1362907>] ? driver_detach+0x87/0x90
[<c1361c48>] ? bus_remove_driver+0x38/0x90
[<c13d1c18>] ? usb_deregister+0x58/0xb0
[<c109fbb0>] ? SyS_delete_module+0x130/0x1f0
[<c1055654>] ? task_work_run+0x64/0x80
[<c1000fa5>] ? exit_to_usermode_loop+0x85/0x90
[<c10013f0>] ? do_fast_syscall_32+0x80/0x130
[<c1549f43>] ? sysenter_past_esp+0x40/0x6a
---[ end trace 6ebc60ef3981792f ]---
Such stack traces provide enough information to identify the line inside the
Kernel's source code where the bug happened. Depending on the severity of
the issue, it may also contain the word **Oops**, as on this one::
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<c06969d4>] iret_exc+0x7d0/0xa59
*pdpt = 000000002258a001 *pde = 0000000000000000
Oops: 0002 [#1] PREEMPT SMP
...
Despite being an **Oops** or some other sort of stack trace, the offended
line is usually required to identify and handle the bug. Along this chapter,
we'll refer to "Oops" for all kinds of stack traces that need to be analized.
.. note::
``ksymoops`` is useless on 2.6 or upper. Please use the Oops in its original
format (from ``dmesg``, etc). Ignore any references in this or other docs to
"decoding the Oops" or "running it through ksymoops".
If you post an Oops from 2.6+ that has been run through ``ksymoops``,
people will just tell you to repost it.
Where is the Oops message is located?
-------------------------------------
Normally the Oops text is read from the kernel buffers by klogd and
handed to ``syslogd`` which writes it to a syslog file, typically
``/var/log/messages`` (depends on ``/etc/syslog.conf``). On systems with
systemd, it may also be stored by the ``journald`` daemon, and accessed
by running ``journalctl`` command.
Sometimes ``klogd`` dies, in which case you can run ``dmesg > file`` to
read the data from the kernel buffers and save it. Or you can
``cat /proc/kmsg > file``, however you have to break in to stop the transfer,
``kmsg`` is a "never ending file".
If the machine has crashed so badly that you cannot enter commands or
the disk is not available then you have three options:
(1) Hand copy the text from the screen and type it in after the machine
has restarted. Messy but it is the only option if you have not
planned for a crash. Alternatively, you can take a picture of
the screen with a digital camera - not nice, but better than
nothing. If the messages scroll off the top of the console, you
may find that booting with a higher resolution (eg, ``vga=791``)
will allow you to read more of the text. (Caveat: This needs ``vesafb``,
so won't help for 'early' oopses)
(2) Boot with a serial console (see
:ref:`Documentation/admin-guide/serial-console.rst <serial_console>`),
run a null modem to a second machine and capture the output there
using your favourite communication program. Minicom works well.
(3) Use Kdump (see Documentation/kdump/kdump.txt),
extract the kernel ring buffer from old memory with using dmesg
gdbmacro in Documentation/kdump/gdbmacros.txt.
Finding the bug's location
--------------------------
Reporting a bug works best if you point the location of the bug at the
Kernel source file. There are two methods for doing that. Usually, using
``gdb`` is easier, but the Kernel should be pre-compiled with debug info.
gdb
^^^
The GNU debug (``gdb``) is the best way to figure out the exact file and line
number of the OOPS from the ``vmlinux`` file.
The usage of gdb works best on a kernel compiled with ``CONFIG_DEBUG_INFO``.
This can be set by running::
$ ./scripts/config -d COMPILE_TEST -e DEBUG_KERNEL -e DEBUG_INFO
On a kernel compiled with ``CONFIG_DEBUG_INFO``, you can simply copy the
EIP value from the OOPS::
EIP: 0060:[<c021e50e>] Not tainted VLI
And use GDB to translate that to human-readable form::
$ gdb vmlinux
(gdb) l *0xc021e50e
If you don't have ``CONFIG_DEBUG_INFO`` enabled, you use the function
offset from the OOPS::
EIP is at vt_ioctl+0xda8/0x1482
And recompile the kernel with ``CONFIG_DEBUG_INFO`` enabled::
$ ./scripts/config -d COMPILE_TEST -e DEBUG_KERNEL -e DEBUG_INFO
$ make vmlinux
$ gdb vmlinux
(gdb) l *vt_ioctl+0xda8
0x1888 is in vt_ioctl (drivers/tty/vt/vt_ioctl.c:293).
288 {
289 struct vc_data *vc = NULL;
290 int ret = 0;
291
292 console_lock();
293 if (VT_BUSY(vc_num))
294 ret = -EBUSY;
295 else if (vc_num)
296 vc = vc_deallocate(vc_num);
297 console_unlock();
or, if you want to be more verbose::
(gdb) p vt_ioctl
$1 = {int (struct tty_struct *, unsigned int, unsigned long)} 0xae0 <vt_ioctl>
(gdb) l *0xae0+0xda8
You could, instead, use the object file::
$ make drivers/tty/
$ gdb drivers/tty/vt/vt_ioctl.o
(gdb) l *vt_ioctl+0xda8
If you have a call trace, such as::
Call Trace:
[<ffffffff8802c8e9>] :jbd:log_wait_commit+0xa3/0xf5
[<ffffffff810482d9>] autoremove_wake_function+0x0/0x2e
[<ffffffff8802770b>] :jbd:journal_stop+0x1be/0x1ee
...
this shows the problem likely in the :jbd: module. You can load that module
in gdb and list the relevant code::
$ gdb fs/jbd/jbd.ko
(gdb) l *log_wait_commit+0xa3
.. note::
You can also do the same for any function call at the stack trace,
like this one::
[<f80bc9ca>] ? dvb_usb_adapter_frontend_exit+0x3a/0x70 [dvb_usb]
The position where the above call happened can be seen with::
$ gdb drivers/media/usb/dvb-usb/dvb-usb.o
(gdb) l *dvb_usb_adapter_frontend_exit+0x3a
objdump
^^^^^^^
To debug a kernel, use objdump and look for the hex offset from the crash
output to find the valid line of code/assembler. Without debug symbols, you
will see the assembler code for the routine shown, but if your kernel has
debug symbols the C code will also be available. (Debug symbols can be enabled
in the kernel hacking menu of the menu configuration.) For example::
$ objdump -r -S -l --disassemble net/dccp/ipv4.o
.. note::
You need to be at the top level of the kernel tree for this to pick up
your C files.
If you don't have access to the code you can also debug on some crash dumps
e.g. crash dump output as shown by Dave Miller::
EIP is at +0x14/0x4c0
...
Code: 44 24 04 e8 6f 05 00 00 e9 e8 fe ff ff 8d 76 00 8d bc 27 00 00
00 00 55 57 56 53 81 ec bc 00 00 00 8b ac 24 d0 00 00 00 8b 5d 08
<8b> 83 3c 01 00 00 89 44 24 14 8b 45 28 85 c0 89 44 24 18 0f 85
Put the bytes into a "foo.s" file like this:
.text
.globl foo
foo:
.byte .... /* bytes from Code: part of OOPS dump */
Compile it with "gcc -c -o foo.o foo.s" then look at the output of
"objdump --disassemble foo.o".
Output:
ip_queue_xmit:
push %ebp
push %edi
push %esi
push %ebx
sub $0xbc, %esp
mov 0xd0(%esp), %ebp ! %ebp = arg0 (skb)
mov 0x8(%ebp), %ebx ! %ebx = skb->sk
mov 0x13c(%ebx), %eax ! %eax = inet_sk(sk)->opt
Reporting the bug
-----------------
Once you find where the bug happened, by inspecting its location,
you could either try to fix it yourself or report it upstream.
In order to report it upstream, you should identify the mailing list
used for the development of the affected code. This can be done by using
the ``get_maintainer.pl`` script.
For example, if you find a bug at the gspca's conex.c file, you can get
their maintainers with::
$ ./scripts/get_maintainer.pl -f drivers/media/usb/gspca/sonixj.c
Hans Verkuil <hverkuil@xs4all.nl> (odd fixer:GSPCA USB WEBCAM DRIVER,commit_signer:1/1=100%)
Mauro Carvalho Chehab <mchehab@kernel.org> (maintainer:MEDIA INPUT INFRASTRUCTURE (V4L/DVB),commit_signer:1/1=100%)
Tejun Heo <tj@kernel.org> (commit_signer:1/1=100%)
Bhaktipriya Shridhar <bhaktipriya96@gmail.com> (commit_signer:1/1=100%,authored:1/1=100%,added_lines:4/4=100%,removed_lines:9/9=100%)
linux-media@vger.kernel.org (open list:GSPCA USB WEBCAM DRIVER)
linux-kernel@vger.kernel.org (open list)
Please notice that it will point to:
- The last developers that touched on the source code. On the above example,
Tejun and Bhaktipriya (in this specific case, none really envolved on the
development of this file);
- The driver maintainer (Hans Verkuil);
- The subsystem maintainer (Mauro Carvalho Chehab)
- The driver and/or subsystem mailing list (linux-media@vger.kernel.org);
- the Linux Kernel mailing list (linux-kernel@vger.kernel.org).
Usually, the fastest way to have your bug fixed is to report it to mailing
list used for the development of the code (linux-media ML) copying the driver maintainer (Hans).
If you are totally stumped as to whom to send the report, and
``get_maintainer.pl`` didn't provide you anything useful, send it to
linux-kernel@vger.kernel.org.
Thanks for your help in making Linux as stable as humanly possible.
Fixing the bug
--------------
If you know programming, you could help us by not only reporting the bug,
but also providing us with a solution. After all open source is about
sharing what you do and don't you want to be recognised for your genius?
If you decide to take this way, once you have worked out a fix please submit
it upstream.
Please do read
ref:`Documentation/process/submitting-patches.rst <submittingpatches>` though
to help your code get accepted.
---------------------------------------------------------------------------
Notes on Oops tracing with ``klogd``
------------------------------------
In order to help Linus and the other kernel developers there has been
substantial support incorporated into ``klogd`` for processing protection
faults. In order to have full support for address resolution at least
version 1.3-pl3 of the ``sysklogd`` package should be used.
When a protection fault occurs the ``klogd`` daemon automatically
translates important addresses in the kernel log messages to their
symbolic equivalents. This translated kernel message is then
forwarded through whatever reporting mechanism ``klogd`` is using. The
protection fault message can be simply cut out of the message files
and forwarded to the kernel developers.
Two types of address resolution are performed by ``klogd``. The first is
static translation and the second is dynamic translation. Static
translation uses the System.map file in much the same manner that
ksymoops does. In order to do static translation the ``klogd`` daemon
must be able to find a system map file at daemon initialization time.
See the klogd man page for information on how ``klogd`` searches for map
files.
Dynamic address translation is important when kernel loadable modules
are being used. Since memory for kernel modules is allocated from the
kernel's dynamic memory pools there are no fixed locations for either
the start of the module or for functions and symbols in the module.
The kernel supports system calls which allow a program to determine
which modules are loaded and their location in memory. Using these
system calls the klogd daemon builds a symbol table which can be used
to debug a protection fault which occurs in a loadable kernel module.
At the very minimum klogd will provide the name of the module which
generated the protection fault. There may be additional symbolic
information available if the developer of the loadable module chose to
export symbol information from the module.
Since the kernel module environment can be dynamic there must be a
mechanism for notifying the ``klogd`` daemon when a change in module
environment occurs. There are command line options available which
allow klogd to signal the currently executing daemon that symbol
information should be refreshed. See the ``klogd`` manual page for more
information.
A patch is included with the sysklogd distribution which modifies the
``modules-2.0.0`` package to automatically signal klogd whenever a module
is loaded or unloaded. Applying this patch provides essentially
seamless support for debugging protection faults which occur with
kernel loadable modules.
The following is an example of a protection fault in a loadable module
processed by ``klogd``::
Aug 29 09:51:01 blizard kernel: Unable to handle kernel paging request at virtual address f15e97cc
Aug 29 09:51:01 blizard kernel: current->tss.cr3 = 0062d000, %cr3 = 0062d000
Aug 29 09:51:01 blizard kernel: *pde = 00000000
Aug 29 09:51:01 blizard kernel: Oops: 0002
Aug 29 09:51:01 blizard kernel: CPU: 0
Aug 29 09:51:01 blizard kernel: EIP: 0010:[oops:_oops+16/3868]
Aug 29 09:51:01 blizard kernel: EFLAGS: 00010212
Aug 29 09:51:01 blizard kernel: eax: 315e97cc ebx: 003a6f80 ecx: 001be77b edx: 00237c0c
Aug 29 09:51:01 blizard kernel: esi: 00000000 edi: bffffdb3 ebp: 00589f90 esp: 00589f8c
Aug 29 09:51:01 blizard kernel: ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018
Aug 29 09:51:01 blizard kernel: Process oops_test (pid: 3374, process nr: 21, stackpage=00589000)
Aug 29 09:51:01 blizard kernel: Stack: 315e97cc 00589f98 0100b0b4 bffffed4 0012e38e 00240c64 003a6f80 00000001
Aug 29 09:51:01 blizard kernel: 00000000 00237810 bfffff00 0010a7fa 00000003 00000001 00000000 bfffff00
Aug 29 09:51:01 blizard kernel: bffffdb3 bffffed4 ffffffda 0000002b 0007002b 0000002b 0000002b 00000036
Aug 29 09:51:01 blizard kernel: Call Trace: [oops:_oops_ioctl+48/80] [_sys_ioctl+254/272] [_system_call+82/128]
Aug 29 09:51:01 blizard kernel: Code: c7 00 05 00 00 00 eb 08 90 90 90 90 90 90 90 90 89 ec 5d c3
---------------------------------------------------------------------------
::
Dr. G.W. Wettstein Oncology Research Div. Computing Facility
Roger Maris Cancer Center INTERNET: greg@wind.rmcc.com
820 4th St. N.
Fargo, ND 58122
Phone: 701-234-7556

View File

@@ -0,0 +1,10 @@
# -*- coding: utf-8; mode: python -*-
project = 'Linux Kernel User Documentation'
tags.add("subproject")
latex_documents = [
('index', 'linux-user.tex', 'Linux Kernel User Documentation',
'The kernel development community', 'manual'),
]

View File

@@ -0,0 +1,268 @@
Linux allocated devices (4.x+ version)
======================================
This list is the Linux Device List, the official registry of allocated
device numbers and ``/dev`` directory nodes for the Linux operating
system.
The LaTeX version of this document is no longer maintained, nor is
the document that used to reside at lanana.org. This version in the
mainline Linux kernel is the master document. Updates shall be sent
as patches to the kernel maintainers (see the
:ref:`Documentation/process/submitting-patches.rst <submittingpatches>` document).
Specifically explore the sections titled "CHAR and MISC DRIVERS", and
"BLOCK LAYER" in the MAINTAINERS file to find the right maintainers
to involve for character and block devices.
This document is included by reference into the Filesystem Hierarchy
Standard (FHS). The FHS is available from http://www.pathname.com/fhs/.
Allocations marked (68k/Amiga) apply to Linux/68k on the Amiga
platform only. Allocations marked (68k/Atari) apply to Linux/68k on
the Atari platform only.
This document is in the public domain. The authors requests, however,
that semantically altered versions are not distributed without
permission of the authors, assuming the authors can be contacted without
an unreasonable effort.
.. attention::
DEVICE DRIVERS AUTHORS PLEASE READ THIS
Linux now has extensive support for dynamic allocation of device numbering
and can use ``sysfs`` and ``udev`` (``systemd``) to handle the naming needs.
There are still some exceptions in the serial and boot device area. Before
asking for a device number make sure you actually need one.
To have a major number allocated, or a minor number in situations
where that applies (e.g. busmice), please submit a patch and send to
the authors as indicated above.
Keep the description of the device *in the same format
as this list*. The reason for this is that it is the only way we have
found to ensure we have all the requisite information to publish your
device and avoid conflicts.
Finally, sometimes we have to play "namespace police." Please don't be
offended. We often get submissions for ``/dev`` names that would be bound
to cause conflicts down the road. We are trying to avoid getting in a
situation where we would have to suffer an incompatible forward
change. Therefore, please consult with us **before** you make your
device names and numbers in any way public, at least to the point
where it would be at all difficult to get them changed.
Your cooperation is appreciated.
.. include:: devices.txt
:literal:
Additional ``/dev/`` directory entries
--------------------------------------
This section details additional entries that should or may exist in
the /dev directory. It is preferred that symbolic links use the same
form (absolute or relative) as is indicated here. Links are
classified as "hard" or "symbolic" depending on the preferred type of
link; if possible, the indicated type of link should be used.
Compulsory links
++++++++++++++++
These links should exist on all systems:
=============== =============== =============== ===============================
/dev/fd /proc/self/fd symbolic File descriptors
/dev/stdin fd/0 symbolic stdin file descriptor
/dev/stdout fd/1 symbolic stdout file descriptor
/dev/stderr fd/2 symbolic stderr file descriptor
/dev/nfsd socksys symbolic Required by iBCS-2
/dev/X0R null symbolic Required by iBCS-2
=============== =============== =============== ===============================
Note: ``/dev/X0R`` is <letter X>-<digit 0>-<letter R>.
Recommended links
+++++++++++++++++
It is recommended that these links exist on all systems:
=============== =============== =============== ===============================
/dev/core /proc/kcore symbolic Backward compatibility
/dev/ramdisk ram0 symbolic Backward compatibility
/dev/ftape qft0 symbolic Backward compatibility
/dev/bttv0 video0 symbolic Backward compatibility
/dev/radio radio0 symbolic Backward compatibility
/dev/i2o* /dev/i2o/* symbolic Backward compatibility
/dev/scd? sr? hard Alternate SCSI CD-ROM name
=============== =============== =============== ===============================
Locally defined links
+++++++++++++++++++++
The following links may be established locally to conform to the
configuration of the system. This is merely a tabulation of existing
practice, and does not constitute a recommendation. However, if they
exist, they should have the following uses.
=============== =============== =============== ===============================
/dev/mouse mouse port symbolic Current mouse device
/dev/tape tape device symbolic Current tape device
/dev/cdrom CD-ROM device symbolic Current CD-ROM device
/dev/cdwriter CD-writer symbolic Current CD-writer device
/dev/scanner scanner symbolic Current scanner device
/dev/modem modem port symbolic Current dialout device
/dev/root root device symbolic Current root filesystem
/dev/swap swap device symbolic Current swap device
=============== =============== =============== ===============================
``/dev/modem`` should not be used for a modem which supports dialin as
well as dialout, as it tends to cause lock file problems. If it
exists, ``/dev/modem`` should point to the appropriate primary TTY device
(the use of the alternate callout devices is deprecated).
For SCSI devices, ``/dev/tape`` and ``/dev/cdrom`` should point to the
*cooked* devices (``/dev/st*`` and ``/dev/sr*``, respectively), whereas
``/dev/cdwriter`` and /dev/scanner should point to the appropriate generic
SCSI devices (/dev/sg*).
``/dev/mouse`` may point to a primary serial TTY device, a hardware mouse
device, or a socket for a mouse driver program (e.g. ``/dev/gpmdata``).
Sockets and pipes
+++++++++++++++++
Non-transient sockets and named pipes may exist in /dev. Common entries are:
=============== =============== ===============================================
/dev/printer socket lpd local socket
/dev/log socket syslog local socket
/dev/gpmdata socket gpm mouse multiplexer
=============== =============== ===============================================
Mount points
++++++++++++
The following names are reserved for mounting special filesystems
under /dev. These special filesystems provide kernel interfaces that
cannot be provided with standard device nodes.
=============== =============== ===============================================
/dev/pts devpts PTY slave filesystem
/dev/shm tmpfs POSIX shared memory maintenance access
=============== =============== ===============================================
Terminal devices
----------------
Terminal, or TTY devices are a special class of character devices. A
terminal device is any device that could act as a controlling terminal
for a session; this includes virtual consoles, serial ports, and
pseudoterminals (PTYs).
All terminal devices share a common set of capabilities known as line
disciplines; these include the common terminal line discipline as well
as SLIP and PPP modes.
All terminal devices are named similarly; this section explains the
naming and use of the various types of TTYs. Note that the naming
conventions include several historical warts; some of these are
Linux-specific, some were inherited from other systems, and some
reflect Linux outgrowing a borrowed convention.
A hash mark (``#``) in a device name is used here to indicate a decimal
number without leading zeroes.
Virtual consoles and the console device
+++++++++++++++++++++++++++++++++++++++
Virtual consoles are full-screen terminal displays on the system video
monitor. Virtual consoles are named ``/dev/tty#``, with numbering
starting at ``/dev/tty1``; ``/dev/tty0`` is the current virtual console.
``/dev/tty0`` is the device that should be used to access the system video
card on those architectures for which the frame buffer devices
(``/dev/fb*``) are not applicable. Do not use ``/dev/console``
for this purpose.
The console device, ``/dev/console``, is the device to which system
messages should be sent, and on which logins should be permitted in
single-user mode. Starting with Linux 2.1.71, ``/dev/console`` is managed
by the kernel; for previous versions it should be a symbolic link to
either ``/dev/tty0``, a specific virtual console such as ``/dev/tty1``, or to
a serial port primary (``tty*``, not ``cu*``) device, depending on the
configuration of the system.
Serial ports
++++++++++++
Serial ports are RS-232 serial ports and any device which simulates
one, either in hardware (such as internal modems) or in software (such
as the ISDN driver.) Under Linux, each serial ports has two device
names, the primary or callin device and the alternate or callout one.
Each kind of device is indicated by a different letter. For any
letter X, the names of the devices are ``/dev/ttyX#`` and ``/dev/cux#``,
respectively; for historical reasons, ``/dev/ttyS#`` and ``/dev/ttyC#``
correspond to ``/dev/cua#`` and ``/dev/cub#``. In the future, it should be
expected that multiple letters will be used; all letters will be upper
case for the "tty" device (e.g. ``/dev/ttyDP#``) and lower case for the
"cu" device (e.g. ``/dev/cudp#``).
The names ``/dev/ttyQ#`` and ``/dev/cuq#`` are reserved for local use.
The alternate devices provide for kernel-based exclusion and somewhat
different defaults than the primary devices. Their main purpose is to
allow the use of serial ports with programs with no inherent or broken
support for serial ports. Their use is deprecated, and they may be
removed from a future version of Linux.
Arbitration of serial ports is provided by the use of lock files with
the names ``/var/lock/LCK..ttyX#``. The contents of the lock file should
be the PID of the locking process as an ASCII number.
It is common practice to install links such as /dev/modem
which point to serial ports. In order to ensure proper locking in the
presence of these links, it is recommended that software chase
symlinks and lock all possible names; additionally, it is recommended
that a lock file be installed with the corresponding alternate
device. In order to avoid deadlocks, it is recommended that the locks
are acquired in the following order, and released in the reverse:
1. The symbolic link name, if any (``/var/lock/LCK..modem``)
2. The "tty" name (``/var/lock/LCK..ttyS2``)
3. The alternate device name (``/var/lock/LCK..cua2``)
In the case of nested symbolic links, the lock files should be
installed in the order the symlinks are resolved.
Under no circumstances should an application hold a lock while waiting
for another to be released. In addition, applications which attempt
to create lock files for the corresponding alternate device names
should take into account the possibility of being used on a non-serial
port TTY, for which no alternate device would exist.
Pseudoterminals (PTYs)
++++++++++++++++++++++
Pseudoterminals, or PTYs, are used to create login sessions or provide
other capabilities requiring a TTY line discipline (including SLIP or
PPP capability) to arbitrary data-generation processes. Each PTY has
a master side, named ``/dev/pty[p-za-e][0-9a-f]``, and a slave side, named
``/dev/tty[p-za-e][0-9a-f]``. The kernel arbitrates the use of PTYs by
allowing each master side to be opened only once.
Once the master side has been opened, the corresponding slave device
can be used in the same manner as any TTY device. The master and
slave devices are connected by the kernel, generating the equivalent
of a bidirectional pipe with TTY capabilities.
Recent versions of the Linux kernels and GNU libc contain support for
the System V/Unix98 naming scheme for PTYs, which assigns a common
device, ``/dev/ptmx``, to all the masters (opening it will automatically
give you a previously unassigned PTY) and a subdirectory, ``/dev/pts``,
for the slaves; the slaves are named with decimal integers (``/dev/pts/#``
in our notation). This removes the problem of exhausting the
namespace and enables the kernel to automatically create the device
nodes for the slaves on demand using the "devpts" filesystem.

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,353 @@
Dynamic debug
+++++++++++++
Introduction
============
This document describes how to use the dynamic debug (dyndbg) feature.
Dynamic debug is designed to allow you to dynamically enable/disable
kernel code to obtain additional kernel information. Currently, if
``CONFIG_DYNAMIC_DEBUG`` is set, then all ``pr_debug()``/``dev_dbg()`` and
``print_hex_dump_debug()``/``print_hex_dump_bytes()`` calls can be dynamically
enabled per-callsite.
If ``CONFIG_DYNAMIC_DEBUG`` is not set, ``print_hex_dump_debug()`` is just
shortcut for ``print_hex_dump(KERN_DEBUG)``.
For ``print_hex_dump_debug()``/``print_hex_dump_bytes()``, format string is
its ``prefix_str`` argument, if it is constant string; or ``hexdump``
in case ``prefix_str`` is build dynamically.
Dynamic debug has even more useful features:
* Simple query language allows turning on and off debugging
statements by matching any combination of 0 or 1 of:
- source filename
- function name
- line number (including ranges of line numbers)
- module name
- format string
* Provides a debugfs control file: ``<debugfs>/dynamic_debug/control``
which can be read to display the complete list of known debug
statements, to help guide you
Controlling dynamic debug Behaviour
===================================
The behaviour of ``pr_debug()``/``dev_dbg()`` are controlled via writing to a
control file in the 'debugfs' filesystem. Thus, you must first mount
the debugfs filesystem, in order to make use of this feature.
Subsequently, we refer to the control file as:
``<debugfs>/dynamic_debug/control``. For example, if you want to enable
printing from source file ``svcsock.c``, line 1603 you simply do::
nullarbor:~ # echo 'file svcsock.c line 1603 +p' >
<debugfs>/dynamic_debug/control
If you make a mistake with the syntax, the write will fail thus::
nullarbor:~ # echo 'file svcsock.c wtf 1 +p' >
<debugfs>/dynamic_debug/control
-bash: echo: write error: Invalid argument
Viewing Dynamic Debug Behaviour
===============================
You can view the currently configured behaviour of all the debug
statements via::
nullarbor:~ # cat <debugfs>/dynamic_debug/control
# filename:lineno [module]function flags format
/usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:323 [svcxprt_rdma]svc_rdma_cleanup =_ "SVCRDMA Module Removed, deregister RPC RDMA transport\012"
/usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:341 [svcxprt_rdma]svc_rdma_init =_ "\011max_inline : %d\012"
/usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:340 [svcxprt_rdma]svc_rdma_init =_ "\011sq_depth : %d\012"
/usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:338 [svcxprt_rdma]svc_rdma_init =_ "\011max_requests : %d\012"
...
You can also apply standard Unix text manipulation filters to this
data, e.g.::
nullarbor:~ # grep -i rdma <debugfs>/dynamic_debug/control | wc -l
62
nullarbor:~ # grep -i tcp <debugfs>/dynamic_debug/control | wc -l
42
The third column shows the currently enabled flags for each debug
statement callsite (see below for definitions of the flags). The
default value, with no flags enabled, is ``=_``. So you can view all
the debug statement callsites with any non-default flags::
nullarbor:~ # awk '$3 != "=_"' <debugfs>/dynamic_debug/control
# filename:lineno [module]function flags format
/usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svcsock.c:1603 [sunrpc]svc_send p "svc_process: st_sendto returned %d\012"
Command Language Reference
==========================
At the lexical level, a command comprises a sequence of words separated
by spaces or tabs. So these are all equivalent::
nullarbor:~ # echo -c 'file svcsock.c line 1603 +p' >
<debugfs>/dynamic_debug/control
nullarbor:~ # echo -c ' file svcsock.c line 1603 +p ' >
<debugfs>/dynamic_debug/control
nullarbor:~ # echo -n 'file svcsock.c line 1603 +p' >
<debugfs>/dynamic_debug/control
Command submissions are bounded by a write() system call.
Multiple commands can be written together, separated by ``;`` or ``\n``::
~# echo "func pnpacpi_get_resources +p; func pnp_assign_mem +p" \
> <debugfs>/dynamic_debug/control
If your query set is big, you can batch them too::
~# cat query-batch-file > <debugfs>/dynamic_debug/control
A another way is to use wildcard. The match rule support ``*`` (matches
zero or more characters) and ``?`` (matches exactly one character).For
example, you can match all usb drivers::
~# echo "file drivers/usb/* +p" > <debugfs>/dynamic_debug/control
At the syntactical level, a command comprises a sequence of match
specifications, followed by a flags change specification::
command ::= match-spec* flags-spec
The match-spec's are used to choose a subset of the known pr_debug()
callsites to which to apply the flags-spec. Think of them as a query
with implicit ANDs between each pair. Note that an empty list of
match-specs will select all debug statement callsites.
A match specification comprises a keyword, which controls the
attribute of the callsite to be compared, and a value to compare
against. Possible keywords are:::
match-spec ::= 'func' string |
'file' string |
'module' string |
'format' string |
'line' line-range
line-range ::= lineno |
'-'lineno |
lineno'-' |
lineno'-'lineno
lineno ::= unsigned-int
.. note::
``line-range`` cannot contain space, e.g.
"1-30" is valid range but "1 - 30" is not.
The meanings of each keyword are:
func
The given string is compared against the function name
of each callsite. Example::
func svc_tcp_accept
file
The given string is compared against either the full pathname, the
src-root relative pathname, or the basename of the source file of
each callsite. Examples::
file svcsock.c
file kernel/freezer.c
file /usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svcsock.c
module
The given string is compared against the module name
of each callsite. The module name is the string as
seen in ``lsmod``, i.e. without the directory or the ``.ko``
suffix and with ``-`` changed to ``_``. Examples::
module sunrpc
module nfsd
format
The given string is searched for in the dynamic debug format
string. Note that the string does not need to match the
entire format, only some part. Whitespace and other
special characters can be escaped using C octal character
escape ``\ooo`` notation, e.g. the space character is ``\040``.
Alternatively, the string can be enclosed in double quote
characters (``"``) or single quote characters (``'``).
Examples::
format svcrdma: // many of the NFS/RDMA server pr_debugs
format readahead // some pr_debugs in the readahead cache
format nfsd:\040SETATTR // one way to match a format with whitespace
format "nfsd: SETATTR" // a neater way to match a format with whitespace
format 'nfsd: SETATTR' // yet another way to match a format with whitespace
line
The given line number or range of line numbers is compared
against the line number of each ``pr_debug()`` callsite. A single
line number matches the callsite line number exactly. A
range of line numbers matches any callsite between the first
and last line number inclusive. An empty first number means
the first line in the file, an empty line number means the
last number in the file. Examples::
line 1603 // exactly line 1603
line 1600-1605 // the six lines from line 1600 to line 1605
line -1605 // the 1605 lines from line 1 to line 1605
line 1600- // all lines from line 1600 to the end of the file
The flags specification comprises a change operation followed
by one or more flag characters. The change operation is one
of the characters::
- remove the given flags
+ add the given flags
= set the flags to the given flags
The flags are::
p enables the pr_debug() callsite.
f Include the function name in the printed message
l Include line number in the printed message
m Include module name in the printed message
t Include thread ID in messages not generated from interrupt context
_ No flags are set. (Or'd with others on input)
For ``print_hex_dump_debug()`` and ``print_hex_dump_bytes()``, only ``p`` flag
have meaning, other flags ignored.
For display, the flags are preceded by ``=``
(mnemonic: what the flags are currently equal to).
Note the regexp ``^[-+=][flmpt_]+$`` matches a flags specification.
To clear all flags at once, use ``=_`` or ``-flmpt``.
Debug messages during Boot Process
==================================
To activate debug messages for core code and built-in modules during
the boot process, even before userspace and debugfs exists, use
``dyndbg="QUERY"``, ``module.dyndbg="QUERY"``, or ``ddebug_query="QUERY"``
(``ddebug_query`` is obsoleted by ``dyndbg``, and deprecated). QUERY follows
the syntax described above, but must not exceed 1023 characters. Your
bootloader may impose lower limits.
These ``dyndbg`` params are processed just after the ddebug tables are
processed, as part of the arch_initcall. Thus you can enable debug
messages in all code run after this arch_initcall via this boot
parameter.
On an x86 system for example ACPI enablement is a subsys_initcall and::
dyndbg="file ec.c +p"
will show early Embedded Controller transactions during ACPI setup if
your machine (typically a laptop) has an Embedded Controller.
PCI (or other devices) initialization also is a hot candidate for using
this boot parameter for debugging purposes.
If ``foo`` module is not built-in, ``foo.dyndbg`` will still be processed at
boot time, without effect, but will be reprocessed when module is
loaded later. ``dyndbg_query=`` and bare ``dyndbg=`` are only processed at
boot.
Debug Messages at Module Initialization Time
============================================
When ``modprobe foo`` is called, modprobe scans ``/proc/cmdline`` for
``foo.params``, strips ``foo.``, and passes them to the kernel along with
params given in modprobe args or ``/etc/modprob.d/*.conf`` files,
in the following order:
1. parameters given via ``/etc/modprobe.d/*.conf``::
options foo dyndbg=+pt
options foo dyndbg # defaults to +p
2. ``foo.dyndbg`` as given in boot args, ``foo.`` is stripped and passed::
foo.dyndbg=" func bar +p; func buz +mp"
3. args to modprobe::
modprobe foo dyndbg==pmf # override previous settings
These ``dyndbg`` queries are applied in order, with last having final say.
This allows boot args to override or modify those from ``/etc/modprobe.d``
(sensible, since 1 is system wide, 2 is kernel or boot specific), and
modprobe args to override both.
In the ``foo.dyndbg="QUERY"`` form, the query must exclude ``module foo``.
``foo`` is extracted from the param-name, and applied to each query in
``QUERY``, and only 1 match-spec of each type is allowed.
The ``dyndbg`` option is a "fake" module parameter, which means:
- modules do not need to define it explicitly
- every module gets it tacitly, whether they use pr_debug or not
- it doesn't appear in ``/sys/module/$module/parameters/``
To see it, grep the control file, or inspect ``/proc/cmdline.``
For ``CONFIG_DYNAMIC_DEBUG`` kernels, any settings given at boot-time (or
enabled by ``-DDEBUG`` flag during compilation) can be disabled later via
the sysfs interface if the debug messages are no longer needed::
echo "module module_name -p" > <debugfs>/dynamic_debug/control
Examples
========
::
// enable the message at line 1603 of file svcsock.c
nullarbor:~ # echo -n 'file svcsock.c line 1603 +p' >
<debugfs>/dynamic_debug/control
// enable all the messages in file svcsock.c
nullarbor:~ # echo -n 'file svcsock.c +p' >
<debugfs>/dynamic_debug/control
// enable all the messages in the NFS server module
nullarbor:~ # echo -n 'module nfsd +p' >
<debugfs>/dynamic_debug/control
// enable all 12 messages in the function svc_process()
nullarbor:~ # echo -n 'func svc_process +p' >
<debugfs>/dynamic_debug/control
// disable all 12 messages in the function svc_process()
nullarbor:~ # echo -n 'func svc_process -p' >
<debugfs>/dynamic_debug/control
// enable messages for NFS calls READ, READLINK, READDIR and READDIR+.
nullarbor:~ # echo -n 'format "nfsd: READ" +p' >
<debugfs>/dynamic_debug/control
// enable messages in files of which the paths include string "usb"
nullarbor:~ # echo -n '*usb* +p' > <debugfs>/dynamic_debug/control
// enable all messages
nullarbor:~ # echo -n '+p' > <debugfs>/dynamic_debug/control
// add module, function to all enabled messages
nullarbor:~ # echo -n '+mf' > <debugfs>/dynamic_debug/control
// boot-args example, with newlines and comments for readability
Kernel command line: ...
// see whats going on in dyndbg=value processing
dynamic_debug.verbose=1
// enable pr_debugs in 2 builtins, #cmt is stripped
dyndbg="module params +p #cmt ; module sys +p"
// enable pr_debugs in 2 functions in a module loaded later
pc87360.dyndbg="func pc87360_init_device +p; func pc87360_find +p"

View File

@@ -0,0 +1,68 @@
The Linux kernel user's and administrator's guide
=================================================
The following is a collection of user-oriented documents that have been
added to the kernel over time. There is, as yet, little overall order or
organization here — this material was not written to be a single, coherent
document! With luck things will improve quickly over time.
This initial section contains overall information, including the README
file describing the kernel as a whole, documentation on kernel parameters,
etc.
.. toctree::
:maxdepth: 1
README
kernel-parameters
devices
Here is a set of documents aimed at users who are trying to track down
problems and bugs in particular.
.. toctree::
:maxdepth: 1
reporting-bugs
security-bugs
bug-hunting
bug-bisect
tainted-kernels
ramoops
dynamic-debug-howto
init
This is the beginning of a section with information of interest to
application developers. Documents covering various aspects of the kernel
ABI will be found here.
.. toctree::
:maxdepth: 1
sysfs-rules
The rest of this manual consists of various unordered guides on how to
configure specific aspects of kernel behavior to your liking.
.. toctree::
:maxdepth: 1
initrd
serial-console
braille-console
parport
md
module-signing
sysrq
unicode
vga-softcursor
binfmt-misc
mono
java
.. only:: subproject and html
Indices
=======
* :ref:`genindex`

View File

@@ -0,0 +1,52 @@
Explaining the dreaded "No init found." boot hang message
=========================================================
OK, so you've got this pretty unintuitive message (currently located
in init/main.c) and are wondering what the H*** went wrong.
Some high-level reasons for failure (listed roughly in order of execution)
to load the init binary are:
A) Unable to mount root FS
B) init binary doesn't exist on rootfs
C) broken console device
D) binary exists but dependencies not available
E) binary cannot be loaded
Detailed explanations:
A) Set "debug" kernel parameter (in bootloader config file or CONFIG_CMDLINE)
to get more detailed kernel messages.
B) make sure you have the correct root FS type
(and ``root=`` kernel parameter points to the correct partition),
required drivers such as storage hardware (such as SCSI or USB!)
and filesystem (ext3, jffs2 etc.) are builtin (alternatively as modules,
to be pre-loaded by an initrd)
C) Possibly a conflict in ``console= setup`` --> initial console unavailable.
E.g. some serial consoles are unreliable due to serial IRQ issues (e.g.
missing interrupt-based configuration).
Try using a different ``console= device`` or e.g. ``netconsole=``.
D) e.g. required library dependencies of the init binary such as
``/lib/ld-linux.so.2`` missing or broken. Use
``readelf -d <INIT>|grep NEEDED`` to find out which libraries are required.
E) make sure the binary's architecture matches your hardware.
E.g. i386 vs. x86_64 mismatch, or trying to load x86 on ARM hardware.
In case you tried loading a non-binary file here (shell script?),
you should make sure that the script specifies an interpreter in its shebang
header line (``#!/...``) that is fully working (including its library
dependencies). And before tackling scripts, better first test a simple
non-script binary such as ``/bin/sh`` and confirm its successful execution.
To find out more, add code ``to init/main.c`` to display kernel_execve()s
return values.
Please extend this explanation whenever you find new failure causes
(after all loading the init binary is a CRITICAL and hard transition step
which needs to be made as painless as possible), then submit patch to LKML.
Further TODOs:
- Implement the various ``run_init_process()`` invocations via a struct array
which can then store the ``kernel_execve()`` result value and on failure
log it all by iterating over **all** results (very important usability fix).
- try to make the implementation itself more helpful in general,
e.g. by providing additional error messages at affected places.
Andreas Mohr <andi at lisas period de>

View File

@@ -0,0 +1,383 @@
Using the initial RAM disk (initrd)
===================================
Written 1996,2000 by Werner Almesberger <werner.almesberger@epfl.ch> and
Hans Lermen <lermen@fgan.de>
initrd provides the capability to load a RAM disk by the boot loader.
This RAM disk can then be mounted as the root file system and programs
can be run from it. Afterwards, a new root file system can be mounted
from a different device. The previous root (from initrd) is then moved
to a directory and can be subsequently unmounted.
initrd is mainly designed to allow system startup to occur in two phases,
where the kernel comes up with a minimum set of compiled-in drivers, and
where additional modules are loaded from initrd.
This document gives a brief overview of the use of initrd. A more detailed
discussion of the boot process can be found in [#f1]_.
Operation
---------
When using initrd, the system typically boots as follows:
1) the boot loader loads the kernel and the initial RAM disk
2) the kernel converts initrd into a "normal" RAM disk and
frees the memory used by initrd
3) if the root device is not ``/dev/ram0``, the old (deprecated)
change_root procedure is followed. see the "Obsolete root change
mechanism" section below.
4) root device is mounted. if it is ``/dev/ram0``, the initrd image is
then mounted as root
5) /sbin/init is executed (this can be any valid executable, including
shell scripts; it is run with uid 0 and can do basically everything
init can do).
6) init mounts the "real" root file system
7) init places the root file system at the root directory using the
pivot_root system call
8) init execs the ``/sbin/init`` on the new root filesystem, performing
the usual boot sequence
9) the initrd file system is removed
Note that changing the root directory does not involve unmounting it.
It is therefore possible to leave processes running on initrd during that
procedure. Also note that file systems mounted under initrd continue to
be accessible.
Boot command-line options
-------------------------
initrd adds the following new options::
initrd=<path> (e.g. LOADLIN)
Loads the specified file as the initial RAM disk. When using LILO, you
have to specify the RAM disk image file in /etc/lilo.conf, using the
INITRD configuration variable.
noinitrd
initrd data is preserved but it is not converted to a RAM disk and
the "normal" root file system is mounted. initrd data can be read
from /dev/initrd. Note that the data in initrd can have any structure
in this case and doesn't necessarily have to be a file system image.
This option is used mainly for debugging.
Note: /dev/initrd is read-only and it can only be used once. As soon
as the last process has closed it, all data is freed and /dev/initrd
can't be opened anymore.
root=/dev/ram0
initrd is mounted as root, and the normal boot procedure is followed,
with the RAM disk mounted as root.
Compressed cpio images
----------------------
Recent kernels have support for populating a ramdisk from a compressed cpio
archive. On such systems, the creation of a ramdisk image doesn't need to
involve special block devices or loopbacks; you merely create a directory on
disk with the desired initrd content, cd to that directory, and run (as an
example)::
find . | cpio --quiet -H newc -o | gzip -9 -n > /boot/imagefile.img
Examining the contents of an existing image file is just as simple::
mkdir /tmp/imagefile
cd /tmp/imagefile
gzip -cd /boot/imagefile.img | cpio -imd --quiet
Installation
------------
First, a directory for the initrd file system has to be created on the
"normal" root file system, e.g.::
# mkdir /initrd
The name is not relevant. More details can be found on the
:manpage:`pivot_root(2)` man page.
If the root file system is created during the boot procedure (i.e. if
you're building an install floppy), the root file system creation
procedure should create the ``/initrd`` directory.
If initrd will not be mounted in some cases, its content is still
accessible if the following device has been created::
# mknod /dev/initrd b 1 250
# chmod 400 /dev/initrd
Second, the kernel has to be compiled with RAM disk support and with
support for the initial RAM disk enabled. Also, at least all components
needed to execute programs from initrd (e.g. executable format and file
system) must be compiled into the kernel.
Third, you have to create the RAM disk image. This is done by creating a
file system on a block device, copying files to it as needed, and then
copying the content of the block device to the initrd file. With recent
kernels, at least three types of devices are suitable for that:
- a floppy disk (works everywhere but it's painfully slow)
- a RAM disk (fast, but allocates physical memory)
- a loopback device (the most elegant solution)
We'll describe the loopback device method:
1) make sure loopback block devices are configured into the kernel
2) create an empty file system of the appropriate size, e.g.::
# dd if=/dev/zero of=initrd bs=300k count=1
# mke2fs -F -m0 initrd
(if space is critical, you may want to use the Minix FS instead of Ext2)
3) mount the file system, e.g.::
# mount -t ext2 -o loop initrd /mnt
4) create the console device::
# mkdir /mnt/dev
# mknod /mnt/dev/console c 5 1
5) copy all the files that are needed to properly use the initrd
environment. Don't forget the most important file, ``/sbin/init``
.. note:: ``/sbin/init`` permissions must include "x" (execute).
6) correct operation the initrd environment can frequently be tested
even without rebooting with the command::
# chroot /mnt /sbin/init
This is of course limited to initrds that do not interfere with the
general system state (e.g. by reconfiguring network interfaces,
overwriting mounted devices, trying to start already running demons,
etc. Note however that it is usually possible to use pivot_root in
such a chroot'ed initrd environment.)
7) unmount the file system::
# umount /mnt
8) the initrd is now in the file "initrd". Optionally, it can now be
compressed::
# gzip -9 initrd
For experimenting with initrd, you may want to take a rescue floppy and
only add a symbolic link from ``/sbin/init`` to ``/bin/sh``. Alternatively, you
can try the experimental newlib environment [#f2]_ to create a small
initrd.
Finally, you have to boot the kernel and load initrd. Almost all Linux
boot loaders support initrd. Since the boot process is still compatible
with an older mechanism, the following boot command line parameters
have to be given::
root=/dev/ram0 rw
(rw is only necessary if writing to the initrd file system.)
With LOADLIN, you simply execute::
LOADLIN <kernel> initrd=<disk_image>
e.g.::
LOADLIN C:\LINUX\BZIMAGE initrd=C:\LINUX\INITRD.GZ root=/dev/ram0 rw
With LILO, you add the option ``INITRD=<path>`` to either the global section
or to the section of the respective kernel in ``/etc/lilo.conf``, and pass
the options using APPEND, e.g.::
image = /bzImage
initrd = /boot/initrd.gz
append = "root=/dev/ram0 rw"
and run ``/sbin/lilo``
For other boot loaders, please refer to the respective documentation.
Now you can boot and enjoy using initrd.
Changing the root device
------------------------
When finished with its duties, init typically changes the root device
and proceeds with starting the Linux system on the "real" root device.
The procedure involves the following steps:
- mounting the new root file system
- turning it into the root file system
- removing all accesses to the old (initrd) root file system
- unmounting the initrd file system and de-allocating the RAM disk
Mounting the new root file system is easy: it just needs to be mounted on
a directory under the current root. Example::
# mkdir /new-root
# mount -o ro /dev/hda1 /new-root
The root change is accomplished with the pivot_root system call, which
is also available via the ``pivot_root`` utility (see :manpage:`pivot_root(8)`
man page; ``pivot_root`` is distributed with util-linux version 2.10h or higher
[#f3]_). ``pivot_root`` moves the current root to a directory under the new
root, and puts the new root at its place. The directory for the old root
must exist before calling ``pivot_root``. Example::
# cd /new-root
# mkdir initrd
# pivot_root . initrd
Now, the init process may still access the old root via its
executable, shared libraries, standard input/output/error, and its
current root directory. All these references are dropped by the
following command::
# exec chroot . what-follows <dev/console >dev/console 2>&1
Where what-follows is a program under the new root, e.g. ``/sbin/init``
If the new root file system will be used with udev and has no valid
``/dev`` directory, udev must be initialized before invoking chroot in order
to provide ``/dev/console``.
Note: implementation details of pivot_root may change with time. In order
to ensure compatibility, the following points should be observed:
- before calling pivot_root, the current directory of the invoking
process should point to the new root directory
- use . as the first argument, and the _relative_ path of the directory
for the old root as the second argument
- a chroot program must be available under the old and the new root
- chroot to the new root afterwards
- use relative paths for dev/console in the exec command
Now, the initrd can be unmounted and the memory allocated by the RAM
disk can be freed::
# umount /initrd
# blockdev --flushbufs /dev/ram0
It is also possible to use initrd with an NFS-mounted root, see the
:manpage:`pivot_root(8)` man page for details.
Usage scenarios
---------------
The main motivation for implementing initrd was to allow for modular
kernel configuration at system installation. The procedure would work
as follows:
1) system boots from floppy or other media with a minimal kernel
(e.g. support for RAM disks, initrd, a.out, and the Ext2 FS) and
loads initrd
2) ``/sbin/init`` determines what is needed to (1) mount the "real" root FS
(i.e. device type, device drivers, file system) and (2) the
distribution media (e.g. CD-ROM, network, tape, ...). This can be
done by asking the user, by auto-probing, or by using a hybrid
approach.
3) ``/sbin/init`` loads the necessary kernel modules
4) ``/sbin/init`` creates and populates the root file system (this doesn't
have to be a very usable system yet)
5) ``/sbin/init`` invokes ``pivot_root`` to change the root file system and
execs - via chroot - a program that continues the installation
6) the boot loader is installed
7) the boot loader is configured to load an initrd with the set of
modules that was used to bring up the system (e.g. ``/initrd`` can be
modified, then unmounted, and finally, the image is written from
``/dev/ram0`` or ``/dev/rd/0`` to a file)
8) now the system is bootable and additional installation tasks can be
performed
The key role of initrd here is to re-use the configuration data during
normal system operation without requiring the use of a bloated "generic"
kernel or re-compiling or re-linking the kernel.
A second scenario is for installations where Linux runs on systems with
different hardware configurations in a single administrative domain. In
such cases, it is desirable to generate only a small set of kernels
(ideally only one) and to keep the system-specific part of configuration
information as small as possible. In this case, a common initrd could be
generated with all the necessary modules. Then, only ``/sbin/init`` or a file
read by it would have to be different.
A third scenario is more convenient recovery disks, because information
like the location of the root FS partition doesn't have to be provided at
boot time, but the system loaded from initrd can invoke a user-friendly
dialog and it can also perform some sanity checks (or even some form of
auto-detection).
Last not least, CD-ROM distributors may use it for better installation
from CD, e.g. by using a boot floppy and bootstrapping a bigger RAM disk
via initrd from CD; or by booting via a loader like ``LOADLIN`` or directly
from the CD-ROM, and loading the RAM disk from CD without need of
floppies.
Obsolete root change mechanism
------------------------------
The following mechanism was used before the introduction of pivot_root.
Current kernels still support it, but you should _not_ rely on its
continued availability.
It works by mounting the "real" root device (i.e. the one set with rdev
in the kernel image or with root=... at the boot command line) as the
root file system when linuxrc exits. The initrd file system is then
unmounted, or, if it is still busy, moved to a directory ``/initrd``, if
such a directory exists on the new root file system.
In order to use this mechanism, you do not have to specify the boot
command options root, init, or rw. (If specified, they will affect
the real root file system, not the initrd environment.)
If /proc is mounted, the "real" root device can be changed from within
linuxrc by writing the number of the new root FS device to the special
file /proc/sys/kernel/real-root-dev, e.g.::
# echo 0x301 >/proc/sys/kernel/real-root-dev
Note that the mechanism is incompatible with NFS and similar file
systems.
This old, deprecated mechanism is commonly called ``change_root``, while
the new, supported mechanism is called ``pivot_root``.
Mixed change_root and pivot_root mechanism
------------------------------------------
In case you did not want to use ``root=/dev/ram0`` to trigger the pivot_root
mechanism, you may create both ``/linuxrc`` and ``/sbin/init`` in your initrd
image.
``/linuxrc`` would contain only the following::
#! /bin/sh
mount -n -t proc proc /proc
echo 0x0100 >/proc/sys/kernel/real-root-dev
umount -n /proc
Once linuxrc exited, the kernel would mount again your initrd as root,
this time executing ``/sbin/init``. Again, it would be the duty of this init
to build the right environment (maybe using the ``root= device`` passed on
the cmdline) before the final execution of the real ``/sbin/init``.
Resources
---------
.. [#f1] Almesberger, Werner; "Booting Linux: The History and the Future"
http://www.almesberger.net/cv/papers/ols2k-9.ps.gz
.. [#f2] newlib package (experimental), with initrd example
https://www.sourceware.org/newlib/
.. [#f3] util-linux: Miscellaneous utilities for Linux
https://www.kernel.org/pub/linux/utils/util-linux/

View File

@@ -0,0 +1,423 @@
Java(tm) Binary Kernel Support for Linux v1.03
----------------------------------------------
Linux beats them ALL! While all other OS's are TALKING about direct
support of Java Binaries in the OS, Linux is doing it!
You can execute Java applications and Java Applets just like any
other program after you have done the following:
1) You MUST FIRST install the Java Developers Kit for Linux.
The Java on Linux HOWTO gives the details on getting and
installing this. This HOWTO can be found at:
ftp://sunsite.unc.edu/pub/Linux/docs/HOWTO/Java-HOWTO
You should also set up a reasonable CLASSPATH environment
variable to use Java applications that make use of any
nonstandard classes (not included in the same directory
as the application itself).
2) You have to compile BINFMT_MISC either as a module or into
the kernel (``CONFIG_BINFMT_MISC``) and set it up properly.
If you choose to compile it as a module, you will have
to insert it manually with modprobe/insmod, as kmod
cannot easily be supported with binfmt_misc.
Read the file 'binfmt_misc.txt' in this directory to know
more about the configuration process.
3) Add the following configuration items to binfmt_misc
(you should really have read ``binfmt_misc.txt`` now):
support for Java applications::
':Java:M::\xca\xfe\xba\xbe::/usr/local/bin/javawrapper:'
support for executable Jar files::
':ExecutableJAR:E::jar::/usr/local/bin/jarwrapper:'
support for Java Applets::
':Applet:E::html::/usr/bin/appletviewer:'
or the following, if you want to be more selective::
':Applet:M::<!--applet::/usr/bin/appletviewer:'
Of course you have to fix the path names. The path/file names given in this
document match the Debian 2.1 system. (i.e. jdk installed in ``/usr``,
custom wrappers from this document in ``/usr/local``)
Note, that for the more selective applet support you have to modify
existing html-files to contain ``<!--applet-->`` in the first line
(``<`` has to be the first character!) to let this work!
For the compiled Java programs you need a wrapper script like the
following (this is because Java is broken in case of the filename
handling), again fix the path names, both in the script and in the
above given configuration string.
You, too, need the little program after the script. Compile like::
gcc -O2 -o javaclassname javaclassname.c
and stick it to ``/usr/local/bin``.
Both the javawrapper shellscript and the javaclassname program
were supplied by Colin J. Watson <cjw44@cam.ac.uk>.
Javawrapper shell script:
.. code-block:: sh
#!/bin/bash
# /usr/local/bin/javawrapper - the wrapper for binfmt_misc/java
if [ -z "$1" ]; then
exec 1>&2
echo Usage: $0 class-file
exit 1
fi
CLASS=$1
FQCLASS=`/usr/local/bin/javaclassname $1`
FQCLASSN=`echo $FQCLASS | sed -e 's/^.*\.\([^.]*\)$/\1/'`
FQCLASSP=`echo $FQCLASS | sed -e 's-\.-/-g' -e 's-^[^/]*$--' -e 's-/[^/]*$--'`
# for example:
# CLASS=Test.class
# FQCLASS=foo.bar.Test
# FQCLASSN=Test
# FQCLASSP=foo/bar
unset CLASSBASE
declare -i LINKLEVEL=0
while :; do
if [ "`basename $CLASS .class`" == "$FQCLASSN" ]; then
# See if this directory works straight off
cd -L `dirname $CLASS`
CLASSDIR=$PWD
cd $OLDPWD
if echo $CLASSDIR | grep -q "$FQCLASSP$"; then
CLASSBASE=`echo $CLASSDIR | sed -e "s.$FQCLASSP$.."`
break;
fi
# Try dereferencing the directory name
cd -P `dirname $CLASS`
CLASSDIR=$PWD
cd $OLDPWD
if echo $CLASSDIR | grep -q "$FQCLASSP$"; then
CLASSBASE=`echo $CLASSDIR | sed -e "s.$FQCLASSP$.."`
break;
fi
# If no other possible filename exists
if [ ! -L $CLASS ]; then
exec 1>&2
echo $0:
echo " $CLASS should be in a" \
"directory tree called $FQCLASSP"
exit 1
fi
fi
if [ ! -L $CLASS ]; then break; fi
# Go down one more level of symbolic links
let LINKLEVEL+=1
if [ $LINKLEVEL -gt 5 ]; then
exec 1>&2
echo $0:
echo " Too many symbolic links encountered"
exit 1
fi
CLASS=`ls --color=no -l $CLASS | sed -e 's/^.* \([^ ]*\)$/\1/'`
done
if [ -z "$CLASSBASE" ]; then
if [ -z "$FQCLASSP" ]; then
GOODNAME=$FQCLASSN.class
else
GOODNAME=$FQCLASSP/$FQCLASSN.class
fi
exec 1>&2
echo $0:
echo " $FQCLASS should be in a file called $GOODNAME"
exit 1
fi
if ! echo $CLASSPATH | grep -q "^\(.*:\)*$CLASSBASE\(:.*\)*"; then
# class is not in CLASSPATH, so prepend dir of class to CLASSPATH
if [ -z "${CLASSPATH}" ] ; then
export CLASSPATH=$CLASSBASE
else
export CLASSPATH=$CLASSBASE:$CLASSPATH
fi
fi
shift
/usr/bin/java $FQCLASS "$@"
javaclassname.c:
.. code-block:: c
/* javaclassname.c
*
* Extracts the class name from a Java class file; intended for use in a Java
* wrapper of the type supported by the binfmt_misc option in the Linux kernel.
*
* Copyright (C) 1999 Colin J. Watson <cjw44@cam.ac.uk>.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
#include <stdlib.h>
#include <stdio.h>
#include <stdarg.h>
#include <sys/types.h>
/* From Sun's Java VM Specification, as tag entries in the constant pool. */
#define CP_UTF8 1
#define CP_INTEGER 3
#define CP_FLOAT 4
#define CP_LONG 5
#define CP_DOUBLE 6
#define CP_CLASS 7
#define CP_STRING 8
#define CP_FIELDREF 9
#define CP_METHODREF 10
#define CP_INTERFACEMETHODREF 11
#define CP_NAMEANDTYPE 12
#define CP_METHODHANDLE 15
#define CP_METHODTYPE 16
#define CP_INVOKEDYNAMIC 18
/* Define some commonly used error messages */
#define seek_error() error("%s: Cannot seek\n", program)
#define corrupt_error() error("%s: Class file corrupt\n", program)
#define eof_error() error("%s: Unexpected end of file\n", program)
#define utf8_error() error("%s: Only ASCII 1-255 supported\n", program);
char *program;
long *pool;
u_int8_t read_8(FILE *classfile);
u_int16_t read_16(FILE *classfile);
void skip_constant(FILE *classfile, u_int16_t *cur);
void error(const char *format, ...);
int main(int argc, char **argv);
/* Reads in an unsigned 8-bit integer. */
u_int8_t read_8(FILE *classfile)
{
int b = fgetc(classfile);
if(b == EOF)
eof_error();
return (u_int8_t)b;
}
/* Reads in an unsigned 16-bit integer. */
u_int16_t read_16(FILE *classfile)
{
int b1, b2;
b1 = fgetc(classfile);
if(b1 == EOF)
eof_error();
b2 = fgetc(classfile);
if(b2 == EOF)
eof_error();
return (u_int16_t)((b1 << 8) | b2);
}
/* Reads in a value from the constant pool. */
void skip_constant(FILE *classfile, u_int16_t *cur)
{
u_int16_t len;
int seekerr = 1;
pool[*cur] = ftell(classfile);
switch(read_8(classfile))
{
case CP_UTF8:
len = read_16(classfile);
seekerr = fseek(classfile, len, SEEK_CUR);
break;
case CP_CLASS:
case CP_STRING:
case CP_METHODTYPE:
seekerr = fseek(classfile, 2, SEEK_CUR);
break;
case CP_METHODHANDLE:
seekerr = fseek(classfile, 3, SEEK_CUR);
break;
case CP_INTEGER:
case CP_FLOAT:
case CP_FIELDREF:
case CP_METHODREF:
case CP_INTERFACEMETHODREF:
case CP_NAMEANDTYPE:
case CP_INVOKEDYNAMIC:
seekerr = fseek(classfile, 4, SEEK_CUR);
break;
case CP_LONG:
case CP_DOUBLE:
seekerr = fseek(classfile, 8, SEEK_CUR);
++(*cur);
break;
default:
corrupt_error();
}
if(seekerr)
seek_error();
}
void error(const char *format, ...)
{
va_list ap;
va_start(ap, format);
vfprintf(stderr, format, ap);
va_end(ap);
exit(1);
}
int main(int argc, char **argv)
{
FILE *classfile;
u_int16_t cp_count, i, this_class, classinfo_ptr;
u_int8_t length;
program = argv[0];
if(!argv[1])
error("%s: Missing input file\n", program);
classfile = fopen(argv[1], "rb");
if(!classfile)
error("%s: Error opening %s\n", program, argv[1]);
if(fseek(classfile, 8, SEEK_SET)) /* skip magic and version numbers */
seek_error();
cp_count = read_16(classfile);
pool = calloc(cp_count, sizeof(long));
if(!pool)
error("%s: Out of memory for constant pool\n", program);
for(i = 1; i < cp_count; ++i)
skip_constant(classfile, &i);
if(fseek(classfile, 2, SEEK_CUR)) /* skip access flags */
seek_error();
this_class = read_16(classfile);
if(this_class < 1 || this_class >= cp_count)
corrupt_error();
if(!pool[this_class] || pool[this_class] == -1)
corrupt_error();
if(fseek(classfile, pool[this_class] + 1, SEEK_SET))
seek_error();
classinfo_ptr = read_16(classfile);
if(classinfo_ptr < 1 || classinfo_ptr >= cp_count)
corrupt_error();
if(!pool[classinfo_ptr] || pool[classinfo_ptr] == -1)
corrupt_error();
if(fseek(classfile, pool[classinfo_ptr] + 1, SEEK_SET))
seek_error();
length = read_16(classfile);
for(i = 0; i < length; ++i)
{
u_int8_t x = read_8(classfile);
if((x & 0x80) || !x)
{
if((x & 0xE0) == 0xC0)
{
u_int8_t y = read_8(classfile);
if((y & 0xC0) == 0x80)
{
int c = ((x & 0x1f) << 6) + (y & 0x3f);
if(c) putchar(c);
else utf8_error();
}
else utf8_error();
}
else utf8_error();
}
else if(x == '/') putchar('.');
else putchar(x);
}
putchar('\n');
free(pool);
fclose(classfile);
return 0;
}
jarwrapper::
#!/bin/bash
# /usr/local/java/bin/jarwrapper - the wrapper for binfmt_misc/jar
java -jar $1
Now simply ``chmod +x`` the ``.class``, ``.jar`` and/or ``.html`` files you
want to execute.
To add a Java program to your path best put a symbolic link to the main
.class file into /usr/bin (or another place you like) omitting the .class
extension. The directory containing the original .class file will be
added to your CLASSPATH during execution.
To test your new setup, enter in the following simple Java app, and name
it "HelloWorld.java":
.. code-block:: java
class HelloWorld {
public static void main(String args[]) {
System.out.println("Hello World!");
}
}
Now compile the application with::
javac HelloWorld.java
Set the executable permissions of the binary file, with::
chmod 755 HelloWorld.class
And then execute it::
./HelloWorld.class
To execute Java Jar files, simple chmod the ``*.jar`` files to include
the execution bit, then just do::
./Application.jar
To execute Java Applets, simple chmod the ``*.html`` files to include
the execution bit, then just do::
./Applet.html
originally by Brian A. Lantz, brian@lantz.com
heavily edited for binfmt_misc by Richard Günther
new scripts by Colin J. Watson <cjw44@cam.ac.uk>
added executable Jar file support by Kurt Huwig <kurt@iku-netz.de>

View File

@@ -0,0 +1,209 @@
The kernel's command-line parameters
====================================
The following is a consolidated list of the kernel parameters as
implemented by the __setup(), core_param() and module_param() macros
and sorted into English Dictionary order (defined as ignoring all
punctuation and sorting digits before letters in a case insensitive
manner), and with descriptions where known.
The kernel parses parameters from the kernel command line up to "--";
if it doesn't recognize a parameter and it doesn't contain a '.', the
parameter gets passed to init: parameters with '=' go into init's
environment, others are passed as command line arguments to init.
Everything after "--" is passed as an argument to init.
Module parameters can be specified in two ways: via the kernel command
line with a module name prefix, or via modprobe, e.g.::
(kernel command line) usbcore.blinkenlights=1
(modprobe command line) modprobe usbcore blinkenlights=1
Parameters for modules which are built into the kernel need to be
specified on the kernel command line. modprobe looks through the
kernel command line (/proc/cmdline) and collects module parameters
when it loads a module, so the kernel command line can be used for
loadable modules too.
Hyphens (dashes) and underscores are equivalent in parameter names, so::
log_buf_len=1M print-fatal-signals=1
can also be entered as::
log-buf-len=1M print_fatal_signals=1
Double-quotes can be used to protect spaces in values, e.g.::
param="spaces in here"
cpu lists:
----------
Some kernel parameters take a list of CPUs as a value, e.g. isolcpus,
nohz_full, irqaffinity, rcu_nocbs. The format of this list is:
<cpu number>,...,<cpu number>
or
<cpu number>-<cpu number>
(must be a positive range in ascending order)
or a mixture
<cpu number>,...,<cpu number>-<cpu number>
Note that for the special case of a range one can split the range into equal
sized groups and for each group use some amount from the beginning of that
group:
<cpu number>-cpu number>:<used size>/<group size>
For example one can add to the command line following parameter:
isolcpus=1,2,10-20,100-2000:2/25
where the final item represents CPUs 100,101,125,126,150,151,...
This document may not be entirely up to date and comprehensive. The command
"modinfo -p ${modulename}" shows a current list of all parameters of a loadable
module. Loadable modules, after being loaded into the running kernel, also
reveal their parameters in /sys/module/${modulename}/parameters/. Some of these
parameters may be changed at runtime by the command
``echo -n ${value} > /sys/module/${modulename}/parameters/${parm}``.
The parameters listed below are only valid if certain kernel build options were
enabled and if respective hardware is present. The text in square brackets at
the beginning of each description states the restrictions within which a
parameter is applicable::
ACPI ACPI support is enabled.
AGP AGP (Accelerated Graphics Port) is enabled.
ALSA ALSA sound support is enabled.
APIC APIC support is enabled.
APM Advanced Power Management support is enabled.
ARM ARM architecture is enabled.
AVR32 AVR32 architecture is enabled.
AX25 Appropriate AX.25 support is enabled.
BLACKFIN Blackfin architecture is enabled.
CLK Common clock infrastructure is enabled.
CMA Contiguous Memory Area support is enabled.
DRM Direct Rendering Management support is enabled.
DYNAMIC_DEBUG Build in debug messages and enable them at runtime
EDD BIOS Enhanced Disk Drive Services (EDD) is enabled
EFI EFI Partitioning (GPT) is enabled
EIDE EIDE/ATAPI support is enabled.
EVM Extended Verification Module
FB The frame buffer device is enabled.
FTRACE Function tracing enabled.
GCOV GCOV profiling is enabled.
HW Appropriate hardware is enabled.
IA-64 IA-64 architecture is enabled.
IMA Integrity measurement architecture is enabled.
IOSCHED More than one I/O scheduler is enabled.
IP_PNP IP DHCP, BOOTP, or RARP is enabled.
IPV6 IPv6 support is enabled.
ISAPNP ISA PnP code is enabled.
ISDN Appropriate ISDN support is enabled.
JOY Appropriate joystick support is enabled.
KGDB Kernel debugger support is enabled.
KVM Kernel Virtual Machine support is enabled.
LIBATA Libata driver is enabled
LP Printer support is enabled.
LOOP Loopback device support is enabled.
M68k M68k architecture is enabled.
These options have more detailed description inside of
Documentation/m68k/kernel-options.txt.
MDA MDA console support is enabled.
MIPS MIPS architecture is enabled.
MOUSE Appropriate mouse support is enabled.
MSI Message Signaled Interrupts (PCI).
MTD MTD (Memory Technology Device) support is enabled.
NET Appropriate network support is enabled.
NUMA NUMA support is enabled.
NFS Appropriate NFS support is enabled.
OSS OSS sound support is enabled.
PV_OPS A paravirtualized kernel is enabled.
PARIDE The ParIDE (parallel port IDE) subsystem is enabled.
PARISC The PA-RISC architecture is enabled.
PCI PCI bus support is enabled.
PCIE PCI Express support is enabled.
PCMCIA The PCMCIA subsystem is enabled.
PNP Plug & Play support is enabled.
PPC PowerPC architecture is enabled.
PPT Parallel port support is enabled.
PS2 Appropriate PS/2 support is enabled.
RAM RAM disk support is enabled.
S390 S390 architecture is enabled.
SCSI Appropriate SCSI support is enabled.
A lot of drivers have their options described inside
the Documentation/scsi/ sub-directory.
SECURITY Different security models are enabled.
SELINUX SELinux support is enabled.
APPARMOR AppArmor support is enabled.
SERIAL Serial support is enabled.
SH SuperH architecture is enabled.
SMP The kernel is an SMP kernel.
SPARC Sparc architecture is enabled.
SWSUSP Software suspend (hibernation) is enabled.
SUSPEND System suspend states are enabled.
TPM TPM drivers are enabled.
TS Appropriate touchscreen support is enabled.
UMS USB Mass Storage support is enabled.
USB USB support is enabled.
USBHID USB Human Interface Device support is enabled.
V4L Video For Linux support is enabled.
VMMIO Driver for memory mapped virtio devices is enabled.
VGA The VGA console has been enabled.
VT Virtual terminal support is enabled.
WDT Watchdog support is enabled.
XT IBM PC/XT MFM hard disk support is enabled.
X86-32 X86-32, aka i386 architecture is enabled.
X86-64 X86-64 architecture is enabled.
More X86-64 boot options can be found in
Documentation/x86/x86_64/boot-options.txt .
X86 Either 32-bit or 64-bit x86 (same as X86-32+X86-64)
X86_UV SGI UV support is enabled.
XEN Xen support is enabled
In addition, the following text indicates that the option::
BUGS= Relates to possible processor bugs on the said processor.
KNL Is a kernel start-up parameter.
BOOT Is a boot loader parameter.
Parameters denoted with BOOT are actually interpreted by the boot
loader, and have no meaning to the kernel directly.
Do not modify the syntax of boot loader parameters without extreme
need or coordination with <Documentation/x86/boot.txt>.
There are also arch-specific kernel-parameters not documented here.
See for example <Documentation/x86/x86_64/boot-options.txt>.
Note that ALL kernel parameters listed below are CASE SENSITIVE, and that
a trailing = on the name of any parameter states that that parameter will
be entered as an environment variable, whereas its absence indicates that
it will appear as a kernel argument readable via /proc/cmdline by programs
running once the system is up.
The number of kernel parameters is not limited, but the length of the
complete command line (parameters including spaces etc.) is limited to
a fixed number of characters. This limit depends on the architecture
and is between 256 and 4096 characters. It is defined in the file
./include/asm/setup.h as COMMAND_LINE_SIZE.
Finally, the [KMG] suffix is commonly described after a number of kernel
parameter values. These 'K', 'M', and 'G' letters represent the _binary_
multipliers 'Kilo', 'Mega', and 'Giga', equalling 2^10, 2^20, and 2^30
bytes respectively. Such letter suffixes can also be entirely omitted:
.. include:: kernel-parameters.txt
:literal:
Todo
----
Add more DRM drivers.

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,727 @@
RAID arrays
===========
Boot time assembly of RAID arrays
---------------------------------
Tools that manage md devices can be found at
http://www.kernel.org/pub/linux/utils/raid/
You can boot with your md device with the following kernel command
lines:
for old raid arrays without persistent superblocks::
md=<md device no.>,<raid level>,<chunk size factor>,<fault level>,dev0,dev1,...,devn
for raid arrays with persistent superblocks::
md=<md device no.>,dev0,dev1,...,devn
or, to assemble a partitionable array::
md=d<md device no.>,dev0,dev1,...,devn
``md device no.``
+++++++++++++++++
The number of the md device
================= =========
``md device no.`` device
================= =========
0 md0
1 md1
2 md2
3 md3
4 md4
================= =========
``raid level``
++++++++++++++
level of the RAID array
=============== =============
``raid level`` level
=============== =============
-1 linear mode
0 striped mode
=============== =============
other modes are only supported with persistent super blocks
``chunk size factor``
+++++++++++++++++++++
(raid-0 and raid-1 only)
Set the chunk size as 4k << n.
``fault level``
+++++++++++++++
Totally ignored
``dev0`` to ``devn``
++++++++++++++++++++
e.g. ``/dev/hda1``, ``/dev/hdc1``, ``/dev/sda1``, ``/dev/sdb1``
A possible loadlin line (Harald Hoyer <HarryH@Royal.Net>) looks like this::
e:\loadlin\loadlin e:\zimage root=/dev/md0 md=0,0,4,0,/dev/hdb2,/dev/hdc3 ro
Boot time autodetection of RAID arrays
--------------------------------------
When md is compiled into the kernel (not as module), partitions of
type 0xfd are scanned and automatically assembled into RAID arrays.
This autodetection may be suppressed with the kernel parameter
``raid=noautodetect``. As of kernel 2.6.9, only drives with a type 0
superblock can be autodetected and run at boot time.
The kernel parameter ``raid=partitionable`` (or ``raid=part``) means
that all auto-detected arrays are assembled as partitionable.
Boot time assembly of degraded/dirty arrays
-------------------------------------------
If a raid5 or raid6 array is both dirty and degraded, it could have
undetectable data corruption. This is because the fact that it is
``dirty`` means that the parity cannot be trusted, and the fact that it
is degraded means that some datablocks are missing and cannot reliably
be reconstructed (due to no parity).
For this reason, md will normally refuse to start such an array. This
requires the sysadmin to take action to explicitly start the array
despite possible corruption. This is normally done with::
mdadm --assemble --force ....
This option is not really available if the array has the root
filesystem on it. In order to support this booting from such an
array, md supports a module parameter ``start_dirty_degraded`` which,
when set to 1, bypassed the checks and will allows dirty degraded
arrays to be started.
So, to boot with a root filesystem of a dirty degraded raid 5 or 6, use::
md-mod.start_dirty_degraded=1
Superblock formats
------------------
The md driver can support a variety of different superblock formats.
Currently, it supports superblock formats ``0.90.0`` and the ``md-1`` format
introduced in the 2.5 development series.
The kernel will autodetect which format superblock is being used.
Superblock format ``0`` is treated differently to others for legacy
reasons - it is the original superblock format.
General Rules - apply for all superblock formats
------------------------------------------------
An array is ``created`` by writing appropriate superblocks to all
devices.
It is ``assembled`` by associating each of these devices with an
particular md virtual device. Once it is completely assembled, it can
be accessed.
An array should be created by a user-space tool. This will write
superblocks to all devices. It will usually mark the array as
``unclean``, or with some devices missing so that the kernel md driver
can create appropriate redundancy (copying in raid 1, parity
calculation in raid 4/5).
When an array is assembled, it is first initialized with the
SET_ARRAY_INFO ioctl. This contains, in particular, a major and minor
version number. The major version number selects which superblock
format is to be used. The minor number might be used to tune handling
of the format, such as suggesting where on each device to look for the
superblock.
Then each device is added using the ADD_NEW_DISK ioctl. This
provides, in particular, a major and minor number identifying the
device to add.
The array is started with the RUN_ARRAY ioctl.
Once started, new devices can be added. They should have an
appropriate superblock written to them, and then be passed in with
ADD_NEW_DISK.
Devices that have failed or are not yet active can be detached from an
array using HOT_REMOVE_DISK.
Specific Rules that apply to format-0 super block arrays, and arrays with no superblock (non-persistent)
--------------------------------------------------------------------------------------------------------
An array can be ``created`` by describing the array (level, chunksize
etc) in a SET_ARRAY_INFO ioctl. This must have ``major_version==0`` and
``raid_disks != 0``.
Then uninitialized devices can be added with ADD_NEW_DISK. The
structure passed to ADD_NEW_DISK must specify the state of the device
and its role in the array.
Once started with RUN_ARRAY, uninitialized spares can be added with
HOT_ADD_DISK.
MD devices in sysfs
-------------------
md devices appear in sysfs (``/sys``) as regular block devices,
e.g.::
/sys/block/md0
Each ``md`` device will contain a subdirectory called ``md`` which
contains further md-specific information about the device.
All md devices contain:
level
a text file indicating the ``raid level``. e.g. raid0, raid1,
raid5, linear, multipath, faulty.
If no raid level has been set yet (array is still being
assembled), the value will reflect whatever has been written
to it, which may be a name like the above, or may be a number
such as ``0``, ``5``, etc.
raid_disks
a text file with a simple number indicating the number of devices
in a fully functional array. If this is not yet known, the file
will be empty. If an array is being resized this will contain
the new number of devices.
Some raid levels allow this value to be set while the array is
active. This will reconfigure the array. Otherwise it can only
be set while assembling an array.
A change to this attribute will not be permitted if it would
reduce the size of the array. To reduce the number of drives
in an e.g. raid5, the array size must first be reduced by
setting the ``array_size`` attribute.
chunk_size
This is the size in bytes for ``chunks`` and is only relevant to
raid levels that involve striping (0,4,5,6,10). The address space
of the array is conceptually divided into chunks and consecutive
chunks are striped onto neighbouring devices.
The size should be at least PAGE_SIZE (4k) and should be a power
of 2. This can only be set while assembling an array
layout
The ``layout`` for the array for the particular level. This is
simply a number that is interpretted differently by different
levels. It can be written while assembling an array.
array_size
This can be used to artificially constrain the available space in
the array to be less than is actually available on the combined
devices. Writing a number (in Kilobytes) which is less than
the available size will set the size. Any reconfiguration of the
array (e.g. adding devices) will not cause the size to change.
Writing the word ``default`` will cause the effective size of the
array to be whatever size is actually available based on
``level``, ``chunk_size`` and ``component_size``.
This can be used to reduce the size of the array before reducing
the number of devices in a raid4/5/6, or to support external
metadata formats which mandate such clipping.
reshape_position
This is either ``none`` or a sector number within the devices of
the array where ``reshape`` is up to. If this is set, the three
attributes mentioned above (raid_disks, chunk_size, layout) can
potentially have 2 values, an old and a new value. If these
values differ, reading the attribute returns::
new (old)
and writing will effect the ``new`` value, leaving the ``old``
unchanged.
component_size
For arrays with data redundancy (i.e. not raid0, linear, faulty,
multipath), all components must be the same size - or at least
there must a size that they all provide space for. This is a key
part or the geometry of the array. It is measured in sectors
and can be read from here. Writing to this value may resize
the array if the personality supports it (raid1, raid5, raid6),
and if the component drives are large enough.
metadata_version
This indicates the format that is being used to record metadata
about the array. It can be 0.90 (traditional format), 1.0, 1.1,
1.2 (newer format in varying locations) or ``none`` indicating that
the kernel isn't managing metadata at all.
Alternately it can be ``external:`` followed by a string which
is set by user-space. This indicates that metadata is managed
by a user-space program. Any device failure or other event that
requires a metadata update will cause array activity to be
suspended until the event is acknowledged.
resync_start
The point at which resync should start. If no resync is needed,
this will be a very large number (or ``none`` since 2.6.30-rc1). At
array creation it will default to 0, though starting the array as
``clean`` will set it much larger.
new_dev
This file can be written but not read. The value written should
be a block device number as major:minor. e.g. 8:0
This will cause that device to be attached to the array, if it is
available. It will then appear at md/dev-XXX (depending on the
name of the device) and further configuration is then possible.
safe_mode_delay
When an md array has seen no write requests for a certain period
of time, it will be marked as ``clean``. When another write
request arrives, the array is marked as ``dirty`` before the write
commences. This is known as ``safe_mode``.
The ``certain period`` is controlled by this file which stores the
period as a number of seconds. The default is 200msec (0.200).
Writing a value of 0 disables safemode.
array_state
This file contains a single word which describes the current
state of the array. In many cases, the state can be set by
writing the word for the desired state, however some states
cannot be explicitly set, and some transitions are not allowed.
Select/poll works on this file. All changes except between
Active_idle and active (which can be frequent and are not
very interesting) are notified. active->active_idle is
reported if the metadata is externally managed.
clear
No devices, no size, no level
Writing is equivalent to STOP_ARRAY ioctl
inactive
May have some settings, but array is not active
all IO results in error
When written, doesn't tear down array, but just stops it
suspended (not supported yet)
All IO requests will block. The array can be reconfigured.
Writing this, if accepted, will block until array is quiessent
readonly
no resync can happen. no superblocks get written.
Write requests fail
read-auto
like readonly, but behaves like ``clean`` on a write request.
clean
no pending writes, but otherwise active.
When written to inactive array, starts without resync
If a write request arrives then
if metadata is known, mark ``dirty`` and switch to ``active``.
if not known, block and switch to write-pending
If written to an active array that has pending writes, then fails.
active
fully active: IO and resync can be happening.
When written to inactive array, starts with resync
write-pending
clean, but writes are blocked waiting for ``active`` to be written.
active-idle
like active, but no writes have been seen for a while (safe_mode_delay).
bitmap/location
This indicates where the write-intent bitmap for the array is
stored.
It can be one of ``none``, ``file`` or ``[+-]N``.
``file`` may later be extended to ``file:/file/name``
``[+-]N`` means that many sectors from the start of the metadata.
This is replicated on all devices. For arrays with externally
managed metadata, the offset is from the beginning of the
device.
bitmap/chunksize
The size, in bytes, of the chunk which will be represented by a
single bit. For RAID456, it is a portion of an individual
device. For RAID10, it is a portion of the array. For RAID1, it
is both (they come to the same thing).
bitmap/time_base
The time, in seconds, between looking for bits in the bitmap to
be cleared. In the current implementation, a bit will be cleared
between 2 and 3 times ``time_base`` after all the covered blocks
are known to be in-sync.
bitmap/backlog
When write-mostly devices are active in a RAID1, write requests
to those devices proceed in the background - the filesystem (or
other user of the device) does not have to wait for them.
``backlog`` sets a limit on the number of concurrent background
writes. If there are more than this, new writes will by
synchronous.
bitmap/metadata
This can be either ``internal`` or ``external``.
``internal``
is the default and means the metadata for the bitmap
is stored in the first 256 bytes of the allocated space and is
managed by the md module.
``external``
means that bitmap metadata is managed externally to
the kernel (i.e. by some userspace program)
bitmap/can_clear
This is either ``true`` or ``false``. If ``true``, then bits in the
bitmap will be cleared when the corresponding blocks are thought
to be in-sync. If ``false``, bits will never be cleared.
This is automatically set to ``false`` if a write happens on a
degraded array, or if the array becomes degraded during a write.
When metadata is managed externally, it should be set to true
once the array becomes non-degraded, and this fact has been
recorded in the metadata.
As component devices are added to an md array, they appear in the ``md``
directory as new directories named::
dev-XXX
where ``XXX`` is a name that the kernel knows for the device, e.g. hdb1.
Each directory contains:
block
a symlink to the block device in /sys/block, e.g.::
/sys/block/md0/md/dev-hdb1/block -> ../../../../block/hdb/hdb1
super
A file containing an image of the superblock read from, or
written to, that device.
state
A file recording the current state of the device in the array
which can be a comma separated list of:
faulty
device has been kicked from active use due to
a detected fault, or it has unacknowledged bad
blocks
in_sync
device is a fully in-sync member of the array
writemostly
device will only be subject to read
requests if there are no other options.
This applies only to raid1 arrays.
blocked
device has failed, and the failure hasn't been
acknowledged yet by the metadata handler.
Writes that would write to this device if
it were not faulty are blocked.
spare
device is working, but not a full member.
This includes spares that are in the process
of being recovered to
write_error
device has ever seen a write error.
want_replacement
device is (mostly) working but probably
should be replaced, either due to errors or
due to user request.
replacement
device is a replacement for another active
device with same raid_disk.
This list may grow in future.
This can be written to.
Writing ``faulty`` simulates a failure on the device.
Writing ``remove`` removes the device from the array.
Writing ``writemostly`` sets the writemostly flag.
Writing ``-writemostly`` clears the writemostly flag.
Writing ``blocked`` sets the ``blocked`` flag.
Writing ``-blocked`` clears the ``blocked`` flags and allows writes
to complete and possibly simulates an error.
Writing ``in_sync`` sets the in_sync flag.
Writing ``write_error`` sets writeerrorseen flag.
Writing ``-write_error`` clears writeerrorseen flag.
Writing ``want_replacement`` is allowed at any time except to a
replacement device or a spare. It sets the flag.
Writing ``-want_replacement`` is allowed at any time. It clears
the flag.
Writing ``replacement`` or ``-replacement`` is only allowed before
starting the array. It sets or clears the flag.
This file responds to select/poll. Any change to ``faulty``
or ``blocked`` causes an event.
errors
An approximate count of read errors that have been detected on
this device but have not caused the device to be evicted from
the array (either because they were corrected or because they
happened while the array was read-only). When using version-1
metadata, this value persists across restarts of the array.
This value can be written while assembling an array thus
providing an ongoing count for arrays with metadata managed by
userspace.
slot
This gives the role that the device has in the array. It will
either be ``none`` if the device is not active in the array
(i.e. is a spare or has failed) or an integer less than the
``raid_disks`` number for the array indicating which position
it currently fills. This can only be set while assembling an
array. A device for which this is set is assumed to be working.
offset
This gives the location in the device (in sectors from the
start) where data from the array will be stored. Any part of
the device before this offset is not touched, unless it is
used for storing metadata (Formats 1.1 and 1.2).
size
The amount of the device, after the offset, that can be used
for storage of data. This will normally be the same as the
component_size. This can be written while assembling an
array. If a value less than the current component_size is
written, it will be rejected.
recovery_start
When the device is not ``in_sync``, this records the number of
sectors from the start of the device which are known to be
correct. This is normally zero, but during a recovery
operation it will steadily increase, and if the recovery is
interrupted, restoring this value can cause recovery to
avoid repeating the earlier blocks. With v1.x metadata, this
value is saved and restored automatically.
This can be set whenever the device is not an active member of
the array, either before the array is activated, or before
the ``slot`` is set.
Setting this to ``none`` is equivalent to setting ``in_sync``.
Setting to any other value also clears the ``in_sync`` flag.
bad_blocks
This gives the list of all known bad blocks in the form of
start address and length (in sectors respectively). If output
is too big to fit in a page, it will be truncated. Writing
``sector length`` to this file adds new acknowledged (i.e.
recorded to disk safely) bad blocks.
unacknowledged_bad_blocks
This gives the list of known-but-not-yet-saved-to-disk bad
blocks in the same form of ``bad_blocks``. If output is too big
to fit in a page, it will be truncated. Writing to this file
adds bad blocks without acknowledging them. This is largely
for testing.
An active md device will also contain an entry for each active device
in the array. These are named::
rdNN
where ``NN`` is the position in the array, starting from 0.
So for a 3 drive array there will be rd0, rd1, rd2.
These are symbolic links to the appropriate ``dev-XXX`` entry.
Thus, for example::
cat /sys/block/md*/md/rd*/state
will show ``in_sync`` on every line.
Active md devices for levels that support data redundancy (1,4,5,6,10)
also have
sync_action
a text file that can be used to monitor and control the rebuild
process. It contains one word which can be one of:
resync
redundancy is being recalculated after unclean
shutdown or creation
recover
a hot spare is being built to replace a
failed/missing device
idle
nothing is happening
check
A full check of redundancy was requested and is
happening. This reads all blocks and checks
them. A repair may also happen for some raid
levels.
repair
A full check and repair is happening. This is
similar to ``resync``, but was requested by the
user, and the write-intent bitmap is NOT used to
optimise the process.
This file is writable, and each of the strings that could be
read are meaningful for writing.
``idle`` will stop an active resync/recovery etc. There is no
guarantee that another resync/recovery may not be automatically
started again, though some event will be needed to trigger
this.
``resync`` or ``recovery`` can be used to restart the
corresponding operation if it was stopped with ``idle``.
``check`` and ``repair`` will start the appropriate process
providing the current state is ``idle``.
This file responds to select/poll. Any important change in the value
triggers a poll event. Sometimes the value will briefly be
``recover`` if a recovery seems to be needed, but cannot be
achieved. In that case, the transition to ``recover`` isn't
notified, but the transition away is.
degraded
This contains a count of the number of devices by which the
arrays is degraded. So an optimal array will show ``0``. A
single failed/missing drive will show ``1``, etc.
This file responds to select/poll, any increase or decrease
in the count of missing devices will trigger an event.
mismatch_count
When performing ``check`` and ``repair``, and possibly when
performing ``resync``, md will count the number of errors that are
found. The count in ``mismatch_cnt`` is the number of sectors
that were re-written, or (for ``check``) would have been
re-written. As most raid levels work in units of pages rather
than sectors, this may be larger than the number of actual errors
by a factor of the number of sectors in a page.
bitmap_set_bits
If the array has a write-intent bitmap, then writing to this
attribute can set bits in the bitmap, indicating that a resync
would need to check the corresponding blocks. Either individual
numbers or start-end pairs can be written. Multiple numbers
can be separated by a space.
Note that the numbers are ``bit`` numbers, not ``block`` numbers.
They should be scaled by the bitmap_chunksize.
sync_speed_min, sync_speed_max
This are similar to ``/proc/sys/dev/raid/speed_limit_{min,max}``
however they only apply to the particular array.
If no value has been written to these, or if the word ``system``
is written, then the system-wide value is used. If a value,
in kibibytes-per-second is written, then it is used.
When the files are read, they show the currently active value
followed by ``(local)`` or ``(system)`` depending on whether it is
a locally set or system-wide value.
sync_completed
This shows the number of sectors that have been completed of
whatever the current sync_action is, followed by the number of
sectors in total that could need to be processed. The two
numbers are separated by a ``/`` thus effectively showing one
value, a fraction of the process that is complete.
A ``select`` on this attribute will return when resync completes,
when it reaches the current sync_max (below) and possibly at
other times.
sync_speed
This shows the current actual speed, in K/sec, of the current
sync_action. It is averaged over the last 30 seconds.
suspend_lo, suspend_hi
The two values, given as numbers of sectors, indicate a range
within the array where IO will be blocked. This is currently
only supported for raid4/5/6.
sync_min, sync_max
The two values, given as numbers of sectors, indicate a range
within the array where ``check``/``repair`` will operate. Must be
a multiple of chunk_size. When it reaches ``sync_max`` it will
pause, rather than complete.
You can use ``select`` or ``poll`` on ``sync_completed`` to wait for
that number to reach sync_max. Then you can either increase
``sync_max``, or can write ``idle`` to ``sync_action``.
The value of ``max`` for ``sync_max`` effectively disables the limit.
When a resync is active, the value can only ever be increased,
never decreased.
The value of ``0`` is the minimum for ``sync_min``.
Each active md device may also have attributes specific to the
personality module that manages it.
These are specific to the implementation of the module and could
change substantially if the implementation changes.
These currently include:
stripe_cache_size (currently raid5 only)
number of entries in the stripe cache. This is writable, but
there are upper and lower limits (32768, 17). Default is 256.
strip_cache_active (currently raid5 only)
number of active entries in the stripe cache
preread_bypass_threshold (currently raid5 only)
number of times a stripe requiring preread will be bypassed by
a stripe that does not require preread. For fairness defaults
to 1. Setting this to 0 disables bypass accounting and
requires preread stripes to wait until all full-width stripe-
writes are complete. Valid values are 0 to stripe_cache_size.

View File

@@ -0,0 +1,285 @@
Kernel module signing facility
------------------------------
.. CONTENTS
..
.. - Overview.
.. - Configuring module signing.
.. - Generating signing keys.
.. - Public keys in the kernel.
.. - Manually signing modules.
.. - Signed modules and stripping.
.. - Loading signed modules.
.. - Non-valid signatures and unsigned modules.
.. - Administering/protecting the private key.
========
Overview
========
The kernel module signing facility cryptographically signs modules during
installation and then checks the signature upon loading the module. This
allows increased kernel security by disallowing the loading of unsigned modules
or modules signed with an invalid key. Module signing increases security by
making it harder to load a malicious module into the kernel. The module
signature checking is done by the kernel so that it is not necessary to have
trusted userspace bits.
This facility uses X.509 ITU-T standard certificates to encode the public keys
involved. The signatures are not themselves encoded in any industrial standard
type. The facility currently only supports the RSA public key encryption
standard (though it is pluggable and permits others to be used). The possible
hash algorithms that can be used are SHA-1, SHA-224, SHA-256, SHA-384, and
SHA-512 (the algorithm is selected by data in the signature).
==========================
Configuring module signing
==========================
The module signing facility is enabled by going to the
:menuselection:`Enable Loadable Module Support` section of
the kernel configuration and turning on::
CONFIG_MODULE_SIG "Module signature verification"
This has a number of options available:
(1) :menuselection:`Require modules to be validly signed`
(``CONFIG_MODULE_SIG_FORCE``)
This specifies how the kernel should deal with a module that has a
signature for which the key is not known or a module that is unsigned.
If this is off (ie. "permissive"), then modules for which the key is not
available and modules that are unsigned are permitted, but the kernel will
be marked as being tainted, and the concerned modules will be marked as
tainted, shown with the character 'E'.
If this is on (ie. "restrictive"), only modules that have a valid
signature that can be verified by a public key in the kernel's possession
will be loaded. All other modules will generate an error.
Irrespective of the setting here, if the module has a signature block that
cannot be parsed, it will be rejected out of hand.
(2) :menuselection:`Automatically sign all modules`
(``CONFIG_MODULE_SIG_ALL``)
If this is on then modules will be automatically signed during the
modules_install phase of a build. If this is off, then the modules must
be signed manually using::
scripts/sign-file
(3) :menuselection:`Which hash algorithm should modules be signed with?`
This presents a choice of which hash algorithm the installation phase will
sign the modules with:
=============================== ==========================================
``CONFIG_MODULE_SIG_SHA1`` :menuselection:`Sign modules with SHA-1`
``CONFIG_MODULE_SIG_SHA224`` :menuselection:`Sign modules with SHA-224`
``CONFIG_MODULE_SIG_SHA256`` :menuselection:`Sign modules with SHA-256`
``CONFIG_MODULE_SIG_SHA384`` :menuselection:`Sign modules with SHA-384`
``CONFIG_MODULE_SIG_SHA512`` :menuselection:`Sign modules with SHA-512`
=============================== ==========================================
The algorithm selected here will also be built into the kernel (rather
than being a module) so that modules signed with that algorithm can have
their signatures checked without causing a dependency loop.
(4) :menuselection:`File name or PKCS#11 URI of module signing key`
(``CONFIG_MODULE_SIG_KEY``)
Setting this option to something other than its default of
``certs/signing_key.pem`` will disable the autogeneration of signing keys
and allow the kernel modules to be signed with a key of your choosing.
The string provided should identify a file containing both a private key
and its corresponding X.509 certificate in PEM form, or — on systems where
the OpenSSL ENGINE_pkcs11 is functional — a PKCS#11 URI as defined by
RFC7512. In the latter case, the PKCS#11 URI should reference both a
certificate and a private key.
If the PEM file containing the private key is encrypted, or if the
PKCS#11 token requries a PIN, this can be provided at build time by
means of the ``KBUILD_SIGN_PIN`` variable.
(5) :menuselection:`Additional X.509 keys for default system keyring`
(``CONFIG_SYSTEM_TRUSTED_KEYS``)
This option can be set to the filename of a PEM-encoded file containing
additional certificates which will be included in the system keyring by
default.
Note that enabling module signing adds a dependency on the OpenSSL devel
packages to the kernel build processes for the tool that does the signing.
=======================
Generating signing keys
=======================
Cryptographic keypairs are required to generate and check signatures. A
private key is used to generate a signature and the corresponding public key is
used to check it. The private key is only needed during the build, after which
it can be deleted or stored securely. The public key gets built into the
kernel so that it can be used to check the signatures as the modules are
loaded.
Under normal conditions, when ``CONFIG_MODULE_SIG_KEY`` is unchanged from its
default, the kernel build will automatically generate a new keypair using
openssl if one does not exist in the file::
certs/signing_key.pem
during the building of vmlinux (the public part of the key needs to be built
into vmlinux) using parameters in the::
certs/x509.genkey
file (which is also generated if it does not already exist).
It is strongly recommended that you provide your own x509.genkey file.
Most notably, in the x509.genkey file, the req_distinguished_name section
should be altered from the default::
[ req_distinguished_name ]
#O = Unspecified company
CN = Build time autogenerated kernel key
#emailAddress = unspecified.user@unspecified.company
The generated RSA key size can also be set with::
[ req ]
default_bits = 4096
It is also possible to manually generate the key private/public files using the
x509.genkey key generation configuration file in the root node of the Linux
kernel sources tree and the openssl command. The following is an example to
generate the public/private key files::
openssl req -new -nodes -utf8 -sha256 -days 36500 -batch -x509 \
-config x509.genkey -outform PEM -out kernel_key.pem \
-keyout kernel_key.pem
The full pathname for the resulting kernel_key.pem file can then be specified
in the ``CONFIG_MODULE_SIG_KEY`` option, and the certificate and key therein will
be used instead of an autogenerated keypair.
=========================
Public keys in the kernel
=========================
The kernel contains a ring of public keys that can be viewed by root. They're
in a keyring called ".system_keyring" that can be seen by::
[root@deneb ~]# cat /proc/keys
...
223c7853 I------ 1 perm 1f030000 0 0 keyring .system_keyring: 1
302d2d52 I------ 1 perm 1f010000 0 0 asymmetri Fedora kernel signing key: d69a84e6bce3d216b979e9505b3e3ef9a7118079: X509.RSA a7118079 []
...
Beyond the public key generated specifically for module signing, additional
trusted certificates can be provided in a PEM-encoded file referenced by the
``CONFIG_SYSTEM_TRUSTED_KEYS`` configuration option.
Further, the architecture code may take public keys from a hardware store and
add those in also (e.g. from the UEFI key database).
Finally, it is possible to add additional public keys by doing::
keyctl padd asymmetric "" [.system_keyring-ID] <[key-file]
e.g.::
keyctl padd asymmetric "" 0x223c7853 <my_public_key.x509
Note, however, that the kernel will only permit keys to be added to
``.system_keyring _if_`` the new key's X.509 wrapper is validly signed by a key
that is already resident in the .system_keyring at the time the key was added.
========================
Manually signing modules
========================
To manually sign a module, use the scripts/sign-file tool available in
the Linux kernel source tree. The script requires 4 arguments:
1. The hash algorithm (e.g., sha256)
2. The private key filename or PKCS#11 URI
3. The public key filename
4. The kernel module to be signed
The following is an example to sign a kernel module::
scripts/sign-file sha512 kernel-signkey.priv \
kernel-signkey.x509 module.ko
The hash algorithm used does not have to match the one configured, but if it
doesn't, you should make sure that hash algorithm is either built into the
kernel or can be loaded without requiring itself.
If the private key requires a passphrase or PIN, it can be provided in the
$KBUILD_SIGN_PIN environment variable.
============================
Signed modules and stripping
============================
A signed module has a digital signature simply appended at the end. The string
``~Module signature appended~.`` at the end of the module's file confirms that a
signature is present but it does not confirm that the signature is valid!
Signed modules are BRITTLE as the signature is outside of the defined ELF
container. Thus they MAY NOT be stripped once the signature is computed and
attached. Note the entire module is the signed payload, including any and all
debug information present at the time of signing.
======================
Loading signed modules
======================
Modules are loaded with insmod, modprobe, ``init_module()`` or
``finit_module()``, exactly as for unsigned modules as no processing is
done in userspace. The signature checking is all done within the kernel.
=========================================
Non-valid signatures and unsigned modules
=========================================
If ``CONFIG_MODULE_SIG_FORCE`` is enabled or module.sig_enforce=1 is supplied on
the kernel command line, the kernel will only load validly signed modules
for which it has a public key. Otherwise, it will also load modules that are
unsigned. Any module for which the kernel has a key, but which proves to have
a signature mismatch will not be permitted to load.
Any module that has an unparseable signature will be rejected.
=========================================
Administering/protecting the private key
=========================================
Since the private key is used to sign modules, viruses and malware could use
the private key to sign modules and compromise the operating system. The
private key must be either destroyed or moved to a secure location and not kept
in the root node of the kernel source tree.
If you use the same private key to sign modules for multiple kernel
configurations, you must ensure that the module version information is
sufficient to prevent loading a module into a different kernel. Either
set ``CONFIG_MODVERSIONS=y`` or ensure that each configuration has a different
kernel release string by changing ``EXTRAVERSION`` or ``CONFIG_LOCALVERSION``.

View File

@@ -0,0 +1,70 @@
Mono(tm) Binary Kernel Support for Linux
-----------------------------------------
To configure Linux to automatically execute Mono-based .NET binaries
(in the form of .exe files) without the need to use the mono CLR
wrapper, you can use the BINFMT_MISC kernel support.
This will allow you to execute Mono-based .NET binaries just like any
other program after you have done the following:
1) You MUST FIRST install the Mono CLR support, either by downloading
a binary package, a source tarball or by installing from CVS. Binary
packages for several distributions can be found at:
http://go-mono.com/download.html
Instructions for compiling Mono can be found at:
http://www.go-mono.com/compiling.html
Once the Mono CLR support has been installed, just check that
``/usr/bin/mono`` (which could be located elsewhere, for example
``/usr/local/bin/mono``) is working.
2) You have to compile BINFMT_MISC either as a module or into
the kernel (``CONFIG_BINFMT_MISC``) and set it up properly.
If you choose to compile it as a module, you will have
to insert it manually with modprobe/insmod, as kmod
cannot be easily supported with binfmt_misc.
Read the file ``binfmt_misc.txt`` in this directory to know
more about the configuration process.
3) Add the following entries to ``/etc/rc.local`` or similar script
to be run at system startup:
.. code-block:: sh
# Insert BINFMT_MISC module into the kernel
if [ ! -e /proc/sys/fs/binfmt_misc/register ]; then
/sbin/modprobe binfmt_misc
# Some distributions, like Fedora Core, perform
# the following command automatically when the
# binfmt_misc module is loaded into the kernel
# or during normal boot up (systemd-based systems).
# Thus, it is possible that the following line
# is not needed at all.
mount -t binfmt_misc none /proc/sys/fs/binfmt_misc
fi
# Register support for .NET CLR binaries
if [ -e /proc/sys/fs/binfmt_misc/register ]; then
# Replace /usr/bin/mono with the correct pathname to
# the Mono CLR runtime (usually /usr/local/bin/mono
# when compiling from sources or CVS).
echo ':CLR:M::MZ::/usr/bin/mono:' > /proc/sys/fs/binfmt_misc/register
else
echo "No binfmt_misc support"
exit 1
fi
4) Check that ``.exe`` binaries can be ran without the need of a
wrapper script, simply by launching the ``.exe`` file directly
from a command prompt, for example::
/usr/bin/xsd.exe
.. note::
If this fails with a permission denied error, check
that the ``.exe`` file has execute permissions.

View File

@@ -0,0 +1,286 @@
Parport
+++++++
The ``parport`` code provides parallel-port support under Linux. This
includes the ability to share one port between multiple device
drivers.
You can pass parameters to the ``parport`` code to override its automatic
detection of your hardware. This is particularly useful if you want
to use IRQs, since in general these can't be autoprobed successfully.
By default IRQs are not used even if they **can** be probed. This is
because there are a lot of people using the same IRQ for their
parallel port and a sound card or network card.
The ``parport`` code is split into two parts: generic (which deals with
port-sharing) and architecture-dependent (which deals with actually
using the port).
Parport as modules
==================
If you load the `parport`` code as a module, say::
# insmod parport
to load the generic ``parport`` code. You then must load the
architecture-dependent code with (for example)::
# insmod parport_pc io=0x3bc,0x378,0x278 irq=none,7,auto
to tell the ``parport`` code that you want three PC-style ports, one at
0x3bc with no IRQ, one at 0x378 using IRQ 7, and one at 0x278 with an
auto-detected IRQ. Currently, PC-style (``parport_pc``), Sun ``bpp``,
Amiga, Atari, and MFC3 hardware is supported.
PCI parallel I/O card support comes from ``parport_pc``. Base I/O
addresses should not be specified for supported PCI cards since they
are automatically detected.
modprobe
--------
If you use modprobe , you will find it useful to add lines as below to a
configuration file in /etc/modprobe.d/ directory::
alias parport_lowlevel parport_pc
options parport_pc io=0x378,0x278 irq=7,auto
modprobe will load ``parport_pc`` (with the options ``io=0x378,0x278 irq=7,auto``)
whenever a parallel port device driver (such as ``lp``) is loaded.
Note that these are example lines only! You shouldn't in general need
to specify any options to ``parport_pc`` in order to be able to use a
parallel port.
Parport probe [optional]
------------------------
In 2.2 kernels there was a module called ``parport_probe``, which was used
for collecting IEEE 1284 device ID information. This has now been
enhanced and now lives with the IEEE 1284 support. When a parallel
port is detected, the devices that are connected to it are analysed,
and information is logged like this::
parport0: Printer, BJC-210 (Canon)
The probe information is available from files in ``/proc/sys/dev/parport/``.
Parport linked into the kernel statically
=========================================
If you compile the ``parport`` code into the kernel, then you can use
kernel boot parameters to get the same effect. Add something like the
following to your LILO command line::
parport=0x3bc parport=0x378,7 parport=0x278,auto,nofifo
You can have many ``parport=...`` statements, one for each port you want
to add. Adding ``parport=0`` to the kernel command-line will disable
parport support entirely. Adding ``parport=auto`` to the kernel
command-line will make ``parport`` use any IRQ lines or DMA channels that
it auto-detects.
Files in /proc
==============
If you have configured the ``/proc`` filesystem into your kernel, you will
see a new directory entry: ``/proc/sys/dev/parport``. In there will be a
directory entry for each parallel port for which parport is
configured. In each of those directories are a collection of files
describing that parallel port.
The ``/proc/sys/dev/parport`` directory tree looks like::
parport
|-- default
| |-- spintime
| `-- timeslice
|-- parport0
| |-- autoprobe
| |-- autoprobe0
| |-- autoprobe1
| |-- autoprobe2
| |-- autoprobe3
| |-- devices
| | |-- active
| | `-- lp
| | `-- timeslice
| |-- base-addr
| |-- irq
| |-- dma
| |-- modes
| `-- spintime
`-- parport1
|-- autoprobe
|-- autoprobe0
|-- autoprobe1
|-- autoprobe2
|-- autoprobe3
|-- devices
| |-- active
| `-- ppa
| `-- timeslice
|-- base-addr
|-- irq
|-- dma
|-- modes
`-- spintime
.. tabularcolumns:: |p{4.0cm}|p{13.5cm}|
======================= =======================================================
File Contents
======================= =======================================================
``devices/active`` A list of the device drivers using that port. A "+"
will appear by the name of the device currently using
the port (it might not appear against any). The
string "none" means that there are no device drivers
using that port.
``base-addr`` Parallel port's base address, or addresses if the port
has more than one in which case they are separated
with tabs. These values might not have any sensible
meaning for some ports.
``irq`` Parallel port's IRQ, or -1 if none is being used.
``dma`` Parallel port's DMA channel, or -1 if none is being
used.
``modes`` Parallel port's hardware modes, comma-separated,
meaning:
- PCSPP
PC-style SPP registers are available.
- TRISTATE
Port is bidirectional.
- COMPAT
Hardware acceleration for printers is
available and will be used.
- EPP
Hardware acceleration for EPP protocol
is available and will be used.
- ECP
Hardware acceleration for ECP protocol
is available and will be used.
- DMA
DMA is available and will be used.
Note that the current implementation will only take
advantage of COMPAT and ECP modes if it has an IRQ
line to use.
``autoprobe`` Any IEEE-1284 device ID information that has been
acquired from the (non-IEEE 1284.3) device.
``autoprobe[0-3]`` IEEE 1284 device ID information retrieved from
daisy-chain devices that conform to IEEE 1284.3.
``spintime`` The number of microseconds to busy-loop while waiting
for the peripheral to respond. You might find that
adjusting this improves performance, depending on your
peripherals. This is a port-wide setting, i.e. it
applies to all devices on a particular port.
``timeslice`` The number of milliseconds that a device driver is
allowed to keep a port claimed for. This is advisory,
and driver can ignore it if it must.
``default/*`` The defaults for spintime and timeslice. When a new
port is registered, it picks up the default spintime.
When a new device is registered, it picks up the
default timeslice.
======================= =======================================================
Device drivers
==============
Once the parport code is initialised, you can attach device drivers to
specific ports. Normally this happens automatically; if the lp driver
is loaded it will create one lp device for each port found. You can
override this, though, by using parameters either when you load the lp
driver::
# insmod lp parport=0,2
or on the LILO command line::
lp=parport0 lp=parport2
Both the above examples would inform lp that you want ``/dev/lp0`` to be
the first parallel port, and /dev/lp1 to be the **third** parallel port,
with no lp device associated with the second port (parport1). Note
that this is different to the way older kernels worked; there used to
be a static association between the I/O port address and the device
name, so ``/dev/lp0`` was always the port at 0x3bc. This is no longer the
case - if you only have one port, it will default to being ``/dev/lp0``,
regardless of base address.
Also:
* If you selected the IEEE 1284 support at compile time, you can say
``lp=auto`` on the kernel command line, and lp will create devices
only for those ports that seem to have printers attached.
* If you give PLIP the ``timid`` parameter, either with ``plip=timid`` on
the command line, or with ``insmod plip timid=1`` when using modules,
it will avoid any ports that seem to be in use by other devices.
* IRQ autoprobing works only for a few port types at the moment.
Reporting printer problems with parport
=======================================
If you are having problems printing, please go through these steps to
try to narrow down where the problem area is.
When reporting problems with parport, really you need to give all of
the messages that ``parport_pc`` spits out when it initialises. There are
several code paths:
- polling
- interrupt-driven, protocol in software
- interrupt-driven, protocol in hardware using PIO
- interrupt-driven, protocol in hardware using DMA
The kernel messages that ``parport_pc`` logs give an indication of which
code path is being used. (They could be a lot better actually..)
For normal printer protocol, having IEEE 1284 modes enabled or not
should not make a difference.
To turn off the 'protocol in hardware' code paths, disable
``CONFIG_PARPORT_PC_FIFO``. Note that when they are enabled they are not
necessarily **used**; it depends on whether the hardware is available,
enabled by the BIOS, and detected by the driver.
So, to start with, disable ``CONFIG_PARPORT_PC_FIFO``, and load ``parport_pc``
with ``irq=none``. See if printing works then. It really should,
because this is the simplest code path.
If that works fine, try with ``io=0x378 irq=7`` (adjust for your
hardware), to make it use interrupt-driven in-software protocol.
If **that** works fine, then one of the hardware modes isn't working
right. Enable ``CONFIG_FIFO`` (no, it isn't a module option,
and yes, it should be), set the port to ECP mode in the BIOS and note
the DMA channel, and try with::
io=0x378 irq=7 dma=none (for PIO)
io=0x378 irq=7 dma=3 (for DMA)
----------
philb@gnu.org
tim@cyberelk.net

View File

@@ -0,0 +1,156 @@
Ramoops oops/panic logger
=========================
Sergiu Iordache <sergiu@chromium.org>
Updated: 17 November 2011
Introduction
------------
Ramoops is an oops/panic logger that writes its logs to RAM before the system
crashes. It works by logging oopses and panics in a circular buffer. Ramoops
needs a system with persistent RAM so that the content of that area can
survive after a restart.
Ramoops concepts
----------------
Ramoops uses a predefined memory area to store the dump. The start and size
and type of the memory area are set using three variables:
* ``mem_address`` for the start
* ``mem_size`` for the size. The memory size will be rounded down to a
power of two.
* ``mem_type`` to specifiy if the memory type (default is pgprot_writecombine).
Typically the default value of ``mem_type=0`` should be used as that sets the pstore
mapping to pgprot_writecombine. Setting ``mem_type=1`` attempts to use
``pgprot_noncached``, which only works on some platforms. This is because pstore
depends on atomic operations. At least on ARM, pgprot_noncached causes the
memory to be mapped strongly ordered, and atomic operations on strongly ordered
memory are implementation defined, and won't work on many ARMs such as omaps.
The memory area is divided into ``record_size`` chunks (also rounded down to
power of two) and each oops/panic writes a ``record_size`` chunk of
information.
Dumping both oopses and panics can be done by setting 1 in the ``dump_oops``
variable while setting 0 in that variable dumps only the panics.
The module uses a counter to record multiple dumps but the counter gets reset
on restart (i.e. new dumps after the restart will overwrite old ones).
Ramoops also supports software ECC protection of persistent memory regions.
This might be useful when a hardware reset was used to bring the machine back
to life (i.e. a watchdog triggered). In such cases, RAM may be somewhat
corrupt, but usually it is restorable.
Setting the parameters
----------------------
Setting the ramoops parameters can be done in several different manners:
A. Use the module parameters (which have the names of the variables described
as before). For quick debugging, you can also reserve parts of memory during
boot and then use the reserved memory for ramoops. For example, assuming a
machine with > 128 MB of memory, the following kernel command line will tell
the kernel to use only the first 128 MB of memory, and place ECC-protected
ramoops region at 128 MB boundary::
mem=128M ramoops.mem_address=0x8000000 ramoops.ecc=1
B. Use Device Tree bindings, as described in
``Documentation/device-tree/bindings/reserved-memory/admin-guide/ramoops.rst``.
For example::
reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
ranges;
ramoops@8f000000 {
compatible = "ramoops";
reg = <0 0x8f000000 0 0x100000>;
record-size = <0x4000>;
console-size = <0x4000>;
};
};
C. Use a platform device and set the platform data. The parameters can then
be set through that platform data. An example of doing that is:
.. code-block:: c
#include <linux/pstore_ram.h>
[...]
static struct ramoops_platform_data ramoops_data = {
.mem_size = <...>,
.mem_address = <...>,
.mem_type = <...>,
.record_size = <...>,
.dump_oops = <...>,
.ecc = <...>,
};
static struct platform_device ramoops_dev = {
.name = "ramoops",
.dev = {
.platform_data = &ramoops_data,
},
};
[... inside a function ...]
int ret;
ret = platform_device_register(&ramoops_dev);
if (ret) {
printk(KERN_ERR "unable to register platform device\n");
return ret;
}
You can specify either RAM memory or peripheral devices' memory. However, when
specifying RAM, be sure to reserve the memory by issuing memblock_reserve()
very early in the architecture code, e.g.::
#include <linux/memblock.h>
memblock_reserve(ramoops_data.mem_address, ramoops_data.mem_size);
Dump format
-----------
The data dump begins with a header, currently defined as ``====`` followed by a
timestamp and a new line. The dump then continues with the actual data.
Reading the data
----------------
The dump data can be read from the pstore filesystem. The format for these
files is ``dmesg-ramoops-N``, where N is the record number in memory. To delete
a stored record from RAM, simply unlink the respective pstore file.
Persistent function tracing
---------------------------
Persistent function tracing might be useful for debugging software or hardware
related hangs. The functions call chain log is stored in a ``ftrace-ramoops``
file. Here is an example of usage::
# mount -t debugfs debugfs /sys/kernel/debug/
# echo 1 > /sys/kernel/debug/pstore/record_ftrace
# reboot -f
[...]
# mount -t pstore pstore /mnt/
# tail /mnt/ftrace-ramoops
0 ffffffff8101ea64 ffffffff8101bcda native_apic_mem_read <- disconnect_bsp_APIC+0x6a/0xc0
0 ffffffff8101ea44 ffffffff8101bcf6 native_apic_mem_write <- disconnect_bsp_APIC+0x86/0xc0
0 ffffffff81020084 ffffffff8101a4b5 hpet_disable <- native_machine_shutdown+0x75/0x90
0 ffffffff81005f94 ffffffff8101a4bb iommu_shutdown_noop <- native_machine_shutdown+0x7b/0x90
0 ffffffff8101a6a1 ffffffff8101a437 native_machine_emergency_restart <- native_machine_restart+0x37/0x40
0 ffffffff811f9876 ffffffff8101a73a acpi_reboot <- native_machine_emergency_restart+0xaa/0x1e0
0 ffffffff8101a514 ffffffff8101a772 mach_reboot_fixups <- native_machine_emergency_restart+0xe2/0x1e0
0 ffffffff811d9c54 ffffffff8101a7a0 __const_udelay <- native_machine_emergency_restart+0x110/0x1e0
0 ffffffff811d9c34 ffffffff811d9c80 __delay <- __const_udelay+0x30/0x40
0 ffffffff811d9d14 ffffffff811d9c3f delay_tsc <- __delay+0xf/0x20

View File

@@ -0,0 +1,182 @@
.. _reportingbugs:
Reporting bugs
++++++++++++++
Background
==========
The upstream Linux kernel maintainers only fix bugs for specific kernel
versions. Those versions include the current "release candidate" (or -rc)
kernel, any "stable" kernel versions, and any "long term" kernels.
Please see https://www.kernel.org/ for a list of supported kernels. Any
kernel marked with [EOL] is "end of life" and will not have any fixes
backported to it.
If you've found a bug on a kernel version that isn't listed on kernel.org,
contact your Linux distribution or embedded vendor for support.
Alternatively, you can attempt to run one of the supported stable or -rc
kernels, and see if you can reproduce the bug on that. It's preferable
to reproduce the bug on the latest -rc kernel.
How to report Linux kernel bugs
===============================
Identify the problematic subsystem
----------------------------------
Identifying which part of the Linux kernel might be causing your issue
increases your chances of getting your bug fixed. Simply posting to the
generic linux-kernel mailing list (LKML) may cause your bug report to be
lost in the noise of a mailing list that gets 1000+ emails a day.
Instead, try to figure out which kernel subsystem is causing the issue,
and email that subsystem's maintainer and mailing list. If the subsystem
maintainer doesn't answer, then expand your scope to mailing lists like
LKML.
Identify who to notify
----------------------
Once you know the subsystem that is causing the issue, you should send a
bug report. Some maintainers prefer bugs to be reported via bugzilla
(https://bugzilla.kernel.org), while others prefer that bugs be reported
via the subsystem mailing list.
To find out where to send an emailed bug report, find your subsystem or
device driver in the MAINTAINERS file. Search in the file for relevant
entries, and send your bug report to the person(s) listed in the "M:"
lines, making sure to Cc the mailing list(s) in the "L:" lines. When the
maintainer replies to you, make sure to 'Reply-all' in order to keep the
public mailing list(s) in the email thread.
If you know which driver is causing issues, you can pass one of the driver
files to the get_maintainer.pl script::
perl scripts/get_maintainer.pl -f <filename>
If it is a security bug, please copy the Security Contact listed in the
MAINTAINERS file. They can help coordinate bugfix and disclosure. See
:ref:`Documentation/admin-guide/security-bugs.rst <securitybugs>` for more information.
If you can't figure out which subsystem caused the issue, you should file
a bug in kernel.org bugzilla and send email to
linux-kernel@vger.kernel.org, referencing the bugzilla URL. (For more
information on the linux-kernel mailing list see
http://www.tux.org/lkml/).
Tips for reporting bugs
-----------------------
If you haven't reported a bug before, please read:
http://www.chiark.greenend.org.uk/~sgtatham/bugs.html
http://www.catb.org/esr/faqs/smart-questions.html
It's REALLY important to report bugs that seem unrelated as separate email
threads or separate bugzilla entries. If you report several unrelated
bugs at once, it's difficult for maintainers to tease apart the relevant
data.
Gather information
------------------
The most important information in a bug report is how to reproduce the
bug. This includes system information, and (most importantly)
step-by-step instructions for how a user can trigger the bug.
If the failure includes an "OOPS:", take a picture of the screen, capture
a netconsole trace, or type the message from your screen into the bug
report. Please read "Documentation/admin-guide/oops-tracing.rst" before posting your
bug report. This explains what you should do with the "Oops" information
to make it useful to the recipient.
This is a suggested format for a bug report sent via email or bugzilla.
Having a standardized bug report form makes it easier for you not to
overlook things, and easier for the developers to find the pieces of
information they're really interested in. If some information is not
relevant to your bug, feel free to exclude it.
First run the ver_linux script included as scripts/ver_linux, which
reports the version of some important subsystems. Run this script with
the command ``awk -f scripts/ver_linux``.
Use that information to fill in all fields of the bug report form, and
post it to the mailing list with a subject of "PROBLEM: <one line
summary from [1.]>" for easy identification by the developers::
[1.] One line summary of the problem:
[2.] Full description of the problem/report:
[3.] Keywords (i.e., modules, networking, kernel):
[4.] Kernel information
[4.1.] Kernel version (from /proc/version):
[4.2.] Kernel .config file:
[5.] Most recent kernel version which did not have the bug:
[6.] Output of Oops.. message (if applicable) with symbolic information
resolved (see Documentation/admin-guide/oops-tracing.rst)
[7.] A small shell script or example program which triggers the
problem (if possible)
[8.] Environment
[8.1.] Software (add the output of the ver_linux script here)
[8.2.] Processor information (from /proc/cpuinfo):
[8.3.] Module information (from /proc/modules):
[8.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem)
[8.5.] PCI information ('lspci -vvv' as root)
[8.6.] SCSI information (from /proc/scsi/scsi)
[8.7.] Other information that might be relevant to the problem
(please look in /proc and include all information that you
think to be relevant):
[X.] Other notes, patches, fixes, workarounds:
Follow up
=========
Expectations for bug reporters
------------------------------
Linux kernel maintainers expect bug reporters to be able to follow up on
bug reports. That may include running new tests, applying patches,
recompiling your kernel, and/or re-triggering your bug. The most
frustrating thing for maintainers is for someone to report a bug, and then
never follow up on a request to try out a fix.
That said, it's still useful for a kernel maintainer to know a bug exists
on a supported kernel, even if you can't follow up with retests. Follow
up reports, such as replying to the email thread with "I tried the latest
kernel and I can't reproduce my bug anymore" are also helpful, because
maintainers have to assume silence means things are still broken.
Expectations for kernel maintainers
-----------------------------------
Linux kernel maintainers are busy, overworked human beings. Some times
they may not be able to address your bug in a day, a week, or two weeks.
If they don't answer your email, they may be on vacation, or at a Linux
conference. Check the conference schedule at https://LWN.net for more info:
https://lwn.net/Calendar/
In general, kernel maintainers take 1 to 5 business days to respond to
bugs. The majority of kernel maintainers are employed to work on the
kernel, and they may not work on the weekends. Maintainers are scattered
around the world, and they may not work in your time zone. Unless you
have a high priority bug, please wait at least a week after the first bug
report before sending the maintainer a reminder email.
The exceptions to this rule are regressions, kernel crashes, security holes,
or userspace breakage caused by new kernel behavior. Those bugs should be
addressed by the maintainers ASAP. If you suspect a maintainer is not
responding to these types of bugs in a timely manner (especially during a
merge window), escalate the bug to LKML and Linus Torvalds.
Thank you!
[Some of this is taken from Frohwalt Egerer's original linux-kernel FAQ]

View File

@@ -0,0 +1,46 @@
.. _securitybugs:
Security bugs
=============
Linux kernel developers take security very seriously. As such, we'd
like to know when a security bug is found so that it can be fixed and
disclosed as quickly as possible. Please report security bugs to the
Linux kernel security team.
Contact
-------
The Linux kernel security team can be contacted by email at
<security@kernel.org>. This is a private list of security officers
who will help verify the bug report and develop and release a fix.
It is possible that the security team will bring in extra help from
area maintainers to understand and fix the security vulnerability.
As it is with any bug, the more information provided the easier it
will be to diagnose and fix. Please review the procedure outlined in
admin-guide/reporting-bugs.rst if you are unclear about what information is helpful.
Any exploit code is very helpful and will not be released without
consent from the reporter unless it has already been made public.
Disclosure
----------
The goal of the Linux kernel security team is to work with the
bug submitter to bug resolution as well as disclosure. We prefer
to fully disclose the bug as soon as possible. It is reasonable to
delay disclosure when the bug or the fix is not yet fully understood,
the solution is not well-tested or for vendor coordination. However, we
expect these delays to be short, measurable in days, not weeks or months.
A disclosure date is negotiated by the security team working with the
bug submitter as well as vendors. However, the kernel security team
holds the final say when setting a disclosure date. The timeframe for
disclosure is from immediate (esp. if it's already publicly known)
to a few weeks. As a basic default policy, we expect report date to
disclosure date to be on the order of 7 days.
Non-disclosure agreements
-------------------------
The Linux kernel security team is not a formal body and therefore unable
to enter any non-disclosure agreements.

View File

@@ -0,0 +1,115 @@
.. _serial_console:
Linux Serial Console
====================
To use a serial port as console you need to compile the support into your
kernel - by default it is not compiled in. For PC style serial ports
it's the config option next to menu option:
:menuselection:`Character devices --> Serial drivers --> 8250/16550 and compatible serial support --> Console on 8250/16550 and compatible serial port`
You must compile serial support into the kernel and not as a module.
It is possible to specify multiple devices for console output. You can
define a new kernel command line option to select which device(s) to
use for console output.
The format of this option is::
console=device,options
device: tty0 for the foreground virtual console
ttyX for any other virtual console
ttySx for a serial port
lp0 for the first parallel port
ttyUSB0 for the first USB serial device
options: depend on the driver. For the serial port this
defines the baudrate/parity/bits/flow control of
the port, in the format BBBBPNF, where BBBB is the
speed, P is parity (n/o/e), N is number of bits,
and F is flow control ('r' for RTS). Default is
9600n8. The maximum baudrate is 115200.
You can specify multiple console= options on the kernel command line.
Output will appear on all of them. The last device will be used when
you open ``/dev/console``. So, for example::
console=ttyS1,9600 console=tty0
defines that opening ``/dev/console`` will get you the current foreground
virtual console, and kernel messages will appear on both the VGA
console and the 2nd serial port (ttyS1 or COM2) at 9600 baud.
Note that you can only define one console per device type (serial, video).
If no console device is specified, the first device found capable of
acting as a system console will be used. At this time, the system
first looks for a VGA card and then for a serial port. So if you don't
have a VGA card in your system the first serial port will automatically
become the console.
You will need to create a new device to use ``/dev/console``. The official
``/dev/console`` is now character device 5,1.
(You can also use a network device as a console. See
``Documentation/networking/netconsole.txt`` for information on that.)
Here's an example that will use ``/dev/ttyS1`` (COM2) as the console.
Replace the sample values as needed.
1. Create ``/dev/console`` (real console) and ``/dev/tty0`` (master virtual
console)::
cd /dev
rm -f console tty0
mknod -m 622 console c 5 1
mknod -m 622 tty0 c 4 0
2. LILO can also take input from a serial device. This is a very
useful option. To tell LILO to use the serial port:
In lilo.conf (global section)::
serial = 1,9600n8 (ttyS1, 9600 bd, no parity, 8 bits)
3. Adjust to kernel flags for the new kernel,
again in lilo.conf (kernel section)::
append = "console=ttyS1,9600"
4. Make sure a getty runs on the serial port so that you can login to
it once the system is done booting. This is done by adding a line
like this to ``/etc/inittab`` (exact syntax depends on your getty)::
S1:23:respawn:/sbin/getty -L ttyS1 9600 vt100
5. Init and ``/etc/ioctl.save``
Sysvinit remembers its stty settings in a file in ``/etc``, called
``/etc/ioctl.save``. REMOVE THIS FILE before using the serial
console for the first time, because otherwise init will probably
set the baudrate to 38400 (baudrate of the virtual console).
6. ``/dev/console`` and X
Programs that want to do something with the virtual console usually
open ``/dev/console``. If you have created the new ``/dev/console`` device,
and your console is NOT the virtual console some programs will fail.
Those are programs that want to access the VT interface, and use
``/dev/console instead of /dev/tty0``. Some of those programs are::
Xfree86, svgalib, gpm, SVGATextMode
It should be fixed in modern versions of these programs though.
Note that if you boot without a ``console=`` option (or with
``console=/dev/tty0``), ``/dev/console`` is the same as ``/dev/tty0``.
In that case everything will still work.
7. Thanks
Thanks to Geert Uytterhoeven <geert@linux-m68k.org>
for porting the patches from 2.1.4x to 2.1.6x for taking care of
the integration of these patches into m68k, ppc and alpha.
Miquel van Smoorenburg <miquels@cistron.nl>, 11-Jun-2000

View File

@@ -0,0 +1,192 @@
Rules on how to access information in sysfs
===========================================
The kernel-exported sysfs exports internal kernel implementation details
and depends on internal kernel structures and layout. It is agreed upon
by the kernel developers that the Linux kernel does not provide a stable
internal API. Therefore, there are aspects of the sysfs interface that
may not be stable across kernel releases.
To minimize the risk of breaking users of sysfs, which are in most cases
low-level userspace applications, with a new kernel release, the users
of sysfs must follow some rules to use an as-abstract-as-possible way to
access this filesystem. The current udev and HAL programs already
implement this and users are encouraged to plug, if possible, into the
abstractions these programs provide instead of accessing sysfs directly.
But if you really do want or need to access sysfs directly, please follow
the following rules and then your programs should work with future
versions of the sysfs interface.
- Do not use libsysfs
It makes assumptions about sysfs which are not true. Its API does not
offer any abstraction, it exposes all the kernel driver-core
implementation details in its own API. Therefore it is not better than
reading directories and opening the files yourself.
Also, it is not actively maintained, in the sense of reflecting the
current kernel development. The goal of providing a stable interface
to sysfs has failed; it causes more problems than it solves. It
violates many of the rules in this document.
- sysfs is always at ``/sys``
Parsing ``/proc/mounts`` is a waste of time. Other mount points are a
system configuration bug you should not try to solve. For test cases,
possibly support a ``SYSFS_PATH`` environment variable to overwrite the
application's behavior, but never try to search for sysfs. Never try
to mount it, if you are not an early boot script.
- devices are only "devices"
There is no such thing like class-, bus-, physical devices,
interfaces, and such that you can rely on in userspace. Everything is
just simply a "device". Class-, bus-, physical, ... types are just
kernel implementation details which should not be expected by
applications that look for devices in sysfs.
The properties of a device are:
- devpath (``/devices/pci0000:00/0000:00:1d.1/usb2/2-2/2-2:1.0``)
- identical to the DEVPATH value in the event sent from the kernel
at device creation and removal
- the unique key to the device at that point in time
- the kernel's path to the device directory without the leading
``/sys``, and always starting with a slash
- all elements of a devpath must be real directories. Symlinks
pointing to /sys/devices must always be resolved to their real
target and the target path must be used to access the device.
That way the devpath to the device matches the devpath of the
kernel used at event time.
- using or exposing symlink values as elements in a devpath string
is a bug in the application
- kernel name (``sda``, ``tty``, ``0000:00:1f.2``, ...)
- a directory name, identical to the last element of the devpath
- applications need to handle spaces and characters like ``!`` in
the name
- subsystem (``block``, ``tty``, ``pci``, ...)
- simple string, never a path or a link
- retrieved by reading the "subsystem"-link and using only the
last element of the target path
- driver (``tg3``, ``ata_piix``, ``uhci_hcd``)
- a simple string, which may contain spaces, never a path or a
link
- it is retrieved by reading the "driver"-link and using only the
last element of the target path
- devices which do not have "driver"-link just do not have a
driver; copying the driver value in a child device context is a
bug in the application
- attributes
- the files in the device directory or files below subdirectories
of the same device directory
- accessing attributes reached by a symlink pointing to another device,
like the "device"-link, is a bug in the application
Everything else is just a kernel driver-core implementation detail
that should not be assumed to be stable across kernel releases.
- Properties of parent devices never belong into a child device.
Always look at the parent devices themselves for determining device
context properties. If the device ``eth0`` or ``sda`` does not have a
"driver"-link, then this device does not have a driver. Its value is empty.
Never copy any property of the parent-device into a child-device. Parent
device properties may change dynamically without any notice to the
child device.
- Hierarchy in a single device tree
There is only one valid place in sysfs where hierarchy can be examined
and this is below: ``/sys/devices.``
It is planned that all device directories will end up in the tree
below this directory.
- Classification by subsystem
There are currently three places for classification of devices:
``/sys/block,`` ``/sys/class`` and ``/sys/bus.`` It is planned that these will
not contain any device directories themselves, but only flat lists of
symlinks pointing to the unified ``/sys/devices`` tree.
All three places have completely different rules on how to access
device information. It is planned to merge all three
classification directories into one place at ``/sys/subsystem``,
following the layout of the bus directories. All buses and
classes, including the converted block subsystem, will show up
there.
The devices belonging to a subsystem will create a symlink in the
"devices" directory at ``/sys/subsystem/<name>/devices``,
If ``/sys/subsystem`` exists, ``/sys/bus``, ``/sys/class`` and ``/sys/block``
can be ignored. If it does not exist, you always have to scan all three
places, as the kernel is free to move a subsystem from one place to
the other, as long as the devices are still reachable by the same
subsystem name.
Assuming ``/sys/class/<subsystem>`` and ``/sys/bus/<subsystem>``, or
``/sys/block`` and ``/sys/class/block`` are not interchangeable is a bug in
the application.
- Block
The converted block subsystem at ``/sys/class/block`` or
``/sys/subsystem/block`` will contain the links for disks and partitions
at the same level, never in a hierarchy. Assuming the block subsystem to
contain only disks and not partition devices in the same flat list is
a bug in the application.
- "device"-link and <subsystem>:<kernel name>-links
Never depend on the "device"-link. The "device"-link is a workaround
for the old layout, where class devices are not created in
``/sys/devices/`` like the bus devices. If the link-resolving of a
device directory does not end in ``/sys/devices/``, you can use the
"device"-link to find the parent devices in ``/sys/devices/``, That is the
single valid use of the "device"-link; it must never appear in any
path as an element. Assuming the existence of the "device"-link for
a device in ``/sys/devices/`` is a bug in the application.
Accessing ``/sys/class/net/eth0/device`` is a bug in the application.
Never depend on the class-specific links back to the ``/sys/class``
directory. These links are also a workaround for the design mistake
that class devices are not created in ``/sys/devices.`` If a device
directory does not contain directories for child devices, these links
may be used to find the child devices in ``/sys/class.`` That is the single
valid use of these links; they must never appear in any path as an
element. Assuming the existence of these links for devices which are
real child device directories in the ``/sys/devices`` tree is a bug in
the application.
It is planned to remove all these links when all class device
directories live in ``/sys/devices.``
- Position of devices along device chain can change.
Never depend on a specific parent device position in the devpath,
or the chain of parent devices. The kernel is free to insert devices into
the chain. You must always request the parent device you are looking for
by its subsystem value. You need to walk up the chain until you find
the device that matches the expected subsystem. Depending on a specific
position of a parent device or exposing relative paths using ``../`` to
access the chain of parents is a bug in the application.
- When reading and writing sysfs device attribute files, avoid dependency
on specific error codes wherever possible. This minimizes coupling to
the error handling implementation within the kernel.
In general, failures to read or write sysfs device attributes shall
propagate errors wherever possible. Common errors include, but are not
limited to:
``-EIO``: The read or store operation is not supported, typically
returned by the sysfs system itself if the read or store pointer
is ``NULL``.
``-ENXIO``: The read or store operation failed
Error codes will not be changed without good reason, and should a change
to error codes result in user-space breakage, it will be fixed, or the
the offending change will be reverted.
Userspace applications can, however, expect the format and contents of
the attribute files to remain consistent in the absence of a version
attribute change in the context of a given attribute.

View File

@@ -0,0 +1,289 @@
Linux Magic System Request Key Hacks
====================================
Documentation for sysrq.c
What is the magic SysRq key?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
It is a 'magical' key combo you can hit which the kernel will respond to
regardless of whatever else it is doing, unless it is completely locked up.
How do I enable the magic SysRq key?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
You need to say "yes" to 'Magic SysRq key (CONFIG_MAGIC_SYSRQ)' when
configuring the kernel. When running a kernel with SysRq compiled in,
/proc/sys/kernel/sysrq controls the functions allowed to be invoked via
the SysRq key. The default value in this file is set by the
CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE config symbol, which itself defaults
to 1. Here is the list of possible values in /proc/sys/kernel/sysrq:
- 0 - disable sysrq completely
- 1 - enable all functions of sysrq
- >1 - bitmask of allowed sysrq functions (see below for detailed function
description)::
2 = 0x2 - enable control of console logging level
4 = 0x4 - enable control of keyboard (SAK, unraw)
8 = 0x8 - enable debugging dumps of processes etc.
16 = 0x10 - enable sync command
32 = 0x20 - enable remount read-only
64 = 0x40 - enable signalling of processes (term, kill, oom-kill)
128 = 0x80 - allow reboot/poweroff
256 = 0x100 - allow nicing of all RT tasks
You can set the value in the file by the following command::
echo "number" >/proc/sys/kernel/sysrq
The number may be written here either as decimal or as hexadecimal
with the 0x prefix. CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE must always be
written in hexadecimal.
Note that the value of ``/proc/sys/kernel/sysrq`` influences only the invocation
via a keyboard. Invocation of any operation via ``/proc/sysrq-trigger`` is
always allowed (by a user with admin privileges).
How do I use the magic SysRq key?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
On x86 - You press the key combo :kbd:`ALT-SysRq-<command key>`.
.. note::
Some
keyboards may not have a key labeled 'SysRq'. The 'SysRq' key is
also known as the 'Print Screen' key. Also some keyboards cannot
handle so many keys being pressed at the same time, so you might
have better luck with press :kbd:`Alt`, press :kbd:`SysRq`,
release :kbd:`SysRq`, press :kbd:`<command key>`, release everything.
On SPARC - You press :kbd:`ALT-STOP-<command key>`, I believe.
On the serial console (PC style standard serial ports only)
You send a ``BREAK``, then within 5 seconds a command key. Sending
``BREAK`` twice is interpreted as a normal BREAK.
On PowerPC
Press :kbd:`ALT - Print Screen` (or :kbd:`F13`) - :kbd:`<command key>`,
:kbd:`Print Screen` (or :kbd:`F13`) - :kbd:`<command key>` may suffice.
On other
If you know of the key combos for other architectures, please
let me know so I can add them to this section.
On all
write a character to /proc/sysrq-trigger. e.g.::
echo t > /proc/sysrq-trigger
What are the 'command' keys?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
=========== ===================================================================
Command Function
=========== ===================================================================
``b`` Will immediately reboot the system without syncing or unmounting
your disks.
``c`` Will perform a system crash by a NULL pointer dereference.
A crashdump will be taken if configured.
``d`` Shows all locks that are held.
``e`` Send a SIGTERM to all processes, except for init.
``f`` Will call the oom killer to kill a memory hog process, but do not
panic if nothing can be killed.
``g`` Used by kgdb (kernel debugger)
``h`` Will display help (actually any other key than those listed
here will display help. but ``h`` is easy to remember :-)
``i`` Send a SIGKILL to all processes, except for init.
``j`` Forcibly "Just thaw it" - filesystems frozen by the FIFREEZE ioctl.
``k`` Secure Access Key (SAK) Kills all programs on the current virtual
console. NOTE: See important comments below in SAK section.
``l`` Shows a stack backtrace for all active CPUs.
``m`` Will dump current memory info to your console.
``n`` Used to make RT tasks nice-able
``o`` Will shut your system off (if configured and supported).
``p`` Will dump the current registers and flags to your console.
``q`` Will dump per CPU lists of all armed hrtimers (but NOT regular
timer_list timers) and detailed information about all
clockevent devices.
``r`` Turns off keyboard raw mode and sets it to XLATE.
``s`` Will attempt to sync all mounted filesystems.
``t`` Will dump a list of current tasks and their information to your
console.
``u`` Will attempt to remount all mounted filesystems read-only.
``v`` Forcefully restores framebuffer console
``v`` Causes ETM buffer dump [ARM-specific]
``w`` Dumps tasks that are in uninterruptable (blocked) state.
``x`` Used by xmon interface on ppc/powerpc platforms.
Show global PMU Registers on sparc64.
Dump all TLB entries on MIPS.
``y`` Show global CPU Registers [SPARC-64 specific]
``z`` Dump the ftrace buffer
``0``-``9`` Sets the console log level, controlling which kernel messages
will be printed to your console. (``0``, for example would make
it so that only emergency messages like PANICs or OOPSes would
make it to your console.)
=========== ===================================================================
Okay, so what can I use them for?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Well, unraw(r) is very handy when your X server or a svgalib program crashes.
sak(k) (Secure Access Key) is useful when you want to be sure there is no
trojan program running at console which could grab your password
when you would try to login. It will kill all programs on given console,
thus letting you make sure that the login prompt you see is actually
the one from init, not some trojan program.
.. important::
In its true form it is not a true SAK like the one in a
c2 compliant system, and it should not be mistaken as
such.
It seems others find it useful as (System Attention Key) which is
useful when you want to exit a program that will not let you switch consoles.
(For example, X or a svgalib program.)
``reboot(b)`` is good when you're unable to shut down. But you should also
``sync(s)`` and ``umount(u)`` first.
``crash(c)`` can be used to manually trigger a crashdump when the system is hung.
Note that this just triggers a crash if there is no dump mechanism available.
``sync(s)`` is great when your system is locked up, it allows you to sync your
disks and will certainly lessen the chance of data loss and fscking. Note
that the sync hasn't taken place until you see the "OK" and "Done" appear
on the screen. (If the kernel is really in strife, you may not ever get the
OK or Done message...)
``umount(u)`` is basically useful in the same ways as ``sync(s)``. I generally
``sync(s)``, ``umount(u)``, then ``reboot(b)`` when my system locks. It's saved
me many a fsck. Again, the unmount (remount read-only) hasn't taken place until
you see the "OK" and "Done" message appear on the screen.
The loglevels ``0``-``9`` are useful when your console is being flooded with
kernel messages you do not want to see. Selecting ``0`` will prevent all but
the most urgent kernel messages from reaching your console. (They will
still be logged if syslogd/klogd are alive, though.)
``term(e)`` and ``kill(i)`` are useful if you have some sort of runaway process
you are unable to kill any other way, especially if it's spawning other
processes.
"just thaw ``it(j)``" is useful if your system becomes unresponsive due to a
frozen (probably root) filesystem via the FIFREEZE ioctl.
Sometimes SysRq seems to get 'stuck' after using it, what can I do?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
That happens to me, also. I've found that tapping shift, alt, and control
on both sides of the keyboard, and hitting an invalid sysrq sequence again
will fix the problem. (i.e., something like :kbd:`alt-sysrq-z`). Switching to
another virtual console (:kbd:`ALT+Fn`) and then back again should also help.
I hit SysRq, but nothing seems to happen, what's wrong?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There are some keyboards that produce a different keycode for SysRq than the
pre-defined value of 99 (see ``KEY_SYSRQ`` in ``include/linux/input.h``), or
which don't have a SysRq key at all. In these cases, run ``showkey -s`` to find
an appropriate scancode sequence, and use ``setkeycodes <sequence> 99`` to map
this sequence to the usual SysRq code (e.g., ``setkeycodes e05b 99``). It's
probably best to put this command in a boot script. Oh, and by the way, you
exit ``showkey`` by not typing anything for ten seconds.
I want to add SysRQ key events to a module, how does it work?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In order to register a basic function with the table, you must first include
the header ``include/linux/sysrq.h``, this will define everything else you need.
Next, you must create a ``sysrq_key_op`` struct, and populate it with A) the key
handler function you will use, B) a help_msg string, that will print when SysRQ
prints help, and C) an action_msg string, that will print right before your
handler is called. Your handler must conform to the prototype in 'sysrq.h'.
After the ``sysrq_key_op`` is created, you can call the kernel function
``register_sysrq_key(int key, struct sysrq_key_op *op_p);`` this will
register the operation pointed to by ``op_p`` at table key 'key',
if that slot in the table is blank. At module unload time, you must call
the function ``unregister_sysrq_key(int key, struct sysrq_key_op *op_p)``, which
will remove the key op pointed to by 'op_p' from the key 'key', if and only if
it is currently registered in that slot. This is in case the slot has been
overwritten since you registered it.
The Magic SysRQ system works by registering key operations against a key op
lookup table, which is defined in 'drivers/tty/sysrq.c'. This key table has
a number of operations registered into it at compile time, but is mutable,
and 2 functions are exported for interface to it::
register_sysrq_key and unregister_sysrq_key.
Of course, never ever leave an invalid pointer in the table. I.e., when
your module that called register_sysrq_key() exits, it must call
unregister_sysrq_key() to clean up the sysrq key table entry that it used.
Null pointers in the table are always safe. :)
If for some reason you feel the need to call the handle_sysrq function from
within a function called by handle_sysrq, you must be aware that you are in
a lock (you are also in an interrupt handler, which means don't sleep!), so
you must call ``__handle_sysrq_nolock`` instead.
When I hit a SysRq key combination only the header appears on the console?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sysrq output is subject to the same console loglevel control as all
other console output. This means that if the kernel was booted 'quiet'
as is common on distro kernels the output may not appear on the actual
console, even though it will appear in the dmesg buffer, and be accessible
via the dmesg command and to the consumers of ``/proc/kmsg``. As a specific
exception the header line from the sysrq command is passed to all console
consumers as if the current loglevel was maximum. If only the header
is emitted it is almost certain that the kernel loglevel is too low.
Should you require the output on the console channel then you will need
to temporarily up the console loglevel using :kbd:`alt-sysrq-8` or::
echo 8 > /proc/sysrq-trigger
Remember to return the loglevel to normal after triggering the sysrq
command you are interested in.
I have more questions, who can I ask?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Just ask them on the linux-kernel mailing list:
linux-kernel@vger.kernel.org
Credits
~~~~~~~
Written by Mydraal <vulpyne@vulpyne.net>
Updated by Adam Sulmicki <adam@cfar.umd.edu>
Updated by Jeremy M. Dolan <jmd@turbogeek.org> 2001/01/28 10:15:59
Added to by Crutcher Dunnavant <crutcher+kernel@datastacks.com>

View File

@@ -0,0 +1,59 @@
Tainted kernels
---------------
Some oops reports contain the string **'Tainted: '** after the program
counter. This indicates that the kernel has been tainted by some
mechanism. The string is followed by a series of position-sensitive
characters, each representing a particular tainted value.
1) 'G' if all modules loaded have a GPL or compatible license, 'P' if
any proprietary module has been loaded. Modules without a
MODULE_LICENSE or with a MODULE_LICENSE that is not recognised by
insmod as GPL compatible are assumed to be proprietary.
2) ``F`` if any module was force loaded by ``insmod -f``, ``' '`` if all
modules were loaded normally.
3) ``S`` if the oops occurred on an SMP kernel running on hardware that
hasn't been certified as safe to run multiprocessor.
Currently this occurs only on various Athlons that are not
SMP capable.
4) ``R`` if a module was force unloaded by ``rmmod -f``, ``' '`` if all
modules were unloaded normally.
5) ``M`` if any processor has reported a Machine Check Exception,
``' '`` if no Machine Check Exceptions have occurred.
6) ``B`` if a page-release function has found a bad page reference or
some unexpected page flags.
7) ``U`` if a user or user application specifically requested that the
Tainted flag be set, ``' '`` otherwise.
8) ``D`` if the kernel has died recently, i.e. there was an OOPS or BUG.
9) ``A`` if the ACPI table has been overridden.
10) ``W`` if a warning has previously been issued by the kernel.
(Though some warnings may set more specific taint flags.)
11) ``C`` if a staging driver has been loaded.
12) ``I`` if the kernel is working around a severe bug in the platform
firmware (BIOS or similar).
13) ``O`` if an externally-built ("out-of-tree") module has been loaded.
14) ``E`` if an unsigned module has been loaded in a kernel supporting
module signature.
15) ``L`` if a soft lockup has previously occurred on the system.
16) ``K`` if the kernel has been live patched.
The primary reason for the **'Tainted: '** string is to tell kernel
debuggers if this is a clean kernel or if anything unusual has
occurred. Tainting is permanent: even if an offending module is
unloaded, the tainted value remains to indicate that the kernel is not
trustworthy.

View File

@@ -0,0 +1,189 @@
Unicode support
===============
Last update: 2005-01-17, version 1.4
This file is maintained by H. Peter Anvin <unicode@lanana.org> as part
of the Linux Assigned Names And Numbers Authority (LANANA) project.
The current version can be found at:
http://www.lanana.org/docs/unicode/admin-guide/unicode.rst
Introduction
------------
The Linux kernel code has been rewritten to use Unicode to map
characters to fonts. By downloading a single Unicode-to-font table,
both the eight-bit character sets and UTF-8 mode are changed to use
the font as indicated.
This changes the semantics of the eight-bit character tables subtly.
The four character tables are now:
=============== =============================== ================
Map symbol Map name Escape code (G0)
=============== =============================== ================
LAT1_MAP Latin-1 (ISO 8859-1) ESC ( B
GRAF_MAP DEC VT100 pseudographics ESC ( 0
IBMPC_MAP IBM code page 437 ESC ( U
USER_MAP User defined ESC ( K
=============== =============================== ================
In particular, ESC ( U is no longer "straight to font", since the font
might be completely different than the IBM character set. This
permits for example the use of block graphics even with a Latin-1 font
loaded.
Note that although these codes are similar to ISO 2022, neither the
codes nor their uses match ISO 2022; Linux has two 8-bit codes (G0 and
G1), whereas ISO 2022 has four 7-bit codes (G0-G3).
In accordance with the Unicode standard/ISO 10646 the range U+F000 to
U+F8FF has been reserved for OS-wide allocation (the Unicode Standard
refers to this as a "Corporate Zone", since this is inaccurate for
Linux we call it the "Linux Zone"). U+F000 was picked as the starting
point since it lets the direct-mapping area start on a large power of
two (in case 1024- or 2048-character fonts ever become necessary).
This leaves U+E000 to U+EFFF as End User Zone.
[v1.2]: The Unicodes range from U+F000 and up to U+F7FF have been
hard-coded to map directly to the loaded font, bypassing the
translation table. The user-defined map now defaults to U+F000 to
U+F0FF, emulating the previous behaviour. In practice, this range
might be shorter; for example, vgacon can only handle 256-character
(U+F000..U+F0FF) or 512-character (U+F000..U+F1FF) fonts.
Actual characters assigned in the Linux Zone
--------------------------------------------
In addition, the following characters not present in Unicode 1.1.4
have been defined; these are used by the DEC VT graphics map. [v1.2]
THIS USE IS OBSOLETE AND SHOULD NO LONGER BE USED; PLEASE SEE BELOW.
====== ======================================
U+F800 DEC VT GRAPHICS HORIZONTAL LINE SCAN 1
U+F801 DEC VT GRAPHICS HORIZONTAL LINE SCAN 3
U+F803 DEC VT GRAPHICS HORIZONTAL LINE SCAN 7
U+F804 DEC VT GRAPHICS HORIZONTAL LINE SCAN 9
====== ======================================
The DEC VT220 uses a 6x10 character matrix, and these characters form
a smooth progression in the DEC VT graphics character set. I have
omitted the scan 5 line, since it is also used as a block-graphics
character, and hence has been coded as U+2500 FORMS LIGHT HORIZONTAL.
[v1.3]: These characters have been officially added to Unicode 3.2.0;
they are added at U+23BA, U+23BB, U+23BC, U+23BD. Linux now uses the
new values.
[v1.2]: The following characters have been added to represent common
keyboard symbols that are unlikely to ever be added to Unicode proper
since they are horribly vendor-specific. This, of course, is an
excellent example of horrible design.
====== ======================================
U+F810 KEYBOARD SYMBOL FLYING FLAG
U+F811 KEYBOARD SYMBOL PULLDOWN MENU
U+F812 KEYBOARD SYMBOL OPEN APPLE
U+F813 KEYBOARD SYMBOL SOLID APPLE
====== ======================================
Klingon language support
------------------------
In 1996, Linux was the first operating system in the world to add
support for the artificial language Klingon, created by Marc Okrand
for the "Star Trek" television series. This encoding was later
adopted by the ConScript Unicode Registry and proposed (but ultimately
rejected) for inclusion in Unicode Plane 1. Thus, it remains as a
Linux/CSUR private assignment in the Linux Zone.
This encoding has been endorsed by the Klingon Language Institute.
For more information, contact them at:
http://www.kli.org/
Since the characters in the beginning of the Linux CZ have been more
of the dingbats/symbols/forms type and this is a language, I have
located it at the end, on a 16-cell boundary in keeping with standard
Unicode practice.
.. note::
This range is now officially managed by the ConScript Unicode
Registry. The normative reference is at:
http://www.evertype.com/standards/csur/klingon.html
Klingon has an alphabet of 26 characters, a positional numeric writing
system with 10 digits, and is written left-to-right, top-to-bottom.
Several glyph forms for the Klingon alphabet have been proposed.
However, since the set of symbols appear to be consistent throughout,
with only the actual shapes being different, in keeping with standard
Unicode practice these differences are considered font variants.
====== =======================================================
U+F8D0 KLINGON LETTER A
U+F8D1 KLINGON LETTER B
U+F8D2 KLINGON LETTER CH
U+F8D3 KLINGON LETTER D
U+F8D4 KLINGON LETTER E
U+F8D5 KLINGON LETTER GH
U+F8D6 KLINGON LETTER H
U+F8D7 KLINGON LETTER I
U+F8D8 KLINGON LETTER J
U+F8D9 KLINGON LETTER L
U+F8DA KLINGON LETTER M
U+F8DB KLINGON LETTER N
U+F8DC KLINGON LETTER NG
U+F8DD KLINGON LETTER O
U+F8DE KLINGON LETTER P
U+F8DF KLINGON LETTER Q
- Written <q> in standard Okrand Latin transliteration
U+F8E0 KLINGON LETTER QH
- Written <Q> in standard Okrand Latin transliteration
U+F8E1 KLINGON LETTER R
U+F8E2 KLINGON LETTER S
U+F8E3 KLINGON LETTER T
U+F8E4 KLINGON LETTER TLH
U+F8E5 KLINGON LETTER U
U+F8E6 KLINGON LETTER V
U+F8E7 KLINGON LETTER W
U+F8E8 KLINGON LETTER Y
U+F8E9 KLINGON LETTER GLOTTAL STOP
U+F8F0 KLINGON DIGIT ZERO
U+F8F1 KLINGON DIGIT ONE
U+F8F2 KLINGON DIGIT TWO
U+F8F3 KLINGON DIGIT THREE
U+F8F4 KLINGON DIGIT FOUR
U+F8F5 KLINGON DIGIT FIVE
U+F8F6 KLINGON DIGIT SIX
U+F8F7 KLINGON DIGIT SEVEN
U+F8F8 KLINGON DIGIT EIGHT
U+F8F9 KLINGON DIGIT NINE
U+F8FD KLINGON COMMA
U+F8FE KLINGON FULL STOP
U+F8FF KLINGON SYMBOL FOR EMPIRE
====== =======================================================
Other Fictional and Artificial Scripts
--------------------------------------
Since the assignment of the Klingon Linux Unicode block, a registry of
fictional and artificial scripts has been established by John Cowan
<jcowan@reutershealth.com> and Michael Everson <everson@evertype.com>.
The ConScript Unicode Registry is accessible at:
http://www.evertype.com/standards/csur/
The ranges used fall at the low end of the End User Zone and can hence
not be normatively assigned, but it is recommended that people who
wish to encode fictional scripts use these codes, in the interest of
interoperability. For Klingon, CSUR has adopted the Linux encoding.
The CSUR people are driving adding Tengwar and Cirth into Unicode
Plane 1; the addition of Klingon to Unicode Plane 1 has been rejected
and so the above encoding remains official.

View File

@@ -0,0 +1,66 @@
Software cursor for VGA
=======================
by Pavel Machek <pavel@atrey.karlin.mff.cuni.cz>
and Martin Mares <mj@atrey.karlin.mff.cuni.cz>
Linux now has some ability to manipulate cursor appearance. Normally, you
can set the size of hardware cursor (and also work around some ugly bugs in
those miserable Trident cards [#f1]_. You can now play a few new tricks:
you can make your cursor look
like a non-blinking red block, make it inverse background of the character it's
over or to highlight that character and still choose whether the original
hardware cursor should remain visible or not. There may be other things I have
never thought of.
The cursor appearance is controlled by a ``<ESC>[?1;2;3c`` escape sequence
where 1, 2 and 3 are parameters described below. If you omit any of them,
they will default to zeroes.
first Parameter
specifies cursor size::
0=default
1=invisible
2=underline,
...
8=full block
+ 16 if you want the software cursor to be applied
+ 32 if you want to always change the background color
+ 64 if you dislike having the background the same as the
foreground.
Highlights are ignored for the last two flags.
second parameter
selects character attribute bits you want to change
(by simply XORing them with the value of this parameter). On standard
VGA, the high four bits specify background and the low four the
foreground. In both groups, low three bits set color (as in normal
color codes used by the console) and the most significant one turns
on highlight (or sometimes blinking -- it depends on the configuration
of your VGA).
third parameter
consists of character attribute bits you want to set.
Bit setting takes place before bit toggling, so you can simply clear a
bit by including it in both the set mask and the toggle mask.
.. [#f1] see ``#define TRIDENT_GLITCH`` in ``drivers/video/vgacon.c``.
Examples
--------
To get normal blinking underline, use::
echo -e '\033[?2c'
To get blinking block, use::
echo -e '\033[?6c'
To get red non-blinking block, use::
echo -e '\033[?17;0;64c'