Merge tag 'docs-4.10' of git://git.lwn.net/linux
Pull documentation update from Jonathan Corbet:
"These are the documentation changes for 4.10.
It's another busy cycle for the docs tree, as the sphinx conversion
continues. Highlights include:
- Further work on PDF output, which remains a bit of a pain but
should be more solid now.
- Five more DocBook template files converted to Sphinx. Only 27 to
go... Lots of plain-text files have also been converted and
integrated.
- Images in binary formats have been replaced with more
source-friendly versions.
- Various bits of organizational work, including the renaming of
various files discussed at the kernel summit.
- New documentation for the device_link mechanism.
... and, of course, lots of typo fixes and small updates"
* tag 'docs-4.10' of git://git.lwn.net/linux: (193 commits)
dma-buf: Extract dma-buf.rst
Update Documentation/00-INDEX
docs: 00-INDEX: document directories/files with no docs
docs: 00-INDEX: remove non-existing entries
docs: 00-INDEX: add missing entries for documentation files/dirs
docs: 00-INDEX: consolidate process/ and admin-guide/ description
scripts: add a script to check if Documentation/00-INDEX is sane
Docs: change sh -> awk in REPORTING-BUGS
Documentation/core-api/device_link: Add initial documentation
core-api: remove an unexpected unident
ppc/idle: Add documentation for powersave=off
Doc: Correct typo, "Introdution" => "Introduction"
Documentation/atomic_ops.txt: convert to ReST markup
Documentation/local_ops.txt: convert to ReST markup
Documentation/assoc_array.txt: convert to ReST markup
docs-rst: parse-headers.pl: cleanup the documentation
docs-rst: fix media cleandocs target
docs-rst: media/Makefile: reorganize the rules
docs-rst: media: build SVG from graphviz files
docs-rst: replace bayer.png by a SVG image
...
This commit is contained in:
@@ -14,13 +14,8 @@ Following translations are available on the WWW:
|
|||||||
- this file.
|
- this file.
|
||||||
ABI/
|
ABI/
|
||||||
- info on kernel <-> userspace ABI and relative interface stability.
|
- info on kernel <-> userspace ABI and relative interface stability.
|
||||||
|
|
||||||
BUG-HUNTING
|
|
||||||
- brute force method of doing binary search of patches to find bug.
|
|
||||||
Changes
|
|
||||||
- list of changes that break older software packages.
|
|
||||||
CodingStyle
|
CodingStyle
|
||||||
- how the maintainers expect the C code in the kernel to look.
|
- nothing here, just a pointer to process/coding-style.rst.
|
||||||
DMA-API.txt
|
DMA-API.txt
|
||||||
- DMA API, pci_ API & extensions for non-consistent memory machines.
|
- DMA API, pci_ API & extensions for non-consistent memory machines.
|
||||||
DMA-API-HOWTO.txt
|
DMA-API-HOWTO.txt
|
||||||
@@ -33,8 +28,6 @@ DocBook/
|
|||||||
- directory with DocBook templates etc. for kernel documentation.
|
- directory with DocBook templates etc. for kernel documentation.
|
||||||
EDID/
|
EDID/
|
||||||
- directory with info on customizing EDID for broken gfx/displays.
|
- directory with info on customizing EDID for broken gfx/displays.
|
||||||
HOWTO
|
|
||||||
- the process and procedures of how to do Linux kernel development.
|
|
||||||
IPMI.txt
|
IPMI.txt
|
||||||
- info on Linux Intelligent Platform Management Interface (IPMI) Driver.
|
- info on Linux Intelligent Platform Management Interface (IPMI) Driver.
|
||||||
IRQ-affinity.txt
|
IRQ-affinity.txt
|
||||||
@@ -46,62 +39,43 @@ IRQ.txt
|
|||||||
Intel-IOMMU.txt
|
Intel-IOMMU.txt
|
||||||
- basic info on the Intel IOMMU virtualization support.
|
- basic info on the Intel IOMMU virtualization support.
|
||||||
Makefile
|
Makefile
|
||||||
- This file does nothing. Removing it breaks make htmldocs and
|
- It's not of interest for those who aren't touching the build system.
|
||||||
make distclean.
|
Makefile.sphinx
|
||||||
ManagementStyle
|
- It's not of interest for those who aren't touching the build system.
|
||||||
- how to (attempt to) manage kernel hackers.
|
PCI/
|
||||||
|
- info related to PCI drivers.
|
||||||
RCU/
|
RCU/
|
||||||
- directory with info on RCU (read-copy update).
|
- directory with info on RCU (read-copy update).
|
||||||
SAK.txt
|
SAK.txt
|
||||||
- info on Secure Attention Keys.
|
- info on Secure Attention Keys.
|
||||||
SM501.txt
|
SM501.txt
|
||||||
- Silicon Motion SM501 multimedia companion chip
|
- Silicon Motion SM501 multimedia companion chip
|
||||||
SecurityBugs
|
|
||||||
- procedure for reporting security bugs found in the kernel.
|
|
||||||
SubmitChecklist
|
|
||||||
- Linux kernel patch submission checklist.
|
|
||||||
SubmittingDrivers
|
|
||||||
- procedure to get a new driver source included into the kernel tree.
|
|
||||||
SubmittingPatches
|
SubmittingPatches
|
||||||
- procedure to get a source patch included into the kernel tree.
|
- nothing here, just a pointer to process/coding-style.rst.
|
||||||
VGA-softcursor.txt
|
|
||||||
- how to change your VGA cursor from a blinking underscore.
|
|
||||||
accounting/
|
accounting/
|
||||||
- documentation on accounting and taskstats.
|
- documentation on accounting and taskstats.
|
||||||
acpi/
|
acpi/
|
||||||
- info on ACPI-specific hooks in the kernel.
|
- info on ACPI-specific hooks in the kernel.
|
||||||
|
admin-guide/
|
||||||
|
- info related to Linux users and system admins.
|
||||||
aoe/
|
aoe/
|
||||||
- description of AoE (ATA over Ethernet) along with config examples.
|
- description of AoE (ATA over Ethernet) along with config examples.
|
||||||
applying-patches.txt
|
|
||||||
- description of various trees and how to apply their patches.
|
|
||||||
arm/
|
arm/
|
||||||
- directory with info about Linux on the ARM architecture.
|
- directory with info about Linux on the ARM architecture.
|
||||||
arm64/
|
arm64/
|
||||||
- directory with info about Linux on the 64 bit ARM architecture.
|
- directory with info about Linux on the 64 bit ARM architecture.
|
||||||
assoc_array.txt
|
|
||||||
- generic associative array intro.
|
|
||||||
atomic_ops.txt
|
|
||||||
- semantics and behavior of atomic and bitmask operations.
|
|
||||||
auxdisplay/
|
auxdisplay/
|
||||||
- misc. LCD driver documentation (cfag12864b, ks0108).
|
- misc. LCD driver documentation (cfag12864b, ks0108).
|
||||||
backlight/
|
backlight/
|
||||||
- directory with info on controlling backlights in flat panel displays
|
- directory with info on controlling backlights in flat panel displays
|
||||||
bad_memory.txt
|
|
||||||
- how to use kernel parameters to exclude bad RAM regions.
|
|
||||||
basic_profiling.txt
|
|
||||||
- basic instructions for those who wants to profile Linux kernel.
|
|
||||||
bcache.txt
|
bcache.txt
|
||||||
- Block-layer cache on fast SSDs to improve slow (raid) I/O performance.
|
- Block-layer cache on fast SSDs to improve slow (raid) I/O performance.
|
||||||
binfmt_misc.txt
|
|
||||||
- info on the kernel support for extra binary formats.
|
|
||||||
blackfin/
|
blackfin/
|
||||||
- directory with documentation for the Blackfin arch.
|
- directory with documentation for the Blackfin arch.
|
||||||
block/
|
block/
|
||||||
- info on the Block I/O (BIO) layer.
|
- info on the Block I/O (BIO) layer.
|
||||||
blockdev/
|
blockdev/
|
||||||
- info on block devices & drivers
|
- info on block devices & drivers
|
||||||
braille-console.txt
|
|
||||||
- info on how to use serial devices for Braille support.
|
|
||||||
bt8xxgpio.txt
|
bt8xxgpio.txt
|
||||||
- info on how to modify a bt8xx video card for GPIO usage.
|
- info on how to modify a bt8xx video card for GPIO usage.
|
||||||
btmrvl.txt
|
btmrvl.txt
|
||||||
@@ -114,18 +88,24 @@ cachetlb.txt
|
|||||||
- describes the cache/TLB flushing interfaces Linux uses.
|
- describes the cache/TLB flushing interfaces Linux uses.
|
||||||
cdrom/
|
cdrom/
|
||||||
- directory with information on the CD-ROM drivers that Linux has.
|
- directory with information on the CD-ROM drivers that Linux has.
|
||||||
cgroups/
|
cgroup-v1/
|
||||||
- cgroups features, including cpusets and memory controller.
|
- cgroups v1 features, including cpusets and memory controller.
|
||||||
|
cgroup-v2.txt
|
||||||
|
- cgroups v2 features, including cpusets and memory controller.
|
||||||
circular-buffers.txt
|
circular-buffers.txt
|
||||||
- how to make use of the existing circular buffer infrastructure
|
- how to make use of the existing circular buffer infrastructure
|
||||||
clk.txt
|
clk.txt
|
||||||
- info on the common clock framework
|
- info on the common clock framework
|
||||||
coccinelle.txt
|
cma/
|
||||||
- info on how to get and use the Coccinelle code checking tool.
|
- Continuous Memory Area (CMA) debugfs interface.
|
||||||
|
conf.py
|
||||||
|
- It's not of interest for those who aren't touching the build system.
|
||||||
connector/
|
connector/
|
||||||
- docs on the netlink based userspace<->kernel space communication mod.
|
- docs on the netlink based userspace<->kernel space communication mod.
|
||||||
console/
|
console/
|
||||||
- documentation on Linux console drivers.
|
- documentation on Linux console drivers.
|
||||||
|
core-api/
|
||||||
|
- documentation on kernel core components.
|
||||||
cpu-freq/
|
cpu-freq/
|
||||||
- info on CPU frequency and voltage scaling.
|
- info on CPU frequency and voltage scaling.
|
||||||
cpu-hotplug.txt
|
cpu-hotplug.txt
|
||||||
@@ -150,26 +130,26 @@ debugging-via-ohci1394.txt
|
|||||||
- how to use firewire like a hardware debugger memory reader.
|
- how to use firewire like a hardware debugger memory reader.
|
||||||
dell_rbu.txt
|
dell_rbu.txt
|
||||||
- document demonstrating the use of the Dell Remote BIOS Update driver.
|
- document demonstrating the use of the Dell Remote BIOS Update driver.
|
||||||
development-process/
|
dev-tools/
|
||||||
- how to work with the mainline kernel development process.
|
- directory with info on development tools for the kernel.
|
||||||
device-mapper/
|
device-mapper/
|
||||||
- directory with info on Device Mapper.
|
- directory with info on Device Mapper.
|
||||||
devices.txt
|
dmaengine/
|
||||||
- plain ASCII listing of all the nodes in /dev/ with major minor #'s.
|
- the DMA engine and controller API guides.
|
||||||
devicetree/
|
devicetree/
|
||||||
- directory with info on device tree files used by OF/PowerPC/ARM
|
- directory with info on device tree files used by OF/PowerPC/ARM
|
||||||
digsig.txt
|
digsig.txt
|
||||||
-info on the Digital Signature Verification API
|
-info on the Digital Signature Verification API
|
||||||
dma-buf-sharing.txt
|
dma-buf-sharing.txt
|
||||||
- the DMA Buffer Sharing API Guide
|
- the DMA Buffer Sharing API Guide
|
||||||
|
docutils.conf
|
||||||
|
- nothing here. Just a configuration file for docutils.
|
||||||
dontdiff
|
dontdiff
|
||||||
- file containing a list of files that should never be diff'ed.
|
- file containing a list of files that should never be diff'ed.
|
||||||
|
driver-api/
|
||||||
|
- the Linux driver implementer's API guide.
|
||||||
driver-model/
|
driver-model/
|
||||||
- directory with info about Linux driver model.
|
- directory with info about Linux driver model.
|
||||||
dvb/
|
|
||||||
- info on Linux Digital Video Broadcast (DVB) subsystem.
|
|
||||||
dynamic-debug-howto.txt
|
|
||||||
- how to use the dynamic debug (dyndbg) feature.
|
|
||||||
early-userspace/
|
early-userspace/
|
||||||
- info about initramfs, klibc, and userspace early during boot.
|
- info about initramfs, klibc, and userspace early during boot.
|
||||||
edac.txt
|
edac.txt
|
||||||
@@ -178,14 +158,16 @@ efi-stub.txt
|
|||||||
- How to use the EFI boot stub to bypass GRUB or elilo on EFI systems.
|
- How to use the EFI boot stub to bypass GRUB or elilo on EFI systems.
|
||||||
eisa.txt
|
eisa.txt
|
||||||
- info on EISA bus support.
|
- info on EISA bus support.
|
||||||
email-clients.txt
|
|
||||||
- info on how to use e-mail to send un-mangled (git) patches.
|
|
||||||
extcon/
|
extcon/
|
||||||
- directory with porting guide for Android kernel switch driver.
|
- directory with porting guide for Android kernel switch driver.
|
||||||
|
isa.txt
|
||||||
|
- info on EISA bus support.
|
||||||
fault-injection/
|
fault-injection/
|
||||||
- dir with docs about the fault injection capabilities infrastructure.
|
- dir with docs about the fault injection capabilities infrastructure.
|
||||||
fb/
|
fb/
|
||||||
- directory with info on the frame buffer graphics abstraction layer.
|
- directory with info on the frame buffer graphics abstraction layer.
|
||||||
|
features/
|
||||||
|
- status of feature implementation on different architectures.
|
||||||
filesystems/
|
filesystems/
|
||||||
- info on the vfs and the various filesystems that Linux supports.
|
- info on the vfs and the various filesystems that Linux supports.
|
||||||
firmware_class/
|
firmware_class/
|
||||||
@@ -194,20 +176,22 @@ flexible-arrays.txt
|
|||||||
- how to make use of flexible sized arrays in linux
|
- how to make use of flexible sized arrays in linux
|
||||||
fmc/
|
fmc/
|
||||||
- information about the FMC bus abstraction
|
- information about the FMC bus abstraction
|
||||||
|
fpga/
|
||||||
|
- FPGA Manager Core.
|
||||||
frv/
|
frv/
|
||||||
- Fujitsu FR-V Linux documentation.
|
- Fujitsu FR-V Linux documentation.
|
||||||
futex-requeue-pi.txt
|
futex-requeue-pi.txt
|
||||||
- info on requeueing of tasks from a non-PI futex to a PI futex
|
- info on requeueing of tasks from a non-PI futex to a PI futex
|
||||||
gcov.txt
|
gcc-plugins.txt
|
||||||
- use of GCC's coverage testing tool "gcov" with the Linux kernel
|
- GCC plugin infrastructure.
|
||||||
gpio/
|
gpio/
|
||||||
- gpio related documentation
|
- gpio related documentation
|
||||||
|
gpu/
|
||||||
|
- directory with information on GPU driver developer's guide.
|
||||||
hid/
|
hid/
|
||||||
- directory with information on human interface devices
|
- directory with information on human interface devices
|
||||||
highuid.txt
|
highuid.txt
|
||||||
- notes on the change from 16 bit to 32 bit user/group IDs.
|
- notes on the change from 16 bit to 32 bit user/group IDs.
|
||||||
hsi.txt
|
|
||||||
- HSI subsystem overview.
|
|
||||||
hwspinlock.txt
|
hwspinlock.txt
|
||||||
- hardware spinlock provides hardware assistance for synchronization
|
- hardware spinlock provides hardware assistance for synchronization
|
||||||
timers/
|
timers/
|
||||||
@@ -218,18 +202,18 @@ hwmon/
|
|||||||
- directory with docs on various hardware monitoring drivers.
|
- directory with docs on various hardware monitoring drivers.
|
||||||
i2c/
|
i2c/
|
||||||
- directory with info about the I2C bus/protocol (2 wire, kHz speed).
|
- directory with info about the I2C bus/protocol (2 wire, kHz speed).
|
||||||
i2o/
|
|
||||||
- directory with info about the Linux I2O subsystem.
|
|
||||||
x86/i386/
|
x86/i386/
|
||||||
- directory with info about Linux on Intel 32 bit architecture.
|
- directory with info about Linux on Intel 32 bit architecture.
|
||||||
ia64/
|
ia64/
|
||||||
- directory with info about Linux on Intel 64 bit architecture.
|
- directory with info about Linux on Intel 64 bit architecture.
|
||||||
|
ide/
|
||||||
|
- Information regarding the Enhanced IDE drive.
|
||||||
|
iio/
|
||||||
|
- info on industrial IIO configfs support.
|
||||||
|
index.rst
|
||||||
|
- main index for the documentation at ReST format.
|
||||||
infiniband/
|
infiniband/
|
||||||
- directory with documents concerning Linux InfiniBand support.
|
- directory with documents concerning Linux InfiniBand support.
|
||||||
init.txt
|
|
||||||
- what to do when the kernel can't find the 1st process to run.
|
|
||||||
initrd.txt
|
|
||||||
- how to use the RAM disk as an initial/temporary root filesystem.
|
|
||||||
input/
|
input/
|
||||||
- info on Linux input device support.
|
- info on Linux input device support.
|
||||||
intel_txt.txt
|
intel_txt.txt
|
||||||
@@ -248,28 +232,16 @@ isapnp.txt
|
|||||||
- info on Linux ISA Plug & Play support.
|
- info on Linux ISA Plug & Play support.
|
||||||
isdn/
|
isdn/
|
||||||
- directory with info on the Linux ISDN support, and supported cards.
|
- directory with info on the Linux ISDN support, and supported cards.
|
||||||
java.txt
|
|
||||||
- info on the in-kernel binary support for Java(tm).
|
|
||||||
ja_JP/
|
|
||||||
- directory with Japanese translations of various documents
|
|
||||||
kbuild/
|
kbuild/
|
||||||
- directory with info about the kernel build process.
|
- directory with info about the kernel build process.
|
||||||
|
kernel-doc-nano-HOWTO.txt
|
||||||
|
- outdated info about kernel-doc documentation.
|
||||||
kdump/
|
kdump/
|
||||||
- directory with mini HowTo on getting the crash dump code to work.
|
- directory with mini HowTo on getting the crash dump code to work.
|
||||||
kernel-docs.txt
|
doc-guide/
|
||||||
- listing of various WWW + books that document kernel internals.
|
|
||||||
kernel-documentation.rst
|
|
||||||
- how to write and format reStructuredText kernel documentation
|
- how to write and format reStructuredText kernel documentation
|
||||||
kernel-parameters.txt
|
|
||||||
- summary listing of command line / boot prompt args for the kernel.
|
|
||||||
kernel-per-CPU-kthreads.txt
|
kernel-per-CPU-kthreads.txt
|
||||||
- List of all per-CPU kthreads and how they introduce jitter.
|
- List of all per-CPU kthreads and how they introduce jitter.
|
||||||
kmemcheck.txt
|
|
||||||
- info on dynamic checker that detects uses of uninitialized memory.
|
|
||||||
kmemleak.txt
|
|
||||||
- info on how to make use of the kernel memory leak detection system
|
|
||||||
ko_KR/
|
|
||||||
- directory with Korean translations of various documents
|
|
||||||
kobject.txt
|
kobject.txt
|
||||||
- info of the kobject infrastructure of the Linux kernel.
|
- info of the kobject infrastructure of the Linux kernel.
|
||||||
kprobes.txt
|
kprobes.txt
|
||||||
@@ -284,8 +256,8 @@ ldm.txt
|
|||||||
- a brief description of LDM (Windows Dynamic Disks).
|
- a brief description of LDM (Windows Dynamic Disks).
|
||||||
leds/
|
leds/
|
||||||
- directory with info about LED handling under Linux.
|
- directory with info about LED handling under Linux.
|
||||||
local_ops.txt
|
livepatch/
|
||||||
- semantics and behavior of local atomic operations.
|
- info on kernel live patching.
|
||||||
locking/
|
locking/
|
||||||
- directory with info about kernel locking primitives
|
- directory with info about kernel locking primitives
|
||||||
lockup-watchdogs.txt
|
lockup-watchdogs.txt
|
||||||
@@ -298,22 +270,24 @@ lzo.txt
|
|||||||
- kernel LZO decompressor input formats
|
- kernel LZO decompressor input formats
|
||||||
m68k/
|
m68k/
|
||||||
- directory with info about Linux on Motorola 68k architecture.
|
- directory with info about Linux on Motorola 68k architecture.
|
||||||
magic-number.txt
|
|
||||||
- list of magic numbers used to mark/protect kernel data structures.
|
|
||||||
mailbox.txt
|
mailbox.txt
|
||||||
- How to write drivers for the common mailbox framework (IPC).
|
- How to write drivers for the common mailbox framework (IPC).
|
||||||
md.txt
|
md-cluster.txt
|
||||||
- info on boot arguments for the multiple devices driver.
|
- info on shared-device RAID MD cluster.
|
||||||
media-framework.txt
|
media/
|
||||||
- info on media framework, its data structures, functions and usage.
|
- info on media drivers: uAPI, kAPI and driver documentation.
|
||||||
memory-barriers.txt
|
memory-barriers.txt
|
||||||
- info on Linux kernel memory barriers.
|
- info on Linux kernel memory barriers.
|
||||||
memory-devices/
|
memory-devices/
|
||||||
- directory with info on parts like the Texas Instruments EMIF driver
|
- directory with info on parts like the Texas Instruments EMIF driver
|
||||||
memory-hotplug.txt
|
memory-hotplug.txt
|
||||||
- Hotpluggable memory support, how to use and current status.
|
- Hotpluggable memory support, how to use and current status.
|
||||||
|
men-chameleon-bus.txt
|
||||||
|
- info on MEN chameleon bus.
|
||||||
metag/
|
metag/
|
||||||
- directory with info about Linux on Meta architecture.
|
- directory with info about Linux on Meta architecture.
|
||||||
|
mic/
|
||||||
|
- Intel Many Integrated Core (MIC) architecture device driver.
|
||||||
mips/
|
mips/
|
||||||
- directory with info about Linux on MIPS architecture.
|
- directory with info about Linux on MIPS architecture.
|
||||||
misc-devices/
|
misc-devices/
|
||||||
@@ -322,12 +296,8 @@ mmc/
|
|||||||
- directory with info about the MMC subsystem
|
- directory with info about the MMC subsystem
|
||||||
mn10300/
|
mn10300/
|
||||||
- directory with info about the mn10300 architecture port
|
- directory with info about the mn10300 architecture port
|
||||||
module-signing.txt
|
|
||||||
- Kernel module signing for increased security when loading modules.
|
|
||||||
mtd/
|
mtd/
|
||||||
- directory with info about memory technology devices (flash)
|
- directory with info about memory technology devices (flash)
|
||||||
mono.txt
|
|
||||||
- how to execute Mono-based .NET binaries with the help of BINFMT_MISC.
|
|
||||||
namespaces/
|
namespaces/
|
||||||
- directory with various information about namespaces
|
- directory with various information about namespaces
|
||||||
netlabel/
|
netlabel/
|
||||||
@@ -336,30 +306,42 @@ networking/
|
|||||||
- directory with info on various aspects of networking with Linux.
|
- directory with info on various aspects of networking with Linux.
|
||||||
nfc/
|
nfc/
|
||||||
- directory relating info about Near Field Communications support.
|
- directory relating info about Near Field Communications support.
|
||||||
|
nios2/
|
||||||
|
- Linux on the Nios II architecture.
|
||||||
nommu-mmap.txt
|
nommu-mmap.txt
|
||||||
- documentation about no-mmu memory mapping support.
|
- documentation about no-mmu memory mapping support.
|
||||||
numastat.txt
|
numastat.txt
|
||||||
- info on how to read Numa policy hit/miss statistics in sysfs.
|
- info on how to read Numa policy hit/miss statistics in sysfs.
|
||||||
oops-tracing.txt
|
ntb.txt
|
||||||
- how to decode those nasty internal kernel error dump messages.
|
- info on Non-Transparent Bridge (NTB) drivers.
|
||||||
|
nvdimm/
|
||||||
|
- info on non-volatile devices.
|
||||||
|
nvmem/
|
||||||
|
- info on non volatile memory framework.
|
||||||
|
output/
|
||||||
|
- default directory where html/LaTeX/pdf files will be written.
|
||||||
padata.txt
|
padata.txt
|
||||||
- An introduction to the "padata" parallel execution API
|
- An introduction to the "padata" parallel execution API
|
||||||
parisc/
|
parisc/
|
||||||
- directory with info on using Linux on PA-RISC architecture.
|
- directory with info on using Linux on PA-RISC architecture.
|
||||||
parport.txt
|
|
||||||
- how to use the parallel-port driver.
|
|
||||||
parport-lowlevel.txt
|
parport-lowlevel.txt
|
||||||
- description and usage of the low level parallel port functions.
|
- description and usage of the low level parallel port functions.
|
||||||
pcmcia/
|
pcmcia/
|
||||||
- info on the Linux PCMCIA driver.
|
- info on the Linux PCMCIA driver.
|
||||||
percpu-rw-semaphore.txt
|
percpu-rw-semaphore.txt
|
||||||
- RCU based read-write semaphore optimized for locking for reading
|
- RCU based read-write semaphore optimized for locking for reading
|
||||||
|
perf/
|
||||||
|
- info about the APM X-Gene SoC Performance Monitoring Unit (PMU).
|
||||||
|
phy/
|
||||||
|
- ino on Samsung USB 2.0 PHY adaptation layer.
|
||||||
phy.txt
|
phy.txt
|
||||||
- Description of the generic PHY framework.
|
- Description of the generic PHY framework.
|
||||||
pi-futex.txt
|
pi-futex.txt
|
||||||
- documentation on lightweight priority inheritance futexes.
|
- documentation on lightweight priority inheritance futexes.
|
||||||
pinctrl.txt
|
pinctrl.txt
|
||||||
- info on pinctrl subsystem and the PINMUX/PINCONF and drivers
|
- info on pinctrl subsystem and the PINMUX/PINCONF and drivers
|
||||||
|
platform/
|
||||||
|
- List of supported hardware by compal and Dell laptop.
|
||||||
pnp.txt
|
pnp.txt
|
||||||
- Linux Plug and Play documentation.
|
- Linux Plug and Play documentation.
|
||||||
power/
|
power/
|
||||||
@@ -372,14 +354,16 @@ preempt-locking.txt
|
|||||||
- info on locking under a preemptive kernel.
|
- info on locking under a preemptive kernel.
|
||||||
printk-formats.txt
|
printk-formats.txt
|
||||||
- how to get printk format specifiers right
|
- how to get printk format specifiers right
|
||||||
|
process/
|
||||||
|
- how to work with the mainline kernel development process.
|
||||||
pps/
|
pps/
|
||||||
- directory with information on the pulse-per-second support
|
- directory with information on the pulse-per-second support
|
||||||
|
pti/
|
||||||
|
- directory with info on Intel MID PTI.
|
||||||
ptp/
|
ptp/
|
||||||
- directory with info on support for IEEE 1588 PTP clocks in Linux.
|
- directory with info on support for IEEE 1588 PTP clocks in Linux.
|
||||||
pwm.txt
|
pwm.txt
|
||||||
- info on the pulse width modulation driver subsystem
|
- info on the pulse width modulation driver subsystem
|
||||||
ramoops.txt
|
|
||||||
- documentation of the ramoops oops/panic logging module.
|
|
||||||
rapidio/
|
rapidio/
|
||||||
- directory with info on RapidIO packet-based fabric interconnect
|
- directory with info on RapidIO packet-based fabric interconnect
|
||||||
rbtree.txt
|
rbtree.txt
|
||||||
@@ -406,8 +390,6 @@ security/
|
|||||||
- directory that contains security-related info
|
- directory that contains security-related info
|
||||||
serial/
|
serial/
|
||||||
- directory with info on the low level serial API.
|
- directory with info on the low level serial API.
|
||||||
serial-console.txt
|
|
||||||
- how to set up Linux with a serial line console as the default.
|
|
||||||
sgi-ioc4.txt
|
sgi-ioc4.txt
|
||||||
- description of the SGI IOC4 PCI (multi function) device.
|
- description of the SGI IOC4 PCI (multi function) device.
|
||||||
sh/
|
sh/
|
||||||
@@ -416,24 +398,20 @@ smsc_ece1099.txt
|
|||||||
-info on the smsc Keyboard Scan Expansion/GPIO Expansion device.
|
-info on the smsc Keyboard Scan Expansion/GPIO Expansion device.
|
||||||
sound/
|
sound/
|
||||||
- directory with info on sound card support.
|
- directory with info on sound card support.
|
||||||
sparse.txt
|
|
||||||
- info on how to obtain and use the sparse tool for typechecking.
|
|
||||||
spi/
|
spi/
|
||||||
- overview of Linux kernel Serial Peripheral Interface (SPI) support.
|
- overview of Linux kernel Serial Peripheral Interface (SPI) support.
|
||||||
stable_api_nonsense.txt
|
sphinx/
|
||||||
- info on why the kernel does not have a stable in-kernel api or abi.
|
- no documentation here, just files required by Sphinx toolchain.
|
||||||
stable_kernel_rules.txt
|
sphinx-static/
|
||||||
- rules and procedures for the -stable kernel releases.
|
- no documentation here, just files required by Sphinx toolchain.
|
||||||
static-keys.txt
|
static-keys.txt
|
||||||
- info on how static keys allow debug code in hotpaths via patching
|
- info on how static keys allow debug code in hotpaths via patching
|
||||||
svga.txt
|
svga.txt
|
||||||
- short guide on selecting video modes at boot via VGA BIOS.
|
- short guide on selecting video modes at boot via VGA BIOS.
|
||||||
sysfs-rules.txt
|
sync_file.txt
|
||||||
- How not to use sysfs.
|
- Sync file API guide.
|
||||||
sysctl/
|
sysctl/
|
||||||
- directory with info on the /proc/sys/* files.
|
- directory with info on the /proc/sys/* files.
|
||||||
sysrq.txt
|
|
||||||
- info on the magic SysRq key.
|
|
||||||
target/
|
target/
|
||||||
- directory with info on generating TCM v4 fabric .ko modules
|
- directory with info on generating TCM v4 fabric .ko modules
|
||||||
this_cpu_ops.txt
|
this_cpu_ops.txt
|
||||||
@@ -442,39 +420,29 @@ thermal/
|
|||||||
- directory with information on managing thermal issues (CPU/temp)
|
- directory with information on managing thermal issues (CPU/temp)
|
||||||
trace/
|
trace/
|
||||||
- directory with info on tracing technologies within linux
|
- directory with info on tracing technologies within linux
|
||||||
|
translations/
|
||||||
|
- translations of this document from English to another language
|
||||||
unaligned-memory-access.txt
|
unaligned-memory-access.txt
|
||||||
- info on how to avoid arch breaking unaligned memory access in code.
|
- info on how to avoid arch breaking unaligned memory access in code.
|
||||||
unicode.txt
|
|
||||||
- info on the Unicode character/font mapping used in Linux.
|
|
||||||
unshare.txt
|
unshare.txt
|
||||||
- description of the Linux unshare system call.
|
- description of the Linux unshare system call.
|
||||||
usb/
|
usb/
|
||||||
- directory with info regarding the Universal Serial Bus.
|
- directory with info regarding the Universal Serial Bus.
|
||||||
vDSO/
|
|
||||||
- directory with info regarding virtual dynamic shared objects
|
|
||||||
vfio.txt
|
vfio.txt
|
||||||
- info on Virtual Function I/O used in guest/hypervisor instances.
|
- info on Virtual Function I/O used in guest/hypervisor instances.
|
||||||
vgaarbiter.txt
|
|
||||||
- info on enable/disable the legacy decoding on different VGA devices
|
|
||||||
video-output.txt
|
video-output.txt
|
||||||
- sysfs class driver interface to enable/disable a video output device.
|
- sysfs class driver interface to enable/disable a video output device.
|
||||||
video4linux/
|
|
||||||
- directory with info regarding video/TV/radio cards and linux.
|
|
||||||
virtual/
|
virtual/
|
||||||
- directory with information on the various linux virtualizations.
|
- directory with information on the various linux virtualizations.
|
||||||
vm/
|
vm/
|
||||||
- directory with info on the Linux vm code.
|
- directory with info on the Linux vm code.
|
||||||
vme_api.txt
|
|
||||||
- file relating info on the VME bus API in linux
|
|
||||||
volatile-considered-harmful.txt
|
|
||||||
- Why the "volatile" type class should not be used
|
|
||||||
w1/
|
w1/
|
||||||
- directory with documents regarding the 1-wire (w1) subsystem.
|
- directory with documents regarding the 1-wire (w1) subsystem.
|
||||||
watchdog/
|
watchdog/
|
||||||
- how to auto-reboot Linux if it has "fallen and can't get up". ;-)
|
- how to auto-reboot Linux if it has "fallen and can't get up". ;-)
|
||||||
wimax/
|
wimax/
|
||||||
- directory with info about Intel Wireless Wimax Connections
|
- directory with info about Intel Wireless Wimax Connections
|
||||||
workqueue.txt
|
core-api/workqueue.rst
|
||||||
- information on the Concurrency Managed Workqueue implementation
|
- information on the Concurrency Managed Workqueue implementation
|
||||||
x86/x86_64/
|
x86/x86_64/
|
||||||
- directory with info on Linux support for AMD x86-64 (Hammer) machines.
|
- directory with info on Linux support for AMD x86-64 (Hammer) machines.
|
||||||
@@ -484,7 +452,5 @@ xtensa/
|
|||||||
- directory with documents relating to arch/xtensa port/implementation
|
- directory with documents relating to arch/xtensa port/implementation
|
||||||
xz.txt
|
xz.txt
|
||||||
- how to make use of the XZ data compression within linux kernel
|
- how to make use of the XZ data compression within linux kernel
|
||||||
zh_CN/
|
|
||||||
- directory with Chinese translations of various documents
|
|
||||||
zorro.txt
|
zorro.txt
|
||||||
- info on writing drivers for Zorro bus devices found on Amigas.
|
- info on writing drivers for Zorro bus devices found on Amigas.
|
||||||
|
|||||||
@@ -84,4 +84,4 @@ stable:
|
|||||||
|
|
||||||
- Kernel-internal symbols. Do not rely on the presence, absence, location, or
|
- Kernel-internal symbols. Do not rely on the presence, absence, location, or
|
||||||
type of any kernel symbol, either in System.map files or the kernel binary
|
type of any kernel symbol, either in System.map files or the kernel binary
|
||||||
itself. See Documentation/stable_api_nonsense.txt.
|
itself. See Documentation/process/stable-api-nonsense.rst.
|
||||||
|
|||||||
@@ -347,7 +347,7 @@ Description:
|
|||||||
because of fragmentation, SLUB will retry with the minimum order
|
because of fragmentation, SLUB will retry with the minimum order
|
||||||
possible depending on its characteristics.
|
possible depending on its characteristics.
|
||||||
When debug_guardpage_minorder=N (N > 0) parameter is specified
|
When debug_guardpage_minorder=N (N > 0) parameter is specified
|
||||||
(see Documentation/kernel-parameters.txt), the minimum possible
|
(see Documentation/admin-guide/kernel-parameters.rst), the minimum possible
|
||||||
order is used and this sysfs entry can not be used to change
|
order is used and this sysfs entry can not be used to change
|
||||||
the order at run time.
|
the order at run time.
|
||||||
|
|
||||||
|
|||||||
@@ -1,246 +0,0 @@
|
|||||||
Table of contents
|
|
||||||
=================
|
|
||||||
|
|
||||||
Last updated: 20 December 2005
|
|
||||||
|
|
||||||
Contents
|
|
||||||
========
|
|
||||||
|
|
||||||
- Introduction
|
|
||||||
- Devices not appearing
|
|
||||||
- Finding patch that caused a bug
|
|
||||||
-- Finding using git-bisect
|
|
||||||
-- Finding it the old way
|
|
||||||
- Fixing the bug
|
|
||||||
|
|
||||||
Introduction
|
|
||||||
============
|
|
||||||
|
|
||||||
Always try the latest kernel from kernel.org and build from source. If you are
|
|
||||||
not confident in doing that please report the bug to your distribution vendor
|
|
||||||
instead of to a kernel developer.
|
|
||||||
|
|
||||||
Finding bugs is not always easy. Have a go though. If you can't find it don't
|
|
||||||
give up. Report as much as you have found to the relevant maintainer. See
|
|
||||||
MAINTAINERS for who that is for the subsystem you have worked on.
|
|
||||||
|
|
||||||
Before you submit a bug report read REPORTING-BUGS.
|
|
||||||
|
|
||||||
Devices not appearing
|
|
||||||
=====================
|
|
||||||
|
|
||||||
Often this is caused by udev. Check that first before blaming it on the
|
|
||||||
kernel.
|
|
||||||
|
|
||||||
Finding patch that caused a bug
|
|
||||||
===============================
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Finding using git-bisect
|
|
||||||
------------------------
|
|
||||||
|
|
||||||
Using the provided tools with git makes finding bugs easy provided the bug is
|
|
||||||
reproducible.
|
|
||||||
|
|
||||||
Steps to do it:
|
|
||||||
- start using git for the kernel source
|
|
||||||
- read the man page for git-bisect
|
|
||||||
- have fun
|
|
||||||
|
|
||||||
Finding it the old way
|
|
||||||
----------------------
|
|
||||||
|
|
||||||
[Sat Mar 2 10:32:33 PST 1996 KERNEL_BUG-HOWTO lm@sgi.com (Larry McVoy)]
|
|
||||||
|
|
||||||
This is how to track down a bug if you know nothing about kernel hacking.
|
|
||||||
It's a brute force approach but it works pretty well.
|
|
||||||
|
|
||||||
You need:
|
|
||||||
|
|
||||||
. A reproducible bug - it has to happen predictably (sorry)
|
|
||||||
. All the kernel tar files from a revision that worked to the
|
|
||||||
revision that doesn't
|
|
||||||
|
|
||||||
You will then do:
|
|
||||||
|
|
||||||
. Rebuild a revision that you believe works, install, and verify that.
|
|
||||||
. Do a binary search over the kernels to figure out which one
|
|
||||||
introduced the bug. I.e., suppose 1.3.28 didn't have the bug, but
|
|
||||||
you know that 1.3.69 does. Pick a kernel in the middle and build
|
|
||||||
that, like 1.3.50. Build & test; if it works, pick the mid point
|
|
||||||
between .50 and .69, else the mid point between .28 and .50.
|
|
||||||
. You'll narrow it down to the kernel that introduced the bug. You
|
|
||||||
can probably do better than this but it gets tricky.
|
|
||||||
|
|
||||||
. Narrow it down to a subdirectory
|
|
||||||
|
|
||||||
- Copy kernel that works into "test". Let's say that 3.62 works,
|
|
||||||
but 3.63 doesn't. So you diff -r those two kernels and come
|
|
||||||
up with a list of directories that changed. For each of those
|
|
||||||
directories:
|
|
||||||
|
|
||||||
Copy the non-working directory next to the working directory
|
|
||||||
as "dir.63".
|
|
||||||
One directory at time, try moving the working directory to
|
|
||||||
"dir.62" and mv dir.63 dir"time, try
|
|
||||||
|
|
||||||
mv dir dir.62
|
|
||||||
mv dir.63 dir
|
|
||||||
find dir -name '*.[oa]' -print | xargs rm -f
|
|
||||||
|
|
||||||
And then rebuild and retest. Assuming that all related
|
|
||||||
changes were contained in the sub directory, this should
|
|
||||||
isolate the change to a directory.
|
|
||||||
|
|
||||||
Problems: changes in header files may have occurred; I've
|
|
||||||
found in my case that they were self explanatory - you may
|
|
||||||
or may not want to give up when that happens.
|
|
||||||
|
|
||||||
. Narrow it down to a file
|
|
||||||
|
|
||||||
- You can apply the same technique to each file in the directory,
|
|
||||||
hoping that the changes in that file are self contained.
|
|
||||||
|
|
||||||
. Narrow it down to a routine
|
|
||||||
|
|
||||||
- You can take the old file and the new file and manually create
|
|
||||||
a merged file that has
|
|
||||||
|
|
||||||
#ifdef VER62
|
|
||||||
routine()
|
|
||||||
{
|
|
||||||
...
|
|
||||||
}
|
|
||||||
#else
|
|
||||||
routine()
|
|
||||||
{
|
|
||||||
...
|
|
||||||
}
|
|
||||||
#endif
|
|
||||||
|
|
||||||
And then walk through that file, one routine at a time and
|
|
||||||
prefix it with
|
|
||||||
|
|
||||||
#define VER62
|
|
||||||
/* both routines here */
|
|
||||||
#undef VER62
|
|
||||||
|
|
||||||
Then recompile, retest, move the ifdefs until you find the one
|
|
||||||
that makes the difference.
|
|
||||||
|
|
||||||
Finally, you take all the info that you have, kernel revisions, bug
|
|
||||||
description, the extent to which you have narrowed it down, and pass
|
|
||||||
that off to whomever you believe is the maintainer of that section.
|
|
||||||
A post to linux.dev.kernel isn't such a bad idea if you've done some
|
|
||||||
work to narrow it down.
|
|
||||||
|
|
||||||
If you get it down to a routine, you'll probably get a fix in 24 hours.
|
|
||||||
|
|
||||||
My apologies to Linus and the other kernel hackers for describing this
|
|
||||||
brute force approach, it's hardly what a kernel hacker would do. However,
|
|
||||||
it does work and it lets non-hackers help fix bugs. And it is cool
|
|
||||||
because Linux snapshots will let you do this - something that you can't
|
|
||||||
do with vendor supplied releases.
|
|
||||||
|
|
||||||
Fixing the bug
|
|
||||||
==============
|
|
||||||
|
|
||||||
Nobody is going to tell you how to fix bugs. Seriously. You need to work it
|
|
||||||
out. But below are some hints on how to use the tools.
|
|
||||||
|
|
||||||
To debug a kernel, use objdump and look for the hex offset from the crash
|
|
||||||
output to find the valid line of code/assembler. Without debug symbols, you
|
|
||||||
will see the assembler code for the routine shown, but if your kernel has
|
|
||||||
debug symbols the C code will also be available. (Debug symbols can be enabled
|
|
||||||
in the kernel hacking menu of the menu configuration.) For example:
|
|
||||||
|
|
||||||
objdump -r -S -l --disassemble net/dccp/ipv4.o
|
|
||||||
|
|
||||||
NB.: you need to be at the top level of the kernel tree for this to pick up
|
|
||||||
your C files.
|
|
||||||
|
|
||||||
If you don't have access to the code you can also debug on some crash dumps
|
|
||||||
e.g. crash dump output as shown by Dave Miller.
|
|
||||||
|
|
||||||
> EIP is at ip_queue_xmit+0x14/0x4c0
|
|
||||||
> ...
|
|
||||||
> Code: 44 24 04 e8 6f 05 00 00 e9 e8 fe ff ff 8d 76 00 8d bc 27 00 00
|
|
||||||
> 00 00 55 57 56 53 81 ec bc 00 00 00 8b ac 24 d0 00 00 00 8b 5d 08
|
|
||||||
> <8b> 83 3c 01 00 00 89 44 24 14 8b 45 28 85 c0 89 44 24 18 0f 85
|
|
||||||
>
|
|
||||||
> Put the bytes into a "foo.s" file like this:
|
|
||||||
>
|
|
||||||
> .text
|
|
||||||
> .globl foo
|
|
||||||
> foo:
|
|
||||||
> .byte .... /* bytes from Code: part of OOPS dump */
|
|
||||||
>
|
|
||||||
> Compile it with "gcc -c -o foo.o foo.s" then look at the output of
|
|
||||||
> "objdump --disassemble foo.o".
|
|
||||||
>
|
|
||||||
> Output:
|
|
||||||
>
|
|
||||||
> ip_queue_xmit:
|
|
||||||
> push %ebp
|
|
||||||
> push %edi
|
|
||||||
> push %esi
|
|
||||||
> push %ebx
|
|
||||||
> sub $0xbc, %esp
|
|
||||||
> mov 0xd0(%esp), %ebp ! %ebp = arg0 (skb)
|
|
||||||
> mov 0x8(%ebp), %ebx ! %ebx = skb->sk
|
|
||||||
> mov 0x13c(%ebx), %eax ! %eax = inet_sk(sk)->opt
|
|
||||||
|
|
||||||
In addition, you can use GDB to figure out the exact file and line
|
|
||||||
number of the OOPS from the vmlinux file. If you have
|
|
||||||
CONFIG_DEBUG_INFO enabled, you can simply copy the EIP value from the
|
|
||||||
OOPS:
|
|
||||||
|
|
||||||
EIP: 0060:[<c021e50e>] Not tainted VLI
|
|
||||||
|
|
||||||
And use GDB to translate that to human-readable form:
|
|
||||||
|
|
||||||
gdb vmlinux
|
|
||||||
(gdb) l *0xc021e50e
|
|
||||||
|
|
||||||
If you don't have CONFIG_DEBUG_INFO enabled, you use the function
|
|
||||||
offset from the OOPS:
|
|
||||||
|
|
||||||
EIP is at vt_ioctl+0xda8/0x1482
|
|
||||||
|
|
||||||
And recompile the kernel with CONFIG_DEBUG_INFO enabled:
|
|
||||||
|
|
||||||
make vmlinux
|
|
||||||
gdb vmlinux
|
|
||||||
(gdb) p vt_ioctl
|
|
||||||
(gdb) l *(0x<address of vt_ioctl> + 0xda8)
|
|
||||||
or, as one command
|
|
||||||
(gdb) l *(vt_ioctl + 0xda8)
|
|
||||||
|
|
||||||
If you have a call trace, such as :-
|
|
||||||
>Call Trace:
|
|
||||||
> [<ffffffff8802c8e9>] :jbd:log_wait_commit+0xa3/0xf5
|
|
||||||
> [<ffffffff810482d9>] autoremove_wake_function+0x0/0x2e
|
|
||||||
> [<ffffffff8802770b>] :jbd:journal_stop+0x1be/0x1ee
|
|
||||||
> ...
|
|
||||||
this shows the problem in the :jbd: module. You can load that module in gdb
|
|
||||||
and list the relevant code.
|
|
||||||
gdb fs/jbd/jbd.ko
|
|
||||||
(gdb) p log_wait_commit
|
|
||||||
(gdb) l *(0x<address> + 0xa3)
|
|
||||||
or
|
|
||||||
(gdb) l *(log_wait_commit + 0xa3)
|
|
||||||
|
|
||||||
|
|
||||||
Another very useful option of the Kernel Hacking section in menuconfig is
|
|
||||||
Debug memory allocations. This will help you see whether data has been
|
|
||||||
initialised and not set before use etc. To see the values that get assigned
|
|
||||||
with this look at mm/slab.c and search for POISON_INUSE. When using this an
|
|
||||||
Oops will often show the poisoned data instead of zero which is the default.
|
|
||||||
|
|
||||||
Once you have worked out a fix please submit it upstream. After all open
|
|
||||||
source is about sharing what you do and don't you want to be recognised for
|
|
||||||
your genius?
|
|
||||||
|
|
||||||
Please do read Documentation/SubmittingPatches though to help your code get
|
|
||||||
accepted.
|
|
||||||
File diff suppressed because it is too large
Load Diff
@@ -9,12 +9,10 @@
|
|||||||
DOCBOOKS := z8530book.xml \
|
DOCBOOKS := z8530book.xml \
|
||||||
kernel-hacking.xml kernel-locking.xml deviceiobook.xml \
|
kernel-hacking.xml kernel-locking.xml deviceiobook.xml \
|
||||||
writing_usb_driver.xml networking.xml \
|
writing_usb_driver.xml networking.xml \
|
||||||
kernel-api.xml filesystems.xml lsm.xml usb.xml kgdb.xml \
|
kernel-api.xml filesystems.xml lsm.xml kgdb.xml \
|
||||||
gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \
|
gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \
|
||||||
genericirq.xml s390-drivers.xml uio-howto.xml scsi.xml \
|
genericirq.xml s390-drivers.xml uio-howto.xml scsi.xml \
|
||||||
debugobjects.xml sh.xml regulator.xml \
|
80211.xml sh.xml regulator.xml w1.xml \
|
||||||
alsa-driver-api.xml writing-an-alsa-driver.xml \
|
|
||||||
tracepoint.xml w1.xml \
|
|
||||||
writing_musb_glue_layer.xml crypto-API.xml iio.xml
|
writing_musb_glue_layer.xml crypto-API.xml iio.xml
|
||||||
|
|
||||||
ifeq ($(DOCBOOKS),)
|
ifeq ($(DOCBOOKS),)
|
||||||
@@ -264,6 +262,7 @@ clean-files := $(DOCBOOKS) \
|
|||||||
$(patsubst %.xml, %.aux.xml, $(DOCBOOKS)) \
|
$(patsubst %.xml, %.aux.xml, $(DOCBOOKS)) \
|
||||||
$(patsubst %.xml, %.xml.db, $(DOCBOOKS)) \
|
$(patsubst %.xml, %.xml.db, $(DOCBOOKS)) \
|
||||||
$(patsubst %.xml, %.xml, $(DOCBOOKS)) \
|
$(patsubst %.xml, %.xml, $(DOCBOOKS)) \
|
||||||
|
$(patsubst %.xml, .%.xml.cmd, $(DOCBOOKS)) \
|
||||||
$(index)
|
$(index)
|
||||||
|
|
||||||
clean-dirs := $(patsubst %.xml,%,$(DOCBOOKS)) man
|
clean-dirs := $(patsubst %.xml,%,$(DOCBOOKS)) man
|
||||||
|
|||||||
@@ -1,142 +0,0 @@
|
|||||||
<?xml version="1.0" encoding="UTF-8"?>
|
|
||||||
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
|
|
||||||
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
|
|
||||||
|
|
||||||
<!-- ****************************************************** -->
|
|
||||||
<!-- Header -->
|
|
||||||
<!-- ****************************************************** -->
|
|
||||||
<book id="ALSA-Driver-API">
|
|
||||||
<bookinfo>
|
|
||||||
<title>The ALSA Driver API</title>
|
|
||||||
|
|
||||||
<legalnotice>
|
|
||||||
<para>
|
|
||||||
This document is free; you can redistribute it and/or modify it
|
|
||||||
under the terms of the GNU General Public License as published by
|
|
||||||
the Free Software Foundation; either version 2 of the License, or
|
|
||||||
(at your option) any later version.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
This document is distributed in the hope that it will be useful,
|
|
||||||
but <emphasis>WITHOUT ANY WARRANTY</emphasis>; without even the
|
|
||||||
implied warranty of <emphasis>MERCHANTABILITY or FITNESS FOR A
|
|
||||||
PARTICULAR PURPOSE</emphasis>. See the GNU General Public License
|
|
||||||
for more details.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
You should have received a copy of the GNU General Public
|
|
||||||
License along with this program; if not, write to the Free
|
|
||||||
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
|
|
||||||
MA 02111-1307 USA
|
|
||||||
</para>
|
|
||||||
</legalnotice>
|
|
||||||
|
|
||||||
</bookinfo>
|
|
||||||
|
|
||||||
<toc></toc>
|
|
||||||
|
|
||||||
<chapter><title>Management of Cards and Devices</title>
|
|
||||||
<sect1><title>Card Management</title>
|
|
||||||
!Esound/core/init.c
|
|
||||||
</sect1>
|
|
||||||
<sect1><title>Device Components</title>
|
|
||||||
!Esound/core/device.c
|
|
||||||
</sect1>
|
|
||||||
<sect1><title>Module requests and Device File Entries</title>
|
|
||||||
!Esound/core/sound.c
|
|
||||||
</sect1>
|
|
||||||
<sect1><title>Memory Management Helpers</title>
|
|
||||||
!Esound/core/memory.c
|
|
||||||
!Esound/core/memalloc.c
|
|
||||||
</sect1>
|
|
||||||
</chapter>
|
|
||||||
<chapter><title>PCM API</title>
|
|
||||||
<sect1><title>PCM Core</title>
|
|
||||||
!Esound/core/pcm.c
|
|
||||||
!Esound/core/pcm_lib.c
|
|
||||||
!Esound/core/pcm_native.c
|
|
||||||
!Iinclude/sound/pcm.h
|
|
||||||
</sect1>
|
|
||||||
<sect1><title>PCM Format Helpers</title>
|
|
||||||
!Esound/core/pcm_misc.c
|
|
||||||
</sect1>
|
|
||||||
<sect1><title>PCM Memory Management</title>
|
|
||||||
!Esound/core/pcm_memory.c
|
|
||||||
</sect1>
|
|
||||||
<sect1><title>PCM DMA Engine API</title>
|
|
||||||
!Esound/core/pcm_dmaengine.c
|
|
||||||
!Iinclude/sound/dmaengine_pcm.h
|
|
||||||
</sect1>
|
|
||||||
</chapter>
|
|
||||||
<chapter><title>Control/Mixer API</title>
|
|
||||||
<sect1><title>General Control Interface</title>
|
|
||||||
!Esound/core/control.c
|
|
||||||
</sect1>
|
|
||||||
<sect1><title>AC97 Codec API</title>
|
|
||||||
!Esound/pci/ac97/ac97_codec.c
|
|
||||||
!Esound/pci/ac97/ac97_pcm.c
|
|
||||||
</sect1>
|
|
||||||
<sect1><title>Virtual Master Control API</title>
|
|
||||||
!Esound/core/vmaster.c
|
|
||||||
!Iinclude/sound/control.h
|
|
||||||
</sect1>
|
|
||||||
</chapter>
|
|
||||||
<chapter><title>MIDI API</title>
|
|
||||||
<sect1><title>Raw MIDI API</title>
|
|
||||||
!Esound/core/rawmidi.c
|
|
||||||
</sect1>
|
|
||||||
<sect1><title>MPU401-UART API</title>
|
|
||||||
!Esound/drivers/mpu401/mpu401_uart.c
|
|
||||||
</sect1>
|
|
||||||
</chapter>
|
|
||||||
<chapter><title>Proc Info API</title>
|
|
||||||
<sect1><title>Proc Info Interface</title>
|
|
||||||
!Esound/core/info.c
|
|
||||||
</sect1>
|
|
||||||
</chapter>
|
|
||||||
<chapter><title>Compress Offload</title>
|
|
||||||
<sect1><title>Compress Offload API</title>
|
|
||||||
!Esound/core/compress_offload.c
|
|
||||||
!Iinclude/uapi/sound/compress_offload.h
|
|
||||||
!Iinclude/uapi/sound/compress_params.h
|
|
||||||
!Iinclude/sound/compress_driver.h
|
|
||||||
</sect1>
|
|
||||||
</chapter>
|
|
||||||
<chapter><title>ASoC</title>
|
|
||||||
<sect1><title>ASoC Core API</title>
|
|
||||||
!Iinclude/sound/soc.h
|
|
||||||
!Esound/soc/soc-core.c
|
|
||||||
<!-- !Esound/soc/soc-cache.c no docbook comments here -->
|
|
||||||
!Esound/soc/soc-devres.c
|
|
||||||
!Esound/soc/soc-io.c
|
|
||||||
!Esound/soc/soc-pcm.c
|
|
||||||
!Esound/soc/soc-ops.c
|
|
||||||
!Esound/soc/soc-compress.c
|
|
||||||
</sect1>
|
|
||||||
<sect1><title>ASoC DAPM API</title>
|
|
||||||
!Esound/soc/soc-dapm.c
|
|
||||||
</sect1>
|
|
||||||
<sect1><title>ASoC DMA Engine API</title>
|
|
||||||
!Esound/soc/soc-generic-dmaengine-pcm.c
|
|
||||||
</sect1>
|
|
||||||
</chapter>
|
|
||||||
<chapter><title>Miscellaneous Functions</title>
|
|
||||||
<sect1><title>Hardware-Dependent Devices API</title>
|
|
||||||
!Esound/core/hwdep.c
|
|
||||||
</sect1>
|
|
||||||
<sect1><title>Jack Abstraction Layer API</title>
|
|
||||||
!Iinclude/sound/jack.h
|
|
||||||
!Esound/core/jack.c
|
|
||||||
!Esound/soc/soc-jack.c
|
|
||||||
</sect1>
|
|
||||||
<sect1><title>ISA DMA Helpers</title>
|
|
||||||
!Esound/core/isadma.c
|
|
||||||
</sect1>
|
|
||||||
<sect1><title>Other Helper Macros</title>
|
|
||||||
!Iinclude/sound/core.h
|
|
||||||
</sect1>
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
</book>
|
|
||||||
@@ -1,443 +0,0 @@
|
|||||||
<?xml version="1.0" encoding="UTF-8"?>
|
|
||||||
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
|
|
||||||
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
|
|
||||||
|
|
||||||
<book id="debug-objects-guide">
|
|
||||||
<bookinfo>
|
|
||||||
<title>Debug objects life time</title>
|
|
||||||
|
|
||||||
<authorgroup>
|
|
||||||
<author>
|
|
||||||
<firstname>Thomas</firstname>
|
|
||||||
<surname>Gleixner</surname>
|
|
||||||
<affiliation>
|
|
||||||
<address>
|
|
||||||
<email>tglx@linutronix.de</email>
|
|
||||||
</address>
|
|
||||||
</affiliation>
|
|
||||||
</author>
|
|
||||||
</authorgroup>
|
|
||||||
|
|
||||||
<copyright>
|
|
||||||
<year>2008</year>
|
|
||||||
<holder>Thomas Gleixner</holder>
|
|
||||||
</copyright>
|
|
||||||
|
|
||||||
<legalnotice>
|
|
||||||
<para>
|
|
||||||
This documentation is free software; you can redistribute
|
|
||||||
it and/or modify it under the terms of the GNU General Public
|
|
||||||
License version 2 as published by the Free Software Foundation.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
This program is distributed in the hope that it will be
|
|
||||||
useful, but WITHOUT ANY WARRANTY; without even the implied
|
|
||||||
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
|
||||||
See the GNU General Public License for more details.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
You should have received a copy of the GNU General Public
|
|
||||||
License along with this program; if not, write to the Free
|
|
||||||
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
|
|
||||||
MA 02111-1307 USA
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
For more details see the file COPYING in the source
|
|
||||||
distribution of Linux.
|
|
||||||
</para>
|
|
||||||
</legalnotice>
|
|
||||||
</bookinfo>
|
|
||||||
|
|
||||||
<toc></toc>
|
|
||||||
|
|
||||||
<chapter id="intro">
|
|
||||||
<title>Introduction</title>
|
|
||||||
<para>
|
|
||||||
debugobjects is a generic infrastructure to track the life time
|
|
||||||
of kernel objects and validate the operations on those.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
debugobjects is useful to check for the following error patterns:
|
|
||||||
<itemizedlist>
|
|
||||||
<listitem><para>Activation of uninitialized objects</para></listitem>
|
|
||||||
<listitem><para>Initialization of active objects</para></listitem>
|
|
||||||
<listitem><para>Usage of freed/destroyed objects</para></listitem>
|
|
||||||
</itemizedlist>
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
debugobjects is not changing the data structure of the real
|
|
||||||
object so it can be compiled in with a minimal runtime impact
|
|
||||||
and enabled on demand with a kernel command line option.
|
|
||||||
</para>
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
<chapter id="howto">
|
|
||||||
<title>Howto use debugobjects</title>
|
|
||||||
<para>
|
|
||||||
A kernel subsystem needs to provide a data structure which
|
|
||||||
describes the object type and add calls into the debug code at
|
|
||||||
appropriate places. The data structure to describe the object
|
|
||||||
type needs at minimum the name of the object type. Optional
|
|
||||||
functions can and should be provided to fixup detected problems
|
|
||||||
so the kernel can continue to work and the debug information can
|
|
||||||
be retrieved from a live system instead of hard core debugging
|
|
||||||
with serial consoles and stack trace transcripts from the
|
|
||||||
monitor.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
The debug calls provided by debugobjects are:
|
|
||||||
<itemizedlist>
|
|
||||||
<listitem><para>debug_object_init</para></listitem>
|
|
||||||
<listitem><para>debug_object_init_on_stack</para></listitem>
|
|
||||||
<listitem><para>debug_object_activate</para></listitem>
|
|
||||||
<listitem><para>debug_object_deactivate</para></listitem>
|
|
||||||
<listitem><para>debug_object_destroy</para></listitem>
|
|
||||||
<listitem><para>debug_object_free</para></listitem>
|
|
||||||
<listitem><para>debug_object_assert_init</para></listitem>
|
|
||||||
</itemizedlist>
|
|
||||||
Each of these functions takes the address of the real object and
|
|
||||||
a pointer to the object type specific debug description
|
|
||||||
structure.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
Each detected error is reported in the statistics and a limited
|
|
||||||
number of errors are printk'ed including a full stack trace.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
The statistics are available via /sys/kernel/debug/debug_objects/stats.
|
|
||||||
They provide information about the number of warnings and the
|
|
||||||
number of successful fixups along with information about the
|
|
||||||
usage of the internal tracking objects and the state of the
|
|
||||||
internal tracking objects pool.
|
|
||||||
</para>
|
|
||||||
</chapter>
|
|
||||||
<chapter id="debugfunctions">
|
|
||||||
<title>Debug functions</title>
|
|
||||||
<sect1 id="prototypes">
|
|
||||||
<title>Debug object function reference</title>
|
|
||||||
!Elib/debugobjects.c
|
|
||||||
</sect1>
|
|
||||||
<sect1 id="debug_object_init">
|
|
||||||
<title>debug_object_init</title>
|
|
||||||
<para>
|
|
||||||
This function is called whenever the initialization function
|
|
||||||
of a real object is called.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
When the real object is already tracked by debugobjects it is
|
|
||||||
checked, whether the object can be initialized. Initializing
|
|
||||||
is not allowed for active and destroyed objects. When
|
|
||||||
debugobjects detects an error, then it calls the fixup_init
|
|
||||||
function of the object type description structure if provided
|
|
||||||
by the caller. The fixup function can correct the problem
|
|
||||||
before the real initialization of the object happens. E.g. it
|
|
||||||
can deactivate an active object in order to prevent damage to
|
|
||||||
the subsystem.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
When the real object is not yet tracked by debugobjects,
|
|
||||||
debugobjects allocates a tracker object for the real object
|
|
||||||
and sets the tracker object state to ODEBUG_STATE_INIT. It
|
|
||||||
verifies that the object is not on the callers stack. If it is
|
|
||||||
on the callers stack then a limited number of warnings
|
|
||||||
including a full stack trace is printk'ed. The calling code
|
|
||||||
must use debug_object_init_on_stack() and remove the object
|
|
||||||
before leaving the function which allocated it. See next
|
|
||||||
section.
|
|
||||||
</para>
|
|
||||||
</sect1>
|
|
||||||
|
|
||||||
<sect1 id="debug_object_init_on_stack">
|
|
||||||
<title>debug_object_init_on_stack</title>
|
|
||||||
<para>
|
|
||||||
This function is called whenever the initialization function
|
|
||||||
of a real object which resides on the stack is called.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
When the real object is already tracked by debugobjects it is
|
|
||||||
checked, whether the object can be initialized. Initializing
|
|
||||||
is not allowed for active and destroyed objects. When
|
|
||||||
debugobjects detects an error, then it calls the fixup_init
|
|
||||||
function of the object type description structure if provided
|
|
||||||
by the caller. The fixup function can correct the problem
|
|
||||||
before the real initialization of the object happens. E.g. it
|
|
||||||
can deactivate an active object in order to prevent damage to
|
|
||||||
the subsystem.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
When the real object is not yet tracked by debugobjects
|
|
||||||
debugobjects allocates a tracker object for the real object
|
|
||||||
and sets the tracker object state to ODEBUG_STATE_INIT. It
|
|
||||||
verifies that the object is on the callers stack.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
An object which is on the stack must be removed from the
|
|
||||||
tracker by calling debug_object_free() before the function
|
|
||||||
which allocates the object returns. Otherwise we keep track of
|
|
||||||
stale objects.
|
|
||||||
</para>
|
|
||||||
</sect1>
|
|
||||||
|
|
||||||
<sect1 id="debug_object_activate">
|
|
||||||
<title>debug_object_activate</title>
|
|
||||||
<para>
|
|
||||||
This function is called whenever the activation function of a
|
|
||||||
real object is called.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
When the real object is already tracked by debugobjects it is
|
|
||||||
checked, whether the object can be activated. Activating is
|
|
||||||
not allowed for active and destroyed objects. When
|
|
||||||
debugobjects detects an error, then it calls the
|
|
||||||
fixup_activate function of the object type description
|
|
||||||
structure if provided by the caller. The fixup function can
|
|
||||||
correct the problem before the real activation of the object
|
|
||||||
happens. E.g. it can deactivate an active object in order to
|
|
||||||
prevent damage to the subsystem.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
When the real object is not yet tracked by debugobjects then
|
|
||||||
the fixup_activate function is called if available. This is
|
|
||||||
necessary to allow the legitimate activation of statically
|
|
||||||
allocated and initialized objects. The fixup function checks
|
|
||||||
whether the object is valid and calls the debug_objects_init()
|
|
||||||
function to initialize the tracking of this object.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
When the activation is legitimate, then the state of the
|
|
||||||
associated tracker object is set to ODEBUG_STATE_ACTIVE.
|
|
||||||
</para>
|
|
||||||
</sect1>
|
|
||||||
|
|
||||||
<sect1 id="debug_object_deactivate">
|
|
||||||
<title>debug_object_deactivate</title>
|
|
||||||
<para>
|
|
||||||
This function is called whenever the deactivation function of
|
|
||||||
a real object is called.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
When the real object is tracked by debugobjects it is checked,
|
|
||||||
whether the object can be deactivated. Deactivating is not
|
|
||||||
allowed for untracked or destroyed objects.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
When the deactivation is legitimate, then the state of the
|
|
||||||
associated tracker object is set to ODEBUG_STATE_INACTIVE.
|
|
||||||
</para>
|
|
||||||
</sect1>
|
|
||||||
|
|
||||||
<sect1 id="debug_object_destroy">
|
|
||||||
<title>debug_object_destroy</title>
|
|
||||||
<para>
|
|
||||||
This function is called to mark an object destroyed. This is
|
|
||||||
useful to prevent the usage of invalid objects, which are
|
|
||||||
still available in memory: either statically allocated objects
|
|
||||||
or objects which are freed later.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
When the real object is tracked by debugobjects it is checked,
|
|
||||||
whether the object can be destroyed. Destruction is not
|
|
||||||
allowed for active and destroyed objects. When debugobjects
|
|
||||||
detects an error, then it calls the fixup_destroy function of
|
|
||||||
the object type description structure if provided by the
|
|
||||||
caller. The fixup function can correct the problem before the
|
|
||||||
real destruction of the object happens. E.g. it can deactivate
|
|
||||||
an active object in order to prevent damage to the subsystem.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
When the destruction is legitimate, then the state of the
|
|
||||||
associated tracker object is set to ODEBUG_STATE_DESTROYED.
|
|
||||||
</para>
|
|
||||||
</sect1>
|
|
||||||
|
|
||||||
<sect1 id="debug_object_free">
|
|
||||||
<title>debug_object_free</title>
|
|
||||||
<para>
|
|
||||||
This function is called before an object is freed.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
When the real object is tracked by debugobjects it is checked,
|
|
||||||
whether the object can be freed. Free is not allowed for
|
|
||||||
active objects. When debugobjects detects an error, then it
|
|
||||||
calls the fixup_free function of the object type description
|
|
||||||
structure if provided by the caller. The fixup function can
|
|
||||||
correct the problem before the real free of the object
|
|
||||||
happens. E.g. it can deactivate an active object in order to
|
|
||||||
prevent damage to the subsystem.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
Note that debug_object_free removes the object from the
|
|
||||||
tracker. Later usage of the object is detected by the other
|
|
||||||
debug checks.
|
|
||||||
</para>
|
|
||||||
</sect1>
|
|
||||||
|
|
||||||
<sect1 id="debug_object_assert_init">
|
|
||||||
<title>debug_object_assert_init</title>
|
|
||||||
<para>
|
|
||||||
This function is called to assert that an object has been
|
|
||||||
initialized.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
When the real object is not tracked by debugobjects, it calls
|
|
||||||
fixup_assert_init of the object type description structure
|
|
||||||
provided by the caller, with the hardcoded object state
|
|
||||||
ODEBUG_NOT_AVAILABLE. The fixup function can correct the problem
|
|
||||||
by calling debug_object_init and other specific initializing
|
|
||||||
functions.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
When the real object is already tracked by debugobjects it is
|
|
||||||
ignored.
|
|
||||||
</para>
|
|
||||||
</sect1>
|
|
||||||
</chapter>
|
|
||||||
<chapter id="fixupfunctions">
|
|
||||||
<title>Fixup functions</title>
|
|
||||||
<sect1 id="debug_obj_descr">
|
|
||||||
<title>Debug object type description structure</title>
|
|
||||||
!Iinclude/linux/debugobjects.h
|
|
||||||
</sect1>
|
|
||||||
<sect1 id="fixup_init">
|
|
||||||
<title>fixup_init</title>
|
|
||||||
<para>
|
|
||||||
This function is called from the debug code whenever a problem
|
|
||||||
in debug_object_init is detected. The function takes the
|
|
||||||
address of the object and the state which is currently
|
|
||||||
recorded in the tracker.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
Called from debug_object_init when the object state is:
|
|
||||||
<itemizedlist>
|
|
||||||
<listitem><para>ODEBUG_STATE_ACTIVE</para></listitem>
|
|
||||||
</itemizedlist>
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
The function returns true when the fixup was successful,
|
|
||||||
otherwise false. The return value is used to update the
|
|
||||||
statistics.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
Note, that the function needs to call the debug_object_init()
|
|
||||||
function again, after the damage has been repaired in order to
|
|
||||||
keep the state consistent.
|
|
||||||
</para>
|
|
||||||
</sect1>
|
|
||||||
|
|
||||||
<sect1 id="fixup_activate">
|
|
||||||
<title>fixup_activate</title>
|
|
||||||
<para>
|
|
||||||
This function is called from the debug code whenever a problem
|
|
||||||
in debug_object_activate is detected.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
Called from debug_object_activate when the object state is:
|
|
||||||
<itemizedlist>
|
|
||||||
<listitem><para>ODEBUG_STATE_NOTAVAILABLE</para></listitem>
|
|
||||||
<listitem><para>ODEBUG_STATE_ACTIVE</para></listitem>
|
|
||||||
</itemizedlist>
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
The function returns true when the fixup was successful,
|
|
||||||
otherwise false. The return value is used to update the
|
|
||||||
statistics.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
Note that the function needs to call the debug_object_activate()
|
|
||||||
function again after the damage has been repaired in order to
|
|
||||||
keep the state consistent.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
The activation of statically initialized objects is a special
|
|
||||||
case. When debug_object_activate() has no tracked object for
|
|
||||||
this object address then fixup_activate() is called with
|
|
||||||
object state ODEBUG_STATE_NOTAVAILABLE. The fixup function
|
|
||||||
needs to check whether this is a legitimate case of a
|
|
||||||
statically initialized object or not. In case it is it calls
|
|
||||||
debug_object_init() and debug_object_activate() to make the
|
|
||||||
object known to the tracker and marked active. In this case
|
|
||||||
the function should return false because this is not a real
|
|
||||||
fixup.
|
|
||||||
</para>
|
|
||||||
</sect1>
|
|
||||||
|
|
||||||
<sect1 id="fixup_destroy">
|
|
||||||
<title>fixup_destroy</title>
|
|
||||||
<para>
|
|
||||||
This function is called from the debug code whenever a problem
|
|
||||||
in debug_object_destroy is detected.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
Called from debug_object_destroy when the object state is:
|
|
||||||
<itemizedlist>
|
|
||||||
<listitem><para>ODEBUG_STATE_ACTIVE</para></listitem>
|
|
||||||
</itemizedlist>
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
The function returns true when the fixup was successful,
|
|
||||||
otherwise false. The return value is used to update the
|
|
||||||
statistics.
|
|
||||||
</para>
|
|
||||||
</sect1>
|
|
||||||
<sect1 id="fixup_free">
|
|
||||||
<title>fixup_free</title>
|
|
||||||
<para>
|
|
||||||
This function is called from the debug code whenever a problem
|
|
||||||
in debug_object_free is detected. Further it can be called
|
|
||||||
from the debug checks in kfree/vfree, when an active object is
|
|
||||||
detected from the debug_check_no_obj_freed() sanity checks.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
Called from debug_object_free() or debug_check_no_obj_freed()
|
|
||||||
when the object state is:
|
|
||||||
<itemizedlist>
|
|
||||||
<listitem><para>ODEBUG_STATE_ACTIVE</para></listitem>
|
|
||||||
</itemizedlist>
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
The function returns true when the fixup was successful,
|
|
||||||
otherwise false. The return value is used to update the
|
|
||||||
statistics.
|
|
||||||
</para>
|
|
||||||
</sect1>
|
|
||||||
<sect1 id="fixup_assert_init">
|
|
||||||
<title>fixup_assert_init</title>
|
|
||||||
<para>
|
|
||||||
This function is called from the debug code whenever a problem
|
|
||||||
in debug_object_assert_init is detected.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
Called from debug_object_assert_init() with a hardcoded state
|
|
||||||
ODEBUG_STATE_NOTAVAILABLE when the object is not found in the
|
|
||||||
debug bucket.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
The function returns true when the fixup was successful,
|
|
||||||
otherwise false. The return value is used to update the
|
|
||||||
statistics.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
Note, this function should make sure debug_object_init() is
|
|
||||||
called before returning.
|
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
The handling of statically initialized objects is a special
|
|
||||||
case. The fixup function should check if this is a legitimate
|
|
||||||
case of a statically initialized object or not. In this case only
|
|
||||||
debug_object_init() should be called to make the object known to
|
|
||||||
the tracker. Then the function should return false because this
|
|
||||||
is not
|
|
||||||
a real fixup.
|
|
||||||
</para>
|
|
||||||
</sect1>
|
|
||||||
</chapter>
|
|
||||||
<chapter id="bugs">
|
|
||||||
<title>Known Bugs And Assumptions</title>
|
|
||||||
<para>
|
|
||||||
None (knock on wood).
|
|
||||||
</para>
|
|
||||||
</chapter>
|
|
||||||
</book>
|
|
||||||
@@ -1208,8 +1208,8 @@ static struct block_device_operations opt_fops = {
|
|||||||
|
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
Finally, don't forget to read <filename>Documentation/SubmittingPatches</filename>
|
Finally, don't forget to read <filename>Documentation/process/submitting-patches.rst</filename>
|
||||||
and possibly <filename>Documentation/SubmittingDrivers</filename>.
|
and possibly <filename>Documentation/process/submitting-drivers.rst</filename>.
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
</itemizedlist>
|
</itemizedlist>
|
||||||
|
|||||||
@@ -1,112 +0,0 @@
|
|||||||
<?xml version="1.0" encoding="UTF-8"?>
|
|
||||||
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
|
|
||||||
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
|
|
||||||
|
|
||||||
<book id="Tracepoints">
|
|
||||||
<bookinfo>
|
|
||||||
<title>The Linux Kernel Tracepoint API</title>
|
|
||||||
|
|
||||||
<authorgroup>
|
|
||||||
<author>
|
|
||||||
<firstname>Jason</firstname>
|
|
||||||
<surname>Baron</surname>
|
|
||||||
<affiliation>
|
|
||||||
<address>
|
|
||||||
<email>jbaron@redhat.com</email>
|
|
||||||
</address>
|
|
||||||
</affiliation>
|
|
||||||
</author>
|
|
||||||
<author>
|
|
||||||
<firstname>William</firstname>
|
|
||||||
<surname>Cohen</surname>
|
|
||||||
<affiliation>
|
|
||||||
<address>
|
|
||||||
<email>wcohen@redhat.com</email>
|
|
||||||
</address>
|
|
||||||
</affiliation>
|
|
||||||
</author>
|
|
||||||
</authorgroup>
|
|
||||||
|
|
||||||
<legalnotice>
|
|
||||||
<para>
|
|
||||||
This documentation is free software; you can redistribute
|
|
||||||
it and/or modify it under the terms of the GNU General Public
|
|
||||||
License as published by the Free Software Foundation; either
|
|
||||||
version 2 of the License, or (at your option) any later
|
|
||||||
version.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
This program is distributed in the hope that it will be
|
|
||||||
useful, but WITHOUT ANY WARRANTY; without even the implied
|
|
||||||
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
|
||||||
See the GNU General Public License for more details.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
You should have received a copy of the GNU General Public
|
|
||||||
License along with this program; if not, write to the Free
|
|
||||||
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
|
|
||||||
MA 02111-1307 USA
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
For more details see the file COPYING in the source
|
|
||||||
distribution of Linux.
|
|
||||||
</para>
|
|
||||||
</legalnotice>
|
|
||||||
</bookinfo>
|
|
||||||
|
|
||||||
<toc></toc>
|
|
||||||
<chapter id="intro">
|
|
||||||
<title>Introduction</title>
|
|
||||||
<para>
|
|
||||||
Tracepoints are static probe points that are located in strategic points
|
|
||||||
throughout the kernel. 'Probes' register/unregister with tracepoints
|
|
||||||
via a callback mechanism. The 'probes' are strictly typed functions that
|
|
||||||
are passed a unique set of parameters defined by each tracepoint.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
From this simple callback mechanism, 'probes' can be used to profile, debug,
|
|
||||||
and understand kernel behavior. There are a number of tools that provide a
|
|
||||||
framework for using 'probes'. These tools include Systemtap, ftrace, and
|
|
||||||
LTTng.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
Tracepoints are defined in a number of header files via various macros. Thus,
|
|
||||||
the purpose of this document is to provide a clear accounting of the available
|
|
||||||
tracepoints. The intention is to understand not only what tracepoints are
|
|
||||||
available but also to understand where future tracepoints might be added.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
The API presented has functions of the form:
|
|
||||||
<function>trace_tracepointname(function parameters)</function>. These are the
|
|
||||||
tracepoints callbacks that are found throughout the code. Registering and
|
|
||||||
unregistering probes with these callback sites is covered in the
|
|
||||||
<filename>Documentation/trace/*</filename> directory.
|
|
||||||
</para>
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
<chapter id="irq">
|
|
||||||
<title>IRQ</title>
|
|
||||||
!Iinclude/trace/events/irq.h
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
<chapter id="signal">
|
|
||||||
<title>SIGNAL</title>
|
|
||||||
!Iinclude/trace/events/signal.h
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
<chapter id="block">
|
|
||||||
<title>Block IO</title>
|
|
||||||
!Iinclude/trace/events/block.h
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
<chapter id="workqueue">
|
|
||||||
<title>Workqueue</title>
|
|
||||||
!Iinclude/trace/events/workqueue.h
|
|
||||||
</chapter>
|
|
||||||
</book>
|
|
||||||
@@ -45,6 +45,13 @@ GPL version 2.
|
|||||||
</abstract>
|
</abstract>
|
||||||
|
|
||||||
<revhistory>
|
<revhistory>
|
||||||
|
<revision>
|
||||||
|
<revnumber>0.10</revnumber>
|
||||||
|
<date>2016-10-17</date>
|
||||||
|
<authorinitials>sch</authorinitials>
|
||||||
|
<revremark>Added generic hyperv driver
|
||||||
|
</revremark>
|
||||||
|
</revision>
|
||||||
<revision>
|
<revision>
|
||||||
<revnumber>0.9</revnumber>
|
<revnumber>0.9</revnumber>
|
||||||
<date>2009-07-16</date>
|
<date>2009-07-16</date>
|
||||||
@@ -1033,6 +1040,61 @@ int main()
|
|||||||
|
|
||||||
</chapter>
|
</chapter>
|
||||||
|
|
||||||
|
<chapter id="uio_hv_generic" xreflabel="Using Generic driver for Hyper-V VMBUS">
|
||||||
|
<?dbhtml filename="uio_hv_generic.html"?>
|
||||||
|
<title>Generic Hyper-V UIO driver</title>
|
||||||
|
<para>
|
||||||
|
The generic driver is a kernel module named uio_hv_generic.
|
||||||
|
It supports devices on the Hyper-V VMBus similar to uio_pci_generic
|
||||||
|
on PCI bus.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<sect1 id="uio_hv_generic_binding">
|
||||||
|
<title>Making the driver recognize the device</title>
|
||||||
|
<para>
|
||||||
|
Since the driver does not declare any device GUID's, it will not get loaded
|
||||||
|
automatically and will not automatically bind to any devices, you must load it
|
||||||
|
and allocate id to the driver yourself. For example, to use the network device
|
||||||
|
GUID:
|
||||||
|
<programlisting>
|
||||||
|
modprobe uio_hv_generic
|
||||||
|
echo "f8615163-df3e-46c5-913f-f2d2f965ed0e" > /sys/bus/vmbus/drivers/uio_hv_generic/new_id
|
||||||
|
</programlisting>
|
||||||
|
</para>
|
||||||
|
<para>
|
||||||
|
If there already is a hardware specific kernel driver for the device, the
|
||||||
|
generic driver still won't bind to it, in this case if you want to use the
|
||||||
|
generic driver (why would you?) you'll have to manually unbind the hardware
|
||||||
|
specific driver and bind the generic driver, like this:
|
||||||
|
<programlisting>
|
||||||
|
echo -n vmbus-ed963694-e847-4b2a-85af-bc9cfc11d6f3 > /sys/bus/vmbus/drivers/hv_netvsc/unbind
|
||||||
|
echo -n vmbus-ed963694-e847-4b2a-85af-bc9cfc11d6f3 > /sys/bus/vmbus/drivers/uio_hv_generic/bind
|
||||||
|
</programlisting>
|
||||||
|
</para>
|
||||||
|
<para>
|
||||||
|
You can verify that the device has been bound to the driver
|
||||||
|
by looking for it in sysfs, for example like the following:
|
||||||
|
<programlisting>
|
||||||
|
ls -l /sys/bus/vmbus/devices/vmbus-ed963694-e847-4b2a-85af-bc9cfc11d6f3/driver
|
||||||
|
</programlisting>
|
||||||
|
Which if successful should print
|
||||||
|
<programlisting>
|
||||||
|
.../vmbus-ed963694-e847-4b2a-85af-bc9cfc11d6f3/driver -> ../../../bus/vmbus/drivers/uio_hv_generic
|
||||||
|
</programlisting>
|
||||||
|
</para>
|
||||||
|
</sect1>
|
||||||
|
|
||||||
|
<sect1 id="uio_hv_generic_internals">
|
||||||
|
<title>Things to know about uio_hv_generic</title>
|
||||||
|
<para>
|
||||||
|
On each interrupt, uio_hv_generic sets the Interrupt Disable bit.
|
||||||
|
This prevents the device from generating further interrupts
|
||||||
|
until the bit is cleared. The userspace driver should clear this
|
||||||
|
bit before blocking and waiting for more interrupts.
|
||||||
|
</para>
|
||||||
|
</sect1>
|
||||||
|
</chapter>
|
||||||
|
|
||||||
<appendix id="app1">
|
<appendix id="app1">
|
||||||
<title>Further information</title>
|
<title>Further information</title>
|
||||||
<itemizedlist>
|
<itemizedlist>
|
||||||
|
|||||||
@@ -1,992 +0,0 @@
|
|||||||
<?xml version="1.0" encoding="UTF-8"?>
|
|
||||||
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
|
|
||||||
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
|
|
||||||
|
|
||||||
<book id="Linux-USB-API">
|
|
||||||
<bookinfo>
|
|
||||||
<title>The Linux-USB Host Side API</title>
|
|
||||||
|
|
||||||
<legalnotice>
|
|
||||||
<para>
|
|
||||||
This documentation is free software; you can redistribute
|
|
||||||
it and/or modify it under the terms of the GNU General Public
|
|
||||||
License as published by the Free Software Foundation; either
|
|
||||||
version 2 of the License, or (at your option) any later
|
|
||||||
version.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
This program is distributed in the hope that it will be
|
|
||||||
useful, but WITHOUT ANY WARRANTY; without even the implied
|
|
||||||
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
|
||||||
See the GNU General Public License for more details.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
You should have received a copy of the GNU General Public
|
|
||||||
License along with this program; if not, write to the Free
|
|
||||||
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
|
|
||||||
MA 02111-1307 USA
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>
|
|
||||||
For more details see the file COPYING in the source
|
|
||||||
distribution of Linux.
|
|
||||||
</para>
|
|
||||||
</legalnotice>
|
|
||||||
</bookinfo>
|
|
||||||
|
|
||||||
<toc></toc>
|
|
||||||
|
|
||||||
<chapter id="intro">
|
|
||||||
<title>Introduction to USB on Linux</title>
|
|
||||||
|
|
||||||
<para>A Universal Serial Bus (USB) is used to connect a host,
|
|
||||||
such as a PC or workstation, to a number of peripheral
|
|
||||||
devices. USB uses a tree structure, with the host as the
|
|
||||||
root (the system's master), hubs as interior nodes, and
|
|
||||||
peripherals as leaves (and slaves).
|
|
||||||
Modern PCs support several such trees of USB devices, usually
|
|
||||||
one USB 2.0 tree (480 Mbit/sec each) with
|
|
||||||
a few USB 1.1 trees (12 Mbit/sec each) that are used when you
|
|
||||||
connect a USB 1.1 device directly to the machine's "root hub".
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>That master/slave asymmetry was designed-in for a number of
|
|
||||||
reasons, one being ease of use. It is not physically possible to
|
|
||||||
assemble (legal) USB cables incorrectly: all upstream "to the host"
|
|
||||||
connectors are the rectangular type (matching the sockets on
|
|
||||||
root hubs), and all downstream connectors are the squarish type
|
|
||||||
(or they are built into the peripheral).
|
|
||||||
Also, the host software doesn't need to deal with distributed
|
|
||||||
auto-configuration since the pre-designated master node manages all that.
|
|
||||||
And finally, at the electrical level, bus protocol overhead is reduced by
|
|
||||||
eliminating arbitration and moving scheduling into the host software.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>USB 1.0 was announced in January 1996 and was revised
|
|
||||||
as USB 1.1 (with improvements in hub specification and
|
|
||||||
support for interrupt-out transfers) in September 1998.
|
|
||||||
USB 2.0 was released in April 2000, adding high-speed
|
|
||||||
transfers and transaction-translating hubs (used for USB 1.1
|
|
||||||
and 1.0 backward compatibility).
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>Kernel developers added USB support to Linux early in the 2.2 kernel
|
|
||||||
series, shortly before 2.3 development forked. Updates from 2.3 were
|
|
||||||
regularly folded back into 2.2 releases, which improved reliability and
|
|
||||||
brought <filename>/sbin/hotplug</filename> support as well more drivers.
|
|
||||||
Such improvements were continued in the 2.5 kernel series, where they added
|
|
||||||
USB 2.0 support, improved performance, and made the host controller drivers
|
|
||||||
(HCDs) more consistent. They also simplified the API (to make bugs less
|
|
||||||
likely) and added internal "kerneldoc" documentation.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>Linux can run inside USB devices as well as on
|
|
||||||
the hosts that control the devices.
|
|
||||||
But USB device drivers running inside those peripherals
|
|
||||||
don't do the same things as the ones running inside hosts,
|
|
||||||
so they've been given a different name:
|
|
||||||
<emphasis>gadget drivers</emphasis>.
|
|
||||||
This document does not cover gadget drivers.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
<chapter id="host">
|
|
||||||
<title>USB Host-Side API Model</title>
|
|
||||||
|
|
||||||
<para>Host-side drivers for USB devices talk to the "usbcore" APIs.
|
|
||||||
There are two. One is intended for
|
|
||||||
<emphasis>general-purpose</emphasis> drivers (exposed through
|
|
||||||
driver frameworks), and the other is for drivers that are
|
|
||||||
<emphasis>part of the core</emphasis>.
|
|
||||||
Such core drivers include the <emphasis>hub</emphasis> driver
|
|
||||||
(which manages trees of USB devices) and several different kinds
|
|
||||||
of <emphasis>host controller drivers</emphasis>,
|
|
||||||
which control individual busses.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>The device model seen by USB drivers is relatively complex.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<itemizedlist>
|
|
||||||
|
|
||||||
<listitem><para>USB supports four kinds of data transfers
|
|
||||||
(control, bulk, interrupt, and isochronous). Two of them (control
|
|
||||||
and bulk) use bandwidth as it's available,
|
|
||||||
while the other two (interrupt and isochronous)
|
|
||||||
are scheduled to provide guaranteed bandwidth.
|
|
||||||
</para></listitem>
|
|
||||||
|
|
||||||
<listitem><para>The device description model includes one or more
|
|
||||||
"configurations" per device, only one of which is active at a time.
|
|
||||||
Devices that are capable of high-speed operation must also support
|
|
||||||
full-speed configurations, along with a way to ask about the
|
|
||||||
"other speed" configurations which might be used.
|
|
||||||
</para></listitem>
|
|
||||||
|
|
||||||
<listitem><para>Configurations have one or more "interfaces", each
|
|
||||||
of which may have "alternate settings". Interfaces may be
|
|
||||||
standardized by USB "Class" specifications, or may be specific to
|
|
||||||
a vendor or device.</para>
|
|
||||||
|
|
||||||
<para>USB device drivers actually bind to interfaces, not devices.
|
|
||||||
Think of them as "interface drivers", though you
|
|
||||||
may not see many devices where the distinction is important.
|
|
||||||
<emphasis>Most USB devices are simple, with only one configuration,
|
|
||||||
one interface, and one alternate setting.</emphasis>
|
|
||||||
</para></listitem>
|
|
||||||
|
|
||||||
<listitem><para>Interfaces have one or more "endpoints", each of
|
|
||||||
which supports one type and direction of data transfer such as
|
|
||||||
"bulk out" or "interrupt in". The entire configuration may have
|
|
||||||
up to sixteen endpoints in each direction, allocated as needed
|
|
||||||
among all the interfaces.
|
|
||||||
</para></listitem>
|
|
||||||
|
|
||||||
<listitem><para>Data transfer on USB is packetized; each endpoint
|
|
||||||
has a maximum packet size.
|
|
||||||
Drivers must often be aware of conventions such as flagging the end
|
|
||||||
of bulk transfers using "short" (including zero length) packets.
|
|
||||||
</para></listitem>
|
|
||||||
|
|
||||||
<listitem><para>The Linux USB API supports synchronous calls for
|
|
||||||
control and bulk messages.
|
|
||||||
It also supports asynchronous calls for all kinds of data transfer,
|
|
||||||
using request structures called "URBs" (USB Request Blocks).
|
|
||||||
</para></listitem>
|
|
||||||
|
|
||||||
</itemizedlist>
|
|
||||||
|
|
||||||
<para>Accordingly, the USB Core API exposed to device drivers
|
|
||||||
covers quite a lot of territory. You'll probably need to consult
|
|
||||||
the USB 2.0 specification, available online from www.usb.org at
|
|
||||||
no cost, as well as class or device specifications.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>The only host-side drivers that actually touch hardware
|
|
||||||
(reading/writing registers, handling IRQs, and so on) are the HCDs.
|
|
||||||
In theory, all HCDs provide the same functionality through the same
|
|
||||||
API. In practice, that's becoming more true on the 2.5 kernels,
|
|
||||||
but there are still differences that crop up especially with
|
|
||||||
fault handling. Different controllers don't necessarily report
|
|
||||||
the same aspects of failures, and recovery from faults (including
|
|
||||||
software-induced ones like unlinking an URB) isn't yet fully
|
|
||||||
consistent.
|
|
||||||
Device driver authors should make a point of doing disconnect
|
|
||||||
testing (while the device is active) with each different host
|
|
||||||
controller driver, to make sure drivers don't have bugs of
|
|
||||||
their own as well as to make sure they aren't relying on some
|
|
||||||
HCD-specific behavior.
|
|
||||||
(You will need external USB 1.1 and/or
|
|
||||||
USB 2.0 hubs to perform all those tests.)
|
|
||||||
</para>
|
|
||||||
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
<chapter id="types"><title>USB-Standard Types</title>
|
|
||||||
|
|
||||||
<para>In <filename><linux/usb/ch9.h></filename> you will find
|
|
||||||
the USB data types defined in chapter 9 of the USB specification.
|
|
||||||
These data types are used throughout USB, and in APIs including
|
|
||||||
this host side API, gadget APIs, and usbfs.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
!Iinclude/linux/usb/ch9.h
|
|
||||||
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
<chapter id="hostside"><title>Host-Side Data Types and Macros</title>
|
|
||||||
|
|
||||||
<para>The host side API exposes several layers to drivers, some of
|
|
||||||
which are more necessary than others.
|
|
||||||
These support lifecycle models for host side drivers
|
|
||||||
and devices, and support passing buffers through usbcore to
|
|
||||||
some HCD that performs the I/O for the device driver.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
|
|
||||||
!Iinclude/linux/usb.h
|
|
||||||
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
<chapter id="usbcore"><title>USB Core APIs</title>
|
|
||||||
|
|
||||||
<para>There are two basic I/O models in the USB API.
|
|
||||||
The most elemental one is asynchronous: drivers submit requests
|
|
||||||
in the form of an URB, and the URB's completion callback
|
|
||||||
handle the next step.
|
|
||||||
All USB transfer types support that model, although there
|
|
||||||
are special cases for control URBs (which always have setup
|
|
||||||
and status stages, but may not have a data stage) and
|
|
||||||
isochronous URBs (which allow large packets and include
|
|
||||||
per-packet fault reports).
|
|
||||||
Built on top of that is synchronous API support, where a
|
|
||||||
driver calls a routine that allocates one or more URBs,
|
|
||||||
submits them, and waits until they complete.
|
|
||||||
There are synchronous wrappers for single-buffer control
|
|
||||||
and bulk transfers (which are awkward to use in some
|
|
||||||
driver disconnect scenarios), and for scatterlist based
|
|
||||||
streaming i/o (bulk or interrupt).
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>USB drivers need to provide buffers that can be
|
|
||||||
used for DMA, although they don't necessarily need to
|
|
||||||
provide the DMA mapping themselves.
|
|
||||||
There are APIs to use used when allocating DMA buffers,
|
|
||||||
which can prevent use of bounce buffers on some systems.
|
|
||||||
In some cases, drivers may be able to rely on 64bit DMA
|
|
||||||
to eliminate another kind of bounce buffer.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
!Edrivers/usb/core/urb.c
|
|
||||||
!Edrivers/usb/core/message.c
|
|
||||||
!Edrivers/usb/core/file.c
|
|
||||||
!Edrivers/usb/core/driver.c
|
|
||||||
!Edrivers/usb/core/usb.c
|
|
||||||
!Edrivers/usb/core/hub.c
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
<chapter id="hcd"><title>Host Controller APIs</title>
|
|
||||||
|
|
||||||
<para>These APIs are only for use by host controller drivers,
|
|
||||||
most of which implement standard register interfaces such as
|
|
||||||
EHCI, OHCI, or UHCI.
|
|
||||||
UHCI was one of the first interfaces, designed by Intel and
|
|
||||||
also used by VIA; it doesn't do much in hardware.
|
|
||||||
OHCI was designed later, to have the hardware do more work
|
|
||||||
(bigger transfers, tracking protocol state, and so on).
|
|
||||||
EHCI was designed with USB 2.0; its design has features that
|
|
||||||
resemble OHCI (hardware does much more work) as well as
|
|
||||||
UHCI (some parts of ISO support, TD list processing).
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>There are host controllers other than the "big three",
|
|
||||||
although most PCI based controllers (and a few non-PCI based
|
|
||||||
ones) use one of those interfaces.
|
|
||||||
Not all host controllers use DMA; some use PIO, and there
|
|
||||||
is also a simulator.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>The same basic APIs are available to drivers for all
|
|
||||||
those controllers.
|
|
||||||
For historical reasons they are in two layers:
|
|
||||||
<structname>struct usb_bus</structname> is a rather thin
|
|
||||||
layer that became available in the 2.2 kernels, while
|
|
||||||
<structname>struct usb_hcd</structname> is a more featureful
|
|
||||||
layer (available in later 2.4 kernels and in 2.5) that
|
|
||||||
lets HCDs share common code, to shrink driver size
|
|
||||||
and significantly reduce hcd-specific behaviors.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
!Edrivers/usb/core/hcd.c
|
|
||||||
!Edrivers/usb/core/hcd-pci.c
|
|
||||||
!Idrivers/usb/core/buffer.c
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
<chapter id="usbfs">
|
|
||||||
<title>The USB Filesystem (usbfs)</title>
|
|
||||||
|
|
||||||
<para>This chapter presents the Linux <emphasis>usbfs</emphasis>.
|
|
||||||
You may prefer to avoid writing new kernel code for your
|
|
||||||
USB driver; that's the problem that usbfs set out to solve.
|
|
||||||
User mode device drivers are usually packaged as applications
|
|
||||||
or libraries, and may use usbfs through some programming library
|
|
||||||
that wraps it. Such libraries include
|
|
||||||
<ulink url="http://libusb.sourceforge.net">libusb</ulink>
|
|
||||||
for C/C++, and
|
|
||||||
<ulink url="http://jUSB.sourceforge.net">jUSB</ulink> for Java.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<note><title>Unfinished</title>
|
|
||||||
<para>This particular documentation is incomplete,
|
|
||||||
especially with respect to the asynchronous mode.
|
|
||||||
As of kernel 2.5.66 the code and this (new) documentation
|
|
||||||
need to be cross-reviewed.
|
|
||||||
</para>
|
|
||||||
</note>
|
|
||||||
|
|
||||||
<para>Configure usbfs into Linux kernels by enabling the
|
|
||||||
<emphasis>USB filesystem</emphasis> option (CONFIG_USB_DEVICEFS),
|
|
||||||
and you get basic support for user mode USB device drivers.
|
|
||||||
Until relatively recently it was often (confusingly) called
|
|
||||||
<emphasis>usbdevfs</emphasis> although it wasn't solving what
|
|
||||||
<emphasis>devfs</emphasis> was.
|
|
||||||
Every USB device will appear in usbfs, regardless of whether or
|
|
||||||
not it has a kernel driver.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<sect1 id="usbfs-files">
|
|
||||||
<title>What files are in "usbfs"?</title>
|
|
||||||
|
|
||||||
<para>Conventionally mounted at
|
|
||||||
<filename>/proc/bus/usb</filename>, usbfs
|
|
||||||
features include:
|
|
||||||
<itemizedlist>
|
|
||||||
<listitem><para><filename>/proc/bus/usb/devices</filename>
|
|
||||||
... a text file
|
|
||||||
showing each of the USB devices on known to the kernel,
|
|
||||||
and their configuration descriptors.
|
|
||||||
You can also poll() this to learn about new devices.
|
|
||||||
</para></listitem>
|
|
||||||
<listitem><para><filename>/proc/bus/usb/BBB/DDD</filename>
|
|
||||||
... magic files
|
|
||||||
exposing the each device's configuration descriptors, and
|
|
||||||
supporting a series of ioctls for making device requests,
|
|
||||||
including I/O to devices. (Purely for access by programs.)
|
|
||||||
</para></listitem>
|
|
||||||
</itemizedlist>
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para> Each bus is given a number (BBB) based on when it was
|
|
||||||
enumerated; within each bus, each device is given a similar
|
|
||||||
number (DDD).
|
|
||||||
Those BBB/DDD paths are not "stable" identifiers;
|
|
||||||
expect them to change even if you always leave the devices
|
|
||||||
plugged in to the same hub port.
|
|
||||||
<emphasis>Don't even think of saving these in application
|
|
||||||
configuration files.</emphasis>
|
|
||||||
Stable identifiers are available, for user mode applications
|
|
||||||
that want to use them. HID and networking devices expose
|
|
||||||
these stable IDs, so that for example you can be sure that
|
|
||||||
you told the right UPS to power down its second server.
|
|
||||||
"usbfs" doesn't (yet) expose those IDs.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
</sect1>
|
|
||||||
|
|
||||||
<sect1 id="usbfs-fstab">
|
|
||||||
<title>Mounting and Access Control</title>
|
|
||||||
|
|
||||||
<para>There are a number of mount options for usbfs, which will
|
|
||||||
be of most interest to you if you need to override the default
|
|
||||||
access control policy.
|
|
||||||
That policy is that only root may read or write device files
|
|
||||||
(<filename>/proc/bus/BBB/DDD</filename>) although anyone may read
|
|
||||||
the <filename>devices</filename>
|
|
||||||
or <filename>drivers</filename> files.
|
|
||||||
I/O requests to the device also need the CAP_SYS_RAWIO capability,
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>The significance of that is that by default, all user mode
|
|
||||||
device drivers need super-user privileges.
|
|
||||||
You can change modes or ownership in a driver setup
|
|
||||||
when the device hotplugs, or maye just start the
|
|
||||||
driver right then, as a privileged server (or some activity
|
|
||||||
within one).
|
|
||||||
That's the most secure approach for multi-user systems,
|
|
||||||
but for single user systems ("trusted" by that user)
|
|
||||||
it's more convenient just to grant everyone all access
|
|
||||||
(using the <emphasis>devmode=0666</emphasis> option)
|
|
||||||
so the driver can start whenever it's needed.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>The mount options for usbfs, usable in /etc/fstab or
|
|
||||||
in command line invocations of <emphasis>mount</emphasis>, are:
|
|
||||||
|
|
||||||
<variablelist>
|
|
||||||
<varlistentry>
|
|
||||||
<term><emphasis>busgid</emphasis>=NNNNN</term>
|
|
||||||
<listitem><para>Controls the GID used for the
|
|
||||||
/proc/bus/usb/BBB
|
|
||||||
directories. (Default: 0)</para></listitem></varlistentry>
|
|
||||||
<varlistentry><term><emphasis>busmode</emphasis>=MMM</term>
|
|
||||||
<listitem><para>Controls the file mode used for the
|
|
||||||
/proc/bus/usb/BBB
|
|
||||||
directories. (Default: 0555)
|
|
||||||
</para></listitem></varlistentry>
|
|
||||||
<varlistentry><term><emphasis>busuid</emphasis>=NNNNN</term>
|
|
||||||
<listitem><para>Controls the UID used for the
|
|
||||||
/proc/bus/usb/BBB
|
|
||||||
directories. (Default: 0)</para></listitem></varlistentry>
|
|
||||||
|
|
||||||
<varlistentry><term><emphasis>devgid</emphasis>=NNNNN</term>
|
|
||||||
<listitem><para>Controls the GID used for the
|
|
||||||
/proc/bus/usb/BBB/DDD
|
|
||||||
files. (Default: 0)</para></listitem></varlistentry>
|
|
||||||
<varlistentry><term><emphasis>devmode</emphasis>=MMM</term>
|
|
||||||
<listitem><para>Controls the file mode used for the
|
|
||||||
/proc/bus/usb/BBB/DDD
|
|
||||||
files. (Default: 0644)</para></listitem></varlistentry>
|
|
||||||
<varlistentry><term><emphasis>devuid</emphasis>=NNNNN</term>
|
|
||||||
<listitem><para>Controls the UID used for the
|
|
||||||
/proc/bus/usb/BBB/DDD
|
|
||||||
files. (Default: 0)</para></listitem></varlistentry>
|
|
||||||
|
|
||||||
<varlistentry><term><emphasis>listgid</emphasis>=NNNNN</term>
|
|
||||||
<listitem><para>Controls the GID used for the
|
|
||||||
/proc/bus/usb/devices and drivers files.
|
|
||||||
(Default: 0)</para></listitem></varlistentry>
|
|
||||||
<varlistentry><term><emphasis>listmode</emphasis>=MMM</term>
|
|
||||||
<listitem><para>Controls the file mode used for the
|
|
||||||
/proc/bus/usb/devices and drivers files.
|
|
||||||
(Default: 0444)</para></listitem></varlistentry>
|
|
||||||
<varlistentry><term><emphasis>listuid</emphasis>=NNNNN</term>
|
|
||||||
<listitem><para>Controls the UID used for the
|
|
||||||
/proc/bus/usb/devices and drivers files.
|
|
||||||
(Default: 0)</para></listitem></varlistentry>
|
|
||||||
</variablelist>
|
|
||||||
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>Note that many Linux distributions hard-wire the mount options
|
|
||||||
for usbfs in their init scripts, such as
|
|
||||||
<filename>/etc/rc.d/rc.sysinit</filename>,
|
|
||||||
rather than making it easy to set this per-system
|
|
||||||
policy in <filename>/etc/fstab</filename>.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
</sect1>
|
|
||||||
|
|
||||||
<sect1 id="usbfs-devices">
|
|
||||||
<title>/proc/bus/usb/devices</title>
|
|
||||||
|
|
||||||
<para>This file is handy for status viewing tools in user
|
|
||||||
mode, which can scan the text format and ignore most of it.
|
|
||||||
More detailed device status (including class and vendor
|
|
||||||
status) is available from device-specific files.
|
|
||||||
For information about the current format of this file,
|
|
||||||
see the
|
|
||||||
<filename>Documentation/usb/proc_usb_info.txt</filename>
|
|
||||||
file in your Linux kernel sources.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>This file, in combination with the poll() system call, can
|
|
||||||
also be used to detect when devices are added or removed:
|
|
||||||
<programlisting>int fd;
|
|
||||||
struct pollfd pfd;
|
|
||||||
|
|
||||||
fd = open("/proc/bus/usb/devices", O_RDONLY);
|
|
||||||
pfd = { fd, POLLIN, 0 };
|
|
||||||
for (;;) {
|
|
||||||
/* The first time through, this call will return immediately. */
|
|
||||||
poll(&pfd, 1, -1);
|
|
||||||
|
|
||||||
/* To see what's changed, compare the file's previous and current
|
|
||||||
contents or scan the filesystem. (Scanning is more precise.) */
|
|
||||||
}</programlisting>
|
|
||||||
Note that this behavior is intended to be used for informational
|
|
||||||
and debug purposes. It would be more appropriate to use programs
|
|
||||||
such as udev or HAL to initialize a device or start a user-mode
|
|
||||||
helper program, for instance.
|
|
||||||
</para>
|
|
||||||
</sect1>
|
|
||||||
|
|
||||||
<sect1 id="usbfs-bbbddd">
|
|
||||||
<title>/proc/bus/usb/BBB/DDD</title>
|
|
||||||
|
|
||||||
<para>Use these files in one of these basic ways:
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para><emphasis>They can be read,</emphasis>
|
|
||||||
producing first the device descriptor
|
|
||||||
(18 bytes) and then the descriptors for the current configuration.
|
|
||||||
See the USB 2.0 spec for details about those binary data formats.
|
|
||||||
You'll need to convert most multibyte values from little endian
|
|
||||||
format to your native host byte order, although a few of the
|
|
||||||
fields in the device descriptor (both of the BCD-encoded fields,
|
|
||||||
and the vendor and product IDs) will be byteswapped for you.
|
|
||||||
Note that configuration descriptors include descriptors for
|
|
||||||
interfaces, altsettings, endpoints, and maybe additional
|
|
||||||
class descriptors.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para><emphasis>Perform USB operations</emphasis> using
|
|
||||||
<emphasis>ioctl()</emphasis> requests to make endpoint I/O
|
|
||||||
requests (synchronously or asynchronously) or manage
|
|
||||||
the device.
|
|
||||||
These requests need the CAP_SYS_RAWIO capability,
|
|
||||||
as well as filesystem access permissions.
|
|
||||||
Only one ioctl request can be made on one of these
|
|
||||||
device files at a time.
|
|
||||||
This means that if you are synchronously reading an endpoint
|
|
||||||
from one thread, you won't be able to write to a different
|
|
||||||
endpoint from another thread until the read completes.
|
|
||||||
This works for <emphasis>half duplex</emphasis> protocols,
|
|
||||||
but otherwise you'd use asynchronous i/o requests.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
</sect1>
|
|
||||||
|
|
||||||
|
|
||||||
<sect1 id="usbfs-lifecycle">
|
|
||||||
<title>Life Cycle of User Mode Drivers</title>
|
|
||||||
|
|
||||||
<para>Such a driver first needs to find a device file
|
|
||||||
for a device it knows how to handle.
|
|
||||||
Maybe it was told about it because a
|
|
||||||
<filename>/sbin/hotplug</filename> event handling agent
|
|
||||||
chose that driver to handle the new device.
|
|
||||||
Or maybe it's an application that scans all the
|
|
||||||
/proc/bus/usb device files, and ignores most devices.
|
|
||||||
In either case, it should <function>read()</function> all
|
|
||||||
the descriptors from the device file,
|
|
||||||
and check them against what it knows how to handle.
|
|
||||||
It might just reject everything except a particular
|
|
||||||
vendor and product ID, or need a more complex policy.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>Never assume there will only be one such device
|
|
||||||
on the system at a time!
|
|
||||||
If your code can't handle more than one device at
|
|
||||||
a time, at least detect when there's more than one, and
|
|
||||||
have your users choose which device to use.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>Once your user mode driver knows what device to use,
|
|
||||||
it interacts with it in either of two styles.
|
|
||||||
The simple style is to make only control requests; some
|
|
||||||
devices don't need more complex interactions than those.
|
|
||||||
(An example might be software using vendor-specific control
|
|
||||||
requests for some initialization or configuration tasks,
|
|
||||||
with a kernel driver for the rest.)
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>More likely, you need a more complex style driver:
|
|
||||||
one using non-control endpoints, reading or writing data
|
|
||||||
and claiming exclusive use of an interface.
|
|
||||||
<emphasis>Bulk</emphasis> transfers are easiest to use,
|
|
||||||
but only their sibling <emphasis>interrupt</emphasis> transfers
|
|
||||||
work with low speed devices.
|
|
||||||
Both interrupt and <emphasis>isochronous</emphasis> transfers
|
|
||||||
offer service guarantees because their bandwidth is reserved.
|
|
||||||
Such "periodic" transfers are awkward to use through usbfs,
|
|
||||||
unless you're using the asynchronous calls. However, interrupt
|
|
||||||
transfers can also be used in a synchronous "one shot" style.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>Your user-mode driver should never need to worry
|
|
||||||
about cleaning up request state when the device is
|
|
||||||
disconnected, although it should close its open file
|
|
||||||
descriptors as soon as it starts seeing the ENODEV
|
|
||||||
errors.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
</sect1>
|
|
||||||
|
|
||||||
<sect1 id="usbfs-ioctl"><title>The ioctl() Requests</title>
|
|
||||||
|
|
||||||
<para>To use these ioctls, you need to include the following
|
|
||||||
headers in your userspace program:
|
|
||||||
<programlisting>#include <linux/usb.h>
|
|
||||||
#include <linux/usbdevice_fs.h>
|
|
||||||
#include <asm/byteorder.h></programlisting>
|
|
||||||
The standard USB device model requests, from "Chapter 9" of
|
|
||||||
the USB 2.0 specification, are automatically included from
|
|
||||||
the <filename><linux/usb/ch9.h></filename> header.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>Unless noted otherwise, the ioctl requests
|
|
||||||
described here will
|
|
||||||
update the modification time on the usbfs file to which
|
|
||||||
they are applied (unless they fail).
|
|
||||||
A return of zero indicates success; otherwise, a
|
|
||||||
standard USB error code is returned. (These are
|
|
||||||
documented in
|
|
||||||
<filename>Documentation/usb/error-codes.txt</filename>
|
|
||||||
in your kernel sources.)
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>Each of these files multiplexes access to several
|
|
||||||
I/O streams, one per endpoint.
|
|
||||||
Each device has one control endpoint (endpoint zero)
|
|
||||||
which supports a limited RPC style RPC access.
|
|
||||||
Devices are configured
|
|
||||||
by hub_wq (in the kernel) setting a device-wide
|
|
||||||
<emphasis>configuration</emphasis> that affects things
|
|
||||||
like power consumption and basic functionality.
|
|
||||||
The endpoints are part of USB <emphasis>interfaces</emphasis>,
|
|
||||||
which may have <emphasis>altsettings</emphasis>
|
|
||||||
affecting things like which endpoints are available.
|
|
||||||
Many devices only have a single configuration and interface,
|
|
||||||
so drivers for them will ignore configurations and altsettings.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
|
|
||||||
<sect2 id="usbfs-mgmt">
|
|
||||||
<title>Management/Status Requests</title>
|
|
||||||
|
|
||||||
<para>A number of usbfs requests don't deal very directly
|
|
||||||
with device I/O.
|
|
||||||
They mostly relate to device management and status.
|
|
||||||
These are all synchronous requests.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<variablelist>
|
|
||||||
|
|
||||||
<varlistentry><term>USBDEVFS_CLAIMINTERFACE</term>
|
|
||||||
<listitem><para>This is used to force usbfs to
|
|
||||||
claim a specific interface,
|
|
||||||
which has not previously been claimed by usbfs or any other
|
|
||||||
kernel driver.
|
|
||||||
The ioctl parameter is an integer holding the number of
|
|
||||||
the interface (bInterfaceNumber from descriptor).
|
|
||||||
</para><para>
|
|
||||||
Note that if your driver doesn't claim an interface
|
|
||||||
before trying to use one of its endpoints, and no
|
|
||||||
other driver has bound to it, then the interface is
|
|
||||||
automatically claimed by usbfs.
|
|
||||||
</para><para>
|
|
||||||
This claim will be released by a RELEASEINTERFACE ioctl,
|
|
||||||
or by closing the file descriptor.
|
|
||||||
File modification time is not updated by this request.
|
|
||||||
</para></listitem></varlistentry>
|
|
||||||
|
|
||||||
<varlistentry><term>USBDEVFS_CONNECTINFO</term>
|
|
||||||
<listitem><para>Says whether the device is lowspeed.
|
|
||||||
The ioctl parameter points to a structure like this:
|
|
||||||
<programlisting>struct usbdevfs_connectinfo {
|
|
||||||
unsigned int devnum;
|
|
||||||
unsigned char slow;
|
|
||||||
}; </programlisting>
|
|
||||||
File modification time is not updated by this request.
|
|
||||||
</para><para>
|
|
||||||
<emphasis>You can't tell whether a "not slow"
|
|
||||||
device is connected at high speed (480 MBit/sec)
|
|
||||||
or just full speed (12 MBit/sec).</emphasis>
|
|
||||||
You should know the devnum value already,
|
|
||||||
it's the DDD value of the device file name.
|
|
||||||
</para></listitem></varlistentry>
|
|
||||||
|
|
||||||
<varlistentry><term>USBDEVFS_GETDRIVER</term>
|
|
||||||
<listitem><para>Returns the name of the kernel driver
|
|
||||||
bound to a given interface (a string). Parameter
|
|
||||||
is a pointer to this structure, which is modified:
|
|
||||||
<programlisting>struct usbdevfs_getdriver {
|
|
||||||
unsigned int interface;
|
|
||||||
char driver[USBDEVFS_MAXDRIVERNAME + 1];
|
|
||||||
};</programlisting>
|
|
||||||
File modification time is not updated by this request.
|
|
||||||
</para></listitem></varlistentry>
|
|
||||||
|
|
||||||
<varlistentry><term>USBDEVFS_IOCTL</term>
|
|
||||||
<listitem><para>Passes a request from userspace through
|
|
||||||
to a kernel driver that has an ioctl entry in the
|
|
||||||
<emphasis>struct usb_driver</emphasis> it registered.
|
|
||||||
<programlisting>struct usbdevfs_ioctl {
|
|
||||||
int ifno;
|
|
||||||
int ioctl_code;
|
|
||||||
void *data;
|
|
||||||
};
|
|
||||||
|
|
||||||
/* user mode call looks like this.
|
|
||||||
* 'request' becomes the driver->ioctl() 'code' parameter.
|
|
||||||
* the size of 'param' is encoded in 'request', and that data
|
|
||||||
* is copied to or from the driver->ioctl() 'buf' parameter.
|
|
||||||
*/
|
|
||||||
static int
|
|
||||||
usbdev_ioctl (int fd, int ifno, unsigned request, void *param)
|
|
||||||
{
|
|
||||||
struct usbdevfs_ioctl wrapper;
|
|
||||||
|
|
||||||
wrapper.ifno = ifno;
|
|
||||||
wrapper.ioctl_code = request;
|
|
||||||
wrapper.data = param;
|
|
||||||
|
|
||||||
return ioctl (fd, USBDEVFS_IOCTL, &wrapper);
|
|
||||||
} </programlisting>
|
|
||||||
File modification time is not updated by this request.
|
|
||||||
</para><para>
|
|
||||||
This request lets kernel drivers talk to user mode code
|
|
||||||
through filesystem operations even when they don't create
|
|
||||||
a character or block special device.
|
|
||||||
It's also been used to do things like ask devices what
|
|
||||||
device special file should be used.
|
|
||||||
Two pre-defined ioctls are used
|
|
||||||
to disconnect and reconnect kernel drivers, so
|
|
||||||
that user mode code can completely manage binding
|
|
||||||
and configuration of devices.
|
|
||||||
</para></listitem></varlistentry>
|
|
||||||
|
|
||||||
<varlistentry><term>USBDEVFS_RELEASEINTERFACE</term>
|
|
||||||
<listitem><para>This is used to release the claim usbfs
|
|
||||||
made on interface, either implicitly or because of a
|
|
||||||
USBDEVFS_CLAIMINTERFACE call, before the file
|
|
||||||
descriptor is closed.
|
|
||||||
The ioctl parameter is an integer holding the number of
|
|
||||||
the interface (bInterfaceNumber from descriptor);
|
|
||||||
File modification time is not updated by this request.
|
|
||||||
</para><warning><para>
|
|
||||||
<emphasis>No security check is made to ensure
|
|
||||||
that the task which made the claim is the one
|
|
||||||
which is releasing it.
|
|
||||||
This means that user mode driver may interfere
|
|
||||||
other ones. </emphasis>
|
|
||||||
</para></warning></listitem></varlistentry>
|
|
||||||
|
|
||||||
<varlistentry><term>USBDEVFS_RESETEP</term>
|
|
||||||
<listitem><para>Resets the data toggle value for an endpoint
|
|
||||||
(bulk or interrupt) to DATA0.
|
|
||||||
The ioctl parameter is an integer endpoint number
|
|
||||||
(1 to 15, as identified in the endpoint descriptor),
|
|
||||||
with USB_DIR_IN added if the device's endpoint sends
|
|
||||||
data to the host.
|
|
||||||
</para><warning><para>
|
|
||||||
<emphasis>Avoid using this request.
|
|
||||||
It should probably be removed.</emphasis>
|
|
||||||
Using it typically means the device and driver will lose
|
|
||||||
toggle synchronization. If you really lost synchronization,
|
|
||||||
you likely need to completely handshake with the device,
|
|
||||||
using a request like CLEAR_HALT
|
|
||||||
or SET_INTERFACE.
|
|
||||||
</para></warning></listitem></varlistentry>
|
|
||||||
|
|
||||||
<varlistentry><term>USBDEVFS_DROP_PRIVILEGES</term>
|
|
||||||
<listitem><para>This is used to relinquish the ability
|
|
||||||
to do certain operations which are considered to be
|
|
||||||
privileged on a usbfs file descriptor.
|
|
||||||
This includes claiming arbitrary interfaces, resetting
|
|
||||||
a device on which there are currently claimed interfaces
|
|
||||||
from other users, and issuing USBDEVFS_IOCTL calls.
|
|
||||||
The ioctl parameter is a 32 bit mask of interfaces
|
|
||||||
the user is allowed to claim on this file descriptor.
|
|
||||||
You may issue this ioctl more than one time to narrow
|
|
||||||
said mask.
|
|
||||||
</para></listitem></varlistentry>
|
|
||||||
</variablelist>
|
|
||||||
|
|
||||||
</sect2>
|
|
||||||
|
|
||||||
<sect2 id="usbfs-sync">
|
|
||||||
<title>Synchronous I/O Support</title>
|
|
||||||
|
|
||||||
<para>Synchronous requests involve the kernel blocking
|
|
||||||
until the user mode request completes, either by
|
|
||||||
finishing successfully or by reporting an error.
|
|
||||||
In most cases this is the simplest way to use usbfs,
|
|
||||||
although as noted above it does prevent performing I/O
|
|
||||||
to more than one endpoint at a time.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<variablelist>
|
|
||||||
|
|
||||||
<varlistentry><term>USBDEVFS_BULK</term>
|
|
||||||
<listitem><para>Issues a bulk read or write request to the
|
|
||||||
device.
|
|
||||||
The ioctl parameter is a pointer to this structure:
|
|
||||||
<programlisting>struct usbdevfs_bulktransfer {
|
|
||||||
unsigned int ep;
|
|
||||||
unsigned int len;
|
|
||||||
unsigned int timeout; /* in milliseconds */
|
|
||||||
void *data;
|
|
||||||
};</programlisting>
|
|
||||||
</para><para>The "ep" value identifies a
|
|
||||||
bulk endpoint number (1 to 15, as identified in an endpoint
|
|
||||||
descriptor),
|
|
||||||
masked with USB_DIR_IN when referring to an endpoint which
|
|
||||||
sends data to the host from the device.
|
|
||||||
The length of the data buffer is identified by "len";
|
|
||||||
Recent kernels support requests up to about 128KBytes.
|
|
||||||
<emphasis>FIXME say how read length is returned,
|
|
||||||
and how short reads are handled.</emphasis>.
|
|
||||||
</para></listitem></varlistentry>
|
|
||||||
|
|
||||||
<varlistentry><term>USBDEVFS_CLEAR_HALT</term>
|
|
||||||
<listitem><para>Clears endpoint halt (stall) and
|
|
||||||
resets the endpoint toggle. This is only
|
|
||||||
meaningful for bulk or interrupt endpoints.
|
|
||||||
The ioctl parameter is an integer endpoint number
|
|
||||||
(1 to 15, as identified in an endpoint descriptor),
|
|
||||||
masked with USB_DIR_IN when referring to an endpoint which
|
|
||||||
sends data to the host from the device.
|
|
||||||
</para><para>
|
|
||||||
Use this on bulk or interrupt endpoints which have
|
|
||||||
stalled, returning <emphasis>-EPIPE</emphasis> status
|
|
||||||
to a data transfer request.
|
|
||||||
Do not issue the control request directly, since
|
|
||||||
that could invalidate the host's record of the
|
|
||||||
data toggle.
|
|
||||||
</para></listitem></varlistentry>
|
|
||||||
|
|
||||||
<varlistentry><term>USBDEVFS_CONTROL</term>
|
|
||||||
<listitem><para>Issues a control request to the device.
|
|
||||||
The ioctl parameter points to a structure like this:
|
|
||||||
<programlisting>struct usbdevfs_ctrltransfer {
|
|
||||||
__u8 bRequestType;
|
|
||||||
__u8 bRequest;
|
|
||||||
__u16 wValue;
|
|
||||||
__u16 wIndex;
|
|
||||||
__u16 wLength;
|
|
||||||
__u32 timeout; /* in milliseconds */
|
|
||||||
void *data;
|
|
||||||
};</programlisting>
|
|
||||||
</para><para>
|
|
||||||
The first eight bytes of this structure are the contents
|
|
||||||
of the SETUP packet to be sent to the device; see the
|
|
||||||
USB 2.0 specification for details.
|
|
||||||
The bRequestType value is composed by combining a
|
|
||||||
USB_TYPE_* value, a USB_DIR_* value, and a
|
|
||||||
USB_RECIP_* value (from
|
|
||||||
<emphasis><linux/usb.h></emphasis>).
|
|
||||||
If wLength is nonzero, it describes the length of the data
|
|
||||||
buffer, which is either written to the device
|
|
||||||
(USB_DIR_OUT) or read from the device (USB_DIR_IN).
|
|
||||||
</para><para>
|
|
||||||
At this writing, you can't transfer more than 4 KBytes
|
|
||||||
of data to or from a device; usbfs has a limit, and
|
|
||||||
some host controller drivers have a limit.
|
|
||||||
(That's not usually a problem.)
|
|
||||||
<emphasis>Also</emphasis> there's no way to say it's
|
|
||||||
not OK to get a short read back from the device.
|
|
||||||
</para></listitem></varlistentry>
|
|
||||||
|
|
||||||
<varlistentry><term>USBDEVFS_RESET</term>
|
|
||||||
<listitem><para>Does a USB level device reset.
|
|
||||||
The ioctl parameter is ignored.
|
|
||||||
After the reset, this rebinds all device interfaces.
|
|
||||||
File modification time is not updated by this request.
|
|
||||||
</para><warning><para>
|
|
||||||
<emphasis>Avoid using this call</emphasis>
|
|
||||||
until some usbcore bugs get fixed,
|
|
||||||
since it does not fully synchronize device, interface,
|
|
||||||
and driver (not just usbfs) state.
|
|
||||||
</para></warning></listitem></varlistentry>
|
|
||||||
|
|
||||||
<varlistentry><term>USBDEVFS_SETINTERFACE</term>
|
|
||||||
<listitem><para>Sets the alternate setting for an
|
|
||||||
interface. The ioctl parameter is a pointer to a
|
|
||||||
structure like this:
|
|
||||||
<programlisting>struct usbdevfs_setinterface {
|
|
||||||
unsigned int interface;
|
|
||||||
unsigned int altsetting;
|
|
||||||
}; </programlisting>
|
|
||||||
File modification time is not updated by this request.
|
|
||||||
</para><para>
|
|
||||||
Those struct members are from some interface descriptor
|
|
||||||
applying to the current configuration.
|
|
||||||
The interface number is the bInterfaceNumber value, and
|
|
||||||
the altsetting number is the bAlternateSetting value.
|
|
||||||
(This resets each endpoint in the interface.)
|
|
||||||
</para></listitem></varlistentry>
|
|
||||||
|
|
||||||
<varlistentry><term>USBDEVFS_SETCONFIGURATION</term>
|
|
||||||
<listitem><para>Issues the
|
|
||||||
<function>usb_set_configuration</function> call
|
|
||||||
for the device.
|
|
||||||
The parameter is an integer holding the number of
|
|
||||||
a configuration (bConfigurationValue from descriptor).
|
|
||||||
File modification time is not updated by this request.
|
|
||||||
</para><warning><para>
|
|
||||||
<emphasis>Avoid using this call</emphasis>
|
|
||||||
until some usbcore bugs get fixed,
|
|
||||||
since it does not fully synchronize device, interface,
|
|
||||||
and driver (not just usbfs) state.
|
|
||||||
</para></warning></listitem></varlistentry>
|
|
||||||
|
|
||||||
</variablelist>
|
|
||||||
</sect2>
|
|
||||||
|
|
||||||
<sect2 id="usbfs-async">
|
|
||||||
<title>Asynchronous I/O Support</title>
|
|
||||||
|
|
||||||
<para>As mentioned above, there are situations where it may be
|
|
||||||
important to initiate concurrent operations from user mode code.
|
|
||||||
This is particularly important for periodic transfers
|
|
||||||
(interrupt and isochronous), but it can be used for other
|
|
||||||
kinds of USB requests too.
|
|
||||||
In such cases, the asynchronous requests described here
|
|
||||||
are essential. Rather than submitting one request and having
|
|
||||||
the kernel block until it completes, the blocking is separate.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>These requests are packaged into a structure that
|
|
||||||
resembles the URB used by kernel device drivers.
|
|
||||||
(No POSIX Async I/O support here, sorry.)
|
|
||||||
It identifies the endpoint type (USBDEVFS_URB_TYPE_*),
|
|
||||||
endpoint (number, masked with USB_DIR_IN as appropriate),
|
|
||||||
buffer and length, and a user "context" value serving to
|
|
||||||
uniquely identify each request.
|
|
||||||
(It's usually a pointer to per-request data.)
|
|
||||||
Flags can modify requests (not as many as supported for
|
|
||||||
kernel drivers).
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>Each request can specify a realtime signal number
|
|
||||||
(between SIGRTMIN and SIGRTMAX, inclusive) to request a
|
|
||||||
signal be sent when the request completes.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<para>When usbfs returns these urbs, the status value
|
|
||||||
is updated, and the buffer may have been modified.
|
|
||||||
Except for isochronous transfers, the actual_length is
|
|
||||||
updated to say how many bytes were transferred; if the
|
|
||||||
USBDEVFS_URB_DISABLE_SPD flag is set
|
|
||||||
("short packets are not OK"), if fewer bytes were read
|
|
||||||
than were requested then you get an error report.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<programlisting>struct usbdevfs_iso_packet_desc {
|
|
||||||
unsigned int length;
|
|
||||||
unsigned int actual_length;
|
|
||||||
unsigned int status;
|
|
||||||
};
|
|
||||||
|
|
||||||
struct usbdevfs_urb {
|
|
||||||
unsigned char type;
|
|
||||||
unsigned char endpoint;
|
|
||||||
int status;
|
|
||||||
unsigned int flags;
|
|
||||||
void *buffer;
|
|
||||||
int buffer_length;
|
|
||||||
int actual_length;
|
|
||||||
int start_frame;
|
|
||||||
int number_of_packets;
|
|
||||||
int error_count;
|
|
||||||
unsigned int signr;
|
|
||||||
void *usercontext;
|
|
||||||
struct usbdevfs_iso_packet_desc iso_frame_desc[];
|
|
||||||
};</programlisting>
|
|
||||||
|
|
||||||
<para> For these asynchronous requests, the file modification
|
|
||||||
time reflects when the request was initiated.
|
|
||||||
This contrasts with their use with the synchronous requests,
|
|
||||||
where it reflects when requests complete.
|
|
||||||
</para>
|
|
||||||
|
|
||||||
<variablelist>
|
|
||||||
|
|
||||||
<varlistentry><term>USBDEVFS_DISCARDURB</term>
|
|
||||||
<listitem><para>
|
|
||||||
<emphasis>TBS</emphasis>
|
|
||||||
File modification time is not updated by this request.
|
|
||||||
</para><para>
|
|
||||||
</para></listitem></varlistentry>
|
|
||||||
|
|
||||||
<varlistentry><term>USBDEVFS_DISCSIGNAL</term>
|
|
||||||
<listitem><para>
|
|
||||||
<emphasis>TBS</emphasis>
|
|
||||||
File modification time is not updated by this request.
|
|
||||||
</para><para>
|
|
||||||
</para></listitem></varlistentry>
|
|
||||||
|
|
||||||
<varlistentry><term>USBDEVFS_REAPURB</term>
|
|
||||||
<listitem><para>
|
|
||||||
<emphasis>TBS</emphasis>
|
|
||||||
File modification time is not updated by this request.
|
|
||||||
</para><para>
|
|
||||||
</para></listitem></varlistentry>
|
|
||||||
|
|
||||||
<varlistentry><term>USBDEVFS_REAPURBNDELAY</term>
|
|
||||||
<listitem><para>
|
|
||||||
<emphasis>TBS</emphasis>
|
|
||||||
File modification time is not updated by this request.
|
|
||||||
</para><para>
|
|
||||||
</para></listitem></varlistentry>
|
|
||||||
|
|
||||||
<varlistentry><term>USBDEVFS_SUBMITURB</term>
|
|
||||||
<listitem><para>
|
|
||||||
<emphasis>TBS</emphasis>
|
|
||||||
</para><para>
|
|
||||||
</para></listitem></varlistentry>
|
|
||||||
|
|
||||||
</variablelist>
|
|
||||||
</sect2>
|
|
||||||
|
|
||||||
</sect1>
|
|
||||||
|
|
||||||
</chapter>
|
|
||||||
|
|
||||||
</book>
|
|
||||||
<!-- vim:syntax=sgml:sw=4
|
|
||||||
-->
|
|
||||||
File diff suppressed because it is too large
Load Diff
@@ -10,6 +10,8 @@ _SPHINXDIRS = $(patsubst $(srctree)/Documentation/%/conf.py,%,$(wildcard $(src
|
|||||||
SPHINX_CONF = conf.py
|
SPHINX_CONF = conf.py
|
||||||
PAPER =
|
PAPER =
|
||||||
BUILDDIR = $(obj)/output
|
BUILDDIR = $(obj)/output
|
||||||
|
PDFLATEX = xelatex
|
||||||
|
LATEXOPTS = -interaction=batchmode
|
||||||
|
|
||||||
# User-friendly check for sphinx-build
|
# User-friendly check for sphinx-build
|
||||||
HAVE_SPHINX := $(shell if which $(SPHINXBUILD) >/dev/null 2>&1; then echo 1; else echo 0; fi)
|
HAVE_SPHINX := $(shell if which $(SPHINXBUILD) >/dev/null 2>&1; then echo 1; else echo 0; fi)
|
||||||
@@ -29,7 +31,7 @@ else ifneq ($(DOCBOOKS),)
|
|||||||
else # HAVE_SPHINX
|
else # HAVE_SPHINX
|
||||||
|
|
||||||
# User-friendly check for pdflatex
|
# User-friendly check for pdflatex
|
||||||
HAVE_PDFLATEX := $(shell if which xelatex >/dev/null 2>&1; then echo 1; else echo 0; fi)
|
HAVE_PDFLATEX := $(shell if which $(PDFLATEX) >/dev/null 2>&1; then echo 1; else echo 0; fi)
|
||||||
|
|
||||||
# Internal variables.
|
# Internal variables.
|
||||||
PAPEROPT_a4 = -D latex_paper_size=a4
|
PAPEROPT_a4 = -D latex_paper_size=a4
|
||||||
@@ -51,8 +53,8 @@ loop_cmd = $(echo-cmd) $(cmd_$(1))
|
|||||||
# $5 reST source folder relative to $(srctree)/$(src),
|
# $5 reST source folder relative to $(srctree)/$(src),
|
||||||
# e.g. "media" for the linux-tv book-set at ./Documentation/media
|
# e.g. "media" for the linux-tv book-set at ./Documentation/media
|
||||||
|
|
||||||
quiet_cmd_sphinx = SPHINX $@ --> file://$(abspath $(BUILDDIR)/$3/$4);
|
quiet_cmd_sphinx = SPHINX $@ --> file://$(abspath $(BUILDDIR)/$3/$4)
|
||||||
cmd_sphinx = $(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) $(build)=Documentation/media all;\
|
cmd_sphinx = $(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) $(build)=Documentation/media $2;\
|
||||||
BUILDDIR=$(abspath $(BUILDDIR)) SPHINX_CONF=$(abspath $(srctree)/$(src)/$5/$(SPHINX_CONF)) \
|
BUILDDIR=$(abspath $(BUILDDIR)) SPHINX_CONF=$(abspath $(srctree)/$(src)/$5/$(SPHINX_CONF)) \
|
||||||
$(SPHINXBUILD) \
|
$(SPHINXBUILD) \
|
||||||
-b $2 \
|
-b $2 \
|
||||||
@@ -67,16 +69,19 @@ htmldocs:
|
|||||||
@$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,html,$(var),,$(var)))
|
@$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,html,$(var),,$(var)))
|
||||||
|
|
||||||
latexdocs:
|
latexdocs:
|
||||||
ifeq ($(HAVE_PDFLATEX),0)
|
|
||||||
$(warning The 'xelatex' command was not found. Make sure you have it installed and in PATH to produce PDF output.)
|
|
||||||
@echo " SKIP Sphinx $@ target."
|
|
||||||
else # HAVE_PDFLATEX
|
|
||||||
@$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,latex,$(var),latex,$(var)))
|
@$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,latex,$(var),latex,$(var)))
|
||||||
endif # HAVE_PDFLATEX
|
|
||||||
|
ifeq ($(HAVE_PDFLATEX),0)
|
||||||
|
|
||||||
|
pdfdocs:
|
||||||
|
$(warning The '$(PDFLATEX)' command was not found. Make sure you have it installed and in PATH to produce PDF output.)
|
||||||
|
@echo " SKIP Sphinx $@ target."
|
||||||
|
|
||||||
|
else # HAVE_PDFLATEX
|
||||||
|
|
||||||
pdfdocs: latexdocs
|
pdfdocs: latexdocs
|
||||||
ifneq ($(HAVE_PDFLATEX),0)
|
$(foreach var,$(SPHINXDIRS), $(MAKE) PDFLATEX=$(PDFLATEX) LATEXOPTS="$(LATEXOPTS)" -C $(BUILDDIR)/$(var)/latex;)
|
||||||
$(foreach var,$(SPHINXDIRS), $(MAKE) PDFLATEX=xelatex LATEXOPTS="-interaction=nonstopmode" -C $(BUILDDIR)/$(var)/latex)
|
|
||||||
endif # HAVE_PDFLATEX
|
endif # HAVE_PDFLATEX
|
||||||
|
|
||||||
epubdocs:
|
epubdocs:
|
||||||
@@ -93,6 +98,7 @@ installmandocs:
|
|||||||
|
|
||||||
cleandocs:
|
cleandocs:
|
||||||
$(Q)rm -rf $(BUILDDIR)
|
$(Q)rm -rf $(BUILDDIR)
|
||||||
|
$(Q)$(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) -C Documentation/media clean
|
||||||
|
|
||||||
endif # HAVE_SPHINX
|
endif # HAVE_SPHINX
|
||||||
|
|
||||||
|
|||||||
@@ -1,841 +1 @@
|
|||||||
.. _submittingpatches:
|
This file has moved to process/submitting-patches.rst
|
||||||
|
|
||||||
How to Get Your Change Into the Linux Kernel or Care And Operation Of Your Linus Torvalds
|
|
||||||
=========================================================================================
|
|
||||||
|
|
||||||
For a person or company who wishes to submit a change to the Linux
|
|
||||||
kernel, the process can sometimes be daunting if you're not familiar
|
|
||||||
with "the system." This text is a collection of suggestions which
|
|
||||||
can greatly increase the chances of your change being accepted.
|
|
||||||
|
|
||||||
This document contains a large number of suggestions in a relatively terse
|
|
||||||
format. For detailed information on how the kernel development process
|
|
||||||
works, see :ref:`Documentation/development-process <development_process_main>`.
|
|
||||||
Also, read :ref:`Documentation/SubmitChecklist <submitchecklist>`
|
|
||||||
for a list of items to check before
|
|
||||||
submitting code. If you are submitting a driver, also read
|
|
||||||
:ref:`Documentation/SubmittingDrivers <submittingdrivers>`;
|
|
||||||
for device tree binding patches, read
|
|
||||||
Documentation/devicetree/bindings/submitting-patches.txt.
|
|
||||||
|
|
||||||
Many of these steps describe the default behavior of the ``git`` version
|
|
||||||
control system; if you use ``git`` to prepare your patches, you'll find much
|
|
||||||
of the mechanical work done for you, though you'll still need to prepare
|
|
||||||
and document a sensible set of patches. In general, use of ``git`` will make
|
|
||||||
your life as a kernel developer easier.
|
|
||||||
|
|
||||||
Creating and Sending your Change
|
|
||||||
********************************
|
|
||||||
|
|
||||||
|
|
||||||
0) Obtain a current source tree
|
|
||||||
-------------------------------
|
|
||||||
|
|
||||||
If you do not have a repository with the current kernel source handy, use
|
|
||||||
``git`` to obtain one. You'll want to start with the mainline repository,
|
|
||||||
which can be grabbed with::
|
|
||||||
|
|
||||||
git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
|
|
||||||
|
|
||||||
Note, however, that you may not want to develop against the mainline tree
|
|
||||||
directly. Most subsystem maintainers run their own trees and want to see
|
|
||||||
patches prepared against those trees. See the **T:** entry for the subsystem
|
|
||||||
in the MAINTAINERS file to find that tree, or simply ask the maintainer if
|
|
||||||
the tree is not listed there.
|
|
||||||
|
|
||||||
It is still possible to download kernel releases via tarballs (as described
|
|
||||||
in the next section), but that is the hard way to do kernel development.
|
|
||||||
|
|
||||||
1) ``diff -up``
|
|
||||||
---------------
|
|
||||||
|
|
||||||
If you must generate your patches by hand, use ``diff -up`` or ``diff -uprN``
|
|
||||||
to create patches. Git generates patches in this form by default; if
|
|
||||||
you're using ``git``, you can skip this section entirely.
|
|
||||||
|
|
||||||
All changes to the Linux kernel occur in the form of patches, as
|
|
||||||
generated by :manpage:`diff(1)`. When creating your patch, make sure to
|
|
||||||
create it in "unified diff" format, as supplied by the ``-u`` argument
|
|
||||||
to :manpage:`diff(1)`.
|
|
||||||
Also, please use the ``-p`` argument which shows which C function each
|
|
||||||
change is in - that makes the resultant ``diff`` a lot easier to read.
|
|
||||||
Patches should be based in the root kernel source directory,
|
|
||||||
not in any lower subdirectory.
|
|
||||||
|
|
||||||
To create a patch for a single file, it is often sufficient to do::
|
|
||||||
|
|
||||||
SRCTREE= linux
|
|
||||||
MYFILE= drivers/net/mydriver.c
|
|
||||||
|
|
||||||
cd $SRCTREE
|
|
||||||
cp $MYFILE $MYFILE.orig
|
|
||||||
vi $MYFILE # make your change
|
|
||||||
cd ..
|
|
||||||
diff -up $SRCTREE/$MYFILE{.orig,} > /tmp/patch
|
|
||||||
|
|
||||||
To create a patch for multiple files, you should unpack a "vanilla",
|
|
||||||
or unmodified kernel source tree, and generate a ``diff`` against your
|
|
||||||
own source tree. For example::
|
|
||||||
|
|
||||||
MYSRC= /devel/linux
|
|
||||||
|
|
||||||
tar xvfz linux-3.19.tar.gz
|
|
||||||
mv linux-3.19 linux-3.19-vanilla
|
|
||||||
diff -uprN -X linux-3.19-vanilla/Documentation/dontdiff \
|
|
||||||
linux-3.19-vanilla $MYSRC > /tmp/patch
|
|
||||||
|
|
||||||
``dontdiff`` is a list of files which are generated by the kernel during
|
|
||||||
the build process, and should be ignored in any :manpage:`diff(1)`-generated
|
|
||||||
patch.
|
|
||||||
|
|
||||||
Make sure your patch does not include any extra files which do not
|
|
||||||
belong in a patch submission. Make sure to review your patch -after-
|
|
||||||
generating it with :manpage:`diff(1)`, to ensure accuracy.
|
|
||||||
|
|
||||||
If your changes produce a lot of deltas, you need to split them into
|
|
||||||
individual patches which modify things in logical stages; see
|
|
||||||
:ref:`split_changes`. This will facilitate review by other kernel developers,
|
|
||||||
very important if you want your patch accepted.
|
|
||||||
|
|
||||||
If you're using ``git``, ``git rebase -i`` can help you with this process. If
|
|
||||||
you're not using ``git``, ``quilt`` <http://savannah.nongnu.org/projects/quilt>
|
|
||||||
is another popular alternative.
|
|
||||||
|
|
||||||
.. _describe_changes:
|
|
||||||
|
|
||||||
2) Describe your changes
|
|
||||||
------------------------
|
|
||||||
|
|
||||||
Describe your problem. Whether your patch is a one-line bug fix or
|
|
||||||
5000 lines of a new feature, there must be an underlying problem that
|
|
||||||
motivated you to do this work. Convince the reviewer that there is a
|
|
||||||
problem worth fixing and that it makes sense for them to read past the
|
|
||||||
first paragraph.
|
|
||||||
|
|
||||||
Describe user-visible impact. Straight up crashes and lockups are
|
|
||||||
pretty convincing, but not all bugs are that blatant. Even if the
|
|
||||||
problem was spotted during code review, describe the impact you think
|
|
||||||
it can have on users. Keep in mind that the majority of Linux
|
|
||||||
installations run kernels from secondary stable trees or
|
|
||||||
vendor/product-specific trees that cherry-pick only specific patches
|
|
||||||
from upstream, so include anything that could help route your change
|
|
||||||
downstream: provoking circumstances, excerpts from dmesg, crash
|
|
||||||
descriptions, performance regressions, latency spikes, lockups, etc.
|
|
||||||
|
|
||||||
Quantify optimizations and trade-offs. If you claim improvements in
|
|
||||||
performance, memory consumption, stack footprint, or binary size,
|
|
||||||
include numbers that back them up. But also describe non-obvious
|
|
||||||
costs. Optimizations usually aren't free but trade-offs between CPU,
|
|
||||||
memory, and readability; or, when it comes to heuristics, between
|
|
||||||
different workloads. Describe the expected downsides of your
|
|
||||||
optimization so that the reviewer can weigh costs against benefits.
|
|
||||||
|
|
||||||
Once the problem is established, describe what you are actually doing
|
|
||||||
about it in technical detail. It's important to describe the change
|
|
||||||
in plain English for the reviewer to verify that the code is behaving
|
|
||||||
as you intend it to.
|
|
||||||
|
|
||||||
The maintainer will thank you if you write your patch description in a
|
|
||||||
form which can be easily pulled into Linux's source code management
|
|
||||||
system, ``git``, as a "commit log". See :ref:`explicit_in_reply_to`.
|
|
||||||
|
|
||||||
Solve only one problem per patch. If your description starts to get
|
|
||||||
long, that's a sign that you probably need to split up your patch.
|
|
||||||
See :ref:`split_changes`.
|
|
||||||
|
|
||||||
When you submit or resubmit a patch or patch series, include the
|
|
||||||
complete patch description and justification for it. Don't just
|
|
||||||
say that this is version N of the patch (series). Don't expect the
|
|
||||||
subsystem maintainer to refer back to earlier patch versions or referenced
|
|
||||||
URLs to find the patch description and put that into the patch.
|
|
||||||
I.e., the patch (series) and its description should be self-contained.
|
|
||||||
This benefits both the maintainers and reviewers. Some reviewers
|
|
||||||
probably didn't even receive earlier versions of the patch.
|
|
||||||
|
|
||||||
Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
|
|
||||||
instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
|
|
||||||
to do frotz", as if you are giving orders to the codebase to change
|
|
||||||
its behaviour.
|
|
||||||
|
|
||||||
If the patch fixes a logged bug entry, refer to that bug entry by
|
|
||||||
number and URL. If the patch follows from a mailing list discussion,
|
|
||||||
give a URL to the mailing list archive; use the https://lkml.kernel.org/
|
|
||||||
redirector with a ``Message-Id``, to ensure that the links cannot become
|
|
||||||
stale.
|
|
||||||
|
|
||||||
However, try to make your explanation understandable without external
|
|
||||||
resources. In addition to giving a URL to a mailing list archive or
|
|
||||||
bug, summarize the relevant points of the discussion that led to the
|
|
||||||
patch as submitted.
|
|
||||||
|
|
||||||
If you want to refer to a specific commit, don't just refer to the
|
|
||||||
SHA-1 ID of the commit. Please also include the oneline summary of
|
|
||||||
the commit, to make it easier for reviewers to know what it is about.
|
|
||||||
Example::
|
|
||||||
|
|
||||||
Commit e21d2170f36602ae2708 ("video: remove unnecessary
|
|
||||||
platform_set_drvdata()") removed the unnecessary
|
|
||||||
platform_set_drvdata(), but left the variable "dev" unused,
|
|
||||||
delete it.
|
|
||||||
|
|
||||||
You should also be sure to use at least the first twelve characters of the
|
|
||||||
SHA-1 ID. The kernel repository holds a *lot* of objects, making
|
|
||||||
collisions with shorter IDs a real possibility. Bear in mind that, even if
|
|
||||||
there is no collision with your six-character ID now, that condition may
|
|
||||||
change five years from now.
|
|
||||||
|
|
||||||
If your patch fixes a bug in a specific commit, e.g. you found an issue using
|
|
||||||
``git bisect``, please use the 'Fixes:' tag with the first 12 characters of
|
|
||||||
the SHA-1 ID, and the one line summary. For example::
|
|
||||||
|
|
||||||
Fixes: e21d2170f366 ("video: remove unnecessary platform_set_drvdata()")
|
|
||||||
|
|
||||||
The following ``git config`` settings can be used to add a pretty format for
|
|
||||||
outputting the above style in the ``git log`` or ``git show`` commands::
|
|
||||||
|
|
||||||
[core]
|
|
||||||
abbrev = 12
|
|
||||||
[pretty]
|
|
||||||
fixes = Fixes: %h (\"%s\")
|
|
||||||
|
|
||||||
.. _split_changes:
|
|
||||||
|
|
||||||
3) Separate your changes
|
|
||||||
------------------------
|
|
||||||
|
|
||||||
Separate each **logical change** into a separate patch.
|
|
||||||
|
|
||||||
For example, if your changes include both bug fixes and performance
|
|
||||||
enhancements for a single driver, separate those changes into two
|
|
||||||
or more patches. If your changes include an API update, and a new
|
|
||||||
driver which uses that new API, separate those into two patches.
|
|
||||||
|
|
||||||
On the other hand, if you make a single change to numerous files,
|
|
||||||
group those changes into a single patch. Thus a single logical change
|
|
||||||
is contained within a single patch.
|
|
||||||
|
|
||||||
The point to remember is that each patch should make an easily understood
|
|
||||||
change that can be verified by reviewers. Each patch should be justifiable
|
|
||||||
on its own merits.
|
|
||||||
|
|
||||||
If one patch depends on another patch in order for a change to be
|
|
||||||
complete, that is OK. Simply note **"this patch depends on patch X"**
|
|
||||||
in your patch description.
|
|
||||||
|
|
||||||
When dividing your change into a series of patches, take special care to
|
|
||||||
ensure that the kernel builds and runs properly after each patch in the
|
|
||||||
series. Developers using ``git bisect`` to track down a problem can end up
|
|
||||||
splitting your patch series at any point; they will not thank you if you
|
|
||||||
introduce bugs in the middle.
|
|
||||||
|
|
||||||
If you cannot condense your patch set into a smaller set of patches,
|
|
||||||
then only post say 15 or so at a time and wait for review and integration.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
4) Style-check your changes
|
|
||||||
---------------------------
|
|
||||||
|
|
||||||
Check your patch for basic style violations, details of which can be
|
|
||||||
found in
|
|
||||||
:ref:`Documentation/CodingStyle <codingstyle>`.
|
|
||||||
Failure to do so simply wastes
|
|
||||||
the reviewers time and will get your patch rejected, probably
|
|
||||||
without even being read.
|
|
||||||
|
|
||||||
One significant exception is when moving code from one file to
|
|
||||||
another -- in this case you should not modify the moved code at all in
|
|
||||||
the same patch which moves it. This clearly delineates the act of
|
|
||||||
moving the code and your changes. This greatly aids review of the
|
|
||||||
actual differences and allows tools to better track the history of
|
|
||||||
the code itself.
|
|
||||||
|
|
||||||
Check your patches with the patch style checker prior to submission
|
|
||||||
(scripts/checkpatch.pl). Note, though, that the style checker should be
|
|
||||||
viewed as a guide, not as a replacement for human judgment. If your code
|
|
||||||
looks better with a violation then its probably best left alone.
|
|
||||||
|
|
||||||
The checker reports at three levels:
|
|
||||||
- ERROR: things that are very likely to be wrong
|
|
||||||
- WARNING: things requiring careful review
|
|
||||||
- CHECK: things requiring thought
|
|
||||||
|
|
||||||
You should be able to justify all violations that remain in your
|
|
||||||
patch.
|
|
||||||
|
|
||||||
|
|
||||||
5) Select the recipients for your patch
|
|
||||||
---------------------------------------
|
|
||||||
|
|
||||||
You should always copy the appropriate subsystem maintainer(s) on any patch
|
|
||||||
to code that they maintain; look through the MAINTAINERS file and the
|
|
||||||
source code revision history to see who those maintainers are. The
|
|
||||||
script scripts/get_maintainer.pl can be very useful at this step. If you
|
|
||||||
cannot find a maintainer for the subsystem you are working on, Andrew
|
|
||||||
Morton (akpm@linux-foundation.org) serves as a maintainer of last resort.
|
|
||||||
|
|
||||||
You should also normally choose at least one mailing list to receive a copy
|
|
||||||
of your patch set. linux-kernel@vger.kernel.org functions as a list of
|
|
||||||
last resort, but the volume on that list has caused a number of developers
|
|
||||||
to tune it out. Look in the MAINTAINERS file for a subsystem-specific
|
|
||||||
list; your patch will probably get more attention there. Please do not
|
|
||||||
spam unrelated lists, though.
|
|
||||||
|
|
||||||
Many kernel-related lists are hosted on vger.kernel.org; you can find a
|
|
||||||
list of them at http://vger.kernel.org/vger-lists.html. There are
|
|
||||||
kernel-related lists hosted elsewhere as well, though.
|
|
||||||
|
|
||||||
Do not send more than 15 patches at once to the vger mailing lists!!!
|
|
||||||
|
|
||||||
Linus Torvalds is the final arbiter of all changes accepted into the
|
|
||||||
Linux kernel. His e-mail address is <torvalds@linux-foundation.org>.
|
|
||||||
He gets a lot of e-mail, and, at this point, very few patches go through
|
|
||||||
Linus directly, so typically you should do your best to -avoid-
|
|
||||||
sending him e-mail.
|
|
||||||
|
|
||||||
If you have a patch that fixes an exploitable security bug, send that patch
|
|
||||||
to security@kernel.org. For severe bugs, a short embargo may be considered
|
|
||||||
to allow distributors to get the patch out to users; in such cases,
|
|
||||||
obviously, the patch should not be sent to any public lists.
|
|
||||||
|
|
||||||
Patches that fix a severe bug in a released kernel should be directed
|
|
||||||
toward the stable maintainers by putting a line like this::
|
|
||||||
|
|
||||||
Cc: stable@vger.kernel.org
|
|
||||||
|
|
||||||
into the sign-off area of your patch (note, NOT an email recipient). You
|
|
||||||
should also read
|
|
||||||
:ref:`Documentation/stable_kernel_rules.txt <stable_kernel_rules>`
|
|
||||||
in addition to this file.
|
|
||||||
|
|
||||||
Note, however, that some subsystem maintainers want to come to their own
|
|
||||||
conclusions on which patches should go to the stable trees. The networking
|
|
||||||
maintainer, in particular, would rather not see individual developers
|
|
||||||
adding lines like the above to their patches.
|
|
||||||
|
|
||||||
If changes affect userland-kernel interfaces, please send the MAN-PAGES
|
|
||||||
maintainer (as listed in the MAINTAINERS file) a man-pages patch, or at
|
|
||||||
least a notification of the change, so that some information makes its way
|
|
||||||
into the manual pages. User-space API changes should also be copied to
|
|
||||||
linux-api@vger.kernel.org.
|
|
||||||
|
|
||||||
For small patches you may want to CC the Trivial Patch Monkey
|
|
||||||
trivial@kernel.org which collects "trivial" patches. Have a look
|
|
||||||
into the MAINTAINERS file for its current manager.
|
|
||||||
|
|
||||||
Trivial patches must qualify for one of the following rules:
|
|
||||||
|
|
||||||
- Spelling fixes in documentation
|
|
||||||
- Spelling fixes for errors which could break :manpage:`grep(1)`
|
|
||||||
- Warning fixes (cluttering with useless warnings is bad)
|
|
||||||
- Compilation fixes (only if they are actually correct)
|
|
||||||
- Runtime fixes (only if they actually fix things)
|
|
||||||
- Removing use of deprecated functions/macros
|
|
||||||
- Contact detail and documentation fixes
|
|
||||||
- Non-portable code replaced by portable code (even in arch-specific,
|
|
||||||
since people copy, as long as it's trivial)
|
|
||||||
- Any fix by the author/maintainer of the file (ie. patch monkey
|
|
||||||
in re-transmission mode)
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
6) No MIME, no links, no compression, no attachments. Just plain text
|
|
||||||
----------------------------------------------------------------------
|
|
||||||
|
|
||||||
Linus and other kernel developers need to be able to read and comment
|
|
||||||
on the changes you are submitting. It is important for a kernel
|
|
||||||
developer to be able to "quote" your changes, using standard e-mail
|
|
||||||
tools, so that they may comment on specific portions of your code.
|
|
||||||
|
|
||||||
For this reason, all patches should be submitted by e-mail "inline".
|
|
||||||
|
|
||||||
.. warning::
|
|
||||||
|
|
||||||
Be wary of your editor's word-wrap corrupting your patch,
|
|
||||||
if you choose to cut-n-paste your patch.
|
|
||||||
|
|
||||||
Do not attach the patch as a MIME attachment, compressed or not.
|
|
||||||
Many popular e-mail applications will not always transmit a MIME
|
|
||||||
attachment as plain text, making it impossible to comment on your
|
|
||||||
code. A MIME attachment also takes Linus a bit more time to process,
|
|
||||||
decreasing the likelihood of your MIME-attached change being accepted.
|
|
||||||
|
|
||||||
Exception: If your mailer is mangling patches then someone may ask
|
|
||||||
you to re-send them using MIME.
|
|
||||||
|
|
||||||
See :ref:`Documentation/email-clients.txt <email_clients>`
|
|
||||||
for hints about configuring your e-mail client so that it sends your patches
|
|
||||||
untouched.
|
|
||||||
|
|
||||||
7) E-mail size
|
|
||||||
--------------
|
|
||||||
|
|
||||||
Large changes are not appropriate for mailing lists, and some
|
|
||||||
maintainers. If your patch, uncompressed, exceeds 300 kB in size,
|
|
||||||
it is preferred that you store your patch on an Internet-accessible
|
|
||||||
server, and provide instead a URL (link) pointing to your patch. But note
|
|
||||||
that if your patch exceeds 300 kB, it almost certainly needs to be broken up
|
|
||||||
anyway.
|
|
||||||
|
|
||||||
8) Respond to review comments
|
|
||||||
-----------------------------
|
|
||||||
|
|
||||||
Your patch will almost certainly get comments from reviewers on ways in
|
|
||||||
which the patch can be improved. You must respond to those comments;
|
|
||||||
ignoring reviewers is a good way to get ignored in return. Review comments
|
|
||||||
or questions that do not lead to a code change should almost certainly
|
|
||||||
bring about a comment or changelog entry so that the next reviewer better
|
|
||||||
understands what is going on.
|
|
||||||
|
|
||||||
Be sure to tell the reviewers what changes you are making and to thank them
|
|
||||||
for their time. Code review is a tiring and time-consuming process, and
|
|
||||||
reviewers sometimes get grumpy. Even in that case, though, respond
|
|
||||||
politely and address the problems they have pointed out.
|
|
||||||
|
|
||||||
|
|
||||||
9) Don't get discouraged - or impatient
|
|
||||||
---------------------------------------
|
|
||||||
|
|
||||||
After you have submitted your change, be patient and wait. Reviewers are
|
|
||||||
busy people and may not get to your patch right away.
|
|
||||||
|
|
||||||
Once upon a time, patches used to disappear into the void without comment,
|
|
||||||
but the development process works more smoothly than that now. You should
|
|
||||||
receive comments within a week or so; if that does not happen, make sure
|
|
||||||
that you have sent your patches to the right place. Wait for a minimum of
|
|
||||||
one week before resubmitting or pinging reviewers - possibly longer during
|
|
||||||
busy times like merge windows.
|
|
||||||
|
|
||||||
|
|
||||||
10) Include PATCH in the subject
|
|
||||||
--------------------------------
|
|
||||||
|
|
||||||
Due to high e-mail traffic to Linus, and to linux-kernel, it is common
|
|
||||||
convention to prefix your subject line with [PATCH]. This lets Linus
|
|
||||||
and other kernel developers more easily distinguish patches from other
|
|
||||||
e-mail discussions.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
11) Sign your work
|
|
||||||
------------------
|
|
||||||
|
|
||||||
To improve tracking of who did what, especially with patches that can
|
|
||||||
percolate to their final resting place in the kernel through several
|
|
||||||
layers of maintainers, we've introduced a "sign-off" procedure on
|
|
||||||
patches that are being emailed around.
|
|
||||||
|
|
||||||
The sign-off is a simple line at the end of the explanation for the
|
|
||||||
patch, which certifies that you wrote it or otherwise have the right to
|
|
||||||
pass it on as an open-source patch. The rules are pretty simple: if you
|
|
||||||
can certify the below:
|
|
||||||
|
|
||||||
Developer's Certificate of Origin 1.1
|
|
||||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
||||||
|
|
||||||
By making a contribution to this project, I certify that:
|
|
||||||
|
|
||||||
(a) The contribution was created in whole or in part by me and I
|
|
||||||
have the right to submit it under the open source license
|
|
||||||
indicated in the file; or
|
|
||||||
|
|
||||||
(b) The contribution is based upon previous work that, to the best
|
|
||||||
of my knowledge, is covered under an appropriate open source
|
|
||||||
license and I have the right under that license to submit that
|
|
||||||
work with modifications, whether created in whole or in part
|
|
||||||
by me, under the same open source license (unless I am
|
|
||||||
permitted to submit under a different license), as indicated
|
|
||||||
in the file; or
|
|
||||||
|
|
||||||
(c) The contribution was provided directly to me by some other
|
|
||||||
person who certified (a), (b) or (c) and I have not modified
|
|
||||||
it.
|
|
||||||
|
|
||||||
(d) I understand and agree that this project and the contribution
|
|
||||||
are public and that a record of the contribution (including all
|
|
||||||
personal information I submit with it, including my sign-off) is
|
|
||||||
maintained indefinitely and may be redistributed consistent with
|
|
||||||
this project or the open source license(s) involved.
|
|
||||||
|
|
||||||
then you just add a line saying::
|
|
||||||
|
|
||||||
Signed-off-by: Random J Developer <random@developer.example.org>
|
|
||||||
|
|
||||||
using your real name (sorry, no pseudonyms or anonymous contributions.)
|
|
||||||
|
|
||||||
Some people also put extra tags at the end. They'll just be ignored for
|
|
||||||
now, but you can do this to mark internal company procedures or just
|
|
||||||
point out some special detail about the sign-off.
|
|
||||||
|
|
||||||
If you are a subsystem or branch maintainer, sometimes you need to slightly
|
|
||||||
modify patches you receive in order to merge them, because the code is not
|
|
||||||
exactly the same in your tree and the submitters'. If you stick strictly to
|
|
||||||
rule (c), you should ask the submitter to rediff, but this is a totally
|
|
||||||
counter-productive waste of time and energy. Rule (b) allows you to adjust
|
|
||||||
the code, but then it is very impolite to change one submitter's code and
|
|
||||||
make him endorse your bugs. To solve this problem, it is recommended that
|
|
||||||
you add a line between the last Signed-off-by header and yours, indicating
|
|
||||||
the nature of your changes. While there is nothing mandatory about this, it
|
|
||||||
seems like prepending the description with your mail and/or name, all
|
|
||||||
enclosed in square brackets, is noticeable enough to make it obvious that
|
|
||||||
you are responsible for last-minute changes. Example::
|
|
||||||
|
|
||||||
Signed-off-by: Random J Developer <random@developer.example.org>
|
|
||||||
[lucky@maintainer.example.org: struct foo moved from foo.c to foo.h]
|
|
||||||
Signed-off-by: Lucky K Maintainer <lucky@maintainer.example.org>
|
|
||||||
|
|
||||||
This practice is particularly helpful if you maintain a stable branch and
|
|
||||||
want at the same time to credit the author, track changes, merge the fix,
|
|
||||||
and protect the submitter from complaints. Note that under no circumstances
|
|
||||||
can you change the author's identity (the From header), as it is the one
|
|
||||||
which appears in the changelog.
|
|
||||||
|
|
||||||
Special note to back-porters: It seems to be a common and useful practice
|
|
||||||
to insert an indication of the origin of a patch at the top of the commit
|
|
||||||
message (just after the subject line) to facilitate tracking. For instance,
|
|
||||||
here's what we see in a 3.x-stable release::
|
|
||||||
|
|
||||||
Date: Tue Oct 7 07:26:38 2014 -0400
|
|
||||||
|
|
||||||
libata: Un-break ATA blacklist
|
|
||||||
|
|
||||||
commit 1c40279960bcd7d52dbdf1d466b20d24b99176c8 upstream.
|
|
||||||
|
|
||||||
And here's what might appear in an older kernel once a patch is backported::
|
|
||||||
|
|
||||||
Date: Tue May 13 22:12:27 2008 +0200
|
|
||||||
|
|
||||||
wireless, airo: waitbusy() won't delay
|
|
||||||
|
|
||||||
[backport of 2.6 commit b7acbdfbd1f277c1eb23f344f899cfa4cd0bf36a]
|
|
||||||
|
|
||||||
Whatever the format, this information provides a valuable help to people
|
|
||||||
tracking your trees, and to people trying to troubleshoot bugs in your
|
|
||||||
tree.
|
|
||||||
|
|
||||||
|
|
||||||
12) When to use Acked-by: and Cc:
|
|
||||||
---------------------------------
|
|
||||||
|
|
||||||
The Signed-off-by: tag indicates that the signer was involved in the
|
|
||||||
development of the patch, or that he/she was in the patch's delivery path.
|
|
||||||
|
|
||||||
If a person was not directly involved in the preparation or handling of a
|
|
||||||
patch but wishes to signify and record their approval of it then they can
|
|
||||||
ask to have an Acked-by: line added to the patch's changelog.
|
|
||||||
|
|
||||||
Acked-by: is often used by the maintainer of the affected code when that
|
|
||||||
maintainer neither contributed to nor forwarded the patch.
|
|
||||||
|
|
||||||
Acked-by: is not as formal as Signed-off-by:. It is a record that the acker
|
|
||||||
has at least reviewed the patch and has indicated acceptance. Hence patch
|
|
||||||
mergers will sometimes manually convert an acker's "yep, looks good to me"
|
|
||||||
into an Acked-by: (but note that it is usually better to ask for an
|
|
||||||
explicit ack).
|
|
||||||
|
|
||||||
Acked-by: does not necessarily indicate acknowledgement of the entire patch.
|
|
||||||
For example, if a patch affects multiple subsystems and has an Acked-by: from
|
|
||||||
one subsystem maintainer then this usually indicates acknowledgement of just
|
|
||||||
the part which affects that maintainer's code. Judgement should be used here.
|
|
||||||
When in doubt people should refer to the original discussion in the mailing
|
|
||||||
list archives.
|
|
||||||
|
|
||||||
If a person has had the opportunity to comment on a patch, but has not
|
|
||||||
provided such comments, you may optionally add a ``Cc:`` tag to the patch.
|
|
||||||
This is the only tag which might be added without an explicit action by the
|
|
||||||
person it names - but it should indicate that this person was copied on the
|
|
||||||
patch. This tag documents that potentially interested parties
|
|
||||||
have been included in the discussion.
|
|
||||||
|
|
||||||
|
|
||||||
13) Using Reported-by:, Tested-by:, Reviewed-by:, Suggested-by: and Fixes:
|
|
||||||
--------------------------------------------------------------------------
|
|
||||||
|
|
||||||
The Reported-by tag gives credit to people who find bugs and report them and it
|
|
||||||
hopefully inspires them to help us again in the future. Please note that if
|
|
||||||
the bug was reported in private, then ask for permission first before using the
|
|
||||||
Reported-by tag.
|
|
||||||
|
|
||||||
A Tested-by: tag indicates that the patch has been successfully tested (in
|
|
||||||
some environment) by the person named. This tag informs maintainers that
|
|
||||||
some testing has been performed, provides a means to locate testers for
|
|
||||||
future patches, and ensures credit for the testers.
|
|
||||||
|
|
||||||
Reviewed-by:, instead, indicates that the patch has been reviewed and found
|
|
||||||
acceptable according to the Reviewer's Statement:
|
|
||||||
|
|
||||||
Reviewer's statement of oversight
|
|
||||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
||||||
|
|
||||||
By offering my Reviewed-by: tag, I state that:
|
|
||||||
|
|
||||||
(a) I have carried out a technical review of this patch to
|
|
||||||
evaluate its appropriateness and readiness for inclusion into
|
|
||||||
the mainline kernel.
|
|
||||||
|
|
||||||
(b) Any problems, concerns, or questions relating to the patch
|
|
||||||
have been communicated back to the submitter. I am satisfied
|
|
||||||
with the submitter's response to my comments.
|
|
||||||
|
|
||||||
(c) While there may be things that could be improved with this
|
|
||||||
submission, I believe that it is, at this time, (1) a
|
|
||||||
worthwhile modification to the kernel, and (2) free of known
|
|
||||||
issues which would argue against its inclusion.
|
|
||||||
|
|
||||||
(d) While I have reviewed the patch and believe it to be sound, I
|
|
||||||
do not (unless explicitly stated elsewhere) make any
|
|
||||||
warranties or guarantees that it will achieve its stated
|
|
||||||
purpose or function properly in any given situation.
|
|
||||||
|
|
||||||
A Reviewed-by tag is a statement of opinion that the patch is an
|
|
||||||
appropriate modification of the kernel without any remaining serious
|
|
||||||
technical issues. Any interested reviewer (who has done the work) can
|
|
||||||
offer a Reviewed-by tag for a patch. This tag serves to give credit to
|
|
||||||
reviewers and to inform maintainers of the degree of review which has been
|
|
||||||
done on the patch. Reviewed-by: tags, when supplied by reviewers known to
|
|
||||||
understand the subject area and to perform thorough reviews, will normally
|
|
||||||
increase the likelihood of your patch getting into the kernel.
|
|
||||||
|
|
||||||
A Suggested-by: tag indicates that the patch idea is suggested by the person
|
|
||||||
named and ensures credit to the person for the idea. Please note that this
|
|
||||||
tag should not be added without the reporter's permission, especially if the
|
|
||||||
idea was not posted in a public forum. That said, if we diligently credit our
|
|
||||||
idea reporters, they will, hopefully, be inspired to help us again in the
|
|
||||||
future.
|
|
||||||
|
|
||||||
A Fixes: tag indicates that the patch fixes an issue in a previous commit. It
|
|
||||||
is used to make it easy to determine where a bug originated, which can help
|
|
||||||
review a bug fix. This tag also assists the stable kernel team in determining
|
|
||||||
which stable kernel versions should receive your fix. This is the preferred
|
|
||||||
method for indicating a bug fixed by the patch. See :ref:`describe_changes`
|
|
||||||
for more details.
|
|
||||||
|
|
||||||
|
|
||||||
14) The canonical patch format
|
|
||||||
------------------------------
|
|
||||||
|
|
||||||
This section describes how the patch itself should be formatted. Note
|
|
||||||
that, if you have your patches stored in a ``git`` repository, proper patch
|
|
||||||
formatting can be had with ``git format-patch``. The tools cannot create
|
|
||||||
the necessary text, though, so read the instructions below anyway.
|
|
||||||
|
|
||||||
The canonical patch subject line is::
|
|
||||||
|
|
||||||
Subject: [PATCH 001/123] subsystem: summary phrase
|
|
||||||
|
|
||||||
The canonical patch message body contains the following:
|
|
||||||
|
|
||||||
- A ``from`` line specifying the patch author (only needed if the person
|
|
||||||
sending the patch is not the author).
|
|
||||||
|
|
||||||
- An empty line.
|
|
||||||
|
|
||||||
- The body of the explanation, line wrapped at 75 columns, which will
|
|
||||||
be copied to the permanent changelog to describe this patch.
|
|
||||||
|
|
||||||
- The ``Signed-off-by:`` lines, described above, which will
|
|
||||||
also go in the changelog.
|
|
||||||
|
|
||||||
- A marker line containing simply ``---``.
|
|
||||||
|
|
||||||
- Any additional comments not suitable for the changelog.
|
|
||||||
|
|
||||||
- The actual patch (``diff`` output).
|
|
||||||
|
|
||||||
The Subject line format makes it very easy to sort the emails
|
|
||||||
alphabetically by subject line - pretty much any email reader will
|
|
||||||
support that - since because the sequence number is zero-padded,
|
|
||||||
the numerical and alphabetic sort is the same.
|
|
||||||
|
|
||||||
The ``subsystem`` in the email's Subject should identify which
|
|
||||||
area or subsystem of the kernel is being patched.
|
|
||||||
|
|
||||||
The ``summary phrase`` in the email's Subject should concisely
|
|
||||||
describe the patch which that email contains. The ``summary
|
|
||||||
phrase`` should not be a filename. Do not use the same ``summary
|
|
||||||
phrase`` for every patch in a whole patch series (where a ``patch
|
|
||||||
series`` is an ordered sequence of multiple, related patches).
|
|
||||||
|
|
||||||
Bear in mind that the ``summary phrase`` of your email becomes a
|
|
||||||
globally-unique identifier for that patch. It propagates all the way
|
|
||||||
into the ``git`` changelog. The ``summary phrase`` may later be used in
|
|
||||||
developer discussions which refer to the patch. People will want to
|
|
||||||
google for the ``summary phrase`` to read discussion regarding that
|
|
||||||
patch. It will also be the only thing that people may quickly see
|
|
||||||
when, two or three months later, they are going through perhaps
|
|
||||||
thousands of patches using tools such as ``gitk`` or ``git log
|
|
||||||
--oneline``.
|
|
||||||
|
|
||||||
For these reasons, the ``summary`` must be no more than 70-75
|
|
||||||
characters, and it must describe both what the patch changes, as well
|
|
||||||
as why the patch might be necessary. It is challenging to be both
|
|
||||||
succinct and descriptive, but that is what a well-written summary
|
|
||||||
should do.
|
|
||||||
|
|
||||||
The ``summary phrase`` may be prefixed by tags enclosed in square
|
|
||||||
brackets: "Subject: [PATCH <tag>...] <summary phrase>". The tags are
|
|
||||||
not considered part of the summary phrase, but describe how the patch
|
|
||||||
should be treated. Common tags might include a version descriptor if
|
|
||||||
the multiple versions of the patch have been sent out in response to
|
|
||||||
comments (i.e., "v1, v2, v3"), or "RFC" to indicate a request for
|
|
||||||
comments. If there are four patches in a patch series the individual
|
|
||||||
patches may be numbered like this: 1/4, 2/4, 3/4, 4/4. This assures
|
|
||||||
that developers understand the order in which the patches should be
|
|
||||||
applied and that they have reviewed or applied all of the patches in
|
|
||||||
the patch series.
|
|
||||||
|
|
||||||
A couple of example Subjects::
|
|
||||||
|
|
||||||
Subject: [PATCH 2/5] ext2: improve scalability of bitmap searching
|
|
||||||
Subject: [PATCH v2 01/27] x86: fix eflags tracking
|
|
||||||
|
|
||||||
The ``from`` line must be the very first line in the message body,
|
|
||||||
and has the form:
|
|
||||||
|
|
||||||
From: Original Author <author@example.com>
|
|
||||||
|
|
||||||
The ``from`` line specifies who will be credited as the author of the
|
|
||||||
patch in the permanent changelog. If the ``from`` line is missing,
|
|
||||||
then the ``From:`` line from the email header will be used to determine
|
|
||||||
the patch author in the changelog.
|
|
||||||
|
|
||||||
The explanation body will be committed to the permanent source
|
|
||||||
changelog, so should make sense to a competent reader who has long
|
|
||||||
since forgotten the immediate details of the discussion that might
|
|
||||||
have led to this patch. Including symptoms of the failure which the
|
|
||||||
patch addresses (kernel log messages, oops messages, etc.) is
|
|
||||||
especially useful for people who might be searching the commit logs
|
|
||||||
looking for the applicable patch. If a patch fixes a compile failure,
|
|
||||||
it may not be necessary to include _all_ of the compile failures; just
|
|
||||||
enough that it is likely that someone searching for the patch can find
|
|
||||||
it. As in the ``summary phrase``, it is important to be both succinct as
|
|
||||||
well as descriptive.
|
|
||||||
|
|
||||||
The ``---`` marker line serves the essential purpose of marking for patch
|
|
||||||
handling tools where the changelog message ends.
|
|
||||||
|
|
||||||
One good use for the additional comments after the ``---`` marker is for
|
|
||||||
a ``diffstat``, to show what files have changed, and the number of
|
|
||||||
inserted and deleted lines per file. A ``diffstat`` is especially useful
|
|
||||||
on bigger patches. Other comments relevant only to the moment or the
|
|
||||||
maintainer, not suitable for the permanent changelog, should also go
|
|
||||||
here. A good example of such comments might be ``patch changelogs``
|
|
||||||
which describe what has changed between the v1 and v2 version of the
|
|
||||||
patch.
|
|
||||||
|
|
||||||
If you are going to include a ``diffstat`` after the ``---`` marker, please
|
|
||||||
use ``diffstat`` options ``-p 1 -w 70`` so that filenames are listed from
|
|
||||||
the top of the kernel source tree and don't use too much horizontal
|
|
||||||
space (easily fit in 80 columns, maybe with some indentation). (``git``
|
|
||||||
generates appropriate diffstats by default.)
|
|
||||||
|
|
||||||
See more details on the proper patch format in the following
|
|
||||||
references.
|
|
||||||
|
|
||||||
.. _explicit_in_reply_to:
|
|
||||||
|
|
||||||
15) Explicit In-Reply-To headers
|
|
||||||
--------------------------------
|
|
||||||
|
|
||||||
It can be helpful to manually add In-Reply-To: headers to a patch
|
|
||||||
(e.g., when using ``git send-email``) to associate the patch with
|
|
||||||
previous relevant discussion, e.g. to link a bug fix to the email with
|
|
||||||
the bug report. However, for a multi-patch series, it is generally
|
|
||||||
best to avoid using In-Reply-To: to link to older versions of the
|
|
||||||
series. This way multiple versions of the patch don't become an
|
|
||||||
unmanageable forest of references in email clients. If a link is
|
|
||||||
helpful, you can use the https://lkml.kernel.org/ redirector (e.g., in
|
|
||||||
the cover email text) to link to an earlier version of the patch series.
|
|
||||||
|
|
||||||
|
|
||||||
16) Sending ``git pull`` requests
|
|
||||||
---------------------------------
|
|
||||||
|
|
||||||
If you have a series of patches, it may be most convenient to have the
|
|
||||||
maintainer pull them directly into the subsystem repository with a
|
|
||||||
``git pull`` operation. Note, however, that pulling patches from a developer
|
|
||||||
requires a higher degree of trust than taking patches from a mailing list.
|
|
||||||
As a result, many subsystem maintainers are reluctant to take pull
|
|
||||||
requests, especially from new, unknown developers. If in doubt you can use
|
|
||||||
the pull request as the cover letter for a normal posting of the patch
|
|
||||||
series, giving the maintainer the option of using either.
|
|
||||||
|
|
||||||
A pull request should have [GIT] or [PULL] in the subject line. The
|
|
||||||
request itself should include the repository name and the branch of
|
|
||||||
interest on a single line; it should look something like::
|
|
||||||
|
|
||||||
Please pull from
|
|
||||||
|
|
||||||
git://jdelvare.pck.nerim.net/jdelvare-2.6 i2c-for-linus
|
|
||||||
|
|
||||||
to get these changes:
|
|
||||||
|
|
||||||
A pull request should also include an overall message saying what will be
|
|
||||||
included in the request, a ``git shortlog`` listing of the patches
|
|
||||||
themselves, and a ``diffstat`` showing the overall effect of the patch series.
|
|
||||||
The easiest way to get all this information together is, of course, to let
|
|
||||||
``git`` do it for you with the ``git request-pull`` command.
|
|
||||||
|
|
||||||
Some maintainers (including Linus) want to see pull requests from signed
|
|
||||||
commits; that increases their confidence that the request actually came
|
|
||||||
from you. Linus, in particular, will not pull from public hosting sites
|
|
||||||
like GitHub in the absence of a signed tag.
|
|
||||||
|
|
||||||
The first step toward creating such tags is to make a GNUPG key and get it
|
|
||||||
signed by one or more core kernel developers. This step can be hard for
|
|
||||||
new developers, but there is no way around it. Attending conferences can
|
|
||||||
be a good way to find developers who can sign your key.
|
|
||||||
|
|
||||||
Once you have prepared a patch series in ``git`` that you wish to have somebody
|
|
||||||
pull, create a signed tag with ``git tag -s``. This will create a new tag
|
|
||||||
identifying the last commit in the series and containing a signature
|
|
||||||
created with your private key. You will also have the opportunity to add a
|
|
||||||
changelog-style message to the tag; this is an ideal place to describe the
|
|
||||||
effects of the pull request as a whole.
|
|
||||||
|
|
||||||
If the tree the maintainer will be pulling from is not the repository you
|
|
||||||
are working from, don't forget to push the signed tag explicitly to the
|
|
||||||
public tree.
|
|
||||||
|
|
||||||
When generating your pull request, use the signed tag as the target. A
|
|
||||||
command like this will do the trick::
|
|
||||||
|
|
||||||
git request-pull master git://my.public.tree/linux.git my-signed-tag
|
|
||||||
|
|
||||||
|
|
||||||
REFERENCES
|
|
||||||
**********
|
|
||||||
|
|
||||||
Andrew Morton, "The perfect patch" (tpp).
|
|
||||||
<http://www.ozlabs.org/~akpm/stuff/tpp.txt>
|
|
||||||
|
|
||||||
Jeff Garzik, "Linux kernel patch submission format".
|
|
||||||
<http://linux.yyz.us/patch-format.html>
|
|
||||||
|
|
||||||
Greg Kroah-Hartman, "How to piss off a kernel subsystem maintainer".
|
|
||||||
<http://www.kroah.com/log/linux/maintainer.html>
|
|
||||||
|
|
||||||
<http://www.kroah.com/log/linux/maintainer-02.html>
|
|
||||||
|
|
||||||
<http://www.kroah.com/log/linux/maintainer-03.html>
|
|
||||||
|
|
||||||
<http://www.kroah.com/log/linux/maintainer-04.html>
|
|
||||||
|
|
||||||
<http://www.kroah.com/log/linux/maintainer-05.html>
|
|
||||||
|
|
||||||
<http://www.kroah.com/log/linux/maintainer-06.html>
|
|
||||||
|
|
||||||
NO!!!! No more huge patch bombs to linux-kernel@vger.kernel.org people!
|
|
||||||
<https://lkml.org/lkml/2005/7/11/336>
|
|
||||||
|
|
||||||
Kernel Documentation/CodingStyle:
|
|
||||||
:ref:`Documentation/CodingStyle <codingstyle>`
|
|
||||||
|
|
||||||
Linus Torvalds's mail on the canonical patch format:
|
|
||||||
<http://lkml.org/lkml/2005/4/7/183>
|
|
||||||
|
|
||||||
Andi Kleen, "On submitting kernel patches"
|
|
||||||
Some strategies to get difficult or controversial changes in.
|
|
||||||
|
|
||||||
http://halobates.de/on-submitting-patches.pdf
|
|
||||||
|
|
||||||
|
|||||||
@@ -1,39 +0,0 @@
|
|||||||
Software cursor for VGA by Pavel Machek <pavel@atrey.karlin.mff.cuni.cz>
|
|
||||||
======================= and Martin Mares <mj@atrey.karlin.mff.cuni.cz>
|
|
||||||
|
|
||||||
Linux now has some ability to manipulate cursor appearance. Normally, you
|
|
||||||
can set the size of hardware cursor (and also work around some ugly bugs in
|
|
||||||
those miserable Trident cards--see #define TRIDENT_GLITCH in drivers/video/
|
|
||||||
vgacon.c). You can now play a few new tricks: you can make your cursor look
|
|
||||||
like a non-blinking red block, make it inverse background of the character it's
|
|
||||||
over or to highlight that character and still choose whether the original
|
|
||||||
hardware cursor should remain visible or not. There may be other things I have
|
|
||||||
never thought of.
|
|
||||||
|
|
||||||
The cursor appearance is controlled by a "<ESC>[?1;2;3c" escape sequence
|
|
||||||
where 1, 2 and 3 are parameters described below. If you omit any of them,
|
|
||||||
they will default to zeroes.
|
|
||||||
|
|
||||||
Parameter 1 specifies cursor size (0=default, 1=invisible, 2=underline, ...,
|
|
||||||
8=full block) + 16 if you want the software cursor to be applied + 32 if you
|
|
||||||
want to always change the background color + 64 if you dislike having the
|
|
||||||
background the same as the foreground. Highlights are ignored for the last two
|
|
||||||
flags.
|
|
||||||
|
|
||||||
The second parameter selects character attribute bits you want to change
|
|
||||||
(by simply XORing them with the value of this parameter). On standard VGA,
|
|
||||||
the high four bits specify background and the low four the foreground. In both
|
|
||||||
groups, low three bits set color (as in normal color codes used by the console)
|
|
||||||
and the most significant one turns on highlight (or sometimes blinking--it
|
|
||||||
depends on the configuration of your VGA).
|
|
||||||
|
|
||||||
The third parameter consists of character attribute bits you want to set.
|
|
||||||
Bit setting takes place before bit toggling, so you can simply clear a bit by
|
|
||||||
including it in both the set mask and the toggle mask.
|
|
||||||
|
|
||||||
Examples:
|
|
||||||
=========
|
|
||||||
|
|
||||||
To get normal blinking underline, use: echo -e '\033[?2c'
|
|
||||||
To get blinking block, use: echo -e '\033[?6c'
|
|
||||||
To get red non-blinking block, use: echo -e '\033[?17;0;64c'
|
|
||||||
@@ -101,6 +101,6 @@ received a notification, it will set the backlight level accordingly. This does
|
|||||||
not affect the sending of event to user space, they are always sent to user
|
not affect the sending of event to user space, they are always sent to user
|
||||||
space regardless of whether or not the video module controls the backlight level
|
space regardless of whether or not the video module controls the backlight level
|
||||||
directly. This behaviour can be controlled through the brightness_switch_enabled
|
directly. This behaviour can be controlled through the brightness_switch_enabled
|
||||||
module parameter as documented in kernel-parameters.txt. It is recommended to
|
module parameter as documented in admin-guide/kernel-parameters.rst. It is recommended to
|
||||||
disable this behaviour once a GUI environment starts up and wants to have full
|
disable this behaviour once a GUI environment starts up and wants to have full
|
||||||
control of the backlight level.
|
control of the backlight level.
|
||||||
|
|||||||
411
Documentation/admin-guide/README.rst
Normal file
411
Documentation/admin-guide/README.rst
Normal file
@@ -0,0 +1,411 @@
|
|||||||
|
Linux kernel release 4.x <http://kernel.org/>
|
||||||
|
=============================================
|
||||||
|
|
||||||
|
These are the release notes for Linux version 4. Read them carefully,
|
||||||
|
as they tell you what this is all about, explain how to install the
|
||||||
|
kernel, and what to do if something goes wrong.
|
||||||
|
|
||||||
|
What is Linux?
|
||||||
|
--------------
|
||||||
|
|
||||||
|
Linux is a clone of the operating system Unix, written from scratch by
|
||||||
|
Linus Torvalds with assistance from a loosely-knit team of hackers across
|
||||||
|
the Net. It aims towards POSIX and Single UNIX Specification compliance.
|
||||||
|
|
||||||
|
It has all the features you would expect in a modern fully-fledged Unix,
|
||||||
|
including true multitasking, virtual memory, shared libraries, demand
|
||||||
|
loading, shared copy-on-write executables, proper memory management,
|
||||||
|
and multistack networking including IPv4 and IPv6.
|
||||||
|
|
||||||
|
It is distributed under the GNU General Public License - see the
|
||||||
|
accompanying COPYING file for more details.
|
||||||
|
|
||||||
|
On what hardware does it run?
|
||||||
|
-----------------------------
|
||||||
|
|
||||||
|
Although originally developed first for 32-bit x86-based PCs (386 or higher),
|
||||||
|
today Linux also runs on (at least) the Compaq Alpha AXP, Sun SPARC and
|
||||||
|
UltraSPARC, Motorola 68000, PowerPC, PowerPC64, ARM, Hitachi SuperH, Cell,
|
||||||
|
IBM S/390, MIPS, HP PA-RISC, Intel IA-64, DEC VAX, AMD x86-64, AXIS CRIS,
|
||||||
|
Xtensa, Tilera TILE, AVR32, ARC and Renesas M32R architectures.
|
||||||
|
|
||||||
|
Linux is easily portable to most general-purpose 32- or 64-bit architectures
|
||||||
|
as long as they have a paged memory management unit (PMMU) and a port of the
|
||||||
|
GNU C compiler (gcc) (part of The GNU Compiler Collection, GCC). Linux has
|
||||||
|
also been ported to a number of architectures without a PMMU, although
|
||||||
|
functionality is then obviously somewhat limited.
|
||||||
|
Linux has also been ported to itself. You can now run the kernel as a
|
||||||
|
userspace application - this is called UserMode Linux (UML).
|
||||||
|
|
||||||
|
Documentation
|
||||||
|
-------------
|
||||||
|
|
||||||
|
- There is a lot of documentation available both in electronic form on
|
||||||
|
the Internet and in books, both Linux-specific and pertaining to
|
||||||
|
general UNIX questions. I'd recommend looking into the documentation
|
||||||
|
subdirectories on any Linux FTP site for the LDP (Linux Documentation
|
||||||
|
Project) books. This README is not meant to be documentation on the
|
||||||
|
system: there are much better sources available.
|
||||||
|
|
||||||
|
- There are various README files in the Documentation/ subdirectory:
|
||||||
|
these typically contain kernel-specific installation notes for some
|
||||||
|
drivers for example. See Documentation/00-INDEX for a list of what
|
||||||
|
is contained in each file. Please read the
|
||||||
|
:ref:`Documentation/process/changes.rst <changes>` file, as it
|
||||||
|
contains information about the problems, which may result by upgrading
|
||||||
|
your kernel.
|
||||||
|
|
||||||
|
- The Documentation/DocBook/ subdirectory contains several guides for
|
||||||
|
kernel developers and users. These guides can be rendered in a
|
||||||
|
number of formats: PostScript (.ps), PDF, HTML, & man-pages, among others.
|
||||||
|
After installation, ``make psdocs``, ``make pdfdocs``, ``make htmldocs``,
|
||||||
|
or ``make mandocs`` will render the documentation in the requested format.
|
||||||
|
|
||||||
|
Installing the kernel source
|
||||||
|
----------------------------
|
||||||
|
|
||||||
|
- If you install the full sources, put the kernel tarball in a
|
||||||
|
directory where you have permissions (e.g. your home directory) and
|
||||||
|
unpack it::
|
||||||
|
|
||||||
|
xz -cd linux-4.X.tar.xz | tar xvf -
|
||||||
|
|
||||||
|
Replace "X" with the version number of the latest kernel.
|
||||||
|
|
||||||
|
Do NOT use the /usr/src/linux area! This area has a (usually
|
||||||
|
incomplete) set of kernel headers that are used by the library header
|
||||||
|
files. They should match the library, and not get messed up by
|
||||||
|
whatever the kernel-du-jour happens to be.
|
||||||
|
|
||||||
|
- You can also upgrade between 4.x releases by patching. Patches are
|
||||||
|
distributed in the xz format. To install by patching, get all the
|
||||||
|
newer patch files, enter the top level directory of the kernel source
|
||||||
|
(linux-4.X) and execute::
|
||||||
|
|
||||||
|
xz -cd ../patch-4.x.xz | patch -p1
|
||||||
|
|
||||||
|
Replace "x" for all versions bigger than the version "X" of your current
|
||||||
|
source tree, **in_order**, and you should be ok. You may want to remove
|
||||||
|
the backup files (some-file-name~ or some-file-name.orig), and make sure
|
||||||
|
that there are no failed patches (some-file-name# or some-file-name.rej).
|
||||||
|
If there are, either you or I have made a mistake.
|
||||||
|
|
||||||
|
Unlike patches for the 4.x kernels, patches for the 4.x.y kernels
|
||||||
|
(also known as the -stable kernels) are not incremental but instead apply
|
||||||
|
directly to the base 4.x kernel. For example, if your base kernel is 4.0
|
||||||
|
and you want to apply the 4.0.3 patch, you must not first apply the 4.0.1
|
||||||
|
and 4.0.2 patches. Similarly, if you are running kernel version 4.0.2 and
|
||||||
|
want to jump to 4.0.3, you must first reverse the 4.0.2 patch (that is,
|
||||||
|
patch -R) **before** applying the 4.0.3 patch. You can read more on this in
|
||||||
|
:ref:`Documentation/process/applying-patches.rst <applying_patches>`.
|
||||||
|
|
||||||
|
Alternatively, the script patch-kernel can be used to automate this
|
||||||
|
process. It determines the current kernel version and applies any
|
||||||
|
patches found::
|
||||||
|
|
||||||
|
linux/scripts/patch-kernel linux
|
||||||
|
|
||||||
|
The first argument in the command above is the location of the
|
||||||
|
kernel source. Patches are applied from the current directory, but
|
||||||
|
an alternative directory can be specified as the second argument.
|
||||||
|
|
||||||
|
- Make sure you have no stale .o files and dependencies lying around::
|
||||||
|
|
||||||
|
cd linux
|
||||||
|
make mrproper
|
||||||
|
|
||||||
|
You should now have the sources correctly installed.
|
||||||
|
|
||||||
|
Software requirements
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
Compiling and running the 4.x kernels requires up-to-date
|
||||||
|
versions of various software packages. Consult
|
||||||
|
:ref:`Documentation/process/changes.rst <changes>` for the minimum version numbers
|
||||||
|
required and how to get updates for these packages. Beware that using
|
||||||
|
excessively old versions of these packages can cause indirect
|
||||||
|
errors that are very difficult to track down, so don't assume that
|
||||||
|
you can just update packages when obvious problems arise during
|
||||||
|
build or operation.
|
||||||
|
|
||||||
|
Build directory for the kernel
|
||||||
|
------------------------------
|
||||||
|
|
||||||
|
When compiling the kernel, all output files will per default be
|
||||||
|
stored together with the kernel source code.
|
||||||
|
Using the option ``make O=output/dir`` allows you to specify an alternate
|
||||||
|
place for the output files (including .config).
|
||||||
|
Example::
|
||||||
|
|
||||||
|
kernel source code: /usr/src/linux-4.X
|
||||||
|
build directory: /home/name/build/kernel
|
||||||
|
|
||||||
|
To configure and build the kernel, use::
|
||||||
|
|
||||||
|
cd /usr/src/linux-4.X
|
||||||
|
make O=/home/name/build/kernel menuconfig
|
||||||
|
make O=/home/name/build/kernel
|
||||||
|
sudo make O=/home/name/build/kernel modules_install install
|
||||||
|
|
||||||
|
Please note: If the ``O=output/dir`` option is used, then it must be
|
||||||
|
used for all invocations of make.
|
||||||
|
|
||||||
|
Configuring the kernel
|
||||||
|
----------------------
|
||||||
|
|
||||||
|
Do not skip this step even if you are only upgrading one minor
|
||||||
|
version. New configuration options are added in each release, and
|
||||||
|
odd problems will turn up if the configuration files are not set up
|
||||||
|
as expected. If you want to carry your existing configuration to a
|
||||||
|
new version with minimal work, use ``make oldconfig``, which will
|
||||||
|
only ask you for the answers to new questions.
|
||||||
|
|
||||||
|
- Alternative configuration commands are::
|
||||||
|
|
||||||
|
"make config" Plain text interface.
|
||||||
|
|
||||||
|
"make menuconfig" Text based color menus, radiolists & dialogs.
|
||||||
|
|
||||||
|
"make nconfig" Enhanced text based color menus.
|
||||||
|
|
||||||
|
"make xconfig" Qt based configuration tool.
|
||||||
|
|
||||||
|
"make gconfig" GTK+ based configuration tool.
|
||||||
|
|
||||||
|
"make oldconfig" Default all questions based on the contents of
|
||||||
|
your existing ./.config file and asking about
|
||||||
|
new config symbols.
|
||||||
|
|
||||||
|
"make silentoldconfig"
|
||||||
|
Like above, but avoids cluttering the screen
|
||||||
|
with questions already answered.
|
||||||
|
Additionally updates the dependencies.
|
||||||
|
|
||||||
|
"make olddefconfig"
|
||||||
|
Like above, but sets new symbols to their default
|
||||||
|
values without prompting.
|
||||||
|
|
||||||
|
"make defconfig" Create a ./.config file by using the default
|
||||||
|
symbol values from either arch/$ARCH/defconfig
|
||||||
|
or arch/$ARCH/configs/${PLATFORM}_defconfig,
|
||||||
|
depending on the architecture.
|
||||||
|
|
||||||
|
"make ${PLATFORM}_defconfig"
|
||||||
|
Create a ./.config file by using the default
|
||||||
|
symbol values from
|
||||||
|
arch/$ARCH/configs/${PLATFORM}_defconfig.
|
||||||
|
Use "make help" to get a list of all available
|
||||||
|
platforms of your architecture.
|
||||||
|
|
||||||
|
"make allyesconfig"
|
||||||
|
Create a ./.config file by setting symbol
|
||||||
|
values to 'y' as much as possible.
|
||||||
|
|
||||||
|
"make allmodconfig"
|
||||||
|
Create a ./.config file by setting symbol
|
||||||
|
values to 'm' as much as possible.
|
||||||
|
|
||||||
|
"make allnoconfig" Create a ./.config file by setting symbol
|
||||||
|
values to 'n' as much as possible.
|
||||||
|
|
||||||
|
"make randconfig" Create a ./.config file by setting symbol
|
||||||
|
values to random values.
|
||||||
|
|
||||||
|
"make localmodconfig" Create a config based on current config and
|
||||||
|
loaded modules (lsmod). Disables any module
|
||||||
|
option that is not needed for the loaded modules.
|
||||||
|
|
||||||
|
To create a localmodconfig for another machine,
|
||||||
|
store the lsmod of that machine into a file
|
||||||
|
and pass it in as a LSMOD parameter.
|
||||||
|
|
||||||
|
target$ lsmod > /tmp/mylsmod
|
||||||
|
target$ scp /tmp/mylsmod host:/tmp
|
||||||
|
|
||||||
|
host$ make LSMOD=/tmp/mylsmod localmodconfig
|
||||||
|
|
||||||
|
The above also works when cross compiling.
|
||||||
|
|
||||||
|
"make localyesconfig" Similar to localmodconfig, except it will convert
|
||||||
|
all module options to built in (=y) options.
|
||||||
|
|
||||||
|
You can find more information on using the Linux kernel config tools
|
||||||
|
in Documentation/kbuild/kconfig.txt.
|
||||||
|
|
||||||
|
- NOTES on ``make config``:
|
||||||
|
|
||||||
|
- Having unnecessary drivers will make the kernel bigger, and can
|
||||||
|
under some circumstances lead to problems: probing for a
|
||||||
|
nonexistent controller card may confuse your other controllers
|
||||||
|
|
||||||
|
- A kernel with math-emulation compiled in will still use the
|
||||||
|
coprocessor if one is present: the math emulation will just
|
||||||
|
never get used in that case. The kernel will be slightly larger,
|
||||||
|
but will work on different machines regardless of whether they
|
||||||
|
have a math coprocessor or not.
|
||||||
|
|
||||||
|
- The "kernel hacking" configuration details usually result in a
|
||||||
|
bigger or slower kernel (or both), and can even make the kernel
|
||||||
|
less stable by configuring some routines to actively try to
|
||||||
|
break bad code to find kernel problems (kmalloc()). Thus you
|
||||||
|
should probably answer 'n' to the questions for "development",
|
||||||
|
"experimental", or "debugging" features.
|
||||||
|
|
||||||
|
Compiling the kernel
|
||||||
|
--------------------
|
||||||
|
|
||||||
|
- Make sure you have at least gcc 3.2 available.
|
||||||
|
For more information, refer to :ref:`Documentation/process/changes.rst <changes>`.
|
||||||
|
|
||||||
|
Please note that you can still run a.out user programs with this kernel.
|
||||||
|
|
||||||
|
- Do a ``make`` to create a compressed kernel image. It is also
|
||||||
|
possible to do ``make install`` if you have lilo installed to suit the
|
||||||
|
kernel makefiles, but you may want to check your particular lilo setup first.
|
||||||
|
|
||||||
|
To do the actual install, you have to be root, but none of the normal
|
||||||
|
build should require that. Don't take the name of root in vain.
|
||||||
|
|
||||||
|
- If you configured any of the parts of the kernel as ``modules``, you
|
||||||
|
will also have to do ``make modules_install``.
|
||||||
|
|
||||||
|
- Verbose kernel compile/build output:
|
||||||
|
|
||||||
|
Normally, the kernel build system runs in a fairly quiet mode (but not
|
||||||
|
totally silent). However, sometimes you or other kernel developers need
|
||||||
|
to see compile, link, or other commands exactly as they are executed.
|
||||||
|
For this, use "verbose" build mode. This is done by passing
|
||||||
|
``V=1`` to the ``make`` command, e.g.::
|
||||||
|
|
||||||
|
make V=1 all
|
||||||
|
|
||||||
|
To have the build system also tell the reason for the rebuild of each
|
||||||
|
target, use ``V=2``. The default is ``V=0``.
|
||||||
|
|
||||||
|
- Keep a backup kernel handy in case something goes wrong. This is
|
||||||
|
especially true for the development releases, since each new release
|
||||||
|
contains new code which has not been debugged. Make sure you keep a
|
||||||
|
backup of the modules corresponding to that kernel, as well. If you
|
||||||
|
are installing a new kernel with the same version number as your
|
||||||
|
working kernel, make a backup of your modules directory before you
|
||||||
|
do a ``make modules_install``.
|
||||||
|
|
||||||
|
Alternatively, before compiling, use the kernel config option
|
||||||
|
"LOCALVERSION" to append a unique suffix to the regular kernel version.
|
||||||
|
LOCALVERSION can be set in the "General Setup" menu.
|
||||||
|
|
||||||
|
- In order to boot your new kernel, you'll need to copy the kernel
|
||||||
|
image (e.g. .../linux/arch/x86/boot/bzImage after compilation)
|
||||||
|
to the place where your regular bootable kernel is found.
|
||||||
|
|
||||||
|
- Booting a kernel directly from a floppy without the assistance of a
|
||||||
|
bootloader such as LILO, is no longer supported.
|
||||||
|
|
||||||
|
If you boot Linux from the hard drive, chances are you use LILO, which
|
||||||
|
uses the kernel image as specified in the file /etc/lilo.conf. The
|
||||||
|
kernel image file is usually /vmlinuz, /boot/vmlinuz, /bzImage or
|
||||||
|
/boot/bzImage. To use the new kernel, save a copy of the old image
|
||||||
|
and copy the new image over the old one. Then, you MUST RERUN LILO
|
||||||
|
to update the loading map! If you don't, you won't be able to boot
|
||||||
|
the new kernel image.
|
||||||
|
|
||||||
|
Reinstalling LILO is usually a matter of running /sbin/lilo.
|
||||||
|
You may wish to edit /etc/lilo.conf to specify an entry for your
|
||||||
|
old kernel image (say, /vmlinux.old) in case the new one does not
|
||||||
|
work. See the LILO docs for more information.
|
||||||
|
|
||||||
|
After reinstalling LILO, you should be all set. Shutdown the system,
|
||||||
|
reboot, and enjoy!
|
||||||
|
|
||||||
|
If you ever need to change the default root device, video mode,
|
||||||
|
ramdisk size, etc. in the kernel image, use the ``rdev`` program (or
|
||||||
|
alternatively the LILO boot options when appropriate). No need to
|
||||||
|
recompile the kernel to change these parameters.
|
||||||
|
|
||||||
|
- Reboot with the new kernel and enjoy.
|
||||||
|
|
||||||
|
If something goes wrong
|
||||||
|
-----------------------
|
||||||
|
|
||||||
|
- If you have problems that seem to be due to kernel bugs, please check
|
||||||
|
the file MAINTAINERS to see if there is a particular person associated
|
||||||
|
with the part of the kernel that you are having trouble with. If there
|
||||||
|
isn't anyone listed there, then the second best thing is to mail
|
||||||
|
them to me (torvalds@linux-foundation.org), and possibly to any other
|
||||||
|
relevant mailing-list or to the newsgroup.
|
||||||
|
|
||||||
|
- In all bug-reports, *please* tell what kernel you are talking about,
|
||||||
|
how to duplicate the problem, and what your setup is (use your common
|
||||||
|
sense). If the problem is new, tell me so, and if the problem is
|
||||||
|
old, please try to tell me when you first noticed it.
|
||||||
|
|
||||||
|
- If the bug results in a message like::
|
||||||
|
|
||||||
|
unable to handle kernel paging request at address C0000010
|
||||||
|
Oops: 0002
|
||||||
|
EIP: 0010:XXXXXXXX
|
||||||
|
eax: xxxxxxxx ebx: xxxxxxxx ecx: xxxxxxxx edx: xxxxxxxx
|
||||||
|
esi: xxxxxxxx edi: xxxxxxxx ebp: xxxxxxxx
|
||||||
|
ds: xxxx es: xxxx fs: xxxx gs: xxxx
|
||||||
|
Pid: xx, process nr: xx
|
||||||
|
xx xx xx xx xx xx xx xx xx xx
|
||||||
|
|
||||||
|
or similar kernel debugging information on your screen or in your
|
||||||
|
system log, please duplicate it *exactly*. The dump may look
|
||||||
|
incomprehensible to you, but it does contain information that may
|
||||||
|
help debugging the problem. The text above the dump is also
|
||||||
|
important: it tells something about why the kernel dumped code (in
|
||||||
|
the above example, it's due to a bad kernel pointer). More information
|
||||||
|
on making sense of the dump is in Documentation/admin-guide/oops-tracing.rst
|
||||||
|
|
||||||
|
- If you compiled the kernel with CONFIG_KALLSYMS you can send the dump
|
||||||
|
as is, otherwise you will have to use the ``ksymoops`` program to make
|
||||||
|
sense of the dump (but compiling with CONFIG_KALLSYMS is usually preferred).
|
||||||
|
This utility can be downloaded from
|
||||||
|
ftp://ftp.<country>.kernel.org/pub/linux/utils/kernel/ksymoops/ .
|
||||||
|
Alternatively, you can do the dump lookup by hand:
|
||||||
|
|
||||||
|
- In debugging dumps like the above, it helps enormously if you can
|
||||||
|
look up what the EIP value means. The hex value as such doesn't help
|
||||||
|
me or anybody else very much: it will depend on your particular
|
||||||
|
kernel setup. What you should do is take the hex value from the EIP
|
||||||
|
line (ignore the ``0010:``), and look it up in the kernel namelist to
|
||||||
|
see which kernel function contains the offending address.
|
||||||
|
|
||||||
|
To find out the kernel function name, you'll need to find the system
|
||||||
|
binary associated with the kernel that exhibited the symptom. This is
|
||||||
|
the file 'linux/vmlinux'. To extract the namelist and match it against
|
||||||
|
the EIP from the kernel crash, do::
|
||||||
|
|
||||||
|
nm vmlinux | sort | less
|
||||||
|
|
||||||
|
This will give you a list of kernel addresses sorted in ascending
|
||||||
|
order, from which it is simple to find the function that contains the
|
||||||
|
offending address. Note that the address given by the kernel
|
||||||
|
debugging messages will not necessarily match exactly with the
|
||||||
|
function addresses (in fact, that is very unlikely), so you can't
|
||||||
|
just 'grep' the list: the list will, however, give you the starting
|
||||||
|
point of each kernel function, so by looking for the function that
|
||||||
|
has a starting address lower than the one you are searching for but
|
||||||
|
is followed by a function with a higher address you will find the one
|
||||||
|
you want. In fact, it may be a good idea to include a bit of
|
||||||
|
"context" in your problem report, giving a few lines around the
|
||||||
|
interesting one.
|
||||||
|
|
||||||
|
If you for some reason cannot do the above (you have a pre-compiled
|
||||||
|
kernel image or similar), telling me as much about your setup as
|
||||||
|
possible will help. Please read the :ref:`admin-guide/reporting-bugs.rst <reportingbugs>`
|
||||||
|
document for details.
|
||||||
|
|
||||||
|
- Alternatively, you can use gdb on a running kernel. (read-only; i.e. you
|
||||||
|
cannot change values or set break points.) To do this, first compile the
|
||||||
|
kernel with -g; edit arch/x86/Makefile appropriately, then do a ``make
|
||||||
|
clean``. You'll also need to enable CONFIG_PROC_FS (via ``make config``).
|
||||||
|
|
||||||
|
After you've rebooted with the new kernel, do ``gdb vmlinux /proc/kcore``.
|
||||||
|
You can now use all the usual gdb commands. The command to look up the
|
||||||
|
point where your system crashed is ``l *0xXXXXXXXX``. (Replace the XXXes
|
||||||
|
with the EIP value.)
|
||||||
|
|
||||||
|
gdb'ing a non-running kernel currently fails because ``gdb`` (wrongly)
|
||||||
|
disregards the starting offset for which the kernel is compiled.
|
||||||
151
Documentation/admin-guide/binfmt-misc.rst
Normal file
151
Documentation/admin-guide/binfmt-misc.rst
Normal file
@@ -0,0 +1,151 @@
|
|||||||
|
Kernel Support for miscellaneous (your favourite) Binary Formats v1.1
|
||||||
|
=====================================================================
|
||||||
|
|
||||||
|
This Kernel feature allows you to invoke almost (for restrictions see below)
|
||||||
|
every program by simply typing its name in the shell.
|
||||||
|
This includes for example compiled Java(TM), Python or Emacs programs.
|
||||||
|
|
||||||
|
To achieve this you must tell binfmt_misc which interpreter has to be invoked
|
||||||
|
with which binary. Binfmt_misc recognises the binary-type by matching some bytes
|
||||||
|
at the beginning of the file with a magic byte sequence (masking out specified
|
||||||
|
bits) you have supplied. Binfmt_misc can also recognise a filename extension
|
||||||
|
aka ``.com`` or ``.exe``.
|
||||||
|
|
||||||
|
First you must mount binfmt_misc::
|
||||||
|
|
||||||
|
mount binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc
|
||||||
|
|
||||||
|
To actually register a new binary type, you have to set up a string looking like
|
||||||
|
``:name:type:offset:magic:mask:interpreter:flags`` (where you can choose the
|
||||||
|
``:`` upon your needs) and echo it to ``/proc/sys/fs/binfmt_misc/register``.
|
||||||
|
|
||||||
|
Here is what the fields mean:
|
||||||
|
|
||||||
|
- ``name``
|
||||||
|
is an identifier string. A new /proc file will be created with this
|
||||||
|
``name below /proc/sys/fs/binfmt_misc``; cannot contain slashes ``/`` for
|
||||||
|
obvious reasons.
|
||||||
|
- ``type``
|
||||||
|
is the type of recognition. Give ``M`` for magic and ``E`` for extension.
|
||||||
|
- ``offset``
|
||||||
|
is the offset of the magic/mask in the file, counted in bytes. This
|
||||||
|
defaults to 0 if you omit it (i.e. you write ``:name:type::magic...``).
|
||||||
|
Ignored when using filename extension matching.
|
||||||
|
- ``magic``
|
||||||
|
is the byte sequence binfmt_misc is matching for. The magic string
|
||||||
|
may contain hex-encoded characters like ``\x0a`` or ``\xA4``. Note that you
|
||||||
|
must escape any NUL bytes; parsing halts at the first one. In a shell
|
||||||
|
environment you might have to write ``\\x0a`` to prevent the shell from
|
||||||
|
eating your ``\``.
|
||||||
|
If you chose filename extension matching, this is the extension to be
|
||||||
|
recognised (without the ``.``, the ``\x0a`` specials are not allowed).
|
||||||
|
Extension matching is case sensitive, and slashes ``/`` are not allowed!
|
||||||
|
- ``mask``
|
||||||
|
is an (optional, defaults to all 0xff) mask. You can mask out some
|
||||||
|
bits from matching by supplying a string like magic and as long as magic.
|
||||||
|
The mask is anded with the byte sequence of the file. Note that you must
|
||||||
|
escape any NUL bytes; parsing halts at the first one. Ignored when using
|
||||||
|
filename extension matching.
|
||||||
|
- ``interpreter``
|
||||||
|
is the program that should be invoked with the binary as first
|
||||||
|
argument (specify the full path)
|
||||||
|
- ``flags``
|
||||||
|
is an optional field that controls several aspects of the invocation
|
||||||
|
of the interpreter. It is a string of capital letters, each controls a
|
||||||
|
certain aspect. The following flags are supported:
|
||||||
|
|
||||||
|
``P`` - preserve-argv[0]
|
||||||
|
Legacy behavior of binfmt_misc is to overwrite
|
||||||
|
the original argv[0] with the full path to the binary. When this
|
||||||
|
flag is included, binfmt_misc will add an argument to the argument
|
||||||
|
vector for this purpose, thus preserving the original ``argv[0]``.
|
||||||
|
e.g. If your interp is set to ``/bin/foo`` and you run ``blah``
|
||||||
|
(which is in ``/usr/local/bin``), then the kernel will execute
|
||||||
|
``/bin/foo`` with ``argv[]`` set to ``["/bin/foo", "/usr/local/bin/blah", "blah"]``. The interp has to be aware of this so it can
|
||||||
|
execute ``/usr/local/bin/blah``
|
||||||
|
with ``argv[]`` set to ``["blah"]``.
|
||||||
|
``O`` - open-binary
|
||||||
|
Legacy behavior of binfmt_misc is to pass the full path
|
||||||
|
of the binary to the interpreter as an argument. When this flag is
|
||||||
|
included, binfmt_misc will open the file for reading and pass its
|
||||||
|
descriptor as an argument, instead of the full path, thus allowing
|
||||||
|
the interpreter to execute non-readable binaries. This feature
|
||||||
|
should be used with care - the interpreter has to be trusted not to
|
||||||
|
emit the contents of the non-readable binary.
|
||||||
|
``C`` - credentials
|
||||||
|
Currently, the behavior of binfmt_misc is to calculate
|
||||||
|
the credentials and security token of the new process according to
|
||||||
|
the interpreter. When this flag is included, these attributes are
|
||||||
|
calculated according to the binary. It also implies the ``O`` flag.
|
||||||
|
This feature should be used with care as the interpreter
|
||||||
|
will run with root permissions when a setuid binary owned by root
|
||||||
|
is run with binfmt_misc.
|
||||||
|
``F`` - fix binary
|
||||||
|
The usual behaviour of binfmt_misc is to spawn the
|
||||||
|
binary lazily when the misc format file is invoked. However,
|
||||||
|
this doesn``t work very well in the face of mount namespaces and
|
||||||
|
changeroots, so the ``F`` mode opens the binary as soon as the
|
||||||
|
emulation is installed and uses the opened image to spawn the
|
||||||
|
emulator, meaning it is always available once installed,
|
||||||
|
regardless of how the environment changes.
|
||||||
|
|
||||||
|
|
||||||
|
There are some restrictions:
|
||||||
|
|
||||||
|
- the whole register string may not exceed 1920 characters
|
||||||
|
- the magic must reside in the first 128 bytes of the file, i.e.
|
||||||
|
offset+size(magic) has to be less than 128
|
||||||
|
- the interpreter string may not exceed 127 characters
|
||||||
|
|
||||||
|
To use binfmt_misc you have to mount it first. You can mount it with
|
||||||
|
``mount -t binfmt_misc none /proc/sys/fs/binfmt_misc`` command, or you can add
|
||||||
|
a line ``none /proc/sys/fs/binfmt_misc binfmt_misc defaults 0 0`` to your
|
||||||
|
``/etc/fstab`` so it auto mounts on boot.
|
||||||
|
|
||||||
|
You may want to add the binary formats in one of your ``/etc/rc`` scripts during
|
||||||
|
boot-up. Read the manual of your init program to figure out how to do this
|
||||||
|
right.
|
||||||
|
|
||||||
|
Think about the order of adding entries! Later added entries are matched first!
|
||||||
|
|
||||||
|
|
||||||
|
A few examples (assumed you are in ``/proc/sys/fs/binfmt_misc``):
|
||||||
|
|
||||||
|
- enable support for em86 (like binfmt_em86, for Alpha AXP only)::
|
||||||
|
|
||||||
|
echo ':i386:M::\x7fELF\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x03:\xff\xff\xff\xff\xff\xfe\xfe\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfb\xff\xff:/bin/em86:' > register
|
||||||
|
echo ':i486:M::\x7fELF\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x06:\xff\xff\xff\xff\xff\xfe\xfe\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfb\xff\xff:/bin/em86:' > register
|
||||||
|
|
||||||
|
- enable support for packed DOS applications (pre-configured dosemu hdimages)::
|
||||||
|
|
||||||
|
echo ':DEXE:M::\x0eDEX::/usr/bin/dosexec:' > register
|
||||||
|
|
||||||
|
- enable support for Windows executables using wine::
|
||||||
|
|
||||||
|
echo ':DOSWin:M::MZ::/usr/local/bin/wine:' > register
|
||||||
|
|
||||||
|
For java support see Documentation/admin-guide/java.rst
|
||||||
|
|
||||||
|
|
||||||
|
You can enable/disable binfmt_misc or one binary type by echoing 0 (to disable)
|
||||||
|
or 1 (to enable) to ``/proc/sys/fs/binfmt_misc/status`` or
|
||||||
|
``/proc/.../the_name``.
|
||||||
|
Catting the file tells you the current status of ``binfmt_misc/the_entry``.
|
||||||
|
|
||||||
|
You can remove one entry or all entries by echoing -1 to ``/proc/.../the_name``
|
||||||
|
or ``/proc/sys/fs/binfmt_misc/status``.
|
||||||
|
|
||||||
|
|
||||||
|
Hints
|
||||||
|
-----
|
||||||
|
|
||||||
|
If you want to pass special arguments to your interpreter, you can
|
||||||
|
write a wrapper script for it. See Documentation/admin-guide/java.rst for an
|
||||||
|
example.
|
||||||
|
|
||||||
|
Your interpreter should NOT look in the PATH for the filename; the kernel
|
||||||
|
passes it the full filename (or the file descriptor) to use. Using ``$PATH`` can
|
||||||
|
cause unexpected behaviour and can be a security hazard.
|
||||||
|
|
||||||
|
|
||||||
|
Richard Günther <rguenth@tat.physik.uni-tuebingen.de>
|
||||||
38
Documentation/admin-guide/braille-console.rst
Normal file
38
Documentation/admin-guide/braille-console.rst
Normal file
@@ -0,0 +1,38 @@
|
|||||||
|
Linux Braille Console
|
||||||
|
=====================
|
||||||
|
|
||||||
|
To get early boot messages on a braille device (before userspace screen
|
||||||
|
readers can start), you first need to compile the support for the usual serial
|
||||||
|
console (see :ref:`Documentation/admin-guide/serial-console.rst <serial_console>`), and
|
||||||
|
for braille device
|
||||||
|
(in :menuselection:`Device Drivers --> Accessibility support --> Console on braille device`).
|
||||||
|
|
||||||
|
Then you need to specify a ``console=brl``, option on the kernel command line, the
|
||||||
|
format is::
|
||||||
|
|
||||||
|
console=brl,serial_options...
|
||||||
|
|
||||||
|
where ``serial_options...`` are the same as described in
|
||||||
|
:ref:`Documentation/admin-guide/serial-console.rst <serial_console>`.
|
||||||
|
|
||||||
|
So for instance you can use ``console=brl,ttyS0`` if the braille device is connected to the first serial port, and ``console=brl,ttyS0,115200`` to
|
||||||
|
override the baud rate to 115200, etc.
|
||||||
|
|
||||||
|
By default, the braille device will just show the last kernel message (console
|
||||||
|
mode). To review previous messages, press the Insert key to switch to the VT
|
||||||
|
review mode. In review mode, the arrow keys permit to browse in the VT content,
|
||||||
|
:kbd:`PAGE-UP`/:kbd:`PAGE-DOWN` keys go at the top/bottom of the screen, and
|
||||||
|
the :kbd:`HOME` key goes back
|
||||||
|
to the cursor, hence providing very basic screen reviewing facility.
|
||||||
|
|
||||||
|
Sound feedback can be obtained by adding the ``braille_console.sound=1`` kernel
|
||||||
|
parameter.
|
||||||
|
|
||||||
|
For simplicity, only one braille console can be enabled, other uses of
|
||||||
|
``console=brl,...`` will be discarded. Also note that it does not interfere with
|
||||||
|
the console selection mechanism described in
|
||||||
|
:ref:`Documentation/admin-guide/serial-console.rst <serial_console>`.
|
||||||
|
|
||||||
|
For now, only the VisioBraille device is supported.
|
||||||
|
|
||||||
|
Samuel Thibault <samuel.thibault@ens-lyon.org>
|
||||||
76
Documentation/admin-guide/bug-bisect.rst
Normal file
76
Documentation/admin-guide/bug-bisect.rst
Normal file
@@ -0,0 +1,76 @@
|
|||||||
|
Bisecting a bug
|
||||||
|
+++++++++++++++
|
||||||
|
|
||||||
|
Last updated: 28 October 2016
|
||||||
|
|
||||||
|
Introduction
|
||||||
|
============
|
||||||
|
|
||||||
|
Always try the latest kernel from kernel.org and build from source. If you are
|
||||||
|
not confident in doing that please report the bug to your distribution vendor
|
||||||
|
instead of to a kernel developer.
|
||||||
|
|
||||||
|
Finding bugs is not always easy. Have a go though. If you can't find it don't
|
||||||
|
give up. Report as much as you have found to the relevant maintainer. See
|
||||||
|
MAINTAINERS for who that is for the subsystem you have worked on.
|
||||||
|
|
||||||
|
Before you submit a bug report read
|
||||||
|
:ref:`Documentation/admin-guide/reporting-bugs.rst <reportingbugs>`.
|
||||||
|
|
||||||
|
Devices not appearing
|
||||||
|
=====================
|
||||||
|
|
||||||
|
Often this is caused by udev/systemd. Check that first before blaming it
|
||||||
|
on the kernel.
|
||||||
|
|
||||||
|
Finding patch that caused a bug
|
||||||
|
===============================
|
||||||
|
|
||||||
|
Using the provided tools with ``git`` makes finding bugs easy provided the bug
|
||||||
|
is reproducible.
|
||||||
|
|
||||||
|
Steps to do it:
|
||||||
|
|
||||||
|
- build the Kernel from its git source
|
||||||
|
- start bisect with [#f1]_::
|
||||||
|
|
||||||
|
$ git bisect start
|
||||||
|
|
||||||
|
- mark the broken changeset with::
|
||||||
|
|
||||||
|
$ git bisect bad [commit]
|
||||||
|
|
||||||
|
- mark a changeset where the code is known to work with::
|
||||||
|
|
||||||
|
$ git bisect good [commit]
|
||||||
|
|
||||||
|
- rebuild the Kernel and test
|
||||||
|
- interact with git bisect by using either::
|
||||||
|
|
||||||
|
$ git bisect good
|
||||||
|
|
||||||
|
or::
|
||||||
|
|
||||||
|
$ git bisect bad
|
||||||
|
|
||||||
|
depending if the bug happened on the changeset you're testing
|
||||||
|
- After some interactions, git bisect will give you the changeset that
|
||||||
|
likely caused the bug.
|
||||||
|
|
||||||
|
- For example, if you know that the current version is bad, and version
|
||||||
|
4.8 is good, you could do::
|
||||||
|
|
||||||
|
$ git bisect start
|
||||||
|
$ git bisect bad # Current version is bad
|
||||||
|
$ git bisect good v4.8
|
||||||
|
|
||||||
|
|
||||||
|
.. [#f1] You can, optionally, provide both good and bad arguments at git
|
||||||
|
start with ``git bisect start [BAD] [GOOD]``
|
||||||
|
|
||||||
|
For further references, please read:
|
||||||
|
|
||||||
|
- The man page for ``git-bisect``
|
||||||
|
- `Fighting regressions with git bisect <https://www.kernel.org/pub/software/scm/git/docs/git-bisect-lk2009.html>`_
|
||||||
|
- `Fully automated bisecting with "git bisect run" <https://lwn.net/Articles/317154>`_
|
||||||
|
- `Using Git bisect to figure out when brokenness was introduced <http://webchick.net/node/99>`_
|
||||||
369
Documentation/admin-guide/bug-hunting.rst
Normal file
369
Documentation/admin-guide/bug-hunting.rst
Normal file
@@ -0,0 +1,369 @@
|
|||||||
|
Bug hunting
|
||||||
|
===========
|
||||||
|
|
||||||
|
Kernel bug reports often come with a stack dump like the one below::
|
||||||
|
|
||||||
|
------------[ cut here ]------------
|
||||||
|
WARNING: CPU: 1 PID: 28102 at kernel/module.c:1108 module_put+0x57/0x70
|
||||||
|
Modules linked in: dvb_usb_gp8psk(-) dvb_usb dvb_core nvidia_drm(PO) nvidia_modeset(PO) snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore nvidia(PO) [last unloaded: rc_core]
|
||||||
|
CPU: 1 PID: 28102 Comm: rmmod Tainted: P WC O 4.8.4-build.1 #1
|
||||||
|
Hardware name: MSI MS-7309/MS-7309, BIOS V1.12 02/23/2009
|
||||||
|
00000000 c12ba080 00000000 00000000 c103ed6a c1616014 00000001 00006dc6
|
||||||
|
c1615862 00000454 c109e8a7 c109e8a7 00000009 ffffffff 00000000 f13f6a10
|
||||||
|
f5f5a600 c103ee33 00000009 00000000 00000000 c109e8a7 f80ca4d0 c109f617
|
||||||
|
Call Trace:
|
||||||
|
[<c12ba080>] ? dump_stack+0x44/0x64
|
||||||
|
[<c103ed6a>] ? __warn+0xfa/0x120
|
||||||
|
[<c109e8a7>] ? module_put+0x57/0x70
|
||||||
|
[<c109e8a7>] ? module_put+0x57/0x70
|
||||||
|
[<c103ee33>] ? warn_slowpath_null+0x23/0x30
|
||||||
|
[<c109e8a7>] ? module_put+0x57/0x70
|
||||||
|
[<f80ca4d0>] ? gp8psk_fe_set_frontend+0x460/0x460 [dvb_usb_gp8psk]
|
||||||
|
[<c109f617>] ? symbol_put_addr+0x27/0x50
|
||||||
|
[<f80bc9ca>] ? dvb_usb_adapter_frontend_exit+0x3a/0x70 [dvb_usb]
|
||||||
|
[<f80bb3bf>] ? dvb_usb_exit+0x2f/0xd0 [dvb_usb]
|
||||||
|
[<c13d03bc>] ? usb_disable_endpoint+0x7c/0xb0
|
||||||
|
[<f80bb48a>] ? dvb_usb_device_exit+0x2a/0x50 [dvb_usb]
|
||||||
|
[<c13d2882>] ? usb_unbind_interface+0x62/0x250
|
||||||
|
[<c136b514>] ? __pm_runtime_idle+0x44/0x70
|
||||||
|
[<c13620d8>] ? __device_release_driver+0x78/0x120
|
||||||
|
[<c1362907>] ? driver_detach+0x87/0x90
|
||||||
|
[<c1361c48>] ? bus_remove_driver+0x38/0x90
|
||||||
|
[<c13d1c18>] ? usb_deregister+0x58/0xb0
|
||||||
|
[<c109fbb0>] ? SyS_delete_module+0x130/0x1f0
|
||||||
|
[<c1055654>] ? task_work_run+0x64/0x80
|
||||||
|
[<c1000fa5>] ? exit_to_usermode_loop+0x85/0x90
|
||||||
|
[<c10013f0>] ? do_fast_syscall_32+0x80/0x130
|
||||||
|
[<c1549f43>] ? sysenter_past_esp+0x40/0x6a
|
||||||
|
---[ end trace 6ebc60ef3981792f ]---
|
||||||
|
|
||||||
|
Such stack traces provide enough information to identify the line inside the
|
||||||
|
Kernel's source code where the bug happened. Depending on the severity of
|
||||||
|
the issue, it may also contain the word **Oops**, as on this one::
|
||||||
|
|
||||||
|
BUG: unable to handle kernel NULL pointer dereference at (null)
|
||||||
|
IP: [<c06969d4>] iret_exc+0x7d0/0xa59
|
||||||
|
*pdpt = 000000002258a001 *pde = 0000000000000000
|
||||||
|
Oops: 0002 [#1] PREEMPT SMP
|
||||||
|
...
|
||||||
|
|
||||||
|
Despite being an **Oops** or some other sort of stack trace, the offended
|
||||||
|
line is usually required to identify and handle the bug. Along this chapter,
|
||||||
|
we'll refer to "Oops" for all kinds of stack traces that need to be analized.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
``ksymoops`` is useless on 2.6 or upper. Please use the Oops in its original
|
||||||
|
format (from ``dmesg``, etc). Ignore any references in this or other docs to
|
||||||
|
"decoding the Oops" or "running it through ksymoops".
|
||||||
|
If you post an Oops from 2.6+ that has been run through ``ksymoops``,
|
||||||
|
people will just tell you to repost it.
|
||||||
|
|
||||||
|
Where is the Oops message is located?
|
||||||
|
-------------------------------------
|
||||||
|
|
||||||
|
Normally the Oops text is read from the kernel buffers by klogd and
|
||||||
|
handed to ``syslogd`` which writes it to a syslog file, typically
|
||||||
|
``/var/log/messages`` (depends on ``/etc/syslog.conf``). On systems with
|
||||||
|
systemd, it may also be stored by the ``journald`` daemon, and accessed
|
||||||
|
by running ``journalctl`` command.
|
||||||
|
|
||||||
|
Sometimes ``klogd`` dies, in which case you can run ``dmesg > file`` to
|
||||||
|
read the data from the kernel buffers and save it. Or you can
|
||||||
|
``cat /proc/kmsg > file``, however you have to break in to stop the transfer,
|
||||||
|
``kmsg`` is a "never ending file".
|
||||||
|
|
||||||
|
If the machine has crashed so badly that you cannot enter commands or
|
||||||
|
the disk is not available then you have three options:
|
||||||
|
|
||||||
|
(1) Hand copy the text from the screen and type it in after the machine
|
||||||
|
has restarted. Messy but it is the only option if you have not
|
||||||
|
planned for a crash. Alternatively, you can take a picture of
|
||||||
|
the screen with a digital camera - not nice, but better than
|
||||||
|
nothing. If the messages scroll off the top of the console, you
|
||||||
|
may find that booting with a higher resolution (eg, ``vga=791``)
|
||||||
|
will allow you to read more of the text. (Caveat: This needs ``vesafb``,
|
||||||
|
so won't help for 'early' oopses)
|
||||||
|
|
||||||
|
(2) Boot with a serial console (see
|
||||||
|
:ref:`Documentation/admin-guide/serial-console.rst <serial_console>`),
|
||||||
|
run a null modem to a second machine and capture the output there
|
||||||
|
using your favourite communication program. Minicom works well.
|
||||||
|
|
||||||
|
(3) Use Kdump (see Documentation/kdump/kdump.txt),
|
||||||
|
extract the kernel ring buffer from old memory with using dmesg
|
||||||
|
gdbmacro in Documentation/kdump/gdbmacros.txt.
|
||||||
|
|
||||||
|
Finding the bug's location
|
||||||
|
--------------------------
|
||||||
|
|
||||||
|
Reporting a bug works best if you point the location of the bug at the
|
||||||
|
Kernel source file. There are two methods for doing that. Usually, using
|
||||||
|
``gdb`` is easier, but the Kernel should be pre-compiled with debug info.
|
||||||
|
|
||||||
|
gdb
|
||||||
|
^^^
|
||||||
|
|
||||||
|
The GNU debug (``gdb``) is the best way to figure out the exact file and line
|
||||||
|
number of the OOPS from the ``vmlinux`` file.
|
||||||
|
|
||||||
|
The usage of gdb works best on a kernel compiled with ``CONFIG_DEBUG_INFO``.
|
||||||
|
This can be set by running::
|
||||||
|
|
||||||
|
$ ./scripts/config -d COMPILE_TEST -e DEBUG_KERNEL -e DEBUG_INFO
|
||||||
|
|
||||||
|
On a kernel compiled with ``CONFIG_DEBUG_INFO``, you can simply copy the
|
||||||
|
EIP value from the OOPS::
|
||||||
|
|
||||||
|
EIP: 0060:[<c021e50e>] Not tainted VLI
|
||||||
|
|
||||||
|
And use GDB to translate that to human-readable form::
|
||||||
|
|
||||||
|
$ gdb vmlinux
|
||||||
|
(gdb) l *0xc021e50e
|
||||||
|
|
||||||
|
If you don't have ``CONFIG_DEBUG_INFO`` enabled, you use the function
|
||||||
|
offset from the OOPS::
|
||||||
|
|
||||||
|
EIP is at vt_ioctl+0xda8/0x1482
|
||||||
|
|
||||||
|
And recompile the kernel with ``CONFIG_DEBUG_INFO`` enabled::
|
||||||
|
|
||||||
|
$ ./scripts/config -d COMPILE_TEST -e DEBUG_KERNEL -e DEBUG_INFO
|
||||||
|
$ make vmlinux
|
||||||
|
$ gdb vmlinux
|
||||||
|
(gdb) l *vt_ioctl+0xda8
|
||||||
|
0x1888 is in vt_ioctl (drivers/tty/vt/vt_ioctl.c:293).
|
||||||
|
288 {
|
||||||
|
289 struct vc_data *vc = NULL;
|
||||||
|
290 int ret = 0;
|
||||||
|
291
|
||||||
|
292 console_lock();
|
||||||
|
293 if (VT_BUSY(vc_num))
|
||||||
|
294 ret = -EBUSY;
|
||||||
|
295 else if (vc_num)
|
||||||
|
296 vc = vc_deallocate(vc_num);
|
||||||
|
297 console_unlock();
|
||||||
|
|
||||||
|
or, if you want to be more verbose::
|
||||||
|
|
||||||
|
(gdb) p vt_ioctl
|
||||||
|
$1 = {int (struct tty_struct *, unsigned int, unsigned long)} 0xae0 <vt_ioctl>
|
||||||
|
(gdb) l *0xae0+0xda8
|
||||||
|
|
||||||
|
You could, instead, use the object file::
|
||||||
|
|
||||||
|
$ make drivers/tty/
|
||||||
|
$ gdb drivers/tty/vt/vt_ioctl.o
|
||||||
|
(gdb) l *vt_ioctl+0xda8
|
||||||
|
|
||||||
|
If you have a call trace, such as::
|
||||||
|
|
||||||
|
Call Trace:
|
||||||
|
[<ffffffff8802c8e9>] :jbd:log_wait_commit+0xa3/0xf5
|
||||||
|
[<ffffffff810482d9>] autoremove_wake_function+0x0/0x2e
|
||||||
|
[<ffffffff8802770b>] :jbd:journal_stop+0x1be/0x1ee
|
||||||
|
...
|
||||||
|
|
||||||
|
this shows the problem likely in the :jbd: module. You can load that module
|
||||||
|
in gdb and list the relevant code::
|
||||||
|
|
||||||
|
$ gdb fs/jbd/jbd.ko
|
||||||
|
(gdb) l *log_wait_commit+0xa3
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
You can also do the same for any function call at the stack trace,
|
||||||
|
like this one::
|
||||||
|
|
||||||
|
[<f80bc9ca>] ? dvb_usb_adapter_frontend_exit+0x3a/0x70 [dvb_usb]
|
||||||
|
|
||||||
|
The position where the above call happened can be seen with::
|
||||||
|
|
||||||
|
$ gdb drivers/media/usb/dvb-usb/dvb-usb.o
|
||||||
|
(gdb) l *dvb_usb_adapter_frontend_exit+0x3a
|
||||||
|
|
||||||
|
objdump
|
||||||
|
^^^^^^^
|
||||||
|
|
||||||
|
To debug a kernel, use objdump and look for the hex offset from the crash
|
||||||
|
output to find the valid line of code/assembler. Without debug symbols, you
|
||||||
|
will see the assembler code for the routine shown, but if your kernel has
|
||||||
|
debug symbols the C code will also be available. (Debug symbols can be enabled
|
||||||
|
in the kernel hacking menu of the menu configuration.) For example::
|
||||||
|
|
||||||
|
$ objdump -r -S -l --disassemble net/dccp/ipv4.o
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
You need to be at the top level of the kernel tree for this to pick up
|
||||||
|
your C files.
|
||||||
|
|
||||||
|
If you don't have access to the code you can also debug on some crash dumps
|
||||||
|
e.g. crash dump output as shown by Dave Miller::
|
||||||
|
|
||||||
|
EIP is at +0x14/0x4c0
|
||||||
|
...
|
||||||
|
Code: 44 24 04 e8 6f 05 00 00 e9 e8 fe ff ff 8d 76 00 8d bc 27 00 00
|
||||||
|
00 00 55 57 56 53 81 ec bc 00 00 00 8b ac 24 d0 00 00 00 8b 5d 08
|
||||||
|
<8b> 83 3c 01 00 00 89 44 24 14 8b 45 28 85 c0 89 44 24 18 0f 85
|
||||||
|
|
||||||
|
Put the bytes into a "foo.s" file like this:
|
||||||
|
|
||||||
|
.text
|
||||||
|
.globl foo
|
||||||
|
foo:
|
||||||
|
.byte .... /* bytes from Code: part of OOPS dump */
|
||||||
|
|
||||||
|
Compile it with "gcc -c -o foo.o foo.s" then look at the output of
|
||||||
|
"objdump --disassemble foo.o".
|
||||||
|
|
||||||
|
Output:
|
||||||
|
|
||||||
|
ip_queue_xmit:
|
||||||
|
push %ebp
|
||||||
|
push %edi
|
||||||
|
push %esi
|
||||||
|
push %ebx
|
||||||
|
sub $0xbc, %esp
|
||||||
|
mov 0xd0(%esp), %ebp ! %ebp = arg0 (skb)
|
||||||
|
mov 0x8(%ebp), %ebx ! %ebx = skb->sk
|
||||||
|
mov 0x13c(%ebx), %eax ! %eax = inet_sk(sk)->opt
|
||||||
|
|
||||||
|
Reporting the bug
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
Once you find where the bug happened, by inspecting its location,
|
||||||
|
you could either try to fix it yourself or report it upstream.
|
||||||
|
|
||||||
|
In order to report it upstream, you should identify the mailing list
|
||||||
|
used for the development of the affected code. This can be done by using
|
||||||
|
the ``get_maintainer.pl`` script.
|
||||||
|
|
||||||
|
For example, if you find a bug at the gspca's conex.c file, you can get
|
||||||
|
their maintainers with::
|
||||||
|
|
||||||
|
$ ./scripts/get_maintainer.pl -f drivers/media/usb/gspca/sonixj.c
|
||||||
|
Hans Verkuil <hverkuil@xs4all.nl> (odd fixer:GSPCA USB WEBCAM DRIVER,commit_signer:1/1=100%)
|
||||||
|
Mauro Carvalho Chehab <mchehab@kernel.org> (maintainer:MEDIA INPUT INFRASTRUCTURE (V4L/DVB),commit_signer:1/1=100%)
|
||||||
|
Tejun Heo <tj@kernel.org> (commit_signer:1/1=100%)
|
||||||
|
Bhaktipriya Shridhar <bhaktipriya96@gmail.com> (commit_signer:1/1=100%,authored:1/1=100%,added_lines:4/4=100%,removed_lines:9/9=100%)
|
||||||
|
linux-media@vger.kernel.org (open list:GSPCA USB WEBCAM DRIVER)
|
||||||
|
linux-kernel@vger.kernel.org (open list)
|
||||||
|
|
||||||
|
Please notice that it will point to:
|
||||||
|
|
||||||
|
- The last developers that touched on the source code. On the above example,
|
||||||
|
Tejun and Bhaktipriya (in this specific case, none really envolved on the
|
||||||
|
development of this file);
|
||||||
|
- The driver maintainer (Hans Verkuil);
|
||||||
|
- The subsystem maintainer (Mauro Carvalho Chehab)
|
||||||
|
- The driver and/or subsystem mailing list (linux-media@vger.kernel.org);
|
||||||
|
- the Linux Kernel mailing list (linux-kernel@vger.kernel.org).
|
||||||
|
|
||||||
|
Usually, the fastest way to have your bug fixed is to report it to mailing
|
||||||
|
list used for the development of the code (linux-media ML) copying the driver maintainer (Hans).
|
||||||
|
|
||||||
|
If you are totally stumped as to whom to send the report, and
|
||||||
|
``get_maintainer.pl`` didn't provide you anything useful, send it to
|
||||||
|
linux-kernel@vger.kernel.org.
|
||||||
|
|
||||||
|
Thanks for your help in making Linux as stable as humanly possible.
|
||||||
|
|
||||||
|
Fixing the bug
|
||||||
|
--------------
|
||||||
|
|
||||||
|
If you know programming, you could help us by not only reporting the bug,
|
||||||
|
but also providing us with a solution. After all open source is about
|
||||||
|
sharing what you do and don't you want to be recognised for your genius?
|
||||||
|
|
||||||
|
If you decide to take this way, once you have worked out a fix please submit
|
||||||
|
it upstream.
|
||||||
|
|
||||||
|
Please do read
|
||||||
|
ref:`Documentation/process/submitting-patches.rst <submittingpatches>` though
|
||||||
|
to help your code get accepted.
|
||||||
|
|
||||||
|
|
||||||
|
---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
Notes on Oops tracing with ``klogd``
|
||||||
|
------------------------------------
|
||||||
|
|
||||||
|
In order to help Linus and the other kernel developers there has been
|
||||||
|
substantial support incorporated into ``klogd`` for processing protection
|
||||||
|
faults. In order to have full support for address resolution at least
|
||||||
|
version 1.3-pl3 of the ``sysklogd`` package should be used.
|
||||||
|
|
||||||
|
When a protection fault occurs the ``klogd`` daemon automatically
|
||||||
|
translates important addresses in the kernel log messages to their
|
||||||
|
symbolic equivalents. This translated kernel message is then
|
||||||
|
forwarded through whatever reporting mechanism ``klogd`` is using. The
|
||||||
|
protection fault message can be simply cut out of the message files
|
||||||
|
and forwarded to the kernel developers.
|
||||||
|
|
||||||
|
Two types of address resolution are performed by ``klogd``. The first is
|
||||||
|
static translation and the second is dynamic translation. Static
|
||||||
|
translation uses the System.map file in much the same manner that
|
||||||
|
ksymoops does. In order to do static translation the ``klogd`` daemon
|
||||||
|
must be able to find a system map file at daemon initialization time.
|
||||||
|
See the klogd man page for information on how ``klogd`` searches for map
|
||||||
|
files.
|
||||||
|
|
||||||
|
Dynamic address translation is important when kernel loadable modules
|
||||||
|
are being used. Since memory for kernel modules is allocated from the
|
||||||
|
kernel's dynamic memory pools there are no fixed locations for either
|
||||||
|
the start of the module or for functions and symbols in the module.
|
||||||
|
|
||||||
|
The kernel supports system calls which allow a program to determine
|
||||||
|
which modules are loaded and their location in memory. Using these
|
||||||
|
system calls the klogd daemon builds a symbol table which can be used
|
||||||
|
to debug a protection fault which occurs in a loadable kernel module.
|
||||||
|
|
||||||
|
At the very minimum klogd will provide the name of the module which
|
||||||
|
generated the protection fault. There may be additional symbolic
|
||||||
|
information available if the developer of the loadable module chose to
|
||||||
|
export symbol information from the module.
|
||||||
|
|
||||||
|
Since the kernel module environment can be dynamic there must be a
|
||||||
|
mechanism for notifying the ``klogd`` daemon when a change in module
|
||||||
|
environment occurs. There are command line options available which
|
||||||
|
allow klogd to signal the currently executing daemon that symbol
|
||||||
|
information should be refreshed. See the ``klogd`` manual page for more
|
||||||
|
information.
|
||||||
|
|
||||||
|
A patch is included with the sysklogd distribution which modifies the
|
||||||
|
``modules-2.0.0`` package to automatically signal klogd whenever a module
|
||||||
|
is loaded or unloaded. Applying this patch provides essentially
|
||||||
|
seamless support for debugging protection faults which occur with
|
||||||
|
kernel loadable modules.
|
||||||
|
|
||||||
|
The following is an example of a protection fault in a loadable module
|
||||||
|
processed by ``klogd``::
|
||||||
|
|
||||||
|
Aug 29 09:51:01 blizard kernel: Unable to handle kernel paging request at virtual address f15e97cc
|
||||||
|
Aug 29 09:51:01 blizard kernel: current->tss.cr3 = 0062d000, %cr3 = 0062d000
|
||||||
|
Aug 29 09:51:01 blizard kernel: *pde = 00000000
|
||||||
|
Aug 29 09:51:01 blizard kernel: Oops: 0002
|
||||||
|
Aug 29 09:51:01 blizard kernel: CPU: 0
|
||||||
|
Aug 29 09:51:01 blizard kernel: EIP: 0010:[oops:_oops+16/3868]
|
||||||
|
Aug 29 09:51:01 blizard kernel: EFLAGS: 00010212
|
||||||
|
Aug 29 09:51:01 blizard kernel: eax: 315e97cc ebx: 003a6f80 ecx: 001be77b edx: 00237c0c
|
||||||
|
Aug 29 09:51:01 blizard kernel: esi: 00000000 edi: bffffdb3 ebp: 00589f90 esp: 00589f8c
|
||||||
|
Aug 29 09:51:01 blizard kernel: ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018
|
||||||
|
Aug 29 09:51:01 blizard kernel: Process oops_test (pid: 3374, process nr: 21, stackpage=00589000)
|
||||||
|
Aug 29 09:51:01 blizard kernel: Stack: 315e97cc 00589f98 0100b0b4 bffffed4 0012e38e 00240c64 003a6f80 00000001
|
||||||
|
Aug 29 09:51:01 blizard kernel: 00000000 00237810 bfffff00 0010a7fa 00000003 00000001 00000000 bfffff00
|
||||||
|
Aug 29 09:51:01 blizard kernel: bffffdb3 bffffed4 ffffffda 0000002b 0007002b 0000002b 0000002b 00000036
|
||||||
|
Aug 29 09:51:01 blizard kernel: Call Trace: [oops:_oops_ioctl+48/80] [_sys_ioctl+254/272] [_system_call+82/128]
|
||||||
|
Aug 29 09:51:01 blizard kernel: Code: c7 00 05 00 00 00 eb 08 90 90 90 90 90 90 90 90 89 ec 5d c3
|
||||||
|
|
||||||
|
---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
Dr. G.W. Wettstein Oncology Research Div. Computing Facility
|
||||||
|
Roger Maris Cancer Center INTERNET: greg@wind.rmcc.com
|
||||||
|
820 4th St. N.
|
||||||
|
Fargo, ND 58122
|
||||||
|
Phone: 701-234-7556
|
||||||
10
Documentation/admin-guide/conf.py
Normal file
10
Documentation/admin-guide/conf.py
Normal file
@@ -0,0 +1,10 @@
|
|||||||
|
# -*- coding: utf-8; mode: python -*-
|
||||||
|
|
||||||
|
project = 'Linux Kernel User Documentation'
|
||||||
|
|
||||||
|
tags.add("subproject")
|
||||||
|
|
||||||
|
latex_documents = [
|
||||||
|
('index', 'linux-user.tex', 'Linux Kernel User Documentation',
|
||||||
|
'The kernel development community', 'manual'),
|
||||||
|
]
|
||||||
268
Documentation/admin-guide/devices.rst
Normal file
268
Documentation/admin-guide/devices.rst
Normal file
@@ -0,0 +1,268 @@
|
|||||||
|
|
||||||
|
Linux allocated devices (4.x+ version)
|
||||||
|
======================================
|
||||||
|
|
||||||
|
This list is the Linux Device List, the official registry of allocated
|
||||||
|
device numbers and ``/dev`` directory nodes for the Linux operating
|
||||||
|
system.
|
||||||
|
|
||||||
|
The LaTeX version of this document is no longer maintained, nor is
|
||||||
|
the document that used to reside at lanana.org. This version in the
|
||||||
|
mainline Linux kernel is the master document. Updates shall be sent
|
||||||
|
as patches to the kernel maintainers (see the
|
||||||
|
:ref:`Documentation/process/submitting-patches.rst <submittingpatches>` document).
|
||||||
|
Specifically explore the sections titled "CHAR and MISC DRIVERS", and
|
||||||
|
"BLOCK LAYER" in the MAINTAINERS file to find the right maintainers
|
||||||
|
to involve for character and block devices.
|
||||||
|
|
||||||
|
This document is included by reference into the Filesystem Hierarchy
|
||||||
|
Standard (FHS). The FHS is available from http://www.pathname.com/fhs/.
|
||||||
|
|
||||||
|
Allocations marked (68k/Amiga) apply to Linux/68k on the Amiga
|
||||||
|
platform only. Allocations marked (68k/Atari) apply to Linux/68k on
|
||||||
|
the Atari platform only.
|
||||||
|
|
||||||
|
This document is in the public domain. The authors requests, however,
|
||||||
|
that semantically altered versions are not distributed without
|
||||||
|
permission of the authors, assuming the authors can be contacted without
|
||||||
|
an unreasonable effort.
|
||||||
|
|
||||||
|
|
||||||
|
.. attention::
|
||||||
|
|
||||||
|
DEVICE DRIVERS AUTHORS PLEASE READ THIS
|
||||||
|
|
||||||
|
Linux now has extensive support for dynamic allocation of device numbering
|
||||||
|
and can use ``sysfs`` and ``udev`` (``systemd``) to handle the naming needs.
|
||||||
|
There are still some exceptions in the serial and boot device area. Before
|
||||||
|
asking for a device number make sure you actually need one.
|
||||||
|
|
||||||
|
To have a major number allocated, or a minor number in situations
|
||||||
|
where that applies (e.g. busmice), please submit a patch and send to
|
||||||
|
the authors as indicated above.
|
||||||
|
|
||||||
|
Keep the description of the device *in the same format
|
||||||
|
as this list*. The reason for this is that it is the only way we have
|
||||||
|
found to ensure we have all the requisite information to publish your
|
||||||
|
device and avoid conflicts.
|
||||||
|
|
||||||
|
Finally, sometimes we have to play "namespace police." Please don't be
|
||||||
|
offended. We often get submissions for ``/dev`` names that would be bound
|
||||||
|
to cause conflicts down the road. We are trying to avoid getting in a
|
||||||
|
situation where we would have to suffer an incompatible forward
|
||||||
|
change. Therefore, please consult with us **before** you make your
|
||||||
|
device names and numbers in any way public, at least to the point
|
||||||
|
where it would be at all difficult to get them changed.
|
||||||
|
|
||||||
|
Your cooperation is appreciated.
|
||||||
|
|
||||||
|
.. include:: devices.txt
|
||||||
|
:literal:
|
||||||
|
|
||||||
|
Additional ``/dev/`` directory entries
|
||||||
|
--------------------------------------
|
||||||
|
|
||||||
|
This section details additional entries that should or may exist in
|
||||||
|
the /dev directory. It is preferred that symbolic links use the same
|
||||||
|
form (absolute or relative) as is indicated here. Links are
|
||||||
|
classified as "hard" or "symbolic" depending on the preferred type of
|
||||||
|
link; if possible, the indicated type of link should be used.
|
||||||
|
|
||||||
|
Compulsory links
|
||||||
|
++++++++++++++++
|
||||||
|
|
||||||
|
These links should exist on all systems:
|
||||||
|
|
||||||
|
=============== =============== =============== ===============================
|
||||||
|
/dev/fd /proc/self/fd symbolic File descriptors
|
||||||
|
/dev/stdin fd/0 symbolic stdin file descriptor
|
||||||
|
/dev/stdout fd/1 symbolic stdout file descriptor
|
||||||
|
/dev/stderr fd/2 symbolic stderr file descriptor
|
||||||
|
/dev/nfsd socksys symbolic Required by iBCS-2
|
||||||
|
/dev/X0R null symbolic Required by iBCS-2
|
||||||
|
=============== =============== =============== ===============================
|
||||||
|
|
||||||
|
Note: ``/dev/X0R`` is <letter X>-<digit 0>-<letter R>.
|
||||||
|
|
||||||
|
Recommended links
|
||||||
|
+++++++++++++++++
|
||||||
|
|
||||||
|
It is recommended that these links exist on all systems:
|
||||||
|
|
||||||
|
|
||||||
|
=============== =============== =============== ===============================
|
||||||
|
/dev/core /proc/kcore symbolic Backward compatibility
|
||||||
|
/dev/ramdisk ram0 symbolic Backward compatibility
|
||||||
|
/dev/ftape qft0 symbolic Backward compatibility
|
||||||
|
/dev/bttv0 video0 symbolic Backward compatibility
|
||||||
|
/dev/radio radio0 symbolic Backward compatibility
|
||||||
|
/dev/i2o* /dev/i2o/* symbolic Backward compatibility
|
||||||
|
/dev/scd? sr? hard Alternate SCSI CD-ROM name
|
||||||
|
=============== =============== =============== ===============================
|
||||||
|
|
||||||
|
Locally defined links
|
||||||
|
+++++++++++++++++++++
|
||||||
|
|
||||||
|
The following links may be established locally to conform to the
|
||||||
|
configuration of the system. This is merely a tabulation of existing
|
||||||
|
practice, and does not constitute a recommendation. However, if they
|
||||||
|
exist, they should have the following uses.
|
||||||
|
|
||||||
|
=============== =============== =============== ===============================
|
||||||
|
/dev/mouse mouse port symbolic Current mouse device
|
||||||
|
/dev/tape tape device symbolic Current tape device
|
||||||
|
/dev/cdrom CD-ROM device symbolic Current CD-ROM device
|
||||||
|
/dev/cdwriter CD-writer symbolic Current CD-writer device
|
||||||
|
/dev/scanner scanner symbolic Current scanner device
|
||||||
|
/dev/modem modem port symbolic Current dialout device
|
||||||
|
/dev/root root device symbolic Current root filesystem
|
||||||
|
/dev/swap swap device symbolic Current swap device
|
||||||
|
=============== =============== =============== ===============================
|
||||||
|
|
||||||
|
``/dev/modem`` should not be used for a modem which supports dialin as
|
||||||
|
well as dialout, as it tends to cause lock file problems. If it
|
||||||
|
exists, ``/dev/modem`` should point to the appropriate primary TTY device
|
||||||
|
(the use of the alternate callout devices is deprecated).
|
||||||
|
|
||||||
|
For SCSI devices, ``/dev/tape`` and ``/dev/cdrom`` should point to the
|
||||||
|
*cooked* devices (``/dev/st*`` and ``/dev/sr*``, respectively), whereas
|
||||||
|
``/dev/cdwriter`` and /dev/scanner should point to the appropriate generic
|
||||||
|
SCSI devices (/dev/sg*).
|
||||||
|
|
||||||
|
``/dev/mouse`` may point to a primary serial TTY device, a hardware mouse
|
||||||
|
device, or a socket for a mouse driver program (e.g. ``/dev/gpmdata``).
|
||||||
|
|
||||||
|
Sockets and pipes
|
||||||
|
+++++++++++++++++
|
||||||
|
|
||||||
|
Non-transient sockets and named pipes may exist in /dev. Common entries are:
|
||||||
|
|
||||||
|
=============== =============== ===============================================
|
||||||
|
/dev/printer socket lpd local socket
|
||||||
|
/dev/log socket syslog local socket
|
||||||
|
/dev/gpmdata socket gpm mouse multiplexer
|
||||||
|
=============== =============== ===============================================
|
||||||
|
|
||||||
|
Mount points
|
||||||
|
++++++++++++
|
||||||
|
|
||||||
|
The following names are reserved for mounting special filesystems
|
||||||
|
under /dev. These special filesystems provide kernel interfaces that
|
||||||
|
cannot be provided with standard device nodes.
|
||||||
|
|
||||||
|
=============== =============== ===============================================
|
||||||
|
/dev/pts devpts PTY slave filesystem
|
||||||
|
/dev/shm tmpfs POSIX shared memory maintenance access
|
||||||
|
=============== =============== ===============================================
|
||||||
|
|
||||||
|
Terminal devices
|
||||||
|
----------------
|
||||||
|
|
||||||
|
Terminal, or TTY devices are a special class of character devices. A
|
||||||
|
terminal device is any device that could act as a controlling terminal
|
||||||
|
for a session; this includes virtual consoles, serial ports, and
|
||||||
|
pseudoterminals (PTYs).
|
||||||
|
|
||||||
|
All terminal devices share a common set of capabilities known as line
|
||||||
|
disciplines; these include the common terminal line discipline as well
|
||||||
|
as SLIP and PPP modes.
|
||||||
|
|
||||||
|
All terminal devices are named similarly; this section explains the
|
||||||
|
naming and use of the various types of TTYs. Note that the naming
|
||||||
|
conventions include several historical warts; some of these are
|
||||||
|
Linux-specific, some were inherited from other systems, and some
|
||||||
|
reflect Linux outgrowing a borrowed convention.
|
||||||
|
|
||||||
|
A hash mark (``#``) in a device name is used here to indicate a decimal
|
||||||
|
number without leading zeroes.
|
||||||
|
|
||||||
|
Virtual consoles and the console device
|
||||||
|
+++++++++++++++++++++++++++++++++++++++
|
||||||
|
|
||||||
|
Virtual consoles are full-screen terminal displays on the system video
|
||||||
|
monitor. Virtual consoles are named ``/dev/tty#``, with numbering
|
||||||
|
starting at ``/dev/tty1``; ``/dev/tty0`` is the current virtual console.
|
||||||
|
``/dev/tty0`` is the device that should be used to access the system video
|
||||||
|
card on those architectures for which the frame buffer devices
|
||||||
|
(``/dev/fb*``) are not applicable. Do not use ``/dev/console``
|
||||||
|
for this purpose.
|
||||||
|
|
||||||
|
The console device, ``/dev/console``, is the device to which system
|
||||||
|
messages should be sent, and on which logins should be permitted in
|
||||||
|
single-user mode. Starting with Linux 2.1.71, ``/dev/console`` is managed
|
||||||
|
by the kernel; for previous versions it should be a symbolic link to
|
||||||
|
either ``/dev/tty0``, a specific virtual console such as ``/dev/tty1``, or to
|
||||||
|
a serial port primary (``tty*``, not ``cu*``) device, depending on the
|
||||||
|
configuration of the system.
|
||||||
|
|
||||||
|
Serial ports
|
||||||
|
++++++++++++
|
||||||
|
|
||||||
|
Serial ports are RS-232 serial ports and any device which simulates
|
||||||
|
one, either in hardware (such as internal modems) or in software (such
|
||||||
|
as the ISDN driver.) Under Linux, each serial ports has two device
|
||||||
|
names, the primary or callin device and the alternate or callout one.
|
||||||
|
Each kind of device is indicated by a different letter. For any
|
||||||
|
letter X, the names of the devices are ``/dev/ttyX#`` and ``/dev/cux#``,
|
||||||
|
respectively; for historical reasons, ``/dev/ttyS#`` and ``/dev/ttyC#``
|
||||||
|
correspond to ``/dev/cua#`` and ``/dev/cub#``. In the future, it should be
|
||||||
|
expected that multiple letters will be used; all letters will be upper
|
||||||
|
case for the "tty" device (e.g. ``/dev/ttyDP#``) and lower case for the
|
||||||
|
"cu" device (e.g. ``/dev/cudp#``).
|
||||||
|
|
||||||
|
The names ``/dev/ttyQ#`` and ``/dev/cuq#`` are reserved for local use.
|
||||||
|
|
||||||
|
The alternate devices provide for kernel-based exclusion and somewhat
|
||||||
|
different defaults than the primary devices. Their main purpose is to
|
||||||
|
allow the use of serial ports with programs with no inherent or broken
|
||||||
|
support for serial ports. Their use is deprecated, and they may be
|
||||||
|
removed from a future version of Linux.
|
||||||
|
|
||||||
|
Arbitration of serial ports is provided by the use of lock files with
|
||||||
|
the names ``/var/lock/LCK..ttyX#``. The contents of the lock file should
|
||||||
|
be the PID of the locking process as an ASCII number.
|
||||||
|
|
||||||
|
It is common practice to install links such as /dev/modem
|
||||||
|
which point to serial ports. In order to ensure proper locking in the
|
||||||
|
presence of these links, it is recommended that software chase
|
||||||
|
symlinks and lock all possible names; additionally, it is recommended
|
||||||
|
that a lock file be installed with the corresponding alternate
|
||||||
|
device. In order to avoid deadlocks, it is recommended that the locks
|
||||||
|
are acquired in the following order, and released in the reverse:
|
||||||
|
|
||||||
|
1. The symbolic link name, if any (``/var/lock/LCK..modem``)
|
||||||
|
2. The "tty" name (``/var/lock/LCK..ttyS2``)
|
||||||
|
3. The alternate device name (``/var/lock/LCK..cua2``)
|
||||||
|
|
||||||
|
In the case of nested symbolic links, the lock files should be
|
||||||
|
installed in the order the symlinks are resolved.
|
||||||
|
|
||||||
|
Under no circumstances should an application hold a lock while waiting
|
||||||
|
for another to be released. In addition, applications which attempt
|
||||||
|
to create lock files for the corresponding alternate device names
|
||||||
|
should take into account the possibility of being used on a non-serial
|
||||||
|
port TTY, for which no alternate device would exist.
|
||||||
|
|
||||||
|
Pseudoterminals (PTYs)
|
||||||
|
++++++++++++++++++++++
|
||||||
|
|
||||||
|
Pseudoterminals, or PTYs, are used to create login sessions or provide
|
||||||
|
other capabilities requiring a TTY line discipline (including SLIP or
|
||||||
|
PPP capability) to arbitrary data-generation processes. Each PTY has
|
||||||
|
a master side, named ``/dev/pty[p-za-e][0-9a-f]``, and a slave side, named
|
||||||
|
``/dev/tty[p-za-e][0-9a-f]``. The kernel arbitrates the use of PTYs by
|
||||||
|
allowing each master side to be opened only once.
|
||||||
|
|
||||||
|
Once the master side has been opened, the corresponding slave device
|
||||||
|
can be used in the same manner as any TTY device. The master and
|
||||||
|
slave devices are connected by the kernel, generating the equivalent
|
||||||
|
of a bidirectional pipe with TTY capabilities.
|
||||||
|
|
||||||
|
Recent versions of the Linux kernels and GNU libc contain support for
|
||||||
|
the System V/Unix98 naming scheme for PTYs, which assigns a common
|
||||||
|
device, ``/dev/ptmx``, to all the masters (opening it will automatically
|
||||||
|
give you a previously unassigned PTY) and a subdirectory, ``/dev/pts``,
|
||||||
|
for the slaves; the slaves are named with decimal integers (``/dev/pts/#``
|
||||||
|
in our notation). This removes the problem of exhausting the
|
||||||
|
namespace and enables the kernel to automatically create the device
|
||||||
|
nodes for the slaves on demand using the "devpts" filesystem.
|
||||||
@@ -1,58 +1,3 @@
|
|||||||
|
|
||||||
LINUX ALLOCATED DEVICES (4.x+ version)
|
|
||||||
|
|
||||||
This list is the Linux Device List, the official registry of allocated
|
|
||||||
device numbers and /dev directory nodes for the Linux operating
|
|
||||||
system.
|
|
||||||
|
|
||||||
The LaTeX version of this document is no longer maintained, nor is
|
|
||||||
the document that used to reside at lanana.org. This version in the
|
|
||||||
mainline Linux kernel is the master document. Updates shall be sent
|
|
||||||
as patches to the kernel maintainers (see the SubmittingPatches document).
|
|
||||||
Specifically explore the sections titled "CHAR and MISC DRIVERS", and
|
|
||||||
"BLOCK LAYER" in the MAINTAINERS file to find the right maintainers
|
|
||||||
to involve for character and block devices.
|
|
||||||
|
|
||||||
This document is included by reference into the Filesystem Hierarchy
|
|
||||||
Standard (FHS). The FHS is available from http://www.pathname.com/fhs/.
|
|
||||||
|
|
||||||
Allocations marked (68k/Amiga) apply to Linux/68k on the Amiga
|
|
||||||
platform only. Allocations marked (68k/Atari) apply to Linux/68k on
|
|
||||||
the Atari platform only.
|
|
||||||
|
|
||||||
This document is in the public domain. The authors requests, however,
|
|
||||||
that semantically altered versions are not distributed without
|
|
||||||
permission of the authors, assuming the authors can be contacted without
|
|
||||||
an unreasonable effort.
|
|
||||||
|
|
||||||
|
|
||||||
**** DEVICE DRIVERS AUTHORS PLEASE READ THIS ****
|
|
||||||
|
|
||||||
Linux now has extensive support for dynamic allocation of device numbering
|
|
||||||
and can use sysfs and udev (systemd) to handle the naming needs. There are
|
|
||||||
still some exceptions in the serial and boot device area. Before asking
|
|
||||||
for a device number make sure you actually need one.
|
|
||||||
|
|
||||||
To have a major number allocated, or a minor number in situations
|
|
||||||
where that applies (e.g. busmice), please submit a patch and send to
|
|
||||||
the authors as indicated above.
|
|
||||||
|
|
||||||
Keep the description of the device *in the same format
|
|
||||||
as this list*. The reason for this is that it is the only way we have
|
|
||||||
found to ensure we have all the requisite information to publish your
|
|
||||||
device and avoid conflicts.
|
|
||||||
|
|
||||||
Finally, sometimes we have to play "namespace police." Please don't be
|
|
||||||
offended. We often get submissions for /dev names that would be bound
|
|
||||||
to cause conflicts down the road. We are trying to avoid getting in a
|
|
||||||
situation where we would have to suffer an incompatible forward
|
|
||||||
change. Therefore, please consult with us *before* you make your
|
|
||||||
device names and numbers in any way public, at least to the point
|
|
||||||
where it would be at all difficult to get them changed.
|
|
||||||
|
|
||||||
Your cooperation is appreciated.
|
|
||||||
|
|
||||||
|
|
||||||
0 Unnamed devices (e.g. non-device mounts)
|
0 Unnamed devices (e.g. non-device mounts)
|
||||||
0 = reserved as null device number
|
0 = reserved as null device number
|
||||||
See block major 144, 145, 146 for expansion areas.
|
See block major 144, 145, 146 for expansion areas.
|
||||||
@@ -3134,192 +3079,3 @@ Your cooperation is appreciated.
|
|||||||
1 = /dev/osd1 Second OSD Device
|
1 = /dev/osd1 Second OSD Device
|
||||||
...
|
...
|
||||||
255 = /dev/osd255 256th OSD Device
|
255 = /dev/osd255 256th OSD Device
|
||||||
|
|
||||||
**** ADDITIONAL /dev DIRECTORY ENTRIES
|
|
||||||
|
|
||||||
This section details additional entries that should or may exist in
|
|
||||||
the /dev directory. It is preferred that symbolic links use the same
|
|
||||||
form (absolute or relative) as is indicated here. Links are
|
|
||||||
classified as "hard" or "symbolic" depending on the preferred type of
|
|
||||||
link; if possible, the indicated type of link should be used.
|
|
||||||
|
|
||||||
|
|
||||||
Compulsory links
|
|
||||||
|
|
||||||
These links should exist on all systems:
|
|
||||||
|
|
||||||
/dev/fd /proc/self/fd symbolic File descriptors
|
|
||||||
/dev/stdin fd/0 symbolic stdin file descriptor
|
|
||||||
/dev/stdout fd/1 symbolic stdout file descriptor
|
|
||||||
/dev/stderr fd/2 symbolic stderr file descriptor
|
|
||||||
/dev/nfsd socksys symbolic Required by iBCS-2
|
|
||||||
/dev/X0R null symbolic Required by iBCS-2
|
|
||||||
|
|
||||||
Note: /dev/X0R is <letter X>-<digit 0>-<letter R>.
|
|
||||||
|
|
||||||
Recommended links
|
|
||||||
|
|
||||||
It is recommended that these links exist on all systems:
|
|
||||||
|
|
||||||
/dev/core /proc/kcore symbolic Backward compatibility
|
|
||||||
/dev/ramdisk ram0 symbolic Backward compatibility
|
|
||||||
/dev/ftape qft0 symbolic Backward compatibility
|
|
||||||
/dev/bttv0 video0 symbolic Backward compatibility
|
|
||||||
/dev/radio radio0 symbolic Backward compatibility
|
|
||||||
/dev/i2o* /dev/i2o/* symbolic Backward compatibility
|
|
||||||
/dev/scd? sr? hard Alternate SCSI CD-ROM name
|
|
||||||
|
|
||||||
Locally defined links
|
|
||||||
|
|
||||||
The following links may be established locally to conform to the
|
|
||||||
configuration of the system. This is merely a tabulation of existing
|
|
||||||
practice, and does not constitute a recommendation. However, if they
|
|
||||||
exist, they should have the following uses.
|
|
||||||
|
|
||||||
/dev/mouse mouse port symbolic Current mouse device
|
|
||||||
/dev/tape tape device symbolic Current tape device
|
|
||||||
/dev/cdrom CD-ROM device symbolic Current CD-ROM device
|
|
||||||
/dev/cdwriter CD-writer symbolic Current CD-writer device
|
|
||||||
/dev/scanner scanner symbolic Current scanner device
|
|
||||||
/dev/modem modem port symbolic Current dialout device
|
|
||||||
/dev/root root device symbolic Current root filesystem
|
|
||||||
/dev/swap swap device symbolic Current swap device
|
|
||||||
|
|
||||||
/dev/modem should not be used for a modem which supports dialin as
|
|
||||||
well as dialout, as it tends to cause lock file problems. If it
|
|
||||||
exists, /dev/modem should point to the appropriate primary TTY device
|
|
||||||
(the use of the alternate callout devices is deprecated).
|
|
||||||
|
|
||||||
For SCSI devices, /dev/tape and /dev/cdrom should point to the
|
|
||||||
``cooked'' devices (/dev/st* and /dev/sr*, respectively), whereas
|
|
||||||
/dev/cdwriter and /dev/scanner should point to the appropriate generic
|
|
||||||
SCSI devices (/dev/sg*).
|
|
||||||
|
|
||||||
/dev/mouse may point to a primary serial TTY device, a hardware mouse
|
|
||||||
device, or a socket for a mouse driver program (e.g. /dev/gpmdata).
|
|
||||||
|
|
||||||
Sockets and pipes
|
|
||||||
|
|
||||||
Non-transient sockets and named pipes may exist in /dev. Common entries are:
|
|
||||||
|
|
||||||
/dev/printer socket lpd local socket
|
|
||||||
/dev/log socket syslog local socket
|
|
||||||
/dev/gpmdata socket gpm mouse multiplexer
|
|
||||||
|
|
||||||
Mount points
|
|
||||||
|
|
||||||
The following names are reserved for mounting special filesystems
|
|
||||||
under /dev. These special filesystems provide kernel interfaces that
|
|
||||||
cannot be provided with standard device nodes.
|
|
||||||
|
|
||||||
/dev/pts devpts PTY slave filesystem
|
|
||||||
/dev/shm tmpfs POSIX shared memory maintenance access
|
|
||||||
|
|
||||||
**** TERMINAL DEVICES
|
|
||||||
|
|
||||||
Terminal, or TTY devices are a special class of character devices. A
|
|
||||||
terminal device is any device that could act as a controlling terminal
|
|
||||||
for a session; this includes virtual consoles, serial ports, and
|
|
||||||
pseudoterminals (PTYs).
|
|
||||||
|
|
||||||
All terminal devices share a common set of capabilities known as line
|
|
||||||
disciplines; these include the common terminal line discipline as well
|
|
||||||
as SLIP and PPP modes.
|
|
||||||
|
|
||||||
All terminal devices are named similarly; this section explains the
|
|
||||||
naming and use of the various types of TTYs. Note that the naming
|
|
||||||
conventions include several historical warts; some of these are
|
|
||||||
Linux-specific, some were inherited from other systems, and some
|
|
||||||
reflect Linux outgrowing a borrowed convention.
|
|
||||||
|
|
||||||
A hash mark (#) in a device name is used here to indicate a decimal
|
|
||||||
number without leading zeroes.
|
|
||||||
|
|
||||||
Virtual consoles and the console device
|
|
||||||
|
|
||||||
Virtual consoles are full-screen terminal displays on the system video
|
|
||||||
monitor. Virtual consoles are named /dev/tty#, with numbering
|
|
||||||
starting at /dev/tty1; /dev/tty0 is the current virtual console.
|
|
||||||
/dev/tty0 is the device that should be used to access the system video
|
|
||||||
card on those architectures for which the frame buffer devices
|
|
||||||
(/dev/fb*) are not applicable. Do not use /dev/console
|
|
||||||
for this purpose.
|
|
||||||
|
|
||||||
The console device, /dev/console, is the device to which system
|
|
||||||
messages should be sent, and on which logins should be permitted in
|
|
||||||
single-user mode. Starting with Linux 2.1.71, /dev/console is managed
|
|
||||||
by the kernel; for previous versions it should be a symbolic link to
|
|
||||||
either /dev/tty0, a specific virtual console such as /dev/tty1, or to
|
|
||||||
a serial port primary (tty*, not cu*) device, depending on the
|
|
||||||
configuration of the system.
|
|
||||||
|
|
||||||
Serial ports
|
|
||||||
|
|
||||||
Serial ports are RS-232 serial ports and any device which simulates
|
|
||||||
one, either in hardware (such as internal modems) or in software (such
|
|
||||||
as the ISDN driver.) Under Linux, each serial ports has two device
|
|
||||||
names, the primary or callin device and the alternate or callout one.
|
|
||||||
Each kind of device is indicated by a different letter. For any
|
|
||||||
letter X, the names of the devices are /dev/ttyX# and /dev/cux#,
|
|
||||||
respectively; for historical reasons, /dev/ttyS# and /dev/ttyC#
|
|
||||||
correspond to /dev/cua# and /dev/cub#. In the future, it should be
|
|
||||||
expected that multiple letters will be used; all letters will be upper
|
|
||||||
case for the "tty" device (e.g. /dev/ttyDP#) and lower case for the
|
|
||||||
"cu" device (e.g. /dev/cudp#).
|
|
||||||
|
|
||||||
The names /dev/ttyQ# and /dev/cuq# are reserved for local use.
|
|
||||||
|
|
||||||
The alternate devices provide for kernel-based exclusion and somewhat
|
|
||||||
different defaults than the primary devices. Their main purpose is to
|
|
||||||
allow the use of serial ports with programs with no inherent or broken
|
|
||||||
support for serial ports. Their use is deprecated, and they may be
|
|
||||||
removed from a future version of Linux.
|
|
||||||
|
|
||||||
Arbitration of serial ports is provided by the use of lock files with
|
|
||||||
the names /var/lock/LCK..ttyX#. The contents of the lock file should
|
|
||||||
be the PID of the locking process as an ASCII number.
|
|
||||||
|
|
||||||
It is common practice to install links such as /dev/modem
|
|
||||||
which point to serial ports. In order to ensure proper locking in the
|
|
||||||
presence of these links, it is recommended that software chase
|
|
||||||
symlinks and lock all possible names; additionally, it is recommended
|
|
||||||
that a lock file be installed with the corresponding alternate
|
|
||||||
device. In order to avoid deadlocks, it is recommended that the locks
|
|
||||||
are acquired in the following order, and released in the reverse:
|
|
||||||
|
|
||||||
1. The symbolic link name, if any (/var/lock/LCK..modem)
|
|
||||||
2. The "tty" name (/var/lock/LCK..ttyS2)
|
|
||||||
3. The alternate device name (/var/lock/LCK..cua2)
|
|
||||||
|
|
||||||
In the case of nested symbolic links, the lock files should be
|
|
||||||
installed in the order the symlinks are resolved.
|
|
||||||
|
|
||||||
Under no circumstances should an application hold a lock while waiting
|
|
||||||
for another to be released. In addition, applications which attempt
|
|
||||||
to create lock files for the corresponding alternate device names
|
|
||||||
should take into account the possibility of being used on a non-serial
|
|
||||||
port TTY, for which no alternate device would exist.
|
|
||||||
|
|
||||||
Pseudoterminals (PTYs)
|
|
||||||
|
|
||||||
Pseudoterminals, or PTYs, are used to create login sessions or provide
|
|
||||||
other capabilities requiring a TTY line discipline (including SLIP or
|
|
||||||
PPP capability) to arbitrary data-generation processes. Each PTY has
|
|
||||||
a master side, named /dev/pty[p-za-e][0-9a-f], and a slave side, named
|
|
||||||
/dev/tty[p-za-e][0-9a-f]. The kernel arbitrates the use of PTYs by
|
|
||||||
allowing each master side to be opened only once.
|
|
||||||
|
|
||||||
Once the master side has been opened, the corresponding slave device
|
|
||||||
can be used in the same manner as any TTY device. The master and
|
|
||||||
slave devices are connected by the kernel, generating the equivalent
|
|
||||||
of a bidirectional pipe with TTY capabilities.
|
|
||||||
|
|
||||||
Recent versions of the Linux kernels and GNU libc contain support for
|
|
||||||
the System V/Unix98 naming scheme for PTYs, which assigns a common
|
|
||||||
device, /dev/ptmx, to all the masters (opening it will automatically
|
|
||||||
give you a previously unassigned PTY) and a subdirectory, /dev/pts,
|
|
||||||
for the slaves; the slaves are named with decimal integers (/dev/pts/#
|
|
||||||
in our notation). This removes the problem of exhausting the
|
|
||||||
namespace and enables the kernel to automatically create the device
|
|
||||||
nodes for the slaves on demand using the "devpts" filesystem.
|
|
||||||
|
|
||||||
353
Documentation/admin-guide/dynamic-debug-howto.rst
Normal file
353
Documentation/admin-guide/dynamic-debug-howto.rst
Normal file
@@ -0,0 +1,353 @@
|
|||||||
|
Dynamic debug
|
||||||
|
+++++++++++++
|
||||||
|
|
||||||
|
|
||||||
|
Introduction
|
||||||
|
============
|
||||||
|
|
||||||
|
This document describes how to use the dynamic debug (dyndbg) feature.
|
||||||
|
|
||||||
|
Dynamic debug is designed to allow you to dynamically enable/disable
|
||||||
|
kernel code to obtain additional kernel information. Currently, if
|
||||||
|
``CONFIG_DYNAMIC_DEBUG`` is set, then all ``pr_debug()``/``dev_dbg()`` and
|
||||||
|
``print_hex_dump_debug()``/``print_hex_dump_bytes()`` calls can be dynamically
|
||||||
|
enabled per-callsite.
|
||||||
|
|
||||||
|
If ``CONFIG_DYNAMIC_DEBUG`` is not set, ``print_hex_dump_debug()`` is just
|
||||||
|
shortcut for ``print_hex_dump(KERN_DEBUG)``.
|
||||||
|
|
||||||
|
For ``print_hex_dump_debug()``/``print_hex_dump_bytes()``, format string is
|
||||||
|
its ``prefix_str`` argument, if it is constant string; or ``hexdump``
|
||||||
|
in case ``prefix_str`` is build dynamically.
|
||||||
|
|
||||||
|
Dynamic debug has even more useful features:
|
||||||
|
|
||||||
|
* Simple query language allows turning on and off debugging
|
||||||
|
statements by matching any combination of 0 or 1 of:
|
||||||
|
|
||||||
|
- source filename
|
||||||
|
- function name
|
||||||
|
- line number (including ranges of line numbers)
|
||||||
|
- module name
|
||||||
|
- format string
|
||||||
|
|
||||||
|
* Provides a debugfs control file: ``<debugfs>/dynamic_debug/control``
|
||||||
|
which can be read to display the complete list of known debug
|
||||||
|
statements, to help guide you
|
||||||
|
|
||||||
|
Controlling dynamic debug Behaviour
|
||||||
|
===================================
|
||||||
|
|
||||||
|
The behaviour of ``pr_debug()``/``dev_dbg()`` are controlled via writing to a
|
||||||
|
control file in the 'debugfs' filesystem. Thus, you must first mount
|
||||||
|
the debugfs filesystem, in order to make use of this feature.
|
||||||
|
Subsequently, we refer to the control file as:
|
||||||
|
``<debugfs>/dynamic_debug/control``. For example, if you want to enable
|
||||||
|
printing from source file ``svcsock.c``, line 1603 you simply do::
|
||||||
|
|
||||||
|
nullarbor:~ # echo 'file svcsock.c line 1603 +p' >
|
||||||
|
<debugfs>/dynamic_debug/control
|
||||||
|
|
||||||
|
If you make a mistake with the syntax, the write will fail thus::
|
||||||
|
|
||||||
|
nullarbor:~ # echo 'file svcsock.c wtf 1 +p' >
|
||||||
|
<debugfs>/dynamic_debug/control
|
||||||
|
-bash: echo: write error: Invalid argument
|
||||||
|
|
||||||
|
Viewing Dynamic Debug Behaviour
|
||||||
|
===============================
|
||||||
|
|
||||||
|
You can view the currently configured behaviour of all the debug
|
||||||
|
statements via::
|
||||||
|
|
||||||
|
nullarbor:~ # cat <debugfs>/dynamic_debug/control
|
||||||
|
# filename:lineno [module]function flags format
|
||||||
|
/usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:323 [svcxprt_rdma]svc_rdma_cleanup =_ "SVCRDMA Module Removed, deregister RPC RDMA transport\012"
|
||||||
|
/usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:341 [svcxprt_rdma]svc_rdma_init =_ "\011max_inline : %d\012"
|
||||||
|
/usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:340 [svcxprt_rdma]svc_rdma_init =_ "\011sq_depth : %d\012"
|
||||||
|
/usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:338 [svcxprt_rdma]svc_rdma_init =_ "\011max_requests : %d\012"
|
||||||
|
...
|
||||||
|
|
||||||
|
|
||||||
|
You can also apply standard Unix text manipulation filters to this
|
||||||
|
data, e.g.::
|
||||||
|
|
||||||
|
nullarbor:~ # grep -i rdma <debugfs>/dynamic_debug/control | wc -l
|
||||||
|
62
|
||||||
|
|
||||||
|
nullarbor:~ # grep -i tcp <debugfs>/dynamic_debug/control | wc -l
|
||||||
|
42
|
||||||
|
|
||||||
|
The third column shows the currently enabled flags for each debug
|
||||||
|
statement callsite (see below for definitions of the flags). The
|
||||||
|
default value, with no flags enabled, is ``=_``. So you can view all
|
||||||
|
the debug statement callsites with any non-default flags::
|
||||||
|
|
||||||
|
nullarbor:~ # awk '$3 != "=_"' <debugfs>/dynamic_debug/control
|
||||||
|
# filename:lineno [module]function flags format
|
||||||
|
/usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svcsock.c:1603 [sunrpc]svc_send p "svc_process: st_sendto returned %d\012"
|
||||||
|
|
||||||
|
Command Language Reference
|
||||||
|
==========================
|
||||||
|
|
||||||
|
At the lexical level, a command comprises a sequence of words separated
|
||||||
|
by spaces or tabs. So these are all equivalent::
|
||||||
|
|
||||||
|
nullarbor:~ # echo -c 'file svcsock.c line 1603 +p' >
|
||||||
|
<debugfs>/dynamic_debug/control
|
||||||
|
nullarbor:~ # echo -c ' file svcsock.c line 1603 +p ' >
|
||||||
|
<debugfs>/dynamic_debug/control
|
||||||
|
nullarbor:~ # echo -n 'file svcsock.c line 1603 +p' >
|
||||||
|
<debugfs>/dynamic_debug/control
|
||||||
|
|
||||||
|
Command submissions are bounded by a write() system call.
|
||||||
|
Multiple commands can be written together, separated by ``;`` or ``\n``::
|
||||||
|
|
||||||
|
~# echo "func pnpacpi_get_resources +p; func pnp_assign_mem +p" \
|
||||||
|
> <debugfs>/dynamic_debug/control
|
||||||
|
|
||||||
|
If your query set is big, you can batch them too::
|
||||||
|
|
||||||
|
~# cat query-batch-file > <debugfs>/dynamic_debug/control
|
||||||
|
|
||||||
|
A another way is to use wildcard. The match rule support ``*`` (matches
|
||||||
|
zero or more characters) and ``?`` (matches exactly one character).For
|
||||||
|
example, you can match all usb drivers::
|
||||||
|
|
||||||
|
~# echo "file drivers/usb/* +p" > <debugfs>/dynamic_debug/control
|
||||||
|
|
||||||
|
At the syntactical level, a command comprises a sequence of match
|
||||||
|
specifications, followed by a flags change specification::
|
||||||
|
|
||||||
|
command ::= match-spec* flags-spec
|
||||||
|
|
||||||
|
The match-spec's are used to choose a subset of the known pr_debug()
|
||||||
|
callsites to which to apply the flags-spec. Think of them as a query
|
||||||
|
with implicit ANDs between each pair. Note that an empty list of
|
||||||
|
match-specs will select all debug statement callsites.
|
||||||
|
|
||||||
|
A match specification comprises a keyword, which controls the
|
||||||
|
attribute of the callsite to be compared, and a value to compare
|
||||||
|
against. Possible keywords are:::
|
||||||
|
|
||||||
|
match-spec ::= 'func' string |
|
||||||
|
'file' string |
|
||||||
|
'module' string |
|
||||||
|
'format' string |
|
||||||
|
'line' line-range
|
||||||
|
|
||||||
|
line-range ::= lineno |
|
||||||
|
'-'lineno |
|
||||||
|
lineno'-' |
|
||||||
|
lineno'-'lineno
|
||||||
|
|
||||||
|
lineno ::= unsigned-int
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
``line-range`` cannot contain space, e.g.
|
||||||
|
"1-30" is valid range but "1 - 30" is not.
|
||||||
|
|
||||||
|
|
||||||
|
The meanings of each keyword are:
|
||||||
|
|
||||||
|
func
|
||||||
|
The given string is compared against the function name
|
||||||
|
of each callsite. Example::
|
||||||
|
|
||||||
|
func svc_tcp_accept
|
||||||
|
|
||||||
|
file
|
||||||
|
The given string is compared against either the full pathname, the
|
||||||
|
src-root relative pathname, or the basename of the source file of
|
||||||
|
each callsite. Examples::
|
||||||
|
|
||||||
|
file svcsock.c
|
||||||
|
file kernel/freezer.c
|
||||||
|
file /usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svcsock.c
|
||||||
|
|
||||||
|
module
|
||||||
|
The given string is compared against the module name
|
||||||
|
of each callsite. The module name is the string as
|
||||||
|
seen in ``lsmod``, i.e. without the directory or the ``.ko``
|
||||||
|
suffix and with ``-`` changed to ``_``. Examples::
|
||||||
|
|
||||||
|
module sunrpc
|
||||||
|
module nfsd
|
||||||
|
|
||||||
|
format
|
||||||
|
The given string is searched for in the dynamic debug format
|
||||||
|
string. Note that the string does not need to match the
|
||||||
|
entire format, only some part. Whitespace and other
|
||||||
|
special characters can be escaped using C octal character
|
||||||
|
escape ``\ooo`` notation, e.g. the space character is ``\040``.
|
||||||
|
Alternatively, the string can be enclosed in double quote
|
||||||
|
characters (``"``) or single quote characters (``'``).
|
||||||
|
Examples::
|
||||||
|
|
||||||
|
format svcrdma: // many of the NFS/RDMA server pr_debugs
|
||||||
|
format readahead // some pr_debugs in the readahead cache
|
||||||
|
format nfsd:\040SETATTR // one way to match a format with whitespace
|
||||||
|
format "nfsd: SETATTR" // a neater way to match a format with whitespace
|
||||||
|
format 'nfsd: SETATTR' // yet another way to match a format with whitespace
|
||||||
|
|
||||||
|
line
|
||||||
|
The given line number or range of line numbers is compared
|
||||||
|
against the line number of each ``pr_debug()`` callsite. A single
|
||||||
|
line number matches the callsite line number exactly. A
|
||||||
|
range of line numbers matches any callsite between the first
|
||||||
|
and last line number inclusive. An empty first number means
|
||||||
|
the first line in the file, an empty line number means the
|
||||||
|
last number in the file. Examples::
|
||||||
|
|
||||||
|
line 1603 // exactly line 1603
|
||||||
|
line 1600-1605 // the six lines from line 1600 to line 1605
|
||||||
|
line -1605 // the 1605 lines from line 1 to line 1605
|
||||||
|
line 1600- // all lines from line 1600 to the end of the file
|
||||||
|
|
||||||
|
The flags specification comprises a change operation followed
|
||||||
|
by one or more flag characters. The change operation is one
|
||||||
|
of the characters::
|
||||||
|
|
||||||
|
- remove the given flags
|
||||||
|
+ add the given flags
|
||||||
|
= set the flags to the given flags
|
||||||
|
|
||||||
|
The flags are::
|
||||||
|
|
||||||
|
p enables the pr_debug() callsite.
|
||||||
|
f Include the function name in the printed message
|
||||||
|
l Include line number in the printed message
|
||||||
|
m Include module name in the printed message
|
||||||
|
t Include thread ID in messages not generated from interrupt context
|
||||||
|
_ No flags are set. (Or'd with others on input)
|
||||||
|
|
||||||
|
For ``print_hex_dump_debug()`` and ``print_hex_dump_bytes()``, only ``p`` flag
|
||||||
|
have meaning, other flags ignored.
|
||||||
|
|
||||||
|
For display, the flags are preceded by ``=``
|
||||||
|
(mnemonic: what the flags are currently equal to).
|
||||||
|
|
||||||
|
Note the regexp ``^[-+=][flmpt_]+$`` matches a flags specification.
|
||||||
|
To clear all flags at once, use ``=_`` or ``-flmpt``.
|
||||||
|
|
||||||
|
|
||||||
|
Debug messages during Boot Process
|
||||||
|
==================================
|
||||||
|
|
||||||
|
To activate debug messages for core code and built-in modules during
|
||||||
|
the boot process, even before userspace and debugfs exists, use
|
||||||
|
``dyndbg="QUERY"``, ``module.dyndbg="QUERY"``, or ``ddebug_query="QUERY"``
|
||||||
|
(``ddebug_query`` is obsoleted by ``dyndbg``, and deprecated). QUERY follows
|
||||||
|
the syntax described above, but must not exceed 1023 characters. Your
|
||||||
|
bootloader may impose lower limits.
|
||||||
|
|
||||||
|
These ``dyndbg`` params are processed just after the ddebug tables are
|
||||||
|
processed, as part of the arch_initcall. Thus you can enable debug
|
||||||
|
messages in all code run after this arch_initcall via this boot
|
||||||
|
parameter.
|
||||||
|
|
||||||
|
On an x86 system for example ACPI enablement is a subsys_initcall and::
|
||||||
|
|
||||||
|
dyndbg="file ec.c +p"
|
||||||
|
|
||||||
|
will show early Embedded Controller transactions during ACPI setup if
|
||||||
|
your machine (typically a laptop) has an Embedded Controller.
|
||||||
|
PCI (or other devices) initialization also is a hot candidate for using
|
||||||
|
this boot parameter for debugging purposes.
|
||||||
|
|
||||||
|
If ``foo`` module is not built-in, ``foo.dyndbg`` will still be processed at
|
||||||
|
boot time, without effect, but will be reprocessed when module is
|
||||||
|
loaded later. ``dyndbg_query=`` and bare ``dyndbg=`` are only processed at
|
||||||
|
boot.
|
||||||
|
|
||||||
|
|
||||||
|
Debug Messages at Module Initialization Time
|
||||||
|
============================================
|
||||||
|
|
||||||
|
When ``modprobe foo`` is called, modprobe scans ``/proc/cmdline`` for
|
||||||
|
``foo.params``, strips ``foo.``, and passes them to the kernel along with
|
||||||
|
params given in modprobe args or ``/etc/modprob.d/*.conf`` files,
|
||||||
|
in the following order:
|
||||||
|
|
||||||
|
1. parameters given via ``/etc/modprobe.d/*.conf``::
|
||||||
|
|
||||||
|
options foo dyndbg=+pt
|
||||||
|
options foo dyndbg # defaults to +p
|
||||||
|
|
||||||
|
2. ``foo.dyndbg`` as given in boot args, ``foo.`` is stripped and passed::
|
||||||
|
|
||||||
|
foo.dyndbg=" func bar +p; func buz +mp"
|
||||||
|
|
||||||
|
3. args to modprobe::
|
||||||
|
|
||||||
|
modprobe foo dyndbg==pmf # override previous settings
|
||||||
|
|
||||||
|
These ``dyndbg`` queries are applied in order, with last having final say.
|
||||||
|
This allows boot args to override or modify those from ``/etc/modprobe.d``
|
||||||
|
(sensible, since 1 is system wide, 2 is kernel or boot specific), and
|
||||||
|
modprobe args to override both.
|
||||||
|
|
||||||
|
In the ``foo.dyndbg="QUERY"`` form, the query must exclude ``module foo``.
|
||||||
|
``foo`` is extracted from the param-name, and applied to each query in
|
||||||
|
``QUERY``, and only 1 match-spec of each type is allowed.
|
||||||
|
|
||||||
|
The ``dyndbg`` option is a "fake" module parameter, which means:
|
||||||
|
|
||||||
|
- modules do not need to define it explicitly
|
||||||
|
- every module gets it tacitly, whether they use pr_debug or not
|
||||||
|
- it doesn't appear in ``/sys/module/$module/parameters/``
|
||||||
|
To see it, grep the control file, or inspect ``/proc/cmdline.``
|
||||||
|
|
||||||
|
For ``CONFIG_DYNAMIC_DEBUG`` kernels, any settings given at boot-time (or
|
||||||
|
enabled by ``-DDEBUG`` flag during compilation) can be disabled later via
|
||||||
|
the sysfs interface if the debug messages are no longer needed::
|
||||||
|
|
||||||
|
echo "module module_name -p" > <debugfs>/dynamic_debug/control
|
||||||
|
|
||||||
|
Examples
|
||||||
|
========
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
// enable the message at line 1603 of file svcsock.c
|
||||||
|
nullarbor:~ # echo -n 'file svcsock.c line 1603 +p' >
|
||||||
|
<debugfs>/dynamic_debug/control
|
||||||
|
|
||||||
|
// enable all the messages in file svcsock.c
|
||||||
|
nullarbor:~ # echo -n 'file svcsock.c +p' >
|
||||||
|
<debugfs>/dynamic_debug/control
|
||||||
|
|
||||||
|
// enable all the messages in the NFS server module
|
||||||
|
nullarbor:~ # echo -n 'module nfsd +p' >
|
||||||
|
<debugfs>/dynamic_debug/control
|
||||||
|
|
||||||
|
// enable all 12 messages in the function svc_process()
|
||||||
|
nullarbor:~ # echo -n 'func svc_process +p' >
|
||||||
|
<debugfs>/dynamic_debug/control
|
||||||
|
|
||||||
|
// disable all 12 messages in the function svc_process()
|
||||||
|
nullarbor:~ # echo -n 'func svc_process -p' >
|
||||||
|
<debugfs>/dynamic_debug/control
|
||||||
|
|
||||||
|
// enable messages for NFS calls READ, READLINK, READDIR and READDIR+.
|
||||||
|
nullarbor:~ # echo -n 'format "nfsd: READ" +p' >
|
||||||
|
<debugfs>/dynamic_debug/control
|
||||||
|
|
||||||
|
// enable messages in files of which the paths include string "usb"
|
||||||
|
nullarbor:~ # echo -n '*usb* +p' > <debugfs>/dynamic_debug/control
|
||||||
|
|
||||||
|
// enable all messages
|
||||||
|
nullarbor:~ # echo -n '+p' > <debugfs>/dynamic_debug/control
|
||||||
|
|
||||||
|
// add module, function to all enabled messages
|
||||||
|
nullarbor:~ # echo -n '+mf' > <debugfs>/dynamic_debug/control
|
||||||
|
|
||||||
|
// boot-args example, with newlines and comments for readability
|
||||||
|
Kernel command line: ...
|
||||||
|
// see whats going on in dyndbg=value processing
|
||||||
|
dynamic_debug.verbose=1
|
||||||
|
// enable pr_debugs in 2 builtins, #cmt is stripped
|
||||||
|
dyndbg="module params +p #cmt ; module sys +p"
|
||||||
|
// enable pr_debugs in 2 functions in a module loaded later
|
||||||
|
pc87360.dyndbg="func pc87360_init_device +p; func pc87360_find +p"
|
||||||
68
Documentation/admin-guide/index.rst
Normal file
68
Documentation/admin-guide/index.rst
Normal file
@@ -0,0 +1,68 @@
|
|||||||
|
The Linux kernel user's and administrator's guide
|
||||||
|
=================================================
|
||||||
|
|
||||||
|
The following is a collection of user-oriented documents that have been
|
||||||
|
added to the kernel over time. There is, as yet, little overall order or
|
||||||
|
organization here — this material was not written to be a single, coherent
|
||||||
|
document! With luck things will improve quickly over time.
|
||||||
|
|
||||||
|
This initial section contains overall information, including the README
|
||||||
|
file describing the kernel as a whole, documentation on kernel parameters,
|
||||||
|
etc.
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 1
|
||||||
|
|
||||||
|
README
|
||||||
|
kernel-parameters
|
||||||
|
devices
|
||||||
|
|
||||||
|
Here is a set of documents aimed at users who are trying to track down
|
||||||
|
problems and bugs in particular.
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 1
|
||||||
|
|
||||||
|
reporting-bugs
|
||||||
|
security-bugs
|
||||||
|
bug-hunting
|
||||||
|
bug-bisect
|
||||||
|
tainted-kernels
|
||||||
|
ramoops
|
||||||
|
dynamic-debug-howto
|
||||||
|
init
|
||||||
|
|
||||||
|
This is the beginning of a section with information of interest to
|
||||||
|
application developers. Documents covering various aspects of the kernel
|
||||||
|
ABI will be found here.
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 1
|
||||||
|
|
||||||
|
sysfs-rules
|
||||||
|
|
||||||
|
The rest of this manual consists of various unordered guides on how to
|
||||||
|
configure specific aspects of kernel behavior to your liking.
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 1
|
||||||
|
|
||||||
|
initrd
|
||||||
|
serial-console
|
||||||
|
braille-console
|
||||||
|
parport
|
||||||
|
md
|
||||||
|
module-signing
|
||||||
|
sysrq
|
||||||
|
unicode
|
||||||
|
vga-softcursor
|
||||||
|
binfmt-misc
|
||||||
|
mono
|
||||||
|
java
|
||||||
|
|
||||||
|
.. only:: subproject and html
|
||||||
|
|
||||||
|
Indices
|
||||||
|
=======
|
||||||
|
|
||||||
|
* :ref:`genindex`
|
||||||
@@ -5,6 +5,7 @@ OK, so you've got this pretty unintuitive message (currently located
|
|||||||
in init/main.c) and are wondering what the H*** went wrong.
|
in init/main.c) and are wondering what the H*** went wrong.
|
||||||
Some high-level reasons for failure (listed roughly in order of execution)
|
Some high-level reasons for failure (listed roughly in order of execution)
|
||||||
to load the init binary are:
|
to load the init binary are:
|
||||||
|
|
||||||
A) Unable to mount root FS
|
A) Unable to mount root FS
|
||||||
B) init binary doesn't exist on rootfs
|
B) init binary doesn't exist on rootfs
|
||||||
C) broken console device
|
C) broken console device
|
||||||
@@ -12,37 +13,39 @@ D) binary exists but dependencies not available
|
|||||||
E) binary cannot be loaded
|
E) binary cannot be loaded
|
||||||
|
|
||||||
Detailed explanations:
|
Detailed explanations:
|
||||||
0) Set "debug" kernel parameter (in bootloader config file or CONFIG_CMDLINE)
|
|
||||||
|
A) Set "debug" kernel parameter (in bootloader config file or CONFIG_CMDLINE)
|
||||||
to get more detailed kernel messages.
|
to get more detailed kernel messages.
|
||||||
A) make sure you have the correct root FS type
|
B) make sure you have the correct root FS type
|
||||||
(and root= kernel parameter points to the correct partition),
|
(and ``root=`` kernel parameter points to the correct partition),
|
||||||
required drivers such as storage hardware (such as SCSI or USB!)
|
required drivers such as storage hardware (such as SCSI or USB!)
|
||||||
and filesystem (ext3, jffs2 etc.) are builtin (alternatively as modules,
|
and filesystem (ext3, jffs2 etc.) are builtin (alternatively as modules,
|
||||||
to be pre-loaded by an initrd)
|
to be pre-loaded by an initrd)
|
||||||
C) Possibly a conflict in console= setup --> initial console unavailable.
|
C) Possibly a conflict in ``console= setup`` --> initial console unavailable.
|
||||||
E.g. some serial consoles are unreliable due to serial IRQ issues (e.g.
|
E.g. some serial consoles are unreliable due to serial IRQ issues (e.g.
|
||||||
missing interrupt-based configuration).
|
missing interrupt-based configuration).
|
||||||
Try using a different console= device or e.g. netconsole= .
|
Try using a different ``console= device`` or e.g. ``netconsole=``.
|
||||||
D) e.g. required library dependencies of the init binary such as
|
D) e.g. required library dependencies of the init binary such as
|
||||||
/lib/ld-linux.so.2 missing or broken. Use readelf -d <INIT>|grep NEEDED
|
``/lib/ld-linux.so.2`` missing or broken. Use
|
||||||
to find out which libraries are required.
|
``readelf -d <INIT>|grep NEEDED`` to find out which libraries are required.
|
||||||
E) make sure the binary's architecture matches your hardware.
|
E) make sure the binary's architecture matches your hardware.
|
||||||
E.g. i386 vs. x86_64 mismatch, or trying to load x86 on ARM hardware.
|
E.g. i386 vs. x86_64 mismatch, or trying to load x86 on ARM hardware.
|
||||||
In case you tried loading a non-binary file here (shell script?),
|
In case you tried loading a non-binary file here (shell script?),
|
||||||
you should make sure that the script specifies an interpreter in its shebang
|
you should make sure that the script specifies an interpreter in its shebang
|
||||||
header line (#!/...) that is fully working (including its library
|
header line (``#!/...``) that is fully working (including its library
|
||||||
dependencies). And before tackling scripts, better first test a simple
|
dependencies). And before tackling scripts, better first test a simple
|
||||||
non-script binary such as /bin/sh and confirm its successful execution.
|
non-script binary such as ``/bin/sh`` and confirm its successful execution.
|
||||||
To find out more, add code to init/main.c to display kernel_execve()s
|
To find out more, add code ``to init/main.c`` to display kernel_execve()s
|
||||||
return values.
|
return values.
|
||||||
|
|
||||||
Please extend this explanation whenever you find new failure causes
|
Please extend this explanation whenever you find new failure causes
|
||||||
(after all loading the init binary is a CRITICAL and hard transition step
|
(after all loading the init binary is a CRITICAL and hard transition step
|
||||||
which needs to be made as painless as possible), then submit patch to LKML.
|
which needs to be made as painless as possible), then submit patch to LKML.
|
||||||
Further TODOs:
|
Further TODOs:
|
||||||
- Implement the various run_init_process() invocations via a struct array
|
|
||||||
which can then store the kernel_execve() result value and on failure
|
- Implement the various ``run_init_process()`` invocations via a struct array
|
||||||
log it all by iterating over _all_ results (very important usability fix).
|
which can then store the ``kernel_execve()`` result value and on failure
|
||||||
|
log it all by iterating over **all** results (very important usability fix).
|
||||||
- try to make the implementation itself more helpful in general,
|
- try to make the implementation itself more helpful in general,
|
||||||
e.g. by providing additional error messages at affected places.
|
e.g. by providing additional error messages at affected places.
|
||||||
|
|
||||||
@@ -16,7 +16,7 @@ where the kernel comes up with a minimum set of compiled-in drivers, and
|
|||||||
where additional modules are loaded from initrd.
|
where additional modules are loaded from initrd.
|
||||||
|
|
||||||
This document gives a brief overview of the use of initrd. A more detailed
|
This document gives a brief overview of the use of initrd. A more detailed
|
||||||
discussion of the boot process can be found in [1].
|
discussion of the boot process can be found in [#f1]_.
|
||||||
|
|
||||||
|
|
||||||
Operation
|
Operation
|
||||||
@@ -27,10 +27,10 @@ When using initrd, the system typically boots as follows:
|
|||||||
1) the boot loader loads the kernel and the initial RAM disk
|
1) the boot loader loads the kernel and the initial RAM disk
|
||||||
2) the kernel converts initrd into a "normal" RAM disk and
|
2) the kernel converts initrd into a "normal" RAM disk and
|
||||||
frees the memory used by initrd
|
frees the memory used by initrd
|
||||||
3) if the root device is not /dev/ram0, the old (deprecated)
|
3) if the root device is not ``/dev/ram0``, the old (deprecated)
|
||||||
change_root procedure is followed. see the "Obsolete root change
|
change_root procedure is followed. see the "Obsolete root change
|
||||||
mechanism" section below.
|
mechanism" section below.
|
||||||
4) root device is mounted. if it is /dev/ram0, the initrd image is
|
4) root device is mounted. if it is ``/dev/ram0``, the initrd image is
|
||||||
then mounted as root
|
then mounted as root
|
||||||
5) /sbin/init is executed (this can be any valid executable, including
|
5) /sbin/init is executed (this can be any valid executable, including
|
||||||
shell scripts; it is run with uid 0 and can do basically everything
|
shell scripts; it is run with uid 0 and can do basically everything
|
||||||
@@ -38,7 +38,7 @@ When using initrd, the system typically boots as follows:
|
|||||||
6) init mounts the "real" root file system
|
6) init mounts the "real" root file system
|
||||||
7) init places the root file system at the root directory using the
|
7) init places the root file system at the root directory using the
|
||||||
pivot_root system call
|
pivot_root system call
|
||||||
8) init execs the /sbin/init on the new root filesystem, performing
|
8) init execs the ``/sbin/init`` on the new root filesystem, performing
|
||||||
the usual boot sequence
|
the usual boot sequence
|
||||||
9) the initrd file system is removed
|
9) the initrd file system is removed
|
||||||
|
|
||||||
@@ -51,7 +51,7 @@ be accessible.
|
|||||||
Boot command-line options
|
Boot command-line options
|
||||||
-------------------------
|
-------------------------
|
||||||
|
|
||||||
initrd adds the following new options:
|
initrd adds the following new options::
|
||||||
|
|
||||||
initrd=<path> (e.g. LOADLIN)
|
initrd=<path> (e.g. LOADLIN)
|
||||||
|
|
||||||
@@ -83,11 +83,11 @@ Recent kernels have support for populating a ramdisk from a compressed cpio
|
|||||||
archive. On such systems, the creation of a ramdisk image doesn't need to
|
archive. On such systems, the creation of a ramdisk image doesn't need to
|
||||||
involve special block devices or loopbacks; you merely create a directory on
|
involve special block devices or loopbacks; you merely create a directory on
|
||||||
disk with the desired initrd content, cd to that directory, and run (as an
|
disk with the desired initrd content, cd to that directory, and run (as an
|
||||||
example):
|
example)::
|
||||||
|
|
||||||
find . | cpio --quiet -H newc -o | gzip -9 -n > /boot/imagefile.img
|
find . | cpio --quiet -H newc -o | gzip -9 -n > /boot/imagefile.img
|
||||||
|
|
||||||
Examining the contents of an existing image file is just as simple:
|
Examining the contents of an existing image file is just as simple::
|
||||||
|
|
||||||
mkdir /tmp/imagefile
|
mkdir /tmp/imagefile
|
||||||
cd /tmp/imagefile
|
cd /tmp/imagefile
|
||||||
@@ -97,19 +97,19 @@ Installation
|
|||||||
------------
|
------------
|
||||||
|
|
||||||
First, a directory for the initrd file system has to be created on the
|
First, a directory for the initrd file system has to be created on the
|
||||||
"normal" root file system, e.g.
|
"normal" root file system, e.g.::
|
||||||
|
|
||||||
# mkdir /initrd
|
# mkdir /initrd
|
||||||
|
|
||||||
The name is not relevant. More details can be found on the pivot_root(2)
|
The name is not relevant. More details can be found on the
|
||||||
man page.
|
:manpage:`pivot_root(2)` man page.
|
||||||
|
|
||||||
If the root file system is created during the boot procedure (i.e. if
|
If the root file system is created during the boot procedure (i.e. if
|
||||||
you're building an install floppy), the root file system creation
|
you're building an install floppy), the root file system creation
|
||||||
procedure should create the /initrd directory.
|
procedure should create the ``/initrd`` directory.
|
||||||
|
|
||||||
If initrd will not be mounted in some cases, its content is still
|
If initrd will not be mounted in some cases, its content is still
|
||||||
accessible if the following device has been created:
|
accessible if the following device has been created::
|
||||||
|
|
||||||
# mknod /dev/initrd b 1 250
|
# mknod /dev/initrd b 1 250
|
||||||
# chmod 400 /dev/initrd
|
# chmod 400 /dev/initrd
|
||||||
@@ -131,60 +131,76 @@ kernels, at least three types of devices are suitable for that:
|
|||||||
We'll describe the loopback device method:
|
We'll describe the loopback device method:
|
||||||
|
|
||||||
1) make sure loopback block devices are configured into the kernel
|
1) make sure loopback block devices are configured into the kernel
|
||||||
2) create an empty file system of the appropriate size, e.g.
|
2) create an empty file system of the appropriate size, e.g.::
|
||||||
|
|
||||||
# dd if=/dev/zero of=initrd bs=300k count=1
|
# dd if=/dev/zero of=initrd bs=300k count=1
|
||||||
# mke2fs -F -m0 initrd
|
# mke2fs -F -m0 initrd
|
||||||
|
|
||||||
(if space is critical, you may want to use the Minix FS instead of Ext2)
|
(if space is critical, you may want to use the Minix FS instead of Ext2)
|
||||||
3) mount the file system, e.g.
|
3) mount the file system, e.g.::
|
||||||
|
|
||||||
# mount -t ext2 -o loop initrd /mnt
|
# mount -t ext2 -o loop initrd /mnt
|
||||||
4) create the console device:
|
|
||||||
|
4) create the console device::
|
||||||
|
|
||||||
# mkdir /mnt/dev
|
# mkdir /mnt/dev
|
||||||
# mknod /mnt/dev/console c 5 1
|
# mknod /mnt/dev/console c 5 1
|
||||||
|
|
||||||
5) copy all the files that are needed to properly use the initrd
|
5) copy all the files that are needed to properly use the initrd
|
||||||
environment. Don't forget the most important file, /sbin/init
|
environment. Don't forget the most important file, ``/sbin/init``
|
||||||
Note that /sbin/init's permissions must include "x" (execute).
|
|
||||||
|
.. note:: ``/sbin/init`` permissions must include "x" (execute).
|
||||||
|
|
||||||
6) correct operation the initrd environment can frequently be tested
|
6) correct operation the initrd environment can frequently be tested
|
||||||
even without rebooting with the command
|
even without rebooting with the command::
|
||||||
|
|
||||||
# chroot /mnt /sbin/init
|
# chroot /mnt /sbin/init
|
||||||
|
|
||||||
This is of course limited to initrds that do not interfere with the
|
This is of course limited to initrds that do not interfere with the
|
||||||
general system state (e.g. by reconfiguring network interfaces,
|
general system state (e.g. by reconfiguring network interfaces,
|
||||||
overwriting mounted devices, trying to start already running demons,
|
overwriting mounted devices, trying to start already running demons,
|
||||||
etc. Note however that it is usually possible to use pivot_root in
|
etc. Note however that it is usually possible to use pivot_root in
|
||||||
such a chroot'ed initrd environment.)
|
such a chroot'ed initrd environment.)
|
||||||
7) unmount the file system
|
7) unmount the file system::
|
||||||
|
|
||||||
# umount /mnt
|
# umount /mnt
|
||||||
|
|
||||||
8) the initrd is now in the file "initrd". Optionally, it can now be
|
8) the initrd is now in the file "initrd". Optionally, it can now be
|
||||||
compressed
|
compressed::
|
||||||
|
|
||||||
# gzip -9 initrd
|
# gzip -9 initrd
|
||||||
|
|
||||||
For experimenting with initrd, you may want to take a rescue floppy and
|
For experimenting with initrd, you may want to take a rescue floppy and
|
||||||
only add a symbolic link from /sbin/init to /bin/sh. Alternatively, you
|
only add a symbolic link from ``/sbin/init`` to ``/bin/sh``. Alternatively, you
|
||||||
can try the experimental newlib environment [2] to create a small
|
can try the experimental newlib environment [#f2]_ to create a small
|
||||||
initrd.
|
initrd.
|
||||||
|
|
||||||
Finally, you have to boot the kernel and load initrd. Almost all Linux
|
Finally, you have to boot the kernel and load initrd. Almost all Linux
|
||||||
boot loaders support initrd. Since the boot process is still compatible
|
boot loaders support initrd. Since the boot process is still compatible
|
||||||
with an older mechanism, the following boot command line parameters
|
with an older mechanism, the following boot command line parameters
|
||||||
have to be given:
|
have to be given::
|
||||||
|
|
||||||
root=/dev/ram0 rw
|
root=/dev/ram0 rw
|
||||||
|
|
||||||
(rw is only necessary if writing to the initrd file system.)
|
(rw is only necessary if writing to the initrd file system.)
|
||||||
|
|
||||||
With LOADLIN, you simply execute
|
With LOADLIN, you simply execute::
|
||||||
|
|
||||||
LOADLIN <kernel> initrd=<disk_image>
|
LOADLIN <kernel> initrd=<disk_image>
|
||||||
e.g. LOADLIN C:\LINUX\BZIMAGE initrd=C:\LINUX\INITRD.GZ root=/dev/ram0 rw
|
|
||||||
|
|
||||||
With LILO, you add the option INITRD=<path> to either the global section
|
e.g.::
|
||||||
or to the section of the respective kernel in /etc/lilo.conf, and pass
|
|
||||||
the options using APPEND, e.g.
|
LOADLIN C:\LINUX\BZIMAGE initrd=C:\LINUX\INITRD.GZ root=/dev/ram0 rw
|
||||||
|
|
||||||
|
With LILO, you add the option ``INITRD=<path>`` to either the global section
|
||||||
|
or to the section of the respective kernel in ``/etc/lilo.conf``, and pass
|
||||||
|
the options using APPEND, e.g.::
|
||||||
|
|
||||||
image = /bzImage
|
image = /bzImage
|
||||||
initrd = /boot/initrd.gz
|
initrd = /boot/initrd.gz
|
||||||
append = "root=/dev/ram0 rw"
|
append = "root=/dev/ram0 rw"
|
||||||
|
|
||||||
and run /sbin/lilo
|
and run ``/sbin/lilo``
|
||||||
|
|
||||||
For other boot loaders, please refer to the respective documentation.
|
For other boot loaders, please refer to the respective documentation.
|
||||||
|
|
||||||
@@ -204,17 +220,17 @@ The procedure involves the following steps:
|
|||||||
- unmounting the initrd file system and de-allocating the RAM disk
|
- unmounting the initrd file system and de-allocating the RAM disk
|
||||||
|
|
||||||
Mounting the new root file system is easy: it just needs to be mounted on
|
Mounting the new root file system is easy: it just needs to be mounted on
|
||||||
a directory under the current root. Example:
|
a directory under the current root. Example::
|
||||||
|
|
||||||
# mkdir /new-root
|
# mkdir /new-root
|
||||||
# mount -o ro /dev/hda1 /new-root
|
# mount -o ro /dev/hda1 /new-root
|
||||||
|
|
||||||
The root change is accomplished with the pivot_root system call, which
|
The root change is accomplished with the pivot_root system call, which
|
||||||
is also available via the pivot_root utility (see pivot_root(8) man
|
is also available via the ``pivot_root`` utility (see :manpage:`pivot_root(8)`
|
||||||
page; pivot_root is distributed with util-linux version 2.10h or higher
|
man page; ``pivot_root`` is distributed with util-linux version 2.10h or higher
|
||||||
[3]). pivot_root moves the current root to a directory under the new
|
[#f3]_). ``pivot_root`` moves the current root to a directory under the new
|
||||||
root, and puts the new root at its place. The directory for the old root
|
root, and puts the new root at its place. The directory for the old root
|
||||||
must exist before calling pivot_root. Example:
|
must exist before calling ``pivot_root``. Example::
|
||||||
|
|
||||||
# cd /new-root
|
# cd /new-root
|
||||||
# mkdir initrd
|
# mkdir initrd
|
||||||
@@ -223,14 +239,14 @@ must exist before calling pivot_root. Example:
|
|||||||
Now, the init process may still access the old root via its
|
Now, the init process may still access the old root via its
|
||||||
executable, shared libraries, standard input/output/error, and its
|
executable, shared libraries, standard input/output/error, and its
|
||||||
current root directory. All these references are dropped by the
|
current root directory. All these references are dropped by the
|
||||||
following command:
|
following command::
|
||||||
|
|
||||||
# exec chroot . what-follows <dev/console >dev/console 2>&1
|
# exec chroot . what-follows <dev/console >dev/console 2>&1
|
||||||
|
|
||||||
Where what-follows is a program under the new root, e.g. /sbin/init
|
Where what-follows is a program under the new root, e.g. ``/sbin/init``
|
||||||
If the new root file system will be used with udev and has no valid
|
If the new root file system will be used with udev and has no valid
|
||||||
/dev directory, udev must be initialized before invoking chroot in order
|
``/dev`` directory, udev must be initialized before invoking chroot in order
|
||||||
to provide /dev/console.
|
to provide ``/dev/console``.
|
||||||
|
|
||||||
Note: implementation details of pivot_root may change with time. In order
|
Note: implementation details of pivot_root may change with time. In order
|
||||||
to ensure compatibility, the following points should be observed:
|
to ensure compatibility, the following points should be observed:
|
||||||
@@ -244,13 +260,13 @@ to ensure compatibility, the following points should be observed:
|
|||||||
- use relative paths for dev/console in the exec command
|
- use relative paths for dev/console in the exec command
|
||||||
|
|
||||||
Now, the initrd can be unmounted and the memory allocated by the RAM
|
Now, the initrd can be unmounted and the memory allocated by the RAM
|
||||||
disk can be freed:
|
disk can be freed::
|
||||||
|
|
||||||
# umount /initrd
|
# umount /initrd
|
||||||
# blockdev --flushbufs /dev/ram0
|
# blockdev --flushbufs /dev/ram0
|
||||||
|
|
||||||
It is also possible to use initrd with an NFS-mounted root, see the
|
It is also possible to use initrd with an NFS-mounted root, see the
|
||||||
pivot_root(8) man page for details.
|
:manpage:`pivot_root(8)` man page for details.
|
||||||
|
|
||||||
|
|
||||||
Usage scenarios
|
Usage scenarios
|
||||||
@@ -263,21 +279,21 @@ as follows:
|
|||||||
1) system boots from floppy or other media with a minimal kernel
|
1) system boots from floppy or other media with a minimal kernel
|
||||||
(e.g. support for RAM disks, initrd, a.out, and the Ext2 FS) and
|
(e.g. support for RAM disks, initrd, a.out, and the Ext2 FS) and
|
||||||
loads initrd
|
loads initrd
|
||||||
2) /sbin/init determines what is needed to (1) mount the "real" root FS
|
2) ``/sbin/init`` determines what is needed to (1) mount the "real" root FS
|
||||||
(i.e. device type, device drivers, file system) and (2) the
|
(i.e. device type, device drivers, file system) and (2) the
|
||||||
distribution media (e.g. CD-ROM, network, tape, ...). This can be
|
distribution media (e.g. CD-ROM, network, tape, ...). This can be
|
||||||
done by asking the user, by auto-probing, or by using a hybrid
|
done by asking the user, by auto-probing, or by using a hybrid
|
||||||
approach.
|
approach.
|
||||||
3) /sbin/init loads the necessary kernel modules
|
3) ``/sbin/init`` loads the necessary kernel modules
|
||||||
4) /sbin/init creates and populates the root file system (this doesn't
|
4) ``/sbin/init`` creates and populates the root file system (this doesn't
|
||||||
have to be a very usable system yet)
|
have to be a very usable system yet)
|
||||||
5) /sbin/init invokes pivot_root to change the root file system and
|
5) ``/sbin/init`` invokes ``pivot_root`` to change the root file system and
|
||||||
execs - via chroot - a program that continues the installation
|
execs - via chroot - a program that continues the installation
|
||||||
6) the boot loader is installed
|
6) the boot loader is installed
|
||||||
7) the boot loader is configured to load an initrd with the set of
|
7) the boot loader is configured to load an initrd with the set of
|
||||||
modules that was used to bring up the system (e.g. /initrd can be
|
modules that was used to bring up the system (e.g. ``/initrd`` can be
|
||||||
modified, then unmounted, and finally, the image is written from
|
modified, then unmounted, and finally, the image is written from
|
||||||
/dev/ram0 or /dev/rd/0 to a file)
|
``/dev/ram0`` or ``/dev/rd/0`` to a file)
|
||||||
8) now the system is bootable and additional installation tasks can be
|
8) now the system is bootable and additional installation tasks can be
|
||||||
performed
|
performed
|
||||||
|
|
||||||
@@ -290,7 +306,7 @@ different hardware configurations in a single administrative domain. In
|
|||||||
such cases, it is desirable to generate only a small set of kernels
|
such cases, it is desirable to generate only a small set of kernels
|
||||||
(ideally only one) and to keep the system-specific part of configuration
|
(ideally only one) and to keep the system-specific part of configuration
|
||||||
information as small as possible. In this case, a common initrd could be
|
information as small as possible. In this case, a common initrd could be
|
||||||
generated with all the necessary modules. Then, only /sbin/init or a file
|
generated with all the necessary modules. Then, only ``/sbin/init`` or a file
|
||||||
read by it would have to be different.
|
read by it would have to be different.
|
||||||
|
|
||||||
A third scenario is more convenient recovery disks, because information
|
A third scenario is more convenient recovery disks, because information
|
||||||
@@ -301,7 +317,7 @@ auto-detection).
|
|||||||
|
|
||||||
Last not least, CD-ROM distributors may use it for better installation
|
Last not least, CD-ROM distributors may use it for better installation
|
||||||
from CD, e.g. by using a boot floppy and bootstrapping a bigger RAM disk
|
from CD, e.g. by using a boot floppy and bootstrapping a bigger RAM disk
|
||||||
via initrd from CD; or by booting via a loader like LOADLIN or directly
|
via initrd from CD; or by booting via a loader like ``LOADLIN`` or directly
|
||||||
from the CD-ROM, and loading the RAM disk from CD without need of
|
from the CD-ROM, and loading the RAM disk from CD without need of
|
||||||
floppies.
|
floppies.
|
||||||
|
|
||||||
@@ -316,7 +332,7 @@ continued availability.
|
|||||||
It works by mounting the "real" root device (i.e. the one set with rdev
|
It works by mounting the "real" root device (i.e. the one set with rdev
|
||||||
in the kernel image or with root=... at the boot command line) as the
|
in the kernel image or with root=... at the boot command line) as the
|
||||||
root file system when linuxrc exits. The initrd file system is then
|
root file system when linuxrc exits. The initrd file system is then
|
||||||
unmounted, or, if it is still busy, moved to a directory /initrd, if
|
unmounted, or, if it is still busy, moved to a directory ``/initrd``, if
|
||||||
such a directory exists on the new root file system.
|
such a directory exists on the new root file system.
|
||||||
|
|
||||||
In order to use this mechanism, you do not have to specify the boot
|
In order to use this mechanism, you do not have to specify the boot
|
||||||
@@ -325,24 +341,25 @@ the real root file system, not the initrd environment.)
|
|||||||
|
|
||||||
If /proc is mounted, the "real" root device can be changed from within
|
If /proc is mounted, the "real" root device can be changed from within
|
||||||
linuxrc by writing the number of the new root FS device to the special
|
linuxrc by writing the number of the new root FS device to the special
|
||||||
file /proc/sys/kernel/real-root-dev, e.g.
|
file /proc/sys/kernel/real-root-dev, e.g.::
|
||||||
|
|
||||||
# echo 0x301 >/proc/sys/kernel/real-root-dev
|
# echo 0x301 >/proc/sys/kernel/real-root-dev
|
||||||
|
|
||||||
Note that the mechanism is incompatible with NFS and similar file
|
Note that the mechanism is incompatible with NFS and similar file
|
||||||
systems.
|
systems.
|
||||||
|
|
||||||
This old, deprecated mechanism is commonly called "change_root", while
|
This old, deprecated mechanism is commonly called ``change_root``, while
|
||||||
the new, supported mechanism is called "pivot_root".
|
the new, supported mechanism is called ``pivot_root``.
|
||||||
|
|
||||||
|
|
||||||
Mixed change_root and pivot_root mechanism
|
Mixed change_root and pivot_root mechanism
|
||||||
------------------------------------------
|
------------------------------------------
|
||||||
|
|
||||||
In case you did not want to use root=/dev/ram0 to trigger the pivot_root
|
In case you did not want to use ``root=/dev/ram0`` to trigger the pivot_root
|
||||||
mechanism, you may create both /linuxrc and /sbin/init in your initrd image.
|
mechanism, you may create both ``/linuxrc`` and ``/sbin/init`` in your initrd
|
||||||
|
image.
|
||||||
|
|
||||||
/linuxrc would contain only the following:
|
``/linuxrc`` would contain only the following::
|
||||||
|
|
||||||
#! /bin/sh
|
#! /bin/sh
|
||||||
mount -n -t proc proc /proc
|
mount -n -t proc proc /proc
|
||||||
@@ -350,17 +367,17 @@ echo 0x0100 >/proc/sys/kernel/real-root-dev
|
|||||||
umount -n /proc
|
umount -n /proc
|
||||||
|
|
||||||
Once linuxrc exited, the kernel would mount again your initrd as root,
|
Once linuxrc exited, the kernel would mount again your initrd as root,
|
||||||
this time executing /sbin/init. Again, it would be the duty of this init
|
this time executing ``/sbin/init``. Again, it would be the duty of this init
|
||||||
to build the right environment (maybe using the root= device passed on
|
to build the right environment (maybe using the ``root= device`` passed on
|
||||||
the cmdline) before the final execution of the real /sbin/init.
|
the cmdline) before the final execution of the real ``/sbin/init``.
|
||||||
|
|
||||||
|
|
||||||
Resources
|
Resources
|
||||||
---------
|
---------
|
||||||
|
|
||||||
[1] Almesberger, Werner; "Booting Linux: The History and the Future"
|
.. [#f1] Almesberger, Werner; "Booting Linux: The History and the Future"
|
||||||
http://www.almesberger.net/cv/papers/ols2k-9.ps.gz
|
http://www.almesberger.net/cv/papers/ols2k-9.ps.gz
|
||||||
[2] newlib package (experimental), with initrd example
|
.. [#f2] newlib package (experimental), with initrd example
|
||||||
http://sources.redhat.com/newlib/
|
https://www.sourceware.org/newlib/
|
||||||
[3] util-linux: Miscellaneous utilities for Linux
|
.. [#f3] util-linux: Miscellaneous utilities for Linux
|
||||||
http://www.kernel.org/pub/linux/utils/util-linux/
|
https://www.kernel.org/pub/linux/utils/util-linux/
|
||||||
@@ -19,7 +19,7 @@ other program after you have done the following:
|
|||||||
as the application itself).
|
as the application itself).
|
||||||
|
|
||||||
2) You have to compile BINFMT_MISC either as a module or into
|
2) You have to compile BINFMT_MISC either as a module or into
|
||||||
the kernel (CONFIG_BINFMT_MISC) and set it up properly.
|
the kernel (``CONFIG_BINFMT_MISC``) and set it up properly.
|
||||||
If you choose to compile it as a module, you will have
|
If you choose to compile it as a module, you will have
|
||||||
to insert it manually with modprobe/insmod, as kmod
|
to insert it manually with modprobe/insmod, as kmod
|
||||||
cannot easily be supported with binfmt_misc.
|
cannot easily be supported with binfmt_misc.
|
||||||
@@ -27,37 +27,49 @@ other program after you have done the following:
|
|||||||
more about the configuration process.
|
more about the configuration process.
|
||||||
|
|
||||||
3) Add the following configuration items to binfmt_misc
|
3) Add the following configuration items to binfmt_misc
|
||||||
(you should really have read binfmt_misc.txt now):
|
(you should really have read ``binfmt_misc.txt`` now):
|
||||||
support for Java applications:
|
support for Java applications::
|
||||||
|
|
||||||
':Java:M::\xca\xfe\xba\xbe::/usr/local/bin/javawrapper:'
|
':Java:M::\xca\xfe\xba\xbe::/usr/local/bin/javawrapper:'
|
||||||
support for executable Jar files:
|
|
||||||
|
support for executable Jar files::
|
||||||
|
|
||||||
':ExecutableJAR:E::jar::/usr/local/bin/jarwrapper:'
|
':ExecutableJAR:E::jar::/usr/local/bin/jarwrapper:'
|
||||||
support for Java Applets:
|
|
||||||
|
support for Java Applets::
|
||||||
|
|
||||||
':Applet:E::html::/usr/bin/appletviewer:'
|
':Applet:E::html::/usr/bin/appletviewer:'
|
||||||
or the following, if you want to be more selective:
|
|
||||||
|
or the following, if you want to be more selective::
|
||||||
|
|
||||||
':Applet:M::<!--applet::/usr/bin/appletviewer:'
|
':Applet:M::<!--applet::/usr/bin/appletviewer:'
|
||||||
|
|
||||||
Of course you have to fix the path names. The path/file names given in this
|
Of course you have to fix the path names. The path/file names given in this
|
||||||
document match the Debian 2.1 system. (i.e. jdk installed in /usr,
|
document match the Debian 2.1 system. (i.e. jdk installed in ``/usr``,
|
||||||
custom wrappers from this document in /usr/local)
|
custom wrappers from this document in ``/usr/local``)
|
||||||
|
|
||||||
Note, that for the more selective applet support you have to modify
|
Note, that for the more selective applet support you have to modify
|
||||||
existing html-files to contain <!--applet--> in the first line
|
existing html-files to contain ``<!--applet-->`` in the first line
|
||||||
('<' has to be the first character!) to let this work!
|
(``<`` has to be the first character!) to let this work!
|
||||||
|
|
||||||
For the compiled Java programs you need a wrapper script like the
|
For the compiled Java programs you need a wrapper script like the
|
||||||
following (this is because Java is broken in case of the filename
|
following (this is because Java is broken in case of the filename
|
||||||
handling), again fix the path names, both in the script and in the
|
handling), again fix the path names, both in the script and in the
|
||||||
above given configuration string.
|
above given configuration string.
|
||||||
|
|
||||||
You, too, need the little program after the script. Compile like
|
You, too, need the little program after the script. Compile like::
|
||||||
|
|
||||||
gcc -O2 -o javaclassname javaclassname.c
|
gcc -O2 -o javaclassname javaclassname.c
|
||||||
and stick it to /usr/local/bin.
|
|
||||||
|
and stick it to ``/usr/local/bin``.
|
||||||
|
|
||||||
Both the javawrapper shellscript and the javaclassname program
|
Both the javawrapper shellscript and the javaclassname program
|
||||||
were supplied by Colin J. Watson <cjw44@cam.ac.uk>.
|
were supplied by Colin J. Watson <cjw44@cam.ac.uk>.
|
||||||
|
|
||||||
====================== Cut here ===================
|
Javawrapper shell script:
|
||||||
|
|
||||||
|
.. code-block:: sh
|
||||||
|
|
||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
# /usr/local/bin/javawrapper - the wrapper for binfmt_misc/java
|
# /usr/local/bin/javawrapper - the wrapper for binfmt_misc/java
|
||||||
|
|
||||||
@@ -144,10 +156,11 @@ fi
|
|||||||
|
|
||||||
shift
|
shift
|
||||||
/usr/bin/java $FQCLASS "$@"
|
/usr/bin/java $FQCLASS "$@"
|
||||||
====================== Cut here ===================
|
|
||||||
|
|
||||||
|
javaclassname.c:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
====================== Cut here ===================
|
|
||||||
/* javaclassname.c
|
/* javaclassname.c
|
||||||
*
|
*
|
||||||
* Extracts the class name from a Java class file; intended for use in a Java
|
* Extracts the class name from a Java class file; intended for use in a Java
|
||||||
@@ -350,18 +363,18 @@ int main(int argc, char **argv)
|
|||||||
fclose(classfile);
|
fclose(classfile);
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
====================== Cut here ===================
|
|
||||||
|
|
||||||
|
jarwrapper::
|
||||||
|
|
||||||
====================== Cut here ===================
|
|
||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
# /usr/local/java/bin/jarwrapper - the wrapper for binfmt_misc/jar
|
# /usr/local/java/bin/jarwrapper - the wrapper for binfmt_misc/jar
|
||||||
|
|
||||||
java -jar $1
|
java -jar $1
|
||||||
====================== Cut here ===================
|
|
||||||
|
|
||||||
|
|
||||||
Now simply chmod +x the .class, .jar and/or .html files you want to execute.
|
Now simply ``chmod +x`` the ``.class``, ``.jar`` and/or ``.html`` files you
|
||||||
|
want to execute.
|
||||||
|
|
||||||
To add a Java program to your path best put a symbolic link to the main
|
To add a Java program to your path best put a symbolic link to the main
|
||||||
.class file into /usr/bin (or another place you like) omitting the .class
|
.class file into /usr/bin (or another place you like) omitting the .class
|
||||||
extension. The directory containing the original .class file will be
|
extension. The directory containing the original .class file will be
|
||||||
@@ -371,29 +384,36 @@ added to your CLASSPATH during execution.
|
|||||||
To test your new setup, enter in the following simple Java app, and name
|
To test your new setup, enter in the following simple Java app, and name
|
||||||
it "HelloWorld.java":
|
it "HelloWorld.java":
|
||||||
|
|
||||||
|
.. code-block:: java
|
||||||
|
|
||||||
class HelloWorld {
|
class HelloWorld {
|
||||||
public static void main(String args[]) {
|
public static void main(String args[]) {
|
||||||
System.out.println("Hello World!");
|
System.out.println("Hello World!");
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
Now compile the application with:
|
Now compile the application with::
|
||||||
|
|
||||||
javac HelloWorld.java
|
javac HelloWorld.java
|
||||||
|
|
||||||
Set the executable permissions of the binary file, with:
|
Set the executable permissions of the binary file, with::
|
||||||
|
|
||||||
chmod 755 HelloWorld.class
|
chmod 755 HelloWorld.class
|
||||||
|
|
||||||
And then execute it:
|
And then execute it::
|
||||||
|
|
||||||
./HelloWorld.class
|
./HelloWorld.class
|
||||||
|
|
||||||
|
|
||||||
To execute Java Jar files, simple chmod the *.jar files to include
|
To execute Java Jar files, simple chmod the ``*.jar`` files to include
|
||||||
the execution bit, then just do
|
the execution bit, then just do::
|
||||||
|
|
||||||
./Application.jar
|
./Application.jar
|
||||||
|
|
||||||
|
|
||||||
To execute Java Applets, simple chmod the *.html files to include
|
To execute Java Applets, simple chmod the ``*.html`` files to include
|
||||||
the execution bit, then just do
|
the execution bit, then just do::
|
||||||
|
|
||||||
./Applet.html
|
./Applet.html
|
||||||
|
|
||||||
|
|
||||||
@@ -401,4 +421,3 @@ originally by Brian A. Lantz, brian@lantz.com
|
|||||||
heavily edited for binfmt_misc by Richard Günther
|
heavily edited for binfmt_misc by Richard Günther
|
||||||
new scripts by Colin J. Watson <cjw44@cam.ac.uk>
|
new scripts by Colin J. Watson <cjw44@cam.ac.uk>
|
||||||
added executable Jar file support by Kurt Huwig <kurt@iku-netz.de>
|
added executable Jar file support by Kurt Huwig <kurt@iku-netz.de>
|
||||||
|
|
||||||
209
Documentation/admin-guide/kernel-parameters.rst
Normal file
209
Documentation/admin-guide/kernel-parameters.rst
Normal file
@@ -0,0 +1,209 @@
|
|||||||
|
The kernel's command-line parameters
|
||||||
|
====================================
|
||||||
|
|
||||||
|
The following is a consolidated list of the kernel parameters as
|
||||||
|
implemented by the __setup(), core_param() and module_param() macros
|
||||||
|
and sorted into English Dictionary order (defined as ignoring all
|
||||||
|
punctuation and sorting digits before letters in a case insensitive
|
||||||
|
manner), and with descriptions where known.
|
||||||
|
|
||||||
|
The kernel parses parameters from the kernel command line up to "--";
|
||||||
|
if it doesn't recognize a parameter and it doesn't contain a '.', the
|
||||||
|
parameter gets passed to init: parameters with '=' go into init's
|
||||||
|
environment, others are passed as command line arguments to init.
|
||||||
|
Everything after "--" is passed as an argument to init.
|
||||||
|
|
||||||
|
Module parameters can be specified in two ways: via the kernel command
|
||||||
|
line with a module name prefix, or via modprobe, e.g.::
|
||||||
|
|
||||||
|
(kernel command line) usbcore.blinkenlights=1
|
||||||
|
(modprobe command line) modprobe usbcore blinkenlights=1
|
||||||
|
|
||||||
|
Parameters for modules which are built into the kernel need to be
|
||||||
|
specified on the kernel command line. modprobe looks through the
|
||||||
|
kernel command line (/proc/cmdline) and collects module parameters
|
||||||
|
when it loads a module, so the kernel command line can be used for
|
||||||
|
loadable modules too.
|
||||||
|
|
||||||
|
Hyphens (dashes) and underscores are equivalent in parameter names, so::
|
||||||
|
|
||||||
|
log_buf_len=1M print-fatal-signals=1
|
||||||
|
|
||||||
|
can also be entered as::
|
||||||
|
|
||||||
|
log-buf-len=1M print_fatal_signals=1
|
||||||
|
|
||||||
|
Double-quotes can be used to protect spaces in values, e.g.::
|
||||||
|
|
||||||
|
param="spaces in here"
|
||||||
|
|
||||||
|
cpu lists:
|
||||||
|
----------
|
||||||
|
|
||||||
|
Some kernel parameters take a list of CPUs as a value, e.g. isolcpus,
|
||||||
|
nohz_full, irqaffinity, rcu_nocbs. The format of this list is:
|
||||||
|
|
||||||
|
<cpu number>,...,<cpu number>
|
||||||
|
|
||||||
|
or
|
||||||
|
|
||||||
|
<cpu number>-<cpu number>
|
||||||
|
(must be a positive range in ascending order)
|
||||||
|
|
||||||
|
or a mixture
|
||||||
|
|
||||||
|
<cpu number>,...,<cpu number>-<cpu number>
|
||||||
|
|
||||||
|
Note that for the special case of a range one can split the range into equal
|
||||||
|
sized groups and for each group use some amount from the beginning of that
|
||||||
|
group:
|
||||||
|
|
||||||
|
<cpu number>-cpu number>:<used size>/<group size>
|
||||||
|
|
||||||
|
For example one can add to the command line following parameter:
|
||||||
|
|
||||||
|
isolcpus=1,2,10-20,100-2000:2/25
|
||||||
|
|
||||||
|
where the final item represents CPUs 100,101,125,126,150,151,...
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
This document may not be entirely up to date and comprehensive. The command
|
||||||
|
"modinfo -p ${modulename}" shows a current list of all parameters of a loadable
|
||||||
|
module. Loadable modules, after being loaded into the running kernel, also
|
||||||
|
reveal their parameters in /sys/module/${modulename}/parameters/. Some of these
|
||||||
|
parameters may be changed at runtime by the command
|
||||||
|
``echo -n ${value} > /sys/module/${modulename}/parameters/${parm}``.
|
||||||
|
|
||||||
|
The parameters listed below are only valid if certain kernel build options were
|
||||||
|
enabled and if respective hardware is present. The text in square brackets at
|
||||||
|
the beginning of each description states the restrictions within which a
|
||||||
|
parameter is applicable::
|
||||||
|
|
||||||
|
ACPI ACPI support is enabled.
|
||||||
|
AGP AGP (Accelerated Graphics Port) is enabled.
|
||||||
|
ALSA ALSA sound support is enabled.
|
||||||
|
APIC APIC support is enabled.
|
||||||
|
APM Advanced Power Management support is enabled.
|
||||||
|
ARM ARM architecture is enabled.
|
||||||
|
AVR32 AVR32 architecture is enabled.
|
||||||
|
AX25 Appropriate AX.25 support is enabled.
|
||||||
|
BLACKFIN Blackfin architecture is enabled.
|
||||||
|
CLK Common clock infrastructure is enabled.
|
||||||
|
CMA Contiguous Memory Area support is enabled.
|
||||||
|
DRM Direct Rendering Management support is enabled.
|
||||||
|
DYNAMIC_DEBUG Build in debug messages and enable them at runtime
|
||||||
|
EDD BIOS Enhanced Disk Drive Services (EDD) is enabled
|
||||||
|
EFI EFI Partitioning (GPT) is enabled
|
||||||
|
EIDE EIDE/ATAPI support is enabled.
|
||||||
|
EVM Extended Verification Module
|
||||||
|
FB The frame buffer device is enabled.
|
||||||
|
FTRACE Function tracing enabled.
|
||||||
|
GCOV GCOV profiling is enabled.
|
||||||
|
HW Appropriate hardware is enabled.
|
||||||
|
IA-64 IA-64 architecture is enabled.
|
||||||
|
IMA Integrity measurement architecture is enabled.
|
||||||
|
IOSCHED More than one I/O scheduler is enabled.
|
||||||
|
IP_PNP IP DHCP, BOOTP, or RARP is enabled.
|
||||||
|
IPV6 IPv6 support is enabled.
|
||||||
|
ISAPNP ISA PnP code is enabled.
|
||||||
|
ISDN Appropriate ISDN support is enabled.
|
||||||
|
JOY Appropriate joystick support is enabled.
|
||||||
|
KGDB Kernel debugger support is enabled.
|
||||||
|
KVM Kernel Virtual Machine support is enabled.
|
||||||
|
LIBATA Libata driver is enabled
|
||||||
|
LP Printer support is enabled.
|
||||||
|
LOOP Loopback device support is enabled.
|
||||||
|
M68k M68k architecture is enabled.
|
||||||
|
These options have more detailed description inside of
|
||||||
|
Documentation/m68k/kernel-options.txt.
|
||||||
|
MDA MDA console support is enabled.
|
||||||
|
MIPS MIPS architecture is enabled.
|
||||||
|
MOUSE Appropriate mouse support is enabled.
|
||||||
|
MSI Message Signaled Interrupts (PCI).
|
||||||
|
MTD MTD (Memory Technology Device) support is enabled.
|
||||||
|
NET Appropriate network support is enabled.
|
||||||
|
NUMA NUMA support is enabled.
|
||||||
|
NFS Appropriate NFS support is enabled.
|
||||||
|
OSS OSS sound support is enabled.
|
||||||
|
PV_OPS A paravirtualized kernel is enabled.
|
||||||
|
PARIDE The ParIDE (parallel port IDE) subsystem is enabled.
|
||||||
|
PARISC The PA-RISC architecture is enabled.
|
||||||
|
PCI PCI bus support is enabled.
|
||||||
|
PCIE PCI Express support is enabled.
|
||||||
|
PCMCIA The PCMCIA subsystem is enabled.
|
||||||
|
PNP Plug & Play support is enabled.
|
||||||
|
PPC PowerPC architecture is enabled.
|
||||||
|
PPT Parallel port support is enabled.
|
||||||
|
PS2 Appropriate PS/2 support is enabled.
|
||||||
|
RAM RAM disk support is enabled.
|
||||||
|
S390 S390 architecture is enabled.
|
||||||
|
SCSI Appropriate SCSI support is enabled.
|
||||||
|
A lot of drivers have their options described inside
|
||||||
|
the Documentation/scsi/ sub-directory.
|
||||||
|
SECURITY Different security models are enabled.
|
||||||
|
SELINUX SELinux support is enabled.
|
||||||
|
APPARMOR AppArmor support is enabled.
|
||||||
|
SERIAL Serial support is enabled.
|
||||||
|
SH SuperH architecture is enabled.
|
||||||
|
SMP The kernel is an SMP kernel.
|
||||||
|
SPARC Sparc architecture is enabled.
|
||||||
|
SWSUSP Software suspend (hibernation) is enabled.
|
||||||
|
SUSPEND System suspend states are enabled.
|
||||||
|
TPM TPM drivers are enabled.
|
||||||
|
TS Appropriate touchscreen support is enabled.
|
||||||
|
UMS USB Mass Storage support is enabled.
|
||||||
|
USB USB support is enabled.
|
||||||
|
USBHID USB Human Interface Device support is enabled.
|
||||||
|
V4L Video For Linux support is enabled.
|
||||||
|
VMMIO Driver for memory mapped virtio devices is enabled.
|
||||||
|
VGA The VGA console has been enabled.
|
||||||
|
VT Virtual terminal support is enabled.
|
||||||
|
WDT Watchdog support is enabled.
|
||||||
|
XT IBM PC/XT MFM hard disk support is enabled.
|
||||||
|
X86-32 X86-32, aka i386 architecture is enabled.
|
||||||
|
X86-64 X86-64 architecture is enabled.
|
||||||
|
More X86-64 boot options can be found in
|
||||||
|
Documentation/x86/x86_64/boot-options.txt .
|
||||||
|
X86 Either 32-bit or 64-bit x86 (same as X86-32+X86-64)
|
||||||
|
X86_UV SGI UV support is enabled.
|
||||||
|
XEN Xen support is enabled
|
||||||
|
|
||||||
|
In addition, the following text indicates that the option::
|
||||||
|
|
||||||
|
BUGS= Relates to possible processor bugs on the said processor.
|
||||||
|
KNL Is a kernel start-up parameter.
|
||||||
|
BOOT Is a boot loader parameter.
|
||||||
|
|
||||||
|
Parameters denoted with BOOT are actually interpreted by the boot
|
||||||
|
loader, and have no meaning to the kernel directly.
|
||||||
|
Do not modify the syntax of boot loader parameters without extreme
|
||||||
|
need or coordination with <Documentation/x86/boot.txt>.
|
||||||
|
|
||||||
|
There are also arch-specific kernel-parameters not documented here.
|
||||||
|
See for example <Documentation/x86/x86_64/boot-options.txt>.
|
||||||
|
|
||||||
|
Note that ALL kernel parameters listed below are CASE SENSITIVE, and that
|
||||||
|
a trailing = on the name of any parameter states that that parameter will
|
||||||
|
be entered as an environment variable, whereas its absence indicates that
|
||||||
|
it will appear as a kernel argument readable via /proc/cmdline by programs
|
||||||
|
running once the system is up.
|
||||||
|
|
||||||
|
The number of kernel parameters is not limited, but the length of the
|
||||||
|
complete command line (parameters including spaces etc.) is limited to
|
||||||
|
a fixed number of characters. This limit depends on the architecture
|
||||||
|
and is between 256 and 4096 characters. It is defined in the file
|
||||||
|
./include/asm/setup.h as COMMAND_LINE_SIZE.
|
||||||
|
|
||||||
|
Finally, the [KMG] suffix is commonly described after a number of kernel
|
||||||
|
parameter values. These 'K', 'M', and 'G' letters represent the _binary_
|
||||||
|
multipliers 'Kilo', 'Mega', and 'Giga', equalling 2^10, 2^20, and 2^30
|
||||||
|
bytes respectively. Such letter suffixes can also be entirely omitted:
|
||||||
|
|
||||||
|
.. include:: kernel-parameters.txt
|
||||||
|
:literal:
|
||||||
|
|
||||||
|
Todo
|
||||||
|
----
|
||||||
|
|
||||||
|
Add more DRM drivers.
|
||||||
@@ -1,202 +1,3 @@
|
|||||||
Kernel Parameters
|
|
||||||
~~~~~~~~~~~~~~~~~
|
|
||||||
|
|
||||||
The following is a consolidated list of the kernel parameters as
|
|
||||||
implemented by the __setup(), core_param() and module_param() macros
|
|
||||||
and sorted into English Dictionary order (defined as ignoring all
|
|
||||||
punctuation and sorting digits before letters in a case insensitive
|
|
||||||
manner), and with descriptions where known.
|
|
||||||
|
|
||||||
The kernel parses parameters from the kernel command line up to "--";
|
|
||||||
if it doesn't recognize a parameter and it doesn't contain a '.', the
|
|
||||||
parameter gets passed to init: parameters with '=' go into init's
|
|
||||||
environment, others are passed as command line arguments to init.
|
|
||||||
Everything after "--" is passed as an argument to init.
|
|
||||||
|
|
||||||
Module parameters can be specified in two ways: via the kernel command
|
|
||||||
line with a module name prefix, or via modprobe, e.g.:
|
|
||||||
|
|
||||||
(kernel command line) usbcore.blinkenlights=1
|
|
||||||
(modprobe command line) modprobe usbcore blinkenlights=1
|
|
||||||
|
|
||||||
Parameters for modules which are built into the kernel need to be
|
|
||||||
specified on the kernel command line. modprobe looks through the
|
|
||||||
kernel command line (/proc/cmdline) and collects module parameters
|
|
||||||
when it loads a module, so the kernel command line can be used for
|
|
||||||
loadable modules too.
|
|
||||||
|
|
||||||
Hyphens (dashes) and underscores are equivalent in parameter names, so
|
|
||||||
log_buf_len=1M print-fatal-signals=1
|
|
||||||
can also be entered as
|
|
||||||
log-buf-len=1M print_fatal_signals=1
|
|
||||||
|
|
||||||
Double-quotes can be used to protect spaces in values, e.g.:
|
|
||||||
param="spaces in here"
|
|
||||||
|
|
||||||
cpu lists:
|
|
||||||
----------
|
|
||||||
|
|
||||||
Some kernel parameters take a list of CPUs as a value, e.g. isolcpus,
|
|
||||||
nohz_full, irqaffinity, rcu_nocbs. The format of this list is:
|
|
||||||
|
|
||||||
<cpu number>,...,<cpu number>
|
|
||||||
|
|
||||||
or
|
|
||||||
|
|
||||||
<cpu number>-<cpu number>
|
|
||||||
(must be a positive range in ascending order)
|
|
||||||
|
|
||||||
or a mixture
|
|
||||||
|
|
||||||
<cpu number>,...,<cpu number>-<cpu number>
|
|
||||||
|
|
||||||
Note that for the special case of a range one can split the range into equal
|
|
||||||
sized groups and for each group use some amount from the beginning of that
|
|
||||||
group:
|
|
||||||
|
|
||||||
<cpu number>-cpu number>:<used size>/<group size>
|
|
||||||
|
|
||||||
For example one can add to the command line following parameter:
|
|
||||||
|
|
||||||
isolcpus=1,2,10-20,100-2000:2/25
|
|
||||||
|
|
||||||
where the final item represents CPUs 100,101,125,126,150,151,...
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
This document may not be entirely up to date and comprehensive. The command
|
|
||||||
"modinfo -p ${modulename}" shows a current list of all parameters of a loadable
|
|
||||||
module. Loadable modules, after being loaded into the running kernel, also
|
|
||||||
reveal their parameters in /sys/module/${modulename}/parameters/. Some of these
|
|
||||||
parameters may be changed at runtime by the command
|
|
||||||
"echo -n ${value} > /sys/module/${modulename}/parameters/${parm}".
|
|
||||||
|
|
||||||
The parameters listed below are only valid if certain kernel build options were
|
|
||||||
enabled and if respective hardware is present. The text in square brackets at
|
|
||||||
the beginning of each description states the restrictions within which a
|
|
||||||
parameter is applicable:
|
|
||||||
|
|
||||||
ACPI ACPI support is enabled.
|
|
||||||
AGP AGP (Accelerated Graphics Port) is enabled.
|
|
||||||
ALSA ALSA sound support is enabled.
|
|
||||||
APIC APIC support is enabled.
|
|
||||||
APM Advanced Power Management support is enabled.
|
|
||||||
ARM ARM architecture is enabled.
|
|
||||||
AVR32 AVR32 architecture is enabled.
|
|
||||||
AX25 Appropriate AX.25 support is enabled.
|
|
||||||
BLACKFIN Blackfin architecture is enabled.
|
|
||||||
CLK Common clock infrastructure is enabled.
|
|
||||||
CMA Contiguous Memory Area support is enabled.
|
|
||||||
DRM Direct Rendering Management support is enabled.
|
|
||||||
DYNAMIC_DEBUG Build in debug messages and enable them at runtime
|
|
||||||
EDD BIOS Enhanced Disk Drive Services (EDD) is enabled
|
|
||||||
EFI EFI Partitioning (GPT) is enabled
|
|
||||||
EIDE EIDE/ATAPI support is enabled.
|
|
||||||
EVM Extended Verification Module
|
|
||||||
FB The frame buffer device is enabled.
|
|
||||||
FTRACE Function tracing enabled.
|
|
||||||
GCOV GCOV profiling is enabled.
|
|
||||||
HW Appropriate hardware is enabled.
|
|
||||||
IA-64 IA-64 architecture is enabled.
|
|
||||||
IMA Integrity measurement architecture is enabled.
|
|
||||||
IOSCHED More than one I/O scheduler is enabled.
|
|
||||||
IP_PNP IP DHCP, BOOTP, or RARP is enabled.
|
|
||||||
IPV6 IPv6 support is enabled.
|
|
||||||
ISAPNP ISA PnP code is enabled.
|
|
||||||
ISDN Appropriate ISDN support is enabled.
|
|
||||||
JOY Appropriate joystick support is enabled.
|
|
||||||
KGDB Kernel debugger support is enabled.
|
|
||||||
KVM Kernel Virtual Machine support is enabled.
|
|
||||||
LIBATA Libata driver is enabled
|
|
||||||
LP Printer support is enabled.
|
|
||||||
LOOP Loopback device support is enabled.
|
|
||||||
M68k M68k architecture is enabled.
|
|
||||||
These options have more detailed description inside of
|
|
||||||
Documentation/m68k/kernel-options.txt.
|
|
||||||
MDA MDA console support is enabled.
|
|
||||||
MIPS MIPS architecture is enabled.
|
|
||||||
MOUSE Appropriate mouse support is enabled.
|
|
||||||
MSI Message Signaled Interrupts (PCI).
|
|
||||||
MTD MTD (Memory Technology Device) support is enabled.
|
|
||||||
NET Appropriate network support is enabled.
|
|
||||||
NUMA NUMA support is enabled.
|
|
||||||
NFS Appropriate NFS support is enabled.
|
|
||||||
OSS OSS sound support is enabled.
|
|
||||||
PV_OPS A paravirtualized kernel is enabled.
|
|
||||||
PARIDE The ParIDE (parallel port IDE) subsystem is enabled.
|
|
||||||
PARISC The PA-RISC architecture is enabled.
|
|
||||||
PCI PCI bus support is enabled.
|
|
||||||
PCIE PCI Express support is enabled.
|
|
||||||
PCMCIA The PCMCIA subsystem is enabled.
|
|
||||||
PNP Plug & Play support is enabled.
|
|
||||||
PPC PowerPC architecture is enabled.
|
|
||||||
PPT Parallel port support is enabled.
|
|
||||||
PS2 Appropriate PS/2 support is enabled.
|
|
||||||
RAM RAM disk support is enabled.
|
|
||||||
S390 S390 architecture is enabled.
|
|
||||||
SCSI Appropriate SCSI support is enabled.
|
|
||||||
A lot of drivers have their options described inside
|
|
||||||
the Documentation/scsi/ sub-directory.
|
|
||||||
SECURITY Different security models are enabled.
|
|
||||||
SELINUX SELinux support is enabled.
|
|
||||||
APPARMOR AppArmor support is enabled.
|
|
||||||
SERIAL Serial support is enabled.
|
|
||||||
SH SuperH architecture is enabled.
|
|
||||||
SMP The kernel is an SMP kernel.
|
|
||||||
SPARC Sparc architecture is enabled.
|
|
||||||
SWSUSP Software suspend (hibernation) is enabled.
|
|
||||||
SUSPEND System suspend states are enabled.
|
|
||||||
TPM TPM drivers are enabled.
|
|
||||||
TS Appropriate touchscreen support is enabled.
|
|
||||||
UMS USB Mass Storage support is enabled.
|
|
||||||
USB USB support is enabled.
|
|
||||||
USBHID USB Human Interface Device support is enabled.
|
|
||||||
V4L Video For Linux support is enabled.
|
|
||||||
VMMIO Driver for memory mapped virtio devices is enabled.
|
|
||||||
VGA The VGA console has been enabled.
|
|
||||||
VT Virtual terminal support is enabled.
|
|
||||||
WDT Watchdog support is enabled.
|
|
||||||
XT IBM PC/XT MFM hard disk support is enabled.
|
|
||||||
X86-32 X86-32, aka i386 architecture is enabled.
|
|
||||||
X86-64 X86-64 architecture is enabled.
|
|
||||||
More X86-64 boot options can be found in
|
|
||||||
Documentation/x86/x86_64/boot-options.txt .
|
|
||||||
X86 Either 32-bit or 64-bit x86 (same as X86-32+X86-64)
|
|
||||||
X86_UV SGI UV support is enabled.
|
|
||||||
XEN Xen support is enabled
|
|
||||||
|
|
||||||
In addition, the following text indicates that the option:
|
|
||||||
|
|
||||||
BUGS= Relates to possible processor bugs on the said processor.
|
|
||||||
KNL Is a kernel start-up parameter.
|
|
||||||
BOOT Is a boot loader parameter.
|
|
||||||
|
|
||||||
Parameters denoted with BOOT are actually interpreted by the boot
|
|
||||||
loader, and have no meaning to the kernel directly.
|
|
||||||
Do not modify the syntax of boot loader parameters without extreme
|
|
||||||
need or coordination with <Documentation/x86/boot.txt>.
|
|
||||||
|
|
||||||
There are also arch-specific kernel-parameters not documented here.
|
|
||||||
See for example <Documentation/x86/x86_64/boot-options.txt>.
|
|
||||||
|
|
||||||
Note that ALL kernel parameters listed below are CASE SENSITIVE, and that
|
|
||||||
a trailing = on the name of any parameter states that that parameter will
|
|
||||||
be entered as an environment variable, whereas its absence indicates that
|
|
||||||
it will appear as a kernel argument readable via /proc/cmdline by programs
|
|
||||||
running once the system is up.
|
|
||||||
|
|
||||||
The number of kernel parameters is not limited, but the length of the
|
|
||||||
complete command line (parameters including spaces etc.) is limited to
|
|
||||||
a fixed number of characters. This limit depends on the architecture
|
|
||||||
and is between 256 and 4096 characters. It is defined in the file
|
|
||||||
./include/asm/setup.h as COMMAND_LINE_SIZE.
|
|
||||||
|
|
||||||
Finally, the [KMG] suffix is commonly described after a number of kernel
|
|
||||||
parameter values. These 'K', 'M', and 'G' letters represent the _binary_
|
|
||||||
multipliers 'Kilo', 'Mega', and 'Giga', equalling 2^10, 2^20, and 2^30
|
|
||||||
bytes respectively. Such letter suffixes can also be entirely omitted.
|
|
||||||
|
|
||||||
|
|
||||||
acpi= [HW,ACPI,X86,ARM64]
|
acpi= [HW,ACPI,X86,ARM64]
|
||||||
Advanced Configuration and Power Interface
|
Advanced Configuration and Power Interface
|
||||||
Format: { force | on | off | strict | noirq | rsdt |
|
Format: { force | on | off | strict | noirq | rsdt |
|
||||||
@@ -811,7 +612,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
|
|||||||
bits, and "f" is flow control ("r" for RTS or
|
bits, and "f" is flow control ("r" for RTS or
|
||||||
omit it). Default is "9600n8".
|
omit it). Default is "9600n8".
|
||||||
|
|
||||||
See Documentation/serial-console.txt for more
|
See Documentation/admin-guide/serial-console.rst for more
|
||||||
information. See
|
information. See
|
||||||
Documentation/networking/netconsole.txt for an
|
Documentation/networking/netconsole.txt for an
|
||||||
alternative.
|
alternative.
|
||||||
@@ -2231,7 +2032,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
|
|||||||
mce=option [X86-64] See Documentation/x86/x86_64/boot-options.txt
|
mce=option [X86-64] See Documentation/x86/x86_64/boot-options.txt
|
||||||
|
|
||||||
md= [HW] RAID subsystems devices and level
|
md= [HW] RAID subsystems devices and level
|
||||||
See Documentation/md.txt.
|
See Documentation/admin-guide/md.rst.
|
||||||
|
|
||||||
mdacon= [MDA]
|
mdacon= [MDA]
|
||||||
Format: <first>,<last>
|
Format: <first>,<last>
|
||||||
@@ -3235,6 +3036,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
|
|||||||
may be specified.
|
may be specified.
|
||||||
Format: <port>,<port>....
|
Format: <port>,<port>....
|
||||||
|
|
||||||
|
powersave=off [PPC] This option disables power saving features.
|
||||||
|
It specifically disables cpuidle and sets the
|
||||||
|
platform machine description specific power_save
|
||||||
|
function to NULL. On Idle the CPU just reduces
|
||||||
|
execution priority.
|
||||||
|
|
||||||
ppc_strict_facility_enable
|
ppc_strict_facility_enable
|
||||||
[PPC] This option catches any kernel floating point,
|
[PPC] This option catches any kernel floating point,
|
||||||
Altivec, VSX and SPE outside of regions specifically
|
Altivec, VSX and SPE outside of regions specifically
|
||||||
@@ -3318,7 +3125,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
|
|||||||
r128= [HW,DRM]
|
r128= [HW,DRM]
|
||||||
|
|
||||||
raid= [HW,RAID]
|
raid= [HW,RAID]
|
||||||
See Documentation/md.txt.
|
See Documentation/admin-guide/md.rst.
|
||||||
|
|
||||||
ramdisk_size= [RAM] Sizes of RAM disks in kilobytes
|
ramdisk_size= [RAM] Sizes of RAM disks in kilobytes
|
||||||
See Documentation/blockdev/ramdisk.txt.
|
See Documentation/blockdev/ramdisk.txt.
|
||||||
@@ -4558,9 +4365,3 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
|
|||||||
xirc2ps_cs= [NET,PCMCIA]
|
xirc2ps_cs= [NET,PCMCIA]
|
||||||
Format:
|
Format:
|
||||||
<irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]]
|
<irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]]
|
||||||
|
|
||||||
______________________________________________________________________
|
|
||||||
|
|
||||||
TODO:
|
|
||||||
|
|
||||||
Add more DRM drivers.
|
|
||||||
@@ -1,40 +1,75 @@
|
|||||||
Tools that manage md devices can be found at
|
RAID arrays
|
||||||
http://www.kernel.org/pub/linux/utils/raid/
|
===========
|
||||||
|
|
||||||
|
|
||||||
Boot time assembly of RAID arrays
|
Boot time assembly of RAID arrays
|
||||||
---------------------------------
|
---------------------------------
|
||||||
|
|
||||||
|
Tools that manage md devices can be found at
|
||||||
|
http://www.kernel.org/pub/linux/utils/raid/
|
||||||
|
|
||||||
|
|
||||||
You can boot with your md device with the following kernel command
|
You can boot with your md device with the following kernel command
|
||||||
lines:
|
lines:
|
||||||
|
|
||||||
for old raid arrays without persistent superblocks:
|
for old raid arrays without persistent superblocks::
|
||||||
|
|
||||||
md=<md device no.>,<raid level>,<chunk size factor>,<fault level>,dev0,dev1,...,devn
|
md=<md device no.>,<raid level>,<chunk size factor>,<fault level>,dev0,dev1,...,devn
|
||||||
|
|
||||||
for raid arrays with persistent superblocks
|
for raid arrays with persistent superblocks::
|
||||||
|
|
||||||
md=<md device no.>,dev0,dev1,...,devn
|
md=<md device no.>,dev0,dev1,...,devn
|
||||||
or, to assemble a partitionable array:
|
|
||||||
|
or, to assemble a partitionable array::
|
||||||
|
|
||||||
md=d<md device no.>,dev0,dev1,...,devn
|
md=d<md device no.>,dev0,dev1,...,devn
|
||||||
|
|
||||||
md device no. = the number of the md device ...
|
``md device no.``
|
||||||
0 means md0,
|
+++++++++++++++++
|
||||||
1 md1,
|
|
||||||
2 md2,
|
|
||||||
3 md3,
|
|
||||||
4 md4
|
|
||||||
|
|
||||||
raid level = -1 linear mode
|
The number of the md device
|
||||||
|
|
||||||
|
================= =========
|
||||||
|
``md device no.`` device
|
||||||
|
================= =========
|
||||||
|
0 md0
|
||||||
|
1 md1
|
||||||
|
2 md2
|
||||||
|
3 md3
|
||||||
|
4 md4
|
||||||
|
================= =========
|
||||||
|
|
||||||
|
``raid level``
|
||||||
|
++++++++++++++
|
||||||
|
|
||||||
|
level of the RAID array
|
||||||
|
|
||||||
|
=============== =============
|
||||||
|
``raid level`` level
|
||||||
|
=============== =============
|
||||||
|
-1 linear mode
|
||||||
0 striped mode
|
0 striped mode
|
||||||
|
=============== =============
|
||||||
|
|
||||||
other modes are only supported with persistent super blocks
|
other modes are only supported with persistent super blocks
|
||||||
|
|
||||||
chunk size factor = (raid-0 and raid-1 only)
|
``chunk size factor``
|
||||||
|
+++++++++++++++++++++
|
||||||
|
|
||||||
|
(raid-0 and raid-1 only)
|
||||||
|
|
||||||
Set the chunk size as 4k << n.
|
Set the chunk size as 4k << n.
|
||||||
|
|
||||||
fault level = totally ignored
|
``fault level``
|
||||||
|
+++++++++++++++
|
||||||
|
|
||||||
dev0-devn: e.g. /dev/hda1,/dev/hdc1,/dev/sda1,/dev/sdb1
|
Totally ignored
|
||||||
|
|
||||||
A possible loadlin line (Harald Hoyer <HarryH@Royal.Net>) looks like this:
|
``dev0`` to ``devn``
|
||||||
|
++++++++++++++++++++
|
||||||
|
|
||||||
|
e.g. ``/dev/hda1``, ``/dev/hdc1``, ``/dev/sda1``, ``/dev/sdb1``
|
||||||
|
|
||||||
|
A possible loadlin line (Harald Hoyer <HarryH@Royal.Net>) looks like this::
|
||||||
|
|
||||||
e:\loadlin\loadlin e:\zimage root=/dev/md0 md=0,0,4,0,/dev/hdb2,/dev/hdc3 ro
|
e:\loadlin\loadlin e:\zimage root=/dev/md0 md=0,0,4,0,/dev/hdb2,/dev/hdc3 ro
|
||||||
|
|
||||||
@@ -45,10 +80,10 @@ Boot time autodetection of RAID arrays
|
|||||||
When md is compiled into the kernel (not as module), partitions of
|
When md is compiled into the kernel (not as module), partitions of
|
||||||
type 0xfd are scanned and automatically assembled into RAID arrays.
|
type 0xfd are scanned and automatically assembled into RAID arrays.
|
||||||
This autodetection may be suppressed with the kernel parameter
|
This autodetection may be suppressed with the kernel parameter
|
||||||
"raid=noautodetect". As of kernel 2.6.9, only drives with a type 0
|
``raid=noautodetect``. As of kernel 2.6.9, only drives with a type 0
|
||||||
superblock can be autodetected and run at boot time.
|
superblock can be autodetected and run at boot time.
|
||||||
|
|
||||||
The kernel parameter "raid=partitionable" (or "raid=part") means
|
The kernel parameter ``raid=partitionable`` (or ``raid=part``) means
|
||||||
that all auto-detected arrays are assembled as partitionable.
|
that all auto-detected arrays are assembled as partitionable.
|
||||||
|
|
||||||
Boot time assembly of degraded/dirty arrays
|
Boot time assembly of degraded/dirty arrays
|
||||||
@@ -56,22 +91,23 @@ Boot time assembly of degraded/dirty arrays
|
|||||||
|
|
||||||
If a raid5 or raid6 array is both dirty and degraded, it could have
|
If a raid5 or raid6 array is both dirty and degraded, it could have
|
||||||
undetectable data corruption. This is because the fact that it is
|
undetectable data corruption. This is because the fact that it is
|
||||||
'dirty' means that the parity cannot be trusted, and the fact that it
|
``dirty`` means that the parity cannot be trusted, and the fact that it
|
||||||
is degraded means that some datablocks are missing and cannot reliably
|
is degraded means that some datablocks are missing and cannot reliably
|
||||||
be reconstructed (due to no parity).
|
be reconstructed (due to no parity).
|
||||||
|
|
||||||
For this reason, md will normally refuse to start such an array. This
|
For this reason, md will normally refuse to start such an array. This
|
||||||
requires the sysadmin to take action to explicitly start the array
|
requires the sysadmin to take action to explicitly start the array
|
||||||
despite possible corruption. This is normally done with
|
despite possible corruption. This is normally done with::
|
||||||
|
|
||||||
mdadm --assemble --force ....
|
mdadm --assemble --force ....
|
||||||
|
|
||||||
This option is not really available if the array has the root
|
This option is not really available if the array has the root
|
||||||
filesystem on it. In order to support this booting from such an
|
filesystem on it. In order to support this booting from such an
|
||||||
array, md supports a module parameter "start_dirty_degraded" which,
|
array, md supports a module parameter ``start_dirty_degraded`` which,
|
||||||
when set to 1, bypassed the checks and will allows dirty degraded
|
when set to 1, bypassed the checks and will allows dirty degraded
|
||||||
arrays to be started.
|
arrays to be started.
|
||||||
|
|
||||||
So, to boot with a root filesystem of a dirty degraded raid[56], use
|
So, to boot with a root filesystem of a dirty degraded raid 5 or 6, use::
|
||||||
|
|
||||||
md-mod.start_dirty_degraded=1
|
md-mod.start_dirty_degraded=1
|
||||||
|
|
||||||
@@ -80,28 +116,28 @@ Superblock formats
|
|||||||
------------------
|
------------------
|
||||||
|
|
||||||
The md driver can support a variety of different superblock formats.
|
The md driver can support a variety of different superblock formats.
|
||||||
Currently, it supports superblock formats "0.90.0" and the "md-1" format
|
Currently, it supports superblock formats ``0.90.0`` and the ``md-1`` format
|
||||||
introduced in the 2.5 development series.
|
introduced in the 2.5 development series.
|
||||||
|
|
||||||
The kernel will autodetect which format superblock is being used.
|
The kernel will autodetect which format superblock is being used.
|
||||||
|
|
||||||
Superblock format '0' is treated differently to others for legacy
|
Superblock format ``0`` is treated differently to others for legacy
|
||||||
reasons - it is the original superblock format.
|
reasons - it is the original superblock format.
|
||||||
|
|
||||||
|
|
||||||
General Rules - apply for all superblock formats
|
General Rules - apply for all superblock formats
|
||||||
------------------------------------------------
|
------------------------------------------------
|
||||||
|
|
||||||
An array is 'created' by writing appropriate superblocks to all
|
An array is ``created`` by writing appropriate superblocks to all
|
||||||
devices.
|
devices.
|
||||||
|
|
||||||
It is 'assembled' by associating each of these devices with an
|
It is ``assembled`` by associating each of these devices with an
|
||||||
particular md virtual device. Once it is completely assembled, it can
|
particular md virtual device. Once it is completely assembled, it can
|
||||||
be accessed.
|
be accessed.
|
||||||
|
|
||||||
An array should be created by a user-space tool. This will write
|
An array should be created by a user-space tool. This will write
|
||||||
superblocks to all devices. It will usually mark the array as
|
superblocks to all devices. It will usually mark the array as
|
||||||
'unclean', or with some devices missing so that the kernel md driver
|
``unclean``, or with some devices missing so that the kernel md driver
|
||||||
can create appropriate redundancy (copying in raid 1, parity
|
can create appropriate redundancy (copying in raid 1, parity
|
||||||
calculation in raid 4/5).
|
calculation in raid 4/5).
|
||||||
|
|
||||||
@@ -126,13 +162,12 @@ Devices that have failed or are not yet active can be detached from an
|
|||||||
array using HOT_REMOVE_DISK.
|
array using HOT_REMOVE_DISK.
|
||||||
|
|
||||||
|
|
||||||
Specific Rules that apply to format-0 super block arrays, and
|
Specific Rules that apply to format-0 super block arrays, and arrays with no superblock (non-persistent)
|
||||||
arrays with no superblock (non-persistent).
|
--------------------------------------------------------------------------------------------------------
|
||||||
-------------------------------------------------------------
|
|
||||||
|
|
||||||
An array can be 'created' by describing the array (level, chunksize
|
An array can be ``created`` by describing the array (level, chunksize
|
||||||
etc) in a SET_ARRAY_INFO ioctl. This must have major_version==0 and
|
etc) in a SET_ARRAY_INFO ioctl. This must have ``major_version==0`` and
|
||||||
raid_disks != 0.
|
``raid_disks != 0``.
|
||||||
|
|
||||||
Then uninitialized devices can be added with ADD_NEW_DISK. The
|
Then uninitialized devices can be added with ADD_NEW_DISK. The
|
||||||
structure passed to ADD_NEW_DISK must specify the state of the device
|
structure passed to ADD_NEW_DISK must specify the state of the device
|
||||||
@@ -142,24 +177,26 @@ Once started with RUN_ARRAY, uninitialized spares can be added with
|
|||||||
HOT_ADD_DISK.
|
HOT_ADD_DISK.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
MD devices in sysfs
|
MD devices in sysfs
|
||||||
-------------------
|
-------------------
|
||||||
md devices appear in sysfs (/sys) as regular block devices,
|
|
||||||
e.g.
|
md devices appear in sysfs (``/sys``) as regular block devices,
|
||||||
|
e.g.::
|
||||||
|
|
||||||
/sys/block/md0
|
/sys/block/md0
|
||||||
|
|
||||||
Each 'md' device will contain a subdirectory called 'md' which
|
Each ``md`` device will contain a subdirectory called ``md`` which
|
||||||
contains further md-specific information about the device.
|
contains further md-specific information about the device.
|
||||||
|
|
||||||
All md devices contain:
|
All md devices contain:
|
||||||
|
|
||||||
level
|
level
|
||||||
a text file indicating the 'raid level'. e.g. raid0, raid1,
|
a text file indicating the ``raid level``. e.g. raid0, raid1,
|
||||||
raid5, linear, multipath, faulty.
|
raid5, linear, multipath, faulty.
|
||||||
If no raid level has been set yet (array is still being
|
If no raid level has been set yet (array is still being
|
||||||
assembled), the value will reflect whatever has been written
|
assembled), the value will reflect whatever has been written
|
||||||
to it, which may be a name like the above, or may be a number
|
to it, which may be a name like the above, or may be a number
|
||||||
such as '0', '5', etc.
|
such as ``0``, ``5``, etc.
|
||||||
|
|
||||||
raid_disks
|
raid_disks
|
||||||
a text file with a simple number indicating the number of devices
|
a text file with a simple number indicating the number of devices
|
||||||
@@ -172,10 +209,10 @@ All md devices contain:
|
|||||||
A change to this attribute will not be permitted if it would
|
A change to this attribute will not be permitted if it would
|
||||||
reduce the size of the array. To reduce the number of drives
|
reduce the size of the array. To reduce the number of drives
|
||||||
in an e.g. raid5, the array size must first be reduced by
|
in an e.g. raid5, the array size must first be reduced by
|
||||||
setting the 'array_size' attribute.
|
setting the ``array_size`` attribute.
|
||||||
|
|
||||||
chunk_size
|
chunk_size
|
||||||
This is the size in bytes for 'chunks' and is only relevant to
|
This is the size in bytes for ``chunks`` and is only relevant to
|
||||||
raid levels that involve striping (0,4,5,6,10). The address space
|
raid levels that involve striping (0,4,5,6,10). The address space
|
||||||
of the array is conceptually divided into chunks and consecutive
|
of the array is conceptually divided into chunks and consecutive
|
||||||
chunks are striped onto neighbouring devices.
|
chunks are striped onto neighbouring devices.
|
||||||
@@ -183,7 +220,7 @@ All md devices contain:
|
|||||||
of 2. This can only be set while assembling an array
|
of 2. This can only be set while assembling an array
|
||||||
|
|
||||||
layout
|
layout
|
||||||
The "layout" for the array for the particular level. This is
|
The ``layout`` for the array for the particular level. This is
|
||||||
simply a number that is interpretted differently by different
|
simply a number that is interpretted differently by different
|
||||||
levels. It can be written while assembling an array.
|
levels. It can be written while assembling an array.
|
||||||
|
|
||||||
@@ -193,22 +230,24 @@ All md devices contain:
|
|||||||
devices. Writing a number (in Kilobytes) which is less than
|
devices. Writing a number (in Kilobytes) which is less than
|
||||||
the available size will set the size. Any reconfiguration of the
|
the available size will set the size. Any reconfiguration of the
|
||||||
array (e.g. adding devices) will not cause the size to change.
|
array (e.g. adding devices) will not cause the size to change.
|
||||||
Writing the word 'default' will cause the effective size of the
|
Writing the word ``default`` will cause the effective size of the
|
||||||
array to be whatever size is actually available based on
|
array to be whatever size is actually available based on
|
||||||
'level', 'chunk_size' and 'component_size'.
|
``level``, ``chunk_size`` and ``component_size``.
|
||||||
|
|
||||||
This can be used to reduce the size of the array before reducing
|
This can be used to reduce the size of the array before reducing
|
||||||
the number of devices in a raid4/5/6, or to support external
|
the number of devices in a raid4/5/6, or to support external
|
||||||
metadata formats which mandate such clipping.
|
metadata formats which mandate such clipping.
|
||||||
|
|
||||||
reshape_position
|
reshape_position
|
||||||
This is either "none" or a sector number within the devices of
|
This is either ``none`` or a sector number within the devices of
|
||||||
the array where "reshape" is up to. If this is set, the three
|
the array where ``reshape`` is up to. If this is set, the three
|
||||||
attributes mentioned above (raid_disks, chunk_size, layout) can
|
attributes mentioned above (raid_disks, chunk_size, layout) can
|
||||||
potentially have 2 values, an old and a new value. If these
|
potentially have 2 values, an old and a new value. If these
|
||||||
values differ, reading the attribute returns
|
values differ, reading the attribute returns::
|
||||||
|
|
||||||
new (old)
|
new (old)
|
||||||
and writing will effect the 'new' value, leaving the 'old'
|
|
||||||
|
and writing will effect the ``new`` value, leaving the ``old``
|
||||||
unchanged.
|
unchanged.
|
||||||
|
|
||||||
component_size
|
component_size
|
||||||
@@ -223,9 +262,9 @@ All md devices contain:
|
|||||||
metadata_version
|
metadata_version
|
||||||
This indicates the format that is being used to record metadata
|
This indicates the format that is being used to record metadata
|
||||||
about the array. It can be 0.90 (traditional format), 1.0, 1.1,
|
about the array. It can be 0.90 (traditional format), 1.0, 1.1,
|
||||||
1.2 (newer format in varying locations) or "none" indicating that
|
1.2 (newer format in varying locations) or ``none`` indicating that
|
||||||
the kernel isn't managing metadata at all.
|
the kernel isn't managing metadata at all.
|
||||||
Alternately it can be "external:" followed by a string which
|
Alternately it can be ``external:`` followed by a string which
|
||||||
is set by user-space. This indicates that metadata is managed
|
is set by user-space. This indicates that metadata is managed
|
||||||
by a user-space program. Any device failure or other event that
|
by a user-space program. Any device failure or other event that
|
||||||
requires a metadata update will cause array activity to be
|
requires a metadata update will cause array activity to be
|
||||||
@@ -233,9 +272,9 @@ All md devices contain:
|
|||||||
|
|
||||||
resync_start
|
resync_start
|
||||||
The point at which resync should start. If no resync is needed,
|
The point at which resync should start. If no resync is needed,
|
||||||
this will be a very large number (or 'none' since 2.6.30-rc1). At
|
this will be a very large number (or ``none`` since 2.6.30-rc1). At
|
||||||
array creation it will default to 0, though starting the array as
|
array creation it will default to 0, though starting the array as
|
||||||
'clean' will set it much larger.
|
``clean`` will set it much larger.
|
||||||
|
|
||||||
new_dev
|
new_dev
|
||||||
This file can be written but not read. The value written should
|
This file can be written but not read. The value written should
|
||||||
@@ -246,10 +285,10 @@ All md devices contain:
|
|||||||
|
|
||||||
safe_mode_delay
|
safe_mode_delay
|
||||||
When an md array has seen no write requests for a certain period
|
When an md array has seen no write requests for a certain period
|
||||||
of time, it will be marked as 'clean'. When another write
|
of time, it will be marked as ``clean``. When another write
|
||||||
request arrives, the array is marked as 'dirty' before the write
|
request arrives, the array is marked as ``dirty`` before the write
|
||||||
commences. This is known as 'safe_mode'.
|
commences. This is known as ``safe_mode``.
|
||||||
The 'certain period' is controlled by this file which stores the
|
The ``certain period`` is controlled by this file which stores the
|
||||||
period as a number of seconds. The default is 200msec (0.200).
|
period as a number of seconds. The default is 200msec (0.200).
|
||||||
Writing a value of 0 disables safemode.
|
Writing a value of 0 disables safemode.
|
||||||
|
|
||||||
@@ -260,38 +299,50 @@ All md devices contain:
|
|||||||
cannot be explicitly set, and some transitions are not allowed.
|
cannot be explicitly set, and some transitions are not allowed.
|
||||||
|
|
||||||
Select/poll works on this file. All changes except between
|
Select/poll works on this file. All changes except between
|
||||||
active_idle and active (which can be frequent and are not
|
Active_idle and active (which can be frequent and are not
|
||||||
very interesting) are notified. active->active_idle is
|
very interesting) are notified. active->active_idle is
|
||||||
reported if the metadata is externally managed.
|
reported if the metadata is externally managed.
|
||||||
|
|
||||||
clear
|
clear
|
||||||
No devices, no size, no level
|
No devices, no size, no level
|
||||||
|
|
||||||
Writing is equivalent to STOP_ARRAY ioctl
|
Writing is equivalent to STOP_ARRAY ioctl
|
||||||
|
|
||||||
inactive
|
inactive
|
||||||
May have some settings, but array is not active
|
May have some settings, but array is not active
|
||||||
all IO results in error
|
all IO results in error
|
||||||
|
|
||||||
When written, doesn't tear down array, but just stops it
|
When written, doesn't tear down array, but just stops it
|
||||||
|
|
||||||
suspended (not supported yet)
|
suspended (not supported yet)
|
||||||
All IO requests will block. The array can be reconfigured.
|
All IO requests will block. The array can be reconfigured.
|
||||||
|
|
||||||
Writing this, if accepted, will block until array is quiessent
|
Writing this, if accepted, will block until array is quiessent
|
||||||
|
|
||||||
readonly
|
readonly
|
||||||
no resync can happen. no superblocks get written.
|
no resync can happen. no superblocks get written.
|
||||||
write requests fail
|
|
||||||
read-auto
|
|
||||||
like readonly, but behaves like 'clean' on a write request.
|
|
||||||
|
|
||||||
clean - no pending writes, but otherwise active.
|
Write requests fail
|
||||||
|
|
||||||
|
read-auto
|
||||||
|
like readonly, but behaves like ``clean`` on a write request.
|
||||||
|
|
||||||
|
clean
|
||||||
|
no pending writes, but otherwise active.
|
||||||
|
|
||||||
When written to inactive array, starts without resync
|
When written to inactive array, starts without resync
|
||||||
|
|
||||||
If a write request arrives then
|
If a write request arrives then
|
||||||
if metadata is known, mark 'dirty' and switch to 'active'.
|
if metadata is known, mark ``dirty`` and switch to ``active``.
|
||||||
if not known, block and switch to write-pending
|
if not known, block and switch to write-pending
|
||||||
|
|
||||||
If written to an active array that has pending writes, then fails.
|
If written to an active array that has pending writes, then fails.
|
||||||
active
|
active
|
||||||
fully active: IO and resync can be happening.
|
fully active: IO and resync can be happening.
|
||||||
When written to inactive array, starts with resync
|
When written to inactive array, starts with resync
|
||||||
|
|
||||||
write-pending
|
write-pending
|
||||||
clean, but writes are blocked waiting for 'active' to be written.
|
clean, but writes are blocked waiting for ``active`` to be written.
|
||||||
|
|
||||||
active-idle
|
active-idle
|
||||||
like active, but no writes have been seen for a while (safe_mode_delay).
|
like active, but no writes have been seen for a while (safe_mode_delay).
|
||||||
@@ -299,41 +350,52 @@ All md devices contain:
|
|||||||
bitmap/location
|
bitmap/location
|
||||||
This indicates where the write-intent bitmap for the array is
|
This indicates where the write-intent bitmap for the array is
|
||||||
stored.
|
stored.
|
||||||
It can be one of "none", "file" or "[+-]N".
|
|
||||||
"file" may later be extended to "file:/file/name"
|
It can be one of ``none``, ``file`` or ``[+-]N``.
|
||||||
"[+-]N" means that many sectors from the start of the metadata.
|
``file`` may later be extended to ``file:/file/name``
|
||||||
|
``[+-]N`` means that many sectors from the start of the metadata.
|
||||||
|
|
||||||
This is replicated on all devices. For arrays with externally
|
This is replicated on all devices. For arrays with externally
|
||||||
managed metadata, the offset is from the beginning of the
|
managed metadata, the offset is from the beginning of the
|
||||||
device.
|
device.
|
||||||
|
|
||||||
bitmap/chunksize
|
bitmap/chunksize
|
||||||
The size, in bytes, of the chunk which will be represented by a
|
The size, in bytes, of the chunk which will be represented by a
|
||||||
single bit. For RAID456, it is a portion of an individual
|
single bit. For RAID456, it is a portion of an individual
|
||||||
device. For RAID10, it is a portion of the array. For RAID1, it
|
device. For RAID10, it is a portion of the array. For RAID1, it
|
||||||
is both (they come to the same thing).
|
is both (they come to the same thing).
|
||||||
|
|
||||||
bitmap/time_base
|
bitmap/time_base
|
||||||
The time, in seconds, between looking for bits in the bitmap to
|
The time, in seconds, between looking for bits in the bitmap to
|
||||||
be cleared. In the current implementation, a bit will be cleared
|
be cleared. In the current implementation, a bit will be cleared
|
||||||
between 2 and 3 times "time_base" after all the covered blocks
|
between 2 and 3 times ``time_base`` after all the covered blocks
|
||||||
are known to be in-sync.
|
are known to be in-sync.
|
||||||
|
|
||||||
bitmap/backlog
|
bitmap/backlog
|
||||||
When write-mostly devices are active in a RAID1, write requests
|
When write-mostly devices are active in a RAID1, write requests
|
||||||
to those devices proceed in the background - the filesystem (or
|
to those devices proceed in the background - the filesystem (or
|
||||||
other user of the device) does not have to wait for them.
|
other user of the device) does not have to wait for them.
|
||||||
'backlog' sets a limit on the number of concurrent background
|
``backlog`` sets a limit on the number of concurrent background
|
||||||
writes. If there are more than this, new writes will by
|
writes. If there are more than this, new writes will by
|
||||||
synchronous.
|
synchronous.
|
||||||
|
|
||||||
bitmap/metadata
|
bitmap/metadata
|
||||||
This can be either 'internal' or 'external'.
|
This can be either ``internal`` or ``external``.
|
||||||
'internal' is the default and means the metadata for the bitmap
|
|
||||||
|
``internal``
|
||||||
|
is the default and means the metadata for the bitmap
|
||||||
is stored in the first 256 bytes of the allocated space and is
|
is stored in the first 256 bytes of the allocated space and is
|
||||||
managed by the md module.
|
managed by the md module.
|
||||||
'external' means that bitmap metadata is managed externally to
|
|
||||||
|
``external``
|
||||||
|
means that bitmap metadata is managed externally to
|
||||||
the kernel (i.e. by some userspace program)
|
the kernel (i.e. by some userspace program)
|
||||||
|
|
||||||
bitmap/can_clear
|
bitmap/can_clear
|
||||||
This is either 'true' or 'false'. If 'true', then bits in the
|
This is either ``true`` or ``false``. If ``true``, then bits in the
|
||||||
bitmap will be cleared when the corresponding blocks are thought
|
bitmap will be cleared when the corresponding blocks are thought
|
||||||
to be in-sync. If 'false', bits will never be cleared.
|
to be in-sync. If ``false``, bits will never be cleared.
|
||||||
This is automatically set to 'false' if a write happens on a
|
This is automatically set to ``false`` if a write happens on a
|
||||||
degraded array, or if the array becomes degraded during a write.
|
degraded array, or if the array becomes degraded during a write.
|
||||||
When metadata is managed externally, it should be set to true
|
When metadata is managed externally, it should be set to true
|
||||||
once the array becomes non-degraded, and this fact has been
|
once the array becomes non-degraded, and this fact has been
|
||||||
@@ -342,14 +404,17 @@ All md devices contain:
|
|||||||
|
|
||||||
|
|
||||||
|
|
||||||
As component devices are added to an md array, they appear in the 'md'
|
As component devices are added to an md array, they appear in the ``md``
|
||||||
directory as new directories named
|
directory as new directories named::
|
||||||
|
|
||||||
dev-XXX
|
dev-XXX
|
||||||
where XXX is a name that the kernel knows for the device, e.g. hdb1.
|
|
||||||
|
where ``XXX`` is a name that the kernel knows for the device, e.g. hdb1.
|
||||||
Each directory contains:
|
Each directory contains:
|
||||||
|
|
||||||
block
|
block
|
||||||
a symlink to the block device in /sys/block, e.g.
|
a symlink to the block device in /sys/block, e.g.::
|
||||||
|
|
||||||
/sys/block/md0/md/dev-hdb1/block -> ../../../../block/hdb/hdb1
|
/sys/block/md0/md/dev-hdb1/block -> ../../../../block/hdb/hdb1
|
||||||
|
|
||||||
super
|
super
|
||||||
@@ -358,51 +423,83 @@ Each directory contains:
|
|||||||
|
|
||||||
state
|
state
|
||||||
A file recording the current state of the device in the array
|
A file recording the current state of the device in the array
|
||||||
which can be a comma separated list of
|
which can be a comma separated list of:
|
||||||
faulty - device has been kicked from active use due to
|
|
||||||
|
faulty
|
||||||
|
device has been kicked from active use due to
|
||||||
a detected fault, or it has unacknowledged bad
|
a detected fault, or it has unacknowledged bad
|
||||||
blocks
|
blocks
|
||||||
in_sync - device is a fully in-sync member of the array
|
|
||||||
writemostly - device will only be subject to read
|
in_sync
|
||||||
|
device is a fully in-sync member of the array
|
||||||
|
|
||||||
|
writemostly
|
||||||
|
device will only be subject to read
|
||||||
requests if there are no other options.
|
requests if there are no other options.
|
||||||
|
|
||||||
This applies only to raid1 arrays.
|
This applies only to raid1 arrays.
|
||||||
blocked - device has failed, and the failure hasn't been
|
|
||||||
|
blocked
|
||||||
|
device has failed, and the failure hasn't been
|
||||||
acknowledged yet by the metadata handler.
|
acknowledged yet by the metadata handler.
|
||||||
|
|
||||||
Writes that would write to this device if
|
Writes that would write to this device if
|
||||||
it were not faulty are blocked.
|
it were not faulty are blocked.
|
||||||
spare - device is working, but not a full member.
|
|
||||||
|
spare
|
||||||
|
device is working, but not a full member.
|
||||||
|
|
||||||
This includes spares that are in the process
|
This includes spares that are in the process
|
||||||
of being recovered to
|
of being recovered to
|
||||||
write_error - device has ever seen a write error.
|
|
||||||
want_replacement - device is (mostly) working but probably
|
write_error
|
||||||
|
device has ever seen a write error.
|
||||||
|
|
||||||
|
want_replacement
|
||||||
|
device is (mostly) working but probably
|
||||||
should be replaced, either due to errors or
|
should be replaced, either due to errors or
|
||||||
due to user request.
|
due to user request.
|
||||||
replacement - device is a replacement for another active
|
|
||||||
|
replacement
|
||||||
|
device is a replacement for another active
|
||||||
device with same raid_disk.
|
device with same raid_disk.
|
||||||
|
|
||||||
|
|
||||||
This list may grow in future.
|
This list may grow in future.
|
||||||
|
|
||||||
This can be written to.
|
This can be written to.
|
||||||
Writing "faulty" simulates a failure on the device.
|
|
||||||
Writing "remove" removes the device from the array.
|
Writing ``faulty`` simulates a failure on the device.
|
||||||
Writing "writemostly" sets the writemostly flag.
|
|
||||||
Writing "-writemostly" clears the writemostly flag.
|
Writing ``remove`` removes the device from the array.
|
||||||
Writing "blocked" sets the "blocked" flag.
|
|
||||||
Writing "-blocked" clears the "blocked" flags and allows writes
|
Writing ``writemostly`` sets the writemostly flag.
|
||||||
|
|
||||||
|
Writing ``-writemostly`` clears the writemostly flag.
|
||||||
|
|
||||||
|
Writing ``blocked`` sets the ``blocked`` flag.
|
||||||
|
|
||||||
|
Writing ``-blocked`` clears the ``blocked`` flags and allows writes
|
||||||
to complete and possibly simulates an error.
|
to complete and possibly simulates an error.
|
||||||
Writing "in_sync" sets the in_sync flag.
|
|
||||||
Writing "write_error" sets writeerrorseen flag.
|
Writing ``in_sync`` sets the in_sync flag.
|
||||||
Writing "-write_error" clears writeerrorseen flag.
|
|
||||||
Writing "want_replacement" is allowed at any time except to a
|
Writing ``write_error`` sets writeerrorseen flag.
|
||||||
|
|
||||||
|
Writing ``-write_error`` clears writeerrorseen flag.
|
||||||
|
|
||||||
|
Writing ``want_replacement`` is allowed at any time except to a
|
||||||
replacement device or a spare. It sets the flag.
|
replacement device or a spare. It sets the flag.
|
||||||
Writing "-want_replacement" is allowed at any time. It clears
|
|
||||||
|
Writing ``-want_replacement`` is allowed at any time. It clears
|
||||||
the flag.
|
the flag.
|
||||||
Writing "replacement" or "-replacement" is only allowed before
|
|
||||||
|
Writing ``replacement`` or ``-replacement`` is only allowed before
|
||||||
starting the array. It sets or clears the flag.
|
starting the array. It sets or clears the flag.
|
||||||
|
|
||||||
|
|
||||||
This file responds to select/poll. Any change to 'faulty'
|
This file responds to select/poll. Any change to ``faulty``
|
||||||
or 'blocked' causes an event.
|
or ``blocked`` causes an event.
|
||||||
|
|
||||||
errors
|
errors
|
||||||
An approximate count of read errors that have been detected on
|
An approximate count of read errors that have been detected on
|
||||||
@@ -417,9 +514,9 @@ Each directory contains:
|
|||||||
|
|
||||||
slot
|
slot
|
||||||
This gives the role that the device has in the array. It will
|
This gives the role that the device has in the array. It will
|
||||||
either be 'none' if the device is not active in the array
|
either be ``none`` if the device is not active in the array
|
||||||
(i.e. is a spare or has failed) or an integer less than the
|
(i.e. is a spare or has failed) or an integer less than the
|
||||||
'raid_disks' number for the array indicating which position
|
``raid_disks`` number for the array indicating which position
|
||||||
it currently fills. This can only be set while assembling an
|
it currently fills. This can only be set while assembling an
|
||||||
array. A device for which this is set is assumed to be working.
|
array. A device for which this is set is assumed to be working.
|
||||||
|
|
||||||
@@ -437,7 +534,7 @@ Each directory contains:
|
|||||||
written, it will be rejected.
|
written, it will be rejected.
|
||||||
|
|
||||||
recovery_start
|
recovery_start
|
||||||
When the device is not 'in_sync', this records the number of
|
When the device is not ``in_sync``, this records the number of
|
||||||
sectors from the start of the device which are known to be
|
sectors from the start of the device which are known to be
|
||||||
correct. This is normally zero, but during a recovery
|
correct. This is normally zero, but during a recovery
|
||||||
operation it will steadily increase, and if the recovery is
|
operation it will steadily increase, and if the recovery is
|
||||||
@@ -447,21 +544,21 @@ Each directory contains:
|
|||||||
|
|
||||||
This can be set whenever the device is not an active member of
|
This can be set whenever the device is not an active member of
|
||||||
the array, either before the array is activated, or before
|
the array, either before the array is activated, or before
|
||||||
the 'slot' is set.
|
the ``slot`` is set.
|
||||||
|
|
||||||
Setting this to 'none' is equivalent to setting 'in_sync'.
|
Setting this to ``none`` is equivalent to setting ``in_sync``.
|
||||||
Setting to any other value also clears the 'in_sync' flag.
|
Setting to any other value also clears the ``in_sync`` flag.
|
||||||
|
|
||||||
bad_blocks
|
bad_blocks
|
||||||
This gives the list of all known bad blocks in the form of
|
This gives the list of all known bad blocks in the form of
|
||||||
start address and length (in sectors respectively). If output
|
start address and length (in sectors respectively). If output
|
||||||
is too big to fit in a page, it will be truncated. Writing
|
is too big to fit in a page, it will be truncated. Writing
|
||||||
"sector length" to this file adds new acknowledged (i.e.
|
``sector length`` to this file adds new acknowledged (i.e.
|
||||||
recorded to disk safely) bad blocks.
|
recorded to disk safely) bad blocks.
|
||||||
|
|
||||||
unacknowledged_bad_blocks
|
unacknowledged_bad_blocks
|
||||||
This gives the list of known-but-not-yet-saved-to-disk bad
|
This gives the list of known-but-not-yet-saved-to-disk bad
|
||||||
blocks in the same form of 'bad_blocks'. If output is too big
|
blocks in the same form of ``bad_blocks``. If output is too big
|
||||||
to fit in a page, it will be truncated. Writing to this file
|
to fit in a page, it will be truncated. Writing to this file
|
||||||
adds bad blocks without acknowledging them. This is largely
|
adds bad blocks without acknowledging them. This is largely
|
||||||
for testing.
|
for testing.
|
||||||
@@ -469,16 +566,18 @@ Each directory contains:
|
|||||||
|
|
||||||
|
|
||||||
An active md device will also contain an entry for each active device
|
An active md device will also contain an entry for each active device
|
||||||
in the array. These are named
|
in the array. These are named::
|
||||||
|
|
||||||
rdNN
|
rdNN
|
||||||
|
|
||||||
where 'NN' is the position in the array, starting from 0.
|
where ``NN`` is the position in the array, starting from 0.
|
||||||
So for a 3 drive array there will be rd0, rd1, rd2.
|
So for a 3 drive array there will be rd0, rd1, rd2.
|
||||||
These are symbolic links to the appropriate 'dev-XXX' entry.
|
These are symbolic links to the appropriate ``dev-XXX`` entry.
|
||||||
Thus, for example,
|
Thus, for example::
|
||||||
|
|
||||||
cat /sys/block/md*/md/rd*/state
|
cat /sys/block/md*/md/rd*/state
|
||||||
will show 'in_sync' on every line.
|
|
||||||
|
will show ``in_sync`` on every line.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@@ -488,50 +587,62 @@ also have
|
|||||||
sync_action
|
sync_action
|
||||||
a text file that can be used to monitor and control the rebuild
|
a text file that can be used to monitor and control the rebuild
|
||||||
process. It contains one word which can be one of:
|
process. It contains one word which can be one of:
|
||||||
resync - redundancy is being recalculated after unclean
|
|
||||||
|
resync
|
||||||
|
redundancy is being recalculated after unclean
|
||||||
shutdown or creation
|
shutdown or creation
|
||||||
recover - a hot spare is being built to replace a
|
|
||||||
|
recover
|
||||||
|
a hot spare is being built to replace a
|
||||||
failed/missing device
|
failed/missing device
|
||||||
idle - nothing is happening
|
|
||||||
check - A full check of redundancy was requested and is
|
idle
|
||||||
|
nothing is happening
|
||||||
|
check
|
||||||
|
A full check of redundancy was requested and is
|
||||||
happening. This reads all blocks and checks
|
happening. This reads all blocks and checks
|
||||||
them. A repair may also happen for some raid
|
them. A repair may also happen for some raid
|
||||||
levels.
|
levels.
|
||||||
repair - A full check and repair is happening. This is
|
|
||||||
similar to 'resync', but was requested by the
|
repair
|
||||||
|
A full check and repair is happening. This is
|
||||||
|
similar to ``resync``, but was requested by the
|
||||||
user, and the write-intent bitmap is NOT used to
|
user, and the write-intent bitmap is NOT used to
|
||||||
optimise the process.
|
optimise the process.
|
||||||
|
|
||||||
This file is writable, and each of the strings that could be
|
This file is writable, and each of the strings that could be
|
||||||
read are meaningful for writing.
|
read are meaningful for writing.
|
||||||
|
|
||||||
'idle' will stop an active resync/recovery etc. There is no
|
``idle`` will stop an active resync/recovery etc. There is no
|
||||||
guarantee that another resync/recovery may not be automatically
|
guarantee that another resync/recovery may not be automatically
|
||||||
started again, though some event will be needed to trigger
|
started again, though some event will be needed to trigger
|
||||||
this.
|
this.
|
||||||
'resync' or 'recovery' can be used to restart the
|
|
||||||
corresponding operation if it was stopped with 'idle'.
|
``resync`` or ``recovery`` can be used to restart the
|
||||||
'check' and 'repair' will start the appropriate process
|
corresponding operation if it was stopped with ``idle``.
|
||||||
providing the current state is 'idle'.
|
|
||||||
|
``check`` and ``repair`` will start the appropriate process
|
||||||
|
providing the current state is ``idle``.
|
||||||
|
|
||||||
This file responds to select/poll. Any important change in the value
|
This file responds to select/poll. Any important change in the value
|
||||||
triggers a poll event. Sometimes the value will briefly be
|
triggers a poll event. Sometimes the value will briefly be
|
||||||
"recover" if a recovery seems to be needed, but cannot be
|
``recover`` if a recovery seems to be needed, but cannot be
|
||||||
achieved. In that case, the transition to "recover" isn't
|
achieved. In that case, the transition to ``recover`` isn't
|
||||||
notified, but the transition away is.
|
notified, but the transition away is.
|
||||||
|
|
||||||
degraded
|
degraded
|
||||||
This contains a count of the number of devices by which the
|
This contains a count of the number of devices by which the
|
||||||
arrays is degraded. So an optimal array will show '0'. A
|
arrays is degraded. So an optimal array will show ``0``. A
|
||||||
single failed/missing drive will show '1', etc.
|
single failed/missing drive will show ``1``, etc.
|
||||||
|
|
||||||
This file responds to select/poll, any increase or decrease
|
This file responds to select/poll, any increase or decrease
|
||||||
in the count of missing devices will trigger an event.
|
in the count of missing devices will trigger an event.
|
||||||
|
|
||||||
mismatch_count
|
mismatch_count
|
||||||
When performing 'check' and 'repair', and possibly when
|
When performing ``check`` and ``repair``, and possibly when
|
||||||
performing 'resync', md will count the number of errors that are
|
performing ``resync``, md will count the number of errors that are
|
||||||
found. The count in 'mismatch_cnt' is the number of sectors
|
found. The count in ``mismatch_cnt`` is the number of sectors
|
||||||
that were re-written, or (for 'check') would have been
|
that were re-written, or (for ``check``) would have been
|
||||||
re-written. As most raid levels work in units of pages rather
|
re-written. As most raid levels work in units of pages rather
|
||||||
than sectors, this may be larger than the number of actual errors
|
than sectors, this may be larger than the number of actual errors
|
||||||
by a factor of the number of sectors in a page.
|
by a factor of the number of sectors in a page.
|
||||||
@@ -542,27 +653,30 @@ also have
|
|||||||
would need to check the corresponding blocks. Either individual
|
would need to check the corresponding blocks. Either individual
|
||||||
numbers or start-end pairs can be written. Multiple numbers
|
numbers or start-end pairs can be written. Multiple numbers
|
||||||
can be separated by a space.
|
can be separated by a space.
|
||||||
Note that the numbers are 'bit' numbers, not 'block' numbers.
|
|
||||||
|
Note that the numbers are ``bit`` numbers, not ``block`` numbers.
|
||||||
They should be scaled by the bitmap_chunksize.
|
They should be scaled by the bitmap_chunksize.
|
||||||
|
|
||||||
sync_speed_min
|
sync_speed_min, sync_speed_max
|
||||||
sync_speed_max
|
This are similar to ``/proc/sys/dev/raid/speed_limit_{min,max}``
|
||||||
This are similar to /proc/sys/dev/raid/speed_limit_{min,max}
|
|
||||||
however they only apply to the particular array.
|
however they only apply to the particular array.
|
||||||
If no value has been written to these, or if the word 'system'
|
|
||||||
|
If no value has been written to these, or if the word ``system``
|
||||||
is written, then the system-wide value is used. If a value,
|
is written, then the system-wide value is used. If a value,
|
||||||
in kibibytes-per-second is written, then it is used.
|
in kibibytes-per-second is written, then it is used.
|
||||||
|
|
||||||
When the files are read, they show the currently active value
|
When the files are read, they show the currently active value
|
||||||
followed by "(local)" or "(system)" depending on whether it is
|
followed by ``(local)`` or ``(system)`` depending on whether it is
|
||||||
a locally set or system-wide value.
|
a locally set or system-wide value.
|
||||||
|
|
||||||
sync_completed
|
sync_completed
|
||||||
This shows the number of sectors that have been completed of
|
This shows the number of sectors that have been completed of
|
||||||
whatever the current sync_action is, followed by the number of
|
whatever the current sync_action is, followed by the number of
|
||||||
sectors in total that could need to be processed. The two
|
sectors in total that could need to be processed. The two
|
||||||
numbers are separated by a '/' thus effectively showing one
|
numbers are separated by a ``/`` thus effectively showing one
|
||||||
value, a fraction of the process that is complete.
|
value, a fraction of the process that is complete.
|
||||||
A 'select' on this attribute will return when resync completes,
|
|
||||||
|
A ``select`` on this attribute will return when resync completes,
|
||||||
when it reaches the current sync_max (below) and possibly at
|
when it reaches the current sync_max (below) and possibly at
|
||||||
other times.
|
other times.
|
||||||
|
|
||||||
@@ -570,26 +684,24 @@ also have
|
|||||||
This shows the current actual speed, in K/sec, of the current
|
This shows the current actual speed, in K/sec, of the current
|
||||||
sync_action. It is averaged over the last 30 seconds.
|
sync_action. It is averaged over the last 30 seconds.
|
||||||
|
|
||||||
suspend_lo
|
suspend_lo, suspend_hi
|
||||||
suspend_hi
|
|
||||||
The two values, given as numbers of sectors, indicate a range
|
The two values, given as numbers of sectors, indicate a range
|
||||||
within the array where IO will be blocked. This is currently
|
within the array where IO will be blocked. This is currently
|
||||||
only supported for raid4/5/6.
|
only supported for raid4/5/6.
|
||||||
|
|
||||||
sync_min
|
sync_min, sync_max
|
||||||
sync_max
|
|
||||||
The two values, given as numbers of sectors, indicate a range
|
The two values, given as numbers of sectors, indicate a range
|
||||||
within the array where 'check'/'repair' will operate. Must be
|
within the array where ``check``/``repair`` will operate. Must be
|
||||||
a multiple of chunk_size. When it reaches "sync_max" it will
|
a multiple of chunk_size. When it reaches ``sync_max`` it will
|
||||||
pause, rather than complete.
|
pause, rather than complete.
|
||||||
You can use 'select' or 'poll' on "sync_completed" to wait for
|
You can use ``select`` or ``poll`` on ``sync_completed`` to wait for
|
||||||
that number to reach sync_max. Then you can either increase
|
that number to reach sync_max. Then you can either increase
|
||||||
"sync_max", or can write 'idle' to "sync_action".
|
``sync_max``, or can write ``idle`` to ``sync_action``.
|
||||||
|
|
||||||
The value of 'max' for "sync_max" effectively disables the limit.
|
The value of ``max`` for ``sync_max`` effectively disables the limit.
|
||||||
When a resync is active, the value can only ever be increased,
|
When a resync is active, the value can only ever be increased,
|
||||||
never decreased.
|
never decreased.
|
||||||
The value of '0' is the minimum for "sync_min".
|
The value of ``0`` is the minimum for ``sync_min``.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
@@ -598,13 +710,15 @@ personality module that manages it.
|
|||||||
These are specific to the implementation of the module and could
|
These are specific to the implementation of the module and could
|
||||||
change substantially if the implementation changes.
|
change substantially if the implementation changes.
|
||||||
|
|
||||||
These currently include
|
These currently include:
|
||||||
|
|
||||||
stripe_cache_size (currently raid5 only)
|
stripe_cache_size (currently raid5 only)
|
||||||
number of entries in the stripe cache. This is writable, but
|
number of entries in the stripe cache. This is writable, but
|
||||||
there are upper and lower limits (32768, 17). Default is 256.
|
there are upper and lower limits (32768, 17). Default is 256.
|
||||||
|
|
||||||
strip_cache_active (currently raid5 only)
|
strip_cache_active (currently raid5 only)
|
||||||
number of active entries in the stripe cache
|
number of active entries in the stripe cache
|
||||||
|
|
||||||
preread_bypass_threshold (currently raid5 only)
|
preread_bypass_threshold (currently raid5 only)
|
||||||
number of times a stripe requiring preread will be bypassed by
|
number of times a stripe requiring preread will be bypassed by
|
||||||
a stripe that does not require preread. For fairness defaults
|
a stripe that does not require preread. For fairness defaults
|
||||||
@@ -1,22 +1,21 @@
|
|||||||
==============================
|
Kernel module signing facility
|
||||||
KERNEL MODULE SIGNING FACILITY
|
------------------------------
|
||||||
==============================
|
|
||||||
|
|
||||||
CONTENTS
|
.. CONTENTS
|
||||||
|
..
|
||||||
- Overview.
|
.. - Overview.
|
||||||
- Configuring module signing.
|
.. - Configuring module signing.
|
||||||
- Generating signing keys.
|
.. - Generating signing keys.
|
||||||
- Public keys in the kernel.
|
.. - Public keys in the kernel.
|
||||||
- Manually signing modules.
|
.. - Manually signing modules.
|
||||||
- Signed modules and stripping.
|
.. - Signed modules and stripping.
|
||||||
- Loading signed modules.
|
.. - Loading signed modules.
|
||||||
- Non-valid signatures and unsigned modules.
|
.. - Non-valid signatures and unsigned modules.
|
||||||
- Administering/protecting the private key.
|
.. - Administering/protecting the private key.
|
||||||
|
|
||||||
|
|
||||||
========
|
========
|
||||||
OVERVIEW
|
Overview
|
||||||
========
|
========
|
||||||
|
|
||||||
The kernel module signing facility cryptographically signs modules during
|
The kernel module signing facility cryptographically signs modules during
|
||||||
@@ -36,17 +35,19 @@ SHA-512 (the algorithm is selected by data in the signature).
|
|||||||
|
|
||||||
|
|
||||||
==========================
|
==========================
|
||||||
CONFIGURING MODULE SIGNING
|
Configuring module signing
|
||||||
==========================
|
==========================
|
||||||
|
|
||||||
The module signing facility is enabled by going to the "Enable Loadable Module
|
The module signing facility is enabled by going to the
|
||||||
Support" section of the kernel configuration and turning on
|
:menuselection:`Enable Loadable Module Support` section of
|
||||||
|
the kernel configuration and turning on::
|
||||||
|
|
||||||
CONFIG_MODULE_SIG "Module signature verification"
|
CONFIG_MODULE_SIG "Module signature verification"
|
||||||
|
|
||||||
This has a number of options available:
|
This has a number of options available:
|
||||||
|
|
||||||
(1) "Require modules to be validly signed" (CONFIG_MODULE_SIG_FORCE)
|
(1) :menuselection:`Require modules to be validly signed`
|
||||||
|
(``CONFIG_MODULE_SIG_FORCE``)
|
||||||
|
|
||||||
This specifies how the kernel should deal with a module that has a
|
This specifies how the kernel should deal with a module that has a
|
||||||
signature for which the key is not known or a module that is unsigned.
|
signature for which the key is not known or a module that is unsigned.
|
||||||
@@ -64,35 +65,39 @@ This has a number of options available:
|
|||||||
cannot be parsed, it will be rejected out of hand.
|
cannot be parsed, it will be rejected out of hand.
|
||||||
|
|
||||||
|
|
||||||
(2) "Automatically sign all modules" (CONFIG_MODULE_SIG_ALL)
|
(2) :menuselection:`Automatically sign all modules`
|
||||||
|
(``CONFIG_MODULE_SIG_ALL``)
|
||||||
|
|
||||||
If this is on then modules will be automatically signed during the
|
If this is on then modules will be automatically signed during the
|
||||||
modules_install phase of a build. If this is off, then the modules must
|
modules_install phase of a build. If this is off, then the modules must
|
||||||
be signed manually using:
|
be signed manually using::
|
||||||
|
|
||||||
scripts/sign-file
|
scripts/sign-file
|
||||||
|
|
||||||
|
|
||||||
(3) "Which hash algorithm should modules be signed with?"
|
(3) :menuselection:`Which hash algorithm should modules be signed with?`
|
||||||
|
|
||||||
This presents a choice of which hash algorithm the installation phase will
|
This presents a choice of which hash algorithm the installation phase will
|
||||||
sign the modules with:
|
sign the modules with:
|
||||||
|
|
||||||
CONFIG_MODULE_SIG_SHA1 "Sign modules with SHA-1"
|
=============================== ==========================================
|
||||||
CONFIG_MODULE_SIG_SHA224 "Sign modules with SHA-224"
|
``CONFIG_MODULE_SIG_SHA1`` :menuselection:`Sign modules with SHA-1`
|
||||||
CONFIG_MODULE_SIG_SHA256 "Sign modules with SHA-256"
|
``CONFIG_MODULE_SIG_SHA224`` :menuselection:`Sign modules with SHA-224`
|
||||||
CONFIG_MODULE_SIG_SHA384 "Sign modules with SHA-384"
|
``CONFIG_MODULE_SIG_SHA256`` :menuselection:`Sign modules with SHA-256`
|
||||||
CONFIG_MODULE_SIG_SHA512 "Sign modules with SHA-512"
|
``CONFIG_MODULE_SIG_SHA384`` :menuselection:`Sign modules with SHA-384`
|
||||||
|
``CONFIG_MODULE_SIG_SHA512`` :menuselection:`Sign modules with SHA-512`
|
||||||
|
=============================== ==========================================
|
||||||
|
|
||||||
The algorithm selected here will also be built into the kernel (rather
|
The algorithm selected here will also be built into the kernel (rather
|
||||||
than being a module) so that modules signed with that algorithm can have
|
than being a module) so that modules signed with that algorithm can have
|
||||||
their signatures checked without causing a dependency loop.
|
their signatures checked without causing a dependency loop.
|
||||||
|
|
||||||
|
|
||||||
(4) "File name or PKCS#11 URI of module signing key" (CONFIG_MODULE_SIG_KEY)
|
(4) :menuselection:`File name or PKCS#11 URI of module signing key`
|
||||||
|
(``CONFIG_MODULE_SIG_KEY``)
|
||||||
|
|
||||||
Setting this option to something other than its default of
|
Setting this option to something other than its default of
|
||||||
"certs/signing_key.pem" will disable the autogeneration of signing keys
|
``certs/signing_key.pem`` will disable the autogeneration of signing keys
|
||||||
and allow the kernel modules to be signed with a key of your choosing.
|
and allow the kernel modules to be signed with a key of your choosing.
|
||||||
The string provided should identify a file containing both a private key
|
The string provided should identify a file containing both a private key
|
||||||
and its corresponding X.509 certificate in PEM form, or — on systems where
|
and its corresponding X.509 certificate in PEM form, or — on systems where
|
||||||
@@ -102,10 +107,11 @@ This has a number of options available:
|
|||||||
|
|
||||||
If the PEM file containing the private key is encrypted, or if the
|
If the PEM file containing the private key is encrypted, or if the
|
||||||
PKCS#11 token requries a PIN, this can be provided at build time by
|
PKCS#11 token requries a PIN, this can be provided at build time by
|
||||||
means of the KBUILD_SIGN_PIN variable.
|
means of the ``KBUILD_SIGN_PIN`` variable.
|
||||||
|
|
||||||
|
|
||||||
(5) "Additional X.509 keys for default system keyring" (CONFIG_SYSTEM_TRUSTED_KEYS)
|
(5) :menuselection:`Additional X.509 keys for default system keyring`
|
||||||
|
(``CONFIG_SYSTEM_TRUSTED_KEYS``)
|
||||||
|
|
||||||
This option can be set to the filename of a PEM-encoded file containing
|
This option can be set to the filename of a PEM-encoded file containing
|
||||||
additional certificates which will be included in the system keyring by
|
additional certificates which will be included in the system keyring by
|
||||||
@@ -116,7 +122,7 @@ packages to the kernel build processes for the tool that does the signing.
|
|||||||
|
|
||||||
|
|
||||||
=======================
|
=======================
|
||||||
GENERATING SIGNING KEYS
|
Generating signing keys
|
||||||
=======================
|
=======================
|
||||||
|
|
||||||
Cryptographic keypairs are required to generate and check signatures. A
|
Cryptographic keypairs are required to generate and check signatures. A
|
||||||
@@ -126,14 +132,14 @@ it can be deleted or stored securely. The public key gets built into the
|
|||||||
kernel so that it can be used to check the signatures as the modules are
|
kernel so that it can be used to check the signatures as the modules are
|
||||||
loaded.
|
loaded.
|
||||||
|
|
||||||
Under normal conditions, when CONFIG_MODULE_SIG_KEY is unchanged from its
|
Under normal conditions, when ``CONFIG_MODULE_SIG_KEY`` is unchanged from its
|
||||||
default, the kernel build will automatically generate a new keypair using
|
default, the kernel build will automatically generate a new keypair using
|
||||||
openssl if one does not exist in the file:
|
openssl if one does not exist in the file::
|
||||||
|
|
||||||
certs/signing_key.pem
|
certs/signing_key.pem
|
||||||
|
|
||||||
during the building of vmlinux (the public part of the key needs to be built
|
during the building of vmlinux (the public part of the key needs to be built
|
||||||
into vmlinux) using parameters in the:
|
into vmlinux) using parameters in the::
|
||||||
|
|
||||||
certs/x509.genkey
|
certs/x509.genkey
|
||||||
|
|
||||||
@@ -142,14 +148,14 @@ file (which is also generated if it does not already exist).
|
|||||||
It is strongly recommended that you provide your own x509.genkey file.
|
It is strongly recommended that you provide your own x509.genkey file.
|
||||||
|
|
||||||
Most notably, in the x509.genkey file, the req_distinguished_name section
|
Most notably, in the x509.genkey file, the req_distinguished_name section
|
||||||
should be altered from the default:
|
should be altered from the default::
|
||||||
|
|
||||||
[ req_distinguished_name ]
|
[ req_distinguished_name ]
|
||||||
#O = Unspecified company
|
#O = Unspecified company
|
||||||
CN = Build time autogenerated kernel key
|
CN = Build time autogenerated kernel key
|
||||||
#emailAddress = unspecified.user@unspecified.company
|
#emailAddress = unspecified.user@unspecified.company
|
||||||
|
|
||||||
The generated RSA key size can also be set with:
|
The generated RSA key size can also be set with::
|
||||||
|
|
||||||
[ req ]
|
[ req ]
|
||||||
default_bits = 4096
|
default_bits = 4096
|
||||||
@@ -158,23 +164,23 @@ The generated RSA key size can also be set with:
|
|||||||
It is also possible to manually generate the key private/public files using the
|
It is also possible to manually generate the key private/public files using the
|
||||||
x509.genkey key generation configuration file in the root node of the Linux
|
x509.genkey key generation configuration file in the root node of the Linux
|
||||||
kernel sources tree and the openssl command. The following is an example to
|
kernel sources tree and the openssl command. The following is an example to
|
||||||
generate the public/private key files:
|
generate the public/private key files::
|
||||||
|
|
||||||
openssl req -new -nodes -utf8 -sha256 -days 36500 -batch -x509 \
|
openssl req -new -nodes -utf8 -sha256 -days 36500 -batch -x509 \
|
||||||
-config x509.genkey -outform PEM -out kernel_key.pem \
|
-config x509.genkey -outform PEM -out kernel_key.pem \
|
||||||
-keyout kernel_key.pem
|
-keyout kernel_key.pem
|
||||||
|
|
||||||
The full pathname for the resulting kernel_key.pem file can then be specified
|
The full pathname for the resulting kernel_key.pem file can then be specified
|
||||||
in the CONFIG_MODULE_SIG_KEY option, and the certificate and key therein will
|
in the ``CONFIG_MODULE_SIG_KEY`` option, and the certificate and key therein will
|
||||||
be used instead of an autogenerated keypair.
|
be used instead of an autogenerated keypair.
|
||||||
|
|
||||||
|
|
||||||
=========================
|
=========================
|
||||||
PUBLIC KEYS IN THE KERNEL
|
Public keys in the kernel
|
||||||
=========================
|
=========================
|
||||||
|
|
||||||
The kernel contains a ring of public keys that can be viewed by root. They're
|
The kernel contains a ring of public keys that can be viewed by root. They're
|
||||||
in a keyring called ".system_keyring" that can be seen by:
|
in a keyring called ".system_keyring" that can be seen by::
|
||||||
|
|
||||||
[root@deneb ~]# cat /proc/keys
|
[root@deneb ~]# cat /proc/keys
|
||||||
...
|
...
|
||||||
@@ -184,27 +190,27 @@ in a keyring called ".system_keyring" that can be seen by:
|
|||||||
|
|
||||||
Beyond the public key generated specifically for module signing, additional
|
Beyond the public key generated specifically for module signing, additional
|
||||||
trusted certificates can be provided in a PEM-encoded file referenced by the
|
trusted certificates can be provided in a PEM-encoded file referenced by the
|
||||||
CONFIG_SYSTEM_TRUSTED_KEYS configuration option.
|
``CONFIG_SYSTEM_TRUSTED_KEYS`` configuration option.
|
||||||
|
|
||||||
Further, the architecture code may take public keys from a hardware store and
|
Further, the architecture code may take public keys from a hardware store and
|
||||||
add those in also (e.g. from the UEFI key database).
|
add those in also (e.g. from the UEFI key database).
|
||||||
|
|
||||||
Finally, it is possible to add additional public keys by doing:
|
Finally, it is possible to add additional public keys by doing::
|
||||||
|
|
||||||
keyctl padd asymmetric "" [.system_keyring-ID] <[key-file]
|
keyctl padd asymmetric "" [.system_keyring-ID] <[key-file]
|
||||||
|
|
||||||
e.g.:
|
e.g.::
|
||||||
|
|
||||||
keyctl padd asymmetric "" 0x223c7853 <my_public_key.x509
|
keyctl padd asymmetric "" 0x223c7853 <my_public_key.x509
|
||||||
|
|
||||||
Note, however, that the kernel will only permit keys to be added to
|
Note, however, that the kernel will only permit keys to be added to
|
||||||
.system_keyring _if_ the new key's X.509 wrapper is validly signed by a key
|
``.system_keyring _if_`` the new key's X.509 wrapper is validly signed by a key
|
||||||
that is already resident in the .system_keyring at the time the key was added.
|
that is already resident in the .system_keyring at the time the key was added.
|
||||||
|
|
||||||
|
|
||||||
=========================
|
========================
|
||||||
MANUALLY SIGNING MODULES
|
Manually signing modules
|
||||||
=========================
|
========================
|
||||||
|
|
||||||
To manually sign a module, use the scripts/sign-file tool available in
|
To manually sign a module, use the scripts/sign-file tool available in
|
||||||
the Linux kernel source tree. The script requires 4 arguments:
|
the Linux kernel source tree. The script requires 4 arguments:
|
||||||
@@ -214,7 +220,7 @@ the Linux kernel source tree. The script requires 4 arguments:
|
|||||||
3. The public key filename
|
3. The public key filename
|
||||||
4. The kernel module to be signed
|
4. The kernel module to be signed
|
||||||
|
|
||||||
The following is an example to sign a kernel module:
|
The following is an example to sign a kernel module::
|
||||||
|
|
||||||
scripts/sign-file sha512 kernel-signkey.priv \
|
scripts/sign-file sha512 kernel-signkey.priv \
|
||||||
kernel-signkey.x509 module.ko
|
kernel-signkey.x509 module.ko
|
||||||
@@ -228,11 +234,11 @@ $KBUILD_SIGN_PIN environment variable.
|
|||||||
|
|
||||||
|
|
||||||
============================
|
============================
|
||||||
SIGNED MODULES AND STRIPPING
|
Signed modules and stripping
|
||||||
============================
|
============================
|
||||||
|
|
||||||
A signed module has a digital signature simply appended at the end. The string
|
A signed module has a digital signature simply appended at the end. The string
|
||||||
"~Module signature appended~." at the end of the module's file confirms that a
|
``~Module signature appended~.`` at the end of the module's file confirms that a
|
||||||
signature is present but it does not confirm that the signature is valid!
|
signature is present but it does not confirm that the signature is valid!
|
||||||
|
|
||||||
Signed modules are BRITTLE as the signature is outside of the defined ELF
|
Signed modules are BRITTLE as the signature is outside of the defined ELF
|
||||||
@@ -242,19 +248,19 @@ debug information present at the time of signing.
|
|||||||
|
|
||||||
|
|
||||||
======================
|
======================
|
||||||
LOADING SIGNED MODULES
|
Loading signed modules
|
||||||
======================
|
======================
|
||||||
|
|
||||||
Modules are loaded with insmod, modprobe, init_module() or finit_module(),
|
Modules are loaded with insmod, modprobe, ``init_module()`` or
|
||||||
exactly as for unsigned modules as no processing is done in userspace. The
|
``finit_module()``, exactly as for unsigned modules as no processing is
|
||||||
signature checking is all done within the kernel.
|
done in userspace. The signature checking is all done within the kernel.
|
||||||
|
|
||||||
|
|
||||||
=========================================
|
=========================================
|
||||||
NON-VALID SIGNATURES AND UNSIGNED MODULES
|
Non-valid signatures and unsigned modules
|
||||||
=========================================
|
=========================================
|
||||||
|
|
||||||
If CONFIG_MODULE_SIG_FORCE is enabled or module.sig_enforce=1 is supplied on
|
If ``CONFIG_MODULE_SIG_FORCE`` is enabled or module.sig_enforce=1 is supplied on
|
||||||
the kernel command line, the kernel will only load validly signed modules
|
the kernel command line, the kernel will only load validly signed modules
|
||||||
for which it has a public key. Otherwise, it will also load modules that are
|
for which it has a public key. Otherwise, it will also load modules that are
|
||||||
unsigned. Any module for which the kernel has a key, but which proves to have
|
unsigned. Any module for which the kernel has a key, but which proves to have
|
||||||
@@ -264,7 +270,7 @@ Any module that has an unparseable signature will be rejected.
|
|||||||
|
|
||||||
|
|
||||||
=========================================
|
=========================================
|
||||||
ADMINISTERING/PROTECTING THE PRIVATE KEY
|
Administering/protecting the private key
|
||||||
=========================================
|
=========================================
|
||||||
|
|
||||||
Since the private key is used to sign modules, viruses and malware could use
|
Since the private key is used to sign modules, viruses and malware could use
|
||||||
@@ -275,5 +281,5 @@ in the root node of the kernel source tree.
|
|||||||
If you use the same private key to sign modules for multiple kernel
|
If you use the same private key to sign modules for multiple kernel
|
||||||
configurations, you must ensure that the module version information is
|
configurations, you must ensure that the module version information is
|
||||||
sufficient to prevent loading a module into a different kernel. Either
|
sufficient to prevent loading a module into a different kernel. Either
|
||||||
set CONFIG_MODVERSIONS=y or ensure that each configuration has a different
|
set ``CONFIG_MODVERSIONS=y`` or ensure that each configuration has a different
|
||||||
kernel release string by changing EXTRAVERSION or CONFIG_LOCALVERSION.
|
kernel release string by changing ``EXTRAVERSION`` or ``CONFIG_LOCALVERSION``.
|
||||||
@@ -19,20 +19,22 @@ other program after you have done the following:
|
|||||||
http://www.go-mono.com/compiling.html
|
http://www.go-mono.com/compiling.html
|
||||||
|
|
||||||
Once the Mono CLR support has been installed, just check that
|
Once the Mono CLR support has been installed, just check that
|
||||||
/usr/bin/mono (which could be located elsewhere, for example
|
``/usr/bin/mono`` (which could be located elsewhere, for example
|
||||||
/usr/local/bin/mono) is working.
|
``/usr/local/bin/mono``) is working.
|
||||||
|
|
||||||
2) You have to compile BINFMT_MISC either as a module or into
|
2) You have to compile BINFMT_MISC either as a module or into
|
||||||
the kernel (CONFIG_BINFMT_MISC) and set it up properly.
|
the kernel (``CONFIG_BINFMT_MISC``) and set it up properly.
|
||||||
If you choose to compile it as a module, you will have
|
If you choose to compile it as a module, you will have
|
||||||
to insert it manually with modprobe/insmod, as kmod
|
to insert it manually with modprobe/insmod, as kmod
|
||||||
cannot be easily supported with binfmt_misc.
|
cannot be easily supported with binfmt_misc.
|
||||||
Read the file 'binfmt_misc.txt' in this directory to know
|
Read the file ``binfmt_misc.txt`` in this directory to know
|
||||||
more about the configuration process.
|
more about the configuration process.
|
||||||
|
|
||||||
3) Add the following entries to /etc/rc.local or similar script
|
3) Add the following entries to ``/etc/rc.local`` or similar script
|
||||||
to be run at system startup:
|
to be run at system startup:
|
||||||
|
|
||||||
|
.. code-block:: sh
|
||||||
|
|
||||||
# Insert BINFMT_MISC module into the kernel
|
# Insert BINFMT_MISC module into the kernel
|
||||||
if [ ! -e /proc/sys/fs/binfmt_misc/register ]; then
|
if [ ! -e /proc/sys/fs/binfmt_misc/register ]; then
|
||||||
/sbin/modprobe binfmt_misc
|
/sbin/modprobe binfmt_misc
|
||||||
@@ -56,11 +58,13 @@ else
|
|||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
4) Check that .exe binaries can be ran without the need of a
|
4) Check that ``.exe`` binaries can be ran without the need of a
|
||||||
wrapper script, simply by launching the .exe file directly
|
wrapper script, simply by launching the ``.exe`` file directly
|
||||||
from a command prompt, for example:
|
from a command prompt, for example::
|
||||||
|
|
||||||
/usr/bin/xsd.exe
|
/usr/bin/xsd.exe
|
||||||
|
|
||||||
NOTE: If this fails with a permission denied error, check
|
.. note::
|
||||||
that the .exe file has execute permissions.
|
|
||||||
|
If this fails with a permission denied error, check
|
||||||
|
that the ``.exe`` file has execute permissions.
|
||||||
286
Documentation/admin-guide/parport.rst
Normal file
286
Documentation/admin-guide/parport.rst
Normal file
@@ -0,0 +1,286 @@
|
|||||||
|
Parport
|
||||||
|
+++++++
|
||||||
|
|
||||||
|
The ``parport`` code provides parallel-port support under Linux. This
|
||||||
|
includes the ability to share one port between multiple device
|
||||||
|
drivers.
|
||||||
|
|
||||||
|
You can pass parameters to the ``parport`` code to override its automatic
|
||||||
|
detection of your hardware. This is particularly useful if you want
|
||||||
|
to use IRQs, since in general these can't be autoprobed successfully.
|
||||||
|
By default IRQs are not used even if they **can** be probed. This is
|
||||||
|
because there are a lot of people using the same IRQ for their
|
||||||
|
parallel port and a sound card or network card.
|
||||||
|
|
||||||
|
The ``parport`` code is split into two parts: generic (which deals with
|
||||||
|
port-sharing) and architecture-dependent (which deals with actually
|
||||||
|
using the port).
|
||||||
|
|
||||||
|
|
||||||
|
Parport as modules
|
||||||
|
==================
|
||||||
|
|
||||||
|
If you load the `parport`` code as a module, say::
|
||||||
|
|
||||||
|
# insmod parport
|
||||||
|
|
||||||
|
to load the generic ``parport`` code. You then must load the
|
||||||
|
architecture-dependent code with (for example)::
|
||||||
|
|
||||||
|
# insmod parport_pc io=0x3bc,0x378,0x278 irq=none,7,auto
|
||||||
|
|
||||||
|
to tell the ``parport`` code that you want three PC-style ports, one at
|
||||||
|
0x3bc with no IRQ, one at 0x378 using IRQ 7, and one at 0x278 with an
|
||||||
|
auto-detected IRQ. Currently, PC-style (``parport_pc``), Sun ``bpp``,
|
||||||
|
Amiga, Atari, and MFC3 hardware is supported.
|
||||||
|
|
||||||
|
PCI parallel I/O card support comes from ``parport_pc``. Base I/O
|
||||||
|
addresses should not be specified for supported PCI cards since they
|
||||||
|
are automatically detected.
|
||||||
|
|
||||||
|
|
||||||
|
modprobe
|
||||||
|
--------
|
||||||
|
|
||||||
|
If you use modprobe , you will find it useful to add lines as below to a
|
||||||
|
configuration file in /etc/modprobe.d/ directory::
|
||||||
|
|
||||||
|
alias parport_lowlevel parport_pc
|
||||||
|
options parport_pc io=0x378,0x278 irq=7,auto
|
||||||
|
|
||||||
|
modprobe will load ``parport_pc`` (with the options ``io=0x378,0x278 irq=7,auto``)
|
||||||
|
whenever a parallel port device driver (such as ``lp``) is loaded.
|
||||||
|
|
||||||
|
Note that these are example lines only! You shouldn't in general need
|
||||||
|
to specify any options to ``parport_pc`` in order to be able to use a
|
||||||
|
parallel port.
|
||||||
|
|
||||||
|
|
||||||
|
Parport probe [optional]
|
||||||
|
------------------------
|
||||||
|
|
||||||
|
In 2.2 kernels there was a module called ``parport_probe``, which was used
|
||||||
|
for collecting IEEE 1284 device ID information. This has now been
|
||||||
|
enhanced and now lives with the IEEE 1284 support. When a parallel
|
||||||
|
port is detected, the devices that are connected to it are analysed,
|
||||||
|
and information is logged like this::
|
||||||
|
|
||||||
|
parport0: Printer, BJC-210 (Canon)
|
||||||
|
|
||||||
|
The probe information is available from files in ``/proc/sys/dev/parport/``.
|
||||||
|
|
||||||
|
|
||||||
|
Parport linked into the kernel statically
|
||||||
|
=========================================
|
||||||
|
|
||||||
|
If you compile the ``parport`` code into the kernel, then you can use
|
||||||
|
kernel boot parameters to get the same effect. Add something like the
|
||||||
|
following to your LILO command line::
|
||||||
|
|
||||||
|
parport=0x3bc parport=0x378,7 parport=0x278,auto,nofifo
|
||||||
|
|
||||||
|
You can have many ``parport=...`` statements, one for each port you want
|
||||||
|
to add. Adding ``parport=0`` to the kernel command-line will disable
|
||||||
|
parport support entirely. Adding ``parport=auto`` to the kernel
|
||||||
|
command-line will make ``parport`` use any IRQ lines or DMA channels that
|
||||||
|
it auto-detects.
|
||||||
|
|
||||||
|
|
||||||
|
Files in /proc
|
||||||
|
==============
|
||||||
|
|
||||||
|
If you have configured the ``/proc`` filesystem into your kernel, you will
|
||||||
|
see a new directory entry: ``/proc/sys/dev/parport``. In there will be a
|
||||||
|
directory entry for each parallel port for which parport is
|
||||||
|
configured. In each of those directories are a collection of files
|
||||||
|
describing that parallel port.
|
||||||
|
|
||||||
|
The ``/proc/sys/dev/parport`` directory tree looks like::
|
||||||
|
|
||||||
|
parport
|
||||||
|
|-- default
|
||||||
|
| |-- spintime
|
||||||
|
| `-- timeslice
|
||||||
|
|-- parport0
|
||||||
|
| |-- autoprobe
|
||||||
|
| |-- autoprobe0
|
||||||
|
| |-- autoprobe1
|
||||||
|
| |-- autoprobe2
|
||||||
|
| |-- autoprobe3
|
||||||
|
| |-- devices
|
||||||
|
| | |-- active
|
||||||
|
| | `-- lp
|
||||||
|
| | `-- timeslice
|
||||||
|
| |-- base-addr
|
||||||
|
| |-- irq
|
||||||
|
| |-- dma
|
||||||
|
| |-- modes
|
||||||
|
| `-- spintime
|
||||||
|
`-- parport1
|
||||||
|
|-- autoprobe
|
||||||
|
|-- autoprobe0
|
||||||
|
|-- autoprobe1
|
||||||
|
|-- autoprobe2
|
||||||
|
|-- autoprobe3
|
||||||
|
|-- devices
|
||||||
|
| |-- active
|
||||||
|
| `-- ppa
|
||||||
|
| `-- timeslice
|
||||||
|
|-- base-addr
|
||||||
|
|-- irq
|
||||||
|
|-- dma
|
||||||
|
|-- modes
|
||||||
|
`-- spintime
|
||||||
|
|
||||||
|
.. tabularcolumns:: |p{4.0cm}|p{13.5cm}|
|
||||||
|
|
||||||
|
======================= =======================================================
|
||||||
|
File Contents
|
||||||
|
======================= =======================================================
|
||||||
|
``devices/active`` A list of the device drivers using that port. A "+"
|
||||||
|
will appear by the name of the device currently using
|
||||||
|
the port (it might not appear against any). The
|
||||||
|
string "none" means that there are no device drivers
|
||||||
|
using that port.
|
||||||
|
|
||||||
|
``base-addr`` Parallel port's base address, or addresses if the port
|
||||||
|
has more than one in which case they are separated
|
||||||
|
with tabs. These values might not have any sensible
|
||||||
|
meaning for some ports.
|
||||||
|
|
||||||
|
``irq`` Parallel port's IRQ, or -1 if none is being used.
|
||||||
|
|
||||||
|
``dma`` Parallel port's DMA channel, or -1 if none is being
|
||||||
|
used.
|
||||||
|
|
||||||
|
``modes`` Parallel port's hardware modes, comma-separated,
|
||||||
|
meaning:
|
||||||
|
|
||||||
|
- PCSPP
|
||||||
|
PC-style SPP registers are available.
|
||||||
|
|
||||||
|
- TRISTATE
|
||||||
|
Port is bidirectional.
|
||||||
|
|
||||||
|
- COMPAT
|
||||||
|
Hardware acceleration for printers is
|
||||||
|
available and will be used.
|
||||||
|
|
||||||
|
- EPP
|
||||||
|
Hardware acceleration for EPP protocol
|
||||||
|
is available and will be used.
|
||||||
|
|
||||||
|
- ECP
|
||||||
|
Hardware acceleration for ECP protocol
|
||||||
|
is available and will be used.
|
||||||
|
|
||||||
|
- DMA
|
||||||
|
DMA is available and will be used.
|
||||||
|
|
||||||
|
Note that the current implementation will only take
|
||||||
|
advantage of COMPAT and ECP modes if it has an IRQ
|
||||||
|
line to use.
|
||||||
|
|
||||||
|
``autoprobe`` Any IEEE-1284 device ID information that has been
|
||||||
|
acquired from the (non-IEEE 1284.3) device.
|
||||||
|
|
||||||
|
``autoprobe[0-3]`` IEEE 1284 device ID information retrieved from
|
||||||
|
daisy-chain devices that conform to IEEE 1284.3.
|
||||||
|
|
||||||
|
``spintime`` The number of microseconds to busy-loop while waiting
|
||||||
|
for the peripheral to respond. You might find that
|
||||||
|
adjusting this improves performance, depending on your
|
||||||
|
peripherals. This is a port-wide setting, i.e. it
|
||||||
|
applies to all devices on a particular port.
|
||||||
|
|
||||||
|
``timeslice`` The number of milliseconds that a device driver is
|
||||||
|
allowed to keep a port claimed for. This is advisory,
|
||||||
|
and driver can ignore it if it must.
|
||||||
|
|
||||||
|
``default/*`` The defaults for spintime and timeslice. When a new
|
||||||
|
port is registered, it picks up the default spintime.
|
||||||
|
When a new device is registered, it picks up the
|
||||||
|
default timeslice.
|
||||||
|
======================= =======================================================
|
||||||
|
|
||||||
|
Device drivers
|
||||||
|
==============
|
||||||
|
|
||||||
|
Once the parport code is initialised, you can attach device drivers to
|
||||||
|
specific ports. Normally this happens automatically; if the lp driver
|
||||||
|
is loaded it will create one lp device for each port found. You can
|
||||||
|
override this, though, by using parameters either when you load the lp
|
||||||
|
driver::
|
||||||
|
|
||||||
|
# insmod lp parport=0,2
|
||||||
|
|
||||||
|
or on the LILO command line::
|
||||||
|
|
||||||
|
lp=parport0 lp=parport2
|
||||||
|
|
||||||
|
Both the above examples would inform lp that you want ``/dev/lp0`` to be
|
||||||
|
the first parallel port, and /dev/lp1 to be the **third** parallel port,
|
||||||
|
with no lp device associated with the second port (parport1). Note
|
||||||
|
that this is different to the way older kernels worked; there used to
|
||||||
|
be a static association between the I/O port address and the device
|
||||||
|
name, so ``/dev/lp0`` was always the port at 0x3bc. This is no longer the
|
||||||
|
case - if you only have one port, it will default to being ``/dev/lp0``,
|
||||||
|
regardless of base address.
|
||||||
|
|
||||||
|
Also:
|
||||||
|
|
||||||
|
* If you selected the IEEE 1284 support at compile time, you can say
|
||||||
|
``lp=auto`` on the kernel command line, and lp will create devices
|
||||||
|
only for those ports that seem to have printers attached.
|
||||||
|
|
||||||
|
* If you give PLIP the ``timid`` parameter, either with ``plip=timid`` on
|
||||||
|
the command line, or with ``insmod plip timid=1`` when using modules,
|
||||||
|
it will avoid any ports that seem to be in use by other devices.
|
||||||
|
|
||||||
|
* IRQ autoprobing works only for a few port types at the moment.
|
||||||
|
|
||||||
|
Reporting printer problems with parport
|
||||||
|
=======================================
|
||||||
|
|
||||||
|
If you are having problems printing, please go through these steps to
|
||||||
|
try to narrow down where the problem area is.
|
||||||
|
|
||||||
|
When reporting problems with parport, really you need to give all of
|
||||||
|
the messages that ``parport_pc`` spits out when it initialises. There are
|
||||||
|
several code paths:
|
||||||
|
|
||||||
|
- polling
|
||||||
|
- interrupt-driven, protocol in software
|
||||||
|
- interrupt-driven, protocol in hardware using PIO
|
||||||
|
- interrupt-driven, protocol in hardware using DMA
|
||||||
|
|
||||||
|
The kernel messages that ``parport_pc`` logs give an indication of which
|
||||||
|
code path is being used. (They could be a lot better actually..)
|
||||||
|
|
||||||
|
For normal printer protocol, having IEEE 1284 modes enabled or not
|
||||||
|
should not make a difference.
|
||||||
|
|
||||||
|
To turn off the 'protocol in hardware' code paths, disable
|
||||||
|
``CONFIG_PARPORT_PC_FIFO``. Note that when they are enabled they are not
|
||||||
|
necessarily **used**; it depends on whether the hardware is available,
|
||||||
|
enabled by the BIOS, and detected by the driver.
|
||||||
|
|
||||||
|
So, to start with, disable ``CONFIG_PARPORT_PC_FIFO``, and load ``parport_pc``
|
||||||
|
with ``irq=none``. See if printing works then. It really should,
|
||||||
|
because this is the simplest code path.
|
||||||
|
|
||||||
|
If that works fine, try with ``io=0x378 irq=7`` (adjust for your
|
||||||
|
hardware), to make it use interrupt-driven in-software protocol.
|
||||||
|
|
||||||
|
If **that** works fine, then one of the hardware modes isn't working
|
||||||
|
right. Enable ``CONFIG_FIFO`` (no, it isn't a module option,
|
||||||
|
and yes, it should be), set the port to ECP mode in the BIOS and note
|
||||||
|
the DMA channel, and try with::
|
||||||
|
|
||||||
|
io=0x378 irq=7 dma=none (for PIO)
|
||||||
|
io=0x378 irq=7 dma=3 (for DMA)
|
||||||
|
|
||||||
|
----------
|
||||||
|
|
||||||
|
philb@gnu.org
|
||||||
|
tim@cyberelk.net
|
||||||
@@ -5,34 +5,37 @@ Sergiu Iordache <sergiu@chromium.org>
|
|||||||
|
|
||||||
Updated: 17 November 2011
|
Updated: 17 November 2011
|
||||||
|
|
||||||
0. Introduction
|
Introduction
|
||||||
|
------------
|
||||||
|
|
||||||
Ramoops is an oops/panic logger that writes its logs to RAM before the system
|
Ramoops is an oops/panic logger that writes its logs to RAM before the system
|
||||||
crashes. It works by logging oopses and panics in a circular buffer. Ramoops
|
crashes. It works by logging oopses and panics in a circular buffer. Ramoops
|
||||||
needs a system with persistent RAM so that the content of that area can
|
needs a system with persistent RAM so that the content of that area can
|
||||||
survive after a restart.
|
survive after a restart.
|
||||||
|
|
||||||
1. Ramoops concepts
|
Ramoops concepts
|
||||||
|
----------------
|
||||||
|
|
||||||
Ramoops uses a predefined memory area to store the dump. The start and size
|
Ramoops uses a predefined memory area to store the dump. The start and size
|
||||||
and type of the memory area are set using three variables:
|
and type of the memory area are set using three variables:
|
||||||
* "mem_address" for the start
|
|
||||||
* "mem_size" for the size. The memory size will be rounded down to a
|
|
||||||
power of two.
|
|
||||||
* "mem_type" to specifiy if the memory type (default is pgprot_writecombine).
|
|
||||||
|
|
||||||
Typically the default value of mem_type=0 should be used as that sets the pstore
|
* ``mem_address`` for the start
|
||||||
mapping to pgprot_writecombine. Setting mem_type=1 attempts to use
|
* ``mem_size`` for the size. The memory size will be rounded down to a
|
||||||
pgprot_noncached, which only works on some platforms. This is because pstore
|
power of two.
|
||||||
|
* ``mem_type`` to specifiy if the memory type (default is pgprot_writecombine).
|
||||||
|
|
||||||
|
Typically the default value of ``mem_type=0`` should be used as that sets the pstore
|
||||||
|
mapping to pgprot_writecombine. Setting ``mem_type=1`` attempts to use
|
||||||
|
``pgprot_noncached``, which only works on some platforms. This is because pstore
|
||||||
depends on atomic operations. At least on ARM, pgprot_noncached causes the
|
depends on atomic operations. At least on ARM, pgprot_noncached causes the
|
||||||
memory to be mapped strongly ordered, and atomic operations on strongly ordered
|
memory to be mapped strongly ordered, and atomic operations on strongly ordered
|
||||||
memory are implementation defined, and won't work on many ARMs such as omaps.
|
memory are implementation defined, and won't work on many ARMs such as omaps.
|
||||||
|
|
||||||
The memory area is divided into "record_size" chunks (also rounded down to
|
The memory area is divided into ``record_size`` chunks (also rounded down to
|
||||||
power of two) and each oops/panic writes a "record_size" chunk of
|
power of two) and each oops/panic writes a ``record_size`` chunk of
|
||||||
information.
|
information.
|
||||||
|
|
||||||
Dumping both oopses and panics can be done by setting 1 in the "dump_oops"
|
Dumping both oopses and panics can be done by setting 1 in the ``dump_oops``
|
||||||
variable while setting 0 in that variable dumps only the panics.
|
variable while setting 0 in that variable dumps only the panics.
|
||||||
|
|
||||||
The module uses a counter to record multiple dumps but the counter gets reset
|
The module uses a counter to record multiple dumps but the counter gets reset
|
||||||
@@ -43,7 +46,8 @@ This might be useful when a hardware reset was used to bring the machine back
|
|||||||
to life (i.e. a watchdog triggered). In such cases, RAM may be somewhat
|
to life (i.e. a watchdog triggered). In such cases, RAM may be somewhat
|
||||||
corrupt, but usually it is restorable.
|
corrupt, but usually it is restorable.
|
||||||
|
|
||||||
2. Setting the parameters
|
Setting the parameters
|
||||||
|
----------------------
|
||||||
|
|
||||||
Setting the ramoops parameters can be done in several different manners:
|
Setting the ramoops parameters can be done in several different manners:
|
||||||
|
|
||||||
@@ -52,12 +56,13 @@ Setting the ramoops parameters can be done in several different manners:
|
|||||||
boot and then use the reserved memory for ramoops. For example, assuming a
|
boot and then use the reserved memory for ramoops. For example, assuming a
|
||||||
machine with > 128 MB of memory, the following kernel command line will tell
|
machine with > 128 MB of memory, the following kernel command line will tell
|
||||||
the kernel to use only the first 128 MB of memory, and place ECC-protected
|
the kernel to use only the first 128 MB of memory, and place ECC-protected
|
||||||
ramoops region at 128 MB boundary:
|
ramoops region at 128 MB boundary::
|
||||||
"mem=128M ramoops.mem_address=0x8000000 ramoops.ecc=1"
|
|
||||||
|
mem=128M ramoops.mem_address=0x8000000 ramoops.ecc=1
|
||||||
|
|
||||||
B. Use Device Tree bindings, as described in
|
B. Use Device Tree bindings, as described in
|
||||||
Documentation/device-tree/bindings/reserved-memory/ramoops.txt.
|
``Documentation/device-tree/bindings/reserved-memory/admin-guide/ramoops.rst``.
|
||||||
For example:
|
For example::
|
||||||
|
|
||||||
reserved-memory {
|
reserved-memory {
|
||||||
#address-cells = <2>;
|
#address-cells = <2>;
|
||||||
@@ -75,6 +80,8 @@ Setting the ramoops parameters can be done in several different manners:
|
|||||||
C. Use a platform device and set the platform data. The parameters can then
|
C. Use a platform device and set the platform data. The parameters can then
|
||||||
be set through that platform data. An example of doing that is:
|
be set through that platform data. An example of doing that is:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
#include <linux/pstore_ram.h>
|
#include <linux/pstore_ram.h>
|
||||||
[...]
|
[...]
|
||||||
|
|
||||||
@@ -105,28 +112,31 @@ if (ret) {
|
|||||||
|
|
||||||
You can specify either RAM memory or peripheral devices' memory. However, when
|
You can specify either RAM memory or peripheral devices' memory. However, when
|
||||||
specifying RAM, be sure to reserve the memory by issuing memblock_reserve()
|
specifying RAM, be sure to reserve the memory by issuing memblock_reserve()
|
||||||
very early in the architecture code, e.g.:
|
very early in the architecture code, e.g.::
|
||||||
|
|
||||||
#include <linux/memblock.h>
|
#include <linux/memblock.h>
|
||||||
|
|
||||||
memblock_reserve(ramoops_data.mem_address, ramoops_data.mem_size);
|
memblock_reserve(ramoops_data.mem_address, ramoops_data.mem_size);
|
||||||
|
|
||||||
3. Dump format
|
Dump format
|
||||||
|
-----------
|
||||||
|
|
||||||
The data dump begins with a header, currently defined as "====" followed by a
|
The data dump begins with a header, currently defined as ``====`` followed by a
|
||||||
timestamp and a new line. The dump then continues with the actual data.
|
timestamp and a new line. The dump then continues with the actual data.
|
||||||
|
|
||||||
4. Reading the data
|
Reading the data
|
||||||
|
----------------
|
||||||
|
|
||||||
The dump data can be read from the pstore filesystem. The format for these
|
The dump data can be read from the pstore filesystem. The format for these
|
||||||
files is "dmesg-ramoops-N", where N is the record number in memory. To delete
|
files is ``dmesg-ramoops-N``, where N is the record number in memory. To delete
|
||||||
a stored record from RAM, simply unlink the respective pstore file.
|
a stored record from RAM, simply unlink the respective pstore file.
|
||||||
|
|
||||||
5. Persistent function tracing
|
Persistent function tracing
|
||||||
|
---------------------------
|
||||||
|
|
||||||
Persistent function tracing might be useful for debugging software or hardware
|
Persistent function tracing might be useful for debugging software or hardware
|
||||||
related hangs. The functions call chain log is stored in a "ftrace-ramoops"
|
related hangs. The functions call chain log is stored in a ``ftrace-ramoops``
|
||||||
file. Here is an example of usage:
|
file. Here is an example of usage::
|
||||||
|
|
||||||
# mount -t debugfs debugfs /sys/kernel/debug/
|
# mount -t debugfs debugfs /sys/kernel/debug/
|
||||||
# echo 1 > /sys/kernel/debug/pstore/record_ftrace
|
# echo 1 > /sys/kernel/debug/pstore/record_ftrace
|
||||||
@@ -1,3 +1,8 @@
|
|||||||
|
.. _reportingbugs:
|
||||||
|
|
||||||
|
Reporting bugs
|
||||||
|
++++++++++++++
|
||||||
|
|
||||||
Background
|
Background
|
||||||
==========
|
==========
|
||||||
|
|
||||||
@@ -50,12 +55,13 @@ maintainer replies to you, make sure to 'Reply-all' in order to keep the
|
|||||||
public mailing list(s) in the email thread.
|
public mailing list(s) in the email thread.
|
||||||
|
|
||||||
If you know which driver is causing issues, you can pass one of the driver
|
If you know which driver is causing issues, you can pass one of the driver
|
||||||
files to the get_maintainer.pl script:
|
files to the get_maintainer.pl script::
|
||||||
|
|
||||||
perl scripts/get_maintainer.pl -f <filename>
|
perl scripts/get_maintainer.pl -f <filename>
|
||||||
|
|
||||||
If it is a security bug, please copy the Security Contact listed in the
|
If it is a security bug, please copy the Security Contact listed in the
|
||||||
MAINTAINERS file. They can help coordinate bugfix and disclosure. See
|
MAINTAINERS file. They can help coordinate bugfix and disclosure. See
|
||||||
Documentation/SecurityBugs for more information.
|
:ref:`Documentation/admin-guide/security-bugs.rst <securitybugs>` for more information.
|
||||||
|
|
||||||
If you can't figure out which subsystem caused the issue, you should file
|
If you can't figure out which subsystem caused the issue, you should file
|
||||||
a bug in kernel.org bugzilla and send email to
|
a bug in kernel.org bugzilla and send email to
|
||||||
@@ -70,6 +76,7 @@ Tips for reporting bugs
|
|||||||
If you haven't reported a bug before, please read:
|
If you haven't reported a bug before, please read:
|
||||||
|
|
||||||
http://www.chiark.greenend.org.uk/~sgtatham/bugs.html
|
http://www.chiark.greenend.org.uk/~sgtatham/bugs.html
|
||||||
|
|
||||||
http://www.catb.org/esr/faqs/smart-questions.html
|
http://www.catb.org/esr/faqs/smart-questions.html
|
||||||
|
|
||||||
It's REALLY important to report bugs that seem unrelated as separate email
|
It's REALLY important to report bugs that seem unrelated as separate email
|
||||||
@@ -87,7 +94,7 @@ step-by-step instructions for how a user can trigger the bug.
|
|||||||
|
|
||||||
If the failure includes an "OOPS:", take a picture of the screen, capture
|
If the failure includes an "OOPS:", take a picture of the screen, capture
|
||||||
a netconsole trace, or type the message from your screen into the bug
|
a netconsole trace, or type the message from your screen into the bug
|
||||||
report. Please read "Documentation/oops-tracing.txt" before posting your
|
report. Please read "Documentation/admin-guide/oops-tracing.rst" before posting your
|
||||||
bug report. This explains what you should do with the "Oops" information
|
bug report. This explains what you should do with the "Oops" information
|
||||||
to make it useful to the recipient.
|
to make it useful to the recipient.
|
||||||
|
|
||||||
@@ -99,11 +106,11 @@ relevant to your bug, feel free to exclude it.
|
|||||||
|
|
||||||
First run the ver_linux script included as scripts/ver_linux, which
|
First run the ver_linux script included as scripts/ver_linux, which
|
||||||
reports the version of some important subsystems. Run this script with
|
reports the version of some important subsystems. Run this script with
|
||||||
the command "sh scripts/ver_linux".
|
the command ``awk -f scripts/ver_linux``.
|
||||||
|
|
||||||
Use that information to fill in all fields of the bug report form, and
|
Use that information to fill in all fields of the bug report form, and
|
||||||
post it to the mailing list with a subject of "PROBLEM: <one line
|
post it to the mailing list with a subject of "PROBLEM: <one line
|
||||||
summary from [1.]>" for easy identification by the developers.
|
summary from [1.]>" for easy identification by the developers::
|
||||||
|
|
||||||
[1.] One line summary of the problem:
|
[1.] One line summary of the problem:
|
||||||
[2.] Full description of the problem/report:
|
[2.] Full description of the problem/report:
|
||||||
@@ -113,7 +120,7 @@ summary from [1.]>" for easy identification by the developers.
|
|||||||
[4.2.] Kernel .config file:
|
[4.2.] Kernel .config file:
|
||||||
[5.] Most recent kernel version which did not have the bug:
|
[5.] Most recent kernel version which did not have the bug:
|
||||||
[6.] Output of Oops.. message (if applicable) with symbolic information
|
[6.] Output of Oops.. message (if applicable) with symbolic information
|
||||||
resolved (see Documentation/oops-tracing.txt)
|
resolved (see Documentation/admin-guide/oops-tracing.rst)
|
||||||
[7.] A small shell script or example program which triggers the
|
[7.] A small shell script or example program which triggers the
|
||||||
problem (if possible)
|
problem (if possible)
|
||||||
[8.] Environment
|
[8.] Environment
|
||||||
@@ -153,7 +160,8 @@ Expectations for kernel maintainers
|
|||||||
Linux kernel maintainers are busy, overworked human beings. Some times
|
Linux kernel maintainers are busy, overworked human beings. Some times
|
||||||
they may not be able to address your bug in a day, a week, or two weeks.
|
they may not be able to address your bug in a day, a week, or two weeks.
|
||||||
If they don't answer your email, they may be on vacation, or at a Linux
|
If they don't answer your email, they may be on vacation, or at a Linux
|
||||||
conference. Check the conference schedule at LWN.net for more info:
|
conference. Check the conference schedule at https://LWN.net for more info:
|
||||||
|
|
||||||
https://lwn.net/Calendar/
|
https://lwn.net/Calendar/
|
||||||
|
|
||||||
In general, kernel maintainers take 1 to 5 business days to respond to
|
In general, kernel maintainers take 1 to 5 business days to respond to
|
||||||
@@ -8,8 +8,8 @@ like to know when a security bug is found so that it can be fixed and
|
|||||||
disclosed as quickly as possible. Please report security bugs to the
|
disclosed as quickly as possible. Please report security bugs to the
|
||||||
Linux kernel security team.
|
Linux kernel security team.
|
||||||
|
|
||||||
1) Contact
|
Contact
|
||||||
----------
|
-------
|
||||||
|
|
||||||
The Linux kernel security team can be contacted by email at
|
The Linux kernel security team can be contacted by email at
|
||||||
<security@kernel.org>. This is a private list of security officers
|
<security@kernel.org>. This is a private list of security officers
|
||||||
@@ -19,12 +19,12 @@ area maintainers to understand and fix the security vulnerability.
|
|||||||
|
|
||||||
As it is with any bug, the more information provided the easier it
|
As it is with any bug, the more information provided the easier it
|
||||||
will be to diagnose and fix. Please review the procedure outlined in
|
will be to diagnose and fix. Please review the procedure outlined in
|
||||||
REPORTING-BUGS if you are unclear about what information is helpful.
|
admin-guide/reporting-bugs.rst if you are unclear about what information is helpful.
|
||||||
Any exploit code is very helpful and will not be released without
|
Any exploit code is very helpful and will not be released without
|
||||||
consent from the reporter unless it has already been made public.
|
consent from the reporter unless it has already been made public.
|
||||||
|
|
||||||
2) Disclosure
|
Disclosure
|
||||||
-------------
|
----------
|
||||||
|
|
||||||
The goal of the Linux kernel security team is to work with the
|
The goal of the Linux kernel security team is to work with the
|
||||||
bug submitter to bug resolution as well as disclosure. We prefer
|
bug submitter to bug resolution as well as disclosure. We prefer
|
||||||
@@ -39,8 +39,8 @@ disclosure is from immediate (esp. if it's already publicly known)
|
|||||||
to a few weeks. As a basic default policy, we expect report date to
|
to a few weeks. As a basic default policy, we expect report date to
|
||||||
disclosure date to be on the order of 7 days.
|
disclosure date to be on the order of 7 days.
|
||||||
|
|
||||||
3) Non-disclosure agreements
|
Non-disclosure agreements
|
||||||
----------------------------
|
-------------------------
|
||||||
|
|
||||||
The Linux kernel security team is not a formal body and therefore unable
|
The Linux kernel security team is not a formal body and therefore unable
|
||||||
to enter any non-disclosure agreements.
|
to enter any non-disclosure agreements.
|
||||||
@@ -1,15 +1,21 @@
|
|||||||
|
.. _serial_console:
|
||||||
|
|
||||||
Linux Serial Console
|
Linux Serial Console
|
||||||
|
====================
|
||||||
|
|
||||||
To use a serial port as console you need to compile the support into your
|
To use a serial port as console you need to compile the support into your
|
||||||
kernel - by default it is not compiled in. For PC style serial ports
|
kernel - by default it is not compiled in. For PC style serial ports
|
||||||
it's the config option next to "Standard/generic (dumb) serial support".
|
it's the config option next to menu option:
|
||||||
|
|
||||||
|
:menuselection:`Character devices --> Serial drivers --> 8250/16550 and compatible serial support --> Console on 8250/16550 and compatible serial port`
|
||||||
|
|
||||||
You must compile serial support into the kernel and not as a module.
|
You must compile serial support into the kernel and not as a module.
|
||||||
|
|
||||||
It is possible to specify multiple devices for console output. You can
|
It is possible to specify multiple devices for console output. You can
|
||||||
define a new kernel command line option to select which device(s) to
|
define a new kernel command line option to select which device(s) to
|
||||||
use for console output.
|
use for console output.
|
||||||
|
|
||||||
The format of this option is:
|
The format of this option is::
|
||||||
|
|
||||||
console=device,options
|
console=device,options
|
||||||
|
|
||||||
@@ -28,11 +34,11 @@ The format of this option is:
|
|||||||
|
|
||||||
You can specify multiple console= options on the kernel command line.
|
You can specify multiple console= options on the kernel command line.
|
||||||
Output will appear on all of them. The last device will be used when
|
Output will appear on all of them. The last device will be used when
|
||||||
you open /dev/console. So, for example:
|
you open ``/dev/console``. So, for example::
|
||||||
|
|
||||||
console=ttyS1,9600 console=tty0
|
console=ttyS1,9600 console=tty0
|
||||||
|
|
||||||
defines that opening /dev/console will get you the current foreground
|
defines that opening ``/dev/console`` will get you the current foreground
|
||||||
virtual console, and kernel messages will appear on both the VGA
|
virtual console, and kernel messages will appear on both the VGA
|
||||||
console and the 2nd serial port (ttyS1 or COM2) at 9600 baud.
|
console and the 2nd serial port (ttyS1 or COM2) at 9600 baud.
|
||||||
|
|
||||||
@@ -44,17 +50,17 @@ first looks for a VGA card and then for a serial port. So if you don't
|
|||||||
have a VGA card in your system the first serial port will automatically
|
have a VGA card in your system the first serial port will automatically
|
||||||
become the console.
|
become the console.
|
||||||
|
|
||||||
You will need to create a new device to use /dev/console. The official
|
You will need to create a new device to use ``/dev/console``. The official
|
||||||
/dev/console is now character device 5,1.
|
``/dev/console`` is now character device 5,1.
|
||||||
|
|
||||||
(You can also use a network device as a console. See
|
(You can also use a network device as a console. See
|
||||||
Documentation/networking/netconsole.txt for information on that.)
|
``Documentation/networking/netconsole.txt`` for information on that.)
|
||||||
|
|
||||||
Here's an example that will use /dev/ttyS1 (COM2) as the console.
|
Here's an example that will use ``/dev/ttyS1`` (COM2) as the console.
|
||||||
Replace the sample values as needed.
|
Replace the sample values as needed.
|
||||||
|
|
||||||
1. Create /dev/console (real console) and /dev/tty0 (master virtual
|
1. Create ``/dev/console`` (real console) and ``/dev/tty0`` (master virtual
|
||||||
console):
|
console)::
|
||||||
|
|
||||||
cd /dev
|
cd /dev
|
||||||
rm -f console tty0
|
rm -f console tty0
|
||||||
@@ -63,42 +69,42 @@ Replace the sample values as needed.
|
|||||||
|
|
||||||
2. LILO can also take input from a serial device. This is a very
|
2. LILO can also take input from a serial device. This is a very
|
||||||
useful option. To tell LILO to use the serial port:
|
useful option. To tell LILO to use the serial port:
|
||||||
In lilo.conf (global section):
|
In lilo.conf (global section)::
|
||||||
|
|
||||||
serial = 1,9600n8 (ttyS1, 9600 bd, no parity, 8 bits)
|
serial = 1,9600n8 (ttyS1, 9600 bd, no parity, 8 bits)
|
||||||
|
|
||||||
3. Adjust to kernel flags for the new kernel,
|
3. Adjust to kernel flags for the new kernel,
|
||||||
again in lilo.conf (kernel section)
|
again in lilo.conf (kernel section)::
|
||||||
|
|
||||||
append = "console=ttyS1,9600"
|
append = "console=ttyS1,9600"
|
||||||
|
|
||||||
4. Make sure a getty runs on the serial port so that you can login to
|
4. Make sure a getty runs on the serial port so that you can login to
|
||||||
it once the system is done booting. This is done by adding a line
|
it once the system is done booting. This is done by adding a line
|
||||||
like this to /etc/inittab (exact syntax depends on your getty):
|
like this to ``/etc/inittab`` (exact syntax depends on your getty)::
|
||||||
|
|
||||||
S1:23:respawn:/sbin/getty -L ttyS1 9600 vt100
|
S1:23:respawn:/sbin/getty -L ttyS1 9600 vt100
|
||||||
|
|
||||||
5. Init and /etc/ioctl.save
|
5. Init and ``/etc/ioctl.save``
|
||||||
|
|
||||||
Sysvinit remembers its stty settings in a file in /etc, called
|
Sysvinit remembers its stty settings in a file in ``/etc``, called
|
||||||
`/etc/ioctl.save'. REMOVE THIS FILE before using the serial
|
``/etc/ioctl.save``. REMOVE THIS FILE before using the serial
|
||||||
console for the first time, because otherwise init will probably
|
console for the first time, because otherwise init will probably
|
||||||
set the baudrate to 38400 (baudrate of the virtual console).
|
set the baudrate to 38400 (baudrate of the virtual console).
|
||||||
|
|
||||||
6. /dev/console and X
|
6. ``/dev/console`` and X
|
||||||
Programs that want to do something with the virtual console usually
|
Programs that want to do something with the virtual console usually
|
||||||
open /dev/console. If you have created the new /dev/console device,
|
open ``/dev/console``. If you have created the new ``/dev/console`` device,
|
||||||
and your console is NOT the virtual console some programs will fail.
|
and your console is NOT the virtual console some programs will fail.
|
||||||
Those are programs that want to access the VT interface, and use
|
Those are programs that want to access the VT interface, and use
|
||||||
/dev/console instead of /dev/tty0. Some of those programs are:
|
``/dev/console instead of /dev/tty0``. Some of those programs are::
|
||||||
|
|
||||||
Xfree86, svgalib, gpm, SVGATextMode
|
Xfree86, svgalib, gpm, SVGATextMode
|
||||||
|
|
||||||
It should be fixed in modern versions of these programs though.
|
It should be fixed in modern versions of these programs though.
|
||||||
|
|
||||||
Note that if you boot without a console= option (or with
|
Note that if you boot without a ``console=`` option (or with
|
||||||
console=/dev/tty0), /dev/console is the same as /dev/tty0. In that
|
``console=/dev/tty0``), ``/dev/console`` is the same as ``/dev/tty0``.
|
||||||
case everything will still work.
|
In that case everything will still work.
|
||||||
|
|
||||||
7. Thanks
|
7. Thanks
|
||||||
|
|
||||||
192
Documentation/admin-guide/sysfs-rules.rst
Normal file
192
Documentation/admin-guide/sysfs-rules.rst
Normal file
@@ -0,0 +1,192 @@
|
|||||||
|
Rules on how to access information in sysfs
|
||||||
|
===========================================
|
||||||
|
|
||||||
|
The kernel-exported sysfs exports internal kernel implementation details
|
||||||
|
and depends on internal kernel structures and layout. It is agreed upon
|
||||||
|
by the kernel developers that the Linux kernel does not provide a stable
|
||||||
|
internal API. Therefore, there are aspects of the sysfs interface that
|
||||||
|
may not be stable across kernel releases.
|
||||||
|
|
||||||
|
To minimize the risk of breaking users of sysfs, which are in most cases
|
||||||
|
low-level userspace applications, with a new kernel release, the users
|
||||||
|
of sysfs must follow some rules to use an as-abstract-as-possible way to
|
||||||
|
access this filesystem. The current udev and HAL programs already
|
||||||
|
implement this and users are encouraged to plug, if possible, into the
|
||||||
|
abstractions these programs provide instead of accessing sysfs directly.
|
||||||
|
|
||||||
|
But if you really do want or need to access sysfs directly, please follow
|
||||||
|
the following rules and then your programs should work with future
|
||||||
|
versions of the sysfs interface.
|
||||||
|
|
||||||
|
- Do not use libsysfs
|
||||||
|
It makes assumptions about sysfs which are not true. Its API does not
|
||||||
|
offer any abstraction, it exposes all the kernel driver-core
|
||||||
|
implementation details in its own API. Therefore it is not better than
|
||||||
|
reading directories and opening the files yourself.
|
||||||
|
Also, it is not actively maintained, in the sense of reflecting the
|
||||||
|
current kernel development. The goal of providing a stable interface
|
||||||
|
to sysfs has failed; it causes more problems than it solves. It
|
||||||
|
violates many of the rules in this document.
|
||||||
|
|
||||||
|
- sysfs is always at ``/sys``
|
||||||
|
Parsing ``/proc/mounts`` is a waste of time. Other mount points are a
|
||||||
|
system configuration bug you should not try to solve. For test cases,
|
||||||
|
possibly support a ``SYSFS_PATH`` environment variable to overwrite the
|
||||||
|
application's behavior, but never try to search for sysfs. Never try
|
||||||
|
to mount it, if you are not an early boot script.
|
||||||
|
|
||||||
|
- devices are only "devices"
|
||||||
|
There is no such thing like class-, bus-, physical devices,
|
||||||
|
interfaces, and such that you can rely on in userspace. Everything is
|
||||||
|
just simply a "device". Class-, bus-, physical, ... types are just
|
||||||
|
kernel implementation details which should not be expected by
|
||||||
|
applications that look for devices in sysfs.
|
||||||
|
|
||||||
|
The properties of a device are:
|
||||||
|
|
||||||
|
- devpath (``/devices/pci0000:00/0000:00:1d.1/usb2/2-2/2-2:1.0``)
|
||||||
|
|
||||||
|
- identical to the DEVPATH value in the event sent from the kernel
|
||||||
|
at device creation and removal
|
||||||
|
- the unique key to the device at that point in time
|
||||||
|
- the kernel's path to the device directory without the leading
|
||||||
|
``/sys``, and always starting with a slash
|
||||||
|
- all elements of a devpath must be real directories. Symlinks
|
||||||
|
pointing to /sys/devices must always be resolved to their real
|
||||||
|
target and the target path must be used to access the device.
|
||||||
|
That way the devpath to the device matches the devpath of the
|
||||||
|
kernel used at event time.
|
||||||
|
- using or exposing symlink values as elements in a devpath string
|
||||||
|
is a bug in the application
|
||||||
|
|
||||||
|
- kernel name (``sda``, ``tty``, ``0000:00:1f.2``, ...)
|
||||||
|
|
||||||
|
- a directory name, identical to the last element of the devpath
|
||||||
|
- applications need to handle spaces and characters like ``!`` in
|
||||||
|
the name
|
||||||
|
|
||||||
|
- subsystem (``block``, ``tty``, ``pci``, ...)
|
||||||
|
|
||||||
|
- simple string, never a path or a link
|
||||||
|
- retrieved by reading the "subsystem"-link and using only the
|
||||||
|
last element of the target path
|
||||||
|
|
||||||
|
- driver (``tg3``, ``ata_piix``, ``uhci_hcd``)
|
||||||
|
|
||||||
|
- a simple string, which may contain spaces, never a path or a
|
||||||
|
link
|
||||||
|
- it is retrieved by reading the "driver"-link and using only the
|
||||||
|
last element of the target path
|
||||||
|
- devices which do not have "driver"-link just do not have a
|
||||||
|
driver; copying the driver value in a child device context is a
|
||||||
|
bug in the application
|
||||||
|
|
||||||
|
- attributes
|
||||||
|
|
||||||
|
- the files in the device directory or files below subdirectories
|
||||||
|
of the same device directory
|
||||||
|
- accessing attributes reached by a symlink pointing to another device,
|
||||||
|
like the "device"-link, is a bug in the application
|
||||||
|
|
||||||
|
Everything else is just a kernel driver-core implementation detail
|
||||||
|
that should not be assumed to be stable across kernel releases.
|
||||||
|
|
||||||
|
- Properties of parent devices never belong into a child device.
|
||||||
|
Always look at the parent devices themselves for determining device
|
||||||
|
context properties. If the device ``eth0`` or ``sda`` does not have a
|
||||||
|
"driver"-link, then this device does not have a driver. Its value is empty.
|
||||||
|
Never copy any property of the parent-device into a child-device. Parent
|
||||||
|
device properties may change dynamically without any notice to the
|
||||||
|
child device.
|
||||||
|
|
||||||
|
- Hierarchy in a single device tree
|
||||||
|
There is only one valid place in sysfs where hierarchy can be examined
|
||||||
|
and this is below: ``/sys/devices.``
|
||||||
|
It is planned that all device directories will end up in the tree
|
||||||
|
below this directory.
|
||||||
|
|
||||||
|
- Classification by subsystem
|
||||||
|
There are currently three places for classification of devices:
|
||||||
|
``/sys/block,`` ``/sys/class`` and ``/sys/bus.`` It is planned that these will
|
||||||
|
not contain any device directories themselves, but only flat lists of
|
||||||
|
symlinks pointing to the unified ``/sys/devices`` tree.
|
||||||
|
All three places have completely different rules on how to access
|
||||||
|
device information. It is planned to merge all three
|
||||||
|
classification directories into one place at ``/sys/subsystem``,
|
||||||
|
following the layout of the bus directories. All buses and
|
||||||
|
classes, including the converted block subsystem, will show up
|
||||||
|
there.
|
||||||
|
The devices belonging to a subsystem will create a symlink in the
|
||||||
|
"devices" directory at ``/sys/subsystem/<name>/devices``,
|
||||||
|
|
||||||
|
If ``/sys/subsystem`` exists, ``/sys/bus``, ``/sys/class`` and ``/sys/block``
|
||||||
|
can be ignored. If it does not exist, you always have to scan all three
|
||||||
|
places, as the kernel is free to move a subsystem from one place to
|
||||||
|
the other, as long as the devices are still reachable by the same
|
||||||
|
subsystem name.
|
||||||
|
|
||||||
|
Assuming ``/sys/class/<subsystem>`` and ``/sys/bus/<subsystem>``, or
|
||||||
|
``/sys/block`` and ``/sys/class/block`` are not interchangeable is a bug in
|
||||||
|
the application.
|
||||||
|
|
||||||
|
- Block
|
||||||
|
The converted block subsystem at ``/sys/class/block`` or
|
||||||
|
``/sys/subsystem/block`` will contain the links for disks and partitions
|
||||||
|
at the same level, never in a hierarchy. Assuming the block subsystem to
|
||||||
|
contain only disks and not partition devices in the same flat list is
|
||||||
|
a bug in the application.
|
||||||
|
|
||||||
|
- "device"-link and <subsystem>:<kernel name>-links
|
||||||
|
Never depend on the "device"-link. The "device"-link is a workaround
|
||||||
|
for the old layout, where class devices are not created in
|
||||||
|
``/sys/devices/`` like the bus devices. If the link-resolving of a
|
||||||
|
device directory does not end in ``/sys/devices/``, you can use the
|
||||||
|
"device"-link to find the parent devices in ``/sys/devices/``, That is the
|
||||||
|
single valid use of the "device"-link; it must never appear in any
|
||||||
|
path as an element. Assuming the existence of the "device"-link for
|
||||||
|
a device in ``/sys/devices/`` is a bug in the application.
|
||||||
|
Accessing ``/sys/class/net/eth0/device`` is a bug in the application.
|
||||||
|
|
||||||
|
Never depend on the class-specific links back to the ``/sys/class``
|
||||||
|
directory. These links are also a workaround for the design mistake
|
||||||
|
that class devices are not created in ``/sys/devices.`` If a device
|
||||||
|
directory does not contain directories for child devices, these links
|
||||||
|
may be used to find the child devices in ``/sys/class.`` That is the single
|
||||||
|
valid use of these links; they must never appear in any path as an
|
||||||
|
element. Assuming the existence of these links for devices which are
|
||||||
|
real child device directories in the ``/sys/devices`` tree is a bug in
|
||||||
|
the application.
|
||||||
|
|
||||||
|
It is planned to remove all these links when all class device
|
||||||
|
directories live in ``/sys/devices.``
|
||||||
|
|
||||||
|
- Position of devices along device chain can change.
|
||||||
|
Never depend on a specific parent device position in the devpath,
|
||||||
|
or the chain of parent devices. The kernel is free to insert devices into
|
||||||
|
the chain. You must always request the parent device you are looking for
|
||||||
|
by its subsystem value. You need to walk up the chain until you find
|
||||||
|
the device that matches the expected subsystem. Depending on a specific
|
||||||
|
position of a parent device or exposing relative paths using ``../`` to
|
||||||
|
access the chain of parents is a bug in the application.
|
||||||
|
|
||||||
|
- When reading and writing sysfs device attribute files, avoid dependency
|
||||||
|
on specific error codes wherever possible. This minimizes coupling to
|
||||||
|
the error handling implementation within the kernel.
|
||||||
|
|
||||||
|
In general, failures to read or write sysfs device attributes shall
|
||||||
|
propagate errors wherever possible. Common errors include, but are not
|
||||||
|
limited to:
|
||||||
|
|
||||||
|
``-EIO``: The read or store operation is not supported, typically
|
||||||
|
returned by the sysfs system itself if the read or store pointer
|
||||||
|
is ``NULL``.
|
||||||
|
|
||||||
|
``-ENXIO``: The read or store operation failed
|
||||||
|
|
||||||
|
Error codes will not be changed without good reason, and should a change
|
||||||
|
to error codes result in user-space breakage, it will be fixed, or the
|
||||||
|
the offending change will be reverted.
|
||||||
|
|
||||||
|
Userspace applications can, however, expect the format and contents of
|
||||||
|
the attribute files to remain consistent in the absence of a version
|
||||||
|
attribute change in the context of a given attribute.
|
||||||
289
Documentation/admin-guide/sysrq.rst
Normal file
289
Documentation/admin-guide/sysrq.rst
Normal file
@@ -0,0 +1,289 @@
|
|||||||
|
Linux Magic System Request Key Hacks
|
||||||
|
====================================
|
||||||
|
|
||||||
|
Documentation for sysrq.c
|
||||||
|
|
||||||
|
What is the magic SysRq key?
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
It is a 'magical' key combo you can hit which the kernel will respond to
|
||||||
|
regardless of whatever else it is doing, unless it is completely locked up.
|
||||||
|
|
||||||
|
How do I enable the magic SysRq key?
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
You need to say "yes" to 'Magic SysRq key (CONFIG_MAGIC_SYSRQ)' when
|
||||||
|
configuring the kernel. When running a kernel with SysRq compiled in,
|
||||||
|
/proc/sys/kernel/sysrq controls the functions allowed to be invoked via
|
||||||
|
the SysRq key. The default value in this file is set by the
|
||||||
|
CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE config symbol, which itself defaults
|
||||||
|
to 1. Here is the list of possible values in /proc/sys/kernel/sysrq:
|
||||||
|
|
||||||
|
- 0 - disable sysrq completely
|
||||||
|
- 1 - enable all functions of sysrq
|
||||||
|
- >1 - bitmask of allowed sysrq functions (see below for detailed function
|
||||||
|
description)::
|
||||||
|
|
||||||
|
2 = 0x2 - enable control of console logging level
|
||||||
|
4 = 0x4 - enable control of keyboard (SAK, unraw)
|
||||||
|
8 = 0x8 - enable debugging dumps of processes etc.
|
||||||
|
16 = 0x10 - enable sync command
|
||||||
|
32 = 0x20 - enable remount read-only
|
||||||
|
64 = 0x40 - enable signalling of processes (term, kill, oom-kill)
|
||||||
|
128 = 0x80 - allow reboot/poweroff
|
||||||
|
256 = 0x100 - allow nicing of all RT tasks
|
||||||
|
|
||||||
|
You can set the value in the file by the following command::
|
||||||
|
|
||||||
|
echo "number" >/proc/sys/kernel/sysrq
|
||||||
|
|
||||||
|
The number may be written here either as decimal or as hexadecimal
|
||||||
|
with the 0x prefix. CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE must always be
|
||||||
|
written in hexadecimal.
|
||||||
|
|
||||||
|
Note that the value of ``/proc/sys/kernel/sysrq`` influences only the invocation
|
||||||
|
via a keyboard. Invocation of any operation via ``/proc/sysrq-trigger`` is
|
||||||
|
always allowed (by a user with admin privileges).
|
||||||
|
|
||||||
|
How do I use the magic SysRq key?
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
On x86 - You press the key combo :kbd:`ALT-SysRq-<command key>`.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
Some
|
||||||
|
keyboards may not have a key labeled 'SysRq'. The 'SysRq' key is
|
||||||
|
also known as the 'Print Screen' key. Also some keyboards cannot
|
||||||
|
handle so many keys being pressed at the same time, so you might
|
||||||
|
have better luck with press :kbd:`Alt`, press :kbd:`SysRq`,
|
||||||
|
release :kbd:`SysRq`, press :kbd:`<command key>`, release everything.
|
||||||
|
|
||||||
|
On SPARC - You press :kbd:`ALT-STOP-<command key>`, I believe.
|
||||||
|
|
||||||
|
On the serial console (PC style standard serial ports only)
|
||||||
|
You send a ``BREAK``, then within 5 seconds a command key. Sending
|
||||||
|
``BREAK`` twice is interpreted as a normal BREAK.
|
||||||
|
|
||||||
|
On PowerPC
|
||||||
|
Press :kbd:`ALT - Print Screen` (or :kbd:`F13`) - :kbd:`<command key>`,
|
||||||
|
:kbd:`Print Screen` (or :kbd:`F13`) - :kbd:`<command key>` may suffice.
|
||||||
|
|
||||||
|
On other
|
||||||
|
If you know of the key combos for other architectures, please
|
||||||
|
let me know so I can add them to this section.
|
||||||
|
|
||||||
|
On all
|
||||||
|
write a character to /proc/sysrq-trigger. e.g.::
|
||||||
|
|
||||||
|
echo t > /proc/sysrq-trigger
|
||||||
|
|
||||||
|
What are the 'command' keys?
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
=========== ===================================================================
|
||||||
|
Command Function
|
||||||
|
=========== ===================================================================
|
||||||
|
``b`` Will immediately reboot the system without syncing or unmounting
|
||||||
|
your disks.
|
||||||
|
|
||||||
|
``c`` Will perform a system crash by a NULL pointer dereference.
|
||||||
|
A crashdump will be taken if configured.
|
||||||
|
|
||||||
|
``d`` Shows all locks that are held.
|
||||||
|
|
||||||
|
``e`` Send a SIGTERM to all processes, except for init.
|
||||||
|
|
||||||
|
``f`` Will call the oom killer to kill a memory hog process, but do not
|
||||||
|
panic if nothing can be killed.
|
||||||
|
|
||||||
|
``g`` Used by kgdb (kernel debugger)
|
||||||
|
|
||||||
|
``h`` Will display help (actually any other key than those listed
|
||||||
|
here will display help. but ``h`` is easy to remember :-)
|
||||||
|
|
||||||
|
``i`` Send a SIGKILL to all processes, except for init.
|
||||||
|
|
||||||
|
``j`` Forcibly "Just thaw it" - filesystems frozen by the FIFREEZE ioctl.
|
||||||
|
|
||||||
|
``k`` Secure Access Key (SAK) Kills all programs on the current virtual
|
||||||
|
console. NOTE: See important comments below in SAK section.
|
||||||
|
|
||||||
|
``l`` Shows a stack backtrace for all active CPUs.
|
||||||
|
|
||||||
|
``m`` Will dump current memory info to your console.
|
||||||
|
|
||||||
|
``n`` Used to make RT tasks nice-able
|
||||||
|
|
||||||
|
``o`` Will shut your system off (if configured and supported).
|
||||||
|
|
||||||
|
``p`` Will dump the current registers and flags to your console.
|
||||||
|
|
||||||
|
``q`` Will dump per CPU lists of all armed hrtimers (but NOT regular
|
||||||
|
timer_list timers) and detailed information about all
|
||||||
|
clockevent devices.
|
||||||
|
|
||||||
|
``r`` Turns off keyboard raw mode and sets it to XLATE.
|
||||||
|
|
||||||
|
``s`` Will attempt to sync all mounted filesystems.
|
||||||
|
|
||||||
|
``t`` Will dump a list of current tasks and their information to your
|
||||||
|
console.
|
||||||
|
|
||||||
|
``u`` Will attempt to remount all mounted filesystems read-only.
|
||||||
|
|
||||||
|
``v`` Forcefully restores framebuffer console
|
||||||
|
``v`` Causes ETM buffer dump [ARM-specific]
|
||||||
|
|
||||||
|
``w`` Dumps tasks that are in uninterruptable (blocked) state.
|
||||||
|
|
||||||
|
``x`` Used by xmon interface on ppc/powerpc platforms.
|
||||||
|
Show global PMU Registers on sparc64.
|
||||||
|
Dump all TLB entries on MIPS.
|
||||||
|
|
||||||
|
``y`` Show global CPU Registers [SPARC-64 specific]
|
||||||
|
|
||||||
|
``z`` Dump the ftrace buffer
|
||||||
|
|
||||||
|
``0``-``9`` Sets the console log level, controlling which kernel messages
|
||||||
|
will be printed to your console. (``0``, for example would make
|
||||||
|
it so that only emergency messages like PANICs or OOPSes would
|
||||||
|
make it to your console.)
|
||||||
|
=========== ===================================================================
|
||||||
|
|
||||||
|
Okay, so what can I use them for?
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Well, unraw(r) is very handy when your X server or a svgalib program crashes.
|
||||||
|
|
||||||
|
sak(k) (Secure Access Key) is useful when you want to be sure there is no
|
||||||
|
trojan program running at console which could grab your password
|
||||||
|
when you would try to login. It will kill all programs on given console,
|
||||||
|
thus letting you make sure that the login prompt you see is actually
|
||||||
|
the one from init, not some trojan program.
|
||||||
|
|
||||||
|
.. important::
|
||||||
|
|
||||||
|
In its true form it is not a true SAK like the one in a
|
||||||
|
c2 compliant system, and it should not be mistaken as
|
||||||
|
such.
|
||||||
|
|
||||||
|
It seems others find it useful as (System Attention Key) which is
|
||||||
|
useful when you want to exit a program that will not let you switch consoles.
|
||||||
|
(For example, X or a svgalib program.)
|
||||||
|
|
||||||
|
``reboot(b)`` is good when you're unable to shut down. But you should also
|
||||||
|
``sync(s)`` and ``umount(u)`` first.
|
||||||
|
|
||||||
|
``crash(c)`` can be used to manually trigger a crashdump when the system is hung.
|
||||||
|
Note that this just triggers a crash if there is no dump mechanism available.
|
||||||
|
|
||||||
|
``sync(s)`` is great when your system is locked up, it allows you to sync your
|
||||||
|
disks and will certainly lessen the chance of data loss and fscking. Note
|
||||||
|
that the sync hasn't taken place until you see the "OK" and "Done" appear
|
||||||
|
on the screen. (If the kernel is really in strife, you may not ever get the
|
||||||
|
OK or Done message...)
|
||||||
|
|
||||||
|
``umount(u)`` is basically useful in the same ways as ``sync(s)``. I generally
|
||||||
|
``sync(s)``, ``umount(u)``, then ``reboot(b)`` when my system locks. It's saved
|
||||||
|
me many a fsck. Again, the unmount (remount read-only) hasn't taken place until
|
||||||
|
you see the "OK" and "Done" message appear on the screen.
|
||||||
|
|
||||||
|
The loglevels ``0``-``9`` are useful when your console is being flooded with
|
||||||
|
kernel messages you do not want to see. Selecting ``0`` will prevent all but
|
||||||
|
the most urgent kernel messages from reaching your console. (They will
|
||||||
|
still be logged if syslogd/klogd are alive, though.)
|
||||||
|
|
||||||
|
``term(e)`` and ``kill(i)`` are useful if you have some sort of runaway process
|
||||||
|
you are unable to kill any other way, especially if it's spawning other
|
||||||
|
processes.
|
||||||
|
|
||||||
|
"just thaw ``it(j)``" is useful if your system becomes unresponsive due to a
|
||||||
|
frozen (probably root) filesystem via the FIFREEZE ioctl.
|
||||||
|
|
||||||
|
Sometimes SysRq seems to get 'stuck' after using it, what can I do?
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
That happens to me, also. I've found that tapping shift, alt, and control
|
||||||
|
on both sides of the keyboard, and hitting an invalid sysrq sequence again
|
||||||
|
will fix the problem. (i.e., something like :kbd:`alt-sysrq-z`). Switching to
|
||||||
|
another virtual console (:kbd:`ALT+Fn`) and then back again should also help.
|
||||||
|
|
||||||
|
I hit SysRq, but nothing seems to happen, what's wrong?
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
There are some keyboards that produce a different keycode for SysRq than the
|
||||||
|
pre-defined value of 99 (see ``KEY_SYSRQ`` in ``include/linux/input.h``), or
|
||||||
|
which don't have a SysRq key at all. In these cases, run ``showkey -s`` to find
|
||||||
|
an appropriate scancode sequence, and use ``setkeycodes <sequence> 99`` to map
|
||||||
|
this sequence to the usual SysRq code (e.g., ``setkeycodes e05b 99``). It's
|
||||||
|
probably best to put this command in a boot script. Oh, and by the way, you
|
||||||
|
exit ``showkey`` by not typing anything for ten seconds.
|
||||||
|
|
||||||
|
I want to add SysRQ key events to a module, how does it work?
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
In order to register a basic function with the table, you must first include
|
||||||
|
the header ``include/linux/sysrq.h``, this will define everything else you need.
|
||||||
|
Next, you must create a ``sysrq_key_op`` struct, and populate it with A) the key
|
||||||
|
handler function you will use, B) a help_msg string, that will print when SysRQ
|
||||||
|
prints help, and C) an action_msg string, that will print right before your
|
||||||
|
handler is called. Your handler must conform to the prototype in 'sysrq.h'.
|
||||||
|
|
||||||
|
After the ``sysrq_key_op`` is created, you can call the kernel function
|
||||||
|
``register_sysrq_key(int key, struct sysrq_key_op *op_p);`` this will
|
||||||
|
register the operation pointed to by ``op_p`` at table key 'key',
|
||||||
|
if that slot in the table is blank. At module unload time, you must call
|
||||||
|
the function ``unregister_sysrq_key(int key, struct sysrq_key_op *op_p)``, which
|
||||||
|
will remove the key op pointed to by 'op_p' from the key 'key', if and only if
|
||||||
|
it is currently registered in that slot. This is in case the slot has been
|
||||||
|
overwritten since you registered it.
|
||||||
|
|
||||||
|
The Magic SysRQ system works by registering key operations against a key op
|
||||||
|
lookup table, which is defined in 'drivers/tty/sysrq.c'. This key table has
|
||||||
|
a number of operations registered into it at compile time, but is mutable,
|
||||||
|
and 2 functions are exported for interface to it::
|
||||||
|
|
||||||
|
register_sysrq_key and unregister_sysrq_key.
|
||||||
|
|
||||||
|
Of course, never ever leave an invalid pointer in the table. I.e., when
|
||||||
|
your module that called register_sysrq_key() exits, it must call
|
||||||
|
unregister_sysrq_key() to clean up the sysrq key table entry that it used.
|
||||||
|
Null pointers in the table are always safe. :)
|
||||||
|
|
||||||
|
If for some reason you feel the need to call the handle_sysrq function from
|
||||||
|
within a function called by handle_sysrq, you must be aware that you are in
|
||||||
|
a lock (you are also in an interrupt handler, which means don't sleep!), so
|
||||||
|
you must call ``__handle_sysrq_nolock`` instead.
|
||||||
|
|
||||||
|
When I hit a SysRq key combination only the header appears on the console?
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Sysrq output is subject to the same console loglevel control as all
|
||||||
|
other console output. This means that if the kernel was booted 'quiet'
|
||||||
|
as is common on distro kernels the output may not appear on the actual
|
||||||
|
console, even though it will appear in the dmesg buffer, and be accessible
|
||||||
|
via the dmesg command and to the consumers of ``/proc/kmsg``. As a specific
|
||||||
|
exception the header line from the sysrq command is passed to all console
|
||||||
|
consumers as if the current loglevel was maximum. If only the header
|
||||||
|
is emitted it is almost certain that the kernel loglevel is too low.
|
||||||
|
Should you require the output on the console channel then you will need
|
||||||
|
to temporarily up the console loglevel using :kbd:`alt-sysrq-8` or::
|
||||||
|
|
||||||
|
echo 8 > /proc/sysrq-trigger
|
||||||
|
|
||||||
|
Remember to return the loglevel to normal after triggering the sysrq
|
||||||
|
command you are interested in.
|
||||||
|
|
||||||
|
I have more questions, who can I ask?
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Just ask them on the linux-kernel mailing list:
|
||||||
|
linux-kernel@vger.kernel.org
|
||||||
|
|
||||||
|
Credits
|
||||||
|
~~~~~~~
|
||||||
|
|
||||||
|
Written by Mydraal <vulpyne@vulpyne.net>
|
||||||
|
Updated by Adam Sulmicki <adam@cfar.umd.edu>
|
||||||
|
Updated by Jeremy M. Dolan <jmd@turbogeek.org> 2001/01/28 10:15:59
|
||||||
|
Added to by Crutcher Dunnavant <crutcher+kernel@datastacks.com>
|
||||||
59
Documentation/admin-guide/tainted-kernels.rst
Normal file
59
Documentation/admin-guide/tainted-kernels.rst
Normal file
@@ -0,0 +1,59 @@
|
|||||||
|
Tainted kernels
|
||||||
|
---------------
|
||||||
|
|
||||||
|
Some oops reports contain the string **'Tainted: '** after the program
|
||||||
|
counter. This indicates that the kernel has been tainted by some
|
||||||
|
mechanism. The string is followed by a series of position-sensitive
|
||||||
|
characters, each representing a particular tainted value.
|
||||||
|
|
||||||
|
1) 'G' if all modules loaded have a GPL or compatible license, 'P' if
|
||||||
|
any proprietary module has been loaded. Modules without a
|
||||||
|
MODULE_LICENSE or with a MODULE_LICENSE that is not recognised by
|
||||||
|
insmod as GPL compatible are assumed to be proprietary.
|
||||||
|
|
||||||
|
2) ``F`` if any module was force loaded by ``insmod -f``, ``' '`` if all
|
||||||
|
modules were loaded normally.
|
||||||
|
|
||||||
|
3) ``S`` if the oops occurred on an SMP kernel running on hardware that
|
||||||
|
hasn't been certified as safe to run multiprocessor.
|
||||||
|
Currently this occurs only on various Athlons that are not
|
||||||
|
SMP capable.
|
||||||
|
|
||||||
|
4) ``R`` if a module was force unloaded by ``rmmod -f``, ``' '`` if all
|
||||||
|
modules were unloaded normally.
|
||||||
|
|
||||||
|
5) ``M`` if any processor has reported a Machine Check Exception,
|
||||||
|
``' '`` if no Machine Check Exceptions have occurred.
|
||||||
|
|
||||||
|
6) ``B`` if a page-release function has found a bad page reference or
|
||||||
|
some unexpected page flags.
|
||||||
|
|
||||||
|
7) ``U`` if a user or user application specifically requested that the
|
||||||
|
Tainted flag be set, ``' '`` otherwise.
|
||||||
|
|
||||||
|
8) ``D`` if the kernel has died recently, i.e. there was an OOPS or BUG.
|
||||||
|
|
||||||
|
9) ``A`` if the ACPI table has been overridden.
|
||||||
|
|
||||||
|
10) ``W`` if a warning has previously been issued by the kernel.
|
||||||
|
(Though some warnings may set more specific taint flags.)
|
||||||
|
|
||||||
|
11) ``C`` if a staging driver has been loaded.
|
||||||
|
|
||||||
|
12) ``I`` if the kernel is working around a severe bug in the platform
|
||||||
|
firmware (BIOS or similar).
|
||||||
|
|
||||||
|
13) ``O`` if an externally-built ("out-of-tree") module has been loaded.
|
||||||
|
|
||||||
|
14) ``E`` if an unsigned module has been loaded in a kernel supporting
|
||||||
|
module signature.
|
||||||
|
|
||||||
|
15) ``L`` if a soft lockup has previously occurred on the system.
|
||||||
|
|
||||||
|
16) ``K`` if the kernel has been live patched.
|
||||||
|
|
||||||
|
The primary reason for the **'Tainted: '** string is to tell kernel
|
||||||
|
debuggers if this is a clean kernel or if anything unusual has
|
||||||
|
occurred. Tainting is permanent: even if an offending module is
|
||||||
|
unloaded, the tainted value remains to indicate that the kernel is not
|
||||||
|
trustworthy.
|
||||||
@@ -1,12 +1,16 @@
|
|||||||
|
Unicode support
|
||||||
|
===============
|
||||||
|
|
||||||
Last update: 2005-01-17, version 1.4
|
Last update: 2005-01-17, version 1.4
|
||||||
|
|
||||||
This file is maintained by H. Peter Anvin <unicode@lanana.org> as part
|
This file is maintained by H. Peter Anvin <unicode@lanana.org> as part
|
||||||
of the Linux Assigned Names And Numbers Authority (LANANA) project.
|
of the Linux Assigned Names And Numbers Authority (LANANA) project.
|
||||||
The current version can be found at:
|
The current version can be found at:
|
||||||
|
|
||||||
http://www.lanana.org/docs/unicode/unicode.txt
|
http://www.lanana.org/docs/unicode/admin-guide/unicode.rst
|
||||||
|
|
||||||
------------------------
|
Introduction
|
||||||
|
------------
|
||||||
|
|
||||||
The Linux kernel code has been rewritten to use Unicode to map
|
The Linux kernel code has been rewritten to use Unicode to map
|
||||||
characters to fonts. By downloading a single Unicode-to-font table,
|
characters to fonts. By downloading a single Unicode-to-font table,
|
||||||
@@ -16,12 +20,14 @@ the font as indicated.
|
|||||||
This changes the semantics of the eight-bit character tables subtly.
|
This changes the semantics of the eight-bit character tables subtly.
|
||||||
The four character tables are now:
|
The four character tables are now:
|
||||||
|
|
||||||
|
=============== =============================== ================
|
||||||
Map symbol Map name Escape code (G0)
|
Map symbol Map name Escape code (G0)
|
||||||
|
=============== =============================== ================
|
||||||
LAT1_MAP Latin-1 (ISO 8859-1) ESC ( B
|
LAT1_MAP Latin-1 (ISO 8859-1) ESC ( B
|
||||||
GRAF_MAP DEC VT100 pseudographics ESC ( 0
|
GRAF_MAP DEC VT100 pseudographics ESC ( 0
|
||||||
IBMPC_MAP IBM code page 437 ESC ( U
|
IBMPC_MAP IBM code page 437 ESC ( U
|
||||||
USER_MAP User defined ESC ( K
|
USER_MAP User defined ESC ( K
|
||||||
|
=============== =============================== ================
|
||||||
|
|
||||||
In particular, ESC ( U is no longer "straight to font", since the font
|
In particular, ESC ( U is no longer "straight to font", since the font
|
||||||
might be completely different than the IBM character set. This
|
might be completely different than the IBM character set. This
|
||||||
@@ -55,10 +61,12 @@ In addition, the following characters not present in Unicode 1.1.4
|
|||||||
have been defined; these are used by the DEC VT graphics map. [v1.2]
|
have been defined; these are used by the DEC VT graphics map. [v1.2]
|
||||||
THIS USE IS OBSOLETE AND SHOULD NO LONGER BE USED; PLEASE SEE BELOW.
|
THIS USE IS OBSOLETE AND SHOULD NO LONGER BE USED; PLEASE SEE BELOW.
|
||||||
|
|
||||||
|
====== ======================================
|
||||||
U+F800 DEC VT GRAPHICS HORIZONTAL LINE SCAN 1
|
U+F800 DEC VT GRAPHICS HORIZONTAL LINE SCAN 1
|
||||||
U+F801 DEC VT GRAPHICS HORIZONTAL LINE SCAN 3
|
U+F801 DEC VT GRAPHICS HORIZONTAL LINE SCAN 3
|
||||||
U+F803 DEC VT GRAPHICS HORIZONTAL LINE SCAN 7
|
U+F803 DEC VT GRAPHICS HORIZONTAL LINE SCAN 7
|
||||||
U+F804 DEC VT GRAPHICS HORIZONTAL LINE SCAN 9
|
U+F804 DEC VT GRAPHICS HORIZONTAL LINE SCAN 9
|
||||||
|
====== ======================================
|
||||||
|
|
||||||
The DEC VT220 uses a 6x10 character matrix, and these characters form
|
The DEC VT220 uses a 6x10 character matrix, and these characters form
|
||||||
a smooth progression in the DEC VT graphics character set. I have
|
a smooth progression in the DEC VT graphics character set. I have
|
||||||
@@ -74,10 +82,12 @@ keyboard symbols that are unlikely to ever be added to Unicode proper
|
|||||||
since they are horribly vendor-specific. This, of course, is an
|
since they are horribly vendor-specific. This, of course, is an
|
||||||
excellent example of horrible design.
|
excellent example of horrible design.
|
||||||
|
|
||||||
|
====== ======================================
|
||||||
U+F810 KEYBOARD SYMBOL FLYING FLAG
|
U+F810 KEYBOARD SYMBOL FLYING FLAG
|
||||||
U+F811 KEYBOARD SYMBOL PULLDOWN MENU
|
U+F811 KEYBOARD SYMBOL PULLDOWN MENU
|
||||||
U+F812 KEYBOARD SYMBOL OPEN APPLE
|
U+F812 KEYBOARD SYMBOL OPEN APPLE
|
||||||
U+F813 KEYBOARD SYMBOL SOLID APPLE
|
U+F813 KEYBOARD SYMBOL SOLID APPLE
|
||||||
|
====== ======================================
|
||||||
|
|
||||||
Klingon language support
|
Klingon language support
|
||||||
------------------------
|
------------------------
|
||||||
@@ -99,7 +109,9 @@ of the dingbats/symbols/forms type and this is a language, I have
|
|||||||
located it at the end, on a 16-cell boundary in keeping with standard
|
located it at the end, on a 16-cell boundary in keeping with standard
|
||||||
Unicode practice.
|
Unicode practice.
|
||||||
|
|
||||||
NOTE: This range is now officially managed by the ConScript Unicode
|
.. note::
|
||||||
|
|
||||||
|
This range is now officially managed by the ConScript Unicode
|
||||||
Registry. The normative reference is at:
|
Registry. The normative reference is at:
|
||||||
|
|
||||||
http://www.evertype.com/standards/csur/klingon.html
|
http://www.evertype.com/standards/csur/klingon.html
|
||||||
@@ -112,6 +124,7 @@ However, since the set of symbols appear to be consistent throughout,
|
|||||||
with only the actual shapes being different, in keeping with standard
|
with only the actual shapes being different, in keeping with standard
|
||||||
Unicode practice these differences are considered font variants.
|
Unicode practice these differences are considered font variants.
|
||||||
|
|
||||||
|
====== =======================================================
|
||||||
U+F8D0 KLINGON LETTER A
|
U+F8D0 KLINGON LETTER A
|
||||||
U+F8D1 KLINGON LETTER B
|
U+F8D1 KLINGON LETTER B
|
||||||
U+F8D2 KLINGON LETTER CH
|
U+F8D2 KLINGON LETTER CH
|
||||||
@@ -155,6 +168,7 @@ U+F8F9 KLINGON DIGIT NINE
|
|||||||
U+F8FD KLINGON COMMA
|
U+F8FD KLINGON COMMA
|
||||||
U+F8FE KLINGON FULL STOP
|
U+F8FE KLINGON FULL STOP
|
||||||
U+F8FF KLINGON SYMBOL FOR EMPIRE
|
U+F8FF KLINGON SYMBOL FOR EMPIRE
|
||||||
|
====== =======================================================
|
||||||
|
|
||||||
Other Fictional and Artificial Scripts
|
Other Fictional and Artificial Scripts
|
||||||
--------------------------------------
|
--------------------------------------
|
||||||
66
Documentation/admin-guide/vga-softcursor.rst
Normal file
66
Documentation/admin-guide/vga-softcursor.rst
Normal file
@@ -0,0 +1,66 @@
|
|||||||
|
Software cursor for VGA
|
||||||
|
=======================
|
||||||
|
|
||||||
|
by Pavel Machek <pavel@atrey.karlin.mff.cuni.cz>
|
||||||
|
and Martin Mares <mj@atrey.karlin.mff.cuni.cz>
|
||||||
|
|
||||||
|
Linux now has some ability to manipulate cursor appearance. Normally, you
|
||||||
|
can set the size of hardware cursor (and also work around some ugly bugs in
|
||||||
|
those miserable Trident cards [#f1]_. You can now play a few new tricks:
|
||||||
|
you can make your cursor look
|
||||||
|
|
||||||
|
like a non-blinking red block, make it inverse background of the character it's
|
||||||
|
over or to highlight that character and still choose whether the original
|
||||||
|
hardware cursor should remain visible or not. There may be other things I have
|
||||||
|
never thought of.
|
||||||
|
|
||||||
|
The cursor appearance is controlled by a ``<ESC>[?1;2;3c`` escape sequence
|
||||||
|
where 1, 2 and 3 are parameters described below. If you omit any of them,
|
||||||
|
they will default to zeroes.
|
||||||
|
|
||||||
|
first Parameter
|
||||||
|
specifies cursor size::
|
||||||
|
|
||||||
|
0=default
|
||||||
|
1=invisible
|
||||||
|
2=underline,
|
||||||
|
...
|
||||||
|
8=full block
|
||||||
|
+ 16 if you want the software cursor to be applied
|
||||||
|
+ 32 if you want to always change the background color
|
||||||
|
+ 64 if you dislike having the background the same as the
|
||||||
|
foreground.
|
||||||
|
|
||||||
|
Highlights are ignored for the last two flags.
|
||||||
|
|
||||||
|
second parameter
|
||||||
|
selects character attribute bits you want to change
|
||||||
|
(by simply XORing them with the value of this parameter). On standard
|
||||||
|
VGA, the high four bits specify background and the low four the
|
||||||
|
foreground. In both groups, low three bits set color (as in normal
|
||||||
|
color codes used by the console) and the most significant one turns
|
||||||
|
on highlight (or sometimes blinking -- it depends on the configuration
|
||||||
|
of your VGA).
|
||||||
|
|
||||||
|
third parameter
|
||||||
|
consists of character attribute bits you want to set.
|
||||||
|
|
||||||
|
Bit setting takes place before bit toggling, so you can simply clear a
|
||||||
|
bit by including it in both the set mask and the toggle mask.
|
||||||
|
|
||||||
|
.. [#f1] see ``#define TRIDENT_GLITCH`` in ``drivers/video/vgacon.c``.
|
||||||
|
|
||||||
|
Examples
|
||||||
|
--------
|
||||||
|
|
||||||
|
To get normal blinking underline, use::
|
||||||
|
|
||||||
|
echo -e '\033[?2c'
|
||||||
|
|
||||||
|
To get blinking block, use::
|
||||||
|
|
||||||
|
echo -e '\033[?6c'
|
||||||
|
|
||||||
|
To get red non-blinking block, use::
|
||||||
|
|
||||||
|
echo -e '\033[?17;0;64c'
|
||||||
@@ -51,7 +51,7 @@ As an alternative, the boot loader can pass the relevant 'console='
|
|||||||
option to the kernel via the tagged lists specifying the port, and
|
option to the kernel via the tagged lists specifying the port, and
|
||||||
serial format options as described in
|
serial format options as described in
|
||||||
|
|
||||||
Documentation/kernel-parameters.txt.
|
Documentation/admin-guide/kernel-parameters.rst.
|
||||||
|
|
||||||
|
|
||||||
3. Detect the machine type
|
3. Detect the machine type
|
||||||
|
|||||||
@@ -1,574 +0,0 @@
|
|||||||
========================================
|
|
||||||
GENERIC ASSOCIATIVE ARRAY IMPLEMENTATION
|
|
||||||
========================================
|
|
||||||
|
|
||||||
Contents:
|
|
||||||
|
|
||||||
- Overview.
|
|
||||||
|
|
||||||
- The public API.
|
|
||||||
- Edit script.
|
|
||||||
- Operations table.
|
|
||||||
- Manipulation functions.
|
|
||||||
- Access functions.
|
|
||||||
- Index key form.
|
|
||||||
|
|
||||||
- Internal workings.
|
|
||||||
- Basic internal tree layout.
|
|
||||||
- Shortcuts.
|
|
||||||
- Splitting and collapsing nodes.
|
|
||||||
- Non-recursive iteration.
|
|
||||||
- Simultaneous alteration and iteration.
|
|
||||||
|
|
||||||
|
|
||||||
========
|
|
||||||
OVERVIEW
|
|
||||||
========
|
|
||||||
|
|
||||||
This associative array implementation is an object container with the following
|
|
||||||
properties:
|
|
||||||
|
|
||||||
(1) Objects are opaque pointers. The implementation does not care where they
|
|
||||||
point (if anywhere) or what they point to (if anything).
|
|
||||||
|
|
||||||
[!] NOTE: Pointers to objects _must_ be zero in the least significant bit.
|
|
||||||
|
|
||||||
(2) Objects do not need to contain linkage blocks for use by the array. This
|
|
||||||
permits an object to be located in multiple arrays simultaneously.
|
|
||||||
Rather, the array is made up of metadata blocks that point to objects.
|
|
||||||
|
|
||||||
(3) Objects require index keys to locate them within the array.
|
|
||||||
|
|
||||||
(4) Index keys must be unique. Inserting an object with the same key as one
|
|
||||||
already in the array will replace the old object.
|
|
||||||
|
|
||||||
(5) Index keys can be of any length and can be of different lengths.
|
|
||||||
|
|
||||||
(6) Index keys should encode the length early on, before any variation due to
|
|
||||||
length is seen.
|
|
||||||
|
|
||||||
(7) Index keys can include a hash to scatter objects throughout the array.
|
|
||||||
|
|
||||||
(8) The array can iterated over. The objects will not necessarily come out in
|
|
||||||
key order.
|
|
||||||
|
|
||||||
(9) The array can be iterated over whilst it is being modified, provided the
|
|
||||||
RCU readlock is being held by the iterator. Note, however, under these
|
|
||||||
circumstances, some objects may be seen more than once. If this is a
|
|
||||||
problem, the iterator should lock against modification. Objects will not
|
|
||||||
be missed, however, unless deleted.
|
|
||||||
|
|
||||||
(10) Objects in the array can be looked up by means of their index key.
|
|
||||||
|
|
||||||
(11) Objects can be looked up whilst the array is being modified, provided the
|
|
||||||
RCU readlock is being held by the thread doing the look up.
|
|
||||||
|
|
||||||
The implementation uses a tree of 16-pointer nodes internally that are indexed
|
|
||||||
on each level by nibbles from the index key in the same manner as in a radix
|
|
||||||
tree. To improve memory efficiency, shortcuts can be emplaced to skip over
|
|
||||||
what would otherwise be a series of single-occupancy nodes. Further, nodes
|
|
||||||
pack leaf object pointers into spare space in the node rather than making an
|
|
||||||
extra branch until as such time an object needs to be added to a full node.
|
|
||||||
|
|
||||||
|
|
||||||
==============
|
|
||||||
THE PUBLIC API
|
|
||||||
==============
|
|
||||||
|
|
||||||
The public API can be found in <linux/assoc_array.h>. The associative array is
|
|
||||||
rooted on the following structure:
|
|
||||||
|
|
||||||
struct assoc_array {
|
|
||||||
...
|
|
||||||
};
|
|
||||||
|
|
||||||
The code is selected by enabling CONFIG_ASSOCIATIVE_ARRAY.
|
|
||||||
|
|
||||||
|
|
||||||
EDIT SCRIPT
|
|
||||||
-----------
|
|
||||||
|
|
||||||
The insertion and deletion functions produce an 'edit script' that can later be
|
|
||||||
applied to effect the changes without risking ENOMEM. This retains the
|
|
||||||
preallocated metadata blocks that will be installed in the internal tree and
|
|
||||||
keeps track of the metadata blocks that will be removed from the tree when the
|
|
||||||
script is applied.
|
|
||||||
|
|
||||||
This is also used to keep track of dead blocks and dead objects after the
|
|
||||||
script has been applied so that they can be freed later. The freeing is done
|
|
||||||
after an RCU grace period has passed - thus allowing access functions to
|
|
||||||
proceed under the RCU read lock.
|
|
||||||
|
|
||||||
The script appears as outside of the API as a pointer of the type:
|
|
||||||
|
|
||||||
struct assoc_array_edit;
|
|
||||||
|
|
||||||
There are two functions for dealing with the script:
|
|
||||||
|
|
||||||
(1) Apply an edit script.
|
|
||||||
|
|
||||||
void assoc_array_apply_edit(struct assoc_array_edit *edit);
|
|
||||||
|
|
||||||
This will perform the edit functions, interpolating various write barriers
|
|
||||||
to permit accesses under the RCU read lock to continue. The edit script
|
|
||||||
will then be passed to call_rcu() to free it and any dead stuff it points
|
|
||||||
to.
|
|
||||||
|
|
||||||
(2) Cancel an edit script.
|
|
||||||
|
|
||||||
void assoc_array_cancel_edit(struct assoc_array_edit *edit);
|
|
||||||
|
|
||||||
This frees the edit script and all preallocated memory immediately. If
|
|
||||||
this was for insertion, the new object is _not_ released by this function,
|
|
||||||
but must rather be released by the caller.
|
|
||||||
|
|
||||||
These functions are guaranteed not to fail.
|
|
||||||
|
|
||||||
|
|
||||||
OPERATIONS TABLE
|
|
||||||
----------------
|
|
||||||
|
|
||||||
Various functions take a table of operations:
|
|
||||||
|
|
||||||
struct assoc_array_ops {
|
|
||||||
...
|
|
||||||
};
|
|
||||||
|
|
||||||
This points to a number of methods, all of which need to be provided:
|
|
||||||
|
|
||||||
(1) Get a chunk of index key from caller data:
|
|
||||||
|
|
||||||
unsigned long (*get_key_chunk)(const void *index_key, int level);
|
|
||||||
|
|
||||||
This should return a chunk of caller-supplied index key starting at the
|
|
||||||
*bit* position given by the level argument. The level argument will be a
|
|
||||||
multiple of ASSOC_ARRAY_KEY_CHUNK_SIZE and the function should return
|
|
||||||
ASSOC_ARRAY_KEY_CHUNK_SIZE bits. No error is possible.
|
|
||||||
|
|
||||||
|
|
||||||
(2) Get a chunk of an object's index key.
|
|
||||||
|
|
||||||
unsigned long (*get_object_key_chunk)(const void *object, int level);
|
|
||||||
|
|
||||||
As the previous function, but gets its data from an object in the array
|
|
||||||
rather than from a caller-supplied index key.
|
|
||||||
|
|
||||||
|
|
||||||
(3) See if this is the object we're looking for.
|
|
||||||
|
|
||||||
bool (*compare_object)(const void *object, const void *index_key);
|
|
||||||
|
|
||||||
Compare the object against an index key and return true if it matches and
|
|
||||||
false if it doesn't.
|
|
||||||
|
|
||||||
|
|
||||||
(4) Diff the index keys of two objects.
|
|
||||||
|
|
||||||
int (*diff_objects)(const void *object, const void *index_key);
|
|
||||||
|
|
||||||
Return the bit position at which the index key of the specified object
|
|
||||||
differs from the given index key or -1 if they are the same.
|
|
||||||
|
|
||||||
|
|
||||||
(5) Free an object.
|
|
||||||
|
|
||||||
void (*free_object)(void *object);
|
|
||||||
|
|
||||||
Free the specified object. Note that this may be called an RCU grace
|
|
||||||
period after assoc_array_apply_edit() was called, so synchronize_rcu() may
|
|
||||||
be necessary on module unloading.
|
|
||||||
|
|
||||||
|
|
||||||
MANIPULATION FUNCTIONS
|
|
||||||
----------------------
|
|
||||||
|
|
||||||
There are a number of functions for manipulating an associative array:
|
|
||||||
|
|
||||||
(1) Initialise an associative array.
|
|
||||||
|
|
||||||
void assoc_array_init(struct assoc_array *array);
|
|
||||||
|
|
||||||
This initialises the base structure for an associative array. It can't
|
|
||||||
fail.
|
|
||||||
|
|
||||||
|
|
||||||
(2) Insert/replace an object in an associative array.
|
|
||||||
|
|
||||||
struct assoc_array_edit *
|
|
||||||
assoc_array_insert(struct assoc_array *array,
|
|
||||||
const struct assoc_array_ops *ops,
|
|
||||||
const void *index_key,
|
|
||||||
void *object);
|
|
||||||
|
|
||||||
This inserts the given object into the array. Note that the least
|
|
||||||
significant bit of the pointer must be zero as it's used to type-mark
|
|
||||||
pointers internally.
|
|
||||||
|
|
||||||
If an object already exists for that key then it will be replaced with the
|
|
||||||
new object and the old one will be freed automatically.
|
|
||||||
|
|
||||||
The index_key argument should hold index key information and is
|
|
||||||
passed to the methods in the ops table when they are called.
|
|
||||||
|
|
||||||
This function makes no alteration to the array itself, but rather returns
|
|
||||||
an edit script that must be applied. -ENOMEM is returned in the case of
|
|
||||||
an out-of-memory error.
|
|
||||||
|
|
||||||
The caller should lock exclusively against other modifiers of the array.
|
|
||||||
|
|
||||||
|
|
||||||
(3) Delete an object from an associative array.
|
|
||||||
|
|
||||||
struct assoc_array_edit *
|
|
||||||
assoc_array_delete(struct assoc_array *array,
|
|
||||||
const struct assoc_array_ops *ops,
|
|
||||||
const void *index_key);
|
|
||||||
|
|
||||||
This deletes an object that matches the specified data from the array.
|
|
||||||
|
|
||||||
The index_key argument should hold index key information and is
|
|
||||||
passed to the methods in the ops table when they are called.
|
|
||||||
|
|
||||||
This function makes no alteration to the array itself, but rather returns
|
|
||||||
an edit script that must be applied. -ENOMEM is returned in the case of
|
|
||||||
an out-of-memory error. NULL will be returned if the specified object is
|
|
||||||
not found within the array.
|
|
||||||
|
|
||||||
The caller should lock exclusively against other modifiers of the array.
|
|
||||||
|
|
||||||
|
|
||||||
(4) Delete all objects from an associative array.
|
|
||||||
|
|
||||||
struct assoc_array_edit *
|
|
||||||
assoc_array_clear(struct assoc_array *array,
|
|
||||||
const struct assoc_array_ops *ops);
|
|
||||||
|
|
||||||
This deletes all the objects from an associative array and leaves it
|
|
||||||
completely empty.
|
|
||||||
|
|
||||||
This function makes no alteration to the array itself, but rather returns
|
|
||||||
an edit script that must be applied. -ENOMEM is returned in the case of
|
|
||||||
an out-of-memory error.
|
|
||||||
|
|
||||||
The caller should lock exclusively against other modifiers of the array.
|
|
||||||
|
|
||||||
|
|
||||||
(5) Destroy an associative array, deleting all objects.
|
|
||||||
|
|
||||||
void assoc_array_destroy(struct assoc_array *array,
|
|
||||||
const struct assoc_array_ops *ops);
|
|
||||||
|
|
||||||
This destroys the contents of the associative array and leaves it
|
|
||||||
completely empty. It is not permitted for another thread to be traversing
|
|
||||||
the array under the RCU read lock at the same time as this function is
|
|
||||||
destroying it as no RCU deferral is performed on memory release -
|
|
||||||
something that would require memory to be allocated.
|
|
||||||
|
|
||||||
The caller should lock exclusively against other modifiers and accessors
|
|
||||||
of the array.
|
|
||||||
|
|
||||||
|
|
||||||
(6) Garbage collect an associative array.
|
|
||||||
|
|
||||||
int assoc_array_gc(struct assoc_array *array,
|
|
||||||
const struct assoc_array_ops *ops,
|
|
||||||
bool (*iterator)(void *object, void *iterator_data),
|
|
||||||
void *iterator_data);
|
|
||||||
|
|
||||||
This iterates over the objects in an associative array and passes each one
|
|
||||||
to iterator(). If iterator() returns true, the object is kept. If it
|
|
||||||
returns false, the object will be freed. If the iterator() function
|
|
||||||
returns true, it must perform any appropriate refcount incrementing on the
|
|
||||||
object before returning.
|
|
||||||
|
|
||||||
The internal tree will be packed down if possible as part of the iteration
|
|
||||||
to reduce the number of nodes in it.
|
|
||||||
|
|
||||||
The iterator_data is passed directly to iterator() and is otherwise
|
|
||||||
ignored by the function.
|
|
||||||
|
|
||||||
The function will return 0 if successful and -ENOMEM if there wasn't
|
|
||||||
enough memory.
|
|
||||||
|
|
||||||
It is possible for other threads to iterate over or search the array under
|
|
||||||
the RCU read lock whilst this function is in progress. The caller should
|
|
||||||
lock exclusively against other modifiers of the array.
|
|
||||||
|
|
||||||
|
|
||||||
ACCESS FUNCTIONS
|
|
||||||
----------------
|
|
||||||
|
|
||||||
There are two functions for accessing an associative array:
|
|
||||||
|
|
||||||
(1) Iterate over all the objects in an associative array.
|
|
||||||
|
|
||||||
int assoc_array_iterate(const struct assoc_array *array,
|
|
||||||
int (*iterator)(const void *object,
|
|
||||||
void *iterator_data),
|
|
||||||
void *iterator_data);
|
|
||||||
|
|
||||||
This passes each object in the array to the iterator callback function.
|
|
||||||
iterator_data is private data for that function.
|
|
||||||
|
|
||||||
This may be used on an array at the same time as the array is being
|
|
||||||
modified, provided the RCU read lock is held. Under such circumstances,
|
|
||||||
it is possible for the iteration function to see some objects twice. If
|
|
||||||
this is a problem, then modification should be locked against. The
|
|
||||||
iteration algorithm should not, however, miss any objects.
|
|
||||||
|
|
||||||
The function will return 0 if no objects were in the array or else it will
|
|
||||||
return the result of the last iterator function called. Iteration stops
|
|
||||||
immediately if any call to the iteration function results in a non-zero
|
|
||||||
return.
|
|
||||||
|
|
||||||
|
|
||||||
(2) Find an object in an associative array.
|
|
||||||
|
|
||||||
void *assoc_array_find(const struct assoc_array *array,
|
|
||||||
const struct assoc_array_ops *ops,
|
|
||||||
const void *index_key);
|
|
||||||
|
|
||||||
This walks through the array's internal tree directly to the object
|
|
||||||
specified by the index key..
|
|
||||||
|
|
||||||
This may be used on an array at the same time as the array is being
|
|
||||||
modified, provided the RCU read lock is held.
|
|
||||||
|
|
||||||
The function will return the object if found (and set *_type to the object
|
|
||||||
type) or will return NULL if the object was not found.
|
|
||||||
|
|
||||||
|
|
||||||
INDEX KEY FORM
|
|
||||||
--------------
|
|
||||||
|
|
||||||
The index key can be of any form, but since the algorithms aren't told how long
|
|
||||||
the key is, it is strongly recommended that the index key includes its length
|
|
||||||
very early on before any variation due to the length would have an effect on
|
|
||||||
comparisons.
|
|
||||||
|
|
||||||
This will cause leaves with different length keys to scatter away from each
|
|
||||||
other - and those with the same length keys to cluster together.
|
|
||||||
|
|
||||||
It is also recommended that the index key begin with a hash of the rest of the
|
|
||||||
key to maximise scattering throughout keyspace.
|
|
||||||
|
|
||||||
The better the scattering, the wider and lower the internal tree will be.
|
|
||||||
|
|
||||||
Poor scattering isn't too much of a problem as there are shortcuts and nodes
|
|
||||||
can contain mixtures of leaves and metadata pointers.
|
|
||||||
|
|
||||||
The index key is read in chunks of machine word. Each chunk is subdivided into
|
|
||||||
one nibble (4 bits) per level, so on a 32-bit CPU this is good for 8 levels and
|
|
||||||
on a 64-bit CPU, 16 levels. Unless the scattering is really poor, it is
|
|
||||||
unlikely that more than one word of any particular index key will have to be
|
|
||||||
used.
|
|
||||||
|
|
||||||
|
|
||||||
=================
|
|
||||||
INTERNAL WORKINGS
|
|
||||||
=================
|
|
||||||
|
|
||||||
The associative array data structure has an internal tree. This tree is
|
|
||||||
constructed of two types of metadata blocks: nodes and shortcuts.
|
|
||||||
|
|
||||||
A node is an array of slots. Each slot can contain one of four things:
|
|
||||||
|
|
||||||
(*) A NULL pointer, indicating that the slot is empty.
|
|
||||||
|
|
||||||
(*) A pointer to an object (a leaf).
|
|
||||||
|
|
||||||
(*) A pointer to a node at the next level.
|
|
||||||
|
|
||||||
(*) A pointer to a shortcut.
|
|
||||||
|
|
||||||
|
|
||||||
BASIC INTERNAL TREE LAYOUT
|
|
||||||
--------------------------
|
|
||||||
|
|
||||||
Ignoring shortcuts for the moment, the nodes form a multilevel tree. The index
|
|
||||||
key space is strictly subdivided by the nodes in the tree and nodes occur on
|
|
||||||
fixed levels. For example:
|
|
||||||
|
|
||||||
Level: 0 1 2 3
|
|
||||||
=============== =============== =============== ===============
|
|
||||||
NODE D
|
|
||||||
NODE B NODE C +------>+---+
|
|
||||||
+------>+---+ +------>+---+ | | 0 |
|
|
||||||
NODE A | | 0 | | | 0 | | +---+
|
|
||||||
+---+ | +---+ | +---+ | : :
|
|
||||||
| 0 | | : : | : : | +---+
|
|
||||||
+---+ | +---+ | +---+ | | f |
|
|
||||||
| 1 |---+ | 3 |---+ | 7 |---+ +---+
|
|
||||||
+---+ +---+ +---+
|
|
||||||
: : : : | 8 |---+
|
|
||||||
+---+ +---+ +---+ | NODE E
|
|
||||||
| e |---+ | f | : : +------>+---+
|
|
||||||
+---+ | +---+ +---+ | 0 |
|
|
||||||
| f | | | f | +---+
|
|
||||||
+---+ | +---+ : :
|
|
||||||
| NODE F +---+
|
|
||||||
+------>+---+ | f |
|
|
||||||
| 0 | NODE G +---+
|
|
||||||
+---+ +------>+---+
|
|
||||||
: : | | 0 |
|
|
||||||
+---+ | +---+
|
|
||||||
| 6 |---+ : :
|
|
||||||
+---+ +---+
|
|
||||||
: : | f |
|
|
||||||
+---+ +---+
|
|
||||||
| f |
|
|
||||||
+---+
|
|
||||||
|
|
||||||
In the above example, there are 7 nodes (A-G), each with 16 slots (0-f).
|
|
||||||
Assuming no other meta data nodes in the tree, the key space is divided thusly:
|
|
||||||
|
|
||||||
KEY PREFIX NODE
|
|
||||||
========== ====
|
|
||||||
137* D
|
|
||||||
138* E
|
|
||||||
13[0-69-f]* C
|
|
||||||
1[0-24-f]* B
|
|
||||||
e6* G
|
|
||||||
e[0-57-f]* F
|
|
||||||
[02-df]* A
|
|
||||||
|
|
||||||
So, for instance, keys with the following example index keys will be found in
|
|
||||||
the appropriate nodes:
|
|
||||||
|
|
||||||
INDEX KEY PREFIX NODE
|
|
||||||
=============== ======= ====
|
|
||||||
13694892892489 13 C
|
|
||||||
13795289025897 137 D
|
|
||||||
13889dde88793 138 E
|
|
||||||
138bbb89003093 138 E
|
|
||||||
1394879524789 12 C
|
|
||||||
1458952489 1 B
|
|
||||||
9431809de993ba - A
|
|
||||||
b4542910809cd - A
|
|
||||||
e5284310def98 e F
|
|
||||||
e68428974237 e6 G
|
|
||||||
e7fffcbd443 e F
|
|
||||||
f3842239082 - A
|
|
||||||
|
|
||||||
To save memory, if a node can hold all the leaves in its portion of keyspace,
|
|
||||||
then the node will have all those leaves in it and will not have any metadata
|
|
||||||
pointers - even if some of those leaves would like to be in the same slot.
|
|
||||||
|
|
||||||
A node can contain a heterogeneous mix of leaves and metadata pointers.
|
|
||||||
Metadata pointers must be in the slots that match their subdivisions of key
|
|
||||||
space. The leaves can be in any slot not occupied by a metadata pointer. It
|
|
||||||
is guaranteed that none of the leaves in a node will match a slot occupied by a
|
|
||||||
metadata pointer. If the metadata pointer is there, any leaf whose key matches
|
|
||||||
the metadata key prefix must be in the subtree that the metadata pointer points
|
|
||||||
to.
|
|
||||||
|
|
||||||
In the above example list of index keys, node A will contain:
|
|
||||||
|
|
||||||
SLOT CONTENT INDEX KEY (PREFIX)
|
|
||||||
==== =============== ==================
|
|
||||||
1 PTR TO NODE B 1*
|
|
||||||
any LEAF 9431809de993ba
|
|
||||||
any LEAF b4542910809cd
|
|
||||||
e PTR TO NODE F e*
|
|
||||||
any LEAF f3842239082
|
|
||||||
|
|
||||||
and node B:
|
|
||||||
|
|
||||||
3 PTR TO NODE C 13*
|
|
||||||
any LEAF 1458952489
|
|
||||||
|
|
||||||
|
|
||||||
SHORTCUTS
|
|
||||||
---------
|
|
||||||
|
|
||||||
Shortcuts are metadata records that jump over a piece of keyspace. A shortcut
|
|
||||||
is a replacement for a series of single-occupancy nodes ascending through the
|
|
||||||
levels. Shortcuts exist to save memory and to speed up traversal.
|
|
||||||
|
|
||||||
It is possible for the root of the tree to be a shortcut - say, for example,
|
|
||||||
the tree contains at least 17 nodes all with key prefix '1111'. The insertion
|
|
||||||
algorithm will insert a shortcut to skip over the '1111' keyspace in a single
|
|
||||||
bound and get to the fourth level where these actually become different.
|
|
||||||
|
|
||||||
|
|
||||||
SPLITTING AND COLLAPSING NODES
|
|
||||||
------------------------------
|
|
||||||
|
|
||||||
Each node has a maximum capacity of 16 leaves and metadata pointers. If the
|
|
||||||
insertion algorithm finds that it is trying to insert a 17th object into a
|
|
||||||
node, that node will be split such that at least two leaves that have a common
|
|
||||||
key segment at that level end up in a separate node rooted on that slot for
|
|
||||||
that common key segment.
|
|
||||||
|
|
||||||
If the leaves in a full node and the leaf that is being inserted are
|
|
||||||
sufficiently similar, then a shortcut will be inserted into the tree.
|
|
||||||
|
|
||||||
When the number of objects in the subtree rooted at a node falls to 16 or
|
|
||||||
fewer, then the subtree will be collapsed down to a single node - and this will
|
|
||||||
ripple towards the root if possible.
|
|
||||||
|
|
||||||
|
|
||||||
NON-RECURSIVE ITERATION
|
|
||||||
-----------------------
|
|
||||||
|
|
||||||
Each node and shortcut contains a back pointer to its parent and the number of
|
|
||||||
slot in that parent that points to it. None-recursive iteration uses these to
|
|
||||||
proceed rootwards through the tree, going to the parent node, slot N + 1 to
|
|
||||||
make sure progress is made without the need for a stack.
|
|
||||||
|
|
||||||
The backpointers, however, make simultaneous alteration and iteration tricky.
|
|
||||||
|
|
||||||
|
|
||||||
SIMULTANEOUS ALTERATION AND ITERATION
|
|
||||||
-------------------------------------
|
|
||||||
|
|
||||||
There are a number of cases to consider:
|
|
||||||
|
|
||||||
(1) Simple insert/replace. This involves simply replacing a NULL or old
|
|
||||||
matching leaf pointer with the pointer to the new leaf after a barrier.
|
|
||||||
The metadata blocks don't change otherwise. An old leaf won't be freed
|
|
||||||
until after the RCU grace period.
|
|
||||||
|
|
||||||
(2) Simple delete. This involves just clearing an old matching leaf. The
|
|
||||||
metadata blocks don't change otherwise. The old leaf won't be freed until
|
|
||||||
after the RCU grace period.
|
|
||||||
|
|
||||||
(3) Insertion replacing part of a subtree that we haven't yet entered. This
|
|
||||||
may involve replacement of part of that subtree - but that won't affect
|
|
||||||
the iteration as we won't have reached the pointer to it yet and the
|
|
||||||
ancestry blocks are not replaced (the layout of those does not change).
|
|
||||||
|
|
||||||
(4) Insertion replacing nodes that we're actively processing. This isn't a
|
|
||||||
problem as we've passed the anchoring pointer and won't switch onto the
|
|
||||||
new layout until we follow the back pointers - at which point we've
|
|
||||||
already examined the leaves in the replaced node (we iterate over all the
|
|
||||||
leaves in a node before following any of its metadata pointers).
|
|
||||||
|
|
||||||
We might, however, re-see some leaves that have been split out into a new
|
|
||||||
branch that's in a slot further along than we were at.
|
|
||||||
|
|
||||||
(5) Insertion replacing nodes that we're processing a dependent branch of.
|
|
||||||
This won't affect us until we follow the back pointers. Similar to (4).
|
|
||||||
|
|
||||||
(6) Deletion collapsing a branch under us. This doesn't affect us because the
|
|
||||||
back pointers will get us back to the parent of the new node before we
|
|
||||||
could see the new node. The entire collapsed subtree is thrown away
|
|
||||||
unchanged - and will still be rooted on the same slot, so we shouldn't
|
|
||||||
process it a second time as we'll go back to slot + 1.
|
|
||||||
|
|
||||||
Note:
|
|
||||||
|
|
||||||
(*) Under some circumstances, we need to simultaneously change the parent
|
|
||||||
pointer and the parent slot pointer on a node (say, for example, we
|
|
||||||
inserted another node before it and moved it up a level). We cannot do
|
|
||||||
this without locking against a read - so we have to replace that node too.
|
|
||||||
|
|
||||||
However, when we're changing a shortcut into a node this isn't a problem
|
|
||||||
as shortcuts only have one slot and so the parent slot number isn't used
|
|
||||||
when traversing backwards over one. This means that it's okay to change
|
|
||||||
the slot number first - provided suitable barriers are used to make sure
|
|
||||||
the parent slot number is read after the back pointer.
|
|
||||||
|
|
||||||
Obsolete blocks and leaves are freed up after an RCU grace period has passed,
|
|
||||||
so as long as anyone doing walking or iteration holds the RCU read lock, the
|
|
||||||
old superstructure should not go away on them.
|
|
||||||
@@ -1,45 +0,0 @@
|
|||||||
March 2008
|
|
||||||
Jan-Simon Moeller, dl9pf@gmx.de
|
|
||||||
|
|
||||||
|
|
||||||
How to deal with bad memory e.g. reported by memtest86+ ?
|
|
||||||
#########################################################
|
|
||||||
|
|
||||||
There are three possibilities I know of:
|
|
||||||
|
|
||||||
1) Reinsert/swap the memory modules
|
|
||||||
|
|
||||||
2) Buy new modules (best!) or try to exchange the memory
|
|
||||||
if you have spare-parts
|
|
||||||
|
|
||||||
3) Use BadRAM or memmap
|
|
||||||
|
|
||||||
This Howto is about number 3) .
|
|
||||||
|
|
||||||
|
|
||||||
BadRAM
|
|
||||||
######
|
|
||||||
BadRAM is the actively developed and available as kernel-patch
|
|
||||||
here: http://rick.vanrein.org/linux/badram/
|
|
||||||
|
|
||||||
For more details see the BadRAM documentation.
|
|
||||||
|
|
||||||
memmap
|
|
||||||
######
|
|
||||||
|
|
||||||
memmap is already in the kernel and usable as kernel-parameter at
|
|
||||||
boot-time. Its syntax is slightly strange and you may need to
|
|
||||||
calculate the values by yourself!
|
|
||||||
|
|
||||||
Syntax to exclude a memory area (see kernel-parameters.txt for details):
|
|
||||||
memmap=<size>$<address>
|
|
||||||
|
|
||||||
Example: memtest86+ reported here errors at address 0x18691458, 0x18698424 and
|
|
||||||
some others. All had 0x1869xxxx in common, so I chose a pattern of
|
|
||||||
0x18690000,0xffff0000.
|
|
||||||
|
|
||||||
With the numbers of the example above:
|
|
||||||
memmap=64K$0x18690000
|
|
||||||
or
|
|
||||||
memmap=0x10000$0x18690000
|
|
||||||
|
|
||||||
@@ -1,56 +0,0 @@
|
|||||||
These instructions are deliberately very basic. If you want something clever,
|
|
||||||
go read the real docs ;-) Please don't add more stuff, but feel free to
|
|
||||||
correct my mistakes ;-) (mbligh@aracnet.com)
|
|
||||||
Thanks to John Levon, Dave Hansen, et al. for help writing this.
|
|
||||||
|
|
||||||
<test> is the thing you're trying to measure.
|
|
||||||
Make sure you have the correct System.map / vmlinux referenced!
|
|
||||||
|
|
||||||
It is probably easiest to use "make install" for linux and hack
|
|
||||||
/sbin/installkernel to copy vmlinux to /boot, in addition to vmlinuz,
|
|
||||||
config, System.map, which are usually installed by default.
|
|
||||||
|
|
||||||
Readprofile
|
|
||||||
-----------
|
|
||||||
A recent readprofile command is needed for 2.6, such as found in util-linux
|
|
||||||
2.12a, which can be downloaded from:
|
|
||||||
|
|
||||||
http://www.kernel.org/pub/linux/utils/util-linux/
|
|
||||||
|
|
||||||
Most distributions will ship it already.
|
|
||||||
|
|
||||||
Add "profile=2" to the kernel command line.
|
|
||||||
|
|
||||||
clear readprofile -r
|
|
||||||
<test>
|
|
||||||
dump output readprofile -m /boot/System.map > captured_profile
|
|
||||||
|
|
||||||
Oprofile
|
|
||||||
--------
|
|
||||||
|
|
||||||
Get the source (see Changes for required version) from
|
|
||||||
http://oprofile.sourceforge.net/ and add "idle=poll" to the kernel command
|
|
||||||
line.
|
|
||||||
|
|
||||||
Configure with CONFIG_PROFILING=y and CONFIG_OPROFILE=y & reboot on new kernel
|
|
||||||
|
|
||||||
./configure --with-kernel-support
|
|
||||||
make install
|
|
||||||
|
|
||||||
For superior results, be sure to enable the local APIC. If opreport sees
|
|
||||||
a 0Hz CPU, APIC was not on. Be aware that idle=poll may mean a performance
|
|
||||||
penalty.
|
|
||||||
|
|
||||||
One time setup:
|
|
||||||
opcontrol --setup --vmlinux=/boot/vmlinux
|
|
||||||
|
|
||||||
clear opcontrol --reset
|
|
||||||
start opcontrol --start
|
|
||||||
<test>
|
|
||||||
stop opcontrol --stop
|
|
||||||
dump output opreport > output_file
|
|
||||||
|
|
||||||
To only report on the kernel, run opreport -l /boot/vmlinux > output_file
|
|
||||||
|
|
||||||
A reset is needed to clear old statistics, which survive a reboot.
|
|
||||||
|
|
||||||
@@ -1,131 +0,0 @@
|
|||||||
Kernel Support for miscellaneous (your favourite) Binary Formats v1.1
|
|
||||||
=====================================================================
|
|
||||||
|
|
||||||
This Kernel feature allows you to invoke almost (for restrictions see below)
|
|
||||||
every program by simply typing its name in the shell.
|
|
||||||
This includes for example compiled Java(TM), Python or Emacs programs.
|
|
||||||
|
|
||||||
To achieve this you must tell binfmt_misc which interpreter has to be invoked
|
|
||||||
with which binary. Binfmt_misc recognises the binary-type by matching some bytes
|
|
||||||
at the beginning of the file with a magic byte sequence (masking out specified
|
|
||||||
bits) you have supplied. Binfmt_misc can also recognise a filename extension
|
|
||||||
aka '.com' or '.exe'.
|
|
||||||
|
|
||||||
First you must mount binfmt_misc:
|
|
||||||
mount binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc
|
|
||||||
|
|
||||||
To actually register a new binary type, you have to set up a string looking like
|
|
||||||
:name:type:offset:magic:mask:interpreter:flags (where you can choose the ':'
|
|
||||||
upon your needs) and echo it to /proc/sys/fs/binfmt_misc/register.
|
|
||||||
|
|
||||||
Here is what the fields mean:
|
|
||||||
- 'name' is an identifier string. A new /proc file will be created with this
|
|
||||||
name below /proc/sys/fs/binfmt_misc; cannot contain slashes '/' for obvious
|
|
||||||
reasons.
|
|
||||||
- 'type' is the type of recognition. Give 'M' for magic and 'E' for extension.
|
|
||||||
- 'offset' is the offset of the magic/mask in the file, counted in bytes. This
|
|
||||||
defaults to 0 if you omit it (i.e. you write ':name:type::magic...'). Ignored
|
|
||||||
when using filename extension matching.
|
|
||||||
- 'magic' is the byte sequence binfmt_misc is matching for. The magic string
|
|
||||||
may contain hex-encoded characters like \x0a or \xA4. Note that you must
|
|
||||||
escape any NUL bytes; parsing halts at the first one. In a shell environment
|
|
||||||
you might have to write \\x0a to prevent the shell from eating your \.
|
|
||||||
If you chose filename extension matching, this is the extension to be
|
|
||||||
recognised (without the '.', the \x0a specials are not allowed). Extension
|
|
||||||
matching is case sensitive, and slashes '/' are not allowed!
|
|
||||||
- 'mask' is an (optional, defaults to all 0xff) mask. You can mask out some
|
|
||||||
bits from matching by supplying a string like magic and as long as magic.
|
|
||||||
The mask is anded with the byte sequence of the file. Note that you must
|
|
||||||
escape any NUL bytes; parsing halts at the first one. Ignored when using
|
|
||||||
filename extension matching.
|
|
||||||
- 'interpreter' is the program that should be invoked with the binary as first
|
|
||||||
argument (specify the full path)
|
|
||||||
- 'flags' is an optional field that controls several aspects of the invocation
|
|
||||||
of the interpreter. It is a string of capital letters, each controls a
|
|
||||||
certain aspect. The following flags are supported -
|
|
||||||
'P' - preserve-argv[0]. Legacy behavior of binfmt_misc is to overwrite
|
|
||||||
the original argv[0] with the full path to the binary. When this
|
|
||||||
flag is included, binfmt_misc will add an argument to the argument
|
|
||||||
vector for this purpose, thus preserving the original argv[0].
|
|
||||||
e.g. If your interp is set to /bin/foo and you run `blah` (which is
|
|
||||||
in /usr/local/bin), then the kernel will execute /bin/foo with
|
|
||||||
argv[] set to ["/bin/foo", "/usr/local/bin/blah", "blah"]. The
|
|
||||||
interp has to be aware of this so it can execute /usr/local/bin/blah
|
|
||||||
with argv[] set to ["blah"].
|
|
||||||
'O' - open-binary. Legacy behavior of binfmt_misc is to pass the full path
|
|
||||||
of the binary to the interpreter as an argument. When this flag is
|
|
||||||
included, binfmt_misc will open the file for reading and pass its
|
|
||||||
descriptor as an argument, instead of the full path, thus allowing
|
|
||||||
the interpreter to execute non-readable binaries. This feature
|
|
||||||
should be used with care - the interpreter has to be trusted not to
|
|
||||||
emit the contents of the non-readable binary.
|
|
||||||
'C' - credentials. Currently, the behavior of binfmt_misc is to calculate
|
|
||||||
the credentials and security token of the new process according to
|
|
||||||
the interpreter. When this flag is included, these attributes are
|
|
||||||
calculated according to the binary. It also implies the 'O' flag.
|
|
||||||
This feature should be used with care as the interpreter
|
|
||||||
will run with root permissions when a setuid binary owned by root
|
|
||||||
is run with binfmt_misc.
|
|
||||||
'F' - fix binary. The usual behaviour of binfmt_misc is to spawn the
|
|
||||||
binary lazily when the misc format file is invoked. However,
|
|
||||||
this doesn't work very well in the face of mount namespaces and
|
|
||||||
changeroots, so the F mode opens the binary as soon as the
|
|
||||||
emulation is installed and uses the opened image to spawn the
|
|
||||||
emulator, meaning it is always available once installed,
|
|
||||||
regardless of how the environment changes.
|
|
||||||
|
|
||||||
|
|
||||||
There are some restrictions:
|
|
||||||
- the whole register string may not exceed 1920 characters
|
|
||||||
- the magic must reside in the first 128 bytes of the file, i.e.
|
|
||||||
offset+size(magic) has to be less than 128
|
|
||||||
- the interpreter string may not exceed 127 characters
|
|
||||||
|
|
||||||
To use binfmt_misc you have to mount it first. You can mount it with
|
|
||||||
"mount -t binfmt_misc none /proc/sys/fs/binfmt_misc" command, or you can add
|
|
||||||
a line "none /proc/sys/fs/binfmt_misc binfmt_misc defaults 0 0" to your
|
|
||||||
/etc/fstab so it auto mounts on boot.
|
|
||||||
|
|
||||||
You may want to add the binary formats in one of your /etc/rc scripts during
|
|
||||||
boot-up. Read the manual of your init program to figure out how to do this
|
|
||||||
right.
|
|
||||||
|
|
||||||
Think about the order of adding entries! Later added entries are matched first!
|
|
||||||
|
|
||||||
|
|
||||||
A few examples (assumed you are in /proc/sys/fs/binfmt_misc):
|
|
||||||
|
|
||||||
- enable support for em86 (like binfmt_em86, for Alpha AXP only):
|
|
||||||
echo ':i386:M::\x7fELF\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x03:\xff\xff\xff\xff\xff\xfe\xfe\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfb\xff\xff:/bin/em86:' > register
|
|
||||||
echo ':i486:M::\x7fELF\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x06:\xff\xff\xff\xff\xff\xfe\xfe\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfb\xff\xff:/bin/em86:' > register
|
|
||||||
|
|
||||||
- enable support for packed DOS applications (pre-configured dosemu hdimages):
|
|
||||||
echo ':DEXE:M::\x0eDEX::/usr/bin/dosexec:' > register
|
|
||||||
|
|
||||||
- enable support for Windows executables using wine:
|
|
||||||
echo ':DOSWin:M::MZ::/usr/local/bin/wine:' > register
|
|
||||||
|
|
||||||
For java support see Documentation/java.txt
|
|
||||||
|
|
||||||
|
|
||||||
You can enable/disable binfmt_misc or one binary type by echoing 0 (to disable)
|
|
||||||
or 1 (to enable) to /proc/sys/fs/binfmt_misc/status or /proc/.../the_name.
|
|
||||||
Catting the file tells you the current status of binfmt_misc/the entry.
|
|
||||||
|
|
||||||
You can remove one entry or all entries by echoing -1 to /proc/.../the_name
|
|
||||||
or /proc/sys/fs/binfmt_misc/status.
|
|
||||||
|
|
||||||
|
|
||||||
HINTS:
|
|
||||||
======
|
|
||||||
|
|
||||||
If you want to pass special arguments to your interpreter, you can
|
|
||||||
write a wrapper script for it. See Documentation/java.txt for an
|
|
||||||
example.
|
|
||||||
|
|
||||||
Your interpreter should NOT look in the PATH for the filename; the kernel
|
|
||||||
passes it the full filename (or the file descriptor) to use. Using $PATH can
|
|
||||||
cause unexpected behaviour and can be a security hazard.
|
|
||||||
|
|
||||||
|
|
||||||
Richard Günther <rguenth@tat.physik.uni-tuebingen.de>
|
|
||||||
@@ -184,7 +184,7 @@ infrequently used and the primary purpose of Smart Array controllers is to
|
|||||||
act as a RAID controller for disk drives, so the vast majority of commands
|
act as a RAID controller for disk drives, so the vast majority of commands
|
||||||
are allocated for disk devices. However, if you have more than a few tape
|
are allocated for disk devices. However, if you have more than a few tape
|
||||||
drives attached to a smart array, the default number of commands may not be
|
drives attached to a smart array, the default number of commands may not be
|
||||||
enought (for example, if you have 8 tape drives, you could only rewind 6
|
enough (for example, if you have 8 tape drives, you could only rewind 6
|
||||||
at one time with the default number of commands.) The cciss_tape_cmds module
|
at one time with the default number of commands.) The cciss_tape_cmds module
|
||||||
parameter allows more commands (up to 16 more) to be allocated for use by
|
parameter allows more commands (up to 16 more) to be allocated for use by
|
||||||
tape drives. For example:
|
tape drives. For example:
|
||||||
|
|||||||
@@ -14,7 +14,7 @@ Contents:
|
|||||||
|
|
||||||
The RAM disk driver is a way to use main system memory as a block device. It
|
The RAM disk driver is a way to use main system memory as a block device. It
|
||||||
is required for initrd, an initial filesystem used if you need to load modules
|
is required for initrd, an initial filesystem used if you need to load modules
|
||||||
in order to access the root filesystem (see Documentation/initrd.txt). It can
|
in order to access the root filesystem (see Documentation/admin-guide/initrd.rst). It can
|
||||||
also be used for a temporary filesystem for crypto work, since the contents
|
also be used for a temporary filesystem for crypto work, since the contents
|
||||||
are erased on reboot.
|
are erased on reboot.
|
||||||
|
|
||||||
|
|||||||
@@ -1,34 +0,0 @@
|
|||||||
Linux Braille Console
|
|
||||||
|
|
||||||
To get early boot messages on a braille device (before userspace screen
|
|
||||||
readers can start), you first need to compile the support for the usual serial
|
|
||||||
console (see serial-console.txt), and for braille device (in Device Drivers -
|
|
||||||
Accessibility).
|
|
||||||
|
|
||||||
Then you need to specify a console=brl, option on the kernel command line, the
|
|
||||||
format is:
|
|
||||||
|
|
||||||
console=brl,serial_options...
|
|
||||||
|
|
||||||
where serial_options... are the same as described in serial-console.txt
|
|
||||||
|
|
||||||
So for instance you can use console=brl,ttyS0 if the braille device is connected
|
|
||||||
to the first serial port, and console=brl,ttyS0,115200 to override the baud rate
|
|
||||||
to 115200, etc.
|
|
||||||
|
|
||||||
By default, the braille device will just show the last kernel message (console
|
|
||||||
mode). To review previous messages, press the Insert key to switch to the VT
|
|
||||||
review mode. In review mode, the arrow keys permit to browse in the VT content,
|
|
||||||
page up/down keys go at the top/bottom of the screen, and the home key goes back
|
|
||||||
to the cursor, hence providing very basic screen reviewing facility.
|
|
||||||
|
|
||||||
Sound feedback can be obtained by adding the braille_console.sound=1 kernel
|
|
||||||
parameter.
|
|
||||||
|
|
||||||
For simplicity, only one braille console can be enabled, other uses of
|
|
||||||
console=brl,... will be discarded. Also note that it does not interfere with
|
|
||||||
the console selection mechanism described in serial-console.txt
|
|
||||||
|
|
||||||
For now, only the VisioBraille device is supported.
|
|
||||||
|
|
||||||
Samuel Thibault <samuel.thibault@ens-lyon.org>
|
|
||||||
@@ -8,7 +8,7 @@ cpuacct.txt
|
|||||||
- CPU Accounting Controller; account CPU usage for groups of tasks.
|
- CPU Accounting Controller; account CPU usage for groups of tasks.
|
||||||
cpusets.txt
|
cpusets.txt
|
||||||
- documents the cpusets feature; assign CPUs and Mem to a set of tasks.
|
- documents the cpusets feature; assign CPUs and Mem to a set of tasks.
|
||||||
devices.txt
|
admin-guide/devices.rst
|
||||||
- Device Whitelist Controller; description, interface and security.
|
- Device Whitelist Controller; description, interface and security.
|
||||||
freezer-subsystem.txt
|
freezer-subsystem.txt
|
||||||
- checkpointing; rationale to not use signals, interface.
|
- checkpointing; rationale to not use signals, interface.
|
||||||
|
|||||||
@@ -161,7 +161,7 @@ The producer will look something like this:
|
|||||||
|
|
||||||
unsigned long head = buffer->head;
|
unsigned long head = buffer->head;
|
||||||
/* The spin_unlock() and next spin_lock() provide needed ordering. */
|
/* The spin_unlock() and next spin_lock() provide needed ordering. */
|
||||||
unsigned long tail = ACCESS_ONCE(buffer->tail);
|
unsigned long tail = READ_ONCE(buffer->tail);
|
||||||
|
|
||||||
if (CIRC_SPACE(head, tail, buffer->size) >= 1) {
|
if (CIRC_SPACE(head, tail, buffer->size) >= 1) {
|
||||||
/* insert one item into the buffer */
|
/* insert one item into the buffer */
|
||||||
@@ -222,7 +222,7 @@ This will instruct the CPU to make sure the index is up to date before reading
|
|||||||
the new item, and then it shall make sure the CPU has finished reading the item
|
the new item, and then it shall make sure the CPU has finished reading the item
|
||||||
before it writes the new tail pointer, which will erase the item.
|
before it writes the new tail pointer, which will erase the item.
|
||||||
|
|
||||||
Note the use of ACCESS_ONCE() and smp_load_acquire() to read the
|
Note the use of READ_ONCE() and smp_load_acquire() to read the
|
||||||
opposition index. This prevents the compiler from discarding and
|
opposition index. This prevents the compiler from discarding and
|
||||||
reloading its cached value - which some compilers will do across
|
reloading its cached value - which some compilers will do across
|
||||||
smp_read_barrier_depends(). This isn't strictly needed if you can
|
smp_read_barrier_depends(). This isn't strictly needed if you can
|
||||||
|
|||||||
@@ -34,10 +34,10 @@ from load_config import loadConfig
|
|||||||
# Add any Sphinx extension module names here, as strings. They can be
|
# Add any Sphinx extension module names here, as strings. They can be
|
||||||
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
|
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
|
||||||
# ones.
|
# ones.
|
||||||
extensions = ['kernel-doc', 'rstFlatTable', 'kernel_include', 'cdomain']
|
extensions = ['kerneldoc', 'rstFlatTable', 'kernel_include', 'cdomain']
|
||||||
|
|
||||||
# The name of the math extension changed on Sphinx 1.4
|
# The name of the math extension changed on Sphinx 1.4
|
||||||
if minor > 3:
|
if major == 1 and minor > 3:
|
||||||
extensions.append("sphinx.ext.imgmath")
|
extensions.append("sphinx.ext.imgmath")
|
||||||
else:
|
else:
|
||||||
extensions.append("sphinx.ext.pngmath")
|
extensions.append("sphinx.ext.pngmath")
|
||||||
@@ -136,7 +136,7 @@ pygments_style = 'sphinx'
|
|||||||
todo_include_todos = False
|
todo_include_todos = False
|
||||||
|
|
||||||
primary_domain = 'C'
|
primary_domain = 'C'
|
||||||
highlight_language = 'guess'
|
highlight_language = 'none'
|
||||||
|
|
||||||
# -- Options for HTML output ----------------------------------------------
|
# -- Options for HTML output ----------------------------------------------
|
||||||
|
|
||||||
@@ -332,18 +332,32 @@ latex_elements = {
|
|||||||
'''
|
'''
|
||||||
}
|
}
|
||||||
|
|
||||||
|
# Fix reference escape troubles with Sphinx 1.4.x
|
||||||
|
if major == 1 and minor > 3:
|
||||||
|
latex_elements['preamble'] += '\\renewcommand*{\\DUrole}[2]{ #2 }\n'
|
||||||
|
|
||||||
# Grouping the document tree into LaTeX files. List of tuples
|
# Grouping the document tree into LaTeX files. List of tuples
|
||||||
# (source start file, target name, title,
|
# (source start file, target name, title,
|
||||||
# author, documentclass [howto, manual, or own class]).
|
# author, documentclass [howto, manual, or own class]).
|
||||||
latex_documents = [
|
latex_documents = [
|
||||||
|
('doc-guide/index', 'kernel-doc-guide.tex', 'Linux Kernel Documentation Guide',
|
||||||
|
'The kernel development community', 'manual'),
|
||||||
|
('admin-guide/index', 'linux-user.tex', 'Linux Kernel User Documentation',
|
||||||
|
'The kernel development community', 'manual'),
|
||||||
|
('core-api/index', 'core-api.tex', 'The kernel core API manual',
|
||||||
|
'The kernel development community', 'manual'),
|
||||||
|
('driver-api/index', 'driver-api.tex', 'The kernel driver API manual',
|
||||||
|
'The kernel development community', 'manual'),
|
||||||
('kernel-documentation', 'kernel-documentation.tex', 'The Linux Kernel Documentation',
|
('kernel-documentation', 'kernel-documentation.tex', 'The Linux Kernel Documentation',
|
||||||
'The kernel development community', 'manual'),
|
'The kernel development community', 'manual'),
|
||||||
('development-process/index', 'development-process.tex', 'Linux Kernel Development Documentation',
|
('process/index', 'development-process.tex', 'Linux Kernel Development Documentation',
|
||||||
'The kernel development community', 'manual'),
|
'The kernel development community', 'manual'),
|
||||||
('gpu/index', 'gpu.tex', 'Linux GPU Driver Developer\'s Guide',
|
('gpu/index', 'gpu.tex', 'Linux GPU Driver Developer\'s Guide',
|
||||||
'The kernel development community', 'manual'),
|
'The kernel development community', 'manual'),
|
||||||
('media/index', 'media.tex', 'Linux Media Subsystem Documentation',
|
('media/index', 'media.tex', 'Linux Media Subsystem Documentation',
|
||||||
'The kernel development community', 'manual'),
|
'The kernel development community', 'manual'),
|
||||||
|
('security/index', 'security.tex', 'The kernel security subsystem manual',
|
||||||
|
'The kernel development community', 'manual'),
|
||||||
]
|
]
|
||||||
|
|
||||||
# The name of an image file (relative to this directory) to place at the top of
|
# The name of an image file (relative to this directory) to place at the top of
|
||||||
|
|||||||
551
Documentation/core-api/assoc_array.rst
Normal file
551
Documentation/core-api/assoc_array.rst
Normal file
@@ -0,0 +1,551 @@
|
|||||||
|
========================================
|
||||||
|
Generic Associative Array Implementation
|
||||||
|
========================================
|
||||||
|
|
||||||
|
Overview
|
||||||
|
========
|
||||||
|
|
||||||
|
This associative array implementation is an object container with the following
|
||||||
|
properties:
|
||||||
|
|
||||||
|
1. Objects are opaque pointers. The implementation does not care where they
|
||||||
|
point (if anywhere) or what they point to (if anything).
|
||||||
|
.. note:: Pointers to objects _must_ be zero in the least significant bit.
|
||||||
|
|
||||||
|
2. Objects do not need to contain linkage blocks for use by the array. This
|
||||||
|
permits an object to be located in multiple arrays simultaneously.
|
||||||
|
Rather, the array is made up of metadata blocks that point to objects.
|
||||||
|
|
||||||
|
3. Objects require index keys to locate them within the array.
|
||||||
|
|
||||||
|
4. Index keys must be unique. Inserting an object with the same key as one
|
||||||
|
already in the array will replace the old object.
|
||||||
|
|
||||||
|
5. Index keys can be of any length and can be of different lengths.
|
||||||
|
|
||||||
|
6. Index keys should encode the length early on, before any variation due to
|
||||||
|
length is seen.
|
||||||
|
|
||||||
|
7. Index keys can include a hash to scatter objects throughout the array.
|
||||||
|
|
||||||
|
8. The array can iterated over. The objects will not necessarily come out in
|
||||||
|
key order.
|
||||||
|
|
||||||
|
9. The array can be iterated over whilst it is being modified, provided the
|
||||||
|
RCU readlock is being held by the iterator. Note, however, under these
|
||||||
|
circumstances, some objects may be seen more than once. If this is a
|
||||||
|
problem, the iterator should lock against modification. Objects will not
|
||||||
|
be missed, however, unless deleted.
|
||||||
|
|
||||||
|
10. Objects in the array can be looked up by means of their index key.
|
||||||
|
|
||||||
|
11. Objects can be looked up whilst the array is being modified, provided the
|
||||||
|
RCU readlock is being held by the thread doing the look up.
|
||||||
|
|
||||||
|
The implementation uses a tree of 16-pointer nodes internally that are indexed
|
||||||
|
on each level by nibbles from the index key in the same manner as in a radix
|
||||||
|
tree. To improve memory efficiency, shortcuts can be emplaced to skip over
|
||||||
|
what would otherwise be a series of single-occupancy nodes. Further, nodes
|
||||||
|
pack leaf object pointers into spare space in the node rather than making an
|
||||||
|
extra branch until as such time an object needs to be added to a full node.
|
||||||
|
|
||||||
|
|
||||||
|
The Public API
|
||||||
|
==============
|
||||||
|
|
||||||
|
The public API can be found in ``<linux/assoc_array.h>``. The associative
|
||||||
|
array is rooted on the following structure::
|
||||||
|
|
||||||
|
struct assoc_array {
|
||||||
|
...
|
||||||
|
};
|
||||||
|
|
||||||
|
The code is selected by enabling ``CONFIG_ASSOCIATIVE_ARRAY`` with::
|
||||||
|
|
||||||
|
./script/config -e ASSOCIATIVE_ARRAY
|
||||||
|
|
||||||
|
|
||||||
|
Edit Script
|
||||||
|
-----------
|
||||||
|
|
||||||
|
The insertion and deletion functions produce an 'edit script' that can later be
|
||||||
|
applied to effect the changes without risking ``ENOMEM``. This retains the
|
||||||
|
preallocated metadata blocks that will be installed in the internal tree and
|
||||||
|
keeps track of the metadata blocks that will be removed from the tree when the
|
||||||
|
script is applied.
|
||||||
|
|
||||||
|
This is also used to keep track of dead blocks and dead objects after the
|
||||||
|
script has been applied so that they can be freed later. The freeing is done
|
||||||
|
after an RCU grace period has passed - thus allowing access functions to
|
||||||
|
proceed under the RCU read lock.
|
||||||
|
|
||||||
|
The script appears as outside of the API as a pointer of the type::
|
||||||
|
|
||||||
|
struct assoc_array_edit;
|
||||||
|
|
||||||
|
There are two functions for dealing with the script:
|
||||||
|
|
||||||
|
1. Apply an edit script::
|
||||||
|
|
||||||
|
void assoc_array_apply_edit(struct assoc_array_edit *edit);
|
||||||
|
|
||||||
|
This will perform the edit functions, interpolating various write barriers
|
||||||
|
to permit accesses under the RCU read lock to continue. The edit script
|
||||||
|
will then be passed to ``call_rcu()`` to free it and any dead stuff it points
|
||||||
|
to.
|
||||||
|
|
||||||
|
2. Cancel an edit script::
|
||||||
|
|
||||||
|
void assoc_array_cancel_edit(struct assoc_array_edit *edit);
|
||||||
|
|
||||||
|
This frees the edit script and all preallocated memory immediately. If
|
||||||
|
this was for insertion, the new object is _not_ released by this function,
|
||||||
|
but must rather be released by the caller.
|
||||||
|
|
||||||
|
These functions are guaranteed not to fail.
|
||||||
|
|
||||||
|
|
||||||
|
Operations Table
|
||||||
|
----------------
|
||||||
|
|
||||||
|
Various functions take a table of operations::
|
||||||
|
|
||||||
|
struct assoc_array_ops {
|
||||||
|
...
|
||||||
|
};
|
||||||
|
|
||||||
|
This points to a number of methods, all of which need to be provided:
|
||||||
|
|
||||||
|
1. Get a chunk of index key from caller data::
|
||||||
|
|
||||||
|
unsigned long (*get_key_chunk)(const void *index_key, int level);
|
||||||
|
|
||||||
|
This should return a chunk of caller-supplied index key starting at the
|
||||||
|
*bit* position given by the level argument. The level argument will be a
|
||||||
|
multiple of ``ASSOC_ARRAY_KEY_CHUNK_SIZE`` and the function should return
|
||||||
|
``ASSOC_ARRAY_KEY_CHUNK_SIZE bits``. No error is possible.
|
||||||
|
|
||||||
|
|
||||||
|
2. Get a chunk of an object's index key::
|
||||||
|
|
||||||
|
unsigned long (*get_object_key_chunk)(const void *object, int level);
|
||||||
|
|
||||||
|
As the previous function, but gets its data from an object in the array
|
||||||
|
rather than from a caller-supplied index key.
|
||||||
|
|
||||||
|
|
||||||
|
3. See if this is the object we're looking for::
|
||||||
|
|
||||||
|
bool (*compare_object)(const void *object, const void *index_key);
|
||||||
|
|
||||||
|
Compare the object against an index key and return ``true`` if it matches and
|
||||||
|
``false`` if it doesn't.
|
||||||
|
|
||||||
|
|
||||||
|
4. Diff the index keys of two objects::
|
||||||
|
|
||||||
|
int (*diff_objects)(const void *object, const void *index_key);
|
||||||
|
|
||||||
|
Return the bit position at which the index key of the specified object
|
||||||
|
differs from the given index key or -1 if they are the same.
|
||||||
|
|
||||||
|
|
||||||
|
5. Free an object::
|
||||||
|
|
||||||
|
void (*free_object)(void *object);
|
||||||
|
|
||||||
|
Free the specified object. Note that this may be called an RCU grace period
|
||||||
|
after ``assoc_array_apply_edit()`` was called, so ``synchronize_rcu()`` may be
|
||||||
|
necessary on module unloading.
|
||||||
|
|
||||||
|
|
||||||
|
Manipulation Functions
|
||||||
|
----------------------
|
||||||
|
|
||||||
|
There are a number of functions for manipulating an associative array:
|
||||||
|
|
||||||
|
1. Initialise an associative array::
|
||||||
|
|
||||||
|
void assoc_array_init(struct assoc_array *array);
|
||||||
|
|
||||||
|
This initialises the base structure for an associative array. It can't fail.
|
||||||
|
|
||||||
|
|
||||||
|
2. Insert/replace an object in an associative array::
|
||||||
|
|
||||||
|
struct assoc_array_edit *
|
||||||
|
assoc_array_insert(struct assoc_array *array,
|
||||||
|
const struct assoc_array_ops *ops,
|
||||||
|
const void *index_key,
|
||||||
|
void *object);
|
||||||
|
|
||||||
|
This inserts the given object into the array. Note that the least
|
||||||
|
significant bit of the pointer must be zero as it's used to type-mark
|
||||||
|
pointers internally.
|
||||||
|
|
||||||
|
If an object already exists for that key then it will be replaced with the
|
||||||
|
new object and the old one will be freed automatically.
|
||||||
|
|
||||||
|
The ``index_key`` argument should hold index key information and is
|
||||||
|
passed to the methods in the ops table when they are called.
|
||||||
|
|
||||||
|
This function makes no alteration to the array itself, but rather returns
|
||||||
|
an edit script that must be applied. ``-ENOMEM`` is returned in the case of
|
||||||
|
an out-of-memory error.
|
||||||
|
|
||||||
|
The caller should lock exclusively against other modifiers of the array.
|
||||||
|
|
||||||
|
|
||||||
|
3. Delete an object from an associative array::
|
||||||
|
|
||||||
|
struct assoc_array_edit *
|
||||||
|
assoc_array_delete(struct assoc_array *array,
|
||||||
|
const struct assoc_array_ops *ops,
|
||||||
|
const void *index_key);
|
||||||
|
|
||||||
|
This deletes an object that matches the specified data from the array.
|
||||||
|
|
||||||
|
The ``index_key`` argument should hold index key information and is
|
||||||
|
passed to the methods in the ops table when they are called.
|
||||||
|
|
||||||
|
This function makes no alteration to the array itself, but rather returns
|
||||||
|
an edit script that must be applied. ``-ENOMEM`` is returned in the case of
|
||||||
|
an out-of-memory error. ``NULL`` will be returned if the specified object is
|
||||||
|
not found within the array.
|
||||||
|
|
||||||
|
The caller should lock exclusively against other modifiers of the array.
|
||||||
|
|
||||||
|
|
||||||
|
4. Delete all objects from an associative array::
|
||||||
|
|
||||||
|
struct assoc_array_edit *
|
||||||
|
assoc_array_clear(struct assoc_array *array,
|
||||||
|
const struct assoc_array_ops *ops);
|
||||||
|
|
||||||
|
This deletes all the objects from an associative array and leaves it
|
||||||
|
completely empty.
|
||||||
|
|
||||||
|
This function makes no alteration to the array itself, but rather returns
|
||||||
|
an edit script that must be applied. ``-ENOMEM`` is returned in the case of
|
||||||
|
an out-of-memory error.
|
||||||
|
|
||||||
|
The caller should lock exclusively against other modifiers of the array.
|
||||||
|
|
||||||
|
|
||||||
|
5. Destroy an associative array, deleting all objects::
|
||||||
|
|
||||||
|
void assoc_array_destroy(struct assoc_array *array,
|
||||||
|
const struct assoc_array_ops *ops);
|
||||||
|
|
||||||
|
This destroys the contents of the associative array and leaves it
|
||||||
|
completely empty. It is not permitted for another thread to be traversing
|
||||||
|
the array under the RCU read lock at the same time as this function is
|
||||||
|
destroying it as no RCU deferral is performed on memory release -
|
||||||
|
something that would require memory to be allocated.
|
||||||
|
|
||||||
|
The caller should lock exclusively against other modifiers and accessors
|
||||||
|
of the array.
|
||||||
|
|
||||||
|
|
||||||
|
6. Garbage collect an associative array::
|
||||||
|
|
||||||
|
int assoc_array_gc(struct assoc_array *array,
|
||||||
|
const struct assoc_array_ops *ops,
|
||||||
|
bool (*iterator)(void *object, void *iterator_data),
|
||||||
|
void *iterator_data);
|
||||||
|
|
||||||
|
This iterates over the objects in an associative array and passes each one to
|
||||||
|
``iterator()``. If ``iterator()`` returns ``true``, the object is kept. If it
|
||||||
|
returns ``false``, the object will be freed. If the ``iterator()`` function
|
||||||
|
returns ``true``, it must perform any appropriate refcount incrementing on the
|
||||||
|
object before returning.
|
||||||
|
|
||||||
|
The internal tree will be packed down if possible as part of the iteration
|
||||||
|
to reduce the number of nodes in it.
|
||||||
|
|
||||||
|
The ``iterator_data`` is passed directly to ``iterator()`` and is otherwise
|
||||||
|
ignored by the function.
|
||||||
|
|
||||||
|
The function will return ``0`` if successful and ``-ENOMEM`` if there wasn't
|
||||||
|
enough memory.
|
||||||
|
|
||||||
|
It is possible for other threads to iterate over or search the array under
|
||||||
|
the RCU read lock whilst this function is in progress. The caller should
|
||||||
|
lock exclusively against other modifiers of the array.
|
||||||
|
|
||||||
|
|
||||||
|
Access Functions
|
||||||
|
----------------
|
||||||
|
|
||||||
|
There are two functions for accessing an associative array:
|
||||||
|
|
||||||
|
1. Iterate over all the objects in an associative array::
|
||||||
|
|
||||||
|
int assoc_array_iterate(const struct assoc_array *array,
|
||||||
|
int (*iterator)(const void *object,
|
||||||
|
void *iterator_data),
|
||||||
|
void *iterator_data);
|
||||||
|
|
||||||
|
This passes each object in the array to the iterator callback function.
|
||||||
|
``iterator_data`` is private data for that function.
|
||||||
|
|
||||||
|
This may be used on an array at the same time as the array is being
|
||||||
|
modified, provided the RCU read lock is held. Under such circumstances,
|
||||||
|
it is possible for the iteration function to see some objects twice. If
|
||||||
|
this is a problem, then modification should be locked against. The
|
||||||
|
iteration algorithm should not, however, miss any objects.
|
||||||
|
|
||||||
|
The function will return ``0`` if no objects were in the array or else it will
|
||||||
|
return the result of the last iterator function called. Iteration stops
|
||||||
|
immediately if any call to the iteration function results in a non-zero
|
||||||
|
return.
|
||||||
|
|
||||||
|
|
||||||
|
2. Find an object in an associative array::
|
||||||
|
|
||||||
|
void *assoc_array_find(const struct assoc_array *array,
|
||||||
|
const struct assoc_array_ops *ops,
|
||||||
|
const void *index_key);
|
||||||
|
|
||||||
|
This walks through the array's internal tree directly to the object
|
||||||
|
specified by the index key..
|
||||||
|
|
||||||
|
This may be used on an array at the same time as the array is being
|
||||||
|
modified, provided the RCU read lock is held.
|
||||||
|
|
||||||
|
The function will return the object if found (and set ``*_type`` to the object
|
||||||
|
type) or will return ``NULL`` if the object was not found.
|
||||||
|
|
||||||
|
|
||||||
|
Index Key Form
|
||||||
|
--------------
|
||||||
|
|
||||||
|
The index key can be of any form, but since the algorithms aren't told how long
|
||||||
|
the key is, it is strongly recommended that the index key includes its length
|
||||||
|
very early on before any variation due to the length would have an effect on
|
||||||
|
comparisons.
|
||||||
|
|
||||||
|
This will cause leaves with different length keys to scatter away from each
|
||||||
|
other - and those with the same length keys to cluster together.
|
||||||
|
|
||||||
|
It is also recommended that the index key begin with a hash of the rest of the
|
||||||
|
key to maximise scattering throughout keyspace.
|
||||||
|
|
||||||
|
The better the scattering, the wider and lower the internal tree will be.
|
||||||
|
|
||||||
|
Poor scattering isn't too much of a problem as there are shortcuts and nodes
|
||||||
|
can contain mixtures of leaves and metadata pointers.
|
||||||
|
|
||||||
|
The index key is read in chunks of machine word. Each chunk is subdivided into
|
||||||
|
one nibble (4 bits) per level, so on a 32-bit CPU this is good for 8 levels and
|
||||||
|
on a 64-bit CPU, 16 levels. Unless the scattering is really poor, it is
|
||||||
|
unlikely that more than one word of any particular index key will have to be
|
||||||
|
used.
|
||||||
|
|
||||||
|
|
||||||
|
Internal Workings
|
||||||
|
=================
|
||||||
|
|
||||||
|
The associative array data structure has an internal tree. This tree is
|
||||||
|
constructed of two types of metadata blocks: nodes and shortcuts.
|
||||||
|
|
||||||
|
A node is an array of slots. Each slot can contain one of four things:
|
||||||
|
|
||||||
|
* A NULL pointer, indicating that the slot is empty.
|
||||||
|
* A pointer to an object (a leaf).
|
||||||
|
* A pointer to a node at the next level.
|
||||||
|
* A pointer to a shortcut.
|
||||||
|
|
||||||
|
|
||||||
|
Basic Internal Tree Layout
|
||||||
|
--------------------------
|
||||||
|
|
||||||
|
Ignoring shortcuts for the moment, the nodes form a multilevel tree. The index
|
||||||
|
key space is strictly subdivided by the nodes in the tree and nodes occur on
|
||||||
|
fixed levels. For example::
|
||||||
|
|
||||||
|
Level: 0 1 2 3
|
||||||
|
=============== =============== =============== ===============
|
||||||
|
NODE D
|
||||||
|
NODE B NODE C +------>+---+
|
||||||
|
+------>+---+ +------>+---+ | | 0 |
|
||||||
|
NODE A | | 0 | | | 0 | | +---+
|
||||||
|
+---+ | +---+ | +---+ | : :
|
||||||
|
| 0 | | : : | : : | +---+
|
||||||
|
+---+ | +---+ | +---+ | | f |
|
||||||
|
| 1 |---+ | 3 |---+ | 7 |---+ +---+
|
||||||
|
+---+ +---+ +---+
|
||||||
|
: : : : | 8 |---+
|
||||||
|
+---+ +---+ +---+ | NODE E
|
||||||
|
| e |---+ | f | : : +------>+---+
|
||||||
|
+---+ | +---+ +---+ | 0 |
|
||||||
|
| f | | | f | +---+
|
||||||
|
+---+ | +---+ : :
|
||||||
|
| NODE F +---+
|
||||||
|
+------>+---+ | f |
|
||||||
|
| 0 | NODE G +---+
|
||||||
|
+---+ +------>+---+
|
||||||
|
: : | | 0 |
|
||||||
|
+---+ | +---+
|
||||||
|
| 6 |---+ : :
|
||||||
|
+---+ +---+
|
||||||
|
: : | f |
|
||||||
|
+---+ +---+
|
||||||
|
| f |
|
||||||
|
+---+
|
||||||
|
|
||||||
|
In the above example, there are 7 nodes (A-G), each with 16 slots (0-f).
|
||||||
|
Assuming no other meta data nodes in the tree, the key space is divided
|
||||||
|
thusly::
|
||||||
|
|
||||||
|
KEY PREFIX NODE
|
||||||
|
========== ====
|
||||||
|
137* D
|
||||||
|
138* E
|
||||||
|
13[0-69-f]* C
|
||||||
|
1[0-24-f]* B
|
||||||
|
e6* G
|
||||||
|
e[0-57-f]* F
|
||||||
|
[02-df]* A
|
||||||
|
|
||||||
|
So, for instance, keys with the following example index keys will be found in
|
||||||
|
the appropriate nodes::
|
||||||
|
|
||||||
|
INDEX KEY PREFIX NODE
|
||||||
|
=============== ======= ====
|
||||||
|
13694892892489 13 C
|
||||||
|
13795289025897 137 D
|
||||||
|
13889dde88793 138 E
|
||||||
|
138bbb89003093 138 E
|
||||||
|
1394879524789 12 C
|
||||||
|
1458952489 1 B
|
||||||
|
9431809de993ba - A
|
||||||
|
b4542910809cd - A
|
||||||
|
e5284310def98 e F
|
||||||
|
e68428974237 e6 G
|
||||||
|
e7fffcbd443 e F
|
||||||
|
f3842239082 - A
|
||||||
|
|
||||||
|
To save memory, if a node can hold all the leaves in its portion of keyspace,
|
||||||
|
then the node will have all those leaves in it and will not have any metadata
|
||||||
|
pointers - even if some of those leaves would like to be in the same slot.
|
||||||
|
|
||||||
|
A node can contain a heterogeneous mix of leaves and metadata pointers.
|
||||||
|
Metadata pointers must be in the slots that match their subdivisions of key
|
||||||
|
space. The leaves can be in any slot not occupied by a metadata pointer. It
|
||||||
|
is guaranteed that none of the leaves in a node will match a slot occupied by a
|
||||||
|
metadata pointer. If the metadata pointer is there, any leaf whose key matches
|
||||||
|
the metadata key prefix must be in the subtree that the metadata pointer points
|
||||||
|
to.
|
||||||
|
|
||||||
|
In the above example list of index keys, node A will contain::
|
||||||
|
|
||||||
|
SLOT CONTENT INDEX KEY (PREFIX)
|
||||||
|
==== =============== ==================
|
||||||
|
1 PTR TO NODE B 1*
|
||||||
|
any LEAF 9431809de993ba
|
||||||
|
any LEAF b4542910809cd
|
||||||
|
e PTR TO NODE F e*
|
||||||
|
any LEAF f3842239082
|
||||||
|
|
||||||
|
and node B::
|
||||||
|
|
||||||
|
3 PTR TO NODE C 13*
|
||||||
|
any LEAF 1458952489
|
||||||
|
|
||||||
|
|
||||||
|
Shortcuts
|
||||||
|
---------
|
||||||
|
|
||||||
|
Shortcuts are metadata records that jump over a piece of keyspace. A shortcut
|
||||||
|
is a replacement for a series of single-occupancy nodes ascending through the
|
||||||
|
levels. Shortcuts exist to save memory and to speed up traversal.
|
||||||
|
|
||||||
|
It is possible for the root of the tree to be a shortcut - say, for example,
|
||||||
|
the tree contains at least 17 nodes all with key prefix ``1111``. The
|
||||||
|
insertion algorithm will insert a shortcut to skip over the ``1111`` keyspace
|
||||||
|
in a single bound and get to the fourth level where these actually become
|
||||||
|
different.
|
||||||
|
|
||||||
|
|
||||||
|
Splitting And Collapsing Nodes
|
||||||
|
------------------------------
|
||||||
|
|
||||||
|
Each node has a maximum capacity of 16 leaves and metadata pointers. If the
|
||||||
|
insertion algorithm finds that it is trying to insert a 17th object into a
|
||||||
|
node, that node will be split such that at least two leaves that have a common
|
||||||
|
key segment at that level end up in a separate node rooted on that slot for
|
||||||
|
that common key segment.
|
||||||
|
|
||||||
|
If the leaves in a full node and the leaf that is being inserted are
|
||||||
|
sufficiently similar, then a shortcut will be inserted into the tree.
|
||||||
|
|
||||||
|
When the number of objects in the subtree rooted at a node falls to 16 or
|
||||||
|
fewer, then the subtree will be collapsed down to a single node - and this will
|
||||||
|
ripple towards the root if possible.
|
||||||
|
|
||||||
|
|
||||||
|
Non-Recursive Iteration
|
||||||
|
-----------------------
|
||||||
|
|
||||||
|
Each node and shortcut contains a back pointer to its parent and the number of
|
||||||
|
slot in that parent that points to it. None-recursive iteration uses these to
|
||||||
|
proceed rootwards through the tree, going to the parent node, slot N + 1 to
|
||||||
|
make sure progress is made without the need for a stack.
|
||||||
|
|
||||||
|
The backpointers, however, make simultaneous alteration and iteration tricky.
|
||||||
|
|
||||||
|
|
||||||
|
Simultaneous Alteration And Iteration
|
||||||
|
-------------------------------------
|
||||||
|
|
||||||
|
There are a number of cases to consider:
|
||||||
|
|
||||||
|
1. Simple insert/replace. This involves simply replacing a NULL or old
|
||||||
|
matching leaf pointer with the pointer to the new leaf after a barrier.
|
||||||
|
The metadata blocks don't change otherwise. An old leaf won't be freed
|
||||||
|
until after the RCU grace period.
|
||||||
|
|
||||||
|
2. Simple delete. This involves just clearing an old matching leaf. The
|
||||||
|
metadata blocks don't change otherwise. The old leaf won't be freed until
|
||||||
|
after the RCU grace period.
|
||||||
|
|
||||||
|
3. Insertion replacing part of a subtree that we haven't yet entered. This
|
||||||
|
may involve replacement of part of that subtree - but that won't affect
|
||||||
|
the iteration as we won't have reached the pointer to it yet and the
|
||||||
|
ancestry blocks are not replaced (the layout of those does not change).
|
||||||
|
|
||||||
|
4. Insertion replacing nodes that we're actively processing. This isn't a
|
||||||
|
problem as we've passed the anchoring pointer and won't switch onto the
|
||||||
|
new layout until we follow the back pointers - at which point we've
|
||||||
|
already examined the leaves in the replaced node (we iterate over all the
|
||||||
|
leaves in a node before following any of its metadata pointers).
|
||||||
|
|
||||||
|
We might, however, re-see some leaves that have been split out into a new
|
||||||
|
branch that's in a slot further along than we were at.
|
||||||
|
|
||||||
|
5. Insertion replacing nodes that we're processing a dependent branch of.
|
||||||
|
This won't affect us until we follow the back pointers. Similar to (4).
|
||||||
|
|
||||||
|
6. Deletion collapsing a branch under us. This doesn't affect us because the
|
||||||
|
back pointers will get us back to the parent of the new node before we
|
||||||
|
could see the new node. The entire collapsed subtree is thrown away
|
||||||
|
unchanged - and will still be rooted on the same slot, so we shouldn't
|
||||||
|
process it a second time as we'll go back to slot + 1.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
Under some circumstances, we need to simultaneously change the parent
|
||||||
|
pointer and the parent slot pointer on a node (say, for example, we
|
||||||
|
inserted another node before it and moved it up a level). We cannot do
|
||||||
|
this without locking against a read - so we have to replace that node too.
|
||||||
|
|
||||||
|
However, when we're changing a shortcut into a node this isn't a problem
|
||||||
|
as shortcuts only have one slot and so the parent slot number isn't used
|
||||||
|
when traversing backwards over one. This means that it's okay to change
|
||||||
|
the slot number first - provided suitable barriers are used to make sure
|
||||||
|
the parent slot number is read after the back pointer.
|
||||||
|
|
||||||
|
Obsolete blocks and leaves are freed up after an RCU grace period has passed,
|
||||||
|
so as long as anyone doing walking or iteration holds the RCU read lock, the
|
||||||
|
old superstructure should not go away on them.
|
||||||
@@ -1,34 +1,40 @@
|
|||||||
Semantics and Behavior of Atomic and
|
=======================================================
|
||||||
Bitmask Operations
|
Semantics and Behavior of Atomic and Bitmask Operations
|
||||||
|
=======================================================
|
||||||
|
|
||||||
David S. Miller
|
:Author: David S. Miller
|
||||||
|
|
||||||
This document is intended to serve as a guide to Linux port
|
This document is intended to serve as a guide to Linux port
|
||||||
maintainers on how to implement atomic counter, bitops, and spinlock
|
maintainers on how to implement atomic counter, bitops, and spinlock
|
||||||
interfaces properly.
|
interfaces properly.
|
||||||
|
|
||||||
|
Atomic Type And Operations
|
||||||
|
==========================
|
||||||
|
|
||||||
The atomic_t type should be defined as a signed integer and
|
The atomic_t type should be defined as a signed integer and
|
||||||
the atomic_long_t type as a signed long integer. Also, they should
|
the atomic_long_t type as a signed long integer. Also, they should
|
||||||
be made opaque such that any kind of cast to a normal C integer type
|
be made opaque such that any kind of cast to a normal C integer type
|
||||||
will fail. Something like the following should suffice:
|
will fail. Something like the following should suffice::
|
||||||
|
|
||||||
typedef struct { int counter; } atomic_t;
|
typedef struct { int counter; } atomic_t;
|
||||||
typedef struct { long counter; } atomic_long_t;
|
typedef struct { long counter; } atomic_long_t;
|
||||||
|
|
||||||
Historically, counter has been declared volatile. This is now discouraged.
|
Historically, counter has been declared volatile. This is now discouraged.
|
||||||
See Documentation/volatile-considered-harmful.txt for the complete rationale.
|
See :ref:`Documentation/process/volatile-considered-harmful.rst
|
||||||
|
<volatile_considered_harmful>` for the complete rationale.
|
||||||
|
|
||||||
local_t is very similar to atomic_t. If the counter is per CPU and only
|
local_t is very similar to atomic_t. If the counter is per CPU and only
|
||||||
updated by one CPU, local_t is probably more appropriate. Please see
|
updated by one CPU, local_t is probably more appropriate. Please see
|
||||||
Documentation/local_ops.txt for the semantics of local_t.
|
:ref:`Documentation/core-api/local_ops.rst <local_ops>` for the semantics of
|
||||||
|
local_t.
|
||||||
|
|
||||||
The first operations to implement for atomic_t's are the initializers and
|
The first operations to implement for atomic_t's are the initializers and
|
||||||
plain reads.
|
plain reads. ::
|
||||||
|
|
||||||
#define ATOMIC_INIT(i) { (i) }
|
#define ATOMIC_INIT(i) { (i) }
|
||||||
#define atomic_set(v, i) ((v)->counter = (i))
|
#define atomic_set(v, i) ((v)->counter = (i))
|
||||||
|
|
||||||
The first macro is used in definitions, such as:
|
The first macro is used in definitions, such as::
|
||||||
|
|
||||||
static atomic_t my_counter = ATOMIC_INIT(1);
|
static atomic_t my_counter = ATOMIC_INIT(1);
|
||||||
|
|
||||||
@@ -38,10 +44,10 @@ initializer is used before runtime. If the initializer is used at runtime, a
|
|||||||
proper implicit or explicit read memory barrier is needed before reading the
|
proper implicit or explicit read memory barrier is needed before reading the
|
||||||
value with atomic_read from another thread.
|
value with atomic_read from another thread.
|
||||||
|
|
||||||
As with all of the atomic_ interfaces, replace the leading "atomic_"
|
As with all of the ``atomic_`` interfaces, replace the leading ``atomic_``
|
||||||
with "atomic_long_" to operate on atomic_long_t.
|
with ``atomic_long_`` to operate on atomic_long_t.
|
||||||
|
|
||||||
The second interface can be used at runtime, as in:
|
The second interface can be used at runtime, as in::
|
||||||
|
|
||||||
struct foo { atomic_t counter; };
|
struct foo { atomic_t counter; };
|
||||||
...
|
...
|
||||||
@@ -59,7 +65,7 @@ been set with this operation or set with another operation. A proper implicit
|
|||||||
or explicit memory barrier is needed before the value set with the operation
|
or explicit memory barrier is needed before the value set with the operation
|
||||||
is guaranteed to be readable with atomic_read from another thread.
|
is guaranteed to be readable with atomic_read from another thread.
|
||||||
|
|
||||||
Next, we have:
|
Next, we have::
|
||||||
|
|
||||||
#define atomic_read(v) ((v)->counter)
|
#define atomic_read(v) ((v)->counter)
|
||||||
|
|
||||||
@@ -73,36 +79,37 @@ initialization by any other thread is visible yet, so the user of the
|
|||||||
interface must take care of that with a proper implicit or explicit memory
|
interface must take care of that with a proper implicit or explicit memory
|
||||||
barrier.
|
barrier.
|
||||||
|
|
||||||
*** WARNING: atomic_read() and atomic_set() DO NOT IMPLY BARRIERS! ***
|
.. warning::
|
||||||
|
|
||||||
Some architectures may choose to use the volatile keyword, barriers, or inline
|
``atomic_read()`` and ``atomic_set()`` DO NOT IMPLY BARRIERS!
|
||||||
assembly to guarantee some degree of immediacy for atomic_read() and
|
|
||||||
atomic_set(). This is not uniformly guaranteed, and may change in the future,
|
|
||||||
so all users of atomic_t should treat atomic_read() and atomic_set() as simple
|
|
||||||
C statements that may be reordered or optimized away entirely by the compiler
|
|
||||||
or processor, and explicitly invoke the appropriate compiler and/or memory
|
|
||||||
barrier for each use case. Failure to do so will result in code that may
|
|
||||||
suddenly break when used with different architectures or compiler
|
|
||||||
optimizations, or even changes in unrelated code which changes how the
|
|
||||||
compiler optimizes the section accessing atomic_t variables.
|
|
||||||
|
|
||||||
*** YOU HAVE BEEN WARNED! ***
|
Some architectures may choose to use the volatile keyword, barriers, or
|
||||||
|
inline assembly to guarantee some degree of immediacy for atomic_read()
|
||||||
|
and atomic_set(). This is not uniformly guaranteed, and may change in
|
||||||
|
the future, so all users of atomic_t should treat atomic_read() and
|
||||||
|
atomic_set() as simple C statements that may be reordered or optimized
|
||||||
|
away entirely by the compiler or processor, and explicitly invoke the
|
||||||
|
appropriate compiler and/or memory barrier for each use case. Failure
|
||||||
|
to do so will result in code that may suddenly break when used with
|
||||||
|
different architectures or compiler optimizations, or even changes in
|
||||||
|
unrelated code which changes how the compiler optimizes the section
|
||||||
|
accessing atomic_t variables.
|
||||||
|
|
||||||
Properly aligned pointers, longs, ints, and chars (and unsigned
|
Properly aligned pointers, longs, ints, and chars (and unsigned
|
||||||
equivalents) may be atomically loaded from and stored to in the same
|
equivalents) may be atomically loaded from and stored to in the same
|
||||||
sense as described for atomic_read() and atomic_set(). The ACCESS_ONCE()
|
sense as described for atomic_read() and atomic_set(). The READ_ONCE()
|
||||||
macro should be used to prevent the compiler from using optimizations
|
and WRITE_ONCE() macros should be used to prevent the compiler from using
|
||||||
that might otherwise optimize accesses out of existence on the one hand,
|
optimizations that might otherwise optimize accesses out of existence on
|
||||||
or that might create unsolicited accesses on the other.
|
the one hand, or that might create unsolicited accesses on the other.
|
||||||
|
|
||||||
For example consider the following code:
|
For example consider the following code::
|
||||||
|
|
||||||
while (a > 0)
|
while (a > 0)
|
||||||
do_something();
|
do_something();
|
||||||
|
|
||||||
If the compiler can prove that do_something() does not store to the
|
If the compiler can prove that do_something() does not store to the
|
||||||
variable a, then the compiler is within its rights transforming this to
|
variable a, then the compiler is within its rights transforming this to
|
||||||
the following:
|
the following::
|
||||||
|
|
||||||
tmp = a;
|
tmp = a;
|
||||||
if (a > 0)
|
if (a > 0)
|
||||||
@@ -110,14 +117,14 @@ the following:
|
|||||||
do_something();
|
do_something();
|
||||||
|
|
||||||
If you don't want the compiler to do this (and you probably don't), then
|
If you don't want the compiler to do this (and you probably don't), then
|
||||||
you should use something like the following:
|
you should use something like the following::
|
||||||
|
|
||||||
while (ACCESS_ONCE(a) < 0)
|
while (READ_ONCE(a) < 0)
|
||||||
do_something();
|
do_something();
|
||||||
|
|
||||||
Alternatively, you could place a barrier() call in the loop.
|
Alternatively, you could place a barrier() call in the loop.
|
||||||
|
|
||||||
For another example, consider the following code:
|
For another example, consider the following code::
|
||||||
|
|
||||||
tmp_a = a;
|
tmp_a = a;
|
||||||
do_something_with(tmp_a);
|
do_something_with(tmp_a);
|
||||||
@@ -125,7 +132,7 @@ For another example, consider the following code:
|
|||||||
|
|
||||||
If the compiler can prove that do_something_with() does not store to the
|
If the compiler can prove that do_something_with() does not store to the
|
||||||
variable a, then the compiler is within its rights to manufacture an
|
variable a, then the compiler is within its rights to manufacture an
|
||||||
additional load as follows:
|
additional load as follows::
|
||||||
|
|
||||||
tmp_a = a;
|
tmp_a = a;
|
||||||
do_something_with(tmp_a);
|
do_something_with(tmp_a);
|
||||||
@@ -139,15 +146,15 @@ The compiler would be likely to manufacture this additional load if
|
|||||||
do_something_with() was an inline function that made very heavy use
|
do_something_with() was an inline function that made very heavy use
|
||||||
of registers: reloading from variable a could save a flush to the
|
of registers: reloading from variable a could save a flush to the
|
||||||
stack and later reload. To prevent the compiler from attacking your
|
stack and later reload. To prevent the compiler from attacking your
|
||||||
code in this manner, write the following:
|
code in this manner, write the following::
|
||||||
|
|
||||||
tmp_a = ACCESS_ONCE(a);
|
tmp_a = READ_ONCE(a);
|
||||||
do_something_with(tmp_a);
|
do_something_with(tmp_a);
|
||||||
do_something_else_with(tmp_a);
|
do_something_else_with(tmp_a);
|
||||||
|
|
||||||
For a final example, consider the following code, assuming that the
|
For a final example, consider the following code, assuming that the
|
||||||
variable a is set at boot time before the second CPU is brought online
|
variable a is set at boot time before the second CPU is brought online
|
||||||
and never changed later, so that memory barriers are not needed:
|
and never changed later, so that memory barriers are not needed::
|
||||||
|
|
||||||
if (a)
|
if (a)
|
||||||
b = 9;
|
b = 9;
|
||||||
@@ -155,7 +162,7 @@ and never changed later, so that memory barriers are not needed:
|
|||||||
b = 42;
|
b = 42;
|
||||||
|
|
||||||
The compiler is within its rights to manufacture an additional store
|
The compiler is within its rights to manufacture an additional store
|
||||||
by transforming the above code into the following:
|
by transforming the above code into the following::
|
||||||
|
|
||||||
b = 42;
|
b = 42;
|
||||||
if (a)
|
if (a)
|
||||||
@@ -163,20 +170,22 @@ by transforming the above code into the following:
|
|||||||
|
|
||||||
This could come as a fatal surprise to other code running concurrently
|
This could come as a fatal surprise to other code running concurrently
|
||||||
that expected b to never have the value 42 if a was zero. To prevent
|
that expected b to never have the value 42 if a was zero. To prevent
|
||||||
the compiler from doing this, write something like:
|
the compiler from doing this, write something like::
|
||||||
|
|
||||||
if (a)
|
if (a)
|
||||||
ACCESS_ONCE(b) = 9;
|
WRITE_ONCE(b, 9);
|
||||||
else
|
else
|
||||||
ACCESS_ONCE(b) = 42;
|
WRITE_ONCE(b, 42);
|
||||||
|
|
||||||
Don't even -think- about doing this without proper use of memory barriers,
|
Don't even -think- about doing this without proper use of memory barriers,
|
||||||
locks, or atomic operations if variable a can change at runtime!
|
locks, or atomic operations if variable a can change at runtime!
|
||||||
|
|
||||||
*** WARNING: ACCESS_ONCE() DOES NOT IMPLY A BARRIER! ***
|
.. warning::
|
||||||
|
|
||||||
|
``READ_ONCE()`` OR ``WRITE_ONCE()`` DO NOT IMPLY A BARRIER!
|
||||||
|
|
||||||
Now, we move onto the atomic operation interfaces typically implemented with
|
Now, we move onto the atomic operation interfaces typically implemented with
|
||||||
the help of assembly code.
|
the help of assembly code. ::
|
||||||
|
|
||||||
void atomic_add(int i, atomic_t *v);
|
void atomic_add(int i, atomic_t *v);
|
||||||
void atomic_sub(int i, atomic_t *v);
|
void atomic_sub(int i, atomic_t *v);
|
||||||
@@ -192,7 +201,7 @@ One very important aspect of these two routines is that they DO NOT
|
|||||||
require any explicit memory barriers. They need only perform the
|
require any explicit memory barriers. They need only perform the
|
||||||
atomic_t counter update in an SMP safe manner.
|
atomic_t counter update in an SMP safe manner.
|
||||||
|
|
||||||
Next, we have:
|
Next, we have::
|
||||||
|
|
||||||
int atomic_inc_return(atomic_t *v);
|
int atomic_inc_return(atomic_t *v);
|
||||||
int atomic_dec_return(atomic_t *v);
|
int atomic_dec_return(atomic_t *v);
|
||||||
@@ -214,7 +223,7 @@ If the atomic instructions used in an implementation provide explicit
|
|||||||
memory barrier semantics which satisfy the above requirements, that is
|
memory barrier semantics which satisfy the above requirements, that is
|
||||||
fine as well.
|
fine as well.
|
||||||
|
|
||||||
Let's move on:
|
Let's move on::
|
||||||
|
|
||||||
int atomic_add_return(int i, atomic_t *v);
|
int atomic_add_return(int i, atomic_t *v);
|
||||||
int atomic_sub_return(int i, atomic_t *v);
|
int atomic_sub_return(int i, atomic_t *v);
|
||||||
@@ -224,7 +233,7 @@ explicit counter adjustment is given instead of the implicit "1".
|
|||||||
This means that like atomic_{inc,dec}_return(), the memory barrier
|
This means that like atomic_{inc,dec}_return(), the memory barrier
|
||||||
semantics are required.
|
semantics are required.
|
||||||
|
|
||||||
Next:
|
Next::
|
||||||
|
|
||||||
int atomic_inc_and_test(atomic_t *v);
|
int atomic_inc_and_test(atomic_t *v);
|
||||||
int atomic_dec_and_test(atomic_t *v);
|
int atomic_dec_and_test(atomic_t *v);
|
||||||
@@ -234,13 +243,13 @@ given atomic counter. They return a boolean indicating whether the
|
|||||||
resulting counter value was zero or not.
|
resulting counter value was zero or not.
|
||||||
|
|
||||||
Again, these primitives provide explicit memory barrier semantics around
|
Again, these primitives provide explicit memory barrier semantics around
|
||||||
the atomic operation.
|
the atomic operation::
|
||||||
|
|
||||||
int atomic_sub_and_test(int i, atomic_t *v);
|
int atomic_sub_and_test(int i, atomic_t *v);
|
||||||
|
|
||||||
This is identical to atomic_dec_and_test() except that an explicit
|
This is identical to atomic_dec_and_test() except that an explicit
|
||||||
decrement is given instead of the implicit "1". This primitive must
|
decrement is given instead of the implicit "1". This primitive must
|
||||||
provide explicit memory barrier semantics around the operation.
|
provide explicit memory barrier semantics around the operation::
|
||||||
|
|
||||||
int atomic_add_negative(int i, atomic_t *v);
|
int atomic_add_negative(int i, atomic_t *v);
|
||||||
|
|
||||||
@@ -249,7 +258,7 @@ is return which indicates whether the resulting counter value is negative.
|
|||||||
This primitive must provide explicit memory barrier semantics around
|
This primitive must provide explicit memory barrier semantics around
|
||||||
the operation.
|
the operation.
|
||||||
|
|
||||||
Then:
|
Then::
|
||||||
|
|
||||||
int atomic_xchg(atomic_t *v, int new);
|
int atomic_xchg(atomic_t *v, int new);
|
||||||
|
|
||||||
@@ -257,14 +266,14 @@ This performs an atomic exchange operation on the atomic variable v, setting
|
|||||||
the given new value. It returns the old value that the atomic variable v had
|
the given new value. It returns the old value that the atomic variable v had
|
||||||
just before the operation.
|
just before the operation.
|
||||||
|
|
||||||
atomic_xchg must provide explicit memory barriers around the operation.
|
atomic_xchg must provide explicit memory barriers around the operation. ::
|
||||||
|
|
||||||
int atomic_cmpxchg(atomic_t *v, int old, int new);
|
int atomic_cmpxchg(atomic_t *v, int old, int new);
|
||||||
|
|
||||||
This performs an atomic compare exchange operation on the atomic value v,
|
This performs an atomic compare exchange operation on the atomic value v,
|
||||||
with the given old and new values. Like all atomic_xxx operations,
|
with the given old and new values. Like all atomic_xxx operations,
|
||||||
atomic_cmpxchg will only satisfy its atomicity semantics as long as all
|
atomic_cmpxchg will only satisfy its atomicity semantics as long as all
|
||||||
other accesses of *v are performed through atomic_xxx operations.
|
other accesses of \*v are performed through atomic_xxx operations.
|
||||||
|
|
||||||
atomic_cmpxchg must provide explicit memory barriers around the operation,
|
atomic_cmpxchg must provide explicit memory barriers around the operation,
|
||||||
although if the comparison fails then no memory ordering guarantees are
|
although if the comparison fails then no memory ordering guarantees are
|
||||||
@@ -273,7 +282,7 @@ required.
|
|||||||
The semantics for atomic_cmpxchg are the same as those defined for 'cas'
|
The semantics for atomic_cmpxchg are the same as those defined for 'cas'
|
||||||
below.
|
below.
|
||||||
|
|
||||||
Finally:
|
Finally::
|
||||||
|
|
||||||
int atomic_add_unless(atomic_t *v, int a, int u);
|
int atomic_add_unless(atomic_t *v, int a, int u);
|
||||||
|
|
||||||
@@ -289,12 +298,12 @@ atomic_inc_not_zero, equivalent to atomic_add_unless(v, 1, 0)
|
|||||||
|
|
||||||
If a caller requires memory barrier semantics around an atomic_t
|
If a caller requires memory barrier semantics around an atomic_t
|
||||||
operation which does not return a value, a set of interfaces are
|
operation which does not return a value, a set of interfaces are
|
||||||
defined which accomplish this:
|
defined which accomplish this::
|
||||||
|
|
||||||
void smp_mb__before_atomic(void);
|
void smp_mb__before_atomic(void);
|
||||||
void smp_mb__after_atomic(void);
|
void smp_mb__after_atomic(void);
|
||||||
|
|
||||||
For example, smp_mb__before_atomic() can be used like so:
|
For example, smp_mb__before_atomic() can be used like so::
|
||||||
|
|
||||||
obj->dead = 1;
|
obj->dead = 1;
|
||||||
smp_mb__before_atomic();
|
smp_mb__before_atomic();
|
||||||
@@ -315,7 +324,7 @@ atomic_t implementation above can have disastrous results. Here is
|
|||||||
an example, which follows a pattern occurring frequently in the Linux
|
an example, which follows a pattern occurring frequently in the Linux
|
||||||
kernel. It is the use of atomic counters to implement reference
|
kernel. It is the use of atomic counters to implement reference
|
||||||
counting, and it works such that once the counter falls to zero it can
|
counting, and it works such that once the counter falls to zero it can
|
||||||
be guaranteed that no other entity can be accessing the object:
|
be guaranteed that no other entity can be accessing the object::
|
||||||
|
|
||||||
static void obj_list_add(struct obj *obj, struct list_head *head)
|
static void obj_list_add(struct obj *obj, struct list_head *head)
|
||||||
{
|
{
|
||||||
@@ -372,10 +381,12 @@ void obj_timeout(struct obj *obj)
|
|||||||
obj_destroy(obj);
|
obj_destroy(obj);
|
||||||
}
|
}
|
||||||
|
|
||||||
(This is a simplification of the ARP queue management in the
|
.. note::
|
||||||
generic neighbour discover code of the networking. Olaf Kirch
|
|
||||||
found a bug wrt. memory barriers in kfree_skb() that exposed
|
This is a simplification of the ARP queue management in the generic
|
||||||
the atomic_t memory barrier requirements quite clearly.)
|
neighbour discover code of the networking. Olaf Kirch found a bug wrt.
|
||||||
|
memory barriers in kfree_skb() that exposed the atomic_t memory barrier
|
||||||
|
requirements quite clearly.
|
||||||
|
|
||||||
Given the above scheme, it must be the case that the obj->active
|
Given the above scheme, it must be the case that the obj->active
|
||||||
update done by the obj list deletion be visible to other processors
|
update done by the obj list deletion be visible to other processors
|
||||||
@@ -383,7 +394,7 @@ before the atomic counter decrement is performed.
|
|||||||
|
|
||||||
Otherwise, the counter could fall to zero, yet obj->active would still
|
Otherwise, the counter could fall to zero, yet obj->active would still
|
||||||
be set, thus triggering the assertion in obj_destroy(). The error
|
be set, thus triggering the assertion in obj_destroy(). The error
|
||||||
sequence looks like this:
|
sequence looks like this::
|
||||||
|
|
||||||
cpu 0 cpu 1
|
cpu 0 cpu 1
|
||||||
obj_poke() obj_timeout()
|
obj_poke() obj_timeout()
|
||||||
@@ -420,6 +431,10 @@ same scheme.
|
|||||||
Another note is that the atomic_t operations returning values are
|
Another note is that the atomic_t operations returning values are
|
||||||
extremely slow on an old 386.
|
extremely slow on an old 386.
|
||||||
|
|
||||||
|
|
||||||
|
Atomic Bitmask
|
||||||
|
==============
|
||||||
|
|
||||||
We will now cover the atomic bitmask operations. You will find that
|
We will now cover the atomic bitmask operations. You will find that
|
||||||
their SMP and memory barrier semantics are similar in shape and scope
|
their SMP and memory barrier semantics are similar in shape and scope
|
||||||
to the atomic_t ops above.
|
to the atomic_t ops above.
|
||||||
@@ -427,7 +442,7 @@ to the atomic_t ops above.
|
|||||||
Native atomic bit operations are defined to operate on objects aligned
|
Native atomic bit operations are defined to operate on objects aligned
|
||||||
to the size of an "unsigned long" C data type, and are least of that
|
to the size of an "unsigned long" C data type, and are least of that
|
||||||
size. The endianness of the bits within each "unsigned long" are the
|
size. The endianness of the bits within each "unsigned long" are the
|
||||||
native endianness of the cpu.
|
native endianness of the cpu. ::
|
||||||
|
|
||||||
void set_bit(unsigned long nr, volatile unsigned long *addr);
|
void set_bit(unsigned long nr, volatile unsigned long *addr);
|
||||||
void clear_bit(unsigned long nr, volatile unsigned long *addr);
|
void clear_bit(unsigned long nr, volatile unsigned long *addr);
|
||||||
@@ -437,7 +452,7 @@ These routines set, clear, and change, respectively, the bit number
|
|||||||
indicated by "nr" on the bit mask pointed to by "ADDR".
|
indicated by "nr" on the bit mask pointed to by "ADDR".
|
||||||
|
|
||||||
They must execute atomically, yet there are no implicit memory barrier
|
They must execute atomically, yet there are no implicit memory barrier
|
||||||
semantics required of these interfaces.
|
semantics required of these interfaces. ::
|
||||||
|
|
||||||
int test_and_set_bit(unsigned long nr, volatile unsigned long *addr);
|
int test_and_set_bit(unsigned long nr, volatile unsigned long *addr);
|
||||||
int test_and_clear_bit(unsigned long nr, volatile unsigned long *addr);
|
int test_and_clear_bit(unsigned long nr, volatile unsigned long *addr);
|
||||||
@@ -466,7 +481,7 @@ must provide explicit memory barrier semantics around their execution.
|
|||||||
All memory operations before the atomic bit operation call must be
|
All memory operations before the atomic bit operation call must be
|
||||||
made visible globally before the atomic bit operation is made visible.
|
made visible globally before the atomic bit operation is made visible.
|
||||||
Likewise, the atomic bit operation must be visible globally before any
|
Likewise, the atomic bit operation must be visible globally before any
|
||||||
subsequent memory operation is made visible. For example:
|
subsequent memory operation is made visible. For example::
|
||||||
|
|
||||||
obj->dead = 1;
|
obj->dead = 1;
|
||||||
if (test_and_set_bit(0, &obj->flags))
|
if (test_and_set_bit(0, &obj->flags))
|
||||||
@@ -479,7 +494,7 @@ done by test_and_set_bit() becomes visible. Likewise, the atomic
|
|||||||
memory operation done by test_and_set_bit() must become visible before
|
memory operation done by test_and_set_bit() must become visible before
|
||||||
"obj->killed = 1;" is visible.
|
"obj->killed = 1;" is visible.
|
||||||
|
|
||||||
Finally there is the basic operation:
|
Finally there is the basic operation::
|
||||||
|
|
||||||
int test_bit(unsigned long nr, __const__ volatile unsigned long *addr);
|
int test_bit(unsigned long nr, __const__ volatile unsigned long *addr);
|
||||||
|
|
||||||
@@ -488,13 +503,13 @@ pointed to by "addr".
|
|||||||
|
|
||||||
If explicit memory barriers are required around {set,clear}_bit() (which do
|
If explicit memory barriers are required around {set,clear}_bit() (which do
|
||||||
not return a value, and thus does not need to provide memory barrier
|
not return a value, and thus does not need to provide memory barrier
|
||||||
semantics), two interfaces are provided:
|
semantics), two interfaces are provided::
|
||||||
|
|
||||||
void smp_mb__before_atomic(void);
|
void smp_mb__before_atomic(void);
|
||||||
void smp_mb__after_atomic(void);
|
void smp_mb__after_atomic(void);
|
||||||
|
|
||||||
They are used as follows, and are akin to their atomic_t operation
|
They are used as follows, and are akin to their atomic_t operation
|
||||||
brothers:
|
brothers::
|
||||||
|
|
||||||
/* All memory operations before this call will
|
/* All memory operations before this call will
|
||||||
* be globally visible before the clear_bit().
|
* be globally visible before the clear_bit().
|
||||||
@@ -511,7 +526,7 @@ There are two special bitops with lock barrier semantics (acquire/release,
|
|||||||
same as spinlocks). These operate in the same way as their non-_lock/unlock
|
same as spinlocks). These operate in the same way as their non-_lock/unlock
|
||||||
postfixed variants, except that they are to provide acquire/release semantics,
|
postfixed variants, except that they are to provide acquire/release semantics,
|
||||||
respectively. This means they can be used for bit_spin_trylock and
|
respectively. This means they can be used for bit_spin_trylock and
|
||||||
bit_spin_unlock type operations without specifying any more barriers.
|
bit_spin_unlock type operations without specifying any more barriers. ::
|
||||||
|
|
||||||
int test_and_set_bit_lock(unsigned long nr, unsigned long *addr);
|
int test_and_set_bit_lock(unsigned long nr, unsigned long *addr);
|
||||||
void clear_bit_unlock(unsigned long nr, unsigned long *addr);
|
void clear_bit_unlock(unsigned long nr, unsigned long *addr);
|
||||||
@@ -526,7 +541,7 @@ provided. They are used in contexts where some other higher-level SMP
|
|||||||
locking scheme is being used to protect the bitmask, and thus less
|
locking scheme is being used to protect the bitmask, and thus less
|
||||||
expensive non-atomic operations may be used in the implementation.
|
expensive non-atomic operations may be used in the implementation.
|
||||||
They have names similar to the above bitmask operation interfaces,
|
They have names similar to the above bitmask operation interfaces,
|
||||||
except that two underscores are prefixed to the interface name.
|
except that two underscores are prefixed to the interface name. ::
|
||||||
|
|
||||||
void __set_bit(unsigned long nr, volatile unsigned long *addr);
|
void __set_bit(unsigned long nr, volatile unsigned long *addr);
|
||||||
void __clear_bit(unsigned long nr, volatile unsigned long *addr);
|
void __clear_bit(unsigned long nr, volatile unsigned long *addr);
|
||||||
@@ -542,9 +557,11 @@ The routines xchg() and cmpxchg() must provide the same exact
|
|||||||
memory-barrier semantics as the atomic and bit operations returning
|
memory-barrier semantics as the atomic and bit operations returning
|
||||||
values.
|
values.
|
||||||
|
|
||||||
Note: If someone wants to use xchg(), cmpxchg() and their variants,
|
.. note::
|
||||||
linux/atomic.h should be included rather than asm/cmpxchg.h, unless
|
|
||||||
the code is in arch/* and can take care of itself.
|
If someone wants to use xchg(), cmpxchg() and their variants,
|
||||||
|
linux/atomic.h should be included rather than asm/cmpxchg.h, unless the
|
||||||
|
code is in arch/* and can take care of itself.
|
||||||
|
|
||||||
Spinlocks and rwlocks have memory barrier expectations as well.
|
Spinlocks and rwlocks have memory barrier expectations as well.
|
||||||
The rule to follow is simple:
|
The rule to follow is simple:
|
||||||
@@ -558,7 +575,7 @@ The rule to follow is simple:
|
|||||||
|
|
||||||
Which finally brings us to _atomic_dec_and_lock(). There is an
|
Which finally brings us to _atomic_dec_and_lock(). There is an
|
||||||
architecture-neutral version implemented in lib/dec_and_lock.c,
|
architecture-neutral version implemented in lib/dec_and_lock.c,
|
||||||
but most platforms will wish to optimize this in assembler.
|
but most platforms will wish to optimize this in assembler. ::
|
||||||
|
|
||||||
int _atomic_dec_and_lock(atomic_t *atomic, spinlock_t *lock);
|
int _atomic_dec_and_lock(atomic_t *atomic, spinlock_t *lock);
|
||||||
|
|
||||||
@@ -573,7 +590,7 @@ sure the spinlock operation is globally visible before any
|
|||||||
subsequent memory operation.
|
subsequent memory operation.
|
||||||
|
|
||||||
We can demonstrate this operation more clearly if we define
|
We can demonstrate this operation more clearly if we define
|
||||||
an abstract atomic operation:
|
an abstract atomic operation::
|
||||||
|
|
||||||
long cas(long *mem, long old, long new);
|
long cas(long *mem, long old, long new);
|
||||||
|
|
||||||
@@ -584,7 +601,7 @@ an abstract atomic operation:
|
|||||||
3) Regardless, the current value at "mem" is returned.
|
3) Regardless, the current value at "mem" is returned.
|
||||||
|
|
||||||
As an example usage, here is what an atomic counter update
|
As an example usage, here is what an atomic counter update
|
||||||
might look like:
|
might look like::
|
||||||
|
|
||||||
void example_atomic_inc(long *counter)
|
void example_atomic_inc(long *counter)
|
||||||
{
|
{
|
||||||
@@ -600,7 +617,7 @@ void example_atomic_inc(long *counter)
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
Let's use cas() in order to build a pseudo-C atomic_dec_and_lock():
|
Let's use cas() in order to build a pseudo-C atomic_dec_and_lock()::
|
||||||
|
|
||||||
int _atomic_dec_and_lock(atomic_t *atomic, spinlock_t *lock)
|
int _atomic_dec_and_lock(atomic_t *atomic, spinlock_t *lock)
|
||||||
{
|
{
|
||||||
@@ -635,6 +652,7 @@ Said another way, _atomic_dec_and_lock() must guarantee that
|
|||||||
a counter dropping to zero is never made visible before the
|
a counter dropping to zero is never made visible before the
|
||||||
spinlock being acquired.
|
spinlock being acquired.
|
||||||
|
|
||||||
Note that this also means that for the case where the counter
|
.. note::
|
||||||
is not dropping to zero, there are no memory ordering
|
|
||||||
requirements.
|
Note that this also means that for the case where the counter is not
|
||||||
|
dropping to zero, there are no memory ordering requirements.
|
||||||
10
Documentation/core-api/conf.py
Normal file
10
Documentation/core-api/conf.py
Normal file
@@ -0,0 +1,10 @@
|
|||||||
|
# -*- coding: utf-8; mode: python -*-
|
||||||
|
|
||||||
|
project = "Core-API Documentation"
|
||||||
|
|
||||||
|
tags.add("subproject")
|
||||||
|
|
||||||
|
latex_documents = [
|
||||||
|
('index', 'core-api.tex', project,
|
||||||
|
'The kernel development community', 'manual'),
|
||||||
|
]
|
||||||
310
Documentation/core-api/debug-objects.rst
Normal file
310
Documentation/core-api/debug-objects.rst
Normal file
@@ -0,0 +1,310 @@
|
|||||||
|
============================================
|
||||||
|
The object-lifetime debugging infrastructure
|
||||||
|
============================================
|
||||||
|
|
||||||
|
:Author: Thomas Gleixner
|
||||||
|
|
||||||
|
Introduction
|
||||||
|
============
|
||||||
|
|
||||||
|
debugobjects is a generic infrastructure to track the life time of
|
||||||
|
kernel objects and validate the operations on those.
|
||||||
|
|
||||||
|
debugobjects is useful to check for the following error patterns:
|
||||||
|
|
||||||
|
- Activation of uninitialized objects
|
||||||
|
|
||||||
|
- Initialization of active objects
|
||||||
|
|
||||||
|
- Usage of freed/destroyed objects
|
||||||
|
|
||||||
|
debugobjects is not changing the data structure of the real object so it
|
||||||
|
can be compiled in with a minimal runtime impact and enabled on demand
|
||||||
|
with a kernel command line option.
|
||||||
|
|
||||||
|
Howto use debugobjects
|
||||||
|
======================
|
||||||
|
|
||||||
|
A kernel subsystem needs to provide a data structure which describes the
|
||||||
|
object type and add calls into the debug code at appropriate places. The
|
||||||
|
data structure to describe the object type needs at minimum the name of
|
||||||
|
the object type. Optional functions can and should be provided to fixup
|
||||||
|
detected problems so the kernel can continue to work and the debug
|
||||||
|
information can be retrieved from a live system instead of hard core
|
||||||
|
debugging with serial consoles and stack trace transcripts from the
|
||||||
|
monitor.
|
||||||
|
|
||||||
|
The debug calls provided by debugobjects are:
|
||||||
|
|
||||||
|
- debug_object_init
|
||||||
|
|
||||||
|
- debug_object_init_on_stack
|
||||||
|
|
||||||
|
- debug_object_activate
|
||||||
|
|
||||||
|
- debug_object_deactivate
|
||||||
|
|
||||||
|
- debug_object_destroy
|
||||||
|
|
||||||
|
- debug_object_free
|
||||||
|
|
||||||
|
- debug_object_assert_init
|
||||||
|
|
||||||
|
Each of these functions takes the address of the real object and a
|
||||||
|
pointer to the object type specific debug description structure.
|
||||||
|
|
||||||
|
Each detected error is reported in the statistics and a limited number
|
||||||
|
of errors are printk'ed including a full stack trace.
|
||||||
|
|
||||||
|
The statistics are available via /sys/kernel/debug/debug_objects/stats.
|
||||||
|
They provide information about the number of warnings and the number of
|
||||||
|
successful fixups along with information about the usage of the internal
|
||||||
|
tracking objects and the state of the internal tracking objects pool.
|
||||||
|
|
||||||
|
Debug functions
|
||||||
|
===============
|
||||||
|
|
||||||
|
.. kernel-doc:: lib/debugobjects.c
|
||||||
|
:functions: debug_object_init
|
||||||
|
|
||||||
|
This function is called whenever the initialization function of a real
|
||||||
|
object is called.
|
||||||
|
|
||||||
|
When the real object is already tracked by debugobjects it is checked,
|
||||||
|
whether the object can be initialized. Initializing is not allowed for
|
||||||
|
active and destroyed objects. When debugobjects detects an error, then
|
||||||
|
it calls the fixup_init function of the object type description
|
||||||
|
structure if provided by the caller. The fixup function can correct the
|
||||||
|
problem before the real initialization of the object happens. E.g. it
|
||||||
|
can deactivate an active object in order to prevent damage to the
|
||||||
|
subsystem.
|
||||||
|
|
||||||
|
When the real object is not yet tracked by debugobjects, debugobjects
|
||||||
|
allocates a tracker object for the real object and sets the tracker
|
||||||
|
object state to ODEBUG_STATE_INIT. It verifies that the object is not
|
||||||
|
on the callers stack. If it is on the callers stack then a limited
|
||||||
|
number of warnings including a full stack trace is printk'ed. The
|
||||||
|
calling code must use debug_object_init_on_stack() and remove the
|
||||||
|
object before leaving the function which allocated it. See next section.
|
||||||
|
|
||||||
|
.. kernel-doc:: lib/debugobjects.c
|
||||||
|
:functions: debug_object_init_on_stack
|
||||||
|
|
||||||
|
This function is called whenever the initialization function of a real
|
||||||
|
object which resides on the stack is called.
|
||||||
|
|
||||||
|
When the real object is already tracked by debugobjects it is checked,
|
||||||
|
whether the object can be initialized. Initializing is not allowed for
|
||||||
|
active and destroyed objects. When debugobjects detects an error, then
|
||||||
|
it calls the fixup_init function of the object type description
|
||||||
|
structure if provided by the caller. The fixup function can correct the
|
||||||
|
problem before the real initialization of the object happens. E.g. it
|
||||||
|
can deactivate an active object in order to prevent damage to the
|
||||||
|
subsystem.
|
||||||
|
|
||||||
|
When the real object is not yet tracked by debugobjects debugobjects
|
||||||
|
allocates a tracker object for the real object and sets the tracker
|
||||||
|
object state to ODEBUG_STATE_INIT. It verifies that the object is on
|
||||||
|
the callers stack.
|
||||||
|
|
||||||
|
An object which is on the stack must be removed from the tracker by
|
||||||
|
calling debug_object_free() before the function which allocates the
|
||||||
|
object returns. Otherwise we keep track of stale objects.
|
||||||
|
|
||||||
|
.. kernel-doc:: lib/debugobjects.c
|
||||||
|
:functions: debug_object_activate
|
||||||
|
|
||||||
|
This function is called whenever the activation function of a real
|
||||||
|
object is called.
|
||||||
|
|
||||||
|
When the real object is already tracked by debugobjects it is checked,
|
||||||
|
whether the object can be activated. Activating is not allowed for
|
||||||
|
active and destroyed objects. When debugobjects detects an error, then
|
||||||
|
it calls the fixup_activate function of the object type description
|
||||||
|
structure if provided by the caller. The fixup function can correct the
|
||||||
|
problem before the real activation of the object happens. E.g. it can
|
||||||
|
deactivate an active object in order to prevent damage to the subsystem.
|
||||||
|
|
||||||
|
When the real object is not yet tracked by debugobjects then the
|
||||||
|
fixup_activate function is called if available. This is necessary to
|
||||||
|
allow the legitimate activation of statically allocated and initialized
|
||||||
|
objects. The fixup function checks whether the object is valid and calls
|
||||||
|
the debug_objects_init() function to initialize the tracking of this
|
||||||
|
object.
|
||||||
|
|
||||||
|
When the activation is legitimate, then the state of the associated
|
||||||
|
tracker object is set to ODEBUG_STATE_ACTIVE.
|
||||||
|
|
||||||
|
|
||||||
|
.. kernel-doc:: lib/debugobjects.c
|
||||||
|
:functions: debug_object_deactivate
|
||||||
|
|
||||||
|
This function is called whenever the deactivation function of a real
|
||||||
|
object is called.
|
||||||
|
|
||||||
|
When the real object is tracked by debugobjects it is checked, whether
|
||||||
|
the object can be deactivated. Deactivating is not allowed for untracked
|
||||||
|
or destroyed objects.
|
||||||
|
|
||||||
|
When the deactivation is legitimate, then the state of the associated
|
||||||
|
tracker object is set to ODEBUG_STATE_INACTIVE.
|
||||||
|
|
||||||
|
.. kernel-doc:: lib/debugobjects.c
|
||||||
|
:functions: debug_object_destroy
|
||||||
|
|
||||||
|
This function is called to mark an object destroyed. This is useful to
|
||||||
|
prevent the usage of invalid objects, which are still available in
|
||||||
|
memory: either statically allocated objects or objects which are freed
|
||||||
|
later.
|
||||||
|
|
||||||
|
When the real object is tracked by debugobjects it is checked, whether
|
||||||
|
the object can be destroyed. Destruction is not allowed for active and
|
||||||
|
destroyed objects. When debugobjects detects an error, then it calls the
|
||||||
|
fixup_destroy function of the object type description structure if
|
||||||
|
provided by the caller. The fixup function can correct the problem
|
||||||
|
before the real destruction of the object happens. E.g. it can
|
||||||
|
deactivate an active object in order to prevent damage to the subsystem.
|
||||||
|
|
||||||
|
When the destruction is legitimate, then the state of the associated
|
||||||
|
tracker object is set to ODEBUG_STATE_DESTROYED.
|
||||||
|
|
||||||
|
.. kernel-doc:: lib/debugobjects.c
|
||||||
|
:functions: debug_object_free
|
||||||
|
|
||||||
|
This function is called before an object is freed.
|
||||||
|
|
||||||
|
When the real object is tracked by debugobjects it is checked, whether
|
||||||
|
the object can be freed. Free is not allowed for active objects. When
|
||||||
|
debugobjects detects an error, then it calls the fixup_free function of
|
||||||
|
the object type description structure if provided by the caller. The
|
||||||
|
fixup function can correct the problem before the real free of the
|
||||||
|
object happens. E.g. it can deactivate an active object in order to
|
||||||
|
prevent damage to the subsystem.
|
||||||
|
|
||||||
|
Note that debug_object_free removes the object from the tracker. Later
|
||||||
|
usage of the object is detected by the other debug checks.
|
||||||
|
|
||||||
|
|
||||||
|
.. kernel-doc:: lib/debugobjects.c
|
||||||
|
:functions: debug_object_assert_init
|
||||||
|
|
||||||
|
This function is called to assert that an object has been initialized.
|
||||||
|
|
||||||
|
When the real object is not tracked by debugobjects, it calls
|
||||||
|
fixup_assert_init of the object type description structure provided by
|
||||||
|
the caller, with the hardcoded object state ODEBUG_NOT_AVAILABLE. The
|
||||||
|
fixup function can correct the problem by calling debug_object_init
|
||||||
|
and other specific initializing functions.
|
||||||
|
|
||||||
|
When the real object is already tracked by debugobjects it is ignored.
|
||||||
|
|
||||||
|
Fixup functions
|
||||||
|
===============
|
||||||
|
|
||||||
|
Debug object type description structure
|
||||||
|
---------------------------------------
|
||||||
|
|
||||||
|
.. kernel-doc:: include/linux/debugobjects.h
|
||||||
|
:internal:
|
||||||
|
|
||||||
|
fixup_init
|
||||||
|
-----------
|
||||||
|
|
||||||
|
This function is called from the debug code whenever a problem in
|
||||||
|
debug_object_init is detected. The function takes the address of the
|
||||||
|
object and the state which is currently recorded in the tracker.
|
||||||
|
|
||||||
|
Called from debug_object_init when the object state is:
|
||||||
|
|
||||||
|
- ODEBUG_STATE_ACTIVE
|
||||||
|
|
||||||
|
The function returns true when the fixup was successful, otherwise
|
||||||
|
false. The return value is used to update the statistics.
|
||||||
|
|
||||||
|
Note, that the function needs to call the debug_object_init() function
|
||||||
|
again, after the damage has been repaired in order to keep the state
|
||||||
|
consistent.
|
||||||
|
|
||||||
|
fixup_activate
|
||||||
|
---------------
|
||||||
|
|
||||||
|
This function is called from the debug code whenever a problem in
|
||||||
|
debug_object_activate is detected.
|
||||||
|
|
||||||
|
Called from debug_object_activate when the object state is:
|
||||||
|
|
||||||
|
- ODEBUG_STATE_NOTAVAILABLE
|
||||||
|
|
||||||
|
- ODEBUG_STATE_ACTIVE
|
||||||
|
|
||||||
|
The function returns true when the fixup was successful, otherwise
|
||||||
|
false. The return value is used to update the statistics.
|
||||||
|
|
||||||
|
Note that the function needs to call the debug_object_activate()
|
||||||
|
function again after the damage has been repaired in order to keep the
|
||||||
|
state consistent.
|
||||||
|
|
||||||
|
The activation of statically initialized objects is a special case. When
|
||||||
|
debug_object_activate() has no tracked object for this object address
|
||||||
|
then fixup_activate() is called with object state
|
||||||
|
ODEBUG_STATE_NOTAVAILABLE. The fixup function needs to check whether
|
||||||
|
this is a legitimate case of a statically initialized object or not. In
|
||||||
|
case it is it calls debug_object_init() and debug_object_activate()
|
||||||
|
to make the object known to the tracker and marked active. In this case
|
||||||
|
the function should return false because this is not a real fixup.
|
||||||
|
|
||||||
|
fixup_destroy
|
||||||
|
--------------
|
||||||
|
|
||||||
|
This function is called from the debug code whenever a problem in
|
||||||
|
debug_object_destroy is detected.
|
||||||
|
|
||||||
|
Called from debug_object_destroy when the object state is:
|
||||||
|
|
||||||
|
- ODEBUG_STATE_ACTIVE
|
||||||
|
|
||||||
|
The function returns true when the fixup was successful, otherwise
|
||||||
|
false. The return value is used to update the statistics.
|
||||||
|
|
||||||
|
fixup_free
|
||||||
|
-----------
|
||||||
|
|
||||||
|
This function is called from the debug code whenever a problem in
|
||||||
|
debug_object_free is detected. Further it can be called from the debug
|
||||||
|
checks in kfree/vfree, when an active object is detected from the
|
||||||
|
debug_check_no_obj_freed() sanity checks.
|
||||||
|
|
||||||
|
Called from debug_object_free() or debug_check_no_obj_freed() when
|
||||||
|
the object state is:
|
||||||
|
|
||||||
|
- ODEBUG_STATE_ACTIVE
|
||||||
|
|
||||||
|
The function returns true when the fixup was successful, otherwise
|
||||||
|
false. The return value is used to update the statistics.
|
||||||
|
|
||||||
|
fixup_assert_init
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
This function is called from the debug code whenever a problem in
|
||||||
|
debug_object_assert_init is detected.
|
||||||
|
|
||||||
|
Called from debug_object_assert_init() with a hardcoded state
|
||||||
|
ODEBUG_STATE_NOTAVAILABLE when the object is not found in the debug
|
||||||
|
bucket.
|
||||||
|
|
||||||
|
The function returns true when the fixup was successful, otherwise
|
||||||
|
false. The return value is used to update the statistics.
|
||||||
|
|
||||||
|
Note, this function should make sure debug_object_init() is called
|
||||||
|
before returning.
|
||||||
|
|
||||||
|
The handling of statically initialized objects is a special case. The
|
||||||
|
fixup function should check if this is a legitimate case of a statically
|
||||||
|
initialized object or not. In this case only debug_object_init()
|
||||||
|
should be called to make the object known to the tracker. Then the
|
||||||
|
function should return false because this is not a real fixup.
|
||||||
|
|
||||||
|
Known Bugs And Assumptions
|
||||||
|
==========================
|
||||||
|
|
||||||
|
None (knock on wood).
|
||||||
33
Documentation/core-api/index.rst
Normal file
33
Documentation/core-api/index.rst
Normal file
@@ -0,0 +1,33 @@
|
|||||||
|
======================
|
||||||
|
Core API Documentation
|
||||||
|
======================
|
||||||
|
|
||||||
|
This is the beginning of a manual for core kernel APIs. The conversion
|
||||||
|
(and writing!) of documents for this manual is much appreciated!
|
||||||
|
|
||||||
|
Core utilities
|
||||||
|
==============
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 1
|
||||||
|
|
||||||
|
assoc_array
|
||||||
|
atomic_ops
|
||||||
|
local_ops
|
||||||
|
workqueue
|
||||||
|
|
||||||
|
Interfaces for kernel debugging
|
||||||
|
===============================
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 1
|
||||||
|
|
||||||
|
debug-objects
|
||||||
|
tracepoint
|
||||||
|
|
||||||
|
.. only:: subproject
|
||||||
|
|
||||||
|
Indices
|
||||||
|
=======
|
||||||
|
|
||||||
|
* :ref:`genindex`
|
||||||
206
Documentation/core-api/local_ops.rst
Normal file
206
Documentation/core-api/local_ops.rst
Normal file
@@ -0,0 +1,206 @@
|
|||||||
|
|
||||||
|
.. _local_ops:
|
||||||
|
|
||||||
|
=================================================
|
||||||
|
Semantics and Behavior of Local Atomic Operations
|
||||||
|
=================================================
|
||||||
|
|
||||||
|
:Author: Mathieu Desnoyers
|
||||||
|
|
||||||
|
|
||||||
|
This document explains the purpose of the local atomic operations, how
|
||||||
|
to implement them for any given architecture and shows how they can be used
|
||||||
|
properly. It also stresses on the precautions that must be taken when reading
|
||||||
|
those local variables across CPUs when the order of memory writes matters.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
Note that ``local_t`` based operations are not recommended for general
|
||||||
|
kernel use. Please use the ``this_cpu`` operations instead unless there is
|
||||||
|
really a special purpose. Most uses of ``local_t`` in the kernel have been
|
||||||
|
replaced by ``this_cpu`` operations. ``this_cpu`` operations combine the
|
||||||
|
relocation with the ``local_t`` like semantics in a single instruction and
|
||||||
|
yield more compact and faster executing code.
|
||||||
|
|
||||||
|
|
||||||
|
Purpose of local atomic operations
|
||||||
|
==================================
|
||||||
|
|
||||||
|
Local atomic operations are meant to provide fast and highly reentrant per CPU
|
||||||
|
counters. They minimize the performance cost of standard atomic operations by
|
||||||
|
removing the LOCK prefix and memory barriers normally required to synchronize
|
||||||
|
across CPUs.
|
||||||
|
|
||||||
|
Having fast per CPU atomic counters is interesting in many cases: it does not
|
||||||
|
require disabling interrupts to protect from interrupt handlers and it permits
|
||||||
|
coherent counters in NMI handlers. It is especially useful for tracing purposes
|
||||||
|
and for various performance monitoring counters.
|
||||||
|
|
||||||
|
Local atomic operations only guarantee variable modification atomicity wrt the
|
||||||
|
CPU which owns the data. Therefore, care must taken to make sure that only one
|
||||||
|
CPU writes to the ``local_t`` data. This is done by using per cpu data and
|
||||||
|
making sure that we modify it from within a preemption safe context. It is
|
||||||
|
however permitted to read ``local_t`` data from any CPU: it will then appear to
|
||||||
|
be written out of order wrt other memory writes by the owner CPU.
|
||||||
|
|
||||||
|
|
||||||
|
Implementation for a given architecture
|
||||||
|
=======================================
|
||||||
|
|
||||||
|
It can be done by slightly modifying the standard atomic operations: only
|
||||||
|
their UP variant must be kept. It typically means removing LOCK prefix (on
|
||||||
|
i386 and x86_64) and any SMP synchronization barrier. If the architecture does
|
||||||
|
not have a different behavior between SMP and UP, including
|
||||||
|
``asm-generic/local.h`` in your architecture's ``local.h`` is sufficient.
|
||||||
|
|
||||||
|
The ``local_t`` type is defined as an opaque ``signed long`` by embedding an
|
||||||
|
``atomic_long_t`` inside a structure. This is made so a cast from this type to
|
||||||
|
a ``long`` fails. The definition looks like::
|
||||||
|
|
||||||
|
typedef struct { atomic_long_t a; } local_t;
|
||||||
|
|
||||||
|
|
||||||
|
Rules to follow when using local atomic operations
|
||||||
|
==================================================
|
||||||
|
|
||||||
|
* Variables touched by local ops must be per cpu variables.
|
||||||
|
* *Only* the CPU owner of these variables must write to them.
|
||||||
|
* This CPU can use local ops from any context (process, irq, softirq, nmi, ...)
|
||||||
|
to update its ``local_t`` variables.
|
||||||
|
* Preemption (or interrupts) must be disabled when using local ops in
|
||||||
|
process context to make sure the process won't be migrated to a
|
||||||
|
different CPU between getting the per-cpu variable and doing the
|
||||||
|
actual local op.
|
||||||
|
* When using local ops in interrupt context, no special care must be
|
||||||
|
taken on a mainline kernel, since they will run on the local CPU with
|
||||||
|
preemption already disabled. I suggest, however, to explicitly
|
||||||
|
disable preemption anyway to make sure it will still work correctly on
|
||||||
|
-rt kernels.
|
||||||
|
* Reading the local cpu variable will provide the current copy of the
|
||||||
|
variable.
|
||||||
|
* Reads of these variables can be done from any CPU, because updates to
|
||||||
|
"``long``", aligned, variables are always atomic. Since no memory
|
||||||
|
synchronization is done by the writer CPU, an outdated copy of the
|
||||||
|
variable can be read when reading some *other* cpu's variables.
|
||||||
|
|
||||||
|
|
||||||
|
How to use local atomic operations
|
||||||
|
==================================
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
#include <linux/percpu.h>
|
||||||
|
#include <asm/local.h>
|
||||||
|
|
||||||
|
static DEFINE_PER_CPU(local_t, counters) = LOCAL_INIT(0);
|
||||||
|
|
||||||
|
|
||||||
|
Counting
|
||||||
|
========
|
||||||
|
|
||||||
|
Counting is done on all the bits of a signed long.
|
||||||
|
|
||||||
|
In preemptible context, use ``get_cpu_var()`` and ``put_cpu_var()`` around
|
||||||
|
local atomic operations: it makes sure that preemption is disabled around write
|
||||||
|
access to the per cpu variable. For instance::
|
||||||
|
|
||||||
|
local_inc(&get_cpu_var(counters));
|
||||||
|
put_cpu_var(counters);
|
||||||
|
|
||||||
|
If you are already in a preemption-safe context, you can use
|
||||||
|
``this_cpu_ptr()`` instead::
|
||||||
|
|
||||||
|
local_inc(this_cpu_ptr(&counters));
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
Reading the counters
|
||||||
|
====================
|
||||||
|
|
||||||
|
Those local counters can be read from foreign CPUs to sum the count. Note that
|
||||||
|
the data seen by local_read across CPUs must be considered to be out of order
|
||||||
|
relatively to other memory writes happening on the CPU that owns the data::
|
||||||
|
|
||||||
|
long sum = 0;
|
||||||
|
for_each_online_cpu(cpu)
|
||||||
|
sum += local_read(&per_cpu(counters, cpu));
|
||||||
|
|
||||||
|
If you want to use a remote local_read to synchronize access to a resource
|
||||||
|
between CPUs, explicit ``smp_wmb()`` and ``smp_rmb()`` memory barriers must be used
|
||||||
|
respectively on the writer and the reader CPUs. It would be the case if you use
|
||||||
|
the ``local_t`` variable as a counter of bytes written in a buffer: there should
|
||||||
|
be a ``smp_wmb()`` between the buffer write and the counter increment and also a
|
||||||
|
``smp_rmb()`` between the counter read and the buffer read.
|
||||||
|
|
||||||
|
|
||||||
|
Here is a sample module which implements a basic per cpu counter using
|
||||||
|
``local.h``::
|
||||||
|
|
||||||
|
/* test-local.c
|
||||||
|
*
|
||||||
|
* Sample module for local.h usage.
|
||||||
|
*/
|
||||||
|
|
||||||
|
|
||||||
|
#include <asm/local.h>
|
||||||
|
#include <linux/module.h>
|
||||||
|
#include <linux/timer.h>
|
||||||
|
|
||||||
|
static DEFINE_PER_CPU(local_t, counters) = LOCAL_INIT(0);
|
||||||
|
|
||||||
|
static struct timer_list test_timer;
|
||||||
|
|
||||||
|
/* IPI called on each CPU. */
|
||||||
|
static void test_each(void *info)
|
||||||
|
{
|
||||||
|
/* Increment the counter from a non preemptible context */
|
||||||
|
printk("Increment on cpu %d\n", smp_processor_id());
|
||||||
|
local_inc(this_cpu_ptr(&counters));
|
||||||
|
|
||||||
|
/* This is what incrementing the variable would look like within a
|
||||||
|
* preemptible context (it disables preemption) :
|
||||||
|
*
|
||||||
|
* local_inc(&get_cpu_var(counters));
|
||||||
|
* put_cpu_var(counters);
|
||||||
|
*/
|
||||||
|
}
|
||||||
|
|
||||||
|
static void do_test_timer(unsigned long data)
|
||||||
|
{
|
||||||
|
int cpu;
|
||||||
|
|
||||||
|
/* Increment the counters */
|
||||||
|
on_each_cpu(test_each, NULL, 1);
|
||||||
|
/* Read all the counters */
|
||||||
|
printk("Counters read from CPU %d\n", smp_processor_id());
|
||||||
|
for_each_online_cpu(cpu) {
|
||||||
|
printk("Read : CPU %d, count %ld\n", cpu,
|
||||||
|
local_read(&per_cpu(counters, cpu)));
|
||||||
|
}
|
||||||
|
del_timer(&test_timer);
|
||||||
|
test_timer.expires = jiffies + 1000;
|
||||||
|
add_timer(&test_timer);
|
||||||
|
}
|
||||||
|
|
||||||
|
static int __init test_init(void)
|
||||||
|
{
|
||||||
|
/* initialize the timer that will increment the counter */
|
||||||
|
init_timer(&test_timer);
|
||||||
|
test_timer.function = do_test_timer;
|
||||||
|
test_timer.expires = jiffies + 1;
|
||||||
|
add_timer(&test_timer);
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
static void __exit test_exit(void)
|
||||||
|
{
|
||||||
|
del_timer_sync(&test_timer);
|
||||||
|
}
|
||||||
|
|
||||||
|
module_init(test_init);
|
||||||
|
module_exit(test_exit);
|
||||||
|
|
||||||
|
MODULE_LICENSE("GPL");
|
||||||
|
MODULE_AUTHOR("Mathieu Desnoyers");
|
||||||
|
MODULE_DESCRIPTION("Local Atomic Ops");
|
||||||
55
Documentation/core-api/tracepoint.rst
Normal file
55
Documentation/core-api/tracepoint.rst
Normal file
@@ -0,0 +1,55 @@
|
|||||||
|
===============================
|
||||||
|
The Linux Kernel Tracepoint API
|
||||||
|
===============================
|
||||||
|
|
||||||
|
:Author: Jason Baron
|
||||||
|
:Author: William Cohen
|
||||||
|
|
||||||
|
Introduction
|
||||||
|
============
|
||||||
|
|
||||||
|
Tracepoints are static probe points that are located in strategic points
|
||||||
|
throughout the kernel. 'Probes' register/unregister with tracepoints via
|
||||||
|
a callback mechanism. The 'probes' are strictly typed functions that are
|
||||||
|
passed a unique set of parameters defined by each tracepoint.
|
||||||
|
|
||||||
|
From this simple callback mechanism, 'probes' can be used to profile,
|
||||||
|
debug, and understand kernel behavior. There are a number of tools that
|
||||||
|
provide a framework for using 'probes'. These tools include Systemtap,
|
||||||
|
ftrace, and LTTng.
|
||||||
|
|
||||||
|
Tracepoints are defined in a number of header files via various macros.
|
||||||
|
Thus, the purpose of this document is to provide a clear accounting of
|
||||||
|
the available tracepoints. The intention is to understand not only what
|
||||||
|
tracepoints are available but also to understand where future
|
||||||
|
tracepoints might be added.
|
||||||
|
|
||||||
|
The API presented has functions of the form:
|
||||||
|
``trace_tracepointname(function parameters)``. These are the tracepoints
|
||||||
|
callbacks that are found throughout the code. Registering and
|
||||||
|
unregistering probes with these callback sites is covered in the
|
||||||
|
``Documentation/trace/*`` directory.
|
||||||
|
|
||||||
|
IRQ
|
||||||
|
===
|
||||||
|
|
||||||
|
.. kernel-doc:: include/trace/events/irq.h
|
||||||
|
:internal:
|
||||||
|
|
||||||
|
SIGNAL
|
||||||
|
======
|
||||||
|
|
||||||
|
.. kernel-doc:: include/trace/events/signal.h
|
||||||
|
:internal:
|
||||||
|
|
||||||
|
Block IO
|
||||||
|
========
|
||||||
|
|
||||||
|
.. kernel-doc:: include/trace/events/block.h
|
||||||
|
:internal:
|
||||||
|
|
||||||
|
Workqueue
|
||||||
|
=========
|
||||||
|
|
||||||
|
.. kernel-doc:: include/trace/events/workqueue.h
|
||||||
|
:internal:
|
||||||
@@ -1,21 +1,14 @@
|
|||||||
|
====================================
|
||||||
Concurrency Managed Workqueue (cmwq)
|
Concurrency Managed Workqueue (cmwq)
|
||||||
|
====================================
|
||||||
|
|
||||||
September, 2010 Tejun Heo <tj@kernel.org>
|
:Date: September, 2010
|
||||||
Florian Mickler <florian@mickler.org>
|
:Author: Tejun Heo <tj@kernel.org>
|
||||||
|
:Author: Florian Mickler <florian@mickler.org>
|
||||||
CONTENTS
|
|
||||||
|
|
||||||
1. Introduction
|
|
||||||
2. Why cmwq?
|
|
||||||
3. The Design
|
|
||||||
4. Application Programming Interface (API)
|
|
||||||
5. Example Execution Scenarios
|
|
||||||
6. Guidelines
|
|
||||||
7. Debugging
|
|
||||||
|
|
||||||
|
|
||||||
1. Introduction
|
Introduction
|
||||||
|
============
|
||||||
|
|
||||||
There are many cases where an asynchronous process execution context
|
There are many cases where an asynchronous process execution context
|
||||||
is needed and the workqueue (wq) API is the most commonly used
|
is needed and the workqueue (wq) API is the most commonly used
|
||||||
@@ -32,7 +25,8 @@ there is no work item left on the workqueue the worker becomes idle.
|
|||||||
When a new work item gets queued, the worker begins executing again.
|
When a new work item gets queued, the worker begins executing again.
|
||||||
|
|
||||||
|
|
||||||
2. Why cmwq?
|
Why cmwq?
|
||||||
|
=========
|
||||||
|
|
||||||
In the original wq implementation, a multi threaded (MT) wq had one
|
In the original wq implementation, a multi threaded (MT) wq had one
|
||||||
worker thread per CPU and a single threaded (ST) wq had one worker
|
worker thread per CPU and a single threaded (ST) wq had one worker
|
||||||
@@ -71,7 +65,8 @@ focus on the following goals.
|
|||||||
the API users don't need to worry about such details.
|
the API users don't need to worry about such details.
|
||||||
|
|
||||||
|
|
||||||
3. The Design
|
The Design
|
||||||
|
==========
|
||||||
|
|
||||||
In order to ease the asynchronous execution of functions a new
|
In order to ease the asynchronous execution of functions a new
|
||||||
abstraction, the work item, is introduced.
|
abstraction, the work item, is introduced.
|
||||||
@@ -102,7 +97,7 @@ aspects of the way the work items are executed by setting flags on the
|
|||||||
workqueue they are putting the work item on. These flags include
|
workqueue they are putting the work item on. These flags include
|
||||||
things like CPU locality, concurrency limits, priority and more. To
|
things like CPU locality, concurrency limits, priority and more. To
|
||||||
get a detailed overview refer to the API description of
|
get a detailed overview refer to the API description of
|
||||||
alloc_workqueue() below.
|
``alloc_workqueue()`` below.
|
||||||
|
|
||||||
When a work item is queued to a workqueue, the target worker-pool is
|
When a work item is queued to a workqueue, the target worker-pool is
|
||||||
determined according to the queue parameters and workqueue attributes
|
determined according to the queue parameters and workqueue attributes
|
||||||
@@ -136,7 +131,7 @@ them.
|
|||||||
|
|
||||||
For unbound workqueues, the number of backing pools is dynamic.
|
For unbound workqueues, the number of backing pools is dynamic.
|
||||||
Unbound workqueue can be assigned custom attributes using
|
Unbound workqueue can be assigned custom attributes using
|
||||||
apply_workqueue_attrs() and workqueue will automatically create
|
``apply_workqueue_attrs()`` and workqueue will automatically create
|
||||||
backing worker pools matching the attributes. The responsibility of
|
backing worker pools matching the attributes. The responsibility of
|
||||||
regulating concurrency level is on the users. There is also a flag to
|
regulating concurrency level is on the users. There is also a flag to
|
||||||
mark a bound wq to ignore the concurrency management. Please refer to
|
mark a bound wq to ignore the concurrency management. Please refer to
|
||||||
@@ -151,23 +146,25 @@ pressure. Else it is possible that the worker-pool deadlocks waiting
|
|||||||
for execution contexts to free up.
|
for execution contexts to free up.
|
||||||
|
|
||||||
|
|
||||||
4. Application Programming Interface (API)
|
Application Programming Interface (API)
|
||||||
|
=======================================
|
||||||
|
|
||||||
alloc_workqueue() allocates a wq. The original create_*workqueue()
|
``alloc_workqueue()`` allocates a wq. The original
|
||||||
functions are deprecated and scheduled for removal. alloc_workqueue()
|
``create_*workqueue()`` functions are deprecated and scheduled for
|
||||||
takes three arguments - @name, @flags and @max_active. @name is the
|
removal. ``alloc_workqueue()`` takes three arguments - @``name``,
|
||||||
name of the wq and also used as the name of the rescuer thread if
|
``@flags`` and ``@max_active``. ``@name`` is the name of the wq and
|
||||||
there is one.
|
also used as the name of the rescuer thread if there is one.
|
||||||
|
|
||||||
A wq no longer manages execution resources but serves as a domain for
|
A wq no longer manages execution resources but serves as a domain for
|
||||||
forward progress guarantee, flush and work item attributes. @flags
|
forward progress guarantee, flush and work item attributes. ``@flags``
|
||||||
and @max_active control how work items are assigned execution
|
and ``@max_active`` control how work items are assigned execution
|
||||||
resources, scheduled and executed.
|
resources, scheduled and executed.
|
||||||
|
|
||||||
@flags:
|
|
||||||
|
|
||||||
WQ_UNBOUND
|
``flags``
|
||||||
|
---------
|
||||||
|
|
||||||
|
``WQ_UNBOUND``
|
||||||
Work items queued to an unbound wq are served by the special
|
Work items queued to an unbound wq are served by the special
|
||||||
worker-pools which host workers which are not bound to any
|
worker-pools which host workers which are not bound to any
|
||||||
specific CPU. This makes the wq behave as a simple execution
|
specific CPU. This makes the wq behave as a simple execution
|
||||||
@@ -184,20 +181,17 @@ resources, scheduled and executed.
|
|||||||
* Long running CPU intensive workloads which can be better
|
* Long running CPU intensive workloads which can be better
|
||||||
managed by the system scheduler.
|
managed by the system scheduler.
|
||||||
|
|
||||||
WQ_FREEZABLE
|
``WQ_FREEZABLE``
|
||||||
|
|
||||||
A freezable wq participates in the freeze phase of the system
|
A freezable wq participates in the freeze phase of the system
|
||||||
suspend operations. Work items on the wq are drained and no
|
suspend operations. Work items on the wq are drained and no
|
||||||
new work item starts execution until thawed.
|
new work item starts execution until thawed.
|
||||||
|
|
||||||
WQ_MEM_RECLAIM
|
``WQ_MEM_RECLAIM``
|
||||||
|
All wq which might be used in the memory reclaim paths **MUST**
|
||||||
All wq which might be used in the memory reclaim paths _MUST_
|
|
||||||
have this flag set. The wq is guaranteed to have at least one
|
have this flag set. The wq is guaranteed to have at least one
|
||||||
execution context regardless of memory pressure.
|
execution context regardless of memory pressure.
|
||||||
|
|
||||||
WQ_HIGHPRI
|
``WQ_HIGHPRI``
|
||||||
|
|
||||||
Work items of a highpri wq are queued to the highpri
|
Work items of a highpri wq are queued to the highpri
|
||||||
worker-pool of the target cpu. Highpri worker-pools are
|
worker-pool of the target cpu. Highpri worker-pools are
|
||||||
served by worker threads with elevated nice level.
|
served by worker threads with elevated nice level.
|
||||||
@@ -206,8 +200,7 @@ resources, scheduled and executed.
|
|||||||
each other. Each maintain its separate pool of workers and
|
each other. Each maintain its separate pool of workers and
|
||||||
implements concurrency management among its workers.
|
implements concurrency management among its workers.
|
||||||
|
|
||||||
WQ_CPU_INTENSIVE
|
``WQ_CPU_INTENSIVE``
|
||||||
|
|
||||||
Work items of a CPU intensive wq do not contribute to the
|
Work items of a CPU intensive wq do not contribute to the
|
||||||
concurrency level. In other words, runnable CPU intensive
|
concurrency level. In other words, runnable CPU intensive
|
||||||
work items will not prevent other work items in the same
|
work items will not prevent other work items in the same
|
||||||
@@ -223,22 +216,25 @@ resources, scheduled and executed.
|
|||||||
|
|
||||||
This flag is meaningless for unbound wq.
|
This flag is meaningless for unbound wq.
|
||||||
|
|
||||||
Note that the flag WQ_NON_REENTRANT no longer exists as all workqueues
|
Note that the flag ``WQ_NON_REENTRANT`` no longer exists as all
|
||||||
are now non-reentrant - any work item is guaranteed to be executed by
|
workqueues are now non-reentrant - any work item is guaranteed to be
|
||||||
at most one worker system-wide at any given time.
|
executed by at most one worker system-wide at any given time.
|
||||||
|
|
||||||
@max_active:
|
|
||||||
|
|
||||||
@max_active determines the maximum number of execution contexts per
|
``max_active``
|
||||||
CPU which can be assigned to the work items of a wq. For example,
|
--------------
|
||||||
with @max_active of 16, at most 16 work items of the wq can be
|
|
||||||
|
``@max_active`` determines the maximum number of execution contexts
|
||||||
|
per CPU which can be assigned to the work items of a wq. For example,
|
||||||
|
with ``@max_active`` of 16, at most 16 work items of the wq can be
|
||||||
executing at the same time per CPU.
|
executing at the same time per CPU.
|
||||||
|
|
||||||
Currently, for a bound wq, the maximum limit for @max_active is 512
|
Currently, for a bound wq, the maximum limit for ``@max_active`` is
|
||||||
and the default value used when 0 is specified is 256. For an unbound
|
512 and the default value used when 0 is specified is 256. For an
|
||||||
wq, the limit is higher of 512 and 4 * num_possible_cpus(). These
|
unbound wq, the limit is higher of 512 and 4 *
|
||||||
values are chosen sufficiently high such that they are not the
|
``num_possible_cpus()``. These values are chosen sufficiently high
|
||||||
limiting factor while providing protection in runaway cases.
|
such that they are not the limiting factor while providing protection
|
||||||
|
in runaway cases.
|
||||||
|
|
||||||
The number of active work items of a wq is usually regulated by the
|
The number of active work items of a wq is usually regulated by the
|
||||||
users of the wq, more specifically, by how many work items the users
|
users of the wq, more specifically, by how many work items the users
|
||||||
@@ -247,13 +243,14 @@ throttling the number of active work items, specifying '0' is
|
|||||||
recommended.
|
recommended.
|
||||||
|
|
||||||
Some users depend on the strict execution ordering of ST wq. The
|
Some users depend on the strict execution ordering of ST wq. The
|
||||||
combination of @max_active of 1 and WQ_UNBOUND is used to achieve this
|
combination of ``@max_active`` of 1 and ``WQ_UNBOUND`` is used to
|
||||||
behavior. Work items on such wq are always queued to the unbound
|
achieve this behavior. Work items on such wq are always queued to the
|
||||||
worker-pools and only one work item can be active at any given time thus
|
unbound worker-pools and only one work item can be active at any given
|
||||||
achieving the same ordering property as ST wq.
|
time thus achieving the same ordering property as ST wq.
|
||||||
|
|
||||||
|
|
||||||
5. Example Execution Scenarios
|
Example Execution Scenarios
|
||||||
|
===========================
|
||||||
|
|
||||||
The following example execution scenarios try to illustrate how cmwq
|
The following example execution scenarios try to illustrate how cmwq
|
||||||
behave under different configurations.
|
behave under different configurations.
|
||||||
@@ -265,7 +262,7 @@ behave under different configurations.
|
|||||||
|
|
||||||
Ignoring all other tasks, works and processing overhead, and assuming
|
Ignoring all other tasks, works and processing overhead, and assuming
|
||||||
simple FIFO scheduling, the following is one highly simplified version
|
simple FIFO scheduling, the following is one highly simplified version
|
||||||
of possible sequences of events with the original wq.
|
of possible sequences of events with the original wq. ::
|
||||||
|
|
||||||
TIME IN MSECS EVENT
|
TIME IN MSECS EVENT
|
||||||
0 w0 starts and burns CPU
|
0 w0 starts and burns CPU
|
||||||
@@ -279,7 +276,7 @@ of possible sequences of events with the original wq.
|
|||||||
40 w2 sleeps
|
40 w2 sleeps
|
||||||
50 w2 wakes up and finishes
|
50 w2 wakes up and finishes
|
||||||
|
|
||||||
And with cmwq with @max_active >= 3,
|
And with cmwq with ``@max_active`` >= 3, ::
|
||||||
|
|
||||||
TIME IN MSECS EVENT
|
TIME IN MSECS EVENT
|
||||||
0 w0 starts and burns CPU
|
0 w0 starts and burns CPU
|
||||||
@@ -293,7 +290,7 @@ And with cmwq with @max_active >= 3,
|
|||||||
20 w1 wakes up and finishes
|
20 w1 wakes up and finishes
|
||||||
25 w2 wakes up and finishes
|
25 w2 wakes up and finishes
|
||||||
|
|
||||||
If @max_active == 2,
|
If ``@max_active`` == 2, ::
|
||||||
|
|
||||||
TIME IN MSECS EVENT
|
TIME IN MSECS EVENT
|
||||||
0 w0 starts and burns CPU
|
0 w0 starts and burns CPU
|
||||||
@@ -308,7 +305,7 @@ If @max_active == 2,
|
|||||||
35 w2 wakes up and finishes
|
35 w2 wakes up and finishes
|
||||||
|
|
||||||
Now, let's assume w1 and w2 are queued to a different wq q1 which has
|
Now, let's assume w1 and w2 are queued to a different wq q1 which has
|
||||||
WQ_CPU_INTENSIVE set,
|
``WQ_CPU_INTENSIVE`` set, ::
|
||||||
|
|
||||||
TIME IN MSECS EVENT
|
TIME IN MSECS EVENT
|
||||||
0 w0 starts and burns CPU
|
0 w0 starts and burns CPU
|
||||||
@@ -322,13 +319,15 @@ WQ_CPU_INTENSIVE set,
|
|||||||
25 w2 wakes up and finishes
|
25 w2 wakes up and finishes
|
||||||
|
|
||||||
|
|
||||||
6. Guidelines
|
Guidelines
|
||||||
|
==========
|
||||||
|
|
||||||
* Do not forget to use WQ_MEM_RECLAIM if a wq may process work items
|
* Do not forget to use ``WQ_MEM_RECLAIM`` if a wq may process work
|
||||||
which are used during memory reclaim. Each wq with WQ_MEM_RECLAIM
|
items which are used during memory reclaim. Each wq with
|
||||||
set has an execution context reserved for it. If there is
|
``WQ_MEM_RECLAIM`` set has an execution context reserved for it. If
|
||||||
dependency among multiple work items used during memory reclaim,
|
there is dependency among multiple work items used during memory
|
||||||
they should be queued to separate wq each with WQ_MEM_RECLAIM.
|
reclaim, they should be queued to separate wq each with
|
||||||
|
``WQ_MEM_RECLAIM``.
|
||||||
|
|
||||||
* Unless strict ordering is required, there is no need to use ST wq.
|
* Unless strict ordering is required, there is no need to use ST wq.
|
||||||
|
|
||||||
@@ -337,25 +336,26 @@ WQ_CPU_INTENSIVE set,
|
|||||||
well under the default limit.
|
well under the default limit.
|
||||||
|
|
||||||
* A wq serves as a domain for forward progress guarantee
|
* A wq serves as a domain for forward progress guarantee
|
||||||
(WQ_MEM_RECLAIM, flush and work item attributes. Work items which
|
(``WQ_MEM_RECLAIM``, flush and work item attributes. Work items
|
||||||
are not involved in memory reclaim and don't need to be flushed as a
|
which are not involved in memory reclaim and don't need to be
|
||||||
part of a group of work items, and don't require any special
|
flushed as a part of a group of work items, and don't require any
|
||||||
attribute, can use one of the system wq. There is no difference in
|
special attribute, can use one of the system wq. There is no
|
||||||
execution characteristics between using a dedicated wq and a system
|
difference in execution characteristics between using a dedicated wq
|
||||||
wq.
|
and a system wq.
|
||||||
|
|
||||||
* Unless work items are expected to consume a huge amount of CPU
|
* Unless work items are expected to consume a huge amount of CPU
|
||||||
cycles, using a bound wq is usually beneficial due to the increased
|
cycles, using a bound wq is usually beneficial due to the increased
|
||||||
level of locality in wq operations and work item execution.
|
level of locality in wq operations and work item execution.
|
||||||
|
|
||||||
|
|
||||||
7. Debugging
|
Debugging
|
||||||
|
=========
|
||||||
|
|
||||||
Because the work functions are executed by generic worker threads
|
Because the work functions are executed by generic worker threads
|
||||||
there are a few tricks needed to shed some light on misbehaving
|
there are a few tricks needed to shed some light on misbehaving
|
||||||
workqueue users.
|
workqueue users.
|
||||||
|
|
||||||
Worker threads show up in the process list as:
|
Worker threads show up in the process list as: ::
|
||||||
|
|
||||||
root 5671 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/0:1]
|
root 5671 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/0:1]
|
||||||
root 5672 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/1:2]
|
root 5672 0.0 0.0 0 0 ? S 12:07 0:00 [kworker/1:2]
|
||||||
@@ -368,7 +368,7 @@ of possible problems:
|
|||||||
1. Something being scheduled in rapid succession
|
1. Something being scheduled in rapid succession
|
||||||
2. A single work item that consumes lots of cpu cycles
|
2. A single work item that consumes lots of cpu cycles
|
||||||
|
|
||||||
The first one can be tracked using tracing:
|
The first one can be tracked using tracing: ::
|
||||||
|
|
||||||
$ echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event
|
$ echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event
|
||||||
$ cat /sys/kernel/debug/tracing/trace_pipe > out.txt
|
$ cat /sys/kernel/debug/tracing/trace_pipe > out.txt
|
||||||
@@ -380,9 +380,15 @@ the output and the offender can be determined with the work item
|
|||||||
function.
|
function.
|
||||||
|
|
||||||
For the second type of problems it should be possible to just check
|
For the second type of problems it should be possible to just check
|
||||||
the stack trace of the offending worker thread.
|
the stack trace of the offending worker thread. ::
|
||||||
|
|
||||||
$ cat /proc/THE_OFFENDING_KWORKER/stack
|
$ cat /proc/THE_OFFENDING_KWORKER/stack
|
||||||
|
|
||||||
The work item's function should be trivially visible in the stack
|
The work item's function should be trivially visible in the stack
|
||||||
trace.
|
trace.
|
||||||
|
|
||||||
|
|
||||||
|
Kernel Inline Documentations Reference
|
||||||
|
======================================
|
||||||
|
|
||||||
|
.. kernel-doc:: include/linux/workqueue.h
|
||||||
@@ -84,9 +84,9 @@ are added or removed anytime. Trimming it accurately for your system needs
|
|||||||
upfront can save some boot time memory. See below for how we use heuristics
|
upfront can save some boot time memory. See below for how we use heuristics
|
||||||
in x86_64 case to keep this under check.
|
in x86_64 case to keep this under check.
|
||||||
|
|
||||||
cpu_online_mask: Bitmap of all CPUs currently online. Its set in __cpu_up()
|
cpu_online_mask: Bitmap of all CPUs currently online. It's set in __cpu_up()
|
||||||
after a cpu is available for kernel scheduling and ready to receive
|
after a CPU is available for kernel scheduling and ready to receive
|
||||||
interrupts from devices. Its cleared when a cpu is brought down using
|
interrupts from devices. It's cleared when a CPU is brought down using
|
||||||
__cpu_disable(), before which all OS services including interrupts are
|
__cpu_disable(), before which all OS services including interrupts are
|
||||||
migrated to another target CPU.
|
migrated to another target CPU.
|
||||||
|
|
||||||
@@ -181,7 +181,7 @@ To support physical addition/removal, one would need some BIOS hooks and
|
|||||||
the platform should have something like an attention button in PCI hotplug.
|
the platform should have something like an attention button in PCI hotplug.
|
||||||
CONFIG_ACPI_HOTPLUG_CPU enables ACPI support for physical add/remove of CPUs.
|
CONFIG_ACPI_HOTPLUG_CPU enables ACPI support for physical add/remove of CPUs.
|
||||||
|
|
||||||
Q: How do i logically offline a CPU?
|
Q: How do I logically offline a CPU?
|
||||||
A: Do the following.
|
A: Do the following.
|
||||||
|
|
||||||
#echo 0 > /sys/devices/system/cpu/cpuX/online
|
#echo 0 > /sys/devices/system/cpu/cpuX/online
|
||||||
@@ -191,15 +191,15 @@ Once the logical offline is successful, check
|
|||||||
#cat /proc/interrupts
|
#cat /proc/interrupts
|
||||||
|
|
||||||
You should now not see the CPU that you removed. Also online file will report
|
You should now not see the CPU that you removed. Also online file will report
|
||||||
the state as 0 when a cpu if offline and 1 when its online.
|
the state as 0 when a CPU is offline and 1 when it's online.
|
||||||
|
|
||||||
#To display the current cpu state.
|
#To display the current cpu state.
|
||||||
#cat /sys/devices/system/cpu/cpuX/online
|
#cat /sys/devices/system/cpu/cpuX/online
|
||||||
|
|
||||||
Q: Why can't i remove CPU0 on some systems?
|
Q: Why can't I remove CPU0 on some systems?
|
||||||
A: Some architectures may have some special dependency on a certain CPU.
|
A: Some architectures may have some special dependency on a certain CPU.
|
||||||
|
|
||||||
For e.g in IA64 platforms we have ability to sent platform interrupts to the
|
For e.g in IA64 platforms we have ability to send platform interrupts to the
|
||||||
OS. a.k.a Corrected Platform Error Interrupts (CPEI). In current ACPI
|
OS. a.k.a Corrected Platform Error Interrupts (CPEI). In current ACPI
|
||||||
specifications, we didn't have a way to change the target CPU. Hence if the
|
specifications, we didn't have a way to change the target CPU. Hence if the
|
||||||
current ACPI version doesn't support such re-direction, we disable that CPU
|
current ACPI version doesn't support such re-direction, we disable that CPU
|
||||||
@@ -231,7 +231,7 @@ either by CONFIG_BOOTPARAM_HOTPLUG_CPU0 or by kernel parameter cpu0_hotplug.
|
|||||||
|
|
||||||
--Fenghua Yu <fenghua.yu@intel.com>
|
--Fenghua Yu <fenghua.yu@intel.com>
|
||||||
|
|
||||||
Q: How do i find out if a particular CPU is not removable?
|
Q: How do I find out if a particular CPU is not removable?
|
||||||
A: Depending on the implementation, some architectures may show this by the
|
A: Depending on the implementation, some architectures may show this by the
|
||||||
absence of the "online" file. This is done if it can be determined ahead of
|
absence of the "online" file. This is done if it can be determined ahead of
|
||||||
time that this CPU cannot be removed.
|
time that this CPU cannot be removed.
|
||||||
@@ -250,7 +250,7 @@ A: The following happen, listed in no particular order :-)
|
|||||||
- All processes are migrated away from this outgoing CPU to new CPUs.
|
- All processes are migrated away from this outgoing CPU to new CPUs.
|
||||||
The new CPU is chosen from each process' current cpuset, which may be
|
The new CPU is chosen from each process' current cpuset, which may be
|
||||||
a subset of all online CPUs.
|
a subset of all online CPUs.
|
||||||
- All interrupts targeted to this CPU is migrated to a new CPU
|
- All interrupts targeted to this CPU are migrated to a new CPU
|
||||||
- timers/bottom half/task lets are also migrated to a new CPU
|
- timers/bottom half/task lets are also migrated to a new CPU
|
||||||
- Once all services are migrated, kernel calls an arch specific routine
|
- Once all services are migrated, kernel calls an arch specific routine
|
||||||
__cpu_disable() to perform arch specific cleanup.
|
__cpu_disable() to perform arch specific cleanup.
|
||||||
@@ -259,10 +259,10 @@ A: The following happen, listed in no particular order :-)
|
|||||||
CPU is being offlined).
|
CPU is being offlined).
|
||||||
|
|
||||||
"It is expected that each service cleans up when the CPU_DOWN_PREPARE
|
"It is expected that each service cleans up when the CPU_DOWN_PREPARE
|
||||||
notifier is called, when CPU_DEAD is called its expected there is nothing
|
notifier is called, when CPU_DEAD is called it's expected there is nothing
|
||||||
running on behalf of this CPU that was offlined"
|
running on behalf of this CPU that was offlined"
|
||||||
|
|
||||||
Q: If i have some kernel code that needs to be aware of CPU arrival and
|
Q: If I have some kernel code that needs to be aware of CPU arrival and
|
||||||
departure, how to i arrange for proper notification?
|
departure, how to i arrange for proper notification?
|
||||||
A: This is what you would need in your kernel code to receive notifications.
|
A: This is what you would need in your kernel code to receive notifications.
|
||||||
|
|
||||||
@@ -311,7 +311,7 @@ things will happen if a notifier in path sent a BAD notify code.
|
|||||||
|
|
||||||
Q: I don't see my action being called for all CPUs already up and running?
|
Q: I don't see my action being called for all CPUs already up and running?
|
||||||
A: Yes, CPU notifiers are called only when new CPUs are on-lined or offlined.
|
A: Yes, CPU notifiers are called only when new CPUs are on-lined or offlined.
|
||||||
If you need to perform some action for each cpu already in the system, then
|
If you need to perform some action for each CPU already in the system, then
|
||||||
do this:
|
do this:
|
||||||
|
|
||||||
for_each_online_cpu(i) {
|
for_each_online_cpu(i) {
|
||||||
@@ -363,8 +363,8 @@ A: Yes, CPU notifiers are called only when new CPUs are on-lined or offlined.
|
|||||||
callbacks as well as initialize the already online CPUs.
|
callbacks as well as initialize the already online CPUs.
|
||||||
|
|
||||||
|
|
||||||
Q: If i would like to develop cpu hotplug support for a new architecture,
|
Q: If I would like to develop CPU hotplug support for a new architecture,
|
||||||
what do i need at a minimum?
|
what do I need at a minimum?
|
||||||
A: The following are what is required for CPU hotplug infrastructure to work
|
A: The following are what is required for CPU hotplug infrastructure to work
|
||||||
correctly.
|
correctly.
|
||||||
|
|
||||||
@@ -382,8 +382,8 @@ A: The following are what is required for CPU hotplug infrastructure to work
|
|||||||
per_cpu state to be set, to ensure the processor
|
per_cpu state to be set, to ensure the processor
|
||||||
dead routine is called to be sure positively.
|
dead routine is called to be sure positively.
|
||||||
|
|
||||||
Q: I need to ensure that a particular cpu is not removed when there is some
|
Q: I need to ensure that a particular CPU is not removed when there is some
|
||||||
work specific to this cpu is in progress.
|
work specific to this CPU in progress.
|
||||||
A: There are two ways. If your code can be run in interrupt context, use
|
A: There are two ways. If your code can be run in interrupt context, use
|
||||||
smp_call_function_single(), otherwise use work_on_cpu(). Note that
|
smp_call_function_single(), otherwise use work_on_cpu(). Note that
|
||||||
work_on_cpu() is slow, and can fail due to out of memory:
|
work_on_cpu() is slow, and can fail due to out of memory:
|
||||||
|
|||||||
10
Documentation/dev-tools/conf.py
Normal file
10
Documentation/dev-tools/conf.py
Normal file
@@ -0,0 +1,10 @@
|
|||||||
|
# -*- coding: utf-8; mode: python -*-
|
||||||
|
|
||||||
|
project = "Development tools for the kernel"
|
||||||
|
|
||||||
|
tags.add("subproject")
|
||||||
|
|
||||||
|
latex_documents = [
|
||||||
|
('index', 'dev-tools.tex', project,
|
||||||
|
'The kernel development community', 'manual'),
|
||||||
|
]
|
||||||
@@ -201,7 +201,9 @@ Appendix A: gather_on_build.sh
|
|||||||
------------------------------
|
------------------------------
|
||||||
|
|
||||||
Sample script to gather coverage meta files on the build machine
|
Sample script to gather coverage meta files on the build machine
|
||||||
(see 6a)::
|
(see 6a):
|
||||||
|
|
||||||
|
.. code-block:: sh
|
||||||
|
|
||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
|
|
||||||
@@ -232,7 +234,9 @@ Appendix B: gather_on_test.sh
|
|||||||
-----------------------------
|
-----------------------------
|
||||||
|
|
||||||
Sample script to gather coverage data files on the test machine
|
Sample script to gather coverage data files on the test machine
|
||||||
(see 6b)::
|
(see 6b):
|
||||||
|
|
||||||
|
.. code-block:: sh
|
||||||
|
|
||||||
#!/bin/bash -e
|
#!/bin/bash -e
|
||||||
|
|
||||||
|
|||||||
@@ -23,3 +23,11 @@ whole; patches welcome!
|
|||||||
kmemleak
|
kmemleak
|
||||||
kmemcheck
|
kmemcheck
|
||||||
gdb-kernel-debugging
|
gdb-kernel-debugging
|
||||||
|
|
||||||
|
|
||||||
|
.. only:: subproject and html
|
||||||
|
|
||||||
|
Indices
|
||||||
|
=======
|
||||||
|
|
||||||
|
* :ref:`genindex`
|
||||||
@@ -24,7 +24,9 @@ Profiling data will only become accessible once debugfs has been mounted::
|
|||||||
|
|
||||||
mount -t debugfs none /sys/kernel/debug
|
mount -t debugfs none /sys/kernel/debug
|
||||||
|
|
||||||
The following program demonstrates kcov usage from within a test program::
|
The following program demonstrates kcov usage from within a test program:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
#include <stdio.h>
|
#include <stdio.h>
|
||||||
#include <stddef.h>
|
#include <stddef.h>
|
||||||
|
|||||||
@@ -1,9 +0,0 @@
|
|||||||
Linux Kernel Development Documentation
|
|
||||||
======================================
|
|
||||||
|
|
||||||
Contents:
|
|
||||||
|
|
||||||
.. toctree::
|
|
||||||
:maxdepth: 2
|
|
||||||
|
|
||||||
development-process
|
|
||||||
@@ -17,7 +17,7 @@ The target is named "raid" and it accepts the following parameters:
|
|||||||
raid0 RAID0 striping (no resilience)
|
raid0 RAID0 striping (no resilience)
|
||||||
raid1 RAID1 mirroring
|
raid1 RAID1 mirroring
|
||||||
raid4 RAID4 with dedicated last parity disk
|
raid4 RAID4 with dedicated last parity disk
|
||||||
raid5_n RAID5 with dedicated last parity disk suporting takeover
|
raid5_n RAID5 with dedicated last parity disk supporting takeover
|
||||||
Same as raid4
|
Same as raid4
|
||||||
-Transitory layout
|
-Transitory layout
|
||||||
raid5_la RAID5 left asymmetric
|
raid5_la RAID5 left asymmetric
|
||||||
@@ -36,7 +36,7 @@ The target is named "raid" and it accepts the following parameters:
|
|||||||
- rotating parity N (right-to-left) with data continuation
|
- rotating parity N (right-to-left) with data continuation
|
||||||
raid6_n_6 RAID6 with dedicate parity disks
|
raid6_n_6 RAID6 with dedicate parity disks
|
||||||
- parity and Q-syndrome on the last 2 disks;
|
- parity and Q-syndrome on the last 2 disks;
|
||||||
laylout for takeover from/to raid4/raid5_n
|
layout for takeover from/to raid4/raid5_n
|
||||||
raid6_la_6 Same as "raid_la" plus dedicated last Q-syndrome disk
|
raid6_la_6 Same as "raid_la" plus dedicated last Q-syndrome disk
|
||||||
- layout for takeover from raid5_la from/to raid6
|
- layout for takeover from raid5_la from/to raid6
|
||||||
raid6_ra_6 Same as "raid5_ra" dedicated last Q-syndrome disk
|
raid6_ra_6 Same as "raid5_ra" dedicated last Q-syndrome disk
|
||||||
@@ -137,8 +137,8 @@ The target is named "raid" and it accepts the following parameters:
|
|||||||
device removal (negative value) or device addition (positive
|
device removal (negative value) or device addition (positive
|
||||||
value) to any reshape supporting raid levels 4/5/6 and 10.
|
value) to any reshape supporting raid levels 4/5/6 and 10.
|
||||||
RAID levels 4/5/6 allow for addition of devices (metadata
|
RAID levels 4/5/6 allow for addition of devices (metadata
|
||||||
and data device tupel), raid10_near and raid10_offset only
|
and data device tuple), raid10_near and raid10_offset only
|
||||||
allow for device addtion. raid10_far does not support any
|
allow for device addition. raid10_far does not support any
|
||||||
reshaping at all.
|
reshaping at all.
|
||||||
A minimum of devices have to be kept to enforce resilience,
|
A minimum of devices have to be kept to enforce resilience,
|
||||||
which is 3 devices for raid4/5 and 4 devices for raid6.
|
which is 3 devices for raid4/5 and 4 devices for raid6.
|
||||||
|
|||||||
@@ -1,7 +1,7 @@
|
|||||||
* Maxim DS3231 Real Time Clock
|
* Maxim DS3231 Real Time Clock
|
||||||
|
|
||||||
Required properties:
|
Required properties:
|
||||||
see: Documentation/devicetree/bindings/i2c/trivial-devices.txt
|
see: Documentation/devicetree/bindings/i2c/trivial-admin-guide/devices.rst
|
||||||
|
|
||||||
Optional property:
|
Optional property:
|
||||||
- #clock-cells: Should be 1.
|
- #clock-cells: Should be 1.
|
||||||
|
|||||||
@@ -3,7 +3,7 @@
|
|||||||
Philips PCF8563/Epson RTC8564 Real Time Clock
|
Philips PCF8563/Epson RTC8564 Real Time Clock
|
||||||
|
|
||||||
Required properties:
|
Required properties:
|
||||||
see: Documentation/devicetree/bindings/i2c/trivial-devices.txt
|
see: Documentation/devicetree/bindings/i2c/trivial-admin-guide/devices.rst
|
||||||
|
|
||||||
Optional property:
|
Optional property:
|
||||||
- #clock-cells: Should be 0.
|
- #clock-cells: Should be 0.
|
||||||
|
|||||||
@@ -3,7 +3,7 @@
|
|||||||
|
|
||||||
I. For patch submitters
|
I. For patch submitters
|
||||||
|
|
||||||
0) Normal patch submission rules from Documentation/SubmittingPatches
|
0) Normal patch submission rules from Documentation/process/submitting-patches.rst
|
||||||
applies.
|
applies.
|
||||||
|
|
||||||
1) The Documentation/ portion of the patch should be a separate patch.
|
1) The Documentation/ portion of the patch should be a separate patch.
|
||||||
|
|||||||
10
Documentation/doc-guide/conf.py
Normal file
10
Documentation/doc-guide/conf.py
Normal file
@@ -0,0 +1,10 @@
|
|||||||
|
# -*- coding: utf-8; mode: python -*-
|
||||||
|
|
||||||
|
project = 'Linux Kernel Documentation Guide'
|
||||||
|
|
||||||
|
tags.add("subproject")
|
||||||
|
|
||||||
|
latex_documents = [
|
||||||
|
('index', 'kernel-doc-guide.tex', 'Linux Kernel Documentation Guide',
|
||||||
|
'The kernel development community', 'manual'),
|
||||||
|
]
|
||||||
90
Documentation/doc-guide/docbook.rst
Normal file
90
Documentation/doc-guide/docbook.rst
Normal file
@@ -0,0 +1,90 @@
|
|||||||
|
DocBook XML [DEPRECATED]
|
||||||
|
========================
|
||||||
|
|
||||||
|
.. attention::
|
||||||
|
|
||||||
|
This section describes the deprecated DocBook XML toolchain. Please do not
|
||||||
|
create new DocBook XML template files. Please consider converting existing
|
||||||
|
DocBook XML templates files to Sphinx/reStructuredText.
|
||||||
|
|
||||||
|
Converting DocBook to Sphinx
|
||||||
|
----------------------------
|
||||||
|
|
||||||
|
Over time, we expect all of the documents under ``Documentation/DocBook`` to be
|
||||||
|
converted to Sphinx and reStructuredText. For most DocBook XML documents, a good
|
||||||
|
enough solution is to use the simple ``Documentation/sphinx/tmplcvt`` script,
|
||||||
|
which uses ``pandoc`` under the hood. For example::
|
||||||
|
|
||||||
|
$ cd Documentation/sphinx
|
||||||
|
$ ./tmplcvt ../DocBook/in.tmpl ../out.rst
|
||||||
|
|
||||||
|
Then edit the resulting rst files to fix any remaining issues, and add the
|
||||||
|
document in the ``toctree`` in ``Documentation/index.rst``.
|
||||||
|
|
||||||
|
Components of the kernel-doc system
|
||||||
|
-----------------------------------
|
||||||
|
|
||||||
|
Many places in the source tree have extractable documentation in the form of
|
||||||
|
block comments above functions. The components of this system are:
|
||||||
|
|
||||||
|
- ``scripts/kernel-doc``
|
||||||
|
|
||||||
|
This is a perl script that hunts for the block comments and can mark them up
|
||||||
|
directly into reStructuredText, DocBook, man, text, and HTML. (No, not
|
||||||
|
texinfo.)
|
||||||
|
|
||||||
|
- ``Documentation/DocBook/*.tmpl``
|
||||||
|
|
||||||
|
These are XML template files, which are normal XML files with special
|
||||||
|
place-holders for where the extracted documentation should go.
|
||||||
|
|
||||||
|
- ``scripts/docproc.c``
|
||||||
|
|
||||||
|
This is a program for converting XML template files into XML files. When a
|
||||||
|
file is referenced it is searched for symbols exported (EXPORT_SYMBOL), to be
|
||||||
|
able to distinguish between internal and external functions.
|
||||||
|
|
||||||
|
It invokes kernel-doc, giving it the list of functions that are to be
|
||||||
|
documented.
|
||||||
|
|
||||||
|
Additionally it is used to scan the XML template files to locate all the files
|
||||||
|
referenced herein. This is used to generate dependency information as used by
|
||||||
|
make.
|
||||||
|
|
||||||
|
- ``Makefile``
|
||||||
|
|
||||||
|
The targets 'xmldocs', 'psdocs', 'pdfdocs', and 'htmldocs' are used to build
|
||||||
|
DocBook XML files, PostScript files, PDF files, and html files in
|
||||||
|
Documentation/DocBook. The older target 'sgmldocs' is equivalent to 'xmldocs'.
|
||||||
|
|
||||||
|
- ``Documentation/DocBook/Makefile``
|
||||||
|
|
||||||
|
This is where C files are associated with SGML templates.
|
||||||
|
|
||||||
|
How to use kernel-doc comments in DocBook XML template files
|
||||||
|
------------------------------------------------------------
|
||||||
|
|
||||||
|
DocBook XML template files (\*.tmpl) are like normal XML files, except that they
|
||||||
|
can contain escape sequences where extracted documentation should be inserted.
|
||||||
|
|
||||||
|
``!E<filename>`` is replaced by the documentation, in ``<filename>``, for
|
||||||
|
functions that are exported using ``EXPORT_SYMBOL``: the function list is
|
||||||
|
collected from files listed in ``Documentation/DocBook/Makefile``.
|
||||||
|
|
||||||
|
``!I<filename>`` is replaced by the documentation for functions that are **not**
|
||||||
|
exported using ``EXPORT_SYMBOL``.
|
||||||
|
|
||||||
|
``!D<filename>`` is used to name additional files to search for functions
|
||||||
|
exported using ``EXPORT_SYMBOL``.
|
||||||
|
|
||||||
|
``!F<filename> <function [functions...]>`` is replaced by the documentation, in
|
||||||
|
``<filename>``, for the functions listed.
|
||||||
|
|
||||||
|
``!P<filename> <section title>`` is replaced by the contents of the ``DOC:``
|
||||||
|
section titled ``<section title>`` from ``<filename>``. Spaces are allowed in
|
||||||
|
``<section title>``; do not quote the ``<section title>``.
|
||||||
|
|
||||||
|
``!C<filename>`` is replaced by nothing, but makes the tools check that all DOC:
|
||||||
|
sections and documented functions, symbols, etc. are used. This makes sense to
|
||||||
|
use when you use ``!F`` or ``!P`` only and want to verify that all documentation
|
||||||
|
is included.
|
||||||
20
Documentation/doc-guide/index.rst
Normal file
20
Documentation/doc-guide/index.rst
Normal file
@@ -0,0 +1,20 @@
|
|||||||
|
.. _doc_guide:
|
||||||
|
|
||||||
|
=================================
|
||||||
|
How to write kernel documentation
|
||||||
|
=================================
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 1
|
||||||
|
|
||||||
|
sphinx.rst
|
||||||
|
kernel-doc.rst
|
||||||
|
parse-headers.rst
|
||||||
|
docbook.rst
|
||||||
|
|
||||||
|
.. only:: subproject and html
|
||||||
|
|
||||||
|
Indices
|
||||||
|
=======
|
||||||
|
|
||||||
|
* :ref:`genindex`
|
||||||
@@ -1,228 +1,3 @@
|
|||||||
==========================
|
|
||||||
Linux Kernel Documentation
|
|
||||||
==========================
|
|
||||||
|
|
||||||
Introduction
|
|
||||||
============
|
|
||||||
|
|
||||||
The Linux kernel uses `Sphinx`_ to generate pretty documentation from
|
|
||||||
`reStructuredText`_ files under ``Documentation``. To build the documentation in
|
|
||||||
HTML or PDF formats, use ``make htmldocs`` or ``make pdfdocs``. The generated
|
|
||||||
documentation is placed in ``Documentation/output``.
|
|
||||||
|
|
||||||
.. _Sphinx: http://www.sphinx-doc.org/
|
|
||||||
.. _reStructuredText: http://docutils.sourceforge.net/rst.html
|
|
||||||
|
|
||||||
The reStructuredText files may contain directives to include structured
|
|
||||||
documentation comments, or kernel-doc comments, from source files. Usually these
|
|
||||||
are used to describe the functions and types and design of the code. The
|
|
||||||
kernel-doc comments have some special structure and formatting, but beyond that
|
|
||||||
they are also treated as reStructuredText.
|
|
||||||
|
|
||||||
There is also the deprecated DocBook toolchain to generate documentation from
|
|
||||||
DocBook XML template files under ``Documentation/DocBook``. The DocBook files
|
|
||||||
are to be converted to reStructuredText, and the toolchain is slated to be
|
|
||||||
removed.
|
|
||||||
|
|
||||||
Finally, there are thousands of plain text documentation files scattered around
|
|
||||||
``Documentation``. Some of these will likely be converted to reStructuredText
|
|
||||||
over time, but the bulk of them will remain in plain text.
|
|
||||||
|
|
||||||
Sphinx Build
|
|
||||||
============
|
|
||||||
|
|
||||||
The usual way to generate the documentation is to run ``make htmldocs`` or
|
|
||||||
``make pdfdocs``. There are also other formats available, see the documentation
|
|
||||||
section of ``make help``. The generated documentation is placed in
|
|
||||||
format-specific subdirectories under ``Documentation/output``.
|
|
||||||
|
|
||||||
To generate documentation, Sphinx (``sphinx-build``) must obviously be
|
|
||||||
installed. For prettier HTML output, the Read the Docs Sphinx theme
|
|
||||||
(``sphinx_rtd_theme``) is used if available. For PDF output, ``rst2pdf`` is also
|
|
||||||
needed. All of these are widely available and packaged in distributions.
|
|
||||||
|
|
||||||
To pass extra options to Sphinx, you can use the ``SPHINXOPTS`` make
|
|
||||||
variable. For example, use ``make SPHINXOPTS=-v htmldocs`` to get more verbose
|
|
||||||
output.
|
|
||||||
|
|
||||||
To remove the generated documentation, run ``make cleandocs``.
|
|
||||||
|
|
||||||
Writing Documentation
|
|
||||||
=====================
|
|
||||||
|
|
||||||
Adding new documentation can be as simple as:
|
|
||||||
|
|
||||||
1. Add a new ``.rst`` file somewhere under ``Documentation``.
|
|
||||||
2. Refer to it from the Sphinx main `TOC tree`_ in ``Documentation/index.rst``.
|
|
||||||
|
|
||||||
.. _TOC tree: http://www.sphinx-doc.org/en/stable/markup/toctree.html
|
|
||||||
|
|
||||||
This is usually good enough for simple documentation (like the one you're
|
|
||||||
reading right now), but for larger documents it may be advisable to create a
|
|
||||||
subdirectory (or use an existing one). For example, the graphics subsystem
|
|
||||||
documentation is under ``Documentation/gpu``, split to several ``.rst`` files,
|
|
||||||
and has a separate ``index.rst`` (with a ``toctree`` of its own) referenced from
|
|
||||||
the main index.
|
|
||||||
|
|
||||||
See the documentation for `Sphinx`_ and `reStructuredText`_ on what you can do
|
|
||||||
with them. In particular, the Sphinx `reStructuredText Primer`_ is a good place
|
|
||||||
to get started with reStructuredText. There are also some `Sphinx specific
|
|
||||||
markup constructs`_.
|
|
||||||
|
|
||||||
.. _reStructuredText Primer: http://www.sphinx-doc.org/en/stable/rest.html
|
|
||||||
.. _Sphinx specific markup constructs: http://www.sphinx-doc.org/en/stable/markup/index.html
|
|
||||||
|
|
||||||
Specific guidelines for the kernel documentation
|
|
||||||
------------------------------------------------
|
|
||||||
|
|
||||||
Here are some specific guidelines for the kernel documentation:
|
|
||||||
|
|
||||||
* Please don't go overboard with reStructuredText markup. Keep it simple.
|
|
||||||
|
|
||||||
* Please stick to this order of heading adornments:
|
|
||||||
|
|
||||||
1. ``=`` with overline for document title::
|
|
||||||
|
|
||||||
==============
|
|
||||||
Document title
|
|
||||||
==============
|
|
||||||
|
|
||||||
2. ``=`` for chapters::
|
|
||||||
|
|
||||||
Chapters
|
|
||||||
========
|
|
||||||
|
|
||||||
3. ``-`` for sections::
|
|
||||||
|
|
||||||
Section
|
|
||||||
-------
|
|
||||||
|
|
||||||
4. ``~`` for subsections::
|
|
||||||
|
|
||||||
Subsection
|
|
||||||
~~~~~~~~~~
|
|
||||||
|
|
||||||
Although RST doesn't mandate a specific order ("Rather than imposing a fixed
|
|
||||||
number and order of section title adornment styles, the order enforced will be
|
|
||||||
the order as encountered."), having the higher levels the same overall makes
|
|
||||||
it easier to follow the documents.
|
|
||||||
|
|
||||||
|
|
||||||
the C domain
|
|
||||||
------------
|
|
||||||
|
|
||||||
The `Sphinx C Domain`_ (name c) is suited for documentation of C API. E.g. a
|
|
||||||
function prototype:
|
|
||||||
|
|
||||||
.. code-block:: rst
|
|
||||||
|
|
||||||
.. c:function:: int ioctl( int fd, int request )
|
|
||||||
|
|
||||||
The C domain of the kernel-doc has some additional features. E.g. you can
|
|
||||||
*rename* the reference name of a function with a common name like ``open`` or
|
|
||||||
``ioctl``:
|
|
||||||
|
|
||||||
.. code-block:: rst
|
|
||||||
|
|
||||||
.. c:function:: int ioctl( int fd, int request )
|
|
||||||
:name: VIDIOC_LOG_STATUS
|
|
||||||
|
|
||||||
The func-name (e.g. ioctl) remains in the output but the ref-name changed from
|
|
||||||
``ioctl`` to ``VIDIOC_LOG_STATUS``. The index entry for this function is also
|
|
||||||
changed to ``VIDIOC_LOG_STATUS`` and the function can now referenced by:
|
|
||||||
|
|
||||||
.. code-block:: rst
|
|
||||||
|
|
||||||
:c:func:`VIDIOC_LOG_STATUS`
|
|
||||||
|
|
||||||
|
|
||||||
list tables
|
|
||||||
-----------
|
|
||||||
|
|
||||||
We recommend the use of *list table* formats. The *list table* formats are
|
|
||||||
double-stage lists. Compared to the ASCII-art they might not be as
|
|
||||||
comfortable for
|
|
||||||
readers of the text files. Their advantage is that they are easy to
|
|
||||||
create or modify and that the diff of a modification is much more meaningful,
|
|
||||||
because it is limited to the modified content.
|
|
||||||
|
|
||||||
The ``flat-table`` is a double-stage list similar to the ``list-table`` with
|
|
||||||
some additional features:
|
|
||||||
|
|
||||||
* column-span: with the role ``cspan`` a cell can be extended through
|
|
||||||
additional columns
|
|
||||||
|
|
||||||
* row-span: with the role ``rspan`` a cell can be extended through
|
|
||||||
additional rows
|
|
||||||
|
|
||||||
* auto span rightmost cell of a table row over the missing cells on the right
|
|
||||||
side of that table-row. With Option ``:fill-cells:`` this behavior can
|
|
||||||
changed from *auto span* to *auto fill*, which automatically inserts (empty)
|
|
||||||
cells instead of spanning the last cell.
|
|
||||||
|
|
||||||
options:
|
|
||||||
|
|
||||||
* ``:header-rows:`` [int] count of header rows
|
|
||||||
* ``:stub-columns:`` [int] count of stub columns
|
|
||||||
* ``:widths:`` [[int] [int] ... ] widths of columns
|
|
||||||
* ``:fill-cells:`` instead of auto-spanning missing cells, insert missing cells
|
|
||||||
|
|
||||||
roles:
|
|
||||||
|
|
||||||
* ``:cspan:`` [int] additional columns (*morecols*)
|
|
||||||
* ``:rspan:`` [int] additional rows (*morerows*)
|
|
||||||
|
|
||||||
The example below shows how to use this markup. The first level of the staged
|
|
||||||
list is the *table-row*. In the *table-row* there is only one markup allowed,
|
|
||||||
the list of the cells in this *table-row*. Exceptions are *comments* ( ``..`` )
|
|
||||||
and *targets* (e.g. a ref to ``:ref:`last row <last row>``` / :ref:`last row
|
|
||||||
<last row>`).
|
|
||||||
|
|
||||||
.. code-block:: rst
|
|
||||||
|
|
||||||
.. flat-table:: table title
|
|
||||||
:widths: 2 1 1 3
|
|
||||||
|
|
||||||
* - head col 1
|
|
||||||
- head col 2
|
|
||||||
- head col 3
|
|
||||||
- head col 4
|
|
||||||
|
|
||||||
* - column 1
|
|
||||||
- field 1.1
|
|
||||||
- field 1.2 with autospan
|
|
||||||
|
|
||||||
* - column 2
|
|
||||||
- field 2.1
|
|
||||||
- :rspan:`1` :cspan:`1` field 2.2 - 3.3
|
|
||||||
|
|
||||||
* .. _`last row`:
|
|
||||||
|
|
||||||
- column 3
|
|
||||||
|
|
||||||
Rendered as:
|
|
||||||
|
|
||||||
.. flat-table:: table title
|
|
||||||
:widths: 2 1 1 3
|
|
||||||
|
|
||||||
* - head col 1
|
|
||||||
- head col 2
|
|
||||||
- head col 3
|
|
||||||
- head col 4
|
|
||||||
|
|
||||||
* - column 1
|
|
||||||
- field 1.1
|
|
||||||
- field 1.2 with autospan
|
|
||||||
|
|
||||||
* - column 2
|
|
||||||
- field 2.1
|
|
||||||
- :rspan:`1` :cspan:`1` field 2.2 - 3.3
|
|
||||||
|
|
||||||
* .. _`last row`:
|
|
||||||
|
|
||||||
- column 3
|
|
||||||
|
|
||||||
|
|
||||||
Including kernel-doc comments
|
Including kernel-doc comments
|
||||||
=============================
|
=============================
|
||||||
|
|
||||||
@@ -484,7 +259,10 @@ span multiple lines. The continuation lines may contain indentation.
|
|||||||
In-line member documentation comments
|
In-line member documentation comments
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
The structure members may also be documented in-line within the definition::
|
The structure members may also be documented in-line within the definition.
|
||||||
|
There are two styles, single-line comments where both the opening ``/**`` and
|
||||||
|
closing ``*/`` are on the same line, and multi-line comments where they are each
|
||||||
|
on a line of their own, like all other kernel-doc comments::
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* struct foo - Brief description.
|
* struct foo - Brief description.
|
||||||
@@ -502,6 +280,8 @@ The structure members may also be documented in-line within the definition::
|
|||||||
* Here, the member description may contain several paragraphs.
|
* Here, the member description may contain several paragraphs.
|
||||||
*/
|
*/
|
||||||
int baz;
|
int baz;
|
||||||
|
/** @foobar: Single line description. */
|
||||||
|
int foobar;
|
||||||
}
|
}
|
||||||
|
|
||||||
Private members
|
Private members
|
||||||
@@ -586,94 +366,3 @@ file.
|
|||||||
|
|
||||||
Data structures visible in kernel include files should also be documented using
|
Data structures visible in kernel include files should also be documented using
|
||||||
kernel-doc formatted comments.
|
kernel-doc formatted comments.
|
||||||
|
|
||||||
DocBook XML [DEPRECATED]
|
|
||||||
========================
|
|
||||||
|
|
||||||
.. attention::
|
|
||||||
|
|
||||||
This section describes the deprecated DocBook XML toolchain. Please do not
|
|
||||||
create new DocBook XML template files. Please consider converting existing
|
|
||||||
DocBook XML templates files to Sphinx/reStructuredText.
|
|
||||||
|
|
||||||
Converting DocBook to Sphinx
|
|
||||||
----------------------------
|
|
||||||
|
|
||||||
Over time, we expect all of the documents under ``Documentation/DocBook`` to be
|
|
||||||
converted to Sphinx and reStructuredText. For most DocBook XML documents, a good
|
|
||||||
enough solution is to use the simple ``Documentation/sphinx/tmplcvt`` script,
|
|
||||||
which uses ``pandoc`` under the hood. For example::
|
|
||||||
|
|
||||||
$ cd Documentation/sphinx
|
|
||||||
$ ./tmplcvt ../DocBook/in.tmpl ../out.rst
|
|
||||||
|
|
||||||
Then edit the resulting rst files to fix any remaining issues, and add the
|
|
||||||
document in the ``toctree`` in ``Documentation/index.rst``.
|
|
||||||
|
|
||||||
Components of the kernel-doc system
|
|
||||||
-----------------------------------
|
|
||||||
|
|
||||||
Many places in the source tree have extractable documentation in the form of
|
|
||||||
block comments above functions. The components of this system are:
|
|
||||||
|
|
||||||
- ``scripts/kernel-doc``
|
|
||||||
|
|
||||||
This is a perl script that hunts for the block comments and can mark them up
|
|
||||||
directly into reStructuredText, DocBook, man, text, and HTML. (No, not
|
|
||||||
texinfo.)
|
|
||||||
|
|
||||||
- ``Documentation/DocBook/*.tmpl``
|
|
||||||
|
|
||||||
These are XML template files, which are normal XML files with special
|
|
||||||
place-holders for where the extracted documentation should go.
|
|
||||||
|
|
||||||
- ``scripts/docproc.c``
|
|
||||||
|
|
||||||
This is a program for converting XML template files into XML files. When a
|
|
||||||
file is referenced it is searched for symbols exported (EXPORT_SYMBOL), to be
|
|
||||||
able to distinguish between internal and external functions.
|
|
||||||
|
|
||||||
It invokes kernel-doc, giving it the list of functions that are to be
|
|
||||||
documented.
|
|
||||||
|
|
||||||
Additionally it is used to scan the XML template files to locate all the files
|
|
||||||
referenced herein. This is used to generate dependency information as used by
|
|
||||||
make.
|
|
||||||
|
|
||||||
- ``Makefile``
|
|
||||||
|
|
||||||
The targets 'xmldocs', 'psdocs', 'pdfdocs', and 'htmldocs' are used to build
|
|
||||||
DocBook XML files, PostScript files, PDF files, and html files in
|
|
||||||
Documentation/DocBook. The older target 'sgmldocs' is equivalent to 'xmldocs'.
|
|
||||||
|
|
||||||
- ``Documentation/DocBook/Makefile``
|
|
||||||
|
|
||||||
This is where C files are associated with SGML templates.
|
|
||||||
|
|
||||||
How to use kernel-doc comments in DocBook XML template files
|
|
||||||
------------------------------------------------------------
|
|
||||||
|
|
||||||
DocBook XML template files (\*.tmpl) are like normal XML files, except that they
|
|
||||||
can contain escape sequences where extracted documentation should be inserted.
|
|
||||||
|
|
||||||
``!E<filename>`` is replaced by the documentation, in ``<filename>``, for
|
|
||||||
functions that are exported using ``EXPORT_SYMBOL``: the function list is
|
|
||||||
collected from files listed in ``Documentation/DocBook/Makefile``.
|
|
||||||
|
|
||||||
``!I<filename>`` is replaced by the documentation for functions that are **not**
|
|
||||||
exported using ``EXPORT_SYMBOL``.
|
|
||||||
|
|
||||||
``!D<filename>`` is used to name additional files to search for functions
|
|
||||||
exported using ``EXPORT_SYMBOL``.
|
|
||||||
|
|
||||||
``!F<filename> <function [functions...]>`` is replaced by the documentation, in
|
|
||||||
``<filename>``, for the functions listed.
|
|
||||||
|
|
||||||
``!P<filename> <section title>`` is replaced by the contents of the ``DOC:``
|
|
||||||
section titled ``<section title>`` from ``<filename>``. Spaces are allowed in
|
|
||||||
``<section title>``; do not quote the ``<section title>``.
|
|
||||||
|
|
||||||
``!C<filename>`` is replaced by nothing, but makes the tools check that all DOC:
|
|
||||||
sections and documented functions, symbols, etc. are used. This makes sense to
|
|
||||||
use when you use ``!F`` or ``!P`` only and want to verify that all documentation
|
|
||||||
is included.
|
|
||||||
192
Documentation/doc-guide/parse-headers.rst
Normal file
192
Documentation/doc-guide/parse-headers.rst
Normal file
@@ -0,0 +1,192 @@
|
|||||||
|
===========================
|
||||||
|
Including uAPI header files
|
||||||
|
===========================
|
||||||
|
|
||||||
|
Sometimes, it is useful to include header files and C example codes in
|
||||||
|
order to describe the userspace API and to generate cross-references
|
||||||
|
between the code and the documentation. Adding cross-references for
|
||||||
|
userspace API files has an additional vantage: Sphinx will generate warnings
|
||||||
|
if a symbol is not found at the documentation. That helps to keep the
|
||||||
|
uAPI documentation in sync with the Kernel changes.
|
||||||
|
The :ref:`parse_headers.pl <parse_headers>` provide a way to generate such
|
||||||
|
cross-references. It has to be called via Makefile, while building the
|
||||||
|
documentation. Please see ``Documentation/media/Makefile`` for an example
|
||||||
|
about how to use it inside the Kernel tree.
|
||||||
|
|
||||||
|
.. _parse_headers:
|
||||||
|
|
||||||
|
parse_headers.pl
|
||||||
|
^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
NAME
|
||||||
|
****
|
||||||
|
|
||||||
|
|
||||||
|
parse_headers.pl - parse a C file, in order to identify functions, structs,
|
||||||
|
enums and defines and create cross-references to a Sphinx book.
|
||||||
|
|
||||||
|
|
||||||
|
SYNOPSIS
|
||||||
|
********
|
||||||
|
|
||||||
|
|
||||||
|
\ **parse_headers.pl**\ [<options>] <C_FILE> <OUT_FILE> [<EXCEPTIONS_FILE>]
|
||||||
|
|
||||||
|
Where <options> can be: --debug, --help or --man.
|
||||||
|
|
||||||
|
|
||||||
|
OPTIONS
|
||||||
|
*******
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
\ **--debug**\
|
||||||
|
|
||||||
|
Put the script in verbose mode, useful for debugging.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
\ **--usage**\
|
||||||
|
|
||||||
|
Prints a brief help message and exits.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
\ **--help**\
|
||||||
|
|
||||||
|
Prints a more detailed help message and exits.
|
||||||
|
|
||||||
|
|
||||||
|
DESCRIPTION
|
||||||
|
***********
|
||||||
|
|
||||||
|
|
||||||
|
Convert a C header or source file (C_FILE), into a ReStructured Text
|
||||||
|
included via ..parsed-literal block with cross-references for the
|
||||||
|
documentation files that describe the API. It accepts an optional
|
||||||
|
EXCEPTIONS_FILE with describes what elements will be either ignored or
|
||||||
|
be pointed to a non-default reference.
|
||||||
|
|
||||||
|
The output is written at the (OUT_FILE).
|
||||||
|
|
||||||
|
It is capable of identifying defines, functions, structs, typedefs,
|
||||||
|
enums and enum symbols and create cross-references for all of them.
|
||||||
|
It is also capable of distinguish #define used for specifying a Linux
|
||||||
|
ioctl.
|
||||||
|
|
||||||
|
The EXCEPTIONS_FILE contain two types of statements: \ **ignore**\ or \ **replace**\ .
|
||||||
|
|
||||||
|
The syntax for the ignore tag is:
|
||||||
|
|
||||||
|
|
||||||
|
ignore \ **type**\ \ **name**\
|
||||||
|
|
||||||
|
The \ **ignore**\ means that it won't generate cross references for a
|
||||||
|
\ **name**\ symbol of type \ **type**\ .
|
||||||
|
|
||||||
|
The syntax for the replace tag is:
|
||||||
|
|
||||||
|
|
||||||
|
replace \ **type**\ \ **name**\ \ **new_value**\
|
||||||
|
|
||||||
|
The \ **replace**\ means that it will generate cross references for a
|
||||||
|
\ **name**\ symbol of type \ **type**\ , but, instead of using the default
|
||||||
|
replacement rule, it will use \ **new_value**\ .
|
||||||
|
|
||||||
|
For both statements, \ **type**\ can be either one of the following:
|
||||||
|
|
||||||
|
|
||||||
|
\ **ioctl**\
|
||||||
|
|
||||||
|
The ignore or replace statement will apply to ioctl definitions like:
|
||||||
|
|
||||||
|
#define VIDIOC_DBG_S_REGISTER _IOW('V', 79, struct v4l2_dbg_register)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
\ **define**\
|
||||||
|
|
||||||
|
The ignore or replace statement will apply to any other #define found
|
||||||
|
at C_FILE.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
\ **typedef**\
|
||||||
|
|
||||||
|
The ignore or replace statement will apply to typedef statements at C_FILE.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
\ **struct**\
|
||||||
|
|
||||||
|
The ignore or replace statement will apply to the name of struct statements
|
||||||
|
at C_FILE.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
\ **enum**\
|
||||||
|
|
||||||
|
The ignore or replace statement will apply to the name of enum statements
|
||||||
|
at C_FILE.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
\ **symbol**\
|
||||||
|
|
||||||
|
The ignore or replace statement will apply to the name of enum statements
|
||||||
|
at C_FILE.
|
||||||
|
|
||||||
|
For replace statements, \ **new_value**\ will automatically use :c:type:
|
||||||
|
references for \ **typedef**\ , \ **enum**\ and \ **struct**\ types. It will use :ref:
|
||||||
|
for \ **ioctl**\ , \ **define**\ and \ **symbol**\ types. The type of reference can
|
||||||
|
also be explicitly defined at the replace statement.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
EXAMPLES
|
||||||
|
********
|
||||||
|
|
||||||
|
|
||||||
|
ignore define _VIDEODEV2_H
|
||||||
|
|
||||||
|
|
||||||
|
Ignore a #define _VIDEODEV2_H at the C_FILE.
|
||||||
|
|
||||||
|
ignore symbol PRIVATE
|
||||||
|
|
||||||
|
|
||||||
|
On a struct like:
|
||||||
|
|
||||||
|
enum foo { BAR1, BAR2, PRIVATE };
|
||||||
|
|
||||||
|
It won't generate cross-references for \ **PRIVATE**\ .
|
||||||
|
|
||||||
|
replace symbol BAR1 :c:type:\`foo\`
|
||||||
|
replace symbol BAR2 :c:type:\`foo\`
|
||||||
|
|
||||||
|
|
||||||
|
On a struct like:
|
||||||
|
|
||||||
|
enum foo { BAR1, BAR2, PRIVATE };
|
||||||
|
|
||||||
|
It will make the BAR1 and BAR2 enum symbols to cross reference the foo
|
||||||
|
symbol at the C domain.
|
||||||
|
|
||||||
|
|
||||||
|
BUGS
|
||||||
|
****
|
||||||
|
|
||||||
|
|
||||||
|
Report bugs to Mauro Carvalho Chehab <mchehab@s-opensource.com>
|
||||||
|
|
||||||
|
|
||||||
|
COPYRIGHT
|
||||||
|
*********
|
||||||
|
|
||||||
|
|
||||||
|
Copyright (c) 2016 by Mauro Carvalho Chehab <mchehab@s-opensource.com>.
|
||||||
|
|
||||||
|
License GPLv2: GNU GPL version 2 <http://gnu.org/licenses/gpl.html>.
|
||||||
|
|
||||||
|
This is free software: you are free to change and redistribute it.
|
||||||
|
There is NO WARRANTY, to the extent permitted by law.
|
||||||
219
Documentation/doc-guide/sphinx.rst
Normal file
219
Documentation/doc-guide/sphinx.rst
Normal file
@@ -0,0 +1,219 @@
|
|||||||
|
Introduction
|
||||||
|
============
|
||||||
|
|
||||||
|
The Linux kernel uses `Sphinx`_ to generate pretty documentation from
|
||||||
|
`reStructuredText`_ files under ``Documentation``. To build the documentation in
|
||||||
|
HTML or PDF formats, use ``make htmldocs`` or ``make pdfdocs``. The generated
|
||||||
|
documentation is placed in ``Documentation/output``.
|
||||||
|
|
||||||
|
.. _Sphinx: http://www.sphinx-doc.org/
|
||||||
|
.. _reStructuredText: http://docutils.sourceforge.net/rst.html
|
||||||
|
|
||||||
|
The reStructuredText files may contain directives to include structured
|
||||||
|
documentation comments, or kernel-doc comments, from source files. Usually these
|
||||||
|
are used to describe the functions and types and design of the code. The
|
||||||
|
kernel-doc comments have some special structure and formatting, but beyond that
|
||||||
|
they are also treated as reStructuredText.
|
||||||
|
|
||||||
|
There is also the deprecated DocBook toolchain to generate documentation from
|
||||||
|
DocBook XML template files under ``Documentation/DocBook``. The DocBook files
|
||||||
|
are to be converted to reStructuredText, and the toolchain is slated to be
|
||||||
|
removed.
|
||||||
|
|
||||||
|
Finally, there are thousands of plain text documentation files scattered around
|
||||||
|
``Documentation``. Some of these will likely be converted to reStructuredText
|
||||||
|
over time, but the bulk of them will remain in plain text.
|
||||||
|
|
||||||
|
Sphinx Build
|
||||||
|
============
|
||||||
|
|
||||||
|
The usual way to generate the documentation is to run ``make htmldocs`` or
|
||||||
|
``make pdfdocs``. There are also other formats available, see the documentation
|
||||||
|
section of ``make help``. The generated documentation is placed in
|
||||||
|
format-specific subdirectories under ``Documentation/output``.
|
||||||
|
|
||||||
|
To generate documentation, Sphinx (``sphinx-build``) must obviously be
|
||||||
|
installed. For prettier HTML output, the Read the Docs Sphinx theme
|
||||||
|
(``sphinx_rtd_theme``) is used if available. For PDF output, ``rst2pdf`` is also
|
||||||
|
needed. All of these are widely available and packaged in distributions.
|
||||||
|
|
||||||
|
To pass extra options to Sphinx, you can use the ``SPHINXOPTS`` make
|
||||||
|
variable. For example, use ``make SPHINXOPTS=-v htmldocs`` to get more verbose
|
||||||
|
output.
|
||||||
|
|
||||||
|
To remove the generated documentation, run ``make cleandocs``.
|
||||||
|
|
||||||
|
Writing Documentation
|
||||||
|
=====================
|
||||||
|
|
||||||
|
Adding new documentation can be as simple as:
|
||||||
|
|
||||||
|
1. Add a new ``.rst`` file somewhere under ``Documentation``.
|
||||||
|
2. Refer to it from the Sphinx main `TOC tree`_ in ``Documentation/index.rst``.
|
||||||
|
|
||||||
|
.. _TOC tree: http://www.sphinx-doc.org/en/stable/markup/toctree.html
|
||||||
|
|
||||||
|
This is usually good enough for simple documentation (like the one you're
|
||||||
|
reading right now), but for larger documents it may be advisable to create a
|
||||||
|
subdirectory (or use an existing one). For example, the graphics subsystem
|
||||||
|
documentation is under ``Documentation/gpu``, split to several ``.rst`` files,
|
||||||
|
and has a separate ``index.rst`` (with a ``toctree`` of its own) referenced from
|
||||||
|
the main index.
|
||||||
|
|
||||||
|
See the documentation for `Sphinx`_ and `reStructuredText`_ on what you can do
|
||||||
|
with them. In particular, the Sphinx `reStructuredText Primer`_ is a good place
|
||||||
|
to get started with reStructuredText. There are also some `Sphinx specific
|
||||||
|
markup constructs`_.
|
||||||
|
|
||||||
|
.. _reStructuredText Primer: http://www.sphinx-doc.org/en/stable/rest.html
|
||||||
|
.. _Sphinx specific markup constructs: http://www.sphinx-doc.org/en/stable/markup/index.html
|
||||||
|
|
||||||
|
Specific guidelines for the kernel documentation
|
||||||
|
------------------------------------------------
|
||||||
|
|
||||||
|
Here are some specific guidelines for the kernel documentation:
|
||||||
|
|
||||||
|
* Please don't go overboard with reStructuredText markup. Keep it simple.
|
||||||
|
|
||||||
|
* Please stick to this order of heading adornments:
|
||||||
|
|
||||||
|
1. ``=`` with overline for document title::
|
||||||
|
|
||||||
|
==============
|
||||||
|
Document title
|
||||||
|
==============
|
||||||
|
|
||||||
|
2. ``=`` for chapters::
|
||||||
|
|
||||||
|
Chapters
|
||||||
|
========
|
||||||
|
|
||||||
|
3. ``-`` for sections::
|
||||||
|
|
||||||
|
Section
|
||||||
|
-------
|
||||||
|
|
||||||
|
4. ``~`` for subsections::
|
||||||
|
|
||||||
|
Subsection
|
||||||
|
~~~~~~~~~~
|
||||||
|
|
||||||
|
Although RST doesn't mandate a specific order ("Rather than imposing a fixed
|
||||||
|
number and order of section title adornment styles, the order enforced will be
|
||||||
|
the order as encountered."), having the higher levels the same overall makes
|
||||||
|
it easier to follow the documents.
|
||||||
|
|
||||||
|
|
||||||
|
the C domain
|
||||||
|
------------
|
||||||
|
|
||||||
|
The `Sphinx C Domain`_ (name c) is suited for documentation of C API. E.g. a
|
||||||
|
function prototype:
|
||||||
|
|
||||||
|
.. code-block:: rst
|
||||||
|
|
||||||
|
.. c:function:: int ioctl( int fd, int request )
|
||||||
|
|
||||||
|
The C domain of the kernel-doc has some additional features. E.g. you can
|
||||||
|
*rename* the reference name of a function with a common name like ``open`` or
|
||||||
|
``ioctl``:
|
||||||
|
|
||||||
|
.. code-block:: rst
|
||||||
|
|
||||||
|
.. c:function:: int ioctl( int fd, int request )
|
||||||
|
:name: VIDIOC_LOG_STATUS
|
||||||
|
|
||||||
|
The func-name (e.g. ioctl) remains in the output but the ref-name changed from
|
||||||
|
``ioctl`` to ``VIDIOC_LOG_STATUS``. The index entry for this function is also
|
||||||
|
changed to ``VIDIOC_LOG_STATUS`` and the function can now referenced by:
|
||||||
|
|
||||||
|
.. code-block:: rst
|
||||||
|
|
||||||
|
:c:func:`VIDIOC_LOG_STATUS`
|
||||||
|
|
||||||
|
|
||||||
|
list tables
|
||||||
|
-----------
|
||||||
|
|
||||||
|
We recommend the use of *list table* formats. The *list table* formats are
|
||||||
|
double-stage lists. Compared to the ASCII-art they might not be as
|
||||||
|
comfortable for
|
||||||
|
readers of the text files. Their advantage is that they are easy to
|
||||||
|
create or modify and that the diff of a modification is much more meaningful,
|
||||||
|
because it is limited to the modified content.
|
||||||
|
|
||||||
|
The ``flat-table`` is a double-stage list similar to the ``list-table`` with
|
||||||
|
some additional features:
|
||||||
|
|
||||||
|
* column-span: with the role ``cspan`` a cell can be extended through
|
||||||
|
additional columns
|
||||||
|
|
||||||
|
* row-span: with the role ``rspan`` a cell can be extended through
|
||||||
|
additional rows
|
||||||
|
|
||||||
|
* auto span rightmost cell of a table row over the missing cells on the right
|
||||||
|
side of that table-row. With Option ``:fill-cells:`` this behavior can
|
||||||
|
changed from *auto span* to *auto fill*, which automatically inserts (empty)
|
||||||
|
cells instead of spanning the last cell.
|
||||||
|
|
||||||
|
options:
|
||||||
|
|
||||||
|
* ``:header-rows:`` [int] count of header rows
|
||||||
|
* ``:stub-columns:`` [int] count of stub columns
|
||||||
|
* ``:widths:`` [[int] [int] ... ] widths of columns
|
||||||
|
* ``:fill-cells:`` instead of auto-spanning missing cells, insert missing cells
|
||||||
|
|
||||||
|
roles:
|
||||||
|
|
||||||
|
* ``:cspan:`` [int] additional columns (*morecols*)
|
||||||
|
* ``:rspan:`` [int] additional rows (*morerows*)
|
||||||
|
|
||||||
|
The example below shows how to use this markup. The first level of the staged
|
||||||
|
list is the *table-row*. In the *table-row* there is only one markup allowed,
|
||||||
|
the list of the cells in this *table-row*. Exceptions are *comments* ( ``..`` )
|
||||||
|
and *targets* (e.g. a ref to ``:ref:`last row <last row>``` / :ref:`last row
|
||||||
|
<last row>`).
|
||||||
|
|
||||||
|
.. code-block:: rst
|
||||||
|
|
||||||
|
.. flat-table:: table title
|
||||||
|
:widths: 2 1 1 3
|
||||||
|
|
||||||
|
* - head col 1
|
||||||
|
- head col 2
|
||||||
|
- head col 3
|
||||||
|
- head col 4
|
||||||
|
|
||||||
|
* - column 1
|
||||||
|
- field 1.1
|
||||||
|
- field 1.2 with autospan
|
||||||
|
|
||||||
|
* - column 2
|
||||||
|
- field 2.1
|
||||||
|
- :rspan:`1` :cspan:`1` field 2.2 - 3.3
|
||||||
|
|
||||||
|
* .. _`last row`:
|
||||||
|
|
||||||
|
- column 3
|
||||||
|
|
||||||
|
Rendered as:
|
||||||
|
|
||||||
|
.. flat-table:: table title
|
||||||
|
:widths: 2 1 1 3
|
||||||
|
|
||||||
|
* - head col 1
|
||||||
|
- head col 2
|
||||||
|
- head col 3
|
||||||
|
- head col 4
|
||||||
|
|
||||||
|
* - column 1
|
||||||
|
- field 1.1
|
||||||
|
- field 1.2 with autospan
|
||||||
|
|
||||||
|
* - column 2
|
||||||
|
- field 2.1
|
||||||
|
- :rspan:`1` :cspan:`1` field 2.2 - 3.3
|
||||||
|
|
||||||
|
* .. _`last row`:
|
||||||
|
|
||||||
|
- column 3
|
||||||
@@ -3,3 +3,8 @@
|
|||||||
project = "Linux 802.11 Driver Developer's Guide"
|
project = "Linux 802.11 Driver Developer's Guide"
|
||||||
|
|
||||||
tags.add("subproject")
|
tags.add("subproject")
|
||||||
|
|
||||||
|
latex_documents = [
|
||||||
|
('index', '80211.tex', project,
|
||||||
|
'The kernel development community', 'manual'),
|
||||||
|
]
|
||||||
@@ -9,7 +9,7 @@ Linux 802.11 Driver Developer's Guide
|
|||||||
mac80211
|
mac80211
|
||||||
mac80211-advanced
|
mac80211-advanced
|
||||||
|
|
||||||
.. only:: subproject
|
.. only:: subproject and html
|
||||||
|
|
||||||
Indices
|
Indices
|
||||||
=======
|
=======
|
||||||
10
Documentation/driver-api/conf.py
Normal file
10
Documentation/driver-api/conf.py
Normal file
@@ -0,0 +1,10 @@
|
|||||||
|
# -*- coding: utf-8; mode: python -*-
|
||||||
|
|
||||||
|
project = "The Linux driver implementer's API guide"
|
||||||
|
|
||||||
|
tags.add("subproject")
|
||||||
|
|
||||||
|
latex_documents = [
|
||||||
|
('index', 'driver-api.tex', project,
|
||||||
|
'The kernel development community', 'manual'),
|
||||||
|
]
|
||||||
279
Documentation/driver-api/device_link.rst
Normal file
279
Documentation/driver-api/device_link.rst
Normal file
@@ -0,0 +1,279 @@
|
|||||||
|
============
|
||||||
|
Device links
|
||||||
|
============
|
||||||
|
|
||||||
|
By default, the driver core only enforces dependencies between devices
|
||||||
|
that are borne out of a parent/child relationship within the device
|
||||||
|
hierarchy: When suspending, resuming or shutting down the system, devices
|
||||||
|
are ordered based on this relationship, i.e. children are always suspended
|
||||||
|
before their parent, and the parent is always resumed before its children.
|
||||||
|
|
||||||
|
Sometimes there is a need to represent device dependencies beyond the
|
||||||
|
mere parent/child relationship, e.g. between siblings, and have the
|
||||||
|
driver core automatically take care of them.
|
||||||
|
|
||||||
|
Secondly, the driver core by default does not enforce any driver presence
|
||||||
|
dependencies, i.e. that one device must be bound to a driver before
|
||||||
|
another one can probe or function correctly.
|
||||||
|
|
||||||
|
Often these two dependency types come together, so a device depends on
|
||||||
|
another one both with regards to driver presence *and* with regards to
|
||||||
|
suspend/resume and shutdown ordering.
|
||||||
|
|
||||||
|
Device links allow representation of such dependencies in the driver core.
|
||||||
|
|
||||||
|
In its standard form, a device link combines *both* dependency types:
|
||||||
|
It guarantees correct suspend/resume and shutdown ordering between a
|
||||||
|
"supplier" device and its "consumer" devices, and it guarantees driver
|
||||||
|
presence on the supplier. The consumer devices are not probed before the
|
||||||
|
supplier is bound to a driver, and they're unbound before the supplier
|
||||||
|
is unbound.
|
||||||
|
|
||||||
|
When driver presence on the supplier is irrelevant and only correct
|
||||||
|
suspend/resume and shutdown ordering is needed, the device link may
|
||||||
|
simply be set up with the ``DL_FLAG_STATELESS`` flag. In other words,
|
||||||
|
enforcing driver presence on the supplier is optional.
|
||||||
|
|
||||||
|
Another optional feature is runtime PM integration: By setting the
|
||||||
|
``DL_FLAG_PM_RUNTIME`` flag on addition of the device link, the PM core
|
||||||
|
is instructed to runtime resume the supplier and keep it active
|
||||||
|
whenever and for as long as the consumer is runtime resumed.
|
||||||
|
|
||||||
|
Usage
|
||||||
|
=====
|
||||||
|
|
||||||
|
The earliest point in time when device links can be added is after
|
||||||
|
:c:func:`device_add()` has been called for the supplier and
|
||||||
|
:c:func:`device_initialize()` has been called for the consumer.
|
||||||
|
|
||||||
|
It is legal to add them later, but care must be taken that the system
|
||||||
|
remains in a consistent state: E.g. a device link cannot be added in
|
||||||
|
the midst of a suspend/resume transition, so either commencement of
|
||||||
|
such a transition needs to be prevented with :c:func:`lock_system_sleep()`,
|
||||||
|
or the device link needs to be added from a function which is guaranteed
|
||||||
|
not to run in parallel to a suspend/resume transition, such as from a
|
||||||
|
device ``->probe`` callback or a boot-time PCI quirk.
|
||||||
|
|
||||||
|
Another example for an inconsistent state would be a device link that
|
||||||
|
represents a driver presence dependency, yet is added from the consumer's
|
||||||
|
``->probe`` callback while the supplier hasn't probed yet: Had the driver
|
||||||
|
core known about the device link earlier, it wouldn't have probed the
|
||||||
|
consumer in the first place. The onus is thus on the consumer to check
|
||||||
|
presence of the supplier after adding the link, and defer probing on
|
||||||
|
non-presence.
|
||||||
|
|
||||||
|
If a device link is added in the ``->probe`` callback of the supplier or
|
||||||
|
consumer driver, it is typically deleted in its ``->remove`` callback for
|
||||||
|
symmetry. That way, if the driver is compiled as a module, the device
|
||||||
|
link is added on module load and orderly deleted on unload. The same
|
||||||
|
restrictions that apply to device link addition (e.g. exclusion of a
|
||||||
|
parallel suspend/resume transition) apply equally to deletion.
|
||||||
|
|
||||||
|
Several flags may be specified on device link addition, two of which
|
||||||
|
have already been mentioned above: ``DL_FLAG_STATELESS`` to express that no
|
||||||
|
driver presence dependency is needed (but only correct suspend/resume and
|
||||||
|
shutdown ordering) and ``DL_FLAG_PM_RUNTIME`` to express that runtime PM
|
||||||
|
integration is desired.
|
||||||
|
|
||||||
|
Two other flags are specifically targeted at use cases where the device
|
||||||
|
link is added from the consumer's ``->probe`` callback: ``DL_FLAG_RPM_ACTIVE``
|
||||||
|
can be specified to runtime resume the supplier upon addition of the
|
||||||
|
device link. ``DL_FLAG_AUTOREMOVE`` causes the device link to be automatically
|
||||||
|
purged when the consumer fails to probe or later unbinds. This obviates
|
||||||
|
the need to explicitly delete the link in the ``->remove`` callback or in
|
||||||
|
the error path of the ``->probe`` callback.
|
||||||
|
|
||||||
|
Limitations
|
||||||
|
===========
|
||||||
|
|
||||||
|
Driver authors should be aware that a driver presence dependency (i.e. when
|
||||||
|
``DL_FLAG_STATELESS`` is not specified on link addition) may cause probing of
|
||||||
|
the consumer to be deferred indefinitely. This can become a problem if the
|
||||||
|
consumer is required to probe before a certain initcall level is reached.
|
||||||
|
Worse, if the supplier driver is blacklisted or missing, the consumer will
|
||||||
|
never be probed.
|
||||||
|
|
||||||
|
Sometimes drivers depend on optional resources. They are able to operate
|
||||||
|
in a degraded mode (reduced feature set or performance) when those resources
|
||||||
|
are not present. An example is an SPI controller that can use a DMA engine
|
||||||
|
or work in PIO mode. The controller can determine presence of the optional
|
||||||
|
resources at probe time but on non-presence there is no way to know whether
|
||||||
|
they will become available in the near future (due to a supplier driver
|
||||||
|
probing) or never. Consequently it cannot be determined whether to defer
|
||||||
|
probing or not. It would be possible to notify drivers when optional
|
||||||
|
resources become available after probing, but it would come at a high cost
|
||||||
|
for drivers as switching between modes of operation at runtime based on the
|
||||||
|
availability of such resources would be much more complex than a mechanism
|
||||||
|
based on probe deferral. In any case optional resources are beyond the
|
||||||
|
scope of device links.
|
||||||
|
|
||||||
|
Examples
|
||||||
|
========
|
||||||
|
|
||||||
|
* An MMU device exists alongside a busmaster device, both are in the same
|
||||||
|
power domain. The MMU implements DMA address translation for the busmaster
|
||||||
|
device and shall be runtime resumed and kept active whenever and as long
|
||||||
|
as the busmaster device is active. The busmaster device's driver shall
|
||||||
|
not bind before the MMU is bound. To achieve this, a device link with
|
||||||
|
runtime PM integration is added from the busmaster device (consumer)
|
||||||
|
to the MMU device (supplier). The effect with regards to runtime PM
|
||||||
|
is the same as if the MMU was the parent of the master device.
|
||||||
|
|
||||||
|
The fact that both devices share the same power domain would normally
|
||||||
|
suggest usage of a :c:type:`struct dev_pm_domain` or :c:type:`struct
|
||||||
|
generic_pm_domain`, however these are not independent devices that
|
||||||
|
happen to share a power switch, but rather the MMU device serves the
|
||||||
|
busmaster device and is useless without it. A device link creates a
|
||||||
|
synthetic hierarchical relationship between the devices and is thus
|
||||||
|
more apt.
|
||||||
|
|
||||||
|
* A Thunderbolt host controller comprises a number of PCIe hotplug ports
|
||||||
|
and an NHI device to manage the PCIe switch. On resume from system sleep,
|
||||||
|
the NHI device needs to re-establish PCI tunnels to attached devices
|
||||||
|
before the hotplug ports can resume. If the hotplug ports were children
|
||||||
|
of the NHI, this resume order would automatically be enforced by the
|
||||||
|
PM core, but unfortunately they're aunts. The solution is to add
|
||||||
|
device links from the hotplug ports (consumers) to the NHI device
|
||||||
|
(supplier). A driver presence dependency is not necessary for this
|
||||||
|
use case.
|
||||||
|
|
||||||
|
* Discrete GPUs in hybrid graphics laptops often feature an HDA controller
|
||||||
|
for HDMI/DP audio. In the device hierarchy the HDA controller is a sibling
|
||||||
|
of the VGA device, yet both share the same power domain and the HDA
|
||||||
|
controller is only ever needed when an HDMI/DP display is attached to the
|
||||||
|
VGA device. A device link from the HDA controller (consumer) to the
|
||||||
|
VGA device (supplier) aptly represents this relationship.
|
||||||
|
|
||||||
|
* ACPI allows definition of a device start order by way of _DEP objects.
|
||||||
|
A classical example is when ACPI power management methods on one device
|
||||||
|
are implemented in terms of I\ :sup:`2`\ C accesses and require a specific
|
||||||
|
I\ :sup:`2`\ C controller to be present and functional for the power
|
||||||
|
management of the device in question to work.
|
||||||
|
|
||||||
|
* In some SoCs a functional dependency exists from display, video codec and
|
||||||
|
video processing IP cores on transparent memory access IP cores that handle
|
||||||
|
burst access and compression/decompression.
|
||||||
|
|
||||||
|
Alternatives
|
||||||
|
============
|
||||||
|
|
||||||
|
* A :c:type:`struct dev_pm_domain` can be used to override the bus,
|
||||||
|
class or device type callbacks. It is intended for devices sharing
|
||||||
|
a single on/off switch, however it does not guarantee a specific
|
||||||
|
suspend/resume ordering, this needs to be implemented separately.
|
||||||
|
It also does not by itself track the runtime PM status of the involved
|
||||||
|
devices and turn off the power switch only when all of them are runtime
|
||||||
|
suspended. Furthermore it cannot be used to enforce a specific shutdown
|
||||||
|
ordering or a driver presence dependency.
|
||||||
|
|
||||||
|
* A :c:type:`struct generic_pm_domain` is a lot more heavyweight than a
|
||||||
|
device link and does not allow for shutdown ordering or driver presence
|
||||||
|
dependencies. It also cannot be used on ACPI systems.
|
||||||
|
|
||||||
|
Implementation
|
||||||
|
==============
|
||||||
|
|
||||||
|
The device hierarchy, which -- as the name implies -- is a tree,
|
||||||
|
becomes a directed acyclic graph once device links are added.
|
||||||
|
|
||||||
|
Ordering of these devices during suspend/resume is determined by the
|
||||||
|
dpm_list. During shutdown it is determined by the devices_kset. With
|
||||||
|
no device links present, the two lists are a flattened, one-dimensional
|
||||||
|
representations of the device tree such that a device is placed behind
|
||||||
|
all its ancestors. That is achieved by traversing the ACPI namespace
|
||||||
|
or OpenFirmware device tree top-down and appending devices to the lists
|
||||||
|
as they are discovered.
|
||||||
|
|
||||||
|
Once device links are added, the lists need to satisfy the additional
|
||||||
|
constraint that a device is placed behind all its suppliers, recursively.
|
||||||
|
To ensure this, upon addition of the device link the consumer and the
|
||||||
|
entire sub-graph below it (all children and consumers of the consumer)
|
||||||
|
are moved to the end of the list. (Call to :c:func:`device_reorder_to_tail()`
|
||||||
|
from :c:func:`device_link_add()`.)
|
||||||
|
|
||||||
|
To prevent introduction of dependency loops into the graph, it is
|
||||||
|
verified upon device link addition that the supplier is not dependent
|
||||||
|
on the consumer or any children or consumers of the consumer.
|
||||||
|
(Call to :c:func:`device_is_dependent()` from :c:func:`device_link_add()`.)
|
||||||
|
If that constraint is violated, :c:func:`device_link_add()` will return
|
||||||
|
``NULL`` and a ``WARNING`` will be logged.
|
||||||
|
|
||||||
|
Notably this also prevents the addition of a device link from a parent
|
||||||
|
device to a child. However the converse is allowed, i.e. a device link
|
||||||
|
from a child to a parent. Since the driver core already guarantees
|
||||||
|
correct suspend/resume and shutdown ordering between parent and child,
|
||||||
|
such a device link only makes sense if a driver presence dependency is
|
||||||
|
needed on top of that. In this case driver authors should weigh
|
||||||
|
carefully if a device link is at all the right tool for the purpose.
|
||||||
|
A more suitable approach might be to simply use deferred probing or
|
||||||
|
add a device flag causing the parent driver to be probed before the
|
||||||
|
child one.
|
||||||
|
|
||||||
|
State machine
|
||||||
|
=============
|
||||||
|
|
||||||
|
.. kernel-doc:: include/linux/device.h
|
||||||
|
:functions: device_link_state
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
.=============================.
|
||||||
|
| |
|
||||||
|
v |
|
||||||
|
DORMANT <=> AVAILABLE <=> CONSUMER_PROBE => ACTIVE
|
||||||
|
^ |
|
||||||
|
| |
|
||||||
|
'============ SUPPLIER_UNBIND <============'
|
||||||
|
|
||||||
|
* The initial state of a device link is automatically determined by
|
||||||
|
:c:func:`device_link_add()` based on the driver presence on the supplier
|
||||||
|
and consumer. If the link is created before any devices are probed, it
|
||||||
|
is set to ``DL_STATE_DORMANT``.
|
||||||
|
|
||||||
|
* When a supplier device is bound to a driver, links to its consumers
|
||||||
|
progress to ``DL_STATE_AVAILABLE``.
|
||||||
|
(Call to :c:func:`device_links_driver_bound()` from
|
||||||
|
:c:func:`driver_bound()`.)
|
||||||
|
|
||||||
|
* Before a consumer device is probed, presence of supplier drivers is
|
||||||
|
verified by checking that links to suppliers are in ``DL_STATE_AVAILABLE``
|
||||||
|
state. The state of the links is updated to ``DL_STATE_CONSUMER_PROBE``.
|
||||||
|
(Call to :c:func:`device_links_check_suppliers()` from
|
||||||
|
:c:func:`really_probe()`.)
|
||||||
|
This prevents the supplier from unbinding.
|
||||||
|
(Call to :c:func:`wait_for_device_probe()` from
|
||||||
|
:c:func:`device_links_unbind_consumers()`.)
|
||||||
|
|
||||||
|
* If the probe fails, links to suppliers revert back to ``DL_STATE_AVAILABLE``.
|
||||||
|
(Call to :c:func:`device_links_no_driver()` from :c:func:`really_probe()`.)
|
||||||
|
|
||||||
|
* If the probe succeeds, links to suppliers progress to ``DL_STATE_ACTIVE``.
|
||||||
|
(Call to :c:func:`device_links_driver_bound()` from :c:func:`driver_bound()`.)
|
||||||
|
|
||||||
|
* When the consumer's driver is later on removed, links to suppliers revert
|
||||||
|
back to ``DL_STATE_AVAILABLE``.
|
||||||
|
(Call to :c:func:`__device_links_no_driver()` from
|
||||||
|
:c:func:`device_links_driver_cleanup()`, which in turn is called from
|
||||||
|
:c:func:`__device_release_driver()`.)
|
||||||
|
|
||||||
|
* Before a supplier's driver is removed, links to consumers that are not
|
||||||
|
bound to a driver are updated to ``DL_STATE_SUPPLIER_UNBIND``.
|
||||||
|
(Call to :c:func:`device_links_busy()` from
|
||||||
|
:c:func:`__device_release_driver()`.)
|
||||||
|
This prevents the consumers from binding.
|
||||||
|
(Call to :c:func:`device_links_check_suppliers()` from
|
||||||
|
:c:func:`really_probe()`.)
|
||||||
|
Consumers that are bound are freed from their driver; consumers that are
|
||||||
|
probing are waited for until they are done.
|
||||||
|
(Call to :c:func:`device_links_unbind_consumers()` from
|
||||||
|
:c:func:`__device_release_driver()`.)
|
||||||
|
Once all links to consumers are in ``DL_STATE_SUPPLIER_UNBIND`` state,
|
||||||
|
the supplier driver is released and the links revert to ``DL_STATE_DORMANT``.
|
||||||
|
(Call to :c:func:`device_links_driver_cleanup()` from
|
||||||
|
:c:func:`__device_release_driver()`.)
|
||||||
|
|
||||||
|
API
|
||||||
|
===
|
||||||
|
|
||||||
|
.. kernel-doc:: drivers/base/core.c
|
||||||
|
:functions: device_link_add device_link_del
|
||||||
73
Documentation/driver-api/dma-buf.rst
Normal file
73
Documentation/driver-api/dma-buf.rst
Normal file
@@ -0,0 +1,73 @@
|
|||||||
|
Buffer Sharing and Synchronization
|
||||||
|
==================================
|
||||||
|
|
||||||
|
The dma-buf subsystem provides the framework for sharing buffers for
|
||||||
|
hardware (DMA) access across multiple device drivers and subsystems, and
|
||||||
|
for synchronizing asynchronous hardware access.
|
||||||
|
|
||||||
|
This is used, for example, by drm "prime" multi-GPU support, but is of
|
||||||
|
course not limited to GPU use cases.
|
||||||
|
|
||||||
|
The three main components of this are: (1) dma-buf, representing a
|
||||||
|
sg_table and exposed to userspace as a file descriptor to allow passing
|
||||||
|
between devices, (2) fence, which provides a mechanism to signal when
|
||||||
|
one device as finished access, and (3) reservation, which manages the
|
||||||
|
shared or exclusive fence(s) associated with the buffer.
|
||||||
|
|
||||||
|
Shared DMA Buffers
|
||||||
|
------------------
|
||||||
|
|
||||||
|
.. kernel-doc:: drivers/dma-buf/dma-buf.c
|
||||||
|
:export:
|
||||||
|
|
||||||
|
.. kernel-doc:: include/linux/dma-buf.h
|
||||||
|
:internal:
|
||||||
|
|
||||||
|
Reservation Objects
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
.. kernel-doc:: drivers/dma-buf/reservation.c
|
||||||
|
:doc: Reservation Object Overview
|
||||||
|
|
||||||
|
.. kernel-doc:: drivers/dma-buf/reservation.c
|
||||||
|
:export:
|
||||||
|
|
||||||
|
.. kernel-doc:: include/linux/reservation.h
|
||||||
|
:internal:
|
||||||
|
|
||||||
|
DMA Fences
|
||||||
|
----------
|
||||||
|
|
||||||
|
.. kernel-doc:: drivers/dma-buf/dma-fence.c
|
||||||
|
:export:
|
||||||
|
|
||||||
|
.. kernel-doc:: include/linux/dma-fence.h
|
||||||
|
:internal:
|
||||||
|
|
||||||
|
Seqno Hardware Fences
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
.. kernel-doc:: drivers/dma-buf/seqno-fence.c
|
||||||
|
:export:
|
||||||
|
|
||||||
|
.. kernel-doc:: include/linux/seqno-fence.h
|
||||||
|
:internal:
|
||||||
|
|
||||||
|
DMA Fence Array
|
||||||
|
~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
.. kernel-doc:: drivers/dma-buf/dma-fence-array.c
|
||||||
|
:export:
|
||||||
|
|
||||||
|
.. kernel-doc:: include/linux/dma-fence-array.h
|
||||||
|
:internal:
|
||||||
|
|
||||||
|
DMA Fence uABI/Sync File
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
.. kernel-doc:: drivers/dma-buf/sync_file.c
|
||||||
|
:export:
|
||||||
|
|
||||||
|
.. kernel-doc:: include/linux/sync_file.h
|
||||||
|
:internal:
|
||||||
|
|
||||||
@@ -16,11 +16,23 @@ available subsections can be seen below.
|
|||||||
|
|
||||||
basics
|
basics
|
||||||
infrastructure
|
infrastructure
|
||||||
|
dma-buf
|
||||||
|
device_link
|
||||||
message-based
|
message-based
|
||||||
sound
|
sound
|
||||||
frame-buffer
|
frame-buffer
|
||||||
input
|
input
|
||||||
|
usb
|
||||||
spi
|
spi
|
||||||
i2c
|
i2c
|
||||||
hsi
|
hsi
|
||||||
miscellaneous
|
miscellaneous
|
||||||
|
vme
|
||||||
|
80211/index
|
||||||
|
|
||||||
|
.. only:: subproject and html
|
||||||
|
|
||||||
|
Indices
|
||||||
|
=======
|
||||||
|
|
||||||
|
* :ref:`genindex`
|
||||||
|
|||||||
@@ -46,76 +46,6 @@ Device Drivers Base
|
|||||||
.. kernel-doc:: drivers/base/bus.c
|
.. kernel-doc:: drivers/base/bus.c
|
||||||
:export:
|
:export:
|
||||||
|
|
||||||
Buffer Sharing and Synchronization
|
|
||||||
----------------------------------
|
|
||||||
|
|
||||||
The dma-buf subsystem provides the framework for sharing buffers for
|
|
||||||
hardware (DMA) access across multiple device drivers and subsystems, and
|
|
||||||
for synchronizing asynchronous hardware access.
|
|
||||||
|
|
||||||
This is used, for example, by drm "prime" multi-GPU support, but is of
|
|
||||||
course not limited to GPU use cases.
|
|
||||||
|
|
||||||
The three main components of this are: (1) dma-buf, representing a
|
|
||||||
sg_table and exposed to userspace as a file descriptor to allow passing
|
|
||||||
between devices, (2) fence, which provides a mechanism to signal when
|
|
||||||
one device as finished access, and (3) reservation, which manages the
|
|
||||||
shared or exclusive fence(s) associated with the buffer.
|
|
||||||
|
|
||||||
dma-buf
|
|
||||||
~~~~~~~
|
|
||||||
|
|
||||||
.. kernel-doc:: drivers/dma-buf/dma-buf.c
|
|
||||||
:export:
|
|
||||||
|
|
||||||
.. kernel-doc:: include/linux/dma-buf.h
|
|
||||||
:internal:
|
|
||||||
|
|
||||||
reservation
|
|
||||||
~~~~~~~~~~~
|
|
||||||
|
|
||||||
.. kernel-doc:: drivers/dma-buf/reservation.c
|
|
||||||
:doc: Reservation Object Overview
|
|
||||||
|
|
||||||
.. kernel-doc:: drivers/dma-buf/reservation.c
|
|
||||||
:export:
|
|
||||||
|
|
||||||
.. kernel-doc:: include/linux/reservation.h
|
|
||||||
:internal:
|
|
||||||
|
|
||||||
fence
|
|
||||||
~~~~~
|
|
||||||
|
|
||||||
.. kernel-doc:: drivers/dma-buf/fence.c
|
|
||||||
:export:
|
|
||||||
|
|
||||||
.. kernel-doc:: include/linux/fence.h
|
|
||||||
:internal:
|
|
||||||
|
|
||||||
.. kernel-doc:: drivers/dma-buf/seqno-fence.c
|
|
||||||
:export:
|
|
||||||
|
|
||||||
.. kernel-doc:: include/linux/seqno-fence.h
|
|
||||||
:internal:
|
|
||||||
|
|
||||||
.. kernel-doc:: drivers/dma-buf/fence-array.c
|
|
||||||
:export:
|
|
||||||
|
|
||||||
.. kernel-doc:: include/linux/fence-array.h
|
|
||||||
:internal:
|
|
||||||
|
|
||||||
.. kernel-doc:: drivers/dma-buf/reservation.c
|
|
||||||
:export:
|
|
||||||
|
|
||||||
.. kernel-doc:: include/linux/reservation.h
|
|
||||||
:internal:
|
|
||||||
|
|
||||||
.. kernel-doc:: drivers/dma-buf/sync_file.c
|
|
||||||
:export:
|
|
||||||
|
|
||||||
.. kernel-doc:: include/linux/sync_file.h
|
|
||||||
:internal:
|
|
||||||
|
|
||||||
Device Drivers DMA Management
|
Device Drivers DMA Management
|
||||||
-----------------------------
|
-----------------------------
|
||||||
|
|
||||||
|
|||||||
748
Documentation/driver-api/usb.rst
Normal file
748
Documentation/driver-api/usb.rst
Normal file
@@ -0,0 +1,748 @@
|
|||||||
|
===========================
|
||||||
|
The Linux-USB Host Side API
|
||||||
|
===========================
|
||||||
|
|
||||||
|
Introduction to USB on Linux
|
||||||
|
============================
|
||||||
|
|
||||||
|
A Universal Serial Bus (USB) is used to connect a host, such as a PC or
|
||||||
|
workstation, to a number of peripheral devices. USB uses a tree
|
||||||
|
structure, with the host as the root (the system's master), hubs as
|
||||||
|
interior nodes, and peripherals as leaves (and slaves). Modern PCs
|
||||||
|
support several such trees of USB devices, usually
|
||||||
|
a few USB 3.0 (5 GBit/s) or USB 3.1 (10 GBit/s) and some legacy
|
||||||
|
USB 2.0 (480 MBit/s) busses just in case.
|
||||||
|
|
||||||
|
That master/slave asymmetry was designed-in for a number of reasons, one
|
||||||
|
being ease of use. It is not physically possible to mistake upstream and
|
||||||
|
downstream or it does not matter with a type C plug (or they are built into the
|
||||||
|
peripheral). Also, the host software doesn't need to deal with
|
||||||
|
distributed auto-configuration since the pre-designated master node
|
||||||
|
manages all that.
|
||||||
|
|
||||||
|
Kernel developers added USB support to Linux early in the 2.2 kernel
|
||||||
|
series and have been developing it further since then. Besides support
|
||||||
|
for each new generation of USB, various host controllers gained support,
|
||||||
|
new drivers for peripherals have been added and advanced features for latency
|
||||||
|
measurement and improved power management introduced.
|
||||||
|
|
||||||
|
Linux can run inside USB devices as well as on the hosts that control
|
||||||
|
the devices. But USB device drivers running inside those peripherals
|
||||||
|
don't do the same things as the ones running inside hosts, so they've
|
||||||
|
been given a different name: *gadget drivers*. This document does not
|
||||||
|
cover gadget drivers.
|
||||||
|
|
||||||
|
USB Host-Side API Model
|
||||||
|
=======================
|
||||||
|
|
||||||
|
Host-side drivers for USB devices talk to the "usbcore" APIs. There are
|
||||||
|
two. One is intended for *general-purpose* drivers (exposed through
|
||||||
|
driver frameworks), and the other is for drivers that are *part of the
|
||||||
|
core*. Such core drivers include the *hub* driver (which manages trees
|
||||||
|
of USB devices) and several different kinds of *host controller
|
||||||
|
drivers*, which control individual busses.
|
||||||
|
|
||||||
|
The device model seen by USB drivers is relatively complex.
|
||||||
|
|
||||||
|
- USB supports four kinds of data transfers (control, bulk, interrupt,
|
||||||
|
and isochronous). Two of them (control and bulk) use bandwidth as
|
||||||
|
it's available, while the other two (interrupt and isochronous) are
|
||||||
|
scheduled to provide guaranteed bandwidth.
|
||||||
|
|
||||||
|
- The device description model includes one or more "configurations"
|
||||||
|
per device, only one of which is active at a time. Devices are supposed
|
||||||
|
to be capable of operating at lower than their top
|
||||||
|
speeds and may provide a BOS descriptor showing the lowest speed they
|
||||||
|
remain fully operational at.
|
||||||
|
|
||||||
|
- From USB 3.0 on configurations have one or more "functions", which
|
||||||
|
provide a common functionality and are grouped together for purposes
|
||||||
|
of power management.
|
||||||
|
|
||||||
|
- Configurations or functions have one or more "interfaces", each of which may have
|
||||||
|
"alternate settings". Interfaces may be standardized by USB "Class"
|
||||||
|
specifications, or may be specific to a vendor or device.
|
||||||
|
|
||||||
|
USB device drivers actually bind to interfaces, not devices. Think of
|
||||||
|
them as "interface drivers", though you may not see many devices
|
||||||
|
where the distinction is important. *Most USB devices are simple,
|
||||||
|
with only one function, one configuration, one interface, and one alternate
|
||||||
|
setting.*
|
||||||
|
|
||||||
|
- Interfaces have one or more "endpoints", each of which supports one
|
||||||
|
type and direction of data transfer such as "bulk out" or "interrupt
|
||||||
|
in". The entire configuration may have up to sixteen endpoints in
|
||||||
|
each direction, allocated as needed among all the interfaces.
|
||||||
|
|
||||||
|
- Data transfer on USB is packetized; each endpoint has a maximum
|
||||||
|
packet size. Drivers must often be aware of conventions such as
|
||||||
|
flagging the end of bulk transfers using "short" (including zero
|
||||||
|
length) packets.
|
||||||
|
|
||||||
|
- The Linux USB API supports synchronous calls for control and bulk
|
||||||
|
messages. It also supports asynchronous calls for all kinds of data
|
||||||
|
transfer, using request structures called "URBs" (USB Request
|
||||||
|
Blocks).
|
||||||
|
|
||||||
|
Accordingly, the USB Core API exposed to device drivers covers quite a
|
||||||
|
lot of territory. You'll probably need to consult the USB 3.0
|
||||||
|
specification, available online from www.usb.org at no cost, as well as
|
||||||
|
class or device specifications.
|
||||||
|
|
||||||
|
The only host-side drivers that actually touch hardware (reading/writing
|
||||||
|
registers, handling IRQs, and so on) are the HCDs. In theory, all HCDs
|
||||||
|
provide the same functionality through the same API. In practice, that's
|
||||||
|
becoming more true, but there are still differences
|
||||||
|
that crop up especially with fault handling on the less common controllers.
|
||||||
|
Different controllers don't
|
||||||
|
necessarily report the same aspects of failures, and recovery from
|
||||||
|
faults (including software-induced ones like unlinking an URB) isn't yet
|
||||||
|
fully consistent. Device driver authors should make a point of doing
|
||||||
|
disconnect testing (while the device is active) with each different host
|
||||||
|
controller driver, to make sure drivers don't have bugs of their own as
|
||||||
|
well as to make sure they aren't relying on some HCD-specific behavior.
|
||||||
|
|
||||||
|
USB-Standard Types
|
||||||
|
==================
|
||||||
|
|
||||||
|
In ``<linux/usb/ch9.h>`` you will find the USB data types defined in
|
||||||
|
chapter 9 of the USB specification. These data types are used throughout
|
||||||
|
USB, and in APIs including this host side API, gadget APIs, and usbfs.
|
||||||
|
|
||||||
|
.. kernel-doc:: include/linux/usb/ch9.h
|
||||||
|
:internal:
|
||||||
|
|
||||||
|
Host-Side Data Types and Macros
|
||||||
|
===============================
|
||||||
|
|
||||||
|
The host side API exposes several layers to drivers, some of which are
|
||||||
|
more necessary than others. These support lifecycle models for host side
|
||||||
|
drivers and devices, and support passing buffers through usbcore to some
|
||||||
|
HCD that performs the I/O for the device driver.
|
||||||
|
|
||||||
|
.. kernel-doc:: include/linux/usb.h
|
||||||
|
:internal:
|
||||||
|
|
||||||
|
USB Core APIs
|
||||||
|
=============
|
||||||
|
|
||||||
|
There are two basic I/O models in the USB API. The most elemental one is
|
||||||
|
asynchronous: drivers submit requests in the form of an URB, and the
|
||||||
|
URB's completion callback handles the next step. All USB transfer types
|
||||||
|
support that model, although there are special cases for control URBs
|
||||||
|
(which always have setup and status stages, but may not have a data
|
||||||
|
stage) and isochronous URBs (which allow large packets and include
|
||||||
|
per-packet fault reports). Built on top of that is synchronous API
|
||||||
|
support, where a driver calls a routine that allocates one or more URBs,
|
||||||
|
submits them, and waits until they complete. There are synchronous
|
||||||
|
wrappers for single-buffer control and bulk transfers (which are awkward
|
||||||
|
to use in some driver disconnect scenarios), and for scatterlist based
|
||||||
|
streaming i/o (bulk or interrupt).
|
||||||
|
|
||||||
|
USB drivers need to provide buffers that can be used for DMA, although
|
||||||
|
they don't necessarily need to provide the DMA mapping themselves. There
|
||||||
|
are APIs to use used when allocating DMA buffers, which can prevent use
|
||||||
|
of bounce buffers on some systems. In some cases, drivers may be able to
|
||||||
|
rely on 64bit DMA to eliminate another kind of bounce buffer.
|
||||||
|
|
||||||
|
.. kernel-doc:: drivers/usb/core/urb.c
|
||||||
|
:export:
|
||||||
|
|
||||||
|
.. kernel-doc:: drivers/usb/core/message.c
|
||||||
|
:export:
|
||||||
|
|
||||||
|
.. kernel-doc:: drivers/usb/core/file.c
|
||||||
|
:export:
|
||||||
|
|
||||||
|
.. kernel-doc:: drivers/usb/core/driver.c
|
||||||
|
:export:
|
||||||
|
|
||||||
|
.. kernel-doc:: drivers/usb/core/usb.c
|
||||||
|
:export:
|
||||||
|
|
||||||
|
.. kernel-doc:: drivers/usb/core/hub.c
|
||||||
|
:export:
|
||||||
|
|
||||||
|
Host Controller APIs
|
||||||
|
====================
|
||||||
|
|
||||||
|
These APIs are only for use by host controller drivers, most of which
|
||||||
|
implement standard register interfaces such as XHCI, EHCI, OHCI, or UHCI. UHCI
|
||||||
|
was one of the first interfaces, designed by Intel and also used by VIA;
|
||||||
|
it doesn't do much in hardware. OHCI was designed later, to have the
|
||||||
|
hardware do more work (bigger transfers, tracking protocol state, and so
|
||||||
|
on). EHCI was designed with USB 2.0; its design has features that
|
||||||
|
resemble OHCI (hardware does much more work) as well as UHCI (some parts
|
||||||
|
of ISO support, TD list processing). XHCI was designed with USB 3.0. It
|
||||||
|
continues to shift support for functionality into hardware.
|
||||||
|
|
||||||
|
There are host controllers other than the "big three", although most PCI
|
||||||
|
based controllers (and a few non-PCI based ones) use one of those
|
||||||
|
interfaces. Not all host controllers use DMA; some use PIO, and there is
|
||||||
|
also a simulator and a virtual host controller to pipe USB over the network.
|
||||||
|
|
||||||
|
The same basic APIs are available to drivers for all those controllers.
|
||||||
|
For historical reasons they are in two layers: :c:type:`struct
|
||||||
|
usb_bus <usb_bus>` is a rather thin layer that became available
|
||||||
|
in the 2.2 kernels, while :c:type:`struct usb_hcd <usb_hcd>`
|
||||||
|
is a more featureful layer
|
||||||
|
that lets HCDs share common code, to shrink driver size and
|
||||||
|
significantly reduce hcd-specific behaviors.
|
||||||
|
|
||||||
|
.. kernel-doc:: drivers/usb/core/hcd.c
|
||||||
|
:export:
|
||||||
|
|
||||||
|
.. kernel-doc:: drivers/usb/core/hcd-pci.c
|
||||||
|
:export:
|
||||||
|
|
||||||
|
.. kernel-doc:: drivers/usb/core/buffer.c
|
||||||
|
:internal:
|
||||||
|
|
||||||
|
The USB Filesystem (usbfs)
|
||||||
|
==========================
|
||||||
|
|
||||||
|
This chapter presents the Linux *usbfs*. You may prefer to avoid writing
|
||||||
|
new kernel code for your USB driver; that's the problem that usbfs set
|
||||||
|
out to solve. User mode device drivers are usually packaged as
|
||||||
|
applications or libraries, and may use usbfs through some programming
|
||||||
|
library that wraps it. Such libraries include
|
||||||
|
`libusb <http://libusb.sourceforge.net>`__ for C/C++, and
|
||||||
|
`jUSB <http://jUSB.sourceforge.net>`__ for Java.
|
||||||
|
|
||||||
|
**Note**
|
||||||
|
|
||||||
|
This particular documentation is incomplete, especially with respect
|
||||||
|
to the asynchronous mode. As of kernel 2.5.66 the code and this
|
||||||
|
(new) documentation need to be cross-reviewed.
|
||||||
|
|
||||||
|
Configure usbfs into Linux kernels by enabling the *USB filesystem*
|
||||||
|
option (CONFIG_USB_DEVICEFS), and you get basic support for user mode
|
||||||
|
USB device drivers. Until relatively recently it was often (confusingly)
|
||||||
|
called *usbdevfs* although it wasn't solving what *devfs* was. Every USB
|
||||||
|
device will appear in usbfs, regardless of whether or not it has a
|
||||||
|
kernel driver.
|
||||||
|
|
||||||
|
What files are in "usbfs"?
|
||||||
|
--------------------------
|
||||||
|
|
||||||
|
Conventionally mounted at ``/proc/bus/usb``, usbfs features include:
|
||||||
|
|
||||||
|
- ``/proc/bus/usb/devices`` ... a text file showing each of the USB
|
||||||
|
devices on known to the kernel, and their configuration descriptors.
|
||||||
|
You can also poll() this to learn about new devices.
|
||||||
|
|
||||||
|
- ``/proc/bus/usb/BBB/DDD`` ... magic files exposing the each device's
|
||||||
|
configuration descriptors, and supporting a series of ioctls for
|
||||||
|
making device requests, including I/O to devices. (Purely for access
|
||||||
|
by programs.)
|
||||||
|
|
||||||
|
Each bus is given a number (BBB) based on when it was enumerated; within
|
||||||
|
each bus, each device is given a similar number (DDD). Those BBB/DDD
|
||||||
|
paths are not "stable" identifiers; expect them to change even if you
|
||||||
|
always leave the devices plugged in to the same hub port. *Don't even
|
||||||
|
think of saving these in application configuration files.* Stable
|
||||||
|
identifiers are available, for user mode applications that want to use
|
||||||
|
them. HID and networking devices expose these stable IDs, so that for
|
||||||
|
example you can be sure that you told the right UPS to power down its
|
||||||
|
second server. "usbfs" doesn't (yet) expose those IDs.
|
||||||
|
|
||||||
|
Mounting and Access Control
|
||||||
|
---------------------------
|
||||||
|
|
||||||
|
There are a number of mount options for usbfs, which will be of most
|
||||||
|
interest to you if you need to override the default access control
|
||||||
|
policy. That policy is that only root may read or write device files
|
||||||
|
(``/proc/bus/BBB/DDD``) although anyone may read the ``devices`` or
|
||||||
|
``drivers`` files. I/O requests to the device also need the
|
||||||
|
CAP_SYS_RAWIO capability,
|
||||||
|
|
||||||
|
The significance of that is that by default, all user mode device
|
||||||
|
drivers need super-user privileges. You can change modes or ownership in
|
||||||
|
a driver setup when the device hotplugs, or maye just start the driver
|
||||||
|
right then, as a privileged server (or some activity within one). That's
|
||||||
|
the most secure approach for multi-user systems, but for single user
|
||||||
|
systems ("trusted" by that user) it's more convenient just to grant
|
||||||
|
everyone all access (using the *devmode=0666* option) so the driver can
|
||||||
|
start whenever it's needed.
|
||||||
|
|
||||||
|
The mount options for usbfs, usable in /etc/fstab or in command line
|
||||||
|
invocations of *mount*, are:
|
||||||
|
|
||||||
|
*busgid*\ =NNNNN
|
||||||
|
Controls the GID used for the /proc/bus/usb/BBB directories.
|
||||||
|
(Default: 0)
|
||||||
|
|
||||||
|
*busmode*\ =MMM
|
||||||
|
Controls the file mode used for the /proc/bus/usb/BBB directories.
|
||||||
|
(Default: 0555)
|
||||||
|
|
||||||
|
*busuid*\ =NNNNN
|
||||||
|
Controls the UID used for the /proc/bus/usb/BBB directories.
|
||||||
|
(Default: 0)
|
||||||
|
|
||||||
|
*devgid*\ =NNNNN
|
||||||
|
Controls the GID used for the /proc/bus/usb/BBB/DDD files. (Default:
|
||||||
|
0)
|
||||||
|
|
||||||
|
*devmode*\ =MMM
|
||||||
|
Controls the file mode used for the /proc/bus/usb/BBB/DDD files.
|
||||||
|
(Default: 0644)
|
||||||
|
|
||||||
|
*devuid*\ =NNNNN
|
||||||
|
Controls the UID used for the /proc/bus/usb/BBB/DDD files. (Default:
|
||||||
|
0)
|
||||||
|
|
||||||
|
*listgid*\ =NNNNN
|
||||||
|
Controls the GID used for the /proc/bus/usb/devices and drivers
|
||||||
|
files. (Default: 0)
|
||||||
|
|
||||||
|
*listmode*\ =MMM
|
||||||
|
Controls the file mode used for the /proc/bus/usb/devices and
|
||||||
|
drivers files. (Default: 0444)
|
||||||
|
|
||||||
|
*listuid*\ =NNNNN
|
||||||
|
Controls the UID used for the /proc/bus/usb/devices and drivers
|
||||||
|
files. (Default: 0)
|
||||||
|
|
||||||
|
Note that many Linux distributions hard-wire the mount options for usbfs
|
||||||
|
in their init scripts, such as ``/etc/rc.d/rc.sysinit``, rather than
|
||||||
|
making it easy to set this per-system policy in ``/etc/fstab``.
|
||||||
|
|
||||||
|
/proc/bus/usb/devices
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
This file is handy for status viewing tools in user mode, which can scan
|
||||||
|
the text format and ignore most of it. More detailed device status
|
||||||
|
(including class and vendor status) is available from device-specific
|
||||||
|
files. For information about the current format of this file, see the
|
||||||
|
``Documentation/usb/proc_usb_info.txt`` file in your Linux kernel
|
||||||
|
sources.
|
||||||
|
|
||||||
|
This file, in combination with the poll() system call, can also be used
|
||||||
|
to detect when devices are added or removed:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
int fd;
|
||||||
|
struct pollfd pfd;
|
||||||
|
|
||||||
|
fd = open("/proc/bus/usb/devices", O_RDONLY);
|
||||||
|
pfd = { fd, POLLIN, 0 };
|
||||||
|
for (;;) {
|
||||||
|
/* The first time through, this call will return immediately. */
|
||||||
|
poll(&pfd, 1, -1);
|
||||||
|
|
||||||
|
/* To see what's changed, compare the file's previous and current
|
||||||
|
contents or scan the filesystem. (Scanning is more precise.) */
|
||||||
|
}
|
||||||
|
|
||||||
|
Note that this behavior is intended to be used for informational and
|
||||||
|
debug purposes. It would be more appropriate to use programs such as
|
||||||
|
udev or HAL to initialize a device or start a user-mode helper program,
|
||||||
|
for instance.
|
||||||
|
|
||||||
|
/proc/bus/usb/BBB/DDD
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
Use these files in one of these basic ways:
|
||||||
|
|
||||||
|
*They can be read,* producing first the device descriptor (18 bytes) and
|
||||||
|
then the descriptors for the current configuration. See the USB 2.0 spec
|
||||||
|
for details about those binary data formats. You'll need to convert most
|
||||||
|
multibyte values from little endian format to your native host byte
|
||||||
|
order, although a few of the fields in the device descriptor (both of
|
||||||
|
the BCD-encoded fields, and the vendor and product IDs) will be
|
||||||
|
byteswapped for you. Note that configuration descriptors include
|
||||||
|
descriptors for interfaces, altsettings, endpoints, and maybe additional
|
||||||
|
class descriptors.
|
||||||
|
|
||||||
|
*Perform USB operations* using *ioctl()* requests to make endpoint I/O
|
||||||
|
requests (synchronously or asynchronously) or manage the device. These
|
||||||
|
requests need the CAP_SYS_RAWIO capability, as well as filesystem
|
||||||
|
access permissions. Only one ioctl request can be made on one of these
|
||||||
|
device files at a time. This means that if you are synchronously reading
|
||||||
|
an endpoint from one thread, you won't be able to write to a different
|
||||||
|
endpoint from another thread until the read completes. This works for
|
||||||
|
*half duplex* protocols, but otherwise you'd use asynchronous i/o
|
||||||
|
requests.
|
||||||
|
|
||||||
|
Life Cycle of User Mode Drivers
|
||||||
|
-------------------------------
|
||||||
|
|
||||||
|
Such a driver first needs to find a device file for a device it knows
|
||||||
|
how to handle. Maybe it was told about it because a ``/sbin/hotplug``
|
||||||
|
event handling agent chose that driver to handle the new device. Or
|
||||||
|
maybe it's an application that scans all the /proc/bus/usb device files,
|
||||||
|
and ignores most devices. In either case, it should :c:func:`read()`
|
||||||
|
all the descriptors from the device file, and check them against what it
|
||||||
|
knows how to handle. It might just reject everything except a particular
|
||||||
|
vendor and product ID, or need a more complex policy.
|
||||||
|
|
||||||
|
Never assume there will only be one such device on the system at a time!
|
||||||
|
If your code can't handle more than one device at a time, at least
|
||||||
|
detect when there's more than one, and have your users choose which
|
||||||
|
device to use.
|
||||||
|
|
||||||
|
Once your user mode driver knows what device to use, it interacts with
|
||||||
|
it in either of two styles. The simple style is to make only control
|
||||||
|
requests; some devices don't need more complex interactions than those.
|
||||||
|
(An example might be software using vendor-specific control requests for
|
||||||
|
some initialization or configuration tasks, with a kernel driver for the
|
||||||
|
rest.)
|
||||||
|
|
||||||
|
More likely, you need a more complex style driver: one using non-control
|
||||||
|
endpoints, reading or writing data and claiming exclusive use of an
|
||||||
|
interface. *Bulk* transfers are easiest to use, but only their sibling
|
||||||
|
*interrupt* transfers work with low speed devices. Both interrupt and
|
||||||
|
*isochronous* transfers offer service guarantees because their bandwidth
|
||||||
|
is reserved. Such "periodic" transfers are awkward to use through usbfs,
|
||||||
|
unless you're using the asynchronous calls. However, interrupt transfers
|
||||||
|
can also be used in a synchronous "one shot" style.
|
||||||
|
|
||||||
|
Your user-mode driver should never need to worry about cleaning up
|
||||||
|
request state when the device is disconnected, although it should close
|
||||||
|
its open file descriptors as soon as it starts seeing the ENODEV errors.
|
||||||
|
|
||||||
|
The ioctl() Requests
|
||||||
|
--------------------
|
||||||
|
|
||||||
|
To use these ioctls, you need to include the following headers in your
|
||||||
|
userspace program:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
#include <linux/usb.h>
|
||||||
|
#include <linux/usbdevice_fs.h>
|
||||||
|
#include <asm/byteorder.h>
|
||||||
|
|
||||||
|
The standard USB device model requests, from "Chapter 9" of the USB 2.0
|
||||||
|
specification, are automatically included from the ``<linux/usb/ch9.h>``
|
||||||
|
header.
|
||||||
|
|
||||||
|
Unless noted otherwise, the ioctl requests described here will update
|
||||||
|
the modification time on the usbfs file to which they are applied
|
||||||
|
(unless they fail). A return of zero indicates success; otherwise, a
|
||||||
|
standard USB error code is returned. (These are documented in
|
||||||
|
``Documentation/usb/error-codes.txt`` in your kernel sources.)
|
||||||
|
|
||||||
|
Each of these files multiplexes access to several I/O streams, one per
|
||||||
|
endpoint. Each device has one control endpoint (endpoint zero) which
|
||||||
|
supports a limited RPC style RPC access. Devices are configured by
|
||||||
|
hub_wq (in the kernel) setting a device-wide *configuration* that
|
||||||
|
affects things like power consumption and basic functionality. The
|
||||||
|
endpoints are part of USB *interfaces*, which may have *altsettings*
|
||||||
|
affecting things like which endpoints are available. Many devices only
|
||||||
|
have a single configuration and interface, so drivers for them will
|
||||||
|
ignore configurations and altsettings.
|
||||||
|
|
||||||
|
Management/Status Requests
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
A number of usbfs requests don't deal very directly with device I/O.
|
||||||
|
They mostly relate to device management and status. These are all
|
||||||
|
synchronous requests.
|
||||||
|
|
||||||
|
USBDEVFS_CLAIMINTERFACE
|
||||||
|
This is used to force usbfs to claim a specific interface, which has
|
||||||
|
not previously been claimed by usbfs or any other kernel driver. The
|
||||||
|
ioctl parameter is an integer holding the number of the interface
|
||||||
|
(bInterfaceNumber from descriptor).
|
||||||
|
|
||||||
|
Note that if your driver doesn't claim an interface before trying to
|
||||||
|
use one of its endpoints, and no other driver has bound to it, then
|
||||||
|
the interface is automatically claimed by usbfs.
|
||||||
|
|
||||||
|
This claim will be released by a RELEASEINTERFACE ioctl, or by
|
||||||
|
closing the file descriptor. File modification time is not updated
|
||||||
|
by this request.
|
||||||
|
|
||||||
|
USBDEVFS_CONNECTINFO
|
||||||
|
Says whether the device is lowspeed. The ioctl parameter points to a
|
||||||
|
structure like this:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
struct usbdevfs_connectinfo {
|
||||||
|
unsigned int devnum;
|
||||||
|
unsigned char slow;
|
||||||
|
};
|
||||||
|
|
||||||
|
File modification time is not updated by this request.
|
||||||
|
|
||||||
|
*You can't tell whether a "not slow" device is connected at high
|
||||||
|
speed (480 MBit/sec) or just full speed (12 MBit/sec).* You should
|
||||||
|
know the devnum value already, it's the DDD value of the device file
|
||||||
|
name.
|
||||||
|
|
||||||
|
USBDEVFS_GETDRIVER
|
||||||
|
Returns the name of the kernel driver bound to a given interface (a
|
||||||
|
string). Parameter is a pointer to this structure, which is
|
||||||
|
modified:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
struct usbdevfs_getdriver {
|
||||||
|
unsigned int interface;
|
||||||
|
char driver[USBDEVFS_MAXDRIVERNAME + 1];
|
||||||
|
};
|
||||||
|
|
||||||
|
File modification time is not updated by this request.
|
||||||
|
|
||||||
|
USBDEVFS_IOCTL
|
||||||
|
Passes a request from userspace through to a kernel driver that has
|
||||||
|
an ioctl entry in the *struct usb_driver* it registered.
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
struct usbdevfs_ioctl {
|
||||||
|
int ifno;
|
||||||
|
int ioctl_code;
|
||||||
|
void *data;
|
||||||
|
};
|
||||||
|
|
||||||
|
/* user mode call looks like this.
|
||||||
|
* 'request' becomes the driver->ioctl() 'code' parameter.
|
||||||
|
* the size of 'param' is encoded in 'request', and that data
|
||||||
|
* is copied to or from the driver->ioctl() 'buf' parameter.
|
||||||
|
*/
|
||||||
|
static int
|
||||||
|
usbdev_ioctl (int fd, int ifno, unsigned request, void *param)
|
||||||
|
{
|
||||||
|
struct usbdevfs_ioctl wrapper;
|
||||||
|
|
||||||
|
wrapper.ifno = ifno;
|
||||||
|
wrapper.ioctl_code = request;
|
||||||
|
wrapper.data = param;
|
||||||
|
|
||||||
|
return ioctl (fd, USBDEVFS_IOCTL, &wrapper);
|
||||||
|
}
|
||||||
|
|
||||||
|
File modification time is not updated by this request.
|
||||||
|
|
||||||
|
This request lets kernel drivers talk to user mode code through
|
||||||
|
filesystem operations even when they don't create a character or
|
||||||
|
block special device. It's also been used to do things like ask
|
||||||
|
devices what device special file should be used. Two pre-defined
|
||||||
|
ioctls are used to disconnect and reconnect kernel drivers, so that
|
||||||
|
user mode code can completely manage binding and configuration of
|
||||||
|
devices.
|
||||||
|
|
||||||
|
USBDEVFS_RELEASEINTERFACE
|
||||||
|
This is used to release the claim usbfs made on interface, either
|
||||||
|
implicitly or because of a USBDEVFS_CLAIMINTERFACE call, before the
|
||||||
|
file descriptor is closed. The ioctl parameter is an integer holding
|
||||||
|
the number of the interface (bInterfaceNumber from descriptor); File
|
||||||
|
modification time is not updated by this request.
|
||||||
|
|
||||||
|
**Warning**
|
||||||
|
|
||||||
|
*No security check is made to ensure that the task which made
|
||||||
|
the claim is the one which is releasing it. This means that user
|
||||||
|
mode driver may interfere other ones.*
|
||||||
|
|
||||||
|
USBDEVFS_RESETEP
|
||||||
|
Resets the data toggle value for an endpoint (bulk or interrupt) to
|
||||||
|
DATA0. The ioctl parameter is an integer endpoint number (1 to 15,
|
||||||
|
as identified in the endpoint descriptor), with USB_DIR_IN added
|
||||||
|
if the device's endpoint sends data to the host.
|
||||||
|
|
||||||
|
**Warning**
|
||||||
|
|
||||||
|
*Avoid using this request. It should probably be removed.* Using
|
||||||
|
it typically means the device and driver will lose toggle
|
||||||
|
synchronization. If you really lost synchronization, you likely
|
||||||
|
need to completely handshake with the device, using a request
|
||||||
|
like CLEAR_HALT or SET_INTERFACE.
|
||||||
|
|
||||||
|
USBDEVFS_DROP_PRIVILEGES
|
||||||
|
This is used to relinquish the ability to do certain operations
|
||||||
|
which are considered to be privileged on a usbfs file descriptor.
|
||||||
|
This includes claiming arbitrary interfaces, resetting a device on
|
||||||
|
which there are currently claimed interfaces from other users, and
|
||||||
|
issuing USBDEVFS_IOCTL calls. The ioctl parameter is a 32 bit mask
|
||||||
|
of interfaces the user is allowed to claim on this file descriptor.
|
||||||
|
You may issue this ioctl more than one time to narrow said mask.
|
||||||
|
|
||||||
|
Synchronous I/O Support
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Synchronous requests involve the kernel blocking until the user mode
|
||||||
|
request completes, either by finishing successfully or by reporting an
|
||||||
|
error. In most cases this is the simplest way to use usbfs, although as
|
||||||
|
noted above it does prevent performing I/O to more than one endpoint at
|
||||||
|
a time.
|
||||||
|
|
||||||
|
USBDEVFS_BULK
|
||||||
|
Issues a bulk read or write request to the device. The ioctl
|
||||||
|
parameter is a pointer to this structure:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
struct usbdevfs_bulktransfer {
|
||||||
|
unsigned int ep;
|
||||||
|
unsigned int len;
|
||||||
|
unsigned int timeout; /* in milliseconds */
|
||||||
|
void *data;
|
||||||
|
};
|
||||||
|
|
||||||
|
The "ep" value identifies a bulk endpoint number (1 to 15, as
|
||||||
|
identified in an endpoint descriptor), masked with USB_DIR_IN when
|
||||||
|
referring to an endpoint which sends data to the host from the
|
||||||
|
device. The length of the data buffer is identified by "len"; Recent
|
||||||
|
kernels support requests up to about 128KBytes. *FIXME say how read
|
||||||
|
length is returned, and how short reads are handled.*.
|
||||||
|
|
||||||
|
USBDEVFS_CLEAR_HALT
|
||||||
|
Clears endpoint halt (stall) and resets the endpoint toggle. This is
|
||||||
|
only meaningful for bulk or interrupt endpoints. The ioctl parameter
|
||||||
|
is an integer endpoint number (1 to 15, as identified in an endpoint
|
||||||
|
descriptor), masked with USB_DIR_IN when referring to an endpoint
|
||||||
|
which sends data to the host from the device.
|
||||||
|
|
||||||
|
Use this on bulk or interrupt endpoints which have stalled,
|
||||||
|
returning *-EPIPE* status to a data transfer request. Do not issue
|
||||||
|
the control request directly, since that could invalidate the host's
|
||||||
|
record of the data toggle.
|
||||||
|
|
||||||
|
USBDEVFS_CONTROL
|
||||||
|
Issues a control request to the device. The ioctl parameter points
|
||||||
|
to a structure like this:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
struct usbdevfs_ctrltransfer {
|
||||||
|
__u8 bRequestType;
|
||||||
|
__u8 bRequest;
|
||||||
|
__u16 wValue;
|
||||||
|
__u16 wIndex;
|
||||||
|
__u16 wLength;
|
||||||
|
__u32 timeout; /* in milliseconds */
|
||||||
|
void *data;
|
||||||
|
};
|
||||||
|
|
||||||
|
The first eight bytes of this structure are the contents of the
|
||||||
|
SETUP packet to be sent to the device; see the USB 2.0 specification
|
||||||
|
for details. The bRequestType value is composed by combining a
|
||||||
|
USB_TYPE_\* value, a USB_DIR_\* value, and a USB_RECIP_\*
|
||||||
|
value (from *<linux/usb.h>*). If wLength is nonzero, it describes
|
||||||
|
the length of the data buffer, which is either written to the device
|
||||||
|
(USB_DIR_OUT) or read from the device (USB_DIR_IN).
|
||||||
|
|
||||||
|
At this writing, you can't transfer more than 4 KBytes of data to or
|
||||||
|
from a device; usbfs has a limit, and some host controller drivers
|
||||||
|
have a limit. (That's not usually a problem.) *Also* there's no way
|
||||||
|
to say it's not OK to get a short read back from the device.
|
||||||
|
|
||||||
|
USBDEVFS_RESET
|
||||||
|
Does a USB level device reset. The ioctl parameter is ignored. After
|
||||||
|
the reset, this rebinds all device interfaces. File modification
|
||||||
|
time is not updated by this request.
|
||||||
|
|
||||||
|
**Warning**
|
||||||
|
|
||||||
|
*Avoid using this call* until some usbcore bugs get fixed, since
|
||||||
|
it does not fully synchronize device, interface, and driver (not
|
||||||
|
just usbfs) state.
|
||||||
|
|
||||||
|
USBDEVFS_SETINTERFACE
|
||||||
|
Sets the alternate setting for an interface. The ioctl parameter is
|
||||||
|
a pointer to a structure like this:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
struct usbdevfs_setinterface {
|
||||||
|
unsigned int interface;
|
||||||
|
unsigned int altsetting;
|
||||||
|
};
|
||||||
|
|
||||||
|
File modification time is not updated by this request.
|
||||||
|
|
||||||
|
Those struct members are from some interface descriptor applying to
|
||||||
|
the current configuration. The interface number is the
|
||||||
|
bInterfaceNumber value, and the altsetting number is the
|
||||||
|
bAlternateSetting value. (This resets each endpoint in the
|
||||||
|
interface.)
|
||||||
|
|
||||||
|
USBDEVFS_SETCONFIGURATION
|
||||||
|
Issues the :c:func:`usb_set_configuration()` call for the
|
||||||
|
device. The parameter is an integer holding the number of a
|
||||||
|
configuration (bConfigurationValue from descriptor). File
|
||||||
|
modification time is not updated by this request.
|
||||||
|
|
||||||
|
**Warning**
|
||||||
|
|
||||||
|
*Avoid using this call* until some usbcore bugs get fixed, since
|
||||||
|
it does not fully synchronize device, interface, and driver (not
|
||||||
|
just usbfs) state.
|
||||||
|
|
||||||
|
Asynchronous I/O Support
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
As mentioned above, there are situations where it may be important to
|
||||||
|
initiate concurrent operations from user mode code. This is particularly
|
||||||
|
important for periodic transfers (interrupt and isochronous), but it can
|
||||||
|
be used for other kinds of USB requests too. In such cases, the
|
||||||
|
asynchronous requests described here are essential. Rather than
|
||||||
|
submitting one request and having the kernel block until it completes,
|
||||||
|
the blocking is separate.
|
||||||
|
|
||||||
|
These requests are packaged into a structure that resembles the URB used
|
||||||
|
by kernel device drivers. (No POSIX Async I/O support here, sorry.) It
|
||||||
|
identifies the endpoint type (USBDEVFS_URB_TYPE_\*), endpoint
|
||||||
|
(number, masked with USB_DIR_IN as appropriate), buffer and length,
|
||||||
|
and a user "context" value serving to uniquely identify each request.
|
||||||
|
(It's usually a pointer to per-request data.) Flags can modify requests
|
||||||
|
(not as many as supported for kernel drivers).
|
||||||
|
|
||||||
|
Each request can specify a realtime signal number (between SIGRTMIN and
|
||||||
|
SIGRTMAX, inclusive) to request a signal be sent when the request
|
||||||
|
completes.
|
||||||
|
|
||||||
|
When usbfs returns these urbs, the status value is updated, and the
|
||||||
|
buffer may have been modified. Except for isochronous transfers, the
|
||||||
|
actual_length is updated to say how many bytes were transferred; if the
|
||||||
|
USBDEVFS_URB_DISABLE_SPD flag is set ("short packets are not OK"), if
|
||||||
|
fewer bytes were read than were requested then you get an error report.
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
struct usbdevfs_iso_packet_desc {
|
||||||
|
unsigned int length;
|
||||||
|
unsigned int actual_length;
|
||||||
|
unsigned int status;
|
||||||
|
};
|
||||||
|
|
||||||
|
struct usbdevfs_urb {
|
||||||
|
unsigned char type;
|
||||||
|
unsigned char endpoint;
|
||||||
|
int status;
|
||||||
|
unsigned int flags;
|
||||||
|
void *buffer;
|
||||||
|
int buffer_length;
|
||||||
|
int actual_length;
|
||||||
|
int start_frame;
|
||||||
|
int number_of_packets;
|
||||||
|
int error_count;
|
||||||
|
unsigned int signr;
|
||||||
|
void *usercontext;
|
||||||
|
struct usbdevfs_iso_packet_desc iso_frame_desc[];
|
||||||
|
};
|
||||||
|
|
||||||
|
For these asynchronous requests, the file modification time reflects
|
||||||
|
when the request was initiated. This contrasts with their use with the
|
||||||
|
synchronous requests, where it reflects when requests complete.
|
||||||
|
|
||||||
|
USBDEVFS_DISCARDURB
|
||||||
|
*TBS* File modification time is not updated by this request.
|
||||||
|
|
||||||
|
USBDEVFS_DISCSIGNAL
|
||||||
|
*TBS* File modification time is not updated by this request.
|
||||||
|
|
||||||
|
USBDEVFS_REAPURB
|
||||||
|
*TBS* File modification time is not updated by this request.
|
||||||
|
|
||||||
|
USBDEVFS_REAPURBNDELAY
|
||||||
|
*TBS* File modification time is not updated by this request.
|
||||||
|
|
||||||
|
USBDEVFS_SUBMITURB
|
||||||
|
*TBS*
|
||||||
@@ -1,13 +1,15 @@
|
|||||||
VME Device Driver API
|
VME Device Drivers
|
||||||
=====================
|
==================
|
||||||
|
|
||||||
Driver registration
|
Driver registration
|
||||||
===================
|
-------------------
|
||||||
|
|
||||||
As with other subsystems within the Linux kernel, VME device drivers register
|
As with other subsystems within the Linux kernel, VME device drivers register
|
||||||
with the VME subsystem, typically called from the devices init routine. This is
|
with the VME subsystem, typically called from the devices init routine. This is
|
||||||
achieved via a call to the following function:
|
achieved via a call to the following function:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
int vme_register_driver (struct vme_driver *driver, unsigned int ndevs);
|
int vme_register_driver (struct vme_driver *driver, unsigned int ndevs);
|
||||||
|
|
||||||
If driver registration is successful this function returns zero, if an error
|
If driver registration is successful this function returns zero, if an error
|
||||||
@@ -17,6 +19,8 @@ A pointer to a structure of type 'vme_driver' must be provided to the
|
|||||||
registration function. Along with ndevs, which is the number of devices your
|
registration function. Along with ndevs, which is the number of devices your
|
||||||
driver is able to support. The structure is as follows:
|
driver is able to support. The structure is as follows:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
struct vme_driver {
|
struct vme_driver {
|
||||||
struct list_head node;
|
struct list_head node;
|
||||||
const char *name;
|
const char *name;
|
||||||
@@ -38,6 +42,8 @@ with the driver. The match function should return 1 if a device should be
|
|||||||
probed and 0 otherwise. This example match function (from vme_user.c) limits
|
probed and 0 otherwise. This example match function (from vme_user.c) limits
|
||||||
the number of devices probed to one:
|
the number of devices probed to one:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
#define USER_BUS_MAX 1
|
#define USER_BUS_MAX 1
|
||||||
...
|
...
|
||||||
static int vme_user_match(struct vme_dev *vdev)
|
static int vme_user_match(struct vme_dev *vdev)
|
||||||
@@ -51,6 +57,8 @@ The '.probe' element should contain a pointer to the probe routine. The
|
|||||||
probe routine is passed a 'struct vme_dev' pointer as an argument. The
|
probe routine is passed a 'struct vme_dev' pointer as an argument. The
|
||||||
'struct vme_dev' structure looks like the following:
|
'struct vme_dev' structure looks like the following:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
struct vme_dev {
|
struct vme_dev {
|
||||||
int num;
|
int num;
|
||||||
struct vme_bridge *bridge;
|
struct vme_bridge *bridge;
|
||||||
@@ -66,11 +74,13 @@ dev->bridge->num.
|
|||||||
A function is also provided to unregister the driver from the VME core and is
|
A function is also provided to unregister the driver from the VME core and is
|
||||||
usually called from the device driver's exit routine:
|
usually called from the device driver's exit routine:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
void vme_unregister_driver (struct vme_driver *driver);
|
void vme_unregister_driver (struct vme_driver *driver);
|
||||||
|
|
||||||
|
|
||||||
Resource management
|
Resource management
|
||||||
===================
|
-------------------
|
||||||
|
|
||||||
Once a driver has registered with the VME core the provided match routine will
|
Once a driver has registered with the VME core the provided match routine will
|
||||||
be called the number of times specified during the registration. If a match
|
be called the number of times specified during the registration. If a match
|
||||||
@@ -86,6 +96,8 @@ specific window or DMA channel (which may be used by a different driver) this
|
|||||||
driver allows a resource to be assigned based on the required attributes of the
|
driver allows a resource to be assigned based on the required attributes of the
|
||||||
driver in question:
|
driver in question:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
struct vme_resource * vme_master_request(struct vme_dev *dev,
|
struct vme_resource * vme_master_request(struct vme_dev *dev,
|
||||||
u32 aspace, u32 cycle, u32 width);
|
u32 aspace, u32 cycle, u32 width);
|
||||||
|
|
||||||
@@ -112,6 +124,8 @@ Functions are also provided to free window allocations once they are no longer
|
|||||||
required. These functions should be passed the pointer to the resource provided
|
required. These functions should be passed the pointer to the resource provided
|
||||||
during resource allocation:
|
during resource allocation:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
void vme_master_free(struct vme_resource *res);
|
void vme_master_free(struct vme_resource *res);
|
||||||
|
|
||||||
void vme_slave_free(struct vme_resource *res);
|
void vme_slave_free(struct vme_resource *res);
|
||||||
@@ -120,7 +134,7 @@ during resource allocation:
|
|||||||
|
|
||||||
|
|
||||||
Master windows
|
Master windows
|
||||||
==============
|
--------------
|
||||||
|
|
||||||
Master windows provide access from the local processor[s] out onto the VME bus.
|
Master windows provide access from the local processor[s] out onto the VME bus.
|
||||||
The number of windows available and the available access modes is dependent on
|
The number of windows available and the available access modes is dependent on
|
||||||
@@ -128,11 +142,13 @@ the underlying chipset. A window must be configured before it can be used.
|
|||||||
|
|
||||||
|
|
||||||
Master window configuration
|
Master window configuration
|
||||||
---------------------------
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
Once a master window has been assigned the following functions can be used to
|
Once a master window has been assigned the following functions can be used to
|
||||||
configure it and retrieve the current settings:
|
configure it and retrieve the current settings:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
int vme_master_set (struct vme_resource *res, int enabled,
|
int vme_master_set (struct vme_resource *res, int enabled,
|
||||||
unsigned long long base, unsigned long long size, u32 aspace,
|
unsigned long long base, unsigned long long size, u32 aspace,
|
||||||
u32 cycle, u32 width);
|
u32 cycle, u32 width);
|
||||||
@@ -149,11 +165,13 @@ These functions return 0 on success or an error code should the call fail.
|
|||||||
|
|
||||||
|
|
||||||
Master window access
|
Master window access
|
||||||
--------------------
|
~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
The following functions can be used to read from and write to configured master
|
The following functions can be used to read from and write to configured master
|
||||||
windows. These functions return the number of bytes copied:
|
windows. These functions return the number of bytes copied:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
ssize_t vme_master_read(struct vme_resource *res, void *buf,
|
ssize_t vme_master_read(struct vme_resource *res, void *buf,
|
||||||
size_t count, loff_t offset);
|
size_t count, loff_t offset);
|
||||||
|
|
||||||
@@ -164,6 +182,8 @@ In addition to simple reads and writes, a function is provided to do a
|
|||||||
read-modify-write transaction. This function returns the original value of the
|
read-modify-write transaction. This function returns the original value of the
|
||||||
VME bus location :
|
VME bus location :
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
unsigned int vme_master_rmw (struct vme_resource *res,
|
unsigned int vme_master_rmw (struct vme_resource *res,
|
||||||
unsigned int mask, unsigned int compare, unsigned int swap,
|
unsigned int mask, unsigned int compare, unsigned int swap,
|
||||||
loff_t offset);
|
loff_t offset);
|
||||||
@@ -175,12 +195,14 @@ the value of swap is written the specified offset.
|
|||||||
Parts of a VME window can be mapped into user space memory using the following
|
Parts of a VME window can be mapped into user space memory using the following
|
||||||
function:
|
function:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
int vme_master_mmap(struct vme_resource *resource,
|
int vme_master_mmap(struct vme_resource *resource,
|
||||||
struct vm_area_struct *vma)
|
struct vm_area_struct *vma)
|
||||||
|
|
||||||
|
|
||||||
Slave windows
|
Slave windows
|
||||||
=============
|
-------------
|
||||||
|
|
||||||
Slave windows provide devices on the VME bus access into mapped portions of the
|
Slave windows provide devices on the VME bus access into mapped portions of the
|
||||||
local memory. The number of windows available and the access modes that can be
|
local memory. The number of windows available and the access modes that can be
|
||||||
@@ -189,11 +211,13 @@ it can be used.
|
|||||||
|
|
||||||
|
|
||||||
Slave window configuration
|
Slave window configuration
|
||||||
--------------------------
|
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
Once a slave window has been assigned the following functions can be used to
|
Once a slave window has been assigned the following functions can be used to
|
||||||
configure it and retrieve the current settings:
|
configure it and retrieve the current settings:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
int vme_slave_set (struct vme_resource *res, int enabled,
|
int vme_slave_set (struct vme_resource *res, int enabled,
|
||||||
unsigned long long base, unsigned long long size,
|
unsigned long long base, unsigned long long size,
|
||||||
dma_addr_t mem, u32 aspace, u32 cycle);
|
dma_addr_t mem, u32 aspace, u32 cycle);
|
||||||
@@ -210,13 +234,15 @@ These functions return 0 on success or an error code should the call fail.
|
|||||||
|
|
||||||
|
|
||||||
Slave window buffer allocation
|
Slave window buffer allocation
|
||||||
------------------------------
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
Functions are provided to allow the user to allocate and free a contiguous
|
Functions are provided to allow the user to allocate and free a contiguous
|
||||||
buffers which will be accessible by the VME bridge. These functions do not have
|
buffers which will be accessible by the VME bridge. These functions do not have
|
||||||
to be used, other methods can be used to allocate a buffer, though care must be
|
to be used, other methods can be used to allocate a buffer, though care must be
|
||||||
taken to ensure that they are contiguous and accessible by the VME bridge:
|
taken to ensure that they are contiguous and accessible by the VME bridge:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
void * vme_alloc_consistent(struct vme_resource *res, size_t size,
|
void * vme_alloc_consistent(struct vme_resource *res, size_t size,
|
||||||
dma_addr_t *mem);
|
dma_addr_t *mem);
|
||||||
|
|
||||||
@@ -225,14 +251,14 @@ taken to ensure that they are contiguous and accessible by the VME bridge:
|
|||||||
|
|
||||||
|
|
||||||
Slave window access
|
Slave window access
|
||||||
-------------------
|
~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
Slave windows map local memory onto the VME bus, the standard methods for
|
Slave windows map local memory onto the VME bus, the standard methods for
|
||||||
accessing memory should be used.
|
accessing memory should be used.
|
||||||
|
|
||||||
|
|
||||||
DMA channels
|
DMA channels
|
||||||
============
|
------------
|
||||||
|
|
||||||
The VME DMA transfer provides the ability to run link-list DMA transfers. The
|
The VME DMA transfer provides the ability to run link-list DMA transfers. The
|
||||||
API introduces the concept of DMA lists. Each DMA list is a link-list which can
|
API introduces the concept of DMA lists. Each DMA list is a link-list which can
|
||||||
@@ -241,29 +267,35 @@ executed, reused and destroyed.
|
|||||||
|
|
||||||
|
|
||||||
List Management
|
List Management
|
||||||
---------------
|
~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
The following functions are provided to create and destroy DMA lists. Execution
|
The following functions are provided to create and destroy DMA lists. Execution
|
||||||
of a list will not automatically destroy the list, thus enabling a list to be
|
of a list will not automatically destroy the list, thus enabling a list to be
|
||||||
reused for repetitive tasks:
|
reused for repetitive tasks:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
struct vme_dma_list *vme_new_dma_list(struct vme_resource *res);
|
struct vme_dma_list *vme_new_dma_list(struct vme_resource *res);
|
||||||
|
|
||||||
int vme_dma_list_free(struct vme_dma_list *list);
|
int vme_dma_list_free(struct vme_dma_list *list);
|
||||||
|
|
||||||
|
|
||||||
List Population
|
List Population
|
||||||
---------------
|
~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
An item can be added to a list using the following function ( the source and
|
An item can be added to a list using the following function ( the source and
|
||||||
destination attributes need to be created before calling this function, this is
|
destination attributes need to be created before calling this function, this is
|
||||||
covered under "Transfer Attributes"):
|
covered under "Transfer Attributes"):
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
int vme_dma_list_add(struct vme_dma_list *list,
|
int vme_dma_list_add(struct vme_dma_list *list,
|
||||||
struct vme_dma_attr *src, struct vme_dma_attr *dest,
|
struct vme_dma_attr *src, struct vme_dma_attr *dest,
|
||||||
size_t count);
|
size_t count);
|
||||||
|
|
||||||
NOTE: The detailed attributes of the transfers source and destination
|
.. note::
|
||||||
|
|
||||||
|
The detailed attributes of the transfers source and destination
|
||||||
are not checked until an entry is added to a DMA list, the request
|
are not checked until an entry is added to a DMA list, the request
|
||||||
for a DMA channel purely checks the directions in which the
|
for a DMA channel purely checks the directions in which the
|
||||||
controller is expected to transfer data. As a result it is
|
controller is expected to transfer data. As a result it is
|
||||||
@@ -271,7 +303,7 @@ NOTE: The detailed attributes of the transfers source and destination
|
|||||||
source or destination is in an unsupported VME address space.
|
source or destination is in an unsupported VME address space.
|
||||||
|
|
||||||
Transfer Attributes
|
Transfer Attributes
|
||||||
-------------------
|
~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
The attributes for the source and destination are handled separately from adding
|
The attributes for the source and destination are handled separately from adding
|
||||||
an item to a list. This is due to the diverse attributes required for each type
|
an item to a list. This is due to the diverse attributes required for each type
|
||||||
@@ -280,33 +312,43 @@ and pattern sources and destinations (where appropriate):
|
|||||||
|
|
||||||
Pattern source:
|
Pattern source:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
struct vme_dma_attr *vme_dma_pattern_attribute(u32 pattern, u32 type);
|
struct vme_dma_attr *vme_dma_pattern_attribute(u32 pattern, u32 type);
|
||||||
|
|
||||||
PCI source or destination:
|
PCI source or destination:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
struct vme_dma_attr *vme_dma_pci_attribute(dma_addr_t mem);
|
struct vme_dma_attr *vme_dma_pci_attribute(dma_addr_t mem);
|
||||||
|
|
||||||
VME source or destination:
|
VME source or destination:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
struct vme_dma_attr *vme_dma_vme_attribute(unsigned long long base,
|
struct vme_dma_attr *vme_dma_vme_attribute(unsigned long long base,
|
||||||
u32 aspace, u32 cycle, u32 width);
|
u32 aspace, u32 cycle, u32 width);
|
||||||
|
|
||||||
The following function should be used to free an attribute:
|
The following function should be used to free an attribute:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
void vme_dma_free_attribute(struct vme_dma_attr *attr);
|
void vme_dma_free_attribute(struct vme_dma_attr *attr);
|
||||||
|
|
||||||
|
|
||||||
List Execution
|
List Execution
|
||||||
--------------
|
~~~~~~~~~~~~~~
|
||||||
|
|
||||||
The following function queues a list for execution. The function will return
|
The following function queues a list for execution. The function will return
|
||||||
once the list has been executed:
|
once the list has been executed:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
int vme_dma_list_exec(struct vme_dma_list *list);
|
int vme_dma_list_exec(struct vme_dma_list *list);
|
||||||
|
|
||||||
|
|
||||||
Interrupts
|
Interrupts
|
||||||
==========
|
----------
|
||||||
|
|
||||||
The VME API provides functions to attach and detach callbacks to specific VME
|
The VME API provides functions to attach and detach callbacks to specific VME
|
||||||
level and status ID combinations and for the generation of VME interrupts with
|
level and status ID combinations and for the generation of VME interrupts with
|
||||||
@@ -314,13 +356,15 @@ specific VME level and status IDs.
|
|||||||
|
|
||||||
|
|
||||||
Attaching Interrupt Handlers
|
Attaching Interrupt Handlers
|
||||||
----------------------------
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
The following functions can be used to attach and free a specific VME level and
|
The following functions can be used to attach and free a specific VME level and
|
||||||
status ID combination. Any given combination can only be assigned a single
|
status ID combination. Any given combination can only be assigned a single
|
||||||
callback function. A void pointer parameter is provided, the value of which is
|
callback function. A void pointer parameter is provided, the value of which is
|
||||||
passed to the callback function, the use of this pointer is user undefined:
|
passed to the callback function, the use of this pointer is user undefined:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
int vme_irq_request(struct vme_dev *dev, int level, int statid,
|
int vme_irq_request(struct vme_dev *dev, int level, int statid,
|
||||||
void (*callback)(int, int, void *), void *priv);
|
void (*callback)(int, int, void *), void *priv);
|
||||||
|
|
||||||
@@ -329,31 +373,37 @@ passed to the callback function, the use of this pointer is user undefined:
|
|||||||
The callback parameters are as follows. Care must be taken in writing a callback
|
The callback parameters are as follows. Care must be taken in writing a callback
|
||||||
function, callback functions run in interrupt context:
|
function, callback functions run in interrupt context:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
void callback(int level, int statid, void *priv);
|
void callback(int level, int statid, void *priv);
|
||||||
|
|
||||||
|
|
||||||
Interrupt Generation
|
Interrupt Generation
|
||||||
--------------------
|
~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
The following function can be used to generate a VME interrupt at a given VME
|
The following function can be used to generate a VME interrupt at a given VME
|
||||||
level and VME status ID:
|
level and VME status ID:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
int vme_irq_generate(struct vme_dev *dev, int level, int statid);
|
int vme_irq_generate(struct vme_dev *dev, int level, int statid);
|
||||||
|
|
||||||
|
|
||||||
Location monitors
|
Location monitors
|
||||||
=================
|
-----------------
|
||||||
|
|
||||||
The VME API provides the following functionality to configure the location
|
The VME API provides the following functionality to configure the location
|
||||||
monitor.
|
monitor.
|
||||||
|
|
||||||
|
|
||||||
Location Monitor Management
|
Location Monitor Management
|
||||||
---------------------------
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
The following functions are provided to request the use of a block of location
|
The following functions are provided to request the use of a block of location
|
||||||
monitors and to free them after they are no longer required:
|
monitors and to free them after they are no longer required:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
struct vme_resource * vme_lm_request(struct vme_dev *dev);
|
struct vme_resource * vme_lm_request(struct vme_dev *dev);
|
||||||
|
|
||||||
void vme_lm_free(struct vme_resource * res);
|
void vme_lm_free(struct vme_resource * res);
|
||||||
@@ -362,15 +412,19 @@ Each block may provide a number of location monitors, monitoring adjacent
|
|||||||
locations. The following function can be used to determine how many locations
|
locations. The following function can be used to determine how many locations
|
||||||
are provided:
|
are provided:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
int vme_lm_count(struct vme_resource * res);
|
int vme_lm_count(struct vme_resource * res);
|
||||||
|
|
||||||
|
|
||||||
Location Monitor Configuration
|
Location Monitor Configuration
|
||||||
------------------------------
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
Once a bank of location monitors has been allocated, the following functions
|
Once a bank of location monitors has been allocated, the following functions
|
||||||
are provided to configure the location and mode of the location monitor:
|
are provided to configure the location and mode of the location monitor:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
int vme_lm_set(struct vme_resource *res, unsigned long long base,
|
int vme_lm_set(struct vme_resource *res, unsigned long long base,
|
||||||
u32 aspace, u32 cycle);
|
u32 aspace, u32 cycle);
|
||||||
|
|
||||||
@@ -379,12 +433,14 @@ are provided to configure the location and mode of the location monitor:
|
|||||||
|
|
||||||
|
|
||||||
Location Monitor Use
|
Location Monitor Use
|
||||||
--------------------
|
~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
The following functions allow a callback to be attached and detached from each
|
The following functions allow a callback to be attached and detached from each
|
||||||
location monitor location. Each location monitor can monitor a number of
|
location monitor location. Each location monitor can monitor a number of
|
||||||
adjacent locations:
|
adjacent locations:
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
int vme_lm_attach(struct vme_resource *res, int num,
|
int vme_lm_attach(struct vme_resource *res, int num,
|
||||||
void (*callback)(void *));
|
void (*callback)(void *));
|
||||||
|
|
||||||
@@ -392,22 +448,27 @@ adjacent locations:
|
|||||||
|
|
||||||
The callback function is declared as follows.
|
The callback function is declared as follows.
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
void callback(void *data);
|
void callback(void *data);
|
||||||
|
|
||||||
|
|
||||||
Slot Detection
|
Slot Detection
|
||||||
==============
|
--------------
|
||||||
|
|
||||||
This function returns the slot ID of the provided bridge.
|
This function returns the slot ID of the provided bridge.
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
int vme_slot_num(struct vme_dev *dev);
|
int vme_slot_num(struct vme_dev *dev);
|
||||||
|
|
||||||
|
|
||||||
Bus Detection
|
Bus Detection
|
||||||
=============
|
-------------
|
||||||
|
|
||||||
This function returns the bus ID of the provided bridge.
|
This function returns the bus ID of the provided bridge.
|
||||||
|
|
||||||
|
.. code-block:: c
|
||||||
|
|
||||||
int vme_bus_num(struct vme_dev *dev);
|
int vme_bus_num(struct vme_dev *dev);
|
||||||
|
|
||||||
|
|
||||||
@@ -1,340 +0,0 @@
|
|||||||
|
|
||||||
Introduction
|
|
||||||
============
|
|
||||||
|
|
||||||
This document describes how to use the dynamic debug (dyndbg) feature.
|
|
||||||
|
|
||||||
Dynamic debug is designed to allow you to dynamically enable/disable
|
|
||||||
kernel code to obtain additional kernel information. Currently, if
|
|
||||||
CONFIG_DYNAMIC_DEBUG is set, then all pr_debug()/dev_dbg() and
|
|
||||||
print_hex_dump_debug()/print_hex_dump_bytes() calls can be dynamically
|
|
||||||
enabled per-callsite.
|
|
||||||
|
|
||||||
If CONFIG_DYNAMIC_DEBUG is not set, print_hex_dump_debug() is just
|
|
||||||
shortcut for print_hex_dump(KERN_DEBUG).
|
|
||||||
|
|
||||||
For print_hex_dump_debug()/print_hex_dump_bytes(), format string is
|
|
||||||
its 'prefix_str' argument, if it is constant string; or "hexdump"
|
|
||||||
in case 'prefix_str' is build dynamically.
|
|
||||||
|
|
||||||
Dynamic debug has even more useful features:
|
|
||||||
|
|
||||||
* Simple query language allows turning on and off debugging
|
|
||||||
statements by matching any combination of 0 or 1 of:
|
|
||||||
|
|
||||||
- source filename
|
|
||||||
- function name
|
|
||||||
- line number (including ranges of line numbers)
|
|
||||||
- module name
|
|
||||||
- format string
|
|
||||||
|
|
||||||
* Provides a debugfs control file: <debugfs>/dynamic_debug/control
|
|
||||||
which can be read to display the complete list of known debug
|
|
||||||
statements, to help guide you
|
|
||||||
|
|
||||||
Controlling dynamic debug Behaviour
|
|
||||||
===================================
|
|
||||||
|
|
||||||
The behaviour of pr_debug()/dev_dbg()s are controlled via writing to a
|
|
||||||
control file in the 'debugfs' filesystem. Thus, you must first mount
|
|
||||||
the debugfs filesystem, in order to make use of this feature.
|
|
||||||
Subsequently, we refer to the control file as:
|
|
||||||
<debugfs>/dynamic_debug/control. For example, if you want to enable
|
|
||||||
printing from source file 'svcsock.c', line 1603 you simply do:
|
|
||||||
|
|
||||||
nullarbor:~ # echo 'file svcsock.c line 1603 +p' >
|
|
||||||
<debugfs>/dynamic_debug/control
|
|
||||||
|
|
||||||
If you make a mistake with the syntax, the write will fail thus:
|
|
||||||
|
|
||||||
nullarbor:~ # echo 'file svcsock.c wtf 1 +p' >
|
|
||||||
<debugfs>/dynamic_debug/control
|
|
||||||
-bash: echo: write error: Invalid argument
|
|
||||||
|
|
||||||
Viewing Dynamic Debug Behaviour
|
|
||||||
===========================
|
|
||||||
|
|
||||||
You can view the currently configured behaviour of all the debug
|
|
||||||
statements via:
|
|
||||||
|
|
||||||
nullarbor:~ # cat <debugfs>/dynamic_debug/control
|
|
||||||
# filename:lineno [module]function flags format
|
|
||||||
/usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:323 [svcxprt_rdma]svc_rdma_cleanup =_ "SVCRDMA Module Removed, deregister RPC RDMA transport\012"
|
|
||||||
/usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:341 [svcxprt_rdma]svc_rdma_init =_ "\011max_inline : %d\012"
|
|
||||||
/usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:340 [svcxprt_rdma]svc_rdma_init =_ "\011sq_depth : %d\012"
|
|
||||||
/usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svc_rdma.c:338 [svcxprt_rdma]svc_rdma_init =_ "\011max_requests : %d\012"
|
|
||||||
...
|
|
||||||
|
|
||||||
|
|
||||||
You can also apply standard Unix text manipulation filters to this
|
|
||||||
data, e.g.
|
|
||||||
|
|
||||||
nullarbor:~ # grep -i rdma <debugfs>/dynamic_debug/control | wc -l
|
|
||||||
62
|
|
||||||
|
|
||||||
nullarbor:~ # grep -i tcp <debugfs>/dynamic_debug/control | wc -l
|
|
||||||
42
|
|
||||||
|
|
||||||
The third column shows the currently enabled flags for each debug
|
|
||||||
statement callsite (see below for definitions of the flags). The
|
|
||||||
default value, with no flags enabled, is "=_". So you can view all
|
|
||||||
the debug statement callsites with any non-default flags:
|
|
||||||
|
|
||||||
nullarbor:~ # awk '$3 != "=_"' <debugfs>/dynamic_debug/control
|
|
||||||
# filename:lineno [module]function flags format
|
|
||||||
/usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svcsock.c:1603 [sunrpc]svc_send p "svc_process: st_sendto returned %d\012"
|
|
||||||
|
|
||||||
|
|
||||||
Command Language Reference
|
|
||||||
==========================
|
|
||||||
|
|
||||||
At the lexical level, a command comprises a sequence of words separated
|
|
||||||
by spaces or tabs. So these are all equivalent:
|
|
||||||
|
|
||||||
nullarbor:~ # echo -c 'file svcsock.c line 1603 +p' >
|
|
||||||
<debugfs>/dynamic_debug/control
|
|
||||||
nullarbor:~ # echo -c ' file svcsock.c line 1603 +p ' >
|
|
||||||
<debugfs>/dynamic_debug/control
|
|
||||||
nullarbor:~ # echo -n 'file svcsock.c line 1603 +p' >
|
|
||||||
<debugfs>/dynamic_debug/control
|
|
||||||
|
|
||||||
Command submissions are bounded by a write() system call.
|
|
||||||
Multiple commands can be written together, separated by ';' or '\n'.
|
|
||||||
|
|
||||||
~# echo "func pnpacpi_get_resources +p; func pnp_assign_mem +p" \
|
|
||||||
> <debugfs>/dynamic_debug/control
|
|
||||||
|
|
||||||
If your query set is big, you can batch them too:
|
|
||||||
|
|
||||||
~# cat query-batch-file > <debugfs>/dynamic_debug/control
|
|
||||||
|
|
||||||
A another way is to use wildcard. The match rule support '*' (matches
|
|
||||||
zero or more characters) and '?' (matches exactly one character).For
|
|
||||||
example, you can match all usb drivers:
|
|
||||||
|
|
||||||
~# echo "file drivers/usb/* +p" > <debugfs>/dynamic_debug/control
|
|
||||||
|
|
||||||
At the syntactical level, a command comprises a sequence of match
|
|
||||||
specifications, followed by a flags change specification.
|
|
||||||
|
|
||||||
command ::= match-spec* flags-spec
|
|
||||||
|
|
||||||
The match-spec's are used to choose a subset of the known pr_debug()
|
|
||||||
callsites to which to apply the flags-spec. Think of them as a query
|
|
||||||
with implicit ANDs between each pair. Note that an empty list of
|
|
||||||
match-specs will select all debug statement callsites.
|
|
||||||
|
|
||||||
A match specification comprises a keyword, which controls the
|
|
||||||
attribute of the callsite to be compared, and a value to compare
|
|
||||||
against. Possible keywords are:
|
|
||||||
|
|
||||||
match-spec ::= 'func' string |
|
|
||||||
'file' string |
|
|
||||||
'module' string |
|
|
||||||
'format' string |
|
|
||||||
'line' line-range
|
|
||||||
|
|
||||||
line-range ::= lineno |
|
|
||||||
'-'lineno |
|
|
||||||
lineno'-' |
|
|
||||||
lineno'-'lineno
|
|
||||||
// Note: line-range cannot contain space, e.g.
|
|
||||||
// "1-30" is valid range but "1 - 30" is not.
|
|
||||||
|
|
||||||
lineno ::= unsigned-int
|
|
||||||
|
|
||||||
The meanings of each keyword are:
|
|
||||||
|
|
||||||
func
|
|
||||||
The given string is compared against the function name
|
|
||||||
of each callsite. Example:
|
|
||||||
|
|
||||||
func svc_tcp_accept
|
|
||||||
|
|
||||||
file
|
|
||||||
The given string is compared against either the full pathname, the
|
|
||||||
src-root relative pathname, or the basename of the source file of
|
|
||||||
each callsite. Examples:
|
|
||||||
|
|
||||||
file svcsock.c
|
|
||||||
file kernel/freezer.c
|
|
||||||
file /usr/src/packages/BUILD/sgi-enhancednfs-1.4/default/net/sunrpc/svcsock.c
|
|
||||||
|
|
||||||
module
|
|
||||||
The given string is compared against the module name
|
|
||||||
of each callsite. The module name is the string as
|
|
||||||
seen in "lsmod", i.e. without the directory or the .ko
|
|
||||||
suffix and with '-' changed to '_'. Examples:
|
|
||||||
|
|
||||||
module sunrpc
|
|
||||||
module nfsd
|
|
||||||
|
|
||||||
format
|
|
||||||
The given string is searched for in the dynamic debug format
|
|
||||||
string. Note that the string does not need to match the
|
|
||||||
entire format, only some part. Whitespace and other
|
|
||||||
special characters can be escaped using C octal character
|
|
||||||
escape \ooo notation, e.g. the space character is \040.
|
|
||||||
Alternatively, the string can be enclosed in double quote
|
|
||||||
characters (") or single quote characters (').
|
|
||||||
Examples:
|
|
||||||
|
|
||||||
format svcrdma: // many of the NFS/RDMA server pr_debugs
|
|
||||||
format readahead // some pr_debugs in the readahead cache
|
|
||||||
format nfsd:\040SETATTR // one way to match a format with whitespace
|
|
||||||
format "nfsd: SETATTR" // a neater way to match a format with whitespace
|
|
||||||
format 'nfsd: SETATTR' // yet another way to match a format with whitespace
|
|
||||||
|
|
||||||
line
|
|
||||||
The given line number or range of line numbers is compared
|
|
||||||
against the line number of each pr_debug() callsite. A single
|
|
||||||
line number matches the callsite line number exactly. A
|
|
||||||
range of line numbers matches any callsite between the first
|
|
||||||
and last line number inclusive. An empty first number means
|
|
||||||
the first line in the file, an empty line number means the
|
|
||||||
last number in the file. Examples:
|
|
||||||
|
|
||||||
line 1603 // exactly line 1603
|
|
||||||
line 1600-1605 // the six lines from line 1600 to line 1605
|
|
||||||
line -1605 // the 1605 lines from line 1 to line 1605
|
|
||||||
line 1600- // all lines from line 1600 to the end of the file
|
|
||||||
|
|
||||||
The flags specification comprises a change operation followed
|
|
||||||
by one or more flag characters. The change operation is one
|
|
||||||
of the characters:
|
|
||||||
|
|
||||||
- remove the given flags
|
|
||||||
+ add the given flags
|
|
||||||
= set the flags to the given flags
|
|
||||||
|
|
||||||
The flags are:
|
|
||||||
|
|
||||||
p enables the pr_debug() callsite.
|
|
||||||
f Include the function name in the printed message
|
|
||||||
l Include line number in the printed message
|
|
||||||
m Include module name in the printed message
|
|
||||||
t Include thread ID in messages not generated from interrupt context
|
|
||||||
_ No flags are set. (Or'd with others on input)
|
|
||||||
|
|
||||||
For print_hex_dump_debug() and print_hex_dump_bytes(), only 'p' flag
|
|
||||||
have meaning, other flags ignored.
|
|
||||||
|
|
||||||
For display, the flags are preceded by '='
|
|
||||||
(mnemonic: what the flags are currently equal to).
|
|
||||||
|
|
||||||
Note the regexp ^[-+=][flmpt_]+$ matches a flags specification.
|
|
||||||
To clear all flags at once, use "=_" or "-flmpt".
|
|
||||||
|
|
||||||
|
|
||||||
Debug messages during Boot Process
|
|
||||||
==================================
|
|
||||||
|
|
||||||
To activate debug messages for core code and built-in modules during
|
|
||||||
the boot process, even before userspace and debugfs exists, use
|
|
||||||
dyndbg="QUERY", module.dyndbg="QUERY", or ddebug_query="QUERY"
|
|
||||||
(ddebug_query is obsoleted by dyndbg, and deprecated). QUERY follows
|
|
||||||
the syntax described above, but must not exceed 1023 characters. Your
|
|
||||||
bootloader may impose lower limits.
|
|
||||||
|
|
||||||
These dyndbg params are processed just after the ddebug tables are
|
|
||||||
processed, as part of the arch_initcall. Thus you can enable debug
|
|
||||||
messages in all code run after this arch_initcall via this boot
|
|
||||||
parameter.
|
|
||||||
|
|
||||||
On an x86 system for example ACPI enablement is a subsys_initcall and
|
|
||||||
dyndbg="file ec.c +p"
|
|
||||||
will show early Embedded Controller transactions during ACPI setup if
|
|
||||||
your machine (typically a laptop) has an Embedded Controller.
|
|
||||||
PCI (or other devices) initialization also is a hot candidate for using
|
|
||||||
this boot parameter for debugging purposes.
|
|
||||||
|
|
||||||
If foo module is not built-in, foo.dyndbg will still be processed at
|
|
||||||
boot time, without effect, but will be reprocessed when module is
|
|
||||||
loaded later. dyndbg_query= and bare dyndbg= are only processed at
|
|
||||||
boot.
|
|
||||||
|
|
||||||
|
|
||||||
Debug Messages at Module Initialization Time
|
|
||||||
============================================
|
|
||||||
|
|
||||||
When "modprobe foo" is called, modprobe scans /proc/cmdline for
|
|
||||||
foo.params, strips "foo.", and passes them to the kernel along with
|
|
||||||
params given in modprobe args or /etc/modprob.d/*.conf files,
|
|
||||||
in the following order:
|
|
||||||
|
|
||||||
1. # parameters given via /etc/modprobe.d/*.conf
|
|
||||||
options foo dyndbg=+pt
|
|
||||||
options foo dyndbg # defaults to +p
|
|
||||||
|
|
||||||
2. # foo.dyndbg as given in boot args, "foo." is stripped and passed
|
|
||||||
foo.dyndbg=" func bar +p; func buz +mp"
|
|
||||||
|
|
||||||
3. # args to modprobe
|
|
||||||
modprobe foo dyndbg==pmf # override previous settings
|
|
||||||
|
|
||||||
These dyndbg queries are applied in order, with last having final say.
|
|
||||||
This allows boot args to override or modify those from /etc/modprobe.d
|
|
||||||
(sensible, since 1 is system wide, 2 is kernel or boot specific), and
|
|
||||||
modprobe args to override both.
|
|
||||||
|
|
||||||
In the foo.dyndbg="QUERY" form, the query must exclude "module foo".
|
|
||||||
"foo" is extracted from the param-name, and applied to each query in
|
|
||||||
"QUERY", and only 1 match-spec of each type is allowed.
|
|
||||||
|
|
||||||
The dyndbg option is a "fake" module parameter, which means:
|
|
||||||
|
|
||||||
- modules do not need to define it explicitly
|
|
||||||
- every module gets it tacitly, whether they use pr_debug or not
|
|
||||||
- it doesn't appear in /sys/module/$module/parameters/
|
|
||||||
To see it, grep the control file, or inspect /proc/cmdline.
|
|
||||||
|
|
||||||
For CONFIG_DYNAMIC_DEBUG kernels, any settings given at boot-time (or
|
|
||||||
enabled by -DDEBUG flag during compilation) can be disabled later via
|
|
||||||
the sysfs interface if the debug messages are no longer needed:
|
|
||||||
|
|
||||||
echo "module module_name -p" > <debugfs>/dynamic_debug/control
|
|
||||||
|
|
||||||
Examples
|
|
||||||
========
|
|
||||||
|
|
||||||
// enable the message at line 1603 of file svcsock.c
|
|
||||||
nullarbor:~ # echo -n 'file svcsock.c line 1603 +p' >
|
|
||||||
<debugfs>/dynamic_debug/control
|
|
||||||
|
|
||||||
// enable all the messages in file svcsock.c
|
|
||||||
nullarbor:~ # echo -n 'file svcsock.c +p' >
|
|
||||||
<debugfs>/dynamic_debug/control
|
|
||||||
|
|
||||||
// enable all the messages in the NFS server module
|
|
||||||
nullarbor:~ # echo -n 'module nfsd +p' >
|
|
||||||
<debugfs>/dynamic_debug/control
|
|
||||||
|
|
||||||
// enable all 12 messages in the function svc_process()
|
|
||||||
nullarbor:~ # echo -n 'func svc_process +p' >
|
|
||||||
<debugfs>/dynamic_debug/control
|
|
||||||
|
|
||||||
// disable all 12 messages in the function svc_process()
|
|
||||||
nullarbor:~ # echo -n 'func svc_process -p' >
|
|
||||||
<debugfs>/dynamic_debug/control
|
|
||||||
|
|
||||||
// enable messages for NFS calls READ, READLINK, READDIR and READDIR+.
|
|
||||||
nullarbor:~ # echo -n 'format "nfsd: READ" +p' >
|
|
||||||
<debugfs>/dynamic_debug/control
|
|
||||||
|
|
||||||
// enable messages in files of which the paths include string "usb"
|
|
||||||
nullarbor:~ # echo -n '*usb* +p' > <debugfs>/dynamic_debug/control
|
|
||||||
|
|
||||||
// enable all messages
|
|
||||||
nullarbor:~ # echo -n '+p' > <debugfs>/dynamic_debug/control
|
|
||||||
|
|
||||||
// add module, function to all enabled messages
|
|
||||||
nullarbor:~ # echo -n '+mf' > <debugfs>/dynamic_debug/control
|
|
||||||
|
|
||||||
// boot-args example, with newlines and comments for readability
|
|
||||||
Kernel command line: ...
|
|
||||||
// see whats going on in dyndbg=value processing
|
|
||||||
dynamic_debug.verbose=1
|
|
||||||
// enable pr_debugs in 2 builtins, #cmt is stripped
|
|
||||||
dyndbg="module params +p #cmt ; module sys +p"
|
|
||||||
// enable pr_debugs in 2 functions in a module loaded later
|
|
||||||
pc87360.dyndbg="func pc87360_init_device +p; func pc87360_find +p"
|
|
||||||
@@ -19,7 +19,7 @@ forever.
|
|||||||
|
|
||||||
This should not cause problems for anybody, since everybody using a
|
This should not cause problems for anybody, since everybody using a
|
||||||
2.1.x kernel should have updated their C library to a suitable version
|
2.1.x kernel should have updated their C library to a suitable version
|
||||||
anyway (see the file "Documentation/Changes".)
|
anyway (see the file "Documentation/process/changes.rst".)
|
||||||
|
|
||||||
1.2 Allow Mixed Locks Again
|
1.2 Allow Mixed Locks Again
|
||||||
---------------------------
|
---------------------------
|
||||||
|
|||||||
@@ -11,7 +11,7 @@ Updated 2006 by Horms <horms@verge.net.au>
|
|||||||
In order to use a diskless system, such as an X-terminal or printer server
|
In order to use a diskless system, such as an X-terminal or printer server
|
||||||
for example, it is necessary for the root filesystem to be present on a
|
for example, it is necessary for the root filesystem to be present on a
|
||||||
non-disk device. This may be an initramfs (see Documentation/filesystems/
|
non-disk device. This may be an initramfs (see Documentation/filesystems/
|
||||||
ramfs-rootfs-initramfs.txt), a ramdisk (see Documentation/initrd.txt) or a
|
ramfs-rootfs-initramfs.txt), a ramdisk (see Documentation/admin-guide/initrd.rst) or a
|
||||||
filesystem mounted via NFS. The following text describes on how to use NFS
|
filesystem mounted via NFS. The following text describes on how to use NFS
|
||||||
for the root filesystem. For the rest of this text 'client' means the
|
for the root filesystem. For the rest of this text 'client' means the
|
||||||
diskless system, and 'server' means the NFS server.
|
diskless system, and 'server' means the NFS server.
|
||||||
@@ -284,7 +284,7 @@ They depend on various facilities being available:
|
|||||||
"kernel <relative-path-below /tftpboot>". The nfsroot parameters
|
"kernel <relative-path-below /tftpboot>". The nfsroot parameters
|
||||||
are passed to the kernel by adding them to the "append" line.
|
are passed to the kernel by adding them to the "append" line.
|
||||||
It is common to use serial console in conjunction with pxeliunx,
|
It is common to use serial console in conjunction with pxeliunx,
|
||||||
see Documentation/serial-console.txt for more information.
|
see Documentation/admin-guide/serial-console.rst for more information.
|
||||||
|
|
||||||
For more information on isolinux, including how to create bootdisks
|
For more information on isolinux, including how to create bootdisks
|
||||||
for prebuilt kernels, see http://syslinux.zytor.com/
|
for prebuilt kernels, see http://syslinux.zytor.com/
|
||||||
|
|||||||
@@ -1307,7 +1307,16 @@ second). The meanings of the columns are as follows, from left to right:
|
|||||||
- nice: niced processes executing in user mode
|
- nice: niced processes executing in user mode
|
||||||
- system: processes executing in kernel mode
|
- system: processes executing in kernel mode
|
||||||
- idle: twiddling thumbs
|
- idle: twiddling thumbs
|
||||||
- iowait: waiting for I/O to complete
|
- iowait: In a word, iowait stands for waiting for I/O to complete. But there
|
||||||
|
are several problems:
|
||||||
|
1. Cpu will not wait for I/O to complete, iowait is the time that a task is
|
||||||
|
waiting for I/O to complete. When cpu goes into idle state for
|
||||||
|
outstanding task io, another task will be scheduled on this CPU.
|
||||||
|
2. In a multi-core CPU, the task waiting for I/O to complete is not running
|
||||||
|
on any CPU, so the iowait of each CPU is difficult to calculate.
|
||||||
|
3. The value of iowait field in /proc/stat will decrease in certain
|
||||||
|
conditions.
|
||||||
|
So, the iowait is not reliable by reading from /proc/stat.
|
||||||
- irq: servicing interrupts
|
- irq: servicing interrupts
|
||||||
- softirq: servicing softirqs
|
- softirq: servicing softirqs
|
||||||
- steal: involuntary wait
|
- steal: involuntary wait
|
||||||
|
|||||||
@@ -119,7 +119,7 @@ separated by spaces:
|
|||||||
253:0 Device with major 253 and minor 0
|
253:0 Device with major 253 and minor 0
|
||||||
|
|
||||||
Authoritative information can be found in
|
Authoritative information can be found in
|
||||||
"Documentation/kernel-parameters.txt".
|
"Documentation/admin-guide/kernel-parameters.rst".
|
||||||
|
|
||||||
(*) rw
|
(*) rw
|
||||||
|
|
||||||
|
|||||||
@@ -3,3 +3,8 @@
|
|||||||
project = "Linux GPU Driver Developer's Guide"
|
project = "Linux GPU Driver Developer's Guide"
|
||||||
|
|
||||||
tags.add("subproject")
|
tags.add("subproject")
|
||||||
|
|
||||||
|
latex_documents = [
|
||||||
|
('index', 'gpu.tex', project,
|
||||||
|
'The kernel development community', 'manual'),
|
||||||
|
]
|
||||||
|
|||||||
@@ -215,7 +215,7 @@ Connectors state change detection must be cleanup up with a call to
|
|||||||
Output discovery and initialization example
|
Output discovery and initialization example
|
||||||
-------------------------------------------
|
-------------------------------------------
|
||||||
|
|
||||||
::
|
.. code-block:: c
|
||||||
|
|
||||||
void intel_crt_init(struct drm_device *dev)
|
void intel_crt_init(struct drm_device *dev)
|
||||||
{
|
{
|
||||||
|
|||||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user