Emergency boot
S
=> single
1 => runlevel 1 (same as Single)
init=/bin/bash
text (remove " splash" or set splash=off)
vga=ask / normal / extended
Ctrl-Alt-F1 exits window manager
ESC enables verbose mode (or use "debug text noslpash" instead of
"quiet splash")
for upstart output, add --verbose
after login:
sudo start
lightdm
startx
sudo /etc/init.d/lightdm start
runlevel
telinit 1 (telinit S)
telinit 5
Make sure journaling is enabled on mounted file systems:
tune2fs -l
/dev/… | grep -i features
(enable journal, better on unmounted file system: ) tune2fs
–j /dev/…
tune2fs -o^journal_data,journal_data_ordered,^journal_data_writeback /dev/...
(can also specify in /etc/fstab mount options: data=ordered)
Alt-SysRq keys:
b |
Reboot without sync or unmounts |
s |
sync |
c |
crash |
u |
try to remount all file systems as readonly |
0-9 |
set log level |
d |
show all held locks |
g |
enter kdbg |
h |
help |
l |
show stack backtrace for all active CPUs |
m |
dump memory |
p |
dump registers |
q |
dump armed hrtimers |
t |
dump tasks |
echo 1
>/proc/sys/kernel/sysrq
echo c
>/proc/sysrq-trigger
Taking kernel dump on bare-metal machine:
See http://www.kernel.org/doc/Documentation/kdump/kdump.txt
Also see http://seife.kernalert.de/blog/wp-content/uploads/linuxtag-2012-crashdump-seyfried.pdf
Development or production kernel
"my"
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y
CONFIG_HIGHMEM4G=y (for 32-bit systems,
optional to allocate beyond 4GB)
CONFIG_SYSFS=y
CONFIG_PROC_VMCORE=y
CONFIG_DEBUG_INFO=y
CONFIG_PHYSICAL_START=0x1000000
CONFIG_RELOCATABLE=y (if want to use main kernel as dump kernel)
Dump kernel "my-kdump"
(unless double-using main kernel)
same
as above, plus:
CONFIG_SMP=n *OR*
specify in /etc/sysconfig/kdump or /etc/default/kdump-tools
KDUMP_COMMANDLINE_APPEND = "maxcpus=1 " (note space at the end!)
CONFIG_LOCALVERSION="-kdump" (alternatively, can add in Makefile)
on
x64, if CONFIG_RELOCATABLE=y, can use vmlinuz,
otherwise kernel should be uncompressed, i.e.
not vmlinuz-id, but vmlinux-id-kdump
apt-get install
[kdump] kexec-tools
Or:
Download
http://kernel.org/pub/linux/utils/kernel/kexec/kexec-tools.tar.gz
tar xvpzf kexec-tools.tar.gz
cd kexec-tools-VERSION
./configure
make
make install
SUSE |
edit /etc/sysconfig/kdump |
load once: rckdump start |
openSUSE |
edit /etc/sysconfig/kdump |
load once: sudo systemctl start
kdump.service |
RedHat / Fedora |
install package crash edit /etc/kdump.conf |
load once: service kdump start |
Debian (Ubuntu) |
edit: /etc/default/kdump-tools tool: kdump-config |
see in the text below |
Fedora |
edit /etc/kdump.conf, tool: system-config-kdump |
load once: systemctl start kdump.service load always: systemctl enable kdump.service |
On SUSE: /etc/sysconfig/kdump (see man 5 kdump)
KDUMP_KERNELVER="id-kdump"
KDUMP_COMMANDLINE="ro ... "
(see grub.cfg)
KDUMP_COMMANDLINE_APPEND = "maxcpus=1 " (note space at the end!)
KEXEC_OPTIONS= "..." (precede
by --args-linux, e.g. "--args-linux
… ", space at end)
KDUMP_RUNLEVEL="1"
KDUMP_SAVEDIR="file:///var/crash"
(alternatively, can save to block device via KDUMP_DUMPDEV="/dev/…")
KDUMP_DUMPLEVEL="0" (all memory)
or "16" (no free pages)
or "17" (no free, no
zero pages),
see man 8 makedumpfile
KDUMP_DUMPFORMAT="ELF"
On openSUSE:
zypper
install -y kexec-tools yast2-kdump
tool:
YaST -> Kernel Kdump
or
edit manually /etc/sysconfig/kdump
KDUMP_COPY_KERNEL="no"
KDUMP_DUMPLEVEL="31"
load
always: sudo systemctl enable kdump.service
load
once: sudo systemctl start kdump.service
openSUSE
kdump is screwed up, it tries to write to /kdump/mnt0/var/crash,
but /kdump/mnt0 does not get mounted
so
if it says "Dump too large. Aborting. Check KDUMP_FREE_DISK_SIZE"
then
save dump manually:
mount
/dev/sda1 /mnt
makedumpfile --dump-dmesg [-x /path/vmlinux] /proc/vmcore /mnt/var/crash/qqq.txt
cp
/proc/vmcore /mnt/var/crash/qqq.dmp
umount
/mnt
reboot
-f
or
instead of cp:
makedumpfile
-l -d 31 /proc/vmcore /mnt/var/crash/qqq.dmp
compression:
-c => zlib, -l => lzo, -p => snappy
-f
=> overwrite
-x
/path/vmlinux => path to vmlinux
--message-level
23 (or 31; default is 7)
On Fedora:
RedHat
Kernel Crash Dump Guide: html,
html-onepage,
PDF
dnf
install -y kexec-tools system-config-kdump
in grub.cfg, add: crashkernel=256M
in /etc/kdump.conf:
path
/var/crash
ext4 LABEL=...
collector makedumpfile -l --message-level 7 -d 14 (to
include free and zero pages: -d 31)
in
/etc/sysconfig/kdump:
KDUMP_KERNELVER="id-kdump"
KDUMP_COMMANDLINE="ro ... "
(see grub.cfg)
KDUMP_COMMANDLINE_APPEND = "maxcpus=1 " (note space at the end!)
systemctl
enable kdump.service
reboot
OR: system-config-kdump
reboot
warning:
grub2-mkconfig/update-grub will overwrite edited grub.cfg
In GRUB, add to "my" linux
boot kernel parameters
crashkernel=128M (if base is unspecified, will place
automatically)
crashkernel=128M@16M (base = @16 MB)
crashkernel=128M,high (allow
auto-allocation above 4GB, ok for x64)
with bigger initrd, may want to increase to 256M or even 512M
holds compressed initrd + uncompressed initrd (while unpacking) + unpacked
kernel
Optionally test that dump kernel can
be executed manually:
kexec --load
/boot/vmlinux-id-kdump
--initrd
=/boot/initrd-id-kdump .img
--command-line=`cat
/proc/cmdline`
--append="1 maxpus=1
reset_devices " (note space at the end)
kexec --exec
Optionally test that dump kernel can
be executed by crash:
kexec --load-panic (… same arguments as for –load
…)
sysctl kernel.sysrq=1
Alt-SysRq-c
Mark kdump to start automatically on
boot:
RedHat: chkconfig
dump on
SUSE: chkconfig boot.kdump
on
Debian: sysv-rc-config dump
on
or start it manually:
start kdump
/etc/init.d/kdump start
(OpenSUSE:) /etc/init.d/boot.kdump { start | stop | status }
systemctl start
kdump.service
(Ubuntu:) sudo
kdump-config load
Writing dump manually from dump kernel:
cp /proc/vmcore
dump-file
On Ubuntu:
apt-get
install kdump-tools linux-crashdump
edit /etc/default/kdump-tools (see man
kdump-tools, man makedumpfile)
USE_KDUMP=1
KDUMP_KERNEL="/boot/..."
KDUMP_INITRD="/boot/..."
DEBUG_KERNEL="/.../vmlinux..."
KDUMP_COREDIR="/var/crash"
KDUMP_CMDLINE_APPEND="1
irqpoll maxcpus=1 nousb reset_devices text " (space at end!)
or
KDUMP_CMDLINE_APPEND="irqpoll maxcpus=1 nousb"
MAKEDUMP_ARGS="-d 31
-E -x /.../vmlinux"
(-d 16 =
no free pages; -d 17=no free, no zero pages)
#KDUMP_EXEC_ARGS=""
also see /usr/share/doc/kdump-tools/README*
and http://wiki.ubuntu.com/Kernel/CrashdumpRecipe
Remember
to set "crashkernel" in GRUB
kdump-config
test
kdump-config show
kdump-config status
kdump-config load <==== required for dump to work
to test:
echo
1 >/proc/sys/kernel/sysrq
echo c >/proc/sysrq-trigger
http://www.admin-magazine.com/Articles/Analyzing-Kernel-Crash-Dumps
http://doc.opensuse.org/documentation/html/openSUSE_122/opensuse-tuning/cha.tuning.kexec.html
http://www.kernel.org/doc/Documentation/kdump/kdump.txt
http://ftp.suse.com/pub/people/tiwai/kdump-training/kdump-training.pdf
http://lse.sourceforge.net/kdump/documentation/ols2oo5-kdump-paper.pdf
http://people.redhat.com/anderson/crash_whitepaper
if makedumpfile says "the kernel
version is not supported":
makedumpfile
-v
·
borrow makedumpfile from another dist
(but check that ldd makedumpfile links fine to deployed DSOs)
·
or download and build the latest
version of makedumpfile:
git
clone http://git.code.sf.net/p/makedumpfile/code
make
sure LATEST_VERSION in makedumpfile.h is high enough
edit:
LIBS = -ldw -lbz2 -lebl -ldl -lelf -lz -llzma
install
packages:
elfutils elfutils-devel libbz2-1
libz1 glibc-devel-static libbz2-devel
zlib-devel-static
lzo-devel-static
Fedora: lzo lzo-devel
snappy snappy-devel
Ubuntu: liblzo2-2
liblzo2-dev libsnappy libsnappy-dev
openSUSE: liblzo2-2 lzo-devel libsnappy1
snappy-devel
opensuse
13.2 package libbz2-devel misses static version of libbz2.a
may
borrow it from another distribution
but
still missing liblzma, that also has no static equivalent under opensuse
so
build with dynamic linking (not default static)
both
Fedora and openSUSE build it this way anyway
make
USELZO=on LINKTYPE=dynamic
make
install
Sample /etc/grub.d/12_devel (chmod a+x)
#!/bin/sh
exec tail -n +3 $0
# This file provides an easy way to add custom menu entries. Simply type the
# menu entries you want to add after this comment. Be careful not to change
# the 'exec tail' line above.
menuentry 'Linux 3.14.3-iocp-0 text' {
recordfail
gfxmode
$linux_gfx_mode
insmod
gzio
insmod
part_msdos
insmod
ext2
set
root='hd0,msdos7'
if [
x$feature_platform_search_hint = xy ]; then
search --no-floppy --fs-uuid --set=root
--hint-bios=hd0,msdos7 --hint-efi=hd0,msdos7 --hint-baremetal=ahci0,msdos7 7178466b-a71b-41e3-b632-aaade102262e
else
search --no-floppy --fs-uuid --set=root
7178466b-a71b-41e3-b632-aaade102262e
fi
echo 'Loading Linux 3.14.3-iocp-0 ...'
linux /boot/vmlinuz-3.14.3-iocp-0
root=UUID=7178466b-a71b-41e3-b632-aaade102262e ro quiet text fbcon=scrollback:1024k sysrq_always_enabled
crashkernel=384M-2G:64M,2G-:128M $vt_handoff
echo 'Loading initial ramdisk ...'
initrd /boot/initrd.img-3.14.3-iocp-0
}
menuentry 'Linux 3.14.3-iocp-0 text-devel' {
recordfail
gfxmode
$linux_gfx_mode
insmod
gzio
insmod
part_msdos
insmod
ext2
set
root='hd0,msdos7'
if [
x$feature_platform_search_hint = xy ]; then
search --no-floppy --fs-uuid --set=root
--hint-bios=hd0,msdos7 --hint-efi=hd0,msdos7 --hint-baremetal=ahci0,msdos7 7178466b-a71b-41e3-b632-aaade102262e
else
search --no-floppy --fs-uuid --set=root
7178466b-a71b-41e3-b632-aaade102262e
fi
echo 'Loading Linux 3.14.3-iocp-0 ...'
linux /boot/vmlinuz-3.14.3-iocp-0
root=UUID=7178466b-a71b-41e3-b632-aaade102262e ro quiet text fbcon=scrollback:1024k sysrq_always_enabled
log_buf_len=1M crashkernel=384M-2G:64M,2G-:128M $vt_handoff
echo 'Loading initial ramdisk ...'
initrd /boot/initrd.img-3.14.3-iocp-0
}
Taking kernel dump on KVM:
virsh
dump domain domain.crash
--bypass-cache --crash --verbose --memory-only
crash vmlinux domain.crash
--
crash => shut off virtual machine after taking dump
--reset -> reset virtual machine after taking dump
-- live => minimize interruption
Taking kernel dump on Xen:
xl
dump-core dom-id domain.crash
Taking kernel dump on VMWare:
suspend
VM
vmss2core –N6 file.vmss
Taking kernel dump with NMI dump switch:
sysctl
kernel.unknown_nmi_panic = 1
Dump analysis:
# gdb-kdump
gdb
vmlinux dump-file
Ubuntu: apt-get install
crash
openSUSE: zypper install
crash
RHEL/Fedora: dnf install crash
building:
latest: http://github.com/crash-utility
Fedora:
dnf install -y
crash-devel lzo lzo-devel snappy snappy-devel
Ubuntu:
apt-get install
liblzo2-2 liblzo2-dev libsnappy libsnappy-dev
openSUSE: zypper install -y
crash-doc crash-devel liblzo2-2 lzo-devel libsnappy1 snappy-devel
(may
also need to install packages required for kernel build environment)
git clone http://github.com/crash-utility/crash
cd
crash
make
lzo
sudo
make install
crash
/path/vmlinux -id
/var/crash/…/vmcore
man crash
http://www.dedoimedo.com/computers/crash-analyze.html
http://www.dedoimedo.com/computers/crash-book.html#download
http://people.redhat.com/anderson/crash_whitepaper
http://magazine.redhat.com/2007/08/15/a-quick-overview-of-linux-kernel-crash-dump-analysis
http://www.dedoimedo.com/computers/crash-analyze.html
http://www.dedoimedo.com/computers/crash-book.html#download
http://magazine.redhat.com/2007/08/15/a-quick-overview-of-linux-kernel-crash-dump-analysis
crash can also be used on the running
kernel
crash
/path/vmlinux-id
crash
commands:
<script //
input from file
cmd >file //
output to a file
cmd | pipe-cmd //
pipe output to shell command
gdb <command> |
crash is a wrapper around gdb, so gdb commands can be used |
help |
|
log |
dmesg from crashed kernel |
sys |
crash info kernel .config data syscall table |
mach mach -m mach -c |
machine info (format
options: -x, -d), incl. stack
locations … memory map |
bt |
backtrace options: -l => file + line
number -sx => symbol + offset -f => expand each
stack frame |
sym pipe |
find symbol |
whatis
<type|symbol> |
display type or symbol defintion |
eval <expr> |
evaluate expression |
p <expr> p
<symbol>[:cpuspec] px = p -x pd = p -d |
print value percpu var, cpuspec can be "all" or "1-3, 5"
or ":" for current task cpu formats: -x, -d userspace: -u |
[struct] <type>
<addr> [struct] <type>.<member> <addr>
<member-addr>
*<struct-typename>
<addr> |
dump “struct type” located at <addr> dump structure by enclosed member address options: -r dump raw structure data -o dump filed offsets (metadata, not data) after <addr> can also add: [:cpuspec] a[ll] or "1-3, 5" or
":" for current task cpu [count] dump array of structures at
<addr> |
set 0xtaskp |
set context to task |
dis <expr> [instr-count] |
disassembly |
ascii … |
translate hex string to ascii, or display ascii table |
rd <addr> [count] |
dump memory formats: -x -8
-16 -32 -64
-a -d -D -p => physical adess |
wr <variable> <value> |
write memory |
search … |
search memory for value |
tree … |
display rb-tree or radix tree |
list … |
dump list |
waitq <expr> |
display wait queue |
runq [-t | -m | -g] |
display run queue |
timer |
timers |
mod |
display modules load module symbols |
ps [-k | -u | -G] -g -a -t -l
-m -C cpus [<pid> | 0xtaskp | command-string] |
task list, -k =>
kernel only, -u => user only, -G => thread group leaders only … by thread group … kernel stack pointer … times |
task [<pid> |
0xtaskp] |
display task_struct and thread_info (format:
-x, -d) … members to display |
sig
[<pid> | 0xtaskp] |
signal handlers |
vtop [-c <pid> |
0xtaskp] <addr> ptov <addr> pte <value> |
translate virtual address to physical, display PTE dump <value> as PTE |
vm [<pid> |
0xtaskp] -m -v vm -f vm_flags |
vm map for the task
(format: -d, -x) … translate virt to phys … all mm_struct for the task … all vm_area_struct for the task |
kmem
-i
-z
-s [-S]
-g
-v |
kernel memory usage slab data regions allocated by vmalloc |
irq [-u] irq -b irq -d irq -s [-c cpu] |
irq data (-u => in-use
only) bottom halves x86 IDT CPU affinity for in-use irqs statistics, (cpu can be “all”, “1-3,5”) |
dev dev
-p dev
-d |
device info IO port usafe disk IO statistics |
mount … |
display mounted file systems and data (inodes, dentries, etc.) |
swap |
swap info |
fuser <path> |
<inode> |
display tasks using specified file or socket |
files files <pid> |
0xtaskp files -d <dentry> |
files opened by current task … by task <pid>
or taskp |
net … |
network data (net devices, IP addresses, sockets, ARP cache) |
ipcs … |
semaphores, shared mem segments, IPC message queues |
foreach <cmd> |
perform command on all tasks … all user tasks … all kernel threads … active threads on each CPU … for all tasks in <state>: RU, IN, UN, ST, ZO, TR, SW, DE … for each task with matching name, e.g. can use bt, vm, task, files, net, set, ps, sig, vtop |
repeat [-seconds] <cmd> |
repeat command every “seconds” |
extend <dso.so> |
load command extension library |
"Oops: 0002 [#1]
PREEMPT SMP "
0002 is flags:
|
bit # |
bit = 0 |
bit = 1 |
PF_PROT |
0 |
page not mapped |
access violation |
PF_WRITE |
1 |
read or execute |
write |
PF_USER |
2 |
kernel mode |
user mode |
PF_RSVD |
3 |
|
|
PF_INSTR |
4 |
data access |
instruction fetch |
Disassembling executable image:
objdump -S
module.ko
Debugging with GDB: http://sourceware.org/gdb/onlinedocs/gdb
Using cscope:
make cscope
cscope -R (for help type ?)
GUI is kscope
Using GNU GLOBAL
apt-get install
global
cd ....
gtags [dbpath]
htags [-d dbpah] [-t title]
[outdir] // default outdir is
"./HTML"
no web-based
search, only command line
Using LXR:
perl -v
ctags --version => must be
"exuberant" // apt-get install exuberant-ctags
need database
need web server
...
Similar: OpenGrok
Debugger backends and connections:
iron
machine, Linux kgdb: COM
port, USB debug port
virtual machine, Linux kgdb: virtual
COM port (mapped to socket, named pipe etc.)
virtual machine, built-in gdb stub: VMWare,
KVM/QEMU, Virtual Box etc.
in
the latter case: no KDB, and limited task info access (each CPU is a gdb
thread, instead tasks being a gdb thread)
Debugger front-ends:
command-line
GDB
Eclipse
SlickEdit
Code::Blocks
Visual Studio + Visual Kernel
KGDB set up:
http://kgdb.geeksofpune.in/downloads/kgdb-2/kgdb_full_2.2.pdf
http://kgdb.geeksofpune.in/downloads/kgdb-2/kgdbquickstart-2.3.pdf
Serial line checking:
setserial -g
/dev/ttyS*
CSA and CSB:
ttyS0 0x3f8
irq=4 // on-board
ttyS4 0xf0a0 irq=19 //
Intel AMT
CSB:
ttyS5 0xd010
irq=16 // PCI (righ RS232 looking from front)
ttyS6 0xd000 irq=117 // PCI (left RS232 looking from front)
CSX:
ttyS0 // left DB9 looking from
front
ttyS1 // right DB9 looking from
front
stty -a -F
/dev/ttyS0
stty sane
-F /dev/ttyS0
stty raw -F /dev/ttyS0
stty 115200 raw -echo pass8 -cstopb
-ignpar -ixon -ixoff -F /dev/ttyS0
minicom -s
[-c on] => setup
minicom
Intel AMT serial-over-LAN (does not expose client end as COM device, but
mappable via amtterm)
https://software.intel.com/en-us/articles/using-intel-amt-serial-over-lan-to-the-fullest
https://www.youtube.com/watch?v=MqiCjopZVR4
https://software.intel.com/en-us/articles/intel-active-management-technology-start-here-guide-intel-amt-9
Early kernel debugging:
earlyprintk=vga ekgdboc=kbd
Break to debugger at boot:
kgdbwait
Debug via serial ("over console"):
kgdboc=ttyS0,115200 [kgdbwait]
https://www.kernel.org/pub/linux/kernel/people/jwessel/kgdb
https://www.kernel.org/pub/linux/kernel/people/jwessel/kdb
Enable KGDB after boot:
echo "ttyS0,115200" >/sys/module/kgdboc/parameters/kgdboc
echo "1"
>/proc/sys/kernel/sysrq
Alt-SysRq-g to
enter debugger (or echo g >
/proc/sysrq-trigger)
to disable: echo
"" >/sys/module/kgdboc/parameters/kgdboc
To send printk messages to gdb stream:
boot parameter: kgdbcon
dynamic: echo 1
>/sys/module/kgdb/parameters/kgdb_use_con
(then must re-configure kgdb IO driver)
Debug via USB (requires USB debug port):
kgdbdbgp=0 // use USB debug port on EHCI USB controller
0, as it is probed via PCI
http://www.coreboot.org/EHCI_Debug_Port
http://www.coreboot.org/EHCI_Gadget_Debug
Most motherboards (including Intel DQ77MK and ASUS
CS-B) do not route debug port to motherboard connectors, either internal or
external
Debugging via Ethernet (kgdboe) is not in
the main tree, abandoned
Debugging via Firewire (1394) was always
for memory read/write only, and no longer works or maintained
Debugging with (k)gdb is Visual
Studio: see below
Debugging with VMWare (or KVM/QEMU):
·
Via virtual serial
port mapped to TCP socket
·
Or via built-in
gdb server
Note: knows nothing about guest OS, only one thread per
CPU, very much like in-circuit debugger.
No KDB commands.
http://www.evilfingers.com/publications/research_RU/unix-kernel-debug.pdf
KDB commands
in KGDB:
(gdb) monitor lsmod //
list loaded modules
(gdb) monitor ps //
active processes
(gdb) monitor ps A //
all processes
(gdb) monitor summary //
kernel version and memory usage
(gdb) monitor dmesg //
syslog buffer
(gdb) monitor bt //
stack of current process using dump_stack (better than gdb bt)
more in http://dev.man-online.org/man1/kdb
md
<vaddr> mdr
<vaddr> <bytes> mdp
<paddr> <bytes> mds
<vaddr> mm
<vaddr> <contents> |
Display Memory Contents, also mdWcN,
e.g. md8c1 Display Raw Memory Display Physical Memory Display Memory Symbolically Modify Memory Contents |
rd rm
<reg> <contents> |
Display Registers Modify Registers |
ef
<vaddr> |
Display exception frame |
bt
[<vaddr>] btp
<pid> bta
[D|R|S|T|C|Z|E|U|I|M|A] btc
btt
<vaddr> address |
Stack traceback Display stack for process <pid> Backtrace current process on each cpu Backtrace process given its struct task |
cpu
<cpunum> |
Display CPUs or switch to new cpu |
pid
<pidnum> ps
[<flags>|A] |
Switch to another task Display active task list |
per_cpu <sym> [<bytes>] [<cpu>]
|
Display per_cpu variables |
bp
[<vaddr>] bl
[<vaddr>] bph
[<vaddr>] [datar [length]|dataw [length]] bc
<bpnum> be
<bpnum> bd
<bpnum> |
Set/Display breakpoints Display breakpoints Set hw brk Clear Breakpoint Enable Breakpoint |
ss
go
[<vaddr>] |
Single Step Continue Execution |
kill
<-signal> <pid> reboot kgdb |
Send a signal to a process Reboot the machine immediately Enter kgdb mode |
ftdump
[skip_#lines] [cpu] |
Dump ftrace log |
env
set
... |
Show environment variables Set environment variables |
dumpcommon dumpall dumpcpu |
Common kdb debugging First line debugging Same as dumpall but only tasks on cpus |
sample: LINUX_ROOT/samples/kdb
kdb_register(...), kdb_unregister(...)
Build parameters
CONFIG_KDGB
CONFIG_HAVE_ARCH_KGDB
CONFIG_KGDB_SERIAL_CONSOLE
CONFIG_FRAME_POINTER
#CONFIG_DEBUG_RODATA // turn off,
otherwise breakpoint may not work, x86 has only 4 debug registers
(DEBUG_RODATA limitation had been fixed in 3.0, see "Fix DEBUG_RODATA
limitation using text_poke")
For KDB additionally:
CONFIG_KGDB_KDB
CONFIG_KDB_KEYBOARD
Overall:
CONFIG_DEBUG_INFO=y
On host side:
apt-get
install python python-libs
python-dev
gedit ~./gdbinit
enter:
set
auto-load safe-path /
or: add-auto-load-safe-path /path/to/vmlinux-gdb.py
gdb ./vmlinux
(gdb) set remotebaud
115200
(gdb) target remote
/dev/ttyS0
When debugging, better to avoid using step/next when
interrupts are enabled.
Rather go from a breakpoint to a breakpoint (as in "continue").
Otherwise may throw into an interrupt handler and stack switch, e.g.
schedule().
Specifically, avoid stepping over functions that may
cause scheduling, such as:
queue_work()
spin_lock()
spin_unlock()
Programmatic break-to-debugger:
breakpoint() ==>
asm( " int $3");
kgdb_breakpoint()
Debugging loadable module (for kgdb
before 1.9):
see http://kgdb.geeksofpune.in/initmodule.htm
# insmod
mymodule.ko
# cd /sys/module/mymodule/sections/
# cat .data
# cat .rodata
# cat .bss
# cat .text
(gdb) add-symbol-file <module_name> <.text -
address> \
-s .bss <address> \
-s .rodata
<address> \
-s .data <address>
other similar sections:
.sdata .sbss
Opening current kernel in gdb:
gdb ./vmlinux /proc/core
to refresh: (gdb) core-file
/proc/core
Disassembling function (helps with asm debug and function calls inlined
from header files):
disasfun.sh vmlinux do_fork
#!/bin/sh
#
#
from kgdb
#
if
[ $# != 2 ]
then
echo disasfun objectfile functionname
exit
1
fi
OBJFILE=$1
FUNNAME=$2
ADDRSZ=`objdump
-t $OBJFILE | gawk -- "{
if (\\\$3 == \\"F\\" &&
\\\$6 == \\"$FUNNAME\\") {
printf(\\"%s %s\\", \\\$1,
\\\$5)
}
}"`
ADDR=`echo
$ADDRSZ | gawk "{ printf(\\"%s\\",\\\$1)}"`
SIZE=`echo
$ADDRSZ | gawk "{ printf(\\"%s\\",\\\$2)}"`
if
[ -z "$ADDR" -o -z "$SIZE" ]
then
echo Cannot find address or size of
function $FUNNAME
exit
2
fi
objdump
-S $OBJFILE --start-address=0x$ADDR --stop-address=$((0x$ADDR + 0x$SIZE))
Using
Code::Blocks
sudo chown sergey
/dev/ttyS0
cd /src-path
create empty project
Settings -> Debugging
watch = all
evaluate expression under cursor = on
do *not* run the debuggee = on
debugger initialization commands:
file
/src-path/vmlinux
dir /src-path
set
remotebaud 115200
target
remote /dev/ttyS0
Has gdb console window
Using
SlickEdit
sudo chown sergey
/dev/ttyS0
Debug -> Attach Debugger -> Attach to Remote
Process
File =
/path/vmlinux
Device
= /dev/ttyS0
Speed
= 115200
Standard SlickEdit does not have gdb console window,
but there are 3-party plugins
Using
Eclipse (Kepler, Mars):
Run -> Debug Configurations -> new config
"C/C++ Remote Application"
Main tab: at
the bottom, select Using GDB (DSF) Manual Remote Launcher
enter
path to vmlinux
disable
autobuild
Debugger tab: stop
on startup = off
debugger
command: ~/.gdbinit
Debugger/Connection subtab: set device/speed, e.g. /dev/ttyS0, 115200
click "Debug"
for GDB console: use Console view (but first
Run->Suspend, or use yellow "Pause" button)
Slow due to reading task/thread list
Using
VMWare (built-in gdb server)
VMWare creates gdb port, works like in-circuit emulator
(OS-agnostic), thread ≡ CPU, no KDB commands.
Does not know about modules and does not auto-load modules symbols/relocations,
however can do "lx-symbols".
Add to VMX file:
debugStub.listen.guest64 = "TRUE" #
enable listener for 64 bit guest
debugStub.listen.guest64.remote = "TRUE" #
allow remote connection
#debugStub.port.guest64 = "8864" # listen on specified port (default: 8864)
#monitor.debugOnStartGuest64 = "TRUE" #
pause on power-up
debugStub.listen.guest32 = "TRUE" # enable listener for 32 bit guest
debugStub.listen.guest32.remote = "TRUE" #
allow remote connection
#debugStub.port.guest32 = "8832" # listen on specified port (default: 8832)
#monitor.debugOnStartGuest32 = "TRUE" #
pause on power-up
debugStub.hideBreakpoints= "TRUE" # Set hardware breakpoints -- limited by HW
See in vmware.log when VM starts up
gdb ./vmlinux
(gdb) set arch
i386
(gdb) target
remote localhost:8832
(gdb) set arch
i386:x86_64
(gdb) target remote
somehost:8864
Replay debugging on Linux: https://www.vmware.com/pdf/ws7_replay_linux_technote.pdf
Using
KDbg
apt-get install kdbg
cd
/usr/share/kde4/apps/kdbg/icons/hicolor/22x22/actions
mv pulse.mnh pulse.mng-bak
kdbg -r /dev/ttyS0
/path/vmlinux
·
does not correctly
read gdbinit
·
no net connection
option
·
no gdb console
·
no locals view
·
all buttons are
disabled, cannot stop or continue
Using
Visual Studio (connect to built-in gdb server in VMWare/KVM or kgdb)
http://sysprogs.com/VisualKernel/tutorials/quickdebug
in target:
apt-get install openssh-server
(edit /etc/ssh/sshd_config to change port number if
desired)
in Visual Studio (to copy sources/vmlinux):
Debug -> Quick Debug Linux Kernel
Machine to Debug
Host name
User name
Password
Setup public key
[Create]
Machine with GDB = (local computer)
Kernel symbols for debugging: Install...
import manually
built kernel
source
directory
use
included pre-built gdb
index
kernel modules in a custom directory
in Visual Studio (when sources/vmlinux are local):
Debug -> Quick Debug Linux Kernel
Machine to Debug
Host name
User name
Password
Setup public key
[Create]
Machine with GDB = (local computer)
Kernel symbols for debugging: Install...
specify kernel symbols and sources manually
kernel
file with symbols -> vmlinux
source
directory
use
included pre-built gdb
index
kernel modules in a custom directory
Importing existing module into VisualKernel project: http://sysprogs.com/VisualKernel/tutorials/import
Managing symbols:
http://sysprogs.com/VisualKernel/documentation/kernelsymbols
Documentation: http://sysprogs.com/VisualKernel/tutorials
Using
KVM/QEMU (built-in gdb server)
https://help.ubuntu.com/community/KVM/Installation
https://help.ubuntu.com/community/KVM/Networking
https://help.ubuntu.com/community/KVM/CreateGuests
https://help.ubuntu.com/community/KVM/Managing
https://help.ubuntu.com/community/KVM/Access
apt-get install qemu-kvm libvirt-bin ubuntu-vm-builder bridge-utils
python-virtinst
apt-get install virtinst or
python-virtinst
apt-get install qemu-system spice-client python-spice-client-gtk
adduser sergey libvirtd
virsh -c qemu:///system list
apt-get install ubuntu-virt-server => kvm
libvirt-bin openssh-server
from X11 console:
apt-get install ubuntu-virt-mgmt => virt-manager
python-vm-builder and virt-viewer
apt-get install virt-manager
(reboot)
Optional:
Define private bridge (guest-only networking):
add to /etc/network/interfaces:
----- add begin -----
auto privatebr0
iface privatebr0 inet static
address 10.48.51.135
netmask 255.255.254.0
pre-up brctl addbr
privatebr0
post-down brctl delbr privatebr0
----- add end -----
/etc/init.d/networking restart
echo '<network> <name>privatenet</name> <bridge
name="privatebr0" /> </network>' >> /tmp/net.xml
virsh net-define /tmp/net.xml
virsh net-start privatenet
virsh net-autostart privatenet
virsh net-list --all
Now can add a virtual network device to any of the guests,
select "privatenet" as its source device
and this guest will be connected to the virtual switch.
Create VM with virt-manager
or:
sudo virt-install --connect qemu:///system
--name=ub64z
--ram=3072
--vcpus=2
--cdrom=ubuntu-14.04.2-desktop-amd64.iso
--os-type linux
--os-variant ubuntutrusty
--disk path=ub64z.qcow2,size=30,cache=writethrough
--graphics sdl
--graphics vnc,password=xxxx --noautoconsole
--network=network:privatenet
--network=network:default # NAT
May want to select spice/QXL as video
export EDITOR=`which nano`
export EDITOR=`which gedit`
List VMs:
virsh --connect qemu:///system
virsh# list --all
virsh# edit ub64z
change: <domain type='kvm'> => <domain type='kvm'
xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
and add:
<qemu:commandline>
<qemu:arg value='-gdb'/>
<qemu:arg value='tcp::1234'/>
</qemu:commandline>
virsh qemu-monitor-command ub64z --hmp help
virsh qemu-monitor-command ub64z --hmp info
virsh qemu-monitor-command ub64z --hmp info [ pci | qtree | mtree | usb |
network | cpus | registers ]
virsh qemu-monitor-command ub64z --hmp { x | xp
} -> memory access
virsh qemu-monitor-command ub64z --hmp { i | o
} -> I/O port access
gdb ./vmlunux
(gdb) target
remote localhost:1234
Raw QEMU debugging is similar,
but raw QEMU is much slower than KVM/QEMU (which uses HVM).
Using
KVM/QEMU (via virtual serial port)
create serial port: TCP
network console (as the server)
bind
to 0.0.0.0 if connecting debugger from remote host
bind
to 127.0.0.1 if connecting debugger locally
on VM startup, can see port creation in
/var/log/libvirt/qemu/<guest>.log
to test do
"nc localhost 4555" on the host
"cat
>/dev/ttySx" in the guest
should
see output on the host
can now connect (gdb) target remote localhost:4555
Tunnel
GDB connection to /dev/ttyS0 via socket
./stty.sh
netcat -l
4444 </dev/ttyS0 >/dev/ttyS0
after this, can connect GDB to port 4444 instead of
/dev/ttyS0
alternative: ser2net
Tunnel
GDB connection via AMT serial-over-LAN
./amt-redir.sh
after this, can connect GDB to port 4444
(gdb) set debug
remote 1
(gdb) set remotelogfile
q.log
(gdb)
target remote locahost:4444
caution: AMT drops the connection after few minutes of
inactivity
#!/bin/bash
#set -x
HOST=csb
XPWD="xxxxxx"
netcat_pid=
amtterm_pid=
if [ "$1" !=
"" ]; then
HOST=$1
fi
if [ "$2" !=
"" ]; then
XPWD=$2
fi
do_cleanup()
{
if [ "netcat_pid" != ""
]; then
kill -9 $netcat_pid
fi
if [ "amtterm_pid" !=
"" ]; then
kill -9 $amtterm_pid
fi
cd /tmp
rm -rf /tmp/amt-$$
}
on_control_c()
{
echo -en "\n*** Exiting ***\n"
do_cleanup
exit 0
}
trap on_control_c SIGINT
mkdir /tmp/amt-$$
cd /tmp/amt-$$
mkfifo xin
mkfifo xout
netcat -l 4444 >xin
<xout &
netcat_pid=$!
echo netcat: $netcat_pid
amtterm -q -u admin -p
"$XPWD" $HOST <xin >xout &
amtterm_pid=$!
echo amtterm: $amtterm_pid
wait $netcat_pid
wait $amtterm_pid
do_cleanup
https://software.intel.com/en-us/articles/using-intel-amt-serial-over-lan-to-the-fullest
https://www.youtube.com/watch?v=MqiCjopZVR4
https://software.intel.com/en-us/articles/intel-active-management-technology-start-here-guide-intel-amt-9
https://software.intel.com/en-us/articles/intel-active-management-technology-downloads
https://software.intel.com/en-us/forums/topic/297602
http://linux.die.net/man/1/amtterm
http://linux.die.net/man/1/amttool
http://linux.die.net/man/7/amt-howto
https://www.kraxel.org/cgit/amtterm/tree/amtterm.c
GDB:
CONFIG_GDB_SCRIPTS
– generate vmlinux_xxx.py
disable CONFIG_DEBUG_INFO_REDUCED
enable CONFIG_FRAME_POINTER
lx-symbols |
load symbols for all modules and vmlinux |
lx-dmesg |
display kernel log buffer |
lx-lsmod |
list loaded modules |
p/x $lx_current().pid |
access current task (on a particular CPU) |
p/x $lx_per_cpu("runqueues").nr_running |
access per-cpu variable value (for a particular CPU) |
p/x $lx_task_by_pid(1) |
print task for specified pid |
p/x $lx_thread_info($lx_task_by_pid(1)) |
print thread_info structure for the task |
set $next =
$lx_per_cpu("hrtimer_bases").clock_base[0].active.next |
container_of |
Compile with
optimization disabled:
In Makefile:
CFLAGS_xxx.o
= -O0
ccflags-y :=
-O0
subdir-ccflags-y
:= -O0
Debug printing
__schedule_bug(task)
debug_show_held_locks(task)
print_modules()
print_irqtrace_events(task)
dump_stack()
Hardware breakpoints to intercerpt data access/code
execution:
sample: LINUX_ROOT/samples/hw_breakpoint
register_wide_hw_breakpoint(...), unregister_wide_hw_breakpoint(...)
Ftrace
http://lwn.net/Articles/365835
http://lwn.net/Articles/366796
Documentation/trace/ftrace.txt
Documentation/trace/ftrace-design.txt
gcc –pg causes each function to call mcount().
CONFIG_TRACING
CONFIG_FUNCTION_TRACER
CONFIG_FUNCTION_GRAPH_TRACER
CONFIG_STACK_TRACER
CONFIG_DYNAMIC_FTRACE ==>
convert all mcount() calls to NOPs at boot
# cd /sys/kernel/debug/tracing
# cat available_tracers
>>> function_graph function
sched_switch nop
# echo 100 > buffer_size_kb //
buffer size per CPU
# echo 1 > ftrace_dump_on_oops
function tracer (trace kernel functions on entry)
# echo function > current_tracer //
call tracing
# cat trace //
repeatable read
# cat trace_pipe //
once read, traces are removed
# cat per_cpu/cpu2/trace //
per-CPU sub-selection
function graph tracer (trace kernel functions on entry and exit)
# echo function_graph > current_tracer //
nested call tracing, with offset
# echo nop > current_tracer
stack tracer
# echo 1 > /proc/sys/kernel/stack_tracer_enabled // record max stack usage and deepest stack trace
# cat stack_max_size
# echo 0 > stack_max_size //
reset
irqsoff (max time interrupts are disabled)
preemptoff (max time preemption is disabled)
preemptirqoff (max time preemption and interrups are
disabled)
wakeup (max latency for the highest
priority task to be scheduled after it is waken up_
wakeup_rt (same as wakeup, but for RT tasks
only)
# echo 0 > tracing_on //
stop and start the trace
# echo 1 > tracing_on
trace_printk("foo %d bar %p",
bar->foo, bar) // enter data into trace from kernel space
Write to /tracing/trace_marker //
enter data into trace from userspace
Kernel functions:
void tracing_on(void);
void
tracing_off(void);
if
(tracing_is_on()) ...
if
(tracing_snapshot_alloc() == 0) ... //
allocate shapshot buffer and take snapshot of current trace buffer
tracing_snapshot(); //
take snapshot of current trace buffer ...
//
... (snapshot buffer must be pre-allocated)
void
tracing_start(void);
void
tracing_stop(void);
void
ftrace_off_permanent(void);
ftrace_dump(DUMP_ALL);
boot parameters:
ftrace=[tracer] |
Start the specified tracer as early as possible |
ftrace_dump_on_oops[=cpu] |
Dump the trace buffers on oops, for specified CPU, or all CPUs. |
trace_buf_size=nn[KMG] |
Set tracing buffer size |
ftrace_filter=[function-list] |
Limit the functions traced by the function tracer. function-list is a comma separated list of functions. Can be changed at run time by the set_ftrace_filter
file. |
ftrace_notrace=[function-list] |
Do not trace the functions specified in function-list. |
ftrace_graph_filter=[function-list] |
Limit the top level callers functions traced by the function graph
tracer. Can be changed at run time by the set_graph_function
file. |
stacktrace |
Enable the stack tracer on boot up |
stacktrace_filter=[function-list] |
Limit the functions that the stack tracer will trace. Can be changed at run time by the stack_trace_filter
file. |
trace_event=[event-list] |
Trace events. See Documentation/trace/events.txt |
trace_options=[option-list] |
Also: /sys/kernel/debug/tracing/trace_options |
traceoff_on_warning |
Disable tracing when a warning is hit, to avoid flooding the trace with
warning code. Changed via sysctl kernel.traceoff_on_warning. |
alloc_snapshot |
Allocate the ftrace snapshot buffer on boot up when the main buffer is
allocated. This is handy if debugging
and you need to use tracing_snapshot() on boot up, and do not want to use
tracing_snapshot_alloc() as it needs to be done where GFP_KERNEL allocations
are allowed. |
See more in the tracing/profiling document.
pstore / ramoops
https://www.kernel.org/doc/Documentation/ramoops.txt
persistently stores dmesg and/or ftrace buffer in memory that survives
crash:
·
ACPI ERST table (if available on the system)
·
UEFI varibles
·
reserved RAM above kernel allocation (boot with reduced
"mem" parameter)
To check for ACPI ERST availability:
apt-get install
acpidump
acpidump -s |
grep ERST
To enable EFI store use (UEFI boot only):
CONFIG_EFI_VARS_PSTORE=y
caution: can
overflow EFI vars space
To store in reserved RAM above max address:
mem=16G
ramoops.mem_address=0x400000000 ramoops.ecc=1 ramoops.mem_size=0x200000
Do not use the
final 1MB of RAM.
CONFIG_PSTORE_RAM=y
(or m)
Check for pstore / ramoops availability:
dmesg | grep
Registered | grep "as persistent store backend"
dmesg | grep ramoops
To enable pstore:
CONFIG_PSTORE=y
CONFIG_PSTORE_CONSOLE=y
CONFIG_PSTORE_PMSG=y
CONFIG_PSTORE_FTRACE=y
Then:
mkdir /pstore
# if using RAM
# caution: boot
with mem=16G, or will overwrite reserved memory!!
modprobe ramoops
modprobe ramoops
mem_address=0x400000000 ecc=1 mem_size=0x200000
dmesg | tail
more
/sys/module/ramoops/parameters/*
mount -t pstore -
/pstore -o kmsg_bytes=32000
mount -t debugfs
debugfs /sys/kernel/debug (unless
already mounted)
echo 1
>/sys/kernel/debug/pstore/record_ftrace
echo function
> /sys/kernel/debug/tracing/current_tracer
echo 1 >
/sys/kernel/debug/tracing/tracing_on
echo b
>/proc/sysrq-trigger
or: echo c
>/proc/sysrq-trigger
....
mount -t pstore -
/pstore
ls /pstore
....
Debugging techniques:
printk, pr_debug
(dynamic)
assertions
(BUG_ON)
kgdb / gdb /
custom kdb commands
debugfs
crash / kexec
lockdep
kmemleak, SLAB
debug flags, /proc/slabinfo
various
CONFIG_DEBUG_xxx
SystemTap,
ftrace, trace-cmd, kernelshark, lttng
kprobes / jprobes
/ TRACE_EVENT / hardware breakpoints
profiler
(perf/oprofile)
Gprof2dot
FlameGraph
custom QEMU