Emergency boot

S => single
1 => runlevel 1 (same as Single)
init=/bin/bash
text (remove " splash" or set splash=off)
vga=ask / normal / extended

Ctrl-Alt-F1 exits window manager
ESC enables verbose mode (or use "debug text noslpash" instead of "quiet splash")
for upstart output, add --verbose

after login:

sudo start lightdm
startx
sudo /etc/init.d/lightdm start
runlevel
telinit 1 (telinit S)
telinit 5

Make sure journaling is enabled on mounted file systems:

tune2fs -l /dev/… | grep -i features

(enable journal, better on unmounted file system: ) tune2fs –j /dev/…

tune2fs -o^journal_data,journal_data_ordered,^journal_data_writeback /dev/...
(can also specify in /etc/fstab mount options: data=ordered)

Alt-SysRq keys:

b	Reboot without sync or unmounts
s	sync
c	crash
u	try to remount all file systems as readonly
0-9	set log level DEBUG=7, INFO=6, NOTICE=5, WARNING=4, ERR=34, CRIT=2, ALERT=1, EMERG=0
d	show all held locks
g	enter kdbg
h	help
l	show stack backtrace for all active CPUs
m	dump memory
p	dump registers
q	dump armed hrtimers
t	dump tasks

echo 1 >/proc/sys/kernel/sysrq
echo c >/proc/sysrq-trigger

Taking kernel dump on bare-metal machine:

See http://www.kernel.org/doc/Documentation/kdump/kdump.txt
Also see http://seife.kernalert.de/blog/wp-content/uploads/linuxtag-2012-crashdump-seyfried.pdf

Development or production kernel "my"

CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y
CONFIG_HIGHMEM4G=y (for 32-bit systems, optional to allocate beyond 4GB)
CONFIG_SYSFS=y
CONFIG_PROC_VMCORE=y
CONFIG_DEBUG_INFO=y
CONFIG_PHYSICAL_START=0x1000000

CONFIG_RELOCATABLE=y (if want to use main kernel as dump kernel)

Dump kernel "my-kdump" (unless double-using main kernel)

same as above, plus:

CONFIG_SMP=n *OR*
specify in /etc/sysconfig/kdump or /etc/default/kdump-tools
KDUMP_COMMANDLINE_APPEND = "maxcpus=1 " (note space at the end!)

CONFIG_LOCALVERSION="-kdump" (alternatively, can add in Makefile)

on x64, if CONFIG_RELOCATABLE=y, can use vmlinuz,
otherwise kernel should be uncompressed, i.e.
not vmlinuz-id, but vmlinux-id-kdump

apt-get install [kdump] kexec-tools

Or:

Download http://kernel.org/pub/linux/utils/kernel/kexec/kexec-tools.tar.gz
tar xvpzf kexec-tools.tar.gz
cd kexec-tools-VERSION
./configure
make
make install

SUSE	edit /etc/sysconfig/kdump	load once: rckdump start load always: chkconfig boot.kdump on
openSUSE	edit /etc/sysconfig/kdump tool: YaST -> Kernel Kdump	load once: sudo systemctl start kdump.service load always: sudo systemctl enable kdump.service
RedHat / Fedora	install package crash edit /etc/kdump.conf congig tool: system-config-kdump	load once: service kdump start load always: chkconfig kdump on
Debian (Ubuntu)	edit: /etc/default/kdump-tools tool: kdump-config	see in the text below
Fedora	edit /etc/kdump.conf, /etc/sysconfig/kdump tool: system-config-kdump	load once: systemctl start kdump.service load always: systemctl enable kdump.service

On SUSE: /etc/sysconfig/kdump (see man 5 kdump)

KDUMP_KERNELVER="id-kdump"
KDUMP_COMMANDLINE="ro ... " (see grub.cfg)
KDUMP_COMMANDLINE_APPEND = "maxcpus=1 " (note space at the end!)
KEXEC_OPTIONS= "..."   (precede by --args-linux, e.g. "--args-linux … ", space at end)
KDUMP_RUNLEVEL="1"
KDUMP_SAVEDIR="file:///var/crash"
(alternatively, can save to block device via KDUMP_DUMPDEV="/dev/…")
KDUMP_DUMPLEVEL="0" (all memory)
                                           or "16" (no free pages)
                                           or "17" (no free, no zero pages),
                                           see man 8 makedumpfile
KDUMP_DUMPFORMAT="ELF"

On openSUSE:

Documentation: SUSE openSUSE

zypper install -y kexec-tools yast2-kdump

tool: YaST -> Kernel Kdump

or edit manually /etc/sysconfig/kdump

KDUMP_COPY_KERNEL="no"

KDUMP_DUMPLEVEL="31"

load always: sudo systemctl enable kdump.service

load once: sudo systemctl start kdump.service

openSUSE kdump is screwed up, it tries to write to /kdump/mnt0/var/crash,
but /kdump/mnt0 does not get mounted

so if it says "Dump too large. Aborting. Check KDUMP_FREE_DISK_SIZE"

then save dump manually:

mount /dev/sda1 /mnt
makedumpfile --dump-dmesg [-x /path/vmlinux] /proc/vmcore /mnt/var/crash/qqq.txt

cp /proc/vmcore /mnt/var/crash/qqq.dmp

umount /mnt

reboot -f

or instead of cp:

makedumpfile -l -d 31 /proc/vmcore /mnt/var/crash/qqq.dmp

compression: -c => zlib, -l => lzo, -p => snappy

-f => overwrite

-x /path/vmlinux => path to vmlinux

--message-level 23 (or 31; default is 7)

On Fedora:

RedHat Kernel Crash Dump Guide: html, html-onepage, PDF

dnf install -y kexec-tools system-config-kdump
in grub.cfg, add: crashkernel=256M

in /etc/kdump.conf:

path /var/crash
ext4 LABEL=...
collector makedumpfile -l --message-level 7 -d 14 (to include free and zero pages: -d 31)

in /etc/sysconfig/kdump:

KDUMP_KERNELVER="id-kdump"
KDUMP_COMMANDLINE="ro ... " (see grub.cfg)
KDUMP_COMMANDLINE_APPEND = "maxcpus=1 " (note space at the end!)

systemctl enable kdump.service
reboot

OR: system-config-kdump
reboot

warning: grub2-mkconfig/update-grub will overwrite edited grub.cfg

In GRUB, add to "my" linux boot kernel parameters

crashkernel=128M (if base is unspecified, will place automatically)
crashkernel=128M@16M (base = @16 MB)
crashkernel=128M,high (allow auto-allocation above 4GB, ok for x64)

with bigger initrd, may want to increase to 256M or even 512M
holds compressed initrd + uncompressed initrd (while unpacking) + unpacked kernel

Optionally test that dump kernel can be executed manually:

kexec --load /boot/vmlinux-id-kdump
            --initrd =/boot/initrd-id-kdump .img
             --command-line=`cat /proc/cmdline`
             --append="1 maxpus=1 reset_devices   " (note space at the end)
kexec --exec

Optionally test that dump kernel can be executed by crash:

kexec --load-panic (… same arguments as for –load …)
sysctl kernel.sysrq=1
Alt-SysRq-c

Mark kdump to start automatically on boot:

RedHat:    chkconfig   dump   on
SUSE:        chkconfig boot.kdump on
Debian:    sysv-rc-config   dump   on

or start it manually:

                start kdump
                /etc/init.d/kdump   start
                (OpenSUSE:) /etc/init.d/boot.kdump { start | stop | status }
                systemctl   start kdump.service
                (Ubuntu:) sudo kdump-config load

Writing dump manually from dump kernel:

cp /proc/vmcore dump-file

On Ubuntu:

apt-get install kdump-tools linux-crashdump

edit /etc/default/kdump-tools (see man kdump-tools, man makedumpfile)
                USE_KDUMP=1
                KDUMP_KERNEL="/boot/..."
                KDUMP_INITRD="/boot/..."
                DEBUG_KERNEL="/.../vmlinux..."
                KDUMP_COREDIR="/var/crash"
                KDUMP_CMDLINE_APPEND="1 irqpoll maxcpus=1 nousb reset_devices text " (space at end!)
                or KDUMP_CMDLINE_APPEND="irqpoll maxcpus=1 nousb"
                MAKEDUMP_ARGS="-d 31 -E -x /.../vmlinux"
                                (-d 16 = no free pages; -d 17=no free, no zero pages)
                #KDUMP_EXEC_ARGS=""
also see /usr/share/doc/kdump-tools/README*
and http://wiki.ubuntu.com/Kernel/CrashdumpRecipe

Remember to set "crashkernel" in GRUB

kdump-config test
kdump-config show
kdump-config status
kdump-config load <==== required for dump to work

to test:

echo 1 >/proc/sys/kernel/sysrq
echo c >/proc/sysrq-trigger

http://www.admin-magazine.com/Articles/Analyzing-Kernel-Crash-Dumps
http://doc.opensuse.org/documentation/html/openSUSE_122/opensuse-tuning/cha.tuning.kexec.html
http://www.kernel.org/doc/Documentation/kdump/kdump.txt
http://ftp.suse.com/pub/people/tiwai/kdump-training/kdump-training.pdf
http://lse.sourceforge.net/kdump/documentation/ols2oo5-kdump-paper.pdf
http://people.redhat.com/anderson/crash_whitepaper

if makedumpfile says "the kernel version is not supported":

makedumpfile -v

· borrow makedumpfile from another dist (but check that ldd makedumpfile links fine to deployed DSOs)

· or download and build the latest version of makedumpfile:

git clone http://git.code.sf.net/p/makedumpfile/code

make sure LATEST_VERSION in makedumpfile.h is high enough

edit: LIBS = -ldw -lbz2 -lebl -ldl -lelf -lz -llzma

install packages:

elfutils elfutils-devel libbz2-1 libz1 glibc-devel-static libbz2-devel

zlib-devel-static lzo-devel-static

Fedora: lzo lzo-devel snappy snappy-devel

Ubuntu: liblzo2-2 liblzo2-dev libsnappy libsnappy-dev
openSUSE: liblzo2-2 lzo-devel libsnappy1 snappy-devel

opensuse 13.2 package libbz2-devel misses static version of libbz2.a

may borrow it from another distribution

but still missing liblzma, that also has no static equivalent under opensuse

so build with dynamic linking (not default static)

both Fedora and openSUSE build it this way anyway

make USELZO=on LINKTYPE=dynamic

make install

Sample /etc/grub.d/12_devel (chmod a+x)

#!/bin/sh

exec tail -n +3 $0

# This file provides an easy way to add custom menu entries. Simply type the

# menu entries you want to add after this comment. Be careful not to change

# the 'exec tail' line above.

menuentry 'Linux 3.14.3-iocp-0 text' {

recordfail

gfxmode $linux_gfx_mode

insmod gzio

insmod part_msdos

insmod ext2

set root='hd0,msdos7'

if [ x$feature_platform_search_hint = xy ]; then

search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos7 --hint-efi=hd0,msdos7 --hint-baremetal=ahci0,msdos7 7178466b-a71b-41e3-b632-aaade102262e

else

search --no-floppy --fs-uuid --set=root 7178466b-a71b-41e3-b632-aaade102262e

echo 'Loading Linux 3.14.3-iocp-0 ...'

linux /boot/vmlinuz-3.14.3-iocp-0 root=UUID=7178466b-a71b-41e3-b632-aaade102262e ro quiet text fbcon=scrollback:1024k sysrq_always_enabled crashkernel=384M-2G:64M,2G-:128M $vt_handoff

echo 'Loading initial ramdisk ...'

initrd /boot/initrd.img-3.14.3-iocp-0

}

menuentry 'Linux 3.14.3-iocp-0 text-devel' {

recordfail

gfxmode $linux_gfx_mode

insmod gzio

insmod part_msdos

insmod ext2

set root='hd0,msdos7'

if [ x$feature_platform_search_hint = xy ]; then

search --no-floppy --fs-uuid --set=root --hint-bios=hd0,msdos7 --hint-efi=hd0,msdos7 --hint-baremetal=ahci0,msdos7 7178466b-a71b-41e3-b632-aaade102262e

else

search --no-floppy --fs-uuid --set=root 7178466b-a71b-41e3-b632-aaade102262e

echo 'Loading Linux 3.14.3-iocp-0 ...'

linux /boot/vmlinuz-3.14.3-iocp-0 root=UUID=7178466b-a71b-41e3-b632-aaade102262e ro quiet text fbcon=scrollback:1024k sysrq_always_enabled log_buf_len=1M crashkernel=384M-2G:64M,2G-:128M $vt_handoff

echo 'Loading initial ramdisk ...'

initrd /boot/initrd.img-3.14.3-iocp-0

}

Taking kernel dump on KVM:

virsh dump domain domain.crash --bypass-cache --crash --verbose --memory-only
crash vmlinux domain.crash

-- crash => shut off virtual machine after taking dump
--reset -> reset virtual machine after taking dump
-- live => minimize interruption

Taking kernel dump on Xen:

xl dump-core dom-id domain.crash

Taking kernel dump on VMWare:

suspend VM
vmss2core –N6 file.vmss

Taking kernel dump with NMI dump switch:

sysctl kernel.unknown_nmi_panic = 1

Dump analysis:

# gdb-kdump

gdb vmlinux dump-file

Ubuntu:                   apt-get install crash
openSUSE:              zypper install crash
RHEL/Fedora:         dnf install crash

building:

latest: http://github.com/crash-utility

Fedora: dnf install -y crash-devel lzo lzo-devel snappy snappy-devel

Ubuntu: apt-get install liblzo2-2 liblzo2-dev libsnappy libsnappy-dev
openSUSE: zypper install -y crash-doc crash-devel liblzo2-2 lzo-devel libsnappy1 snappy-devel

(may also need to install packages required for kernel build environment)

git clone http://github.com/crash-utility/crash

cd crash

make lzo

sudo make install

crash /path/vmlinux -id /var/crash/…/vmcore
man crash

http://www.dedoimedo.com/computers/crash-analyze.html
http://www.dedoimedo.com/computers/crash-book.html#download
http://people.redhat.com/anderson/crash_whitepaper
http://magazine.redhat.com/2007/08/15/a-quick-overview-of-linux-kernel-crash-dump-analysis
http://www.dedoimedo.com/computers/crash-analyze.html
http://www.dedoimedo.com/computers/crash-book.html#download
http://magazine.redhat.com/2007/08/15/a-quick-overview-of-linux-kernel-crash-dump-analysis

crash can also be used on the running kernel

crash /path/vmlinux-id

crash commands:

gdb <command>	crash is a wrapper around gdb, so gdb commands can be used
help
log	dmesg from crashed kernel
sys sys config sys -c	crash info kernel .config data syscall table
mach mach -m mach -c	machine info (format options: -x, -d), incl. stack locations … memory map … cpuinfo
bt bt -a (all CPUs) bt -c 1,8,9-14 (some CPUs) bt <pid> (task) bt 0x<taskp> (task)	backtrace options: -l => file + line number -sx => symbol + offset -f => expand each stack frame
sym pipe sym -q pipe sym 0x...	find symbol (symbols including substring "pipe") (symbol just below the address)
whatis <type\|symbol>	display type or symbol defintion
eval <expr>	evaluate expression
p <expr> p <symbol>[:cpuspec] px = p -x pd = p -d	print value percpu var, cpuspec can be "all" or "1-3, 5" or ":" for current task cpu formats: -x, -d userspace: -u
[struct] <type> <addr> [struct] <type>.<member> <addr> [struct] <type> -l <type>.<member> <member-addr> [union] … same use as [struct] *<struct-typename> <addr>	dump “struct type” located at <addr> dump <member> in “struct type” located at <addr> (can also use t1.m1,m2,m3) dump structure by enclosed member address options: -x -d (hex/decimal) -p expand pointer members -r dump raw structure data -o dump filed offsets (metadata, not data) after <addr> can also add: [:cpuspec] a[ll] or "1-3, 5" or ":" for current task cpu [count] dump array of structures at <addr>
set 0xtaskp set -p set <pid> set -c <cpu>	set context to task … to panic-ing task … to task with <pid> … to task currently active on <cpu>
dis <expr> [instr-count]	disassembly
ascii …	translate hex string to ascii, or display ascii table
rd <addr> [count]	dump memory formats: -x -8 -16 -32 -64 -a -d -D -p => physical adess
wr <variable> <value>	write memory
search …	search memory for value
tree …	display rb-tree or radix tree
list …	dump list
waitq <expr> waitq struct.member <s-addr>	display wait queue
runq [-t \| -m \| -g]	display run queue
timer timer -r	timers hrtimers
mod mod …	display modules load module symbols
ps [-k \| -u \| -G] -p [pid] -c [pid] -g -s -a -S -t -l -m -C cpus [<pid> \| 0xtaskp \| command-string]	task list, -k => kernel only, -u => user only, -G => thread group leaders only … parental hierarchy … children … by thread group … kernel stack pointer ... command line and environment … summary, task in various states … times
task [<pid> \| 0xtaskp] -R m1,m2,m3	display task_struct and thread_info (format: -x, -d) … members to display
sig [<pid> \| 0xtaskp] sig -l	signal handlers signal numbers
vtop [-c <pid> \| 0xtaskp] <addr> ptov <addr> ptov offset:cpu pte <value>	translate virtual address to physical, display PTE translate physical to virtual translate per-cpu offset to KVA (cpu: a[ll], “1-3,5”, empty for current) dump <value> as PTE
vm [<pid> \| 0xtaskp] -p -m -v vm -f vm_flags	vm map for the task (format: -d, -x) … translate virt to phys … all mm_struct for the task … all vm_area_struct for the task decode flags
kmem -i -z -s [-S] -h -g -n -v -V -f -F -c -C	kernel memory usage zone use slab data huge pages page.flags bits NUMA nodes regions allocated by vmalloc vm_stat table etc. free list and page hash table
irq [-u] irq -b irq -d irq -a irq -s [-c cpu]	irq data (-u => in-use only) bottom halves x86 IDT CPU affinity for in-use irqs statistics, (cpu can be “all”, “1-3,5”)
dev dev -i dev -p dev -d	device info IO port usafe PCI data disk IO statistics
mount …	display mounted file systems and data (inodes, dentries, etc.)
swap	swap info
fuser <path> \| <inode> foreach files -R <path>	display tasks using specified file or socket
files files <pid> \| 0xtaskp files -d <dentry>	files opened by current task … by task <pid> or taskp
net …	network data (net devices, IP addresses, sockets, ARP cache)
ipcs …	semaphores, shared mem segments, IPC message queues
foreach <cmd> foreach user <cmd> foreach kernel <cmd> foreach active <cmd> foreach <state> <cmd> foreach <name> <cmd>	perform command on all tasks … all user tasks … all kernel threads … active threads on each CPU … for all tasks in <state>: RU, IN, UN, ST, ZO, TR, SW, DE … for each task with matching name, e.g. …… foreach bash bt, foreach ‘event.*’ task -R state can use bt, vm, task, files, net, set, ps, sig, vtop
repeat [-seconds] <cmd>	repeat command every “seconds”
extend <dso.so>	load command extension library

"Oops: 0002 [#1] PREEMPT SMP "
0002 is flags:

	bit #	bit = 0	bit = 1
PF_PROT	0	page not mapped	access violation
PF_WRITE	1	read or execute	write
PF_USER	2	kernel mode	user mode
PF_RSVD	3
PF_INSTR	4	data access	instruction fetch

Disassembling executable image:

objdump -S module.ko

Debugging with GDB: http://sourceware.org/gdb/onlinedocs/gdb

Using cscope:

make cscope
cscope -R (for help type ?)

GUI is kscope

Using GNU GLOBAL

apt-get install global
cd ....
gtags [dbpath]
htags [-d dbpah] [-t title] [outdir] // default outdir is "./HTML"

no web-based search, only command line

Using LXR:

perl -v
ctags --version => must be "exuberant" // apt-get install exuberant-ctags
need database
need web server
...

Similar: OpenGrok

Debugger backends and connections:

iron machine, Linux kgdb:                       COM port, USB debug port
virtual machine, Linux kgdb:                   virtual COM port (mapped to socket, named pipe etc.)
virtual machine, built-in gdb stub:         VMWare, KVM/QEMU, Virtual Box etc.

in the latter case: no KDB, and limited task info access (each CPU is a gdb thread, instead tasks being a gdb thread)

Debugger front-ends:

command-line GDB
Eclipse
SlickEdit
Code::Blocks
Visual Studio + Visual Kernel

KGDB set up:

http://kgdb.geeksofpune.in/downloads/kgdb-2/kgdb_full_2.2.pdf
http://kgdb.geeksofpune.in/downloads/kgdb-2/kgdbquickstart-2.3.pdf

Serial line checking:

setserial -g /dev/ttyS*

CSA and CSB:

ttyS0 0x3f8 irq=4 // on-board
ttyS4 0xf0a0 irq=19 // Intel AMT

CSB:

ttyS5 0xd010 irq=16 // PCI (righ RS232 looking from front)
ttyS6 0xd000 irq=117 // PCI (left RS232 looking from front)

CSX:

ttyS0 // left DB9 looking from front

ttyS1 // right DB9 looking from front

stty -a -F /dev/ttyS0

stty sane -F /dev/ttyS0
stty raw -F /dev/ttyS0

stty 115200 raw -echo pass8 -cstopb -ignpar -ixon -ixoff -F /dev/ttyS0

minicom -s [-c on] => setup

minicom

Intel AMT serial-over-LAN (does not expose client end as COM device, but mappable via amtterm)

https://software.intel.com/en-us/articles/using-intel-amt-serial-over-lan-to-the-fullest
https://www.youtube.com/watch?v=MqiCjopZVR4
https://software.intel.com/en-us/articles/intel-active-management-technology-start-here-guide-intel-amt-9

Early kernel debugging:

earlyprintk=vga ekgdboc=kbd

Break to debugger at boot:

kgdbwait

Debug via serial ("over console"):

kgdboc=ttyS0,115200 [kgdbwait]

https://www.kernel.org/pub/linux/kernel/people/jwessel/kgdb
https://www.kernel.org/pub/linux/kernel/people/jwessel/kdb

Enable KGDB after boot:

echo "ttyS0,115200" >/sys/module/kgdboc/parameters/kgdboc

echo "1" >/proc/sys/kernel/sysrq

Alt-SysRq-g to enter debugger (or echo g > /proc/sysrq-trigger)

to disable: echo "" >/sys/module/kgdboc/parameters/kgdboc

To send printk messages to gdb stream:

boot parameter: kgdbcon

dynamic: echo 1 >/sys/module/kgdb/parameters/kgdb_use_con
(then must re-configure kgdb IO driver)

Debug via USB (requires USB debug port):

kgdbdbgp=0 // use USB debug port on EHCI USB controller 0, as it is probed via PCI

http://www.coreboot.org/EHCI_Debug_Port
http://www.coreboot.org/EHCI_Gadget_Debug

Most motherboards (including Intel DQ77MK and ASUS CS-B) do not route debug port to motherboard connectors, either internal or external

Debugging via Ethernet (kgdboe) is not in the main tree, abandoned

Debugging via Firewire (1394) was always for memory read/write only, and no longer works or maintained

Debugging with (k)gdb is Visual Studio: see below

Debugging with VMWare (or KVM/QEMU):

· Via virtual serial port mapped to TCP socket

· Or via built-in gdb server

Note: knows nothing about guest OS, only one thread per CPU, very much like in-circuit debugger.
No KDB commands.

http://www.evilfingers.com/publications/research_RU/unix-kernel-debug.pdf

KDB commands in KGDB:

(gdb) monitor lsmod // list loaded modules
(gdb) monitor ps // active processes

(gdb) monitor ps A // all processes

(gdb) monitor summary // kernel version and memory usage

(gdb) monitor dmesg // syslog buffer

(gdb) monitor bt // stack of current process using dump_stack (better than gdb bt)

more in http://dev.man-online.org/man1/kdb

md <vaddr>

mdr <vaddr> <bytes>

mdp <paddr> <bytes>

mds <vaddr>

mm <vaddr> <contents>

Display Memory Contents, also mdWcN, e.g. md8c1

Display Raw Memory

Display Physical Memory

Display Memory Symbolically

Modify Memory Contents

rm <reg> <contents>

Display Registers

Modify Registers

ef <vaddr>

Display exception frame

bt [<vaddr>]

btp <pid>

bta [D|R|S|T|C|Z|E|U|I|M|A]

btc

btt <vaddr> address

Stack traceback

Display stack for process <pid>
Backtrace all processes matching state flag

Backtrace current process on each cpu

Backtrace process given its struct task

cpu <cpunum>

Display CPUs or switch to new cpu

pid <pidnum>

ps [<flags>|A]

Switch to another task

Display active task list

per_cpu <sym> [<bytes>]

[<cpu>]

Display per_cpu variables

bp [<vaddr>]

bl [<vaddr>]

bph [<vaddr>] [datar

[length]|dataw [length]]

bc <bpnum>

be <bpnum>

bd <bpnum>

Set/Display breakpoints

Display breakpoints

Set hw brk

Clear Breakpoint

Enable Breakpoint
Disable Breakpoint

go [<vaddr>]

Single Step

Continue Execution

kill <-signal> <pid>

reboot

kgdb

Send a signal to a process

Reboot the machine immediately

Enter kgdb mode

ftdump [skip_#lines] [cpu]

Dump ftrace log

env

set ...

Show environment variables

Set environment variables

dumpcommon

dumpall

dumpcpu

Common kdb debugging

First line debugging

Same as dumpall but only tasks on cpus

Define custom KDB commands:

sample: LINUX_ROOT/samples/kdb

kdb_register(...), kdb_unregister(...)

Build parameters

CONFIG_KDGB
CONFIG_HAVE_ARCH_KGDB
CONFIG_KGDB_SERIAL_CONSOLE
CONFIG_FRAME_POINTER
#CONFIG_DEBUG_RODATA // turn off, otherwise breakpoint may not work, x86 has only 4 debug registers

(DEBUG_RODATA limitation had been fixed in 3.0, see "Fix DEBUG_RODATA limitation using text_poke")

For KDB additionally:

CONFIG_KGDB_KDB
CONFIG_KDB_KEYBOARD

Overall:

CONFIG_DEBUG_INFO=y

On host side:

apt-get install python python-libs python-dev

gedit ~./gdbinit

enter: set auto-load safe-path /

or: add-auto-load-safe-path /path/to/vmlinux-gdb.py

gdb ./vmlinux
(gdb) set remotebaud 115200
(gdb) target remote /dev/ttyS0

When debugging, better to avoid using step/next when interrupts are enabled.
Rather go from a breakpoint to a breakpoint (as in "continue").
Otherwise may throw into an interrupt handler and stack switch, e.g. schedule().

Specifically, avoid stepping over functions that may cause scheduling, such as:

queue_work()

spin_lock()

spin_unlock()

Programmatic break-to-debugger:

breakpoint() ==> asm( " int $3");

kgdb_breakpoint()

Debugging loadable module (for kgdb before 1.9):

see http://kgdb.geeksofpune.in/initmodule.htm

# insmod mymodule.ko
# cd /sys/module/mymodule/sections/

# cat .data
# cat .rodata

# cat .bss
# cat .text

(gdb) add-symbol-file <module_name> <.text - address> \

-s .bss <address> \

-s .rodata <address> \

-s .data <address>

other similar sections: .sdata .sbss

Opening current kernel in gdb:

gdb ./vmlinux /proc/core

to refresh: (gdb) core-file /proc/core

Disassembling function (helps with asm debug and function calls inlined from header files):

disasfun.sh vmlinux do_fork

#!/bin/sh

# from kgdb

if [ $# != 2 ]

then

echo disasfun objectfile functionname

exit 1

OBJFILE=$1

FUNNAME=$2

ADDRSZ=`objdump -t $OBJFILE | gawk -- "{

if (\\\$3 == \\"F\\" && \\\$6 == \\"$FUNNAME\\") {

printf(\\"%s %s\\", \\\$1, \\\$5)

}

}"`

ADDR=`echo $ADDRSZ | gawk "{ printf(\\"%s\\",\\\$1)}"`

SIZE=`echo $ADDRSZ | gawk "{ printf(\\"%s\\",\\\$2)}"`

if [ -z "$ADDR" -o -z "$SIZE" ]

then

echo Cannot find address or size of function $FUNNAME

exit 2

objdump -S $OBJFILE --start-address=0x$ADDR --stop-address=$((0x$ADDR + 0x$SIZE))

Using Code::Blocks

sudo chown sergey /dev/ttyS0

cd /src-path

create empty project

Settings -> Debugging

watch = all

evaluate expression under cursor = on

do *not* run the debuggee = on

debugger initialization commands:

file /src-path/vmlinux

dir /src-path

set remotebaud 115200

target remote /dev/ttyS0

Has gdb console window

Using SlickEdit

sudo chown sergey /dev/ttyS0

Debug -> Attach Debugger -> Attach to Remote Process

File = /path/vmlinux

Device = /dev/ttyS0

Speed = 115200

Standard SlickEdit does not have gdb console window, but there are 3-party plugins

Using Eclipse (Kepler, Mars):

Run -> Debug Configurations -> new config "C/C++ Remote Application"

Main tab: at the bottom, select Using GDB (DSF) Manual Remote Launcher

enter path to vmlinux

disable autobuild

Debugger tab: stop on startup = off

debugger command: ~/.gdbinit

Debugger/Connection subtab: set device/speed, e.g. /dev/ttyS0, 115200

click "Debug"

for GDB console: use Console view (but first Run->Suspend, or use yellow "Pause" button)

Slow due to reading task/thread list

Using VMWare (built-in gdb server)

VMWare creates gdb port, works like in-circuit emulator (OS-agnostic), thread ≡ CPU, no KDB commands.
Does not know about modules and does not auto-load modules symbols/relocations, however can do "lx-symbols".

Add to VMX file:

debugStub.listen.guest64 = "TRUE" # enable listener for 64 bit guest

debugStub.listen.guest64.remote = "TRUE" # allow remote connection

#debugStub.port.guest64 = "8864" # listen on specified port (default: 8864)

#monitor.debugOnStartGuest64 = "TRUE" # pause on power-up

debugStub.listen.guest32 = "TRUE" # enable listener for 32 bit guest

debugStub.listen.guest32.remote = "TRUE" # allow remote connection

#debugStub.port.guest32 = "8832" # listen on specified port (default: 8832)

#monitor.debugOnStartGuest32 = "TRUE" # pause on power-up

debugStub.hideBreakpoints= "TRUE" # Set hardware breakpoints -- limited by HW

See in vmware.log when VM starts up

gdb ./vmlinux

(gdb) set arch i386

(gdb) target remote localhost:8832

(gdb) set arch i386:x86_64

(gdb) target remote somehost:8864

http://wiki.osdev.org/VMware

Replay debugging on Linux: https://www.vmware.com/pdf/ws7_replay_linux_technote.pdf

Using KDbg

apt-get install kdbg

cd /usr/share/kde4/apps/kdbg/icons/hicolor/22x22/actions

mv pulse.mnh pulse.mng-bak

kdbg -r /dev/ttyS0 /path/vmlinux

· does not correctly read gdbinit

· no net connection option

· no gdb console

· no locals view

· all buttons are disabled, cannot stop or continue

Using Visual Studio (connect to built-in gdb server in VMWare/KVM or kgdb)

http://sysprogs.com/VisualKernel/tutorials/quickdebug

in target:

apt-get install openssh-server

(edit /etc/ssh/sshd_config to change port number if desired)

in Visual Studio (to copy sources/vmlinux):

Debug -> Quick Debug Linux Kernel

Machine to Debug

Host name

User name

Password

Setup public key

[Create]

Machine with GDB = (local computer)

Kernel symbols for debugging: Install...

import manually built kernel

source directory

use included pre-built gdb

index kernel modules in a custom directory

in Visual Studio (when sources/vmlinux are local):

Debug -> Quick Debug Linux Kernel

Machine to Debug

Host name

User name

Password

Setup public key

[Create]

Machine with GDB = (local computer)

Kernel symbols for debugging: Install...

specify kernel symbols and sources manually

kernel file with symbols -> vmlinux

source directory

use included pre-built gdb

index kernel modules in a custom directory

Importing existing module into VisualKernel project: http://sysprogs.com/VisualKernel/tutorials/import

Managing symbols: http://sysprogs.com/VisualKernel/documentation/kernelsymbols

Documentation: http://sysprogs.com/VisualKernel/tutorials

Using KVM/QEMU (built-in gdb server)

https://help.ubuntu.com/community/KVM/Installation
https://help.ubuntu.com/community/KVM/Networking
https://help.ubuntu.com/community/KVM/CreateGuests
https://help.ubuntu.com/community/KVM/Managing
https://help.ubuntu.com/community/KVM/Access

apt-get install qemu-kvm libvirt-bin ubuntu-vm-builder bridge-utils python-virtinst
apt-get install virtinst or python-virtinst
apt-get install qemu-system spice-client python-spice-client-gtk

adduser sergey libvirtd
virsh -c qemu:///system list

apt-get install ubuntu-virt-server => kvm libvirt-bin openssh-server

from X11 console:

apt-get install ubuntu-virt-mgmt => virt-manager python-vm-builder and virt-viewer
apt-get install virt-manager

(reboot)

Optional:

Define private bridge (guest-only networking):

add to /etc/network/interfaces:
----- add begin -----
auto privatebr0
iface privatebr0 inet static
        address 10.48.51.135
        netmask 255.255.254.0
        pre-up    brctl addbr privatebr0
        post-down brctl delbr privatebr0
----- add end -----

/etc/init.d/networking restart

echo '<network> <name>privatenet</name> <bridge name="privatebr0" /> </network>' >> /tmp/net.xml
virsh net-define /tmp/net.xml
virsh net-start privatenet
virsh net-autostart privatenet
virsh net-list --all

Now can add a virtual network device to any of the guests,
select "privatenet" as its source device
and this guest will be connected to the virtual switch.

Create VM with virt-manager
or:

sudo virt-install --connect qemu:///system
--name=ub64z
--ram=3072
--vcpus=2
--cdrom=ubuntu-14.04.2-desktop-amd64.iso
--os-type linux
--os-variant ubuntutrusty
--disk path=ub64z.qcow2,size=30,cache=writethrough
--graphics sdl
--graphics vnc,password=xxxx --noautoconsole
--network=network:privatenet
--network=network:default # NAT

May want to select spice/QXL as video

export EDITOR=`which nano`
export EDITOR=`which gedit`

List VMs:
virsh --connect qemu:///system
virsh# list --all
virsh# edit ub64z

change: <domain type='kvm'> => <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
and add:
<qemu:commandline>
<qemu:arg value='-gdb'/>
<qemu:arg value='tcp::1234'/>
</qemu:commandline>

virsh qemu-monitor-command ub64z --hmp help
virsh qemu-monitor-command ub64z --hmp info
virsh qemu-monitor-command ub64z --hmp info [ pci | qtree | mtree | usb | network | cpus | registers ]
virsh qemu-monitor-command ub64z --hmp { x | xp } -> memory access
virsh qemu-monitor-command ub64z --hmp { i | o } -> I/O port access

gdb ./vmlunux

(gdb) target remote localhost:1234

Raw QEMU debugging is similar, but raw QEMU is much slower than KVM/QEMU (which uses HVM).

Using KVM/QEMU (via virtual serial port)

create serial port: TCP network console (as the server)

bind to 0.0.0.0 if connecting debugger from remote host

bind to 127.0.0.1 if connecting debugger locally

on VM startup, can see port creation in /var/log/libvirt/qemu/<guest>.log

to test do "nc localhost 4555" on the host

"cat >/dev/ttySx" in the guest

should see output on the host

can now connect (gdb) target remote localhost:4555

Tunnel GDB connection to /dev/ttyS0 via socket

./stty.sh

netcat -l 4444 </dev/ttyS0 >/dev/ttyS0

after this, can connect GDB to port 4444 instead of /dev/ttyS0

alternative: ser2net

Tunnel GDB connection via AMT serial-over-LAN

./amt-redir.sh

after this, can connect GDB to port 4444

(gdb) set debug remote 1

(gdb) set remotelogfile q.log

(gdb) target remote locahost:4444

caution: AMT drops the connection after few minutes of inactivity

#!/bin/bash

#set -x

HOST=csb

XPWD="xxxxxx"

netcat_pid=

amtterm_pid=

if [ "$1" != "" ]; then

HOST=$1

if [ "$2" != "" ]; then

XPWD=$2

do_cleanup()

{

if [ "netcat_pid" != "" ]; then

kill -9 $netcat_pid

if [ "amtterm_pid" != "" ]; then

kill -9 $amtterm_pid

cd /tmp

rm -rf /tmp/amt-$$

}

on_control_c()

{

echo -en "\n*** Exiting ***\n"

do_cleanup

exit 0

}

trap on_control_c SIGINT

mkdir /tmp/amt-$$

cd /tmp/amt-$$

mkfifo xin

mkfifo xout

netcat -l 4444 >xin <xout &

netcat_pid=$!

echo netcat: $netcat_pid

amtterm -q -u admin -p "$XPWD" $HOST <xin >xout &

amtterm_pid=$!

echo amtterm: $amtterm_pid

wait $netcat_pid

wait $amtterm_pid

do_cleanup

https://software.intel.com/en-us/articles/intel-active-management-technology-downloads

https://software.intel.com/en-us/forums/topic/297602

http://linux.die.net/man/1/amtterm

http://linux.die.net/man/1/amttool

http://linux.die.net/man/7/amt-howto

https://www.kraxel.org/cgit/amtterm/tree/amtterm.c

GDB:

CONFIG_GDB_SCRIPTS – generate vmlinux_xxx.py

disable CONFIG_DEBUG_INFO_REDUCED

enable CONFIG_FRAME_POINTER

lx-symbols	load symbols for all modules and vmlinux
lx-dmesg	display kernel log buffer
lx-lsmod	list loaded modules
p/x $lx_current().pid p/x $lx_current(2).pid	access current task (on a particular CPU)
p/x $lx_per_cpu("runqueues").nr_running p/x $lx_per_cpu("runqueues", 2).nr_running	access per-cpu variable value (for a particular CPU)
p/x $lx_task_by_pid(1)	print task for specified pid
p/x $lx_thread_info($lx_task_by_pid(1))	print thread_info structure for the task
set $next = $lx_per_cpu("hrtimer_bases").clock_base[0].active.next p *$container_of($next, "struct hrtimer", "node)	container_of

Debugging with GDB

Compile with optimization disabled:

In Makefile:

CFLAGS_xxx.o = -O0

ccflags-y := -O0

subdir-ccflags-y := -O0

Debug printing

__schedule_bug(task)
debug_show_held_locks(task)

print_modules()

print_irqtrace_events(task)

dump_stack()

Hardware breakpoints to intercerpt data access/code execution:

sample: LINUX_ROOT/samples/hw_breakpoint
register_wide_hw_breakpoint(...), unregister_wide_hw_breakpoint(...)

Ftrace

http://lwn.net/Articles/365835
http://lwn.net/Articles/366796

Documentation/trace/ftrace.txt
Documentation/trace/ftrace-design.txt

gcc –pg causes each function to call mcount().

CONFIG_TRACING
CONFIG_FUNCTION_TRACER

CONFIG_FUNCTION_GRAPH_TRACER

CONFIG_STACK_TRACER

CONFIG_DYNAMIC_FTRACE ==> convert all mcount() calls to NOPs at boot

# cd /sys/kernel/debug/tracing

# cat available_tracers
>>> function_graph function sched_switch nop

# echo 100 > buffer_size_kb // buffer size per CPU

# echo 1 > ftrace_dump_on_oops

function tracer (trace kernel functions on entry)

# echo function > current_tracer                            // call tracing
# cat trace                                                               // repeatable read
# cat trace_pipe                                                      // once read, traces are removed
# cat per_cpu/cpu2/trace                                       // per-CPU sub-selection

function graph tracer (trace kernel functions on entry and exit)

# echo function_graph > current_tracer // nested call tracing, with offset
# echo nop > current_tracer

stack tracer

# echo 1 > /proc/sys/kernel/stack_tracer_enabled // record max stack usage and deepest stack trace

# cat stack_max_size

# echo 0 > stack_max_size // reset

irqsoff (max time interrupts are disabled)

preemptoff (max time preemption is disabled)
preemptirqoff (max time preemption and interrups are disabled)

wakeup (max latency for the highest priority task to be scheduled after it is waken up_

wakeup_rt (same as wakeup, but for RT tasks only)

# echo 0 > tracing_on // stop and start the trace
# echo 1 > tracing_on

trace_printk("foo %d bar %p", bar->foo, bar) // enter data into trace from kernel space

Write to /tracing/trace_marker // enter data into trace from userspace

Kernel functions:

void tracing_on(void);

void tracing_off(void);

if (tracing_is_on()) ...

if (tracing_snapshot_alloc() == 0) ... // allocate shapshot buffer and take snapshot of current trace buffer

tracing_snapshot(); // take snapshot of current trace buffer ...

// ... (snapshot buffer must be pre-allocated)

void tracing_start(void);

void tracing_stop(void);

void ftrace_off_permanent(void);
ftrace_dump(DUMP_ALL);

boot parameters:

ftrace=[tracer]	Start the specified tracer as early as possible
ftrace_dump_on_oops[=cpu]	Dump the trace buffers on oops, for specified CPU, or all CPUs.
trace_buf_size=nn[KMG]	Set tracing buffer size
ftrace_filter=[function-list]	Limit the functions traced by the function tracer. function-list is a comma separated list of functions. Can be changed at run time by the set_ftrace_filter file.
ftrace_notrace=[function-list]	Do not trace the functions specified in function-list.
ftrace_graph_filter=[function-list]	Limit the top level callers functions traced by the function graph tracer. Can be changed at run time by the set_graph_function file.
stacktrace	Enable the stack tracer on boot up
stacktrace_filter=[function-list]	Limit the functions that the stack tracer will trace. Can be changed at run time by the stack_trace_filter file.
trace_event=[event-list]	Trace events. See Documentation/trace/events.txt
trace_options=[option-list]	Also: /sys/kernel/debug/tracing/trace_options
traceoff_on_warning	Disable tracing when a warning is hit, to avoid flooding the trace with warning code. Changed via sysctl kernel.traceoff_on_warning.
alloc_snapshot	Allocate the ftrace snapshot buffer on boot up when the main buffer is allocated. This is handy if debugging and you need to use tracing_snapshot() on boot up, and do not want to use tracing_snapshot_alloc() as it needs to be done where GFP_KERNEL allocations are allowed.

See more in the tracing/profiling document.

pstore / ramoops

https://www.kernel.org/doc/Documentation/ramoops.txt

persistently stores dmesg and/or ftrace buffer in memory that survives crash:

· ACPI ERST table (if available on the system)

· UEFI varibles

· reserved RAM above kernel allocation (boot with reduced "mem" parameter)

To check for ACPI ERST availability:

apt-get install acpidump

acpidump -s | grep ERST

To enable EFI store use (UEFI boot only):

CONFIG_EFI_VARS_PSTORE=y

caution: can overflow EFI vars space

To store in reserved RAM above max address:

mem=16G ramoops.mem_address=0x400000000 ramoops.ecc=1 ramoops.mem_size=0x200000

Do not use the final 1MB of RAM.

CONFIG_PSTORE_RAM=y (or m)

Check for pstore / ramoops availability:

dmesg | grep Registered | grep "as persistent store backend"

dmesg | grep ramoops

To enable pstore:

CONFIG_PSTORE=y

CONFIG_PSTORE_CONSOLE=y

CONFIG_PSTORE_PMSG=y

CONFIG_PSTORE_FTRACE=y

Then:

mkdir /pstore

# if using RAM

# caution: boot with mem=16G, or will overwrite reserved memory!!

modprobe ramoops

modprobe ramoops mem_address=0x400000000 ecc=1 mem_size=0x200000

dmesg | tail

more /sys/module/ramoops/parameters/*

mount -t pstore - /pstore -o kmsg_bytes=32000

mount -t debugfs debugfs /sys/kernel/debug (unless already mounted)

echo 1 >/sys/kernel/debug/pstore/record_ftrace

echo function > /sys/kernel/debug/tracing/current_tracer

echo 1 > /sys/kernel/debug/tracing/tracing_on

echo b >/proc/sysrq-trigger

or: echo c >/proc/sysrq-trigger

....

mount -t pstore - /pstore

ls /pstore

....

Debugging techniques:

printk, pr_debug (dynamic)

assertions (BUG_ON)

kgdb / gdb / custom kdb commands

debugfs

crash / kexec

lockdep

kmemleak, SLAB debug flags, /proc/slabinfo

various CONFIG_DEBUG_xxx

SystemTap, ftrace, trace-cmd, kernelshark, lttng

kprobes / jprobes / TRACE_EVENT / hardware breakpoints

profiler (perf/oprofile)

Gprof2dot

FlameGraph

custom QEMU