Linux KVM set-up

Ubuntu:

https://help.ubuntu.com/community/KVM/Installation
https://help.ubuntu.com/community/KVM/Networking
https://help.ubuntu.com/community/KVM/CreateGuests
https://help.ubuntu.com/community/KVM/Managing
https://help.ubuntu.com/community/KVM/Access

apt-get install qemu-kvm libvirt-bin ubuntu-vm-builder bridge-utils python-virtinst
apt-get install virtinst or python-virtinst
apt-get install qemu-system spice-client python-spice-client-gtk

adduser sergey libvirtd
virsh -c qemu:///system list

apt-get install ubuntu-virt-server  => kvm libvirt-bin openssh-server

from X11 console:   
apt-get install ubuntu-virt-mgmt => virt-manager python-vm-builder and virt-viewer
apt-get install virt-manager   

(reboot)

Fedora / RHEL:

Virtualization Deployment and Administration Guide
Virtualization Tuning and Optimization Guide
Virtualization Security Guide

openSUSE:

zypper install -y -t pattern kvm_server
zypper install -y spice-client virt-utils

systemctl start libvirtd.service
systemctl enable libvirtd.service

Fedora and SUSE (some versions): may need to add user to policy kit.

virsh
virt-manager

When creating virtual machines:
http://wiki.libvirt.org/page/VirtualNetworking
http://wiki.libvirt.org/page/Networking
https://libvirt.org/formatnetwork.html

virsh --connect qemu:///system
virsh# list --all

virt-top

Remote connection (assuming listener is enabled):

virt-viewer --connect qemu+ssh://user@host/system domain
virt-viewer --connect qemu+tcp://host/system domain

more on URIs: https://libvirt.org/remote.html

To pass alt-right-click, need to disable it locally:

ccsm -> General -> General Options -> Key bindings -> change Window Menu to <Super>Button3



Disk image utilities:

To install:

Ubuntu:

apt-get update
apt-get install libguestfs-tools
update-guestfs-appliance

RHEL/Fedora:

dnf install -y libguestfs guestfish libguestfs-tools libguestfs-mount libguestfs-winsupport

openSUSE:

zypper install -y guestfs-tools

Utilities:

virt-filesystems --domain vmname --all -l
virt-filesystems --add disk.img --all -l

virt-inspector --domain vmname
virt-inspector --add disk.img

guestmount [--ro] --add disk.img [-m /dev/sda1] /mnt

guestfish

virt-df --add q.img
virt-dm --domain <vmname>

virt-ls
virt-cat
virt-diff
virt-edit
virt-log
virt-make-fs
virt-rescue
virt-resize
virt-sparsify
virt-tar-in
virt-tar-out
virt-win-reg



Attaching PCI device to a virtual machine:
"pt" limits IOMMU device remapping to KVM only (not host kernel itself)
that's it: virt-manager will unbind the device from host driver automatically when VM starts,
and rebind it back to the host driver when VM exits
<devices>

    <hostdev mode='subsystem' type='pci' managed='yes'>
        <source>
            <address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
        </source>
        <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </hostdev>

    <hostdev mode='subsystem' type='pci' managed='yes'>
       <source>
            <address domain='0x0000' bus='0x04' slot='0x00' function='0x1'/>
        </source>
        <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </hostdev>

</devices>

<address> is optional:

<devices>

    <hostdev mode='subsystem' type='pci' managed='yes'>
        <source>
            <address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
        </source>
    </hostdev>

    <hostdev mode='subsystem' type='pci' managed='yes'>
        <source>
            <address domain='0x0000' bus='0x04' slot='0x00' function='0x1'/>
        </source>
    </hostdev>

</devices>

Documentation:  SUSE   libvirt   RHEL   more libvirt

Note: some devices can be attached to the VM only as a group, collectively.
They cannot be assigned some to the VM and some to the host, albeit it is possible to assign only some to the VM, and detach the rest from the host.

These are devices that belong to the same IOMMU group.
Typically, PFs of the card belong to the same single IOMMU group, and VFs each have their own IOMMU group.

To check which devices belong to the same IOMMU group, execute

ls /sys/bus/pci/devices/0000:04:00.0/iommu_group/devices

The output will be something like

0000:04:00.0  0000:04:00.1

To assign only one to the guest, the unused endpoint should be detached from the host before stating the guest:

virsh nodedev-detach pci_0000_04_00_1

To reattach it back to the host:

virsh nodedev-reattach pci_0000_04_00_1



Attaching SR-IOV device to a virtual machine:
Ubuntu (depending on version):  here and here

RHEL/Fedora: here (ch. 8)

/etc/sysconfig/network-scripts/ifcfg-xxx, specify DEVICE and HWADDR

BIOS naming:

https://en.wikipedia.org/wiki/Consistent_Network_Device_Naming
http://linux.dell.com/files/whitepapers/consistent_network_device_naming_in_linux.pdf

Often, BIOS device naming is best:

Install (or build) package biosdevname.

Verify that BIOS provides enumeration data.
One of the following is required, and a set must cover all of the devices, otherwise BIOS naming is not possible:

dmidecode -t 41    // on-board devices
dmidecode -t 9     // slot devices
biosdecode         // interrupt routing table

SMBIOS 2.6 or later is required.
If biosdevname -i xxx cannot display a name, BIOS naming won't work.
BIOS naming of SR-IOV functions located on the motherboard (as opposed to slots) is not supported.

If BIOS naming is supported:
UDEV way:

Edit NAME fields in /etc/udev/rules.d/70-persistent-net.rules

To regenerate /etc/udev/rules.d/70-persistent-net.rules from scratch:

rm /etc/udev/rules.d/70-persistent-net.rules

#for each interface
export INTERFACE=eth1
export MATCHADDR=`ip addr show $INTERFACE | grep ether | awk '{print $2}'`

/lib/udev/write_net_rules the file

# now edit NAME fields in /etc/udev/rules.d/70-persistent-net.rules

Reboot

Change references to old interface names with their new names (e.g. eth1 -> em1).
To change existing virtual machine definitions, can use virsh edit <vmid>.
cat /sys/bus/pci/devices/0000:04:00.0/sriov_totalvfs
cat /sys/bus/pci/devices/0000:04:00.1/sriov_totalvfs

currently created number of virtual functions can be seen with lspci or via

cat /sys/bus/pci/devices/0000:04:00.0/sriov_numvfs
cat /sys/bus/pci/devices/0000:04:00.1/sriov_numvfs
modprobe  -rv  igb
modprobe  igb  max_vfs=7
pci=assign-busses

make sure
CONFIG_PCI_IOV=y
CONFIG_PCI_REALLOC_ENABLE_AUTO=y

add pci=assign-busses to /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on iommu=pt pci=assign-busses"
and rebuild grub.cfg
sync; reboot
options igb max_vfs=7
Ubuntu:    update-initramfs  -u  -k `uname -r` [-v]
Fedora:    dracut --force /boot/initramfs-`uname -r`.img  `uname -r`

sync; reboot
On Ubuntu:

Edit /etc/network/interfaces like this:

# interfaces(5) file used by ifup(8) and ifdown(8)
auto lo
iface lo inet loopback

auto p4p1
iface p4p1 inet manual
    up ifconfig $IFACE 0.0.0.0 up
    up sysctl -w net.ipv6.conf.$IFACE.disable_ipv6=1
    up ip link set $IFACE up
    down ifconfig $IFACE down

auto p4p2
iface p4p2 inet manual
    up ifconfig $IFACE 0.0.0.0 up
    up sysctl -w net.ipv6.conf.$IFACE.disable_ipv6=1
    up ip link set $IFACE up
    down ifconfig $IFACE down

iface p4p1_0 inet manual
iface p4p1_1 inet manual
iface p4p1_2 inet manual
iface p4p1_3 inet manual
iface p4p1_4 inet manual
iface p4p1_5 inet manual
iface p4p1_6 inet manual
iface p4p2_0 inet manual
iface p4p2_1 inet manual
iface p4p2_2 inet manual
iface p4p2_3 inet manual
iface p4p2_4 inet manual
iface p4p2_5 inet manual
iface p4p2_6 inet manual

May also want to disable these interfaces in Ubuntu NetworkManager GUI ("Automatically connect to this network" = off)
or in /etc/NetworkManager/system-connections. See https://wiki.debian.org/NetworkConfiguration.

Reboot

On RHEL/Fedora/SUSE:

Edit /etc/sysconfig/network-scripts/ifcfg-xxx

DEVICE=xxx
BOOTPROTO=none

Reboot

Make sure (ifconfig -a) that all interfaces show up and do not have DHCP-assigned IPv4 or IPv6 addresses
export EDITOR=`which vs`
virsh edit <vmid>

and change hostdev descriptors -> interface type='hostdev'
<devices>

    <interface type='hostdev' managed='yes'>
        <source>
            <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x2'/>
        </source>
        <mac address='52:54:00:6d:90:02' />
        <driver name='vfio' />
    </interface>

    <interface type='hostdev' managed='yes'>
        <source>
            <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x3'/>
        </source>
        <mac address='52:54:00:6d:90:03' />
        <driver name='vfio' />
    </interface>

</devices>
<mac> elements are optional.
Other optional tags are <vlan> and <virtualport>.

<driver> elements are optional, but desirable for performance, or if <mac> need to be specified, and in other cases (see 19.1.6).
It defaults to vfio when VFIO is available, to kvm otherwise.

To generate unique mac address: use http://www.marsching.com/2009/mac-address-generator or use this script.

To find relationship of PCI addresses to network interfaces:

lspci
systool -c net -v | grep --color=never -E "Device =|address"
lshw -C network -businfo
run netif-enum.py
create passthrough network definition file referencing the interface providing a pool (e.g. sriov-p4p1.xml)
<network>
    <name>sriov-p4p1</name>
    <forward mode='hostdev' managed='yes'>
        <pf dev='p4p1'/>
    </forward>
</network>
define passthrough network:

virsh net-define sriov-p4p1.xml
virsh net-autostart sriov-p4p1
virsh net-start sriov-p4p1

execute virsh edit <vmid> and add a section to VM definition under <devices> tag

<devices>

    <interface type='network'>
        <source network='sriov-p4p1' />
        <mac address='52:54:a8:2e:18:c8' />
    </interface>

</devices>

after starting the first guest, execute virsh net-dumpxml sriov-p4p1 – it should display approximately the following:

<network connections='1'>
    <name>sriov-p4p1</name>
    <uuid>a6b49429-d353-d7ad-3185-4451cc786437</uuid>
    <forward mode='hostdev' managed='yes'>
        <pf dev='p4p1'/>
        <address type='pci' domain='0x0000' bus='0x04' slot='0x0' function='0x1'/>
        <address type='pci' domain='0x0000' bus='0x04' slot='0x0' function='0x3'/>
        <address type='pci' domain='0x0000' bus='0x04' slot='0x0' function='0x5'/>
        <address type='pci' domain='0x0000' bus='0x04' slot='0x0' function='0x7'/>
        <address type='pci' domain='0x0000' bus='0x04' slot='0x1' function='0x1'/>
        <address type='pci' domain='0x0000' bus='0x04' slot='0x1' function='0x3'/>
        <address type='pci' domain='0x0000' bus='0x04' slot='0x1' function='0x5'/>
    </forward>
</network>
<devices>

    <interface type='direct'>
        <source dev='p4p1_6' mode='passthrough' />
        <mac address='52:54:00:6d:90:02' />
    </interface>

</devices>



Set up fixed DHCP addresses in libvirt:

See more here:  http://libvirt.org/formatnetwork.html#elementsAddress

virsh net-edit <netname>

e.g. virsh net-edit default

Add elements:
<network>
    <name>default</name>
    <uuid>5c0448c7-4240-4325-9a80-eb9f575c962e</uuid>
    <forward mode='nat'/>
    <bridge name='virbr0' stp='on' delay='0' />
    <mac address='52:54:00:BE:58:83'/>
    <ip address='192.168.122.1' netmask='255.255.255.0'>
        <dhcp>
            <range start='192.168.122.100' end='192.168.122.254' />
            <host mac='52:54:00:c9:06:c5' name='foo' ip='192.168.122.10' />
            <host mac='52:54:00:c9:06:c6' name='bar' ip='192.168.122.11' />
        </dhcp>
    </ip>
</network>

May also need to restart libvirtd and dnsmasq processes:

# caution!
# stopping libvirt-bin will kill all running VMs

stop libvirt-bin
killall -9 dnsmasq
start libvirt-bin



AppArmor:

http://wiki.apparmor.net/index.php/QuickProfileLanguage
http://wiki.apparmor.net/index.php/Documentation
https://help.ubuntu.com/community/AppArmor

# disable apparmor (Ubuntu pre-systemd)
sudo invoke-rc.d apparmor kill  [or stop]
sudo update-rc.d -f apparmor remove

# enable apparmor (Ubuntu pre-systemd)
sudo invoke-rc.d apparmor start
sudo update-rc.d apparmor start 37 S .

# reload all profiles (Ubuntu pre-systemd)
sudo invoke-rc.d apparmor reload


/etc/apparmor.d/libvirt/libvirt-<guid>
/etc/apparmor.d/libvirt/libvirt-<guid>.files


# if profile does not exist:
export VM=foo
virsh dumpxml $VM | sudo /usr/lib/libvirt/virt-aa-helper -c -u libvirt-`virsh domuuid $VM`

# if profile already exists:
export VM=foo
virsh dumpxml $VM | sudo /usr/lib/libvirt/virt-aa-helper -r -u libvirt-`virsh domuuid $VM`


/var/log/libvirt/...
/var/run/livirt/...
/var/lib/libvirt/...

For example:

libvirt-e83aba02-6a56-e4b7-5f4f-f64f4171aa98

#
# This profile is for the domain whose UUID matches this file.
#

#include <tunables/global>

profile libvirt-e83aba02-6a56-e4b7-5f4f-f64f4171aa98 {
  #include <abstractions/libvirt-qemu>
  #include <libvirt/libvirt-e83aba02-6a56-e4b7-5f4f-f64f4171aa98.files>

}

libvirt-bfa46efa-96d0-e063-05bb-1ecbffb19216.files

# DO NOT EDIT THIS FILE DIRECTLY. IT IS MANAGED BY LIBVIRT.
  "/var/log/libvirt/**/w7a.log" w,
  "/var/lib/libvirt/**/w7a.monitor" rw,
  "/var/run/libvirt/**/w7a.pid" rwk,
  "/run/libvirt/**/w7a.pid" rwk,
  "/var/run/libvirt/**/*.tunnelmigrate.dest.w7a" rw,
  "/run/libvirt/**/*.tunnelmigrate.dest.w7a" rw,
  "/home/sergey/vms/w7a/disk1" rw,
  "/home/sergey/iso/en_windows_7_ultimate_with_sp1_x64_dvd_u_677332.iso" r,
  # don't audit writes to readonly files
  deny "/home/sergey/iso/en_windows_7_ultimate_with_sp1_x64_dvd_u_677332.iso" w,