http://www.osronline.com
http://www.osronline.com/ddkx/kmarch/k112_49bm.htm
http://www.osronline.com/ddkx/kmarch/irps_8lgn.htm
http://www.kernelmode.info
http://msdn.microsoft.com/en-us/library/windows/hardware/ff552185%28v=vs.85%29.aspx
http://fsfilters.blogspot.com
http://www.codemachine.com/articles.html
http://msdn.microsoft.com/en-us/library/ms810047.aspx
http://www.osronline.com/showthread.cfm?link=202323
http://msdn.microsoft.com/en-us/library/windows/hardware/ff544652%28v=vs.85%29.aspx
Kernel Transaction Manager: http://msdn.microsoft.com/en-us/library/windows/hardware/ff565408%28v=vs.85%29.aspx
Transactional NTFS: http://msdn.microsoft.com/en-us/library/bb968806%28VS.85%29.aspx
All Driver
Support Routines: http://msdn.microsoft.com/en-us/library/windows/hardware/ff544200%28v=vs.85%29.aspx
Code analysis for drivers: http://msdn.microsoft.com/en-us/library/windows/hardware/gg487345.aspx
Root: http://msdn.microsoft.com/en-us/library/windows/hardware/ff960953.aspx
Installable file system drivers:
http://msdn.microsoft.com/en-us/library/windows/hardware/ff551834%28v=vs.85%29.aspx
Useful macros:
NTSTATUS status;
if (NT_SUCCESS(status)) …
if (NT_INFORMATION(status)) …
if (NT_WARNING(status)) …
if (NT_ERROR(status)) …
max(a, b)
min(a, b)
RTL_NUMBER_OF(array) ≡
countof(array)
RTL_NUMBER_OF_FIELD (type,
field) ≡
countof(type.field)
typedef struct {
FOO bar;
LIST_ENTRY list;
LAST behind;
}
TS, *PTS;
LIST_ENTRY* plist = & pts->list;
FIELD_OFFSET(PTS, list)
CONTAINING_RECORD(plist, TS, list)
RTL_SIZEOF_THROUGH_FIELD(TS, list)
RTL_CONTAINS_FIELD(TS, size,
behind) – check if
field is present in possibly truncated instance of structure
const static UNICODE_STRING strU = RTL_CONSTANT_STRING(L"Foo");
const static STRING str = RTL_CONSTANT_STRING("Foo");
const static ANSI_STRING strA = RTL_CONSTANT_STRING("Foo");
Instead of:
UNICODE_STRING strU;
STRING str;
ANSI_STRING strA;
RtlInitUnicodeString(&strU, L"Foo");
RtlInitString(&Fstr, "Foo");
RtlInitAnsiString(&strA, "Foo");
Note: cannot perform Unicode <-> ASCII conversion at IRQL >= DISPATCH_LEVEL, these routines are pageable.
ALIGN_UP(length, type)
ALIGN_DOWN(length, type)
p = ALIGN_UP_POINTER(p, ULONG)
p = ALIGN_DOWN_POINTER(p, ULONG)
C_ASSERT(SMALL_BUFFER < LARGEST_BUFFER);
// compile-time assertion
VOID MyDpcRoutine(IN PKDPC Dpc,IN PVOID DeferredContext,
IN PVOID SystemArgument1,IN PVOID SystemArgument2)
{
UNREFERENCED_PARAMETER(Dpc); // Dpc parameter is not used
If (ARGUMENT_PRESENT(SystemArgument1)) // same as if (SystemArgument1 != NULL)
{
// Perform work on SystemArgument1
Objects:
IO objects:
FILE_OBJECT
// IoCreateFile, ZwCreateFile, ZwOpenFile
DEVICE_OBJECT
DRIVER_OBJECT
KINTERRUPT
(interrupt object)
ADAPTER_OBJECT
IRP
Other:
Symbolic
links
// ZwOpenSymbolicLinkObject
Directory
// ZwCreateDirectoryObject
Registry
keys
// IoOpenDeviceInterfaceRegistryKey, IoOpenDeviceRegistryKey, ZwCreateKey,
ZwOpenKey
Callback objects
Section
objects
// ZwOpenSection
Threads and
processes
// PsCreateSystemThread
Control objects:
KDPC
(DPC object)
KAPC
(APC object)
Spinlocks
Dispatcher (waitable) objects:
kernel mutex/mutant
event
// IoCreateSynchronizationEvent, IoCreateNotificationEvent
timer
semaphore
threads, processes, files
Waitable at <= APC_LEVEL, settable at <= DISPATCH_LEVEL.
To manipulate dispatcher
objects system must internally synchronize at (elevate to) DISPATCH_LEVEL and
acquire system-wide dispatcher spinlock.
All kernel dispatcher objects are based on common header (DISPATCHER_HEADER).
Kernel dispatcher objects can be referred by either a handle or a pointer.
If a handle is used to refer to a kernel dispatcher object created in an
arbitrary thread context, must set OBJ_KERNEL_HANDLE attribute for the object.
Setting this attribute prevents user-mode threads from accessing the handle.
If (Timeout != NULL && *Timeout == 0), can also call KeWaitFor...
at DISPATCH_LEVEL.
Otherwise only at <= APC_LEVEL.
Note that if WaitMode == UserMode, kernel stack can be
swapped out.
In this case object must not be on the stack – nor object being waited for, nor
any other object actively exposed to the system (KEVENT, KDPC, KTIMER etc).
If Alertable is TRUE, may need to remember to cancel the object (such as KTIMER).
Effects of the Alertable and WaitMode parameters on APC delivery:
|
Special
|
Normal
|
User-mode APC |
|||
Value of Alertable and |
Terminate wait? |
Deliver and run APC? |
Terminate wait? |
Deliver and run APC? |
Terminate wait? |
Deliver and run APC? |
Alertable = TRUE WaitMode = UserMode |
No |
If IRQL
== PASSIVE_LEVEL |
No |
if IRQL
== PASSIVE_LEVEL, thread not already in an APC, and thread not in a critical
region, |
Yes |
Yes, after thread returns to user mode |
Alertable = TRUE WaitMode = KernelMode |
–– “ –– |
–– “ –– |
–– “ –– |
–– “ –– |
No |
No |
Alertable = FALSE WaitMode = UserMode |
–– “ –– |
–– “ –– |
–– “ –– |
–– “ –– |
No |
No (with exceptions, such as CTRL+C to terminate) |
Alertable = FALSE WaitMode = KernelMode |
–– “ –– |
–– “ –– |
–– “ –– |
–– “ –– |
No |
No |
Waiting in UserMode is safe only if the waiting driver
is the only driver on the stack.
If one or more other drivers are on the stack, one of those drivers might try to
update a stack variable, thereby causing a page fault. If that driver is
running at IRQL=DISPATCH_LEVEL or higher, the page fault will cause the system
to crash.
Because PnP driver stacks often include filter drivers, PnP drivers rarely set WaitMode
to UserMode.
NTSTATUS KeWaitForSingleObject(
PVOID
Object,
// event, mutex, semaphore, thread, timer
KWAIT_REASON
WaitReason,
// Executive or UserRequest, the latter only if on behalf of the user and in
the context of requestor user thread
KPROCESSOR_MODE
WaitMode,
// KernelMode or UserMode, intermediate and lower-level drivers should use
KernelMode, for mutex must be KernelMode
BOOLEAN
Alertable,
// for drivers, usually FALSE
LARGE_INTEGER*
Timeout);
// optional, may be NULL; positive value is absolute time in 100-ns units
relative to 1.1.1601, subject in change to system time;
//
negative
value is interval relative to current time, not affected by system time changes
//
zero means no waiting
STATUS_SUCCESS
STATUS_ALERTED
STATUS_USER_APC
STATUS_TIMEOUT
STATUS_ABANDONED_WAIT_0
other condition codes
NTSTATUS KeWaitforMultipleObjects(
ULONG
Count,
// must be <= MAXIMUM_WAIT_OBJECTS (64)
PVOID
Objects[],
WAIT_TYPE
WaitType,
// WaitAll or WaitAny
KWAIT_REASON
WaitReason,
KPROCESSOR_MODE
WaitMode,
// if any of the objects is mutex, must specify KernelMode
BOOLEAN
Alertable,
LARGE_INTEGER* Timeout,
KWAIT_BLOCK*
WaitBlockArray); //
scratch array, must be non-NULL if Count > THREAD_WAIT_OBJECTS (3)
// if WaitMode == UserMode, must not allocate it on stack since the stack can
get swapped out
STATUS_SUCCESS
STATUS_ALERTED
STATUS_USER_APC
STATUS_TIMEOUT
STATUS_ WAIT_0 ... STATUS_ WAIT_63
STATUS_ABANDONED_WAIT_0 ... STATUS_ABANDONED_WAIT_63
other condition codes
If kernel-mode APC delivery is disabled, user-mode APC
delivery is disabled as well.
If WaitMode == KernelMode, then user-mode APC does not interrupt wait
regardless of Alertable.
If WaitMode == UserMode and Alertable == TRUE, wait can be
interrupted (with status STATUS_USER_APC) and user-mode APC will be delivered.
If WaitMode == UserMode and Alertable == FALSE, wait can be
interrupted for thread termination, but not for user-mode APC.
Kernel APCs (if enabled) do not cause KeWaitForXxx and KeDelayExecution
to return: the system interrupts and resumes the wait internally (but it is
possible for a waiter to miss a transient signal such as KePulseEvent).
Alerts, a very seldomly used mechanism that is internal to the operating system
(NtAlertThread, NtAlertResumeThread), can also interrupt wait
states if Alertable = TRUE, regardless of the WaitMode; the
waiting routine returns then STATUS_ALERTED.
IRQL levels:
amd64 |
x86-32 |
ia64 |
|
|
15 |
31 |
15 |
HIGH_LEVEL |
NMI, machine check, catastrophic
errors, |
15 |
27 |
15 |
PROFILE_LEVEL |
profiling timer for releases earlier than Windows 2000 |
14 |
30 |
15 |
POWER_LEVEL |
power failure |
14 |
|
14 |
DRS_LEVEL |
deferred recovery service |
14 |
29 |
14 |
IPI_LEVEL |
IPI interrupts |
13 |
28 |
13 |
CLOCK_LEVEL, |
interval clock |
13 |
27 |
13 |
SYNCH_LEVEL |
synchronization of code and instruction streams across processors |
n/a |
n/a |
12 |
PC_LEVEL |
performance counter |
3-11 |
3-26 |
4-11 |
(DIRQL) |
device interrupts |
3 |
3 |
4 |
DEVICE_LEVEL_BASE |
devices: from here to below clock |
n/a |
n/a |
3 |
CMCI_LEVEL |
correctable machine check |
2 |
2 |
2 |
DISPATCH_LEVEL |
DPC: disables preemption, but not ISRs |
1 |
1 |
1 |
APC_LEVEL |
blocks APC delivery, similar to ASTDEL |
0 |
0 |
0 |
PASSIVE_LEVEL |
aka LOW_LEVEL |
#include <wdm.h>
KIRQL KeGetCurrentIrql()
VOID KeRaiseIrql(KIRQL
newIrql, OUT KIRQL* oldIrql)
VOID KeLowerIrql(KIRQL
newIrql)
Driver routines are called at IRQL described in “Scheduling, Thread Context, and IRQL” (http://msdn.microsoft.com/en-us/library/ms810029.aspx)
KAPC:
Special kernel APC: dispatched at APC_LEVEL, preempts
user-mode code, kernel-mode code and normal kernel APCs.
Normal kernel APC: dispatched at PASSIVE_LEVEL, preempts user-mode code,
kernel-mode.
Guarded region: disables all APCs.
VOID
KeEnterGuardedRegion();
VOID KeLeaveGuardedRegion();
Also holding a guarded mutex implicitly places holder within a guarded region
(see KaAcquireGuardedMutex).
Critical region: disables normal APCs, but special kernel APCs are still delivered.
VOID
KeEnterCriticalRegion();
VOID KeLeaveCriticalRegion();
Also holding a mutex implicitly places holder within a critical region.
While in critical region, cannot open files on storage media or throw errors.
Thread killing and suspension is performed via normal KAPC, so
code that acquires a resource or lock must be called within a critical region
at a minimum or at APC_LEVEL.
Note that any routine that relies on IO completion must be called with special
KAPC enabled.
Raising to APC_LEVEL disables all KAPCs.
Holding a fast mutex implicitly raises IRQL to APC_LEVEL (see KaAcquireFastMutex).
BOOLEAN KeAreApcsDisabled(void);
// inside critical region or guarded region (callable at <= DISPATCH_LEVEL)
BOOLEAN KeAreAllApcsDisabled(void);
// inside guarded region or IRQL >= APC_LEVEL (callable at <=
DISPATCH_LEVEL)
Undocumented KeInitializeApc, KeInsertQueueApc.
See http://www.microsoft.com/msj/0799/nerd/nerd0799.aspx
See http://www.codeproject.com/Articles/5812/Yet-Another-Thread-Monitor
See
http://www.codeproject.com/Articles/13572/Starting-a-Process-from-KernelMode
Executive Spinlocks (Ordinary and Queued):
May be acquired at <= DISPATCH_LEVEL.
Elevates to DISPATCH_LEVEL.
Non-recursive.
Timeout in checked build.
On uniprocessor just evelates IRQL.
VOID
KeInitializeSpinLock(KSPIN_LOCK* SpinLock)
VOID KeAcquireSpinLock(KSPIN_LOCK*
SpinLock, KIRQL* oldIrql)
VOID KeReleaseSpinLock(KSPIN_LOCK*
SpinLock, KIRQL newIrql)
VOID KeAcquireSpinLockAtDpcLevel(KSPIN_LOCK*
SpinLock)
VOID KeReleaseSpinLockFromDpcLevel(KSPIN_LOCK*
SpinLock)
Queued spinlocks are more efficient for high-contention lock on a
multiprocessor.
Also provide first come, first served.
SPIN_LOCK SpinLock;
KLOCK_QUEUE_HANDLE LockHandle; // usually on stack
VOID KeAcquireInStackQueuedSpinLock(& SpinLock, & LockHandle);
VOID KeReleaseInStackQueuedSpinLock(& LockHandle);
VOID KeAcquireInStackQueuedSpinLockAtDpcLevel(& SpinLock, & LockHandle);
VOID
KeReleaseInStackQueuedSpinLockFromDpcLevel(&
LockHandle);
Note that ExInterlockedXxx routines can use spinlocks at any IRQL, while
KeXxx routines only at DISPATCH_LEVEL.
Therefore if ExInterlockedXxx is called from ISR (i.e. from interrupt
above DISPATCH_LEVEL), it should not use the same spinlock as used by KeXxx.
A driver can pass to KeXxx the same spinlock to ExInterlockedXxx as
long as both routines use the spinlock at the same IRQL.
Interrupt Spinlocks:
Elevates to DIRQL (if device is multi-level, must be
the highest of device interrupt levels).
Non-recursive.
NTSTATUS
IoConnectInterrupt(
KINTERRUPT** InterruptObject, // will store here the pointer to created KINTERRUPT object
PKSERVICE_ROUTINE ServiceRoutine, // ISR: BOOLEAN InterruptService(KINTERRUPT* Interrupt, PVOID ServiceContext)
PVOID ServiceContext, // passed to ISR
PKSPIN_LOCK SpinLock, // optional: required if the driver handles more than one vector or has multiple ISRs that need to be serialized
ULONG Vector,
KIRQL Irql, // IRQL for this interrupt as returned by HalGetInterrupt
KIRQL SynchronizeIrql, // DIRQL; if multiple vectors, then the highest of them, otherwise the same as Irql; ISR will execute at this IRQL
KINTERRUPT_MODE InterruptMode, // LevelSensitive or Latched (Latched = edge-triggered; PCI is LevelSensitive)
BOOLEAN ShareVector, // if driver is willing to share interrupt vector with other devices
KAFFINITY ProcessorEnableMask, // set of processors than can process this interrupt, usually the same as Affinity returned by HalGetInterruptVector
BOOLEAN FloatingSave) // whether to save floating point registers, for x86 must be FALSE
STATUS_SUCCESS
STATUS_INVALID_PARAMETER
STATUS_INSUFFICIENT_RESOURCES
SpinLock is normally set to NULL. This results in kernel using spinlock internal to KINTERRUPT to ensure ISR is active on one CPU at a time.
However if driver uses multiple interrupt sources and wishes to synchronize across them (serialize ISRs), it can use external spinlock shared across multiple KINTERRUPT objects.
Note: interrupt spinlock must never be acquired with KeAcquireSpinLock since it does not elevate IRQL to DIRQL (only to DISPATCH_LEVEL).
Must use KeSynchronizeExecution or KeAcquireInterruptSpinLock.
If driver not willing to share interrupt (ShareVector = FALSE) attaches to Vector, another IoConnectInterrupt to the same Vector will fail.
If interrupt occurs on a vector that has been attached to as LevelSensitive, kernel calls all interrupts attached to this vector until some ISR returns TRUE or until all had been called.
This is because LevelSensitive devices will continue to interrupt until an interrupt is acknowledged on each device.
For Latched vector kernel always calls all ISRs attached to the vector.
Because there is no way to determine how many Latched interrupts had been requested.
Furthermore, ISRs need to be called iteratively until all return FALSE to ensure no interrupts had been lost.
BOOLEAN InterruptServiceRoutine(KINTERRUPT* Interrupt, VOID* ServiceContext)
Called at SynchronizerIrql with external SpinLock held if was specified, otherwise with KINTERRUPT internal spinlock held.
NTSTATUS
IoConnectInterruptEx(IO_CONNECT_INTERRUPT_PARAMETERS* Parameters)
VOID IoDisconnectInterrupt(KINTERRUPT*
InterruptObject)
BOOLEAN KeSynchronizeExecution(
KINTERRUPT* Interrupt,
PKSYNCHRONIZE_ROUTINE SynchronizeRoutine,
PVOID
SynchronizeContext)
BOOLEAN SynchronizeRoutine (PVOID
SynchronizeContext)
// executed at SynchronizeIrql and with interrupt object spinlock
KIRQL KeAcquireInterruptSpinLock(KINTERRUPT*
Interrupt)
VOID KeReleaseInterruptSpinLock(KINTERRUPT*
Interrupt, KIRQL OldIrql)
###
http://download.microsoft.com/download/5/7/7/577a5684-8a83-43ae-9272-ff260a9c20e2/MSI.doc
### ch 15
### http://msdn.microsoft.com/en-us/library/windows/hardware/ff544079%28v=vs.85%29.aspx
### http://msdn.microsoft.com/en-us/library/windows/hardware/ff565513%28v=vs.85%29.aspx
Reader/Writer spinlocks:
Must be acquired from <= DISPATCH_LEVEL.
Elevates to DISPATCH_LEVEL.
Non-recursive.
EX_SPIN_LOCK ExSpinLock = 0;
KIRQL ExAcquireSpinLockExclusive(& ExSpinLock);
KIRQL ExAcquireSpinLockShared(& ExSpinLock);
VOID ExAcquireSpinLockExclusiveAtDpcLevel(& ExSpinLock);
VOID ExAcquireSpinLockSharedAtDpcLevel(& ExSpinLock);
VOID ExReleaseSpinLockExclusive(& ExSpinLock, KIRQL OldIrql);
VOID ExReleaseSpinLockExclusiveFromDpcLevel(& ExSpinLock);
VOID ExReleaseSpinLockShared(& ExSpinLock, KIRQL OldIrql);
VOID ExReleaseSpinLockSharedFromDpcLevel(& ExSpinLock);
BOOLEAN ExTryConvertSharedSpinLockExclusive(& ExSpinLock);
Fast mutexes:
Elevates to APC_LEVEL.
Can only be acquired at <= APC_LEVEL.
Non-recursive.
Low-overhead (but guarded mutexes are more efficient before Win8; since Win8 guarded mutex == fast mutex).
FAST_MUTEX fm;
VOID ExInitializeFastMutex(&
fm);
VOID ExAcquireFastMutex(&
fm);
VOID ExReleaseFastMutex(& fm);
BOOLEAN ExTryToAcquireFastMutex(& fm);
VOID
ExAcquireFastMutexUnsafe(&
fm);
VOID ExReleaseFastMutexUnsafe(&
fm);
“Unsafe” routines do not change IRQL and can be called at APC_LEVEL.
Guarded mutexes:
Before Win8: enters guarded region with all APCs disabled,
IRQL is unchanged.
Since Win8: implemented as fast mutex, i.e. elevates to APC_LEVEL.
Can only be acquired at <= APC_LEVEL.
Non-recursive.
Even more efficient than fast mutexes (but since Win8 are the same).
KGUARDED_MUTEX gm;
VOID KeInitializeGuardedMutex(&
gm);
VOID KeAcquireGuardedMutex(& gm);
VOID
KeReleaseGuardedMutex(&
gm);
BOOLEAN KeTryToAcquireGuardedMutex(& gm);
VOID KeAcquireGuardedMutexUnsafe(&
gm);
VOID KeReleaseGuardedMutexUnsafe(& gm);
“Unsafe” routines do not enter/exit guarded region and can be used only either within an existing guarded region or at APC_LEVEL.
Kernel Mutexes (Mutants):
While acquired, thread enters critical section: no
kernel APCs except special kernel-mode APCs.
Not as performant as fast or guarded mutexes.
Recursive.
Does not change IRQL.
Wait at <= APC_LEVEL.
Signal (release) or state checking at <= DISPATCH_LEVEL.
KMUTEX mtx;
VOID KeInitializeMutex(& mtx, 0)
LONG KeReadStateMutex(& mtx) // 1 -> signalled
LONG KeReleaseMutex(& mtx, BOOLEAN Wait) // if Wait == TRUE, does not restore caller's IRQL, and caller must immediately call KeWaitxxx
KeWaitForSingleObject
KeWaitForMultipleObjects
NTSTATUS KeWaitForMutexObject(
PVOID Mutex,
KWAIT_REASON WaitReason,
KPROCESSOR_MODE WaitMode, // must be KernelMode
BOOLEAN Alertable,
PLARGE_INTEGER Timeout);
Events:
NotificationEvent: (level, manual-resetting) release all waiting threads, remain signaled
SynchronizationEvent: (edge, auto-resetting) releases single waiting thread for execution and auto-resets to non-signaled
Wait at <= APC_LEVEL, signal or check at <= DISPATCH_LEVEL.
KEVENT ev;
VOID KeInitializeEvent(& ev,
EVENT_TYPE Type, // NotificationEvent or SynchronizationEvent
BOOLEAN InitialState);
LONG KeSetEvent(&
ev,
// if previously was signaled, returns non-zero; if Wait is FALSE,
callable at <= DISPATCH_LEVEL
KPRIORITY Increment,
BOOLEAN Wait); // if TRUE, does not restore caller's IRQL, and caller must immediately call KeWaitxxx
LONG KePulseEvent(& ev, // if previously was signaled, returns non-zero
KPRIORITY Increment,
BOOLEAN Wait); // if TRUE, does not restore caller's IRQL, and caller must immediately call KeWaitxxx
Temporarily signal event, satisfy as many waiters as
possible, and reset event.
Warning: if a thread waiting for the event is currently running a kernel APC,
then it's wait is not satisfied, and after APC completes, the thread remains in
the wait state.
LONG
KeResetEvent(&
ev);
// if previously was signaled, returns non-zero; callable at <=
DISPATCH_LEVEL
VOID KeClearEvent(&
ev);
// faster than KeResetEvent, but does not return previous state
LONG KeReadStateEvent(&
ev);
// if currently set to signaled state, returns non-zero
KeWaitForSingleObject
KeWaitForMultipleObjects
PKEVENT IoCreateNotificationEvent(PUNICODE_STRING EventName, OUT PHANDLE EventHandle)
PKEVENT IoCreateSynchronizationEvent (PUNICODE_STRING EventName, OUT PHANDLE EventHandle)
To use event handles passed from user:
· Validate the handle received in the IOCTL by calling ObReferenceObjectByHandle. In the DesiredAccess parameter, specify SYNCHRONIZE access, and in the ObjectType parameter, specify *ExEventObjectType.
· To signal the event, call KeSetEvent; to reset a notification event, call KeResetEvent.
· Call ObDereferenceObject to free the handle when the event is no longer needed.
Timers:
NotificationTimer: (level) release all waiting threads, remain signaled
SynchronizationTimer: (edge) releases single waiting thread for execution and auto-resets to non-signaled
Wait at <= APC_LEVEL, schedule and cancel at <=
DISPATCH_LEVEL.
KTIMER tmr;
VOID KeInitializeTimer(& tmr); // notification timer
VOID KeInitializeTimerEx(& tmr,
TIMER_TYPE Type); // NotificationTimer or SynchronizationTimer
BOOLEAN KeSetTimer(& tmr, // if timer was already in the queue, returns TRUE
LARGE_INTEGER DueTime, // positive is absolute time (affected by system time changes), negative is interval (not affected by system time changes)
KDPC* Dpc); // optional, on timer expiration inserted into DPC queue
BOOLEAN KeSetTimerEx(& tmr,
LARGE_INTEGER DueTime,
LONG Period, // optional repetition period in milliseconds (must be <= MAXLONG)
KDPC* Dpc);
If timer was already in the queue, it is implicitly
cancelled and re-inserted at new values.
A call to KeSetTimer(Ex) before previously specified DueTime expires cancels
both the timer and the call to
DPC.
On timer
expiration, timer event is set to signaled and DPC, if was specified, is
inserted into DPC queue.
DPC routine cannot deallocate a periodic timer object, but can deallocate a
non-periodic timer object.
Dpc sysargs 1 and 2 are: systime.LowPart, systime.HighPart.
BOOLEAN KeReadStateTimer(& tmr); // returns TRUE if tmr is signaled
BOOLEAN KeCancelTimer(& tmr); // if was in the queue, return TRUE
If timer
was in the queue, it is removed from the queue.
If a DPC object is associated with the timer, it is also cancelled.
If timer was still in the queue, return TRUE.
Periodic timers are always in the queue, so KeCancelTimer always returns
TRUE for them (and thus no telling of DPC state).
For non-periodic timers that has expired and DPC had been queued (and possibly
still in flight) KeCancelTimer returns FALSE.
Call KeFlushQueuedDpcs to block until DPC completes.
KeWaitForSingleObject
KeWaitForMultipleObjects
Once-a-second device timer routine:
NTSTATUS IoInitializeTimer(DEVICE_OBJECT* DeviceObject, IO_TIMER_ROUTINE* TimerRoutine, PVOID* Context) // at PASSIVE_LEVEL
VOID IoStartTimer(DEVICE_OBJECT* DeviceObject) // at <= DISPATCH_LEVEL
VOID IoStopTimer(DEVICE_OBJECT*
DeviceObject)
// at <= DISPATCH_LEVEL
IO_TIMER_ROUTINE IoTimer;
VOID IoTimer(DEVICE_OBJECT* DeviceObject, VOID* Context) // called at DISPATCH_LEVEL
Semaphores:
Wait at <= APC_LEVEL, signal at <= DISPATCH_LEVEL
KSEMAPHORE sem;
VOID KeInitializeSemaphore(&
sem,
LONG Count, // initial count
LONG Limit); // maximum count
LONG KeReleaseSemaphore(& sem,
KPRIORITY Increment, // priority increment to give to waiters
LONG Adjustment, // units to release, if too much (would exceed Limit) will fail with STATUS_SEMAPHORE_LIMIT_EXCEEDED
BOOLEAN Wait); // if TRUE, returns at elevated IRQL and caller must immediately execute KeWaitForXxx
LONG KeReadStateSemaphore(&
sem);
// if sem is signaled, returns non-zero
KeWaitForSingleObject
KeWaitForMultipleObjects
Callbacks:
Notification can happen at <= DISPATCH_LEVEL, all other calls at <= APC_LEVEL.
OBJECT_ATTRIBUTES oa;
CALLBACK_OBJECT cbk;
VOID InitializeObjectAttributes(& oa,
PUNICODE_STRING ObjectName,
ULONG Attributes, // typically OBJ_CASE_INSENSITIVE
HANDLE RootDirectory, // if ObjectName is fully qualified, RootDirectory is NULL
PSECURITY_DESCRIPTOR SecurityDescriptor); // optional
NTSTATUS ExCreateCallback(& cbk, & oa,
BOOLEAN Create, // TRUE -> create if does not exist yet
BOOLEAN AllowMultipleCallbackRoutines);
PVOID ExRegisterCallback(& cbk, // returns registration handle
PCALLBACK_FUNCTION CallbackFunction,
PVOID CallbackContext);
VOID (*PCALLBACK_FUNCTION) (PVOID CallbackContext, PVOID Argument1, PVOID Argument2);
VOID ExNotifyCallback(& cbk, PVOID Argument1, PVOID Argument2);
Notify at <= DISPATCH_LEVEL; callback routines are called in the context of the notifying thread at the same IRQL at which notification occurred.
VOID ExUnregisterCallback(PVOID RegistrationHandle);
Some predefined callback names:
\Callback\SetSystemTime
\Callback\PowerState
\Callback\ProcessorAdd
Executive resources (RW locks):
Acquire at
<= APC_LEVEL.
Normal kernel APC delivery must be disabled while holding a resource, i.e. do
acquire within critical region.
ERESOURCE res;
On 32-bit platforms ERESOURCE must be 4-byte aligned, on 64-bit platforms must
be 8-byte aligned.
Recursive count?
NTSTATUS
ExInitializeResourceLite(& res);
NTSTATUS ExDeleteResourceLite(& res); // IRQL <= APC_LEVEL
NTSTATUS ExReinitializeResourceLite(& res); // equivalent to Delete + Initialize
Below: Wait == TRUE -> do wait, otherwise just "try" and return status
BOOLEAN ExAcquireResourceSharedLite(& res, BOOLEAN Wait);
If already acquired as shared, grant recursively as
shared.
If already acquired as exclusive, grant recursively as exclusive.
BOOLEAN ExAcquireResourceExclusiveLite(& res, BOOLEAN Wait);
If already acquired as exclusive, grant recursively as
exclusive.
If already acquired as shared, must first release the lock before trying to
acquire exclusive.
BOOLEAN ExAcquireSharedStarveExclusive(& res, BOOLEAN Wait);
Like ExAcquireResourceSharedLite, but take priority over any pending exclusive waiters.
BOOLEAN ExAcquireSharedWaitForExclusive(& res, BOOLEAN Wait);
If already
acquired as exclusive, grant recursively as exclusive.
If resource is already owned as shared and there are no pending exclusive
waiters, grant shared access (if caller already held this lock, then
recursively).
If resource is owned as shared (but not by current holder) and there is
exclusive waiter, give priority to exclusive waiter and get shared access only
after it.
If resource is owned as shared by current holder and there is exclusive waiter,
behavior is unclear from the documentation.
VOID ExReleaseResourceLite(& res); // unclear: decrements one or release recursive all acquisitions
VOID ExReleaseResourceForThreadLite(& res, ERESOURCE_THREAD ResourceThreadId); // release resource on behalf of the indicated thread
BOOLEAN ExIsResourceAcquiredExclusiveLite(& res); // check if the caller acquired res for exclusive access
ULONG ExIsResourceAcquiredSharedLite(&
res);
// number of times the caller has acquired res for shared or exclusive
(doc?) access
ULONG ExIsResourceAcquiredLite(&
res);
// number of times the caller has acquired res for shared or exclusive
access
ERESOURCE_THREAD ExGetCurrentResourceThread(); // thread is for subsequent call to ExReleaseResourceForThreadLite
ULONG ExGetExclusiveWaiterCount(&
res);
// not interlocked: just an estimate
ULONG ExGetSharedWaiterCount(&
res);
// not interlocked: just an estimate
VOID ExResourceOwnerPointer(Ex)(…) // change resource ownership to another thread or non-thread owner
### more:
http://msdn.microsoft.com/en-us/library/windows/hardware/hh454220%28v=vs.85%29.aspx
Singly-linked lists:
SINGLE_LIST_ENTRY head;
SINGLE_LIST_ENTRY entry;
KSPIN_LOCK spinlock;
entry.Next
macro CONTAINING_RECORD(structaddr, typename, fieldname)
macro FIELD_OFFSET(typename, fieldname)
VOID
PushEntryList(& head, &
entry);
// insert at the head
SINGLE_LIST_ENTRY* PopEntryList(& head); // remove from the head
SINGLE_LIST_ENTRY*
ExInterlockedPushEntryList(& head, & entry, &
spinlock); //
returns: old head before insertion
SINGLE_LIST_ENTRY* ExInterlockedPopEntryList(& head, & spinlock); // returns: removed from the head or NULL
Can be called at any IRQL.
Note that if the list is accessed at DIRQL, code must not access it by locking spinlock at DISPATCH_LEVEL.
Doubly-linked list:
LIST_ENTRY head;
LIST_ENTRY entry;
entry.Flink, entry.Blink
VOID
InitializeListHead(& head);
VOID InsertHeadList(& head, & entry); // insert at the front of list
VOID InsertTailList(& head, & entry); // inser at the tail of list
LIST_ENTRY* RemoveHeadList(& head); // remove entry from the front of the list
LIST_ENTRY* RemoveTailList(& head); // remove entry from the tail of the list
BOOLEAN RemoveEntryList(& entry); // TRUE if was empty, FALSE otherwise
BOOLEAN IsListEmpty(& head); // check if list has entries
VOID AppendTailList(& head, LIST_ENTRY* ListToAppend);
LIST_ENTRY*
ExInterlockedInsertHeadList(& head, & entry, &
spinlock); // insert at
the head of the list
LIST_ENTRY* ExInterlockedInsertTailList(& head, & entry, & spinlock); // insert at the tail of the list
LIST_ENTRY* ExInterlockedRemoveHeadList(& head, & spinlock); // empty -> returns "&head"
Can be called at any IRQL.
Note that if the list is accessed at DIRQL, code must not access it by locking
spinlock at DISPATCH_LEVEL.
S-Lists (sequenced, singly linked list):
More efficient for atomic operations than singly-linked
list.
Usable at <= DISPATCH_LEVEL only.
SLIST_HEADER head
SLIST_ENTRY next
entry.Next
head and entry must be 16-byte aligned, use DECLSPEC_ALIGN(MEMORY_ALLOCATION_ALIGNMENT)
VOID ExInitializeSListHead(& head);
SLIST_ENTRY* ExInterlockedPushEntrySList(& head, & entry, & spinlock); // returns previous first entry or NULL if was empty
SLIST_ENTRY* ExInterlockedPopEntrySList(& head, & spinlock); // if was empty, return NULL
SLIST_ENTRY* ExInterlockedFlushSList(& head); // return previous first element (or NULL) and resets the head.Next to empty
USHORT ExQueryDepthSList(& head); // current number of elements in the list
Can be called at any IRQL.
Note that if the list is accessed at DIRQL, code must not access it by locking
spinlock at DISPATCH_LEVEL.
###
InterlockedXxx at any IRQL
DPC:
Some KDPC fields:
PKDEFERRED_ROUTINE DeferredRoutine;
PVOID
DeferredContext;
PVOID
SystemArgument1;
PVOID
SystemArgument2;
volatile USHORT
Number;
// target CPU
UCHAR
Importance;
UCHAR
Type;
// DPC_NORMAL or
DPC_THREADED
typedef VOID KDEFERRED_ROUTINE (KDPC *Dpc, PVOID DeferredContext, PVOID SystemArgument1, PVOID SystemArgument2);
VOID
KeInitializeDpc(KDPC* Dpc, PKDEFERRED_ROUTINE DeferredRoutine, PVOID
DeferredContext)
VOID KeSetImportanceDpc(KDPC*
Dpc, LowImportance / MediumImportance (default) / HighImportance)
VOID KeSetTargetProcessorDpc(KDPC*
Dpc, CCHAR
Number);
// defaults is current
for KeInsertQueueDpc
BOOLEAN KeInsertQueueDpc(KDPC*
Dpc, PVOID SystemArgument1, PVOID
SystemArgument2)
// if already queued, do nothing and return FALSE
BOOLEAN KeRemoveQueueDpc(KDPC*
Dpc);
// TRUE if was on the queue and was successfully removed before started to run
VOID
KeFlushQueuedDpcs();
// wait until all currently queued DPCs on all CPUs complete; may take long
time
OK to call KeInsertQueueDpc from DPC routine for this
DPC.
The same DPC can run in parallel on multiples CPUs, no implicit interlocking.
If LowImportance & queue is short & request rate is high, DPC interrupt
is delayed until either of two conditions goes away.
Note: multiple requests to queue a particular DPC (while DPC
is already queued) will result in only a single invocation of DPC routine.
In drivers that can have multiple requests outstanding on the device this can result
in interrupt notifications being lost if the driver relies on passing
information only via arg1/arg2 or Irp/Context to convey information from ISR to
DPC routine. Should rather use flags in device extension.
Single DPC object can be active at multiple processors at the same time. While DPC is executing on CPU1, ISR on CPU2 can queue the same DPC object again and it will start executing on CPU2. Should use spinlocks to synchronize against this case.
Threaded DPC structure: RKDPC ≡ KDPC.
Threaded DPCs by default execute at PASSIVE_LEVEL unless HKLM\System\CCS\Control\SessionManager\Kernel\ThreadDpcEnable
is set to zero, then as regular DPCs.
Threaded DPC can be preempted by ordinary DPCs but not by other threads.
Threaded DPC's deferred routine must be designed to execute equally well both
at PASSIVE_LEVEL and DISPATCH_LEVEL, for example must not try to enter wait
states, such as wait for kernel event objects.
VOID KeInitializeThreadedDpc(KDPC*
Dpc, PKDEFERRED_ROUTINE DeferredRoutine, PVOID DeferredContext)
VOID IoInitializeDpcRequest(DEVICE_OBJECT* DeviceObject, IO_DPC_ROUTINE* DpcForIsr) // at any IRQL level
VOID IoRequestDpc(DEVICE_OBJECT* DeviceObject, IRP* Irp, VOID* Context) // at DIRQL, from ISR; KeInsertQueueDpc on DPC object embedded in DeviceObject
IO_DPC_ROUTINE DpcForIsr;
VOID DpcForIsr(KDPC* Dpc, DEVICE_OBJECT* DeviceObject, IRP* Irp, VOID* Context) // arguments from IoRequestDpc
typical DpcForIsr:
if StartIo does not reset cancel routine to NULL, handle it here
Irp->IoStatus.Status = ...
Irp->IoStatus.Information = ...
IoStartNextPacket(DeviceObject, TRUE)
IoCompleteRequest(Irp, IO_NO_INCREMENT)
if using system queueing, there is only one request, so do not need to synchronize across DpcForIsr invocations
Delays:
NTSTATUS KeDelayExecutionThread(
KPROCESSOR_MODE WaitMode, // KernelMode or UserMode (if UserMode, kernel stack can be swapped out)
BOOLEAN Alertable,
LARGE_INTEGER* Interval); // positive = abstime, negative = interval (100 ns units)
STATUS_SUCCESS
STATUS_ALERTED
STATUS_USER_APC
other condition codes
VOID KeStallExecutionProcessor(ULONG MicroSeconds); // busy-wait
Time:
VOID KeQuerySystemTime(LARGE_INTEGER* CurrentTime) // any IRQL, return is GMT time zone, in 100ns units, subject to adjustments
VOID
ExSystemTimeToLocalTime(const
LARGE_INTEGER* SystemTime, LARGE_INTEGER* LocalTime) // GMT to local, any
IRQL
VOID ExLocalTimeToSystemTime(const
LARGE_INTEGER* LocalTime, LARGE_INTEGER* SystemTime) // local to GMT, any
IRQL
VOID KeQueryTickCount(LARGE_INTEGER* TickCount) // ticks since startup, any IRQL
ULONG KeQueryTimeIncrement() // tick size in 100ns units
LARGE_INTEGER KeQueryPerformanceCounter(OPTIONAL
LARGE_INTEGER*
PerformanceFrequency)
// return performance counter and its frequency (per second), any IRQL,
// finest grain timing available in the system, but costly so do not call often
// or can distort the results and hamper system performance
ULONGLONG KeQueryInterruptTime()
// current interrupt time in 100 ns units, any IRQL
ULONGLONG KeQueryUnbiasedInterruptTime()
// ...
Amount of time since system last started (on checked
builds: +49 days), incremented at each tick.
Not affected by system time adjustments.
Finer grained measurement than KeQueryTickCount.
Much less overhead than KeQueryPerformanceCounter
During power-state sleep, clock is not running, but on wakeup an estimate
"bias" is calculated and added.
KeQueryInterruptTime returns value with this bias accounted for, KeQueryUnbiasedInterruptTime
returns raw value without sleep-bias time added..
Memory allocation:
POOL_TYPE:
NonPagedPool allocated and freed at IRQL <= DISPATCH_LEVEL
PagedPool allocated and accessed only at IRQL < DISPATCH_LEVEL
NonPagedPoolCacheAligned aligned on processor cache line
PagedPoolCacheAligned
Extra flag: POOL_COLD_ALLOCATION = use memory that will be paged out quickly
PVOID ExAllocatePool(POOL_TYPE PoolType, SIZE_T NumberOfBytes)
PVOID ExAllocatePoolWithTag(POOL_TYPE PoolType, SIZE_T NumberOfBytes, ULONG Tag)
PVOID ExAllocatePoolWithQuotaTag(POOL_QUOTA_FAIL_INSTEAD_OF_RAISE | POOL_TYPE PoolType, SIZE_T NumberOfBytes, ULONG Tag)
PVOID ExAllocatePoolWithTagPriority(POOL_TYPE PoolType, SIZE_T NumberOfBytes, ULONG Tag, Low/Normal/HighPoolPriority)
VOID ExFreePool(PVOID p)
VOID ExFreePoolWithTag(PVOID P, ULONG Tag)
Blocks of size >= PAGE_SIZE are page-aligned (last page beyond block limit can contain other block).
Blocks of size <= PAGE_SIZE do not cross page boundary and are aligned to 8-byte (on x86) and 16-byte (on x64).
STATUS_INSUFFICIENT_RESOURCES
Pool tag is 4 chars starting from low byte, e.g. '1gaT'.
WinDbg can display block tag.
Gflags tool can make certain tag allocatable from special pool.
Poolmon tracks memory usage by pool tag.
ExInitializeNPagedLookasideList
ExInitializePagedLookasideList
ExDeleteNPagedLookasideList
ExDeletePagedLookasideList
ExAllocateFromNPagedLookasideList
ExAllocateFromPagedLookasideList
ExFreeToNPagedLookasideList
ExFreeToPagedLookasideList
Releasing the memory that contains an active lookaside list.
When you create a memory lookaside list by calling ExInitialize[N]PagedLookasideList, the system places your lookaside list object on an internal queue that it traverses every so often in order to adjust the list depth based on recent usage. Be sure to make the matching call to ExDelete[N]PagedLookasideList before allowing the list object to pass out of scope.
Memory probing:
Check access to user-mode address (do not if Irp->RequestorMode == KernelMode).
VOID
ProbeForRead(PVOID
Address, SIZE_T Length, ULONG
Alignment) // ProbeForRead only validates that va
range is in user-mode range, not that it is valid
VOID ProbeForWrite(PVOID
Address, SIZE_T Length, ULONG
Alignment)
//
ProbeForRead will try to access first byte in each page to verify these are
valid addresses
try
{
ProbeForWrite(Buffer, BufferSize, BufferAlignment);
… access buffer … // should do in try-catch: malicious application could have another thread manipulating space
}
except (GetExceptionCode() == STATUS_DATATYPE_MISALIGNMENT ||
GetExceptionCode() == STATUS_ACCESS_VIOLATION ?
EXCEPTION_EXECUTE_HANDLER : EXCEPTION_CONTINUE_SEARCH)
{
/* Error handling code */
...
}
VOID MmProbeAndLockPages(MDL*
MemoryDescriptorList, KernelMode / UserMode, IoReadAccess / IoWriteAccess /
IoModifyAccess)
PVOID MmGetSystemAddressForMdlSafe(PMDL
Mdl, Low/Normal/HighPagePriority)
VOID MmUnlockPages(MDL*
MemoryDescriptorList)
Kernel stack:
On x86, 12K
On x64, 24K
VOID
IoGetStackLimits(ULONG_PTR*
LowLimit, ULONG_PTR* HighLimit)
ULONG_PTR IoGetRemainingStackSize()
// bytes remaining
PVOID
IoGetInitialStack()
// base address
Objects and object
names:
SysInternals WinObj
\KnownDlls\*.dll
\KnownDlls32\*.dll
\BaseNamedObjects\* – mutexes, events, sections etc.
\ArcName – symlinks for partitions
\GLOBAL?? – symlinks
\GLOBAL??\COM1 -> \Device\ProlificSerial0
\GLOBAL??\D: -> \Device\HarddiskVolume4
\Device\...
(DosDevices appears to be gone in Win7)
\DosDevices\C:\Directory\File
\DosDevices\Global\X:\Directory\File
\DosDevices\COM1
UNICODE_STRING ustr; // most Rtl string routines should be called at low IRQL
VOID RtlInitUnicodeString(& ustr, PCWSTR SourceString)
NTSTATUS RtlAnsiStringToUnicodeString(&
ustr, PCANSI_STRING SourceString, BOOLEAN AllocateDestinationString)
VOID RtlFreeUnicodeString(PUNICODE_STRING
UnicodeString)
OBJECT_ATTRIBUTES oa;
InitializeObjectAttributes(&oa, & ustr, OBJ_KERNEL_HANDLE, NULL
/*root directory for relative naming*/, NULL /*security descriptor*/);
OBJ_KERNEL_HANDLE
handle inaccessible to user mode
OBJ_FORCE_ACCESS_CHECK perform
access check even if being opened from kernel mode
OBJ_INHERIT handle should be inherited by child processes
OBJ_PERMANENT if object is named in the Object Manager, do not delete it when last handle is closed (created with extra reference)
OBJ_EXCLUSIVE only one handle can be opened for this object
OBJ_CASE_INSENSITIVE use case-insensitive comparison during name matching
OBJ_OPENIF in create request, if the object with the same name already exist, then open this object
NTSTATUS
ObReferenceObjectByHandle(
// validate object handle access and, if access can be granted, return object
pointer and increment reference count
_In_ HANDLE Handle,
_In_ ACCESS_MASK DesiredAccess, // e.g. a combo of GENERIC_ALL, STANDARD_RIGHTS_ALL, SYNCHRONIZE, DELETE
_In_opt_ POBJECT_TYPE ObjectType, // if not NULL, syste, verifies that object is of required type
_In_ KPROCESSOR_MODE AccessMode, // UserMode for handles received from user mode, otherwise typically KernelMode
_Out_ PVOID* Object, // pointer to receive object address
_Out_opt_ POBJECT_HANDLE_INFORMATION HandleInformation) // drivers usually set this to NULL
ObjectType can be *ExEventObjectType (KEVENT), *ExSemaphoreObjectType (KSEMAPHORE), *IoFileObjectType (FILE_OBJECT), *PsProcessType (EPROCESS or KPROCESS), *PsThreadType (ETHREAD or KTHREAD), *SeTokenObjectType (ACCESS_TOKEN), *TmEnlistmentObjectType (KENLISTMENT), *TmResourceManagerObjectType (KRESOURCEMANAGER), *TmTransactionManagerObjectType (KTM), *TmTransactionObjectType (KTRANSACTION)
NTSTATUS ObReferenceObjectByPointer( // validate object handle access and, if access can be granted, increment reference count
_In_ PVOID Object,
_In_ ACCESS_MASK DesiredAccess,
_In_opt_ POBJECT_TYPE ObjectType,
_In_ K PROCESSOR_MODE AccessMode)
VOID
ObReferenceObject(PVOID
Object)
// increment reference count with no access checks
VOID ObDereferenceObject(PVOID
Object)
// decrement reference count, and if zero, delete it unless OBJ_PERMANENT was
specified at creation
VOID ObDereferenceObjectDeferDelete(PVOID
Object) // use
instead of ObDereferenceObject when the latter can cause a deadlock
(especially for KTM objects),
// deletion is performed later on a worker thread and at PASSIVE_LEVEL
###
http://msdn.microsoft.com/en-us/library/windows/hardware/ff557728%28v=vs.85%29.aspx
### http://msdn.microsoft.com/en-us/library/windows/hardware/ff557758%28v=vs.85%29.aspx
###
http://msdn.microsoft.com/en-us/library/windows/hardware/ff557755%28v=vs.85%29.aspx
###
http://msdn.microsoft.com/en-us/library/windows/hardware/ff563705%28v=vs.85%29.aspx
### http://msdn.microsoft.com/en-us/library/windows/hardware/ff558678%28v=vs.85%29.aspx
Threads:
### http://msdn.microsoft.com/en-us/library/windows/hardware/ff564633%28v=vs.85%29.aspx
### http://www-user.tu-chemnitz.de/~heha/oney_wdm/ch09e.htm
###
http://msdn.microsoft.com/en-us/library/windows/hardware/ff559932%28v=vs.85%29.aspx
KTHREAD* KeGetCurrentThread()
ETHREAD* PsGetCurrentThread()
// ETHREAD is an extension of KTHREAD, can cast between the two
KTHREAD* kthread = NULL;
OBJECT_ATTRIBUTES oa;
InitializeObjectAttributes(&oa, NULL, OBJ_KERNEL_HANDLE, NULL,
NULL);
NTSTATUS PsCreateSystemThread(
HANDLE* pThreadHandle,
ULONG DesiredAccess, // THREAD_ALL_ACCESS
OBJECT_ATTRIBUTES* ObjectAttributes, // must set OBJ_KERNEL_HANDLE
HANDLE ProcessHandle, // if NULL, will use system process
CLIENT_ID* ClientId, // NULL for driver-created thread
PKSTART_ROUTINE StartRoutine, // VOID ThreadStartRoutine(PVOID StartContext)
VOID* StartContext)
if (! NT_SUCCESS(status)) …
ObReferenceObjectByHandle(ThreadHandle,
THREAD_ALL_ACCESS, *PsThreadType, KernelMode, (PVOID*) &kthread, NULL);
ZwClose(ThreadHandle);
. . . .
KeWaitForSingleObject(kthread,
Executive, KernelMode, FALSE, NULL);
ObDereferenceObject(kthread);
http://www-user.tu-chemnitz.de/~heha/oney_wdm/ch09e.htm
VOID PsTerminateSystemThread(NTSTATUS
status)
Termination handler for thread: PsSetCreateThreadNotifyRoutine.
Called before cancelling IRPs, the latter is done by IoCancelIrp for each
routine i.e. still may be calls in the context of the thread
(but can mark thread as "being deleted", so do not have to
re-create/re-delete context on each entry).
### what can do with handle?
PsLookupProcessByProcessId, PsLookupThreadByThreadId
### PsGetThreadId
###
ZwQueryInformationThread
### IoCreateSystemThread
LONG KeSetBasePriorityThread(PKTHREAD
Thread, LONG Increment)
KPRIORITY
KeSetPriorityThread(PKTHREAD Thread, KPRIORITY Priority)
### get priority
Work items:
The kernel maintains three queues for work items:
· Delayed work queue. Items in this queue are processed by a system worker thread that has a variable, dynamic thread priority. Drivers should use this queue.
· Critical work queue. Processed by a system worker thread at a higher thread priority than the items in the delayed work queue.
· Hypercritical work queue. Processed by a system worker thread at a higher priority than items in the critical work queue. This work queue is reserved for use by the operating system and must not be used by drivers.
Items are executed at PASSIVE_LEVEL.
System makes sure to not unload the driver while the callback if running.
### ExInitializeWorkItem, ExQueueWorkItem (watch out for driver unloading)
### IoAllocateWorkItem, IoQueueWorkItem, IoFreeWorkItem
### IoAcquireRemoveLock before IoCallDriver
### IoReleaseRemoveLock in completion routine
### see http://www.wd-3.com/archive/WalkPlank.htm
Rundown protection:
Reference count based.
To avoid suspending run down for long, desirable to hold run-down protection
within a critical or guarded region or at APC_LEVEL.
PEX_RUNDOWN_REF rr;
VOID ExInitializeRundownProtection(& rr);
BOOLEAN ExAcquireRundownProtection(& rr);
BOOLEAN ExAcquireRundownProtectionEx(& rr, ULONG Count);
VOID ExReleaseRundownProtection(& rr);
VOID ExReleaseRundownProtectionEx(& rr, ULONG Count);
VOID ExWaitForRundownProtectionRelease(& rr); // wait till all protection holders release it, no further acquires are granted
VOID
ExRundownCompleted(&
rr);
// called after ExWaitForRundownProtectionRelease
all subsequent calls to ExWaitForRundownProtectionRelease complete
immediately
VOID ExReInitializeRundownProtection(&
rr);
// reinitialize structure for reuse after ExRundownCompleted,
so acquires are possible again
Debugging:
VOID DbgBreakPoint()
VOID DbgBreakPointWithStatus(ULONG Status)
VOID KdBreakPoint() // same as DbgBreakPoint if compiled for debug (#if DBG), no-op if for release
VOID
KdBreakPointWithStatus(ULONG
Status)
ULONG DbgPrint(PCHAR Format, arguments) =
DbgPrintEx(DPFLTR_DEFAULT_ID, DPFLTR_INFO_LEVEL, Format, arguments)
ULONG DbgPrintEx(ULONG ComponentId, ULONG Level, PCSTR Format, arguments)
Send output to debugger if passes filtering.
Can use any printf formats, but Unicode formats can be used only at PASSVE_LEVEL (this includes %C, %S, %lc, %ls, %wc, %ws, and %wZ).
Other formats can be used up to DIRQL
Output limited to 512 bytes. Total output limited to DbgPrint buffer (4 KB on free build, 32 KB on checked build, alterable with KDbgCtrl tool).
ComponentId = DPFLTR_xxx_ID, e.g. DPFLTR_DEFAULT_ID
Level = DPFLTR_ERROR_LEVEL, DPFLTR_WARNING_LEVEL, DPFLTR_TRACE_LEVEL, DPFLTR_INFO_LEVEL
When compiled for debug KdPrint = DbgPrint, KdPrintEx = DbgPrintEx.
When compiled for release, no-op.
ULONG vDbgPrintEx(ULONG ComponentId, ULONG Level, PCCH Format, va_list arglist)
ULONG vDbgPrintExWithPrefix(PCCH
Prefix, ULONG ComponentId, ULONG Level, PCCH Format, va_list arglist)
ULONG DbgPrompt(PCCH Prompt, PCHAR
Response, ULONG MaximumResponseLength) //
returns response size, incl. terminating newline character; or zero if no
response
BOOLEAN KdRefreshDebuggerNotPresent()
if (KdRefreshDebuggerNotPresent() == FALSE)
{
// A kernel debugger is active.
DbgPrint("A problem occurred\n");
DbgBreakPoint();
}
else
{
// No kernel debugger attached, or kernel debugging not enabled.
KeBugCheckEx(...);
}
Bugchecks:
VOID KeBugCheck(ULONG BugCheckCode) // e.g. FILE_SYSTEM or DRIVER_VIOLATION
VOID KeBugCheckEx(ULONG BugCheckCode, ULONG_PTR param1, ULONG_PTR param2, ULONG_PTR param3, ULONG_PTR param4)
KeInitializeCallbackRecord
KeRegisterBugCheckCallback
KeDeregisterBugCheckCallback
KeRegisterBugCheckReasonCallback // may use for panic data, via BugCheckSecondaryDumpDataCallback, may display with !bugdata command
KeDeregisterBugCheckReasonCallback
Logging:
IO_ERROR_LOG_PACKET
PVOID IoAllocateErrorLogEntry(
// callable at <= DISPATCH_LEVEL, may return NULL
PVOID IoObject, // DEVCE_OBJECT or DRIVER_OBJECT
UCHAR EntrySize) // sizeof(IO_ERROR_LOG_PACKET) + size of the DumpData member + combined size of any driver-supplied insertion strings
Use example here: http://www.osronline.com/showThread.cfm?link=95878, but make sure not to overflow ERROR_LOG_MAXIMUM_SIZE with insertion strings.
VOID IoWriteErrorLogEntry(PVOID entry) // entry allocated with IoAllocateErrorLogEntry, automatically freed after recording
VOID IoFreeErrorLogEntry(PVOID entry) // entry allocated with IoAllocateErrorLogEntry and not passed yet to IoWriteErrorLogEntry
### http://msdn.microsoft.com/en-us/library/windows/hardware/ff554312%28v=vs.85%29.aspx
Devices:
DEVICE_OBJECT
CSHORT Type;
USHORT Size;
LONG ReferenceCount; // open handles count
DRIVER_OBJECT* DriverObject;
DEVICE_OBJECT* NextDevice; // in driver devices list
DEVICE_OBJECT* AttachedDevice; // up the stack
IRP* CurrentIrp; // used if driver has StartIO routine
PIO_TIMER Timer; // timer object, see IoInitializeTimer
ULONG Flags; // DO_xxx
ULONG Characteristics; // FILE_xxx
volatile VPB* Vpb; // volume parameters block
VOID* DeviceExtension; // extension requested by driver in IoCreateDevice
DEVICE_TYPE DeviceType; // FILE_DEVICE_xxx or custom
CCHAR StackSize; // required number of stack locations in IRP
union { // used internally by IO Manager to manipulate the device
LIST_ENTRY ListEntry;
WAIT_CONTEXT_BLOCK Wcb;
} Queue;
ULONG AlignmentRequirement; // FILE_XXX_ALIHNMENT
KDEVICE_QUEUE DeviceQueue; // IRPs
KDPC Dpc; // DPC for the device
SECURITY_DESCRIPTOR* SecurityDescriptor;
KEVENT DeviceLock; // used during mount or mount verification
USHORT SectorSize; // volume sector size
DO_BUFFERED_IO
DO_DIRECT_IO
DO_DEVICE_INITIALIZING
// cleared in AddDevice routine
DO_EXCLUSIVE
DO_POWER_INRUSH
DO_POWER_PAGABLE
DO_VERIFY_VOLUME
FILE_AUTOGENERATED_DEVICE_NAME
FILE_CHARACTERISTIC_PNP_DEVICE
FILE_DEVICE_IS_MOUNTED
FILE_DEVICE_SECURE_OPEN
FILE_READ_ONLY_DEVICE
FILE_REMOTE_DEVICE
FILE_REMOVABLE_MEDIA
FILE_VIRTUAL_VOLUME
FILE_WRITE_ONCE_MEDIA
NTSTATUS IoCreateDevice(
DRIVER_OBJECT* DriverObject, // from DriverEntry
ULONG DeviceExtensionSize, // in bytes
UNICODE_STRING* DeviceName OPTIONAL, // zero-terminated unicode string, typically \Device\mydevname, but can be any
DEVICE_TYPE DeviceType, // FILE_DEVICE_XXX, e.g. FILE_DEVICE_DISK, FILE_DEVICE_DISK_FILE_SYSTEM etc. or FILE_DEVICE_UNKNOWN or 32768...65535
ULONG DeviceCharacteristics, // FILE_DEVICE_SECURE_OPEN, FILE_REMOVABLE_MEDIA, FILE_READ_ONLY_DEVICE, FILE_WRITE_ONCE_MEDIA etc.
BOOLEAN Exclusive, // exclusive access (only one handle can be opened on the device), normally FALSE, otherwise requires specifying in the INF file
DEVICE_OBJECT** DeviceObject); // created device object
Some returned codes:
STATUS_SUCCESS
STATUS_INSUFFICIENT_RESOURCES
STATUS_OBJECT_NAME_EXISTS
STATUS_OBJECT_NAME_COLLISION
Legacy (non-PnP) drivers call IoCreateDevice
from DriverEntry.
PnP drivers usually call IoCreateDevice from AddDevice routine,
but they also might create control devices in DriverEntry.
Physical device objects (PDOs) are named.
Functional device objects (FDOs) and filer DOs are not named.
Callable at <= APC_LEVEL.
Extension address is stored in DeviceObject->DeviceExtension.
May want to use FILE_DEVICE_SECURE_OPEN characteristic in the call to IoCreateDevice; this directs IO Manager to perform security checks against the device object for all open requests. If FILE_DEVICE_SECURE_OPEN is set, IO Manager applies security descriptor of the device object to ant relative opens or trailing-filename opens. For example, if FILE_DEVICE_SECURE_OPEN is set for \Device\foo and if \Device\foo can only be opened by the administrator, then \Device\foo\bar can also be only opened by the administrator.
Set DeviceObject->StackSize, unless default value is fine:
· IoCreateDevice sets it to 1
· Lowest-level driver can ignore the field
· When a higher-level driver attaches itself to lower level driver, IoAttachDeviceToDeviceStack sets StackSize to that of lower-level device plus 1.
Set DeviceObject->AlignmentRequirement, unless default value is fine:
· IoCreateDevice sets it to processor's data cacheline size minus 1
· Lowest-level device must set it to max(old value, device required alignment - 1), can use symbols FILE_BYTE_ALIGNMENT ... FILE_512_BYTE_ALIGNMENT
· Higher-level driver that layers itself over lower-level driver must copy the value of lower-level device; this is what is done automatically by IoAttachDevice and IoAttachDeviceToDeviceStack
File system drivers may also set DeviceObject->SectorSize to volume's sector size. If the device object represents a volume, SectorSize specifies the volume's sector size, in bytes. The I/O manager uses SectorSize to make sure that all read operations, write operations, and set file position operations that are issued are aligned correctly when intermediate buffering is disabled. A default system bytes-per-sector value is used when the device object is created, however, file system drivers; and more rarely, legacy and minifilter drivers, can update this value that is based on the geometry of the underlying volume hardware when a mount occurs.
Define flags in the device extension to track certain PnP states of the device, such as device being paused, removed or surprise-removed. For example, a flag may indicate that incoming IRPs should be held while device is in a paused states. Create a queue for holding IRPs if the driver does not already have a mechanism for queueing IRPs (see "Queueing and Dequeueing IRPs"). Also allocate IO_REMOVE_LOCK in the device extension and call IoInitializeRemoveLock (see "Using Remove Locks").
Set the DO_BUFFERED_IO or DO_DIRECT_IO flag bit in the device object to specify the type of buffering that the I/O manager is to use for I/O requests that are sent to the device stack. Medium-level drivers typically use modes of the lower-level driver.
if (FlagOn( lowerDeviceObject->Flags, DO_BUFFERED_IO ))
SetFlag( myDeviceObject->Flags, DO_BUFFERED_IO );
if (FlagOn( lowerDeviceObject->Flags, DO_DIRECT_IO ))
SetFlag(myDeviceObject->Flags, DO_DIRECT_IO );
or: myDeviceObject->Flags |= lowerDeviceObject->Flags & (DO_BUFFERED_IO | DO_DIRECT_IO)
Exception: file system drivers use Neither method (however they still do set AlignmentRequirement and SectorSize as described above).
Set power-management flags such as DO_POWER_INRUSH or DO_POWER_PAGABLE if needed.
Create and initialize required spinlocks, event
objects, buffers etc. stored or pointed from FDO/DO device extension.
Call IoInitializeDpcRequest and IoInitializeTimer.
Function
and filter drivers must clear DO_DEVICE_INITIALIZING flag: ClearFlag(FunctionalDeviceObject->Flags,
DO_DEVICE_INITIALIZING).
While DO_DEVICE_INITIALIZING is set, IO Manager won't let other component to
open the device via functions such as CreateFile, OpenFile or IoGetDeviceObjectPointer.
IO Manager will clear DO_DEVICE_INITIALIZING for devices created inside DriverEntry.
For devices created inside any other routine, such as AddDevice, driver
is responsible for clearing DO_DEVICE_INITIALIZING.
IO Manager will allow devices created inside DriverEntry to be opened as soon as DriverEntry returns.
For PnP drivers created inside AddDevice, driver must complete IRP_MN_START_DEVICE first before IO Manager allows the device to be opened. Driver may also need to handle certain requests from drivers up the stack (such as port settings) before it receives IRP_MN_START_DEVICE, but driver does not receive any create or file-based requests until after IRP_MN_START_DEVICE.
NTSTATUS IoCreateSymbolicLink(UNICODE_STRING* SymbolicLinkName, UNICODE_STRING* DeviceName)
NTSTATUS IoDeleteSymbolicLink(UNICODE_STRING* SymbolicLinkName)
Callable at PASSIVE_LEVEL.
WDM drivers do not name device objects. Instead a WDM driver should call IoRegisterDeviceInterface to set up a symbolic link.
Place Win32-visible names in name directory
"??" (or "GLOBAL??").
When Win32 processes a call such as CreateFile, it prefixes name with
"\??\" or "\DosDevices".
"\DosDevices" is a symbolic link to "\??".
Device is also openable from user-level as "\\.\MyDevice"
(double the number of backslashes for C string) ->
"\DosDevices\MyDevice".
To create session-local link name, use "??"
or "\DosDeivices".
To create global link name, use "GLOBAL??" or
"\DosDevices\Global".
Links can also be defined with Win32 function DefineDosDevice.
RtlInitUnicodeString(& linkName,
L"\\??\\MyDevice")
status = IoCreateSymbolicLink(& linkName, & deviceName)
VOID IoDeleteDevice(DEVICE_OBJECT* DeviceObject)
Called at <= APC_LEVEL.
PnP driver calls it when handling IRP_MN_REMOVE_DEVICE.
Legacy driver calls it when being unloaded.
Driver must first:
· release external references stored in device object (or extension), such as pointers to other device objects or to interrupt objects
· if external device (such as next-lower device) was linked via IoGetDeviceObjectPointer, release it with ObDereferenceObject
· if external device was attached with IoAttachDevice or IoAttachDeviceToDeviceStack, detach it with IoDetachDevice
If there are no outstanding references to device object, it is deleted immediately; otherwise marked "delete pending" and deleted when the last reference is released.
NTSTATUS IoGetDeviceObjectPointer( // must be called at PASSIVE_LEVEL
PUNICODE_STRING ObjectName, // name of device object
ACCESS_MASK DesiredAccess, // FILE_READ_DATA, FILE_READ_ATTRIBUTES, FILE_WRITE_DATA, FILE_WRITE_ATTRIBUTES, FILE_ALL_ACCESS, STANDARD_RIGHTS_ALL etc.
PFILE_OBJECT* FileObject, // returns file object
PDEVICE_OBJECT* DeviceObject) // returns device object
Specifying FILE_READ_ATTRIBUTES does not cause the file system of the target device to be mounted. Must specify FILE_READ_DATA or FILE_WRITE_ATTRIBUTES for this.
Sends Create request to target device.
If request fails, either object does not exist or access cannot be granted.
If request succeeds, file object is created, a file object is created which results in incrementing reference count on the device object.
IO Manager then increments reference count on the file object by 1 and sends Close request to the device.
When device is no longer needed (for a
layered driver that would be on unload, for example) call ObDereferenceObject(FileObject),
which also causes DeviceObject to be indirectly dereferenced. ObDereferenceObject(FileObject)
must be called before IoDetachDevice(DeviceObject).
If want to close FileObject but to keep DeviceObject, do ObReferenceObject(DeviceObject) first.
CONTROLLER_OBJECT* IoCreateController(
ULONG ExtensionSize)
VOID
IoDeleteController(CONTROLLER_OBJECT*
ControllerObject)
VOID
IoAllocateController(...)
// allocate controller to the device for the duration of IO operation
VOID
IoFreeController(...)
// release controller previously to the device for the duration of IO
operation
NTSTATUS IoAttachDevice(DEVICE_OBJECT* HigherDevice, UNICODE_STRING* LowerDeviceName, DEVICE_OBJECT** AttachedDevice)
Attach higher-device over the top of existing chain
above lower-device (so requests sent to lower-device are routed to
higher-device).
Called at PASSVE_LEVEL.
Must be called before any device is layered on top of higher-device.
Higher-device is layered on top of existing chain above lower-device, and
previous top of the chain is returned to *AttachedDevice.
AttachedDevice must be global memory location (such as in higher-device
device extension), perhaps cannot be caller's stack.
Sets higher-device AlignmentRequirement and StackSize from lower-device (i.e. attached device, that was previous top of chain).
For file system drivers and drivers in the storage stack opens the target device FILE_READ_ATTTIBUTES and then calls IoGetRelatedDeviceObject. This does not cause a file system to be mounted. Thus, a successful call to IoAttachDevice returns the device object of the storage driver, not that of the file system driver.
DEVICE_OBJECT* IoAttachDeviceToDeviceStack(DEVICE_OBJECT*
HigherDevice, DEVICE_OBJECT* LowerDevice)
Attach higher-device over the top of existing chain
above lower-device.
Called at <= DISPATCH_LEVEL.
Must be called before any device is layered on top of higher-device.
Returns device object that was previously top of the chain above lower-device.
On failure (e.g. lower device driver being unloaded) returns NULL.
Sets higher-device AlignmentRequirement and StackSize.
NTSTATUS IoAttachDeviceToDeviceStackSafe(DEVICE_OBJECT* HigherDevice, DEVICE_OBJECT* LowerDevice, DEVICE_OBJECT** AttachedToDeviceObject)
Similar to IoAttachDeviceToDeviceStack.
Called at <= DISPATCH_LEVEL.
AttachedToDeviceObject must be in HigherDevice extension and be
NULL at input.
AttachedToDeviceObject is updated while IO database lock is still held.
This avoids race condition whereby higher-device may receive an IRP before its AttachedToDeviceObject is updated.
VOID IoDetachDevice(DEVICE_OBJECT* LowerDevice)
Called at PASSIVE_LEVEL.
Detaches lower-device from upchain.
Decrements reference count on lower-device. If reference count goes to zero and driver had been marked for unload, driver is unloaded.
Remove locks are used to prevent device objects being detached from the device stack or deleted.
The remove lock routines provide a way to track the number of outstanding I/O operations on a device, and to determine when it is safe to detach and delete a driver's device object.
A driver can use this mechanism for two purposes:
1) To ensure that the driver's DispatchPnP routine will not complete an IRP_MN_REMOVE_DEVICE request while the lock is held (for example, while another driver routine is accessing the device).
2) To count the number of reasons why the driver should not delete its device object, and to set an event when that count goes to zero.
Driver must call IoAcquireRemoveLock each time
it starts an IO operation.
Driver must call IoReleaseRemoveLock each time it finishes an IO
operation.
Driver should also call IoAcquireRemoveLock each
time it passes out a reference to its code (for timers, DPCs and so on).
The driver must then call IoReleaseRemoveLock when the reference is
returned.
In the dispatch code for IRP_MN_REMOVE_DEVICE, the
driver must acquire the lock once more and then call IoReleaseRemoveLockAndWait.
This routine does not return until all outstanding acquisitions of the lock
have been released.
After IoReleaseRemoveLockAndWait returns, the driver should consider the
device to be in a state in which it is ready to be removed and cannot perform
I/O operations.
To allow queued I/O operations to complete, each driver should call IoReleaseRemoveLockAndWait
before it passes the IRP_MN_REMOVE_DEVICE request to the next-lower driver, and
before it releases memory, calls IoDetachDevice, or calls IoDeleteDevice.
After IoReleaseRemoveLockAndWait has been called for a particular remove
lock, all subsequent calls to IoAcquireRemoveLock for the same remove
lock will fail.
IO_REMOVE_LOCK rmvlock; // typically held in device object extension
VOID IoInitializeRemoveLock(& rmvlock, // typically called in AddDevice
ULONG AllocateTag, // 4-character tag in reverse order, similar to ExPoolAllocateWithTag
ULONG MaxLockedMinutes, // maximum number of minutes the lock should be held (0 = no limit), on exceeding crashes
ULONG HighWatermark) // maximum number of outstanding acquisitions, 0 = no maximum; must be <= 0x7FFFFFFF
NTSTATUS IoAcquireRemoveLock(& rmvlock, OPTIONAL VOID* Tag)
Callable at <= DISPATCH_LEVEL.
Returns STATUS_SUCCESS or STATUS_DELETE_PENDING if the driver has called IoReleaseRemoveLockandWait.
VOID IoReleaseRemoveLock(& rmvlock, VOID* Tag)
Callable at <= DISPATCH_LEVEL.
Tag should match one supplied to IoAcquireRemoveLock.
For I/O operations (including power and PnP IRPs) that set an IoCompletion
routine, a driver should call IoReleaseRemoveLock in the IoCompletion
routine, after calling IoCompleteRequest.
For I/O operations that do not set an IoCompletion routine, a driver should call IoReleaseRemoveLock after passing the current IRP to the next-lower driver, but before exiting the dispatch routine.
VOID IoReleaseRemoveLockAndWait(& rmvlock, VOID* Tag)
Callable at PASSIVE_LEVEL.
A driver typically calls this routine in its dispatch code for an
IRP_MN_REMOVE_DEVICE request or in the Unload routine. To allow queued
I/O requests to complete, each driver should call IoReleaseRemoveLockAndWait
before it passes the remove IRP to the next-lower driver, and before it
releases memory, calls IoDetachDevice, or calls IoDeleteDevice.
###
IoGetRelatedObject
### ObDereferenceObject
###
http://msdn.microsoft.com/en-us/library/windows/hardware/ff547807%28v=vs.85%29.aspx
###
http://msdn.microsoft.com/en-us/library/windows/hardware/ff543153(v=vs.85).aspx
### http://msdn.microsoft.com/en-us/library/windows/hardware/ff542862(v=vs.85).aspx
### http://msdn.microsoft.com/en-us/library/windows/hardware/ff547807(v=vs.85).aspx
###
http://msdn.microsoft.com/en-us/library/windows/hardware/ff559925(v=vs.85).aspx
###
http://msdn.microsoft.com/en-us/library/windows/hardware/ff544156%28v=vs.85%29.aspx
IRP:
IRP main body |
|
MdlAddress |
Memory Descriptor List (MDL) describes the requestor’s buffer when driver uses Direct IO |
Flags |
IRP_PAGING_IO = paging request |
AssociatedIrp.MasterIrp |
Master IRP for this child IRP (overlays AssociatedIrp.SystemBuffer as union) |
AssociatedIrp.SystemBuffer |
In case of Buffered IO, SVA in non-paged pool for the buffer |
IoStatus.Status |
Status is an IO operation completion
status. |
RequestorMode |
Kernel or User. |
Cancel |
When cancellation is requested, IO Manager sets Cancel to TRUE. CancelRoutine is set by the driver and is called by IO Manager at DISPATCH_LEVEL, but it should return at CancelIrql level (IRQL at which cancel spinlock was acquired). |
UserBuffer |
Requestor’s VA of the buffer. |
Tail.Overlay.DeviceQueueEntry |
Link in per-device queue entry used by IO Manager when System Queueing is used. |
TaskOverlay.ListEntry |
When a driver owns this IRP, it can use this field to linking IRPs. |
I/O stack location |
|
MajorFunction |
|
MinorFunction |
|
Flags |
|
Control |
|
Parameters |
|
DeviceObject |
Pointer to DEVICE_OBJECT that is the target of request. |
FileObject |
Pointer to FILE_OBJECT associated with the request. |
CompletionRoutine |
|
Context |
Context for the CompletionRoutine |
IO method is chosen and set in device Flags when DEVICE_OBJECT
is created and applies to all read and write requests (but not IOCTL requests).
When driver is intermediate driver layered on top of another driver, must use
the same method as the driver below.
Direct IO: Requestor’s buffer is locked in memory and described by MDL. Some
drivers (e.g. network) may use MDL chain.
Buffered IO: Data is copied via intermediate non-paged buffer in system
space.
Neither IO: Driver is provided with the requestor’s virtual address of
the buffer.
For Direct IO:
MmGetSystemAddressForMdl() – map buffer described by MDL into SVA and return
mapped address. On first call invokes MmMapLockedPages to map the
buffer.
MmGetMdlVirtualAddress() – get requestor’s virtual address in
requestor’s space.
MmGetMdlByteCount() – get byte count for the MDL.
MmGetMdlByteOffset() – get buffer byte offset into the first page.
IoMapTransfer() – get logical (bus) address and size for DMA.
For Buffered IO:
buffer address (SVA) = AssociatedIrp.SystemBuffer
buffer length = Parameters.Read.Length or Parameters.Write.Length
Note that buffered driver can easily emulate Direct IO (MDL) interface by calling MmGetSystemAddressForMdl() and using it instead of AssociatedIrp.SystemBuffer.
For Neither IO:
requestor’s buffer address = UserBuffer (unvalidated !!), usable only in the context of requestor’s thread
If Irp->RequestorMode == UserMode, must at a minimum
validate that address range <= MM_USER_PROBE_ADDRESS and that length does
not cause buffer to wrap.
Should also attempt to access each page.
IO_STACK_LOCATION* IoStack = IoGetCurrentIrpStackLocation(IRP* Irp);
IRP_MJ_CREATE |
Create new file object. Issued e.g. by
CreateFile(). SecurityContext = IO_SECURITY_CONTEXT* Options = ULONG FileAttributes = USHORT ShareAccess = USHORT EaLength = ULONG |
IRP_MJ_READ |
Read, issued e.g. by ReadFile(). |
IRP_MJ_WRITE |
Write, issued e.g. by WriteFile(). |
IRP_MJ_DEVICE_CONTROL |
DeviceIoControl() |
IRP_MJ_INTERNAL_DEVICE_CONTROL |
IOCTL called by other driver, not available from user level. |
IRP_MJ_QUERY_INFORMATION IRP_MJ_LOCK_CONTROL IRP_MJ_QUERY_EA IRP_MJ_QUERY_QUOTA IRP_MJ_QUERY_SECURITY IRP_MJ_QUERY_VOLUME_INFORMATION IRP_MJ_SET_EA IRP_MJ_SET_QUOTA IRP_MJ_SET_SECURITY IRP_MJ_SET_VOLUME_INFORMATION |
See at MSDN:
|
MinorFunction: usually IRP_MN_NORMAL, but can be IRP_MN_COMPRESSED etc.
IOCTL code is composed as CTL_CODE(DeviceType, Function,
Method, Access)
DeviceType: for standard devices FILE_DEVICE_DISK, FILE_DEVICE_TAPE, for
custom devices in range 32768…65535.
Also specified in IoCreateDevice(), but does not have to be the same value.
Method: METHOD_BUFFERED (in and out are both buffered), METHOD_NEITHER
(in and out are neither),
METHOD_IN_DIRECT or METHOD_OUT_DIRECT (in both cases in is buffered and out
is direct IO, however
METHOD_IN_DIRECT checks out buffer for read access and
METHOD_OUT_DIRECT checks it for write access)
Function: custom function range is 2048…4095
Access: access on handle: FILE_ANY_ACCESS, FILE_READ_ACCESS,
FILE_WRITE_ACCESS
|
METHOD_BUFFERED |
METHOD_IN_DIRECT |
METHOD_OUT_DIRECT |
METHOD_NEITHER |
InBuffer uses |
buffered IO |
requestor’s VA |
||
InBuffer (if not NULL) |
SVA in Irp->AsociatedIrp.SystemBuffer |
requestor’s VA in |
||
InBuffer length (bytes) |
Parameters.DeviceIoControl.InputBufferLength of current I/O Stack Location |
|||
OutBuffer uses |
buffered IO |
direct IO |
requestor’s VA |
|
OutBuffer (if not NULL) |
SVA in Irp-> |
MDL pointed by Irp->MdlAddress |
requestor’s VA |
|
OutBuffer length (bytes) |
Parameters.DeviceIOControl.OutputBufferLength of current I/O Stack Location |
Note that IO Manager considers 0 to be a valid buffer length for an IO
operation.
When requestor's buffer length is 0, IO Manager does not allocate MDL
(for Direct I/O drivers) or system buffer (for Buffered I/O drivers).
Driver must always check for zero transfer side before attempting to access MDL
or system buffer.
Layered (stacked) drivers:
Example (stack of device objects):
(unnamed) FAT on \Device\HardDisk0\Partition2
\Device\HardDisk0\Partition2
\Device\ScsiPort0
Two methods: (1) driver A can call driver's B dispatch routine, (2) use private interface, e.g. by exchanging private call vectors and other private structures – the latter can be done by passing them via IoCallDriver() or by linking against same-mode kernel DLL aiding in the exchange.
IO_STACK_LOCATION:
UCHAR
MajorFunction
UCHAR MinorFunction
UCHAR Flags
UCHAR
Control
// SL_PENDING_RETURNED, SL_ERROR_RETURNED, SL_INVOKE_ON_CANCEL,
SL_INVOKE_ON_SUCCESS, SL_INVOKE_ON_ERROR
DEVICE_OBJECT*
DeviceObject
// for completion routine
FILE_OBJECT* FileObject
IO_COMPLETION_ROUTINE CompletionRoutine
VOID* Context // for completion routine
union Parameters (request-specific)
Create, Read, Write, QueryDirectory, QueryDirectory, NotifyDirectory,
QueryFile, SetFile, QueryEa, SetEa,
QueryVolume,
SetVolume, FileSystemControl, LockControl, DeviceIoControl, QuerySecurity,
SetSecurity,
MountVolume,
VerifyVolume, Scsi, QueryQuota, SetQuota, QueryDeviceRelations, QueryInterface,
DeviceCapabilities,
FilterResourceRequirements, ReadWriteConfig, SetLock, QueryId, QueryDeviceText,
UsageNotification, WaitWake, PowerSequence, Power, StartDevice, WMI, Others
Others:
PVOID Argument1
PVOID Argument2
PVOID Argument3
PVOID Argument4
IO_STACK_LOCATION* IoGetCurrentIrpStackLocation(IRP* Irp)
IO_STACK_LOCATION* IoGetNextIrpStackLocation(IRP* Irp)
Used to set up parameters for call to lower-level driver.
VOID IoSetNextIrpStackLocation(IRP* Irp)
Seldom used routine.
Used by a driver if it needs an extra location for
passing context information that cannot be passed in the context for a
completion routine.
Remember to increase requested stack depth in IoAllocateIrp or IoMakeAssociatedIrp.
Cannot be used with IRPs built by IoBuildDeviceIoControlRequest, IoBuildAsynchronousFsdRequest, IoBuildSynchronousFsdRequest.
VOID IoCopyCurrentIrpStackLocationToNext(IRP* irp)
Copy current IO_STACK_LOCATION parameters to next-down slot.
After calling this routine, driver typically calls IoSetCompletionRoutine(Ex)
before calling IoCallDriver.
If driver passes down its parameters but does not set completion routine, it is
better call IoSkipCurrentIrpStackLocation instead.
VOID IoSkipCurrentIrpStackLocation(IRP* irp)
Modifies IO_STACK_LOCATION pointer so that next-lower
driver receives current slot.
Can only be used if current level does not utilize completion routine.
Must not be used if driver called or intend to call IoMarkIrpPending
before passing IRP to next-lower driver.
In the latter cases use IoCopyCurrentIrpStackLocationToNext.
NTSTATUS IoCallDriver(DEVICE_OBJECT* DeviceObject, IRP* Irp)
Called at <= DISPATCH_LEVEL, but mostly at PASSIVE_LEVEL (as target driver's dispatch routine expects).
Before calling must set next stack location.
In Windows 2003 and earlier, power IRPs should be passed down with PoCallDriver,
rather than IoCallDriver.
Calls dispatch routine of lower-level driver.
Before calling, IO stack in IRP is pushed and "Next" location becomes
"Current".
Unless the driver set up IoSetCompletionRoutine(Ex),
it can no longer access the IRP.
If it did set up completion routine, when the latter is invoked, it sees IO
status block in IRP as set by next-below driver, and all lower-level IO stack
locations zeroed out.
Can return STATUS_PENDING, or other success/error status codes.
If lower-level driver returns STATUS_PENDING, higher-level driver should not call IoCompleteRequest for the IRP.
One exception to this: The higher-level driver can use an event to synchronize between its IoCompletion routine and its dispatch routine, in which case the IoCompletion routine signals the event and returns STATUS_MORE_PROCESSING_REQUIRED. The dispatch routine waits for the event and then calls IoCompleteRequest to complete the IRP.
Note that IO Manager runs Attached list only when processing Create request (such as during IoGetDeviceObjectPointer that uses Create request). No redirection occurs when IoCallDriver is called.
IRP* IoAllocateIrp(CCHAR StackSize, BOOLEAN ChargeQuota)
Called at <= DISPATCH_LEVEL.
ChargeQuota means "charge quota to current process".
On failure returns NULL.
IoAllocateIrp does not associate IRP with a thread; allocating driver
must use completion routine (IoSetCompletionRoutine(Ex)) either to free
IRP (IoFreeIrp) or reuse it (IoReuseIrp) instead of completing it
back to IO Manager (i.e. completion routine should return
STATUS_MORE_PROCESSING_REQUIRED). Driver should also free all other resources
it set up for the operation, such as MDLs built with IoBuildPartialMdl.
Also:
IRP* IoMakeAssociatedIrp(...)
IRP* IoBuildDeviceIoControlRequest(...)
IRP* IoBuildAsynchronousFsdRequest(...)
IRP* IoBuildSynchronousFsdRequest(...)
IRP* IoMakeAssociatedIrp(IRP* Irp, CCHAR StackSize)
Allocate and initialize IRP to be associated with master IRP, allowing to split original request and send associated IRPs to lower-level drivers.
Only a highest-level driver (such as file system
driver) can call IoMakeAssociatedIrp.
IO Manager completes master IRP automatically when lower drivers have completed
all associated IRPs as long as the caller has not set its IoCompletion
routine in associated IRP and returned STATUS_MORE_PROCESSING_REQUIRED from
that routine. In the latter case the caller must explicitly complete the master
IRP when the driver determines that all associated IRPs have completed.
Only the master IRP is linked with a thread; associated IRPs are not. For this
reason when master thread exits, IO Manager does not call Cancel
routines for associated IRPs. When the master IRP's thread exits, the I/O manager
calls the master IRP's Cancel routine. This Cancel routine is
responsible for tracking down all associated IRPs and calling IoCancelIrp
to cancel them.
VOID IoMarkIrpPending(IRP* Irp)
Unless driver dispatch routine completes IRP, it must
mark it pending with IoMarkIrpPending and return STATUS_PENDING.
Otherwise IO Manager attempts to complete IRP as soon as dispatch routine
returns.
After calling IoMarkIrpPending, dispatch routine must return STATUS_PENDING even if it or some other routine it calls completes the request (with IoCompleteRequest) before the dispatch routine (that did IoMarkIrpPending) returns.
If driver queues the IRP, it must call IoMarkIrpPending before queuing it (otherwise it might get dequeued, completed and released before IoMarkIrpPending gets called).
If the driver sets IoCompletion routine and then
passed IRP to lower driver, IoCompletion routine must check flag
Irp->PendingReturned.
If it is set, IoCompletion routine must call IoMarkIrpPending(Irp).
A driver that passes down the IRP and then waits on an event should not
mark the IRP pending.
Instead, its IoCompletion routine should signal the event and return
STATUS_MORE_PROCESSING_REQUIRED.
VOID IoStartPacket(DEVICE_OBJECT* DeviceObject, IRP* Irp, OPTIONAL ULONG* Key, OPTIONAL DRIVER_CANCEL CancelFunction)
Call StartIo (if device is not busy) or insert IRP into device queue (if device is busy).
Callable at <= DISPATCH_LEVEL.
If CancelFunction != NULL, calls IoSetCancelRoutine.
If Key is specified, inserted into the queue according to the value,
otherwise at the end.
Used by drivers that utilize system queueing.
Note that such drivers can have only one IRP in progress at a time.
VOID IoStartNextPacket(DEVICE_OBJECT* DeviceObject, BOOLEAN Cancelable)
Dequeue the next IRP and, if available, call StartIo.
If Cancelable is TRUE, uses the cancel spinlock to protect the device queue and the current IRP.
If CancelFunction in IoStartPacket was not null, must set Cancellable to TRUE.
Callable at DISPATCH_LEVEL.
DeviceObject.Busy is set TRUE right before StartIo is called and is cleared any time IoStartNextPacket is called and there are no pending IRPs in the device queue.
Warning: if StartIo itself can call IoStartXXX, this can lead to a recursion and stack overflow (in case of deep recursive invocations).
In this case should call IoSetStartIoAttribute(DeviceObject, TRUE, ...) to set deferred StartIo invocation mode. This setting ensures that next packet won't be issued until the previous StartIo call returns.
VOID IoStartNextPacketByKey(DEVICE_OBJECT* DeviceObject, BOOLEAN Cancelable, ULONG Key)
Similar to IoStartNextPacket.
Callable at <= DISPATCH_LEVEL.
Sort Key defines which IRP to remove.
VOID IoSetStartIoAttributes(DEVICE_OBJECT* DeviceObject, BOOLEAN DeferredStartIo, BOOLEAN NonCancelable)
If DeferredStartIo is TRUE, IO Manager will
defer any call to the driver's StartIo while the driver is already inside
it.
Default is FALSE.
If NonCancelable is TRUE, IRP cannot be
cancelled once it had been dequeued by IoStartNextPacket.
Default is FALSE.
Drivers that use NonCancelable = FALSE must synchronize their IRP
handling with the cancel spinlock.
Higher-level driver does:
set up next stack location, such as with IoCopyCurrentIrpStackLocationToNext
optionally IoSetCompletionRoutine(Ex)
return IoCallDriver
Lower-level driver does:
if checks fail, complete immediately with error status
(IoCompleteRequest, return status)
IoMarkIrpPending
for system-managed queue and StartIo
call IoStartPacket
for driver-managed IRP queue
if driver does not have a StartIo, but handles cancellable IRPs, it must either register a Cancel routine (with IoSetCancelRoutine) or implement a cancel-safe IRP queue (IoCsqInsertIrp)
return STATUS_PENDING
NTSTATUS IoRegisterShutdownNotification(DEVICE_OBJECT* DeviceObject)
NTSTATUS IoRegisterLastChanceShutdownNotification(DEVICE_OBJECT* DeviceObject)
VOID
IoUnregisterShutdownNotification(DEVICE_OBJECT* DeviceObject)
Register/unregister driver to receive IRP_MJ_SHUTDOWN
notification IRP.
Callable at PASSVE_LEVEL.
"LastChance" is after file systems are flushed.
Driver must declare DispatchShutdown routine.
IRP completion:
When driver finished processing request or cancels the
it, driver calls IoCompleteRequest(Irp, PriorityBoost).
IO Manager checks if the IRP to see if any higher-level drivers have set up an IoCompletion
routine for the IRP.
If so, IoCompletion routines are called in turn from bottom to upwards,
until all have been executed or one returned STATUS_MORE_PROCESSING_REQUIRED.
VOID
IoCompleteRequest(IRP*
Irp, CCHAR PriorityBoost)
Callable at <= DISPATHCH_LEVEL.
Caller must not hold a spinlock, or deadlock can result.
Always release all spinlocks before calling IoCompleteRequest.
Before completing, set Irp->IoStatus.Status (= STATUS_xxx) and Irp->IoStatus.Information
(= usually number of transferred bytes).
Must not access the IRP after calling IoCompleteRequest (it is gone), not even for Irp->IoStatus.Status.
check(Irp->CancelRoutine == NULL)
check(Irp->IoStatus.Status != STATUS_PENDING)
foreach (stack location bottom to top)
{
Irp->PendingReturned = iostk->Control & SL_PENDING_RETURNED
if (must invoke completion routine due to IoStatus.Status and Control flags)
{
zero stack location
if (STATUS_MORE_PROCESSING_REQUIRED == iostk->CompetionRoutine(iostk->deviceObject, Irp, iostk->Context))
return
}
else
{
if (Irp->PendingReturned)
IoMarkIrpPending(Irp)
zero stack location
}
}
if (Irp is an associated IRP)
{
decrement refcnt on masterIrp
if (Irp->MdlAddress != NULL)
{
IoFreeMdl
and all other MDLs in the chain
}
IoFreeIrp(Irp)
if (masterIrp refcnt went down to zero)
IoCompleteRequest(masterIrp, ...)
return
}
special handling for STATUS_REPARSE
{
in particular, may save Irp->Tail.Overlay.AuxiliaryBuffer
}
if (Irp->Tail.Overlay.AuxiliaryBuffer != NULL)
{
ExFreePool it
pointer = NULL
}
if (is close operation)
{
KeSetEvent(Irp->UserEvent, ...)
return
}
if (special handling for paging IO)
{
may IoFreeIrp
may fire APC to thread
}
if (Irp->MdlAddress != NULL)
{
unlock pages in it
and all other MDLs in the chain
}
special handling for STATUS_REPARSE
{
in particular, may restore Irp->Tail.Overlay.AuxiliaryBuffer
}
VOID
IoSetCompletionRoutine(IRP*
Irp, IO_COMPLETION_ROUTINE CompletionRoutine, VOID* Context,
BOOLEAN InvokeOnSuccess, BOOLEAN InvokeOnError, BOOLEAN InvokeOnCancel)
Stores completion routine in the next-lowest
stack location.
This means that IoCompleteRequest within current layer will not
execute that completion routine since it processes IO locations staring from
current (non next) location.
Lowest-level driver cannot register IoCompletion routine and must
never try to do so: not only it won't be called, but IO stack will
overflow: there is no valid IO stack location following the IO stack location
used by the lowest driver in the stack.
Only a driver that can guarantee it will not be
unloaded before its completion routine finishes can use IoSetCompletionRoutine.
Otherwise must use IoSetCompletionRoutineEx.
NTSTATUS IoSetCompletionRoutineEx(DEVICE_OBJECT*
DeviceObject, IRP* Irp, IO_COMPLETION_ROUTINE IoCompletion, VOID* Context,
BOOLEAN InvokeOnSuccess, BOOLEAN InvokeOnError, BOOLEAN InvokeOnCancel)
Non-PnP drivers that may be unloaded before their IoCompletion
routine executes must use IoSetCompletionRoutineEx.
DeviceObject must belong to the driver that contains IoCompletion routine.
Returns STATUS_SUCCESS on success, or STATUS_INSUFFICIENT_RESOURCES if insufficient memory is available for the operation.
Once IoSetCompletionRoutineEx succeeds, must
ensure that IoCompletion routine gets executed, by calling IoCallDriver,
otherwise will leak memory.
### Using IoSetCompletionRoutineEx to install the completion routine will
prevent this problem, but you still need to make sure by other means that your
driver doesn't unload before the system calls your completion routine.
### http://www.wd-3.com/archive/WalkPlank.htm
IO_COMPLETION_ROUTINE IoCompletion;
NTSTATUS IoCompletion(DEVICE_OBJECT* DeviceObject, IRP* Irp, VOID* Context)
Called within IoCompleteRequest.
Lower-level driver IoComplete routines are called before higher-level driver IoComplete.
If additional processing is required, return
STATUS_MORE_PROCESSING_REQUIRED.
Otherwise return STATUS_SUCCESS.
No other status codes may be returned.
Can be executed in arbitrary thread or DPC context.
May be called at any IRQL <= DISPATCH_LEVEL.
Should be designed to be executable at DISPATCH_LEVEL; however may be called at a lower level.
If the driver sets IoCompletion routine and then
passed IRP to lower driver, IoCompletion routine must check flag
Irp->PendingReturned.
If it is set, IoCompletion routine must call IoMarkIrpPending(Irp).
However a
driver that passes down the IRP and then waits on an event should not mark the
IRP pending.
Instead its IoCompletion routine should signal the event and return
STATUS_MORE_PROCESSING_REQUIRED.
If IoCompletion returns STATUS_SUCCESS, it must first release per-IRP resources.
For example, if dispatch routine allocated MDL with IoAllocateMdl and calls IoBuildPartialMdl for a partial-transfer IRP, this MDL must be released with IoFreeMdl.
If IoCompletion returns
STATUS_MORE_PROCESSING_REQUIRED, lower-driver's call to IoCompleteRequest
immediately terminates (i.e. scanning of chain of IoCompleteRequest routines
terminates).
Higher-level driver will then have to call IoCompleteRequest later again
to complete the IRP (if that IRP was allocated by IO Manager or higher driver)
or call IoFreeIrp / IoReuseIrp (if that IRP was allocated by current
driver with IoAllocateIrp or IoBuildAsynchronousFsdRequest).
When performing IoFreeIrp, IoCompletion routine should:
· release per-IRP resources
· call IoFreeIrp
· if there is any "parent" IRP and this is final child request on its behalf, call IoCompleteRequest for parent IRP
· IoCompletion routine must return STATUS_MORE_PROCESSING_REQUIRED
]
To retry IRP (e.g. for error recovery):
· save original error information, if needed
· increment retry count
·
Irp->IoStatus.Status = STATUS_SUCCESS
Irp->IoStatus.Information = 0
· call IoSetCompletionRoutine(Ex)
· call IoCallDriver
· return STATUS_MORE_PROCESSING_REQUIRED
However calling IoCallDriver is not always
possible from IoCompletion routine, since IoCompletion can
execute at elevated IRQL up to DISPATCH_LEVEL.
If that is the case, IoCompletion routine should send IRP to a worker
thread that will then call IoCallDriver to resubmit IRP to a lower-level
driver.
When IoCompletion routine is called, the following information is available to it:
· Irp->IoStatus, that can be also altered by IoCompletion if desired
·
IO Stack Location used by dispatch routine,
can be used to pass information from dispatch routine to completion routine,
available in both by IoGetCurrentIrpStackLocation
· note that the content of next IO stack location is not available in IoCompletion routine, since it is zeroes out by IO Manager
IRP cancellation:
If a driver has StartIo routine, its dispatch routines can register a Cancel routine by supplying its address as input to IoStartPacket.
If a driver does not have StartIo routine, its dispatch routines must do the following before queuing an IRP internally:
IoAcquireCancelSpinLock();
IoSetCancelRoutine();
IoReleaseCancelSpinLock();
VOID IoAcquireCancelSpinLock(KIRQL* Irql) // callable at <= DISPATCH_LEVEL
VOID IoReleaseCancelSpinLock(KIRQL Irql) // must be at DISPATCH_LEVEL
Driver that uses IO-Manager supplied IRP queues to device object (system queueing):
· Must hold the cancel spinlock when it changes the cancellable state of IRP with IoSetCancelRoutine.
· Only the holder of the spinlock can change cancel state of the IRP.
· StartIo and any other routine that dequeues IRP or is called with an IRP that may be held in cancellable state should do the following:
IoAcquireCancelSpinLock
if (Irp != DeviceObject->CurrentIrp)
{
// IRP was cancelled between the time IoStartPacket released the cancel spinlock and this routine acquired it
IoReleaseCancelSpinLock
return
}
// remove from cancellable state
IoSetCancelRoutine(Irp, NULL)
// was cancellation requested?
if (Irp->Cancel)
{
IoReleaseCancelSpinLock
Irp->IoStatus.Status = STATUS_CANCELLED
Irp->IoStatus.Information = 0
IoStartNextPacket(deviceObject, TRUE) // in a StatIo routine only
IoCompleteRequest(Irp, IO_NO_INCREMENT)
return
}
IoReleaseCancelSpinLock
start processing the request
Drivers that manage their own IRP queues do not need to hold the cancel
spinlock when calling IoSetCancelRoutine.
They can use their own spinlock instead and reduce the contention.
However they should check return value of IoSetCancelRoutine to
determine whether Cancel routine has already started (see below).
Pseudo-code for such driver's routines to queue IRP, to dequeue IRP and for
cancellation routine is as follows:
To queue IRP:
acquire private spinlock
IoMarkPending(Irp)
insert into private queue, e.g. InsertTailList(&deviceContext->irpQueue, &Irp->Tail.Overlay.ListEntry)
oldCancelRoutine = IoSetCancelRoutine(Irp, MyCancelRoutine)
assert (oldCancelRoutine == NULL)
if (Irp->Cancel)
{
// IRP was cancelled. Check whether our cancel routine had been called.
oldCancelRoutine = IoSetCancelRoutine(Irp, NULL)
if
(oldCancelRoutine != NULL)
{
// cancel routine had not been called
// so dequeue IRP now and complete it
remove private queue, e.g.
RemoveEntryList(&Irp->Tail.Overlay.ListEntry)
release private spinlock
Irp->IoStatus.Status = STATUS_CANCELLED
Irp->IoStatus.Information = 0
IoStartNextPacket(deviceObject, TRUE) ??? // in a StatIo routine only
IoCompleteRequest(Irp, IO_NO_INCREMENT)
return STATUS_CANCELLED
}
else
{
// cancel routine was already called
// as soon as we release the spinlock, it will dequeue and complete the IRP
// return STATUS_PENDING as we are not completing IRP here
return STATUS_PENDING
}
}
release private spinlock
return STATUS_PENDING
To dequeue IRP:
acquire private spinlock
irp = NULL
while (queue is not empty)
{
irp = remove irp off queue
oldCancelRoutine = IoSetCancelRoutine(irp, NULL);
if (oldCancelRoutine != NULL)
{
// was not cancelled
assert (oldCancelRoutne == MyCancelRoutine)
break;
}
// this IRP was just cancelled and cancel routine is being called
// cancel routine will complete this irp as soon as we release the spinlock,
// so don't do anything with it
// However cancel routine will try to dequeue the IRP, so make its ListEntry point to itself
assert (irp->Cancel)
InitializeListHead(&irp->Tail.Overlay.ListEntry);
irp = NULL
}
release private spinlock
For cancellation routine:
// release the global cancel spin lock
// do this while not holding any other spin locks so that we exit at the right IRQL
IoReleaseCancelSpinLock(Irp->CancelIrql)
lock private spinlock
// find on private list, e.g.
RemoveEntryList(&Irp->Tail.Overlay.ListEntry)
// if not found on private list, just release the spinlock and return
// in assumption IRP processing already had started
release private spinlock
Irp->IoStatus.Status = STATUS_CANCELLED
Irp->IoStatus.Information = 0
IoCompleteRequest(Irp, IO_NO_INCREMENT)
PDRIVER_CANCEL IoSetCancelRoutine(IRP* Irp, PDRIVER_CANCEL CancelRoutine)
Callable at DISPATCH_LEVEL.
If driver uses system queueing, must hold the cancel spinlock.
If driver does not use system queueing, need not hold a spinlock. IoSetCancelRoutine internally uses interlocked exchange to set cancel routine.
If no routine was previously set or cancellation is
already in process, returns NULL.
When passing IRP to lower-level driver, must reset cancel routine to
NULL before IoCallDriver.
Before completing IRP, driver must reset cancel routine to NULL.
BOOLEAN IoCancelIrp(IRP* Irp)
Callable at <= DISPATCH_LEVEL.
Returns TRUE if IRP had a cancel routine (Irp->CancelRoutine != NULL) and this routine was called.
Otherwise returns FALSE.
In either case IRP's cancel bit is set to TRUE.
lock cancel spinlock
Irp->Cancel = TRUE
cancelRoutine = interlocked exchange (Irp->cancelRoutine, NULL)
if (cancelRoutine == NULL)
{
release cancel spinlock
return FALSE;
}
else
{
cancelRoutine(deviceObject, irp)
return TRUE;
}
A higher-level driver can call IoCancelIrp with any pending IRP that the driver has allocated. However, making this call does not ensure that the driver-allocated IRP will be completed with its I/O status block set to STATUS_CANCELLED; another thread might already be completing the IRP. To check whether the IRP was canceled, the higher-level driver must call IoSetCompletionRoutine(Ex) with the InvokeOnCancel parameter set to TRUE before passing the IRP on to the next lower driver.
VOID CancelRoutine(DEVICE_OBJECT* DeviceObject, IRP* Irp)
Called at DISPATCH_LEVEL with cancel spinlock already
held.
CancelRoutine should:
· determine if IRP is currently being held in a cancellable state within the driver
· release the cancel spinlock once it no longer needs it by IoReleaseCancelSpinLock(Irp->CancelIrql)
· if IRP is not held in cancellable state within the driver, return
· otherwise cancel IRP and:
·
Irp->IoStatus.Status = STATUS_CANCELLED
Irp->IoStatus.Information = 0
· call IoCompleteRequest(Irp, boost)
If driver uses StartIo routine, Cancel routine should:
if (Irp ==
deviceObject->CurrentIrp)
{
// current, already being processed
IoReleaseCancelSpinLock
}
else if (KeRemoveEntryDeviceQueue(&DeviceObject->DeviceQueue, & Irp->Tail.Overlay.DeviceQueueEntry))
{
// was in the queue and was removed
IoReleaseCancelSpinLock
Irp->IoStatus.Status = STATUS_CANCELLED
Irp->IoStatus.Information = 0
IoSetCancelRoutine(Irp, NULL)
IoCompleteRequest(Irp, IO_NO_INCREMENT)
}
else
{
// not in the queue, apparently being processed
IoReleaseCancelSpinLock
}
if being processed (case 1 and possibly 3) and can actually abort it, do:
do abort processing
IoStartNextPacket(DeviceObject, TRUE)
Irp->IoStatus.Status = STATUS_CANCELLED
Irp->IoStatus.Information = 0
IoSetCancelRoutine(Irp, NULL)
IoCompleteRequest(Irp, IO_NO_INCREMENT)
If driver does not use StartIo routine, Cancel routine should follow the pseudocode in the beginning of this section.
Under XP and later versions of Windows, drivers that
implement their own queueing can use cancel-safe IRP queues rather than
implement their own cancel routines.
See MSDN “Cancel-Safe IRP Queues”.
Driver routines (entry points) overview:
Basic (used by most of the drivers, hardware anyway):
DriverEntry |
Called when a driver is first loaded. |
Dispatch entry points |
Called to request the driver to initiate a particular
IO operation. |
ISR |
Called when one of the driver’s devices fields an
interrupt, |
DpcForIsr and/or CustomDpc |
Called to let driver complete its work as a result of an interrupt or other special condition. |
Routines applicable to processing of a specific IRP or group of IRPs:
Cancel |
Per-IRP. Driver can define cancel routine for
each IRP it holds in the internal queue. |
Completion routine |
Higher-level driver can store a Completion
routine in the IRP. |
Routines applicable specific functionality if it is supported by the driver:
Reinitialzie |
Called to allow the driver to perform secondary initialization (“stage2” of initialization, after other drivers have initialized). Can be called multiple times, as needed. Can be used to wait for other drivers to initialize or for drivers taking long time to initialize, to perform initialization in steps. |
StartIo |
IO Manager utilizes this routine only in drivers that
use System Queueing. |
Unload |
Called when the driver is about to be unloaded. |
IoTimer |
Called about once a second in drivers that has initialized and started IoTimer support. |
Fast I/O |
Exposed by file system drivers. |
AdapterControl |
Called to indicate that shared DMA resources are available for use. |
Timer DPC |
Called when driver-requested timer expires. |
Synchronize function |
Called in response to a driver’s request to acquire one of its ISR spinlocks. |
IRQLs are described in “Scheduling, Thread Context, and IRQL” (http://msdn.microsoft.com/en-us/library/ms810029.aspx)
General flow:
DispatchSome(deviceObject,
irp)
{
NTSTATUS status;
CCHAR priorityBoost = IO_NO_INCREMENT; // or
IO_DISK_INCREMENT etc.
IO_STACK_LOCATION* ios = IoGetCurrentIrpStackLocation(irp);
if (! request parameters
valid)
{
status = STATUS_some_error;
irp->IoStatus.Status = status;
irp->IoStatus.Information = 0;
IoCompleteRequest(irp, priorityBoost);
return status;
}
if (device is busy)
{
mark irp pending
queue irp
return STATUS_PENDING;
}
start request described by IRP on the device within the context of DispatchSome
if (need time to complete the
request)
{
if (uses system queuing)
{
IoMarkIrpPending
indicate device is busy
IoSetCancelRoutine
IoStartPacket
return STATUS_PENDING;
}
else if (uses private queueing)
{
IoMarkIrpPending
indicate device is busy
IoSetCancelRoutine
queue it internally
if queue was empty, start IO
return STATUS_PENDING;
}
}
else
{
status = STATUS_result;
irp->IoStatus.Status = status;
irp->IoStatus.Information =
usually number of bytes;
IoCompleteRequest(irp, priorityBoost);
return status;
}
}
ISR(...)
{
if (our interrupt)
{
save some state
ack interrupt to device
IoRequestDpc()
}
}
DpcForIsr(...)
{
// called at DISPATCH_LEVEL
if (! is here to complete the
request)
{
do something else
return
}
complete processing request
things like moving data from the device to requestor's buffer (for programmed
IO devices)
reenable interrupt on the device
can also retry failed operation or set up next part of large transfer
complete irp:
irp->IoStatus.Status = status;
irp->IoStatus.Information =
usually number of bytes;
IoCompleteRequest(irp, priorityBoost);
if (more irp's pending)
{
start request described by irp on the device:
if the driver uses system IRP queue, call IoStartNextPacket or IoStartNextPacketByKey,
so driver's StartIo routine will be called
if the driver uses internal IRP queue, dequeue next IRP and begin processing
the request
}
else
{
indicate device is not busy
}
}
If need to pass request down the stack, do IoSetCompletionRoutine(Ex) and call IoCallDriver.
Driver routines (entry points):
DRIVER_INITIALIZE DriverEntry;
NTSTATUS DriverEntry(DRIVER_OBJECT* DriverObject, UNICODE_STRING* RegistryPath)
Called when
a driver is being loaded.
Responsible for driver initialization.
In PnP case initializes just the driver itself, not the device
(DRIVER_ADD_DEVICE is responsible for device initialization for PnP).
Called at PASSIVE_LEVEL within a context of system thread.
Return STATUS_SUCCESS if ok.
Required purpose:
· export driver entry points, including AddDevice, StartIo, Unload, and dispatch routines
· initialize driver-wide objects and resources that the driver uses
·
DriverObject->DriverUnload = MyUnload;
DriverObject->DriverExtension->AddDevice = MyAddDevice;
DriverObject->MajorFunction[IRP_MJ_PNP] = MyDispatchPnp;
DriverObject->MajorFunction[IRP_MJ_POWER] = MyDispatchPower;
DriverObject->DriverStartIo =
MyStartIo;
// optional
DriverObject->FastIoDispatch = MyFastIoDispatch; // for file system
drivers and network transport drivers
· additional routines, such as ISRs or IoCompletion are specified by calling system support routines
Optional purpose:
· call IoAllocateDriverObjectExtension to allocate and initialize driver object extension, if such storage is needed
· create system threads if needed
· register DRIVER_REINITIALIZE routine
· for non-PnP drivers (mostly):
o determine device configuration and dynamically locate the devices that the driver needs to support
o create device objects by calling IoCreateDevice etc. (see more in DRIVER_ADD_DEVICE description below)
o claim hardware resources (IO ports, IRQs, shared memory, DMA channels) for the driver's use
o translate bus addresses from bus-specific to system-wide values suppliable to HAL interfaces
o connect to interrupt and register DpcForIsr routine
o get a pointer to Adapter object
o perform device initialization
Legacy (non-PnP) drivers call IoCreateDevice
from DriverEntry.
PnP drivers usually call IoCreateDevice from AddDevice routine,
but they also might create control devices in DriverEntry.
If DriverEntry returns error (not success or informational status):
· driver is not loaded
· DRIVER_UNLOAD is not called
· before returning DriverEntry must deallocate all objects it created and registry entries it set up
· entry points for IRP_MJ_FLUSH_BUFFERS and IRP_MJ_SHUTDOWN must be reset to NULL
· must record error log entry
DriverEntry
routine can be pageable and may be in INIT segment (use alloc_text) so it can
be discarded after initialization.
Must be named DriverEntry (possible to rename, but DDK build tools
assume this name).
Must fill DriverObject with pointers to driver standard routines.
DriverObject->HardwareDatabase (counted unicode string) is
"\Registry\Machine\Hardware".
RegistryPath is something like
"\HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\DriverName",
should save a copy of this string for use.
Must not call IoRegisterDriverReinitialization unless will return
STATUS_SUCCESS.
DRIVER_UNLOAD Unload;
VOID Unload(DRIVER_OBJECT* DriverObject)
Called when the driver is about to be unloaded.
Called at PASSIVE_LEVEL in the context of system
process.
Before being called to Unload, driver is marked as "unload
pending" .
When Unload is called:
· no references remain to any of the device objects for the driver
· this means: no new IRPs can be created for the driver's devices (incl. no files can be opened) and there are no outstanding IRPs on any of the driver's devices
· also means: no other driver remains attached to this driver and no new drivers can attach anymore to this driver
· the driver has called IoUnregisterPlugPlayNotification to unregister all PnP notifications for which it previously registered
Not called during shutdown, rather DispatchShutdown routine is called in this case.
Purpose:
· for PnP driver:
o Unload is called after all driver's devices had been removed (PnP drivers free device-specific resources and device objects in response to PnP device-removal IRPs. The PnP manager sends these IRPs on behalf of each PnP device it enumerates as well as any root-enumerated legacy devices a driver reports using IoReportDetectedDevice)
· for non-PnP driver:
o disable interrupts from device (so ISR is no longer called) and call IoDisconnectInterrupt for each device and controller that has interrupt object
o free the hardware resources that the DriverEntry or Reinitialize routine claimed for the driver's physical devices, if any, in the registry under the \Registry\Machine\Hardware\ResourceMap tree
o IoStopTimer, if driver's IoTimer routine is enabled on the device object
o also make sure that no timer objects are queued for calls to CustomTimerDpc routine
o KeRemoveQueueDpc if ISR queued DPC
o if driver called IoQueueWorkItem, wait till it completes
o release any device-spfcific resources that the driver allocated (format: if DriverEntry/Reinitialize did XXX -> do YYY)
§ PsCreateSystemThread -> PsTherminateSystemThread
§ IoCreateSymbolicLink, IoCreateUnprotectedSymbolcLink -> IoDeleteSymbolicLink
§ IoAssignArcName -> IoDeassignArcName
§ ExAllocatePool -> ExFreePool
§ MmMapIoSpace -> MmUnmapIoSpace
§ MmAllocateContigiousMemory -> MmFreeContigiousMemory
§ AllocateCommonBuffer -> FreeCommonBuffer
§ HalAssignSlotResources -> IoAssignResources, IoReportResourceUsage
o remove any names for driver's devices that DriverEntry or Reinitialize stored in the registry under \Registry...\DeviceMap
o free device objects and controller objects:
for each controller
{
for each device on the controller
{
release external references stored in device object (or extension),
such as pointers to other device objects or to interrupt objects;
if external device (such as next-lower device) was linked via IoGetDeviceObjectPointer, release it with ObDereferenceObject;
if external device was attached with IoAttachDevice or IoAttachDeviceToDeviceStack, detach it with IoDetachDevice
IoDeleteDevice()
}
release external references stored in controller object (or extension)
IODeleteController() // for non-WDM drivers only
}
can also traverse devices via DriverObject->DeviceObject and then via DeviceObject->NextDevice.
· release allocated driver-wide resources, such as memory, threads, events etc.
· i.e. undo the work performed by DriverEntry or Reinitialize routines
IO Manager will free the driver object including extentsion allocated with IoAllocateDriverObjectExtension.
DRIVER_REINITIALIZE Reinitialize;
VOID Reinitialize(DRIVER_OBJECT* DriverObject, PVOID Context, ULONG Count)
Called to allow the driver to perform secondary
initialization (“stage2” of initialization, after other drivers have
initialized).
Can be called multiple times, as needed.
Called after DriverEntry had returned control
(with STATUS_SUCCESS) and other drivers have initialized themselves.
“Stage2” of initialization.
Mostly useful for drivers loaded at boot-start of system-start phases.
Called at PASSIVE_LEVEL.
Count is the number of times this Reinitialize had been called,
including current call, i.e. >= 1.
To schedule its execution, DriverEntry must call (but only if returns
STATUS_SUCCESS):
VOID IoRegisterDriverReinitialization(DRIVER_OBJECT* DriverObject, MyReinitialize, PVOID Context)
If reinitialization need to be repeated again, can
again execute IoRegisterDriverReinitialization from Reinitialize, as
many times as required (Count will increment on each new invocation.)
If Reinitialize needs to use registry, caller (such as DriverEntry)
should pass a copy of RegistryPath string as a part of the Context.
DRIVER_ADD_DEVICE AddDevice;
NTSTATUS AddDevice(DRIVER_OBJECT* DriverObject, DEVICE_OBJECT* PhysicalDeviceObject)
Defined by drivers that support PnP.
Called by PnP Manager to create functional device object (FDO) or filter device
object (DO) for devices enumerated by PnP Manager.
Declared by DriverObject->DriverExtension->AddDevice = MyAddDevice.
Called at PASSIVE_LEVEL.
Determine if device needs service by this driver. May use routines such as IoGetDeviceProperty.
If a filter driver determines its AddDevice routine was called for a device it does not need to service, the filter driver does not create a device object nor attach it to the device stack, instead it must return STATUS_SUCCESS to allow the rest of the device stack to be loaded for the device.
Otherwise call IoCreateDevice to create functional or filter device object (FDO or filter DO) for the device being added.
Optionally create one or more symbolic links for the device: call IoRegisterDeviceInterface to register device functionality and create symbolic link; the driver should enable the interface by calling IoSetDeviceInterfaceState when it handles IRP_MN_START_DEVICE request.
Store the pointer to device's PDO in the FDO/DO device object extension.
Attach the device object to the device stack IoAttachDeviceToDeviceStack. Specify a pointer to device's PDO as TargetDevice parameter. Store the pointer returned by IoAttachDeviceToDeviceStack (i.e. pointer to the device object of next-lower driver for the device) for use in IoCallDriver and PoCollDriver when passing IRPs down the stack.
Be prepared to handle PnP IRPs for the device, such as IRP_MN_QUERY_RESOURCE_REQUIREMENTS and IRP_MN_START_DEVICE.
A driver must not start controlling the device until it receives an IRP_MN_START_DEVICE containing the list of hardware resources assigned to the device by the PnP manager.
DRIVER_STARTIO StartIo;
VOID StartIo(DEVICE_OBJECT* DeviceObject, IRP* Irp)
Optional routine.
IO Manager utilizes this routine only in drivers that use System Queueing.
For such drivers, IO Manager calls StartIo routine to start a new I/O
request.
Called to start an operation described by IRP.
If exported, must be declared as DriverObject->DriverStartIo = MyStartIo.
Called at DISPATCH_LEVEL in an unpredictable thread context.
DeviceObject.Busy is set TRUE right before StartIo is called and is cleared any time IoStartNextPacket is called and there are no pending IRPs in the device queue.
When StartIo can be called is controlled by VOID
IoSetStartIoAttributes(DEVICE_OBJECT* DeviceObject, BOOLEAN
DeferredStartIo, BOOLEAN NonCancelable).
DeferredStartIo means IO Manager will defer any call to the driver's StartIo
while the driver is already inside this routine. In particular, if StartIo
calls IoStartNextPacket, then StartIo will not be called again
until the current invocation completes. The default is FALSE.
NonCancellable means IRP cannot be cancelled once it has been dequeued by a call to IoStartNextPacket; default is FALSE; drivers that set it to FALSE must synchronize their IRP handling with the cancel spinlock.
### startio: http://msdn.microsoft.com/en-us/library/windows/hardware/ff566404%28v=vs.85%29.aspx
DRIVER_DISPATCH DispatchSome;
NTSTATUS DispatchSome(DEVICE_OBJECT* DeviceObject, IRP* Irp)
Defined one per major IO function (IRP_MJ_xxx) supported by the driver.
Most (but not all, see below) dispatch routines are called at PASSIVE_LEVEL.
For top-level drivers (such as file system drivers)
dispatch routine is called in the context of the thread originating IO request
at PASSIVE_LEVEL.
System Service Dispatcher calls IO Manager that
calls IoAllocateIrp
sets up IRP and IO Stack in the IRP
calls IoCallDriver => which calls DispatchSome
For drivers layered below other drivers, DispatchRead,
DispatchWrite and DispatchDeviceControl routines can be invoked
in the context of an arbitrary thread and at IRQL <= APC_LEVEL.
For example, DispatchRead and DispatchWrite for storage drivers
in the paging path can be invoked at APC_LEVEL.
DispatchRead and DispatchWrite routines of intermediate and lowest-level drivers are required to be non-pageable and should not execute any blocking calls (such as KeWaitForXxx with non-zero timeout).
For drivers in the hibernation and/or paging path, DispatchPower
can be called at DISPATCH_LEVEL.
DispatchPnP of such drivers must also handle
IRP_MN_DEVICE_USAGE_NOTIFICATION.
DispatchPower of drivers that require inrush power at start-ip can be called at DISPATCH_LEVEL.
If driver queued the IRP, returns STATUS_PENDING.
DRIVER_DISPATCH DispatchCleanup;
NTSTATUS DispatchCleanup(DEVICE_OBJECT* DeviceObject, IRP* Irp)
Dispatch routine for IRP_MJ_CLEANUP.
Request indicates that an application is being terminated or has closed a file handle
for the file object that represents the driver's device object.
When DispatchCleanup returns, usually DispatchClose is called next.
Driver must complete every IRP currently queued to target device object, for the file specified in the cleanup IRP's IO stack location, and complete the cleanup IRP.
DRIVER_DISPATCH DispatchShutdown;
NTSTATUS DispatchShutdown(DEVICE_OBJECT* DeviceObject, IRP* Irp)
Dispatch routine for IRP_MJ_SHUTDOWN.
Must be declared in DriverObject->MajorFunction[IRP_MJ_SHUTDOWN].
Additionally, driver must call IoRegisterShutdownNotification or IoRegisterLastChanceShutdownNotification.
Called at PASSIVE_LEVEL.
Last-chance routine must not cause any paging or IO operations.
Called before IO Manager sends IRP_MN_SET_POWER for PowerSystemShutdown.
DRIVER_DISPATCH DispatchDeviceControl;
NTSTATUS DispatchDeviceControl(DEVICE_OBJECT* DeviceObject, IRP* Irp)
DRIVER_DISPATCH DispatchInternalDeviceControl;
NTSTATUS DispatchInternalDeviceControl(DEVICE_OBJECT* DeviceObject, IRP* Irp)
DriverObject->MajorFunction[IRP_MJ_DEVICE_CONTROL] = MyDispatchDeviceControl
DriverObject->MajorFunction[IRP_MJ_INTERNAL_DEVICE_CONTROL] = MyDispatchInternalDeviceControl
Can validate access rights of non-internal caller with NTSTATUS IoValidateDeviceIoControlAccess(Irp, FILE_READ_ACCESS / FILE_WRITE_ACCESS).
If a higher-level driver receives IRP_MJ_DEVICE_CONTROL with the control code it does not recognize as its own, it should pass it to lower-level driver.
BOOLEAN FAST_IO_DEVICE_CONTROL (
FILE_OBJECT* FileObject,
BOOLEAN Wait, // TRUE if driver can wait while processing the request
OPTIONAL VOID* InputBuffer,
ULONG InputBufferLength,
OPTIONAL VOID* OutputBuffer,
ULONG OutputBufferLength,
ULONG IoControlCode,
IO_STATUS_BLOCK* IoStatus, // to be filled if routine returns TRUE
DEVICE_OBJECT* DeviceObject)
The only Fast I/O routine that can be declared by
drivers other than file systems.
Handles IRP_MJ_DEVICE_CONTROL.
Note that it can be invoked by IO Manager only if the driver is at the top of
the stack.
### device interfaces:
http://msdn.microsoft.com/en-us/library/windows/hardware/ff541339%28v=vs.85%29.aspx
### device objects: http://msdn.microsoft.com/en-us/library/windows/hardware/ff547807%28v=vs.85%29.aspx
### dispatch routines:
http://msdn.microsoft.com/en-us/library/windows/hardware/ff566407%28v=vs.85%29.aspx
### all routines: http://msdn.microsoft.com/en-us/library/windows/hardware/ff564886%28v=vs.85%29.aspx
### all routines:
http://msdn.microsoft.com/en-us/library/windows/hardware/ff544652%28v=vs.85%29.aspx
### dispatch, shutdown
File system drivers:
Example (stack of device objects):
(unnamed) FAT on \Device\HardDisk0\Partition2
\Device\HardDisk0\Partition2
\Device\ScsiPort0
DO for \Device\HardDisk0\Partition2 points to VPB (DO.Vpb -> VPB).
VPB structure:
CSHORT
Type
CSHORT Size
USHORT Flags
USHORT VolumeLabelLength // in bytes
DEVICE_OBJECT* DeviceObject -> FSDO: DO for (unnamed) FAT on \Device\HardDisk0\Partition2
DEVICE_OBJECT* RealDevice -> PDO: DO for \Device\HardDisk0\Partition2
ULONG SerialNumber
ULONG ReferenceCount
WCHAR VolumeLabel[MAXIMUM_VOLUME_LABEL_LENGTH / sizeof(WCHAR)]
On first access to device \Device\HardDisk0\Partition2
IO Manager checks if VPB exists and if VPB.DeviceObject is NULL.
If so, IO Manager initiates Mount operation (aka Volume Recognition).
Registered disk type file systems will be asked one at a time if they recognize
the structure of the partition.
The first file system to recognize the partition will create an unnamed FSDO
and stack it on top of PDO.
Then request is passed to FSDO.
This is likely to be Create request.
FSDs use "Neither IO" method.
### http://msdn.microsoft.com/en-us/library/windows/hardware/ff551834%28v=vs.85%29.aspx
### http://msdn.microsoft.com/en-us/library/windows/hardware/ff540382%28v=vs.85%29.aspx
### FsRtlEnterFileSystem, FsRtlExitFileSystem
IoReadPartitionTableEx
Fast IO Dispatch routines:
Intended for optimized processing of operations that
are likely to result only in references to Cache Manager and not involve
lower-level drivers.
I.e. rapid synchronous IO on cached files.
In fast I/O operations, data is transferred directly between user buffers and
the system cache, bypassing the file system and the storage driver stack.
When doing an IO operation, for certain operations IO Manager checks if file system driver supports Fast I/O for this function.
If it does not, IO Manager builds IRP and performs the operation (invokes file system) in a regular way.
If it does, IO Manager calls the driver via Fast I/O entry
point in the context of requesting thread and with the parameters supplied on
in the request.
If the driver can completely handle this request in its Fast I/O routine, it
does so and returns TRUE, otherwise FALSE.
If the driver returns FALSE, IO Manager proceeds as if no Fast I/O entry had
been supplied, i.e. it builds an IRP in a normal way and calls driver with that
IRP.
File systems and file system filters are required to support IRPs, but they are not required to support fast I/O. However, file systems and file system filters must implement fast I/O routines. Even if file systems and file system filters do not support fast I/O, they must define a fast I/O routine that returns FALSE.
FAST_IO_DISPATCH MyFastIoDispatch;
MyFastIoDispatch.SizeOfFastIoDispatch =sizeof(FAST_IO_DISPATCH);
MyFastIoDispatch.FastIoCheckIfPossible = CdFastIoCheckIfPossible;
MyFastIoDispatch.FastIoRead = FsRtlCopyRead;
MyFastIoDispatch.FastIoQueryBasicInfo =CdFastQueryBasicInfo;
MyFastIoDispatch.FastIoQueryStandardInfo = CdFastQueryStdInfo;
MyFastIoDispatch.FastIoLock = CdFastLock;
MyFastIoDispatch.FastIoUnlockSingle = CdFastUnlockSingle;
MyFastIoDispatch.FastIoUnlockAll = CdFastUnlockAll;
MyFastIoDispatch.FastIoUnlockAllByKey =CdFastUnlockAllByKey;
MyFastIoDispatch.AcquireFileForNtCreateSection = NULL;
MyFastIoDispatch.ReleaseFileForNtCreateSection = CdReleaseForCreateSection;
MyFastIoDispatch.FastIoQueryNetworkOpenInfo = CdFastQueryNetworkInfo;
MyFastIoDispatch.MdlRead = FsRtlMdlReadDev;
MyFastIoDispatch.MdlReadComplete = FsRtlMdlReadCompleteDev;
MyFastIoDispatch.PrepareMdlWrite = FsRtlPrepareMdlWriteDev;
MyFastIoDispatch.MdlWriteComplete = FsRtlMdlWriteCompleteDev;
… and other routines …
DriverObject->FastIoDispatch = &MyFastIoDispatch;
IRQL and Thread Context for Standard Driver Routines:
Routine |
Caller's IRQL |
Thread context |
AdapterControl |
DISPATCH_LEVEL |
Arbitrary |
AdapterListControl |
DISPATCH_LEVEL |
Arbitrary |
AddDevice |
PASSIVE_LEVEL |
System |
BugCheckCallback |
HIGH_LEVEL |
Arbitrary: depends on state of operating system when the bug check occurred |
BugCheckDumpIoCallback |
HIGH_LEVEL |
Arbitrary: depends on state of operating system when the bug check occurred |
BugCheckSecondaryDumpDataCallback |
HIGH_LEVEL |
Arbitrary: depends on state of operating system when the bug check occurred |
Cancel |
DISPATCH_LEVEL |
Arbitrary |
ControllerControl |
DISPATCH_LEVEL |
Arbitrary |
CsqAcquireLock |
IRQL of the routine that called IoCsqXxx. Usually <= DISPATCH_LEVEL |
Arbitrary |
CsqCompleteCanceledIrp |
<= DISPATCH_LEVEL |
Arbitrary |
CsqInsertIrp |
IRQL of the lock acquired by CsqAcquireLock. Usually <= DISPATCH_LEVEL |
Arbitrary |
CsqInsertIrpEx |
IRQL of the lock acquired by CsqAcquireLock. Usually <= DISPATCH_LEVEL |
Arbitrary |
CsqPeekNextIrp |
IRQL of the lock acquired by CsqAcquireLock. Usually <= DISPATCH_LEVEL |
Arbitrary |
CsqReleaseLock |
IRQL of the lock acquired by CsqAcquireLock. Usually <= DISPATCH_LEVEL |
Arbitrary |
CsqRemoveIrp |
IRQL of the lock acquired by CsqAcquireLock. Usually <= DISPATCH_LEVEL |
Arbitrary |
CustomDpc |
DISPATCH_LEVEL |
Arbitrary |
CustomTimerDpc |
DISPATCH_LEVEL |
Arbitrary |
DispatchCleanup |
PASSIVE_LEVEL |
Non-arbitrary for FSD, FS filter, and other highest-level drivers; arbitrary for other drivers |
DispatchClose (for FSD, FS filters, and other highest-level drivers) |
APC_LEVEL |
Arbitrary |
DispatchClose (for all other drivers) |
PASSIVE_LEVEL |
Arbitrary |
DispatchCreate |
PASSIVE_LEVEL |
Non-arbitrary for FSD, FS filter, and other highest-level drivers; arbitrary for other drivers |
DispatchCreateClose |
PASSIVE_LEVEL |
Non-arbitrary for FSD, FS filter, and other highest-level drivers; arbitrary for other drivers |
DispatchDeviceControl (for devices not in paging path) |
PASSIVE_LEVEL |
Non-arbitrary for FSD and FS filters; arbitrary for other drivers |
DispatchDeviceControl (for devices in paging path) |
<= DISPATCH_LEVEL |
Arbitrary |
DispatchFlushBuffers |
PASSIVE_LEVEL |
Non-arbitrary for FSD, FS filter, and other highest-level drivers; arbitrary for other drivers |
DispatchInternalDeviceControl |
Depends on the device type, but always <= DISPATCH_LEVEL |
Arbitrary |
DispatchPnp |
PASSIVE_LEVEL |
Arbitrary |
DispatchPower (if the DO_POWER_PAGABLE flag is not set in the device object) |
<= DISPATCH_LEVEL |
Arbitrary |
DispatchPower (if the DO_POWER_PAGABLE flag is set in the device object) |
PASSIVE_LEVEL |
Arbitrary |
DispatchQueryInformation |
PASSIVE_LEVEL |
Non-arbitrary for FSD, FS filter, and other highest-level drivers; arbitrary for other drivers |
DispatchRead |
PASSIVE_LEVEL |
Non-arbitrary for FSD, FS filter, and other highest-level drivers; arbitrary for other drivers |
DispatchRead (for devices in paging path) |
APC_LEVEL |
Arbitrary |
DispatchRead and DispatchWrite routines of drivers in the storage stack |
<= DISPATCH_LEVEL |
Arbitrary |
DispatchReadWrite (for devices not in paging path) |
PASSIVE_LEVEL |
Non-arbitrary for FSD, FS filter, and other highest-level drivers; arbitrary for other drivers |
DispatchReadWrite (for devices in paging path) |
APC_LEVEL |
Arbitrary |
DispatchSetInformation |
PASSIVE_LEVEL |
Non-arbitrary for FSD, FS filter, and other highest-level drivers; arbitrary for other drivers |
DispatchShutdown |
PASSIVE_LEVEL |
Non-arbitrary for FSD, FS filter, and other highest-level drivers; arbitrary for other drivers |
DispatchSystemControl |
PASSIVE_LEVEL |
Arbitrary |
DispatchWrite (for devices in paging path) |
APC_LEVEL |
Arbitrary |
DispatchWrite (for devices not in paging path) |
PASSIVE_LEVEL |
Non-arbitrary for FSD, FS filter, and other highest-level drivers; arbitrary for other drivers |
DllInitialize |
PASSIVE_LEVEL |
System or arbitrary |
DllUnload |
PASSIVE_LEVEL |
Arbitrary |
DpcForIsr |
DISPATCH_LEVEL |
Arbitrary |
DriverEntry |
PASSIVE_LEVEL |
System |
InterruptService |
DIRQL for the associated interrupt object |
Arbitrary |
IoCompletion |
<= DISPATCH_LEVEL |
Arbitrary |
IoTimer |
DISPATCH_LEVEL |
Arbitrary |
Reinitialize |
PASSIVE_LEVEL |
System |
StartIo |
DISPATCH_LEVEL |
Arbitrary |
SynchCritSection |
DIRQL for the associated interrupt object |
Arbitrary |
Unload |
PASSIVE_LEVEL |
System |
Misc:
ExIsProcessorFeaturePresent(….)
ExSetTimerResoultion(…)
PVOID MmGetSystemRoutineAddress(UNICODE_STRING SystemRoutineName)
ULONG KeGetCurrentProcessorNumber()
// zero-based, must be called at >= DISPATCH_LEVEL
ULONG KeGetCurrentProcessorNumberEx(OPTIONAL OUT PROCESSOR_NUMBER*
ProcNumber)
ULONG KeQueryActiveProcessorCount(OPTIONAL OUT KAFFINITY* ActiveProcessors)
ULONG KeQueryActiveProcessorCountEx(USHORT GroupNumber)
USHORT KeQueryMaximumGroupCount(void)
USHORT KeQueryActiveGroupCount(void)
NTSTATUS KeGetProcessorNumberFromIndex(IN ULONG ProcIndex, OUT PROCESSOR_NUMBER* ProcNumber)
ULONG KeGetProcessorIndexFromNumber(IN PROCESSOR_NUMBER* ProcNumber)
NTSTATUS KeQueryLogicalProcessorRelationship(...)
### KeMemoryBarrier
### KeMemoryBarrierWithoutFence
### KeSaveFloatingPointState, KeRestoreFloatingPointState
### PEX_PUSH_LOCK
Annotations:
__drv_dispatchType(IRP_MJ_xxx) |
Driver dispatch routine |
__drv_setsIRQL(irql) |
Exits at IRQL irql |
__drv_raisesIRQL(irql) |
Exits at irql, but this may only raise IRQL |
__drv_requiresIRQL(irql) |
Must be entered at irql |
__drv_maxIRQL(irql) |
Maximum IRQL at which the function may be called |
__drv_minIRQL(irql) |
Minimum IRQL at which the function may be called |
__drv_savesIRQL |
Current IRQL is saved in the annotated parameter |
__drv_restoresIRQL |
Current IRQL is restored from the annotated parameter |
__drv_savesIRQLGlobal(kind,param) |
Current IRQL s saved in global object |
__drv_restoresIRQLGlobal(kind,param) |
Current IRQL is restored from global object |
__drv_minFunctionIRQL(irql) |
Minimum IRQL to which the function can lower itself |
__drv_maxFunctionIRQL(irql) |
Maximum IRQL to which the function can raise itself |
__drv_useCancelIRQL |
Annotated parameter contains the cancelIRQL which will be restored by the called function |
__drv_sameIRQL |
The function must exit with the same IRQL it was called with |
__drv_acquiresResource(kind) |
|
__drv_releasesResource(kind) |
|
__drv_acquiresResourceGlobal(kind,param) |
|
__drv_releasesResourceGlobal(kind,param) |
|
__drv_mustHold(kind) |
|
__drv_neverHold(kind) |
|
__drv_mustHoldGlobal(kind,param) |
|
__drv_neverHoldGlobal(kind,param) |
|
__drv_floatSaved |
Floating point hardware was saved (available to kernel) |
__drv_floatRestored |
Floating point hardware was restored (no longer available) |
__drv_floatUsed |
The function uses floating point. |
__drv_interlocked |
The parameter is used for interlocked instructions |
__drv_inTry |
The function must be called inside a try block |
__drv_notInTry |
The function must not be called inside a try block |
__drv_acquiresExclusiveResource(kind) |
|
__drv_releasesExclusiveResource(kind) |
|
__drv_acquiresExclusiveResourceGlobal(kind, param) |
|
__drv_releasesExclusiveResourceGlobal(kind, param) |
|
__drv_acquiresCancelSpinLock |
|
__drv_releasesCancelSpinLock |
|
__drv_mustHoldCancelSpinLock |
|
__drv_holdsCancelSpinLock |
|
__drv_neverHoldCancelSpinLock |
|
__drv_acquiresCriticalRegion |
|
__drv_releasesCriticalRegion |
|
__drv_mustHoldCriticalRegion |
|
__drv_neverHoldCriticalRegion |
|
__drv_holdsCriticalRegion |
|
__drv_acquiresPriorityRegion |
|
__drv_releasesPriorityRegion |
|
__drv_mustHoldPriorityRegion |
|
__drv_neverHoldPriorityRegion |
|
__drv_holdsPriorityRegion |
|
__drv_ maxIRQL (DISPATCH_LEVEL)
void myfunc(__out __deref __drv_savesIRQL PUCHAR OldIrql)
__drv_maxIRQL(DISPATCH_LEVEL)
__drv_savesIRQLGlobal(QueuedSpinLock,LockHandle)
__drv_setsIRQL(DISPATCH_LEVEL)
VOID KeAcquireInStackQueuedSpinLock (
__inout PKSPIN_LOCK SpinLock,
__out __deref __drv_acquiresExclusiveResource(KeQueuedSpinLockType)
PKLOCK_QUEUE_HANDLE LockHandle)
__drv_at(return, __drv_innerMustHoldGlobal(CancelSpinLock,)
__drv_innerReleasesGlobal(CancelSpinLock,)
__drv_minFunctionIRQL(DISPATCH_LEVEL)
__drv_requiresIRQL(DISPATCH_LEVEL))
void myfunc2()
__drv_acquiresCriticalRegion
__drv_maxIRQL(APC_LEVEL)
VOID FltAcquireResourceExclusive(
__inout __deref __drv_neverHold(ResourceLite)
__deref __drv_acquiresResource(ResourceLite) PERESOURCE Resource)
Signing for testing:
See also: http://msdn.microsoft.com/en-us/library/windows/hardware/ff546236%28v=vs.85%29.aspx
On build machine:
// generate self-signed (-r) certificate MyTestCert
// installs it in the MyTestStore certificate store
// creates a copy of the certificate in MyTestCert.cer
makecert -r -pe -ss MyTestStore -n "CN=MyTestCert" MyTestCert.cer
// signs $(TARGET) file with certificate from the store
SIGNCODE_CMD=signtool sign /v /s MyTestStore /n MyTestCert $(TARGET)
On second build machine:
certmgr /add MyTestCert.cer /s /r localMachine MyTestStore
On target machine:
// install certificate in the Trusted Root Certification Authorities certificate store of the local computer
certmgr /add MyTestCert.cer /s /r localMachine root
// install certificate in the Trusted Publishers certificate store of the local computer
certmgr /add MyTestCert.cer /s /r localMachine trustedpublisher
Or:
bcdedit -debug on
### http://www.osronline.com/article.cfm?article=205
### Building a driver: http://msdn.microsoft.com/en-us/library/windows/hardware/ff554644%28v=vs.85%29.aspx
Debugger:
"C:\WinKernel\WinDDK\7600.16385.1\Debuggers\windbg.exe" -v -y "C:\WinKernel\Downloads\Symbols\Windows7SP1-x64-REL" -i "c:\windows;c:\windows\system32;c:\windows\system32\drivers" -z "C:\Windows\MEMORY.DMP"
### http://msdn.microsoft.com/en-us/library/windows/hardware/ff564717%28v=vs.85%29.aspx
CRTL routines
_i64toa_s
_i64tow_s
_itoa
_itoa_s
_itow
_itow_s
_ltoa_s
_ltow_s
_makepath_s
_purecall
_setjmp
_setjmpex
_snprintf
_snprintf_s
_snscanf_s
_snwprintf
_snwprintf_s
_snwscanf_s
_splitpath_s
_stricmp
_strlwr
_strnicmp
_strnset
_strnset_s
_strrev
_strset
_strset_s
_strtoui64
_strupr
_swprintf
_ui64toa_s
_ui64tow_s
_ultoa_s
_ultow_s
_vsnprintf
_vsnprintf_s
_vsnwprintf
_vsnwprintf_s
_vswprintf
_wcsicmp
_wcslwr
_wcsnicmp
_wcsnset
_wcsnset_s
_wcsrev
_wcsset_s
_wcsupr
_wmakepath_s
_wsplitpath_s
_wtoi
_wtol
atoi
atol
bsearch
isdigit
islower
isprint
isspace
isupper
isxdigit
longjmp
mbstowcs
mbtowc
memchr
memcmp
memcpy
memcpy_s
memmove
memmove_s
memset
qsort
rand
sprintf
sprintf_s
srand
sscanf_s
strcat
strcat_s
strchr
strcmp
strcpy
strcpy_s
strlen
strncat
strncat_s
strncmp
strncpy
strncpy_s
strnlen
strrchr
strspn
strstr
strtok_s
swprintf
swprintf_s
swscanf_s
tolower
toupper
towlower
towupper
vsprintf
vsprintf_s
vswprintf_s
wcscat
wcscat_s
wcschr
wcscmp
wcscpy
wcscpy_s
wcscspn
wcslen
wcsncat
wcsncat_s
wcsncmp
wcsncpy
wcsncpy_s
wcsnlen
wcsrchr
wcsspn
wcsstr
wcstombs
wcstoul
wctomb
Facility prefixes:
Cache
Manager:
Cc
Configuration
Manager:
Cm
Subscribe to notifications about changes in registry
Event
tracing:
Etw
Trace and log events raised by kernel-mode components and user-mode
applications
File system runtime library FsRtl
Runtime library Rtl
Security Reference Monitor Se
Kernel Transaction Manager Tm
Driver support routines
Zw
Windows Management Instr Wmi
Unknown:
Em
http://www.osronline.com/ddkx/kmarch/devobjts_7mav.htm
http://www.osronline.com/ddkx/kmarch/devobjts_1qxz.htm
http://www.osronline.com/ddkx/kmarch/k104_1ycy.htm
http://www.osronline.com/ddkx/kmarch/k104_8piq.htm
http://www.osronline.com/ddkx/kmarch/k112_3z5e.htm
http://www.osronline.com/ddkx/kmarch/k104_61pu.htm
http://www.osronline.com/ddkx/kmarch/k104_42pe.htm
http://www.osronline.com/ddkx/kmarch/irps_1e3r.htm#ddk__bmc_2girpeg.wmf__kg
http://www.osronline.com/ddkx/kmarch/irps_8lgn.htm
http://www.osronline.com/ddkx/kmarch/k104_89pu.htm
http://www.osronline.com/ddkx/kmarch/irps_1oh3.htm
http://www.osronline.com/ddkx/kmarch/k104_1agi.htm
http://msdn.microsoft.com/en-us/library/windows/hardware/ff554368%28v=vs.85%29.aspx
http://msdn.microsoft.com/en-us/library/windows/hardware/ff545910%28v=vs.85%29.aspx
http://msdn.microsoft.com/en-us/library/windows/hardware/ff565388%28v=vs.85%29.aspx
### sleep/wakeup
### IoBuildSynchronousFsdRequest,
IoBuildDeviceIoControlRequest
### KeGetPreviousMode
### load/unload/quiescent
### KeSetEvent (115), IoCreateNotificationEvent
### KeNumberProcesors
### FsRtlEnterFileSystem
### Ke functions http://msdn.microsoft.com/en-us/library/windows/hardware/ff553350%28v=vs.85%29.aspx
### Ex functions
http://msdn.microsoft.com/en-us/library/windows/hardware/ff544363%28v=vs.85%29.aspx
### Mm functions
http://msdn.microsoft.com/en-us/library/windows/hardware/ff554479%28v=vs.85%29.aspx
### all functions http://msdn.microsoft.com/en-us/library/windows/hardware/ff544200%28v=vs.85%29.aspx
### ntdef.h wdm.h
### use of floating point in driver
### debugger !ready n !thread !pcr
### Windows Installable File System kit