Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6
[deliverable/linux.git] / Documentation / kvm / api.txt
CommitLineData
9c1b96e3
AK
1The Definitive KVM (Kernel-based Virtual Machine) API Documentation
2===================================================================
3
41. General description
5
6The kvm API is a set of ioctls that are issued to control various aspects
7of a virtual machine. The ioctls belong to three classes
8
9 - System ioctls: These query and set global attributes which affect the
10 whole kvm subsystem. In addition a system ioctl is used to create
11 virtual machines
12
13 - VM ioctls: These query and set attributes that affect an entire virtual
14 machine, for example memory layout. In addition a VM ioctl is used to
15 create virtual cpus (vcpus).
16
17 Only run VM ioctls from the same process (address space) that was used
18 to create the VM.
19
20 - vcpu ioctls: These query and set attributes that control the operation
21 of a single virtual cpu.
22
23 Only run vcpu ioctls from the same thread that was used to create the
24 vcpu.
25
2044892d 262. File descriptors
9c1b96e3
AK
27
28The kvm API is centered around file descriptors. An initial
29open("/dev/kvm") obtains a handle to the kvm subsystem; this handle
30can be used to issue system ioctls. A KVM_CREATE_VM ioctl on this
2044892d 31handle will create a VM file descriptor which can be used to issue VM
9c1b96e3
AK
32ioctls. A KVM_CREATE_VCPU ioctl on a VM fd will create a virtual cpu
33and return a file descriptor pointing to it. Finally, ioctls on a vcpu
34fd can be used to control the vcpu, including the important task of
35actually running guest code.
36
37In general file descriptors can be migrated among processes by means
38of fork() and the SCM_RIGHTS facility of unix domain socket. These
39kinds of tricks are explicitly not supported by kvm. While they will
40not cause harm to the host, their actual behavior is not guaranteed by
41the API. The only supported use is one virtual machine per process,
42and one vcpu per thread.
43
443. Extensions
45
46As of Linux 2.6.22, the KVM ABI has been stabilized: no backward
47incompatible change are allowed. However, there is an extension
48facility that allows backward-compatible extensions to the API to be
49queried and used.
50
51The extension mechanism is not based on on the Linux version number.
52Instead, kvm defines extension identifiers and a facility to query
53whether a particular extension identifier is available. If it is, a
54set of ioctls is available for application use.
55
564. API description
57
58This section describes ioctls that can be used to control kvm guests.
59For each ioctl, the following information is provided along with a
60description:
61
62 Capability: which KVM extension provides this ioctl. Can be 'basic',
63 which means that is will be provided by any kernel that supports
64 API version 12 (see section 4.1), or a KVM_CAP_xyz constant, which
65 means availability needs to be checked with KVM_CHECK_EXTENSION
66 (see section 4.4).
67
68 Architectures: which instruction set architectures provide this ioctl.
69 x86 includes both i386 and x86_64.
70
71 Type: system, vm, or vcpu.
72
73 Parameters: what parameters are accepted by the ioctl.
74
75 Returns: the return value. General error numbers (EBADF, ENOMEM, EINVAL)
76 are not detailed, but errors with specific meanings are.
77
784.1 KVM_GET_API_VERSION
79
80Capability: basic
81Architectures: all
82Type: system ioctl
83Parameters: none
84Returns: the constant KVM_API_VERSION (=12)
85
86This identifies the API version as the stable kvm API. It is not
87expected that this number will change. However, Linux 2.6.20 and
882.6.21 report earlier versions; these are not documented and not
89supported. Applications should refuse to run if KVM_GET_API_VERSION
90returns a value other than 12. If this check passes, all ioctls
91described as 'basic' will be available.
92
934.2 KVM_CREATE_VM
94
95Capability: basic
96Architectures: all
97Type: system ioctl
98Parameters: none
99Returns: a VM fd that can be used to control the new virtual machine.
100
101The new VM has no virtual cpus and no memory. An mmap() of a VM fd
102will access the virtual machine's physical address space; offset zero
103corresponds to guest physical address zero. Use of mmap() on a VM fd
104is discouraged if userspace memory allocation (KVM_CAP_USER_MEMORY) is
105available.
106
1074.3 KVM_GET_MSR_INDEX_LIST
108
109Capability: basic
110Architectures: x86
111Type: system
112Parameters: struct kvm_msr_list (in/out)
113Returns: 0 on success; -1 on error
114Errors:
115 E2BIG: the msr index list is to be to fit in the array specified by
116 the user.
117
118struct kvm_msr_list {
119 __u32 nmsrs; /* number of msrs in entries */
120 __u32 indices[0];
121};
122
123This ioctl returns the guest msrs that are supported. The list varies
124by kvm version and host processor, but does not change otherwise. The
125user fills in the size of the indices array in nmsrs, and in return
126kvm adjusts nmsrs to reflect the actual number of msrs and fills in
127the indices array with their numbers.
128
2e2602ca
AK
129Note: if kvm indicates supports MCE (KVM_CAP_MCE), then the MCE bank MSRs are
130not returned in the MSR list, as different vcpus can have a different number
131of banks, as set via the KVM_X86_SETUP_MCE ioctl.
132
9c1b96e3
AK
1334.4 KVM_CHECK_EXTENSION
134
135Capability: basic
136Architectures: all
137Type: system ioctl
138Parameters: extension identifier (KVM_CAP_*)
139Returns: 0 if unsupported; 1 (or some other positive integer) if supported
140
141The API allows the application to query about extensions to the core
142kvm API. Userspace passes an extension identifier (an integer) and
143receives an integer that describes the extension availability.
144Generally 0 means no and 1 means yes, but some extensions may report
145additional information in the integer return value.
146
1474.5 KVM_GET_VCPU_MMAP_SIZE
148
149Capability: basic
150Architectures: all
151Type: system ioctl
152Parameters: none
153Returns: size of vcpu mmap area, in bytes
154
155The KVM_RUN ioctl (cf.) communicates with userspace via a shared
156memory region. This ioctl returns the size of that region. See the
157KVM_RUN documentation for details.
158
1594.6 KVM_SET_MEMORY_REGION
160
161Capability: basic
162Architectures: all
163Type: vm ioctl
164Parameters: struct kvm_memory_region (in)
165Returns: 0 on success, -1 on error
166
b74a07be 167This ioctl is obsolete and has been removed.
9c1b96e3
AK
168
1694.6 KVM_CREATE_VCPU
170
171Capability: basic
172Architectures: all
173Type: vm ioctl
174Parameters: vcpu id (apic id on x86)
175Returns: vcpu fd on success, -1 on error
176
177This API adds a vcpu to a virtual machine. The vcpu id is a small integer
178in the range [0, max_vcpus).
179
1804.7 KVM_GET_DIRTY_LOG (vm ioctl)
181
182Capability: basic
183Architectures: x86
184Type: vm ioctl
185Parameters: struct kvm_dirty_log (in/out)
186Returns: 0 on success, -1 on error
187
188/* for KVM_GET_DIRTY_LOG */
189struct kvm_dirty_log {
190 __u32 slot;
191 __u32 padding;
192 union {
193 void __user *dirty_bitmap; /* one bit per page */
194 __u64 padding;
195 };
196};
197
198Given a memory slot, return a bitmap containing any pages dirtied
199since the last call to this ioctl. Bit 0 is the first page in the
200memory slot. Ensure the entire structure is cleared to avoid padding
201issues.
202
2034.8 KVM_SET_MEMORY_ALIAS
204
205Capability: basic
206Architectures: x86
207Type: vm ioctl
208Parameters: struct kvm_memory_alias (in)
209Returns: 0 (success), -1 (error)
210
a1f4d395 211This ioctl is obsolete and has been removed.
9c1b96e3
AK
212
2134.9 KVM_RUN
214
215Capability: basic
216Architectures: all
217Type: vcpu ioctl
218Parameters: none
219Returns: 0 on success, -1 on error
220Errors:
221 EINTR: an unmasked signal is pending
222
223This ioctl is used to run a guest virtual cpu. While there are no
224explicit parameters, there is an implicit parameter block that can be
225obtained by mmap()ing the vcpu fd at offset 0, with the size given by
226KVM_GET_VCPU_MMAP_SIZE. The parameter block is formatted as a 'struct
227kvm_run' (see below).
228
2294.10 KVM_GET_REGS
230
231Capability: basic
232Architectures: all
233Type: vcpu ioctl
234Parameters: struct kvm_regs (out)
235Returns: 0 on success, -1 on error
236
237Reads the general purpose registers from the vcpu.
238
239/* x86 */
240struct kvm_regs {
241 /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */
242 __u64 rax, rbx, rcx, rdx;
243 __u64 rsi, rdi, rsp, rbp;
244 __u64 r8, r9, r10, r11;
245 __u64 r12, r13, r14, r15;
246 __u64 rip, rflags;
247};
248
2494.11 KVM_SET_REGS
250
251Capability: basic
252Architectures: all
253Type: vcpu ioctl
254Parameters: struct kvm_regs (in)
255Returns: 0 on success, -1 on error
256
257Writes the general purpose registers into the vcpu.
258
259See KVM_GET_REGS for the data structure.
260
2614.12 KVM_GET_SREGS
262
263Capability: basic
264Architectures: x86
265Type: vcpu ioctl
266Parameters: struct kvm_sregs (out)
267Returns: 0 on success, -1 on error
268
269Reads special registers from the vcpu.
270
271/* x86 */
272struct kvm_sregs {
273 struct kvm_segment cs, ds, es, fs, gs, ss;
274 struct kvm_segment tr, ldt;
275 struct kvm_dtable gdt, idt;
276 __u64 cr0, cr2, cr3, cr4, cr8;
277 __u64 efer;
278 __u64 apic_base;
279 __u64 interrupt_bitmap[(KVM_NR_INTERRUPTS + 63) / 64];
280};
281
282interrupt_bitmap is a bitmap of pending external interrupts. At most
283one bit may be set. This interrupt has been acknowledged by the APIC
284but not yet injected into the cpu core.
285
2864.13 KVM_SET_SREGS
287
288Capability: basic
289Architectures: x86
290Type: vcpu ioctl
291Parameters: struct kvm_sregs (in)
292Returns: 0 on success, -1 on error
293
294Writes special registers into the vcpu. See KVM_GET_SREGS for the
295data structures.
296
2974.14 KVM_TRANSLATE
298
299Capability: basic
300Architectures: x86
301Type: vcpu ioctl
302Parameters: struct kvm_translation (in/out)
303Returns: 0 on success, -1 on error
304
305Translates a virtual address according to the vcpu's current address
306translation mode.
307
308struct kvm_translation {
309 /* in */
310 __u64 linear_address;
311
312 /* out */
313 __u64 physical_address;
314 __u8 valid;
315 __u8 writeable;
316 __u8 usermode;
317 __u8 pad[5];
318};
319
3204.15 KVM_INTERRUPT
321
322Capability: basic
323Architectures: x86
324Type: vcpu ioctl
325Parameters: struct kvm_interrupt (in)
326Returns: 0 on success, -1 on error
327
328Queues a hardware interrupt vector to be injected. This is only
329useful if in-kernel local APIC is not used.
330
331/* for KVM_INTERRUPT */
332struct kvm_interrupt {
333 /* in */
334 __u32 irq;
335};
336
337Note 'irq' is an interrupt vector, not an interrupt pin or line.
338
3394.16 KVM_DEBUG_GUEST
340
341Capability: basic
342Architectures: none
343Type: vcpu ioctl
344Parameters: none)
345Returns: -1 on error
346
347Support for this has been removed. Use KVM_SET_GUEST_DEBUG instead.
348
3494.17 KVM_GET_MSRS
350
351Capability: basic
352Architectures: x86
353Type: vcpu ioctl
354Parameters: struct kvm_msrs (in/out)
355Returns: 0 on success, -1 on error
356
357Reads model-specific registers from the vcpu. Supported msr indices can
358be obtained using KVM_GET_MSR_INDEX_LIST.
359
360struct kvm_msrs {
361 __u32 nmsrs; /* number of msrs in entries */
362 __u32 pad;
363
364 struct kvm_msr_entry entries[0];
365};
366
367struct kvm_msr_entry {
368 __u32 index;
369 __u32 reserved;
370 __u64 data;
371};
372
373Application code should set the 'nmsrs' member (which indicates the
374size of the entries array) and the 'index' member of each array entry.
375kvm will fill in the 'data' member.
376
3774.18 KVM_SET_MSRS
378
379Capability: basic
380Architectures: x86
381Type: vcpu ioctl
382Parameters: struct kvm_msrs (in)
383Returns: 0 on success, -1 on error
384
385Writes model-specific registers to the vcpu. See KVM_GET_MSRS for the
386data structures.
387
388Application code should set the 'nmsrs' member (which indicates the
389size of the entries array), and the 'index' and 'data' members of each
390array entry.
391
3924.19 KVM_SET_CPUID
393
394Capability: basic
395Architectures: x86
396Type: vcpu ioctl
397Parameters: struct kvm_cpuid (in)
398Returns: 0 on success, -1 on error
399
400Defines the vcpu responses to the cpuid instruction. Applications
401should use the KVM_SET_CPUID2 ioctl if available.
402
403
404struct kvm_cpuid_entry {
405 __u32 function;
406 __u32 eax;
407 __u32 ebx;
408 __u32 ecx;
409 __u32 edx;
410 __u32 padding;
411};
412
413/* for KVM_SET_CPUID */
414struct kvm_cpuid {
415 __u32 nent;
416 __u32 padding;
417 struct kvm_cpuid_entry entries[0];
418};
419
4204.20 KVM_SET_SIGNAL_MASK
421
422Capability: basic
423Architectures: x86
424Type: vcpu ioctl
425Parameters: struct kvm_signal_mask (in)
426Returns: 0 on success, -1 on error
427
428Defines which signals are blocked during execution of KVM_RUN. This
429signal mask temporarily overrides the threads signal mask. Any
430unblocked signal received (except SIGKILL and SIGSTOP, which retain
431their traditional behaviour) will cause KVM_RUN to return with -EINTR.
432
433Note the signal will only be delivered if not blocked by the original
434signal mask.
435
436/* for KVM_SET_SIGNAL_MASK */
437struct kvm_signal_mask {
438 __u32 len;
439 __u8 sigset[0];
440};
441
4424.21 KVM_GET_FPU
443
444Capability: basic
445Architectures: x86
446Type: vcpu ioctl
447Parameters: struct kvm_fpu (out)
448Returns: 0 on success, -1 on error
449
450Reads the floating point state from the vcpu.
451
452/* for KVM_GET_FPU and KVM_SET_FPU */
453struct kvm_fpu {
454 __u8 fpr[8][16];
455 __u16 fcw;
456 __u16 fsw;
457 __u8 ftwx; /* in fxsave format */
458 __u8 pad1;
459 __u16 last_opcode;
460 __u64 last_ip;
461 __u64 last_dp;
462 __u8 xmm[16][16];
463 __u32 mxcsr;
464 __u32 pad2;
465};
466
4674.22 KVM_SET_FPU
468
469Capability: basic
470Architectures: x86
471Type: vcpu ioctl
472Parameters: struct kvm_fpu (in)
473Returns: 0 on success, -1 on error
474
475Writes the floating point state to the vcpu.
476
477/* for KVM_GET_FPU and KVM_SET_FPU */
478struct kvm_fpu {
479 __u8 fpr[8][16];
480 __u16 fcw;
481 __u16 fsw;
482 __u8 ftwx; /* in fxsave format */
483 __u8 pad1;
484 __u16 last_opcode;
485 __u64 last_ip;
486 __u64 last_dp;
487 __u8 xmm[16][16];
488 __u32 mxcsr;
489 __u32 pad2;
490};
491
5dadbfd6
AK
4924.23 KVM_CREATE_IRQCHIP
493
494Capability: KVM_CAP_IRQCHIP
495Architectures: x86, ia64
496Type: vm ioctl
497Parameters: none
498Returns: 0 on success, -1 on error
499
500Creates an interrupt controller model in the kernel. On x86, creates a virtual
501ioapic, a virtual PIC (two PICs, nested), and sets up future vcpus to have a
502local APIC. IRQ routing for GSIs 0-15 is set to both PIC and IOAPIC; GSI 16-23
503only go to the IOAPIC. On ia64, a IOSAPIC is created.
504
5054.24 KVM_IRQ_LINE
506
507Capability: KVM_CAP_IRQCHIP
508Architectures: x86, ia64
509Type: vm ioctl
510Parameters: struct kvm_irq_level
511Returns: 0 on success, -1 on error
512
513Sets the level of a GSI input to the interrupt controller model in the kernel.
514Requires that an interrupt controller model has been previously created with
515KVM_CREATE_IRQCHIP. Note that edge-triggered interrupts require the level
516to be set to 1 and then back to 0.
517
518struct kvm_irq_level {
519 union {
520 __u32 irq; /* GSI */
521 __s32 status; /* not used for KVM_IRQ_LEVEL */
522 };
523 __u32 level; /* 0 or 1 */
524};
525
5264.25 KVM_GET_IRQCHIP
527
528Capability: KVM_CAP_IRQCHIP
529Architectures: x86, ia64
530Type: vm ioctl
531Parameters: struct kvm_irqchip (in/out)
532Returns: 0 on success, -1 on error
533
534Reads the state of a kernel interrupt controller created with
535KVM_CREATE_IRQCHIP into a buffer provided by the caller.
536
537struct kvm_irqchip {
538 __u32 chip_id; /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */
539 __u32 pad;
540 union {
541 char dummy[512]; /* reserving space */
542 struct kvm_pic_state pic;
543 struct kvm_ioapic_state ioapic;
544 } chip;
545};
546
5474.26 KVM_SET_IRQCHIP
548
549Capability: KVM_CAP_IRQCHIP
550Architectures: x86, ia64
551Type: vm ioctl
552Parameters: struct kvm_irqchip (in)
553Returns: 0 on success, -1 on error
554
555Sets the state of a kernel interrupt controller created with
556KVM_CREATE_IRQCHIP from a buffer provided by the caller.
557
558struct kvm_irqchip {
559 __u32 chip_id; /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */
560 __u32 pad;
561 union {
562 char dummy[512]; /* reserving space */
563 struct kvm_pic_state pic;
564 struct kvm_ioapic_state ioapic;
565 } chip;
566};
567
ffde22ac
ES
5684.27 KVM_XEN_HVM_CONFIG
569
570Capability: KVM_CAP_XEN_HVM
571Architectures: x86
572Type: vm ioctl
573Parameters: struct kvm_xen_hvm_config (in)
574Returns: 0 on success, -1 on error
575
576Sets the MSR that the Xen HVM guest uses to initialize its hypercall
577page, and provides the starting address and size of the hypercall
578blobs in userspace. When the guest writes the MSR, kvm copies one
579page of a blob (32- or 64-bit, depending on the vcpu mode) to guest
580memory.
581
582struct kvm_xen_hvm_config {
583 __u32 flags;
584 __u32 msr;
585 __u64 blob_addr_32;
586 __u64 blob_addr_64;
587 __u8 blob_size_32;
588 __u8 blob_size_64;
589 __u8 pad2[30];
590};
591
afbcf7ab
GC
5924.27 KVM_GET_CLOCK
593
594Capability: KVM_CAP_ADJUST_CLOCK
595Architectures: x86
596Type: vm ioctl
597Parameters: struct kvm_clock_data (out)
598Returns: 0 on success, -1 on error
599
600Gets the current timestamp of kvmclock as seen by the current guest. In
601conjunction with KVM_SET_CLOCK, it is used to ensure monotonicity on scenarios
602such as migration.
603
604struct kvm_clock_data {
605 __u64 clock; /* kvmclock current value */
606 __u32 flags;
607 __u32 pad[9];
608};
609
6104.28 KVM_SET_CLOCK
611
612Capability: KVM_CAP_ADJUST_CLOCK
613Architectures: x86
614Type: vm ioctl
615Parameters: struct kvm_clock_data (in)
616Returns: 0 on success, -1 on error
617
2044892d 618Sets the current timestamp of kvmclock to the value specified in its parameter.
afbcf7ab
GC
619In conjunction with KVM_GET_CLOCK, it is used to ensure monotonicity on scenarios
620such as migration.
621
622struct kvm_clock_data {
623 __u64 clock; /* kvmclock current value */
624 __u32 flags;
625 __u32 pad[9];
626};
627
3cfc3092
JK
6284.29 KVM_GET_VCPU_EVENTS
629
630Capability: KVM_CAP_VCPU_EVENTS
48005f64 631Extended by: KVM_CAP_INTR_SHADOW
3cfc3092
JK
632Architectures: x86
633Type: vm ioctl
634Parameters: struct kvm_vcpu_event (out)
635Returns: 0 on success, -1 on error
636
637Gets currently pending exceptions, interrupts, and NMIs as well as related
638states of the vcpu.
639
640struct kvm_vcpu_events {
641 struct {
642 __u8 injected;
643 __u8 nr;
644 __u8 has_error_code;
645 __u8 pad;
646 __u32 error_code;
647 } exception;
648 struct {
649 __u8 injected;
650 __u8 nr;
651 __u8 soft;
48005f64 652 __u8 shadow;
3cfc3092
JK
653 } interrupt;
654 struct {
655 __u8 injected;
656 __u8 pending;
657 __u8 masked;
658 __u8 pad;
659 } nmi;
660 __u32 sipi_vector;
dab4b911 661 __u32 flags;
3cfc3092
JK
662};
663
48005f64
JK
664KVM_VCPUEVENT_VALID_SHADOW may be set in the flags field to signal that
665interrupt.shadow contains a valid state. Otherwise, this field is undefined.
666
3cfc3092
JK
6674.30 KVM_SET_VCPU_EVENTS
668
669Capability: KVM_CAP_VCPU_EVENTS
48005f64 670Extended by: KVM_CAP_INTR_SHADOW
3cfc3092
JK
671Architectures: x86
672Type: vm ioctl
673Parameters: struct kvm_vcpu_event (in)
674Returns: 0 on success, -1 on error
675
676Set pending exceptions, interrupts, and NMIs as well as related states of the
677vcpu.
678
679See KVM_GET_VCPU_EVENTS for the data structure.
680
dab4b911
JK
681Fields that may be modified asynchronously by running VCPUs can be excluded
682from the update. These fields are nmi.pending and sipi_vector. Keep the
683corresponding bits in the flags field cleared to suppress overwriting the
684current in-kernel state. The bits are:
685
686KVM_VCPUEVENT_VALID_NMI_PENDING - transfer nmi.pending to the kernel
687KVM_VCPUEVENT_VALID_SIPI_VECTOR - transfer sipi_vector
688
48005f64
JK
689If KVM_CAP_INTR_SHADOW is available, KVM_VCPUEVENT_VALID_SHADOW can be set in
690the flags field to signal that interrupt.shadow contains a valid state and
691shall be written into the VCPU.
692
a1efbe77
JK
6934.32 KVM_GET_DEBUGREGS
694
695Capability: KVM_CAP_DEBUGREGS
696Architectures: x86
697Type: vm ioctl
698Parameters: struct kvm_debugregs (out)
699Returns: 0 on success, -1 on error
700
701Reads debug registers from the vcpu.
702
703struct kvm_debugregs {
704 __u64 db[4];
705 __u64 dr6;
706 __u64 dr7;
707 __u64 flags;
708 __u64 reserved[9];
709};
710
7114.33 KVM_SET_DEBUGREGS
712
713Capability: KVM_CAP_DEBUGREGS
714Architectures: x86
715Type: vm ioctl
716Parameters: struct kvm_debugregs (in)
717Returns: 0 on success, -1 on error
718
719Writes debug registers into the vcpu.
720
721See KVM_GET_DEBUGREGS for the data structure. The flags field is unused
722yet and must be cleared on entry.
723
0f2d8f4d
AK
7244.34 KVM_SET_USER_MEMORY_REGION
725
726Capability: KVM_CAP_USER_MEM
727Architectures: all
728Type: vm ioctl
729Parameters: struct kvm_userspace_memory_region (in)
730Returns: 0 on success, -1 on error
731
732struct kvm_userspace_memory_region {
733 __u32 slot;
734 __u32 flags;
735 __u64 guest_phys_addr;
736 __u64 memory_size; /* bytes */
737 __u64 userspace_addr; /* start of the userspace allocated memory */
738};
739
740/* for kvm_memory_region::flags */
741#define KVM_MEM_LOG_DIRTY_PAGES 1UL
742
743This ioctl allows the user to create or modify a guest physical memory
744slot. When changing an existing slot, it may be moved in the guest
745physical memory space, or its flags may be modified. It may not be
746resized. Slots may not overlap in guest physical address space.
747
748Memory for the region is taken starting at the address denoted by the
749field userspace_addr, which must point at user addressable memory for
750the entire memory slot size. Any object may back this memory, including
751anonymous memory, ordinary files, and hugetlbfs.
752
753It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr
754be identical. This allows large pages in the guest to be backed by large
755pages in the host.
756
757The flags field supports just one flag, KVM_MEM_LOG_DIRTY_PAGES, which
758instructs kvm to keep track of writes to memory within the slot. See
759the KVM_GET_DIRTY_LOG ioctl.
760
761When the KVM_CAP_SYNC_MMU capability, changes in the backing of the memory
762region are automatically reflected into the guest. For example, an mmap()
763that affects the region will be made visible immediately. Another example
764is madvise(MADV_DROP).
765
766It is recommended to use this API instead of the KVM_SET_MEMORY_REGION ioctl.
767The KVM_SET_MEMORY_REGION does not allow fine grained control over memory
768allocation and is deprecated.
3cfc3092 769
8a5416db
AK
7704.35 KVM_SET_TSS_ADDR
771
772Capability: KVM_CAP_SET_TSS_ADDR
773Architectures: x86
774Type: vm ioctl
775Parameters: unsigned long tss_address (in)
776Returns: 0 on success, -1 on error
777
778This ioctl defines the physical address of a three-page region in the guest
779physical address space. The region must be within the first 4GB of the
780guest physical address space and must not conflict with any memory slot
781or any mmio address. The guest may malfunction if it accesses this memory
782region.
783
784This ioctl is required on Intel-based hosts. This is needed on Intel hardware
785because of a quirk in the virtualization implementation (see the internals
786documentation when it pops into existence).
787
71fbfd5f
AG
7884.36 KVM_ENABLE_CAP
789
790Capability: KVM_CAP_ENABLE_CAP
791Architectures: ppc
792Type: vcpu ioctl
793Parameters: struct kvm_enable_cap (in)
794Returns: 0 on success; -1 on error
795
796+Not all extensions are enabled by default. Using this ioctl the application
797can enable an extension, making it available to the guest.
798
799On systems that do not support this ioctl, it always fails. On systems that
800do support it, it only works for extensions that are supported for enablement.
801
802To check if a capability can be enabled, the KVM_CHECK_EXTENSION ioctl should
803be used.
804
805struct kvm_enable_cap {
806 /* in */
807 __u32 cap;
808
809The capability that is supposed to get enabled.
810
811 __u32 flags;
812
813A bitfield indicating future enhancements. Has to be 0 for now.
814
815 __u64 args[4];
816
817Arguments for enabling a feature. If a feature needs initial values to
818function properly, this is the place to put them.
819
820 __u8 pad[64];
821};
822
b843f065
AK
8234.37 KVM_GET_MP_STATE
824
825Capability: KVM_CAP_MP_STATE
826Architectures: x86, ia64
827Type: vcpu ioctl
828Parameters: struct kvm_mp_state (out)
829Returns: 0 on success; -1 on error
830
831struct kvm_mp_state {
832 __u32 mp_state;
833};
834
835Returns the vcpu's current "multiprocessing state" (though also valid on
836uniprocessor guests).
837
838Possible values are:
839
840 - KVM_MP_STATE_RUNNABLE: the vcpu is currently running
841 - KVM_MP_STATE_UNINITIALIZED: the vcpu is an application processor (AP)
842 which has not yet received an INIT signal
843 - KVM_MP_STATE_INIT_RECEIVED: the vcpu has received an INIT signal, and is
844 now ready for a SIPI
845 - KVM_MP_STATE_HALTED: the vcpu has executed a HLT instruction and
846 is waiting for an interrupt
847 - KVM_MP_STATE_SIPI_RECEIVED: the vcpu has just received a SIPI (vector
848 accesible via KVM_GET_VCPU_EVENTS)
849
850This ioctl is only useful after KVM_CREATE_IRQCHIP. Without an in-kernel
851irqchip, the multiprocessing state must be maintained by userspace.
852
8534.38 KVM_SET_MP_STATE
854
855Capability: KVM_CAP_MP_STATE
856Architectures: x86, ia64
857Type: vcpu ioctl
858Parameters: struct kvm_mp_state (in)
859Returns: 0 on success; -1 on error
860
861Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for
862arguments.
863
864This ioctl is only useful after KVM_CREATE_IRQCHIP. Without an in-kernel
865irqchip, the multiprocessing state must be maintained by userspace.
866
47dbb84f
AK
8674.39 KVM_SET_IDENTITY_MAP_ADDR
868
869Capability: KVM_CAP_SET_IDENTITY_MAP_ADDR
870Architectures: x86
871Type: vm ioctl
872Parameters: unsigned long identity (in)
873Returns: 0 on success, -1 on error
874
875This ioctl defines the physical address of a one-page region in the guest
876physical address space. The region must be within the first 4GB of the
877guest physical address space and must not conflict with any memory slot
878or any mmio address. The guest may malfunction if it accesses this memory
879region.
880
881This ioctl is required on Intel-based hosts. This is needed on Intel hardware
882because of a quirk in the virtualization implementation (see the internals
883documentation when it pops into existence).
884
57bc24cf
AK
8854.40 KVM_SET_BOOT_CPU_ID
886
887Capability: KVM_CAP_SET_BOOT_CPU_ID
888Architectures: x86, ia64
889Type: vm ioctl
890Parameters: unsigned long vcpu_id
891Returns: 0 on success, -1 on error
892
893Define which vcpu is the Bootstrap Processor (BSP). Values are the same
894as the vcpu id in KVM_CREATE_VCPU. If this ioctl is not called, the default
895is vcpu 0.
896
2d5b5a66
SY
8974.41 KVM_GET_XSAVE
898
899Capability: KVM_CAP_XSAVE
900Architectures: x86
901Type: vcpu ioctl
902Parameters: struct kvm_xsave (out)
903Returns: 0 on success, -1 on error
904
905struct kvm_xsave {
906 __u32 region[1024];
907};
908
909This ioctl would copy current vcpu's xsave struct to the userspace.
910
9114.42 KVM_SET_XSAVE
912
913Capability: KVM_CAP_XSAVE
914Architectures: x86
915Type: vcpu ioctl
916Parameters: struct kvm_xsave (in)
917Returns: 0 on success, -1 on error
918
919struct kvm_xsave {
920 __u32 region[1024];
921};
922
923This ioctl would copy userspace's xsave struct to the kernel.
924
9254.43 KVM_GET_XCRS
926
927Capability: KVM_CAP_XCRS
928Architectures: x86
929Type: vcpu ioctl
930Parameters: struct kvm_xcrs (out)
931Returns: 0 on success, -1 on error
932
933struct kvm_xcr {
934 __u32 xcr;
935 __u32 reserved;
936 __u64 value;
937};
938
939struct kvm_xcrs {
940 __u32 nr_xcrs;
941 __u32 flags;
942 struct kvm_xcr xcrs[KVM_MAX_XCRS];
943 __u64 padding[16];
944};
945
946This ioctl would copy current vcpu's xcrs to the userspace.
947
9484.44 KVM_SET_XCRS
949
950Capability: KVM_CAP_XCRS
951Architectures: x86
952Type: vcpu ioctl
953Parameters: struct kvm_xcrs (in)
954Returns: 0 on success, -1 on error
955
956struct kvm_xcr {
957 __u32 xcr;
958 __u32 reserved;
959 __u64 value;
960};
961
962struct kvm_xcrs {
963 __u32 nr_xcrs;
964 __u32 flags;
965 struct kvm_xcr xcrs[KVM_MAX_XCRS];
966 __u64 padding[16];
967};
968
969This ioctl would set vcpu's xcr to the value userspace specified.
970
d153513d
AK
9714.45 KVM_GET_SUPPORTED_CPUID
972
973Capability: KVM_CAP_EXT_CPUID
974Architectures: x86
975Type: system ioctl
976Parameters: struct kvm_cpuid2 (in/out)
977Returns: 0 on success, -1 on error
978
979struct kvm_cpuid2 {
980 __u32 nent;
981 __u32 padding;
982 struct kvm_cpuid_entry2 entries[0];
983};
984
985#define KVM_CPUID_FLAG_SIGNIFCANT_INDEX 1
986#define KVM_CPUID_FLAG_STATEFUL_FUNC 2
987#define KVM_CPUID_FLAG_STATE_READ_NEXT 4
988
989struct kvm_cpuid_entry2 {
990 __u32 function;
991 __u32 index;
992 __u32 flags;
993 __u32 eax;
994 __u32 ebx;
995 __u32 ecx;
996 __u32 edx;
997 __u32 padding[3];
998};
999
1000This ioctl returns x86 cpuid features which are supported by both the hardware
1001and kvm. Userspace can use the information returned by this ioctl to
1002construct cpuid information (for KVM_SET_CPUID2) that is consistent with
1003hardware, kernel, and userspace capabilities, and with user requirements (for
1004example, the user may wish to constrain cpuid to emulate older hardware,
1005or for feature consistency across a cluster).
1006
1007Userspace invokes KVM_GET_SUPPORTED_CPUID by passing a kvm_cpuid2 structure
1008with the 'nent' field indicating the number of entries in the variable-size
1009array 'entries'. If the number of entries is too low to describe the cpu
1010capabilities, an error (E2BIG) is returned. If the number is too high,
1011the 'nent' field is adjusted and an error (ENOMEM) is returned. If the
1012number is just right, the 'nent' field is adjusted to the number of valid
1013entries in the 'entries' array, which is then filled.
1014
1015The entries returned are the host cpuid as returned by the cpuid instruction,
1016with unknown or unsupported features masked out. The fields in each entry
1017are defined as follows:
1018
1019 function: the eax value used to obtain the entry
1020 index: the ecx value used to obtain the entry (for entries that are
1021 affected by ecx)
1022 flags: an OR of zero or more of the following:
1023 KVM_CPUID_FLAG_SIGNIFCANT_INDEX:
1024 if the index field is valid
1025 KVM_CPUID_FLAG_STATEFUL_FUNC:
1026 if cpuid for this function returns different values for successive
1027 invocations; there will be several entries with the same function,
1028 all with this flag set
1029 KVM_CPUID_FLAG_STATE_READ_NEXT:
1030 for KVM_CPUID_FLAG_STATEFUL_FUNC entries, set if this entry is
1031 the first entry to be read by a cpu
1032 eax, ebx, ecx, edx: the values returned by the cpuid instruction for
1033 this function/index combination
1034
9c1b96e3
AK
10355. The kvm_run structure
1036
1037Application code obtains a pointer to the kvm_run structure by
1038mmap()ing a vcpu fd. From that point, application code can control
1039execution by changing fields in kvm_run prior to calling the KVM_RUN
1040ioctl, and obtain information about the reason KVM_RUN returned by
1041looking up structure members.
1042
1043struct kvm_run {
1044 /* in */
1045 __u8 request_interrupt_window;
1046
1047Request that KVM_RUN return when it becomes possible to inject external
1048interrupts into the guest. Useful in conjunction with KVM_INTERRUPT.
1049
1050 __u8 padding1[7];
1051
1052 /* out */
1053 __u32 exit_reason;
1054
1055When KVM_RUN has returned successfully (return value 0), this informs
1056application code why KVM_RUN has returned. Allowable values for this
1057field are detailed below.
1058
1059 __u8 ready_for_interrupt_injection;
1060
1061If request_interrupt_window has been specified, this field indicates
1062an interrupt can be injected now with KVM_INTERRUPT.
1063
1064 __u8 if_flag;
1065
1066The value of the current interrupt flag. Only valid if in-kernel
1067local APIC is not used.
1068
1069 __u8 padding2[2];
1070
1071 /* in (pre_kvm_run), out (post_kvm_run) */
1072 __u64 cr8;
1073
1074The value of the cr8 register. Only valid if in-kernel local APIC is
1075not used. Both input and output.
1076
1077 __u64 apic_base;
1078
1079The value of the APIC BASE msr. Only valid if in-kernel local
1080APIC is not used. Both input and output.
1081
1082 union {
1083 /* KVM_EXIT_UNKNOWN */
1084 struct {
1085 __u64 hardware_exit_reason;
1086 } hw;
1087
1088If exit_reason is KVM_EXIT_UNKNOWN, the vcpu has exited due to unknown
1089reasons. Further architecture-specific information is available in
1090hardware_exit_reason.
1091
1092 /* KVM_EXIT_FAIL_ENTRY */
1093 struct {
1094 __u64 hardware_entry_failure_reason;
1095 } fail_entry;
1096
1097If exit_reason is KVM_EXIT_FAIL_ENTRY, the vcpu could not be run due
1098to unknown reasons. Further architecture-specific information is
1099available in hardware_entry_failure_reason.
1100
1101 /* KVM_EXIT_EXCEPTION */
1102 struct {
1103 __u32 exception;
1104 __u32 error_code;
1105 } ex;
1106
1107Unused.
1108
1109 /* KVM_EXIT_IO */
1110 struct {
1111#define KVM_EXIT_IO_IN 0
1112#define KVM_EXIT_IO_OUT 1
1113 __u8 direction;
1114 __u8 size; /* bytes */
1115 __u16 port;
1116 __u32 count;
1117 __u64 data_offset; /* relative to kvm_run start */
1118 } io;
1119
2044892d 1120If exit_reason is KVM_EXIT_IO, then the vcpu has
9c1b96e3
AK
1121executed a port I/O instruction which could not be satisfied by kvm.
1122data_offset describes where the data is located (KVM_EXIT_IO_OUT) or
1123where kvm expects application code to place the data for the next
2044892d 1124KVM_RUN invocation (KVM_EXIT_IO_IN). Data format is a packed array.
9c1b96e3
AK
1125
1126 struct {
1127 struct kvm_debug_exit_arch arch;
1128 } debug;
1129
1130Unused.
1131
1132 /* KVM_EXIT_MMIO */
1133 struct {
1134 __u64 phys_addr;
1135 __u8 data[8];
1136 __u32 len;
1137 __u8 is_write;
1138 } mmio;
1139
2044892d 1140If exit_reason is KVM_EXIT_MMIO, then the vcpu has
9c1b96e3
AK
1141executed a memory-mapped I/O instruction which could not be satisfied
1142by kvm. The 'data' member contains the written data if 'is_write' is
1143true, and should be filled by application code otherwise.
1144
ad0a048b
AG
1145NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO and KVM_EXIT_OSI, the corresponding
1146operations are complete (and guest state is consistent) only after userspace
1147has re-entered the kernel with KVM_RUN. The kernel side will first finish
67961344
MT
1148incomplete operations and then check for pending signals. Userspace
1149can re-enter the guest with an unmasked signal pending to complete
1150pending operations.
1151
9c1b96e3
AK
1152 /* KVM_EXIT_HYPERCALL */
1153 struct {
1154 __u64 nr;
1155 __u64 args[6];
1156 __u64 ret;
1157 __u32 longmode;
1158 __u32 pad;
1159 } hypercall;
1160
647dc49e
AK
1161Unused. This was once used for 'hypercall to userspace'. To implement
1162such functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO (all except s390).
1163Note KVM_EXIT_IO is significantly faster than KVM_EXIT_MMIO.
9c1b96e3
AK
1164
1165 /* KVM_EXIT_TPR_ACCESS */
1166 struct {
1167 __u64 rip;
1168 __u32 is_write;
1169 __u32 pad;
1170 } tpr_access;
1171
1172To be documented (KVM_TPR_ACCESS_REPORTING).
1173
1174 /* KVM_EXIT_S390_SIEIC */
1175 struct {
1176 __u8 icptcode;
1177 __u64 mask; /* psw upper half */
1178 __u64 addr; /* psw lower half */
1179 __u16 ipa;
1180 __u32 ipb;
1181 } s390_sieic;
1182
1183s390 specific.
1184
1185 /* KVM_EXIT_S390_RESET */
1186#define KVM_S390_RESET_POR 1
1187#define KVM_S390_RESET_CLEAR 2
1188#define KVM_S390_RESET_SUBSYSTEM 4
1189#define KVM_S390_RESET_CPU_INIT 8
1190#define KVM_S390_RESET_IPL 16
1191 __u64 s390_reset_flags;
1192
1193s390 specific.
1194
1195 /* KVM_EXIT_DCR */
1196 struct {
1197 __u32 dcrn;
1198 __u32 data;
1199 __u8 is_write;
1200 } dcr;
1201
1202powerpc specific.
1203
ad0a048b
AG
1204 /* KVM_EXIT_OSI */
1205 struct {
1206 __u64 gprs[32];
1207 } osi;
1208
1209MOL uses a special hypercall interface it calls 'OSI'. To enable it, we catch
1210hypercalls and exit with this exit struct that contains all the guest gprs.
1211
1212If exit_reason is KVM_EXIT_OSI, then the vcpu has triggered such a hypercall.
1213Userspace can now handle the hypercall and when it's done modify the gprs as
1214necessary. Upon guest entry all guest GPRs will then be replaced by the values
1215in this struct.
1216
9c1b96e3
AK
1217 /* Fix the size of the union. */
1218 char padding[256];
1219 };
1220};
This page took 0.143591 seconds and 5 git commands to generate.