Cpu steal

Author: h | 2025-04-24

★★★★☆ (4.9 / 2223 reviews)

db2toexcel

This was interesting, as %st was representing the Cpu Steal Time. Here is IBM's definition of Cpu Steal Time - Steal time is the percentage of time a virtual CPU waits for a

roblox refund policy

cpu-steal-diff/cpu-steal-diff.pl at main - GitHub

*st = &per_cpu(steal_time, cpu);if (!has_steal_clock)return;memset(st, 0, sizeof(*st));wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));}此时steal_time的内存地址写入到MSR_KVM_STEAL_TIME中,guest中初始化部分完成。继续,前面提到kvm_steal_clock被注册到了steal_clock中,kvm_steal_clock本质就是计算出指定CPU的steal time,从代码上可以明显看出steal time来自每CPU变量,此处假设每CPU变量赋值正常,再看,调用注册点steal_clock则是paravirt_steal_clock,往上走有steal_account_process_tick函数,top看到的st值就是它计算出来的,account_process_tick和irqtime_account_process_tick中都有调用steal_account_process_tick,即是steal_time的更新已经集成到kenrel的定时器更新中,在steal_account_process_tick函数中,paravirt_steal_clock获取steal clock,paravirt_steal_clock的返回值是一个累积值,减去this_rq()->prev_steal_time即得出当前的steal time,然后累加到kcpustat_this_cpu->cpustat[CPUTIME_STEAL],同时刷新this_rq()->prev_steal_time,而kcpustat_this_cpu则是top命令看到的st数据的来源。static __always_inline bool steal_account_process_tick(void){#ifdef CONFIG_PARAVIRTif (static_key_false(&paravirt_steal_enabled)) {u64 steal;cputime_t steal_ct;//获取steal timesteal = paravirt_steal_clock(smp_processor_id());//减去上次更新的steal time就得到这次时间片(伪概念)内的steal timesteal -= this_rq()->prev_steal_time;/* * cputime_t may be less precise than nsecs (eg: if it's * based on jiffies). Lets cast the result to cputime * granularity and account the rest on the next rounds. */steal_ct = nsecs_to_cputime(steal);//再次刷新prev_steal_time,其实就是第一次的steal直接赋值更快this_rq()->prev_steal_time += cputime_to_nsecs(steal_ct);//将结果赋值到kcpustat_this_cpuaccount_steal_time(steal_ct);return steal_ct;}#endifreturn false;}另外一个调用paravirt_steal_clock是update_rq_clock_task,用来更新队列中的clock_task,先看调用它的update_rq_clock,update_rq_clock_task的入参delta来自sched_clock_cpu(cpu_of(rq)) - rq->clock,如果config中CONFIG_PARAVIRT_TIME_ACCOUNTING=y,update_rq_clock_task则会在delta中减去steal time,赋值给clock_task。void update_rq_clock(struct rq *rq){s64 delta;lockdep_assert_held(&rq->lock);if (rq->clock_skip_update & RQCF_ACT_SKIP)return;delta = sched_clock_cpu(cpu_of(rq)) - rq->clock;if (delta clock += delta;update_rq_clock_task(rq, delta);}static void update_rq_clock_task(struct rq *rq, s64 delta){#if defined(CONFIG_PARAVIRT_TIME_ACCOUNTING)s64 steal = 0, irq_delta = 0;#endif#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTINGif (static_key_false((&paravirt_steal_rq_enabled))) {steal = paravirt_steal_clock(cpu_of(rq));steal -= rq->prev_steal_time_rq;if (unlikely(steal > delta))steal = delta;rq->prev_steal_time_rq += steal;delta -= steal;}#endifrq->clock_task += delta;}通过观察clock_task的调用者就可以发现,clock_task则是进程调度中rq计算task运行时间的重要数据,如此看就是steal time是被guest可以意识到的时间,但这个时间不被计算到具体的调度队列的运行时间中,虚拟化下guest中的task调度正常。最后留下的疑问就是steal_time的每CPU变量是如何刷新的,在kvm_vcpu_arch下有下面的一个结构体struct {u64 msr_val;u64 last_steal;u64 accum_steal;struct gfn_to_hva_cache stime;struct kvm_steal_time steal;} st;在vcpu_enter_guest函数中,有record_steal_time,其他地方还有另外一个函数,accumulate_steal_time,它们的调用关系如下kvm_vcpu_ioctl->vcpu_load->kvm_arch_vcpu_load->accumulate_steal_timekvm_vcpu_ioctl->kvm_arch_vcpu_ioctl_run->vcpu_run->vcpu_enter_guest->record_steal_time也就是说accumulate_steal_time必然在record_steal_time之前执行,最新的4.4代码直接把accumulate_steal_time放到了record_steal_time函数的最前面。accumulate_steal_time函数顾名思义,计算steal time,只有3行重点delta = current->sched_info.run_delay - vcpu->arch.st.last_steal;vcpu->arch.st.last_steal = current->sched_info.run_delay;vcpu->arch.st.accum_steal = delta;先理解run_delay是什么,"在运行队列上等待的时间"?#ifdef CONFIG_SCHED_INFOstruct sched_info {/* cumulative counters */unsigned long pcount; /* # of times run on this cpu */unsigned long long run_delay; /* time spent waiting on a runqueue *//* timestamps */unsigned long long last_arrival,/* when we last ran on a cpu */ last_queued;/* when we were last queued to run */};#endif /* CONFIG_SCHED_INFO */那么,current->sched_info.run_delay就是qemu的run_delay,也就是陷入到guest中的时间,那么在t5时刻(每次enter到guest中的时候),current->sched_info.run_delay=t2-t1+t4-t3,而vcpu->arch.st.last_steal则是上次enter guest时(t3时刻)的run_delay,即t2-t1,那么本次时间段的steal time则是vcpu->arch.st.accum_steal=t4-t3。这样就将当前时间段内的steal time存储到accum_steal中。再看record_steal_time函数,此处使用了kvm_read_guest_cached和kvm_write_guest_cached,本质就是直接读取或写入guest的某段内存,涉及到gfn_to_hva_cache结构体,因为写入的gfn2hva映射关系一般是不变的,所以不需要在guest重复转换浪费计算能力,vcpu->arch.st.stime的gfn_to_hva_cache结构体是在kvm_set_msr_common函数下MSR_KVM_STEAL_TIME case下初始化的,函数是case MSR_KVM_STEAL_TIME:if (unlikely(!sched_info_on()))return 1;if (data & KVM_STEAL_RESERVED_MASK)return 1;//下面的kvm_gfn_to_hva_cache_init中第三个入参是gpa,前面提到把steal_time//的每CPU变量的物理地址注册到MSR_KVM_STEAL_TIME中,此处则data则是//MSR_KVM_STEAL_TIME索引的msr的值,即每CPU变量的物理地址if (kvm_gfn_to_hva_cache_init(vcpu->kvm, &vcpu->arch.st.stime,data & KVM_STEAL_VALID_BITS,sizeof(struct kvm_steal_time)))return 1;vcpu->arch.st.msr_val = data;vcpu->arch.st.last_steal = current->sched_info.run_delay;break;kvm_gfn_to_hva_cache_init就是纯粹的转换,不提。回到record_steal_time,kvm_read_guest_cached本质就是__copy_from_user,拷贝到vcpu->arch.st.stime,然后加上vcpu->arch.st.accum_steal,赋值给vcpu->arch.st.steal.steal,然后再通过kvm_write_guest_cached函数写入到guest中的映射地址中,这样就和steal_time每CPU变量对应起来了。if (unlikely(kvm_read_guest_cached(vcpu->kvm, &vcpu->arch.st.stime,&vcpu->arch.st.steal, sizeof(struct kvm_steal_time))))return;vcpu->arch.st.steal.steal += vcpu->arch.st.accum_steal;vcpu->arch.st.steal.version += 2;vcpu->arch.st.accum_steal = 0;kvm_write_guest_cached(vcpu->kvm, &vcpu->arch.st.stime,&vcpu->arch.st.steal, sizeof(struct kvm_steal_time));KVM下steal_time源代码分析来自于OenHan链接为: This was interesting, as %st was representing the Cpu Steal Time. Here is IBM's definition of Cpu Steal Time - Steal time is the percentage of time a virtual CPU waits for a Priority of a given thread arbitrarily, as long as it is equal to or greater than the BasePriority value for that thread.On Windows NT, numeric priority values range between 0 and 31 , although the value 0 is reserved by the OS. Thus, no threads, except specially designated OS threads, may use this priority. This range is divided into two categories: dynamic priorities and real-time priorities.Dynamic priorities are values between 1 and 15. They are referred to as "dynamic" because the OS varies the priority of threads in this range. Thus, for example, it is not possible for a thread in this range to steal the CPU and cause starvation of other threads that are waiting to run.Real-time priorities are values between 16 and 31 . They are referred to as real-time because the OS does not vary the priority of threads in this range. Real-time range threads can continue to control the CPU, as long as no other threads of equal or higher priority are scheduled. Thus, it is possible for a real-time thread to steal the CPU and cause starvation of other threads that are waiting to run.For either dynamic or real-time priorities, the BasePriority is established when the thread is first created and may be programmatically adjusted via such calls as KeSetBasePriorityThread().For dynamic threads, the Priority starts out equal to the BasePriority , but may be adjusted by the OS. For example during I/O completion IoCompleteRequest(), KeSetEvent(), Quantum exhaustion.For real-time threads, the OS never adjusts the Priority

Comments

User4980

*st = &per_cpu(steal_time, cpu);if (!has_steal_clock)return;memset(st, 0, sizeof(*st));wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));}此时steal_time的内存地址写入到MSR_KVM_STEAL_TIME中,guest中初始化部分完成。继续,前面提到kvm_steal_clock被注册到了steal_clock中,kvm_steal_clock本质就是计算出指定CPU的steal time,从代码上可以明显看出steal time来自每CPU变量,此处假设每CPU变量赋值正常,再看,调用注册点steal_clock则是paravirt_steal_clock,往上走有steal_account_process_tick函数,top看到的st值就是它计算出来的,account_process_tick和irqtime_account_process_tick中都有调用steal_account_process_tick,即是steal_time的更新已经集成到kenrel的定时器更新中,在steal_account_process_tick函数中,paravirt_steal_clock获取steal clock,paravirt_steal_clock的返回值是一个累积值,减去this_rq()->prev_steal_time即得出当前的steal time,然后累加到kcpustat_this_cpu->cpustat[CPUTIME_STEAL],同时刷新this_rq()->prev_steal_time,而kcpustat_this_cpu则是top命令看到的st数据的来源。static __always_inline bool steal_account_process_tick(void){#ifdef CONFIG_PARAVIRTif (static_key_false(&paravirt_steal_enabled)) {u64 steal;cputime_t steal_ct;//获取steal timesteal = paravirt_steal_clock(smp_processor_id());//减去上次更新的steal time就得到这次时间片(伪概念)内的steal timesteal -= this_rq()->prev_steal_time;/* * cputime_t may be less precise than nsecs (eg: if it's * based on jiffies). Lets cast the result to cputime * granularity and account the rest on the next rounds. */steal_ct = nsecs_to_cputime(steal);//再次刷新prev_steal_time,其实就是第一次的steal直接赋值更快this_rq()->prev_steal_time += cputime_to_nsecs(steal_ct);//将结果赋值到kcpustat_this_cpuaccount_steal_time(steal_ct);return steal_ct;}#endifreturn false;}另外一个调用paravirt_steal_clock是update_rq_clock_task,用来更新队列中的clock_task,先看调用它的update_rq_clock,update_rq_clock_task的入参delta来自sched_clock_cpu(cpu_of(rq)) - rq->clock,如果config中CONFIG_PARAVIRT_TIME_ACCOUNTING=y,update_rq_clock_task则会在delta中减去steal time,赋值给clock_task。void update_rq_clock(struct rq *rq){s64 delta;lockdep_assert_held(&rq->lock);if (rq->clock_skip_update & RQCF_ACT_SKIP)return;delta = sched_clock_cpu(cpu_of(rq)) - rq->clock;if (delta clock += delta;update_rq_clock_task(rq, delta);}static void update_rq_clock_task(struct rq *rq, s64 delta){#if defined(CONFIG_PARAVIRT_TIME_ACCOUNTING)s64 steal = 0, irq_delta = 0;#endif#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTINGif (static_key_false((&paravirt_steal_rq_enabled))) {steal = paravirt_steal_clock(cpu_of(rq));steal -= rq->prev_steal_time_rq;if (unlikely(steal > delta))steal = delta;rq->prev_steal_time_rq += steal;delta -= steal;}#endifrq->clock_task += delta;}通过观察clock_task的调用者就可以发现,clock_task则是进程调度中rq计算task运行时间的重要数据,如此看就是steal time是被guest可以意识到的时间,但这个时间不被计算到具体的调度队列的运行时间中,虚拟化下guest中的task调度正常。最后留下的疑问就是steal_time的每CPU变量是如何刷新的,在kvm_vcpu_arch下有下面的一个结构体struct {u64 msr_val;u64 last_steal;u64 accum_steal;struct gfn_to_hva_cache stime;struct kvm_steal_time steal;} st;在vcpu_enter_guest函数中,有record_steal_time,其他地方还有另外一个函数,accumulate_steal_time,它们的调用关系如下kvm_vcpu_ioctl->vcpu_load->kvm_arch_vcpu_load->accumulate_steal_timekvm_vcpu_ioctl->kvm_arch_vcpu_ioctl_run->vcpu_run->vcpu_enter_guest->record_steal_time也就是说accumulate_steal_time必然在record_steal_time之前执行,最新的4.4代码直接把accumulate_steal_time放到了record_steal_time函数的最前面。accumulate_steal_time函数顾名思义,计算steal time,只有3行重点delta = current->sched_info.run_delay - vcpu->arch.st.last_steal;vcpu->arch.st.last_steal = current->sched_info.run_delay;vcpu->arch.st.accum_steal = delta;先理解run_delay是什么,"在运行队列上等待的时间"?#ifdef CONFIG_SCHED_INFOstruct sched_info {/* cumulative counters */unsigned long pcount; /* # of times run on this cpu */unsigned long long run_delay; /* time spent waiting on a runqueue *//* timestamps */unsigned long long last_arrival,/* when we last ran on a cpu */ last_queued;/* when we were last queued to run */};#endif /* CONFIG_SCHED_INFO */那么,current->sched_info.run_delay就是qemu的run_delay,也就是陷入到guest中的时间,那么在t5时刻(每次enter到guest中的时候),current->sched_info.run_delay=t2-t1+t4-t3,而vcpu->arch.st.last_steal则是上次enter guest时(t3时刻)的run_delay,即t2-t1,那么本次时间段的steal time则是vcpu->arch.st.accum_steal=t4-t3。这样就将当前时间段内的steal time存储到accum_steal中。再看record_steal_time函数,此处使用了kvm_read_guest_cached和kvm_write_guest_cached,本质就是直接读取或写入guest的某段内存,涉及到gfn_to_hva_cache结构体,因为写入的gfn2hva映射关系一般是不变的,所以不需要在guest重复转换浪费计算能力,vcpu->arch.st.stime的gfn_to_hva_cache结构体是在kvm_set_msr_common函数下MSR_KVM_STEAL_TIME case下初始化的,函数是case MSR_KVM_STEAL_TIME:if (unlikely(!sched_info_on()))return 1;if (data & KVM_STEAL_RESERVED_MASK)return 1;//下面的kvm_gfn_to_hva_cache_init中第三个入参是gpa,前面提到把steal_time//的每CPU变量的物理地址注册到MSR_KVM_STEAL_TIME中,此处则data则是//MSR_KVM_STEAL_TIME索引的msr的值,即每CPU变量的物理地址if (kvm_gfn_to_hva_cache_init(vcpu->kvm, &vcpu->arch.st.stime,data & KVM_STEAL_VALID_BITS,sizeof(struct kvm_steal_time)))return 1;vcpu->arch.st.msr_val = data;vcpu->arch.st.last_steal = current->sched_info.run_delay;break;kvm_gfn_to_hva_cache_init就是纯粹的转换,不提。回到record_steal_time,kvm_read_guest_cached本质就是__copy_from_user,拷贝到vcpu->arch.st.stime,然后加上vcpu->arch.st.accum_steal,赋值给vcpu->arch.st.steal.steal,然后再通过kvm_write_guest_cached函数写入到guest中的映射地址中,这样就和steal_time每CPU变量对应起来了。if (unlikely(kvm_read_guest_cached(vcpu->kvm, &vcpu->arch.st.stime,&vcpu->arch.st.steal, sizeof(struct kvm_steal_time))))return;vcpu->arch.st.steal.steal += vcpu->arch.st.accum_steal;vcpu->arch.st.steal.version += 2;vcpu->arch.st.accum_steal = 0;kvm_write_guest_cached(vcpu->kvm, &vcpu->arch.st.stime,&vcpu->arch.st.steal, sizeof(struct kvm_steal_time));KVM下steal_time源代码分析来自于OenHan链接为:

2025-04-16
User6128

Priority of a given thread arbitrarily, as long as it is equal to or greater than the BasePriority value for that thread.On Windows NT, numeric priority values range between 0 and 31 , although the value 0 is reserved by the OS. Thus, no threads, except specially designated OS threads, may use this priority. This range is divided into two categories: dynamic priorities and real-time priorities.Dynamic priorities are values between 1 and 15. They are referred to as "dynamic" because the OS varies the priority of threads in this range. Thus, for example, it is not possible for a thread in this range to steal the CPU and cause starvation of other threads that are waiting to run.Real-time priorities are values between 16 and 31 . They are referred to as real-time because the OS does not vary the priority of threads in this range. Real-time range threads can continue to control the CPU, as long as no other threads of equal or higher priority are scheduled. Thus, it is possible for a real-time thread to steal the CPU and cause starvation of other threads that are waiting to run.For either dynamic or real-time priorities, the BasePriority is established when the thread is first created and may be programmatically adjusted via such calls as KeSetBasePriorityThread().For dynamic threads, the Priority starts out equal to the BasePriority , but may be adjusted by the OS. For example during I/O completion IoCompleteRequest(), KeSetEvent(), Quantum exhaustion.For real-time threads, the OS never adjusts the Priority

2025-04-15
User5078

Cores by the system and user processes.The percentage of CPU used for interrupt requests. ('irq')The next value is the idle percentage for all the cores combined.The following value denotes the waiting each CPU core had to do.Next up is the percentage for the steal time.'guest' denotes the guest-percentage, which is the CPU time spent on other virtual machines.The last two values indicate the current frequency of the CPU.Now, the 'atop' displays the above statistics for each core independently.CPL - refers to as CPU Load.The first three values are the average loads with different periods: 1, 5, and 15 minutes.This is followed by the number of context switches ('csw')Next up is the number of interrupts ('intr')The last value is number of available CPUs.MEM - Memory UtilizationThe total physical memory supported.The memory currently free.The current cache memory.'buff' as in “buffer” is the amount of memory consumed in filesystem meta-data.The sum of memory for kernel’s memory allocation shown as 'slab'.The amount of shared memory.SWP - Swap Memory.DSK - Disk usageThe first value denotes the percentage of time the system is busy handling requests.The reading requests issued.The writing requests issued.The rate at which data (in KB) is read per reading request.The rate at which data (in KB) is written per writing request.The next two values are time rates for reading and writing on the disk in Megabytes.The last value is the average number of milliseconds spent in handling requests.NET - Network Statistics at the Transport Layer'transport' signifies the Transport layer in Networking, which deals with the data protocols.The number of segments received by the system following the TCP protocol. ('tcpi')The number of segments transmitted. ('tcpo')The similar statistics for UCP protocol. ('udpi' for UDP in) and ('udpo' for UDP out).'tcpao' is the number of active TCP open connections.Opposite to previous 'tcppo' is the number of

2025-04-22
User8472

KVM下steal_time源代码分析代码版本: branch v4.3刚好有人在其他文章评论下问到steal_time机制,顺便看了一下,总结如下。steal_time原意是指在虚拟化环境下,hypervisor窃取的vm中的时间,严格讲就是VCPU没有运行的时间。在guest中执行top选项,就可以看到一个st数据[email protected] ~$ toptop - 21:04:12 up 1:24, 2 users, load average: 0.45, 0.31, 0.22Tasks: 268 total, 1 running, 267 sleeping, 0 stopped, 0 zombie%Cpu(s): 0.5 us, 0.2 sy, 0.3 ni, 98.0 id, 0.9 wa, 0.0 hi, 0.0 si,0.0 stst数据的意义是给guest看到了自己真正占用CPU的时间比例,让guest根据st调整自己的行为,以免影响业务,如果st值比较高,则说明hostvm的CPU比例太小,整个hypervisor的任务比较繁重,有些高计算任务可以跟着自我限制。同时KVM的doc文件描述如下:MSR_KVM_STEAL_TIME: 0x4b564d03data: 64-byte alignment physical address of a memory area which must be in guest RAM, plus an enable bit in bit 0. This memory is expected to hold a copy of the following structure:struct kvm_steal_time { __u64 steal;__u32 version;__u32 flags;__u32 pad[12]; }whose data will be filled in by the hypervisor periodically. Only one write, or registration, is needed for each VCPU. The interval between updates of this structure is arbitrary and implementation-dependent.The hypervisor may update this structure at any time it sees fit until anything with bit0 == 0 is written to it. Guest is required to make sure this structure is initialized to zero.Fields have the following meanings:version: a sequence counter. In other words, guest has to check this field before and after grabbing time information and make sure they are both equal and even. An odd version indicates an in-progress update.flags: At this point, always zero. May be used to indicate changes in this structure in the future.steal: the amount of time in which this vCPU did not run, in nanoseconds. Time during which the vcpu is idle, will not be reported as steal time.下面看一下具体的源代码实现:先说guest端的代码,steal_time本身是一个PV实现,这个应该是在AWS(XEN)开发出来后又搬到KVM上的,因为本身kernel不存在这个功能,算是修改kernel,归到PV里面了,所有一般编译内核的时候要保证CONFIG_PARAVIRT=y即开关打开。cat /boot/config-4.2.6 | grep CONFIG_PARAVIRTCONFIG_PARAVIRT=y在guest kernel启动的过程中,内核初始化调用setup_arch,然后是kvm_guest_init,先调用kvm_para_available判断是否是KVM虚拟化环境,原理就是根据CPUID查询的字符串是否有"KVMKVMKVM",然后又将kvm_steal_clock注册到steal_clock,if (kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) {has_steal_clock = 1;pv_time_ops.steal_clock = kvm_steal_clock;}在CONFIG_SMP不同的情况下分叉,但最后都是调用kvm_guest_cpu_init,进入kvm_register_steal_time函数,kvm_register_steal_time做了一件事,即把steal_time的每CPU变量的物理地址注册到MSR_KVM_STEAL_TIME中,如下static DEFINE_PER_CPU(struct kvm_steal_time, steal_time) __aligned(64);static void kvm_register_steal_time(void){int cpu = smp_processor_id();struct kvm_steal_time

2025-04-10
User1557

--> --> --> Stay away from fake Afterburner sites Lately, we heard about many phishing Afterburner sites that will steal your data for improper purposes. Please be noted that the correct Afterburner site only exists on msi.com and Guru3D, any other is a fake site. Be careful and stay away from those sites to protect your digital assets. Protection for PC gaming Multiple layers of protection for your devices, game accounts and digital assets. Norton 360 provides powerful layers of protection for your devices and online privacy. It helps guard against malware, and other online threats as you bank, browse and shop online. Password Manager tools help you manage your passwords and online credentials, and PC Cloud Backup,3 helps prevent data loss due to hard drive failures and ransomware. Plus, with our 100% Virus Protection Promise you get your money back if your device gets a virus we can’t remove!2 Maximize your gaming performance with Norton Game Optimizer Level-up your protection without compromising your game. Game Optimizer dedicates the CPU power needed for optimal performance in your game by isolating non-essential apps to a single CPU core. Boost performance and strengthen your PC’s security at the same time. Try Game Optimizer and Norton 360 for Gamers for 30 days free. Game Optimizer Automatically Optimize -Detects full-screen games and feeds them maximum CPU power. Smooth Performance -Helps eliminate FPS lags and slowdowns from your other apps for smooth visuals. Maximize Resources -Free your PC from power-hungry programs running in the background that eat up your system’s resources. Get more performance out of your rig! The MSI trial offer is not available if you have already had another existing cyber security software installed. If you have another cyber security software installed, you will not be able to use our product. Please uninstall

2025-04-06

Add Comment