What exactly is "iowait"?

本文解释了iowait指标的意义,它表示CPU空闲时至少有一个IO进程在进行的状态百分比。通过四个状态(用户、系统、空闲、iowait)的百分比展示,以及对这些状态如何更新的详细说明,文章深入探讨了IO等待对性能的影响。使用具体例子来阐述iowait如何反映IO子系统的效率,并与程序性能之间的关系。

What exactly is "iowait"?

To summarize it in one sentence, 'iowait' is the percentage
of time the CPU is idle AND there is at least one I/O
in progress.

Each CPU can be in one of four states: user, sys, idle, iowait.
Performance tools such as vmstat, iostat, sar, etc. print
out these four states as a percentage.  The sar tool can
print out the states on a per CPU basis (-P flag) but most
other tools print out the average values across all the CPUs.
Since these are percentage values, the four state values
should add up to 100%.

The tools print out the statistics using counters that the
kernel updates periodically (on AIX, these CPU state counters
are incremented at every clock interrupt (these occur
at 10 millisecond intervals).
When the clock interrupt occurs on a CPU, the kernel
checks the CPU to see if it is idle or not. If it's not
idle, the kernel then determines if the instruction being
executed at that point is in user space or in kernel space.
If user, then it increments the 'user' counter by one. If
the instruction is in kernel space, then the 'sys' counter
is incremented by one.

If the CPU is idle, the kernel then determines if there is
at least one I/O currently in progress to either a local disk
or a remotely mounted disk (NFS) which had been initiated
from that CPU. If there is, then the 'iowait' counter is
incremented by one. If there is no I/O in progress that was
initiated from that CPU, the 'idle' counter is incremented
by one.

When a performance tool such as vmstat is invoked, it reads
the current values of these four counters. Then it sleeps
for the number of seconds the user specified as the interval
time and then reads the counters again. Then vmstat will
subtract the previous values from the current values to
get the delta value for this sampling period. Since vmstat
knows that the counters are incremented at each clock
tick (10ms), second, it then divides the delta value of
each counter by the number of clock ticks in the sampling
period. For example, if you run 'vmstat 2', this makes
vmstat sample the counters every 2 seconds. Since the
clock ticks at 10ms intervals, then there are 100 ticks
per second or 200 ticks per vmstat interval (if the interval
value is 2 seconds).   The delta values of each counter
are divided by the total ticks in the interval and
multiplied by 100 to get the percentage value in that
interval.

iowait can in some cases be an indicator of a limiting factor
to transaction throughput whereas in other cases, iowait may
be completely meaningless.
Some examples here will help to explain this. The first
example is one where high iowait is a direct cause
of a performance issue.

Example 1:
Let's say that a program needs to perform transactions on behalf of
a batch job. For each transaction, the program will perform some
computations which takes 10 milliseconds and then does a synchronous
write of the results to disk. Since the file it is writing to was
opened synchronously, the write does not return until the I/O has
made it all the way to the disk. Let's say the disk subsystem does
not have a cache and that each physical write I/O takes 20ms.
This means that the program completes a transaction every 30ms.
Over a period of 1 second (1000ms), the program can do 33
transactions (33 tps).  If this program is the only one running
on a 1-CPU system, then the CPU usage would be busy 1/3 of the
time and waiting on I/O the rest of the time - so 66% iowait
and 34% CPU busy.

If the I/O subsystem was improved (let's say a disk cache is
added) such that a write I/O takes only 1ms. This means that
it takes 11ms to complete a transaction, and the program can
now do around 90-91 transactions a second. Here the iowait time
would be around 8%. Notice that a lower iowait time directly
affects the throughput of the program.

Example 2:

Let's say that there is one program running on the system - let's assume
that this is the 'dd' program, and it is reading from the disk 4KB at
a time. Let's say that the subroutine in 'dd' is called main() and it
invokes read() to do a read. Both main() and read() are user space
subroutines. read() is a libc.a subroutine which will then invoke
the kread() system call at which point it enters kernel space.
kread() will then initiate a physical I/O to the device and the 'dd'
program is then put to sleep until the physical I/O completes.
The time to execute the code in main, read, and kread is very small -
probably around 50 microseconds at most. The time it takes for
the disk to complete the I/O request will probably be around 2-20
milliseconds depending on how far the disk arm had to seek. This
means that when the clock interrupt occurs, the chances are that
the 'dd' program is asleep and that the I/O is in progress. Therefore,
the 'iowait' counter is incremented. If the I/O completes in
2 milliseconds, then the 'dd' program runs again to do another read.
But since 50 microseconds is so small compared to 2ms (2000 microseconds),
the chances are that when the clock interrupt occurs, the CPU will
again be idle with a I/O in progress.  So again, 'iowait' is
incremented.  If 'sar -P <cpunumber>' is run to show the CPU
utilization for this CPU, it will most likely show 97-98% iowait.
If each I/O takes 20ms, then the iowait would be 99-100%.
Even though the I/O wait is extremely high in either case,
the throughput is 10 times better in one case.



Example 3:

Let's say that there are two programs running on a CPU. One is a 'dd'
program reading from the disk. The other is a program that does no
I/O but is spending 100% of its time doing computational work.
Now assume that there is a problem with the I/O subsystem and that
physical I/Os are taking over a second to complete. Whenever the
'dd' program is asleep while waiting for its I/Os to complete,
the other program is able to run on that CPU. When the clock
interrupt occurs, there will always be a program running in
either user mode or system mode. Therefore, the %idle and %iowait
values will be 0. Even though iowait is 0 now, that does not
mean there is NOT a I/O problem because there obviously is one
if physical I/Os are taking over a second to complete.



Example 4:

Let's say that there is a 4-CPU system where there are 6 programs
running. Let's assume that four of the programs spend 70% of their
time waiting on physical read I/Os and the 30% actually using CPU time.
Since these four  programs do have to enter kernel space to execute the
kread system calls, it will spend a percentage of its time in
the kernel; let's assume that 25% of the time is in user mode,
and 5% of the time in kernel mode.
Let's also assume that the other two programs spend 100% of their
time in user code doing computations and no I/O so that two CPUs
will always be 100% busy. Since the other four programs are busy
only 30% of the time, they can share that are not busy.

If we run 'sar -P ALL 1 10' to run 'sar' at 1-second intervals
for 10 intervals, then we'd expect to see this for each interval:

         cpu    %usr    %sys    %wio   %idle
          0       50      10      40       0
          1       50      10      40       0
          2      100       0       0       0
          3      100       0       0       0
          -       75       5      20       0

Notice that the average CPU utilization will be 75% user, 5% sys,
and 20% iowait. The values one sees with 'vmstat' or 'iostat' or
most tools are the average across all CPUs.

Now let's say we take this exact same workload (same 6 programs
with same behavior) to another machine that has 6 CPUs (same
CPU speeds and same I/O subsytem).  Now each program can be
running on its own CPU. Therefore, the CPU usage breakdown
would be as follows:

         cpu    %usr    %sys    %wio   %idle
          0       25       5      70       0
          1       25       5      70       0
          2       25       5      70       0
          3       25       5      70       0
          4      100       0       0       0
          5      100       0       0       0
          -       50       3      47       0

So now the average CPU utilization will be 50% user, 3% sy,
and 47% iowait.  Notice that the same workload on another
machine has more than double the iowait value.



Conclusion:

The iowait statistic may or may not be a useful indicator of
I/O performance - but it does tell us that the system can
handle more computational work. Just because a CPU is in
iowait state does not mean that it can't run other threads
on that CPU; that is, iowait is simply a form of idle time.

原文链接:https://blog.pregos.info/wp-content/uploads/2010/09/iowait.txt


内容概要:本文提出了一种基于非合作博弈理论的居民负荷分层调度模型,并结合双层鲸鱼优化算法(Two-level Whale Optimization Algorithm)进行高效求解,模型与算法均通过Matlab代码实现。研究针对电力系统中居民侧用电负荷的复杂调度问题,引入非合作博弈机制刻画各用户之间的利益竞争关系,实现负荷的分层优化分配;同时设计双层优化架构,上层优化资源配置,下层模拟用户自主决策行为,提升了模型的实用性与合理性。通过智能优化算法求解多层级、非凸非线性的博弈模型,有效提高了调度方案的收敛性与全局寻优能力,适用于现代智能电网中的需求侧管理与能源优化场景。; 适合人群:具备电力系统基础理论知识和Matlab编程能力,从事智能电网、能源优化调度、需求侧管理、博弈论应用等方向的科研人员、高校研究生及工程技术人员。; 使用场景及目标:①应用于居民区电力负荷的分层优化调度系统设计与仿真分析;②为非合作博弈在多主体能源系统建模中的应用提供方法论支持;③利用双层鲸鱼算法解决具有嵌套结构的复杂双层优化问题,提升求解效率与调度方案的可行性。; 阅读建议:建议读者结合提供的Matlab代码深入理解模型构建逻辑与算法实现流程,重点关注博弈模型的效用函数设计、纳什均衡求解思路以及双层优化结构的迭代机制,宜配合实际用电数据开展复现实验以验证模型有效性与鲁棒性。
内容概要:本文围绕基于自适应神经模糊推理系统(ANFIS)智能控制器的可再生能源微电网功率管理系统展开研究,结合Simulink仿真实现,深入探讨了微电网中功率的智能调控与经济机组组合调度问题。通过引入ANFIS控制器,有效应对风能、光伏等可再生能源出力的波动性与不确定性,提升系统运行的稳定性与电能质量。研究内容涵盖微电网多源协调控制策略、功率平衡管理、优化调度模型构建及仿真验证,实现了对分布式电源、储能系统和负荷的协同优化,兼顾经济性与可靠性目标,并通过仿真平台验证了所提方法的有效性与优越性。; 适合人群:具备电力系统、自动化或新能源相关专业背景,熟悉Matlab/Simulink仿真环境,从事微电网能量管理、智能控制、能源优化等领域研究的研究生、科研人员及工程技术人员。; 使用场景及目标:①用于高比例可再生能源接入场景下的微电网能量管理系统研发与教学实践;②为实现微电网功率稳定控制与经济高效运行提供先进的智能控制解决方案;③支撑高水平学术论文复现、科研课题攻关及实际工程项目的仿真验证与方案优化。; 阅读建议:建议结合提供的Simulink模型与相关代码进行动手实践,重点关注ANFIS控制器的设计流程、规则库构建与参数调优方法,并通过与传统PID或MPC控制策略的对比实验,深入理解其在动态响应与鲁棒性方面的优势。同时可进一步拓展文中提出的优化调度逻辑,应用于多目标、多约束的复杂实际应用场景中。
内容概要:本文档聚焦于“直流电机双闭环控制Matlab仿真”,系统阐述了基于Matlab/Simulink平台实现直流电机双闭环控制系统(主要包括速度环与电流环)的设计与仿真全过程。通过构建直流电机的数学模型,结合PI控制器进行调控,实现对电机转速和电枢电流的高精度动态控制,验证控制策略的稳定性与响应性能。文档详细介绍了仿真模型的搭建流程、关键参数的整定方法、系统动态波形的分析手段以及仿真结果的有效性验证,体现了经典自动控制理论在实际电机系统中的工程应用,是电机控制与电力电子技术相结合的典型研究案例。; 适合人群:具备自动控制原理、电机与拖动基础、电力电子技术和Matlab/Simulink仿真能力的电气工程、自动化、机电一体化等专业的本科生、研究生及从事电机驱动系统研发的工程技术人员。; 使用场景及目标:①作为高校课程设计或实验教学材料,帮助学生深入理解双闭环调速系统的工作机理与工程实现;②服务于科研项目,为新型电机控制算法(如滑模、模糊PID等)的开发与性能对比提供基础仿真验证平台;③作为工业界产品前期设计的仿真工具,用于评估不同控制策略在动态响应、抗干扰能力和稳态精度方面的可行性。; 阅读建议:建议读者在学习过程中紧密结合自动控制理论知识,亲手在Simulink环境中搭建完整的双闭环仿真模型,通过反复调整PI控制器的比例与积分参数,观察并分析转速、电流的阶跃响应曲线,从而深刻理解反馈控制的本质、系统稳定性条件以及参数整定对动态性能的影响,进而掌握电机控制系统的设计精髓。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值