Chapter7 Deadlock

第七章 死锁!

Posted by LvKouKou on June 1, 2023

死锁

死锁是同步时遇到的问题。

死锁的定义:互为条件的进程发生的无限等待状态。

Chapter Objectives:To develop a description of deadlocks, which prevent sets of concurrent processes from completing their tasksTo present a number of different methods for preventing or avoiding deadlocks in a computer system

本章目标:了解死锁的描述,死锁阻止并发进程集完成其任务提出多种不同的方法来防止或避免计算机系统中的死锁

Bridge Crossing Example

image-20230601215756089

1. System Model 系统模型

  • System consists of resources

  • Resource types R1, R2, . . ., Rm
    • CPU cycles, memory space, I/O devices
  • Each resource type Ri has Wi instances.
  • Each process utilizes a resource as follows:
    • request
    • use
    • release
  • 系统由资源组成
  • 资源类型 R1, R2, . . Rm
    • CPU 周期、内存空间、I/O 设备
  • 每种资源类型 Ri 都有 Wi 实例。
  • 每个进程利用的资源如下:
    • request
    • use
    • release

2. Deadlock Characterization 死锁特征

2.1 必要条件

==死锁发生的四个必要条件(必须背下来)==

Deadlock can arise if four conditions hold simultaneously.

  • Mutual exclusion: only one process at a time can use a resource
  • Hold and wait: a process holding at least one resource is waiting toacquire additional resources held by other processes
  • No preemption: a resource can be released only voluntarily by theprocess holding it, after that process has completed its task
  • Circular wait: there exists a set {P0, P1, …, Pn} of waiting processessuch that P0 is waiting for a resource that is held by P1, P1 is waitingfor a resource that is held by P2, …, Pn–1 is waiting for a resource thatis held by Pn, and Pn is waiting for a resource that is held by P0.

如果四个条件同时成立,则可能会出现死锁。

  • 互斥:一次只能有一个进程使用资源
  • 占有并等待:持有至少一个资源的进程正在等待获取其他进程持有的其他资源
  • 非抢占:资源只能由持有资源的进程在完成其任务后自愿释放
  • 循环等待:存在一组 {P0, P1, …, Pn} 等待进程,使得 P0 正在等待由 P1 持有的资源,P1 正在等待由 P2 持有的资源,…,Pn–1 正在等待由 Pn 持有的资源,而 Pn 正在等待由 P0 持有的资源。

2.2 资源分配图 Resource-Allocation Graph

  • A set of vertices V and a set of edges E

    • V is partitioned into two types:
      • P = {P1, P2, …, Pn}, the set consisting of all the processes in the system
      • R = {R1, R2, …, Rm}, the set consisting of all resource types in the system

    request edge – directed edge Pi → Rj

    assignment edge – directed edge Rj → Pi

image-20230601220634885 image-20230601220710596

一组顶点 V 和一组边 E

  • V 分为两种类型:
    • P = {P1, P2, …, Pn},由系统中所有进程组成的集合
    • R = {R1, R2, …, Rm},由系统中所有资源类型组成的集合

请求边 – 有向边 Pi → Rj

分配边 – 有向边 Rj → Pi

image-20230601220738978

图一:无死锁,因为P3完成后可以释放R3;图二:有死锁,P3占有R3等待R2;图三:无死锁,P2可以完成后释放R1

2.3 ⭐ Basic Facts

  • If graph contains no cycles -> no deadlock
  • If graph contains a cycle
    • if only one instance per resource type, then deadlock
    • if several instances per resource type, possibility of deadlock
  • 如果资源分配图不包含环 ->则没有死锁
  • 如果资源分配图包含环->
    • 如果每种资源类型只有一个实例,则会死锁
    • 如果每种资源类型有多个实例,则==可能==会死锁

3. 死锁处理方法 Methods for Handling Deadlocks

  1. Ensure that the system will never enter a deadlock state:
    1. Deadlock prevention
    2. Deadlock avoidence
  2. Allow the system to enter a deadlock state and then recover
  3. Ignore the problem and pretend that deadlocks never occur in the system; used by most operating systems, including UNIX

  4. 确保系统永远不会进入死锁状态:
    1. 防止死锁
    2. 避免死锁
  5. 允许系统进入死锁状态,然后恢复
  6. 忽略问题,假装系统中从未发生过死锁;被大多数操作系统使用,包括 UNIX

4. 死锁预防 Deadlock Prevention

想要预防死锁,破坏四个必要条件中一个即可

Restrain the ways request can be made

  • Mutual Exclusion – not required for sharable resources (e.g., readonly files); must hold for non-sharable resources
  • Hold and Wait – must guarantee that whenever a process requests a resource, it does not hold any other resources
    • Require process to request and be allocated all its resources before it begins execution, or allow process to request resources only when the process has none allocated to it.
    • Low resource utilization; starvation possible
  • No Preemption
    • If a process that is holding some resources requests another resource that cannot be immediately allocated to it, then all resources currently being held are released
    • Preempted resources are added to the list of resources for which the process is waiting
    • Process will be restarted only when it can regain its old resources, as well as the new ones that it is requesting
  • Circular Wait – impose a total ordering of all resource types, and require that each process requests resources in an increasing order of enumeration

限制提出请求的方式

  • 互斥——(非必要不互斥)共享资源不需要(例如,只读文件);必须持有非共享资源(但是有一些资源天生就是互斥的,无法解决,所以这不是通用的方法)
  • 占有并等待——必须保证每当进程请求资源时,它不会持有任何其他资源
    • 要求进程在开始执行之前请求并分配所有资源,或者仅当进程没有分配给它时才允许进程请求资源。
    • 资源利用率低;可能饿死
  • 非抢占
    • 如果持有某些资源的进程请求另一个不能立即分配给它的资源,则释放当前持有的所有资源
    • 被抢占的资源被添加到进程正在等待的资源列表中
    • 只有当进程可以重新获得其旧资源以及它正在请求的新资源时,它才会重新启动
  • 循环等待 – 强加所有资源类型的总排序,要求每个进程按照枚举递增的顺序请求资源(如规定优先级高的先执行)

image-20230601232122808image-20230601232134010

5. 死锁避免 Deadlock Avoidance

我们在上一节中发现死锁预防比较困难,所以考虑死锁避免。

Requires that the system has some additional a priori information available

  • Simplest and most useful model requires that each process declare the maximum number of resources of each type that it may need
  • The deadlock-avoidance algorithm dynamically examines the resource-allocation state to ensure that there can never be a circular-wait condition
  • Resource-allocation state is defined by the number of available and allocated resources, and the maximum demands of the processes

要求系统具有一些额外的先验可用信息

  • 最简单和最有用的模型要求每个进程声明它可能需要的每种类型的最大资源数
  • 避免死锁算法动态检查资源分配状态,以确保永远不会有循环等待条件
  • 资源分配状态由可用和已分配资源的数量以及进程的最大需求定义

5.1 安全状态

When a process requests an available resource, system must decide if immediate allocation leaves the system in a safe state

System is in safe state if there exists a sequence <P1, P2, …, Pn> of ALL the processes in the systems such that for each Pi, the resources that Pi can still request can be satisfied by currently available resources + resources held by all the Pj, with j < I

That is:

  • If P i resource needs are not immediately available, then Pi can wait until all P j have finished
  • When P j is finished, Pi can obtain needed resources, execute, return allocated resources, and terminate
  • When Pi terminates, Pi +1 can obtain its needed resources, and so on

如果系统能按一定顺序为每个进程分配资源 (不超过它的最大需求),仍然避免死锁,那么系统的状态就是安全的(safe)。更为正式地说,只有存在一个安全序列(safesequence),系统才处于安全状态。

进程序列<P1,P2,…,Pn>在当前分配状态下为安全序列是指:对于每个 Pi,Pi仍然可以申请的资源数小于当前可用资源加上所有进程Pj(其中j<i)所占有的资源。在这种情况下,进程 Pi需要的资源即使不能立即可用,那么Pi可以等待直到所有 Pj释放资源。当它们完成时,Pi可得到需要的所有资源,完成给定任务,返回分配的资源,最后终止。当Pi终止时,Pi+1 可得到它需要的资源,如此进行。如果没有这样的序列存在,那么系统状态就是非安全的(unsafe)。


当进程请求可用资源时,系统必须确定立即分配是否使系统处于安全状态

如果系统中存在所有进程的序列<P1,P2,…,Pn>,则系统处于安全状态,对于每个Pi,Pi仍然可以请求的资源可以通过所有Pj持有的当前可用资源来满足,j < I

image-20230601233119469

那是:

  • 如果 P i 资源需求不能立即可用,则 Pi 可以等到所有 P j 都完成
  • 当 P j 完成后,Pi 可以获取所需的资源,执行,返回分配的资源,并终止
  • 当 Pi 终止时,Pi 1 可以获得所需的资源,依此类推

Basic Facts :

  • If a system is in safe state -> no deadlocks
  • If a system is in unsafe state -> possibility of deadlock
  • Avoidance -> ensure that a system will never enter an unsafe state

基本事实 :

  • 如果系统处于安全状态 -> 无死锁
  • 如果系统处于不安全状态 - >有死锁的可能性
  • 如何避免 - >确保系统永远不会进入不安全状态

Avoidance Algorithms :(确保系统始终处于安全状态)

Single instance of a resource type (只有单个实例)

  • Use a resource-allocation graph

Multiple instances of a resource type (有多个实例)

  • Use the banker’s algorithm

回避算法▁:(确保系统始终处于安全状态)

资源类型的单个实例

  • 使用资源分配图

资源类型的多个实例

  • 使用银行家算法

5.2 资源分配图算法 Resource-Allocation Graph Scheme

Claim edge Pi → Rj indicated that process Pj may request resource Rj ; represented by a dashed line

Claim edge converts to request edge when a process requests a resource

Request edge converted to an assignment edge when the resource is allocated to the process

When a resource is released by a process, assignment edge reconverts to a claim edge

Resources must be claimed a priori in the system

声明边缘 Pi → Rj 表示进程 Pj 可以请求资源 Rj ;用虚线表示

当进程请求资源时,声明边缘转换为请求边缘

将资源分配给流程时,请求边转换为工作分配边

当资源由流程释放时,分配边缘将重新转换为声明边缘

必须在系统中先验地声明资源

image-20230601234013279image-20230601234027535

Suppose that process Pi requests a resource Rj

The request can be granted only if converting the request edge to an assignment edge does not result in the formation of a cycle in the resource allocation graph

假设进程 Pi 请求资源 Rj

仅当将请求边转换为分配边不会导致在资源分配图中形成环时,才能授予请求(即先给一个进程分配,看是否安全,如果不安全则返回上一个状态,然后把资源分配给下一个需要的进程)

5.3 银行家算法 Banker’s Algorithm

Multiple instances

Each process must a priori claim maximum use

When a process requests a resource it may have to wait

When a process gets all its resources it must return them in a finite amount of time

多个实例

每个过程都必须先验地声称最大限度地使用

当进程请求资源时,可能需要等待

当一个进程获得所有资源时,它必须在有限的时间内返回它们。

先定义一些数据结构:

image-20230601235410849image-20230601235427257

Resource-Request Algorithm for Process Pi :

Requesti = request vector for process Pi. If Requesti [j] = k then process Pi wants k instances of resource type Rj

  1. If Requesti <= Needi go to step 2. Otherwise, raise error condition, sinceprocess has exceeded its maximum claim

  2. If Requesti <= Available, go to step 3. Otherwise Pi must wait, sinceresources are not available(没有可用资源)

  3. Pretend to allocate requested resources to Pi by modifying the state asfollows:

    • Available = Available – Requesti; (满足需求)
    • Allocationi = Allocationi + Requesti;(分配资源)
    • Needi = Needi – Requesti;(进程剩余所需)

    If safe -> the resources are allocated to Pi

    If unsafe -> Pi must wait, and the old resource-allocation state is restored

进程 Pi 的资源请求算法:

请求 = 进程 Pi 的请求向量。如果 Requesti [j] = k,则进程 Pi 需要 k 个资源类型 Rj 的实例

  1. 如果请求 <= 需要,请转到步骤 2。否则,引发错误条件,因为进程已超过其最大索赔
  2. 如果请求 <= 可用,请转到步骤 3。否则 Pi 必须等待,因为资源不可用(没有可用资源)
  3. 假设通过修改状态将请求的资源分配给 Pi,如下所示:
    • 可用 = 可用 – 请求;(满足需求)
    • Allocationi = Allocationi Requesti;(分配资源)
    • Needi = Needi – Requesti;(进程剩余所需) 如果安全 ->则资源分配给 Pi 如果不安全 -> Pi 必须等待,并且恢复旧的资源分配状态

image-20230601235756171image-20230601235808340

image-20230601235828605

6. 死锁检测 Deadlock Detection

Allow system to enter deadlock state

Detection algorithm

Recovery scheme

允许系统进入死锁状态

检测算法

恢复方案

6.1 每种资源类型只有单个实例 Single Instance of Each Resource Type

Maintain wait-for graph

  • Nodes are processes
  • Pi → P j if Pi is waiting for Pj

Periodically invoke an algorithm that searches for a cycle in the graph. If there is a cycle, there exists a deadlock

An algorithm to detect a cycle in a graph requires an order of n2 operations, where n is the number of vertices in the graph

维护等待图

  • 节点是进程
  • Pi → P j 如果 Pi 正在等待 Pj

定期调用在图形中搜索周期的算法。如果存在循环,则存在死锁

检测图中周期的算法需要 n2 次运算的顺序,其中 n 是图中的顶点数

image-20230602000316928

6.2 每种资源类型只有多个实例 Several Instances of a Resource Type

Available: A vector of length m indicates the number of available resources of each type

Allocation: An n x m matrix defines the number of resources of each type currently allocated to each process

Request: An n x m matrix indicates the current request of each process. If Request [i][j] = k, then process Pi is requesting k more instances of resource type Rj.

Available:长度为 m 的向量表示每种类型的可用资源数

Allocation:n x m 矩阵定义当前分配给每个进程的每种类型的资源数量

Request:n x m 矩阵表示每个进程的当前请求。如果请求 [i][j] = k,则进程 Pi 正在请求更多资源类型 Rj 的实例。

6.3 检测算法 Detection Algorithm

  1. Let Work and Finish be vectors of length m and n, respectivelyInitialize:

    (a) Work = Available

    (b) For i = 1,2, …, n, if Allocationi != 0, thenFinish[i] = false; otherwise, Finish[i] = true (是否完成)

  2. Find an index i such that both:(a) Finish[i] == false(b) Requesti  WorkIf no such i exists, go to step 4

  3. Work = Work + AllocationiFinish[i] = truego to step 2
  4. If Finish[i] == false, for some i, 1 <= i <= n, then the system is indeadlock state. Moreover, if Finish[i] == false, then Pi isdeadlocked

Algorithm requires an order of O(m x n2) operations to detect whether the system is in deadlocked state

  1. 让 Work 和 Finish 分别是长度为 m 和 n 的向量初始化: (a) 工作 = 可用 (b) 对于 i = 1,2, …, n,如果 Allocationi != 0,则 Finish[i] = false;否则,完成 [i] = 真 (是否完成)
  2. 找到一个索引 i,以便两者:(a) 完成[i] == false(b) 请求  工作如果不存在这样的 i,请转到步骤 4

  3. 工时 = 工时分配iFinish[i] = true转到步骤 2
  4. 如果 Finish[i] == false,对于某些 i,1 <= i <= n,则系统处于死锁状态。此外,如果 Finish[i] == 假,则 Pi 死锁

算法需要一阶 O(m x n2) 操作来检测系统是否处于死锁状态

image-20230602000744259image-20230602000758079

Detection-Algorithm Usage :

When, and how often, to invoke depends on:

  • How often a deadlock is likely to occur?
  • How many processes will need to be rolled back?
    • one for each disjoint cycle

If detection algorithm is invoked arbitrarily, there may be many cycles in the resource graph and so we would not be able to tell which of the many deadlocked processes “ caused ” the deadlock.

7. 死锁恢复

7.1 进程终止 Recovery from Deadlock: Process Termination

Abort all deadlocked processes

Abort one process at a time until the deadlock cycle is eliminated

In which order should we choose to abort?

  1. Priority of the process
  2. How long process has computed, and how much longer to completion
  3. Resources the process has used
  4. Resources process needs to complete
  5. How many processes will need to be terminated
  6. Is process interactive or batch?

中止所有死锁的进程

一次中止一个进程,直到消除死锁循环,此时引出问题:我们应该选择以什么顺序中止?

  1. 流程的优先级
  2. 计算了多长时间的过程,以及完成的时间
  3. 流程使用的资源
  4. 资源流程需要完成
  5. 需要终止多少个进程
  6. 流程是交互式的还是批处理的?

7.2 资源抢占 Recovery from Deadlock: Resource Preemption

Selecting a victim – minimize cost

Rollback – return to some safe state, restart process for that state

Starvation – same process may always be picked as victim, include number of rollback in cost factor

选择受害者 – 成本最小化

回滚 – 返回到某个安全状态,重新启动该状态的过程

饥饿 – 可能总是选择相同的过程作为受害者,包括成本因素中的回滚次数