صفحه 1:
Distributed Process
Management
Chapter 15
صفحه 2:
Process Migration
° Transfer of sufficient amount of
the state of a process from one
computer to another
° The process executes on the
target machine
صفحه 3:
Motivation
* Load sharing
- Move processes from heavily loaded
to lightly load systems
* Communications performance
~ Processes that interact intensively
can be moved to the same node to
reduce communications cost
~ May be better to move process to
where the data reside when the data
is large
صفحه 4:
Motivation
° Availability
~ Long-running process may need to
move because the machine it is
running on will be down
° Utilizing special capabilities
~ Process can take advantage of
unique hardware or software
capabilities
صفحه 5:
Initiation of Migration
° Operating system
~ When goal is load balancing
° Process
~ When goal is to reach a particular
resource
صفحه 6:
What is Migrated?
° Must destroy the process on the
source system and create it on
the target system
° Process image and process
control block and any links
must be moved
صفحه 7:
Example of Process
Migration
Machines
صفحه 8:
Example of Process
Migration
(by After migration
صفحه 9:
What is Migrated?
° Eager (all):Transfer entire
address space
~ No trace of process is left behind
~ If address space is large and if the
process does not need most of it,
then this approach my be
unnecessarily expensive
صفحه 10:
What is Migrated?
° Precopy: Process continues to
execute on the source node while
the address space is copied
~ Pages modified on the source
during precopy operation have to
be copied a second time
~ Reduces the time that a process is
frozen and cannot execute during
migration
صفحه 11:
What is Migrated?
° Eager (dirty): Transfer only that
portion of the address space
that is in main memory and
have been modified
~ Any additional blocks of the
virtual address space are
transferred on demand
~ The source machine is involved
throughout the life of the process
11
صفحه 12:
What is Migrated?
* Copy-on-reference: Pages are only
brought over when referenced
~ Has lowest initial cost of process
migration
° Flushing: Pages are cleared from
main memory by flushing dirty
pages to disk
~ Relieves the source of holding any
pages of the migrated process in main
memory
12
صفحه 13:
Negotiation of
Migration
* Migration policy is responsibility
of Starter utility
° Starter utility is also responsible
for long-term scheduling and
memory allocation
° Decision to migrate must be
reached jointly by two Starter
processes (one on the source and
one on the destination)
صفحه 14:
14
Negotiation of Process Migration
Figure 15.2
صفحه 15:
Eviction
° Destination system may refuse
to accept the migration of a
process to itself
° If a workstation is idle, process
may have been migrated to it
~ Once the workstation is active, it
may be necessary to evict the
migrated processes to provide
adequate response time
صفحه 16:
Distributed Global
States
* Operating system cannot know the
current state of all process in the
distributed system
۰ A process can only know the current
state of all processes on the local
system
» Remote processes only know state
information that is received by
messages
~ These messages represent the state in
the past
16
صفحه 17:
Example
° Bank account is distributed
over two branches
¢ The total amount in the account
is the sum at each branch
° At 3 PM the account balance is
determined
° Messages are sent to request
the information
صفحه 18:
Example
(aya foo
صفحه 19:
Example
° If at the time of balance
determination, the balance from
branch A is in transit to branch
B
° The result is a false reading
صفحه 20:
Example
صفحه 21:
Example
° All messages in transit must be
examined at time of observation
¢ Total consists of balance at both
branches and amount in
message
صفحه 22:
Example
° If clocks at the two branches
are not perfectly synchronized
¢ Transfer amount at 3:01 from
branch A
° Amount arrives at branch B at
2:59
e At 3:00 the amount is counted
twice
صفحه 23:
Example
صفحه 24:
Some Terms
° Channel
- Exists between two processes if
they exchange messages
° State
~ Sequence of messages that have
been sent and received along
channels incident with the process
صفحه 25:
Some Terms
° Snapshot
~ Records the state of a process
° Global state
~ The combined state of all
processes
° Distributed Snapshot
- A collection of snapshots, one for
each process
صفحه 26:
Inconsistent Global
State
صفحه 27:
Consistent Global State
رع تح 5
سس تام
3<
صفحه 28:
Distributed Snapshot
Algorithm
صفحه 29:
29
Distributed Snapshot
Algorithm
Process 3
Outgoing channels
2sent1, 2,3,4,5,6,7,8
Incoming channels
1 received 1,2,3
stored 4, 5, 6
2 received 1,2,3
stored 4
4 received 1,2,3
Process 4
Outgoing channels
3 sent 1,2,3
Incoming channels
2 received 1, 2 stored
3,4
Process 1
Outgoing channels
2sent 1,2,3,4,5,6
3sent 1,2,3,4,5,6
Incoming channels,
Process 2
Outgoing channels
3sent1, 2,3,4
456011, 2, 3, 4
Incoming channels
lreceived 1,2,3,4
stored 5, 6
Breceived 1, 2,3, 4,
5, 6, 7,8
صفحه 30:
Distributed Mutual
Exclusion Concepts
* Mutual exclusion must be enforced:
only one process at a time is allowed
in its critical section
A process that halts in its noncritical
section must do so without
interfering with other processes
It must not be possible for a process
requiring access to a critical section
to be delayed indefinitely: no
deadlock or starvation
صفحه 31:
Distributed Mutual
Exclusion Concepts
° When no process is in a critical
section, any process that requests
entry to its critical section must be
permitted to enter without delay
* No assumptions are made about
relative process speeds or number
of processors
* A process remains inside its critical
section for a finite time only
صفحه 32:
32
‘Figure 15.7 Model for Mutual Exclusion Problem in Distributed Process Management
صفحه 33:
Centralized Algorithm
for Mutual Exclusion
* One node is designated as the control
node
° This node control access to all shared
objects
* Only the control node makes
resource-allocation decision
° All necessary information is
concentrated in the control node
° If control node fails, mutual exclusion
breaks down
صفحه 34:
Distributed Algorithm
° All nodes have equal amount of
information, on average
° Each node has only a partial
picture of the total system and
must make decisions based on
this information
° All nodes bear equal
responsibility for the final
decision
صفحه 35:
Distributed Algorithm
° All nodes expend equal effort, on
average, in effecting a final
decision
° Failure of a node, in general,
does not result in a total system
collapse
۰ There exits no systemwide
common clock with which to
regulate the time of events
صفحه 36:
Ordering of Events
° Events must be order to ensure
mutual exclusion and avoid
deadlock
° Clocks are not synchronized
* Communication delays
صفحه 37:
Ordering of Events
° Need to consistently say that one
event occurs before another event
° Messages are sent when want to
enter critical section and when
leaving critical section
° Time-stamping
~ Orders events on a distributed
system
- System clock is not used
صفحه 38:
Time-Stamping
° Each system on the network
maintains a counter which
functions as a clock
° Each site has a numerical identifier
° When a message is received, the
receiving system sets is counter to
one more than the maximum of its
current value and the incoming
time-stamp (counter)
38
صفحه 39:
Time-Stamping
° If two messages have the same
time-stamp, they are ordered by
the number of their sites
° For this method to work, each
message is sent from one process
to all other processes
~ Ensures all sites have same ordering
of messages
~ For mutual exclusion and deadlock all
processes must be aware of the
situation
39
صفحه 40:
40
Time
امس oelo
Figure 15.8 Example of Operation of Timestamping Algorithm
صفحه 41:
41
Figure 15.9 Another Example of Operation
of Timestamping Algorithm
صفحه 42:
42
‘Mutua
Exclusion
Request
Return Replies
or Waiting Requests
Figure 15.10 State Diagram for Algorithm in [RICA81]
صفحه 43:
Token-Passing
Approach
Pass a token among the participating
processes
The token is an entity that at any
time is held by one process
The process holding the token may
enter its critical section without
asking permission
When a process leaves its critical
section, it passes the token to
another process
صفحه 44:
Deadlock in Resource
Allocation
° Mutual exclusion
° Hold and wait
° No preemption
¢ Circular wait
صفحه 45:
Phantom Deadlock
Figure 15.12 Phantom Deadlock
صفحه 46:
Deadlock Prevention
* Circular-wait condition can be
prevented by defining a linear
ordering of resource types
° Hold-and-wait condition can be
prevented by requiring that a
process request all of its required
resource at one time, and blocking
the process until all requests can
be granted simultaneously
صفحه 47:
Deadlock Avoidance
° Distributed deadlock avoidance is
impractical
~ Every node must keep track of the
global state of the system
~ The process of checking for a safe
global state must be mutually
exclusive
~ Checking for safe states involves
considerable processing overhead
for a distributed system with a large
number of processes and resources
47
صفحه 48:
Distributed Deadlock
Detection
* Each site only knows about its own
resources
~ Deadlock may involve distributed resources
* Centralized control - one site is
responsible for deadlock detection
° Hierarchical control - lowest node
above the nodes involved in deadlock
° Distributed control - all processes
cooperate in the deadlock detection
function
48
صفحه 49:
Deadlock in Message
Communication
° Mutual Waiting
~ Deadlock occurs in message
communication when each of a
group of processes is waiting for a
message from another member of
the group and there are no
messages in transit
صفحه 50:
50
(a) No deadlock مسد رسا
Figure 15.16 Deadlock in Message Communication
صفحه 51:
Deadlock in Message
Communication
° Unavailability of Message
Buffers
~ Well known in packet-switching
data networks
~ Example: buffer space for A is
filled with packets destined for B.
The reverse is true at B.
صفحه 52:
Direct Store-and-
Forward Deadlock
صفحه 53:
Deadlock in Message
Communication
° Unavailability of Message
Buffers
~ For each node, the queue to the
adjacent node in one direction is
full with packets destined for the
next node beyond
صفحه 54:
54
let ith Eg wit
دصر packs
صفحه 55:
Structured Buffer Pool
Figure 15.18 Structured Buffer Pool for Deadlock Prevention
صفحه 56:
Finite Channels Lead to
Deadlock
Figure 15.19 Communication Deadlock in a Distributed System