Operating Systems

Processes

  • Most basic function of kernel is to run programs

  • How do we achieve that with just 1 (or small number) cpu and memory?

  • Want to create illusion that there are many cpus; 1 for every program

  • A process is a program in execution

  • - A program is static; a process is the activity of executing that program

  • Processes have states that change over time

  • Processes need resources (CPU, memory, I/O, ...) to actually execute

  • Processes have contexts - all of the machine and kernel-related states

  • A context is comprised of CPU context (values of registers, stack pointer, program counter, ...) and memory context (code, variables, things in memory, ...)

  • A process has 3 memory areas: stack, text, data

  • Text contains all of the code

  • Data contains global variables, heap

  • Stack contains activation records

  • An activation record is information pertaining to a procedure call

  • - If a procedure calls itself 3 times, there will be 3 activation records

  • Activation records store:

  • - pointer where to return to

  • - link to previous record (so we know where the new top of the stack is after this is popped off the stack)

  • - local variables

  • Stack pointer register contains the location of the top of the stack

  • Return address keeps track of where to return to after procedure

  • All this is for 1 process; but we want to run many processes

  • In reality, if only 1 cpu, only 1 process could be running

  • Multiprogramming: voluntarily giving up CPU to another process

  • A process running may need a resource (keystroke, anything needed to get work done). If it doesn't get that resource, the process is just sitting there doing nothing. It can't make any progress, so let's give CPU to another process that can

  • yield(p) - give CPU to process p

  • Context switching: allocating of CPU from one process to another

  • - Save the context of one process, restore context of process we want to run

  • Switch text, data, and stack because each process has its own text, data, and stack

  • Yield is so important, we don't want it written by programmers. We want it written by OS implementer and put in the kernel

  • Kernel is code that executes. Is it a process then? No it is not!

  • - Kernel supports processes; can be thought of as an extension of all processes. Processes run inside the kernel.

  • Kernel has its own text and data. It has 1 stack per process

  • Yield is in the kernel, so calling yield causes a jump into the kernel

  • Yield is a system call - a function in the kernel that is callable by a process

Timesharing

  • If a process doesn't give up CPU, how does the system keep working?

  • Quantum: amount of time a process runs

  • Each process is getting a quantum of CPU time. This creates the illusion of parallel progress by rapidly switching CPU

How Timesharing Is Implemented
  • Kernel keeps track of progress of each process and characterizes their states

  • - Running: actually using CPU

  • - Ready: able to use CPU, but not using

  • - Blocked: does not have CPU, not able to run

  • Running `->` Blocked (sleep): when process tries to access a resource, but the resource is being used by another process

  • Blocked `->` Ready (wake up): when resource it was waiting for becomes available

  • Ready `->` Running (dispatch): when CPU is given to process by kernel

  • Running `->` Ready (preempt): if process never asks for a resource

Process vs. Kernel
  • Kernel includes functionality for system calls

  • Kernel runs when process makes a system call or a hardware interrupt occurs

  • When process is running its own code, it runs in user space. When it makes a system call, it jumps into kernel space

  • While running in kernel space, variables are accessible via the kernel's data area and stack

  • Within the kernel, stuff in kernel space and user space is accessible. In user space, only the user space stuff is accessible

Kernel Maintains List of Processes
  • Kernel has a process table of unique process ID's and their states

  • The table also has contents of all CPU contexts, areas of memory being used, reasons for being blocked
How Kernel Gets Control
  • Kernel can get control when process voluntarily gives up control by making a system call that blocks

  • Kernel can take away CPU from currently running process - preemption

  • There is a hardware clock that is programmed to go off at a certain time. When it goes off - that's a hardware interrupt - and the kernel gets control

  • - When kernel gives control to a new process, it resets timer

How a Context Switch Occurs
  • Process makes system call or hardware interrupt occurs to get into kernel

  • Kernel expects hardware to do some things on its behalf:

  • - hardware switches from user mode to kernel mode (characteristics of hardware)

  • - in user mode, process has less power because it can't execute all instructions

  • - by going into kernel mode, its power is amplified

  • - hardware arranges for control to go to a fixed location, which is the handler: the code that runs to handle the trap (system call) or interrupt

  • Kernel now saves context of currently running process, selects a ready process, restores its context, and returns from interrupt/trap

  • rti instruction: return from interrupt; need a special return because going from kernel space to user space; rti turns kernel mode off

  • Kernel has access to all processes, but only 1 process is running at a time

How To Get Parallelism Within a Process
  • What if we want multiple things going on within the same process?

  • Want a single text and data, but multiple stacks for every path of execution

  • Thread: single sequential path of execution

  • - like a process, but no memory; lives in a process

  • Can have multiple threads in a process

  • To the user: a thread is a unit of parallelism; more threads -> more parallelism

  • To the kernel: a thread is a unit of schedulability; it is an object that the kernel assigns to a CPU

  • User level thread: support for threads is part of the program; implemented by programmer or by including a thread library; kernel thinks it is a single process

  • Kernel level thread: support for threads is provided by kernel; kernel can put threads on separate CPU's

  • User level picture: each process has 1 stack for each thread; each process has code for context switching, scheduling; kernel has 1 stack for each process

  • Kernel level picture: each process has 1 stack for each thread; kernel has code for context switching, scheduling; kernel has 1 stack for each thread

Pros And Cons
  • Pros of user level threads:

  • - Programmers should write programs independent of OS; portability - works on any kernel

  • - Context switching in user space is efficient; going into kernel takes time

  • - Programmer can control scheduling policy

  • Cons of user level threads:

  • - No true parallelism

  • Pros of kernel level threads:

  • - true parallelism - different thread on each CPU

  • Cons of kernel level threads:

  • - overhead - jumping into kernel is expensive

Scheduling

  • CPU scheduling problem: if there are multiple processes and only 1 CPU, how much time should each process get?

  • No single best policy; depends on goals of the system

  • - personal computer: active window gets most CPU

  • - large time-shared computer: everyone gets equal amount

  • Arrival time: when process is created

  • Service time: amount of CPU time for a process

  • Turnaround time: time between arriving and departing

  • - want to be as small as possible

Processes Arrive at the Same Time
  • Longest First: order processes by amount of CPU time needed from greatest to least

  • Shortest First: order from least time needed to greatest time needed

  • - provably optimal

  • Longest First

  • 1 2 3 4 5 6 7 8 9 T T
    P1 S1 S1 S1 S1 S1 5
    P2 S2 S2 S2 8
    P3 S3 9

    T T: turnaround time

    Average Turnaround Time: `22/3`

    Process 1, Process 2, Process 3 all arrive at the same time. P1 is scheduled first and it has service time S1. P2 and P3 are waiting for P1 to finish. Once P1 is done, P2, which is the next longest process, is scheduled. Once P2 is done, P3 runs.

    P2, P3, ..., Pn all have to wait S1 time. So it is best to make it as small as possible. Hence, shortest first.

  • Shortest First

  • 1 2 3 4 5 6 7 8 9 T T
    P1 S1 S1 S1 S1 S1 9
    P2 S2 S2 S2 4
    P3 S3 1

    T T: turnaround time

    Average Turnaround Time: `14/3`

  • Math proof of shortest first being optimal:

  • Average T T = `[S_1 + (S_1 + S_2) + ... + (S_1 + ... + S_n)] / n`

  • `= [nS_1 + (n-1)S_2 + ... + S_n] / n`

  • S1 has the most weight (`n`) so it should be as small as possible to minimize average turnaround time.

Processes Arrive at Different Times
  • Non-preemptive: CPU is not taken away from a process once its given

  • Starvation: when a process wants a resource but can't get it

  • First Come First Served: first process created gets all the CPU, then the next process created gets the CPU when the previous process is done

  • - non-preemptive

  • - no starvation (all processes eventually get CPU time)

  • - bad for short processes (have to wait for long processes to finish if they (long processes) are created first)

  • Round Robin: every process gets a time-slice (quantum) of CPU

  • - generally does better than FCFS; a process waits at most (n-1)*quantum units of time

  • - preemptive

  • - no starvation

  • Shortest Process Next: select the shortest process to run next

  • - theoretical; have to know how long processes will run ahead of time

  • - most optimal non-preemptive policy

  • - allows for starvation; long process is waiting, but shorter processes keep coming in

  • Shortest Remaining Time: select process with shortest remaining time to run next

  • - theoretical

  • - most optimal preemptive policy

  • - allows for starvation

  • Multi-Level Feedback Queues

  • - levels represent priority; processes in queue 0 have the highest priority and processes in queue n have the lowest priority

  • - new processes enter queue 0

  • - select process with highest priority

  • - process runs for 2k quantums, where k is the level of the queue

  • - higher priority processes get the CPU more often, but have shorter running times; lower priority processes don't get the CPU as often, but have longer running times

  • - if a process ran for less than 2k quantums (CPU gets taken away before it could use all the time it was given), it gets put back in queue k

  • - if a process ran for 2k quantums (able to use all the time it was given), it gets put in queue k+1

  • - potential starvation; fix by periodically raising everyone's priority

  • - favors short processes, so it does well

  • - learns over time; sorts processes

  • Priority Scheduling: label each process with a priority and select the highest one

  • - tie priorities to something external (1/CPU time used)

  • Fair Share (Proportional Share): every process gets a proportion of what they ask for

  • - select process with minimum actual/request ratio

  • - inefficient because have to compute fractions and minimum at every quantum

Real Time Scheduling
  • Real time system is correct if computations are correct and done within a certain amount of time

  • Non-real time system only needs to make correct computations

  • Hard real time system must meet all deadlines (e.g. nuclear reactor)

  • Soft real time system - ok to miss a few deadlines

  • Why not use hard real time for everything?

  • - requires a lot of resources; might not be able to run anything else

Periodic Tasks
  • Periodic: schedule repeats every period

  • Can processes be ordered to meet all deadlines?

  • Earliest Deadline First: schedule process with earliest deadline first

  • - works for periodic and aperiodic processes

  • - expensive - sort deadlines

  • C: CPU burst - how long process runs for each period

  • T: period - deadline (e.g. every 30 seconds)

  • U: utilization = C/T

  • Rate Monotonic Scheduling: select process with highest rate (1/period)

  • - only works for periodic processes

  • - if sum of utilizations `<= n(2^(1/n)-1)` then all deadlines will be met

  • - preempt if new period starts and there is a higher priority process

  • - if process completes burst, it goes to sleep until next period

  • - static: priorities of processes don't change over time

  • - RMS is optimal for static priority scheduling

  • - no matter how many processes there are, if they're not asking for > 69% of CPU, RMS will work

Synchronization

  • Synchronize: when events happen at the same time

  • Process synchronization: when events in processes occur at the same time

  • - every instruction, function call is an event

  • Use synchronization to prevent race conditions and wait for resources

  • Critical section: code that may be executing by different processes, but needs to be atomic: cannot be divided; cannot interleave critical sections that are related to each other

  • Mutual exclusion: only 1 process active in a critical section

  • Programmer must identify critical sections and surround them with entry/exit code, which makes processes atomic

  • Entry code should allow processes to enter critical section if there are no other processes inside

  • - none inside `->` open; process enters `->` close; process leaves `->` open

  • Requirements for a good solution for entry/exit code (mutual exclusion):

  • - at most 1 process in critical section

  • - can't block entry if no process inside critical section

  • - a process that is waiting should be able to enter critical section

  • - no assumptions about CPU speed or number

Software Lock
    shared int lock = OPEN;

    P1

    while (lock == CLOSED);
    lock = CLOSED;
    < critical section >
    lock = OPEN;

    P2

    while (lock == CLOSED);
    lock = CLOSED;
    < critical section >
    lock = OPEN;
  • At first, lock is open, so while loop will be skipped. Then lock will be closed. Then process will enter critical section. Since lock is closed, the other process will be stuck in the while loop, and thus, be unable to enter critical section. Once the process exits the critical section, the lock is open, and the other process will be able to enter the critical section.

  • Problem: P1 checks lock, lock is open so skip while loop, but just before it closes the lock, there is a context switch to P2. The lock is still open, so P2 also skips the while loop and enters the critical section. There is a context switch back to P1 and it also enters the critical section. Broke rule: at most 1 process in critical section

Take Turns
    shared int turn = 0;

    P1

    while (turn != 0);
    < critical section >
    turn = 1;

    P2

    while (turn != 1);
    < critical section >
    turn = 0;
  • At first, turn is 0, so P1 gets to go inside the critical section first. By doing this, P2 won't go inside the critical section since it is not P2's turn.

  • Problem: if P2 is scheduled to go before P1, it will be stuck in the while loop and won't enter the critical section. Broke rule: can't block entry if no process inside critical section

State Intention
    shared boolean intent[2] = {FALSE, FALSE};

    P1

    intent[0] = TRUE;
    while (intent[1]);
    < critical section >
    intent[0] = FALSE;

    P2

    intent[1] = TRUE;
    while (intent[0]);
    < critical section >
    intent[1] = FALSE;
  • When a process starts, it will declare its intention to go into the critical section. But before it goes in, it checks if the other process was intending to go inside the critical section. If it was, it will let the other process go first.

  • Problem: P1 sets intent = TRUE, there is a context switch to P2, P2 sets intent = TRUE, now both are stuck in the while loop (busy-waiting). Broke rule: a process that is waiting should be able to enter critical section

Peterson's Solution
    shared int turn;
    shared boolean intent[2] = {FALSE, FALSE};

    P1

    intent[0] = TRUE;
    turn = 1;
    while (intent[1] && turn == 1);
    < critical section >
    intent[0] = FALSE;

    P2

    intent[1] = TRUE;
    turn = 0;
    while (intent[0] && turn == 0);
    < critical section >
    intent[1] = FALSE;
  • Combines take turns and state intention

  • Problem: none!

  • An all-software solution

Disable Interrupts
  • Context switching causes problems; interrupts cause context switching; so disable interrupts before entering critical sections and enable when leaving

  • Problem: interrupts are turned off on a per-CPU basis. Broke rule: no assumptions about CPU speed or number

Test-and-Set Lock
  • Takes a memory location, tests whether its contents == 0 and sets contents = 1 atomically (virtually, at the same time)

  • Typical hardware solution

Semaphores
  • Variable that helps us achieve synchronization

  • Can declare and initialize it to some integer value

  • Can call wait or call signal on it

  • - wait will decrement value of semaphore and check if value < 0; if yes, block the currently running process, else, return

  • - signal will increment value of semaphore and unblock 1 process, if any are blocked

  • synchronization is more than mutual exclusion

  • Can use semaphores to control order of process execution (want P1 before P2)

  • - conditional synchronization or general synchronization

  • Semaphores only provide synchronization; they don't allow any information to be transferred from one process to another

  • - A and B use semaphores; B can never learn anything about behavior of A (e.g. whether A is blocked)

  • Can't check value of semaphore; only initialize, wait, signal

  • Wait and signal must be atomic, so need TSL or Peterson's solution

  • Synchronization: one process waiting for another one

InterProcess Communication

  • Used when want to build multiprocess program

  • Want to support cooperating processes for speed and modularity

  • - speed: exploit parallelism; one process waits for resources and other ones can proceed

  • - modularity: reusable, self-contained programs

Examples of Cooperating Processes
  • Pipeline: P1 -> P2 -> P3 output of P1 is input of P2 ...

  • Client/Server: client needs a service (result of a function), sends message to server, server computes answer and sends it back to client

  • - good for cooperating processes built by different people (e.g. web browser and web server)

  • Parent/Child: parent generates numerous children who computer subresults and parent combines results

  • Cooperate: processes need to talk to each other to organize their activity

  • IPC requires data transfer mechanism and synchronization mechanism

  • - semaphores are not an IPC mechanism because no data transfer

  • There are three abstractions for IPC: shared memory + semaphores, monitors, and message passing

Producer/Consumer Problem
  • Producer produces data; consumer consumes what was produced (both are processes)

  • Work on a shared buffer (array)

Shared Memory + Semaphores
  • Shared variables to tell where things are in buffer

  • Won't work if consumer goes first (nothing in buffer)

  • Consumer has semaphore for waiting until there are filled slots

  • Producer has semaphore for waiting until slots are empty before inputting

  • Won't work for multiple producers; producers might overwrite values

  • - inputting into array is a critical section; surround with additional semaphores

  • One semaphore solves synchronization and the other solves mutual exclusion

Monitors
  • Programming language constructs; if programming langauge doesn't support monitors, can't use

  • Consists of shared variables, accessed via procedures that make up the monitor

  • Have condition variables for blocking and unblocking processes

  • - wait: block; signal: unblock

  • Only 1 process can be active inside a monitor

  • Producer produces n times then blocks; consumer consumes n times then blocks

  • No race condition because only 1 process active inside monitor

  • - since monitors are a programming language construct, compiler adds code to make them atomic

  • - semaphores are not a programming language construct

  • Think of monitor as an apartment with 2 rooms (active area and waiting area) and 1 door (between active area and outside the apartment)

  • When process calls monitor function, it is entering the monitor

  • - door is open by default; closed when process enters

  • If process is done, it exits and the door opens

  • If process is inside monitor and calls wait, it moves from active area to waiting area; then door opens and another process can come in to the active area

  • Condition variables have no memory; can't "remember" that wait/signal was called previously

  • Monitors bring additional structure to IPC

Message Passing
  • Two functions: send and receive

  • - send: give name of process and pointer to buffer

  • Kernel copies data into kernel; records that it was from P1 and supposed to go to P2

  • - happens when P1 calls send, which is a system call

  • If P2 calls receive, kernel copies data into buffer

  • How data transfer is implemented: kernel transfers data from one place to another

  • How synchronization is achieved: process waits until message is sent before receiving

  • No shared memory - good for multiprocess program over a network, multiple computers have no shared memory

  • Kernel is in control, so it can block producer or not schedule it to run if producing too much

  • Who should messages be sent to? Since the client and server may be two programs written by different people, how does client know process ID of server?

  • - port: like a mailbox; has a number everyone recognizes

  • Kernel buffering: messages haven't been sent yet, pile up in kernel

  • - kernel could block/not allow to run process producing messages

  • Safer than shared memory paradigms because shared memory can lead to programmer abuse/error

Deadlock

  • Set of processes that are permanently blocked; not able to run because of logical reason

  • - unblocking one process requires another to run, but no others can run

  • Processes ask for resources and kernel gives them out; sometimes process holds on to resource and sometimes ask for resource

  • Ex: memory = 200; P1 asks for 80, memory = 120; P2 asks for 70, memory = 50; P1 asks for 60, P1 gets blocked; P2 asks for 80, P2 gets blocked

Four Conditions For Deadlock
  • Deadlock is possible only if all four conditions are present:

  • - mutual exclusion: only 1 process may use a resource at a time

  • - hold-and-wait: process holds 1 resource and is waiting for another

  • - no preemption: resources can't be taken away from a process

  • - circular wait: hold-and-wait pattern is circular/cyclic

Attack the Deadlock Problem
  • Deadlock prevention: make deadlock impossible by removing 1 condition

  • Deadlock avoidance: temporarily get rid of 1 condition, look at every situation and see if it's possible for a deadlock

  • Deadlock detection: let them happen, maybe can recover

Deadlock Prevention
  • Remove mutual exclusion: why not allow all resources to be shared?

  • - some resources can't be shared (e.g. printer, critical section)

  • Remove hold-and-wait: what if we make processes ask for resources all at once? (e.g. if need resources A and B, need to ask for both at the same time, not A then later B)

  • - processes don't know what they will want; may realize need more later

  • Remove no preemption: what if we allow resources to be taken away?

  • - bad for some things like printing; jobs need to be completed

  • Remove circular wait: what if we numbered all resources and processes have to ask for them in order? (e.g. if process wants resources 3, 5, and 9, it has to get 3, then 5, then, 9; not 9, then 3, then 5

  • - there are an unknown number of resources; some are created on demand

Deadlock Avoidance
  • Avoid getting into situations that lead to deadlock; remove condition only when deadlock is possible

  • Works when resources are given out in increments (e.g. memory, CPU)

  • Processes would have to declare upfront the max amount they need, but they often don't know how much they need

  • If they did know, could use Banker's Algorithm

Banker's Algorithm
  • Works for fixed number of processes and resources

  • Characterize system as either in a safe state or unsafe state

  • Safe state: deadlock is avoidable

  • Unsafe state: deadlock is possible; not certain

  • System always begins in a safe state, everytime there is a request, kernel asks if giving resource makes system go from safe to unsafe; if so, won't grant request

  • Would need 3 data structures:

  • - process/resource claim matrix: what is the max amount a process may ask for

  • - process/resource allocation matrix: what did they actually ask for and what do they currently own

  • - resource availability vector: how many resources are available/free

  • Can look at contents of these data structures and classify system as safe or unsafe

  • Safe state means at least 1 process can run to completion

  • If process may need more resources but none are available, don't run that process

Detection and Recovery
  • Do nothing to prevent/avoid deadlocks; if they happen, they happen

  • Periodically try to see if deadlock did occur and try to do something about it

  • Deadlocks rarely happen; why put all this complexity (of prevention and avoidance) into kernel for something unusual?

  • Cost of putting them in may be very high

  • Most general purpose operating systems (e.g. laptops) take this approach

  • Detect deadlock by constructing resource allocation graph and see if there's a cycle

  • Can kill all processes that are deadlocked

  • Can terminate them one at a time and see if that solves it

  • - process can be left in inconsistent state (writing/sending message may contain garbage)

Memory Management

  • Process has to store code (text), static variables (heap), activation records (stack)

  • Text is fixed; stack and data grow/shrink

  • What it would look like if process owns whole memory:

  • - process thinks it does own whole memory

  • Address space: set of addresses needed to access memory

  • Compiler has this view of memory as well

  • Compiler generates all of the memory addresses for variables

  • The problem is the compiler works with incomplete information; it doesn't know where the program will run in physical memory

  • Typically there are many processes in physical memory; only 1 can live at location 0

  • Compiler doesn't know that; pretends that the process's memory goes from 0 to some max value and allocate addresses based on that

  • The operating system figures out how to map those addresses into real physical addresses

  • Abstract - each process needs its own CPU and memory

  • Reality - single CPU whose time is divided up; single memory whose space is divided up

  • Why do all processes have to live in memory?

  • Consider an alternative - have all the processes live in a disk (secondary storage); when we want to run a process, we have to move its memory into physical memory

  • - takes too long to load from/to disk and to perform a context switch

  • The solution is to keep all processes inside memory so we just have to change context of CPU and move pointers around

  • We have a number of processes and want to fit them into memory

  • Memory starts empty; one big hole

  • Over time, memory gets allocated into blocks (put process into block and put block into hole)

  • Every allocation requires finding a hole large enough to fit the process

  • Usually, an allocation leaves a smaller hole

  • When memory is not needed, it is released and holes next to each other combine

Selecting the Best Hole
  • First Fit: find the first hole that fits

  • - simple, fast

  • Best Fit: find the hole that fits as perfectly as possible

  • - have to look at every hole

  • - leaves very small holes - fragments - that are likely unusable

  • Worst Fit: find the hole that leaves the largest resulting hole

  • - have to check every hole

  • Which one is the best?

  • - consider tradeoff: how well blocks fit vs search time

  • Prefer to waste memory space than waste CPU time

  • - memory is cheap; time is precious

  • First Fit is typically used because it is fast

  • Eventually, memory becomes fragmented - has unused holes

  • Internal fragmentation: memory not being used inside block (room between process's stack and data)

  • - unused memory inside process

  • External fragmentation: outside of allocated regions; usable for new requests (processes)

  • - unused memory inside overall memory

  • Compaction: combine all external fragments into a big hole; hopefully it is large enough for another process

  • - simple, but time consuming

  • Break block (process's memory) into little pieces and fit those pieces into fragments

  • - easier to fit, but complicated

  • Prefer breaking over compaction; prefer complexity over time

50% Rule
  • On average, how many holes are there going to be?

  • If there are n blocks allocated, there will be `m = n/2` holes, on average

Unused Memory Rule
  • On average, how much memory is lost to holes?

  • b: average size of blocks; h: average size of holes; f: fraction of memory lost to holes

  • M: size of memory; m: number of holes; n: number of blocks

  • `k = h/b` (ratio of average hole-to-block size)

  • `M = mh + nb`

  • `f = (mh)/M = (mh)/(mh + nb)`

  • `m = n/2 => n = 2m` (by the 50% rule)

  • `f = (mh)/(mh + 2mb) = h/(h + 2b)`

  • `k = h/b => h = kb`

  • `f = (kb)/(kb + 2b) = k/(k + 2)`

  • If k = 1, then the average hole size = average block size. Then `f = 1/3`, which means there is 33% wasted memory if the average hole size = average block size

  • If k = 2, then the average hole size = 2x average block size. Then `f = 2/4 = 1/2`, which means there is 50% wasted memory if the average hole size = 2x average block size

  • As `k -> oo, f -> 1`. This means the bigger the average hole size, the more wasted memory there is

  • As `k -> 0, f -> 0`. This means the smaller the average hole size, the less wasted memory there is

Pre-sized Holes
  • Variable-sized allocations cause fragmentation

  • So let's use pre-sized holes

  • Make all holes the same size

  • - easy to allocate blocks, but the holes may be too small

  • Make holes in different sizes (e.g. small, medium, large)

  • - more flexible, but complex

Buddy System
  • Partition the holes into sizes of powers of 2

  • allocation: r = size of request

  • - find hole larger than r

  • - break hole into 2 buddies and see if half is ok; keep breaking until too small

  • free: check if buddy is free and coalesce

Logical Memory

  • There are three problems with sharing memory among processes

  • Addressing problem: every process thinks they own their own memory and everyone's memories starts at 0

  • - in physical memory, there is only one address 0 and only the kernel gets it

  • - processes' memories don't actually start at 0

  • Protection problem: because multiple processes live in memory, we don't want one process to touch the memory of another process

  • - possible because we are forced to keep processes in memory

  • Space problem: one physical memory shared among a lot of processes means each process only gets a little; even less as more processes come in

  • Address space: set of addresses for memory

  • - usually linear; starting at 0 and going to some max value n-1 where n is the size of memory

  • Physical address space: address space that applies to physical memory

  • Logical address: address for logical memory

  • Logical memory: memory that belongs to process; process thinks it owns that memory; the compiler generates addresses for that memory and can reason about that memory

  • - can say memory always starts at 0; structured a certain way; goes up by increments of x; independent of physical memory

  • Machine doesn't care about logical memory, so we need to convert logical addresses to physical addresses

  • There are two possible times when we can do that conversion

  • Load time: when we load executable into memory; convert via software

  • - ex: can add 100 to every address

  • - doesn't work with dynamic allocation (creating new address not known at load time)

  • Access time: every time memory is accessed, we can convert logical address at that point just before it's sent to memory; convert via hardware

Hardware for Logical Addressing
  • Base register: hardware register; when there is a context switch to a process, the base register gets filled with the start value of that process's memory; every time a logical address is supplied, it gets added by the hardware to the base register and that gets sent to the physical memory

Protection
  • Bound register: filled with size of logical memory

  • Every time a logical address is supplied, it's checked against the bound register

  • - is this address between 0 and n-1?

  • - if yes, add it to the base register to get physical address

  • - if no, it is an invalid address; hardware causes TRAP into kernel; kernel kills the process

  • On every context switch, load the base and bound registers for the selected process

  • Only the kernel can do this, so kernel memory has to be protected

  • - advantage of putting kernel at 0 (or low address)

  • - if process can't issue addresses that are negative, they can't reach behind themselves; kernel is behind all processes, so kernel is automatically protected

Fitting Processes Into Memory
  • Take advantage of the fact that a process is already broken up into text, data, stack; would like to put each of those into a hole that is available

  • There are two approaches of breaking up memory into pieces

  • Segmentation: break them up into variable size pieces that are fitted exactly for what we need; results in segmented address space; segments for text, data, stack

  • Break them up into fixed size pieces regardless of what we actually need; results in paged address space

  • - fixed size pieces are called pages

  • - text will be same number of pages, data will be same number of pages, ...

Segmented Address Space
  • Address space is a set of segments

  • Segment: linearly addressed memory; mini logical memory

  • - typically contains logically-related information (e.g. all stuff in this segment is code, data, activation records, ...)

  • Each segment is identified by an integer s, where 0 ≤ s ≤ S-1, S: max number of segments

  • Hardware determines max number of segments

  • Logical address is a pair of numbers (s, i), s: segment number, i: offset

  • - which segment (denoted by s) and where within that segment (denoted by i)

  • Convenient for compiler; it doesn't have to know anything about physical addresses

  • No segment for hole between data and stack; not wasting logical memory

  • How does hardware convert logical addresses based on segmentation into physical addresses?

  • Segment table: describes how to convert segments into physical addresses that indicate the start of the segment in physical memory

  • If we know where a segment starts in physical memory, we could just add the offset and get the physical address

  • Physical address = ST(s) + i

  • There is one segment table per process

  • In each entry there are a set of fields:

  • - valid bit: says whether that entry is valid

  • - base: where segment is located in phsyical memory

  • - bound: how big segment is

  • - permissions: read, write, execute

  • Table located in kernel because it has to be protected

  • Hardware has to be told where table is located, because hardware uses it to automatically convert logical to physical addresses

  • The kernel loads two hardware registers: segment table base register and segment table size register

Address Translation
  • Logical address: string of bits (bits for segment number followed by bits for offset)

  • Hardware takes segment number, indexes into segment table, extracts the base value, and adds offset to get physical address

  • - have to check if offset is within bound

  • First check whether trying to access segment that is within segment table size

  • Add segment number to segment table base value

  • Check whether offset is within bounds

  • Check permissions

  • Add offset to base value

Sizing the Segment Table
  • How do we determine the size of the segment table?

  • If there are n bits for the segment number in logical address, then there are 2n segments, so there are 2n entries in the table

  • The width of the table is the number of bits in each entry

  • Also need 1 bit for the valid bit

  • The base value can reference any location in physical memory, so we need enough bits to reference any part of memory; we need to know size of memory, which will be the number of bits for the base

  • Bound tells us size of a segment; need enough bits to express largest possible segment

  • - offset indexes into segment; however many bits offset is, we can't reference any more than that within segment

  • - number of bits for offset = max segment size = number of bits for bound

  • Pad remaining space to make entry size a power of 2

Pros and Cons of Segmentation
  • Pros of segmentation:

  • - each segment can be located independently; can put segments wherever we want because all we have to do is change the base value

  • - can separately protect them because each segment can be bounded

  • - can grow and shrink (increase/decrease bound)

  • - segments can be shared by processes

  • -- if everyone uses the same compiler, can store text at same address for everyone

  • Cons of segmentation

  • - variable size allocation is difficult to deal with

  • -- hard to find holes

  • -- external fragmentation

Paged Address Space
  • Break logical and physical memory into fixed-size pieces

  • Pages of logical memory fit into frames of physical memory

  • Frame: physical unit of information

  • A page fits exactly into a frame; makes allocation easy

  • Logical address: string of bits (bits for page number followed by bits for offset)

  • Size of logical address space is (max number of pages) x (page size)

Frame-Based Physical Addressing
  • Form of physical address is frame number and offset (f, i)

  • Offset is within a frame

  • Translate logical address to physical: convert page number to frame number and concatenate with offset

  • There is a page table, which is very similar to the segment table

  • f = PT(p)

  • Logical memory is a sequence of pages; physical memory is a sequence of frames

  • Page table tells where to find pages in frames

  • Page table is a sequence of entries with elements

  • - valid bit; demand paging bits; frame number: which frame is this page located in; permission bits

  • Page table is located in the kernel

  • Tell hardware where the page table is located with:

  • - page table base register: physical address of page table

  • - page table size register: number of entries

  • Hardware takes page number, checks to make sure it's valid to index into the page table by checking against the size register, then indexes into page table

  • Then checks if valid bit is on, extracts frame number that tells us which frame in physical memory page is located, and concatenates with offset to get physical address

Sizing the Page Table
  • If there are n bits for page number, then there are 2n entries in the table

  • How wide is each entry?

  • - need to know how big physical memory is

  • - need to know frame number

  • To get frame number, take maximum physical memory size and divide it by the size of the frame

  • Number of bits in offset is the size of the frame

Segments vs Pages
  • Segments are good logical units of information

  • If we need to put all instructions in memory, it's nice to be able to say, "we need a memory of this size"

  • Memory contains same kind of information

  • However, different sizes lead to expensive allocations and segmentation

  • Pages are good physical units of information

  • Memory management is simple; any page will fit in any frame

  • Can't size them to what is wanted; have to spread code over many pages

Combining Segments and Pages
  • Convert logical memory to physical memory by using segment tables and page tables

  • A segment is a group of pages

  • A segment table entry points to the page table

  • Page table entry tells where to find page in what frame of memory

  • Logical address has three fields: segment number, page number, offset

  • Use segment number to index into segment table; base value in segment table tells where to find page table; use page number to index into page table; concatenate frame number of offset to get physical address

  • Every memory reference costs 2 additional memory references

  • Locality of reference: if we just referenced memory location 1000, there is a good chance that next memory reference will be the next address 1001

  • - if we know page of 1000, then we know page of 1001; we only need to do translation once

  • Add high-speed memory to keep cache of most frequent translations

  • TLB: translation look-aside buffer

  • Each entry is a pair of key and frame number

  • Key is a bitstring of page number or segment+page number

  • The more entries there are in the TLB, the higher the hit rate but the slower and more expensive memory references become

  • TLB has to be flushed on every context switch

  • - context switch changes which logical memory we are using; need new segment, page tables

  • At the beginning of a quantum, all memory references are slow; later in the quantum, the TLB fills up, which makes the machine faster

Virtual Memory

  • Segments/pages allow partitioning memory for convenient allocation; can reorganize memory to have a convenient way of using it

  • Address translation used to achieve relocation: put piece of logical memory with own logical address into any physical address

  • Protection achieved by matching permissions with object (text - execute, data - read)

  • Break logical memory into little pieces; some don't need to be in physical memory

  • - if error handler is never invoked, then don't need in memory

  • Only bring into physical memory what is needed; allows for large logical memory

  • Virtual memory is logical memory with some pieces not in physical memory

  • Part of logical memory is kept in physical; the rest is kept on disk

Virtual Memory Based on Paging
  • Virtual memory of pages; every page corresponds to a page table entry; every table entry tells us where to find things in physical memory

  • Valid bit: whether page is in physical memory or not

  • Reference bit: has this page been referenced?

  • Modified bit: has this page been modified?

  • Frame number: where to find page in physical memory

  • Protection bits: allowable memory operations

Address Translation and Page Faults
  • Translate logical to physical: take page number; index into page table; now it's possible to index into entry with valid bit off, which results in a page fault

  • When page fault happens, the kernel is entered; brings in page from disk to memory; updates page table entry

  • If physical memory is full of pages, have to kick something out

Faults Under Segmentation/Paging
  • Virtual address consists of segment number, page number, and offset

  • Take segment number, index into segment table to get page table

  • - possible that page table is on disk; segment fault; bring into physical memory

  • Check if page number is within bound of page table; index into page table

  • - may get segmentation violation if not

  • Page faults are more expensive than context switches and TLB misses

  • Tolerable if page faults are really rare

  • Principle of locality: memory is not referenced uniformly; some a lot, little, not at all

  • Keep popular pieces in memory and rare pieces on disk

  • Access memory location 1000, high probability that next reference will be 1000 or 1001; if we access a variable, it is likely we will access it again

  • Use locality to keep page faults down

Page Replacement Policy
  • Key to virtual memory is finding a good page replacement policy

  • If memory is full, need to kick out a page to bring in a new page

  • FIFO: kick out oldest page, order in which they came

  • - simple, but oldest one may be popular

  • OPT: kick out page that is furthest out in future, will not be referenced anytime soon

  • - theoretically optimal, but need knowledge of future

  • LRU: kick out least recently used, used furthest in the past

  • All algorithms have obligatory page faults

  • FIFO typically does bad; LRU typically does better than FIFO

  • LRU performs well if there is locality, but it is complex - need time stamps

  • - if no locality, it may be better, worse, or the same as FIFO

Clock
  • Clock algorithm: kick page that is old and not recently used (not least recently used)

  • use reference bit; hardware sets bit to 1, kernel sets bit to 0

  • Take frames and arrange in circle; clock hand points to frame that is considered for replacement

  • If reference bit is 1, don't kick out, set bit to 0, move clock hand

  • If reference bit is 0, kick out and bring in a new page

  • If clock hand goes all the way around, it will kick out the oldest page

  • - defaults to FIFO in the worst case

  • If frame being replaced has been modified, need to write it to the disk

Resident Set Management
  • Resident set: process's pages in physical memory

  • The bigger it is, the less pages other processes get

  • Try to make sure pages in resident set are valuable

  • Who provides frames to a process when it needs memory?

  • Local: process needs to bring in a page; it has to kick out existing ones

  • - if one process is page faulting a lot, it won't affect other processes

  • Global: process grabs frame from another process

  • - more efficient; if other processes are not using a page, then give it away

Multiprogramming Level
  • Number of processes in phsyical memory

  • Processor utilization: how busy processor is doing useful work, e.g. running code of user program (not overhead like context switches)

  • Typically one process cannot use all of CPU; waiting for input, writing to display

  • Utilization goes up as multiprogramming level goes up

  • After a certain point, processes have just enough; one more process causes other processes to page fault; CPU spends time replacing pages instead of running code

Denning's Working Set Model
  • Working set: set of pages referenced from t - Δ to t

  • Represents pages that have been recently used

  • Put process in memory only if working set can be kept in physical memory

  • Paging: bringing in single pages

  • Swapping: bringing in as many pages as needed for process to get minimal page faults

  • Local page replacement policy; one process page faulting does not affect others

  • Difficult to implement; need time stamps, determine Δ

  • Clock is easy to implement, but it is a global policy

File System

  • File: logical unit of storage; container of data

  • File system: collection of files that is structured

  • Want file system to be persistent, large, sharing, secure

  • Any object that can be accessed by name, read/written, protected, or shared can be put in a file system (e.g. keyboard, display, processes)

  • Name space is organized as a tree (more strictly, a directed acyclic graph)

  • A file has attributes: characteristics that are relevant to the OS

  • - type (system user), time (creation, accessed), size, permissions

  • Read/write model: open, read, write, close; read/write into buffers

  • Memory-mapped model: map file into memory, access file by memory operations

  • Efficient for processes that share memory; all open some file, map file into address space, one modifies, everyone else gets to see it

  • Access control: how we share information

  • Want to share with this group; give some read, some write

  • Who can access a file, what operations are allowed

  • Want file system to support archival storage: saved file should never go away unless we want it to; want earlier versions of a file

  • Support various storage technologies; disks are slow

  • Want to achieve balance between performance, reliability, and security

  • All storage technologies support ability to read and write blocks of data

  • Abstract storage device: array of blocks; all fixed size

  • - read block number into memory address; write block number to abstract storage device

  • Divide array into three regions: file system metadata, file metadata, data blocks

  • File system metadata: data about entire file system

  • - number of total files

  • File metadata: data about a particular file

  • - attributes

  • Data blocks: actual contents of data

File System Metadata
  • First thing in storage array; know exactly where it is, starts at 0

  • Other parts know where it is only if they know how big file system metadata is

  • Keep pointers to start of file metadata and data block areas

  • Keep things like number of files in file system, number free/in-use (files, blocks)

File Metadata
  • Made up of array of file control blocks: small amount of metadata for a fiel

  • Contains attributes (type, size, permissions), references to data blocks: where to find data blocks that correspond to a file

Keeping Track of Allocated Blocks
  • Methods for keeping track of blocks that belong to a file

  • Contiguous block method: need pointer pointing to first data block and a number indicating number of contiguous blocks

  • Non-contiguous block method: pointer for each block

  • Extent method: keep pointer and number for several contiguous blocks

Keeping Track of Free Blocks
  • When create new file, know where to find them

  • Free block map: array of pairs (pointer, number of contiguous blocks); like extents

  • Doubly linked list of extents

  • Bit map: string of bits; location of bit corresponds to location of block

  • - 1: used; 0: not being used

File Name to File Control Block
  • User wants to use file names; OS wants to use numbers to index into file control block array

  • Need to translate file name to control block number

  • Directories are implemented as files organized as a table; array of entries

  • - each entry has a name and attributes

  • i-node number: file control block index

  • Parsing file name involves repeatedly indexing table

  • Read/write model has open method to have a faster way of opening files

Big Picture
  • Abstract storage device split into file system metadata, file metadata, and data blocks

  • File metadata is array of file control blocks/i-nodes

  • There is a file control block for every file

  • A block map is made up of 13 pointers, each pointing to a data block

Disks For Storage
  • Disk is made up of platters stacked on top of each other with space in between

  • Each platter is coated with magnetic material

  • Disk head: arm with reader that can magnetize or demagnetize

  • - write 0 or 1 on platter

  • Track: circle of bits

  • Sector: portion of a track

  • Cylinder: group of tracks

  • Rotational latency: amount of time it takes for disk to rotate

  • Seek time: amount of time it takes to move arm

  • Transfer time: read bits and transfer into memory

Solid State Drives
  • NAND-based flash memory: no moving parts

  • Non-volatile: doesn't forget when take power off

  • Unaffected by shock, magnetic fields

  • Limited number of writes; wears out with age

  • Faster access time; move more data per unit time; less power; less capacity; more cost

Performance: Caching
  • Keep things in memory for reuse instead of accessing disk

  • Cache data blocks of files, file system metadata, file metadata, file names

Performance: Clustering
  • Putting data blocks that are related physically close to each other

  • - reduce disk head movement

Performance: Block Size
  • Bigger: better speed; on disk access, able to bring in or put out more data

  • Smaller: less wasted space

Reliability: Consistency
  • Buffer cache reduces disk accesses, but system crash will lose all data

  • Some blocks write immediately (file system metadata); others less time

Reliability: Journaling
  • Create log entry for each update

  • If crash, there is a record

Input/Output System

  • CPU/Memory and devices; devices communicate over I/O bus into CPU/memory register

  • Problem: many different types of I/O devices; all operate differently

  • How does process initiate I/O? achieve synchronization? transfer data?

  • - IPC

Background
  • A system is composed of CPU, memory, devices

  • Devices attach to I/O bus via controller

  • CPU and devices communicate through:

  • - I/O instructions: hardware instruction designed for I/O

  • - Memory-mapped instructions: move value into/out of address in memory of device

  • Data transfer via:

  • - programmed I/O: CPU is involved with moving data from register into device

  • - DMA: CPU tells device to move data to certain memory area

  • Synchronization via:

  • - polling: checking register/memory which says whether device is ready and busy wait if it isn't

  • - interrupts: device does its thing, generates interrupt that causes kernel to run

Buffered and Unbuffered I/O
  • Buffered I/O: buffer in kernel

  • Buffer: temporary data storage

  • Pro: devices can write data without worrying about memory/process swapped out

  • Pro: Process can output to device without busy waiting if device is busy

  • Con: Takes time to copy data

  • Try to classify devices by shared characteristics

  • - lower code, complexity

  • Variable vs. fixed size units: data transfer

  • Sequential vs. random access

  • Synchronous vs. asynchronous: wait for or produce data spontaneously

  • Speed of operation: high-speed or not

I/O System
  • Software that deals with I/O

  • Mostly in kernel

  • Split into 2 parts: device-dependent, device-independent

  • Device drivers: device-dependent code; specific to device

Device Dependent
  • Implements standard interface: open, close, read, write

  • Kernel invokes device driver through these functions

  • Interrupt handler: when device completes, tells kernel it is done

Device-Independent I/O
  • Does everything I/O related but not specific to any device; for all devices

  • All devices are modeled as files

  • Anything that can be abstracted out of device driver

  • Device driver is code run by kernel; bugs could mess up whole OS

  • Implement device-independent functionality (e.g. buffer) so there is less code and, therefore, less bugs in drivers

User-Space I/O
  • Part of I/O running in process; user-space

  • - e.g. printf

  • Devices in UNIX are categorized as block or character

  • Block: operates in fixed size, randomly addressable blocks

  • - can use buffer cache

  • Character: not block (e.g. keyboard, mouse)

UNIX I/O System Call Interface
  • ioctl: special operations for certain devices

  • - rewind for tape

Software Block Cache Design
  • Blocks of data on disk; devote certain amount of RAM to store blocks

  • Check if block is in cache

  • - if not, then bring it into cache

  • Use hash table to search if block is in cache

  • Kick out LRU - least recently used - block

  • Maintain LRU list that is independent of hash table

Protection

  • Processes need to access resources

  • Resources are shared; want all processes to be able to access them

  • Resources need to be protected

  • - don't want process to take it and use it forever

  • - don't want to use without permission

  • The kernel enforces protection

  • At the beginning, the kernel owns all the resources; then processes ask the kernel for resources

  • Once a process is given access, the kernel can prevent others from gaining access; the kernel may or may not be able to take the resource away

  • The kernel itself needs to be protected

  • Typically, the kernel is protected by the hardware

  • Protection: how to limit access to a resource

  • Resource: anything that requires protection

  • Domain: set of resource, permission pairs; where a process lives; defines what a process is allowed to do

  • Process: the ones accessing resources

Protection Matrix
    A B C D
    X r, w r, w w
    Y w r r, w
  • Rows are domains

  • Columns are resources

  • Matrix entry contains permissions/rights (read, write)

  • Inefficient because most of entries would be empty (since there are a lot of processes and resources)

Access Control Lists
  • Associated with resources; for every resource there is an access control list

  • ACL for B
    X: r, w
    Y: r
  • Checks if domain of process is on list; like a registry: if name is on list, then ok to access

  • Inefficient because have to lookup on each resource access and have to check which domain

  • However, easy to revoke access; just take name off list

Capability Lists
  • Associated with each domain

  • CL for X
    A: r, w
    B: r, w
    D: w
  • Like key or ticket: if you have it, you get access

  • Efficient because you just present list on each access

  • However, it is hard to revoke access

Protection in UNIX
  • UNIX uses both access control lists and capability lists for resource protection

  • The first time a process accesses a file, the kernel checks the process against the access control list. If the process has permission to open the file, the kernel gives the process a file descriptor. That file descriptor then acts like a capability list for future file accesses

Security

  • How to protect computer system (contents, operations, services) from threats (theft, damage, disruption)

  • Confidentiality: only those authorized should be able to access

  • Integrity: only authorized changes should be allowed

  • Authenticity: is person who they are claiming to be actually that person?

  • Availability: able to access service

  • Threats: interception, interruption, modification, fabrication

User Authentication
  • Password: secret only you and computer system knows

  • Encrypt passwords to prevent others from knowing, if they can see it

  • Longer passwords are harder to break but harder to remember

  • Password is data that is secret

  • Challenge/response protocol: program that is secret

  • User and computer know secret algorithm

  • Computer challenges user; user must return correct value based on algorithm

Trojan Horse
  • Program that contains malicious code; pretends to be useful

  • Runs in user domain; has access to all user files

Trap Door
  • Secret access point in program

  • E.x.: bank hires programmer; programmer puts something in that allows access for programmer

  • Hard to tell if trap door is in program

  • Not in source code; compiler can put it in; not in compiler source code; compiler designer can put it in

Virus
  • Code attached to a legitimate program

  • When program runs, virus also runs; causes damage, spread to other programs

  • Disinfectant: checks normal program vs this program

Internet Worm
  • Program that copies itself over a network

Denial of Service
  • Prevent others from accessing system

  • Have many servers to distribute different ways of accessing system

Intrusion Detection
  • Signature-based: looking for a pattern

  • - repeated login

  • Anomaly-based: looking for unusual behavior

  • Honey pot: attract intruders

Cryptography
  • Sender and receiver; send messages to each other in a way so that no one else can tell what the messages are

  • Plaintext, encryption key, encryption method, decryption method, decryption key

  • Sender and receiver use same key for encryption and decryption

  • How to agree on the same key?

  • Use two keys; private, public; one decrypts, other encrypts

  • Encrypt using public key; decrypt using private key

  • Secret key is fast; hard to distribute keys

  • Public key is slow; easy to distribute keys

  • Combine both; start with public key; send secret key; use secret key

Networks

  • Set of computing nodes connected by communication links that allow data to be transferred between sender and receiver

  • Internetwork: network of networks

  • Many different types of networks; categorize them in different ways

  • - topology: the way they're connected (ring, star, bus, graph)

  • Ring: node sends information in circle

  • Star: sending/receiving nodes on outside; center is hub

  • Bus: all nodes attached to bus; send message through bus; everyone hears it but only node with address receives

  • Geographic coverage: how much territory they span

  • - LAN: local area network; span floor, building

  • - WAN: wide area network; span state, country

  • Circuit switching: if A wants to send something to B, A establishes a path from A to B, then all bits flow over that path

  • Packet switching: if A wants to send to B, A sends information to another node, another node, ... until it reaches B, if possible

  • Protocol: agreed upon message format and transfer procedure

  • Multiparty: no central controller; involves cooperation of at least 2 nodes

  • Protocol is a set of expectations of how operation will happen

  • Message contains data you want to transmit and additional stuff to help on protocol (e.g. data, header)

  • Header contains source, destination addresses, state of protocol operation, error control

  • Network has many layers

  • Physical: how to get 0's, 1's over wire

  • - format: electrical signal

  • - transfer procedure: change voltage at one end, go through wire to other end

  • Link: break message into frames; can tell if anything went wrong with it; physical addressing: who gets message

  • Network: transfer from one network to another by translating logical address to physical address

  • Transport: break message into logical sections

  • Session: start, stop, manage connections

  • Presentation: format of message (e.g. JPG)

  • Application: actual application; everything has a protocol (email, browser)

  • Internet uses 3 levels of addresses

  • Domain names: convenient for humans; text name

  • Logical address: something internet understands

  • Physical address: ethernet; address for underlying physical network

Size of Addresses
  • IPv4 used 32-bit addresses

  • Internet grew and we ran out of addresses

  • IPv6 uses 128-bit addresses

Routing
  • How to get packet from A to B

  • Static: always make same decision

  • Dynamic: decision can change

Scalability
  • How growable something is

  • Network is scalable based on how much effort is needed to add a node

  • Make network scalable by limiting effect of adding a node

Error Control
  • Used to determine whether what is sent is good

  • Parity: even number of 1's is good; odd number of 1's is bad

  • CRC: cyclic redundancy code; attachment of bits

  • Checksum: take message, interpret it as array of bytes, add them up, receiver compares number

  • ARQ: automatic repeat request; send message, if receiver doesn't get it in a certain amount of time, send again

Two Generals Problem
  • Two armies; whoever has more soldiers will win

  • Two hills; half of one army on each hill, other army in valley

  • If the army on the hills attack at different times, then it loses; if they attack at the same time, it wins

  • Only way to communicate is to send a soldier to other side

  • Everytime a soldier leaves, worry about whether message got through

  • Moral: can't create perfect channel out of faulty one

Distributed Systems

  • Set of cooperating processes over a network

  • Characterize distributed systems by how integrated they are

  • - loose: e.g. Internet; possible to send a message somewhere and have a program run remotely

  • - tight: nodes know more about each other; support process migration

Advantages
  • Speed: lots of machines; lots of parallelism

  • Reliability: if one machine fails, whole system doesn't fail

  • - NSPF: no single point of failure

  • Scalability: easy to grow; add a machine to get more processing power

  • Geographic distribution: spread out machines; machines interact with closest ones

Disadvantages
  • Two fundamental problems with decentralized control

  • - fundamental: inherent in system; can't get rid of them

  • State uncertainty: no node knows exactly what's going on everywhere else

  • - can send messages, but sending takes time; there could be a change while the message is being sent

  • - no shared memory; no shared clock

  • Action uncertainty: everyone finds out about change, everyone makes same decision, mutually conflicting decisions

Is Distribution Better?
  • Single fast server with single queue is better than multiple slower servers with separate queues

  • Better: single queue with multiple slow servers; arrive in queue, go to next available server

  • Little's Law: N = λW

  • - number of things that need work done = (arrival rate)(waiting time, amount of time spent in system)

Client/Server Model
  • Processes are either clients or servers

  • Clients are small, lightweight, short-lived; make requests

  • Servers are big, long-lived; wait for requests

Peer-to-peer
  • Everyone is equal; asking each other for information

  • Processes alternate from being a client, server

Event Ordering
  • No shared clock; every process running at different clocks

  • All processes have to agree on ordering of events

  • - e.g. have to send before receive

  • All events are timestamped based on local clock

  • If receive before sent time, advance local clock

Leader Election
  • Elect a leader in distributed system

  • One process takes a leadership role and coordinates the others

  • If one process fails, new one takes over

  • Process sends request to leader; no reply; tries to elect itself; sends request to all processes with higher id

  • - if there are no replies, the process becomes the leader and tells all processes with lower id

  • - if there is a reply, the higher process tries to become the leader

Mutual Exclusion
  • Can't use semaphores since there is no shared memory

  • Centralized approach: one process acts as leader; processes that want to enter a critical section have to get permission from the leader; leader knows if others are in critical section

  • Distributed approach: process that wants to enter critical section has to get permission from everybody

Atomic Transactions
  • How to allow piece of code to run atomically? Either whole thing succeeded or nothing happened

  • Pieces of transactions have to execute on different machines; distributed transaction

  • How to ensure that if one piece fails, then whole transaction fails?

Two-Phase Commit Protocol
  • Protocol: set of rules

  • Phase 1: coordinator logs message, sends to all sites involved with transaction, each site tries to complete transaction, sends reply back and log success or failure

  • Phase 2: coordinator looks at all messages received; if all successful, then the transaction is successful

  • All logging is done to a stable storage

Stable Storage
  • Uses redundancy to build a robust storage system

  • Instead of one disk, use many disks; have multiple copies of everything

  • Disks have independent failures; if one fails, it doesn't affect others

  • We view the file we want to write as a set of logical blocks

  • Every logical block is implemented using 2 or more physical blocks

  • Write first block; if successful, then write second block

  • If one block has an error, copy good version to overwrite bad version

  • If blocks are different, copy second (older) block to first block

Common Knowledge Problem
  • n people wearing either a black or red hat

  • Each person can't see their own hat but can see others

  • Everyone is told publicly that at least one person has a black hat (hint)

  • Have to figure out if own hat is black

  • Everyone asked publicly if their hat is black

  • Everyone has to respond at the same time yes, no, or don't know

  • If there are m black hats, need m questions to know

  • Hint was necessary to tell difference between all red hats and one black hat

  • Public hint provides information about what others know

  • Common knowledge: K(x), K2(x), ..., Kn(x) as n goes to ∞

  • - K(x): everyone knows x

  • - K2(x): everyone knows that everyone knows x

  • Impossible to achieve common knowledge with imperfect communication; Two Generals problem

  • Only way to have common knowledge in system is to have all machines start out with it