Listor / Teknisk fysik / Operating System Concepts
Operating System Concepts
- Processes and Threads
- System calls and Process scheduling
- Inter-process communication
- File system
- Virtual machines
1. Processes and Threads
Purpose of a OS
The purpose of the operating system is to provide an environment
in which a user can execute programs in a convenient
(abstraction between user software and hardware) and efficient
(resource management between users and programs) manner.
Different OS types
- Embedded - Does not load and execute applications. This means that the system is only able to run a single application.
- Distributed - Distributed computations are carried out on more than one machine. When computers in a group work in cooperation.
- Network - To allow shared file and printer access among multiple computers in a network, to enable the sharing of data, users, groups, security, applications etc.
- Workstation - Workstation operating system is primarily designed to run applications. (e.g. Windows 7)
In order to ensure the proper execution of an operating system,
a distinction should be made between the execution of operating-system code and user-defined code.
We need two separate modes of operation: user mode and kernel mode.
Kernel mode can execute all machine instructions and reference all memory locations while user mode can't.
Process descriptor (PCB)
The process descriptor (PCB) serves as the repository for any information that may vary from process to process including register values,
logical state, type & location of resources it holds, allocated I/O-devices, list of open files.
- the object program to be executed (called the program text);
- the data on which the program will execute;
- resources required by the program;
- the process status.
- New: the process is being created
- Running: instructions are being executed
- Ready: the process is waiting for the CPU (and is prepared to run at any time)
- Blocked: the process is waiting for some event to occur (and cannot run until it does)
- Exit: the process has finished execution
Process vs thread
Threads are used for small tasks, whereas processes are used for more 'heavyweight' tasks
– basically the execution of applications.
Another difference between a thread and a process is that threads within the same process share the same address space,
whereas different processes do not.
User level threads
- More configurable (any schedueling algorithm)
- Doesnt need hardware support (can run on a variaty of OS’es)
- Much faster (doesnt need to switch between user and kernel space)
- The I/O must be non-blocking (extra checking to deal with cases that normally would block)
- Cannot use multiple CPU’s (the OS cant dispatch threads to different processors, so a thread blocks the entire process)
2. System calls and Process scheduling
System call procedure
- A program in the user-space make a function call to (for example) the C library.
- This will set up a system call with the function parameters
- which traps to the kernel to serve the request.
- Once the kernel gets control, the user state gets saved.
- The kernel does security and sanity checks and then attempts to fulfill the request.
- Then the user state is restored, the counter of the calling program is placed on the stack or in a program counter register and the control gets back to the user space.
- The C library routine reads the stack/register that the kernel just wrote, and returns it to the user space program.
Types of schedulers
- LTS executes rarely (in CPU terms) to make the decision as to whether to take new jobs and which ones to take.
- The MTS is executed from time to time to make a swapping decision.
- The short-term scheduling, also known as the CPU scheduling or dispatcher,
executes very frequently to make the decision as to which process to run next on the CPU.
- CPU utilization
- Turnaround time - amount of time to execute each process
- Waiting time - time in the ready queue
- Response time - time between request and the first response produced
The times should also be predictable, processes shall not starve and priority and deadlines should be respected.
Non-preemptive vs preemptive schedulers
In a non-preemptive system, a
process is only replaced when it becomes blocked as a result of requesting an I/O
operation or voluntarily gives up control of the CPU. In a preemptive system, the
dispatcher uses a clock interrupt to stop processes after a fixed amount of execution
time (a timeslice).
Round-robin algorithm is a pre-emptive algorithm as the scheduler forces the process out of the CPU once the time quota expires.
3. Inter-process communication
- Data transfer
- Sharing data
- Event notification
- Resource sharing
- Process control
- Message passing - useful for exchanging smaller amount of data; it is also easier to implement for inter-computer communication
- Shared memory - allows maximum speed and convenience of communication – it does not require system calls for each message that slow the entire process down
- Signals - A system message sent from one process to another, not usually used to transfer data but instead used to remotely command the partnered process
- Pipes - A unidirectional data channel.
Data written to the write end of the pipe is buffered by the operating system until it is read from the read end of the pipe.
Two-way data streams between processes can be achieved by creating two pipes utilizing standard input and output.
A named pipe does not terminate when the processes that uses the pipe terminates.
- Sockets - A data stream sent over a network interface,
either to a different process on the same computer or to another computer on the network.
Typically byte-oriented, sockets rarely preserve message boundaries.
Data written through a socket requires formatting to preserve message boundaries.
It is more secure to use register than memory because other processes cant fuck up the register.
DDE and OLE
The two fundamental IPC methods for Windows OS is DDE and OLE.
Dynamic Data Exchange (DDE) is a message-based communication system between two applications. Not efficient and should not be used.
Object Linking and Embedding (OLE), designed for creating compound documents by combining objects obtained from different application programs. With linking the application contains only a reference to an object, with embedding the object is actually stored as part of the source document data.
From OLE, the Component Object Model (COM) have been developed.
- Used for network computing in modern applications. Comes from DCOM (Distributed COM) that comes from COM.
- Aims to move computing power from a single computer to distributed computers over the network.
- Provides a platform- and language-neutral development environment for components that can easily inter-operate.
- Based on web-services which uses a XML format standard that defines how data should be transported between apps through the HTTP protocol.
Memory protection hardware
Checks if memory reference is greater or equal to the base and less than the base plus limit. If not it throws a vector to the OS (address error). Base and limit registers can only be updated in kernel mode.
When a computer's virtual memory subsystem is in a constant state swapping.
The page fault rate becomes high.
It occurs when its not enough physical memory due to too many programs and/or programs with poor locality of reference.
The CPU performs productive work less and swapping more.
Thrashing can be detected by monitoring the page fault frequency and CPU utilisation.
If an increase in the number of processes leads to an increasing rate of page faults and decreasing CPU utilisation
at the same time, then the system is thrashing.
Working set and Trashing prevention
A working set is a set of pages that a process needs in store at the same time and its varying between processes and during execution.
In a working set model, if the sum of the working set sizes exceeds the physical memory, the process will get suspended and free the memory to another process.
Alternatively, the OS can check if the page fault frequency is too high and in that case allocate more frames to the process.
Virtual memory main three strategies
- fetch - when to move a page/segment
- placement - where in real memory a process piece should be placed
- replacement - which processes or their parts to remove
Increasing the number of page frames results in an increase in the number of page faults for certain memory access patterns. This phenomenon is commonly experienced when using the first-in first-out (FIFO) page replacement algorithm.
Stores and retrieves data from secondary storage for use in main memory. In this scheme, the operating system retrieves data from secondary storage in same-size blocks called pages. Paging is an important part of virtual memory implementations in modern operating systems, using secondary storage to let programs exceed the size of available physical memory.
Demand paging follows that pages should only be brought into memory if the executing process demands them. This is often referred to as lazy evaluation as only those pages demanded by the process are swapped from secondary storage to main memory. Contrast this to pure swapping, where all memory for a process is swapped from secondary storage to main memory during the process startup.
Segmentation is a memory management scheme that supports the logical view of memory. A logical-address space is a collection of segments. A logical address consists of: [segment-number, offset] where segment-number represents segment name.
- Fast hardware implementation of a page table lookup.
- Recently used page table entries are loaded into the TLB in order to speed up the virtual-to- physical address conversion process.
- When a virtual address cannot be successfully translated by the TLB, a page-fault is generated, and the OS resorts to finding the appropriate frame in the page table.
- Contains a page-number with the matching frame, privileges and sometimes a ASID (address space id). ASID is used to see if the current page table is from one of the active processes.
5. File system
- field - group of characters
- record - group of fields
- file - group of related records/collection of data
- directory - special files containing the names and locations of other files
- filesystem - collection of files
Soft link (symbolic link/shortcut) - dir entry to path to file somewhere else. Changing path/rename a file will break a soft link.
Hard link - physical path at the storage device. Refragmentation will break a hard link.
File related system calls
Manipulated as a unit:
Manipulated as a data item from a file:
Recovery capability of NTFS
The file system NTFS used by the Windows family of operating systems enables recovering to a consistent state following a system crash or disk failure.
The essence of the NTFS recovery capability is logging . Each operation that alters a file system is treated as a transaction. Each sub-operation of a transaction that alters important file system data structures is recorded in a log file before being recorded on the disk volume Using the log, a partially completed transaction at the time of a crash can later be re-done or undone when the system recovers.
This can be achieved by the following steps:
1. NTFS first calls the log file system to record in the log file (in the cache) any transactions that will modify the volume structure
2. NTFS modifies the volume (in the cache)
3. The cache manager calls the log file system to prompt it to flush the log file to disk
4. Once the log file updates are safely on disk, the cache manager flushes the volume changes to disk
Security requirements and example threats
- Confidentiality - information on a system should only be accessible by authorised parties. Ex. information disclosure.
- Integrity - assets can only be modified by authorised parties. Ex. tampering with information
- Availability - assets are available. Ex. DoS
- Authentication. Ex. Spoofing.
- Access control. Ex. Unauthorized access
- Masquerader: An individual who is not authorised to use a computer and who penetrates a system's access control to exploit a legitimate user's account. The masquerader is likely to be an outsider.
- Misfeasor: A legitimate user who accesses data, programs, or resources for which such access is not authorised, or who is authorised for such access but misuses his or her privileges. The misfeasor generally is an insider.
- Clandestine user: An individual who seizes supervisory control of the system and uses this control to evade auditing and access control or to suppress audit collection. The clandestine user can be either an outsider or insider.
Program that consumes system resources by
Logic embedded in a computer program that checks
for a certain set of conditions to be present on the
system. When these conditions are met, it executes
some function resulting in unauthorised actions.
Secret undocumented entry point into a program,
used to grant access without normal methods of
Secret undocumented routine embedded within a
useful program. Execution of the program results in
execution of the secret routine.
Code embedded within a program that causes a copy
of itself to be inserted in one or more other programs.
In addition to propagation, the virus usually performs
some unwanted function.
Program that can replicate itself and send copies
from computer to computer across network
connection. Upon arrival, the worm may be activated
to replicate and propagate again. In addition to
propagation, the worm usually performs some
7. Virtual machines
Purpose with virtualization
Abstration/replication, isolation/encapsulation, cross compatibility/legacy applications and software development/training.
Guest OS needs to call privileged instructions, manipulate page tables and to believe that its running on a real machine.
Virtualization advantages and disadvantages
The advantages is efficient use of resources, cost and energy savings, faults and threat isolation and its simple to backup.
The disadvantages is compromised performance, increased complexity, licensing costs and single point of failure.
Virtual Machine Monitor (VMM)
The VMM (or hypervisor) is a software that is responsible for hosting and managing all virtual machines.
Functionality of hypervisor or Virtual Machine Monitor (VMM) varies greatly based on architecture and implementation.
The VMM implements VM hardware abstraction and is responsible for running guest OS.
The VMM has to partition and share CPU, memory, I/O devices.
- Full virtualization - No sensitive instructions issued by guest OS are ever executed by true hardware. Sensitive instructions are caught and replaced with a call to a VMM procedure(s) that handle it. User level code is directly executed on processor.
- Paravirtualization - Refers to communication between guest OS and hypervisor to improve performance and efficiency. It involves modifying OS kernel to replace non-virtualizable instructions with hypercalls that communicate directly with virtualization layer hypervisor.
- Hardware-assisted virtualization - Hardware vendors develop new features to simplify virtualization techniques. Privileged and sensitive calls are set to automatically trap to hypervisor, removing need for either binary translation or paravirtualization.