CPU Scheduling

School Projects

Since I started school last August I haven’t had a lot of time to devote to working on Colin and, consequently, I haven’t produced much new material for this site. Suffice it to say that working full time and going to school in the evenings leaves me with very little free time and I’m looking forward to the day when I can just do one or the other.

Winter quarter is over and, while I haven’t done much work on Colin, I have done a couple of interesting projects for school and I’d like to post one of them here. I’ll won’t be doing an in-depth tutorial. Instead I’ll just provide an outline of the project, the source code, and some comments one the more interesting aspects and some possible improvements.


Project Specification

My favorite project of last quarter was a CPU scheduler simulator. If you’re not familiar with CPU scheduling wikipedia has a great article that outlines the basics. It’s a topic that sounds kinda dull until you realize just about every computer you’ve ever used is completely dependent on good scheduling algorithms to function properly.

My assignment was to create a program that simulates a short term scheduler. The program needs two arguments when it’s called from the command line: the name of a file specifying the processes to be scheduled and run, and the identifier of one of the following scheduling policies:

The input file contains definitions for an arbitrary number of jobs. Each line in the file defines one job with the process id (PID), arrival time, CPU burst time, and priority of the job. The following line defines a job with a PID of 1, arrival time of 10 milliseconds, burst time of 20 milliseconds, and a priority of 3. Note that lower numbers indicate higher priority.

1 10 20 3

The program prints a notification to the console when a context switch occurs (when a new process is scheduled) or if the CPU goes idle. When all of the processes have completed it prints the CPU utilization, average waiting time, and worst case waiting time to the console.


Design Concerns

It would have been easy enough to simply write a different function for each of the four scheduling policies but this is a bad idea since they can share most of their code. Shortest Job First is nearly the same as Nonpreemptive priority scheduling, it just determines priority based on CPU burst time rather than an independent priority variable. The same is true for Shortest Remaining Time First and Preemptive Priority. So really I just need to write two functions: one for preemptive and one for nonpreemptive scheduling. As long as I can change what the two functions use to evaluate priority it should work like a charm!

Of course that last point is harder than it sounds, especially since I’m a relative newbie with C++. That’s one of the things I’ve really loved about getting more familiar with C++ though. If I’m writing code and I think, “man it would be really cool if I could make this do X,” I can always find a way to do it if I’m willing to put in the time to learn how. For instance, getting my scheduling functions to evaluate job priority involved learning to write comparison functions for the structs I used to represent jobs, and then to creating pointers to those comparison functions. Then I could just pass the function pointer to the scheduling functions. Easy peasy, right?


The Code

// Andrew Kramer
// 2/20/2016

// hw3.cpp
// simulates a CPU scheduler with one of four scheduling strategies:
// shortest job first, shortest remaining time first, nonpreemptive
// priority, and preemptive priority scheduling

// command line arguments:
// (1) the name of the input file
// (2) the name of the scheduling policy
//      - SJF  for shortest job first
//      - SRTF for shortest remaining time first
//      - NP   for nonpreemptive priority scheduing
//      - PP   for preemptive priority scheduling

// input file formatting:
//    file should contain definitions for an arbitrary number of jobs
//    each line should define one job with the following info:
//       (int) a unique process id
//       (int) arrival time
//       (int) CPU burst time
//       (int) priority (lower numbers have higher priority)

// exits with an error if an invalid filename or scheduling 
// policy is given in command line


#include<iostream>
#include<sstream>
#include<fstream>
#include<string>
#include<vector>
#include<queue>
#include<algorithm>
#include<iomanip>
using namespace std;

struct Process {
  int pid;
  int arrivalTime;
  int burstTime;
  int priority;
  int waitTime;
  bool complete;
};

// type definition for the comparison function pointers
// makes it possible to pass function pointers to the 
// scheduling functions
typedef bool (*compareFunc)(const Process*, const Process*);

bool comparePID(const Process *a, const Process *b) {
  return (a->pid > b->pid);
}

bool compareBurst(const Process *a, const Process *b) {
  if (a->burstTime == b->burstTime) {
    return comparePID(a, b);
  }
  return (a->burstTime > b->burstTime);
}

bool comparePriority(const Process *a, const Process *b) {
  if (a->priority == b->priority) {
    return comparePID(a, b);
  }
  return (a->priority > b->priority);
}

// print all of the processes in the given vector
// for testing purposes only
void printProcesses(vector<Process*> processes) {
  for (int i = 0; i < processes.size(); i++) {
    cout << processes[i]->pid << " "
	 << processes[i]->arrivalTime << " "
	 << processes[i]->burstTime << " "
	 << processes[i]->priority << endl;
  }
}

// accepts a string, the name of the input file
// returns a vector of pointers to processes
// checks if the file exists and reads the processes into
// a vector if so
// exits with an error if file does not exist
vector<Process*> readFile(string s) {
  const char *fileName = s.c_str();
  ifstream inFile(fileName);
  // check if input file exists
  if (!inFile) {
    cerr << "Input file " << fileName << " not found" << endl;
    exit(1);
  }
  string line;
  vector<Process*> processes;
  while (getline(inFile, line)) {
    istringstream iss(line);
    int thisPid, thisArrival, thisBurst, thisPriority;
    Process *thisProcess = new Process();
    iss >> thisProcess->pid >> thisProcess->arrivalTime
	>> thisProcess->burstTime >> thisProcess->priority;
    thisProcess->complete = false;
    // prevents invalid input from being added if input file
    // contains an extra \n at the end
    if (thisProcess->burstTime > 0) {
      processes.push_back(thisProcess);
    }
  }
  return processes;
}

// accepts a pointer to the running process, a vector of processes
// containing the ready heap, and an int: the current time
// checks if the running process has completed and marks it complete if so
// if the ready heap is empty prints a CPU idle message
bool runProcess(Process *running, 
		vector<Process*> &readyHeap, int time) {
  // if the running process is not marked complete
  if (!running->complete) {
    running->burstTime--;
    if (running->burstTime == 0) {
      running->complete = true;
      if (readyHeap.empty()) { // if no process is available on the ready heap
        cout << "Time " << time << " Idle" << endl;      
      }
    }
  }
}

// accepts two vectors of pointers of processes, a process comparison
// function, and two ints, the current time and number of processes 
// that have been run
// checks all processes to see if their arrival time has come and adds
// a process to the ready heap if so
void addProcess(vector<Process*> &processes, 
		vector<Process*> &readyHeap,
		compareFunc compare,
		int &time, int &curProcess) {
  for (int i = 0; i < processes.size(); i++) {
    // add process to ready heap if its arrival time has come
    if (processes[i]->arrivalTime == time) {
      readyHeap.push_back(processes[i]);
      push_heap(readyHeap.begin(), readyHeap.end(),
		compare);
      curProcess++;
    }
  }
}

// accepts a pointer to the running process, a vector of processes
// that stores the ready heap, a process comparison function, and
// two ints, the active time and idle time for the CPU
// if the running process has completed and another process is 
// available on the ready heap, starts running the new process
void completed(Process *&running, 
	       vector<Process*> &readyHeap,
	       compareFunc compare,
	       int &time, int &idleTime) {
  if (running->complete) {
    // check to see if another process is available on the heap
    // and run the top process on the heap if so
    if (!readyHeap.empty()) {
      running = readyHeap.front();
      pop_heap(readyHeap.begin(), readyHeap.end(),
	       compare);
      readyHeap.pop_back();
      cout << "Time " << time << " Process " << running->pid << endl;
    } else { // increment the CPU idle time
      idleTime++;
    }
  }
}

// accepts a vector of pointers to processes, a pointer to a 
// function that compares processes, and two ints representing
// the cumulative active time and idle time for the processes
// uses non-preemptive scheduling and the given comparison funciton
// to simulate a CPU scheduler
void nonPreemptive(vector<Process*> &processes, 
		  compareFunc compare,
		  int &activeTime, int &idleTime) {
  int time = 0, curProcess = 0;
  // create a dummy process to ensure the first process is
  // properly scheduled
  Process dummy; 
  dummy.complete = false;
  dummy.burstTime = 1;
  Process *running = &dummy;
  vector<Process*> readyHeap;
  // determine active CPU time
  for (int i = 0; i < processes.size(); i++) {
    activeTime += processes[i]->burstTime;
  }
  // while there are still processes to be run
  while (curProcess < processes.size() ||
	 !readyHeap.empty()) {
    addProcess(processes, readyHeap, compare, time, curProcess);
    runProcess(running, readyHeap, time); 
    // if the running process has completed
    completed(running, readyHeap, compare, time, idleTime);
    for (int i = 0; i < readyHeap.size(); i++) {
      readyHeap[i]->waitTime++;
    }
    time++;
  }
}

// accepts a vector of pointers to processes, a pointer to
// a function that compares pointers to processes, and two ints
// the active time and idle time for the processes
// uses preemptive scheduling and the given comparison function to
// to simulate process scheduling
void preemptive(vector<Process*> &processes, 
		compareFunc compare,
		int &activeTime, int &idleTime) {
  int time = 0, curProcess = 0;
  // dummy process to ensure first process is scheduled
  Process dummy;
  dummy.complete = false;
  dummy.burstTime = 1;
  Process *running = &dummy;
  vector<Process*> readyHeap;
  // sum burst times to find CPU active time
  for (int i = 0; i < processes.size(); i++) {
    activeTime += processes[i]->burstTime;
  }
  while (curProcess < processes.size() ||
	 !readyHeap.empty()) {
    addProcess(processes, readyHeap, compare, time, curProcess);
    runProcess(running, readyHeap, time);
    completed(running, readyHeap, compare, time, idleTime);
    // check if next process should preempt the running process
    if (!readyHeap.empty() && !running->complete) {
      if (compare(running, readyHeap.front())) {
	readyHeap.push_back(running);
	push_heap(readyHeap.begin(), readyHeap.end(),
		  compare);
	running = readyHeap.front();
	pop_heap(readyHeap.begin(), readyHeap.end(),
		 compare);
	readyHeap.pop_back();
	cout << "Time " << time << " Process " << running->pid << endl;
      }
    }
    // increment wait time for processes in the readyHeap
    for (int i = 0; i < readyHeap.size(); i++) {
      readyHeap[i]->waitTime++;
    }
    time++;
  }
}

// accepts a vector of pointers to Processes that have been run 
// and two ints, the active time and idle time for the processes
// calculates and prints the average wait time, worst case wait
// time, and CPU utilization for the given processes
void printStats(vector<Process*> &processes, 
		int activeTime, int idleTime) {
  int totalWait = 0, worstWait = -1;
  for (int i = 0; i < processes.size(); i++) {
    totalWait += processes[i]->waitTime;
    if (processes[i]->waitTime > worstWait) {
      worstWait = processes[i]->waitTime;
    }
  }
  double utilization = (double)activeTime / (double)(activeTime + idleTime);
  double averageWait = (double)totalWait / (double)processes.size();
  cout << "CPU utilization: " << setprecision(2) 
       << (utilization * 100) << "%" << endl;
  cout << "Average waiting time: " << fixed << setprecision(2) 
       << averageWait << endl;
  cout << "Worst-case waiting time: " << worstWait << endl;
}


int main(int argc, char *argv[]) {
  cout << endl;
  string fileName = argv[1], algorithm = argv[2];
  vector<Process*> processes = readFile(fileName);
  int idleTime = 0, activeTime = 0;
  if (algorithm.compare("SJF") == 0) {
    nonPreemptive(processes, compareBurst, activeTime, idleTime);
  } else if (algorithm.compare("SRTF") == 0) {
    preemptive(processes, compareBurst, activeTime, idleTime);
  } else if (algorithm.compare("NP") == 0) {
    nonPreemptive(processes, comparePriority, activeTime, idleTime);
  } else if (algorithm.compare("PP") == 0) {
    preemptive(processes, comparePriority, activeTime, idleTime);
  } else {
    cerr << "Invalid parameter:" << endl
	 << "2nd command line parameter must be "
	 << "\"SJF\", \"SRTF\", \"NP\", or \"PP\"." << endl;
    exit(1);
  }
  printStats(processes, activeTime, idleTime);
  cout << endl;
  processes.clear();
  return 0;
}

 

Possible Improvements

There are a couple of improvements that could be made to improve readability and generally cut down on the amount of code.

First, I think there’s probably a way to combine the functions preemptive() and nonpreemptive() . They’re basically the same apart from a few extra lines allowing  preemptive() to, you know, preempt. I wrote functions like runProcess() and addProcess()  for most of the behaviors common to both preemptive() and nonpreemptive() but I’m pretty sure I could combine them into a single function with a little extra work.

Second, and more importantly, I think readability could be vastly improved if I rewrote this using an object-oriented approach rather than the procedural approach seen above. For instance, if the ready queue of processes were it’s own object it could encapsulate all of the data about each of the processes as well as information like average waiting time, turnaround time, and so forth. It could also handle all of the pushing, popping, and heapifying behaviors automatically.

I would like to have implemented these improvements and others in my project but, due to my limited command of C++ and the short amount of time allocated for the project I wasn’t able to.

Overall it was a great project that probably taught me a lot more about C++ than it taught me about CPU scheduling. I had a lot of fun with it. If anyone has any recommendations to improve my code I’d love to hear them. Like I said I’m still pretty new to C++ so all help is appreciated!