Raspberry Pi to Arduino Communication For Robot Control


Arduino to Raspberry Pi communicationGood news, everyone! I’ve come up with a good way for Colin’s Raspberry Pi to talk to his Arduino. To review, the idea was that the Arduino could handle low-level control functions like speed control, odometry, and sensor reading. This leaves the Raspberry Pi free to handle high-level control like obstacle avoidance, motion planning, and state estimation.

There are a few steps in this process:


Freeing Up the Raspberry Pi’s Serial Port

The first problem we have to deal with is that the Raspbian reserves its serial port for use by a serial console. So the Raspberry Pi’s GPIO serial port is totally useless until you free it up. I’m not going into all the details here, but I found this guide really helpful. The important things to remember are that you need to enable uart in config.txt and disable the serial console. This will allow our program to use the serial port.

Top of page


The Communication Protocol

The communication protocol works like this:

  • The Raspberry Pi sends 2 16-bit ints to the Arduino
    • The first int is the commanded translational velocity
    • The second int is the commanded angular velocity
  • The Arduino sets its speeds accordingly and then updates its sonar sensors
  • After the sensors are updated, the Arduino sends 11 16-bit ints back to the Raspberry Pi
    • The first 8 ints are the distance readings from the 8 sonar sensors
    • The last 3 ints are the Colin’s x and y position and his heading, calculated from odometry

Using this protocol, Colin will run at the commanded speeds until he receives another command. This causes a problem: Colin will continue to run even if the serial communication fails and the Raspberry Pi stops sending commands all together. This means he could run off a cliff and there would be nothing to stop him!

To fix this I have the Raspberry Pi send commands at a regular interval. If, for example, Colin expects to get a new command every quarter second and he doesn’t get a command at the expected time, he will know there’s a communication problem. If Colin detects a communication fault he can respond in a fail-safe manner by stopping.

Communication Formatting

How do we send ints over serial though? Arduino’s Serial.print()  function converts int and float values to strings of chars before sending them. At first this might seem really convenient, but it actually causes more problems than it solves. First, Arduino and C++ don’t have a great function for converting char strings to int or float values. Second, C++ doesn’t automatically convert numerical types to strings before sending them via serial, so you’d need to write your own function for that. Lastly, if we convert numbers to strings, their lengths will be variable. This means we would need to define some way to tell when one value ends and the next begins.

The good news is none of that is necessary! You know why? Every command and sensor packet will be exactly the same length: 2 16-bit ints for commands and 11 16-bit ints for responses. Further, the ints will always come in the same order. So the meaning of each byte is predictable.

The only problem is you can only send individual bytes over serial. But this is easily solved by splitting the ints into their component bytes before sending:

char firstByte = (byte)(value & 0xFF);
char secondByte = (byte)((value >> 8) & 0xFF);

and reassembling them on the other end:

int value = (secondByte << 8) | firstByte;

Top of page


Wiring It Up

Wiring up the Raspberry Pi to the Arduino is pretty simple, but there’s an important catch. The Raspberry Pi uses 3.3 volt logic and the Arduino uses 5 volt logic. So we need to use a level shifter to allow communication between the two devices. If the level shifter gets a 3.3 volt signal on the low side, it sends out a 5 volt signal on the high side. If it gets a 5 volt signal on the high side it sends out a 3.3 volt signal on the low side. Pretty simple, right? Wire it up as shown below:

wiring for serial between an rPi and an Arduino

Wiring for serial between a Raspberry Pi and an Arduino


Top of page


The Code!


Okay, enough talk. Let’s get into the code. I’m going present the code for the Raspberry Pi side first, and follow it up with the Arduino code. The complete code for the Raspberry Pi can be found here and the Arduino code can be found here.

Raspberry Pi Code

Opening the Serial connection

First, we need to open a serial connection with the Arduino. This is handled by the following function, which I adapted from this extremely helpful site. Check out that site if you want details on how all of this works.

void SerialBot::openSerial()
{
	serialFd_ = open("/dev/serial0", O_RDWR);
																						// to allow blocking read
	if (serialFd_ == -1)
	{
		cerr << "Error - unable to open uart" << endl;
		exit(-1);
	}	
	
	struct termios options;
	tcgetattr(serialFd_, &options);
	options.c_cflag = B9600 | CS8 | CLOCAL | CREAD;
	options.c_iflag = IGNPAR;
	options.c_oflag = 0;
	options.c_lflag = 0;
	tcflush(serialFd_, TCIFLUSH);
	tcsetattr(serialFd_, TCSANOW, &options);
}

There is one important difference between my function above and the one it’s adapted from: I dropped the O_NOCTTY and O_NDELAY flags from the open command in line 3. This means my serial connection will be blocking. In other words, when I call the read() function the program execution stops until there is data to read in the serial buffer. In other words, my program will wait for a response from the Arduino before continuing.

Sending Commands

Sending a command works as follows:

int SerialBot::transmit(char* commandPacket)
{
	int result = -1;
	if (serialFd_ != -1) 
	{
		result = write(serialFd_, commandPacket, commandPacketSize);
	}
	return result;
}

And, in case you’re wondering, the function that assembles the commandPacket is below.

void SerialBot::makeCommandPacket(char* commandPacket)
{
	int16_t intAngular = (int)(angular_ * 1000.0);
	commandPacket[0] = (char)(translational_ & 0xFF);
	commandPacket[1] = (char)((translational_ >> 8) & 0xFF);
	commandPacket[2] = (char)(intAngular & 0xFF);
	commandPacket[3] = (char)((intAngular >> 8) & 0xFF);
}

Note that angular_ is a double representing Colin’s commanded angular velocity. The size and representation of doubles is inconsistent, however, so it’s difficult to break them up and reassemble them on a different machine. Int representations are very consistent, however, so I just multiply angular_ by 1000 to save the first three decimal places and cast it to an int. The loss of accuracy is pretty negligible for our purposes.

Receiving Data

The function below is how we receive data from the Arduino. Note that I’ve set the read to time out after 0.25 seconds. The Raspberry Pi expects to get a response from the Arduino after every command is sent. If it doesn’t receive a response before it’s time to send the next command, it throws an error.

int SerialBot::receive(char* sensorPacket)
{
	memset(sensorPacket, '\0', sensorPacketSize_);
	int rxBytes;
	if (serialFd_ != -1)
	{
		// set up blocking read with timeout at .25 seconds
		fd_set set;
		FD_ZERO(&set); // clear the file descriptor set
		FD_SET(serialFd_, &set); // add serial file descriptor to the set
		struct timeval timeout;
		timeout.tv_sec = 0;
		timeout.tv_usec = 250000;
		
		// wait for serial to become available
		int selectResult = select(serialFd_ + 1, &set, NULL, NULL, &timeout);
		if (selectResult < 0)
		{
			cerr << "blocking read failed" << endl;
			return -1;
		}
		else if (selectResult == 0)
		{
			cerr << "read failed: timeout occurred" << endl;
			return 0;
		}
		else
		{
			rxBytes = read(serialFd_, sensorPacket, numSonar_ + numPoseVariables);
		}
	}
	return rxBytes;
}

Once we’ve read data from the Arduino, we need to parse it:

int SerialBot::parseSensorPacket(char* sensorPacket)
{
	int16_t firstByte;
	int16_t secondByte;
	int16_t inValues[numSonar_ + numPoseVariables];
	for (int i = 0; i < numSonar_ + numPoseVariables; i++)
	{
		firstByte = sensorPacket[2 * i];
		secondByte = sensorPacket[(2 * i) + 1];
		inValues[i] = (secondByte << 8) | firstByte;
	}

	for (int i = 0; i < numSonar_; i++)
	{
		distances_[i] = inValues[i];
	}
	
	x_ = inValues[8];
	y_ = inValues[9];
	theta_ = ((double)inValues[10]) / 1000.0;
}

Note again that Colin’s heading, theta_ , is a double. To save some bother in programming, the double value is multiplied by 1000 and casted to an int before it’s sent. So it needs to be casted to a double and divided by 1000 after it’s received.

Putting It All Together

Okay, last thing: we’ll put all of these things together in a communication function that runs every 0.25 seconds in its own thread:

void SerialBot::commThreadFunction()
{
	while (true) 
	{
		char commandPacket[commandPacketSize];
		makeCommandPacket(commandPacket);
		if (transmit(commandPacket) < 1)
			cerr << "command packet transmission failed" << endl;
		char sensorPacket[sensorPacketSize_];
		memset(sensorPacket, '\0', sensorPacketSize_);
		int receiveResult = receive(sensorPacket);
		if (receiveResult < 1)
		{
			cerr << "sensor packet not received" << endl;
		}
		else if (receiveResult < commandPacketSize)
		{
			cerr << "incomplete sensor packet received" << endl;
		}
		else
		{
			parseSensorPacket(sensorPacket);
		}
		usleep(readPeriod_);
	}
}

Top of page


The Arduino Code

Are you still with me? That took a while, but we got one side of it done. So we just have the Arduino code left to deal with.

Receiving

Let’s start with receiving a command from the Raspberry Pi:

void readCommandPacket()
{
  byte buffer[4];
  int result = Serial.readBytes((char*)buffer, 4);

  if (result == 4) // if the correct number of bytes has been received
  {
    int commands[2];
    
    // assemble 16 bit ints from the received bytes in the buffer
    for (int i = 0; i < 2; i++)
    {
      int firstByte = buffer[2 * i];
      int secondByte = buffer[(2 * i) + 1];
      commands[i] = (secondByte << 8) | firstByte;
    }
    translational = commands[0]; 
    angular = (double)commands[1] / 1000.0; // convert received int to double angular velocity
    colin.drive(translational, angular); // set Colin's speeds
    commandReceived = true; // note that a command has been received
    lastCommandTime = millis();
  }
  else if (result > 0)
  {
    Serial.println("incomplete command");
  }
  // else do nothing and try again later
}

Note that I’m using the Serial.readBytes() function rather than the more common Serial.read() function. There’s a couple of reasons for this. First, Serial.read() only reads a char at a time, but we know we need 4 bytes. Serial.readBytes() also blocks the program’s execution until it receives the requested number of bytes. This is perfect, since it means we’ll get a complete packet, instead of just receiving part of one.

Transmitting

The transmit function first puts all the data that needs to be sent into an array, buffer. Then the buffer is sent to the Raspberry Pi using Serial.write() . Note that I’m not using Serial.print() because it automatically converts int values to characters, and we want to send the bytes exactly as-is.

void sendSensorPacket()
{
  colin.getPosition(x, y, theta); // updates Colin's position
  byte buffer[22];
  addDistances(buffer); // adds sonar readings to buffer
  int sendX = (int)x;
  int sendY = (int)y;
  int sendTheta = (int)(theta * 1000.0);
  buffer[16] = (byte)(sendX & 0xFF);
  buffer[17] = (byte)((sendX >> 8) & 0xFF);
  buffer[18] = (byte)(sendY & 0xFF);
  buffer[19] = (byte)((sendY >> 8) & 0xFF);
  buffer[20] = (byte)(sendTheta & 0xFF);
  buffer[21] = (byte)((sendTheta >> 8) & 0xFF);
  Serial.write(buffer, 22);
}

void addDistances(byte* buffer)
{
  for (int i = 0; i < NUM_SONAR; i++)
  {
    buffer[2 * i] = (byte)(sonarDistances[i] & 0xFF);
    buffer[(2 * i) + 1] = (byte)((sonarDistances[i] >> 8) & 0xFF);
  }
}

Bringing It All Together

The loop() function below brings everything together. It checks to see if there is data in the serial buffer and, if so, attempts to interpret it as a command. If it successfully reads a command, it requests an update from the sonar controller. After the Arduino gets updated sensor readings it assembles a response packet and sends it back to the Raspberry Pi.

Lastly, the Arduino checks to see if more than 1 second has passed since the last command was received. If so, it assumes that a communication fault has occurred and it stops Colin.

void loop() 
{ 
  // check if a command packet is available to read
  readCommandPacket();
  
  // request a sensor update if a command has been received
  if (commandReceived)
  {
    commandReceived = false;
    requestSonarUpdate(SONAR_ADDRESS);
  }
  // send sensor packet if sonar has finished updating
  if (distancesRead)
  {
    distancesRead = false; 
    sendSensorPacket();
  }
  currentTime = millis();
  
  // stop colin if a command packet has not been received for 1 second
  if (currentTime - lastCommandTime > 1000)
  {
    Serial.println("command not received for 1 second");
    lastCommandTime = millis();
    colin.drive(0, 0.0);
  }
}

Top of page


Where Do We Go From Here?

Now we have a good way to communicate between Colin’s Raspberry Pi and Arduino. Colin doesn’t have a way to perceive the world around him, however, so that’s our next step. I designed an independent controller to read Colin’s sonar sensors and relay the information to the Arduino via I2C. My next post will cover the finer details of my sonar controller and the associated communication protocol.

After we sort these details out and get the system working like we want, we can get to programming higher level behaviors. For example, I’m currently working on a wall-following program. I’m hoping it will be ready to present in a couple of weeks! Lots of good stuff to come, stay tuned!

Top of page

I’m Learning Multithreading!

Since this quarter started I’ve been doing a weekly interview practice session with friends from school. Last week we spent most of our hour on this:

Rotate Matrix: Given an image represented by an NxN matrix, where each pixel in the image is 4 bytes, write a method to rotate the image by 90 degrees. Can you do this in place?

For those interested, it’s exercise 1.7 in the sixth edition of Cracking the Coding Interview by Gayle Laakmann McDowell. It’s a great interview prep book that’s been really helpful since I started job hunting. I had a lot of fun with this one so I’d like to do a short write up on our thought process.


Row-to-Column Method

The obvious thing to do here is create a new NxN square matrix and transpose the rows of the old matrix into the columns of the new matrix as in the graphic below:

MatrixRotateRtC

The above method takes O(n) time and O(n) space where n is the number of entries in the matrix. Coding this up should be pretty simple so I’m going to skip ahead to the thing I’m actually interested in: the in-place method.


In-Place Method

We can’t get below O(n) time to rotate a matrix since every entry must be moved and there are n entries. However, it is possible to cut the space usage down from O(n) to O(1) by doing the rotation in place. See the graphic below for an illustration of the method:

MatrixRotateIP

How do we implement this in code? It’s a little difficult but the key insight is that it’s possible to find the destination index for any given entry in the matrix from its starting index. In general terms the destination index can be found using this relation

Destination for { i, j } = { j, n – i }

where n is the last index in the matrix. The really cool thing about this is that it works for any starting index. The graphic above only shows a rotation of the outer layer of the matrix but the above method works for any index in any size of square matrix.

The code I post on this blog has been getting longer and longer lately, so I’m only going to post the relevant functions on the site. If you want to see the whole program you can find it on my gitHub repo.

The program is called matrixRotate.cpp. It takes a .txt file containing the matrix to rotate as a command line argument. The text file should have one int  representing the size of the matrix on the first line. Each successive line should have a space delimited string of ints representing each row of the matrix. It prints the rotated matrix to the console.

The function below will take care of a single step in the matrix rotation. After the function completes 4 matrix entries are moved.

// accepts three ints, the starting row and column indices
// and the highest index in the matrix, NOT the size
// rotates 4 matrix entries by 90 degrees counterclockwise
void rotateLeftHelper(int row, int column, int n) 
{
  int temp = matrix[row][column];
  for (int i = 0; i < 3; i++) 
  {
    int nextRow = column;
    int nextColumn = n - row;
    matrix[row][column] = matrix[nextRow][nextColumn];
    row = nextRow;
    column = nextColumn;
  }
  matrix[row][column] = temp;
}

To rotate the whole matrix we need to run the above function on every entry in the first row from { 1, 1 } to { 1, n-1 }. Then every entry in the second row from { 2, 2 } to { 2, n-1 }. This needs to be repeated until we get to the starting index: { n/2, n/2 }. The function below will do this for us:

// accepts an int, the matrix size
// rotates the matrix by 90 degrees counterclockwise
void rotateLeft(int n) 
{
  int level = n - 1; // switch n from matrix size to highest index
  for (int i = 0; i < n/2; i++) 
  {
    for (int j = i; j < level; j++) 
    {
      rotateLeftHelper(i, j, n - 1);
    }
    level--;
  }
}

So when they work in concert, the above functions will rotate a matrix by 90 degrees counterclockwise and it’ll do it in place. That’s pretty cool, right? If you’ll recall though, the problem from Cracking the Coding Interview says the matrix represents an image. If it’s a raw image from a run-of-the-mill camera it will have 20 million entries. Doing each rotation in series will tie up the CPU for a long time. Fortunately, using the above  rotateLeftHelper() function, one instance of the function doesn’t affect the others. They never operate on the same part of the matrix! Do you know what that means? They can run in parallel and asynchronously!


Multi-Threaded Version

Really all we need to do here is create a new thread for every call to the rotateLeftHelper() function instead of running them in series. Cool, right?

The function below implements the in-place matrix rotation method using pthreads. See matrixRotate.cpp on my gitHub repo to see the whole program.

// accepts a pointer to a threadParams struct containing three ints, 
// the starting row and column indices
// and the highest index in the matrix, NOT the size
void* rotateLeftHelper(void* start) 
{
  threadParams* startState = (threadParams*) start;
  int row = startState->row;
  int column = startState->column;
  // cout << "starting thread on {" << row <<  ", " << column << "}" << endl;
  int n = startState->n;
  int temp = matrix[row][column];
  for (int i = 0; i < 3; i++) 
  {
    int nextRow = column;
    int nextColumn = n - row;
    matrix[row][column] = matrix[nextRow][nextColumn];
    row = nextRow;
    column = nextColumn;
  }
  matrix[row][column] = temp;
}

// accepts an int, the matrix size
// rotates a matrix by 90 degrees counterclockwise
void rotateLeft(int n) 
{
  int level = n - 1; // switch n from matrix size to highest index
  // calculate number of threads
  int numThreads = 0;
  for (int i = level; i >= 1; i -= 2) {
    numThreads += i;
  }
  pthread_t *threads;
  threadParams *parameters;
  threads = new pthread_t[numThreads];
  parameters = new threadParams[numThreads];
  int curThread = 0;
  for (int i = 0; i < n/2; i++) 
  {
    for (int j = i; j < level; j++) 
    {
      parameters[curThread].row = i;
      parameters[curThread].column = j;
      parameters[curThread].n = n - 1;
      pthread_create(&threads[curThread], NULL, 
		     &rotateLeftHelper, (void*) &parameters[curThread]);
      curThread++;
    }
    level--;
  }
  // join all threads
  for (int i = 0; i < numThreads; i++) 
  {
    pthread_join(threads[i], NULL);
  }
  delete[] threads;
  delete[] parameters;
}

Further Improvements

So the above code will make use of parallel processing to speed up the matrix rotation. However, creating and destroying a thread for each instance of the rotateLeftHelper() function involves an unnecessary amount of overhead. We could avoid that overhead by using a thread pool.

Also, we’re still running the threads on the CPU. In the best case scenario the CPU has 8 cores, so we’ll be getting a speedup factor of much less than 8. That’s good but not great. Also, we’re tying up the CPU to do a lot of extremely simple tasks. It would be better if we had a much larger number of much worse processors to do this task. Does that sound familiar? It’s called a graphics card and it’s what image processing programs on computers actually use to do tasks like this.

I think the best thing I could do for this program would be to figure out how to create threads that run on my graphics card using openGL.

That’s all for now though! Feel free to email me with questions.

CPU Scheduling

School Projects

Since I started school last August I haven’t had a lot of time to devote to working on Colin and, consequently, I haven’t produced much new material for this site. Suffice it to say that working full time and going to school in the evenings leaves me with very little free time and I’m looking forward to the day when I can just do one or the other.

Winter quarter is over and, while I haven’t done much work on Colin, I have done a couple of interesting projects for school and I’d like to post one of them here. I’ll won’t be doing an in-depth tutorial. Instead I’ll just provide an outline of the project, the source code, and some comments one the more interesting aspects and some possible improvements.


Project Specification

My favorite project of last quarter was a CPU scheduler simulator. If you’re not familiar with CPU scheduling wikipedia has a great article that outlines the basics. It’s a topic that sounds kinda dull until you realize just about every computer you’ve ever used is completely dependent on good scheduling algorithms to function properly.

My assignment was to create a program that simulates a short term scheduler. The program needs two arguments when it’s called from the command line: the name of a file specifying the processes to be scheduled and run, and the identifier of one of the following scheduling policies:

The input file contains definitions for an arbitrary number of jobs. Each line in the file defines one job with the process id (PID), arrival time, CPU burst time, and priority of the job. The following line defines a job with a PID of 1, arrival time of 10 milliseconds, burst time of 20 milliseconds, and a priority of 3. Note that lower numbers indicate higher priority.

1 10 20 3

The program prints a notification to the console when a context switch occurs (when a new process is scheduled) or if the CPU goes idle. When all of the processes have completed it prints the CPU utilization, average waiting time, and worst case waiting time to the console.


Design Concerns

It would have been easy enough to simply write a different function for each of the four scheduling policies but this is a bad idea since they can share most of their code. Shortest Job First is nearly the same as Nonpreemptive priority scheduling, it just determines priority based on CPU burst time rather than an independent priority variable. The same is true for Shortest Remaining Time First and Preemptive Priority. So really I just need to write two functions: one for preemptive and one for nonpreemptive scheduling. As long as I can change what the two functions use to evaluate priority it should work like a charm!

Of course that last point is harder than it sounds, especially since I’m a relative newbie with C++. That’s one of the things I’ve really loved about getting more familiar with C++ though. If I’m writing code and I think, “man it would be really cool if I could make this do X,” I can always find a way to do it if I’m willing to put in the time to learn how. For instance, getting my scheduling functions to evaluate job priority involved learning to write comparison functions for the structs I used to represent jobs, and then to creating pointers to those comparison functions. Then I could just pass the function pointer to the scheduling functions. Easy peasy, right?


The Code

// Andrew Kramer
// 2/20/2016

// hw3.cpp
// simulates a CPU scheduler with one of four scheduling strategies:
// shortest job first, shortest remaining time first, nonpreemptive
// priority, and preemptive priority scheduling

// command line arguments:
// (1) the name of the input file
// (2) the name of the scheduling policy
//      - SJF  for shortest job first
//      - SRTF for shortest remaining time first
//      - NP   for nonpreemptive priority scheduing
//      - PP   for preemptive priority scheduling

// input file formatting:
//    file should contain definitions for an arbitrary number of jobs
//    each line should define one job with the following info:
//       (int) a unique process id
//       (int) arrival time
//       (int) CPU burst time
//       (int) priority (lower numbers have higher priority)

// exits with an error if an invalid filename or scheduling 
// policy is given in command line


#include<iostream>
#include<sstream>
#include<fstream>
#include<string>
#include<vector>
#include<queue>
#include<algorithm>
#include<iomanip>
using namespace std;

struct Process {
  int pid;
  int arrivalTime;
  int burstTime;
  int priority;
  int waitTime;
  bool complete;
};

// type definition for the comparison function pointers
// makes it possible to pass function pointers to the 
// scheduling functions
typedef bool (*compareFunc)(const Process*, const Process*);

bool comparePID(const Process *a, const Process *b) {
  return (a->pid > b->pid);
}

bool compareBurst(const Process *a, const Process *b) {
  if (a->burstTime == b->burstTime) {
    return comparePID(a, b);
  }
  return (a->burstTime > b->burstTime);
}

bool comparePriority(const Process *a, const Process *b) {
  if (a->priority == b->priority) {
    return comparePID(a, b);
  }
  return (a->priority > b->priority);
}

// print all of the processes in the given vector
// for testing purposes only
void printProcesses(vector<Process*> processes) {
  for (int i = 0; i < processes.size(); i++) {
    cout << processes[i]->pid << " "
	 << processes[i]->arrivalTime << " "
	 << processes[i]->burstTime << " "
	 << processes[i]->priority << endl;
  }
}

// accepts a string, the name of the input file
// returns a vector of pointers to processes
// checks if the file exists and reads the processes into
// a vector if so
// exits with an error if file does not exist
vector<Process*> readFile(string s) {
  const char *fileName = s.c_str();
  ifstream inFile(fileName);
  // check if input file exists
  if (!inFile) {
    cerr << "Input file " << fileName << " not found" << endl;
    exit(1);
  }
  string line;
  vector<Process*> processes;
  while (getline(inFile, line)) {
    istringstream iss(line);
    int thisPid, thisArrival, thisBurst, thisPriority;
    Process *thisProcess = new Process();
    iss >> thisProcess->pid >> thisProcess->arrivalTime
	>> thisProcess->burstTime >> thisProcess->priority;
    thisProcess->complete = false;
    // prevents invalid input from being added if input file
    // contains an extra \n at the end
    if (thisProcess->burstTime > 0) {
      processes.push_back(thisProcess);
    }
  }
  return processes;
}

// accepts a pointer to the running process, a vector of processes
// containing the ready heap, and an int: the current time
// checks if the running process has completed and marks it complete if so
// if the ready heap is empty prints a CPU idle message
bool runProcess(Process *running, 
		vector<Process*> &readyHeap, int time) {
  // if the running process is not marked complete
  if (!running->complete) {
    running->burstTime--;
    if (running->burstTime == 0) {
      running->complete = true;
      if (readyHeap.empty()) { // if no process is available on the ready heap
        cout << "Time " << time << " Idle" << endl;      
      }
    }
  }
}

// accepts two vectors of pointers of processes, a process comparison
// function, and two ints, the current time and number of processes 
// that have been run
// checks all processes to see if their arrival time has come and adds
// a process to the ready heap if so
void addProcess(vector<Process*> &processes, 
		vector<Process*> &readyHeap,
		compareFunc compare,
		int &time, int &curProcess) {
  for (int i = 0; i < processes.size(); i++) {
    // add process to ready heap if its arrival time has come
    if (processes[i]->arrivalTime == time) {
      readyHeap.push_back(processes[i]);
      push_heap(readyHeap.begin(), readyHeap.end(),
		compare);
      curProcess++;
    }
  }
}

// accepts a pointer to the running process, a vector of processes
// that stores the ready heap, a process comparison function, and
// two ints, the active time and idle time for the CPU
// if the running process has completed and another process is 
// available on the ready heap, starts running the new process
void completed(Process *&running, 
	       vector<Process*> &readyHeap,
	       compareFunc compare,
	       int &time, int &idleTime) {
  if (running->complete) {
    // check to see if another process is available on the heap
    // and run the top process on the heap if so
    if (!readyHeap.empty()) {
      running = readyHeap.front();
      pop_heap(readyHeap.begin(), readyHeap.end(),
	       compare);
      readyHeap.pop_back();
      cout << "Time " << time << " Process " << running->pid << endl;
    } else { // increment the CPU idle time
      idleTime++;
    }
  }
}

// accepts a vector of pointers to processes, a pointer to a 
// function that compares processes, and two ints representing
// the cumulative active time and idle time for the processes
// uses non-preemptive scheduling and the given comparison funciton
// to simulate a CPU scheduler
void nonPreemptive(vector<Process*> &processes, 
		  compareFunc compare,
		  int &activeTime, int &idleTime) {
  int time = 0, curProcess = 0;
  // create a dummy process to ensure the first process is
  // properly scheduled
  Process dummy; 
  dummy.complete = false;
  dummy.burstTime = 1;
  Process *running = &dummy;
  vector<Process*> readyHeap;
  // determine active CPU time
  for (int i = 0; i < processes.size(); i++) {
    activeTime += processes[i]->burstTime;
  }
  // while there are still processes to be run
  while (curProcess < processes.size() ||
	 !readyHeap.empty()) {
    addProcess(processes, readyHeap, compare, time, curProcess);
    runProcess(running, readyHeap, time); 
    // if the running process has completed
    completed(running, readyHeap, compare, time, idleTime);
    for (int i = 0; i < readyHeap.size(); i++) {
      readyHeap[i]->waitTime++;
    }
    time++;
  }
}

// accepts a vector of pointers to processes, a pointer to
// a function that compares pointers to processes, and two ints
// the active time and idle time for the processes
// uses preemptive scheduling and the given comparison function to
// to simulate process scheduling
void preemptive(vector<Process*> &processes, 
		compareFunc compare,
		int &activeTime, int &idleTime) {
  int time = 0, curProcess = 0;
  // dummy process to ensure first process is scheduled
  Process dummy;
  dummy.complete = false;
  dummy.burstTime = 1;
  Process *running = &dummy;
  vector<Process*> readyHeap;
  // sum burst times to find CPU active time
  for (int i = 0; i < processes.size(); i++) {
    activeTime += processes[i]->burstTime;
  }
  while (curProcess < processes.size() ||
	 !readyHeap.empty()) {
    addProcess(processes, readyHeap, compare, time, curProcess);
    runProcess(running, readyHeap, time);
    completed(running, readyHeap, compare, time, idleTime);
    // check if next process should preempt the running process
    if (!readyHeap.empty() && !running->complete) {
      if (compare(running, readyHeap.front())) {
	readyHeap.push_back(running);
	push_heap(readyHeap.begin(), readyHeap.end(),
		  compare);
	running = readyHeap.front();
	pop_heap(readyHeap.begin(), readyHeap.end(),
		 compare);
	readyHeap.pop_back();
	cout << "Time " << time << " Process " << running->pid << endl;
      }
    }
    // increment wait time for processes in the readyHeap
    for (int i = 0; i < readyHeap.size(); i++) {
      readyHeap[i]->waitTime++;
    }
    time++;
  }
}

// accepts a vector of pointers to Processes that have been run 
// and two ints, the active time and idle time for the processes
// calculates and prints the average wait time, worst case wait
// time, and CPU utilization for the given processes
void printStats(vector<Process*> &processes, 
		int activeTime, int idleTime) {
  int totalWait = 0, worstWait = -1;
  for (int i = 0; i < processes.size(); i++) {
    totalWait += processes[i]->waitTime;
    if (processes[i]->waitTime > worstWait) {
      worstWait = processes[i]->waitTime;
    }
  }
  double utilization = (double)activeTime / (double)(activeTime + idleTime);
  double averageWait = (double)totalWait / (double)processes.size();
  cout << "CPU utilization: " << setprecision(2) 
       << (utilization * 100) << "%" << endl;
  cout << "Average waiting time: " << fixed << setprecision(2) 
       << averageWait << endl;
  cout << "Worst-case waiting time: " << worstWait << endl;
}


int main(int argc, char *argv[]) {
  cout << endl;
  string fileName = argv[1], algorithm = argv[2];
  vector<Process*> processes = readFile(fileName);
  int idleTime = 0, activeTime = 0;
  if (algorithm.compare("SJF") == 0) {
    nonPreemptive(processes, compareBurst, activeTime, idleTime);
  } else if (algorithm.compare("SRTF") == 0) {
    preemptive(processes, compareBurst, activeTime, idleTime);
  } else if (algorithm.compare("NP") == 0) {
    nonPreemptive(processes, comparePriority, activeTime, idleTime);
  } else if (algorithm.compare("PP") == 0) {
    preemptive(processes, comparePriority, activeTime, idleTime);
  } else {
    cerr << "Invalid parameter:" << endl
	 << "2nd command line parameter must be "
	 << "\"SJF\", \"SRTF\", \"NP\", or \"PP\"." << endl;
    exit(1);
  }
  printStats(processes, activeTime, idleTime);
  cout << endl;
  processes.clear();
  return 0;
}

 

Possible Improvements

There are a couple of improvements that could be made to improve readability and generally cut down on the amount of code.

First, I think there’s probably a way to combine the functions preemptive() and nonpreemptive() . They’re basically the same apart from a few extra lines allowing  preemptive() to, you know, preempt. I wrote functions like runProcess() and addProcess()  for most of the behaviors common to both preemptive() and nonpreemptive() but I’m pretty sure I could combine them into a single function with a little extra work.

Second, and more importantly, I think readability could be vastly improved if I rewrote this using an object-oriented approach rather than the procedural approach seen above. For instance, if the ready queue of processes were it’s own object it could encapsulate all of the data about each of the processes as well as information like average waiting time, turnaround time, and so forth. It could also handle all of the pushing, popping, and heapifying behaviors automatically.

I would like to have implemented these improvements and others in my project but, due to my limited command of C++ and the short amount of time allocated for the project I wasn’t able to.

Overall it was a great project that probably taught me a lot more about C++ than it taught me about CPU scheduling. I had a lot of fun with it. If anyone has any recommendations to improve my code I’d love to hear them. Like I said I’m still pretty new to C++ so all help is appreciated!