Computer Systems Organization

CSCI-UA.0201(005), Spring 2018

Lab-5: Thread lab

Due: 5/6

Introduction

In this lab, you are going to implement a hash table that allows multiple threads to concurrently perform insert and lookup operations on it.

This is an individual lab.

Obtaining the lab

$ cd cso-labs/ 
$ git fetch upstream 
$ git merge upstream/master

This creates a number of new files in subdirectory threadlab/.

Change these files only

The only files you should modify for this lab are htable.{h,c} and rwlock.{h,c}

Step 1: Make a fixed-size hash table thread-safe

Let's first solve the easy problem of adding locks to a hash table that does not do resize.

Read the given hash table implementation in files htable.{h,c}. We can see that the hash table contains an array of pointers. Each slot of the array contains the head of a singly-linked list that chains together those inserted items that have hashed to the same slot. Function htable_init initializes the hash table with a certain number of slots. It also remembers whether the initialized hash table is allowed to resize itself or not. For step 1, the hash table does not resize.

To insert a tuple whose key is a C string and whose value is a (void *) pointer, function htable_insert computes a hashcode based on the key and finds the tuple's corresponding slot based on the hashcode. It then inserts the tuple into the linked list at that slot.

Add locks to the hash table code (you should only change htable.c and htable.h) so that multiple threads can insert and lookup items in the hash table concurrently. The easiest locking strategy is to use a single big lock. However, this strategy does not leave any room for concurrency (resulting in bad performance with multiple threads). You are expected to use a fine-grained locking strategy.

Hint: you only need to use these pthread functions: pthread_mutex_init, pthread_mutex_destroy, pthread_mutex_lock, pthread_mutex_unlock.

To test your code, run ./tester -t htable. This test uses a fixed-size hash table. Below is an example of a successful test.

$ make
$ ./tester -t htable 
Initialized hash table of size 10007
Spawned 4 threads, each to insert 500000 tuples
All 4 threads finished. Throughput is 1.578250 inserts/sec
Spawned 4 threads, each to insert or lookup 500000 tuples
All 4 threads finished. Throughput is 1.195037 lookups/sec
--- HTABLE TEST PASSED (final htable size 10007) 

You can change the number of threads that perform concurrent hash table operations by using the -n option. For example, ./tester -t htable -n 1 uses a single thread to perform hash table operations (i.e. there are no concurrent operations in this test). As another example, ./tester -t htable -n 2 performs hash table operations using two threads.

Step 2: Implement a reader-writer lock

In order to make hash table resize thread-safe, we are going to need an utility called the reader-writer lock. Read the skeleton code of the reader-writer lock in rwlock.{c,h}.

A reader-writer lock can be acquired in either the "read" (also referred to as "shared") or "write" (also referred to as "excluse") mode. In the given code skeleton, function rwl_rlock attempts to grab the lock in "read" mode, rwl_runlock releases the grabbed lock in "read" mode. Similarly, function rwl_wlock and rwl_wunlock locks and unlocks in "write" mode.

The reader-writer lock allows multiple readers but only one writer at a time. In other words, If the lock has been acquired in the "read" mode, then other threads can also grab the lock in the "read" mode without blocking. If the lock has been acquired in the "write" mode, then no other thread should be able to acquire the lock in either "read" or "write" mode.

Hint: you need to use conditional variable(s) to implement the reader-writer locks. The pthread functions related to conditional variables that you may need to use are: pthread_cond_init, pthread_cond_wait, pthread_cond_timedwait, pthread_cond_signal, pthread_cond_broadcast.

In order to faciliate testing, function rwl_rlock and rwl_wlock take a second argument "expire" and requires any conditional wait inside the function to last no later than absolute time "expire". Therefore, we ask you to use our wrapper function cond_timedwait instead of pthread_cond_wait . When cond_wait returns ETIMEDOUT if timeout happens. It returns 0 if it has been woken up by a pthread_cond_signal or pthread_cond_broadcast. In both cases, the corresponding mutex (the second argument of cond_wait) is locked.

The reader-writer lock can have different scheduling policies. For this lab, we ask you to prioritize the writer. In other words, as long as some writer is waiting to acquire the lock in "write" mode, no subsequent readers are allowed to acquire the lock in "read" mode.

To facilitate testing, we also ask you to implement rwl_nwaiter to return the number of threads who are currently waiting to acquire the lock in either mode.

To test your implementation, run ./tester -t rwl. Below is an example of a successful test:

$ ./tester -t rwl
Reader grabbed the lock
Another reader grabbed the lock
One reader unlocked
Another reader unlocked
Writer grabbed the lock
Writer unlocked
--- RWLOCK TEST PASSED basic correctness
Reader has locked but not yet unlocked
Spawned a writer who is waiting to lock
Spawned 8 readers who are waiting to lock
Writer locked before readers
--- RWLOCK TEST TEST PASSED (writer is given priority)

Step 3: Use reader-writer lock to make a resizable hash table thread-safe

Now that you have the reader-write lock, we can proceed to handle the scenario of hash table resize. In particular, we can think of the hash table resize operation as an "exclusive" activity: if a hash table resize is ongoing, no other threads can perform any other other hash table operations, such as lookups, inserts, and resize. On the other hands, hash table lookups and inserts can be viewed as "shared" activities: multiple hash table lookups and inserts can occur at the same time (provided that you have already locked properly in step 1) in the absence of resize.

Use your reader-writer lock in htable.{cc,h} to deal with resize. When you are done, test using ./tester -t resize. An example of a successful run is as follows:

Initialized hash table of size 10007
Spawned 4 threads, each to insert 500000 tuples
All 4 threads finished. Throughput is 2.007357 inserts/sec
Spawned 4 threads, each to insert or lookup 500000 tuples
All 4 threads finished. Throughput is 1.923626 lookups/sec
--- RESIZE TEST PASSED (final htable size 1282529) 

Run all tests

To run all tests together, type ./test -t all. Make sure you run the tests many times as concurrency bugs are non-deterministic and may show up rarely.

Handin Procedure

To handin your files, simply commit and push them to github.com
$ git commit -am "Finish lab5"
$ git push origin 
We will fetching your lab files from Github.com at the specified deadline and grade them.