Thursday, February 28, 2013

What's the minimum number of steps to do a read-only lock?

Let's start by remembering what the read-lock looks like for the ScalableReentrantRWLock:




    public void sharedLock() {
        // Initialize a new Reader-state for this thread if needed        
        if (entry.get() == null) addState();     
       
        final AtomicInteger readersCount = entry.get().state;
       
        while (true) {
            if (writerOwner.get() == SRWL_INVALID_TID) {
                readersCount.set(1);
               if (writerOwner.get() == SRWL_INVALID_TID) {
                   // Acquired lock in read-only mode
                   return;
               } else {
                   // A Writer has acquired the lock, must reset to 0 and wait
                   readersCount.set(0);
               }
            }
            Thread.yield();
        }        
    }



Looking into the while() loop, we can see that the minimum number of operations needed to acquire the read-lock are three: writerOwner.get(), readersCount.set(1), writerOwner.get().
This means that in order to obtain a read-lock, you will need to do at least the following Atomic operations: two reads (loads) and one write (store). An interesting detail about this, is that the second writerOwner.get() should be a very quick operation because if there was no cache-line invalidation of writerOwner the value will still be in cache L1 due to the first writerOwner.get(), and not only that, on x86 the instruction is a simple mov (plus the compiler-level acquire-barrier). If the previous sentence didn't make sense, just go and listen to this starting at minute 31.

And now you ask: Dude, that's just 3 operations! Can we still improved it?
Yes we can!

Here's how: Instead of reading the state of the Writer and then setting the Reader's state to 1, we simply set the Reader's state to 1, and then if there is a Writer waiting, we rollback to state 0.
The problem with that approach is that if you repeat that logic then you might get into a livelock state between a Reader and a Writer, which is not good. We can only use this technique once (for the first time) and then we default to the regular technique that uses three operations. Sounds confusing? Let's look at some code:



    public void sharedLock() {
        // Initialize a new Reader-state for this thread if needed        
        if (entry.get() == null) addState();     
       
        final AtomicInteger readersCount = entry.get().state;

        // Optimistic approach: we hope that most of the time
        // the writerOwner will be SRWL_INVALID_TID and that will
        // save us one extra .get()
        readersCount.set(1);
        if (writerOwner.get() == SRWL_INVALID_TID) {
            // Acquired lock in read-only mode
            return;
        } else {
            // A Writer has acquired the lock, must set the reader's state
            // to 0 and go into the "regular" execution
            readersCount.set(0);
        }

        while (true) {
            if (writerOwner.get() == SRWL_INVALID_TID) {
                readersCount.set(1);
               if (writerOwner.get() == SRWL_INVALID_TID) {
                   // Acquired lock in read-only mode
                   return;
               } else {
                   // A Writer has acquired the lock, must reset to 0 and wait
                   readersCount.set(0);
               }
            }
            Thread.yield();
        }        
    }

Do we gain a performance improvement by doing this?
Not on x86, but maybe yes on architectures where the get() forces a full synchronization and saving a single get() may increase the performance noticeably. Anyone has a PowerPC or ARMv7 system that runs Java?

By the way, this whole idea came from Andreia's head... and I'm offering dinner to whoever can do a read-lock with less operations (than one get() and one set())   :)





No comments:

Post a Comment