Sunday, May 07, 2006

Lock free Concurrent Operations on Linked List

It's possible to have a linked list data structure that is "non blocking" in a uni-processor multitasking environment. Though "block free" the data structure is not wait free. Block free datastructures are more performant than "synchronized" data structure.

"Non-blocking" data structures depends on the hardware capability to perform "Compare and Swap" (CAS) as atomic / uninterruptible operations.

Le'ts see the linked list creation and insertions.

Class LinkedList
{
LinkedList next=null;
Data data=null;
.....


}

LinkedList root = null;


Here "root" is global object and let's assume we have two process (P1 & P2) which work on this object concurrently.

First let's see the functions to manipulate LinkedList in single task environment:

boolean createRootNode(LinkedList root, Data dt)
{
LinkedList root =new LinkedList();
root.next=null;
root.data=dt;
return true;
}

boolean insertAfterNode(LinkedList item, Data dt)
{
LinkedList newItem = new LinkedList();
newItem.next=item.next;
newItem.data = dt;
item.next=newItem;
return true;
}


The same functions will look like below in a multi task environment using "synchronized" blocks

1. boolean createRootNode(LinkedList root, Data dt)
2. {
3. LinkedList newItem =new LinkedList();
4. newItem.next=null;
5. newItem.data=dt;
6. synchronized(root)
7. {
8. if (root==null) // Possible that prior executed thread could have created the Root item.
9. {
10. root=newItem;
11. return true;
12. }
13. }
14. return false;
15. }
16.
17.
18. boolean insertAfterNode(LinkedList itemAfter, Data dt)
19. {
20. LinkedList newItem = new LinkedList();
21. newItem.data = dt;
22. synchronized(itemAfter)
23. //Technically this is not foolproof implemenations.
23 // Other threads can delete itemAfter
24. // before it's synchronized here.

25. //Le'ts assume that the "entire datastructure" is locked
26. {
27. newItem.next=itemAfter.next;
28. itemAfter.next=newItem;
29. }
30. return true;
31. }


Synchronized blocks avoids data corruption/inconsistency in data but it blocks other threads from updating the data structure when one thread is inside the synchronized block. This delay is generally considered inefficient in real time systems as faulty thread will block non-faulty thread and the resources are not utilized to the maximum extent. So much research as be done on formulating "Lock free" datastructures. Below is one such implementation (Single Processor - Multitasking System).

CAS Code Snippet:

1. boolean createRootNode(LinkedList root, Data dt)
2. {
3. LinkedList newValue =new LinkedList();
4. LinkedList oldValue=root;
4. newValue.next=null;
5. newValue.data=dt;
6. return
atomicreferencefieldupdater.compareAndSet(root,oldValue,newValue);
7. }
8.
9.
10. boolean insertAfterNode(LinkedList itemAfter, Data dt)
11. {
12. LinkedList newItem = new LinkedList();
13. newItem.data = dt;
14. newItem.next=itemAfter.next;
15. if (atomicreferencefieldupdater.compareAndSet(itemAfter.next
,newItem.next,newItem))
16. {
17. return true;
18. }else
19. {
19. newItem.next=null;
20. newItem=null;
21. return false;
21. }
22.
23. }


Here is how both the above functions works.

RootNode Creation:

createRootNode()
  • In this function a new node is created and initialized to the passed "Data".
  • The function then calls "compareAndSet(root,oldValue,newValue)" function which does:
    • compare the value stored in root with oldValue. If the values are same then "newValue" is assigned to root and CAS returns "true" if not CAS just returns "false".

Now assume that two process P1 & P2 tries to create root node concurrently.

Case- 1:

"Context Switch" happens on line 5 in P1
P2 completes creating the root

Result:
P1 fails in updating the root reference. Data structure is in consistent state.

Similarly for insertAfterNode() functions there are two possible inconsistent states

(*) itemAfter.next has been changed (by other process)
(*) itemAfter has been deleted (by other process)

Concurrent Insertion:

Initial structure

A -> E -> H

P1 - insert(A,"B")
P2 - insert(A,"D")

Both the process try to insert after A and only one can succeed.

Combination of context switch snapshot will be something like

P1 (started)
12. LinkedList newItem = new LinkedList();
13. newItem.data = dt;
14. newItem.next=itemAfter.next;
/*
newItem->next & itemAfter-> next = "E"
*/
-----------------------------------------------------------------------------
P2 (Context switch)

12. LinkedList newItem = new LinkedList();
13. newItem.data = dt;
14. newItem.next=itemAfter.next;
/*
newItem->next & itemAfter->next = "E"
*/

------------------------------------------------------------------------------
P1 (Context switch)

15. if (atomicreferencefieldupdater.compareAndSet
(itemAfter.next,newItem.next,newItem))
16. {
17. return true;
18. }

/*
nextItem->next="E" , itemAfter->next="B"
*/
------------------------------------------------------------------------------
P2 (Context switch)
15. if(atomicreferencefieldupdater.compareAndSet(itemAfter.next,newItem.next
, newItem))
/* fails nextItem->next="E" but itemAfter->next="B" */


Similarly if Process 2 (P2) is executed to completion in one execution P1 CAS will fail and returns "false" and the linked list will look like

"A" -> "D" -> "E" -> "H"

Concurrent Insertion and Deletion:

Now let's see how "Deletion" works:

P1 inserts new element "F" after "E".
P2 deletes the element "E"


P1 (Started)

12. LinkedList newItem = new LinkedList();
13. newItem.data = dt;
14. newItem.next=itemAfter.next;
/*
newItem->next & itemAfter->next = "E"
*/
------------------------------------------------------------------------------
P2 (Context switch)
xx. LinkedList item2Del=param1.next;
xx. return CAS(param1.next,item2Del,item2Del.next);

/* P2 executed to the completion
"A"->"B"->"D"->"H" ->"E->next=null"
Note: E cannot be GC'ed as the reference is still in P1
*/


-------------------------------------------------------------------------------------
P1 (Context Switch)

15. return CAS(reference,oldValue,newValue);

/* The above operations fails as
newItem.next="H" but itemAfter.next=null
*/

P1 has to be performed again.

References:
http://en.wikipedia.org/wiki/Compare-and-swap
http://citeseer.ist.psu.edu/herlihy93waitfree.html

It's quite unclear for me as how all this will work in a multiprocessor environment except to know that Multiprocessor environments uses some sort of semaphore/mutex to access the shared memory (if it's so then it's not truly lock free).