HashSet is based on HashMap.
If we look at HashSet<E>
implementation, everything is been managed under HashMap<E,Object>
.
<E>
is used as a key of HashMap
.
And we know that HashMap
is not thread safe. That is why we have ConcurrentHashMap
in Java.
Based on this, I am confused that why we don't have a ConcurrentHashSet which should be based on the ConcurrentHashMap
?
Is there anything else that I am missing? I need to use Set
in a multi-threaded environment.
Also, If I want to create my own ConcurrentHashSet
can I achieve it by just replacing the HashMap
to ConcurrentHashMap
and leaving the rest as is?
ConcurrentSkipListSet
is built on ConcurrentSkipListMap
, which implements ConcurrentNavigableMap
and ConcurrentMap
.
There's no built in type for ConcurrentHashSet
because you can always derive a set from a map. Since there are many types of maps, you use a method to produce a set from a given map (or map class).
Prior to Java 8, you produce a concurrent hash set backed by a concurrent hash map, by using Collections.newSetFromMap(map)
In Java 8 (pointed out by @Matt), you can get a concurrent hash set view via ConcurrentHashMap.newKeySet()
. This is a bit simpler than the old newSetFromMap
which required you to pass in an empty map object. But it is specific to ConcurrentHashMap
.
Anyway, the Java designers could have created a new set interface every time a new map interface was created, but that pattern would be impossible to enforce when third parties create their own maps. It is better to have the static methods that derive new sets; that approach always works, even when you create your own map implementations.
Set<String> mySet = Collections.newSetFromMap(new ConcurrentHashMap<String, Boolean>());
With Guava 15 you can also simply use:
Set s = Sets.newConcurrentHashSet();
Like Ray Toal mentioned it is as easy as:
Set<String> myConcurrentSet = ConcurrentHashMap.newKeySet();
ConcurrentHashMap
.
It looks like Java provides a concurrent Set implementation with its ConcurrentSkipListSet. A SkipList Set is just a special kind of set implementation. It still implements the Serializable, Cloneable, Iterable, Collection, NavigableSet, Set, SortedSet interfaces. This might work for you if you only need the Set interface.
ConcurrentSkipListSet
's elements should be Comparable
ConcurrentSkipListSet
unless you want a SortedSet
. A usual operation like add or remove should be O(1) for a HashSet
, but O(log(n)) for a SortedSet
.
As pointed by this the best way to obtain a concurrency-able HashSet is by means of Collections.synchronizedSet()
Set s = Collections.synchronizedSet(new HashSet(...));
This worked for me and I haven't seen anybody really pointing to it.
EDIT This is less efficient than the currently aproved solution, as Eugene points out, since it just wraps your set into a synchronized decorator, while a ConcurrentHashMap
actually implements low-level concurrency and it can back your Set just as fine. So thanks to Mr. Stepanenkov for making that clear.
http://docs.oracle.com/javase/8/docs/api/java/util/Collections.html#synchronizedSet-java.util.Set-
synchronizedSet
method just creates the decorator under Collection
to wrap methods that could be thread-safe by synchronization the whole collection. But ConcurrentHashMap
is implemented using non-blocking algorithms and "low-level" synchronisations without any locks of the whole collection. So wrapers from Collections.synchronized
... is worse in multi-threads environments for performance reasons.
You can use guava's Sets.newSetFromMap(map)
to get one. Java 6 also has that method in java.util.Collections
import java.util.AbstractSet;
import java.util.Iterator;
import java.util.Set;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;
public class ConcurrentHashSet<E> extends AbstractSet<E> implements Set<E>{
private final ConcurrentMap<E, Object> theMap;
private static final Object dummy = new Object();
public ConcurrentHashSet(){
theMap = new ConcurrentHashMap<E, Object>();
}
@Override
public int size() {
return theMap.size();
}
@Override
public Iterator<E> iterator(){
return theMap.keySet().iterator();
}
@Override
public boolean isEmpty(){
return theMap.isEmpty();
}
@Override
public boolean add(final E o){
return theMap.put(o, ConcurrentHashSet.dummy) == null;
}
@Override
public boolean contains(final Object o){
return theMap.containsKey(o);
}
@Override
public void clear(){
theMap.clear();
}
@Override
public boolean remove(final Object o){
return theMap.remove(o) == ConcurrentHashSet.dummy;
}
public boolean addIfAbsent(final E o){
Object obj = theMap.putIfAbsent(o, ConcurrentHashSet.dummy);
return obj == null;
}
}
Why not use: CopyOnWriteArraySet from java.util.concurrent?
CopyOnWriteArraySet.contains()
has a run-time of O(n)
(has to check ever entry) where as HashSet/HashMap has O(1)
.
Success story sharing
ConcurrentHashMap
, you lose the benefits you'd get fromConcurrentHashMap
?newSetFromMap
's implementation is found starting on line 3841 in docjar.com/html/api/java/util/Collections.java.html. It's just a wrapper....Collections.newSetFromMap
creates aSetFromMap
. e.g. theSetFromMap.removeAll
method delegates to theKeySetView.removeAll
, that inherits fromConcurrentHashMap$CollectionView.removeAll
. This method is highly inefficient in bulk removing elements. imagineremoveAll(Collections.emptySet())
traverses all elements in theMap
without doing anything. Having aConcurrentHashSet
that is corretly implemented will be better in most cases.