GUAVA--集合（新集合类型）

时间 2019-12-22 标签 guava 集合新集合类型

1、新集合类型

Guava 引入了不少 JDK 没有的、但咱们发现明显有用的新集合类型。这些新类型是为了和 JDK 集合框架共存，而没有往 JDK 集合抽象中硬塞其余概念。做为通常规则，Guava 集合很是精准地遵循了 JDK 接口契约。java

1.一、Multiset

统计一个词在文档中出现了多少次，传统的作法是这样的：数组

Map<String, Integer> counts = new HashMap<String, Integer>();
for (String word : words) {
	Integer count = counts.get(word);
	if (count == null) {
		counts.put(word, 1);
	} else {
	counts.put(word, count + 1);
	}
}

这种写法很笨拙，也容易出错，而且不支持同时收集多种统计信息，如总词数。安全

Guava 提供了一个新集合类型 Multiset，它能够屡次添加相等的元素。维基百科从数学角度这样定义 Multiset：集合[set]概念的延伸，它的元素能够重复出现…与集合[set]相同而与元组[tuple]相反的是，Multiset 元素的顺序是可有可无的：Multiset {a, a, b}和{a, b, a}是相等的”。——译者注：这里所说的集合[set]是数学上的概念，Multiset继承自 JDK 中的 Collection 接口，而不是 Set 接口，因此包含重复元素并无违反原有的接口契约。数据结构

Multiset和Set的区别就是能够保存多个相同的对象。在JDK中，List和Set有一个基本的区别，就是List能够包含多个相同对象，且是有顺序的，而Set不能有重复，且不保证顺序（有些实现有顺序，例如LinkedHashSet和SortedSet等）因此Multiset占据了List和Set之间的一个灰色地带：容许重复，可是不保证顺序。框架

常见使用场景：Multiset有一个有用的功能，就是跟踪每种对象的数量，因此你能够用来进行数字统计。常见的普通实现方式以下：工具

public void testWordCount() {
	String strWorld = "wer|dffd|ddsa|dfd|dreg|de|dr|ce|ghrt|cf|gt|ser|tg|ghrt|cf|gt|"
			+ "ser|tg|gt|kldf|dfg|vcd|fg|gt|ls|lser|dfr|wer|dffd|ddsa|dfd|dreg|de|dr|"
			+ "ce|ghrt|cf|gt|ser|tg|gt|kldf|dfg|vcd|fg|gt|ls|lser|dfr";
	String[] words = strWorld.split("\\|");
	Map<String, Integer> countMap = new HashMap<String, Integer>();
	for (String word : words) {
		Integer count = countMap.get(word);
		if (count == null) {
			countMap.put(word, 1);
		} else {
			countMap.put(word, count + 1);
		}
	}
	System.out.println("countMap：");
	for (String key : countMap.keySet()) {
		System.out.println(key + " count：" + countMap.get(key));
	}
}

上面的代码实现的功能很是简单，用于记录字符串在数组中出现的次数。这种场景在实际的开发过程仍是容易常常出现的，若是使用实现Multiset接口的具体类就能够很容易实现以上的功能需求：优化

public void testMultsetWordCount() {
	String strWorld = "wer|dfd|dd|dfd|dda|de|dr";
	String[] words = strWorld.split("\\|");
	List<String> wordList = new ArrayList<String>();
	for (String word : words) {
		wordList.add(word);
	}
	Multiset<String> wordsMultiset = HashMultiset.create();
	wordsMultiset.addAll(wordList);

	for (String key : wordsMultiset.elementSet()) {
		System.out.println(key + " count：" + wordsMultiset.count(key));
	}
}

Multiset接口定义的接口主要有：spa

add(E element) :向其中添加单个元素
add(E element,int occurrences) : 向其中添加指定个数的元素
count(Object element) : 返回给定参数元素的个数
remove(E element) : 移除一个元素，其count值会响应减小
remove(E element,int occurrences): 移除相应个数的元素
elementSet() : 将不一样的元素放入一个Set中
entrySet(): 相似与Map.entrySet 返回Set<Multiset.Entry>。包含的Entry支持使用getElement()和getCount()
setCount(E element ,int count): 设定某一个元素的重复次数
setCount(E element,int oldCount,int newCount): 将符合原有重复个数的元素修改成新的重复次数
retainAll(Collection c) : 保留出如今给定集合参数的全部的元素
removeAll(Collectionc) : 去除出现给给定集合参数的全部的元素

Multiset不是Mapcode

须要注意的是Multiset不是一个Map<E,Integer>,尽管Multiset提供一部分相似的功能实现。其它值得关注的差异有:orm

Multiset中的元素的重复个数只会是正数，且最大不会超过Integer.MAX_VALUE。设定计数为0的元素将不会出现multiset中，也不会出现elementSet()和entrySet()的返回结果中。

multiset.size() 方法返回的是全部的元素的总和，至关因而将全部重复的个数相加。若是须要知道每一个元素的个数可使用elementSet().size()获得.(于是调用add(E)方法会是multiset.size()增长1)。

multiset.iterator() 会循环迭代每个出现的元素，迭代的次数与multiset.size()相同。 iterates over each occurrence of each element, so the length of the iteration is equal to multiset.size()。

Multiset 支持添加、移除多个元素以及从新设定元素的个数。执行setCount(element,0)至关于移除multiset中全部的相同元素。

调用multiset.count(elem)方法时，若是该元素不在该集中，那么返回的结果只会是0。

Multiset的实现Guava提供了Multiset的多种实现，这些实现基本对应了JDK中Map的实现：

Map	对应的Multiset	是否支持null元素
HashMap	HashMultiset	Yes
TreeMap	TreeMultiset	Yes（若是 comparator 支持的话）
LinkedHashMap	LinkedHashMultiset	Yes
ConcurrentHashMap	ConcurrentHashMultiset	No
ImmutableMap	ImmutableMultiset	No

1.二、Multimap

在平常的开发工做中，咱们有的时候须要构造像Map<K, List<V>>或者Map<K, Set<V>>这样比较复杂的集合类型的数据结构，以便作相应的业务逻辑处理。

像 Map<String, List<StudentScore>> StudentScoreMap = new HashMap<String, List<StudentScore>>()这样的数据结构，本身实现起来太麻烦，你须要检查key是否存在，不存在时则建立一个，存在时在List后面添加上一个。这个过程是比较痛苦的，若是你但愿检查List中的对象是否存在，删除一个对象，或者遍历整个数据结构，那么则须要更多的代码来实现。

Guava的Multimap就提供了一个方便地把一个键对应到多个值的数据结构。让咱们能够简单优雅的实现上面复杂的数据结构，让咱们的精力和时间放在实现业务逻辑上，而不是在数据结构上，下面咱们具体来看看Multimap的相关知识点。

public class MultimapTest {

    Map<String, List<StudentScore>> StudentScoreMap = new HashMap<String, List<StudentScore>>();
    
    public void testStudentScore(){
        
        for(int i=10;i<20;i++){
            StudentScore studentScore=new StudentScore();
            studentScore.CourseId=1001+i;
            studentScore.score=100-i;
            addStudentScore("peida",studentScore);
        }
        
        System.out.println("StudentScoreMap:"+StudentScoreMap.size());
        System.out.println("StudentScoreMap:"+StudentScoreMap.containsKey("peida"));
            
        System.out.println("StudentScoreMap:"+StudentScoreMap.containsKey("jerry"));
        System.out.println("StudentScoreMap:"+StudentScoreMap.size());
        System.out.println("StudentScoreMap:"+StudentScoreMap.get("peida").size());

        List<StudentScore> StudentScoreList=StudentScoreMap.get("peida");
        if(StudentScoreList!=null&&StudentScoreList.size()>0){
            for(StudentScore stuScore:StudentScoreList){
                System.out.println("stuScore one:"+stuScore.CourseId+" score:"+stuScore.score);
            }
        }
    }
    
    public void addStudentScore(final String stuName,final StudentScore studentScore) {
        List<StudentScore> stuScore = StudentScoreMap.get(stuName);
        if (stuScore == null) {
            stuScore = new ArrayList<StudentScore>();
            StudentScoreMap.put(stuName, stuScore);
        }
        stuScore.add(studentScore);
    }
}

class StudentScore{
    int CourseId;
    int score;
}
}

上面的代码和数据结构用Multimap来实现：

public class MultimapTest {

	Map<String, List<StudentScore>> StudentScoreMap = new HashMap<String, List<StudentScore>>();
	
	public void teststuScoreMultimap(){
	    Multimap<String,StudentScore> scoreMultimap = ArrayListMultimap.create(); 
	    for(int i=10;i<20;i++){
	        StudentScore studentScore=new StudentScore();
	        studentScore.CourseId=1001+i;
	        studentScore.score=100-i;
	        scoreMultimap.put("peida",studentScore);
	    }
	    System.out.println("scoreMultimap:"+scoreMultimap.size());
	    System.out.println("scoreMultimap:"+scoreMultimap.keys());
	    }
	}
	
	class StudentScore{
	    int CourseId;
	    int score;
	}
}

Multimap也支持一系列强大的视图功能：

asMap把自身Multimap<K, V>映射成Map<K, Collection<V>>视图。这个Map视图支持remove和修改操做，可是不支持put和putAll。严格地来说，当你但愿传入参数是不存在的key，并且你但愿返回的是null而不是一个空的可修改的集合的时候就能够调用asMap().get(key)。（你能够强制转型asMap().get(key)的结果类型－对SetMultimap的结果转成Set，对ListMultimap的结果转成List型－可是直接把ListMultimap转成Map<K, List<V>>是不行的。）
entries视图是把Multimap里全部的键值对以Collection<Map.Entry<K, V>>的形式展示。
keySet视图是把Multimap的键集合做为视图
keys视图返回的是个Multiset，这个Multiset是以不重复的键对应的个数做为视图。这个Multiset能够经过支持移除操做而不是添加操做来修改Multimap。
values()视图能把Multimap里的全部值“平展”成一个Collection<V>。这个操做和Iterables.concat(multimap.asMap().values())很类似，只是它返回的是一个完整的Collection。

尽管Multimap的实现用到了Map，但Multimap<K, V>不是Map<K, Collection<V>>。由于二者有明显区别：

Multimap.get(key)必定返回一个非null的集合。但这不表示Multimap使用了内存来关联这些键，相反，返回的集合只是个容许添加元素的视图。
若是你喜欢像Map那样当不存在键的时候要返回null，而不是Multimap那样返回空集合的话，能够用asMap()返回的视图来获得Map<K, Collection<V>>。（这种状况下，你得把返回的Collection<V>强转型为List或Set）。
Multimap.containsKey(key)只有在这个键存在的时候才返回true。
Multimap.entries()返回的是Multimap全部的键值对。可是若是须要key-collection的键值对，那就得用asMap().entries()。
Multimap.size()返回的是entries的数量，而不是不重复键的数量。若是要获得不重复键的数目就得用Multimap.keySet().size()。

Multimap提供了丰富的实现，因此你能够用它来替代程序里的Map<K, Collection<V>>，具体的实现以下：

实现	键行为相似	值行为相似
ArrayListMultimap	HashMap	ArrayList
HashMultimap	HashMap	HashSet
LinkedListMultimap*	LinkedHashMap*	LinkedList*
LinkedHashMultimap**	LinkedHashMap	LinkedHashMap
TreeMultimap	HashMap	TreeSet
ImmutableListMultimap	ImmutableMap	ImmutableList
ImmutableSetMultimap	ImmutableMap	ImmutableSet

除了两个不可变形式的实现，其余全部实现都支持 null 键和 null 值*LinkedListMultimap.entries()保留了全部键和值的迭代顺序。详情见 doc 连接。

**LinkedHashMultimap 保留了映射项的插入顺序，包括键插入的顺序，以及键映射的全部值的插入顺序。

请注意，并不是全部的 Multimap 都和上面列出的同样，使用 Map<K, Collection>来实现（特别是，一些 Multimap 实现用了自定义的 hashTable，以最小化开销）若是你想要更大的定制化，请用Multimaps.newMultimap(Map, Supplier)或 list 和 set 版本，使用自定义的 Collection、List 或 Set 实现 Multimap。

1.三、BiMap

传统上，实现键值对的双向映射须要维护两个单独的 map，并保持它们间的同步。但这种方式很容易出错，并且对于值已经在 map 中的状况，会变得很是混乱。例如：

出现下面一种场景的状况，咱们就须要额外编写一些代码了。首先来看下面一种表示标识序号和文件名的map结构。

Map<Integer,String> logfileMap = Maps.newHashMap();
logfileMap.put(1,"a.log");
logfileMap.put(2,"b.log");
logfileMap.put(3,"c.log");
System.out.println("logfileMap:"+logfileMap);

当咱们须要经过序号查找文件名，很简单。可是若是咱们须要经过文件名查找其序号时，咱们就不得不遍历map了。固然咱们还能够编写一段Map倒转的方法来帮助实现倒置的映射关系。

public static <S,T> Map<T,S> getInverseMap(Map<S,T> map) {
	 Map<T,S> inverseMap = new HashMap<T,S>();
	 for(Entry<S,T> entry: map.entrySet()) {
		 inverseMap.put(entry.getValue(), entry.getKey());
	 }
	 return inverseMap;
 }

public void logMapTest(){
	Map<Integer,String> logfileMap = Maps.newHashMap();
	logfileMap.put(1,"a.log");
	logfileMap.put(2,"b.log");
	logfileMap.put(3,"c.log");

	System.out.println("logfileMap:"+logfileMap);

	Map<String,Integer> logfileInverseMap = Maps.newHashMap();

	logfileInverseMap=getInverseMap(logfileMap);

	System.out.println("logfileInverseMap:"+logfileInverseMap);
}

上面的代码能够帮助咱们实现map倒转的要求，可是还有一些咱们须要考虑的问题:

如何处理重复的value的状况。不考虑的话，反转的时候就会出现覆盖的状况.
若是在反转的map中增长一个新的key，倒转前的map是否须要更新一个值呢?

在这种状况下须要考虑的业务之外的内容就增长了，编写的代码也变得不那么易读了。这时咱们就能够考虑使用Guava中的BiMap了。

Bimap使用很是的简单，对于上面的这种使用场景，咱们能够用很简单的代码就实现了：

public void BimapTest(){
	BiMap<Integer,String> logfileMap = HashBiMap.create(); 
	logfileMap.put(1,"a.log");
	logfileMap.put(2,"b.log");
	logfileMap.put(3,"c.log");
	System.out.println("logfileMap:"+logfileMap);
	BiMap<String,Integer> filelogMap = logfileMap.inverse();
	System.out.println("filelogMap:"+filelogMap);
}

logfileMap.put(5,"d.log") 会抛出java.lang.IllegalArgumentException: value already present: d.log的错误。若是咱们确实须要插入重复的value值，那能够选择forcePut方法。可是咱们须要注意的是前面的key也会被覆盖了。

public void BimapTest(){
	BiMap<Integer,String> logfileMap = HashBiMap.create();
	logfileMap.put(1,"a.log");
	logfileMap.put(2,"b.log");
	logfileMap.put(3,"c.log");

	logfileMap.put(4,"d.log");
	logfileMap.forcePut(5,"d.log");
	System.out.println("logfileMap:"+logfileMap);
}

inverse方法会返回一个反转的BiMap，可是注意这个反转的map不是新的map对象，它实现了一种视图关联，这样你对于反转后的map的全部操做都会影响原先的map对象。

键–值实现	值–键实现	对应的BiMap实现
HashMap	HashMap	HashBiMap
ImmutableMap	ImmutableMap	ImmutableBiMap
EnumMap	EnumMap	EnumBiMap
EnumMap	HashMap	EnumHashBiMap
注：Maps 类中还有一些诸如 synchronizedBiMap 的 BiMap 工具方法。

一般来讲，当你想使用多个键作索引的时候，你可能会用相似 Map<FirstName, Map<LastName, Person>>的实现，这种方式很丑陋，使用上也不友好。Guava 为此提供了新集合类型 Table，它有两个支持全部类型的键：”行”和”列”。Table 提供多种视图，以便你从各类角度使用它：

rowMap()：用 Map<R, Map<C, V>>表现 Table<R, C, V>。一样的， rowKeySet()返回”行”的集合Set。
row(r) ：用 Map<C, V>返回给定”行”的全部列，对这个 map 进行的写操做也将写入 Table 中。
相似的列访问方法：columnMap()、columnKeySet()、column(c)。（基于列的访问会比基于的行访问稍微低效点）
cellSet()：用元素类型为 Table.Cell<R, C, V>的 Set 表现 Table<R, C, V>。Cell 相似于 Map.Entry，但它是用行和列两个键区分的。

Table 有以下几种实现：

HashBasedTable：本质上用 HashMap<R, HashMap<C, V>>实现；
TreeBasedTable：本质上用 TreeMap<R, TreeMap<C,V>>实现；
ImmutableTable：本质上用 ImmutableMap<R, ImmutableMap<C, V>>实现；注：ImmutableTable对稀疏或密集的数据集都有优化。
ArrayTable：要求在构造时就指定行和列的大小，本质上由一个二维数组实现，以提高访问速度和密集 Table 的内存利用率。ArrayTable 与其余 Table 的工做原理有点不一样，请参见 Javadoc 了解详情。

public void TableTest(){
	Table<String, Integer, String> aTable = HashBasedTable.create();

	for (char a = 'A'; a <= 'C'; ++a) {
		for (Integer b = 1; b <= 3; ++b) {
			aTable.put(Character.toString(a), b, String.format("%c%d", a, b));
		}
	}

	System.out.println(aTable.column(2));
	System.out.println(aTable.row("B"));
	System.out.println(aTable.get("B", 2));

	System.out.println(aTable.contains("D", 1));
	System.out.println(aTable.containsColumn(3));
	System.out.println(aTable.containsRow("C"));
	System.out.println(aTable.columnMap());
	System.out.println(aTable.rowMap());

	System.out.println(aTable.remove("B", 3));
}

1.四、ClassToInstanceMap

ClassToInstanceMap 是一种特殊的 Map：它的键是类型，而值是符合键所指类型的对象。

为了扩展 Map 接口，ClassToInstanceMap 额外声明了两个方法：T getInstance(Class) 和 T putInstance(Class, T)，从而避免强制类型转换，同时保证了类型安全。

ClassToInstanceMap 有惟一的泛型参数，一般称为 B，表明 Map 支持的全部类型的上界。

ClassToInstanceMap<Number> numberDefaults = MutableClassToInstanceMap.create();
numberDefaults.putInstance(Integer.class, Integer.valueOf(0));

大体意思为:

有时，咱们的映射键并非全部类型都相同:它们是类型，如键分别是Integer、Bigdecimal等，可是他们都是属于Number类型.咱们若是但愿将它们映射到该类型的值,就可使用Guava为此提供的新集合类型 - ClassToInstanceMap。

值得注意的是:

ClassToInstanceMap除了扩展Map接口以外，ClassToInstanceMap还提供了T getInstance(类)和T putInstance(类， T)方法，这消除了强制类型安全强制转换的须要。
ClassToInstanceMap只有一个类型参数.表示了全部的key不能超出这个类型参数的范围。
从技术上讲，ClassToInstanceMap实现了Map<Class<? extends B>, B>或者换句话说,从map的子类B B .这可使泛型类型实例参与ClassToInstanceMap有点混乱,可是只要记住B老是上限的类型映射——一般,B是对象。

public void classToInstanceMapTest(){
    //可存放全部数字类型的map
    ClassToInstanceMap<Number> numberDefaults= MutableClassToInstanceMap.create();
    numberDefaults.putInstance(Integer.class, 1);
    numberDefaults.putInstance(Double.class, 1.00);
    numberDefaults.putInstance(BigDecimal.class, new BigDecimal("52"));
    numberDefaults.put(Integer.class, 2);
    numberDefaults.replace(Integer.class, 3); //修改
    numberDefaults.getInstance(Integer.class); //获取value 3
    //经过传入新值和旧值计算,修改旧值,例: 将Integer,class的值+3
    numberDefaults.merge(Integer.class,2,(x,y)->x.intValue()+y.intValue());
    numberDefaults.forEach((x,y)-> System.out.println(x+","+y));
}

1.五、RangeSet

RangeSet描述了一组不相连的、非空的区间。当把一个区间添加到可变的RangeSet时，全部相连的区间会被合并，空区间会被忽略。例如：

RangeSet<Integer> rangeSet = TreeRangeSet.create();
rangeSet.add(Range.closed(1, 10)); // {[1,10]}
rangeSet.add(Range.closedOpen(11, 15));//不相连区间:{[1,10], [11,15)}
rangeSet.add(Range.closedOpen(15, 20)); //相连区间; {[1,10], [11,20)}
rangeSet.add(Range.openClosed(0, 0)); //空区间; {[1,10], [11,20)}
rangeSet.remove(Range.open(5, 10)); //分割[1, 10]; {[1,5], [10,10], [11,20)}

请注意，要合并 Range.closed(1, 10)和 Range.closedOpen(11, 15)这样的区间，你须要首先用 Range.canonical(DiscreteDomain)对区间进行预处理，例如 DiscreteDomain.integers()。

注：RangeSet不支持 GWT，也不支持 JDK5 和更早版本；由于，RangeSet 须要充分利用 JDK6 中 NavigableMap 的特性。

RangeSet 的实现支持很是普遍的视图：

complement()：返回 RangeSet 的补集视图。complement 也是 RangeSet 类型,包含了不相连的、非空的区间。
subRangeSet(Range)：返回 RangeSet 与给定 Range 的交集视图。这扩展了传统排序集合中的 headSet、subSet 和 tailSet 操做。
asRanges()：用 Set<Range>表现 RangeSet，这样能够遍历其中的Range。
asSet(DiscreteDomain)（仅 ImmutableRangeSet 支持）：用 ImmutableSortedSet表现 RangeSet，以区间中全部元素的形式而不是区间自己的形式查看。（这个操做不支持 DiscreteDomain 和 RangeSet 都没有上边界，或都没有下边界的状况）

为了方便操做，RangeSet 直接提供了若干查询方法，其中最突出的有:

contains(C)：RangeSet 最基本的操做，判断 RangeSet 中是否有任何区间包含给定元素。
rangeContaining(C)：返回包含给定元素的区间；若没有这样的区间，则返回 null。
encloses(Range)：简单明了，判断 RangeSet 中是否有任何区间包括给定区间。
span()：返回包括 RangeSet 中全部区间的最小区间。

1.六、RangeMap

RangeMap 描述了”不相交的、非空的区间”到特定值的映射。和 RangeSet 不一样，RangeMap 不会合并相邻的映射，即使相邻的区间映射到相同的值。例如：

RangeMap<Integer, String> rangeMap = TreeRangeMap.create();
rangeMap.put(Range.closed(1, 10), "foo"); //{[1,10] => "foo"}
rangeMap.put(Range.open(3, 6), "bar"); //{[1,3] => "foo", (3,6) => "bar", [6,10] => "foo"}
rangeMap.put(Range.open(10, 20), "foo"); //{[1,3] => "foo", (3,6) => "bar", [6,10] => "foo", (10,20) => "foo"}
rangeMap.remove(Range.closed(5, 11)); //{[1,3] => "foo", (3,5) => "bar", (11,20) => "foo"}

遍历：

Map<Range<Integer>, Integer> map = rangeMap.asMapOfRanges();
Set<Map.Entry<Range<Integer>, Integer>> entrySet = map.entrySet();
Set<Range<Integer>> keySet = map.keySet();
Collection<Integer> values = map.values();

RangeMap 提供两个视图：

asMapOfRanges()：用 Map<Range, V>表现 RangeMap。这能够用来遍历 RangeMap。
subRangeMap(Range)：用 RangeMap 类型返回 RangeMap 与给定 Range 的交集视图。这扩展了传统的 headMap、subMap 和 tailMap 操做。