[Java] 자바 8 - (3) Stream : Collectors

Backend/Java

[Java] 자바 8 - (3) Stream : Collectors

제이동 개발자 2023. 7. 24. 00:22

728x90

자바 8 - (3) Stream - Collectors

Collectors는 Java 8 부터 도입된 유용한 기능을 제공하는 클래스로, Stream의 요소들을 수집하여 다양한 컬렉션 형태로 변환하거나 집계(Aggregate) 작업을 수행할 수 있습니다. Collectors에서 제공하는 메서드의 기능은 크게 세 가지로 구분할 수 있습니다.

요소 요약
요소 그룹화
요소 분할

예제 코드는 Github에서 확인하실 수 있습니다.

기능	팩토리 메서드	반환 형식	설명
요소 요약	countiong	Long	스트림의 요소 수
	summingInt summingLong summingDouble	Integer Long Double	스트림의 요소 합계
	averagingInt averagingLong averagingDouble	Integer Long Double	스트림의 요소 평균
	summarizingInt summarizingLong summarizingDouble	IntSummaryStatistisc LongSummaryStatistisc DoubleSummaryStatistisc	스트림의 항목 통계 (최대값, 최솟값, 합계, 평균, 항목 수)
	minBy	Optional<T>	주어진 비교자를 이용하여 스트림의 최솟값 요소를 Optional로 반환
	maxBy	Optional<T>	주어진 비교자를 이용하여 스트림의 최대값 요소를 Otpional로 반환
	joining	String	스트림의 각 요소들을 하나의 문자열로 연결
	reducing	reduce 결과 값	스트림의 각 요소를 병합 연산을 수행하여 반환
요소 그룹화	toList	List<T>	Stream의 요소들을 List로 반환
	toSet	Set<T>	Stream의 요소들을 Set로 반환
	toCollection	Collection<T>	Stream의 요소들을 발행자가 제공하는 컬렉션으로 반환
	groupingBy	Map<K, List<T>>	스트림의 요소들을 그룹화하여 반환
요소 분할	patitioningBy	Map<Boolean, List<T>>	스트림의 요소들을 Boolean으로 그룹화하여 반환

1. 요소 요약

요소 요약이란 Stream의 요소들을 이용하여 하나의 값으로 반환하는 것을 말합니다. 대표적으로 개수, 최댓값, 최솟값, 합계, 평균 등이 있습니다.

1-1. Collectors.counting()

Collectors에서 제공하는 스트림 개수를 구하는 메서드입니다. Stream의 'count' 메서드와 같다고 생각할 수 있지만 'Collectors.counting' 메서드는 다른 컬렉터와 함께 사용할 때 위력을 발휘합니다.(2-4. 서브그룹으로 데이터 수집 참고)

@Test
void collectors_counting_test() {
    Long counting = coffeeList.stream()
            .collect(Collectors.counting());
    long count = coffeeList.stream()
            .count();
    Assertions.assertThat(counting).isEqualTo(coffeeList.size());
    Assertions.assertThat(count).isEqualTo(coffeeList.size());
}

1-2.
Collectors.minBy(Comparator)
Collectors.maxBy(Comparator)

Stream 요소에서 최솟값과 최댓값을 구하는 Collectors 메서드입니다. 주의할 점은 Stream의 요소가 없을 수도 있기 때문에 Optional을 반환됩니다.

@Test
void collectors_max_min_test() {
    Optional<Coffee> minOp = coffeeList.stream()
            .collect(Collectors.minBy(Comparator.comparing(Coffee::getPrice)));
    Optional<Coffee> maxOp = coffeeList.stream()
            .collect(Collectors.maxBy(Comparator.comparing(Coffee::getPrice)));

    Assertions.assertThat(minOp.get().getPrice()).isEqualTo(2000);
    Assertions.assertThat(maxOp.get().getPrice()).isEqualTo(5500);
}

1-3.
Collectors.summingInt()
Collectors.SummingLong()
Collectors.summingDouble()

Stream 요소의 합계를 구하는 Collectors 메서드입니다. reduce를 사용하며 초기 값이 0으로 되어 있기 때문에 Stream의 요소가 없어도 Optional이 아닌 기본형의 래퍼 클래스(Integer, Long, Double)로 반환이 됩니다.

@Test
void collectors_summingInt_test() {
    Integer priceSum = coffeeList.stream()
            .collect(Collectors.summingInt(Coffee::getPrice));
    log.info("priceSum = {}", priceSum);
}

// 결과
priceSum = 41900

1-4.
Collectors.averagingInt()
Collectors.averagingLong()
Collectors.averagingDouble()

Stream 요소의 평균을 구하는 Collecotrs 메서드입니다. reduce를 사용하며 초기 값이 0으로 되어 있기 때문에 Stream 요소가 없어도 Optional이 아닌 기본형의 래퍼 클래스(Integer, Long, Double)로 반환이 됩니다.

@Test
void collectors_averagingInt_test() {
    Double priceAverage = coffeeList.stream()
            .collect(Collectors.averagingDouble(Coffee::getPrice));
    log.info("priceAverage = {}", priceAverage);
}

1-5.
Collectors.summarizingInt()
Collectors.summarizingLong()
Collectors.summarizingDouble()

Stream 요소의 개수, 합계, 최솟값, 최댓값, 평균을 갖는 IntSummaryStatistics, DoubleSummaryStatistics, LongSummaryStatistics 클래스를 반환하는 Collectors 메서드입니다.

@Test
void collectors_summarizingInt() {
    IntSummaryStatistics summary = coffeeList.stream()
            .collect(Collectors.summarizingInt(Coffee::getPrice));
    log.info("summary = {}", summary);
}

// 결과
summary = IntSummaryStatistics{count=11, sum=41900, min=2000, average=3809.090909, max=5500}

1-6. Collectors.joining()

Stream 요소의 문자열들을 하나의 문자열로 연결해 주는 Collectors 메서드입니다. 내부적으로 StringBuilder를 이용하기 때문에 성능이 좋고, 구분 문자열을 인자로 받는 오버로딩(Overloading)된 메서드가 있습니다.

@Test
void collectors_joining_test() {
    String joinBrands1 = coffeeList.stream()
            .map(Coffee::getBrands)
            .map(Brands::getDesc)
            .collect(Collectors.joining());
    log.info("joinBrands1 = {}", joinBrands1);

    String joinBrands2 = coffeeList.stream()
            .map(Coffee::getBrands)
            .map(Brands::getDesc)
            .distinct()
            .collect(Collectors.joining());
    log.info("joinBrands2 = {}", joinBrands2);

    // 구분 문자 ", " 사용
    String joinBrands3 = coffeeList.stream()
            .map(Coffee::getBrands)
            .map(Brands::getDesc)
            .distinct()
            .collect(Collectors.joining(", "));
    log.info("joinBrands3 = {}", joinBrands3);
}

// 결과
joinBrands1 = 메가커피메가커피스타벅스스타벅스스타벅스빽다방빽다방투썸플레이스투썸플레이스이디야이디야
joinBrands2 = 메가커피스타벅스빽다방투썸플레이스이디야
joinBrands3 = 메가커피, 스타벅스, 빽다방, 투썸플레이스, 이디야

1-7. Collectors.reducing()

Stream 요소를 원하는 병합 연산을 수행하기 위해 사용되는 Collectors 메서드입니다.

@Test
void collectors_reducing_test() {
    // 첫번째 인수 - 초기 값
    // 두번째 인수 - reduce 가 수행할 요소들
    // 세번째 인수 - reduce 수행 식
    Integer customPriceSum = coffeeList.stream()
            .collect(Collectors.reducing(0, Coffee::getPrice, (a, b) -> b >= 3000 ? a + b : a));
    log.info("customPriceSum = {}", customPriceSum);
}

2. 요소 그룹화

데이터베이스에서 사용되는 그룹화처럼 Collectors의 'groupingBy' 메서드를 이용해 그룹화 처리를 할 수 있습니다.

@Test
void collectors_groupingBy_test() {

    // 자바 8 이전 코드
    Map<Brands, List<Coffee>> coffeeMap = new HashMap<>();
    for (Coffee coffee : coffeeList) {
        List<Coffee> coffeeList = coffeeMap.getOrDefault(coffee.getBrands(), new ArrayList<>());
        if (coffeeList.isEmpty()) {
            coffeeMap.put(coffee.getBrands(), coffeeList);
        }
        coffeeList.add(coffee);
    }
    System.out.println("coffeeMap         = " + coffeeMap);

    // 자바 8 Stream 그룹화
    Map<Brands, List<Coffee>> coffeeMapByStream = coffeeList.stream()
            .collect(Collectors.groupingBy(Coffee::getBrands));
    System.out.println("coffeeMapByStream = " + coffeeMapByStream);
}

// 결과
coffeeMap         = {
    TWOSOME=[Coffee(price=4500, capacity=355, brands=TWOSOME), Coffee(price=5000, capacity=414, brands=TWOSOME)], 
    STARBUCKS=[Coffee(price=4500, capacity=355, brands=STARBUCKS), Coffee(price=5000, capacity=473, brands=STARBUCKS), Coffee(price=5500, capacity=592, brands=STARBUCKS)], 
    MEGA=[Coffee(price=2000, capacity=680, brands=MEGA), Coffee(price=3000, capacity=1000, brands=MEGA)], 
    PAIKDABANG=[Coffee(price=2000, capacity=625, brands=PAIKDABANG), Coffee(price=3000, capacity=946, brands=PAIKDABANG)],
    EDIYA=[Coffee(price=3200, capacity=420, brands=EDIYA), Coffee(price=4200, capacity=650, brands=EDIYA)]
}
coffeeMapByStream = {
    TWOSOME=[Coffee(price=4500, capacity=355, brands=TWOSOME), Coffee(price=5000, capacity=414, brands=TWOSOME)], 
    STARBUCKS=[Coffee(price=4500, capacity=355, brands=STARBUCKS), Coffee(price=5000, capacity=473, brands=STARBUCKS), Coffee(price=5500, capacity=592, brands=STARBUCKS)], 
    MEGA=[Coffee(price=2000, capacity=680, brands=MEGA), Coffee(price=3000, capacity=1000, brands=MEGA)], 
    PAIKDABANG=[Coffee(price=2000, capacity=625, brands=PAIKDABANG), Coffee(price=3000, capacity=946, brands=PAIKDABANG)], 
    EDIYA=[Coffee(price=3200, capacity=420, brands=EDIYA), Coffee(price=4200, capacity=650, brands=EDIYA)]
}

Collectors에서 제공하는 'groupingBy'는 필터링, 매핑, 다수준 그룹화 등과 같은 다양한 기능들을 이용하여 원하는 데이터 그룹을 얻을 수 있습니다.

2-1. Collectors.filtering()

'groupingBy'를 이용하여 그룹화할 때 Collectors에서 제공하는 'filtering' 메서드를 이용하면 필터링을 할 수 있습니다. Stream에서 제공하는 'filter' 메서드와 Collectors에서 제공하는 'filtering'은 같은 역할을 하지만 필터링하는 시점이 달라 그룹화된 데이터가 다르게 나올 수 있습니다.

@Test
void collectors_groupingBy_test2() {
    // TWOSOME 과 STARBUCKS 에 해당되는 커피의 가격이 3500원 이하인 커피가 없으므로
    // Map에 TOWSOME, STARBUCKS 인 Key가 존재하지 않게 된다.
    Map<Brands, List<Coffee>> filterGroupingMap = coffeeList.stream()
            .filter(coffee -> coffee.getPrice() < 3500)
            .collect(Collectors.groupingBy(Coffee::getBrands));

    // Map에 TOWSOME과 STARBUCKS 인 Key가 존재하며 빈 목록을 갖는다.
    Map<Brands, List<Coffee>> groupingFilterMap = coffeeList.stream()
            .collect(Collectors.groupingBy(Coffee::getBrands,
                    Collectors.filtering(coffee -> coffee.getPrice() < 3500, Collectors.toList())));
    System.out.println("filterGroupingMap = " + filterGroupingMap);
    System.out.println("groupingFilterMap = " + groupingFilterMap);
}

// 결과
filterGroupingMap = {
    PAIKDABANG=[Coffee(price=2000, capacity=625, brands=PAIKDABANG), Coffee(price=3000, capacity=946, brands=PAIKDABANG)], 
    EDIYA=[Coffee(price=3200, capacity=420, brands=EDIYA)], 
    MEGA=[Coffee(price=2000, capacity=680, brands=MEGA), Coffee(price=3000, capacity=1000, brands=MEGA)]}
groupingFilterMap = {
    PAIKDABANG=[Coffee(price=2000, capacity=625, brands=PAIKDABANG), Coffee(price=3000, capacity=946, brands=PAIKDABANG)], 
    TWOSOME=[], 
    EDIYA=[Coffee(price=3200, capacity=420, brands=EDIYA)], 
    STARBUCKS=[], 
    MEGA=[Coffee(price=2000, capacity=680, brands=MEGA), Coffee(price=3000, capacity=1000, brands=MEGA)]}

2-2.
Collectors.mapping()
Collectors.flatMapping()

그룹화된 요소들을 조작하는 유용한 기능 중 하나로 'mapping', 'flatMapping'을 이용하여 그룹화 시 요소들을 변환할 수 있습니다.

@Test
void collectors_groupingBy_test3() {
    // Brands를 기준으로 Coffee 요소들을 capacity로 변환하여 그룹화 진행
    Map<Brands, List<Integer>> groupingMappingMap = coffeeList.stream()
            .collect(Collectors.groupingBy(Coffee::getBrands,
                    Collectors.mapping(Coffee::getCapacity, Collectors.toList())));
    System.out.println("groupingMappingMap = " + groupingMappingMap);

    final List<Dish> menu = asList(
            new Dish("pork", false, 800, Dish.Type.MEAT),
            new Dish("beef", false, 700, Dish.Type.MEAT),
            new Dish("chicken", false, 400, Dish.Type.MEAT),
            new Dish("french fries", true, 530, Dish.Type.OTHER),
            new Dish("rice", true, 350, Dish.Type.OTHER),
            new Dish("season fruit", true, 120, Dish.Type.OTHER),
            new Dish("pizza", true, 550, Dish.Type.OTHER),
            new Dish("prawns", false, 400, Dish.Type.FISH),
            new Dish("salmon", false, 450, Dish.Type.FISH)
    );

    final Map<String, List<String>> dishTags = new HashMap<>();
    dishTags.put("pork", asList("greasy", "salty"));
    dishTags.put("beef", asList("salty", "roasted"));
    dishTags.put("chicken", asList("fried", "crisp"));
    dishTags.put("french fries", asList("greasy", "fried"));
    dishTags.put("rice", asList("light", "natural"));
    dishTags.put("season fruit", asList("fresh", "natural"));
    dishTags.put("pizza", asList("tasty", "salty"));
    dishTags.put("prawns", asList("tasty", "roasted"));
    dishTags.put("salmon", asList("delicious", "fresh"));

    // Dish.Type 기준으로 요소 리스트들을 평면화하여 그룹화
    Map<Dish.Type, Set<String>> flatMappingMap = menu.stream().collect(
            groupingBy(Dish::getType,
                    flatMapping(dish -> dishTags.get(dish.getName()).stream(), toSet())));
    System.out.println("flatMappingMap = " + flatMappingMap);
}

// 결과
groupingMappingMap = {
    TWOSOME=[355, 414], 
    MEGA=[680, 1000], 
    STARBUCKS=[355, 473, 592], 
    PAIKDABANG=[625, 946], 
    EDIYA=[420, 650]
}
flatMappingMap = {
    MEAT=[salty, greasy, roasted, fried, crisp], 
    FISH=[roasted, tasty, fresh, delicious], 
    OTHER=[salty, greasy, natural, light, tasty, fresh, fried]
}

2-3. 다수준 그룹화
Collectors.grouingBy(.. , Collectors.groupingBy(..))

그룹화된 요소들을 다시 한번 그룹화하는 방법으로 두 인수를 받는 Collectors의 'groupingBy' 메서드를 이용하여 항목을 다수준으로 그룹화를 할 수 있습니다.

@Test
void collectors_grouping_test4() {

    // Brands를 기준으로 그룹화 한 후
    // 그룹화 된 요소를 다시 한번 그룹화
    Map<Brands, Map<Size, List<Coffee>>> twoGroupingMap = coffeeList.stream()
            .collect(groupingBy(Coffee::getBrands,
                    groupingBy(coffee -> {
                        if (coffee.getCapacity() < 400) {
                            return Size.SMALL;
                        } else if (coffee.getCapacity() < 750) {
                            return Size.MEDIUM;
                        } else {
                            return Size.LARGE;
                        }
                    })));
    System.out.println("twoGroupingMap = " + twoGroupingMap);
}

// 결과
twoGroupingMap = {
    STARBUCKS={
        MEDIUM=[Coffee(price=5000, capacity=473, brands=STARBUCKS), Coffee(price=5500, capacity=592, brands=STARBUCKS)], 
        SMALL=[Coffee(price=4500, capacity=355, brands=STARBUCKS)]
    }, 
    MEGA={
        MEDIUM=[Coffee(price=2000, capacity=680, brands=MEGA)], 
        LARGE=[Coffee(price=3000, capacity=1000, brands=MEGA)]
    }, 
    TWOSOME={
        MEDIUM=[Coffee(price=5000, capacity=414, brands=TWOSOME)],
        SMALL=[Coffee(price=4500, capacity=355, brands=TWOSOME)]
    }, 
    PAIKDABANG={
        MEDIUM=[Coffee(price=2000, capacity=625, brands=PAIKDABANG)], 
        LARGE=[Coffee(price=3000, capacity=946, brands=PAIKDABANG)]
    }, 
    EDIYA={
        MEDIUM=[Coffee(price=3200, capacity=420, brands=EDIYA), Coffee(price=4200, capacity=650, brands=EDIYA)]
    }
}

2-4. 서브그룹으로 데이터 수집
Collectors.maxBy()
Collectors.minBy()
Collectors.counting

그룹핑한 데이터를 앞서 나왔던 Collectors의 'minBy', 'maxBy', 'counting' 메서드들을 이용하여 결과 값을 반환할 수 있습니다.

@Test
void collectors_groupingBy_test5() {

    // Brands 기준으로 그룹화 진행 후
    // 그룹화 된 요소들 중 price가 가장 높은 Coffee 요소만 반환
    Map<Brands, Optional<Coffee>> groupingMaxByMap = coffeeList.stream()
            .collect(groupingBy(Coffee::getBrands, maxBy(Comparator.comparing(Coffee::getPrice))));
    System.out.println("collect = " + groupingMaxByMap);

    // Brands 기준으로 그룹화 진행 후
    // 그룹화 된 요소들의 갯수를 반환
    Map<Brands, Long> groupingCountingMap = coffeeList.stream()
            .collect(groupingBy(Coffee::getBrands, counting()));
    System.out.println("groupingCountingMap = " + groupingCountingMap);
}

// 결과
collect = {
    TWOSOME=Optional[Coffee(price=5000, capacity=414, brands=TWOSOME)], 
    EDIYA=Optional[Coffee(price=4200, capacity=650, brands=EDIYA)], 
    PAIKDABANG=Optional[Coffee(price=3000, capacity=946, brands=PAIKDABANG)], 
    STARBUCKS=Optional[Coffee(price=5500, capacity=592, brands=STARBUCKS)], 
    MEGA=Optional[Coffee(price=3000, capacity=1000, brands=MEGA)]
}
groupingCountingMap = {
    TWOSOME=2, 
    EDIYA=2,
    PAIKDABANG=2, 
    STARBUCKS=3, 
    MEGA=2
}

2-5. 그룹핑 결과를 다른 형식으로 변환
Collectors.collectingAndThen()

그룹핑된 요소들을 Collectors의 'collectingAndThen()' 메서드를 이용하여 원하는 결과로 변경한 그룹핑을 할 수 있습니다. 위 2-4 예시에서 그룹핑 결과로 나온 Collection인 Map<Brands, Optional<Coffee>>을 Map<Brands, Coffee>로 변경하는 예제 코드는 다음과 같습니다.

@Test
void collectors_groupingBy_test6() {

    // collectingAndThen을 이용하여 Optional<Coffee>을 변환하여 Coffee로 그룹핑
    Map<Brands, Coffee> coffeeMap = coffeeList.stream()
            .collect(groupingBy(Coffee::getBrands, 
                    collectingAndThen(maxBy(Comparator.comparing(Coffee::getPrice)), Optional::get)));
    System.out.println("coffeeMap = " + coffeeMap);
}

// 결과
coffeeMap = {
    TWOSOME=Coffee(price=5000, capacity=414, brands=TWOSOME), 
    EDIYA=Coffee(price=4200, capacity=650, brands=EDIYA), 
    PAIKDABANG=Coffee(price=3000, capacity=946, brands=PAIKDABANG), 
    STARBUCKS=Coffee(price=5500, capacity=592, brands=STARBUCKS), 
    MEGA=Coffee(price=3000, capacity=1000, brands=MEGA)
}

3. 분할
Collectors.partitioningBy(..)

'partitioningBy' 메서드는 Stream의 요소들을 주어진 조건(Predicate)에 따라 두 개의 그룹으로 분할하여 그룹핑할 수 있는 메서드입니다. 즉, Predicate의 조건에 따라 참 또는 거짓에 따라 두 그룹으로 나누는 방법입니다.

@Test
void collectors_groupingBy_test7() {

    // capacity 값 600을 기준으로 두 그룹으로 그룹핑 
    Map<Boolean, List<Coffee>> partitioningMap = coffeeList.stream()
            .collect(partitioningBy(coffee -> coffee.getCapacity() >= 600));
    System.out.println("partitioningMap = " + partitioningMap);
}

// 결과
partitioningMap = {
    false=[Coffee(price=4500, capacity=355, brands=STARBUCKS), Coffee(price=5000, capacity=473, brands=STARBUCKS), Coffee(price=5500, capacity=592, brands=STARBUCKS), Coffee(price=4500, capacity=355, brands=TWOSOME), Coffee(price=5000, capacity=414, brands=TWOSOME), Coffee(price=3200, capacity=420, brands=EDIYA)], 
    true=[Coffee(price=2000, capacity=680, brands=MEGA), Coffee(price=3000, capacity=1000, brands=MEGA), Coffee(price=2000, capacity=625, brands=PAIKDABANG), Coffee(price=3000, capacity=946, brands=PAIKDABANG), Coffee(price=4200, capacity=650, brands=EDIYA)]
}