Binning by value is the only original binning method implemented in this package. It is inspired by the case in marketing when accounts need to be binned by their sales. For example, creating 10 bins, where each bin represent 10% of all market sales. The first bin contains the highest sales accounts, thus has the small total number of accounts, whereas the last bin contains the smallest sales accounts, thus requiring the most number of accounts per bin to reach 10% of the market sales.
tibble::tibble(SALES = as.integer(rnorm(1000L, mean = 10000L, sd = 3000))) -> sales_data
sales_data %>%
bin_cols(SALES, bin_type = "value") -> sales_data1
sales_data1
#> # A tibble: 1,000 × 2
#> SALES SALES_va10
#> <int> <int>
#> 1 11159 6
#> 2 10510 5
#> 3 11642 7
#> 4 6813 1
#> 5 11377 6
#> 6 12396 8
#> 7 8848 3
#> 8 7471 2
#> 9 13750 9
#> 10 8247 2
#> # ℹ 990 more rowsNotice that the sum is equal across bins.
sales_data1 %>%
bin_summary() %>%
print(width = Inf)
#> # A tibble: 10 × 14
#> column method n_bins .rank .min .mean .max .count .uniques
#> <chr> <chr> <int> <int> <int> <dbl> <int> <int> <int>
#> 1 SALES equal value 10 10 14500 15702. 20805 64 62
#> 2 SALES equal value 10 9 13168 13730. 14479 72 70
#> 3 SALES equal value 10 8 12279 12712. 13158 78 74
#> 4 SALES equal value 10 7 11565 11932. 12275 83 81
#> 5 SALES equal value 10 6 10895 11246. 11560 88 84
#> 6 SALES equal value 10 5 10198 10509. 10893 94 91
#> 7 SALES equal value 10 4 9352 9767. 10196 102 95
#> 8 SALES equal value 10 3 8368 8855. 9344 112 111
#> 9 SALES equal value 10 2 7065 7727. 8348 128 122
#> 10 SALES equal value 10 1 1865 5533. 7063 179 176
#> relative_value .sum .med .sd width
#> <dbl> <int> <dbl> <dbl> <int>
#> 1 100 1004944 15340. 1254. 6305
#> 2 87.4 988586 13704. 378. 1311
#> 3 81.0 991532 12690. 278. 879
#> 4 76.0 990368 11935 213. 710
#> 5 71.6 989685 11251 190. 665
#> 6 66.9 987842 10458. 209. 695
#> 7 62.2 996266 9762 258. 844
#> 8 56.4 991748 8832. 290. 976
#> 9 49.2 989066 7756. 383. 1283
#> 10 35.2 990422 5863 1170. 5198