Binning by value is the only original binning method implemented in this package. It is inspired by the case in marketing when accounts need to be binned by their sales. For example, creating 10 bins, where each bin represent 10% of all market sales. The first bin contains the highest sales accounts, thus has the small total number of accounts, whereas the last bin contains the smallest sales accounts, thus requiring the most number of accounts per bin to reach 10% of the market sales.
tibble::tibble(SALES = as.integer(rnorm(1000L, mean = 10000L, sd = 3000))) -> sales_data
sales_data %>%
bin_cols(SALES, bin_type = "value") -> sales_data1
sales_data1
#> # A tibble: 1,000 × 2
#> SALES SALES_va10
#> <int> <int>
#> 1 8835 3
#> 2 13663 9
#> 3 7100 2
#> 4 12844 8
#> 5 10709 5
#> 6 12138 7
#> 7 9584 4
#> 8 9762 4
#> 9 14492 9
#> 10 6649 1
#> # ℹ 990 more rowsNotice that the sum is equal across bins.
sales_data1 %>%
bin_summary() %>%
print(width = Inf)
#> # A tibble: 10 × 14
#> column method n_bins .rank .min .mean .max .count .uniques
#> <chr> <chr> <int> <int> <int> <dbl> <int> <int> <int>
#> 1 SALES equal value 10 10 14621 15944. 19440 63 63
#> 2 SALES equal value 10 9 13357 13952. 14603 71 70
#> 3 SALES equal value 10 8 12418 12827. 13325 77 73
#> 4 SALES equal value 10 7 11639 12062. 12409 82 76
#> 5 SALES equal value 10 6 10857 11260. 11634 88 86
#> 6 SALES equal value 10 5 10096 10481. 10849 95 90
#> 7 SALES equal value 10 4 9307 9665. 10092 102 96
#> 8 SALES equal value 10 3 8273 8800. 9306 113 107
#> 9 SALES equal value 10 2 6902 7589. 8266 130 123
#> 10 SALES equal value 10 1 723 5526. 6897 179 172
#> relative_value .sum .med .sd width
#> <dbl> <int> <dbl> <dbl> <int>
#> 1 100 1004492 15708 1111. 4819
#> 2 87.5 990617 13945 350. 1246
#> 3 80.4 987661 12819 266. 907
#> 4 75.7 989098 12054. 215. 770
#> 5 70.6 990844 11249 219. 777
#> 6 65.7 995681 10496 211. 753
#> 7 60.6 985825 9670. 228. 785
#> 8 55.2 994448 8766 318. 1033
#> 9 47.6 986589 7612 411. 1364
#> 10 34.7 989176 5835 1177. 6174