diff --git a/articles/plotting.html b/articles/plotting.html index 45d6a2f..f6c385a 100644 --- a/articles/plotting.html +++ b/articles/plotting.html @@ -98,13 +98,30 @@
segcurve()
provides a simple way
-of plotting a segregation curve:
+of plotting one or several segregation curves:
In this case, state A
is the most segregated, while
+state B
and C
are similarly segregated, but at
+a lower level. Segregation curves are closely related to the index of
+dissimilarity, and here this corresponds to the following index
+values:
+# converting to data.table makes this easier
+data.table::as.data.table(schools00)[
+ race %in% c("white", "asian"),
+ dissimilarity(.SD, "race", "school", weight = "n"),
+ by = .(state)
+]
+#> state stat est
+#> 1: A D 0.6558592
+#> 2: B D 0.4002980
+#> 3: C D 0.3886178
axis_labels
.
Examples of how to use these arguments are given below:
-+- -sch <- subset(schools00, state == "A") # basic segplot segplot(sch, "race", "school", weight = "n", axis_labels = "both")
+ +- -# order by majority group (white in this case) segplot(sch, "race", "school", weight = "n", order = "majority")
+ +- -# increase the space between bars # (has to be very low here because there are many schools in this dataset) segplot(sch, "race", "school", weight = "n", bar_space = 0.0005)
+ +- +# change the reference distribution # (here, we just use an equalized distribution across the five groups) @@ -161,7 +178,7 @@
Segplot weight = "n", reference_distribution = ref )
+#> 1: M under 0 0.004806118 7.679837e-05 +#> 2: H under 0 0.004730100 7.558367e-05Compressing segregation information @@ -187,7 +204,7 @@
Compressing segregation information
The second step is then to run the actual compression algorithm using
-compress()
. For this example, we choose to compress based on a relatively small window:++#> [1] 0.4138701 0.4165712# compression based on window of 20 'neighboring' units # in terms of local segregation (alternatively, neighbors can be a data frame) comp <- compress(sch, "race", "school", @@ -196,7 +213,7 @@
Compressing segregation information
After running
-compress()
—which can take some time depending on how many neighbors need to be considered—the output summarizes the compression that can be achieved:++#> 1: M 0.4218735 0.0007364381 0.4205171,0.4232621 0.003665477 +#> 2: H 0.4152207 0.0006890576 0.4139019,0.4165112 0.003587629comp #> Compression of dataset with 560 units #> Original M: 0.4085965; Final M: 0 @@ -210,12 +227,12 @@
Compressing segregation information through
comp$iterations
. This data frame can also be used to generate a plot that shows the relationship between the number of merges and the loss in segregation information: -@@ -481,8 +481,8 @@+- +scree_plot(comp)
Another way to learn more about the compression is to visualize the information as a dendrogram:
-+- +dend <- as.dendrogram(comp) dendextend::labels(dend) <- NULL # remove the labels #> Warning in `labels<-.dendrogram`(`*tmp*`, value = NULL): The lengths of the new @@ -224,13 +241,13 @@
Compressing segregation information #> Warning in rep(new_labels, length.out = leaves_length): 'x' is NULL so the #> result will be NULL plot(dend)
The third step is to create a new dataset based on the desired level of compression. This can be achieved using the function
-merge_units()
, and eithern_units
orpercent
can be specified to indicate the desired level of compression.+#> 1: M 0.4219383 0.0008178582 0.4203089,0.4236068 0.003600699 +#> 2: H 0.4152563 0.0007530694 0.4137359,0.4167181 0.003552063+sch_compressed <- merge_units(comp, n_units = 15) # or, for instance: merge_units(comp, percent = 0.80) head(sch_compressed) @@ -243,9 +260,9 @@
Compressing segregation information #> 6: M2 hisp 642
The compressed dataset has the same format as the original dataset and can now be used to produce another segplot, e.g.
-diff --git a/articles/plotting_files/figure-html/unnamed-chunk-10-1.png b/articles/plotting_files/figure-html/unnamed-chunk-10-1.png new file mode 100644 index 0000000..f5dc90e Binary files /dev/null and b/articles/plotting_files/figure-html/unnamed-chunk-10-1.png differ diff --git a/articles/plotting_files/figure-html/unnamed-chunk-2-1.png b/articles/plotting_files/figure-html/unnamed-chunk-2-1.png index 943fdf7..8f41812 100644 Binary files a/articles/plotting_files/figure-html/unnamed-chunk-2-1.png and b/articles/plotting_files/figure-html/unnamed-chunk-2-1.png differ diff --git a/articles/plotting_files/figure-html/unnamed-chunk-4-1.png b/articles/plotting_files/figure-html/unnamed-chunk-4-1.png new file mode 100644 index 0000000..d59549c Binary files /dev/null and b/articles/plotting_files/figure-html/unnamed-chunk-4-1.png differ diff --git a/articles/plotting_files/figure-html/unnamed-chunk-4-2.png b/articles/plotting_files/figure-html/unnamed-chunk-4-2.png new file mode 100644 index 0000000..f78d6b6 Binary files /dev/null and b/articles/plotting_files/figure-html/unnamed-chunk-4-2.png differ diff --git a/articles/plotting_files/figure-html/unnamed-chunk-4-3.png b/articles/plotting_files/figure-html/unnamed-chunk-4-3.png new file mode 100644 index 0000000..0d7b72f Binary files /dev/null and b/articles/plotting_files/figure-html/unnamed-chunk-4-3.png differ diff --git a/articles/plotting_files/figure-html/unnamed-chunk-4-4.png b/articles/plotting_files/figure-html/unnamed-chunk-4-4.png new file mode 100644 index 0000000..0cca0b7 Binary files /dev/null and b/articles/plotting_files/figure-html/unnamed-chunk-4-4.png differ diff --git a/articles/plotting_files/figure-html/unnamed-chunk-7-1.png b/articles/plotting_files/figure-html/unnamed-chunk-7-1.png index 3c95ff4..23c30f6 100644 Binary files a/articles/plotting_files/figure-html/unnamed-chunk-7-1.png and b/articles/plotting_files/figure-html/unnamed-chunk-7-1.png differ diff --git a/articles/plotting_files/figure-html/unnamed-chunk-8-1.png b/articles/plotting_files/figure-html/unnamed-chunk-8-1.png index f5dc90e..3c95ff4 100644 Binary files a/articles/plotting_files/figure-html/unnamed-chunk-8-1.png and b/articles/plotting_files/figure-html/unnamed-chunk-8-1.png differ diff --git a/articles/segregation.html b/articles/segregation.html index 2b69e50..211641b 100644 --- a/articles/segregation.html +++ b/articles/segregation.html @@ -249,8 +249,8 @@+- +segplot(sch_compressed, "race", "school", weight = "n")
Computing the M and H indices) #> 500 bootstrap iterations on 877739 observations #> stat est se CI bias -#> 1: M 0.4218888 0.0008052111 0.4202830,0.4234599 0.003650193 -#> 2: H 0.4152473 0.0007114470 0.4137465,0.4165963 0.003561007
As there a large number of observations, the standard errors are very small.
Inference)) #> 500 bootstrap iterations on 877739 observations #> stat est se CI bias -#> 1: M 0.4219476 0.0007579537 0.4205498,0.4234631 0.003591399 -#> 2: H 0.4152541 0.0006763022 0.4138784,0.4166825 0.003554205
The confidence intervals are based on the percentiles from the bootstrap distribution, and hence require a large number of bootstrap iterations for valid interpretation. The estimate
est
that @@ -500,10 +500,10 @@Inference
# M with(se, c(est[1] - 1.96 * se[1], est[1] + 1.96 * se[1])) -#> [1] 0.4204620 0.4234332 +#> [1] 0.4204301 0.4233169 # H with(se, c(est[2] - 1.96 * se[2], est[2] + 1.96 * se[2])) -#> [1] 0.4139286 0.4165797
provide effectively the same coverage as the confidence intervals obtained from the percentile bootstrap.
Whenever the bootstrap is used, the bootstrap distributions for each @@ -535,8 +535,8 @@
Inference
mutual_expected(schools00, "race", "school", weight = "n", n_bootstrap = 500) #> stat est se -#> 1: M under 0 0.004808867 7.623260e-05 -#> 2: H under 0 0.004732807 7.502684e-05
Here, there is no concern about bias due to a small sample size.
@@ -557,8 +557,8 @@Decomposing differences in indices#> 3: diff -0.012153884 #> 4: additions -0.003412776 #> 5: removals -0.011405093 -#> 6: group_marginal 0.017865684 -#> 7: unit_marginal -0.011707361 +#> 6: group_marginal 0.018550238 +#> 7: unit_marginal -0.012391915 #> 8: structural -0.003494338
This method also supports inference by setting
diff --git a/articles/segregation_files/figure-html/unnamed-chunk-15-1.png b/articles/segregation_files/figure-html/unnamed-chunk-15-1.png index eaee46c..60b04bb 100644 Binary files a/articles/segregation_files/figure-html/unnamed-chunk-15-1.png and b/articles/segregation_files/figure-html/unnamed-chunk-15-1.png differ diff --git a/articles/segregation_files/figure-html/unnamed-chunk-19-1.png b/articles/segregation_files/figure-html/unnamed-chunk-19-1.png index 4b739de..75ba345 100644 Binary files a/articles/segregation_files/figure-html/unnamed-chunk-19-1.png and b/articles/segregation_files/figure-html/unnamed-chunk-19-1.png differ diff --git a/news/index.html b/news/index.html index d7e1882..8a275f1 100644 --- a/news/index.html +++ b/news/index.html @@ -58,6 +58,7 @@se = TRUE
.segregation (development version)
segcurve
functionCRAN release: 2023-08-24
diff --git a/pkgdown.yml b/pkgdown.yml index 0be8039..f404504 100644 --- a/pkgdown.yml +++ b/pkgdown.yml @@ -5,7 +5,7 @@ articles: faq: faq.html plotting: plotting.html segregation: segregation.html -last_built: 2023-10-03T12:53Z +last_built: 2023-10-03T13:33Z urls: reference: https://elbersb.com/segregation/reference article: https://elbersb.com/segregation/articles diff --git a/reference/dissimilarity_expected.html b/reference/dissimilarity_expected.html index e644f4b..9993076 100644 --- a/reference/dissimilarity_expected.html +++ b/reference/dissimilarity_expected.html @@ -134,13 +134,13 @@