diff --git a/articles/plotting.html b/articles/plotting.html
index 45d6a2f..f6c385a 100644
--- a/articles/plotting.html
+++ b/articles/plotting.html
@@ -98,13 +98,30 @@ <h2 id="segregation-curve">Segregation curve<a class="anchor" aria-label="anchor
 </h2>
 <p>The segregation curve was first introduced by <a href="https://www.jstor.org/stable/2088328" class="external-link">Duncan and Duncan
 (1955)</a>. The function <code><a href="../reference/segcurve.html">segcurve()</a></code> provides a simple way
-of plotting a segregation curve:</p>
+of plotting one or several segregation curves:</p>
 <div class="sourceCode" id="cb1"><pre class="downlit sourceCode r">
-<code class="sourceCode R"><span><span class="fu"><a href="../reference/segcurve.html">segcurve</a></span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/subset.html" class="external-link">subset</a></span><span class="op">(</span><span class="va">schools00</span>, <span class="va">race</span> <span class="op"><a href="https://rdrr.io/r/base/match.html" class="external-link">%in%</a></span> <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="st">"white"</span>, <span class="st">"black"</span><span class="op">)</span><span class="op">)</span>,</span>
+<code class="sourceCode R"><span><span class="fu"><a href="../reference/segcurve.html">segcurve</a></span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/subset.html" class="external-link">subset</a></span><span class="op">(</span><span class="va">schools00</span>, <span class="va">race</span> <span class="op"><a href="https://rdrr.io/r/base/match.html" class="external-link">%in%</a></span> <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="st">"white"</span>, <span class="st">"asian"</span><span class="op">)</span><span class="op">)</span>,</span>
 <span>  <span class="st">"race"</span>, <span class="st">"school"</span>,</span>
-<span>  weight <span class="op">=</span> <span class="st">"n"</span></span>
+<span>  weight <span class="op">=</span> <span class="st">"n"</span>,</span>
+<span>  segment <span class="op">=</span> <span class="st">"state"</span> <span class="co"># leave this out to produce a single curve</span></span>
 <span><span class="op">)</span></span></code></pre></div>
 <p><img src="plotting_files/figure-html/unnamed-chunk-2-1.png" width="700"></p>
+<p>In this case, state <code>A</code> is the most segregated, while
+state <code>B</code> and <code>C</code> are similarly segregated, but at
+a lower level. Segregation curves are closely related to the index of
+dissimilarity, and here this corresponds to the following index
+values:</p>
+<div class="sourceCode" id="cb2"><pre class="downlit sourceCode r">
+<code class="sourceCode R"><span><span class="co"># converting to data.table makes this easier</span></span>
+<span><span class="fu">data.table</span><span class="fu">::</span><span class="fu"><a href="https://Rdatatable.gitlab.io/data.table/reference/as.data.table.html" class="external-link">as.data.table</a></span><span class="op">(</span><span class="va">schools00</span><span class="op">)</span><span class="op">[</span></span>
+<span>  <span class="va">race</span> <span class="op"><a href="https://rdrr.io/r/base/match.html" class="external-link">%in%</a></span> <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="st">"white"</span>, <span class="st">"asian"</span><span class="op">)</span>,</span>
+<span>  <span class="fu"><a href="../reference/dissimilarity.html">dissimilarity</a></span><span class="op">(</span><span class="va">.SD</span>, <span class="st">"race"</span>, <span class="st">"school"</span>, weight <span class="op">=</span> <span class="st">"n"</span><span class="op">)</span>,</span>
+<span>  by <span class="op">=</span> <span class="fu">.</span><span class="op">(</span><span class="va">state</span><span class="op">)</span></span>
+<span><span class="op">]</span></span>
+<span><span class="co">#&gt;    state stat       est</span></span>
+<span><span class="co">#&gt; 1:     A    D 0.6558592</span></span>
+<span><span class="co">#&gt; 2:     B    D 0.4002980</span></span>
+<span><span class="co">#&gt; 3:     C    D 0.3886178</span></span></code></pre></div>
 </div>
 <div class="section level2">
 <h2 id="segplot">Segplot<a class="anchor" aria-label="anchor" href="#segplot"></a>
@@ -129,24 +146,24 @@ <h2 id="segplot">Segplot<a class="anchor" aria-label="anchor" href="#segplot"></
 either on the left side only, the right side only, or on both sides, use
 the argument <code>axis_labels</code>.</p>
 <p>Examples of how to use these arguments are given below:</p>
-<div class="sourceCode" id="cb2"><pre class="downlit sourceCode r">
+<div class="sourceCode" id="cb3"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">sch</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/subset.html" class="external-link">subset</a></span><span class="op">(</span><span class="va">schools00</span>, <span class="va">state</span> <span class="op">==</span> <span class="st">"A"</span><span class="op">)</span></span>
 <span></span>
 <span><span class="co"># basic segplot</span></span>
 <span><span class="fu"><a href="../reference/segplot.html">segplot</a></span><span class="op">(</span><span class="va">sch</span>, <span class="st">"race"</span>, <span class="st">"school"</span>, weight <span class="op">=</span> <span class="st">"n"</span>, axis_labels <span class="op">=</span> <span class="st">"both"</span><span class="op">)</span></span></code></pre></div>
-<p><img src="plotting_files/figure-html/unnamed-chunk-3-1.png" width="700"></p>
-<div class="sourceCode" id="cb3"><pre class="downlit sourceCode r">
+<p><img src="plotting_files/figure-html/unnamed-chunk-4-1.png" width="700"></p>
+<div class="sourceCode" id="cb4"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span></span>
 <span><span class="co"># order by majority group (white in this case)</span></span>
 <span><span class="fu"><a href="../reference/segplot.html">segplot</a></span><span class="op">(</span><span class="va">sch</span>, <span class="st">"race"</span>, <span class="st">"school"</span>, weight <span class="op">=</span> <span class="st">"n"</span>, order <span class="op">=</span> <span class="st">"majority"</span><span class="op">)</span></span></code></pre></div>
-<p><img src="plotting_files/figure-html/unnamed-chunk-3-2.png" width="700"></p>
-<div class="sourceCode" id="cb4"><pre class="downlit sourceCode r">
+<p><img src="plotting_files/figure-html/unnamed-chunk-4-2.png" width="700"></p>
+<div class="sourceCode" id="cb5"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span></span>
 <span><span class="co"># increase the space between bars</span></span>
 <span><span class="co"># (has to be very low here because there are many schools in this dataset)</span></span>
 <span><span class="fu"><a href="../reference/segplot.html">segplot</a></span><span class="op">(</span><span class="va">sch</span>, <span class="st">"race"</span>, <span class="st">"school"</span>, weight <span class="op">=</span> <span class="st">"n"</span>, bar_space <span class="op">=</span> <span class="fl">0.0005</span><span class="op">)</span></span></code></pre></div>
-<p><img src="plotting_files/figure-html/unnamed-chunk-3-3.png" width="700"></p>
-<div class="sourceCode" id="cb5"><pre class="downlit sourceCode r">
+<p><img src="plotting_files/figure-html/unnamed-chunk-4-3.png" width="700"></p>
+<div class="sourceCode" id="cb6"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span></span>
 <span><span class="co"># change the reference distribution</span></span>
 <span><span class="co"># (here, we just use an equalized distribution across the five groups)</span></span>
@@ -161,7 +178,7 @@ <h2 id="segplot">Segplot<a class="anchor" aria-label="anchor" href="#segplot"></
 <span>  weight <span class="op">=</span> <span class="st">"n"</span>,</span>
 <span>  reference_distribution <span class="op">=</span> <span class="va">ref</span></span>
 <span><span class="op">)</span></span></code></pre></div>
-<p><img src="plotting_files/figure-html/unnamed-chunk-3-4.png" width="700"></p>
+<p><img src="plotting_files/figure-html/unnamed-chunk-4-4.png" width="700"></p>
 </div>
 <div class="section level2">
 <h2 id="compressing-segregation-information">Compressing segregation information<a class="anchor" aria-label="anchor" href="#compressing-segregation-information"></a>
@@ -187,7 +204,7 @@ <h2 id="compressing-segregation-information">Compressing segregation information
 <p>The second step is then to run the actual compression algorithm using
 <code><a href="../reference/compress.html">compress()</a></code>. For this example, we choose to compress based
 on a relatively small window:</p>
-<div class="sourceCode" id="cb6"><pre class="downlit sourceCode r">
+<div class="sourceCode" id="cb7"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="co"># compression based on window of 20 'neighboring' units</span></span>
 <span><span class="co"># in terms of local segregation (alternatively, neighbors can be a data frame)</span></span>
 <span><span class="va">comp</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/compress.html">compress</a></span><span class="op">(</span><span class="va">sch</span>, <span class="st">"race"</span>, <span class="st">"school"</span>,</span>
@@ -196,7 +213,7 @@ <h2 id="compressing-segregation-information">Compressing segregation information
 <p>After running <code><a href="../reference/compress.html">compress()</a></code>—which can take some time
 depending on how many neighbors need to be considered—the output
 summarizes the compression that can be achieved:</p>
-<div class="sourceCode" id="cb7"><pre class="downlit sourceCode r">
+<div class="sourceCode" id="cb8"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">comp</span></span>
 <span><span class="co">#&gt; Compression of dataset with 560 units</span></span>
 <span><span class="co">#&gt; Original M: 0.4085965; Final M: 0</span></span>
@@ -210,12 +227,12 @@ <h2 id="compressing-segregation-information">Compressing segregation information
 through <code>comp$iterations</code>. This data frame can also be used
 to generate a plot that shows the relationship between the number of
 merges and the loss in segregation information:</p>
-<div class="sourceCode" id="cb8"><pre class="downlit sourceCode r">
+<div class="sourceCode" id="cb9"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="fu"><a href="../reference/scree_plot.html">scree_plot</a></span><span class="op">(</span><span class="va">comp</span><span class="op">)</span></span></code></pre></div>
-<p><img src="plotting_files/figure-html/unnamed-chunk-6-1.png" width="700"></p>
+<p><img src="plotting_files/figure-html/unnamed-chunk-7-1.png" width="700"></p>
 <p>Another way to learn more about the compression is to visualize the
 information as a dendrogram:</p>
-<div class="sourceCode" id="cb9"><pre class="downlit sourceCode r">
+<div class="sourceCode" id="cb10"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">dend</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/stats/dendrogram.html" class="external-link">as.dendrogram</a></span><span class="op">(</span><span class="va">comp</span><span class="op">)</span></span>
 <span><span class="fu">dendextend</span><span class="fu">::</span><span class="fu">labels</span><span class="op">(</span><span class="va">dend</span><span class="op">)</span> <span class="op">&lt;-</span> <span class="cn">NULL</span> <span class="co"># remove the labels</span></span>
 <span><span class="co">#&gt; Warning in `labels&lt;-.dendrogram`(`*tmp*`, value = NULL): The lengths of the new</span></span>
@@ -224,13 +241,13 @@ <h2 id="compressing-segregation-information">Compressing segregation information
 <span><span class="co">#&gt; Warning in rep(new_labels, length.out = leaves_length): 'x' is NULL so the</span></span>
 <span><span class="co">#&gt; result will be NULL</span></span>
 <span><span class="fu"><a href="https://rdrr.io/r/graphics/plot.default.html" class="external-link">plot</a></span><span class="op">(</span><span class="va">dend</span><span class="op">)</span></span></code></pre></div>
-<p><img src="plotting_files/figure-html/unnamed-chunk-7-1.png" width="700"></p>
+<p><img src="plotting_files/figure-html/unnamed-chunk-8-1.png" width="700"></p>
 <p>The third step is to create a new dataset based on the desired level
 of compression. This can be achieved using the function
 <code><a href="../reference/merge_units.html">merge_units()</a></code>, and either <code>n_units</code> or
 <code>percent</code> can be specified to indicate the desired level of
 compression.</p>
-<div class="sourceCode" id="cb10"><pre class="downlit sourceCode r">
+<div class="sourceCode" id="cb11"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">sch_compressed</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/merge_units.html">merge_units</a></span><span class="op">(</span><span class="va">comp</span>, n_units <span class="op">=</span> <span class="fl">15</span><span class="op">)</span></span>
 <span><span class="co"># or, for instance: merge_units(comp, percent = 0.80)</span></span>
 <span><span class="fu"><a href="https://rdrr.io/r/utils/head.html" class="external-link">head</a></span><span class="op">(</span><span class="va">sch_compressed</span><span class="op">)</span></span>
@@ -243,9 +260,9 @@ <h2 id="compressing-segregation-information">Compressing segregation information
 <span><span class="co">#&gt; 6:     M2  hisp  642</span></span></code></pre></div>
 <p>The compressed dataset has the same format as the original dataset
 and can now be used to produce another segplot, e.g.</p>
-<div class="sourceCode" id="cb11"><pre class="downlit sourceCode r">
+<div class="sourceCode" id="cb12"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="fu"><a href="../reference/segplot.html">segplot</a></span><span class="op">(</span><span class="va">sch_compressed</span>, <span class="st">"race"</span>, <span class="st">"school"</span>, weight <span class="op">=</span> <span class="st">"n"</span><span class="op">)</span></span></code></pre></div>
-<p><img src="plotting_files/figure-html/unnamed-chunk-9-1.png" width="700"></p>
+<p><img src="plotting_files/figure-html/unnamed-chunk-10-1.png" width="700"></p>
 </div>
   </main><aside class="col-md-3"><nav id="toc"><h2>On this page</h2>
     </nav></aside>
diff --git a/articles/plotting_files/figure-html/unnamed-chunk-10-1.png b/articles/plotting_files/figure-html/unnamed-chunk-10-1.png
new file mode 100644
index 0000000..f5dc90e
Binary files /dev/null and b/articles/plotting_files/figure-html/unnamed-chunk-10-1.png differ
diff --git a/articles/plotting_files/figure-html/unnamed-chunk-2-1.png b/articles/plotting_files/figure-html/unnamed-chunk-2-1.png
index 943fdf7..8f41812 100644
Binary files a/articles/plotting_files/figure-html/unnamed-chunk-2-1.png and b/articles/plotting_files/figure-html/unnamed-chunk-2-1.png differ
diff --git a/articles/plotting_files/figure-html/unnamed-chunk-4-1.png b/articles/plotting_files/figure-html/unnamed-chunk-4-1.png
new file mode 100644
index 0000000..d59549c
Binary files /dev/null and b/articles/plotting_files/figure-html/unnamed-chunk-4-1.png differ
diff --git a/articles/plotting_files/figure-html/unnamed-chunk-4-2.png b/articles/plotting_files/figure-html/unnamed-chunk-4-2.png
new file mode 100644
index 0000000..f78d6b6
Binary files /dev/null and b/articles/plotting_files/figure-html/unnamed-chunk-4-2.png differ
diff --git a/articles/plotting_files/figure-html/unnamed-chunk-4-3.png b/articles/plotting_files/figure-html/unnamed-chunk-4-3.png
new file mode 100644
index 0000000..0d7b72f
Binary files /dev/null and b/articles/plotting_files/figure-html/unnamed-chunk-4-3.png differ
diff --git a/articles/plotting_files/figure-html/unnamed-chunk-4-4.png b/articles/plotting_files/figure-html/unnamed-chunk-4-4.png
new file mode 100644
index 0000000..0cca0b7
Binary files /dev/null and b/articles/plotting_files/figure-html/unnamed-chunk-4-4.png differ
diff --git a/articles/plotting_files/figure-html/unnamed-chunk-7-1.png b/articles/plotting_files/figure-html/unnamed-chunk-7-1.png
index 3c95ff4..23c30f6 100644
Binary files a/articles/plotting_files/figure-html/unnamed-chunk-7-1.png and b/articles/plotting_files/figure-html/unnamed-chunk-7-1.png differ
diff --git a/articles/plotting_files/figure-html/unnamed-chunk-8-1.png b/articles/plotting_files/figure-html/unnamed-chunk-8-1.png
index f5dc90e..3c95ff4 100644
Binary files a/articles/plotting_files/figure-html/unnamed-chunk-8-1.png and b/articles/plotting_files/figure-html/unnamed-chunk-8-1.png differ
diff --git a/articles/segregation.html b/articles/segregation.html
index 2b69e50..211641b 100644
--- a/articles/segregation.html
+++ b/articles/segregation.html
@@ -249,8 +249,8 @@ <h2 id="computing-the-m-and-h-indices">Computing the M and H indices<a class="an
 <span><span class="op">)</span></span>
 <span><span class="co">#&gt; 500 bootstrap iterations on 877739 observations</span></span>
 <span><span class="co">#&gt;    stat       est           se                  CI        bias</span></span>
-<span><span class="co">#&gt; 1:    M 0.4218888 0.0008052111 0.4202830,0.4234599 0.003650193</span></span>
-<span><span class="co">#&gt; 2:    H 0.4152473 0.0007114470 0.4137465,0.4165963 0.003561007</span></span></code></pre></div>
+<span><span class="co">#&gt; 1:    M 0.4219383 0.0008178582 0.4203089,0.4236068 0.003600699</span></span>
+<span><span class="co">#&gt; 2:    H 0.4152563 0.0007530694 0.4137359,0.4167181 0.003552063</span></span></code></pre></div>
 <p>As there a large number of observations, the standard errors are very
 small.</p>
 </div>
@@ -481,8 +481,8 @@ <h2 id="inference">Inference<a class="anchor" aria-label="anchor" href="#inferen
 <span><span class="op">)</span><span class="op">)</span></span>
 <span><span class="co">#&gt; 500 bootstrap iterations on 877739 observations</span></span>
 <span><span class="co">#&gt;    stat       est           se                  CI        bias</span></span>
-<span><span class="co">#&gt; 1:    M 0.4219476 0.0007579537 0.4205498,0.4234631 0.003591399</span></span>
-<span><span class="co">#&gt; 2:    H 0.4152541 0.0006763022 0.4138784,0.4166825 0.003554205</span></span></code></pre></div>
+<span><span class="co">#&gt; 1:    M 0.4218735 0.0007364381 0.4205171,0.4232621 0.003665477</span></span>
+<span><span class="co">#&gt; 2:    H 0.4152207 0.0006890576 0.4139019,0.4165112 0.003587629</span></span></code></pre></div>
 <p>The confidence intervals are based on the percentiles from the
 bootstrap distribution, and hence require a large number of bootstrap
 iterations for valid interpretation. The estimate <code>est</code> that
@@ -500,10 +500,10 @@ <h2 id="inference">Inference<a class="anchor" aria-label="anchor" href="#inferen
 <div class="sourceCode" id="cb17"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="co"># M</span></span>
 <span><span class="fu"><a href="https://rdrr.io/r/base/with.html" class="external-link">with</a></span><span class="op">(</span><span class="va">se</span>, <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="va">est</span><span class="op">[</span><span class="fl">1</span><span class="op">]</span> <span class="op">-</span> <span class="fl">1.96</span> <span class="op">*</span> <span class="va">se</span><span class="op">[</span><span class="fl">1</span><span class="op">]</span>, <span class="va">est</span><span class="op">[</span><span class="fl">1</span><span class="op">]</span> <span class="op">+</span> <span class="fl">1.96</span> <span class="op">*</span> <span class="va">se</span><span class="op">[</span><span class="fl">1</span><span class="op">]</span><span class="op">)</span><span class="op">)</span></span>
-<span><span class="co">#&gt; [1] 0.4204620 0.4234332</span></span>
+<span><span class="co">#&gt; [1] 0.4204301 0.4233169</span></span>
 <span><span class="co"># H</span></span>
 <span><span class="fu"><a href="https://rdrr.io/r/base/with.html" class="external-link">with</a></span><span class="op">(</span><span class="va">se</span>, <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="va">est</span><span class="op">[</span><span class="fl">2</span><span class="op">]</span> <span class="op">-</span> <span class="fl">1.96</span> <span class="op">*</span> <span class="va">se</span><span class="op">[</span><span class="fl">2</span><span class="op">]</span>, <span class="va">est</span><span class="op">[</span><span class="fl">2</span><span class="op">]</span> <span class="op">+</span> <span class="fl">1.96</span> <span class="op">*</span> <span class="va">se</span><span class="op">[</span><span class="fl">2</span><span class="op">]</span><span class="op">)</span><span class="op">)</span></span>
-<span><span class="co">#&gt; [1] 0.4139286 0.4165797</span></span></code></pre></div>
+<span><span class="co">#&gt; [1] 0.4138701 0.4165712</span></span></code></pre></div>
 <p>provide effectively the same coverage as the confidence intervals
 obtained from the percentile bootstrap.</p>
 <p>Whenever the bootstrap is used, the bootstrap distributions for each
@@ -535,8 +535,8 @@ <h2 id="inference">Inference<a class="anchor" aria-label="anchor" href="#inferen
 <div class="sourceCode" id="cb19"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="fu"><a href="../reference/mutual_expected.html">mutual_expected</a></span><span class="op">(</span><span class="va">schools00</span>, <span class="st">"race"</span>, <span class="st">"school"</span>, weight <span class="op">=</span> <span class="st">"n"</span>, n_bootstrap <span class="op">=</span> <span class="fl">500</span><span class="op">)</span></span>
 <span><span class="co">#&gt;         stat         est           se</span></span>
-<span><span class="co">#&gt; 1: M under 0 0.004808867 7.623260e-05</span></span>
-<span><span class="co">#&gt; 2: H under 0 0.004732807 7.502684e-05</span></span></code></pre></div>
+<span><span class="co">#&gt; 1: M under 0 0.004806118 7.679837e-05</span></span>
+<span><span class="co">#&gt; 2: H under 0 0.004730100 7.558367e-05</span></span></code></pre></div>
 <p>Here, there is no concern about bias due to a small sample size.</p>
 </div>
 <div class="section level2">
@@ -557,8 +557,8 @@ <h2 id="decomposing-differences-in-indices">Decomposing differences in indices<a
 <span><span class="co">#&gt; 3:           diff -0.012153884</span></span>
 <span><span class="co">#&gt; 4:      additions -0.003412776</span></span>
 <span><span class="co">#&gt; 5:       removals -0.011405093</span></span>
-<span><span class="co">#&gt; 6: group_marginal  0.017865684</span></span>
-<span><span class="co">#&gt; 7:  unit_marginal -0.011707361</span></span>
+<span><span class="co">#&gt; 6: group_marginal  0.018550238</span></span>
+<span><span class="co">#&gt; 7:  unit_marginal -0.012391915</span></span>
 <span><span class="co">#&gt; 8:     structural -0.003494338</span></span></code></pre></div>
 <p>This method also supports inference by setting
 <code>se = TRUE</code>.</p>
diff --git a/articles/segregation_files/figure-html/unnamed-chunk-15-1.png b/articles/segregation_files/figure-html/unnamed-chunk-15-1.png
index eaee46c..60b04bb 100644
Binary files a/articles/segregation_files/figure-html/unnamed-chunk-15-1.png and b/articles/segregation_files/figure-html/unnamed-chunk-15-1.png differ
diff --git a/articles/segregation_files/figure-html/unnamed-chunk-19-1.png b/articles/segregation_files/figure-html/unnamed-chunk-19-1.png
index 4b739de..75ba345 100644
Binary files a/articles/segregation_files/figure-html/unnamed-chunk-19-1.png and b/articles/segregation_files/figure-html/unnamed-chunk-19-1.png differ
diff --git a/news/index.html b/news/index.html
index d7e1882..8a275f1 100644
--- a/news/index.html
+++ b/news/index.html
@@ -58,6 +58,7 @@
 <h2 class="pkg-version" data-toc-text="development version" id="segregation-development-version">segregation (development version)<a class="anchor" aria-label="anchor" href="#segregation-development-version"></a></h2>
 <ul><li>various improvements to compression algorithm</li>
 <li>add dendrogram visualization</li>
+<li>allow multiple curves in <code>segcurve</code> function</li>
 </ul></div>
     <div class="section level2">
 <h2 class="pkg-version" data-toc-text="1.0.0" id="segregation-100">segregation 1.0.0<a class="anchor" aria-label="anchor" href="#segregation-100"></a></h2><p class="text-muted">CRAN release: 2023-08-24</p>
diff --git a/pkgdown.yml b/pkgdown.yml
index 0be8039..f404504 100644
--- a/pkgdown.yml
+++ b/pkgdown.yml
@@ -5,7 +5,7 @@ articles:
   faq: faq.html
   plotting: plotting.html
   segregation: segregation.html
-last_built: 2023-10-03T12:53Z
+last_built: 2023-10-03T13:33Z
 urls:
   reference: https://elbersb.com/segregation/reference
   article: https://elbersb.com/segregation/articles
diff --git a/reference/dissimilarity_expected.html b/reference/dissimilarity_expected.html
index e644f4b..9993076 100644
--- a/reference/dissimilarity_expected.html
+++ b/reference/dissimilarity_expected.html
@@ -134,13 +134,13 @@ <h2 id="ref-examples">Examples<a class="anchor" aria-label="anchor" href="#ref-e
 <span class="r-in"><span>    n <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/rep.html" class="external-link">rep</a></span><span class="op">(</span><span class="fl">1</span>, <span class="fl">10</span><span class="op">)</span>, <span class="fu"><a href="https://rdrr.io/r/base/rep.html" class="external-link">rep</a></span><span class="op">(</span><span class="fl">9</span>, <span class="fl">10</span><span class="op">)</span><span class="op">)</span></span></span>
 <span class="r-in"><span><span class="op">)</span></span></span>
 <span class="r-in"><span><span class="fu">dissimilarity_expected</span><span class="op">(</span><span class="va">small</span>, <span class="st">"race"</span>, <span class="st">"school"</span>, weight <span class="op">=</span> <span class="st">"n"</span><span class="op">)</span></span></span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>         stat       est         se</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> 1: D under 0 0.3788889 0.09618616</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span>         stat       est       se</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> 1: D under 0 0.3755556 0.117949</span>
 <span class="r-in"><span><span class="co"># with an increase in sample size (n=1000), the values improve</span></span></span>
 <span class="r-in"><span><span class="va">small</span><span class="op">$</span><span class="va">n</span> <span class="op">&lt;-</span> <span class="va">small</span><span class="op">$</span><span class="va">n</span> <span class="op">*</span> <span class="fl">10</span></span></span>
 <span class="r-in"><span><span class="fu">dissimilarity_expected</span><span class="op">(</span><span class="va">small</span>, <span class="st">"race"</span>, <span class="st">"school"</span>, weight <span class="op">=</span> <span class="st">"n"</span><span class="op">)</span></span></span>
-<span class="r-out co"><span class="r-pr">#&gt;</span>         stat       est         se</span>
-<span class="r-out co"><span class="r-pr">#&gt;</span> 1: D under 0 0.1232222 0.02899553</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span>         stat   est         se</span>
+<span class="r-out co"><span class="r-pr">#&gt;</span> 1: D under 0 0.121 0.02762111</span>
 </code></pre></div>
     </div>
   </main><aside class="col-md-3"><nav id="toc"><h2>On this page</h2>
diff --git a/reference/segcurve.html b/reference/segcurve.html
index af6f8d9..c2ef25b 100644
--- a/reference/segcurve.html
+++ b/reference/segcurve.html
@@ -1,5 +1,5 @@
 <!DOCTYPE html>
-<!-- Generated by pkgdown: do not edit by hand --><html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="utf-8"><meta http-equiv="X-UA-Compatible" content="IE=edge"><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"><meta name="description" content="Produces a segregation curve, as defined in Duncan and Duncan (1955)"><title>A visual representation of two-group segregation — segcurve • segregation</title><script src="../deps/jquery-3.6.0/jquery-3.6.0.min.js"></script><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"><link href="../deps/bootstrap-5.2.2/bootstrap.min.css" rel="stylesheet"><script src="../deps/bootstrap-5.2.2/bootstrap.bundle.min.js"></script><!-- Font Awesome icons --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/all.min.css" integrity="sha256-mmgLkCYLUQbXn0B1SRqzHar6dCnv9oZFPEC1g1cwlkk=" crossorigin="anonymous"><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/v4-shims.min.css" integrity="sha256-wZjR52fzng1pJHwx4aV2AO3yyTOXrcDW7jBpJtTwVxw=" crossorigin="anonymous"><!-- bootstrap-toc --><script src="https://cdn.jsdelivr.net/gh/afeld/bootstrap-toc@v1.0.1/dist/bootstrap-toc.min.js" integrity="sha256-4veVQbu7//Lk5TSmc7YV48MxtMy98e26cf5MrgZYnwo=" crossorigin="anonymous"></script><!-- headroom.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/headroom.min.js" integrity="sha256-AsUX4SJE1+yuDu5+mAVzJbuYNPHj/WroHuZ8Ir/CkE0=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/jQuery.headroom.min.js" integrity="sha256-ZX/yNShbjqsohH1k95liqY9Gd8uOiE1S4vZc+9KQ1K4=" crossorigin="anonymous"></script><!-- clipboard.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.6/clipboard.min.js" integrity="sha256-inc5kl9MA1hkeYUt+EC3BhlIgyp/2jDIyBLS6k3UxPI=" crossorigin="anonymous"></script><!-- search --><script src="https://cdnjs.cloudflare.com/ajax/libs/fuse.js/6.4.6/fuse.js" integrity="sha512-zv6Ywkjyktsohkbp9bb45V6tEMoWhzFzXis+LrMehmJZZSys19Yxf1dopHx7WzIKxr5tK2dVcYmaCk2uqdjF4A==" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/autocomplete.js/0.38.0/autocomplete.jquery.min.js" integrity="sha512-GU9ayf+66Xx2TmpxqJpliWbT5PiGYxpaG8rfnBEk1LL8l1KGkRShhngwdXK1UgqhAzWpZHSiYPc09/NwDQIGyg==" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mark.js/8.11.1/mark.min.js" integrity="sha512-5CYOlHXGh6QpOFA/TeTylKLWfB3ftPsde7AnmhuitiTX4K5SqCLBeKro6sPS8ilsz1Q4NRx3v8Ko2IBiszzdww==" crossorigin="anonymous"></script><!-- pkgdown --><script src="../pkgdown.js"></script><meta property="og:title" content="A visual representation of two-group segregation — segcurve"><meta property="og:description" content="Produces a segregation curve, as defined in Duncan and Duncan (1955)"><!-- mathjax --><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script><!--[if lt IE 9]>
+<!-- Generated by pkgdown: do not edit by hand --><html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="utf-8"><meta http-equiv="X-UA-Compatible" content="IE=edge"><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"><meta name="description" content="Produces one or several segregation curves, as defined in Duncan and Duncan (1955)"><title>A visual representation of two-group segregation — segcurve • segregation</title><script src="../deps/jquery-3.6.0/jquery-3.6.0.min.js"></script><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"><link href="../deps/bootstrap-5.2.2/bootstrap.min.css" rel="stylesheet"><script src="../deps/bootstrap-5.2.2/bootstrap.bundle.min.js"></script><!-- Font Awesome icons --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/all.min.css" integrity="sha256-mmgLkCYLUQbXn0B1SRqzHar6dCnv9oZFPEC1g1cwlkk=" crossorigin="anonymous"><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/v4-shims.min.css" integrity="sha256-wZjR52fzng1pJHwx4aV2AO3yyTOXrcDW7jBpJtTwVxw=" crossorigin="anonymous"><!-- bootstrap-toc --><script src="https://cdn.jsdelivr.net/gh/afeld/bootstrap-toc@v1.0.1/dist/bootstrap-toc.min.js" integrity="sha256-4veVQbu7//Lk5TSmc7YV48MxtMy98e26cf5MrgZYnwo=" crossorigin="anonymous"></script><!-- headroom.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/headroom.min.js" integrity="sha256-AsUX4SJE1+yuDu5+mAVzJbuYNPHj/WroHuZ8Ir/CkE0=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/jQuery.headroom.min.js" integrity="sha256-ZX/yNShbjqsohH1k95liqY9Gd8uOiE1S4vZc+9KQ1K4=" crossorigin="anonymous"></script><!-- clipboard.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.6/clipboard.min.js" integrity="sha256-inc5kl9MA1hkeYUt+EC3BhlIgyp/2jDIyBLS6k3UxPI=" crossorigin="anonymous"></script><!-- search --><script src="https://cdnjs.cloudflare.com/ajax/libs/fuse.js/6.4.6/fuse.js" integrity="sha512-zv6Ywkjyktsohkbp9bb45V6tEMoWhzFzXis+LrMehmJZZSys19Yxf1dopHx7WzIKxr5tK2dVcYmaCk2uqdjF4A==" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/autocomplete.js/0.38.0/autocomplete.jquery.min.js" integrity="sha512-GU9ayf+66Xx2TmpxqJpliWbT5PiGYxpaG8rfnBEk1LL8l1KGkRShhngwdXK1UgqhAzWpZHSiYPc09/NwDQIGyg==" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mark.js/8.11.1/mark.min.js" integrity="sha512-5CYOlHXGh6QpOFA/TeTylKLWfB3ftPsde7AnmhuitiTX4K5SqCLBeKro6sPS8ilsz1Q4NRx3v8Ko2IBiszzdww==" crossorigin="anonymous"></script><!-- pkgdown --><script src="../pkgdown.js"></script><meta property="og:title" content="A visual representation of two-group segregation — segcurve"><meta property="og:description" content="Produces one or several segregation curves, as defined in Duncan and Duncan (1955)"><!-- mathjax --><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script><!--[if lt IE 9]>
 <script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
 <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
 <![endif]--></head><body>
@@ -56,12 +56,12 @@
     </div>
 
     <div class="ref-description section level2">
-    <p>Produces a segregation curve, as defined in Duncan and Duncan (1955)</p>
+    <p>Produces one or several segregation curves, as defined in Duncan and Duncan (1955)</p>
     </div>
 
     <div class="section level2">
     <h2 id="ref-usage">Usage<a class="anchor" aria-label="anchor" href="#ref-usage"></a></h2>
-    <div class="sourceCode"><pre class="sourceCode r"><code><span><span class="fu">segcurve</span><span class="op">(</span><span class="va">data</span>, <span class="va">group</span>, <span class="va">unit</span>, <span class="va">weight</span><span class="op">)</span></span></code></pre></div>
+    <div class="sourceCode"><pre class="sourceCode r"><code><span><span class="fu">segcurve</span><span class="op">(</span><span class="va">data</span>, <span class="va">group</span>, <span class="va">unit</span>, weight <span class="op">=</span> <span class="cn">NULL</span>, segment <span class="op">=</span> <span class="cn">NULL</span><span class="op">)</span></span></code></pre></div>
     </div>
 
     <div class="section level2">
@@ -71,20 +71,23 @@ <h2 id="arguments">Arguments<a class="anchor" aria-label="anchor" href="#argumen
 
 
 <dt>group</dt>
-<dd><p>A categorical variable or a vector of variables
-contained in <code>data</code>. Defines the first dimension
-over which segregation is computed.</p></dd>
+<dd><p>A categorical variable contained in <code>data</code>.
+Defines the first dimension over which segregation is computed.</p></dd>
 
 
 <dt>unit</dt>
-<dd><p>A categorical variable or a vector of variables
-contained in <code>data</code>. Defines the second dimension
-over which segregation is computed.</p></dd>
+<dd><p>A categorical variable contained in <code>data</code>.
+Defines the second dimension over which segregation is computed.</p></dd>
 
 
 <dt>weight</dt>
 <dd><p>Numeric. (Default <code>NULL</code>)</p></dd>
 
+
+<dt>segment</dt>
+<dd><p>A categorical variable contained in <code>data</code>. (Default <code>NULL</code>)
+If given, several segregation curves will be shown, one for each segment.</p></dd>
+
 </dl></div>
     <div class="section level2">
     <h2 id="value">Value<a class="anchor" aria-label="anchor" href="#value"></a></h2>
diff --git a/search.json b/search.json
index 7a8cac9..8b6d286 100644
--- a/search.json
+++ b/search.json
@@ -1 +1 @@
-[{"path":"https://elbersb.com/segregation/articles/faq.html","id":"can-index-x-be-added-to-the-package","dir":"Articles","previous_headings":"","what":"Can index X be added to the package?","title":"FAQ","text":"Adding new segregation indices big trouble. Please open issue GitHub request index added.","code":""},{"path":"https://elbersb.com/segregation/articles/faq.html","id":"how-can-i-compute-indices-for-different-areas-at-once","dir":"Articles","previous_headings":"","what":"How can I compute indices for different areas at once?","title":"FAQ","text":"use dplyr package, one pattern works well use group_modify. , compute pairwise Black-White dissimilarity index state separately: similar pattern works also well data.table: compute many decompositions , ’s easiest combine data two time points. instance, ’s dplyr solution decompose state-specific M indices 2000 2005: , ’s also data.table solution:","code":"library(\"segregation\") library(\"dplyr\")  schools00 %>%   filter(race %in% c(\"black\", \"white\")) %>%   group_by(state) %>%   group_modify(~ dissimilarity(     data = .x,     group = \"race\",     unit = \"school\",     weight = \"n\"   )) #> # A tibble: 3 × 3 #> # Groups:   state [3] #>   state stat    est #>   <fct> <chr> <dbl> #> 1 A     D     0.706 #> 2 B     D     0.655 #> 3 C     D     0.704 library(\"data.table\")  schools00 <- as.data.table(schools00) schools00[   race %in% c(\"black\", \"white\"),   dissimilarity(data = .SD, group = \"race\", unit = \"school\", weight = \"n\"),   by = .(state) ] #>    state stat       est #> 1:     A    D 0.7063595 #> 2:     B    D 0.6548485 #> 3:     C    D 0.7042057 # helper function for decomposition diff <- function(df, group) {   data1 <- filter(df, year == 2000)   data2 <- filter(df, year == 2005)   mutual_difference(data1, data2, group = \"race\", unit = \"school\", weight = \"n\") }  # add year indicators schools00$year <- 2000 schools05$year <- 2005 combine <- bind_rows(schools00, schools05)  combine %>%   group_by(state) %>%   group_modify(diff) %>%   head(5) #> # A tibble: 5 × 3 #> # Groups:   state [1] #>   state stat          est #>   <fct> <chr>       <dbl> #> 1 A     M1         0.409  #> 2 A     M2         0.445  #> 3 A     diff       0.0359 #> 4 A     additions -0.0159 #> 5 A     removals   0.0390 setDT(combine) combine[, diff(.SD), by = .(state)] %>% head(5) #>    state      stat         est #> 1:     A        M1  0.40859652 #> 2:     A        M2  0.44454379 #> 3:     A      diff  0.03594727 #> 4:     A additions -0.01585879 #> 5:     A  removals  0.03903106"},{"path":"https://elbersb.com/segregation/articles/faq.html","id":"how-can-i-use-census-data-from-tidycensus-to-compute-segregation-indices","dir":"Articles","previous_headings":"","what":"How can I use Census data from tidycensus to compute segregation indices?","title":"FAQ","text":"examples thanks Kyle Walker, author tidycensus package. First, download data: data “long” format, ’s easy compute segregation indices: Producing map local segregation scores also hard:","code":"library(\"tidycensus\")  cook_data <- get_acs(   geography = \"tract\",   variables = c(     white = \"B03002_003\",     black = \"B03002_004\",     asian = \"B03002_006\",     hispanic = \"B03002_012\"   ),   state = \"IL\",   county = \"Cook\" ) #> Getting data from the 2017-2021 5-year ACS # compute index of dissimilarity cook_data %>%   filter(variable %in% c(\"black\", \"white\")) %>%   dissimilarity(     group = \"variable\",     unit = \"GEOID\",     weight = \"estimate\"   ) #>    stat       est #> 1:    D 0.7855711  # compute multigroup M/H indices cook_data %>%   mutual_total(     group = \"variable\",     unit = \"GEOID\",     weight = \"estimate\"   ) #>    stat       est #> 1:    M 0.5114435 #> 2:    H 0.4089561 library(\"tigris\") library(\"ggplot2\")  local_seg <- mutual_local(cook_data,   group = \"variable\",   unit = \"GEOID\",   weight = \"estimate\",   wide = TRUE )  # download shapefile seg_geom <- tracts(\"IL\", \"Cook\", cb = TRUE, progress_bar = FALSE) %>%   left_join(local_seg, by = \"GEOID\") #> Retrieving data for the year 2021  ggplot(seg_geom, aes(fill = ls)) +   geom_sf(color = NA) +   coord_sf(crs = 3435) +   scale_fill_viridis_c() +   theme_void() +   labs(     title = \"Local segregation scores for Cook County, IL\",     fill = NULL   )"},{"path":"https://elbersb.com/segregation/articles/faq.html","id":"how-can-i-compute-margins-adjusted-local-segregation-scores","dir":"Articles","previous_headings":"","what":"How can I compute margins-adjusted local segregation scores?","title":"FAQ","text":"using mutual_difference, supply method = \"shapley_detailed\" get two different local segregation scores margins-adjusted (one coming adjusting forward, adjusting backwards). averaging can create single margins-adjusted local segregation score:","code":"diff <- mutual_difference(schools00, schools05, \"race\", \"school\",   weight = \"n\", method = \"shapley_detailed\" )  diff[stat %in% c(\"ls_diff1\", \"ls_diff2\"),   .(ls_diff_adjusted = mean(est)),   by = .(school) ] #>       school ls_diff_adjusted #>    1:   A1_3     -0.088983164 #>    2:   A2_2     -0.044338042 #>    3:   A2_3     -0.101696519 #>    4:   A2_4     -0.020134162 #>    5:   A2_6     -0.138567163 #>   ---                         #> 1706: C164_2     -0.031329845 #> 1707: C165_1     -0.023978101 #> 1708: C165_3      0.003781632 #> 1709: C166_1      0.010270713 #> 1710: C167_1     -0.002663687"},{"path":"https://elbersb.com/segregation/articles/plotting.html","id":"segregation-curve","dir":"Articles","previous_headings":"","what":"Segregation curve","title":"Visualizing and compressing segregation","text":"segregation curve first introduced Duncan Duncan (1955). function segcurve() provides simple way plotting segregation curve:","code":"segcurve(subset(schools00, race %in% c(\"white\", \"black\")),   \"race\", \"school\",   weight = \"n\" )"},{"path":"https://elbersb.com/segregation/articles/plotting.html","id":"segplot","dir":"Articles","previous_headings":"","what":"Segplot","title":"Visualizing and compressing segregation","text":"function segplot() provided generate segplots. Segplots described detail working paper. function requires dataset, group, unit variables, , required, variable identifies weight (n case). options customize look segplot given argument order. default, units segplot ordered local segregation score, also possible order entropy (.e., diversity) share majority population. last option can useful two-group case. argument bar_space can used increase space units default zero space bars. plotting subset dataset, reference distribution shown right segplot can changed supplying two-column data frame reference_distribution argument. One column frame contain group identifiers, include reference proportion group. show axis labels either left side , right side , sides, use argument axis_labels. Examples use arguments given :","code":"sch <- subset(schools00, state == \"A\")  # basic segplot segplot(sch, \"race\", \"school\", weight = \"n\", axis_labels = \"both\") # order by majority group (white in this case) segplot(sch, \"race\", \"school\", weight = \"n\", order = \"majority\") # increase the space between bars # (has to be very low here because there are many schools in this dataset) segplot(sch, \"race\", \"school\", weight = \"n\", bar_space = 0.0005) # change the reference distribution # (here, we just use an equalized distribution across the five groups) (ref <- data.frame(race = unique(schools00$race), p = rep(0.2, 5))) #>     race   p #> 1  asian 0.2 #> 2  black 0.2 #> 3   hisp 0.2 #> 4  white 0.2 #> 5 native 0.2 segplot(sch, \"race\", \"school\",   weight = \"n\",   reference_distribution = ref )"},{"path":"https://elbersb.com/segregation/articles/plotting.html","id":"compressing-segregation-information","dir":"Articles","previous_headings":"","what":"Compressing segregation information","title":"Visualizing and compressing segregation","text":"compression algorithm requires three steps taken. First, important decide units permitted merge: residential segregation, may want allow neighboring units (tracts) mergeable. case, first step consists compiling data frame exactly two columns, row identifies pair neighboring units. cases, may want allow units mergeable, principle. However, can time-consuming requires unit compared others every step merging operation. speed compression, therefore implement option allows units merged within window “neighboring” units, definition window based similarities local segregation. Hence, given unit, n_neighbors considered every step, neighbors based similarities local segregation. Smaller n_neighbors values result faster run times, increase probability non-optimal merges. method merging can specified compress() function supplying argument neighbors. second step run actual compression algorithm using compress(). example, choose compress based relatively small window: running compress()—can take time depending many neighbors need considered—output summarizes compression can achieved: results indicate 99% segregation information can retained 98 units (560 original dataset), 95% 24 units, 90% 10 units. percentage information retained iteration can accessed via data frame available comp$iterations. data frame can also used generate plot shows relationship number merges loss segregation information:  Another way learn compression visualize information dendrogram:  third step create new dataset based desired level compression. can achieved using function merge_units(), either n_units percent can specified indicate desired level compression. compressed dataset format original dataset can now used produce another segplot, e.g.","code":"# compression based on window of 20 'neighboring' units # in terms of local segregation (alternatively, neighbors can be a data frame) comp <- compress(sch, \"race\", \"school\",   weight = \"n\", neighbors = \"local\", n_neighbors = 20 ) comp #> Compression of dataset with 560 units #> Original M: 0.4085965; Final M: 0 #> - Threshold 99%: M = 0.4045242; Units = 92 #> - Threshold 95%: M = 0.388871; Units = 22 #> - Threshold 90%: M = 0.3695035; Units = 9 scree_plot(comp) dend <- as.dendrogram(comp) dendextend::labels(dend) <- NULL # remove the labels #> Warning in `labels<-.dendrogram`(`*tmp*`, value = NULL): The lengths of the new #> labels is shorter than the number of leaves in the dendrogram - labels are #> recycled. #> Warning in rep(new_labels, length.out = leaves_length): 'x' is NULL so the #> result will be NULL plot(dend) sch_compressed <- merge_units(comp, n_units = 15) # or, for instance: merge_units(comp, percent = 0.80) head(sch_compressed) #>    school  race    n #> 1:    M12 asian  143 #> 2:    M12 black  445 #> 3:    M12  hisp  472 #> 4:    M12 white 6174 #> 5:     M2 black   67 #> 6:     M2  hisp  642 segplot(sch_compressed, \"race\", \"school\", weight = \"n\")"},{"path":"https://elbersb.com/segregation/articles/segregation.html","id":"the-basic-mathematics","dir":"Articles","previous_headings":"","what":"The basic mathematics","title":"A walkthrough of the segregation package","text":"idea segregation index summarize contingency table single number. instance, may table \\(U\\) units, say schools occupations, \\(G\\) groups, say gender racial groups. combination unit group count, \\(t_{ug}\\). Arranged \\(U\\times G\\) matrix \\(\\mathbf{T}\\), structure data looks like: matrix, can define \\(t=\\sum_{u=1}^U\\sum_{g=1}^G t_{ug}\\), total population size. joint probability unit \\(u\\) racial group \\(g\\) \\(p_{ug}=t_{ug}/t\\). Also define \\(p_{u \\cdot}=\\sum_{g=1}^{G}t_{ug}/t\\) \\(p_{\\cdot g}=\\sum_{u=1}^{U}t_{ug}/t\\) marginal probabilities units groups, respectively. Mutual Information Index defined \\[ M(\\mathbf{T})=\\sum_{u=1}^U\\sum_{g=1}^Gp_{ug}\\log\\frac{p_{ug}}{p_{u \\cdot}p_{\\cdot g}}. \\] Theil Index closely related M index, just normalized version Mutual Information Index: \\[ H(\\mathbf{T})=\\frac{M(\\mathbf{T})}{E(\\mathbf{T})}, \\] \\(E(\\mathbf{T})\\) denotes entropy group marginal distribution \\(\\mathbf{T}\\), .e. \\(E(\\mathbf{T})=-\\sum_{g=1}^{G}p_{\\cdot g}\\log p_{\\cdot g}\\). Dividing group entropy effect constraining H 0 1.","code":""},{"path":"https://elbersb.com/segregation/articles/segregation.html","id":"data-format","dir":"Articles","previous_headings":"","what":"Data format","title":"A walkthrough of the segregation package","text":"examples, use dataset built segregation package, schools00. dataset contains data 2,045 schools across 429 school districts three U.S. states. school, dataset records number Asian, Black, Hispanic, White, Native American students. segregation package requires data long form (segregation data comes form), form contingency tables. Hence, row schools00 dataset unique combination given school racial group, column n records number students combination: Note first school, A1_1, Native American students. Hence, row missing. data form contingency tables, can use matrix_to_long() convert long format required package. example: group unit arguments optional.","code":"library(\"segregation\") head(schools00[, c(\"school\", \"race\", \"n\")]) #>   school  race   n #> 1   A1_1 asian   2 #> 2   A1_1 black  14 #> 3   A1_1  hisp  30 #> 4   A1_1 white 351 #> 5   A1_2 black   9 #> 6   A1_2  hisp 101 (m <- matrix(c(10, 20, 30, 30, 20, 10), nrow = 3)) #>      [,1] [,2] #> [1,]   10   30 #> [2,]   20   20 #> [3,]   30   10 colnames(m) <- c(\"Black\", \"White\") matrix_to_long(m, group = \"race\", unit = \"school\") #>    school  race  n #> 1:      1 Black 10 #> 2:      2 Black 20 #> 3:      3 Black 30 #> 4:      1 White 30 #> 5:      2 White 20 #> 6:      3 White 10"},{"path":"https://elbersb.com/segregation/articles/segregation.html","id":"computing-the-m-and-h-indices","dir":"Articles","previous_headings":"","what":"Computing the M and H indices","title":"A walkthrough of the segregation package","text":"Compute M H indices using mutual_total(): Interpreting M easy, normalized. However, H can range 0 1, value 0.419 indicate moderate segregation. second argument mutual_total() refers groups, third argument refers units. Switching groups units affect M index, change H index: segregation package always divides marginal group entropy, hence divide entropy school distribution, expect much larger (many schools racial groups). check, can use entropy() function: Therefore, H index used, important specify groups units correctly. inference (discussed detail ), can use bootstrapping obtain standard errors confidence intervals: large number observations, standard errors small.","code":"mutual_total(schools00, \"race\", \"school\", weight = \"n\") #>    stat       est #> 1:    M 0.4255390 #> 2:    H 0.4188083 mutual_total(schools00, \"school\", \"race\", weight = \"n\") #>    stat        est #> 1:    M 0.42553898 #> 2:    H 0.05642991 (entropy(schools00, \"race\", weight = \"n\")) #> [1] 1.016071 (entropy(schools00, \"school\", weight = \"n\")) #> [1] 7.541018 mutual_total(schools00, \"race\", \"school\",   weight = \"n\",   se = TRUE, CI = .95, n_bootstrap = 500 ) #> 500 bootstrap iterations on 877739 observations #>    stat       est           se                  CI        bias #> 1:    M 0.4218888 0.0008052111 0.4202830,0.4234599 0.003650193 #> 2:    H 0.4152473 0.0007114470 0.4137465,0.4165963 0.003561007"},{"path":"https://elbersb.com/segregation/articles/segregation.html","id":"between-within-decomposition","dir":"Articles","previous_headings":"","what":"Between-Within decomposition","title":"A walkthrough of the segregation package","text":"might wonder whether segregation different across three different states. can compute segregation indices manually (just showing M simplicity): Clearly, state segregated state C, turn shows higher school segregation B. One advantages entropy-based segregation indices three state-specific indices simple relationship overall index. called /within decomposition: Total segregation can decomposed term measures much distribution racial groups differs states, term measures segregation within states. \\(S\\) states (“super-units” generally), school belongs exactly one state, M index can decomposed follows: \\[ M(\\mathbf{T})=M(\\mathbf{S}) + \\sum_{s=1}^S p_s M(\\mathbf{T}_s), \\] \\(\\mathbf{T}\\) full \\(U \\times G\\) contingency table, \\(\\mathbf{S}\\) aggregated contingency table dimension \\(S\\times G\\), \\(p_s\\) population proportion state \\(s\\) (\\(\\sum_{s=1}^S p_s=1\\)), \\(\\mathbf{T}_s\\) subset rows \\(\\mathbf{T}\\) belonging state \\(s\\). Put simple terms, M index can decomposed -state segregation index, plus weighted average within-state M indices. H index, dividing formula \\(E(\\mathbf{T})\\). makes formula bit complicated, normalization offset decomposition: \\[ H(\\mathbf{T})=H(\\mathbf{S}) + \\sum_{s=1}^S \\frac{E(\\mathbf{T}_s)}{E(\\mathbf{T})} p_s H(\\mathbf{T}_s), \\] \\(E(\\cdot)\\) entropy marginal group distribution. Note \\(E(\\mathbf{T})=E(\\mathbf{S})\\), group marginal distributions identical. compute decomposition using segregation package, use: Note \\(0.426 = 0.0992 + 0.326\\) \\(0.419 = 0.0977 + 0.321\\). results indicate 75% segregation within states. words, differences racial composition three different states account less 25% segregation. using mutual_total() within argument, can obtain overall within component, obtain decomposition state. , can use mutual_within(): much simpler way obtain state-specific segregation scores compared subsetting manually, shown beginning section. addition M H indices, also obtain p, population proportion state (\\(p_s\\) ), ent_ratio, \\(E(\\mathbf{T}_s)/E(\\mathbf{T})\\) . Hence, can recover total within-component using exactly . quantity \\(p_s M(\\mathbf{T}_s)\\) interest, shows much states contribute segregation total, taking account size. adding component, can calculate contribution four components: four components contributes quarter total segregation 0.426. Note state smallest state (27.7% population), contributes largest percentage (26.6%) total segregation. Hence, decomposition shows important look \\(p_s\\), state sizes, well \\(M(\\mathbf{T}_s)\\), within-state segregation. -within decomposition can also applied repeatedly hierarchical setting. instance, schools00 dataset, schools nested within districts, districts nested within states. Therefore, can ask: much segregation due segregation states, much segregation due -district segregation within states, much segregation due -school segregation within districts? package provides convenience function use case:","code":"split_schools <- split(schools00, schools00$state) mutual_total(split_schools$A, \"race\", \"school\", weight = \"n\")[1, ] #>    stat       est #> 1:    M 0.4085965 mutual_total(split_schools$B, \"race\", \"school\", weight = \"n\")[1, ] #>    stat       est #> 1:    M 0.2549959 mutual_total(split_schools$C, \"race\", \"school\", weight = \"n\")[1, ] #>    stat       est #> 1:    M 0.3450221 # total segregation (total <- mutual_total(schools00, \"race\", \"school\", weight = \"n\")) #>    stat       est #> 1:    M 0.4255390 #> 2:    H 0.4188083 # between-state segregation: #     how much does the racial distributions differ across states? (between <- mutual_total(schools00, \"race\", \"state\", weight = \"n\")) #>    stat        est #> 1:    M 0.09924370 #> 2:    H 0.09767398 # within-state segregation: #     how much segregation exist within states? (mutual_total(schools00, \"race\", \"school\", within = \"state\", weight = \"n\")) #>    stat       est #> 1:    M 0.3262953 #> 2:    H 0.3211343 (within <- mutual_within(schools00, \"race\", \"school\",   within = \"state\", weight = \"n\", wide = TRUE )) #>    state         M         p         H ent_ratio #> 1:     A 0.4085965 0.2768819 0.4969216 0.8092501 #> 2:     B 0.2549959 0.4035425 0.2680884 0.9361190 #> 3:     C 0.3450221 0.3195756 0.3611257 0.9402955 with(within, sum(M * p)) #> [1] 0.3262953 with(within, sum(H * p * ent_ratio)) #> [1] 0.3211343 # merge into a vector components <- c(between$est[1], within$M * within$p) names(components) <- c(\"Between\", \"A\", \"B\", \"C\") signif(100 * components / total$est[1], 3) #> Between       A       B       C  #>    23.3    26.6    24.2    25.9 mutual_total_nested(schools00, \"race\", c(\"state\", \"district\", \"school\"),   weight = \"n\" ) #>     between          within stat        est #> 1:    state                    M 0.09924370 #> 2:    state                    H 0.09767398 #> 3: district           state    M 0.23870880 #> 4: district           state    H 0.23493319 #> 5:   school state, district    M 0.08758648 #> 6:   school state, district    H 0.08620114 # This is a simpler way of running the following three decompositions manually: # mutual_total(schools00, \"race\", \"state\", weight = \"n\") # mutual_total(schools00, \"race\", \"district\", within = \"state\", weight = \"n\") # mutual_total(schools00, \"race\", \"school\", within = c(\"state\", \"district\"), weight = \"n\")"},{"path":"https://elbersb.com/segregation/articles/segregation.html","id":"local-segregation","dir":"Articles","previous_headings":"","what":"Local segregation","title":"A walkthrough of the segregation package","text":"M index (H index) allows another decomposition, local segregation scores. define decomposition, let \\(p_{g|u} = t_{ug} / t_{u \\cdot}\\) conditional probability group \\(g\\), given one unit \\(u\\). can define local segregation score unit \\(u\\) \\[L_u = \\sum_{g=1}^G p_{g|u}\\log\\frac{p_{g|u}}{p_{\\cdot g}}\\] weighted average \\(L_u\\) \\(M(\\mathbf{T})\\), .e. \\(M(\\mathbf{T}) = \\sum_{u=1}^U p_{u\\cdot}L_u\\). obtain local segregation scores \\(L_u\\), along marginal weights \\(p_{u\\cdot}\\), use mutual_local(): Local segregation scores based much less data full M index, often makes sense obtain confidence intervals. following code plots length 95% confidence interval relation size school:  Although relationship deterministic, larger schools shorter confidence intervals. M symmetric, local segregation scores can also obtained groups. equivalent definition local segregation score group \\(g\\) \\[L_g = \\sum_{u=1}^U p_{u|g}\\log\\frac{p_{u|g}}{p_{u \\cdot}},\\] , expected, \\(M(\\mathbf{T}) = \\sum_{g=1}^G p_{\\cdot g}L_g\\). obtain scores, switch group unit arguments mutual_local: results show racial groups experience different levels segregation: White students less segregated Asian, Black, Hispanic, , especially, Native American students.","code":"mutual_local(schools00, \"race\", \"school\", weight = \"n\", wide = TRUE) #>       school        ls            p #>    1:   A1_1 0.1826710 0.0004522985 #>    2:   A1_2 0.1825592 0.0004978701 #>    3:   A1_3 0.2756157 0.0006642066 #>    4:   A1_4 0.1368034 0.0005685061 #>    5:   A2_1 0.3585546 0.0004260948 #>   ---                               #> 2041: C165_1 0.3174930 0.0004568556 #> 2042: C165_2 0.3835477 0.0005297702 #> 2043: C165_3 0.2972550 0.0005650883 #> 2044: C166_1 0.3072281 0.0011586588 #> 2045: C167_1 0.3166498 0.0005354667 localse <- mutual_local(schools00, \"race\", \"school\",   weight = \"n\",   se = TRUE, wide = TRUE, n_bootstrap = 500 ) #> 500 bootstrap iterations on 877739 observations localse$lengthCI <- sapply(localse$ls_CI, base::diff) with(localse, plot(x = p, y = lengthCI, pch = 16, cex = 0.3)) (localg <- mutual_local(schools00, \"school\", \"race\", weight = \"n\", wide = TRUE)) #>      race        ls           p #> 1:  asian 0.6287673 0.022553401 #> 2:  black 0.8805413 0.190149919 #> 3:   hisp 0.7766327 0.151696575 #> 4:  white 0.1836393 0.628092178 #> 5: native 1.4342644 0.007507927"},{"path":"https://elbersb.com/segregation/articles/segregation.html","id":"inference","dir":"Articles","previous_headings":"","what":"Inference","title":"A walkthrough of the segregation package","text":"four main functions packages, mutual_total(), mutual_within(), mutual_local(), mutual_difference() support inference bootstrapping. Inference segregation indices tricky, standard error estimates confidence intervals trusted much little data, especially segregation index close either 0 maximum segregation. estimate standard errors confidence intervals, use se = TRUE. coverage confidence interval can specified CI argument. number bootstrap iterations can specified well: confidence intervals based percentiles bootstrap distribution, hence require large number bootstrap iterations valid interpretation. estimate est reported results already “debiased”, .e. bias estimated bootstrap distribution (reported bias) subtracted usual maximum-likelihood estimate obtain mutual_total se = FALSE. confidence interval centered around debiased estimate. balance, confidence intervals preferred standard error bootstrap distribution can skewed, especially segregation low high. example, can see standard errors provide almost identical coverage confidence intervals, provide effectively coverage confidence intervals obtained percentile bootstrap. Whenever bootstrap used, bootstrap distributions parameter reported attribute bootstrap returned object. can used, instance, check whether bootstrap distribution skewed. following code computes local segregation scores schools, shows histogram bootstrap distribution school C137_9, low local segregation score:  school, bootstrap distribution skewed. precise inference specific school needed, standard error interpreted, confidence interval interpreted number bootstrap iterations large. concerned contingency table small provide reliable segregation estimates, package also provides function mutual_expected() simulates random cell counts independence marginal distributions table. schools00 dataset: , concern bias due small sample size.","code":"(se <- mutual_total(schools00, \"race\", \"school\",   weight = \"n\",   se = TRUE, CI = .95, n_bootstrap = 500 )) #> 500 bootstrap iterations on 877739 observations #>    stat       est           se                  CI        bias #> 1:    M 0.4219476 0.0007579537 0.4205498,0.4234631 0.003591399 #> 2:    H 0.4152541 0.0006763022 0.4138784,0.4166825 0.003554205 # M with(se, c(est[1] - 1.96 * se[1], est[1] + 1.96 * se[1])) #> [1] 0.4204620 0.4234332 # H with(se, c(est[2] - 1.96 * se[2], est[2] + 1.96 * se[2])) #> [1] 0.4139286 0.4165797 local <- mutual_local(schools00, \"race\", \"school\",   weight = \"n\",   se = TRUE, CI = .95, n_bootstrap = 500 ) #> 500 bootstrap iterations on 877739 observations # pick bootstrap distribution of local segregation scores for school C137_9 ls_school <- attr(local, \"bootstrap\")[school == \"C137_9\" & stat == \"ls\", boot_est] hist(ls_school, main = \"Bootstrap distribution for school C137_9\") mutual_expected(schools00, \"race\", \"school\", weight = \"n\", n_bootstrap = 500) #>         stat         est           se #> 1: M under 0 0.004808867 7.623260e-05 #> 2: H under 0 0.004732807 7.502684e-05"},{"path":"https://elbersb.com/segregation/articles/segregation.html","id":"decomposing-differences-in-indices","dir":"Articles","previous_headings":"","what":"Decomposing differences in indices","title":"A walkthrough of the segregation package","text":"command mutual_difference() can used decompose differences segregation, described Elbers (2021). default, recommended method, use method = shapley (method = shapley_detailed). methods (mrc, km) exist mostly testing purposes, recommended. Details procedure interpret terms decomposition found Elbers (2021). method also supports inference setting se = TRUE.","code":"mutual_difference(schools00, schools05, \"race\", \"school\", weight = \"n\") #>              stat          est #> 1:             M1  0.425538976 #> 2:             M2  0.413385092 #> 3:           diff -0.012153884 #> 4:      additions -0.003412776 #> 5:       removals -0.011405093 #> 6: group_marginal  0.017865684 #> 7:  unit_marginal -0.011707361 #> 8:     structural -0.003494338"},{"path":"https://elbersb.com/segregation/articles/segregation.html","id":"references","dir":"Articles","previous_headings":"","what":"References","title":"A walkthrough of the segregation package","text":"Elbers, B. (2021). Method Studying Differences Segregation Across Time Space. Sociological Methods & Research. https://doi.org/10.1177/0049124121986204 Mora, R., & Ruiz-Castillo, J. (2011). Entropy-based Segregation Indices. Sociological Methodology, 41(1), 159–194. https://doi.org/10.1111/j.1467-9531.2011.01237.x Theil, H. (1971). Principles Econometrics. New York: Wiley","code":""},{"path":"https://elbersb.com/segregation/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Benjamin Elbers. Author, maintainer.","code":""},{"path":"https://elbersb.com/segregation/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Benjamin Elbers. 2021. Method Studying Differences Segregation Across Time Space Sociological Methods & Research 52(1): 5-42. doi: 10.1177/0049124121986204","code":"@Article{,   title = {A Method for Studying Differences in Segregation Across Time and Space},   author = {Benjamin Elbers},   journal = {Sociological Methods & Research},   year = {2021},   volume = {52},   number = {1},   pages = {5-42},   doi = {10.1177/0049124121986204}, }"},{"path":"https://elbersb.com/segregation/index.html","id":"segregation","dir":"","previous_headings":"","what":"Entropy-Based Segregation Indices","title":"Entropy-Based Segregation Indices","text":"R package calculate, visualize, decompose various segregation indices. package currently supports Mutual Information Index (M), Theil’s Information Index (H), index Dissimilarity (D), isolation exposure index. Find information vignette(\"segregation\") documentation. package also supports standard error confidence intervals estimation via bootstrapping, also corrects small sample bias decomposition M H indices (within/, local segregation) decomposing differences total segregation time (Elbers 2020) segregation visualizations (segregation curves ‘segplots’) methods return tidy data.tables easy post-processing plotting. speed, package uses data.table package internally, implements functions C++. procedures implemented package described detail SMR paper (Preprint) working paper.","code":""},{"path":"https://elbersb.com/segregation/index.html","id":"usage","dir":"","previous_headings":"","what":"Usage","title":"Entropy-Based Segregation Indices","text":"package provides easy way calculate segregation measures, based Mutual Information Index (M) Theil’s Entropy Index (H). Standard errors functions can estimated via boostrapping. also apply bias-correction estimates: Decompose segregation -state within-state term (sum equals total segregation): Local segregation (ls) decomposition units groups (racial groups). function also support standard error CI estimation. sum proportion-weighted local segregation scores equals M: Decompose difference M 2000 2005, using iterative proportional fitting (IPF) Shapley decomposition (see Elbers 2021 details): Show segplot:  Find information documentation.","code":"library(segregation)  # example dataset with fake data provided by the package mutual_total(schools00, \"race\", \"school\", weight = \"n\") #>      stat   est #>    <char> <num> #> 1:      M 0.426 #> 2:      H 0.419 mutual_total(schools00, \"race\", \"school\",     weight = \"n\",     se = TRUE, CI = 0.90, n_bootstrap = 500 ) #> 500 bootstrap iterations on 877739 observations #>      stat   est       se          CI    bias #>    <char> <num>    <num>      <list>   <num> #> 1:      M 0.422 0.000775 0.421,0.423 0.00361 #> 2:      H 0.415 0.000712 0.414,0.416 0.00356 # between states mutual_total(schools00, \"race\", \"state\", weight = \"n\") #>      stat    est #>    <char>  <num> #> 1:      M 0.0992 #> 2:      H 0.0977  # within states mutual_total(schools00, \"race\", \"school\", within = \"state\", weight = \"n\") #>      stat   est #>    <char> <num> #> 1:      M 0.326 #> 2:      H 0.321 local <- mutual_local(schools00,     group = \"school\", unit = \"race\", weight = \"n\",     se = TRUE, CI = 0.90, n_bootstrap = 500, wide = TRUE ) #> 500 bootstrap iterations on 877739 observations local[, c(\"race\", \"ls\", \"p\", \"ls_CI\")] #>      race    ls       p       ls_CI #>    <fctr> <num>   <num>      <list> #> 1:  asian 0.591 0.02255 0.582,0.601 #> 2:  black 0.876 0.19017 0.873,0.879 #> 3:   hisp 0.771 0.15167 0.767,0.775 #> 4:  white 0.183 0.62810 0.182,0.184 #> 5: native 1.352 0.00751   1.32,1.38 sum(local$p * local$ls) #> [1] 0.422 mutual_difference(schools00, schools05,     group = \"race\", unit = \"school\",     weight = \"n\", method = \"shapley\" ) #>              stat      est #>            <char>    <num> #> 1:             M1  0.42554 #> 2:             M2  0.41339 #> 3:           diff -0.01215 #> 4:      additions -0.00341 #> 5:       removals -0.01141 #> 6: group_marginal  0.01787 #> 7:  unit_marginal -0.01171 #> 8:     structural -0.00349 segplot(schools00, group = \"race\", unit = \"school\", weight = \"n\")"},{"path":"https://elbersb.com/segregation/index.html","id":"how-to-install","dir":"","previous_headings":"","what":"How to install","title":"Entropy-Based Segregation Indices","text":"install package CRAN, use install development version, use","code":"install.packages(\"segregation\") devtools::install_github(\"elbersb/segregation\")"},{"path":"https://elbersb.com/segregation/index.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Entropy-Based Segregation Indices","text":"use package research, please cite: Elbers, B. (2021). Method Studying Differences Segregation Across Time Space. Sociological Methods & Research. https://doi.org/10.1177/0049124121986204","code":""},{"path":"https://elbersb.com/segregation/index.html","id":"some-additional-resources","dir":"","previous_headings":"","what":"Some additional resources","title":"Entropy-Based Segregation Indices","text":"book Analyzing US Census Data: Methods, Maps, Models R Kyle E. Walker contains discussion package, great resource anyone working spatial data, especially U.S. Census data. paper makes use package: Residential Racial Segregation U.S. Really Increase? Analysis Accounting Changes Racial Diversity (Code Data) analyses article Belgian newspaper De Tijd used package. analyses article Wall Street Journal produced using package.","code":""},{"path":"https://elbersb.com/segregation/index.html","id":"references-on-entropy-based-segregation-indices","dir":"","previous_headings":"","what":"References on entropy-based segregation indices","title":"Entropy-Based Segregation Indices","text":"Deutsch, J., Flückiger, Y. & Silber, J. (2009). Analyzing Changes Occupational Segregation: Case Switzerland (1970–2000), : Yves Flückiger, Sean F. Reardon, Jacques Silber (eds.) Occupational Residential Segregation (Research Economic Inequality, Volume 17), 171–202. DiPrete, T. ., Eller, C. C., Bol, T., & van de Werfhorst, H. G. (2017). School--Work Linkages United States, Germany, France. American Journal Sociology, 122(6), 1869-1938. https://doi.org/10.1086/691327 Elbers, B. (2021). Method Studying Differences Segregation Across Time Space. Sociological Methods & Research. https://doi.org/10.1177/0049124121986204 Forster, . G., & Bol, T. (2017). Vocational education employment life course using new measure occupational specificity. Social Science Research, 70, 176-197. https://doi.org/10.1016/j.ssresearch.2017.11.004 Theil, H. (1971). Principles Econometrics. New York: Wiley. Frankel, D. M., & Volij, O. (2011). Measuring school segregation. Journal Economic Theory, 146(1), 1-38. https://doi.org/10.1016/j.jet.2010.10.008 Mora, R., & Ruiz-Castillo, J. (2003). Additively decomposable segregation indexes. case gender segregation occupations human capital levels Spain. Journal Economic Inequality, 1(2), 147-179. https://doi.org/10.1023/:1026198429377 Mora, R., & Ruiz-Castillo, J. (2009). Invariance Properties Mutual Information Index Multigroup Segregation, : Yves Flückiger, Sean F. Reardon, Jacques Silber (eds.) Occupational Residential Segregation (Research Economic Inequality, Volume 17), 33-53. Mora, R., & Ruiz-Castillo, J. (2011). Entropy-based Segregation Indices. Sociological Methodology, 41(1), 159–194. https://doi.org/10.1111/j.1467-9531.2011.01237.x Van Puyenbroeck, T., De Bruyne, K., & Sels, L. (2012). ‘Mutual Information’: Educational sectoral gender segregation interaction Flemish labor market. Labour Economics, 19(1), 1-8. https://doi.org/10.1016/j.labeco.2011.05.002 Watts, M. Use Abuse Entropy Based Segregation Indices. Working Paper. URL: http://www.ecineq.org/ecineq_lux15/FILESx2015/CR2/p217.pdf","code":""},{"path":"https://elbersb.com/segregation/reference/compress.html","id":null,"dir":"Reference","previous_headings":"","what":"Compresses a data matrix based on mutual information (segregation) — compress","title":"Compresses a data matrix based on mutual information (segregation) — compress","text":"Given data set identifies suitable neighbors merging, function merge units iteratively, iteration neighbors smallest reduction terms total M merged.","code":""},{"path":"https://elbersb.com/segregation/reference/compress.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Compresses a data matrix based on mutual information (segregation) — compress","text":"","code":"compress(   data,   group,   unit,   weight = NULL,   neighbors = \"local\",   n_neighbors = 50,   max_iter = Inf )"},{"path":"https://elbersb.com/segregation/reference/compress.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Compresses a data matrix based on mutual information (segregation) — compress","text":"data data frame. group categorical variable contained data. Defines first dimension segregation computed. unit categorical variable contained data. Defines second dimension segregation computed. weight Numeric. frequency weights allowed. (Default NULL) neighbors Either data frame character. data frame, needs exactly two columns, row identifies set \"neighbors\" may merged. \"local\", considers n_neighbors closest neighbors terms local segregation. \"\", units considered possible neighbors. may time-consuming. n_neighbors relevant neighbors \"local\". max_iter Maximum number iterations (Default Inf)","code":""},{"path":"https://elbersb.com/segregation/reference/compress.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Compresses a data matrix based on mutual information (segregation) — compress","text":"Returns data.table.","code":""},{"path":"https://elbersb.com/segregation/reference/dissimilarity.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates Index of Dissimilarity — dissimilarity","title":"Calculates Index of Dissimilarity — dissimilarity","text":"Returns total segregation group unit using Index Dissimilarity.","code":""},{"path":"https://elbersb.com/segregation/reference/dissimilarity.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates Index of Dissimilarity — dissimilarity","text":"","code":"dissimilarity(   data,   group,   unit,   weight = NULL,   se = FALSE,   CI = 0.95,   n_bootstrap = 100 )"},{"path":"https://elbersb.com/segregation/reference/dissimilarity.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates Index of Dissimilarity — dissimilarity","text":"data data frame. group categorical variable vector variables contained data. Defines first dimension segregation computed. D index allows two distinct groups. unit categorical variable vector variables contained data. Defines second dimension segregation computed. weight Numeric. (Default NULL) se TRUE, segregation estimates bootstrapped provide standard errors apply bias correction. bias reported already applied estimates (.e. reported estimates \"debiased\") (Default FALSE) CI se = TRUE, compute confidence (CI*100) addition bootstrap standard error. based percentiles bootstrap distribution, valid interpretation relies larger number bootstrap iterations. (Default 0.95) n_bootstrap Number bootstrap iterations. (Default 100)","code":""},{"path":"https://elbersb.com/segregation/reference/dissimilarity.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates Index of Dissimilarity — dissimilarity","text":"Returns data.table one row. column est contains   Index Dissimilarity.   se set TRUE, additional column se contains   associated bootstrapped standard errors, additional column CI contains   estimate confidence interval list column, additional column bias contains   estimated bias, column est contains bias-corrected estimates.","code":""},{"path":"https://elbersb.com/segregation/reference/dissimilarity.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Calculates Index of Dissimilarity — dissimilarity","text":"Otis Dudley Duncan Beverly Duncan. 1955. \"Methodological Analysis Segregation Indexes,\"      American Sociological Review 20(2): 210-217.","code":""},{"path":"https://elbersb.com/segregation/reference/dissimilarity.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculates Index of Dissimilarity — dissimilarity","text":"","code":"# Example where D and H deviate m1 <- matrix_to_long(matrix(c(100, 60, 40, 0, 0, 40, 60, 100), ncol = 2)) m2 <- matrix_to_long(matrix(c(80, 80, 20, 20, 20, 20, 80, 80), ncol = 2)) dissimilarity(m1, \"group\", \"unit\", weight = \"n\") #>    stat est #> 1:    D 0.6 dissimilarity(m2, \"group\", \"unit\", weight = \"n\") #>    stat est #> 1:    D 0.6"},{"path":"https://elbersb.com/segregation/reference/dissimilarity_expected.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates expected values when true segregation is zero — dissimilarity_expected","title":"Calculates expected values when true segregation is zero — dissimilarity_expected","text":"sample sizes small, one group small proportion, many units, segregation indices typically upwardly biased, even true segregation zero. function simulates tables zero segregation, given marginals dataset, calculates segregation. expected values large, interpretation index scores might adjusted.","code":""},{"path":"https://elbersb.com/segregation/reference/dissimilarity_expected.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates expected values when true segregation is zero — dissimilarity_expected","text":"","code":"dissimilarity_expected(   data,   group,   unit,   weight = NULL,   fixed_margins = TRUE,   n_bootstrap = 100 )"},{"path":"https://elbersb.com/segregation/reference/dissimilarity_expected.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates expected values when true segregation is zero — dissimilarity_expected","text":"data data frame. group categorical variable vector variables contained data. Defines first dimension segregation computed. unit categorical variable vector variables contained data. Defines second dimension segregation computed. weight Numeric. (Default NULL) fixed_margins margins fixed simulated? (Default TRUE) n_bootstrap Number bootstrap iterations. (Default 100)","code":""},{"path":"https://elbersb.com/segregation/reference/dissimilarity_expected.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates expected values when true segregation is zero — dissimilarity_expected","text":"data.table one row, corresponding expected value    D index true segregation zero.","code":""},{"path":"https://elbersb.com/segregation/reference/dissimilarity_expected.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculates expected values when true segregation is zero — dissimilarity_expected","text":"","code":"# build a smaller table, with 100 students distributed across # 10 schools, where one racial group has 10% of the students small <- data.frame(     school = c(1:10, 1:10),     race = c(rep(\"r1\", 10), rep(\"r2\", 10)),     n = c(rep(1, 10), rep(9, 10)) ) dissimilarity_expected(small, \"race\", \"school\", weight = \"n\") #>         stat       est         se #> 1: D under 0 0.3788889 0.09618616 # with an increase in sample size (n=1000), the values improve small$n <- small$n * 10 dissimilarity_expected(small, \"race\", \"school\", weight = \"n\") #>         stat       est         se #> 1: D under 0 0.1232222 0.02899553"},{"path":"https://elbersb.com/segregation/reference/entropy.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates the entropy of a distribution — entropy","title":"Calculates the entropy of a distribution — entropy","text":"Returns entropy distribution defined group.","code":""},{"path":"https://elbersb.com/segregation/reference/entropy.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates the entropy of a distribution — entropy","text":"","code":"entropy(data, group, weight = NULL, base = exp(1))"},{"path":"https://elbersb.com/segregation/reference/entropy.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates the entropy of a distribution — entropy","text":"data data frame. group categorical variable vector variables contained data. weight Numeric. (Default NULL) base Base logarithm used entropy calculation. Defaults natural logarithm.","code":""},{"path":"https://elbersb.com/segregation/reference/entropy.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates the entropy of a distribution — entropy","text":"single number, entropy.","code":""},{"path":"https://elbersb.com/segregation/reference/entropy.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculates the entropy of a distribution — entropy","text":"","code":"d <- data.frame(cat = c(\"A\", \"B\"), n = c(25, 75)) entropy(d, \"cat\", weight = \"n\") # => .56 #> [1] 0.5623351 # this is equivalent to -.25*log(.25)-.75*log(.75)  d <- data.frame(cat = c(\"A\", \"B\"), n = c(50, 50)) # use base 2 for the logarithm, then entropy is maximized at 1 entropy(d, \"cat\", weight = \"n\", base = 2) # => 1 #> [1] 1"},{"path":"https://elbersb.com/segregation/reference/exposure.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates pairwise exposure indices — exposure","title":"Calculates pairwise exposure indices — exposure","text":"Returns pairwise exposure indices groups","code":""},{"path":"https://elbersb.com/segregation/reference/exposure.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates pairwise exposure indices — exposure","text":"","code":"exposure(data, group, unit, weight = NULL)"},{"path":"https://elbersb.com/segregation/reference/exposure.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates pairwise exposure indices — exposure","text":"data data frame. group categorical variable contained data. Defines first dimension segregation computed. unit vector variables contained data. Defines second dimension segregation computed. weight Numeric. (Default NULL)","code":""},{"path":"https://elbersb.com/segregation/reference/exposure.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates pairwise exposure indices — exposure","text":"Returns data.table columns \"\", \"\",  \"exposure\". Read results \"exposure group x group y\".","code":""},{"path":"https://elbersb.com/segregation/reference/get_crosswalk.html","id":null,"dir":"Reference","previous_headings":"","what":"Create crosswalk after compression — get_crosswalk","title":"Create crosswalk after compression — get_crosswalk","text":"running compress, function creates crosswalk table. Usually preferred call merge_units directly.","code":""},{"path":"https://elbersb.com/segregation/reference/get_crosswalk.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Create crosswalk after compression — get_crosswalk","text":"","code":"get_crosswalk(compression, n_units = NULL, percent = NULL, parts = FALSE)"},{"path":"https://elbersb.com/segregation/reference/get_crosswalk.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Create crosswalk after compression — get_crosswalk","text":"compression \"segcompression\" object returned compress. n_units Determines number merges specifying number units remain compressed dataset. n_units percent must given. (default: NULL) percent Determines number merges specifying percentage total segregation information retained compressed dataset. n_units percent must given. (default: NULL) parts (default: FALSE)","code":""},{"path":"https://elbersb.com/segregation/reference/get_crosswalk.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Create crosswalk after compression — get_crosswalk","text":"Returns ggplot2 plot. Returns data.table.","code":""},{"path":"https://elbersb.com/segregation/reference/ipf.html","id":null,"dir":"Reference","previous_headings":"","what":"Adjustment of marginal distributions using iterative proportional fitting — ipf","title":"Adjustment of marginal distributions using iterative proportional fitting — ipf","text":"Adjusts marginal distributions group unit source respective marginal distributions target, using iterative proportional fitting algorithm (IPF).","code":""},{"path":"https://elbersb.com/segregation/reference/ipf.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Adjustment of marginal distributions using iterative proportional fitting — ipf","text":"","code":"ipf(   source,   target,   group,   unit,   weight = NULL,   max_iterations = 100,   precision = 1e-04 )"},{"path":"https://elbersb.com/segregation/reference/ipf.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Adjustment of marginal distributions using iterative proportional fitting — ipf","text":"source \"source\" data frame. marginals dataset adjusted marginals target. target \"target\" data frame. function returns dataset marginal distributions group unit categories approximated target. group categorical variable vector variables contained source target. Defines first distribution adjustment. unit categorical variable vector variables contained source target. Defines second distribution adjustment. weight Numeric. (Default NULL) max_iterations Maximum number iterations used IPF algorithm. precision Convergence criterion IPF algorithm. every iteration, ratio source target marginals calculated every category group unit. algorithm converges ratios smaller 1 + precision.","code":""},{"path":"https://elbersb.com/segregation/reference/ipf.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Adjustment of marginal distributions using iterative proportional fitting — ipf","text":"Returns data frame retains   association structure source approximating   marginal distributions group unit target.   dataset identifies combination group unit,   categories occur either source target dropped.   adjusted frequency combination given column n,   n_target n_source contain zero-adjusted frequencies   target source dataset, respectively.","code":""},{"path":"https://elbersb.com/segregation/reference/ipf.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Adjustment of marginal distributions using iterative proportional fitting — ipf","text":"algorithm works scaling marginal distribution group source data frame towards marginal distribution target; repeating process unit. algorithm keeps alternating group unit marginals adjusted data frame within allowed precision. results dataset retains association structure source approximating marginal distribution target. number unit group categories different source target, data frame returns combination unit group categories occur datasets. Zero values replaced small, non-zero number (1e-4). Note values returned sum observations source data frame, target data frame. different IPF implementations, ensures IPF change number observations.","code":""},{"path":"https://elbersb.com/segregation/reference/ipf.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Adjustment of marginal distributions using iterative proportional fitting — ipf","text":"W. E. Deming F. F. Stephan. 1940.   \"Least Squares Adjustment Sampled Frequency Table   Expected Marginal Totals Known\".   Annals Mathematical Statistics. 11 (4): 427–444. T. Karmel M. Maclachlan. 1988.   \"Occupational Sex Segregation — Increasing Decreasing?\" Economic Record 64: 187-195.","code":""},{"path":"https://elbersb.com/segregation/reference/ipf.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Adjustment of marginal distributions using iterative proportional fitting — ipf","text":"","code":"if (FALSE) { # adjusts the marginals of group and unit categories so that # schools00 has similar marginals as schools05 adj <- ipf(schools00, schools05, \"race\", \"school\", weight = \"n\")  # check that the new \"race\" marginals are similar to the target marginals # (the same could be done for schools) aggregate(adj$n, list(adj$race), sum) aggregate(adj$n_target, list(adj$race), sum)  # note that the adjusted dataset contains fewer # schools than either the source or the target dataset, # because the marginals are only defined for the overlap # of schools length(unique(schools00$school)) length(unique(schools05$school)) length(unique(adj$school)) }"},{"path":"https://elbersb.com/segregation/reference/isolation.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates isolation indices — isolation","title":"Calculates isolation indices — isolation","text":"Returns isolation index group","code":""},{"path":"https://elbersb.com/segregation/reference/isolation.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates isolation indices — isolation","text":"","code":"isolation(data, group, unit, weight = NULL)"},{"path":"https://elbersb.com/segregation/reference/isolation.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates isolation indices — isolation","text":"data data frame. group categorical variable contained data. Defines first dimension segregation computed. unit vector variables contained data. Defines second dimension segregation computed. weight Numeric. (Default NULL)","code":""},{"path":"https://elbersb.com/segregation/reference/isolation.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates isolation indices — isolation","text":"Returns data.table group column isolation index.","code":""},{"path":"https://elbersb.com/segregation/reference/matrix_to_long.html","id":null,"dir":"Reference","previous_headings":"","what":"Turns a contingency table into long format — matrix_to_long","title":"Turns a contingency table into long format — matrix_to_long","text":"Returns data.table long form, suitable use mutual_total, etc. Colnames rownames matrix respected.","code":""},{"path":"https://elbersb.com/segregation/reference/matrix_to_long.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Turns a contingency table into long format — matrix_to_long","text":"","code":"matrix_to_long(   matrix,   group = \"group\",   unit = \"unit\",   weight = \"n\",   drop_zero = TRUE )"},{"path":"https://elbersb.com/segregation/reference/matrix_to_long.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Turns a contingency table into long format — matrix_to_long","text":"matrix matrix, rows represent units, column represent groups. group Variable name group. (Default group) unit Variable name unit. (Default unit) weight Variable name frequency weight. (Default weight) drop_zero Drop unit-group combinations zero weight. (Default TRUE)","code":""},{"path":"https://elbersb.com/segregation/reference/matrix_to_long.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Turns a contingency table into long format — matrix_to_long","text":"data.table.","code":""},{"path":"https://elbersb.com/segregation/reference/matrix_to_long.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Turns a contingency table into long format — matrix_to_long","text":"","code":"m <- matrix(c(10, 20, 30, 30, 20, 10), nrow = 3) colnames(m) <- c(\"Black\", \"White\") long <- matrix_to_long(m, group = \"race\", unit = \"school\") mutual_total(long, \"race\", \"school\", weight = \"n\") #>    stat        est #> 1:    M 0.08720802 #> 2:    H 0.12581458"},{"path":"https://elbersb.com/segregation/reference/merge_units.html","id":null,"dir":"Reference","previous_headings":"","what":"Creates a compressed dataset — merge_units","title":"Creates a compressed dataset — merge_units","text":"running compress, function creates dataset units merged.","code":""},{"path":"https://elbersb.com/segregation/reference/merge_units.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Creates a compressed dataset — merge_units","text":"","code":"merge_units(compression, n_units = NULL, percent = NULL, parts = FALSE)"},{"path":"https://elbersb.com/segregation/reference/merge_units.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Creates a compressed dataset — merge_units","text":"compression \"segcompression\" object returned compress. n_units Determines number merges specifying number units remain compressed dataset. n_units percent must given. (default: NULL) percent Determines number merges specifying percentage total segregation information retained compressed dataset. n_units percent must given. (default: NULL) parts (default: FALSE)","code":""},{"path":"https://elbersb.com/segregation/reference/merge_units.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Creates a compressed dataset — merge_units","text":"Returns data.table.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_difference.html","id":null,"dir":"Reference","previous_headings":"","what":"Decomposes the difference between two M indices — mutual_difference","title":"Decomposes the difference between two M indices — mutual_difference","text":"Uses one three methods decompose difference two M indices: (1) \"shapley\" / \"shapley_detailed\": method based Shapley decomposition advantages Karmel-Maclachlan method (recommended default, Deutsch et al. 2006), (2) \"km\": method based Karmel-Maclachlan (1988), (3) \"mrc\": method developed Mora Ruiz-Castillo (2009). methods extended account missing units/groups either data input.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_difference.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Decomposes the difference between two M indices — mutual_difference","text":"","code":"mutual_difference(   data1,   data2,   group,   unit,   weight = NULL,   method = \"shapley\",   se = FALSE,   CI = 0.95,   n_bootstrap = 100,   base = exp(1),   ... )"},{"path":"https://elbersb.com/segregation/reference/mutual_difference.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Decomposes the difference between two M indices — mutual_difference","text":"data1 data frame structure data2. data2 data frame structure data1. group categorical variable vector variables contained data. Defines first dimension segregation computed. unit categorical variable vector variables contained data. Defines second dimension segregation computed. weight Numeric. (Default NULL) method Either \"shapley\" (default), \"km\" (Karmel Maclachlan method), \"mrc\" (Mora Ruiz-Castillo method). se TRUE, segregation estimates bootstrapped provide standard errors apply bias correction. bias reported already applied estimates (.e. reported estimates \"debiased\") (Default FALSE) CI se = TRUE, compute confidence (CI*100) addition bootstrap standard error. based percentiles bootstrap distribution, valid interpretation relies larger number bootstrap iterations. (Default 0.95) n_bootstrap Number bootstrap iterations. (Default 100) base Base logarithm used calculation. Defaults natural logarithm. ... used additional arguments method set shapley km. See ipf details.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_difference.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Decomposes the difference between two M indices — mutual_difference","text":"Returns data.table columns stat est. data frame contains   following rows defined stat:  M1 contains M data1.  M2 contains M data2.  diff difference M2 M1.   sum five rows following diff equal diff.  additions contains change M induces unit group categories   present data2 data1, removals reverse. methods return following three terms:  unit_marginal contribution unit composition differences.  group_marginal contribution group composition differences.  structural contribution unexplained marginal changes, .e. structural     difference. Note interpretation terms depend exact method used. using \"km\", one additional row returned:  interaction contribution differences joint marginal distribution      unit group. \"shapley_detailed\" used, additional column \"unit\" returned, along     six additional rows unit present data1 data2.     five rows following meaning:  p1 (p2) proportion unit data1 (data2)     non-intersecting units/groups removed. changes local linkage     given ls_diff1 ls_diff2, average given  ls_diff_mean. row named total summarizes contribution     unit towards structural change     using formula .5 * p1 * ls_diff1 + .5 * p2 * ls_diff2.     sum \"total\" components equals structural change. se set TRUE, additional column se contains   associated bootstrapped standard errors, additional column CI contains   estimate confidence interval list column, additional column bias contains   estimated bias, column est contains bias-corrected estimates.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_difference.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Decomposes the difference between two M indices — mutual_difference","text":"Shapley method improvement Karmel-Maclachlan method (Deutsch et al. 2006). based several margins-adjusted data inputs yields symmetrical results (.e. data1 data2 can switched). \"shapley_detailed\" used, structural component decomposed contributions individuals units. Karmel-Maclachlan method (Karmel Maclachlan 1988) adjusts margins data1 similar margins data2. process symmetrical. Shapley Karmel-Maclachlan methods based iterative proportional fitting (IPF), first introduced Deming Stephan (1940). Depending size dataset, may take seconds (see ipf details). method developed Mora Ruiz-Castillo (2009) uses algebraic approach estimate size components. often yield substantively different results Shapley Karmel-Maclachlan methods. Note method symmetric terms defined group unit categories, may yield contradictory results. problem arises group /unit categories data1 present data2 (vice versa). methods estimate difference categories present datasets, report additionally change M induced cases additions (present data2, data1) removals (present data1, data2).","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_difference.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Decomposes the difference between two M indices — mutual_difference","text":"W. E. Deming, F. F. Stephan. 1940. \"Least Squares Adjustment Sampled Frequency Table    Expected Marginal Totals Known.\"    Annals Mathematical Statistics 11(4): 427-444. T. Karmel M. Maclachlan. 1988.   \"Occupational Sex Segregation — Increasing Decreasing?\" Economic Record 64: 187-195. R. Mora J. Ruiz-Castillo. 2009. \"Invariance Properties   Mutual Information Index Multigroup Segregation.\" Research Economic Inequality 17: 33-53. J. Deutsch, Y. Flückiger, J. Silber. 2009.       \"Analyzing Changes Occupational Segregation: Case Switzerland (1970–2000).\"        Research Economic Inequality 17: 171–202.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_difference.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Decomposes the difference between two M indices — mutual_difference","text":"","code":"if (FALSE) { # decompose the difference in school segregation between 2000 and 2005, # using the Shapley method mutual_difference(schools00, schools05,     group = \"race\", unit = \"school\",     weight = \"n\", method = \"shapley\", precision = .1 ) # => the structural component is close to zero, thus most change is in the marginals. # This method gives identical results when we switch the unit and group definitions, # and when we switch the data inputs.  # the Karmel-Maclachlan method is similar, but only adjust the data in the forward direction... mutual_difference(schools00, schools05,     group = \"school\", unit = \"race\",     weight = \"n\", method = \"km\", precision = .1 )  # ...this means that the results won't be identical when we switch the data inputs mutual_difference(schools05, schools00,     group = \"school\", unit = \"race\",     weight = \"n\", method = \"km\", precision = .1 )  # the MRC method indicates a much higher structural change... mutual_difference(schools00, schools05,     group = \"race\", unit = \"school\",     weight = \"n\", method = \"mrc\" )  # ...and is not symmetric mutual_difference(schools00, schools05,     group = \"school\", unit = \"race\",     weight = \"n\", method = \"mrc\" ) }"},{"path":"https://elbersb.com/segregation/reference/mutual_expected.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates expected values when true segregation is zero — mutual_expected","title":"Calculates expected values when true segregation is zero — mutual_expected","text":"sample sizes small, one group small proportion, many units, segregation indices typically upwardly biased, even true segregation zero. function simulates tables zero segregation, given marginals dataset, calculates segregation. expected values large, interpretation index scores might adjusted.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_expected.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates expected values when true segregation is zero — mutual_expected","text":"","code":"mutual_expected(   data,   group,   unit,   weight = NULL,   within = NULL,   fixed_margins = TRUE,   n_bootstrap = 100,   base = exp(1) )"},{"path":"https://elbersb.com/segregation/reference/mutual_expected.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates expected values when true segregation is zero — mutual_expected","text":"data data frame. group categorical variable vector variables contained data. Defines first dimension segregation computed. unit categorical variable vector variables contained data. Defines second dimension segregation computed. weight Numeric. (Default NULL) within Apply algorithm within group defined variable, report weighted average. (Default NULL) fixed_margins margins fixed simulated? (Default TRUE) n_bootstrap Number bootstrap iterations. (Default 100) base Base logarithm used calculation. Defaults natural logarithm.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_expected.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates expected values when true segregation is zero — mutual_expected","text":"data.table two rows, corresponding expected values    segregation true segregation zero.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_expected.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculates expected values when true segregation is zero — mutual_expected","text":"","code":"if (FALSE) { # the schools00 dataset has a large sample size, so expected segregation is close to zero mutual_expected(schools00, \"race\", \"school\", weight = \"n\")  # but we can build a smaller table, with 100 students distributed across # 10 schools, where one racial group has 10% of the students small <- data.frame(     school = c(1:10, 1:10),     race = c(rep(\"r1\", 10), rep(\"r2\", 10)),     n = c(rep(1, 10), rep(9, 10)) ) mutual_expected(small, \"race\", \"school\", weight = \"n\") # with an increase in sample size (n=1000), the values improve small$n <- small$n * 10 mutual_expected(small, \"race\", \"school\", weight = \"n\") }"},{"path":"https://elbersb.com/segregation/reference/mutual_local.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates local segregation scores based on M — mutual_local","title":"Calculates local segregation scores based on M — mutual_local","text":"Returns local segregation indices category defined unit.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_local.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates local segregation scores based on M — mutual_local","text":"","code":"mutual_local(   data,   group,   unit,   weight = NULL,   se = FALSE,   CI = 0.95,   n_bootstrap = 100,   base = exp(1),   wide = FALSE )"},{"path":"https://elbersb.com/segregation/reference/mutual_local.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates local segregation scores based on M — mutual_local","text":"data data frame. group categorical variable vector variables contained data. Defines dimension segregation computed. unit categorical variable vector variables contained data. Defines group local segregation indices calculated. weight Numeric. (Default NULL) se TRUE, segregation estimates bootstrapped provide standard errors apply bias correction. bias reported already applied estimates (.e. reported estimates \"debiased\") (Default FALSE) CI se = TRUE, compute confidence (CI*100) addition bootstrap standard error. based percentiles bootstrap distribution, valid interpretation relies larger number bootstrap iterations. (Default 0.95) n_bootstrap Number bootstrap iterations. (Default 100) base Base logarithm used calculation. Defaults natural logarithm. wide Returns wide dataframe instead long dataframe. (Default FALSE)","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_local.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates local segregation scores based on M — mutual_local","text":"Returns data.table two rows category defined unit,   total 2*(number units) rows.   column est contains two statistics   provided unit: ls, local segregation score,  p, proportion unit total number cases.   se set TRUE, additional column se contains   associated bootstrapped standard errors, additional column CI contains   estimate confidence interval list column, additional column bias contains   estimated bias, column est contains bias-corrected estimates.   wide set TRUE, returns instead wide dataframe, one   row unit, associated statistics separate columns.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_local.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Calculates local segregation scores based on M — mutual_local","text":"Henri Theil. 1971. Principles Econometrics. New York: Wiley. Ricardo Mora Javier Ruiz-Castillo. 2011.   \"Entropy-based Segregation Indices\". Sociological Methodology 41(1): 159–194.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_local.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculates local segregation scores based on M — mutual_local","text":"","code":"# which schools are most segregated? (localseg <- mutual_local(schools00, \"race\", \"school\",     weight = \"n\", wide = TRUE )) #>       school        ls            p #>    1:   A1_1 0.1826710 0.0004522985 #>    2:   A1_2 0.1825592 0.0004978701 #>    3:   A1_3 0.2756157 0.0006642066 #>    4:   A1_4 0.1368034 0.0005685061 #>    5:   A2_1 0.3585546 0.0004260948 #>   ---                               #> 2041: C165_1 0.3174930 0.0004568556 #> 2042: C165_2 0.3835477 0.0005297702 #> 2043: C165_3 0.2972550 0.0005650883 #> 2044: C166_1 0.3072281 0.0011586588 #> 2045: C167_1 0.3166498 0.0005354667  sum(localseg$p) # => 1 #> [1] 1  # the sum of the weighted local segregation scores equals # total segregation sum(localseg$ls * localseg$p) # => .425 #> [1] 0.425539 mutual_total(schools00, \"school\", \"race\", weight = \"n\") # M => .425 #>    stat        est #> 1:    M 0.42553898 #> 2:    H 0.05642991"},{"path":"https://elbersb.com/segregation/reference/mutual_total.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates the Mutual Information Index M and Theil's Entropy Index H — mutual_total","title":"Calculates the Mutual Information Index M and Theil's Entropy Index H — mutual_total","text":"Returns total segregation group unit. within given, calculates segregation within within category separately, takes weighted average. Also see mutual_within detailed within calculations.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_total.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates the Mutual Information Index M and Theil's Entropy Index H — mutual_total","text":"","code":"mutual_total(   data,   group,   unit,   within = NULL,   weight = NULL,   se = FALSE,   CI = 0.95,   n_bootstrap = 100,   base = exp(1) )"},{"path":"https://elbersb.com/segregation/reference/mutual_total.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates the Mutual Information Index M and Theil's Entropy Index H — mutual_total","text":"data data frame. group categorical variable vector variables contained data. Defines first dimension segregation computed. unit categorical variable vector variables contained data. Defines second dimension segregation computed. within categorical variable vector variables contained data. variable(s) superset either unit group calculation meaningful. provided, segregation computed within groups defined variable, averaged. (Default NULL) weight Numeric. (Default NULL) se TRUE, segregation estimates bootstrapped provide standard errors apply bias correction. bias reported already applied estimates (.e. reported estimates \"debiased\") (Default FALSE) CI se = TRUE, compute confidence (CI*100) addition bootstrap standard error. based percentiles bootstrap distribution, valid interpretation relies larger number bootstrap iterations. (Default 0.95) n_bootstrap Number bootstrap iterations. (Default 100) base Base logarithm used calculation. Defaults natural logarithm.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_total.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates the Mutual Information Index M and Theil's Entropy Index H — mutual_total","text":"Returns data.table two rows. column est contains   Mutual Information Index, M, Theil's Entropy Index, H. H   M divided group entropy. within given,   M H weighted averages within-category segregation scores.   se set TRUE, additional column se contains   associated bootstrapped standard errors, additional column CI contains   estimate confidence interval list column, additional column bias contains   estimated bias, column est contains bias-corrected estimates.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_total.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Calculates the Mutual Information Index M and Theil's Entropy Index H — mutual_total","text":"Henri Theil. 1971. Principles Econometrics. New York: Wiley. Ricardo Mora Javier Ruiz-Castillo. 2011.      \"Entropy-based Segregation Indices\". Sociological Methodology 41(1): 159–194.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_total.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculates the Mutual Information Index M and Theil's Entropy Index H — mutual_total","text":"","code":"# calculate school racial segregation mutual_total(schools00, \"school\", \"race\", weight = \"n\") # M => .425 #>    stat        est #> 1:    M 0.42553898 #> 2:    H 0.05642991  # note that the definition of groups and units is arbitrary mutual_total(schools00, \"race\", \"school\", weight = \"n\") # M => .425 #>    stat       est #> 1:    M 0.4255390 #> 2:    H 0.4188083  # if groups or units are defined by a combination of variables, # vectors of variable names can be provided - # here there is no difference, because schools # are nested within districts mutual_total(schools00, \"race\", c(\"district\", \"school\"),     weight = \"n\" ) # M => .424 #>    stat       est #> 1:    M 0.4255390 #> 2:    H 0.4188083  # estimate standard errors and 95% CI for M and H if (FALSE) { mutual_total(schools00, \"race\", \"school\",     weight = \"n\",     se = TRUE, n_bootstrap = 1000 )  # estimate segregation within school districts mutual_total(schools00, \"race\", \"school\",     within = \"district\", weight = \"n\" ) # M => .087  # estimate between-district racial segregation mutual_total(schools00, \"race\", \"district\", weight = \"n\") # M => .338 # note that the sum of within-district and between-district # segregation equals total school-race segregation; # here, most segregation is between school districts }"},{"path":"https://elbersb.com/segregation/reference/mutual_total_nested.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates a nested decomposition of segregation for M and H — mutual_total_nested","title":"Calculates a nested decomposition of segregation for M and H — mutual_total_nested","text":"Returns -within decomposition defined sequence variables unit.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_total_nested.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates a nested decomposition of segregation for M and H — mutual_total_nested","text":"","code":"mutual_total_nested(data, group, unit, weight = NULL, base = exp(1))"},{"path":"https://elbersb.com/segregation/reference/mutual_total_nested.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates a nested decomposition of segregation for M and H — mutual_total_nested","text":"data data frame. group categorical variable vector variables contained data. Defines first dimension segregation computed. unit vector variables contained data. Defines levels decomposition computed. weight Numeric. (Default NULL) base Base logarithm used calculation. Defaults natural logarithm.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_total_nested.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates a nested decomposition of segregation for M and H — mutual_total_nested","text":"Returns data.table similar mutual_total,   column within define   levels nesting.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_total_nested.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculates a nested decomposition of segregation for M and H — mutual_total_nested","text":"","code":"mutual_total_nested(schools00, \"race\", c(\"state\", \"district\", \"school\"),     weight = \"n\" ) #>     between          within stat        est #> 1:    state                    M 0.09924370 #> 2:    state                    H 0.09767398 #> 3: district           state    M 0.23870880 #> 4: district           state    H 0.23493319 #> 5:   school state, district    M 0.08758648 #> 6:   school state, district    H 0.08620114 # This is a simpler way to run the following manually: # mutual_total(schools00, \"race\", \"state\", weight = \"n\") # mutual_total(schools00, \"race\", \"district\", within = \"state\", weight = \"n\") # mutual_total(schools00, \"race\", \"school\", within = c(\"state\", \"district\"), weight = \"n\")"},{"path":"https://elbersb.com/segregation/reference/mutual_within.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates detailed within-category segregation scores for M and H — mutual_within","title":"Calculates detailed within-category segregation scores for M and H — mutual_within","text":"Calculates segregation group unit within category defined within.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_within.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates detailed within-category segregation scores for M and H — mutual_within","text":"","code":"mutual_within(   data,   group,   unit,   within,   weight = NULL,   se = FALSE,   CI = 0.95,   n_bootstrap = 100,   base = exp(1),   wide = FALSE )"},{"path":"https://elbersb.com/segregation/reference/mutual_within.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates detailed within-category segregation scores for M and H — mutual_within","text":"data data frame. group categorical variable vector variables contained data. Defines first dimension segregation computed. unit categorical variable vector variables contained data. Defines second dimension segregation computed. within categorical variable vector variables contained data defines within-segregation categories. weight Numeric. (Default NULL) se TRUE, segregation estimates bootstrapped provide standard errors apply bias correction. bias reported already applied estimates (.e. reported estimates \"debiased\") (Default FALSE) CI se = TRUE, compute confidence (CI*100) addition bootstrap standard error. based percentiles bootstrap distribution, valid interpretation relies larger number bootstrap iterations. (Default 0.95) n_bootstrap Number bootstrap iterations. (Default 100) base Base logarithm used calculation. Defaults natural logarithm. wide Returns wide dataframe instead long dataframe. (Default FALSE)","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_within.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates detailed within-category segregation scores for M and H — mutual_within","text":"Returns data.table four rows category defined within.   column est contains four statistics   provided unit:  M within-category M, p proportion category.   Multiplying M p gives contribution within-category   towards total M.  H within-category H, ent_ratio provides entropy ratio,   defined EW/E, EW within-category entropy,   E overall entropy.   Multiplying H, p, ent_ratio gives contribution within-category   towards total H.   se set TRUE, additional column se contains   associated bootstrapped standard errors, additional column CI contains   estimate confidence interval list column, additional column bias contains   estimated bias, column est contains bias-corrected estimates.   wide set TRUE, returns instead wide dataframe, one   row within category, associated statistics separate columns.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_within.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Calculates detailed within-category segregation scores for M and H — mutual_within","text":"Henri Theil. 1971. Principles Econometrics. New York: Wiley. Ricardo Mora Javier Ruiz-Castillo. 2011.      \"Entropy-based Segregation Indices\". Sociological Methodology 41(1): 159–194.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_within.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculates detailed within-category segregation scores for M and H — mutual_within","text":"","code":"(within <- mutual_within(schools00, \"race\", \"school\",     within = \"state\",     weight = \"n\", wide = TRUE )) #>    state         M         p         H ent_ratio #> 1:     A 0.4085965 0.2768819 0.4969216 0.8092501 #> 2:     B 0.2549959 0.4035425 0.2680884 0.9361190 #> 3:     C 0.3450221 0.3195756 0.3611257 0.9402955 # the M for state \"A\" is .409 # manual calculation schools_A <- schools00[schools00$state == \"A\", ] mutual_total(schools_A, \"race\", \"school\", weight = \"n\") # M => .409 #>    stat       est #> 1:    M 0.4085965 #> 2:    H 0.4969216  # to recover the within M and H from the output, multiply # p * M and p * ent_ratio * H, respectively sum(within$p * within$M) # => .326 #> [1] 0.3262953 sum(within$p * within$ent_ratio * within$H) # => .321 #> [1] 0.3211343 # compare with: mutual_total(schools00, \"race\", \"school\", within = \"state\", weight = \"n\") #>    stat       est #> 1:    M 0.3262953 #> 2:    H 0.3211343"},{"path":"https://elbersb.com/segregation/reference/school_ses.html","id":null,"dir":"Reference","previous_headings":"","what":"Student-level data including SES status — school_ses","title":"Student-level data including SES status — school_ses","text":"Fake dataset used examples. individual-level dataset students schools.","code":""},{"path":"https://elbersb.com/segregation/reference/school_ses.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Student-level data including SES status — school_ses","text":"","code":"school_ses"},{"path":"https://elbersb.com/segregation/reference/school_ses.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"Student-level data including SES status — school_ses","text":"data frame 5,153 rows 3 variables: school_id school ID ethnic_group one , B, C ses_quintile SES student (1 = lowest, 5 = highest)","code":""},{"path":"https://elbersb.com/segregation/reference/schools00.html","id":null,"dir":"Reference","previous_headings":"","what":"Ethnic/racial composition of schools for 2000/2001 — schools00","title":"Ethnic/racial composition of schools for 2000/2001 — schools00","text":"Fake dataset used examples. Loosely based data provided National Center Education Statistics, Common Core Data, information U.S. primary schools three U.S. states. original data can downloaded https://nces.ed.gov/ccd/.","code":""},{"path":"https://elbersb.com/segregation/reference/schools00.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Ethnic/racial composition of schools for 2000/2001 — schools00","text":"","code":"schools00"},{"path":"https://elbersb.com/segregation/reference/schools00.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"Ethnic/racial composition of schools for 2000/2001 — schools00","text":"data frame 8,142 rows 5 variables: state either , B, C district school agency/district ID school school ID race either native, asian, hispanic, black, white n n students school race","code":""},{"path":"https://elbersb.com/segregation/reference/schools05.html","id":null,"dir":"Reference","previous_headings":"","what":"Ethnic/racial composition of schools for 2005/2006 — schools05","title":"Ethnic/racial composition of schools for 2005/2006 — schools05","text":"Fake dataset used examples. Loosely based data provided National Center Education Statistics, Common Core Data, information U.S. primary schools three U.S. states. original data can downloaded https://nces.ed.gov/ccd/.","code":""},{"path":"https://elbersb.com/segregation/reference/schools05.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Ethnic/racial composition of schools for 2005/2006 — schools05","text":"","code":"schools05"},{"path":"https://elbersb.com/segregation/reference/schools05.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"Ethnic/racial composition of schools for 2005/2006 — schools05","text":"data frame 8,013 rows 5 variables: state either , B, C district school agency/district ID school school ID race either native, asian, hispanic, black, white n n students school race","code":""},{"path":"https://elbersb.com/segregation/reference/scree_plot.html","id":null,"dir":"Reference","previous_headings":"","what":"Scree plot for segregation compression — scree_plot","title":"Scree plot for segregation compression — scree_plot","text":"plot allows visually see effect compression mutual information.","code":""},{"path":"https://elbersb.com/segregation/reference/scree_plot.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Scree plot for segregation compression — scree_plot","text":"","code":"scree_plot(compression, tail = Inf)"},{"path":"https://elbersb.com/segregation/reference/scree_plot.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Scree plot for segregation compression — scree_plot","text":"compression \"segcompression\" object returned compress. tail Return last tail units (default: Inf)","code":""},{"path":"https://elbersb.com/segregation/reference/scree_plot.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Scree plot for segregation compression — scree_plot","text":"Returns ggplot2 plot.","code":""},{"path":"https://elbersb.com/segregation/reference/segcurve.html","id":null,"dir":"Reference","previous_headings":"","what":"A visual representation of two-group segregation — segcurve","title":"A visual representation of two-group segregation — segcurve","text":"Produces segregation curve, defined Duncan Duncan (1955)","code":""},{"path":"https://elbersb.com/segregation/reference/segcurve.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A visual representation of two-group segregation — segcurve","text":"","code":"segcurve(data, group, unit, weight)"},{"path":"https://elbersb.com/segregation/reference/segcurve.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A visual representation of two-group segregation — segcurve","text":"data data frame. group categorical variable vector variables contained data. Defines first dimension segregation computed. unit categorical variable vector variables contained data. Defines second dimension segregation computed. weight Numeric. (Default NULL)","code":""},{"path":"https://elbersb.com/segregation/reference/segcurve.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A visual representation of two-group segregation — segcurve","text":"Returns ggplot2 object.","code":""},{"path":"https://elbersb.com/segregation/reference/segplot.html","id":null,"dir":"Reference","previous_headings":"","what":"A visual representation of segregation — segplot","title":"A visual representation of segregation — segplot","text":"Produces segregation plot.","code":""},{"path":"https://elbersb.com/segregation/reference/segplot.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A visual representation of segregation — segplot","text":"","code":"segplot(   data,   group,   unit,   weight,   order = \"segregation\",   reference_distribution = NULL,   bar_space = 0,   title = NULL,   axis_labels = \"left\" )"},{"path":"https://elbersb.com/segregation/reference/segplot.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A visual representation of segregation — segplot","text":"data data frame. group categorical variable vector variables contained data. Defines first dimension segregation computed. unit categorical variable vector variables contained data. Defines second dimension segregation computed. weight Numeric. (Default NULL) order character, either \"segregation\", \"entropy\", \"majority\", \"majority_fixed\". Affects ordering units. horizontal ordering groups can changed using factor variable group. difference \"majority\" \"majority_fixed\" former reorder groups way majority group actually comes first. want control ordering , use \"majority_fixed\" specify group variable factor variable. reference_distribution Specifies reference distribution, given two-column data frame, plotted right. order segregation, reference distribution also used compute local segregation scores. bar_space Specifies space single units. title Adds plot title appends value H index. axis_labels One \"left\", \"right\", \"\". Determines y axis labels placed.","code":""},{"path":"https://elbersb.com/segregation/reference/segplot.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A visual representation of segregation — segplot","text":"Returns ggplot2 object.","code":""},{"path":"https://elbersb.com/segregation/reference/segregation.html","id":null,"dir":"Reference","previous_headings":"","what":"segregation: Entropy-based segregation indices — segregation","title":"segregation: Entropy-based segregation indices — segregation","text":"Calculate decompose entropy-based, multigroup segregation indices, focus Mutual Information Index (M) Theil's Information Index (H). Provides tools decompose measures groups units, within terms. Includes standard error estimation bootstrapping.","code":""},{"path":[]},{"path":"https://elbersb.com/segregation/reference/segregation.html","id":"author","dir":"Reference","previous_headings":"","what":"Author","title":"segregation: Entropy-based segregation indices — segregation","text":"Maintainer: Benjamin Elbers be2239@columbia.edu (ORCID)","code":""},{"path":"https://elbersb.com/segregation/news/index.html","id":"segregation-development-version","dir":"Changelog","previous_headings":"","what":"segregation (development version)","title":"segregation (development version)","text":"various improvements compression algorithm add dendrogram visualization","code":""},{"path":"https://elbersb.com/segregation/news/index.html","id":"segregation-100","dir":"Changelog","previous_headings":"","what":"segregation 1.0.0","title":"segregation 1.0.0","text":"CRAN release: 2023-08-24 add mutual_total_nested add within argument mutual_expected add dissimilarity_expected add suite compression-related functions (C++) add segplot function add functions exposure isolation fix roxygen2 problem","code":""},{"path":"https://elbersb.com/segregation/news/index.html","id":"segregation-060","dir":"Changelog","previous_headings":"","what":"segregation 0.6.0","title":"segregation 0.6.0","text":"CRAN release: 2021-09-02 faster mutual_total(…, within) updated docs minor bug fixes improved error messages","code":""},{"path":"https://elbersb.com/segregation/news/index.html","id":"segregation-050","dir":"Changelog","previous_headings":"","what":"segregation 0.5.0","title":"segregation 0.5.0","text":"CRAN release: 2021-02-08 dissimilarity: support index dissimilarity add CI argument confidence intervals mutual_within: report ent_ratio instead h_weight matrix_to_long: convert contingency tables long form add introductory vignette","code":""},{"path":"https://elbersb.com/segregation/news/index.html","id":"segregation-040","dir":"Changelog","previous_headings":"","what":"segregation 0.4.0","title":"segregation 0.4.0","text":"CRAN release: 2021-01-08 faster bootstrap return bootstrap estimates attr add mutual_expected apply bias-correction via bootstrap default se=TRUE","code":""},{"path":"https://elbersb.com/segregation/news/index.html","id":"segregation-030","dir":"Changelog","previous_headings":"","what":"segregation 0.3.0","title":"segregation 0.3.0","text":"CRAN release: 2019-09-20 always return data.table ipf function, warn groups/units dropped return sample size source dataset IPF don’t allow bootstrap sample size integer, allow non-integer sample weights (unproblematic) simplify precision parameter ipf procedure increase default bootstrap 100 fix data.table issue (#3)","code":""},{"path":"https://elbersb.com/segregation/news/index.html","id":"segregation-020","dir":"Changelog","previous_headings":"","what":"segregation 0.2.0","title":"segregation 0.2.0","text":"CRAN release: 2019-01-14 add “shapley” decomposition method, revisit difference decomposition methods better logging bootstrap/IPF several small fixes add lintr package add warning attempting bootstrap non-integer weights","code":""},{"path":"https://elbersb.com/segregation/news/index.html","id":"segregation-010","dir":"Changelog","previous_headings":"","what":"segregation 0.1.0","title":"segregation 0.1.0","text":"CRAN release: 2018-06-15 switch group unit definitions, consistent literature add Theil’s Information Index (H) add entropy function add mutual_within function decompose weighted within indices add “wide” option mutual_local mutual_within add “ipf” (iterative proportional fitting) function difference decomposition based IPF “mrc_adjusted” difference decomposition defined overlap sample units groups internal refactoring","code":""},{"path":"https://elbersb.com/segregation/news/index.html","id":"segregation-001","dir":"Changelog","previous_headings":"","what":"segregation 0.0.1","title":"segregation 0.0.1","text":"CRAN release: 2018-04-17 Initial release.","code":""}]
+[{"path":"https://elbersb.com/segregation/articles/faq.html","id":"can-index-x-be-added-to-the-package","dir":"Articles","previous_headings":"","what":"Can index X be added to the package?","title":"FAQ","text":"Adding new segregation indices big trouble. Please open issue GitHub request index added.","code":""},{"path":"https://elbersb.com/segregation/articles/faq.html","id":"how-can-i-compute-indices-for-different-areas-at-once","dir":"Articles","previous_headings":"","what":"How can I compute indices for different areas at once?","title":"FAQ","text":"use dplyr package, one pattern works well use group_modify. , compute pairwise Black-White dissimilarity index state separately: similar pattern works also well data.table: compute many decompositions , ’s easiest combine data two time points. instance, ’s dplyr solution decompose state-specific M indices 2000 2005: , ’s also data.table solution:","code":"library(\"segregation\") library(\"dplyr\")  schools00 %>%   filter(race %in% c(\"black\", \"white\")) %>%   group_by(state) %>%   group_modify(~ dissimilarity(     data = .x,     group = \"race\",     unit = \"school\",     weight = \"n\"   )) #> # A tibble: 3 × 3 #> # Groups:   state [3] #>   state stat    est #>   <fct> <chr> <dbl> #> 1 A     D     0.706 #> 2 B     D     0.655 #> 3 C     D     0.704 library(\"data.table\")  schools00 <- as.data.table(schools00) schools00[   race %in% c(\"black\", \"white\"),   dissimilarity(data = .SD, group = \"race\", unit = \"school\", weight = \"n\"),   by = .(state) ] #>    state stat       est #> 1:     A    D 0.7063595 #> 2:     B    D 0.6548485 #> 3:     C    D 0.7042057 # helper function for decomposition diff <- function(df, group) {   data1 <- filter(df, year == 2000)   data2 <- filter(df, year == 2005)   mutual_difference(data1, data2, group = \"race\", unit = \"school\", weight = \"n\") }  # add year indicators schools00$year <- 2000 schools05$year <- 2005 combine <- bind_rows(schools00, schools05)  combine %>%   group_by(state) %>%   group_modify(diff) %>%   head(5) #> # A tibble: 5 × 3 #> # Groups:   state [1] #>   state stat          est #>   <fct> <chr>       <dbl> #> 1 A     M1         0.409  #> 2 A     M2         0.445  #> 3 A     diff       0.0359 #> 4 A     additions -0.0159 #> 5 A     removals   0.0390 setDT(combine) combine[, diff(.SD), by = .(state)] %>% head(5) #>    state      stat         est #> 1:     A        M1  0.40859652 #> 2:     A        M2  0.44454379 #> 3:     A      diff  0.03594727 #> 4:     A additions -0.01585879 #> 5:     A  removals  0.03903106"},{"path":"https://elbersb.com/segregation/articles/faq.html","id":"how-can-i-use-census-data-from-tidycensus-to-compute-segregation-indices","dir":"Articles","previous_headings":"","what":"How can I use Census data from tidycensus to compute segregation indices?","title":"FAQ","text":"examples thanks Kyle Walker, author tidycensus package. First, download data: data “long” format, ’s easy compute segregation indices: Producing map local segregation scores also hard:","code":"library(\"tidycensus\")  cook_data <- get_acs(   geography = \"tract\",   variables = c(     white = \"B03002_003\",     black = \"B03002_004\",     asian = \"B03002_006\",     hispanic = \"B03002_012\"   ),   state = \"IL\",   county = \"Cook\" ) #> Getting data from the 2017-2021 5-year ACS # compute index of dissimilarity cook_data %>%   filter(variable %in% c(\"black\", \"white\")) %>%   dissimilarity(     group = \"variable\",     unit = \"GEOID\",     weight = \"estimate\"   ) #>    stat       est #> 1:    D 0.7855711  # compute multigroup M/H indices cook_data %>%   mutual_total(     group = \"variable\",     unit = \"GEOID\",     weight = \"estimate\"   ) #>    stat       est #> 1:    M 0.5114435 #> 2:    H 0.4089561 library(\"tigris\") library(\"ggplot2\")  local_seg <- mutual_local(cook_data,   group = \"variable\",   unit = \"GEOID\",   weight = \"estimate\",   wide = TRUE )  # download shapefile seg_geom <- tracts(\"IL\", \"Cook\", cb = TRUE, progress_bar = FALSE) %>%   left_join(local_seg, by = \"GEOID\") #> Retrieving data for the year 2021  ggplot(seg_geom, aes(fill = ls)) +   geom_sf(color = NA) +   coord_sf(crs = 3435) +   scale_fill_viridis_c() +   theme_void() +   labs(     title = \"Local segregation scores for Cook County, IL\",     fill = NULL   )"},{"path":"https://elbersb.com/segregation/articles/faq.html","id":"how-can-i-compute-margins-adjusted-local-segregation-scores","dir":"Articles","previous_headings":"","what":"How can I compute margins-adjusted local segregation scores?","title":"FAQ","text":"using mutual_difference, supply method = \"shapley_detailed\" get two different local segregation scores margins-adjusted (one coming adjusting forward, adjusting backwards). averaging can create single margins-adjusted local segregation score:","code":"diff <- mutual_difference(schools00, schools05, \"race\", \"school\",   weight = \"n\", method = \"shapley_detailed\" )  diff[stat %in% c(\"ls_diff1\", \"ls_diff2\"),   .(ls_diff_adjusted = mean(est)),   by = .(school) ] #>       school ls_diff_adjusted #>    1:   A1_3     -0.088983164 #>    2:   A2_2     -0.044338042 #>    3:   A2_3     -0.101696519 #>    4:   A2_4     -0.020134162 #>    5:   A2_6     -0.138567163 #>   ---                         #> 1706: C164_2     -0.031329845 #> 1707: C165_1     -0.023978101 #> 1708: C165_3      0.003781632 #> 1709: C166_1      0.010270713 #> 1710: C167_1     -0.002663687"},{"path":"https://elbersb.com/segregation/articles/plotting.html","id":"segregation-curve","dir":"Articles","previous_headings":"","what":"Segregation curve","title":"Visualizing and compressing segregation","text":"segregation curve first introduced Duncan Duncan (1955). function segcurve() provides simple way plotting one several segregation curves:  case, state segregated, state B C similarly segregated, lower level. Segregation curves closely related index dissimilarity, corresponds following index values:","code":"segcurve(subset(schools00, race %in% c(\"white\", \"asian\")),   \"race\", \"school\",   weight = \"n\",   segment = \"state\" # leave this out to produce a single curve ) # converting to data.table makes this easier data.table::as.data.table(schools00)[   race %in% c(\"white\", \"asian\"),   dissimilarity(.SD, \"race\", \"school\", weight = \"n\"),   by = .(state) ] #>    state stat       est #> 1:     A    D 0.6558592 #> 2:     B    D 0.4002980 #> 3:     C    D 0.3886178"},{"path":"https://elbersb.com/segregation/articles/plotting.html","id":"segplot","dir":"Articles","previous_headings":"","what":"Segplot","title":"Visualizing and compressing segregation","text":"function segplot() provided generate segplots. Segplots described detail working paper. function requires dataset, group, unit variables, , required, variable identifies weight (n case). options customize look segplot given argument order. default, units segplot ordered local segregation score, also possible order entropy (.e., diversity) share majority population. last option can useful two-group case. argument bar_space can used increase space units default zero space bars. plotting subset dataset, reference distribution shown right segplot can changed supplying two-column data frame reference_distribution argument. One column frame contain group identifiers, include reference proportion group. show axis labels either left side , right side , sides, use argument axis_labels. Examples use arguments given :","code":"sch <- subset(schools00, state == \"A\")  # basic segplot segplot(sch, \"race\", \"school\", weight = \"n\", axis_labels = \"both\") # order by majority group (white in this case) segplot(sch, \"race\", \"school\", weight = \"n\", order = \"majority\") # increase the space between bars # (has to be very low here because there are many schools in this dataset) segplot(sch, \"race\", \"school\", weight = \"n\", bar_space = 0.0005) # change the reference distribution # (here, we just use an equalized distribution across the five groups) (ref <- data.frame(race = unique(schools00$race), p = rep(0.2, 5))) #>     race   p #> 1  asian 0.2 #> 2  black 0.2 #> 3   hisp 0.2 #> 4  white 0.2 #> 5 native 0.2 segplot(sch, \"race\", \"school\",   weight = \"n\",   reference_distribution = ref )"},{"path":"https://elbersb.com/segregation/articles/plotting.html","id":"compressing-segregation-information","dir":"Articles","previous_headings":"","what":"Compressing segregation information","title":"Visualizing and compressing segregation","text":"compression algorithm requires three steps taken. First, important decide units permitted merge: residential segregation, may want allow neighboring units (tracts) mergeable. case, first step consists compiling data frame exactly two columns, row identifies pair neighboring units. cases, may want allow units mergeable, principle. However, can time-consuming requires unit compared others every step merging operation. speed compression, therefore implement option allows units merged within window “neighboring” units, definition window based similarities local segregation. Hence, given unit, n_neighbors considered every step, neighbors based similarities local segregation. Smaller n_neighbors values result faster run times, increase probability non-optimal merges. method merging can specified compress() function supplying argument neighbors. second step run actual compression algorithm using compress(). example, choose compress based relatively small window: running compress()—can take time depending many neighbors need considered—output summarizes compression can achieved: results indicate 99% segregation information can retained 98 units (560 original dataset), 95% 24 units, 90% 10 units. percentage information retained iteration can accessed via data frame available comp$iterations. data frame can also used generate plot shows relationship number merges loss segregation information:  Another way learn compression visualize information dendrogram:  third step create new dataset based desired level compression. can achieved using function merge_units(), either n_units percent can specified indicate desired level compression. compressed dataset format original dataset can now used produce another segplot, e.g.","code":"# compression based on window of 20 'neighboring' units # in terms of local segregation (alternatively, neighbors can be a data frame) comp <- compress(sch, \"race\", \"school\",   weight = \"n\", neighbors = \"local\", n_neighbors = 20 ) comp #> Compression of dataset with 560 units #> Original M: 0.4085965; Final M: 0 #> - Threshold 99%: M = 0.4045242; Units = 92 #> - Threshold 95%: M = 0.388871; Units = 22 #> - Threshold 90%: M = 0.3695035; Units = 9 scree_plot(comp) dend <- as.dendrogram(comp) dendextend::labels(dend) <- NULL # remove the labels #> Warning in `labels<-.dendrogram`(`*tmp*`, value = NULL): The lengths of the new #> labels is shorter than the number of leaves in the dendrogram - labels are #> recycled. #> Warning in rep(new_labels, length.out = leaves_length): 'x' is NULL so the #> result will be NULL plot(dend) sch_compressed <- merge_units(comp, n_units = 15) # or, for instance: merge_units(comp, percent = 0.80) head(sch_compressed) #>    school  race    n #> 1:    M12 asian  143 #> 2:    M12 black  445 #> 3:    M12  hisp  472 #> 4:    M12 white 6174 #> 5:     M2 black   67 #> 6:     M2  hisp  642 segplot(sch_compressed, \"race\", \"school\", weight = \"n\")"},{"path":"https://elbersb.com/segregation/articles/segregation.html","id":"the-basic-mathematics","dir":"Articles","previous_headings":"","what":"The basic mathematics","title":"A walkthrough of the segregation package","text":"idea segregation index summarize contingency table single number. instance, may table \\(U\\) units, say schools occupations, \\(G\\) groups, say gender racial groups. combination unit group count, \\(t_{ug}\\). Arranged \\(U\\times G\\) matrix \\(\\mathbf{T}\\), structure data looks like: matrix, can define \\(t=\\sum_{u=1}^U\\sum_{g=1}^G t_{ug}\\), total population size. joint probability unit \\(u\\) racial group \\(g\\) \\(p_{ug}=t_{ug}/t\\). Also define \\(p_{u \\cdot}=\\sum_{g=1}^{G}t_{ug}/t\\) \\(p_{\\cdot g}=\\sum_{u=1}^{U}t_{ug}/t\\) marginal probabilities units groups, respectively. Mutual Information Index defined \\[ M(\\mathbf{T})=\\sum_{u=1}^U\\sum_{g=1}^Gp_{ug}\\log\\frac{p_{ug}}{p_{u \\cdot}p_{\\cdot g}}. \\] Theil Index closely related M index, just normalized version Mutual Information Index: \\[ H(\\mathbf{T})=\\frac{M(\\mathbf{T})}{E(\\mathbf{T})}, \\] \\(E(\\mathbf{T})\\) denotes entropy group marginal distribution \\(\\mathbf{T}\\), .e. \\(E(\\mathbf{T})=-\\sum_{g=1}^{G}p_{\\cdot g}\\log p_{\\cdot g}\\). Dividing group entropy effect constraining H 0 1.","code":""},{"path":"https://elbersb.com/segregation/articles/segregation.html","id":"data-format","dir":"Articles","previous_headings":"","what":"Data format","title":"A walkthrough of the segregation package","text":"examples, use dataset built segregation package, schools00. dataset contains data 2,045 schools across 429 school districts three U.S. states. school, dataset records number Asian, Black, Hispanic, White, Native American students. segregation package requires data long form (segregation data comes form), form contingency tables. Hence, row schools00 dataset unique combination given school racial group, column n records number students combination: Note first school, A1_1, Native American students. Hence, row missing. data form contingency tables, can use matrix_to_long() convert long format required package. example: group unit arguments optional.","code":"library(\"segregation\") head(schools00[, c(\"school\", \"race\", \"n\")]) #>   school  race   n #> 1   A1_1 asian   2 #> 2   A1_1 black  14 #> 3   A1_1  hisp  30 #> 4   A1_1 white 351 #> 5   A1_2 black   9 #> 6   A1_2  hisp 101 (m <- matrix(c(10, 20, 30, 30, 20, 10), nrow = 3)) #>      [,1] [,2] #> [1,]   10   30 #> [2,]   20   20 #> [3,]   30   10 colnames(m) <- c(\"Black\", \"White\") matrix_to_long(m, group = \"race\", unit = \"school\") #>    school  race  n #> 1:      1 Black 10 #> 2:      2 Black 20 #> 3:      3 Black 30 #> 4:      1 White 30 #> 5:      2 White 20 #> 6:      3 White 10"},{"path":"https://elbersb.com/segregation/articles/segregation.html","id":"computing-the-m-and-h-indices","dir":"Articles","previous_headings":"","what":"Computing the M and H indices","title":"A walkthrough of the segregation package","text":"Compute M H indices using mutual_total(): Interpreting M easy, normalized. However, H can range 0 1, value 0.419 indicate moderate segregation. second argument mutual_total() refers groups, third argument refers units. Switching groups units affect M index, change H index: segregation package always divides marginal group entropy, hence divide entropy school distribution, expect much larger (many schools racial groups). check, can use entropy() function: Therefore, H index used, important specify groups units correctly. inference (discussed detail ), can use bootstrapping obtain standard errors confidence intervals: large number observations, standard errors small.","code":"mutual_total(schools00, \"race\", \"school\", weight = \"n\") #>    stat       est #> 1:    M 0.4255390 #> 2:    H 0.4188083 mutual_total(schools00, \"school\", \"race\", weight = \"n\") #>    stat        est #> 1:    M 0.42553898 #> 2:    H 0.05642991 (entropy(schools00, \"race\", weight = \"n\")) #> [1] 1.016071 (entropy(schools00, \"school\", weight = \"n\")) #> [1] 7.541018 mutual_total(schools00, \"race\", \"school\",   weight = \"n\",   se = TRUE, CI = .95, n_bootstrap = 500 ) #> 500 bootstrap iterations on 877739 observations #>    stat       est           se                  CI        bias #> 1:    M 0.4219383 0.0008178582 0.4203089,0.4236068 0.003600699 #> 2:    H 0.4152563 0.0007530694 0.4137359,0.4167181 0.003552063"},{"path":"https://elbersb.com/segregation/articles/segregation.html","id":"between-within-decomposition","dir":"Articles","previous_headings":"","what":"Between-Within decomposition","title":"A walkthrough of the segregation package","text":"might wonder whether segregation different across three different states. can compute segregation indices manually (just showing M simplicity): Clearly, state segregated state C, turn shows higher school segregation B. One advantages entropy-based segregation indices three state-specific indices simple relationship overall index. called /within decomposition: Total segregation can decomposed term measures much distribution racial groups differs states, term measures segregation within states. \\(S\\) states (“super-units” generally), school belongs exactly one state, M index can decomposed follows: \\[ M(\\mathbf{T})=M(\\mathbf{S}) + \\sum_{s=1}^S p_s M(\\mathbf{T}_s), \\] \\(\\mathbf{T}\\) full \\(U \\times G\\) contingency table, \\(\\mathbf{S}\\) aggregated contingency table dimension \\(S\\times G\\), \\(p_s\\) population proportion state \\(s\\) (\\(\\sum_{s=1}^S p_s=1\\)), \\(\\mathbf{T}_s\\) subset rows \\(\\mathbf{T}\\) belonging state \\(s\\). Put simple terms, M index can decomposed -state segregation index, plus weighted average within-state M indices. H index, dividing formula \\(E(\\mathbf{T})\\). makes formula bit complicated, normalization offset decomposition: \\[ H(\\mathbf{T})=H(\\mathbf{S}) + \\sum_{s=1}^S \\frac{E(\\mathbf{T}_s)}{E(\\mathbf{T})} p_s H(\\mathbf{T}_s), \\] \\(E(\\cdot)\\) entropy marginal group distribution. Note \\(E(\\mathbf{T})=E(\\mathbf{S})\\), group marginal distributions identical. compute decomposition using segregation package, use: Note \\(0.426 = 0.0992 + 0.326\\) \\(0.419 = 0.0977 + 0.321\\). results indicate 75% segregation within states. words, differences racial composition three different states account less 25% segregation. using mutual_total() within argument, can obtain overall within component, obtain decomposition state. , can use mutual_within(): much simpler way obtain state-specific segregation scores compared subsetting manually, shown beginning section. addition M H indices, also obtain p, population proportion state (\\(p_s\\) ), ent_ratio, \\(E(\\mathbf{T}_s)/E(\\mathbf{T})\\) . Hence, can recover total within-component using exactly . quantity \\(p_s M(\\mathbf{T}_s)\\) interest, shows much states contribute segregation total, taking account size. adding component, can calculate contribution four components: four components contributes quarter total segregation 0.426. Note state smallest state (27.7% population), contributes largest percentage (26.6%) total segregation. Hence, decomposition shows important look \\(p_s\\), state sizes, well \\(M(\\mathbf{T}_s)\\), within-state segregation. -within decomposition can also applied repeatedly hierarchical setting. instance, schools00 dataset, schools nested within districts, districts nested within states. Therefore, can ask: much segregation due segregation states, much segregation due -district segregation within states, much segregation due -school segregation within districts? package provides convenience function use case:","code":"split_schools <- split(schools00, schools00$state) mutual_total(split_schools$A, \"race\", \"school\", weight = \"n\")[1, ] #>    stat       est #> 1:    M 0.4085965 mutual_total(split_schools$B, \"race\", \"school\", weight = \"n\")[1, ] #>    stat       est #> 1:    M 0.2549959 mutual_total(split_schools$C, \"race\", \"school\", weight = \"n\")[1, ] #>    stat       est #> 1:    M 0.3450221 # total segregation (total <- mutual_total(schools00, \"race\", \"school\", weight = \"n\")) #>    stat       est #> 1:    M 0.4255390 #> 2:    H 0.4188083 # between-state segregation: #     how much does the racial distributions differ across states? (between <- mutual_total(schools00, \"race\", \"state\", weight = \"n\")) #>    stat        est #> 1:    M 0.09924370 #> 2:    H 0.09767398 # within-state segregation: #     how much segregation exist within states? (mutual_total(schools00, \"race\", \"school\", within = \"state\", weight = \"n\")) #>    stat       est #> 1:    M 0.3262953 #> 2:    H 0.3211343 (within <- mutual_within(schools00, \"race\", \"school\",   within = \"state\", weight = \"n\", wide = TRUE )) #>    state         M         p         H ent_ratio #> 1:     A 0.4085965 0.2768819 0.4969216 0.8092501 #> 2:     B 0.2549959 0.4035425 0.2680884 0.9361190 #> 3:     C 0.3450221 0.3195756 0.3611257 0.9402955 with(within, sum(M * p)) #> [1] 0.3262953 with(within, sum(H * p * ent_ratio)) #> [1] 0.3211343 # merge into a vector components <- c(between$est[1], within$M * within$p) names(components) <- c(\"Between\", \"A\", \"B\", \"C\") signif(100 * components / total$est[1], 3) #> Between       A       B       C  #>    23.3    26.6    24.2    25.9 mutual_total_nested(schools00, \"race\", c(\"state\", \"district\", \"school\"),   weight = \"n\" ) #>     between          within stat        est #> 1:    state                    M 0.09924370 #> 2:    state                    H 0.09767398 #> 3: district           state    M 0.23870880 #> 4: district           state    H 0.23493319 #> 5:   school state, district    M 0.08758648 #> 6:   school state, district    H 0.08620114 # This is a simpler way of running the following three decompositions manually: # mutual_total(schools00, \"race\", \"state\", weight = \"n\") # mutual_total(schools00, \"race\", \"district\", within = \"state\", weight = \"n\") # mutual_total(schools00, \"race\", \"school\", within = c(\"state\", \"district\"), weight = \"n\")"},{"path":"https://elbersb.com/segregation/articles/segregation.html","id":"local-segregation","dir":"Articles","previous_headings":"","what":"Local segregation","title":"A walkthrough of the segregation package","text":"M index (H index) allows another decomposition, local segregation scores. define decomposition, let \\(p_{g|u} = t_{ug} / t_{u \\cdot}\\) conditional probability group \\(g\\), given one unit \\(u\\). can define local segregation score unit \\(u\\) \\[L_u = \\sum_{g=1}^G p_{g|u}\\log\\frac{p_{g|u}}{p_{\\cdot g}}\\] weighted average \\(L_u\\) \\(M(\\mathbf{T})\\), .e. \\(M(\\mathbf{T}) = \\sum_{u=1}^U p_{u\\cdot}L_u\\). obtain local segregation scores \\(L_u\\), along marginal weights \\(p_{u\\cdot}\\), use mutual_local(): Local segregation scores based much less data full M index, often makes sense obtain confidence intervals. following code plots length 95% confidence interval relation size school:  Although relationship deterministic, larger schools shorter confidence intervals. M symmetric, local segregation scores can also obtained groups. equivalent definition local segregation score group \\(g\\) \\[L_g = \\sum_{u=1}^U p_{u|g}\\log\\frac{p_{u|g}}{p_{u \\cdot}},\\] , expected, \\(M(\\mathbf{T}) = \\sum_{g=1}^G p_{\\cdot g}L_g\\). obtain scores, switch group unit arguments mutual_local: results show racial groups experience different levels segregation: White students less segregated Asian, Black, Hispanic, , especially, Native American students.","code":"mutual_local(schools00, \"race\", \"school\", weight = \"n\", wide = TRUE) #>       school        ls            p #>    1:   A1_1 0.1826710 0.0004522985 #>    2:   A1_2 0.1825592 0.0004978701 #>    3:   A1_3 0.2756157 0.0006642066 #>    4:   A1_4 0.1368034 0.0005685061 #>    5:   A2_1 0.3585546 0.0004260948 #>   ---                               #> 2041: C165_1 0.3174930 0.0004568556 #> 2042: C165_2 0.3835477 0.0005297702 #> 2043: C165_3 0.2972550 0.0005650883 #> 2044: C166_1 0.3072281 0.0011586588 #> 2045: C167_1 0.3166498 0.0005354667 localse <- mutual_local(schools00, \"race\", \"school\",   weight = \"n\",   se = TRUE, wide = TRUE, n_bootstrap = 500 ) #> 500 bootstrap iterations on 877739 observations localse$lengthCI <- sapply(localse$ls_CI, base::diff) with(localse, plot(x = p, y = lengthCI, pch = 16, cex = 0.3)) (localg <- mutual_local(schools00, \"school\", \"race\", weight = \"n\", wide = TRUE)) #>      race        ls           p #> 1:  asian 0.6287673 0.022553401 #> 2:  black 0.8805413 0.190149919 #> 3:   hisp 0.7766327 0.151696575 #> 4:  white 0.1836393 0.628092178 #> 5: native 1.4342644 0.007507927"},{"path":"https://elbersb.com/segregation/articles/segregation.html","id":"inference","dir":"Articles","previous_headings":"","what":"Inference","title":"A walkthrough of the segregation package","text":"four main functions packages, mutual_total(), mutual_within(), mutual_local(), mutual_difference() support inference bootstrapping. Inference segregation indices tricky, standard error estimates confidence intervals trusted much little data, especially segregation index close either 0 maximum segregation. estimate standard errors confidence intervals, use se = TRUE. coverage confidence interval can specified CI argument. number bootstrap iterations can specified well: confidence intervals based percentiles bootstrap distribution, hence require large number bootstrap iterations valid interpretation. estimate est reported results already “debiased”, .e. bias estimated bootstrap distribution (reported bias) subtracted usual maximum-likelihood estimate obtain mutual_total se = FALSE. confidence interval centered around debiased estimate. balance, confidence intervals preferred standard error bootstrap distribution can skewed, especially segregation low high. example, can see standard errors provide almost identical coverage confidence intervals, provide effectively coverage confidence intervals obtained percentile bootstrap. Whenever bootstrap used, bootstrap distributions parameter reported attribute bootstrap returned object. can used, instance, check whether bootstrap distribution skewed. following code computes local segregation scores schools, shows histogram bootstrap distribution school C137_9, low local segregation score:  school, bootstrap distribution skewed. precise inference specific school needed, standard error interpreted, confidence interval interpreted number bootstrap iterations large. concerned contingency table small provide reliable segregation estimates, package also provides function mutual_expected() simulates random cell counts independence marginal distributions table. schools00 dataset: , concern bias due small sample size.","code":"(se <- mutual_total(schools00, \"race\", \"school\",   weight = \"n\",   se = TRUE, CI = .95, n_bootstrap = 500 )) #> 500 bootstrap iterations on 877739 observations #>    stat       est           se                  CI        bias #> 1:    M 0.4218735 0.0007364381 0.4205171,0.4232621 0.003665477 #> 2:    H 0.4152207 0.0006890576 0.4139019,0.4165112 0.003587629 # M with(se, c(est[1] - 1.96 * se[1], est[1] + 1.96 * se[1])) #> [1] 0.4204301 0.4233169 # H with(se, c(est[2] - 1.96 * se[2], est[2] + 1.96 * se[2])) #> [1] 0.4138701 0.4165712 local <- mutual_local(schools00, \"race\", \"school\",   weight = \"n\",   se = TRUE, CI = .95, n_bootstrap = 500 ) #> 500 bootstrap iterations on 877739 observations # pick bootstrap distribution of local segregation scores for school C137_9 ls_school <- attr(local, \"bootstrap\")[school == \"C137_9\" & stat == \"ls\", boot_est] hist(ls_school, main = \"Bootstrap distribution for school C137_9\") mutual_expected(schools00, \"race\", \"school\", weight = \"n\", n_bootstrap = 500) #>         stat         est           se #> 1: M under 0 0.004806118 7.679837e-05 #> 2: H under 0 0.004730100 7.558367e-05"},{"path":"https://elbersb.com/segregation/articles/segregation.html","id":"decomposing-differences-in-indices","dir":"Articles","previous_headings":"","what":"Decomposing differences in indices","title":"A walkthrough of the segregation package","text":"command mutual_difference() can used decompose differences segregation, described Elbers (2021). default, recommended method, use method = shapley (method = shapley_detailed). methods (mrc, km) exist mostly testing purposes, recommended. Details procedure interpret terms decomposition found Elbers (2021). method also supports inference setting se = TRUE.","code":"mutual_difference(schools00, schools05, \"race\", \"school\", weight = \"n\") #>              stat          est #> 1:             M1  0.425538976 #> 2:             M2  0.413385092 #> 3:           diff -0.012153884 #> 4:      additions -0.003412776 #> 5:       removals -0.011405093 #> 6: group_marginal  0.018550238 #> 7:  unit_marginal -0.012391915 #> 8:     structural -0.003494338"},{"path":"https://elbersb.com/segregation/articles/segregation.html","id":"references","dir":"Articles","previous_headings":"","what":"References","title":"A walkthrough of the segregation package","text":"Elbers, B. (2021). Method Studying Differences Segregation Across Time Space. Sociological Methods & Research. https://doi.org/10.1177/0049124121986204 Mora, R., & Ruiz-Castillo, J. (2011). Entropy-based Segregation Indices. Sociological Methodology, 41(1), 159–194. https://doi.org/10.1111/j.1467-9531.2011.01237.x Theil, H. (1971). Principles Econometrics. New York: Wiley","code":""},{"path":"https://elbersb.com/segregation/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Benjamin Elbers. Author, maintainer.","code":""},{"path":"https://elbersb.com/segregation/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Benjamin Elbers. 2021. Method Studying Differences Segregation Across Time Space Sociological Methods & Research 52(1): 5-42. doi: 10.1177/0049124121986204","code":"@Article{,   title = {A Method for Studying Differences in Segregation Across Time and Space},   author = {Benjamin Elbers},   journal = {Sociological Methods & Research},   year = {2021},   volume = {52},   number = {1},   pages = {5-42},   doi = {10.1177/0049124121986204}, }"},{"path":"https://elbersb.com/segregation/index.html","id":"segregation","dir":"","previous_headings":"","what":"Entropy-Based Segregation Indices","title":"Entropy-Based Segregation Indices","text":"R package calculate, visualize, decompose various segregation indices. package currently supports Mutual Information Index (M), Theil’s Information Index (H), index Dissimilarity (D), isolation exposure index. Find information vignette(\"segregation\") documentation. package also supports standard error confidence intervals estimation via bootstrapping, also corrects small sample bias decomposition M H indices (within/, local segregation) decomposing differences total segregation time (Elbers 2020) segregation visualizations (segregation curves ‘segplots’) methods return tidy data.tables easy post-processing plotting. speed, package uses data.table package internally, implements functions C++. procedures implemented package described detail SMR paper (Preprint) working paper.","code":""},{"path":"https://elbersb.com/segregation/index.html","id":"usage","dir":"","previous_headings":"","what":"Usage","title":"Entropy-Based Segregation Indices","text":"package provides easy way calculate segregation measures, based Mutual Information Index (M) Theil’s Entropy Index (H). Standard errors functions can estimated via boostrapping. also apply bias-correction estimates: Decompose segregation -state within-state term (sum equals total segregation): Local segregation (ls) decomposition units groups (racial groups). function also support standard error CI estimation. sum proportion-weighted local segregation scores equals M: Decompose difference M 2000 2005, using iterative proportional fitting (IPF) Shapley decomposition (see Elbers 2021 details): Show segplot:  Find information documentation.","code":"library(segregation)  # example dataset with fake data provided by the package mutual_total(schools00, \"race\", \"school\", weight = \"n\") #>      stat   est #>    <char> <num> #> 1:      M 0.426 #> 2:      H 0.419 mutual_total(schools00, \"race\", \"school\",     weight = \"n\",     se = TRUE, CI = 0.90, n_bootstrap = 500 ) #> 500 bootstrap iterations on 877739 observations #>      stat   est       se          CI    bias #>    <char> <num>    <num>      <list>   <num> #> 1:      M 0.422 0.000775 0.421,0.423 0.00361 #> 2:      H 0.415 0.000712 0.414,0.416 0.00356 # between states mutual_total(schools00, \"race\", \"state\", weight = \"n\") #>      stat    est #>    <char>  <num> #> 1:      M 0.0992 #> 2:      H 0.0977  # within states mutual_total(schools00, \"race\", \"school\", within = \"state\", weight = \"n\") #>      stat   est #>    <char> <num> #> 1:      M 0.326 #> 2:      H 0.321 local <- mutual_local(schools00,     group = \"school\", unit = \"race\", weight = \"n\",     se = TRUE, CI = 0.90, n_bootstrap = 500, wide = TRUE ) #> 500 bootstrap iterations on 877739 observations local[, c(\"race\", \"ls\", \"p\", \"ls_CI\")] #>      race    ls       p       ls_CI #>    <fctr> <num>   <num>      <list> #> 1:  asian 0.591 0.02255 0.582,0.601 #> 2:  black 0.876 0.19017 0.873,0.879 #> 3:   hisp 0.771 0.15167 0.767,0.775 #> 4:  white 0.183 0.62810 0.182,0.184 #> 5: native 1.352 0.00751   1.32,1.38 sum(local$p * local$ls) #> [1] 0.422 mutual_difference(schools00, schools05,     group = \"race\", unit = \"school\",     weight = \"n\", method = \"shapley\" ) #>              stat      est #>            <char>    <num> #> 1:             M1  0.42554 #> 2:             M2  0.41339 #> 3:           diff -0.01215 #> 4:      additions -0.00341 #> 5:       removals -0.01141 #> 6: group_marginal  0.01787 #> 7:  unit_marginal -0.01171 #> 8:     structural -0.00349 segplot(schools00, group = \"race\", unit = \"school\", weight = \"n\")"},{"path":"https://elbersb.com/segregation/index.html","id":"how-to-install","dir":"","previous_headings":"","what":"How to install","title":"Entropy-Based Segregation Indices","text":"install package CRAN, use install development version, use","code":"install.packages(\"segregation\") devtools::install_github(\"elbersb/segregation\")"},{"path":"https://elbersb.com/segregation/index.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Entropy-Based Segregation Indices","text":"use package research, please cite: Elbers, B. (2021). Method Studying Differences Segregation Across Time Space. Sociological Methods & Research. https://doi.org/10.1177/0049124121986204","code":""},{"path":"https://elbersb.com/segregation/index.html","id":"some-additional-resources","dir":"","previous_headings":"","what":"Some additional resources","title":"Entropy-Based Segregation Indices","text":"book Analyzing US Census Data: Methods, Maps, Models R Kyle E. Walker contains discussion package, great resource anyone working spatial data, especially U.S. Census data. paper makes use package: Residential Racial Segregation U.S. Really Increase? Analysis Accounting Changes Racial Diversity (Code Data) analyses article Belgian newspaper De Tijd used package. analyses article Wall Street Journal produced using package.","code":""},{"path":"https://elbersb.com/segregation/index.html","id":"references-on-entropy-based-segregation-indices","dir":"","previous_headings":"","what":"References on entropy-based segregation indices","title":"Entropy-Based Segregation Indices","text":"Deutsch, J., Flückiger, Y. & Silber, J. (2009). Analyzing Changes Occupational Segregation: Case Switzerland (1970–2000), : Yves Flückiger, Sean F. Reardon, Jacques Silber (eds.) Occupational Residential Segregation (Research Economic Inequality, Volume 17), 171–202. DiPrete, T. ., Eller, C. C., Bol, T., & van de Werfhorst, H. G. (2017). School--Work Linkages United States, Germany, France. American Journal Sociology, 122(6), 1869-1938. https://doi.org/10.1086/691327 Elbers, B. (2021). Method Studying Differences Segregation Across Time Space. Sociological Methods & Research. https://doi.org/10.1177/0049124121986204 Forster, . G., & Bol, T. (2017). Vocational education employment life course using new measure occupational specificity. Social Science Research, 70, 176-197. https://doi.org/10.1016/j.ssresearch.2017.11.004 Theil, H. (1971). Principles Econometrics. New York: Wiley. Frankel, D. M., & Volij, O. (2011). Measuring school segregation. Journal Economic Theory, 146(1), 1-38. https://doi.org/10.1016/j.jet.2010.10.008 Mora, R., & Ruiz-Castillo, J. (2003). Additively decomposable segregation indexes. case gender segregation occupations human capital levels Spain. Journal Economic Inequality, 1(2), 147-179. https://doi.org/10.1023/:1026198429377 Mora, R., & Ruiz-Castillo, J. (2009). Invariance Properties Mutual Information Index Multigroup Segregation, : Yves Flückiger, Sean F. Reardon, Jacques Silber (eds.) Occupational Residential Segregation (Research Economic Inequality, Volume 17), 33-53. Mora, R., & Ruiz-Castillo, J. (2011). Entropy-based Segregation Indices. Sociological Methodology, 41(1), 159–194. https://doi.org/10.1111/j.1467-9531.2011.01237.x Van Puyenbroeck, T., De Bruyne, K., & Sels, L. (2012). ‘Mutual Information’: Educational sectoral gender segregation interaction Flemish labor market. Labour Economics, 19(1), 1-8. https://doi.org/10.1016/j.labeco.2011.05.002 Watts, M. Use Abuse Entropy Based Segregation Indices. Working Paper. URL: http://www.ecineq.org/ecineq_lux15/FILESx2015/CR2/p217.pdf","code":""},{"path":"https://elbersb.com/segregation/reference/compress.html","id":null,"dir":"Reference","previous_headings":"","what":"Compresses a data matrix based on mutual information (segregation) — compress","title":"Compresses a data matrix based on mutual information (segregation) — compress","text":"Given data set identifies suitable neighbors merging, function merge units iteratively, iteration neighbors smallest reduction terms total M merged.","code":""},{"path":"https://elbersb.com/segregation/reference/compress.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Compresses a data matrix based on mutual information (segregation) — compress","text":"","code":"compress(   data,   group,   unit,   weight = NULL,   neighbors = \"local\",   n_neighbors = 50,   max_iter = Inf )"},{"path":"https://elbersb.com/segregation/reference/compress.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Compresses a data matrix based on mutual information (segregation) — compress","text":"data data frame. group categorical variable contained data. Defines first dimension segregation computed. unit categorical variable contained data. Defines second dimension segregation computed. weight Numeric. frequency weights allowed. (Default NULL) neighbors Either data frame character. data frame, needs exactly two columns, row identifies set \"neighbors\" may merged. \"local\", considers n_neighbors closest neighbors terms local segregation. \"\", units considered possible neighbors. may time-consuming. n_neighbors relevant neighbors \"local\". max_iter Maximum number iterations (Default Inf)","code":""},{"path":"https://elbersb.com/segregation/reference/compress.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Compresses a data matrix based on mutual information (segregation) — compress","text":"Returns data.table.","code":""},{"path":"https://elbersb.com/segregation/reference/dissimilarity.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates Index of Dissimilarity — dissimilarity","title":"Calculates Index of Dissimilarity — dissimilarity","text":"Returns total segregation group unit using Index Dissimilarity.","code":""},{"path":"https://elbersb.com/segregation/reference/dissimilarity.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates Index of Dissimilarity — dissimilarity","text":"","code":"dissimilarity(   data,   group,   unit,   weight = NULL,   se = FALSE,   CI = 0.95,   n_bootstrap = 100 )"},{"path":"https://elbersb.com/segregation/reference/dissimilarity.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates Index of Dissimilarity — dissimilarity","text":"data data frame. group categorical variable vector variables contained data. Defines first dimension segregation computed. D index allows two distinct groups. unit categorical variable vector variables contained data. Defines second dimension segregation computed. weight Numeric. (Default NULL) se TRUE, segregation estimates bootstrapped provide standard errors apply bias correction. bias reported already applied estimates (.e. reported estimates \"debiased\") (Default FALSE) CI se = TRUE, compute confidence (CI*100) addition bootstrap standard error. based percentiles bootstrap distribution, valid interpretation relies larger number bootstrap iterations. (Default 0.95) n_bootstrap Number bootstrap iterations. (Default 100)","code":""},{"path":"https://elbersb.com/segregation/reference/dissimilarity.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates Index of Dissimilarity — dissimilarity","text":"Returns data.table one row. column est contains   Index Dissimilarity.   se set TRUE, additional column se contains   associated bootstrapped standard errors, additional column CI contains   estimate confidence interval list column, additional column bias contains   estimated bias, column est contains bias-corrected estimates.","code":""},{"path":"https://elbersb.com/segregation/reference/dissimilarity.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Calculates Index of Dissimilarity — dissimilarity","text":"Otis Dudley Duncan Beverly Duncan. 1955. \"Methodological Analysis Segregation Indexes,\"      American Sociological Review 20(2): 210-217.","code":""},{"path":"https://elbersb.com/segregation/reference/dissimilarity.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculates Index of Dissimilarity — dissimilarity","text":"","code":"# Example where D and H deviate m1 <- matrix_to_long(matrix(c(100, 60, 40, 0, 0, 40, 60, 100), ncol = 2)) m2 <- matrix_to_long(matrix(c(80, 80, 20, 20, 20, 20, 80, 80), ncol = 2)) dissimilarity(m1, \"group\", \"unit\", weight = \"n\") #>    stat est #> 1:    D 0.6 dissimilarity(m2, \"group\", \"unit\", weight = \"n\") #>    stat est #> 1:    D 0.6"},{"path":"https://elbersb.com/segregation/reference/dissimilarity_expected.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates expected values when true segregation is zero — dissimilarity_expected","title":"Calculates expected values when true segregation is zero — dissimilarity_expected","text":"sample sizes small, one group small proportion, many units, segregation indices typically upwardly biased, even true segregation zero. function simulates tables zero segregation, given marginals dataset, calculates segregation. expected values large, interpretation index scores might adjusted.","code":""},{"path":"https://elbersb.com/segregation/reference/dissimilarity_expected.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates expected values when true segregation is zero — dissimilarity_expected","text":"","code":"dissimilarity_expected(   data,   group,   unit,   weight = NULL,   fixed_margins = TRUE,   n_bootstrap = 100 )"},{"path":"https://elbersb.com/segregation/reference/dissimilarity_expected.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates expected values when true segregation is zero — dissimilarity_expected","text":"data data frame. group categorical variable vector variables contained data. Defines first dimension segregation computed. unit categorical variable vector variables contained data. Defines second dimension segregation computed. weight Numeric. (Default NULL) fixed_margins margins fixed simulated? (Default TRUE) n_bootstrap Number bootstrap iterations. (Default 100)","code":""},{"path":"https://elbersb.com/segregation/reference/dissimilarity_expected.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates expected values when true segregation is zero — dissimilarity_expected","text":"data.table one row, corresponding expected value    D index true segregation zero.","code":""},{"path":"https://elbersb.com/segregation/reference/dissimilarity_expected.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculates expected values when true segregation is zero — dissimilarity_expected","text":"","code":"# build a smaller table, with 100 students distributed across # 10 schools, where one racial group has 10% of the students small <- data.frame(     school = c(1:10, 1:10),     race = c(rep(\"r1\", 10), rep(\"r2\", 10)),     n = c(rep(1, 10), rep(9, 10)) ) dissimilarity_expected(small, \"race\", \"school\", weight = \"n\") #>         stat       est       se #> 1: D under 0 0.3755556 0.117949 # with an increase in sample size (n=1000), the values improve small$n <- small$n * 10 dissimilarity_expected(small, \"race\", \"school\", weight = \"n\") #>         stat   est         se #> 1: D under 0 0.121 0.02762111"},{"path":"https://elbersb.com/segregation/reference/entropy.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates the entropy of a distribution — entropy","title":"Calculates the entropy of a distribution — entropy","text":"Returns entropy distribution defined group.","code":""},{"path":"https://elbersb.com/segregation/reference/entropy.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates the entropy of a distribution — entropy","text":"","code":"entropy(data, group, weight = NULL, base = exp(1))"},{"path":"https://elbersb.com/segregation/reference/entropy.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates the entropy of a distribution — entropy","text":"data data frame. group categorical variable vector variables contained data. weight Numeric. (Default NULL) base Base logarithm used entropy calculation. Defaults natural logarithm.","code":""},{"path":"https://elbersb.com/segregation/reference/entropy.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates the entropy of a distribution — entropy","text":"single number, entropy.","code":""},{"path":"https://elbersb.com/segregation/reference/entropy.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculates the entropy of a distribution — entropy","text":"","code":"d <- data.frame(cat = c(\"A\", \"B\"), n = c(25, 75)) entropy(d, \"cat\", weight = \"n\") # => .56 #> [1] 0.5623351 # this is equivalent to -.25*log(.25)-.75*log(.75)  d <- data.frame(cat = c(\"A\", \"B\"), n = c(50, 50)) # use base 2 for the logarithm, then entropy is maximized at 1 entropy(d, \"cat\", weight = \"n\", base = 2) # => 1 #> [1] 1"},{"path":"https://elbersb.com/segregation/reference/exposure.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates pairwise exposure indices — exposure","title":"Calculates pairwise exposure indices — exposure","text":"Returns pairwise exposure indices groups","code":""},{"path":"https://elbersb.com/segregation/reference/exposure.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates pairwise exposure indices — exposure","text":"","code":"exposure(data, group, unit, weight = NULL)"},{"path":"https://elbersb.com/segregation/reference/exposure.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates pairwise exposure indices — exposure","text":"data data frame. group categorical variable contained data. Defines first dimension segregation computed. unit vector variables contained data. Defines second dimension segregation computed. weight Numeric. (Default NULL)","code":""},{"path":"https://elbersb.com/segregation/reference/exposure.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates pairwise exposure indices — exposure","text":"Returns data.table columns \"\", \"\",  \"exposure\". Read results \"exposure group x group y\".","code":""},{"path":"https://elbersb.com/segregation/reference/get_crosswalk.html","id":null,"dir":"Reference","previous_headings":"","what":"Create crosswalk after compression — get_crosswalk","title":"Create crosswalk after compression — get_crosswalk","text":"running compress, function creates crosswalk table. Usually preferred call merge_units directly.","code":""},{"path":"https://elbersb.com/segregation/reference/get_crosswalk.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Create crosswalk after compression — get_crosswalk","text":"","code":"get_crosswalk(compression, n_units = NULL, percent = NULL, parts = FALSE)"},{"path":"https://elbersb.com/segregation/reference/get_crosswalk.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Create crosswalk after compression — get_crosswalk","text":"compression \"segcompression\" object returned compress. n_units Determines number merges specifying number units remain compressed dataset. n_units percent must given. (default: NULL) percent Determines number merges specifying percentage total segregation information retained compressed dataset. n_units percent must given. (default: NULL) parts (default: FALSE)","code":""},{"path":"https://elbersb.com/segregation/reference/get_crosswalk.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Create crosswalk after compression — get_crosswalk","text":"Returns ggplot2 plot. Returns data.table.","code":""},{"path":"https://elbersb.com/segregation/reference/ipf.html","id":null,"dir":"Reference","previous_headings":"","what":"Adjustment of marginal distributions using iterative proportional fitting — ipf","title":"Adjustment of marginal distributions using iterative proportional fitting — ipf","text":"Adjusts marginal distributions group unit source respective marginal distributions target, using iterative proportional fitting algorithm (IPF).","code":""},{"path":"https://elbersb.com/segregation/reference/ipf.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Adjustment of marginal distributions using iterative proportional fitting — ipf","text":"","code":"ipf(   source,   target,   group,   unit,   weight = NULL,   max_iterations = 100,   precision = 1e-04 )"},{"path":"https://elbersb.com/segregation/reference/ipf.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Adjustment of marginal distributions using iterative proportional fitting — ipf","text":"source \"source\" data frame. marginals dataset adjusted marginals target. target \"target\" data frame. function returns dataset marginal distributions group unit categories approximated target. group categorical variable vector variables contained source target. Defines first distribution adjustment. unit categorical variable vector variables contained source target. Defines second distribution adjustment. weight Numeric. (Default NULL) max_iterations Maximum number iterations used IPF algorithm. precision Convergence criterion IPF algorithm. every iteration, ratio source target marginals calculated every category group unit. algorithm converges ratios smaller 1 + precision.","code":""},{"path":"https://elbersb.com/segregation/reference/ipf.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Adjustment of marginal distributions using iterative proportional fitting — ipf","text":"Returns data frame retains   association structure source approximating   marginal distributions group unit target.   dataset identifies combination group unit,   categories occur either source target dropped.   adjusted frequency combination given column n,   n_target n_source contain zero-adjusted frequencies   target source dataset, respectively.","code":""},{"path":"https://elbersb.com/segregation/reference/ipf.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Adjustment of marginal distributions using iterative proportional fitting — ipf","text":"algorithm works scaling marginal distribution group source data frame towards marginal distribution target; repeating process unit. algorithm keeps alternating group unit marginals adjusted data frame within allowed precision. results dataset retains association structure source approximating marginal distribution target. number unit group categories different source target, data frame returns combination unit group categories occur datasets. Zero values replaced small, non-zero number (1e-4). Note values returned sum observations source data frame, target data frame. different IPF implementations, ensures IPF change number observations.","code":""},{"path":"https://elbersb.com/segregation/reference/ipf.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Adjustment of marginal distributions using iterative proportional fitting — ipf","text":"W. E. Deming F. F. Stephan. 1940.   \"Least Squares Adjustment Sampled Frequency Table   Expected Marginal Totals Known\".   Annals Mathematical Statistics. 11 (4): 427–444. T. Karmel M. Maclachlan. 1988.   \"Occupational Sex Segregation — Increasing Decreasing?\" Economic Record 64: 187-195.","code":""},{"path":"https://elbersb.com/segregation/reference/ipf.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Adjustment of marginal distributions using iterative proportional fitting — ipf","text":"","code":"if (FALSE) { # adjusts the marginals of group and unit categories so that # schools00 has similar marginals as schools05 adj <- ipf(schools00, schools05, \"race\", \"school\", weight = \"n\")  # check that the new \"race\" marginals are similar to the target marginals # (the same could be done for schools) aggregate(adj$n, list(adj$race), sum) aggregate(adj$n_target, list(adj$race), sum)  # note that the adjusted dataset contains fewer # schools than either the source or the target dataset, # because the marginals are only defined for the overlap # of schools length(unique(schools00$school)) length(unique(schools05$school)) length(unique(adj$school)) }"},{"path":"https://elbersb.com/segregation/reference/isolation.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates isolation indices — isolation","title":"Calculates isolation indices — isolation","text":"Returns isolation index group","code":""},{"path":"https://elbersb.com/segregation/reference/isolation.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates isolation indices — isolation","text":"","code":"isolation(data, group, unit, weight = NULL)"},{"path":"https://elbersb.com/segregation/reference/isolation.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates isolation indices — isolation","text":"data data frame. group categorical variable contained data. Defines first dimension segregation computed. unit vector variables contained data. Defines second dimension segregation computed. weight Numeric. (Default NULL)","code":""},{"path":"https://elbersb.com/segregation/reference/isolation.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates isolation indices — isolation","text":"Returns data.table group column isolation index.","code":""},{"path":"https://elbersb.com/segregation/reference/matrix_to_long.html","id":null,"dir":"Reference","previous_headings":"","what":"Turns a contingency table into long format — matrix_to_long","title":"Turns a contingency table into long format — matrix_to_long","text":"Returns data.table long form, suitable use mutual_total, etc. Colnames rownames matrix respected.","code":""},{"path":"https://elbersb.com/segregation/reference/matrix_to_long.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Turns a contingency table into long format — matrix_to_long","text":"","code":"matrix_to_long(   matrix,   group = \"group\",   unit = \"unit\",   weight = \"n\",   drop_zero = TRUE )"},{"path":"https://elbersb.com/segregation/reference/matrix_to_long.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Turns a contingency table into long format — matrix_to_long","text":"matrix matrix, rows represent units, column represent groups. group Variable name group. (Default group) unit Variable name unit. (Default unit) weight Variable name frequency weight. (Default weight) drop_zero Drop unit-group combinations zero weight. (Default TRUE)","code":""},{"path":"https://elbersb.com/segregation/reference/matrix_to_long.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Turns a contingency table into long format — matrix_to_long","text":"data.table.","code":""},{"path":"https://elbersb.com/segregation/reference/matrix_to_long.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Turns a contingency table into long format — matrix_to_long","text":"","code":"m <- matrix(c(10, 20, 30, 30, 20, 10), nrow = 3) colnames(m) <- c(\"Black\", \"White\") long <- matrix_to_long(m, group = \"race\", unit = \"school\") mutual_total(long, \"race\", \"school\", weight = \"n\") #>    stat        est #> 1:    M 0.08720802 #> 2:    H 0.12581458"},{"path":"https://elbersb.com/segregation/reference/merge_units.html","id":null,"dir":"Reference","previous_headings":"","what":"Creates a compressed dataset — merge_units","title":"Creates a compressed dataset — merge_units","text":"running compress, function creates dataset units merged.","code":""},{"path":"https://elbersb.com/segregation/reference/merge_units.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Creates a compressed dataset — merge_units","text":"","code":"merge_units(compression, n_units = NULL, percent = NULL, parts = FALSE)"},{"path":"https://elbersb.com/segregation/reference/merge_units.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Creates a compressed dataset — merge_units","text":"compression \"segcompression\" object returned compress. n_units Determines number merges specifying number units remain compressed dataset. n_units percent must given. (default: NULL) percent Determines number merges specifying percentage total segregation information retained compressed dataset. n_units percent must given. (default: NULL) parts (default: FALSE)","code":""},{"path":"https://elbersb.com/segregation/reference/merge_units.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Creates a compressed dataset — merge_units","text":"Returns data.table.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_difference.html","id":null,"dir":"Reference","previous_headings":"","what":"Decomposes the difference between two M indices — mutual_difference","title":"Decomposes the difference between two M indices — mutual_difference","text":"Uses one three methods decompose difference two M indices: (1) \"shapley\" / \"shapley_detailed\": method based Shapley decomposition advantages Karmel-Maclachlan method (recommended default, Deutsch et al. 2006), (2) \"km\": method based Karmel-Maclachlan (1988), (3) \"mrc\": method developed Mora Ruiz-Castillo (2009). methods extended account missing units/groups either data input.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_difference.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Decomposes the difference between two M indices — mutual_difference","text":"","code":"mutual_difference(   data1,   data2,   group,   unit,   weight = NULL,   method = \"shapley\",   se = FALSE,   CI = 0.95,   n_bootstrap = 100,   base = exp(1),   ... )"},{"path":"https://elbersb.com/segregation/reference/mutual_difference.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Decomposes the difference between two M indices — mutual_difference","text":"data1 data frame structure data2. data2 data frame structure data1. group categorical variable vector variables contained data. Defines first dimension segregation computed. unit categorical variable vector variables contained data. Defines second dimension segregation computed. weight Numeric. (Default NULL) method Either \"shapley\" (default), \"km\" (Karmel Maclachlan method), \"mrc\" (Mora Ruiz-Castillo method). se TRUE, segregation estimates bootstrapped provide standard errors apply bias correction. bias reported already applied estimates (.e. reported estimates \"debiased\") (Default FALSE) CI se = TRUE, compute confidence (CI*100) addition bootstrap standard error. based percentiles bootstrap distribution, valid interpretation relies larger number bootstrap iterations. (Default 0.95) n_bootstrap Number bootstrap iterations. (Default 100) base Base logarithm used calculation. Defaults natural logarithm. ... used additional arguments method set shapley km. See ipf details.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_difference.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Decomposes the difference between two M indices — mutual_difference","text":"Returns data.table columns stat est. data frame contains   following rows defined stat:  M1 contains M data1.  M2 contains M data2.  diff difference M2 M1.   sum five rows following diff equal diff.  additions contains change M induces unit group categories   present data2 data1, removals reverse. methods return following three terms:  unit_marginal contribution unit composition differences.  group_marginal contribution group composition differences.  structural contribution unexplained marginal changes, .e. structural     difference. Note interpretation terms depend exact method used. using \"km\", one additional row returned:  interaction contribution differences joint marginal distribution      unit group. \"shapley_detailed\" used, additional column \"unit\" returned, along     six additional rows unit present data1 data2.     five rows following meaning:  p1 (p2) proportion unit data1 (data2)     non-intersecting units/groups removed. changes local linkage     given ls_diff1 ls_diff2, average given  ls_diff_mean. row named total summarizes contribution     unit towards structural change     using formula .5 * p1 * ls_diff1 + .5 * p2 * ls_diff2.     sum \"total\" components equals structural change. se set TRUE, additional column se contains   associated bootstrapped standard errors, additional column CI contains   estimate confidence interval list column, additional column bias contains   estimated bias, column est contains bias-corrected estimates.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_difference.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Decomposes the difference between two M indices — mutual_difference","text":"Shapley method improvement Karmel-Maclachlan method (Deutsch et al. 2006). based several margins-adjusted data inputs yields symmetrical results (.e. data1 data2 can switched). \"shapley_detailed\" used, structural component decomposed contributions individuals units. Karmel-Maclachlan method (Karmel Maclachlan 1988) adjusts margins data1 similar margins data2. process symmetrical. Shapley Karmel-Maclachlan methods based iterative proportional fitting (IPF), first introduced Deming Stephan (1940). Depending size dataset, may take seconds (see ipf details). method developed Mora Ruiz-Castillo (2009) uses algebraic approach estimate size components. often yield substantively different results Shapley Karmel-Maclachlan methods. Note method symmetric terms defined group unit categories, may yield contradictory results. problem arises group /unit categories data1 present data2 (vice versa). methods estimate difference categories present datasets, report additionally change M induced cases additions (present data2, data1) removals (present data1, data2).","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_difference.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Decomposes the difference between two M indices — mutual_difference","text":"W. E. Deming, F. F. Stephan. 1940. \"Least Squares Adjustment Sampled Frequency Table    Expected Marginal Totals Known.\"    Annals Mathematical Statistics 11(4): 427-444. T. Karmel M. Maclachlan. 1988.   \"Occupational Sex Segregation — Increasing Decreasing?\" Economic Record 64: 187-195. R. Mora J. Ruiz-Castillo. 2009. \"Invariance Properties   Mutual Information Index Multigroup Segregation.\" Research Economic Inequality 17: 33-53. J. Deutsch, Y. Flückiger, J. Silber. 2009.       \"Analyzing Changes Occupational Segregation: Case Switzerland (1970–2000).\"        Research Economic Inequality 17: 171–202.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_difference.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Decomposes the difference between two M indices — mutual_difference","text":"","code":"if (FALSE) { # decompose the difference in school segregation between 2000 and 2005, # using the Shapley method mutual_difference(schools00, schools05,     group = \"race\", unit = \"school\",     weight = \"n\", method = \"shapley\", precision = .1 ) # => the structural component is close to zero, thus most change is in the marginals. # This method gives identical results when we switch the unit and group definitions, # and when we switch the data inputs.  # the Karmel-Maclachlan method is similar, but only adjust the data in the forward direction... mutual_difference(schools00, schools05,     group = \"school\", unit = \"race\",     weight = \"n\", method = \"km\", precision = .1 )  # ...this means that the results won't be identical when we switch the data inputs mutual_difference(schools05, schools00,     group = \"school\", unit = \"race\",     weight = \"n\", method = \"km\", precision = .1 )  # the MRC method indicates a much higher structural change... mutual_difference(schools00, schools05,     group = \"race\", unit = \"school\",     weight = \"n\", method = \"mrc\" )  # ...and is not symmetric mutual_difference(schools00, schools05,     group = \"school\", unit = \"race\",     weight = \"n\", method = \"mrc\" ) }"},{"path":"https://elbersb.com/segregation/reference/mutual_expected.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates expected values when true segregation is zero — mutual_expected","title":"Calculates expected values when true segregation is zero — mutual_expected","text":"sample sizes small, one group small proportion, many units, segregation indices typically upwardly biased, even true segregation zero. function simulates tables zero segregation, given marginals dataset, calculates segregation. expected values large, interpretation index scores might adjusted.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_expected.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates expected values when true segregation is zero — mutual_expected","text":"","code":"mutual_expected(   data,   group,   unit,   weight = NULL,   within = NULL,   fixed_margins = TRUE,   n_bootstrap = 100,   base = exp(1) )"},{"path":"https://elbersb.com/segregation/reference/mutual_expected.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates expected values when true segregation is zero — mutual_expected","text":"data data frame. group categorical variable vector variables contained data. Defines first dimension segregation computed. unit categorical variable vector variables contained data. Defines second dimension segregation computed. weight Numeric. (Default NULL) within Apply algorithm within group defined variable, report weighted average. (Default NULL) fixed_margins margins fixed simulated? (Default TRUE) n_bootstrap Number bootstrap iterations. (Default 100) base Base logarithm used calculation. Defaults natural logarithm.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_expected.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates expected values when true segregation is zero — mutual_expected","text":"data.table two rows, corresponding expected values    segregation true segregation zero.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_expected.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculates expected values when true segregation is zero — mutual_expected","text":"","code":"if (FALSE) { # the schools00 dataset has a large sample size, so expected segregation is close to zero mutual_expected(schools00, \"race\", \"school\", weight = \"n\")  # but we can build a smaller table, with 100 students distributed across # 10 schools, where one racial group has 10% of the students small <- data.frame(     school = c(1:10, 1:10),     race = c(rep(\"r1\", 10), rep(\"r2\", 10)),     n = c(rep(1, 10), rep(9, 10)) ) mutual_expected(small, \"race\", \"school\", weight = \"n\") # with an increase in sample size (n=1000), the values improve small$n <- small$n * 10 mutual_expected(small, \"race\", \"school\", weight = \"n\") }"},{"path":"https://elbersb.com/segregation/reference/mutual_local.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates local segregation scores based on M — mutual_local","title":"Calculates local segregation scores based on M — mutual_local","text":"Returns local segregation indices category defined unit.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_local.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates local segregation scores based on M — mutual_local","text":"","code":"mutual_local(   data,   group,   unit,   weight = NULL,   se = FALSE,   CI = 0.95,   n_bootstrap = 100,   base = exp(1),   wide = FALSE )"},{"path":"https://elbersb.com/segregation/reference/mutual_local.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates local segregation scores based on M — mutual_local","text":"data data frame. group categorical variable vector variables contained data. Defines dimension segregation computed. unit categorical variable vector variables contained data. Defines group local segregation indices calculated. weight Numeric. (Default NULL) se TRUE, segregation estimates bootstrapped provide standard errors apply bias correction. bias reported already applied estimates (.e. reported estimates \"debiased\") (Default FALSE) CI se = TRUE, compute confidence (CI*100) addition bootstrap standard error. based percentiles bootstrap distribution, valid interpretation relies larger number bootstrap iterations. (Default 0.95) n_bootstrap Number bootstrap iterations. (Default 100) base Base logarithm used calculation. Defaults natural logarithm. wide Returns wide dataframe instead long dataframe. (Default FALSE)","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_local.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates local segregation scores based on M — mutual_local","text":"Returns data.table two rows category defined unit,   total 2*(number units) rows.   column est contains two statistics   provided unit: ls, local segregation score,  p, proportion unit total number cases.   se set TRUE, additional column se contains   associated bootstrapped standard errors, additional column CI contains   estimate confidence interval list column, additional column bias contains   estimated bias, column est contains bias-corrected estimates.   wide set TRUE, returns instead wide dataframe, one   row unit, associated statistics separate columns.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_local.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Calculates local segregation scores based on M — mutual_local","text":"Henri Theil. 1971. Principles Econometrics. New York: Wiley. Ricardo Mora Javier Ruiz-Castillo. 2011.   \"Entropy-based Segregation Indices\". Sociological Methodology 41(1): 159–194.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_local.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculates local segregation scores based on M — mutual_local","text":"","code":"# which schools are most segregated? (localseg <- mutual_local(schools00, \"race\", \"school\",     weight = \"n\", wide = TRUE )) #>       school        ls            p #>    1:   A1_1 0.1826710 0.0004522985 #>    2:   A1_2 0.1825592 0.0004978701 #>    3:   A1_3 0.2756157 0.0006642066 #>    4:   A1_4 0.1368034 0.0005685061 #>    5:   A2_1 0.3585546 0.0004260948 #>   ---                               #> 2041: C165_1 0.3174930 0.0004568556 #> 2042: C165_2 0.3835477 0.0005297702 #> 2043: C165_3 0.2972550 0.0005650883 #> 2044: C166_1 0.3072281 0.0011586588 #> 2045: C167_1 0.3166498 0.0005354667  sum(localseg$p) # => 1 #> [1] 1  # the sum of the weighted local segregation scores equals # total segregation sum(localseg$ls * localseg$p) # => .425 #> [1] 0.425539 mutual_total(schools00, \"school\", \"race\", weight = \"n\") # M => .425 #>    stat        est #> 1:    M 0.42553898 #> 2:    H 0.05642991"},{"path":"https://elbersb.com/segregation/reference/mutual_total.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates the Mutual Information Index M and Theil's Entropy Index H — mutual_total","title":"Calculates the Mutual Information Index M and Theil's Entropy Index H — mutual_total","text":"Returns total segregation group unit. within given, calculates segregation within within category separately, takes weighted average. Also see mutual_within detailed within calculations.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_total.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates the Mutual Information Index M and Theil's Entropy Index H — mutual_total","text":"","code":"mutual_total(   data,   group,   unit,   within = NULL,   weight = NULL,   se = FALSE,   CI = 0.95,   n_bootstrap = 100,   base = exp(1) )"},{"path":"https://elbersb.com/segregation/reference/mutual_total.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates the Mutual Information Index M and Theil's Entropy Index H — mutual_total","text":"data data frame. group categorical variable vector variables contained data. Defines first dimension segregation computed. unit categorical variable vector variables contained data. Defines second dimension segregation computed. within categorical variable vector variables contained data. variable(s) superset either unit group calculation meaningful. provided, segregation computed within groups defined variable, averaged. (Default NULL) weight Numeric. (Default NULL) se TRUE, segregation estimates bootstrapped provide standard errors apply bias correction. bias reported already applied estimates (.e. reported estimates \"debiased\") (Default FALSE) CI se = TRUE, compute confidence (CI*100) addition bootstrap standard error. based percentiles bootstrap distribution, valid interpretation relies larger number bootstrap iterations. (Default 0.95) n_bootstrap Number bootstrap iterations. (Default 100) base Base logarithm used calculation. Defaults natural logarithm.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_total.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates the Mutual Information Index M and Theil's Entropy Index H — mutual_total","text":"Returns data.table two rows. column est contains   Mutual Information Index, M, Theil's Entropy Index, H. H   M divided group entropy. within given,   M H weighted averages within-category segregation scores.   se set TRUE, additional column se contains   associated bootstrapped standard errors, additional column CI contains   estimate confidence interval list column, additional column bias contains   estimated bias, column est contains bias-corrected estimates.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_total.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Calculates the Mutual Information Index M and Theil's Entropy Index H — mutual_total","text":"Henri Theil. 1971. Principles Econometrics. New York: Wiley. Ricardo Mora Javier Ruiz-Castillo. 2011.      \"Entropy-based Segregation Indices\". Sociological Methodology 41(1): 159–194.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_total.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculates the Mutual Information Index M and Theil's Entropy Index H — mutual_total","text":"","code":"# calculate school racial segregation mutual_total(schools00, \"school\", \"race\", weight = \"n\") # M => .425 #>    stat        est #> 1:    M 0.42553898 #> 2:    H 0.05642991  # note that the definition of groups and units is arbitrary mutual_total(schools00, \"race\", \"school\", weight = \"n\") # M => .425 #>    stat       est #> 1:    M 0.4255390 #> 2:    H 0.4188083  # if groups or units are defined by a combination of variables, # vectors of variable names can be provided - # here there is no difference, because schools # are nested within districts mutual_total(schools00, \"race\", c(\"district\", \"school\"),     weight = \"n\" ) # M => .424 #>    stat       est #> 1:    M 0.4255390 #> 2:    H 0.4188083  # estimate standard errors and 95% CI for M and H if (FALSE) { mutual_total(schools00, \"race\", \"school\",     weight = \"n\",     se = TRUE, n_bootstrap = 1000 )  # estimate segregation within school districts mutual_total(schools00, \"race\", \"school\",     within = \"district\", weight = \"n\" ) # M => .087  # estimate between-district racial segregation mutual_total(schools00, \"race\", \"district\", weight = \"n\") # M => .338 # note that the sum of within-district and between-district # segregation equals total school-race segregation; # here, most segregation is between school districts }"},{"path":"https://elbersb.com/segregation/reference/mutual_total_nested.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates a nested decomposition of segregation for M and H — mutual_total_nested","title":"Calculates a nested decomposition of segregation for M and H — mutual_total_nested","text":"Returns -within decomposition defined sequence variables unit.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_total_nested.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates a nested decomposition of segregation for M and H — mutual_total_nested","text":"","code":"mutual_total_nested(data, group, unit, weight = NULL, base = exp(1))"},{"path":"https://elbersb.com/segregation/reference/mutual_total_nested.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates a nested decomposition of segregation for M and H — mutual_total_nested","text":"data data frame. group categorical variable vector variables contained data. Defines first dimension segregation computed. unit vector variables contained data. Defines levels decomposition computed. weight Numeric. (Default NULL) base Base logarithm used calculation. Defaults natural logarithm.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_total_nested.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates a nested decomposition of segregation for M and H — mutual_total_nested","text":"Returns data.table similar mutual_total,   column within define   levels nesting.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_total_nested.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculates a nested decomposition of segregation for M and H — mutual_total_nested","text":"","code":"mutual_total_nested(schools00, \"race\", c(\"state\", \"district\", \"school\"),     weight = \"n\" ) #>     between          within stat        est #> 1:    state                    M 0.09924370 #> 2:    state                    H 0.09767398 #> 3: district           state    M 0.23870880 #> 4: district           state    H 0.23493319 #> 5:   school state, district    M 0.08758648 #> 6:   school state, district    H 0.08620114 # This is a simpler way to run the following manually: # mutual_total(schools00, \"race\", \"state\", weight = \"n\") # mutual_total(schools00, \"race\", \"district\", within = \"state\", weight = \"n\") # mutual_total(schools00, \"race\", \"school\", within = c(\"state\", \"district\"), weight = \"n\")"},{"path":"https://elbersb.com/segregation/reference/mutual_within.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculates detailed within-category segregation scores for M and H — mutual_within","title":"Calculates detailed within-category segregation scores for M and H — mutual_within","text":"Calculates segregation group unit within category defined within.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_within.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculates detailed within-category segregation scores for M and H — mutual_within","text":"","code":"mutual_within(   data,   group,   unit,   within,   weight = NULL,   se = FALSE,   CI = 0.95,   n_bootstrap = 100,   base = exp(1),   wide = FALSE )"},{"path":"https://elbersb.com/segregation/reference/mutual_within.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculates detailed within-category segregation scores for M and H — mutual_within","text":"data data frame. group categorical variable vector variables contained data. Defines first dimension segregation computed. unit categorical variable vector variables contained data. Defines second dimension segregation computed. within categorical variable vector variables contained data defines within-segregation categories. weight Numeric. (Default NULL) se TRUE, segregation estimates bootstrapped provide standard errors apply bias correction. bias reported already applied estimates (.e. reported estimates \"debiased\") (Default FALSE) CI se = TRUE, compute confidence (CI*100) addition bootstrap standard error. based percentiles bootstrap distribution, valid interpretation relies larger number bootstrap iterations. (Default 0.95) n_bootstrap Number bootstrap iterations. (Default 100) base Base logarithm used calculation. Defaults natural logarithm. wide Returns wide dataframe instead long dataframe. (Default FALSE)","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_within.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculates detailed within-category segregation scores for M and H — mutual_within","text":"Returns data.table four rows category defined within.   column est contains four statistics   provided unit:  M within-category M, p proportion category.   Multiplying M p gives contribution within-category   towards total M.  H within-category H, ent_ratio provides entropy ratio,   defined EW/E, EW within-category entropy,   E overall entropy.   Multiplying H, p, ent_ratio gives contribution within-category   towards total H.   se set TRUE, additional column se contains   associated bootstrapped standard errors, additional column CI contains   estimate confidence interval list column, additional column bias contains   estimated bias, column est contains bias-corrected estimates.   wide set TRUE, returns instead wide dataframe, one   row within category, associated statistics separate columns.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_within.html","id":"references","dir":"Reference","previous_headings":"","what":"References","title":"Calculates detailed within-category segregation scores for M and H — mutual_within","text":"Henri Theil. 1971. Principles Econometrics. New York: Wiley. Ricardo Mora Javier Ruiz-Castillo. 2011.      \"Entropy-based Segregation Indices\". Sociological Methodology 41(1): 159–194.","code":""},{"path":"https://elbersb.com/segregation/reference/mutual_within.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculates detailed within-category segregation scores for M and H — mutual_within","text":"","code":"(within <- mutual_within(schools00, \"race\", \"school\",     within = \"state\",     weight = \"n\", wide = TRUE )) #>    state         M         p         H ent_ratio #> 1:     A 0.4085965 0.2768819 0.4969216 0.8092501 #> 2:     B 0.2549959 0.4035425 0.2680884 0.9361190 #> 3:     C 0.3450221 0.3195756 0.3611257 0.9402955 # the M for state \"A\" is .409 # manual calculation schools_A <- schools00[schools00$state == \"A\", ] mutual_total(schools_A, \"race\", \"school\", weight = \"n\") # M => .409 #>    stat       est #> 1:    M 0.4085965 #> 2:    H 0.4969216  # to recover the within M and H from the output, multiply # p * M and p * ent_ratio * H, respectively sum(within$p * within$M) # => .326 #> [1] 0.3262953 sum(within$p * within$ent_ratio * within$H) # => .321 #> [1] 0.3211343 # compare with: mutual_total(schools00, \"race\", \"school\", within = \"state\", weight = \"n\") #>    stat       est #> 1:    M 0.3262953 #> 2:    H 0.3211343"},{"path":"https://elbersb.com/segregation/reference/school_ses.html","id":null,"dir":"Reference","previous_headings":"","what":"Student-level data including SES status — school_ses","title":"Student-level data including SES status — school_ses","text":"Fake dataset used examples. individual-level dataset students schools.","code":""},{"path":"https://elbersb.com/segregation/reference/school_ses.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Student-level data including SES status — school_ses","text":"","code":"school_ses"},{"path":"https://elbersb.com/segregation/reference/school_ses.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"Student-level data including SES status — school_ses","text":"data frame 5,153 rows 3 variables: school_id school ID ethnic_group one , B, C ses_quintile SES student (1 = lowest, 5 = highest)","code":""},{"path":"https://elbersb.com/segregation/reference/schools00.html","id":null,"dir":"Reference","previous_headings":"","what":"Ethnic/racial composition of schools for 2000/2001 — schools00","title":"Ethnic/racial composition of schools for 2000/2001 — schools00","text":"Fake dataset used examples. Loosely based data provided National Center Education Statistics, Common Core Data, information U.S. primary schools three U.S. states. original data can downloaded https://nces.ed.gov/ccd/.","code":""},{"path":"https://elbersb.com/segregation/reference/schools00.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Ethnic/racial composition of schools for 2000/2001 — schools00","text":"","code":"schools00"},{"path":"https://elbersb.com/segregation/reference/schools00.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"Ethnic/racial composition of schools for 2000/2001 — schools00","text":"data frame 8,142 rows 5 variables: state either , B, C district school agency/district ID school school ID race either native, asian, hispanic, black, white n n students school race","code":""},{"path":"https://elbersb.com/segregation/reference/schools05.html","id":null,"dir":"Reference","previous_headings":"","what":"Ethnic/racial composition of schools for 2005/2006 — schools05","title":"Ethnic/racial composition of schools for 2005/2006 — schools05","text":"Fake dataset used examples. Loosely based data provided National Center Education Statistics, Common Core Data, information U.S. primary schools three U.S. states. original data can downloaded https://nces.ed.gov/ccd/.","code":""},{"path":"https://elbersb.com/segregation/reference/schools05.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Ethnic/racial composition of schools for 2005/2006 — schools05","text":"","code":"schools05"},{"path":"https://elbersb.com/segregation/reference/schools05.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"Ethnic/racial composition of schools for 2005/2006 — schools05","text":"data frame 8,013 rows 5 variables: state either , B, C district school agency/district ID school school ID race either native, asian, hispanic, black, white n n students school race","code":""},{"path":"https://elbersb.com/segregation/reference/scree_plot.html","id":null,"dir":"Reference","previous_headings":"","what":"Scree plot for segregation compression — scree_plot","title":"Scree plot for segregation compression — scree_plot","text":"plot allows visually see effect compression mutual information.","code":""},{"path":"https://elbersb.com/segregation/reference/scree_plot.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Scree plot for segregation compression — scree_plot","text":"","code":"scree_plot(compression, tail = Inf)"},{"path":"https://elbersb.com/segregation/reference/scree_plot.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Scree plot for segregation compression — scree_plot","text":"compression \"segcompression\" object returned compress. tail Return last tail units (default: Inf)","code":""},{"path":"https://elbersb.com/segregation/reference/scree_plot.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Scree plot for segregation compression — scree_plot","text":"Returns ggplot2 plot.","code":""},{"path":"https://elbersb.com/segregation/reference/segcurve.html","id":null,"dir":"Reference","previous_headings":"","what":"A visual representation of two-group segregation — segcurve","title":"A visual representation of two-group segregation — segcurve","text":"Produces one several segregation curves, defined Duncan Duncan (1955)","code":""},{"path":"https://elbersb.com/segregation/reference/segcurve.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A visual representation of two-group segregation — segcurve","text":"","code":"segcurve(data, group, unit, weight = NULL, segment = NULL)"},{"path":"https://elbersb.com/segregation/reference/segcurve.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A visual representation of two-group segregation — segcurve","text":"data data frame. group categorical variable contained data. Defines first dimension segregation computed. unit categorical variable contained data. Defines second dimension segregation computed. weight Numeric. (Default NULL) segment categorical variable contained data. (Default NULL) given, several segregation curves shown, one segment.","code":""},{"path":"https://elbersb.com/segregation/reference/segcurve.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A visual representation of two-group segregation — segcurve","text":"Returns ggplot2 object.","code":""},{"path":"https://elbersb.com/segregation/reference/segplot.html","id":null,"dir":"Reference","previous_headings":"","what":"A visual representation of segregation — segplot","title":"A visual representation of segregation — segplot","text":"Produces segregation plot.","code":""},{"path":"https://elbersb.com/segregation/reference/segplot.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"A visual representation of segregation — segplot","text":"","code":"segplot(   data,   group,   unit,   weight,   order = \"segregation\",   reference_distribution = NULL,   bar_space = 0,   title = NULL,   axis_labels = \"left\" )"},{"path":"https://elbersb.com/segregation/reference/segplot.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"A visual representation of segregation — segplot","text":"data data frame. group categorical variable vector variables contained data. Defines first dimension segregation computed. unit categorical variable vector variables contained data. Defines second dimension segregation computed. weight Numeric. (Default NULL) order character, either \"segregation\", \"entropy\", \"majority\", \"majority_fixed\". Affects ordering units. horizontal ordering groups can changed using factor variable group. difference \"majority\" \"majority_fixed\" former reorder groups way majority group actually comes first. want control ordering , use \"majority_fixed\" specify group variable factor variable. reference_distribution Specifies reference distribution, given two-column data frame, plotted right. order segregation, reference distribution also used compute local segregation scores. bar_space Specifies space single units. title Adds plot title appends value H index. axis_labels One \"left\", \"right\", \"\". Determines y axis labels placed.","code":""},{"path":"https://elbersb.com/segregation/reference/segplot.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"A visual representation of segregation — segplot","text":"Returns ggplot2 object.","code":""},{"path":"https://elbersb.com/segregation/reference/segregation.html","id":null,"dir":"Reference","previous_headings":"","what":"segregation: Entropy-based segregation indices — segregation","title":"segregation: Entropy-based segregation indices — segregation","text":"Calculate decompose entropy-based, multigroup segregation indices, focus Mutual Information Index (M) Theil's Information Index (H). Provides tools decompose measures groups units, within terms. Includes standard error estimation bootstrapping.","code":""},{"path":[]},{"path":"https://elbersb.com/segregation/reference/segregation.html","id":"author","dir":"Reference","previous_headings":"","what":"Author","title":"segregation: Entropy-based segregation indices — segregation","text":"Maintainer: Benjamin Elbers be2239@columbia.edu (ORCID)","code":""},{"path":"https://elbersb.com/segregation/news/index.html","id":"segregation-development-version","dir":"Changelog","previous_headings":"","what":"segregation (development version)","title":"segregation (development version)","text":"various improvements compression algorithm add dendrogram visualization allow multiple curves segcurve function","code":""},{"path":"https://elbersb.com/segregation/news/index.html","id":"segregation-100","dir":"Changelog","previous_headings":"","what":"segregation 1.0.0","title":"segregation 1.0.0","text":"CRAN release: 2023-08-24 add mutual_total_nested add within argument mutual_expected add dissimilarity_expected add suite compression-related functions (C++) add segplot function add functions exposure isolation fix roxygen2 problem","code":""},{"path":"https://elbersb.com/segregation/news/index.html","id":"segregation-060","dir":"Changelog","previous_headings":"","what":"segregation 0.6.0","title":"segregation 0.6.0","text":"CRAN release: 2021-09-02 faster mutual_total(…, within) updated docs minor bug fixes improved error messages","code":""},{"path":"https://elbersb.com/segregation/news/index.html","id":"segregation-050","dir":"Changelog","previous_headings":"","what":"segregation 0.5.0","title":"segregation 0.5.0","text":"CRAN release: 2021-02-08 dissimilarity: support index dissimilarity add CI argument confidence intervals mutual_within: report ent_ratio instead h_weight matrix_to_long: convert contingency tables long form add introductory vignette","code":""},{"path":"https://elbersb.com/segregation/news/index.html","id":"segregation-040","dir":"Changelog","previous_headings":"","what":"segregation 0.4.0","title":"segregation 0.4.0","text":"CRAN release: 2021-01-08 faster bootstrap return bootstrap estimates attr add mutual_expected apply bias-correction via bootstrap default se=TRUE","code":""},{"path":"https://elbersb.com/segregation/news/index.html","id":"segregation-030","dir":"Changelog","previous_headings":"","what":"segregation 0.3.0","title":"segregation 0.3.0","text":"CRAN release: 2019-09-20 always return data.table ipf function, warn groups/units dropped return sample size source dataset IPF don’t allow bootstrap sample size integer, allow non-integer sample weights (unproblematic) simplify precision parameter ipf procedure increase default bootstrap 100 fix data.table issue (#3)","code":""},{"path":"https://elbersb.com/segregation/news/index.html","id":"segregation-020","dir":"Changelog","previous_headings":"","what":"segregation 0.2.0","title":"segregation 0.2.0","text":"CRAN release: 2019-01-14 add “shapley” decomposition method, revisit difference decomposition methods better logging bootstrap/IPF several small fixes add lintr package add warning attempting bootstrap non-integer weights","code":""},{"path":"https://elbersb.com/segregation/news/index.html","id":"segregation-010","dir":"Changelog","previous_headings":"","what":"segregation 0.1.0","title":"segregation 0.1.0","text":"CRAN release: 2018-06-15 switch group unit definitions, consistent literature add Theil’s Information Index (H) add entropy function add mutual_within function decompose weighted within indices add “wide” option mutual_local mutual_within add “ipf” (iterative proportional fitting) function difference decomposition based IPF “mrc_adjusted” difference decomposition defined overlap sample units groups internal refactoring","code":""},{"path":"https://elbersb.com/segregation/news/index.html","id":"segregation-001","dir":"Changelog","previous_headings":"","what":"segregation 0.0.1","title":"segregation 0.0.1","text":"CRAN release: 2018-04-17 Initial release.","code":""}]