Skip to content

Commit

Permalink
Merge pull request #24 from nrosed/main
Browse files Browse the repository at this point in the history
reordered sections to match elife, small changes in supp tables with …
  • Loading branch information
cgreene authored Mar 15, 2024
2 parents 1764d96 + c6ed1df commit e0b1ead
Show file tree
Hide file tree
Showing 4 changed files with 32 additions and 32 deletions.
12 changes: 6 additions & 6 deletions content/04.results.md → content/03.results.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ Figure @fig:fig2 shows an overview of the process and example input data for thi
These analyses relied upon accurate gender prediction of both authors and speakers.
To predict the gender of the speaker or author, we used the package genderizeR [@doi:10.32614/rj-2016-002], an R package wrapper to access the genderize.io API [@https://genderize.io] to get binary gender predictions for each identified first name.
We unfortunately cannot identify non-binary gender expression with the tools we used.
Performance of binary prediction was evaluated on a benchmark data set of thirty randomly selected news articles, ten from each of the following years: 2005, 2010, 2015 (Figure {@fig:suppfig1}a).
Performance of binary prediction was evaluated on a benchmark data set of thirty randomly selected news articles, ten from each of the following years: 2005, 2010, 2015 (Figure {@fig:suppfig1}).
In addition, genderize.io has been found by independent researchers to have an error rate comparable to other published gender prediction methods, with a error-rate on predicted names below 6% [@doi:10.7717/peerj-cs.156;@doi:10.5195/jmla.2021.1252].
However, it should be noted that the error rate varies by name origin with the largest decrease in performance on names with an Asian origin [@doi:10.7717/peerj-cs.156;@doi:10.5195/jmla.2021.1252].
In our analysis, we did not observe a large difference in names predicted to come from a man or woman between predicted East Asian and other name origins (Table{@tbl:tableGenderNameOrigin}).
Expand Down Expand Up @@ -167,9 +167,9 @@ Minimum and median per data type over all years: _Nature_ papers, (568, 684); _S

In comparing the citation rate of first and last author name origins in news articles, we decided to additionally analyze scientist-written articles.
Though fewer in number, scientist-written news articles have many citations, making the set sufficient for analysis and providing an opportunity to measure differences in citation patterns between journalists and scientists.
In both journalist- and scientist-written articles, we found that most cited name origins were predicted Celtic/English or European, both with a bootstrapped estimated citation rate between 19.2-42.8% (Figure {@fig:suppfig3}b,c).
East Asian predicted name origins are the third highest proportion of cited names, with a bootstrapped estimated citation rate between 6.4-28.8%.
All other predicted name origins individually account for less than 7.9% of total cited authors.
In both journalist- and scientist-written articles, we found that most cited name origins were predicted Celtic/English or European, both with a bootstrapped estimated citation rate between 17.9-39.6% (Figure {@fig:suppfig3}b,c).
East Asian predicted name origins are the third highest proportion of cited names, with a bootstrapped estimated citation rate between 7.3-28.1%.
All other predicted name origins individually account for less than 8.1% of total cited authors.

We analyzed how these distributions compare to the composition of the first and last authors in _Nature_ (Figure {@fig:supfig_nameorigin_bg}), by examining the top three most frequent predicted name origins (Figure {@fig:fig3}b,c, Table {@tbl:tableFCNature}).
When considering only first authors, we found a slight over-representation for predicted Celtic/English name origins and a small under-representation for predicted East Asian name origins in scientist-written and journalist-written news articles when compared to the composition of first authors in _Nature_ (Figure {@fig:fig3}b, c).
Expand All @@ -183,8 +183,8 @@ Additionally, we see a difference in predicted Arabic/Turkish/Persian name origi

We then sought to determine whether or not the quoted speaker demographic replicated the cited authors’ over- and under-representation patterns.
We found a much stronger Celtic/English over-representation in comparison to citation patterns, with quotes from those with Celtic/English name origins at a much higher frequency than quotes from those with European name origins (Figure {@fig:suppfig3}d, Table {@tbl:tableFCNature}).
We also found a much stronger reduction of quotes from people with predicted East Asian name origins (Figure {@fig:suppfig3}b), with never more than 7.7% of quotes (Figure {@fig:fig3}d, Table {@tbl:tableFCNature}).
This reveals a large disparity when considering that people with a predicted East Asian name origin constitute between 14.3-33.6% of last authors cited in either journalist- or scientist-written news articles (Figure {@fig:fig3}b,c, Table {@tbl:tableFCNature}).
We also found a much stronger reduction of quotes from people with predicted East Asian name origins (Figure {@fig:suppfig3}b), with never more than 8.2% of quotes (Figure {@fig:fig3}d, Table {@tbl:tableFCNature}).
This reveals a large disparity when considering that people with a predicted East Asian name origin constitute between 7.3-24.6% of last authors cited in either journalist- or scientist-written news articles (Figure {@fig:fig3}b,c, Table {@tbl:tableFCNature}).
When we compare against first and last authorship in _Nature_ across all predicted name origins, we find that for all other name origins except for East Asian and Celtic/English, the quote rates closely matches the predicted name origin rate of first and last authors in _Nature_ (Figure {@fig:suppfig4}c, dark grey and light blue lines compare to the purple lines).

To further understand the source of Celtic/English over-representation and East Asian under-representation, we selected a subset of quotes from people whose works were also cited in the news article.
Expand Down
File renamed without changes.
File renamed without changes.
52 changes: 26 additions & 26 deletions content/07.supplement.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,15 @@
![
**Benchmark Data **
The performance of gender prediction for pipeline-identified quoted speakers.
](https://github.com/nrosed/nature_news_disparities/raw/main/figure_notebooks/manuscript_figs/supp_fig1_tmp/supp_fig1.png "Supplementary Figure 1"){#fig:suppfig1 tag="Supplemental 1" width=6in}
](https://github.com/nrosed/nature_news_disparities/raw/main/figure_notebooks/manuscript_figs/supp_fig1_tmp/supp_fig1.png "Figure 1 – figure supplemental 1"){#fig:suppfig1 tag="1 – figure supplemental 1" width=6in}

![
**Speakers predicted to be men are overrepresented in news quotes regardless of predicted journalist gender**
Panel a depicts two trend lines: Yellow: Proportion of _Nature_ news articles written by a predicted women journalist; Blue: Proportion of _Nature_ news articles written by a predicted men journalist.
We observe a moderate gender difference in the number of articles written by men and women journalists.
Panel b depicts two trend lines: Yellow: Proportion of quotes predicted to be from men in an article written by a journalist predicted to be a woman; Blue: Proportion of quotes predicted to be from men in an article written by a journalist predicted to be a man.
In all plots, the colored bands represent the 5th and 95th bootstrap quantiles and the point is the mean calculated from 1,000 bootstrap samples.
](https://github.com/nrosed/nature_news_disparities/raw/main/figure_notebooks/manuscript_figs/supp_journalist_contingency_tab_tmp/supp_fig.png "Supplementary Figure 2"){#fig:suppfig_j_gender tag="Supplemental 2" width=6in}
](https://github.com/nrosed/nature_news_disparities/raw/main/figure_notebooks/manuscript_figs/supp_journalist_contingency_tab_tmp/supp_fig.png "Figure 2 - figure supplemental 1"){#fig:suppfig_j_gender tag="2 - figure supplemental 1" width=6in}


![
Expand All @@ -20,22 +20,22 @@ Panel a depicts three trend lines: Purple: Proportion of _Nature_ quotes for a s
We observe a larger gender difference between first and last authors in _Springer Nature_ articles, however the proportion of speakers estimated to be men is less than observed in _Nature_ research articles.
Panel b depicts the proportion of quotes from predicted men broken down by article type.
In all plots the colored bands represent the 5th and 95th bootstrap quantiles and the point is the mean calculated from 1,000 bootstrap samples.
](https://github.com/nrosed/nature_news_disparities/raw/main/figure_notebooks/manuscript_figs/fig2_tmp/fig2_supp.png "Supplementary Figure 3"){#fig:suppfig2 tag="Supplemental 3" width=6in}
](https://github.com/nrosed/nature_news_disparities/raw/main/figure_notebooks/manuscript_figs/fig2_tmp/fig2_supp.png "Figure 2 - figure supplemental 2"){#fig:suppfig2 tag="2 - figure supplemental 2" width=6in}

![
**Predicted Celtic/English, and European name origins are the highest cited, quoted, and mentioned**
Panel a, depicts the number of quotes, mentions, citations, or research articles considered in the name origin analysis.
Panels b-g depicts the proportion of a name origin in a given dataset, citations in articles written by journalists or writers, quoted speakers or mentions.
In all plots the colored bands represent the 5th and 95th bootstrap quantiles and the point is the mean calculated from 1,000 bootstrap samples.
](https://github.com/nrosed/nature_news_disparities/raw/main/figure_notebooks/manuscript_figs/fig3_tmp/fig3_supp.png "Supplementary Figure 4"){#fig:suppfig3 tag="Supplemental 4" width=6in}
](https://github.com/nrosed/nature_news_disparities/raw/main/figure_notebooks/manuscript_figs/fig3_tmp/fig3_supp.png "Figure 3 – figure supplemental 1"){#fig:suppfig3 tag="3 – figure supplemental 1" width=6in}


![
**Distribution of name origins _Nature_ and _Springer Nature_ articles**
Panels a-d depicts the predicted name origins of first and last authors in our background sets.
Panel a and b show the predicted name origins of _Nature_ first and last authors, respectively.
Panel c and d show the predicted name origins of _Springer Nature_ first and last authors, respectively.
](https://github.com/nrosed/nature_news_disparities/raw/main/figure_notebooks/manuscript_figs/fig3_tmp/fig3_supp3.png "Supplementary Figure 5"){#fig:supfig_nameorigin_bg tag="Supplemental 5" width=6in}
](https://github.com/nrosed/nature_news_disparities/raw/main/figure_notebooks/manuscript_figs/fig3_tmp/fig3_supp3.png "Figure 3 – figure supplemental 2"){#fig:supfig_nameorigin_bg tag="3 – figure supplemental 2" width=6in}


![
Expand All @@ -45,7 +45,7 @@ Panel a, c, and e compare the citation (a), quote (c), or mention (e) rate again
Panel b, d, and f compare the citation (a), quote (c), or mention (e) rate against _Springer Nature_ first and last author name origins.
Panels a and b additionally partition the citation rates by journalist-written articles and scientist-written articles, each further divided into first or last author position.
For c-f, only journalist written articles are considered.
](https://github.com/nrosed/nature_news_disparities/raw/main/figure_notebooks/manuscript_figs/fig3_tmp/fig3_supp2.png "Supplementary Figure 6"){#fig:suppfig4 tag="Supplemental 6" width=6in}
](https://github.com/nrosed/nature_news_disparities/raw/main/figure_notebooks/manuscript_figs/fig3_tmp/fig3_supp2.png "Figure 3 – figure supplemental 3"){#fig:suppfig4 tag="3 – figure supplemental 3" width=6in}


![
Expand All @@ -54,7 +54,7 @@ Panels a-d depicts twelve plots, each for a possible name origin comparison agai
Panels a and b compare name origin proportions of quotes from people that were also cited in the same article.
Panels c and d compare name origin proportions from mentions of people that were also cited in the same article.
In all plots the colored bands represent the 5th and 95th bootstrap quantiles and the point is the mean calculated from 1,000 bootstrap samples.
](https://github.com/nrosed/nature_news_disparities/raw/main/figure_notebooks/manuscript_figs/supp_country_specific_analysis_tmp/supp_fig.png "Supplementary Figure 7"){#fig:suppfig_quote_cite tag="Supplemental 7" width=6in}
](https://github.com/nrosed/nature_news_disparities/raw/main/figure_notebooks/manuscript_figs/supp_country_specific_analysis_tmp/supp_fig.png "Figure 3 - figure supplemental 4"){#fig:suppfig_quote_cite tag="3 - figure supplemental 4" width=6in}

<!--
![
Expand Down Expand Up @@ -122,28 +122,28 @@ Table: Quoted speaker gender by name origin {#tbl:tableGenderNameOrigin}

| |CelticEnglish |EastAsian |European |
|:------------------------------------------|:-----------------|:-----------------|:-----------------|
|citation_journalist_first vs. nature_first |1.37 (0.93, 1.82) |0.68 (0.44, 0.91) |1.01 (0.77, 1.28) |
|citation_journalist_last vs. nature_last |1.18 (0.91, 1.58) |0.81 (0.4, 1.27) |0.92 (0.68, 1.17) |
|citation_scientist_first vs. nature_first |1.28 (1.04, 1.56) |0.8 (0.64, 1.02) |1.06 (0.88, 1.25) |
|citation_scientist_last vs. nature_last |1.11 (0.93, 1.34) |0.76 (0.56, 1) |1.07 (0.91, 1.22) |
|quote vs. nature_first |2.2 (1.8, 2.64) |0.23 (0.18, 0.29) |1.02 (0.79, 1.23) |
|quote vs. nature_last |1.54 (1.33, 1.85) |0.36 (0.28, 0.47) |0.89 (0.78, 1.01) |
|mention vs. nature_first |2.1 (1.69, 2.53) |0.27 (0.21, 0.33) |1.02 (0.8, 1.24) |
|mention vs. nature_last |1.47 (1.26, 1.76) |0.42 (0.32, 0.52) |0.89 (0.78, 1) |
|citation_journalist_first vs. nature_first |1.36 (0.96, 1.74) |0.7 (0.46, 0.91) |1.01 (0.8, 1.25) |
|citation_journalist_last vs. nature_last |1.18 (0.93, 1.54) |0.82 (0.42, 1.27) |0.93 (0.71, 1.19) |
|citation_scientist_first vs. nature_first |1.26 (1.05, 1.5) |0.81 (0.66, 1.02) |1.05 (0.88, 1.22) |
|citation_scientist_last vs. nature_last |1.11 (0.95, 1.31) |0.77 (0.58, 0.99) |1.06 (0.93, 1.19) |
|quote vs. nature_first |2.12 (1.77, 2.51) |0.25 (0.2, 0.32) |1.01 (0.81, 1.22) |
|quote vs. nature_last |1.52 (1.32, 1.75) |0.39 (0.3, 0.49) |0.89 (0.79, 1.01) |
|mention vs. nature_first |2.03 (1.67, 2.39) |0.29 (0.23, 0.36) |1.02 (0.81, 1.22) |
|mention vs. nature_last |1.44 (1.26, 1.67) |0.45 (0.35, 0.54) |0.89 (0.79, 1) |
Table: Mean fold change comparison with Nature from bootstrap samples with 95% CI {#tbl:tableFCNature}



| |CelticEnglish |EastAsian |European |
|:--------------------------------------------|:-----------------|:-----------------|:-----------------|
|citation_journalist_first vs. springer_first |2.08 (1.42, 2.9) |0.68 (0.45, 0.94) |1.15 (0.87, 1.53) |
|citation_journalist_last vs. springer_last |2.08 (1.31, 3.26) |0.56 (0.28, 0.81) |1.13 (0.87, 1.43) |
|citation_scientist_first vs. springer_last |1.59 (0.95, 2.29) |0.9 (0.61, 1.67) |1.15 (0.91, 1.37) |
|citation_scientist_last vs. nature_last |1.11 (0.93, 1.34) |0.76 (0.56, 1) |1.07 (0.91, 1.22) |
|quote vs. springer_last |2.71 (1.78, 3.91) |0.26 (0.18, 0.5) |1.1 (0.84, 1.37) |
|quote vs. nature_last |1.54 (1.33, 1.85) |0.36 (0.28, 0.47) |0.89 (0.78, 1.01) |
|mention vs. springer_last |2.59 (1.68, 3.75) |0.3 (0.21, 0.56) |1.1 (0.86, 1.34) |
|mention vs. nature_last |1.47 (1.26, 1.76) |0.42 (0.32, 0.52) |0.89 (0.78, 1) |
|citation_journalist_first vs. springer_first |1.99 (1.42, 2.64) |0.69 (0.47, 0.96) |1.14 (0.89, 1.47) |
|citation_journalist_last vs. springer_last |2.01 (1.31, 3.08) |0.56 (0.3, 0.82) |1.12 (0.91, 1.37) |
|citation_scientist_first vs. springer_last |1.54 (0.95, 2.17) |0.91 (0.62, 1.64) |1.13 (0.91, 1.33) |
|citation_scientist_last vs. nature_last |1.11 (0.95, 1.31) |0.77 (0.58, 0.99) |1.06 (0.93, 1.19) |
|quote vs. springer_last |2.58 (1.74, 3.6) |0.28 (0.2, 0.54) |1.08 (0.84, 1.35) |
|quote vs. nature_last |1.52 (1.32, 1.75) |0.39 (0.3, 0.49) |0.89 (0.79, 1.0) |
|mention vs. springer_last |2.45 (1.65, 3.42) |0.32 (0.23, 0.59) |1.08 (0.85, 1.32) |
|mention vs. nature_last |1.44 (1.26, 1.67) |0.45 (0.35, 0.54) |0.89 (0.79, 1) |
Table: Mean fold change comparison with Springer Nature from bootstrap samples with 95% CI {#tbl:tableFCSpringer}


Expand All @@ -158,15 +158,15 @@ Table: Quoted speaker name origin, by journalist name origin {#tbl:table5}
| Journalist Name Origin | African| Arab Turk Pers| Celtic English| East Asian| European| Greek| Hebrew| Hispanic| Nordic| South Asian|
|:-------------|---------:|------------:|-------------:|---------:|---------:|---------:|---------:|---------:|---------:|----------:|
|CelticEnglish | 0.016| 0.027| 0.368| 0.070| 0.363| 0.008| 0.017| 0.023| 0.083| 0.025|
|EastAsian | 0.002| 0.077| 0.377| 0.142| 0.167| 0.000| 0.012| 0.133| 0.019| 0.080|
|EastAsian | 0.002| 0.077| 0.377| 0.143| 0.167| 0.000| 0.012| 0.133| 0.019| 0.080|
|European | 0.014| 0.028| 0.363| 0.116| 0.352| 0.006| 0.030| 0.026| 0.035| 0.030|
Table: Quoted + cited speaker name origin, by journalist name origin {#tbl:table6}


| Journalist Name Origin | African| Arab Turk Pers| Celtic English| East Asian| European| Greek| Hebrew| Hispanic| Nordic| South Asian|
|:-------------|---------:|------------:|-------------:|---------:|---------:|---------:|---------:|---------:|---------:|----------:|
|CelticEnglish | 0.010| 0.023| 0.378| 0.087| 0.361| 0.010| 0.021| 0.029| 0.056| 0.024|
|CelticEnglish | 0.011| 0.023| 0.378| 0.086| 0.361| 0.010| 0.021| 0.029| 0.056| 0.025|
|EastAsian | 0.000| 0.066| 0.340| 0.148| 0.209| 0.000| 0.005| 0.148| 0.033| 0.049|
|European | 0.020| 0.030| 0.410| 0.111| 0.300| 0.012| 0.023| 0.019| 0.030| 0.046|
|European | 0.021| 0.030| 0.410| 0.111| 0.300| 0.012| 0.023| 0.019| 0.030| 0.046|
Table: Quoted speakers (with US affiliated citation) name origin, by journalist name origin {#tbl:table7}

0 comments on commit e0b1ead

Please sign in to comment.