From 084aa987338d038f11579641fa4b9e30e6e74826 Mon Sep 17 00:00:00 2001 From: Daniel Himmelstein Date: Fri, 12 Jan 2024 14:50:42 -0500 Subject: [PATCH] apply some Gigascience copyeditor changes (#75) merges https://github.com/greenelab/xswap-manuscript/pull/75 from https://github.com/greenelab/xswap-manuscript/issues/69#issuecomment-1889458335 Co-authored-by: Michael Zietz --- content/01.abstract.md | 2 +- content/02.body.md | 98 +++++++++++++++++++++--------------------- 2 files changed, 50 insertions(+), 50 deletions(-) diff --git a/content/01.abstract.md b/content/01.abstract.md index 2423b5e..7d1c452 100644 --- a/content/01.abstract.md +++ b/content/01.abstract.md @@ -1,6 +1,6 @@ ## Abstract {.page_break_before} -Important tasks in biomedical discovery such as predicting gene functions, gene-disease associations, and drug repurposing opportunities are often framed as network edge prediction. +Important tasks in biomedical discovery such as predicting gene functions, gene–disease associations, and drug repurposing opportunities are often framed as network edge prediction. The number of edges connecting to a node, termed degree, can vary greatly across nodes in real biomedical networks, and the distribution of degrees varies between networks. If degree strongly influences edge prediction, then imbalance or bias in the distribution of degrees could lead to nonspecific or misleading predictions. We introduce a network permutation framework to quantify the effects of node degree on edge prediction. diff --git a/content/02.body.md b/content/02.body.md index 1cc00f4..a390e3e 100644 --- a/content/02.body.md +++ b/content/02.body.md @@ -4,15 +4,15 @@ Networks contain information about relationships between entities (referred to h A node's degree is the number of edges it has in the network. Networks contain many nodes, whose degrees can be aggregated to form the network's degree distribution. Because different nodes can have very different degrees, real networks have a variety of degree distributions (Figure {@fig:hetionet}), and they commonly exhibit degree imbalance [@doi:10.1371/journal.pone.0017645; @doi:10.1007/978-1-61779-361-5_13; @doi:10.1038/s41467-019-08746-5; @doi:10.1126/science.286.5439.509]. -This is especially true for networks encoding biomedical knowledge or assays, where natural forces such as preferential attachment inherent to the problem domain combine with observation-based influences such as study methodology to create non-uniform degree distributions (Figure {@fig:hetionet}). +This is especially true for networks encoding biomedical knowledge or assays, where natural forces such as preferential attachment inherent to the problem domain combine with observation-based influences such as study methodology to create nonuniform degree distributions (Figure {@fig:hetionet}). ![ -**Biomedical networks are characterized by non-uniform degree distributions.** +**Biomedical networks are characterized by nonuniform degree distributions.** Eight degree distributions are plotted for six edge types Hetionet v1.0 [@doi:10.7554/eLife.26726]. Hetionet integrates subnetworks for 24 different edge types, the degree distributions of which are analyzed separately. Furthermore, bipartite (e.g. Anatomy→expresses→Gene) and directed (e.g. Gene→regulates→Gene) graphs (Hetionet edge types) have both source and target degrees that must be assessed separately. -Undirected edge types (e.g Compound–resembles–Compound) have only a single degree distribution. -Degree distributions are non-uniform and vary greatly between different networks. +Undirected edge types (e.g. Compound–resembles–Compound) have only a single degree distribution. +Degree distributions are nonuniform and vary greatly between different networks. The y-axis is log~10~-scaled to accommodate the common occurrence where most nodes have low degree while a small portion of nodes have high degree. Several distributions have nodes that reach the maximum degree, corresponding to a node being connected to all other possible nodes. Zero-degree nodes are not displayed, since methodological limitations often result in edge data only existing for a subset of nodes. @@ -36,24 +36,24 @@ How well a network represents the true relationships it attempts to represent de We define "degree bias" as the type of misrepresentation that occurs when the fraction of incorrectly existent/nonexistent relationships depends on a node's degree. Depending on the type of data being represented, degree biases can arise due to experimental methods, inspection bias, or other factors [@doi:10.1016/j.jprot.2014.01.020]. -Inspection bias indicates that entities are not uniformly studied [@doi:10.1038/nature04209], and it is likely to cause degree bias when networks are constructed using hypothesis-driven findings extracted from the literature, as newly-discovered relationships are not randomly sampled from the set of all true relationships. -Though there is a high correlation between the number of publications mentioning a gene and its degree in low-throughput interaction networks, the number of publications mentioning a gene has little correlation with its degree in a systematically-derived protein interaction network [@doi:10.1016/j.cell.2014.10.050, Figure 6A]. -This suggests that many poorly connected genes in non-systematic protein interaction networks are due to inspection bias, i.e. a lack of study, rather than a lack of biological function. -For networks with a large inspection bias, reliance on degree can lead to predictions that have good metrics when assessed by cross validation but little ability to generalize. +Inspection bias indicates that entities are not uniformly studied [@doi:10.1038/nature04209], and it is likely to cause degree bias when networks are constructed using hypothesis-driven findings extracted from the literature, as newly discovered relationships are not randomly sampled from the set of all true relationships. +Though there is a high correlation between the number of publications mentioning a gene and its degree in low-throughput interaction networks, the number of publications mentioning a gene has little correlation with its degree in a systematically derived protein interaction network [@doi:10.1016/j.cell.2014.10.050, Figure 6A]. +This suggests that many poorly connected genes in nonsystematic protein interaction networks are due to inspection bias (i.e. a lack of study) rather than a lack of biological function. +For networks with a large inspection bias, reliance on degree can lead to predictions that have good metrics when assessed by cross-validation but little ability to generalize. Another reason why a reliance on degree can be unfavorable is that degree imbalance can lead to prediction nonspecificity. Nonspecific predictions are not made on the basis of the specific connectivity information contained in a network. -For example, Gillis et al. examined the concept of prediction specificity in the context of gene function prediction and found that many predictions appear to rely primarily on multifunctionality and could be "potentially misleading with respect to causality" [@doi:10.1371/journal.pone.0017258]. +For example, Gillis and Pavlidis examined the concept of prediction specificity in the context of gene function prediction and found that many predictions appear to rely primarily on multifunctionality and could be "potentially misleading with respect to causality" [@doi:10.1371/journal.pone.0017258]. Degree imbalance leads high-degree nodes to dominate in the predictions made by degree-associated methods [@doi:10.1093/bioinformatics/btv215], which are effective predictors of connections in some biological networks [@doi:10.1186/1752-0509-2-11]. Consequently, degree-based predictions are more likely nonspecific, meaning the same set of predictions performs well for different tasks. -Depending on the prediction task, edge predictions involving very high degree nodes may be undesired, uninsightful, or nonspecific. +Depending on the prediction task, edge predictions involving very high-degree nodes may be undesired, uninsightful, or nonspecific. While predictions based primarily on degree may be acceptable for some tasks, generating less obvious insights from networks requires drawing inferences from the specific connections and network structure between nodes. Model evaluation is challenging in this context: nonspecific or trivial predictions can dominate performance evaluations and may actually be correct, even if they are not the desired outputs of the predictive model. For example, predicting that the highest degree node in a network shares edges with the remaining nodes to which it is not connected will often lead to many correct predictions, despite this prediction being generic to all other nodes in the network. Degree is important in edge prediction, but it can cause undesired effects. -Degree-based features should often be included in the interpretation of predictions to disentangle desired from non-desired effects and to effectively evaluate and compare predictive models. +Degree-based features should often be included in the interpretation of predictions to disentangle desired from undesired effects and to effectively evaluate and compare predictive models. We sought to directly measure the effect of node degree on edge prediction methods. To do so, we developed a network permutation approach that allows any edge prediction method to be compared to an empirical baseline distribution. This method allows edge predictions to be evaluated in the context of degree and its effects on the prediction task. @@ -86,15 +86,15 @@ IndeCut proposed a method to characterize these strategies by their ability to u ### XSwap algorithm -Hanhijärvi, et al. presented XSwap [@doi:10.1137/1.9781611972795.67], an algorithm for the randomization ("permutation") of unweighted networks (Figure {@fig:algo}A). +Hanhijärvi et al. presented XSwap [@doi:10.1137/1.9781611972795.67], an algorithm for the randomization ("permutation") of unweighted networks (Figure {@fig:algo}A). The algorithm picks two existing edges at random ({ab, cd}) and---if the edges constitute a valid swap---exchanges the targets between the edges ({ad, cb}; Supplemental Table {@tbl:xswap}). This process is repeated a user-specified number of times. In general, the number of exchanges should be chosen to be sufficiently large that the fraction of original edges retained in the permuted network is near its asymptotic value as the number of exchanges increases to infinity. The asymptotic fraction of original edges retained in permutation depends on network density, and higher density networks require more swap attempts per edge to reach their asymptotic fraction (Figure {@fig:swap-percent}). -We modified the original XSwap algorithm by adding two parameters, `allow_loops` (a-a), and `allow_antiparallel` (a-b and b-a) that allow a greater variety of network types to be permuted (Figure {@fig:algo}B and Supplemental Table {@tbl:xswap}). -The motivation for these generalizations is to make the permutation method applicable both to directed and undirected graphs, as well as to networks with different types of nodes, variously called multipartite, heterogeneous, or multimodal networks. -Specifically, in the modified algorithm two chosen edges constitute a valid swap if they preserve degree for all four involved nodes and do not violate the user-specified parameters. +We modified the original XSwap algorithm by adding two parameters, `allow_loops` (a-a) and `allow_antiparallel` (a-b and b-a), that allow a greater variety of network types to be permuted (Figure {@fig:algo}B and Supplemental Table {@tbl:xswap}). +The motivation for these generalizations is to make the permutation method applicable to both directed and undirected graphs, as well as to networks with different types of nodes, variously called multipartite, heterogeneous, or multimodal networks. +Specifically, in the modified algorithm, two chosen edges constitute a valid swap if they preserve degree for all four involved nodes and do not violate the user-specified parameters. When permuting bipartite networks, our method ensures that each node's class membership and within-class degree is preserved. Similarly, heterogeneous networks should be permuted by considering each edge type as a separate network [@doi:10.1371/journal.pcbi.1004259; @doi:10.15363/thinklab.d136]. @@ -116,7 +116,7 @@ The original algorithm and our proposed modification are given in Figures {@fig: We introduce the edge prior to quantify the probability that two nodes are connected based only on their degree. The edge prior can be estimated using the fraction of permuted networks in which a given edge exists. -In short, for a given node pair (a, b), given $N$ permutations of the network, and given that $m$ of these permutation contain (a, b), the prior for (a, b) is $m \mathbin{/} N$, which is also the maximum likelihood estimate for the binomial distribution success probability. +In short, for a given node pair (a, b), given $N$ permutations of the network, and given that $m$ of these permutations contain (a, b), the prior for (a, b) is $m \mathbin{/} N$, which is also the maximum likelihood estimate for the binomial distribution success probability. Based only on permuted networks, the edge prior does not contain any information about the true edges in the (unpermuted) network. The edge prior is a numerical value that can be computed for every pair of nodes that could potentially share an edge; we compared its ability to predict edges in three tasks, discussed in [prediction tasks](#tasks). @@ -152,28 +152,28 @@ We performed three prediction tasks to assess the performance of the edge prior. We compared the permutation-based prior with two additional predictors: our analytical approximation of the edge prior and the product of source and target degree, scaled to the range [0, 1] so that we could assess its calibration as well as its discrimination. We used 20 biomedical networks from the Hetionet heterogeneous network [@doi:10.7554/eLife.26726] that had at least 2000 edges for the first two tasks ([Supplemental table](#networks)). -In the first task, we computed the degree-based predictors (edge prior, scaled degree product, and analytical prior approximation), and predicted the original edges in the network by rank-ordering node pair edge predictions by the node pairs' predictor values. +In the first task, we computed the degree-based predictors (edge prior, scaled degree product, and analytical prior approximation) and predicted the original edges in the network by rank-ordering node pair edge predictions by the node pairs' predictor values. We used node pairs that lacked an edge in the original network as negative examples and those with an edge as positive examples. To assess the methods' predictive performances, we computed the area under the receiver operating characteristic (AUROC) curve for all three predictors. -In the second task, we sampled 70% of edges from each of the networks, computed predictors on the sampled network, then predicted held-out edges. +In the second task, we sampled 70% of edges from each of the networks, computed predictors on the sampled network, and then predicted held-out edges. For this task, negative examples were node pairs in which an edge did not exist in either original or sampled network, while positive samples were those node pairs without an edge in the sampled network but with an edge in the original network. The third task evaluated the ability of the edge prior to generalize to new degree distributions. We used two domains where networks were available which shared nodes but had different degree distributions. -Protein-protein interactions (PPI) and transcription factor-target gene (TF-TG) relationships had networks created both by literature curation of low-throughput, hypothesis-driven research and by high-throughput, systematic, hypothesis-free experimentation. +Protein–protein interactions (PPIs) and transcription factor–target gene (TF-TG) relationships had networks created both by literature curation of low-throughput, hypothesis-driven research and by high-throughput, systematic, hypothesis-free experimentation. For the PPI networks, we used the STRING network, which incorporates literature-mining to find relationships [@doi:10.1093/nar/gky1131] and a combination of the high-throughput, proteome-scale interaction networks from Rual et al. [@doi:10.1038/nature04209] and Rolland et al. [@doi:10.1016/j.cell.2014.10.050]. We used a transcription factor-target gene (TF-TG) literature-derived network from Han et al. [@doi:10.1093/nar/gkx1013] and a high-throughput network from Lachmann et al. [@doi:10.1093/bioinformatics/btq466]. The pairs of networks for PPI and TF-TG data sources are ideal because in one we expect inspection bias and in the other we do not. -As a further basis of comparison, we added a time-resolved co-authorship network, which we partitioned by time to create two separate networks. -We created the co-authorship network of bioRxiv bioinformatics preprints using the Rxivist [@doi:10.7554/eLife.45133; @doi:10.5281/zenodo.2566421] database, which was generated by crawling the bioRxiv server. -Unlike the other two networks, co-authorship does not have degree bias, as the network faithfully represents all true co-author relationships. +As a further basis of comparison, we added a time-resolved coauthorship network, which we partitioned by time to create two separate networks. +We created the coauthorship network of bioRxiv bioinformatics preprints using the Rxivist [@doi:10.7554/eLife.45133; @doi:10.5281/zenodo.2566421] database, which was generated by crawling the bioRxiv server. +Unlike the other two networks, coauthorship does not have degree bias, as the network faithfully represents all true coauthor relationships. We include this network to offer a comparative prediction task in which the degree distributions between training (posted before 2018) and testing (posted during or after 2018) are not dramatically different (Figure {@fig:degree-bias}A). The goal of the third prediction task is to determine predictor generalizability for network reconstruction between different degree distributions, especially predicting a network without degree bias using predictors from a degree-biased network. Further information about the networks used can be found in [the supplement](#networks). -### Degree-grouping +### Degree grouping Our method for degree-preserving permutation produces randomized networks that share few of their edges with the original network. As permutation preserves only node degree, node pairs with equal degree are equivalent in permutations. @@ -190,7 +190,7 @@ Additionally, we include the analytical approximation of the edge prior and func The Python package is [available](https://pypi.org/project/xswap/) on the Python Packaging Index under the name "xswap". The full source code is freely available under the BSD 2-Clause License (). -The edge swap mechanism---implemented in C++ for greater speed---uses a bitset to avoid producing edges which violate the conditions for a valid swap. +The edge swap mechanism---implemented in C++ for greater speed---uses a bitset to avoid producing edges that violate the conditions for a valid swap. While the full bitset implementation is faster for smaller networks, our package uses a compressed bitset [@arxiv:1709.07821] when a network would occupy memory above a user-adjustable threshold. In addition to the validity conditions already described, our package allows specific edges to be excluded from permutation, and every network permutation returns both a permuted network and summary information about the numbers of swaps attempted, performed, and the reasons why invalid swaps were rejected. @@ -207,19 +207,19 @@ We found examples of node degree bias in the PPI and TF-TG networks we investiga Figure {@fig:degree-bias} shows node degree in separate networks for the same type of data. For the PPI networks, the literature-derived network has a larger mean degree and a longer tail than the systematic network, while in the TF-TG networks this relationship is reversed. Because the TF-TG network contained far more transcription factors than target genes (144 and 1406, respectively), the distributions of target degrees were far more compact than those of source degrees. -Unlike the PPI and TF-TG networks, the co-authorship networks were split by date of first co-authorship and did not exhibit a great difference in their degree distributions. -All three types of networks (PPI, TF-TG, and co-authorship) exhibit degree imbalance to varying extents. +Unlike the PPI and TF-TG networks, the coauthorship networks were split by date of first coauthorship and did not exhibit a great difference in their degree distributions. +All three types of networks (PPI, TF-TG, and coauthorship) exhibit degree imbalance to varying extents. These results indicate that, depending on the methods by which the represented data were generated, networks of the same type of data may have overall degree distributions that differ greatly (Figure {@fig:degree-bias}A), and they may even assign very different degree to the same nodes (Figure {@fig:degree-bias}B). ![ **A.** Degree distributions of networks with and without degree bias can be very different. - Data on PPI and TF-TG were split between literature-derived and systematically-derived networks. + Data on PPI and TF-TG were split between literature-derived and systematically derived networks. In both cases, the networks exhibit large differences in degree distribution. - Co-authorship relationship networks split by date of first co-authorship roughly share their degree distributions. + coauthorship relationship networks split by date of first coauthorship roughly share their degree distributions. **B.** Comparison of individual node degrees between different networks. Not only are the overall degree distributions different, but individual nodes can have systematically different degrees between two networks. - Uniform random sampling produces linearly-correlated node degree, while non-random sampling produces non-correlated degree. - Systematically-derived networks are not uniformly sampled from literature-derived networks or vice versa. + Uniform random sampling produces linearly correlated node degree, while nonrandom sampling produces non-correlated degree. + systematically derived networks are not uniformly sampled from literature-derived networks or vice versa. 70% of literature edges were sampled with uniform probability for the "Subsampled holdout" network. ](https://github.com/greenelab/xswap-analysis/raw/4f06bdaf1f034af9136e25c03f9891a145b9bf91/img/degree_bias.png){#fig:degree-bias width="100%"} @@ -239,9 +239,9 @@ Meanwhile, the lowest self-reconstruction performance (AUROC = 0.7697) occurred ![ **Degree can predict edges within a given network but does not generalize to networks with different degree distributions** - The edge prior is able to reconstruct the networks on which it was computed (Task 1, "unsampled", 20 different networks) with high performance. - When computed on a sampled network, the edge prior can reconstruct the unsampled network with slightly lower performance (Task 2, "sampled", 20 different networks). - However, when computed on a completely different network (having a different degree distribution) of the same type of data, the edge prior's performance is greatly reduced (Task 3, "separate", 3 different networks). + The edge prior is able to reconstruct the networks on which it was computed (task 1, "unsampled", 20 different networks) with high performance. + When computed on a sampled network, the edge prior can reconstruct the unsampled network with slightly lower performance (task 2, "sampled", 20 different networks). + However, when computed on a completely different network (having a different degree distribution) of the same type of data, the edge prior's performance is greatly reduced (task 3, "separate", 3 different networks). The performance reduction from computing predictors on sampled networks is real but far smaller compared to a new degree distribution. This indicates that while degree can be effective for network reconstruction, it is far less effective in predicting edges from a different degree distribution. ](https://github.com/greenelab/xswap-analysis/raw/4f06bdaf1f034af9136e25c03f9891a145b9bf91/img/auroc_dists.png){#fig:discrimination width="60%"} @@ -252,7 +252,7 @@ https://github.com/greenelab/xswap-analysis/blob/4f06bdaf1f034af9136e25c03f9891a --> The three predictors that we compared were highly correlated (Spearman rank correlation over 0.984 for all 20 networks). -The three predictors also had very similar AUROC reconstruction performance values for the first, second, and third prediction tasks (max difference < 0.027) because AUROC is rank-based. +The three predictors also had very similar AUROC reconstruction performance values for the first, second, and third prediction tasks (max difference < 0.027) because AUROC is rank based. The edge prior was slightly better than the approximations in 12 of 20 networks. However, while the AUROC results were similar, the predictors were very different in their levels of calibration---the ability of the model to correctly estimate edge existence probabilities. The edge prior was very well calibrated for all networks in the first and second tasks, and it provided the best calibration of the three predictors for each of the prediction tasks (Figure {@fig:calibration}A). @@ -280,27 +280,27 @@ Unlike in the first task, edges that were present in the sampled network were no The results of the second prediction task further demonstrate a high level of performance for degree-sequence-based node pair predictors (Figure {@fig:discrimination}). The edge prior was able to reconstruct the unsampled network with an AUROC of greater than 0.9 in 14 of 20 networks. As was observed in the first task, node pair predictors computed in the second task were highly rank-correlated, meaning the AUROC values for different predictors were similar. -While performance was slightly lower in the second task than the first, many networks were still well-reconstructed. +While performance was slightly lower in the second task than the first, many networks were still well reconstructed. The edge prior was the best calibrated predictor for both tasks. -In the third prediction task, we computed the three edge predictors for paired networks representing data from PPI, TF-TG, and bioRxiv bioinformatics pre-print co-authorship. +In the third prediction task, we computed the three edge predictors for paired networks representing data from PPI, TF-TG, and bioRxiv bioinformatics preprint coauthorship. The goal of the task was to compare predictive performance across different degree distributions for the same type of data. -We find that the task of predicting systematically-derived edges using a network with degree bias is significantly more challenging than network reconstruction, and we find consistently lower performance compared to the other tasks (Figure {@fig:discrimination}). +We find that the task of predicting systematically derived edges using a network with degree bias is significantly more challenging than network reconstruction, and we find consistently lower performance compared to the other tasks (Figure {@fig:discrimination}). The edge prior was not able to predict the separate PPI network better than by random guessing (AUROC of roughly 0.5). Only slightly better was its performance in predicting the separate TF-TG network, at an AUROC of 0.59. -We find superior performance in predicting the co-authorship relationships (AUROC 0.75), which was expected as the network being predicted shared roughly the same degree distribution as the network on which the edge prior was computed. +We find superior performance in predicting the coauthorship relationships (AUROC 0.75), which was expected as the network being predicted shared roughly the same degree distribution as the network on which the edge prior was computed. The results of the third prediction task show that a difference in degree distribution between the network on which predictors are computed and the network to be predicted can make prediction significantly more challenging. The edge prior can be considered a baseline edge predictor that accurately captures degree's contribution to the probability of an edge existing. The edge prior's low performance in the third task indicates that degree is less helpful for edge prediction tasks in which training and testing networks do not share their degree distributions. Many biomedical prediction tasks can be framed as edge prediction tasks between different degree distributions. -In drug repurposing, for example, existing compound-disease treatment relationships are unlikely to be randomly sampled from all true treatment relationships. +In drug repurposing, for example, existing compound–disease treatment relationships are unlikely to be randomly sampled from all true treatment relationships. However, all treatment relationships between existing compounds and diseases are desirable outputs in prediction. Edge predictions can be based on both underlying biological properties and network degree distributions. However, predictions based on biological properties may be more consistent and generalizable than those based on degree. Degree's influence on edge prediction accuracy measures can reveal the relative contributions of these two factors. -### Degree can underly a large fraction of performance +### Degree can underlie a large fraction of performance We evaluated the extent to which edge prediction performance is due to degree. To begin, we chose the STRING PPI network for the comparison and computed five edge prediction features (Supplemental table {@tbl:edge-prediction}). @@ -309,9 +309,9 @@ All five features were correlated with degree (Figure {@fig:feature-degree}), wh We expected features based on degree to show strong performance for a network reconstruction task without holdout, as found in the first prediction task. ![ - **Common edge-prediction metrics correlate with node degree.** - Five common edge-prediction features (Supplemental table {@tbl:edge-prediction}) are correlated with node degree on the STRING PPI network [@doi:10.1093/nar/gky1131]. - All five features show a positive relationship with degree, though the magnitude of this correlation is highly variable. + **Common edge prediction metrics correlate with node degree.** + Five common edge prediction features (Supplemental table {@tbl:edge-prediction}) are correlated with node degree on the STRING PPI network [@doi:10.1093/nar/gky1131]. + All five features show a positive relationship with degree, although the magnitude of this correlation is highly variable. The preferential attachment index is understandably perfectly correlated because it is equal to the product of source and target degree. Each panel indicates the Pearson correlation ("r") between feature and degree in the lower right corner. ](https://github.com/greenelab/xswap-analysis/raw/4f06bdaf1f034af9136e25c03f9891a145b9bf91/img/feature-degree.png){#fig:feature-degree width="100%"} @@ -341,14 +341,14 @@ https://github.com/greenelab/xswap-analysis/blob/4f06bdaf1f034af9136e25c03f9891a --> The edge prior encapsulates nonspecific predictions due to degree, and it reconstructed the PPI network with an AUROC of 0.797 (dotted red line in Figure {@fig:feature-auroc}). -In the second comparison, edge prediction features computed on permuted networks had performance equal or lower to their performances on the unpermuted networks. +In the second comparison, edge prediction features computed on permuted networks had performance equal to or lower than their performances on the unpermuted networks. This indicated that four out of five edge prediction features discern more than node degree for the prediction task. The preferential attachment index is the product of source and target degree, and its performance did not differ from the edge prior or the feature's performance when computed on permuted networks. This comparison quantified the performance of degree toward the prediction task and assessed degree's effect on five edge prediction features. The edge prior provided the baseline level of performance attributable to degree alone. Comparing the performances on permuted networks to the performance of the edge prior reveals the extent to which a feature measures degree. -Features whose performances on permuted networks were below that of the edge prior only imperfectly measured degree (eg: Jaccard index), whereas features whose performances equaled the edge prior completely captured degree (eg: preferential attachment index). +Features whose performances on permuted networks were below that of the edge prior only imperfectly measured degree (e.g. Jaccard index), whereas features whose performances equaled the edge prior completely captured degree (e.g. preferential attachment index). Features can also capture information beyond degree, and our method can quantify this performance. For example, the superior performance on unpermuted networks relative to permuted networks indicated that RWR, resource allocation, Jaccard, and Adamic/Adar indices captured more than degree in this prediction task. These results aligned with the definitions of each feature and validated that our permutation framework accurately assessed reliance on degree. @@ -358,14 +358,14 @@ These results aligned with the definitions of each feature and validated that ou We focus on edge prediction in biomedical networks. Our overall goal is to predict new edges with specificity, so that predictions reflect particular connectivity rather than generic node characteristics. Our permutation framework measures the predictive performance attributable to degree to provide a baseline expectation for edge pairs. -We expect that non-specificity due to degree is not a unique property of biomedical networks. +We expect that nonspecificity due to degree is not a unique property of biomedical networks. For example, if node A connects to nearly all other nodes in a network, predicting that all remaining nodes share an edge with node A will likely result in many correct---though nonspecific---predictions, regardless of the type of data contained in the network. Node degree should be accounted for to make correct predictions while being able to distinguish specific from nonspecific predictions. Prediction without reliance on node degree is challenging because many effective methods for edge prediction are correlated with degree (Figure {@fig:feature-degree}). The effects of node degree are obvious when edge prediction features are functions of degree. -For example, the resource allocation index is the sum of inverse degree of common neighbors between source and target nodes (in the symmetric case), while preferential attachment is the product of source and target degree [@doi:10.1140/epjb/e2009-00335-8; @doi:10.1145/1065385.1065415]. -However, because many other edge prediction methods are not explicitly degree-based, it is important to have a general method for comparing the effects of node degree on edge prediction methods. +For example, the resource allocation index is the sum of the inverse degree of common neighbors between source and target nodes (in the symmetric case), while preferential attachment is the product of source and target degree [@doi:10.1140/epjb/e2009-00335-8; @doi:10.1145/1065385.1065415]. +However, because many other edge prediction methods are not explicitly degree based, it is important to have a general method for comparing the effects of node degree on edge prediction methods. We developed a permutation framework to quantify the edge probability due to degree. We term this probability the "edge prior", and we have identified two applications. @@ -390,7 +390,7 @@ This analysis, enabled by network permutation, measured the extent to which feat ## Conclusion -We developed a network permutation framework and open source software implementation that quantifies the probability of edge existence due to degree and can assess the fraction of feature performance attributable to degree. +We developed a network permutation framework and open-source software implementation that quantifies the probability of edge existence due to degree and can assess the fraction of feature performance attributable to degree. We demonstrated the superiority of the edge prior over other degree-based features for quantifying the effect of degree on the probability of edge existence. The XSwap methods and software provide a context for evaluating edge prediction methods and specific predictions for reliance on degree and, therefore, nonspecificity. Network edge prediction is a common task in biological and biomedical research, and it can be greatly influenced by degree.