Prepare for CRAN

cleanzr · Nov 20, 2020 · 5f35226 · 5f35226
1 parent c229044
commit 5f35226
Show file tree

Hide file tree

Showing 5 changed files with 192 additions and 0 deletions.
diff --git a/.Rbuildignore b/.Rbuildignore
@@ -1,2 +1,4 @@
 ^.*\.Rproj$
 ^\.Rproj\.user$
+^README\.Rmd$
+^cran-comments.md$
diff --git a/NEWS.md b/NEWS.md
@@ -0,0 +1,2 @@
+# clevr 0.1.0
+* Initial release
diff --git a/README.Rmd b/README.Rmd
@@ -0,0 +1,90 @@
+---
+output: github_document
+---
+
+<!-- README.md is generated from README.Rmd. Please edit that file -->
+
+```{r, include = FALSE}
+knitr::opts_chunk$set(
+  collapse = TRUE,
+  comment = "#>",
+  fig.path = "man/figures/README-",
+  out.width = "100%"
+)
+```
+
+# clevr: Clustering and Link Prediction Evaluation in R
+
+<!-- badges: start -->
+<!-- badges: end -->
+
+clevr implements functions for evaluating link prediction and clustering 
+algorithms in R. It includes efficient implementations of common performance 
+measures, such as:
+* pairwise precision, recall, F-measure;
+* homogeneity, completeness and V-measure;
+* (adjusted) Rand index;
+* variation of information; and
+* mutual information.
+While the current focus is on supervised (a.k.a. external) performance 
+measures, unsupervised (internal) measures are also in scope for future 
+releases.
+
+## Installation
+
+You can install the latest release from [CRAN](https://CRAN.R-project.org) 
+by entering:
+
+``` r
+install.packages("clevr")
+```
+
+The development version can be installed from GitHub using `devtools`:
+
+``` r
+# install.packages("devtools")
+devtools::install_github("cleanzr/clevr")
+```
+
+## Example
+
+Several functions are included which transform between different clustering 
+representations.
+
+```{r example}
+library(clevr)
+# A clustering of four records represented as a membership vector
+pred_membership <- c("Record1" = 1, "Record2" = 1, "Record3" = 1, "Record 4" = 2)
+
+# Represent as a set of record pairs that appear in the same cluster
+pred_pairs <- membership_to_pairs(pred_membership)
+print(pred_pairs)
+
+# Represent as a list of record clusters
+pred_clusters <- membership_to_clusters(pred_membership)
+print(pred_clusters)
+```
+
+Performance measures are available for evaluating linked pairs:
+
+```{r pair-measures}
+true_pairs <- rbind(c("Record1", "Record2"), c("Record3", "Record4"))
+
+pr <- precision_pairs(true_pairs, pred_pairs)
+print(pr)
+
+re <- recall_pairs(true_pairs, pred_pairs)
+print(re)
+```
+
+and for evaluating clusterings:
+
+```{r clust-measures}
+true_membership <- c("Record1" = 1, "Record2" = 1, "Record3" = 2, "Record 4" = 2)
+
+ari <- adj_rand_index(true_membership, pred_membership)
+print(ari)
+
+vi <- variation_info(true_membership, pred_membership)
+print(vi)
+```
diff --git a/README.md b/README.md
@@ -0,0 +1,88 @@
+
+<!-- README.md is generated from README.Rmd. Please edit that file -->
+
+# clevr: Clustering and Link Prediction Evaluation in R
+
+<!-- badges: start -->
+
+<!-- badges: end -->
+
+clevr implements functions for evaluating link prediction and clustering
+algorithms in R. It includes efficient implementations of common
+performance measures, such as: \* pairwise precision, recall, F-measure;
+\* homogeneity, completeness and V-measure; \* (adjusted) Rand index; \*
+variation of information; and \* mutual information. While the current
+focus is on supervised (a.k.a. external) performance measures,
+unsupervised (internal) measures are also in scope for future releases.
+
+## Installation
+
+You can install the latest release from
+[CRAN](https://CRAN.R-project.org) by entering:
+
+``` r
+install.packages("clevr")
+```
+
+The development version can be installed from GitHub using `devtools`:
+
+``` r
+# install.packages("devtools")
+devtools::install_github("cleanzr/clevr")
+```
+
+## Example
+
+Several functions are included which transform between different
+clustering representations.
+
+``` r
+library(clevr)
+# A clustering of four records represented as a membership vector
+pred_membership <- c("Record1" = 1, "Record2" = 1, "Record3" = 1, "Record 4" = 2)
+
+# Represent as a set of record pairs that appear in the same cluster
+pred_pairs <- membership_to_pairs(pred_membership)
+print(pred_pairs)
+#>      [,1]      [,2]     
+#> [1,] "Record1" "Record2"
+#> [2,] "Record1" "Record3"
+#> [3,] "Record2" "Record3"
+
+# Represent as a list of record clusters
+pred_clusters <- membership_to_clusters(pred_membership)
+print(pred_clusters)
+#> $`1`
+#> [1] "Record1" "Record2" "Record3"
+#> 
+#> $`2`
+#> [1] "Record 4"
+```
+
+Performance measures are available for evaluating linked pairs:
+
+``` r
+true_pairs <- rbind(c("Record1", "Record2"), c("Record3", "Record4"))
+
+pr <- precision_pairs(true_pairs, pred_pairs)
+print(pr)
+#> [1] 0.3333333
+
+re <- recall_pairs(true_pairs, pred_pairs)
+print(re)
+#> [1] 0.5
+```
+
+and for evaluating clusterings:
+
+``` r
+true_membership <- c("Record1" = 1, "Record2" = 1, "Record3" = 2, "Record 4" = 2)
+
+ari <- adj_rand_index(true_membership, pred_membership)
+print(ari)
+#> [1] 0
+
+vi <- variation_info(true_membership, pred_membership)
+print(vi)
+#> [1] 0.8239592
+```
diff --git a/cran-comments.md b/cran-comments.md
@@ -0,0 +1,10 @@
+Description of release. What has changed. No reverse dependencies.
+
+## Test environments
+* macOS, R 4.0.1
+* Fedora 33, R 4.0.3
+* Windows 10, R 4.0.3
+
+## R CMD check results
+
+0 errors | 0 warnings | 0 note