Skip to content

Commit

Permalink
Prepare for CRAN
Browse files Browse the repository at this point in the history
  • Loading branch information
ngmarchant committed Nov 20, 2020
1 parent c229044 commit 5f35226
Show file tree
Hide file tree
Showing 5 changed files with 192 additions and 0 deletions.
2 changes: 2 additions & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
^.*\.Rproj$
^\.Rproj\.user$
^README\.Rmd$
^cran-comments.md$
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# clevr 0.1.0
* Initial release
90 changes: 90 additions & 0 deletions README.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
---
output: github_document
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```

# clevr: Clustering and Link Prediction Evaluation in R

<!-- badges: start -->
<!-- badges: end -->

clevr implements functions for evaluating link prediction and clustering
algorithms in R. It includes efficient implementations of common performance
measures, such as:
* pairwise precision, recall, F-measure;
* homogeneity, completeness and V-measure;
* (adjusted) Rand index;
* variation of information; and
* mutual information.
While the current focus is on supervised (a.k.a. external) performance
measures, unsupervised (internal) measures are also in scope for future
releases.

## Installation

You can install the latest release from [CRAN](https://CRAN.R-project.org)
by entering:

``` r
install.packages("clevr")
```

The development version can be installed from GitHub using `devtools`:

``` r
# install.packages("devtools")
devtools::install_github("cleanzr/clevr")
```

## Example

Several functions are included which transform between different clustering
representations.

```{r example}
library(clevr)
# A clustering of four records represented as a membership vector
pred_membership <- c("Record1" = 1, "Record2" = 1, "Record3" = 1, "Record 4" = 2)
# Represent as a set of record pairs that appear in the same cluster
pred_pairs <- membership_to_pairs(pred_membership)
print(pred_pairs)
# Represent as a list of record clusters
pred_clusters <- membership_to_clusters(pred_membership)
print(pred_clusters)
```

Performance measures are available for evaluating linked pairs:

```{r pair-measures}
true_pairs <- rbind(c("Record1", "Record2"), c("Record3", "Record4"))
pr <- precision_pairs(true_pairs, pred_pairs)
print(pr)
re <- recall_pairs(true_pairs, pred_pairs)
print(re)
```

and for evaluating clusterings:

```{r clust-measures}
true_membership <- c("Record1" = 1, "Record2" = 1, "Record3" = 2, "Record 4" = 2)
ari <- adj_rand_index(true_membership, pred_membership)
print(ari)
vi <- variation_info(true_membership, pred_membership)
print(vi)
```
88 changes: 88 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@

<!-- README.md is generated from README.Rmd. Please edit that file -->

# clevr: Clustering and Link Prediction Evaluation in R

<!-- badges: start -->

<!-- badges: end -->

clevr implements functions for evaluating link prediction and clustering
algorithms in R. It includes efficient implementations of common
performance measures, such as: \* pairwise precision, recall, F-measure;
\* homogeneity, completeness and V-measure; \* (adjusted) Rand index; \*
variation of information; and \* mutual information. While the current
focus is on supervised (a.k.a. external) performance measures,
unsupervised (internal) measures are also in scope for future releases.

## Installation

You can install the latest release from
[CRAN](https://CRAN.R-project.org) by entering:

``` r
install.packages("clevr")
```

The development version can be installed from GitHub using `devtools`:

``` r
# install.packages("devtools")
devtools::install_github("cleanzr/clevr")
```

## Example

Several functions are included which transform between different
clustering representations.

``` r
library(clevr)
# A clustering of four records represented as a membership vector
pred_membership <- c("Record1" = 1, "Record2" = 1, "Record3" = 1, "Record 4" = 2)

# Represent as a set of record pairs that appear in the same cluster
pred_pairs <- membership_to_pairs(pred_membership)
print(pred_pairs)
#> [,1] [,2]
#> [1,] "Record1" "Record2"
#> [2,] "Record1" "Record3"
#> [3,] "Record2" "Record3"

# Represent as a list of record clusters
pred_clusters <- membership_to_clusters(pred_membership)
print(pred_clusters)
#> $`1`
#> [1] "Record1" "Record2" "Record3"
#>
#> $`2`
#> [1] "Record 4"
```

Performance measures are available for evaluating linked pairs:

``` r
true_pairs <- rbind(c("Record1", "Record2"), c("Record3", "Record4"))

pr <- precision_pairs(true_pairs, pred_pairs)
print(pr)
#> [1] 0.3333333

re <- recall_pairs(true_pairs, pred_pairs)
print(re)
#> [1] 0.5
```

and for evaluating clusterings:

``` r
true_membership <- c("Record1" = 1, "Record2" = 1, "Record3" = 2, "Record 4" = 2)

ari <- adj_rand_index(true_membership, pred_membership)
print(ari)
#> [1] 0

vi <- variation_info(true_membership, pred_membership)
print(vi)
#> [1] 0.8239592
```
10 changes: 10 additions & 0 deletions cran-comments.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
Description of release. What has changed. No reverse dependencies.

## Test environments
* macOS, R 4.0.1
* Fedora 33, R 4.0.3
* Windows 10, R 4.0.3

## R CMD check results

0 errors | 0 warnings | 0 note

0 comments on commit 5f35226

Please sign in to comment.