Adds badges, and code coveage

mlverse · Sep 12, 2024 · f6aa8ac · f6aa8ac
1 parent 493525b
commit f6aa8ac
Show file tree

Hide file tree

Showing 4 changed files with 26 additions and 100 deletions.
diff --git a/.Rbuildignore b/.Rbuildignore
@@ -7,3 +7,4 @@ utils/
 ^docs$
 ^pkgdown$
 ^\.github$
+^codecov\.yml$
diff --git a/README.Rmd b/README.Rmd
@@ -10,7 +10,7 @@ knitr::opts_chunk$set(
   comment = "#>",
   fig.path = "man/figures/README-",
   out.width = "100%",
-  eval = TRUE
+  eval = FALSE
 )
 library(dplyr)
 library(dbplyr)
@@ -23,6 +23,9 @@ mall::llm_use("ollama", "llama3.1", seed = 100)
 # mall
 
 <!-- badges: start -->
+[![Codecov test coverage](https://codecov.io/gh/edgararuiz/mall/branch/main/graph/badge.svg)](https://app.codecov.io/gh/edgararuiz/mall?branch=main)
+[![R-CMD-check](https://github.com/edgararuiz/mall/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/edgararuiz/mall/actions/workflows/R-CMD-check.yaml)
+[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
 <!-- badges: end -->
 
 ```{r, eval = FALSE, echo = FALSE}

diff --git a/README.md b/README.md
@@ -4,7 +4,14 @@
 # mall
 
 <!-- badges: start -->
+
+[![Codecov test
+coverage](https://codecov.io/gh/edgararuiz/mall/branch/main/graph/badge.svg)](https://app.codecov.io/gh/edgararuiz/mall?branch=main)
+[![R-CMD-check](https://github.com/edgararuiz/mall/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/edgararuiz/mall/actions/workflows/R-CMD-check.yaml)
+[![Lifecycle:
+experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
 <!-- badges: end -->
+
 <!-- toc: start -->
 
 - [Motivation](#motivation)
@@ -84,25 +91,13 @@ library(mall)
 
 reviews |>
   llm_sentiment(review)
-#> # A tibble: 3 × 2
-#>   review                                   .sentiment
-#>   <chr>                                    <chr>     
-#> 1 This has been the best TV I've ever use… positive  
-#> 2 I regret buying this laptop. It is too … negative  
-#> 3 Not sure how to feel about my new washi… neutral
 ```
 
 The function let’s us modify the options to choose from:
 
 ``` r
 reviews |>
   llm_sentiment(review, options = c("positive", "negative"))
-#> # A tibble: 3 × 2
-#>   review                                   .sentiment
-#>   <chr>                                    <chr>     
-#> 1 This has been the best TV I've ever use… positive  
-#> 2 I regret buying this laptop. It is too … negative  
-#> 3 Not sure how to feel about my new washi… negative
 ```
 
 As mentioned before, by being pipe friendly, the results from the LLM
@@ -112,11 +107,6 @@ prediction can be used in further transformations:
 reviews |>
   llm_sentiment(review, options = c("positive", "negative")) |>
   filter(.sentiment == "negative")
-#> # A tibble: 2 × 2
-#>   review                                   .sentiment
-#>   <chr>                                    <chr>     
-#> 1 I regret buying this laptop. It is too … negative  
-#> 2 Not sure how to feel about my new washi… negative
 ```
 
 ### Summarize
@@ -129,12 +119,6 @@ number of words to output (`max_words`):
 ``` r
 reviews |>
   llm_summarize(review, max_words = 5)
-#> # A tibble: 3 × 2
-#>   review                                   .summary                       
-#>   <chr>                                    <chr>                          
-#> 1 This has been the best TV I've ever use… very good tv experience overall
-#> 2 I regret buying this laptop. It is too … slow and noisy laptop purchase 
-#> 3 Not sure how to feel about my new washi… mixed feelings about new washer
 ```
 
 To control the name of the prediction field, you can change `pred_name`
@@ -143,12 +127,6 @@ argument. This works with the other `llm_` functions as well.
 ``` r
 reviews |>
   llm_summarize(review, max_words = 5, pred_name = "review_summary")
-#> # A tibble: 3 × 2
-#>   review                                   review_summary                 
-#>   <chr>                                    <chr>                          
-#> 1 This has been the best TV I've ever use… very good tv experience overall
-#> 2 I regret buying this laptop. It is too … slow and noisy laptop purchase 
-#> 3 Not sure how to feel about my new washi… mixed feelings about new washer
 ```
 
 ### Classify
@@ -158,12 +136,6 @@ Use the LLM to categorize the text into one of the options you provide:
 ``` r
 reviews |>
   llm_classify(review, c("appliance", "computer"))
-#> # A tibble: 3 × 2
-#>   review                                   .classify
-#>   <chr>                                    <chr>    
-#> 1 This has been the best TV I've ever use… appliance
-#> 2 I regret buying this laptop. It is too … computer 
-#> 3 Not sure how to feel about my new washi… appliance
 ```
 
 ### Extract
@@ -177,12 +149,6 @@ We do this by simply saying “product”. The LLM understands what we
 ``` r
 reviews |>
   llm_extract(review, "product")
-#> # A tibble: 3 × 2
-#>   review                                   .extract       
-#>   <chr>                                    <chr>          
-#> 1 This has been the best TV I've ever use… tv             
-#> 2 I regret buying this laptop. It is too … laptop         
-#> 3 Not sure how to feel about my new washi… washing machine
 ```
 
 ### Translate
@@ -195,12 +161,6 @@ to be defined. The translation accuracy will depend on the LLM
 ``` r
 reviews |>
   llm_translate(review, "spanish")
-#> # A tibble: 3 × 2
-#>   review                                   .translation                         
-#>   <chr>                                    <chr>                                
-#> 1 This has been the best TV I've ever use… Este ha sido el mejor televisor que …
-#> 2 I regret buying this laptop. It is too … Lamento haber comprado esta laptop. …
-#> 3 Not sure how to feel about my new washi… No estoy seguro de cómo sentirme sob…
 ```
 
 ### Custom prompt
@@ -219,12 +179,6 @@ my_prompt <- paste(
 
 reviews |>
   llm_custom(review, my_prompt)
-#> # A tibble: 3 × 2
-#>   review                                   .pred
-#>   <chr>                                    <chr>
-#> 1 This has been the best TV I've ever use… Yes  
-#> 2 I regret buying this laptop. It is too … No   
-#> 3 Not sure how to feel about my new washi… No
 ```
 
 ## Initialize session
@@ -243,8 +197,6 @@ Ollama, that function is
 
 ``` r
 llm_use("ollama", "llama3.1", seed = 100, temperature = 0.2)
-#> Provider: ollama
-#> Model: llama3.1
 ```
 
 ## Key considerations
@@ -290,18 +242,13 @@ book_reviews <- data_bookReviews |>
   as_tibble()
 
 glimpse(book_reviews)
-#> Rows: 100
-#> Columns: 2
-#> $ review    <chr> "i got this as both a book and an audio file. i had waited t…
-#> $ sentiment <fct> 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 2, 1, …
 ```
 
 As per the docs, `sentiment` is a factor indicating the sentiment of the
 review: negative (1) or positive (2)
 
 ``` r
 length(strsplit(paste(book_reviews, collapse = " "), " ")[[1]])
-#> [1] 20571
 ```
 
 Just to get an idea of how much data we’re processing, I’m using a very,
@@ -317,12 +264,7 @@ reviews_llm <- book_reviews |>
     options = c("positive", "negative"),
     pred_name = "predicted"
   )
-#> ! There were 1 predictions with invalid output, they were coerced to NA
-```
-
-``` r
 toc()
-#> 169.546 sec elapsed
 ```
 
 As far as **time**, on my Apple M3 machine, it took about 3 minutes to
@@ -345,20 +287,6 @@ This is what the new table looks like:
 
 ``` r
 reviews_llm
-#> # A tibble: 100 × 3
-#>    review                                                    sentiment predicted
-#>    <chr>                                                     <fct>     <chr>    
-#>  1 "i got this as both a book and an audio file. i had wait… 1         negative 
-#>  2 "this book places too much emphasis on spending money in… 1         negative 
-#>  3 "remember the hollywood blacklist? the hollywood ten? i'… 2         negative 
-#>  4 "while i appreciate what tipler was attempting to accomp… 1         negative 
-#>  5 "the others in the series were great, and i really looke… 1         negative 
-#>  6 "a few good things, but she's lost her edge and i find i… 1         negative 
-#>  7 "words cannot describe how ripped off and disappointed i… 1         negative 
-#>  8 "1. the persective of most writers is shaped by their ow… 1         negative 
-#>  9 "i have been a huge fan of michael crichton for about 25… 1         negative 
-#> 10 "i saw dr. polk on c-span a month or two ago. he was add… 2         positive 
-#> # ℹ 90 more rows
 ```
 
 I used `yardstick` to see how well the model performed. Of course, the
@@ -371,10 +299,6 @@ library(forcats)
 reviews_llm |>
   mutate(fct_pred = as.factor(ifelse(predicted == "positive", 2, 1))) |>
   yardstick::accuracy(sentiment, fct_pred)
-#> # A tibble: 1 × 3
-#>   .metric  .estimator .estimate
-#>   <chr>    <chr>          <dbl>
-#> 1 accuracy binary         0.939
 ```
 
 ## Vector functions
@@ -386,12 +310,10 @@ corresponding `llm_vec_` function:
 
 ``` r
 llm_vec_sentiment("I am happy")
-#> [1] "positive"
 ```
 
 ``` r
 llm_vec_translate("Este es el mejor dia!", "english")
-#> [1] "This is the best day!"
 ```
 
 ## Databricks
@@ -416,13 +338,6 @@ vendor’s SQL AI function directly:
 ``` r
 tbl_reviews |>
   llm_sentiment(review)
-#> # Source:   SQL [3 x 2]
-#> # Database: Spark SQL 3.1.1[token@Spark SQL/hive_metastore]
-#>   review                                                              .sentiment
-#>   <chr>                                                               <chr>     
-#> 1 This has been the best TV Ive ever used. Great screen, and sound.   positive  
-#> 2 I regret buying this laptop. It is too slow and the keyboard is to… negative  
-#> 3 Not sure how to feel about my new washing machine. Great color, bu… mixed
 ```
 
 There are some differences in the arguments, and output of the LLM’s.
@@ -435,11 +350,4 @@ the same argument in the AI Summarize function:
 ``` r
 tbl_reviews |>
   llm_summarize(review, max_words = 5)
-#> # Source:   SQL [3 x 2]
-#> # Database: Spark SQL 3.1.1[token@Spark SQL/hive_metastore]
-#>   review                                                                .summary
-#>   <chr>                                                                 <chr>   
-#> 1 This has been the best TV Ive ever used. Great screen, and sound.     Superio…
-#> 2 I regret buying this laptop. It is too slow and the keyboard is too … Slow, n…
-#> 3 Not sure how to feel about my new washing machine. Great color, but … Initial…
 ```
diff --git a/codecov.yml b/codecov.yml
@@ -0,0 +1,14 @@
+comment: false
+
+coverage:
+  status:
+    project:
+      default:
+        target: auto
+        threshold: 1%
+        informational: true
+    patch:
+      default:
+        target: auto
+        threshold: 1%
+        informational: true