mlverse · edgararuiz · Sep 20, 2024 · Sep 19, 2024 · Sep 20, 2024 · Sep 20, 2024
diff --git a/.Rbuildignore b/.Rbuildignore
@@ -11,3 +11,10 @@ utils/
 ^_mall_cache/
 ^_readme_cache/
 ^_prompt_cache/
+^_freeze/
+^_quarto.yml
+^articles/
+^index.qmd
+^reference/
+^.*\.scss$
+^.quarto
diff --git a/.gitignore b/.gitignore
@@ -39,12 +39,13 @@ vignettes/*.pdf
 # R Environment Variables
 .Renviron
 
-# pkgdown site
-docs/
 
 # translation temp files
 po/*~
 
-# RStudio Connect folder
 rsconnect/
-docs
+
+/.quarto/
+
+_freeze/
+
diff --git a/README.Rmd b/README.Rmd
@@ -9,8 +9,7 @@ knitr::opts_chunk$set(
   collapse = TRUE,
   comment = "#>",
   fig.path = "man/figures/README-",
-  out.width = "100%",
-  eval = TRUE
+  out.width = "100%"
 )
 library(dplyr)
 library(dbplyr)
@@ -28,29 +27,6 @@ mall::llm_use("ollama", "llama3.1", seed = 100, .cache = "_readme_cache")
 [![Codecov test coverage](https://codecov.io/gh/edgararuiz/mall/branch/main/graph/badge.svg)](https://app.codecov.io/gh/edgararuiz/mall?branch=main)
 <!-- badges: end -->
 
-```{r, eval = FALSE, echo = FALSE}
-source("utils/table_of_contents.R")
-table_of_contents()
-```
-
-
-<!-- toc: start -->
-- [Motivation](#motivation)
-- [LLM functions](#llm-functions)
-    - [Sentiment](#sentiment)
-    - [Summarize](#summarize)
-    - [Classify](#classify)
-    - [Extract ](#extract)
-    - [Translate](#translate)
-    - [Custom prompt](#custom-prompt)
-- [Initialize session](#initialize-session)
-- [Key considerations](#key-considerations)
-- [Performance](#performance)
-- [Vector functions](#vector-functions)
-- [Databricks](#databricks)
-
-<!-- toc: end -->
-
 ## Intro
 
 Run multiple LLM predictions against a table. The predictions run row-wise
@@ -347,38 +323,3 @@ llm_vec_sentiment("I am happy")
 ```{r}
 llm_vec_translate("Este es el mejor dia!", "english")
 ```
-
-## Databricks
-
-This brief example shows how seamless it is to use the same `llm_` functions,
-but against a remote connection:
-
-```{r}
-library(DBI)
-
-con <- dbConnect(
-  odbc::databricks(),
-  HTTPPath = Sys.getenv("DATABRICKS_PATH")
-)
-
-tbl_reviews <- copy_to(con, reviews)
-```
-As mentioned above, using `llm_sentiment()` in Databricks will call that vendor's
-SQL AI function directly:
-
-```{r}
-tbl_reviews |>
-  llm_sentiment(review)
-```
-
-There are some differences in the arguments, and output of the LLM's. Notice
-that instead of "neutral", the prediction is "mixed".  The AI Sentiment function
-does not allow to change the possible options.
-
-Next, we will try `llm_summarize()`. The `max_words` argument maps to the same
-argument in the AI Summarize function:
-
-```{r}
-tbl_reviews |>
-  llm_summarize(review, max_words = 5)
-```
diff --git a/README.md b/README.md
@@ -12,24 +12,6 @@ experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](h
 coverage](https://codecov.io/gh/edgararuiz/mall/branch/main/graph/badge.svg)](https://app.codecov.io/gh/edgararuiz/mall?branch=main)
 <!-- badges: end -->
 
-<!-- toc: start -->
-
-- [Motivation](#motivation)
-- [LLM functions](#llm-functions)
-  - [Sentiment](#sentiment)
-  - [Summarize](#summarize)
-  - [Classify](#classify)
-  - [Extract](#extract)
-  - [Translate](#translate)
-  - [Custom prompt](#custom-prompt)
-- [Initialize session](#initialize-session)
-- [Key considerations](#key-considerations)
-- [Performance](#performance)
-- [Vector functions](#vector-functions)
-- [Databricks](#databricks)
-
-<!-- toc: end -->
-
 ## Intro
 
 Run multiple LLM predictions against a table. The predictions run
@@ -390,53 +372,3 @@ llm_vec_sentiment("I am happy")
 llm_vec_translate("Este es el mejor dia!", "english")
 #> [1] "This is the best day!"
 ```
-
-## Databricks
-
-This brief example shows how seamless it is to use the same `llm_`
-functions, but against a remote connection:
-
-``` r
-library(DBI)
-
-con <- dbConnect(
-  odbc::databricks(),
-  HTTPPath = Sys.getenv("DATABRICKS_PATH")
-)
-
-tbl_reviews <- copy_to(con, reviews)
-```
-
-As mentioned above, using `llm_sentiment()` in Databricks will call that
-vendor’s SQL AI function directly:
-
-``` r
-tbl_reviews |>
-  llm_sentiment(review)
-#> # Source:   SQL [3 x 2]
-#> # Database: Spark SQL 3.1.1[token@Spark SQL/hive_metastore]
-#>   review                                                              .sentiment
-#>   <chr>                                                               <chr>     
-#> 1 This has been the best TV Ive ever used. Great screen, and sound.   positive  
-#> 2 I regret buying this laptop. It is too slow and the keyboard is to… negative  
-#> 3 Not sure how to feel about my new washing machine. Great color, bu… mixed
-```
-
-There are some differences in the arguments, and output of the LLM’s.
-Notice that instead of “neutral”, the prediction is “mixed”. The AI
-Sentiment function does not allow to change the possible options.
-
-Next, we will try `llm_summarize()`. The `max_words` argument maps to
-the same argument in the AI Summarize function:
-
-``` r
-tbl_reviews |>
-  llm_summarize(review, max_words = 5)
-#> # Source:   SQL [3 x 2]
-#> # Database: Spark SQL 3.1.1[token@Spark SQL/hive_metastore]
-#>   review                                                                .summary
-#>   <chr>                                                                 <chr>   
-#> 1 This has been the best TV Ive ever used. Great screen, and sound.     Superio…
-#> 2 I regret buying this laptop. It is too slow and the keyboard is too … Slow, n…
-#> 3 Not sure how to feel about my new washing machine. Great color, but … Initial…
-```
diff --git a/_quarto.yml b/_quarto.yml
@@ -0,0 +1,39 @@
+project:
+  type: website
+  output-dir: docs
+
+execute: 
+  freeze: true  
+
+website:
+  title: mall
+  navbar:
+    left:
+      - sidebar:articles    
+      - href: reference/index.qmd
+        text: Reference
+  sidebar:
+    - id: articles
+      title: "Articles"
+      style: "docked"
+      background: light
+      collapse-level: 2
+      contents: 
+      - text: "Databricks"
+        href: articles/databricks.qmd
+
+format:
+  html:
+    toc: true
+    code-copy: true
+    code-overflow: wrap
+    code-toos: true
+    eval: false
+    theme:
+      light: [cosmo, theme.scss]
+      dark: [cosmo, theme-dark.scss]
+
+knitr:
+  opts_chunk: 
+    collapse: true
+    comment: "#>"    
diff --git a/articles/databricks.qmd b/articles/databricks.qmd
@@ -0,0 +1,64 @@
+---
+title: "Databricks"
+---
+
+
+```{r, include = FALSE}
+packageStartupMessage(library(dplyr))
+```
+
+This brief example shows how seamless it is to use the same functions,
+but against a remote database connection. Today, it works with the following
+functions:
+
+- `llm_sentiment()` / `llm_vec_sentiment()`
+- `llm_summarize()` / `llm_vec_summarize()`
+
+## Examples
+
+We will start by connecting to the Databricks Warehouse
+
+```{r}
+library(mall)
+library(DBI)
+
+con <- dbConnect(
+  odbc::databricks(),
+  HTTPPath = Sys.getenv("DATABRICKS_PATH")
+)
+```
+
+Next, we will create a small reviews table
+
+```{r}
+library(dplyr)
+
+reviews <- tribble(
+  ~review,
+  "This has been the best TV I've ever used. Great screen, and sound.",
+  "I regret buying this laptop. It is too slow and the keyboard is too noisy",
+  "Not sure how to feel about my new washing machine. Great color, but hard to figure"
+)
+
+tbl_reviews <- copy_to(con, reviews)
+```
+
+Using `llm_sentiment()` in Databricks will call that vendor's SQL AI function
+directly:
+
+```{r}
+tbl_reviews |>
+  llm_sentiment(review)
+```
+
+There are some differences in the arguments, and output of the LLM's. Notice
+that instead of "neutral", the prediction is "mixed".  The AI Sentiment function
+does not allow to change the possible options.
+
+Next, we will try `llm_summarize()`. The `max_words` argument maps to the same
+argument in the AI Summarize function:
+
+```{r}
+tbl_reviews |>
+  llm_summarize(review, max_words = 5)
+```