Just to get an idea of how much data we’re processing, I’m using a very, very simple word count. So we’re analyzing a bit over 20 thousand words.
-library(tictoc)
-
-tic()
-reviews_llm<-book_reviews|>
+reviews_llm<-book_reviews|>llm_sentiment( col =review, options =c("positive", "negative"), pred_name ="predicted")#> ! There were 1 predictions with invalid output, they were coerced to NA
-
-toc()
-#> 171.074 sec elapsed
As far as time, on my Apple M3 machine, it took about 3 minutes to process, 100 rows, containing 20 thousand words. Setting temp to 0.2 in llm_use(), made the model run a bit faster.
The package uses purrr to send each prompt individually to the LLM. But, I did try a few different ways to speed up the process, unsuccessfully:
I used yardstick to see how well the model performed. Of course, the accuracy will not be of the “truth”, but rather the package’s results recorded in sentiment.
mall includes functions that expect a vector, instead of a table, to run the predictions. This should make it easier to test things, such as custom prompts or results of specific text. Each llm_ function has a corresponding llm_vec_ function:
There are some differences in the arguments, and output of the LLM’s. Notice that instead of “neutral”, the prediction is “mixed”. The AI Sentiment function does not allow to change the possible options.
Next, we will try llm_summarize(). The max_words argument maps to the same argument in the AI Summarize function:
The path to save model results, so they can be re-used if
+the same operation is ran again. To turn off, set this argument to an empty
+character: "". 'It defaults to '_mall_cache'. If this argument is left NULL
+when calling this function, no changes to the path will be made.
+
+
+
.force
Flag that tell the function to reset all of the settings in the
R session
diff --git a/reference/m_backend_submit.html b/reference/m_backend_submit.html
index 6466940..6afd893 100644
--- a/reference/m_backend_submit.html
+++ b/reference/m_backend_submit.html
@@ -7,7 +7,7 @@
mall
- 0.0.0.9004
+ 0.0.0.9005
diff --git a/search.json b/search.json
index 5deb0c0..12fb4c9 100644
--- a/search.json
+++ b/search.json
@@ -1 +1 @@
-[{"path":"https://edgararuiz.github.io/mall/LICENSE.html","id":null,"dir":"","previous_headings":"","what":"MIT License","title":"MIT License","text":"Copyright (c) 2024 mall authors Permission hereby granted, free charge, person obtaining copy software associated documentation files (“Software”), deal Software without restriction, including without limitation rights use, copy, modify, merge, publish, distribute, sublicense, /sell copies Software, permit persons Software furnished , subject following conditions: copyright notice permission notice shall included copies substantial portions Software. SOFTWARE PROVIDED “”, WITHOUT WARRANTY KIND, EXPRESS IMPLIED, INCLUDING LIMITED WARRANTIES MERCHANTABILITY, FITNESS PARTICULAR PURPOSE NONINFRINGEMENT. EVENT SHALL AUTHORS COPYRIGHT HOLDERS LIABLE CLAIM, DAMAGES LIABILITY, WHETHER ACTION CONTRACT, TORT OTHERWISE, ARISING , CONNECTION SOFTWARE USE DEALINGS SOFTWARE.","code":""},{"path":"https://edgararuiz.github.io/mall/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Edgar Ruiz. Author, maintainer.","code":""},{"path":"https://edgararuiz.github.io/mall/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Ruiz E (2024). mall: Run multiple 'Large Language Model' predictions table, vectors. R package version 0.0.0.9004, https://edgararuiz.github.io/mall/.","code":"@Manual{, title = {mall: Run multiple 'Large Language Model' predictions against a table, or vectors}, author = {Edgar Ruiz}, year = {2024}, note = {R package version 0.0.0.9004}, url = {https://edgararuiz.github.io/mall/}, }"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"mall","dir":"","previous_headings":"","what":"Run multiple Large Language Model predictions against a table, or vectors","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"Motivation Sentiment Summarize Classify Extract Translate Custom prompt Initialize session Key considerations Performance Vector functions Databricks","code":""},{"path":"https://edgararuiz.github.io/mall/index.html","id":"intro","dir":"","previous_headings":"","what":"Intro","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"Run multiple LLM predictions table. predictions run row-wise specified column. works using pre-determined one-shot prompt, along current row’s content. prompt use depend type analysis needed. Currently, included prompts perform following: Sentiment analysis Summarize text Extract one, several, specific pieces information text package inspired SQL AI functions now offered vendors Databricks Snowflake. local data, mall uses Ollama call LLM.","code":""},{"path":"https://edgararuiz.github.io/mall/index.html","id":"databricks-integration","dir":"","previous_headings":"Intro","what":"Databricks integration","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"pass table connected Databricks via odbc, mall automatically use Databricks’ LLM instead Ollama. call corresponding SQL AI function.","code":""},{"path":"https://edgararuiz.github.io/mall/index.html","id":"motivation","dir":"","previous_headings":"","what":"Motivation","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"want new find ways help data scientists use LLMs daily work. Unlike familiar interfaces, chatting code completion, interface runs text data directly LLM. LLM’s flexibility, allows adapt subject data, provide surprisingly accurate predictions. saves data scientist need write tune NLP model.","code":""},{"path":"https://edgararuiz.github.io/mall/index.html","id":"llm-functions","dir":"","previous_headings":"","what":"LLM functions","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"start small table product reviews:","code":"library(dplyr) reviews <- tribble( ~review, \"This has been the best TV I've ever used. Great screen, and sound.\", \"I regret buying this laptop. It is too slow and the keyboard is too noisy\", \"Not sure how to feel about my new washing machine. Great color, but hard to figure\" )"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"sentiment","dir":"","previous_headings":"LLM functions","what":"Sentiment","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"Primarily, mall provides verb-like functions expect tbl first argument. allows us use piped operations. first example, ’ll asses sentiment review. order call llm_sentiment(): function let’s us modify options choose : mentioned , pipe friendly, results LLM prediction can used transformations:","code":"library(mall) reviews |> llm_sentiment(review) #> # A tibble: 3 × 2 #> review .sentiment #> #> 1 This has been the best TV I've ever use… positive #> 2 I regret buying this laptop. It is too … negative #> 3 Not sure how to feel about my new washi… neutral reviews |> llm_sentiment(review, options = c(\"positive\", \"negative\")) #> # A tibble: 3 × 2 #> review .sentiment #> #> 1 This has been the best TV I've ever use… positive #> 2 I regret buying this laptop. It is too … negative #> 3 Not sure how to feel about my new washi… negative reviews |> llm_sentiment(review, options = c(\"positive\", \"negative\")) |> filter(.sentiment == \"negative\") #> # A tibble: 2 × 2 #> review .sentiment #> #> 1 I regret buying this laptop. It is too … negative #> 2 Not sure how to feel about my new washi… negative"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"summarize","dir":"","previous_headings":"LLM functions","what":"Summarize","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"may need reduce number words given text. Usually, make easier capture intent. , use llm_summarize(). function argument control maximum number words output (max_words): control name prediction field, can change pred_name argument. works llm_ functions well.","code":"reviews |> llm_summarize(review, max_words = 5) #> # A tibble: 3 × 2 #> review .summary #> #> 1 This has been the best TV I've ever use… very good tv experience overall #> 2 I regret buying this laptop. It is too … slow and noisy laptop purchase #> 3 Not sure how to feel about my new washi… mixed feelings about new washer reviews |> llm_summarize(review, max_words = 5, pred_name = \"review_summary\") #> # A tibble: 3 × 2 #> review review_summary #> #> 1 This has been the best TV I've ever use… very good tv experience overall #> 2 I regret buying this laptop. It is too … slow and noisy laptop purchase #> 3 Not sure how to feel about my new washi… mixed feelings about new washer"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"classify","dir":"","previous_headings":"LLM functions","what":"Classify","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"Use LLM categorize text one options provide:","code":"reviews |> llm_classify(review, c(\"appliance\", \"computer\")) #> # A tibble: 3 × 2 #> review .classify #> #> 1 This has been the best TV I've ever use… appliance #> 2 I regret buying this laptop. It is too … computer #> 3 Not sure how to feel about my new washi… appliance"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"extract","dir":"","previous_headings":"LLM functions","what":"Extract","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"One interesting operations. Using natural language, can tell LLM return specific part text. following example, request LLM return product referred . simply saying “product”. LLM understands mean word, looks text.","code":"reviews |> llm_extract(review, \"product\") #> # A tibble: 3 × 2 #> review .extract #> #> 1 This has been the best TV I've ever use… tv #> 2 I regret buying this laptop. It is too … laptop #> 3 Not sure how to feel about my new washi… washing machine"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"translate","dir":"","previous_headings":"LLM functions","what":"Translate","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"title implies, function translate text specified language. really nice, don’t need specify language source text. target language needs defined. translation accuracy depend LLM","code":"reviews |> llm_translate(review, \"spanish\") #> # A tibble: 3 × 2 #> review .translation #> #> 1 This has been the best TV I've ever use… Este ha sido el mejor televisor que … #> 2 I regret buying this laptop. It is too … Lamento haber comprado esta laptop. … #> 3 Not sure how to feel about my new washi… No estoy seguro de cómo sentirme sob…"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"custom-prompt","dir":"","previous_headings":"LLM functions","what":"Custom prompt","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"possible pass prompt LLM, mall run text entry. Use llm_custom() access functionality:","code":"my_prompt <- paste( \"Answer a question.\", \"Return only the answer, no explanation\", \"Acceptable answers are 'yes', 'no'\", \"Answer this about the following text, is this a happy customer?:\" ) reviews |> llm_custom(review, my_prompt) #> # A tibble: 3 × 2 #> review .pred #> #> 1 This has been the best TV I've ever use… Yes #> 2 I regret buying this laptop. It is too … No #> 3 Not sure how to feel about my new washi… No"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"initialize-session","dir":"","previous_headings":"","what":"Initialize session","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"Invoking llm_ function automatically initialize model selection don’t one selected yet. one option, pre-select . one available models, mall present menu selection can select model wish use. Calling llm_use() directly let specify model backend use. can also setup additional arguments passed function actually runs prediction. case Ollama, function generate().","code":"llm_use(\"ollama\", \"llama3.1\", seed = 100, temperature = 0.2) #> Provider: ollama #> Model: llama3.1"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"key-considerations","dir":"","previous_headings":"","what":"Key considerations","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"main consideration cost. Either, time cost, money cost. using method LLM locally available, cost long running time. Unless using specialized LLM, given LLM general model. fitted using vast amount data. determining response row, takes longer using manually created NLP model. default model used Ollama Llama 3.1, fitted using 8B parameters. using external LLM service, consideration need billing costs using service. Keep mind sending lot data evaluated. Another consideration novelty approach. Early tests providing encouraging results. , user, still need keep mind predictions infallible, always check output. time, think best use method, quick analysis.","code":""},{"path":"https://edgararuiz.github.io/mall/index.html","id":"performance","dir":"","previous_headings":"","what":"Performance","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"briefly cover methods performance two perspectives: long analysis takes run locally well predicts , use data_bookReviews data set, provided classmap package. exercise, first 100, total 1,000, going part analysis. per docs, sentiment factor indicating sentiment review: negative (1) positive (2) Just get idea much data ’re processing, ’m using , simple word count. ’re analyzing bit 20 thousand words. far time, Apple M3 machine, took 3 minutes process, 100 rows, containing 20 thousand words. Setting temp 0.2 llm_use(), made model run bit faster. package uses purrr send prompt individually LLM. , try different ways speed process, unsuccessfully: Used furrr send multiple requests time. work either LLM Ollama processed requests serially. improvement. also tried sending one row’s text time. cause instability number results. example sending 5 time, sometimes returned 7 8. Even sending 2 stable. new table looks like: used yardstick see well model performed. course, accuracy “truth”, rather package’s results recorded sentiment.","code":"library(classmap) data(data_bookReviews) book_reviews <- data_bookReviews |> head(100) |> as_tibble() glimpse(book_reviews) #> Rows: 100 #> Columns: 2 #> $ review \"i got this as both a book and an audio file. i had waited t… #> $ sentiment 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 2, 1, … length(strsplit(paste(book_reviews, collapse = \" \"), \" \")[[1]]) #> [1] 20571 library(tictoc) tic() reviews_llm <- book_reviews |> llm_sentiment( col = review, options = c(\"positive\", \"negative\"), pred_name = \"predicted\" ) #> ! There were 1 predictions with invalid output, they were coerced to NA toc() #> 171.074 sec elapsed reviews_llm #> # A tibble: 100 × 3 #> review sentiment predicted #> #> 1 \"i got this as both a book and an audio file. i had wait… 1 negative #> 2 \"this book places too much emphasis on spending money in… 1 negative #> 3 \"remember the hollywood blacklist? the hollywood ten? i'… 2 negative #> 4 \"while i appreciate what tipler was attempting to accomp… 1 negative #> 5 \"the others in the series were great, and i really looke… 1 negative #> 6 \"a few good things, but she's lost her edge and i find i… 1 negative #> 7 \"words cannot describe how ripped off and disappointed i… 1 negative #> 8 \"1. the persective of most writers is shaped by their ow… 1 negative #> 9 \"i have been a huge fan of michael crichton for about 25… 1 negative #> 10 \"i saw dr. polk on c-span a month or two ago. he was add… 2 positive #> # ℹ 90 more rows library(forcats) reviews_llm |> mutate(fct_pred = as.factor(ifelse(predicted == \"positive\", 2, 1))) |> yardstick::accuracy(sentiment, fct_pred) #> # A tibble: 1 × 3 #> .metric .estimator .estimate #> #> 1 accuracy binary 0.939"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"vector-functions","dir":"","previous_headings":"","what":"Vector functions","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"mall includes functions expect vector, instead table, run predictions. make easier test things, custom prompts results specific text. llm_ function corresponding llm_vec_ function:","code":"llm_vec_sentiment(\"I am happy\") #> [1] \"positive\" llm_vec_translate(\"Este es el mejor dia!\", \"english\") #> [1] \"This is the best day!\""},{"path":"https://edgararuiz.github.io/mall/index.html","id":"databricks","dir":"","previous_headings":"","what":"Databricks","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"brief example shows seamless use llm_ functions, remote connection: mentioned , using llm_sentiment() Databricks call vendor’s SQL AI function directly: differences arguments, output LLM’s. Notice instead “neutral”, prediction “mixed”. AI Sentiment function allow change possible options. Next, try llm_summarize(). max_words argument maps argument AI Summarize function:","code":"library(DBI) con <- dbConnect( odbc::databricks(), HTTPPath = Sys.getenv(\"DATABRICKS_PATH\") ) tbl_reviews <- copy_to(con, reviews) tbl_reviews |> llm_sentiment(review) #> # Source: SQL [3 x 2] #> # Database: Spark SQL 3.1.1[token@Spark SQL/hive_metastore] #> review .sentiment #> #> 1 This has been the best TV Ive ever used. Great screen, and sound. positive #> 2 I regret buying this laptop. It is too slow and the keyboard is to… negative #> 3 Not sure how to feel about my new washing machine. Great color, bu… mixed tbl_reviews |> llm_summarize(review, max_words = 5) #> # Source: SQL [3 x 2] #> # Database: Spark SQL 3.1.1[token@Spark SQL/hive_metastore] #> review .summary #> #> 1 This has been the best TV Ive ever used. Great screen, and sound. Superio… #> 2 I regret buying this laptop. It is too slow and the keyboard is too … Slow, n… #> 3 Not sure how to feel about my new washing machine. Great color, but … Initial…"},{"path":"https://edgararuiz.github.io/mall/reference/llm_classify.html","id":null,"dir":"Reference","previous_headings":"","what":"Categorize data as one of options given — llm_classify","title":"Categorize data as one of options given — llm_classify","text":"Use Large Language Model (LLM) classify provided text one options provided via labels argument.","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_classify.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Categorize data as one of options given — llm_classify","text":"","code":"llm_classify( .data, col, labels, pred_name = \".classify\", additional_prompt = \"\" ) llm_vec_classify(x, labels, additional_prompt = \"\")"},{"path":"https://edgararuiz.github.io/mall/reference/llm_classify.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Categorize data as one of options given — llm_classify","text":".data data.frame tbl object contains text analyzed col name field analyze, supports tidy-eval labels character vector least 2 labels classify text pred_name character vector name new column prediction placed additional_prompt Inserts text prompt sent LLM x vector contains text analyzed","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_classify.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Categorize data as one of options given — llm_classify","text":"llm_classify returns data.frame tbl object. llm_vec_classify returns vector length x.","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_custom.html","id":null,"dir":"Reference","previous_headings":"","what":"Send a custom prompt to the LLM — llm_custom","title":"Send a custom prompt to the LLM — llm_custom","text":"Use Large Language Model (LLM) process provided text using instructions prompt","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_custom.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Send a custom prompt to the LLM — llm_custom","text":"","code":"llm_custom(.data, col, prompt = \"\", pred_name = \".pred\", valid_resps = \"\") llm_vec_custom(x, prompt = \"\", valid_resps = NULL)"},{"path":"https://edgararuiz.github.io/mall/reference/llm_custom.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Send a custom prompt to the LLM — llm_custom","text":".data data.frame tbl object contains text analyzed col name field analyze, supports tidy-eval prompt prompt append record sent LLM pred_name character vector name new column prediction placed valid_resps response LLM open, deterministic, provide options vector. function set NA response options x vector contains text analyzed","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_custom.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Send a custom prompt to the LLM — llm_custom","text":"llm_custom returns data.frame tbl object. llm_vec_custom returns vector length x.","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_extract.html","id":null,"dir":"Reference","previous_headings":"","what":"Extract entities from text — llm_extract","title":"Extract entities from text — llm_extract","text":"Use Large Language Model (LLM) extract specific entity, entities, provided text","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_extract.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Extract entities from text — llm_extract","text":"","code":"llm_extract( .data, col, labels, expand_cols = FALSE, additional_prompt = \"\", pred_name = \".extract\" ) llm_vec_extract(x, labels = c(), additional_prompt = \"\")"},{"path":"https://edgararuiz.github.io/mall/reference/llm_extract.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Extract entities from text — llm_extract","text":".data data.frame tbl object contains text analyzed col name field analyze, supports tidy-eval labels vector entities extract text expand_cols multiple labels passed, flag tells function create new column per item labels. labels named vector, function use names new column names, , function use sanitized version content name. additional_prompt Inserts text prompt sent LLM pred_name character vector name new column prediction placed x vector contains text analyzed","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_extract.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Extract entities from text — llm_extract","text":"llm_extract returns data.frame tbl object. llm_vec_extract returns vector length x.","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_sentiment.html","id":null,"dir":"Reference","previous_headings":"","what":"Sentiment analysis — llm_sentiment","title":"Sentiment analysis — llm_sentiment","text":"Use Large Language Model (LLM) perform sentiment analysis provided text","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_sentiment.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Sentiment analysis — llm_sentiment","text":"","code":"llm_sentiment( .data, col, options = c(\"positive\", \"negative\", \"neutral\"), pred_name = \".sentiment\", additional_prompt = \"\" ) llm_vec_sentiment( x, options = c(\"positive\", \"negative\", \"neutral\"), additional_prompt = \"\" )"},{"path":"https://edgararuiz.github.io/mall/reference/llm_sentiment.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Sentiment analysis — llm_sentiment","text":".data data.frame tbl object contains text analyzed col name field analyze, supports tidy-eval options vector options LLM use assign sentiment text. Defaults : 'positive', 'negative', 'neutral' pred_name character vector name new column prediction placed additional_prompt Inserts text prompt sent LLM x vector contains text analyzed","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_sentiment.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Sentiment analysis — llm_sentiment","text":"llm_sentiment returns data.frame tbl object. llm_vec_sentiment returns vector length x.","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_summarize.html","id":null,"dir":"Reference","previous_headings":"","what":"Summarize text — llm_summarize","title":"Summarize text — llm_summarize","text":"Use Large Language Model (LLM) summarize text","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_summarize.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Summarize text — llm_summarize","text":"","code":"llm_summarize( .data, col, max_words = 10, pred_name = \".summary\", additional_prompt = \"\" ) llm_vec_summarize(x, max_words = 10, additional_prompt = \"\")"},{"path":"https://edgararuiz.github.io/mall/reference/llm_summarize.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Summarize text — llm_summarize","text":".data data.frame tbl object contains text analyzed col name field analyze, supports tidy-eval max_words maximum number words LLM use summary. Defaults 10. pred_name character vector name new column prediction placed additional_prompt Inserts text prompt sent LLM x vector contains text analyzed","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_summarize.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Summarize text — llm_summarize","text":"llm_summarize returns data.frame tbl object. llm_vec_summarize returns vector length x.","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_translate.html","id":null,"dir":"Reference","previous_headings":"","what":"Translates text to a specific language — llm_translate","title":"Translates text to a specific language — llm_translate","text":"Use Large Language Model (LLM) translate text specific language","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_translate.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Translates text to a specific language — llm_translate","text":"","code":"llm_translate( .data, col, language, pred_name = \".translation\", additional_prompt = \"\" ) llm_vec_translate(x, language, additional_prompt = \"\")"},{"path":"https://edgararuiz.github.io/mall/reference/llm_translate.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Translates text to a specific language — llm_translate","text":".data data.frame tbl object contains text analyzed col name field analyze, supports tidy-eval language Target language translate text pred_name character vector name new column prediction placed additional_prompt Inserts text prompt sent LLM x vector contains text analyzed","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_translate.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Translates text to a specific language — llm_translate","text":"llm_translate returns data.frame tbl object. llm_vec_translate returns vector length x.","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_use.html","id":null,"dir":"Reference","previous_headings":"","what":"Specify the model to use — llm_use","title":"Specify the model to use — llm_use","text":"Allows us specify back-end provider, model use current R session","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_use.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Specify the model to use — llm_use","text":"","code":"llm_use(backend = NULL, model = NULL, ..., .silent = FALSE, force = FALSE)"},{"path":"https://edgararuiz.github.io/mall/reference/llm_use.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Specify the model to use — llm_use","text":"backend name supported back-end provider. Currently 'ollama' supported. model name model supported back-end provider ... Additional arguments function pass integrating function. case Ollama, pass arguments ollamar::chat(). .silent Avoids console output force Flag tell function reset settings R session","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_use.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Specify the model to use — llm_use","text":"mall_defaults object","code":""},{"path":"https://edgararuiz.github.io/mall/reference/m_backend_submit.html","id":null,"dir":"Reference","previous_headings":"","what":"Functions to integrate different back-ends — m_backend_prompt","title":"Functions to integrate different back-ends — m_backend_prompt","text":"Functions integrate different back-ends","code":""},{"path":"https://edgararuiz.github.io/mall/reference/m_backend_submit.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Functions to integrate different back-ends — m_backend_prompt","text":"","code":"m_backend_prompt(backend, additional) m_backend_submit(backend, x, prompt)"},{"path":"https://edgararuiz.github.io/mall/reference/m_backend_submit.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Functions to integrate different back-ends — m_backend_prompt","text":"backend mall_defaults object additional Additional text insert base_prompt x body text submitted LLM prompt additional information add submission","code":""},{"path":"https://edgararuiz.github.io/mall/reference/m_backend_submit.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Functions to integrate different back-ends — m_backend_prompt","text":"m_backend_submit return object. m_backend_prompt returns list functions contain base prompts.","code":""}]
+[{"path":"https://edgararuiz.github.io/mall/LICENSE.html","id":null,"dir":"","previous_headings":"","what":"MIT License","title":"MIT License","text":"Copyright (c) 2024 mall authors Permission hereby granted, free charge, person obtaining copy software associated documentation files (“Software”), deal Software without restriction, including without limitation rights use, copy, modify, merge, publish, distribute, sublicense, /sell copies Software, permit persons Software furnished , subject following conditions: copyright notice permission notice shall included copies substantial portions Software. SOFTWARE PROVIDED “”, WITHOUT WARRANTY KIND, EXPRESS IMPLIED, INCLUDING LIMITED WARRANTIES MERCHANTABILITY, FITNESS PARTICULAR PURPOSE NONINFRINGEMENT. EVENT SHALL AUTHORS COPYRIGHT HOLDERS LIABLE CLAIM, DAMAGES LIABILITY, WHETHER ACTION CONTRACT, TORT OTHERWISE, ARISING , CONNECTION SOFTWARE USE DEALINGS SOFTWARE.","code":""},{"path":"https://edgararuiz.github.io/mall/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Edgar Ruiz. Author, maintainer.","code":""},{"path":"https://edgararuiz.github.io/mall/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Ruiz E (2024). mall: Run multiple 'Large Language Model' predictions table, vectors. R package version 0.0.0.9005, https://edgararuiz.github.io/mall/.","code":"@Manual{, title = {mall: Run multiple 'Large Language Model' predictions against a table, or vectors}, author = {Edgar Ruiz}, year = {2024}, note = {R package version 0.0.0.9005}, url = {https://edgararuiz.github.io/mall/}, }"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"mall","dir":"","previous_headings":"","what":"Run multiple Large Language Model predictions against a table, or vectors","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"Motivation Sentiment Summarize Classify Extract Translate Custom prompt Initialize session Key considerations Performance Vector functions Databricks","code":""},{"path":"https://edgararuiz.github.io/mall/index.html","id":"intro","dir":"","previous_headings":"","what":"Intro","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"Run multiple LLM predictions table. predictions run row-wise specified column. works using pre-determined one-shot prompt, along current row’s content. prompt use depend type analysis needed. Currently, included prompts perform following: Sentiment analysis Summarize text Extract one, several, specific pieces information text package inspired SQL AI functions now offered vendors Databricks Snowflake. local data, mall uses Ollama call LLM.","code":""},{"path":"https://edgararuiz.github.io/mall/index.html","id":"databricks-integration","dir":"","previous_headings":"Intro","what":"Databricks integration","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"pass table connected Databricks via odbc, mall automatically use Databricks’ LLM instead Ollama. call corresponding SQL AI function.","code":""},{"path":"https://edgararuiz.github.io/mall/index.html","id":"motivation","dir":"","previous_headings":"","what":"Motivation","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"want new find ways help data scientists use LLMs daily work. Unlike familiar interfaces, chatting code completion, interface runs text data directly LLM. LLM’s flexibility, allows adapt subject data, provide surprisingly accurate predictions. saves data scientist need write tune NLP model.","code":""},{"path":"https://edgararuiz.github.io/mall/index.html","id":"llm-functions","dir":"","previous_headings":"","what":"LLM functions","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"start small table product reviews:","code":"library(dplyr) reviews <- tribble( ~review, \"This has been the best TV I've ever used. Great screen, and sound.\", \"I regret buying this laptop. It is too slow and the keyboard is too noisy\", \"Not sure how to feel about my new washing machine. Great color, but hard to figure\" )"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"sentiment","dir":"","previous_headings":"LLM functions","what":"Sentiment","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"Primarily, mall provides verb-like functions expect tbl first argument. allows us use piped operations. first example, ’ll asses sentiment review. order call llm_sentiment(): function let’s us modify options choose : mentioned , pipe friendly, results LLM prediction can used transformations:","code":"library(mall) reviews |> llm_sentiment(review) #> # A tibble: 3 × 2 #> review .sentiment #> #> 1 This has been the best TV I've ever use… positive #> 2 I regret buying this laptop. It is too … negative #> 3 Not sure how to feel about my new washi… neutral reviews |> llm_sentiment(review, options = c(\"positive\", \"negative\")) #> # A tibble: 3 × 2 #> review .sentiment #> #> 1 This has been the best TV I've ever use… positive #> 2 I regret buying this laptop. It is too … negative #> 3 Not sure how to feel about my new washi… negative reviews |> llm_sentiment(review, options = c(\"positive\", \"negative\")) |> filter(.sentiment == \"negative\") #> # A tibble: 2 × 2 #> review .sentiment #> #> 1 I regret buying this laptop. It is too … negative #> 2 Not sure how to feel about my new washi… negative"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"summarize","dir":"","previous_headings":"LLM functions","what":"Summarize","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"may need reduce number words given text. Usually, make easier capture intent. , use llm_summarize(). function argument control maximum number words output (max_words): control name prediction field, can change pred_name argument. works llm_ functions well.","code":"reviews |> llm_summarize(review, max_words = 5) #> # A tibble: 3 × 2 #> review .summary #> #> 1 This has been the best TV I've ever use… very good tv experience overall #> 2 I regret buying this laptop. It is too … slow and noisy laptop purchase #> 3 Not sure how to feel about my new washi… mixed feelings about new washer reviews |> llm_summarize(review, max_words = 5, pred_name = \"review_summary\") #> # A tibble: 3 × 2 #> review review_summary #> #> 1 This has been the best TV I've ever use… very good tv experience overall #> 2 I regret buying this laptop. It is too … slow and noisy laptop purchase #> 3 Not sure how to feel about my new washi… mixed feelings about new washer"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"classify","dir":"","previous_headings":"LLM functions","what":"Classify","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"Use LLM categorize text one options provide:","code":"reviews |> llm_classify(review, c(\"appliance\", \"computer\")) #> # A tibble: 3 × 2 #> review .classify #> #> 1 This has been the best TV I've ever use… appliance #> 2 I regret buying this laptop. It is too … computer #> 3 Not sure how to feel about my new washi… appliance"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"extract","dir":"","previous_headings":"LLM functions","what":"Extract","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"One interesting operations. Using natural language, can tell LLM return specific part text. following example, request LLM return product referred . simply saying “product”. LLM understands mean word, looks text.","code":"reviews |> llm_extract(review, \"product\") #> # A tibble: 3 × 2 #> review .extract #> #> 1 This has been the best TV I've ever use… tv #> 2 I regret buying this laptop. It is too … laptop #> 3 Not sure how to feel about my new washi… washing machine"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"translate","dir":"","previous_headings":"LLM functions","what":"Translate","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"title implies, function translate text specified language. really nice, don’t need specify language source text. target language needs defined. translation accuracy depend LLM","code":"reviews |> llm_translate(review, \"spanish\") #> # A tibble: 3 × 2 #> review .translation #> #> 1 This has been the best TV I've ever use… Este ha sido el mejor televisor que … #> 2 I regret buying this laptop. It is too … Lamento haber comprado esta laptop. … #> 3 Not sure how to feel about my new washi… No estoy seguro de cómo sentirme sob…"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"custom-prompt","dir":"","previous_headings":"LLM functions","what":"Custom prompt","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"possible pass prompt LLM, mall run text entry. Use llm_custom() access functionality:","code":"my_prompt <- paste( \"Answer a question.\", \"Return only the answer, no explanation\", \"Acceptable answers are 'yes', 'no'\", \"Answer this about the following text, is this a happy customer?:\" ) reviews |> llm_custom(review, my_prompt) #> # A tibble: 3 × 2 #> review .pred #> #> 1 This has been the best TV I've ever use… Yes #> 2 I regret buying this laptop. It is too … No #> 3 Not sure how to feel about my new washi… No"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"initialize-session","dir":"","previous_headings":"","what":"Initialize session","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"Invoking llm_ function automatically initialize model selection don’t one selected yet. one option, pre-select . one available models, mall present menu selection can select model wish use. Calling llm_use() directly let specify model backend use. can also setup additional arguments passed function actually runs prediction. case Ollama, function generate().","code":"llm_use(\"ollama\", \"llama3.1\", seed = 100, temperature = 0.2)"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"key-considerations","dir":"","previous_headings":"","what":"Key considerations","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"main consideration cost. Either, time cost, money cost. using method LLM locally available, cost long running time. Unless using specialized LLM, given LLM general model. fitted using vast amount data. determining response row, takes longer using manually created NLP model. default model used Ollama Llama 3.1, fitted using 8B parameters. using external LLM service, consideration need billing costs using service. Keep mind sending lot data evaluated. Another consideration novelty approach. Early tests providing encouraging results. , user, still need keep mind predictions infallible, always check output. time, think best use method, quick analysis.","code":""},{"path":"https://edgararuiz.github.io/mall/index.html","id":"performance","dir":"","previous_headings":"","what":"Performance","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"briefly cover methods performance two perspectives: long analysis takes run locally well predicts , use data_bookReviews data set, provided classmap package. exercise, first 100, total 1,000, going part analysis. per docs, sentiment factor indicating sentiment review: negative (1) positive (2) Just get idea much data ’re processing, ’m using , simple word count. ’re analyzing bit 20 thousand words. far time, Apple M3 machine, took 3 minutes process, 100 rows, containing 20 thousand words. Setting temp 0.2 llm_use(), made model run bit faster. package uses purrr send prompt individually LLM. , try different ways speed process, unsuccessfully: Used furrr send multiple requests time. work either LLM Ollama processed requests serially. improvement. also tried sending one row’s text time. cause instability number results. example sending 5 time, sometimes returned 7 8. Even sending 2 stable. new table looks like: used yardstick see well model performed. course, accuracy “truth”, rather package’s results recorded sentiment.","code":"library(classmap) data(data_bookReviews) book_reviews <- data_bookReviews |> head(100) |> as_tibble() glimpse(book_reviews) #> Rows: 100 #> Columns: 2 #> $ review \"i got this as both a book and an audio file. i had waited t… #> $ sentiment 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 2, 1, … length(strsplit(paste(book_reviews, collapse = \" \"), \" \")[[1]]) #> [1] 20571 reviews_llm <- book_reviews |> llm_sentiment( col = review, options = c(\"positive\", \"negative\"), pred_name = \"predicted\" ) #> ! There were 1 predictions with invalid output, they were coerced to NA reviews_llm #> # A tibble: 100 × 3 #> review sentiment predicted #> #> 1 \"i got this as both a book and an audio file. i had wait… 1 negative #> 2 \"this book places too much emphasis on spending money in… 1 negative #> 3 \"remember the hollywood blacklist? the hollywood ten? i'… 2 negative #> 4 \"while i appreciate what tipler was attempting to accomp… 1 negative #> 5 \"the others in the series were great, and i really looke… 1 negative #> 6 \"a few good things, but she's lost her edge and i find i… 1 negative #> 7 \"words cannot describe how ripped off and disappointed i… 1 negative #> 8 \"1. the persective of most writers is shaped by their ow… 1 negative #> 9 \"i have been a huge fan of michael crichton for about 25… 1 negative #> 10 \"i saw dr. polk on c-span a month or two ago. he was add… 2 positive #> # ℹ 90 more rows library(forcats) reviews_llm |> mutate(fct_pred = as.factor(ifelse(predicted == \"positive\", 2, 1))) |> yardstick::accuracy(sentiment, fct_pred) #> # A tibble: 1 × 3 #> .metric .estimator .estimate #> #> 1 accuracy binary 0.939"},{"path":"https://edgararuiz.github.io/mall/index.html","id":"vector-functions","dir":"","previous_headings":"","what":"Vector functions","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"mall includes functions expect vector, instead table, run predictions. make easier test things, custom prompts results specific text. llm_ function corresponding llm_vec_ function:","code":"llm_vec_sentiment(\"I am happy\") #> [1] \"positive\" llm_vec_translate(\"Este es el mejor dia!\", \"english\") #> [1] \"This is the best day!\""},{"path":"https://edgararuiz.github.io/mall/index.html","id":"databricks","dir":"","previous_headings":"","what":"Databricks","title":"Run multiple Large Language Model predictions against a table, or vectors","text":"brief example shows seamless use llm_ functions, remote connection: mentioned , using llm_sentiment() Databricks call vendor’s SQL AI function directly: differences arguments, output LLM’s. Notice instead “neutral”, prediction “mixed”. AI Sentiment function allow change possible options. Next, try llm_summarize(). max_words argument maps argument AI Summarize function:","code":"library(DBI) con <- dbConnect( odbc::databricks(), HTTPPath = Sys.getenv(\"DATABRICKS_PATH\") ) tbl_reviews <- copy_to(con, reviews) tbl_reviews |> llm_sentiment(review) #> # Source: SQL [3 x 2] #> # Database: Spark SQL 3.1.1[token@Spark SQL/hive_metastore] #> review .sentiment #> #> 1 This has been the best TV Ive ever used. Great screen, and sound. positive #> 2 I regret buying this laptop. It is too slow and the keyboard is to… negative #> 3 Not sure how to feel about my new washing machine. Great color, bu… mixed tbl_reviews |> llm_summarize(review, max_words = 5) #> # Source: SQL [3 x 2] #> # Database: Spark SQL 3.1.1[token@Spark SQL/hive_metastore] #> review .summary #> #> 1 This has been the best TV Ive ever used. Great screen, and sound. Superio… #> 2 I regret buying this laptop. It is too slow and the keyboard is too … Slow, n… #> 3 Not sure how to feel about my new washing machine. Great color, but … Initial…"},{"path":"https://edgararuiz.github.io/mall/reference/llm_classify.html","id":null,"dir":"Reference","previous_headings":"","what":"Categorize data as one of options given — llm_classify","title":"Categorize data as one of options given — llm_classify","text":"Use Large Language Model (LLM) classify provided text one options provided via labels argument.","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_classify.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Categorize data as one of options given — llm_classify","text":"","code":"llm_classify( .data, col, labels, pred_name = \".classify\", additional_prompt = \"\" ) llm_vec_classify(x, labels, additional_prompt = \"\")"},{"path":"https://edgararuiz.github.io/mall/reference/llm_classify.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Categorize data as one of options given — llm_classify","text":".data data.frame tbl object contains text analyzed col name field analyze, supports tidy-eval labels character vector least 2 labels classify text pred_name character vector name new column prediction placed additional_prompt Inserts text prompt sent LLM x vector contains text analyzed","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_classify.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Categorize data as one of options given — llm_classify","text":"llm_classify returns data.frame tbl object. llm_vec_classify returns vector length x.","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_custom.html","id":null,"dir":"Reference","previous_headings":"","what":"Send a custom prompt to the LLM — llm_custom","title":"Send a custom prompt to the LLM — llm_custom","text":"Use Large Language Model (LLM) process provided text using instructions prompt","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_custom.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Send a custom prompt to the LLM — llm_custom","text":"","code":"llm_custom(.data, col, prompt = \"\", pred_name = \".pred\", valid_resps = \"\") llm_vec_custom(x, prompt = \"\", valid_resps = NULL)"},{"path":"https://edgararuiz.github.io/mall/reference/llm_custom.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Send a custom prompt to the LLM — llm_custom","text":".data data.frame tbl object contains text analyzed col name field analyze, supports tidy-eval prompt prompt append record sent LLM pred_name character vector name new column prediction placed valid_resps response LLM open, deterministic, provide options vector. function set NA response options x vector contains text analyzed","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_custom.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Send a custom prompt to the LLM — llm_custom","text":"llm_custom returns data.frame tbl object. llm_vec_custom returns vector length x.","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_extract.html","id":null,"dir":"Reference","previous_headings":"","what":"Extract entities from text — llm_extract","title":"Extract entities from text — llm_extract","text":"Use Large Language Model (LLM) extract specific entity, entities, provided text","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_extract.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Extract entities from text — llm_extract","text":"","code":"llm_extract( .data, col, labels, expand_cols = FALSE, additional_prompt = \"\", pred_name = \".extract\" ) llm_vec_extract(x, labels = c(), additional_prompt = \"\")"},{"path":"https://edgararuiz.github.io/mall/reference/llm_extract.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Extract entities from text — llm_extract","text":".data data.frame tbl object contains text analyzed col name field analyze, supports tidy-eval labels vector entities extract text expand_cols multiple labels passed, flag tells function create new column per item labels. labels named vector, function use names new column names, , function use sanitized version content name. additional_prompt Inserts text prompt sent LLM pred_name character vector name new column prediction placed x vector contains text analyzed","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_extract.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Extract entities from text — llm_extract","text":"llm_extract returns data.frame tbl object. llm_vec_extract returns vector length x.","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_sentiment.html","id":null,"dir":"Reference","previous_headings":"","what":"Sentiment analysis — llm_sentiment","title":"Sentiment analysis — llm_sentiment","text":"Use Large Language Model (LLM) perform sentiment analysis provided text","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_sentiment.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Sentiment analysis — llm_sentiment","text":"","code":"llm_sentiment( .data, col, options = c(\"positive\", \"negative\", \"neutral\"), pred_name = \".sentiment\", additional_prompt = \"\" ) llm_vec_sentiment( x, options = c(\"positive\", \"negative\", \"neutral\"), additional_prompt = \"\" )"},{"path":"https://edgararuiz.github.io/mall/reference/llm_sentiment.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Sentiment analysis — llm_sentiment","text":".data data.frame tbl object contains text analyzed col name field analyze, supports tidy-eval options vector options LLM use assign sentiment text. Defaults : 'positive', 'negative', 'neutral' pred_name character vector name new column prediction placed additional_prompt Inserts text prompt sent LLM x vector contains text analyzed","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_sentiment.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Sentiment analysis — llm_sentiment","text":"llm_sentiment returns data.frame tbl object. llm_vec_sentiment returns vector length x.","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_summarize.html","id":null,"dir":"Reference","previous_headings":"","what":"Summarize text — llm_summarize","title":"Summarize text — llm_summarize","text":"Use Large Language Model (LLM) summarize text","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_summarize.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Summarize text — llm_summarize","text":"","code":"llm_summarize( .data, col, max_words = 10, pred_name = \".summary\", additional_prompt = \"\" ) llm_vec_summarize(x, max_words = 10, additional_prompt = \"\")"},{"path":"https://edgararuiz.github.io/mall/reference/llm_summarize.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Summarize text — llm_summarize","text":".data data.frame tbl object contains text analyzed col name field analyze, supports tidy-eval max_words maximum number words LLM use summary. Defaults 10. pred_name character vector name new column prediction placed additional_prompt Inserts text prompt sent LLM x vector contains text analyzed","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_summarize.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Summarize text — llm_summarize","text":"llm_summarize returns data.frame tbl object. llm_vec_summarize returns vector length x.","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_translate.html","id":null,"dir":"Reference","previous_headings":"","what":"Translates text to a specific language — llm_translate","title":"Translates text to a specific language — llm_translate","text":"Use Large Language Model (LLM) translate text specific language","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_translate.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Translates text to a specific language — llm_translate","text":"","code":"llm_translate( .data, col, language, pred_name = \".translation\", additional_prompt = \"\" ) llm_vec_translate(x, language, additional_prompt = \"\")"},{"path":"https://edgararuiz.github.io/mall/reference/llm_translate.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Translates text to a specific language — llm_translate","text":".data data.frame tbl object contains text analyzed col name field analyze, supports tidy-eval language Target language translate text pred_name character vector name new column prediction placed additional_prompt Inserts text prompt sent LLM x vector contains text analyzed","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_translate.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Translates text to a specific language — llm_translate","text":"llm_translate returns data.frame tbl object. llm_vec_translate returns vector length x.","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_use.html","id":null,"dir":"Reference","previous_headings":"","what":"Specify the model to use — llm_use","title":"Specify the model to use — llm_use","text":"Allows us specify back-end provider, model use current R session","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_use.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Specify the model to use — llm_use","text":"","code":"llm_use( backend = NULL, model = NULL, ..., .silent = FALSE, .cache = NULL, .force = FALSE )"},{"path":"https://edgararuiz.github.io/mall/reference/llm_use.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Specify the model to use — llm_use","text":"backend name supported back-end provider. Currently 'ollama' supported. model name model supported back-end provider ... Additional arguments function pass integrating function. case Ollama, pass arguments ollamar::chat(). .silent Avoids console output .cache path save model results, can re-used operation ran . turn , set argument empty character: \"\". 'defaults '_mall_cache'. argument left NULL calling function, changes path made. .force Flag tell function reset settings R session","code":""},{"path":"https://edgararuiz.github.io/mall/reference/llm_use.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Specify the model to use — llm_use","text":"mall_defaults object","code":""},{"path":"https://edgararuiz.github.io/mall/reference/m_backend_submit.html","id":null,"dir":"Reference","previous_headings":"","what":"Functions to integrate different back-ends — m_backend_prompt","title":"Functions to integrate different back-ends — m_backend_prompt","text":"Functions integrate different back-ends","code":""},{"path":"https://edgararuiz.github.io/mall/reference/m_backend_submit.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Functions to integrate different back-ends — m_backend_prompt","text":"","code":"m_backend_prompt(backend, additional) m_backend_submit(backend, x, prompt)"},{"path":"https://edgararuiz.github.io/mall/reference/m_backend_submit.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Functions to integrate different back-ends — m_backend_prompt","text":"backend mall_defaults object additional Additional text insert base_prompt x body text submitted LLM prompt additional information add submission","code":""},{"path":"https://edgararuiz.github.io/mall/reference/m_backend_submit.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Functions to integrate different back-ends — m_backend_prompt","text":"m_backend_submit return object. m_backend_prompt returns list functions contain base prompts.","code":""}]