-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #17 from edgararuiz/updates
Updates
- Loading branch information
Showing
10 changed files
with
295 additions
and
23 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
{ | ||
"hash": "9d356c51f2bf3faca6acf1dd4add9abf", | ||
"result": { | ||
"engine": "knitr", | ||
"markdown": "---\ntitle: \"Caching results\"\nexecute:\n eval: true\n freeze: true\n---\n\n\n\n\n\n\nData preparation, and model preparation, is usually a iterative process. Because\nmodels in R are normally rather fast, it is not a problem to re-run the\nentire code to confirm that all of the results are reproducible. But in\nthe case of LLM's, re-running things may be a problem. Locally, running the \nLLM will be processor intensive, and typically long. If running against a remote\nLLM, the issue would the cost per token. \n\nTo ameliorate this, `mall` is able to cache existing results in a folder. That way, \nrunning the same analysis over and over, will be much quicker. Because instead of\ncalling the LLM again, `mall` will return the previously recorded result. \n\nBy default, this functionality is turned on. The results will be saved to a folder\nnamed \"_mall_cache\" . The name of the folder can be easily changed, simply set\nthe `.cache` argument in `llm_use()`. To **disable** this functionality, set\nthe argument to an empty character, meaning `.cache = \"\"`.\n\n## How it works\n\n`mall` uses all of the values used to make the LLM query as the \"finger print\"\nto confidently identify when the same query is being done again. This includes:\n\n- The value in the particular row\n- The additional prompting built by the `llm_` function,\n- Any other arguments/options used, set in `llm_use()`\n- The name of the back end used for the call\n\nA file is created that contains the request and response. The key to the process\nis the name of the file itself. The name is the hashed value of the combined\nvalue of the items listed above. This becomes the \"finger print\" that allows \n`mall` to know if there is an existing cache. \n\n## Walk-through \n\nWe will initialize the LLM session specifying a seed\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(mall)\n\nllm_use(\"ollama\", \"llama3.1\", seed = 100)\n#> \n#> ── mall session object\n#> Backend: ollama\n#> LLM session:\n#> model:llama3.1\n#> \n#> seed:100\n#> \n#> R session: cache_folder:_mall_cache\n```\n:::\n\n\n\n\nUsing the `tictoc` package, we will measure how long it takes to make a simple\nsentiment call. \n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(tictoc)\n\ntic()\nllm_vec_sentiment(\"I am happy\")\n#> [1] \"positive\"\ntoc()\n#> 1.266 sec elapsed\n```\n:::\n\n\n\n\nThis creates a the \"_mall_cache\" folder, and inside a sub-folder, it creates a \nfile with the cache. The name of the file is the resulting hash value of the\ncombination mentioned in the previous section. \n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndir_ls(\"_mall_cache\", recurse = TRUE, type = \"file\")\n#> _mall_cache/08/086214f2638f60496fd0468d7de37c59.json\n```\n:::\n\n\n\n\nThe cache is a JSON file, that contains both the request, and the response. As\nmentioned in the previous section, the named of the file is derived from the\ncombining the values in the request (`$request`).\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\njsonlite::read_json(\n \"_mall_cache/08/086214f2638f60496fd0468d7de37c59.json\", \n simplifyVector = TRUE, \n flatten = TRUE\n )\n#> $request\n#> $request$messages\n#> role\n#> 1 user\n#> content\n#> 1 You are a helpful sentiment engine. Return only one of the following answers: positive, negative, neutral. No capitalization. No explanations. The answer is based on the following text:\\nI am happy\n#> \n#> $request$output\n#> [1] \"text\"\n#> \n#> $request$model\n#> [1] \"llama3.1\"\n#> \n#> $request$seed\n#> [1] 100\n#> \n#> \n#> $response\n#> [1] \"positive\"\n```\n:::\n\n\n\n\nRe-running the same `mall` call, will complete significantly faster\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntic()\nllm_vec_sentiment(\"I am happy\")\n#> [1] \"positive\"\ntoc()\n#> 0.001 sec elapsed\n```\n:::\n\n\n\n\nIf a slightly different query is made, `mall` will recognize that this is a\ndifferent call, and it will send it to the LLM. The results are then saved in a \nnew JSON file. \n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nllm_vec_sentiment(\"I am very happy\")\n#> [1] \"positive\"\n\ndir_ls(\"_mall_cache\", recurse = TRUE, type = \"file\")\n#> _mall_cache/08/086214f2638f60496fd0468d7de37c59.json\n#> _mall_cache/7c/7c7cfcfddc43a90b4deb9d7e60e88291.json\n```\n:::\n\n\n\n\nDuring the same R session, if we change something in `llm_use()` that will\nimpact the request to the LLM, that will trigger a new cache file\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nllm_use(seed = 101)\n#> \n#> ── mall session object\n#> Backend: ollama\n#> LLM session:\n#> model:llama3.1\n#> \n#> seed:101\n#> \n#> R session: cache_folder:_mall_cache\n\nllm_vec_sentiment(\"I am very happy\")\n#> [1] \"positive\"\n\ndir_ls(\"_mall_cache\", recurse = TRUE, type = \"file\")\n#> _mall_cache/08/086214f2638f60496fd0468d7de37c59.json\n#> _mall_cache/7c/7c7cfcfddc43a90b4deb9d7e60e88291.json\n#> _mall_cache/f1/f1c72c2bf22e22074cef9c859d6344a6.json\n```\n:::\n\n\n\n\nThe only argument that does not trigger a new cache file is `.silent`\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nllm_use(seed = 101, .silent = TRUE)\n\nllm_vec_sentiment(\"I am very happy\")\n#> [1] \"positive\"\n\ndir_ls(\"_mall_cache\", recurse = TRUE, type = \"file\")\n#> _mall_cache/08/086214f2638f60496fd0468d7de37c59.json\n#> _mall_cache/7c/7c7cfcfddc43a90b4deb9d7e60e88291.json\n#> _mall_cache/f1/f1c72c2bf22e22074cef9c859d6344a6.json\n```\n:::\n\n\n\n\n## Performance improvements \n\nTo drive home the point of the usefulness of this feature, we will use the\nsame data set we used for the README. To start, we will change the cache folder\nto make it easy to track the new files\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nllm_use(.cache = \"_performance_cache\", .silent = TRUE)\n```\n:::\n\n\n\nAs mentioned, we will use the `data_bookReviews` data frame from the `classmap`\npackage\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(classmap)\n\ndata(data_bookReviews)\n```\n:::\n\n\n\n\nThe individual reviews in this data set are really long. So they take a while to\nprocess. To run this test, we will use the first 5 rows: \n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntic()\n\ndata_bookReviews |>\n head(5) |> \n llm_sentiment(review)\n#> # A tibble: 5 × 3\n#> review sentiment .sentiment\n#> <chr> <fct> <chr> \n#> 1 \"i got this as both a book and an audio file… 1 negative \n#> 2 \"this book places too much emphasis on spend… 1 negative \n#> 3 \"remember the hollywood blacklist? the holly… 2 negative \n#> 4 \"while i appreciate what tipler was attempti… 1 negative \n#> 5 \"the others in the series were great, and i … 1 negative\n\ntoc()\n#> 10.223 sec elapsed\n```\n:::\n\n\n\n\nThe analysis took about 10 seconds on my laptop, so around 2 seconds per record.\nThat may not seem like much, but during model, or workflow, development having\nto wait this long every time will take its toll on our time, and patience.\n\nThe new cache folder now has the 5 records cached in their corresponding \nJSON files\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ndir_ls(\"_performance_cache\", recurse = TRUE, type = \"file\")\n#> _performance_cache/23/23ea4fff55a6058db3b4feefe447ddeb.json\n#> _performance_cache/60/60a0dbb7d3b8133d40e2f74deccdbf47.json\n#> _performance_cache/76/76f1b84b70328b1b3533436403914217.json\n#> _performance_cache/c7/c7cf6e0f9683ae29eba72b0a4dd4b189.json\n#> _performance_cache/e3/e375559b424833d17c7bcb067fe6b0f8.json\n```\n:::\n\n\n\n\nRe-running the same exact call will not take a fraction of a fraction of the\noriginal time!\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntic()\n\ndata_bookReviews |>\n head(5) |> \n llm_sentiment(review)\n#> # A tibble: 5 × 3\n#> review sentiment .sentiment\n#> <chr> <fct> <chr> \n#> 1 \"i got this as both a book and an audio file… 1 negative \n#> 2 \"this book places too much emphasis on spend… 1 negative \n#> 3 \"remember the hollywood blacklist? the holly… 2 negative \n#> 4 \"while i appreciate what tipler was attempti… 1 negative \n#> 5 \"the others in the series were great, and i … 1 negative\n\ntoc()\n#> 0.01 sec elapsed\n```\n:::\n\n\n\n\nRunning an additional record, will only cost the time it takes to process it.\nThe other 5 will still be scored using their cached result\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntic()\n\ndata_bookReviews |>\n head(6) |> \n llm_sentiment(review)\n#> # A tibble: 6 × 3\n#> review sentiment .sentiment\n#> <chr> <fct> <chr> \n#> 1 \"i got this as both a book and an audio file… 1 negative \n#> 2 \"this book places too much emphasis on spend… 1 negative \n#> 3 \"remember the hollywood blacklist? the holly… 2 negative \n#> 4 \"while i appreciate what tipler was attempti… 1 negative \n#> 5 \"the others in the series were great, and i … 1 negative \n#> 6 \"a few good things, but she's lost her edge … 1 negative\n\ntoc()\n#> 0.624 sec elapsed\n```\n:::\n\n\n\n\n## Set the seed!\n\nIf at the end of your analysis, you plan to re-run all of the code, and you\nwant to take advantage of the caching functionaly, then set the model seed. This\nwill allow for the exact same results to be returned by the LLM.\n\nIf no seed is set during development, then the results will always come back \nthe same because the cache is being read. But once the cache is removed, to run \neverything from 0, then you will get different results. This is because the \ninvariability of the cache results, mask the fact that the model will have \nvariability. \n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nllm_use(\"ollama\", \"llama3.1\", seed = 999)\n#> \n#> ── mall session object\n#> Backend: ollama\n#> LLM session:\n#> model:llama3.1\n#> \n#> seed:999\n#> \n#> R session: cache_folder:_performance_cache\n```\n:::\n", | ||
"supporting": [], | ||
"filters": [ | ||
"rmarkdown/pagebreak.lua" | ||
], | ||
"includes": {}, | ||
"engineDependencies": {}, | ||
"preserve": {}, | ||
"postProcess": true | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.