--- title: "Using `cookiemonster`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Using `cookiemonster`} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ## Reading and storing cookies Welcome to the `cookiemonster` package, a one-stop solution to help you navigate the delicious world of browser cookies! In this vignette, we will explain what browser cookies are, how to read and store them using the `cookiemonster` package, and interact with them using the modern `httr2` package, the legacy `httr` package, and the powerful `curl` package. ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", message = FALSE, eval = TRUE ) # write to a temporary jar options(cookie_dir = tempdir()) is_available <- function(p) requireNamespace(p, quietly = TRUE) curl_available <- is_available("curl") httr_available <- is_available("httr") httr2_available <- is_available("httr2") ``` ## What are Browser Cookies? This package helps you manage browser cookies in R, making it easy to work with cookies when sending HTTP requests. Before we dive into the functions and features of the package, let's briefly discuss what browser cookies are. Browser cookies are small pieces of data that websites send to your browser to store and later retrieve. They help websites remember your preferences, login information, or even the items in your shopping cart. Cookies play a crucial role in making your browsing experience smooth and personalized. In `R` browser cookies come in handy when working with tasks that involve web interactions, like web scraping, browsing automation, website testing, or API calls that require authentication. They allow your scripts to efficiently mimic browser behaviour and maintain sessions as well as user-specific data. ## Reading and Storing Cookies Using the `cookiemonster` package, we can easily read and store cookies for further use. First, let's load the package: ```{r setup} library(cookiemonster) ``` To use cookies with the `cookiemonster`, you will need to export the necessary cookies from your browser after visiting or logging into a website. To do this, you can use browser extensions like ["Get cookies.txt"](https://chrome.google.com/webstore/detail/get-cookiestxt-locally/cclelndahbckbenkjhflpdbgdldlbecc) for Chromium-based browsers or ["cookies.txt"](https://addons.mozilla.org/en-US/firefox/addon/cookies-txt/) for Firefox. Let's import an example cookie file provided by the `cookiemonster` package: ```{r eval=FALSE} file.copy( from = system.file("extdata", "cookies.txt", package = "cookiemonster"), to = "." ) ``` This is how the file looks like: ```{r eval=FALSE} readLines("cookies.txt") |> cat(sep = "\n") ``` ```{r echo=FALSE} system.file("extdata", "cookies.txt", package = "cookiemonster") |> readLines() |> cat(sep = "\n") ``` Now, let's add the cookies from this file to our cookie jar: ```{r eval=FALSE} add_cookies(cookiefile = "cookies.txt") ``` ```{r echo=FALSE} add_cookies( cookiefile = system.file("extdata", "cookies.txt", package = "cookiemonster"), confirm = TRUE ) ``` You can also import cookies directly from your browser (only works with Firefox at the moment): ```{r eval=FALSE} cookies <- get_cookies_from_browser(browser = "Firefox") store_cookies(cookies) ``` If you are working with `rvest` version 1.0.4 or above, you might also know about live browser sessions: ```{r eval=FALSE} sess <- rvest::read_html_live("https://vu.nl") ``` You can actually open a browser using `sess$view()`. To save the cookies from this session, e.g., after logging in on a website, simply run: ```{r eval=FALSE} add_cookies(session = sess) #> ✔ Cookies for vu.nl put in the jar! ``` ## Default Cookie Storage The `cookiemonster` package stores cookies in a default location. To see this location, you can use: ```{r echo=FALSE} options(cookie_dir = NULL) ``` ```{r} default_jar() ``` If you want to change the default cookie storage location, you can set the `cookie_dir` option: ```{r} options(cookie_dir = tempdir()) default_jar() ``` To revert back to the original cookie storage location: ```{r} options(cookie_dir = NULL) default_jar() ``` ```{r echo=FALSE} options(cookie_dir = tempdir()) ``` To retrieve cookies for a specific domain: ```{r} get_cookies("hb.cran.dev") ``` Note that his function uses regular expressions to match the domain by default. ## Using `cookiemonster` with other packages ### Using Stored Cookies with `httr2` Now let's see how to use stored cookies with the `httr2` package: ```{r eval=httr2_available} library(httr2) resp <- request("https://hb.cran.dev/cookies/set") |> # start a request req_options(cookie = get_cookies("hb.cran.dev", as = "string")) |> # add cookies to be sent with it req_perform() # perform the request resp |> resp_body_json() ``` As you can see, the individual cookie values we see above are returned correctly. This is because the server at https://hb.cran.dev is configured to echo requests send to it. It shows us that the correct cookies were send (hooray!). ### Using Stored Cookies with `httr` To use stored cookies with the legacy `httr` package: ```{r eval=httr_available} library(httr) GET("https://hb.cran.dev/cookies/set", set_cookies(get_cookies("hb.cran.dev", as = "vector"))) ``` This code uses the 'httr' library to set cookies from the 'https://hb.cran.dev' website (a test website for development). The `GET` function is used to set the cookies, and the `set_cookies` function add cookies to the request. ### Using Stored Cookies with `curl` `curl` is the backbone of both `httr` and `httr2`, which provide a more straightforward interface for it. You can also use `curl` directly though (which is only recommended for advanced users though). To make the same request as above, we can use this code: ```{r eval=curl_available} library(curl) h <- new_handle() handle_setopt(h, cookie = get_cookies("hb.cran.dev", as = "string")) resp <- curl_fetch_memory("https://hb.cran.dev/cookies/set", handle = h) jsonlite::fromJSON(rawToChar(resp$content)) ``` If the `curl` response contains new cookies: ```{r eval=curl_available} h2 <- new_handle() resp <- curl_fetch_memory("https://hb.cran.dev/cookies/set?new_cookies=moo", handle = h2) handle_cookies(h2) ``` Use `store_cookies` to store them in your jar: ```{r eval=curl_available} new_cookies <- handle_cookies(h2) store_cookies(new_cookies) get_cookies("hb.cran.dev") ``` Keep in mind that adding cookies for a domain will replace all previously stored cookies for that domain by default. Now that you have an understanding of how the `cookiemonster` package can be used with `httr2`, `httr`, and `curl`, you're ready to take control of browser cookies in your R projects! Happy coding and stay sharp, cookie monsters!