| Title: | Download and Cache Files Safely |
|---|---|
| Description: | The goal of dlr is to provide a friendly wrapper around the common pattern of downloading a file if that file does not already exist locally. |
| Authors: | Jonathan Bratt [aut] (ORCID: <https://orcid.org/0000-0003-2859-0076>), Jon Harmon [aut, cre] (ORCID: <https://orcid.org/0000-0003-4781-4346>), Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph, fnd] |
| Maintainer: | Jon Harmon <[email protected]> |
| License: | Apache License (>= 2) |
| Version: | 1.0.1.9001 |
| Built: | 2026-05-15 09:14:52 UTC |
| Source: | https://github.com/macmillancontentscience/dlr |
App cache directories can depend on the user's operating system and an
overall R_USER_CACHE_DIR environment variable. We also respect a per-app
option (appname.dir), and a per-app environment variable
(APPNAME_CACHE_DIR). This function returns the path that will be used for a
given app's cache.
app_cache_dir(appname, verbose = interactive())app_cache_dir(appname, verbose = interactive())
appname |
Character; the name of the application that will "own" the cache, such as the name of a package. |
The full path to the app's cache directory.
app_cache_dir("myApp")app_cache_dir("myApp")
Construct the full path to the cached version of a file within a particular app's cache, using the source path of the file to make sure the cache filename is unique.
construct_cached_file_path(source_path, appname, extension = "")construct_cached_file_path(source_path, appname, extension = "")
source_path |
Character scalar; the full path to the source file. |
appname |
Character; the name of the application that will "own" the cache, such as the name of a package. |
extension |
Character scalar; an optional filename extension. |
The full path to the processed version of source_path in the app's cache directory.
construct_cached_file_path( source_path = "my/file.txt", appname = "dlr", extension = "rds" )construct_cached_file_path( source_path = "my/file.txt", appname = "dlr", extension = "rds" )
Given the path to a file, construct a unique filename using the hash of the path.
construct_processed_filename(source_path, extension = "")construct_processed_filename(source_path, extension = "")
source_path |
Character scalar; the full path to the source file. |
extension |
Character scalar; an optional filename extension. |
A unique filename for a processed version of the file.
construct_processed_filename( source_path = "my/file.txt", extension = "rds" )construct_processed_filename( source_path = "my/file.txt", extension = "rds" )
Create the default path expected by app_cache_dir().
create_app_cache_dir(appname)create_app_cache_dir(appname)
appname |
Character; the name of the application that will "own" the cache, such as the name of a package. |
A normalized path to a cache directory. The directory is created if the user has write access and the directory does not exist.
# Executing this function creates a cache directory. create_app_cache_dir("dlr")# Executing this function creates a cache directory. create_app_cache_dir("dlr")
This function wraps maybe_process(), specifying the app's cache directory.
maybe_cache( source_path, appname, filename = construct_processed_filename(source_path), process_f = readRDS, process_args = NULL, write_f = saveRDS, write_args = NULL, force_process = FALSE )maybe_cache( source_path, appname, filename = construct_processed_filename(source_path), process_f = readRDS, process_args = NULL, write_f = saveRDS, write_args = NULL, force_process = FALSE )
source_path |
Character scalar; the path to the raw file. Paths starting
with |
appname |
Character; the name of the application that will "own" the cache, such as the name of a package. |
filename |
Character; an optional filename for the cached version of the
file. By default, a filename is constructed using
|
process_f |
A function or one-sided formula to use to process the source
file. |
process_args |
An optional list of additional arguments to |
write_f |
A function or one-sided formula to use to save the processed
file. The processed object will be passed as the first argument to this
function, and |
write_args |
An optional list of additional arguments to |
force_process |
A logical scalar indicating whether we should process the source file even if the target already exists. This can be particularly useful if you wish to redownload a file. |
The normalized target_path.
if (interactive()) { target_path <- maybe_cache( "https://query.data.world/s/owqxojjiphaypjmlxldsp566lck7co", appname = "dlr", process_f = read.csv ) target_path unlink(target_path) }if (interactive()) { target_path <- maybe_cache( "https://query.data.world/s/owqxojjiphaypjmlxldsp566lck7co", appname = "dlr", process_f = read.csv ) target_path unlink(target_path) }
Sometimes you just need to get a processed file to a particular location, without reading the data. For example, you might need to download a lookup table used by various functions in a package, independent of a particular function call that needs the data. This function does the processing if it hasn't already been done.
maybe_process( source_path, target_path, process_f = readRDS, process_args = NULL, write_f = saveRDS, write_args = NULL, force_process = FALSE )maybe_process( source_path, target_path, process_f = readRDS, process_args = NULL, write_f = saveRDS, write_args = NULL, force_process = FALSE )
source_path |
Character scalar; the path to the raw file. Paths starting
with |
target_path |
Character scalar; the path where the processed version of the file should be stored. |
process_f |
A function or one-sided formula to use to process the source
file. |
process_args |
An optional list of additional arguments to |
write_f |
A function or one-sided formula to use to save the processed
file. The processed object will be passed as the first argument to this
function, and |
write_args |
An optional list of additional arguments to |
force_process |
A logical scalar indicating whether we should process the source file even if the target already exists. This can be particularly useful if you wish to redownload a file. |
The normalized target_path.
if (interactive()) { temp_filename <- tempfile() maybe_process( "https://query.data.world/s/owqxojjiphaypjmlxldsp566lck7co", target_path = temp_filename, process_f = read.csv ) unlink(temp_filename) }if (interactive()) { temp_filename <- tempfile() maybe_process( "https://query.data.world/s/owqxojjiphaypjmlxldsp566lck7co", target_path = temp_filename, process_f = read.csv ) unlink(temp_filename) }
This function wraps read_or_process(), specifying an app's cache directory
as the target directory.
read_or_cache( source_path, appname, filename = construct_processed_filename(source_path), process_f = readRDS, process_args = NULL, read_f = readRDS, read_args = NULL, write_f = saveRDS, write_args = NULL, force_process = FALSE )read_or_cache( source_path, appname, filename = construct_processed_filename(source_path), process_f = readRDS, process_args = NULL, read_f = readRDS, read_args = NULL, write_f = saveRDS, write_args = NULL, force_process = FALSE )
source_path |
Character scalar; the path to the raw file. Paths starting
with |
appname |
Character; the name of the application that will "own" the cache, such as the name of a package. |
filename |
Character; an optional filename for the cached version of the
file. By default, a filename is constructed using
|
process_f |
A function or one-sided formula to use to process the source
file. |
process_args |
An optional list of additional arguments to |
read_f |
A function or one-sided formula to use to read the processed
file. |
read_args |
An optional list of additional arguments to |
write_f |
A function or one-sided formula to use to save the processed
file. The processed object will be passed as the first argument to this
function, and |
write_args |
An optional list of additional arguments to |
force_process |
A logical scalar indicating whether we should process the source file even if the target already exists. This can be particularly useful if you wish to redownload a file. |
The processed object.
if (interactive()) { austin_smoke_free <- read_or_cache( "https://query.data.world/s/owqxojjiphaypjmlxldsp566lck7co", appname = "dlr", process_f = read.csv ) head(austin_smoke_free) } if (interactive()) { # Calling the function a second time gives the result instantly. austin_smoke_free <- read_or_cache( "https://query.data.world/s/owqxojjiphaypjmlxldsp566lck7co", appname = "dlr", process_f = read.csv ) head(austin_smoke_free) } if (interactive()) { # Remove the generated file. unlink( construct_cached_file_path( "https://query.data.world/s/owqxojjiphaypjmlxldsp566lck7co" ) ) }if (interactive()) { austin_smoke_free <- read_or_cache( "https://query.data.world/s/owqxojjiphaypjmlxldsp566lck7co", appname = "dlr", process_f = read.csv ) head(austin_smoke_free) } if (interactive()) { # Calling the function a second time gives the result instantly. austin_smoke_free <- read_or_cache( "https://query.data.world/s/owqxojjiphaypjmlxldsp566lck7co", appname = "dlr", process_f = read.csv ) head(austin_smoke_free) } if (interactive()) { # Remove the generated file. unlink( construct_cached_file_path( "https://query.data.world/s/owqxojjiphaypjmlxldsp566lck7co" ) ) }
Often, a file must be processed before being usable in R. It can be useful to save the processed contents of that file in a standard format, such as RDS, so that the file does not need to be processed the next time it is loaded.
read_or_process( source_path, target_path, process_f = readRDS, process_args = NULL, read_f = readRDS, read_args = NULL, write_f = saveRDS, write_args = NULL, force_process = FALSE )read_or_process( source_path, target_path, process_f = readRDS, process_args = NULL, read_f = readRDS, read_args = NULL, write_f = saveRDS, write_args = NULL, force_process = FALSE )
source_path |
Character scalar; the path to the raw file. Paths starting
with |
target_path |
Character scalar; the path where the processed version of the file should be stored. |
process_f |
A function or one-sided formula to use to process the source
file. |
process_args |
An optional list of additional arguments to |
read_f |
A function or one-sided formula to use to read the processed
file. |
read_args |
An optional list of additional arguments to |
write_f |
A function or one-sided formula to use to save the processed
file. The processed object will be passed as the first argument to this
function, and |
write_args |
An optional list of additional arguments to |
force_process |
A logical scalar indicating whether we should process the source file even if the target already exists. This can be particularly useful if you wish to redownload a file. |
The processed object.
if (interactive()) { temp_filename <- tempfile() austin_smoke_free <- read_or_process( "https://query.data.world/s/owqxojjiphaypjmlxldsp566lck7co", target_path = temp_filename, process_f = read.csv ) head(austin_smoke_free) } # Calling the function a second time gives the result instantly. if (interactive()) { austin_smoke_free <- read_or_process( "https://query.data.world/s/owqxojjiphaypjmlxldsp566lck7co", target_path = temp_filename, process_f = read.csv ) head(austin_smoke_free) } if (interactive()) { # Remove the generated file. unlink(temp_filename) }if (interactive()) { temp_filename <- tempfile() austin_smoke_free <- read_or_process( "https://query.data.world/s/owqxojjiphaypjmlxldsp566lck7co", target_path = temp_filename, process_f = read.csv ) head(austin_smoke_free) } # Calling the function a second time gives the result instantly. if (interactive()) { austin_smoke_free <- read_or_process( "https://query.data.world/s/owqxojjiphaypjmlxldsp566lck7co", target_path = temp_filename, process_f = read.csv ) head(austin_smoke_free) } if (interactive()) { # Remove the generated file. unlink(temp_filename) }
Override the default paths used by app_cache_dir().
set_app_cache_dir(appname, cache_dir = NULL)set_app_cache_dir(appname, cache_dir = NULL)
appname |
Character; the name of the application that will "own" the cache, such as the name of a package. |
cache_dir |
Character scalar; a path to a cache directory. |
A normalized path to a cache directory. The directory is created if
the user has write access and the directory does not exist. An option is
also set so future calls to app_cache_dir() will respect the
change.
# Executing this function creates a cache directory. set_app_cache_dir(appname = "dlr", cache_dir = "/my/cache/path")# Executing this function creates a cache directory. set_app_cache_dir(appname = "dlr", cache_dir = "/my/cache/path")
The default timeout for downloads is 60 seconds. This is not long enough for
many of the files that are downloaded using this package. We therefore supply
a convenience function to easily change this setting. You can permanently
change this default by setting R_DEFAULT_INTERNET_TIMEOUT in your
.Renviron.
set_timeout(seconds = 600L)set_timeout(seconds = 600L)
seconds |
The number of seconds to set as the timeout (default 600 seconds). |
A list with the old timeout setting (invisibly).
getOption("timeout") old_setting <- set_timeout() getOption("timeout") options(old_setting)getOption("timeout") old_setting <- set_timeout() getOption("timeout") options(old_setting)