# How caching works in tidywikidatar

library(tidywikidatar)

In order to reduce load on Wikidata’s server and to speed up the processing of data, tidywikidatar makes extensive use of local caching.

## What data are cached locally

There are mainly three types of data that are cached locally:

• searches run with tw_search()
• data about an item, typically retrieved with tw_get() or tw_get_property()
• labels or description of properties, typically retrieved with tw_get_property_label() and tw_get_property_description()
• qualifiers of properties, typically retrieved with tw_get_qualifiers()

To reduce space used for local caching and speed up processing time, it is possible to store only labels and information available in a given language when relevant.

## Caching with SQLite

In tidywikidatar, it is possible to enable caching with:

tw_enable_cache()

If you do not include further parameters, by default tidywikidatar will use a local SQLite database for caching.

You can choose in which folder the SQLite database will be stored with tw_set_cache_folder(); if not already existing, you can create that folder with tw_create_cache_folder().

tw_set_cache_folder(path = fs::path(fs::path_home_r(),
"R",
"tw_data"))
tw_create_cache_folder()

## Caching with other database backends

Early support for other database backends is now available, and will be implemented more consistently across all functions in a forthcoming version. In the current version, by default, connections are closed at the end of each function call, so you will need to provide a fresh one with each call. If you see the error external pointer is not valid it probably means the connection has already been closed.

tw_enable_cache()

item_df <- tw_get(id = c("Q1", "Q2", "Q3", "Q4", "Q5"),
cache_connection = DBI::dbConnect(odbc::odbc(),
Host = "localhost",
database = "Zu5oobei9heloquoa6Shahwu",
UID = "Xeit6dieSho7eongamuiyieW",
PWD = "Oot7moo4einguJahgahwi8oh"
))

tw_get_cached_item(id = c("Q1", "Q2"),
cache_connection = DBI::dbConnect(odbc::odbc(),
Host = "localhost",
database = "Zu5oobei9heloquoa6Shahwu",
UID = "Xeit6dieSho7eongamuiyieW",
PWD = "Oot7moo4einguJahgahwi8oh"
))

## Name of tables in cached databases

Each database has a table for each language and type of content. For example, item information retrieved with tw_get(id = "Q180099", language = "en") will be stored in a table called tw_item_en.

When caching with a local SQLite database, each type of of contents (e.g. “item” or “property” or “search”) and each language is stored in a separate file.

When caching with another database, e.g. SQL, all tables are stored inside a single database. In either case, the name of the table is unique and is generated by tw_get_cache_table_name(). For example:

tw_get_cache_table_name(type = "item", language = "en")
#> [1] "tw_item_en"