calpassapi Tutorial

The calpassapi R package contains functions that help a user query data using CalPASS’s API.

Install Package

# From CRAN (Official)
## install.packages('calpassapi')

# From github (Development)
## devtools::install_github('vinhdizzo/calpassapi')

Load Packages

library(calpassapi)
library(dplyr) # Ease in manipulations with data frames

Setup

If the user does not want to expose their CalPASS API username and password in their R script, then it is recommended that the user specify their credentials in their .Renviron file in their home directory (execute Sys.getenv('HOME') in R to determine the R home directory) as follows:

cp_api_uid='my_username'
cp_api_pwd='my_password'

R will automatically load these environment variables at start up and the user will not have to specify username and password in calpass_get_token.

Obtain access token

First we need to authenticate with CalPASS using our credentials in order to obtain a token that will allow us to query data from the API.

cp_token <- calpass_get_token(username='my_cp_api_uid', password='my_cp_api_pwd', client_id='my_client_id', scope='my_scope')
# cp_token <- calpass_get_token(client_id='my_client_id', scope='my_scope') ## if cp_api_uid and cp_api_pwd are set in .Renviron

This token will be used in calpass_query and calpass_query_many in the token argument.

Create interSegmentKey’s for each student

To obtain information for a particular student, we need to convert the student’s first name, last name, gender, and birthdate into an interSegmentKey, a key that allows the API to look up a student.

# single
isk <- calpass_create_isk(first_name='Jane', last_name='Doe'
                 , gender='F', birthdate=20001231)
isk
## [1] "D04ADB733A5E2F09744F6BB581A712D634702A181A020ED59B083DCE412E788101EDBA6CAEE8F7A90DF20B56D7519CF1699D167891D182076E34A27F7166F672"
# multiple
firstname <- c('Tom', 'Jane', 'Jo')
lastname <- c('Ng', 'Doe', 'Smith')
gender <- c('Male', 'Female', 'X')
birthdate <- c(20001231, 19990101, 19981111)
df <- data.frame(firstname, lastname
               , gender, birthdate, stringsAsFactors=FALSE)
df <- df %>%
  mutate(isk=calpass_create_isk(first_name=firstname
                              , last_name=lastname
                              , gender=gender
                              , birthdate
                                ))
df$isk
## [1] "5DFEBA29144043D4B504BDAC0F4831DC7AE48C61F67164535E89D577FA46156089A96A34F8D82A7281CB035E339999AE38C3C159738D7FAA53E708F63E2C55A3"
## [2] "B99037E0B8AEBBE4C9F5AA0A7C67813D5628D940972D4BBC0980D1BFF587C8D0BC716784004693E087E9E6D9889B610D840EAA72285B64779A73C916CD1934FC"
## [3] "858C66FA1B48020F8CB8734A7596666DF607C91DFA6BA7325C86BD029BB44AAD0A3D0D64283D1F1C0AA14D68B4820B39889848D47C5B3A85A177426F80349824"

Query data from CalPASS

After we have the interSegmentKey’s, we can now query data from CalPASS.

## single
calpass_query(interSegmentKey=isk
            , token=cp_token, endpoint='transcript')
## multiple
dfResults <- calpass_query_many(interSegmentKey=df$isk
                         , token=cp_token
                         , endpoint='transcript'
                           )
## can specify credentials
dfResults <- calpass_query_many(interSegmentKey=df$isk
                         , endpoint='transcript'
                         , token_username='my_username'
                         , token_password='my_password'
                         , token_client_id='my_client_id'
                         , token_scope='my_scope'
                           )

Multiple batches (when there are many interSegmentKey’s)

The CALPASS API currently has a limit of 3200 calls per hour (3600 seconds). These are specified by default in the api_call_limit and limit_per_n_sec arguments in calpass_query_many. If the user has a need beyond these limits, or if the user would like to break the calls into batches, then the user should specify wait=TRUE, and calpass_query_many will break the calls into batches of api_call_limit calls.

## batches
dfResults <- calpass_query_many(interSegmentKey=df$isk
                         , token=cp_token
                         , endpoint='transcript'
                         , api_call_limit=2 ## batches of 2
                         , limit_per_n_sec=10 ## every 10 seconds
                         , wait=TRUE
                           )