Notes

Authenticating via DataOne API

Matt Jones provides a nice dataone package to handle accessing and posting data to DataOne nodes such as the KNB from R. Uploading files through the API requires authentication, which in turn requires a certificate from CILogin. As this process proved a bit tricky to me, I add some documentation here. I am using a Ubuntu (13.04) system, so the procedure might differ slightly for other systems.

  1. Go to https://cilogon.org/?skin=DataONE

  2. Select authentication credentials. I used Google as an OpenID provider.

  3. This asked me to download a file named shibCILaunchGSCA.jnlp. This is a java webstart application and is suppsed to launch in the browser; a feature many operating systems apparently disabled due to security concerns. One could reconfigure the browser settings to execute this file type (using javaws), or simply download and run javaws manually in the command line:

$ javaws shibCILaunchGSCA.jnlp

My process was slightly more troublesome as I had java-6-openjdk javaws installed as the default but needed the java-7 one to correspond to the rest of my setup… Also more complicated in that I downloaded shibCILaunchGSCA.jnlp twice and first tried javaws on the earlier download, which had become invalid when I created the second download. Words to the wise…

This should basically finish the authentication by installing a certificate in /tmp/x509up_u0, where the R package will look for it. If you already have a file of that name in /tmp it will use a different name and the R package may be confused. You can work around this (see #3) but better to use the default for simplicity.

At this point we can successfully upload data to DataOne nodes such as the KNB as illustrated in Matt’s tutorial.

Notes

This presupposes a working rJava package, which can be tricky on Linux systems given the plethora of different flavors of Java. Though rJava is provided as a Ubuntu binary (apt-get install r-cran-rjava) the version is rather behind. Stackoverflow has some useful hints.

ToDo

  • Should now be able to wrap the dataone API in reml in a similar way as we have done for figshare.
  • Might consider implementing the REST-based DataONE API calls directly through RCurl or httr to avoid the java issues. Will need to handle the certificate issues above regardless, though some steps can perhaps be wrapped into an R function…