# Notes

• writing manuscript
• slides
• data request to Glaser.

### gpdd

• Global Popluation Dynamics Database initiated package rgpdd
• data extraction – No API, but does provide direct dump in the form of a Microsoft Access mdb database. For the moment I’m using mdbtools: mdb-tables and mdb-extract to convert to CSV and then reading those into R. RODBC package installs but cannot read the database directly for me. Since data will fit into working memory no real reason to support database calls from R? (avoids extra dependencies). data.table should speed internal computations.
• data compression. Looks like raw data exceeds CRAN’s 5MB limit, but zx compressed binaries it will be less than $$<0.4$$ MB. Compressing those from within R: save(table_name, file="filename.rda", compress="xz"). Looks like database is essentially static with updates every few years, see Updates page. Together this makes a good case for distributing data directly in the package.

On distributing the data, note:

“Not all of the data in GPDD is in the public domain; some are provided under licence from the BTO. We require permission before we can distribute some of the data sets. The data in these datasets have been removed from the DATA table in this online version of GPDD, but the metadata in the other tables has been left in place.” https://www3.imperial.ac.uk/cpb/databases/gpdd/structure

Emailed database providers to confirm.

### Notebook infrastructure

• ugh should implement script to transform DOIs to links+metadata. probably knitcitations based, but without needing the verbose r citep(" notation. pity markdown doesn’t support rel attribute on links. Maybe possible already? https://drupal.org/node/421832 No luck there, though pandoc can avoid choking on this style of link attribute: https://github.com/jgm/pandoc/wiki/Pandoc-vs-Multimarkdown#image-and-link-attributes when compiling with -f markdown+link_attributes. Unfortunately this doesn’t actually render the attributes but just silently drops them.