Exploring XML Schema in R
I danced around the room after reading these lines in the excellent XML and Web Technologies for Data Sciences with R book:
We can generate S4 class definitions corresponding to the data structures that are represented in the XML document. We can programmatically generate R code that will map an XML document adhering to this schema to corresponding R data structures. Similarly, we can automate the creation of code to serialize, or write, an R object of a particular type to a suitable XML format corresponding to the schema
Wow. Automatically turn EML objects into R objects and vice versa! Generate a NeXML parser for R automatically. The world is your oyster now!
Trouble so far.
As I suspected from just the description of the problem, there is a recursive circularity in the schema files - eml-documentation.xsd imports eml-text.xsd which imports eml-documentation.xsd. I’ll try to find some time to handle this.
This is a schema for phylogenetic trees. Being able to use this schema would be a big improvement over the current approach of our treebase package, which queries a vertically integrated database that is far less expressive then the NeXML schema.
I tried this schema, but cannot download the file for some reason:
nex <- readSchema("https://www.nexml.org/2009/nexml.xsd") failed to load HTTP resource Error: 1: failed to load HTTP resource >
I download a copy from here and set its xsd/ directory as my working directory:
> nex <- readSchema("nexml.xsd") Error: XML content does not seem to be XML: './continuous.xsd'
because the path should be
characters/continuous.xsd. Not sure why I’m not managing to navigate the paths correctly here.
KML example from book
I cannot load the KML schema directly from the website:
> kx = "https://developers.google.com/kml/schema/kml21.xsd" > kml = readSchema(kx) Error: XML content does not seem to be XML: 'https://developers.google.com/kml/schema/kml21.xsd'
kml = readSchema( "https://schemas.opengis.net/kml/2.2.0/ogckml22.xsd") failed to load HTTP resource Error: 1: failed to load HTTP resource
fails, though the other example works
pmml = readSchema("https://www.dmg.org/v4-1/pmml-4-1.xsd")
If I wget the xsd and save it locally first, then I can readSchema on the local file. If I have
options(error = recover) set, then
gets me deep into the debugger. If I run it with
options(error=NULL), then I get warnings:
> defineClasses(kml) character(0) Warning messages: 1: In (function (className, baseType, where = globalenv()) : https://earth.google.com/kml/2.1.Object is an existing class 2: In (function (className, baseType, where = globalenv()) : https://earth.google.com/kml/2.1.Feature is an existing class 3: In (function (className, baseType, where = globalenv()) : https://earth.google.com/kml/2.1.LookAt is an existing class 4: In (function (className, baseType, where = globalenv()) : https://earth.google.com/kml/2.1.Geometry is an existing class 5: In (function (className, baseType, where = globalenv()) : https://earth.google.com/kml/2.1.StyleSelector is an existing class 6: In (function (className, baseType, where = globalenv()) : https://earth.google.com/kml/2.1.TimePrimitive is an existing class >
I load an minimal example file and try fromXML, but hit an error:
> doc <- xmlParse("kmlexample.xml") > fromXML(doc) Error in UseMethod("xmlAttrs", node) : no applicable method for 'xmlAttrs' applied to an object of class "c('XMLInternalDocument', 'XMLAbstractDocument')"
Stay tuned for more from Duncan on getting this rolling..
Early warning signals in multivariate dynamics:
r citet("https://arxiv.org/abs/1306.3465", "discusses"). Population size grows as approaches collapse, multidemensional data needed to detect slowing down (eigenvalue estimates from multidimensional autoregressive model)
Little grants are more bang for the buck than big grants:
r citet("10.1371/journal.pone.0065263", "discusses")
From Egan, on getting a useful version of rvm on Ubuntu 12.04 (lets me install
bundle install on the lab-notebook gemfile so I can deploy from
RVM has to be installed locally because the packaged version is broken. Before installation, the following packages need to be installed globally (unless you have root access in which case the installer will take care of them):
bison libyaml-dev libgdbm-dev gawk sqlite
To install RVM, run:
\curl -L https://get.rvm.io | bash -s stable --ruby --autolibs=enable --auto-dotfiles
which takes a while because it downloads and builds a lot. The installer takes care of the configuration automatically and all you need to do after installation is log into a fresh shell to source the required functions and environment.