I’d like to define a class that acts just like a
data.frame, just like the
data.table class does, but contains some additional metadata (e.g. the units associated with the columns) and has some additional methods associated with it (e.g. that might do something with those units) while also working with any function that simply knows how to handle
How might this be done?
I’m not really sure where to start on this, so below is a summary of my attempt so far and some further puzzles I have stumbled across (e.g. is a
data.frame and S4 object?). This seems to work, but might not be the best way to go about it.
If I understood what the data structure underlying
data.frame was, I might have a better idea how to extend it. But I don’t know what makes a
data.frame, as the exploration below will illustrate:
For instance, it’s not clear if I should try writing my extended object as an S3 or an S4 object that inherits
data.frame. When is a
data.frame an S4 object and when is it an S3 object?
A <- new("data.frame") isS4(A)
TRUE, suggesting that data.frames are S4 objects and that we might start with:
setClass("myclass", contains="data.frame"). However,
B <- data.frame(x = rnorm(10)) isS4(B)
FALSE, suggesting that
data.frames are S3 classes, and perhaps I should consider starting with
attr(B, “class”) <- c(“myclass”, “data.frame”)
It appears that a data.frame is built upon a list, since
is.list(data.frame(x = rnorm(10)))
FALSE, we find that the S4 method
> slotNames(B)  ".Data" "names" "row.names" ".S3Class"
data.frame is an S4 object, with a (unnamed?) list in
.Data and nonempty values in
names, along with the slot “.S3Class” containing the string “data.frame”.
B@namesreturns an error, while
Bis S3, with some
B@.Datareturns an unnamed length-1 list containing a numeric vector with our 10 random values. (and illustrating a working
From this exploration, I conclude that I can promote a
list to a
data.frame by providing an unnamed list, editing the “class” attribute, and then assigning values to the slots
obj <- list(rnorm(10)) attr(obj, "class") <- "data.frame" slot(obj, "names") <- "x" slot(obj, "row.names") <- 1:10
obj acts like a proper
data.frame (at least it prints like one).
This suggests that I could extend
data.frame by adding a new slot for my units, like so:
setClass(“myclass”, representation(units = “character”), contains=“data.frame”)
But this does not inherit the slotNames
.S3Class. I try assigning these explicitly,
setClass("myclass", representation(units = "character", names = "character", row.names = "integer", .S3Class = "character"), contains = "list") obj <- new("myclass", names="x", row.names=1:10, units = "degrees Celsius") obj@.Data <- list(rnorm(10)) obj@.S3Class <- "data.frame"
but this new object is not a
data.frame. Finally we add
attr(obj, "class") <- "data.frame"
and we have a successful extension that acts like a data frame. Some weird things of note: It appears that .S3Class needs to be assigned as above with
@, and not as a named argument in
new or as part of the
This last step is almost certainly not what we really want to do, since it destroys our new class assigment (
myclass). We really want the object to have two classes,
data.frame. However, if we do
attr(obj, "class") <- c("myclass", "data.frame"), then the object does not behave as a
data.frame at all.
One might thing that the solution would be to then define all the S3 methods as copies of those avialable for data.frames, e.g. define
print.myclass <- print.data.frame. Unfortunately this doesn’t quite work (appears not to get
row.names and hence prints column headings but not data. Defining the S4 method for
myclass signature doesn’t fix this either…).