The 'gtfs' class #12
Replies: 1 comment 5 replies
-
Hi @dhersz! Thanks again so much for your brilliant work on this package. I am finally able to ditch all former (Kinda technical reasons why, and they're not even directly important for now, because R can only implement method dispatch on the first listed class entry, but that will hopefully change sometime relatively soon, after which time things like this will matter. So you can also see it as future-proofing, if nothing else.) |
Beta Was this translation helpful? Give feedback.
-
Hello all.
In this thread I'll discuss the
gtfs
class. I don't have an extensive experience with S3 classes and methods, so I tried to stick with the recommendations made by Hadley in the S3 chapter of his Advanced R book. I'll provide some notes on where I diverged from them, and why.According to Hadley, it's good practice to create a constructor, a validator and a helper for an S3 class. The constructor function is meant to be used by people with more experience with the objects in question, so it doesn't include that many checks (thorough checks are included in the validator function, which I'll show later). The constructor for the
gtfs
class isnew_gtfs()
. It receives a list and assigns thegtfs
class to it, as well as any other attribute you may also want to include.:It also receives the
subclass
argument, which is used to specify a "more specific" class, while still inheriting fromgtfs
:Why is that useful? Because now we can start differentiating the objects created in our packages. For example, it might be very useful to differentiate objects created by
{gtfstools}
and{tidytransit}
, because data.table syntax doesn't work on tibbles. Then we can do something like:As you can see, a
gtfs
object may be composed either bydata.table
s ortibble
s, it really doesn't matter.new_gtfs()
will only raise a few errors when:typeof()
list;subclass
is not a character vector;As you can see, these are not very thorough checks. That's because, again, this function is meant to be used by us, package maintainers, and not final users. A basic usage, for example, is
import_gtfs()
, which does a lot of checks itself, so a situation like any of the above is highly unlikely to occur. Another situation food use, for example, would be inside our own custom GTFS reading functions. See an example workflow below:More complicated checks are made in the validator function, called
assert_gtfs()
. (here is where I first diverge from Hadley's suggestions, because he suggests that these functions should take the name ofvalidate_...()
- but as you can see in the code above,{gtfstools}
(and{tidytransit}
as well) already includes avalidate_gtfs()
function, so I thought it would be wiser to prevent yet another name clash)So what does this
assert_gtfs()
check?data.frame
;.
element is present. If it is, then it must be a list;.
sub-list must also be named and inherit fromdata.frame
.The content of these
data.frame
s is never checked. If all these checks are successful, thenassert_gtfs(x)
invisibly returnsx
:Ok, so where does this validator function should be used? Hadley recommends using it inside helper functions. Helper functions are functions that final users should use to create these objects. So, for example,
tibble::tibble()
is a helper, whiletibble::new_tibble()
is a constructor. So it feels like very natural to create a helper calledgtfs()
.But, in fact, I didn't lol. This is because I feel like the
gtfs()
function would not be that useful. Instead, I think that each of our own packages should include helper functions that are specific to our needs. So for example,{gtfstools}
could very well benefit from adt_gtfs()
function that might resemble something like:And the user would use it somewhat like this
dt_gtfs(shapes = data.table(.....), trips = data.table(.....), *andsoon*)
. What do you think of this? Should{gtfsio}
include a helper function? Or you see it as a better fit for our packages?Methods
The
gtfs
class has a few, though not many, custom methods. These can be found in thegtfs_*.R
files.print.gtfs()
is very similar toprint.default()
, but it doesn't print theclass
attribute:But don't worry.
print.gtfs()
preserves theprint.default()
's property of invisibly returning the very same object passed to it.Please note that any other attributes will not be stripped from the print. For example:
This preserves the default behaviour of
{tidytransit}
and{gtfstools}
today of always printing thevalidation_result
alongside the gtfs object itself. I'm not sure if{gtfsrouter}
and{gtfs2gps}
use any other custom attributes as well.To be honest I'm not sure if I like the current
{gtfstools}
behaviour of always printing the validation result object, and I'd very happily change the printing method to hide any custom attributes. If you agree with this please let me know and I'll change it. Otherwise I'll probably change it directly on{gtfstools}
.{gtfsio}
also include custom methods for subsetting GTFS objects. The first, which was actually shown above, is that subsetting it with[
preserves thegtfs
class ([.default
would strip the class from the object):The
$
and[[
operators were also tweaked to raise warnings when attempting to subset missing elements (this warning aside, they behave exactly like the default operators).:Perhaps an error would be more useful here? I tried sticking to default behaviour as much as possible, but maybe an error is more telling here than a warning.
I also think it might be useful to include some custom methods to
[<-
,$<-
and[[<-
to prevent nondata.frame
-like objects to be assigned to the GTFS object (exception:.
sub-lists), but I haven't done it. What do you think?Sorry for the long post, but I wanted to be very thorough here. Cheers!
Beta Was this translation helpful? Give feedback.
All reactions