Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guide and materials on teaching R through R-Instat #19

Open
rdstern opened this issue Jul 16, 2018 · 4 comments
Open

Guide and materials on teaching R through R-Instat #19

rdstern opened this issue Jul 16, 2018 · 4 comments
Assignees

Comments

@rdstern
Copy link
Contributor

rdstern commented Jul 16, 2018

With the forthcoming course to AIMS students in Cameroon one aspect is that these students will later proceed to using R with RStudio. We wonder whether the use of R-Instat can be used to also introduce R itself.

There are different aspects. here I introduce one small possible item.
The Climatic > Prepare > Transform dialogue makes it easy to get running totals or means.

Here is the code in the script window:

Code generated by the dialog, Transform

grouping <- instat_calculation$new(type="by", calculated_from=list("Guinee2"="year","Guinee2"="Station"))
transform_calculation <- instat_calculation$new(type="calculation", function_exp="zoo::rollapply(Rain, width=3, FUN=mean, fill=NA, align='right')", result_name="moving_mean", manipulations=list(grouping), save=2)
InstatDataObject$run_instat_calculation(calc=transform_calculation, display=FALSE)
rm(list=c("grouping", "transform_calculation"))

Currently this is OK for the mean, but not the median. Changing the code to the median is as follows:

Code generated by the dialog, Transform

grouping <- instat_calculation$new(type="by", calculated_from=list("Guinee2"="year","Guinee2"="Station"))
transform_calculation <- instat_calculation$new(type="calculation", function_exp="zoo::rollapply(Rain, width=3, FUN=median, fill=NA, align='right')", result_name="moving_med", manipulations=list(grouping), save=2)
InstatDataObject$run_instat_calculation(calc=transform_calculation, display=FALSE)
rm(list=c("grouping", "transform_calculation"))

More ambitiously we calculate a function and use that:
Code generated by the dialog, Transform

grouping <- instat_calculation$new(type="by", calculated_from=list("Guinee2"="year","Guinee2"="Station"))
diff <- function(x) {mean(x)-median(x)}
transform_calculation <- instat_calculation$new(type="calculation", function_exp="zoo::rollapply(Rain, width=3, FUN=diff, fill=NA, align='right')", result_name="moving_diff", manipulations=list(grouping), save=2)
InstatDataObject$run_instat_calculation(calc=transform_calculation, display=FALSE)
rm(list=c("grouping", "transform_calculation"))

I wonder about a more ambitious change and also how this might be done using RStudio. I am assuming this simple sort of edit might be useful in the script window and we would do other more ambitious analyses using RStudio.

One aspect is that it is easy (I think) to move from R-Instat to RStudio, but not so obvious to me that this is so useful when the edit is small and then we want to continue with R-Instat. So prepare sort of stuff might often stay in R-Instat, while Describe and Model stuff would be helped by migration.

@rdstern
Copy link
Contributor Author

rdstern commented Jul 17, 2018

This follow-up is different. I found this blog very useful and it complements what we will try to do through R-Instat. In parallel with using R-Instat it would be good if students learned to use R directly. Here are 5 ways that are suggested. I propose to investigate these further, with the idea that students could (to some extent) choose for themselves.

These 5 ways are:
a) YouTube videos
b) Blogs
c) Online courses
d) Books
e) Experiment

Within each approach is a small list of suggestions.

Different people might use a different mix of these approaches - options by context. And, of course, they are not mutually exclusive.

@dannyparsons
Copy link
Contributor

Making small edits in the script window is something I think the AIMS students can cope with if needed. Doing anything more with that example, like moving to RStudio I wouldn't be suggesting because it's not general R code so won't help them learn R or will confuse them about R code. There's no output from running a calculation either so you then need to know how the whole instat object works to even see the result in the data.

If it was a graph that's different because the code will be almost standard ggplot2 code and there I can see that moving to RStudio is useful as there's lots you can do with the script and it's all useful to know in general.

In workshops I have imported data, done and graph and then moved the log to RStudio. Removing the instat object lines which aren't needed you end up with a nice R script very quickly and people could understand that R code and saw how easy it was to generate. The data manipulation is more tricky because we have our own functions wrapped around dplyr code so there's not much you can really learn from R-Instat that would be useful in writing your own R.

@rdstern
Copy link
Contributor Author

rdstern commented Jul 17, 2018

Thanks. That's very useful. I think I am now getting there with ideas of the materials to produce.

This is currently pretty badly written, but I think has some of the key ideas.

David phrased it sort of that a statistician-type (or someone who wants to solve statistical problems could usefully have mastered three tools, namely a spreadsheet, a statistical package and a statistical language. This is the supposition that we make here.

Not everyone needs all three components. Some problems can be solved with just a spreadsheet and some users are just familiar with Excel for their statistical work.

Many people use a statistical package, but not the language. This may not be related to R. For example any of SPSS, SAS or Stata can be used simply as a menu-driven package. If then more is needed, then each can also be used in command mode, i.e. through their language.

In the past AIMS students have used R - the language - for their statistical work. This year we are adding R-Instat. This is a package, that uses R commands (behind the scenes). Later in the year we expect students to migrate, so they use the R language. We expect that some will then find they no-longer need R-Instat, while others may wish to continue with both tools. This is with R-Instat as the package, and with RStudio (which makes the R language pretty easy) as the environment for the R language.

To help you to migrate, we will also introduce some use of R commands while using R-Instat. This is partly through this guide, through a series of videos and also by the practice you make.

More to come

@dannyparsons
Copy link
Contributor

On your list of resources to learn R, I like it in principle, but you can imagine students might then spend forever watching YouTube tutorials and reading blogs about R code and never actually getting to point e) or if they did not really knowing how they should "experiment" if they're less used to self guided stuff.
I think I would present one or two concrete starting points for them: one or two books or online courses, where they will start using R code straight away and then suggest other resources to compliment this like the tutorials and blogs.
I would also ask Sam for his book recommendations, I remember he has a few favourites and good to know how he compares them e.g. to the R for Data Science book.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants