Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange behaviour with attach() in preExec #65

Open
ShimantoRahman opened this issue Apr 13, 2021 · 2 comments
Open

Strange behaviour with attach() in preExec #65

ShimantoRahman opened this issue Apr 13, 2021 · 2 comments
Labels
bug Something isn't working

Comments

@ShimantoRahman
Copy link
Collaborator

In the exercises for Statistical Modeling and Data Mining when data is loaded in, we tend to also attach the dataframe. Students can choose whether they use the traditional syntax of refering to columns df$col or use the attached names directly col. Today, when testing several exercises we came across an issue which we had not encountered previously. When a dataframe is attached in the preExec argument, if a column is called via traditional syntax and in the context uses the attached syntax, the dodona environment throws errors that this column cannot be found.

An example of this issue can be found in the exercise Logistic Regression 9 (code repository).
In this exercise the Smarket dataframe is loaded and attached from the ISLR package. One of the columns in this dataframe is Year, which contains integer values according to the year that the observations was recorded. One part of the exercise is creating a boolean vector "train" where the values are true if the year was before 2005.

If this code is ran:

train = Year < 2005

no issues are found, but if we attempt to use the traditional syntax:

train = Smarket$Year < 2005

Dodona throws the compilation error: "Error while evaluating context: object 'Year' not found".

This is the code in the preExec argument:

library(ISLR)
data(Smarket)
attach(Smarket)

This is the code that is ran in the context, before any tests are evaluated:

train = (Year < 2005)

glm.fit = glm(Direction ~ Lag1 + Lag2 + Lag3 + Lag4 + Lag5 + Volume,
                      data = Smarket, family = binomial, subset = train)

This issue also occurs in other exercises with different dataframes, loaded in from different packages.

Other examples:

I am unsure what exactly causes this behaviour, since both syntax should behave the same way. The last time I worked with the dodona platform and R judge (first week of february) this issue did not occur.

@ShimantoRahman ShimantoRahman added the bug Something isn't working label Apr 13, 2021
@ShimantoRahman
Copy link
Collaborator Author

I have tried to isolate the behaviour in a the test exercise Test Attach (repo). In this exercise the Auto data set (from the ISLR) is loaded in and attached in the preExec argument. In the exercise an operation is performed on a column of that data set. In this case:

a <- year < 75

year is a column in the Auto data set.

In the context the same code is ran as in the exercise.

context({

  a <- year < 75

  testcase('Is the value of a correct:', {
    testEqual("", function(env) {env$a}, a)
  })

}, preExec = {
  library(ISLR)
  data(Auto)
  attach(Auto)
})

In this example both the code that is submitted on Dodona and the code within the context use the attach notation where the column is directly referred to. This leads to no issues and the behaviour is normal. However when the submitted code on Dodona is changed form a <- year < 75 to a <- Auto$year < 75, The error "Error while evaluating context: object 'year' not found" is thrown when attempting the submit the code.

So far I have found two ways to circumvent this issue.

  • First, Avoid using attach notation within the context.

If the context is changed to:

context({

  a <- Auto$year < 75

  testcase('Is the value of a correct:', {
    testEqual("", function(env) {env$a}, a)
  })

}, preExec = {
  library(ISLR)
  data(Auto)
  attach(Auto)
})

No errors are thrown when the code a <- year < 75 or a <- Auto$year < 75 is submitted.

  • Second, add a second attach statement before any columns are referred to.

If the context is changed to:

context({

  attach(Auto)
  a <- year < 75

  testcase('Is the value of a correct:', {
    testEqual("", function(env) {env$a}, a)
  })

}, preExec = {
  library(ISLR)
  data(Auto)
  attach(Auto)
})

No errors are thrown when the code a <- year < 75 or a <- Auto$year < 75 is submitted.

@driesbenoit
Copy link
Collaborator

We are still experiencing problems with this issue. It is not clear (to me) why we are experiencing problems. Would be great if someone (Charlotte?) could have a closer look at this with us. Thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants