Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LAB - 06 #11

Open
sunaynagoel opened this issue Apr 22, 2020 · 18 comments
Open

LAB - 06 #11

sunaynagoel opened this issue Apr 22, 2020 · 18 comments

Comments

@sunaynagoel
Copy link

Just noticed that the link for lab -06 is not working.
~Nina

@lecy
Copy link
Contributor

lecy commented Apr 23, 2020

It's live! And some code has been added with an example.

@sunaynagoel
Copy link
Author

It's live! And some code has been added with an example.

@lecy Lecture video and code script was very helpful. Thanks

@lepp12
Copy link

lepp12 commented Apr 25, 2020

data <- read.csv("logit-lab.csv", stringsAsFactors=F)

This data source doesn't seem to be working. Is there somewhere else I should be looking for the data for Lab 6?

@katiegentry07
Copy link

katiegentry07 commented Apr 25, 2020

@lepp12
Copy link

lepp12 commented Apr 25, 2020

@lecy @lepp12 here's the link to the raw data.
https://github.com/DS4PS/pe4ps-textbook/blob/master/labs/DATA/logit-lab.csv

to load into your RMD file use as the URL: "https://raw.githubusercontent.com/DS4PS/pe4ps-textbook/master/labs/DATA/logit-lab.csv"

@katiegentry07 Thank you! Where did I miss this?

@katiegentry07
Copy link

@lecy @lepp12 here's the link to the raw data.
https://github.com/DS4PS/pe4ps-textbook/blob/master/labs/DATA/logit-lab.csv
to load into your RMD file use as the URL: "https://raw.githubusercontent.com/DS4PS/pe4ps-textbook/master/labs/DATA/logit-lab.csv"

@katiegentry07 Thank you! Where did I miss this?

I went and found the raw data in Jesse's website because I was also working on the lab

@lecy
Copy link
Contributor

lecy commented Apr 25, 2020

Thanks for catching that @katiegentry07 ! Lab is updated.

I also added some code to demonstrate how to create the probability plots, if you are interested:

image

@katiegentry07
Copy link

@lecy I'm struggling with 3c given the error.

Error in $<-.data.frame(*tmp*, Adoption, value = c(1 = 0.356634996972544, : replacement has 49 rows, data has 1

Here's my code:

data2 <- with( data, 
               data.frame( Democrats = mean(Democrats), 
                           Evangelics = mean(Evangelics), 
                           Catholics = mean(Catholics),
                           Media = mean(Media),
                           Merck = mean(Merck)))

data2$Adoption <- predict( log, newdata = data2, type = "response" )
data2$Adoption

@lecy
Copy link
Contributor

lecy commented Apr 25, 2020

Try it again. The problem was the row numbers were stored in a variable called "X", so when you create data2 it imputes a new dataset with the means of the study variables repeated 50 times, one for each row.

The predict() function uses only the model variables, so it creates one value, but the dataset has 50 rows, so you are getting a dimension incongruity error.

If you drop X from data2 it will work. I removed it from the dataset on GitHub so you can also just reload and try again.

Note that for a single value you don't need to assign the solution to a new dataset. This would also work:

predict( log, newdata = data2, type = "response" )

@katiegentry07
Copy link

@lecy it is still producing 49 rows of data when it should only produce 1 correct?

@lecy
Copy link
Contributor

lecy commented Apr 26, 2020

Re-load the dataset, or drop the column X from your data2

@lecy
Copy link
Contributor

lecy commented Apr 26, 2020

@katiegentry07 Did that work?

@katiegentry07
Copy link

@lecy I ended up having to recode the whole lab, but then it worked. Thanks!

@lepp12
Copy link

lepp12 commented Apr 29, 2020

@lecy In question 4, I keep getting the error: Error in model[["call"]] : object of type 'special' is not subsettable

Here is my code: margins(log, at = list(Media = mean(Media)), variable = "Media")

@lecy
Copy link
Contributor

lecy commented Apr 29, 2020

@lepp12 can you give me a little more context with your code. which question specifically? and what is your model?

You can reference variables directly without the dat$ reference in the with() function. I'm not sure you can in the margins() function. You might try:

mean.media <- mean(Media)
margins( log, at = list( Media = mean.media), variable = "Media" )

Without the question and the model I can't test to see if that's the problem, though.

@lepp12
Copy link

lepp12 commented Apr 29, 2020

@lecy This is for question 4a attempting to run the following model:

logit <- glm(data$Adoption ~ data$Democrats + data$Evangelics + data$Catholics + data$Media + data$Merck, data = data, family = "binomial")

I saw I was referencing the wrong model in the margins() function and have since changed it to:

margins(logit, at = list(Media = mean(Media)), variable = "Media")

but now I get the error:

Error in names(classes) <- clean_terms(names(classes)) : 'names' attribute [10] must be the same length as the vector [5]

I had tried to use data$Margins, but that led to other errors as well.

@lecy
Copy link
Contributor

lecy commented Apr 29, 2020

I am wondering if you are confusing margins() with the names in your model.

Note that some functions allow you to reference a data frame then call variable names directly. Pretty much all dplyr functions work with way (first argument is the data frame object, and afterwards all are variable names with no quotes and no dat$.

Others, like plot(), always require you to reference vectors by their data.frame + column.name convention (unless you have atomized vectors).

If you aren't paying attention to which one it is you can get muddles object names like this:

> x <- 1:10
> y <- 1:10
> 
> dat <- data.frame( x, y )
> names( dat )
[1] "x" "y"
> 
> d2 <- data.frame( dat$x, dat$y )
> names( d2 )
[1] "dat.x" "dat.y"

Note that the lm() and glm() functions are like dplyr functions - you reference the data frame (at the end) and can reference variable names directly inside the model formula.

logit <- glm( Adoption ~ Democrats + Evangelics + Catholics + 
              Media + Merck, data = data, family = "binomial" )

@lecy
Copy link
Contributor

lecy commented Apr 29, 2020

I had the general problem correct, but the specifics wrong.

The margins() function melds both conventions, which makes it hard to remember which goes where.

# your syntax:
# mean(Media) should be mean( data$Media )
# margins( logit, at = list(Media = mean(Media) ), variable = "Media" )

margins( logit, at = list( Media = mean(data$Media) ), variable = "Media" )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants