Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orange Dataset is missing #98

Open
timothyslau opened this issue Jun 24, 2020 · 3 comments
Open

Orange Dataset is missing #98

timothyslau opened this issue Jun 24, 2020 · 3 comments

Comments

@timothyslau
Copy link

  1. When I run:
    RDatasets.dataset("datasets", "Orange")
    I get
    image
@tlienart
Copy link
Collaborator

Would you be interested in adding it? If you get the raw CSV from somewhere and a description, I can help you do the rest.

@timothyslau
Copy link
Author

timothyslau commented Nov 19, 2020

I'd be happy to.
Code:
write.csv(x = Orange, file = "Orange.csv", row.names = F) # export the R dataset
zip(zipfile = "Orange.zip", files = "Orange.csv") # zip the csv file
?Orange # print the documentation

Documentation
Orange {datasets} R Documentation
Growth of Orange Trees
Description
The Orange data frame has 35 rows and 3 columns of records of the growth of orange trees.

Usage
Orange
Format
An object of class c("nfnGroupedData", "nfGroupedData", "groupedData", "data.frame") containing the following columns:

Tree
an ordered factor indicating the tree on which the measurement is made. The ordering is according to increasing maximum diameter.

age
a numeric vector giving the age of the tree (days since 1968/12/31)

circumference
a numeric vector of trunk circumferences (mm). This is probably “circumference at breast height”, a standard measurement in forestry.

Details
This dataset was originally part of package nlme, and that has methods (including for [, as.data.frame, plot and print) for its grouped-data classes.

Source
Draper, N. R. and Smith, H. (1998), Applied Regression Analysis (3rd ed), Wiley (exercise 24.N).

Pinheiro, J. C. and Bates, D. M. (2000) Mixed-effects Models in S and S-PLUS, Springer.

Examples
require(stats); require(graphics)
coplot(circumference ~ age | Tree, data = Orange, show.given = FALSE)
fm1 <- nls(circumference ~ SSlogis(age, Asym, xmid, scal),
data = Orange, subset = Tree == 3)
plot(circumference ~ age, data = Orange, subset = Tree == 3,
xlab = "Tree age (days since 1968/12/31)",
ylab = "Tree circumference (mm)", las = 1,
main = "Orange tree data and fitted model (Tree 3 only)")
age <- seq(0, 1600, length.out = 101)
lines(age, predict(fm1, list(age = age)))
[Package datasets version 4.0.2 Index]

See attachment.
Orange.zip

@tlienart
Copy link
Collaborator

tlienart commented Nov 19, 2020

Great thanks, your PR should follow these steps:

  1. the file should be gzipped (.csv.gz extension) and placed in data/datasets/Orange.csv.gz (note that RDA file is also fine if that's easier)
  2. in doc/datasest/ you should add an Orange.html following the templates of the other ones already in there, this is just a short description that can be copy pasted from the original source
  3. adding a line before this one to indicate the dataset + its dimensions
    "datasets","occupationalStatus","Occupational Status of Fathers and their Sons",8,8

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants