Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
hannesdatta committed Mar 20, 2024
2 parents 3c728cd + 1a37ce6 commit 0f78d03
Show file tree
Hide file tree
Showing 4 changed files with 152 additions and 141 deletions.
112 changes: 40 additions & 72 deletions content/docs/exam/examplequestions.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,89 +8,40 @@ draft: false

# Example questions

Questions will be asked along the course's learning goals, and complexity levels (e.g., knowledge, application, evaluation). For details, see [here](../exam#content).
Questions will be asked along the course's learning goals, and complexity levels (e.g., comprehension, application, synthesis, evaluation). For details, see [here](../exam#content).

Below, you can find a few example questions, which will be discussed with students in the final live stream of this course.


## Theoretical part

{{< hint warning >}}

This part of the exam consists of __personalized open and closed (multiple-choice) questions__, shown in __random order__. You can freely go back and forth between these questions.
This exam consists of __open and closed (multiple-choice) questions__. You can freely go back and forth between these questions.

{{< /hint >}}
![](../dprep_part1.png)

*Note: the number of questions depends on the points awarded to each question. The instructions during the final exam may slightly vary, so make sure to still read it accordingly.*

1. Please name a tool that can be used to automate workflows. (*knowledge*)

2. Please name three ways to deploy one's research findings. (*knowledge*)

3. What are the main benefits of exploring data using RMarkdown documents, compared to “point-and-click” interfaces (e.g., SPSS), or manually investigating data by issuing commands in the R terminal? (*comprehension*)

4. What are the benefits from automating pipelines, compared to manually executing source code files? (*comprehension*)

5. Please view the code snippet below.

```
library(dplyr)
df <- read.csv('data.csv')
df <- df %>% filter(age >= 18)
```
Please assess the completeness of the script with regard to the ITO components of a source code file. Can you identify any missing piece in the code? (*analysis*)

6. Please assess whether the makefile below will run when you type "make". (*analysis*)

__Directory Structure__
```
\readme.md
\code\makefile
\code\load.R
\data\dataset.csv
```
__Makefile__
```
data/dataset.csv: load.R
R --vanilla < load.R
```

![](../dprep_overview.png)

## Practical part

{{< hint warning >}}

This part of the exam consists of __personalized open questions__, shown in __random order__. You can freely go back and forth between these questions.

{{< /hint >}}

![](../dprep_part2.png)

*Note: the instructions during the final exam may slightly vary, so make sure to still read it accordingly.*

### Question 1

Imagine you have just enrolled as a thesis student, and you receive the following email from your advisor:
1. Please download the `datasets.RData` workspace file from the exam cover page and open it in RStudio. Please answer the following questions using the objects in this R workspace.
1. Please use the dataset stored in `data1`. Using `dplyr`, reshape this dataset from wide to long. Paste the code snippet with the solution below.
2. Please use the dataset stored in `data2`. Using `dplyr`, please create an aggregated dataset, taking an average of `variable1` and `variable2` for all users in the data (i.e., you obtain a dataset with the number of rows equal to the number of users in the data).
3. Please take a look at `data3`. Please propose which data preparation steps are necessary to clean this data.
2. Imagine you have just enrolled as a thesis student, and you receive the following email from your advisor. Submit your PDF document, and provide a conclusion on the suitability of the explored data for the research question.

{{< hint >}}

Dear (name of student),

I really look forward to working with you on this exciting dataset, capturing the consumption of music on Spotify. I scraped it from spotifycharts.com a while ago.
I really look forward to working with you on this exciting dataset, capturing the consumption of music on Spotify. I scraped it from spotifycharts.com a while ago. Please download this `data.zip`, which contains a stripped-down version of an RMarkdown file and the data.

As a starting point, please explore the data set using RMarkdown. I’d love to learn more about the data myself (haven’t looked into it yet) - maybe you can figure out a way to shed some light on how the start of the global pandemic (let’s assume that was March 2020) affected music consumption?

Please render your RMarkdown as a PDF document. Please keep any code that you’re writing (e.g., to load the data, or to explore and do some minor data preparations) visible so I can learn from it!

{{< /hint >}}

Submit your PDF document for question 1, and provide a conclusion on the suitability of the explored data for the research question? (*analysis*)

### Question 2

Imagine you are a research assistant at Tilburg University, and you receive the following email from your project supervisor:
3. Please download the `github_repository.zip` file from the exam cover page and unzip it to a folder on this computer. Open this folder using Git Bash. Imagine you are a research assistant at Tilburg University, and you receive the following email from your project supervisor. Please submit your Git repository, by zipping the folder and uploading it here.

{{< hint >}}

Expand All @@ -105,21 +56,38 @@ Starting from `run.R`, can you apply your learnings from dPrep, and submit a lin
- Have a proper readme at the repository (in an `.md` file),
- Ignore files that should not be versioned using .gitignore, and
- remove `run.R` and replace it by a proper makefile for this project.
- throughout, make use of frequent commits and commit messages.

I really look forward seeing your work. Your deliverable is just a link to a (private!) GitHub repository, provided in the answer box below.
I really look forward seeing your work. Your deliverable is the zipped Git repository, which you can upload in the answer box below.

{{< /hint >}}

a) Please submit your GitHub link with your end-to-end GitHub workflow using make (*application*)

b) How could you determine whether the GitHub workflow runs well, beyond merely executing it yourself? (*evaluation*)


<!--
{{< hint info >}}
__This section is still work-in-progress (i.e., we are still adding examples and add code/data where needed).__
{{< /hint >}}
-->
4. Other example questions.
1. Please name three ways to deploy one's research findings. (*knowledge*)
2. What are the main benefits of exploring data using RMarkdown documents, compared to “point-and-click” interfaces (e.g., SPSS), or manually investigating data by issuing commands in the R terminal? (*comprehension*)
3. What are the benefits from automating pipelines, compared to manually executing source code files? (*comprehension*)
4. Please view the code snippet below and assess the completeness of the script with regard to the ITO components of a source code file. Can you identify any missing piece in the code? (*analysis*)

```
library(dplyr)
df <- read.csv('data.csv')
df <- df %>% filter(age >= 18)
```
5. Please assess whether the makefile below will run when you type "make". (*analysis*)
{{< hint >}}
Directory Structure:
\readme.md
\code\makefile
\code\load.R
\data\dataset.csv
Makefile:
data/dataset.csv: load.R
R --vanilla < load.R
{{< /hint >}}
2 changes: 1 addition & 1 deletion content/docs/modules/week3/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ __Tips & tricks__
- Curious how to use Git with a graphical user interface?
- Use Git directly from within R - [find out how!](https://swcarpentry.github.io/git-novice/14-supplemental-rstudio/)
- Another fantastic Git client is [Sourcetreeapp](https://www.sourcetreeapp.com), which works on Windows, Mac and Linux!
- [Git & Github cheatsheet](https://github.com/tilburgsciencehub/website/raw/master/content/building-blocks/collaborate-and-share-your-work/use-github/github_cheatsheet_tsh.pdf)
- [Git & Github cheatsheet](https://tilburgsciencehub.com/topics/automation/version-control/start-git/images/github_cheatsheet_tsh.pdf)
- Optional activity: [Contributing to an open source web site](activity)

{{< /hint >}}
Expand Down
84 changes: 51 additions & 33 deletions content/docs/modules/week7/slides.Rpres
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,9 @@ Welcome to the final lecture in dPrep!

If you haven't done so, please **explore the exam page & example questions** at [https://dprep.hannesdatta.com/docs/exam]().

<!--
The __course evaluation__ is live at https://app.evalytics.nl. Please voice your opinions!
-->

<!--
- Team project:
Expand Down Expand Up @@ -84,7 +86,7 @@ incremental: true
- delete stuff that isn't needed, roll back when you want
- Oh no, not authenticated!
- Oh no! Can't push! (pull first!)
- GUIs (e.g., in R) are available, too!
- GUIs (e.g., in R or VS Code) are available, too!

Lessons learnt #1: Versioning and Project Management with Git(Hub) (II)
========================================================
Expand All @@ -93,7 +95,7 @@ incremental: true
- Ready to use Git/versioning in a business
- Know what to version, and what not
- Purpose of `.gitignore`
- How to work together (issues, feature branches, PRs)
- How to work together (issues, feature branches, pull requests/PRs)
- Collaborate on open source projects
- You know what forks are!
- You may even actually contributed to public projects!
Expand All @@ -105,7 +107,7 @@ incremental: true
- RMarkdown! (mixing code with reporting)
- ability to quickly produce clean docs to share!
- uh... but how to run it w/ make?
- __verify you can render markdown documents on your computer!__
- __verify you can render markdown documents!__
- Doing data quality checks?
- getting back to your data supplier if needed!
- reporting summary stats
Expand All @@ -123,7 +125,7 @@ incremental: true
- Make ITO blocks
- setup to load libraries, __then__ input, transformation, output!
- super crucial for `make`, too!
- Modularize code, iterate, and use functions
- Modularize code, loop, and use functions

Lessons learnt #4: Pipeline building and automation
========================================================
Expand Down Expand Up @@ -171,6 +173,7 @@ incremental: true

- Gradually implement across classes or projects
- e.g., some projects just benefit from better directory structure, while others may need "more"
- risk of "losing" the skill (!)
- Realize it takes time to learn
- it took me years to become proficient

Expand All @@ -191,8 +194,8 @@ incremental: true

- Become part of the open science community at Tilburg
- contribute to [Tilburg Science Hub](https://tilburgsciencehub.com)
- improve [music-to-scrape](https://music-to-scrape.org)
- follow us on [LinkedIn](https://www.linkedin.com/company/tilburgsciencehub/) and [Twitter](https://twitter.com/tilburgscience)
- improve [music-to-scrape.org](https://music-to-scrape.org)
- follow us on [LinkedIn](https://www.linkedin.com/company/tilburgsciencehub/) and [X](https://twitter.com/tilburgscience)
- Invest in your research skills (e.g., Docker, coding)

Looking ahead: After your studies
Expand All @@ -207,7 +210,7 @@ incremental: true
Next steps: Submissions and preparing for the exam
========================================================

- Deadline for project coming up!
- Deadline for project coming up next week!
- remember: __make it work__ on different computers and operating systems
- __do not use__ absolute paths!
- verify data can be downloaded smoothly
Expand All @@ -217,68 +220,83 @@ Next steps: Submissions and preparing for the exam
- https://dprep.hannesdatta.com/docs/exam/examplequestions/


Exam overview: Theoretical part
Exam planning
=============

- Organization
- When: 16 October, 1-2pm (1 hour)
- How?
- closed book, no software like R, __just TestVision__
- 4 April (time tba; 3 hours)
- On campus, using __TestVision__
- Software & materials
- access to R/RStudio, Git, make
- access to github.com/course-dprep and classroom.github.com; no access to ChatGPT or other AI tools
- I'm making selecting resources available on the instruction page - [check them out here](https://github.com/hannesdatta/course-dprep/raw/master/content/docs/exam/cheatsheets-exam.zip)
- What?
- How to prepare?
- familiarize yourselves with [how TestVision works with a practice test](https://oefentoetsen.testvision.nl/online/fe/login_ot.htm?campagne=tlb_demo_eng&taal=2)
- let's look at [some questions now](https://dprep.hannesdatta.com/docs/exam/examplequestions/#theoretical-part)
- let's look at [some questions now](https://dprep.hannesdatta.com/docs/exam/examplequestions/)


Exam overview: Practical part
=============

- Organization
- When: 17 October, 10am - 11.59am + 1 minute, take home
- Work max. 2 hours on this part
- How?
- open book, on your computer
- ChatGPT and other tools ONLY when explicitly asked for.
- What?
- let's look at [some questions now](https://dprep.hannesdatta.com/docs/exam/examplequestions/#practical-part)
- prep well - expect new datasets that are big (too big maybe even) -- aggregation, selection of number of rows, etc.

Some tips for your exam
=======
incremental: true

- Expect an unexpected data set & data wrangling
- Know common data operations in `dplyr` & become fast!
- When handing in documents, check what I require (for `.Rmd`, I sometimes ask for rendered `.pdf` documents - does it work on your computer?)
- Be prepared to make commits to GitHub repositories -- know how clone, fork, write issues, do PRs, roll back to previous versions, etc.
- Be prepared to run, correct and develop new `make` workflows
- Be prepared to work with Git Bash
- know how to make commits with commit messages, create branches, switch branches, etc.
- download git repository, unzip, do your commits, zip again and submit as a zipfile
- Be prepared to work with GitHub
- know how to clone, fork, write issues, do PRs
- know how to roll back to previous version
- Be prepared to use `make`
- run, correct and develop new `make` workflows
- Be prepared to download data sets from Testvision (`.Rdata`) - on the Cover page of the exam or in a specific question on the exam.


Learning goals + distribution of points
==========

- 100 points in total, about 25 questions
- Mix of open and closed questions
- Learning goals & question weights
1. Use R to clean and transform data for analysis (e.g., aggregation, merging, de-duplication, reshaping, data conversions, regular expressions) [synthesis; __20% of points__]
2. Use GitHub for managing empirical research projects (e.g., GitHub Issues and Project Boards) [evaluation; __10% of points__]
3. Use Git/GitHub for versioning files and collaborating on privately-shared and publicly-available (open science) GitHub repositories [application; __30% of points__]
4. Use R for generating automatic reports (e.g., to assess data quality, to report research findings in a paper) and deploying research findings in novel ways (e.g., apps) [comprehension; __15% of points__]
5. Use Workflow Management Tools to create and run portable, automated, and reproducible research pipelines [application; __25% of points__]


Next steps: Official course evaluation
========================================================

- Course evaluation has been immensely important to this course
- this semester: new data set, coaching sessions with breakout groups on campus and online
- last semester: developed new on-campus tutorials, released example projects
- this semester: switched order of sessions, revised GitHub tutorial/onboarding
- last semester: new data set, coaching sessions with breakout groups on campus and online

- Course evaluation has been critical to my career
- Without my past evaluations, I wouldn't be teaching to you today
- I will look at all comments.
- __Scores are most important to show importance of this course__

- You will be invited via [Evalytics](https://app.evalytics.nl/#/login)
- You will be invited via [Evalytics](https://app.evalytics.nl/#/login) at the end of the week

Next steps: Self- and peer assessment
========================================================

- You will be invited via e-mail
- To be filled in using Google Forms



Informal feedback
========================================================
incremental: true

- Coaching sessions?
- Coaching sessions? (Online, offline)
- How was it for beginners?
- What's are three things you'd like me to change?



Stay in touch!
========================================================

Expand Down
Loading

0 comments on commit 0f78d03

Please sign in to comment.