This document describes the form that your data must be in, in order to be imported and used with Glance.
Each type of data is contained within a single .csv
file, with columns as described below. The information to be
contained within each file is kept purposefully simple in order to be as agnostic as possible to the institution
producing the data.
It is recommended that you use some sort of ETL (Export, Transform, Load) tool, such as Talend to export the necessary records from your institution's data-warehouse in the desired format.
Alternatively, if your institution operates some sort of "Data Integration" team, they should be able to use this document to provide you with the files.
All .csv
files below should conform to the widely accepted RFC 4180 spec except
where otherwise stated.
Note: Many files are required by only some of the visualisations that it is possible to include in a Glance survey. If the set of visualisations you wish to include in your surveys do not require certain files, then you needn't prepare them. You can see which visualisations require which files here. Likewise, the files listed below are unlikely to be an exhaustive description of all the useful student data available to each institution. Instead they are simply the data which is needed to power Glance's various visualisations.
If you have any questions about the files specified in this document, please create an issue here
This file simply contains a list of the printers in an institution.
Id | Location |
---|---|
Integer | String |
An example row would look something like the following:
2,Library - First floor
This file contains a list of events corresponding to students printing documents.
Timestamp | Student Id | Study Id | Stage Id | Printer Id | Num Sides |
---|---|---|---|---|---|
ISO 8601 String | Integer | Integer | Integer | Integer | Integer |
Note: On the subject of timestamps: these should be expressed with ISO 8601 standards compliant strings including
timezone offset. The strings will look something like this: 1999-01-08T04:05:00+01:00
(where date portion is
yyyy-mm-dd). You can read about the ISO 8601 specification here.
An example row would look something like the following:
2015-09-07T07:58:10000Z,26567,249,23,104,1
This file contains a list of the computing clusters in an institution. I.e. The distinct locations or labs where a student may use a computer.
Id | PC Name | Cluster Name | Building Name |
---|---|---|---|
Integer | String | String | String |
Note: The PC Name column is expected to correspond to a prefix common to the names of all computers in a cluster. If this does information does not make sense in the context of your institution's data, it can be safely discarded.
An example row would look something like the following:
6,COOKSONB,Cookson - Cluster,Medical School
Without the PC Name field, this would be:
6,,Cookson - Cluster,Medical School
Note: Remember to include the blank commas 6,,...
This file contains a list of events corresponding to a student session using a computer in an institution's cluster.
Session Start Timestamp | Session End Timestamp | Student Id | Study Id | Stage Id | Machine Name |
---|---|---|---|---|---|
ISO 8601 String | ISO 8601 String | Integer | Integer | Integer | String |
An example row would look something like the following:
2015-08-30T08:04:40000Z,2015-08-30T08:36:10000Z,22186,1074,23,COOKSONB282
Note: If the PC Name field is present in the Cluster.csv
file then we can match that prefix against the
Machine Name field of ClusterSession.csv
rows in order to determine in which cluster a session physically took place.
This file contains instances of lecture recordings which students may (re)watch at their leisure. This file is called
Recap.csv
as Recap is the lecture recording and playback software used at Newcastle, provided by Panopto.
Any conceptually similar lecture replay system will be fine as long as it can report the necessary information.
Id | Session Name | Module Code |
---|---|---|
Integer | String | String |
An example row would look something like the following:
139,TCP3099/L03/01,TCP3099
Note: the Module Code field is an identifier for the course, module or workshop from which the recording was
produced. Also note that whilst the Session Name field in the example contains the module code, this is not required.
Session Name simply identifies the actual thing being recorded and has no explicit structure. For instance,
TCP3099/L03/01
likely corresponds to the third lecture from the course with code TCP3099, but this could just as easily
be FooBar Lecture 124
This file contains a list of events corresponding to a student watching a lecture recording.
Session Start Timestamp | Recap Id | Student Id | Study Id | Stage Id | Seconds Listened |
---|---|---|---|---|---|
ISO 8601 String | Integer | Integer | Integer | Integer | Integer/Decimal |
An example row would look something like the following:
2015-01-01T00:06:48000Z,52837,23349,996,23,1140.135
Note: the Recap Id field should correspond to entries in the Id field of Recap.csv
.
This file simply contains a list of the stages a student may occupy throughout their time at an institution. These normally correspond to academic years.
Id | Stage Number | Study Stage |
---|---|---|
Integer | Integer | String |
An example row would look something like the following:
2,1,Undergraduate Stage 1
This file simply contains a list of the programmes of study students may be enrolled on.
Id | Programme Code | Program Title | Programme Type | School Code | School | Faculty Code |
---|---|---|---|---|---|---|
Integer | String | String | String | String | String | String |
An example row would look something like the following:
5,09MD,Doctor of Medicine,Postgraduate Research,D-SCMS,CLINICAL MEDICAL SCIENCES,F-FMED
This file contains a list of events corresponding to student sessions in an institution's Virtual Learning Environment (VLE). A common example of a VLE is the software Blackboard.
Id | Start Timestamp | End Timestamp | Student Id | Study Id | Stage Id | Num Clicks |
---|---|---|---|---|---|---|
Integer | ISO 8601 String | ISO 8601 String | Integer | Integer | Integer | Integer |
An example row would look something like the following:
331440600,2017-07-04T18:01:55000Z,2017-07-04T18:02:00000Z,50873,249,23,4
Within each VLE Session, a student may engage with multiple pieces of content. This file contains a list of events corresponding to a student engaging with (viewing, downloading etc...) a specific piece of content.
Session Id | Student Id | Module Code | Monitored Content |
---|---|---|---|
Integer | Integer | String | String |
An example row would look something like the following:
300296073,12488,ECO3001,Teaching Material/Exercise Sets/1
Note: the Monitored Content field's structure has no significance. Teaching Material/Exercise Sets/1
could just
as easily be Economics Exercises Set 1
etc... Also note that the Module Code field corresponds to the course to which
the content being accessed belongs (it is assumed that all content will be owned by some module).
Note: the Session Id field is expected to correspond to entries in the Id field of VLESession.csv
.
This file contains a list of events corresponding to recorded student meetings, likely with a tutor, supervisor or lecturer.
Meeting Date | Student Id | Study Id | Stage Id | Meeting Type |
---|---|---|---|---|
ISO 8601 String | Integer | Integer | Integer | String |
An example row would look something like the following:
2016-08-31T09:00:00000Z,9982,1585,38,Tutor
Note: the type of the Meeting Date field is listed as an ISO 8601 String. Meetings are often manually recorded and may only include a date, rather than a date and start time. In this case it is acceptable to provide the row in the following form:
2016-08-31,9982,1585,38,Tutor
The format yyyy-mm-dd
is still technically valid according to ISO 8601 and will be parsed by Glance without issue.
This file contains a list of marks for student work (both exams and coursework).
Student Id | Study Id | Stage Id | Academic Year | Module Code | Component Text | Component Attempt | Component Mark | Weighting |
---|---|---|---|---|---|---|---|---|
Integer | Integer | Integer | String | String | String | Integer | Integer | Decimal |
An example row would look something like the following:
37802,5802,23,2015,GNM8002,Written Examination,1,84,100.0
Note: when a module/course has more than 1 component, students will achieve marks for both which will be recorded in separate rows. An example of this would look like the following:
34814,1089,23,2015,SPG8027,Oral Presentation 1,1,70,20.0
34814,1089,23,2015,SPG8027,Individual Project Plan,1,66,80.0
Together, these marks constitute student 34814's performance in the module SPG8027.
Note: In the background, Glance will group these rows together and combine component marks, multiplied by their
respective weights, to compute a total module/course mark. Therefore, if a given student completed a module, the Weighting
fields of all associated rows in Marks.csv
should sum to 100.
This file contains a list of the modules/courses that a student may take, across an institution.
Id | Module Code | Title | Description | Keywords |
---|---|---|---|---|
Integer | String | String | String | String|String| ... |String |
An example row would look something like the following:
217,CSC2021,Software Engineering,"Teaches students the principles, tools and techniques of software engineering.",Programming|Commercial Accumen|Exam heavy
Note: the Description field is enclosed in double quotes as such a long field is likely to include commas. If left outside double quotes, commas are treated as field separators. Columns which contain these will then have too many rows may cause Glance to omit entries or even break. This is infact true of all files listed here, but mentioned specifically here as the most likely place to encounter the problem.
Note: unlike any other fields in the other files listed here, the structure of the Keywords field is important. The csv format is not naturally good at representing fields which which may be lists of arbitrary length. We choose here to represent multiple keywords in a single Keyword field by separating them by pipe symbols (|). If you have a row with 0 or 1 keywords, then no pipe symbols are needed. Equally, pipe symbols are only needed to separate keywords; you do not need a trailing pipe symbol. Glance will parse such correctly formatted Keyword fields as lists of distinct keywords. As with the Description field incorrect formatting may cause Glance to omit entries or even break.
Throughout this guide, fields such as Student Id are referenced many times. Exactly how identifying such information is will vary from institution to institution. For instance, can student ids be used to lookup contact info on some widely accessible service? If so, student ids will often need to be anonymised before use with Glance in order to preserve student privacy.
This anonymisation will often involve simply replacing a student id with some other integer which has no meaning; i.e. 10378592 becomes 17.
If this anonymisation takes place, it must be consistent accross all files!. If the id 17 refers to student number
10378592 in Printed.csv
, then it must also refer to student number 10378592 in ClusterSession.csv
. If the anonymisation
of student data is inconsistent then Glance will link unrelated records to one another and present incorrect information.