-
Notifications
You must be signed in to change notification settings - Fork 25
/
Copy pathemail.txt
78 lines (47 loc) · 2.4 KB
/
email.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
Data frame representing information about a collection of emails
Description:
These data represent incoming emails for the first three months of
2012 for an email account (see Source).
Usage:
data(email)
data(email_test)
Format:
A ‘email’ (‘email_sent’) data frame has 3921 (1252) observations
on the following 21 variables.
‘spam’ Indicator for whether the email was spam.
‘to_multiple’ Indicator for whether the email was addressed to
more than one recipient.
‘from’ Whether the message was listed as from anyone (this is
usually set by default for regular outgoing email).
‘cc’ Indicator for whether anyone was CCed.
‘sent_email’ Indicator for whether the sender had been sent an
email in the last 30 days.
‘time’ Time at which email was sent.
‘image’ The number of images attached.
‘attach’ The number of attached files.
‘dollar’ The number of times a dollar sign or the word “dollar”
appeared in the email.
‘winner’ Indicates whether “winner” appeared in the email.
‘inherit’ The number of times “inherit” (or an extension, such as
“inheritance”) appeared in the email.
‘viagra’ The number of times “viagra” appeared in the email.
‘password’ The number of times “password” appeared in the email.
‘num_char’ The number of characters in the email, in thousands.
‘line_breaks’ The number of line breaks in the email (does not
count text wrapping).
‘format’ Indicates whether the email was written using HTML (e.g.
may have included bolding or active links).
‘re_subj’ Whether the subject started with “Re:”, “RE:”, “re:”, or
“rE:”
‘exclaim_subj’ Whether there was an exclamation point in the
subject.
‘urgent_subj’ Whether the word “urgent” was in the email subject.
‘exclaim_mess’ The number of exclamation points in the email
message.
‘number’ Factor variable saying whether there was no number, a
small number (under 1 million), or a big number.
Source:
David Diez's Gmail Account, early months of 2012. All personally
identifiable information has been removed.
References:
~~ OpenIntro Statistics, openintro.org ~~