From the UCI repository of machine learning datasets, this database contains 14
features concerning demographic characteristics of 45,222
rows (32,561
for training and 12,661
for testing). The task is to predict whether a person has a yearly income that is more or less than $50,000
, hence the proble will be formulated as classification task*.
Data Source:* https://archive.ics.uci.edu/ml/machine-learning-databases/adult/
Reference:* Dua Dheeru, and Efi Karra Taniskidou. “UCI Machine Learning Repository”. Irvine, CA: University of California, School of Information and Computer Science (2017).
Here are the features and their possible values:
- Age:* continuous.
- Workclass:*
Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked
. - Fnlwgt:* continuous (the number of people the census takers believe that observation represents).
- Education:*
Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool
. - Education-num:* continuous.
- Marital-status:*
Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse
. - Occupation:*
Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces
. - Relationship:*
Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried
. - Ethnic group:*
White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black
. - Sex:*
Female, Male
. * Note: this data is extracted from the 1994 Census and enforces a binary option on Sex - Capital-gain:* continuous.
- Capital-loss:* continuous.
- Hours-per-week:* continuous.
- Native-country:*
United-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US(Guam-USVI-etc), India, Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Poland, Jamaica, Vietnam, Mexico, Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary, Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinadad&Tobago, Peru, Hong, Holand-Netherlands
.