Skip to content

Commit

Permalink
MapReduce for k-means
Browse files Browse the repository at this point in the history
- Quiz: MapReduce for k-means
  • Loading branch information
tuanavu committed Jul 8, 2016
1 parent b897bf2 commit ca1d76f
Show file tree
Hide file tree
Showing 9 changed files with 1,885 additions and 12 deletions.

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 4,
"metadata": {
"collapsed": false
},
Expand Down Expand Up @@ -68,7 +68,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 5,
"metadata": {
"collapsed": false
},
Expand All @@ -79,7 +79,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 6,
"metadata": {
"collapsed": false
},
Expand All @@ -99,7 +99,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 7,
"metadata": {
"collapsed": false
},
Expand Down Expand Up @@ -149,7 +149,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 8,
"metadata": {
"collapsed": false
},
Expand All @@ -161,11 +161,23 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [],
"outputs": [
{
"data": {
"text/plain": [
"<59071x547979 sparse matrix of type '<type 'numpy.float64'>'\n",
"\twith 10379283 stored elements in Compressed Sparse Row format>"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tf_idf"
]
Expand Down Expand Up @@ -210,7 +222,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 10,
"metadata": {
"collapsed": true
},
Expand Down Expand Up @@ -238,7 +250,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 11,
"metadata": {
"collapsed": true
},
Expand Down Expand Up @@ -305,11 +317,25 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 12,
"metadata": {
"collapsed": false
},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 1.41000789 1.36894636]\n",
" [ 1.40935215 1.41023886]\n",
" [ 1.39855967 1.40890299]\n",
" ..., \n",
" [ 1.41108296 1.39123646]\n",
" [ 1.41022804 1.31468652]\n",
" [ 1.39899784 1.41072448]]\n"
]
}
],
"source": [
"from sklearn.metrics import pairwise_distances\n",
"\n",
Expand Down Expand Up @@ -1569,7 +1595,13 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.11"
"version": "2.7.12"
},
"toc": {
"toc_cell": false,
"toc_number_sections": false,
"toc_threshold": "8",
"toc_window_display": false
}
},
"nbformat": 4,
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# MapReduce for k-means"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Question 1\n",
"\n",
"<img src=\"images/Screen Shot 2016-07-08 at 1.58.04 PM.png\">\n",
"\n",
"*Screenshot taken from [Coursera](https://www.coursera.org/learn/ml-clustering-and-retrieval/exam/dlBVz/mapreduce-for-k-means)*\n",
"\n",
"<!--TEASER_END-->"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Question 2\n",
"\n",
"<img src=\"images/Screen Shot 2016-07-08 at 1.56.48 PM.png\">\n",
"\n",
"*Screenshot taken from [Coursera](https://www.coursera.org/learn/ml-clustering-and-retrieval/exam/dlBVz/mapreduce-for-k-means)*\n",
"\n",
"<!--TEASER_END-->"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Question 3\n",
"\n",
"<img src=\"images/Screen Shot 2016-07-08 at 1.51.16 PM.png\">\n",
"\n",
"*Screenshot taken from [Coursera](https://www.coursera.org/learn/ml-clustering-and-retrieval/exam/dlBVz/mapreduce-for-k-means)*\n",
"\n",
"<!--TEASER_END-->"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Question 4\n",
"\n",
"<img src=\"images/Screen Shot 2016-07-08 at 1.51.20 PM.png\">\n",
"\n",
"*Screenshot taken from [Coursera](https://www.coursera.org/learn/ml-clustering-and-retrieval/exam/dlBVz/mapreduce-for-k-means)*\n",
"\n",
"<!--TEASER_END-->"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Question 5\n",
"\n",
"<img src=\"images/Screen Shot 2016-07-08 at 1.51.24 PM.png\">\n",
"\n",
"*Screenshot taken from [Coursera](https://www.coursera.org/learn/ml-clustering-and-retrieval/exam/dlBVz/mapreduce-for-k-means)*\n",
"\n",
"<!--TEASER_END-->"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Answer**\n",
"- https://www.coursera.org/learn/ml-clustering-and-retrieval/discussions/weeks/3/threads/5ls_g0IIEeaJZA6Ew5-W7Q\n",
"- For each of the operations, check to see which ones would fail the commutative-associative test.\n",
"- https://www.mathsisfun.com/associative-commutative-distributive.html\n",
"- For example, say $OP(x1, x2) = 2 * x1 + 3 * x2$\n",
" - So $OP(2,3) = 4 + 9 = 13$\n",
" - $OP(3,2) = 6 + 6 = 12$\n",
" - Hence the above OP fails the commutative-associative test."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.11"
},
"toc": {
"toc_cell": false,
"toc_number_sections": false,
"toc_threshold": "8",
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# MapReduce for k-means"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Question 1\n",
"\n",
"<img src=\"images/Screen Shot 2016-07-08 at 1.58.04 PM.png\">\n",
"\n",
"*Screenshot taken from [Coursera](https://www.coursera.org/learn/ml-clustering-and-retrieval/exam/dlBVz/mapreduce-for-k-means)*\n",
"\n",
"<!--TEASER_END-->"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Question 2\n",
"\n",
"<img src=\"images/Screen Shot 2016-07-08 at 1.56.48 PM.png\">\n",
"\n",
"*Screenshot taken from [Coursera](https://www.coursera.org/learn/ml-clustering-and-retrieval/exam/dlBVz/mapreduce-for-k-means)*\n",
"\n",
"<!--TEASER_END-->"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Question 3\n",
"\n",
"<img src=\"images/Screen Shot 2016-07-08 at 1.51.16 PM.png\">\n",
"\n",
"*Screenshot taken from [Coursera](https://www.coursera.org/learn/ml-clustering-and-retrieval/exam/dlBVz/mapreduce-for-k-means)*\n",
"\n",
"<!--TEASER_END-->"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Question 4\n",
"\n",
"<img src=\"images/Screen Shot 2016-07-08 at 1.51.20 PM.png\">\n",
"\n",
"*Screenshot taken from [Coursera](https://www.coursera.org/learn/ml-clustering-and-retrieval/exam/dlBVz/mapreduce-for-k-means)*\n",
"\n",
"<!--TEASER_END-->"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Question 5\n",
"\n",
"<img src=\"images/Screen Shot 2016-07-08 at 1.51.24 PM.png\">\n",
"\n",
"*Screenshot taken from [Coursera](https://www.coursera.org/learn/ml-clustering-and-retrieval/exam/dlBVz/mapreduce-for-k-means)*\n",
"\n",
"<!--TEASER_END-->"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Answer**\n",
"- https://www.coursera.org/learn/ml-clustering-and-retrieval/discussions/weeks/3/threads/5ls_g0IIEeaJZA6Ew5-W7Q\n",
"- For each of the operations, check to see which ones would fail the commutative-associative test.\n",
"- https://www.mathsisfun.com/associative-commutative-distributive.html\n",
"- For example, say $OP(x1, x2) = 2 * x1 + 3 * x2$\n",
" - So $OP(2,3) = 4 + 9 = 13$\n",
" - $OP(3,2) = 6 + 6 = 12$\n",
" - Hence the above OP fails the commutative-associative test."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.11"
},
"toc": {
"toc_cell": false,
"toc_number_sections": false,
"toc_threshold": "8",
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 0
}

0 comments on commit ca1d76f

Please sign in to comment.