Inverted Index

Integrantes:

Zúñiga Coayla, Jerson
Chullunquia Rosas, Sharon

Requirements

Hadoop Distributed File System (HDFS)

Start the HDFS by running the following script from main node.

❯ start-dfs.sh

Go to http://host:9870 to monitor your HDFS cluster.

Yet Another Resource Negotiator (YARN)

YARN runs and schedules tasks. Start YARN with the script:

❯ start-yarn.sh

Start

Compile

Compile and create JAR file:

❯ hadoop com.sun.tools.javac.Main InvertedIndex.java
❯ jar cf inverted-index.jar InvertedIndex*.class

HDFS

Copy input data to HDFS.

❯ hdfs dfs -mkdir input
❯ hdfs dfs -put input/* input

List the contents of the input directory:

❯ hdfs dfs -ls input

YARN

Submit job

❯ yarn jar inverted-index.jar InvertedIndex input output

Results

You can get the result by querying HDFS with hdfs dfs -ls output. In case of success, the output will resemble:

Found 2 items
-rw-r--r--   1 hadoop supergroup          0 2022-06-16 14:16 output/_SUCCESS
-rw-r--r--   1 hadoop supergroup         84 2022-06-16 14:16 output/part-r-00000

Print the results with:

❯ hdfs dfs -cat output/part-r-00000 | less
bye     doc01:1 
goodbye doc02:1 
hadoop  doc02:2 
hello   doc01:1 doc02:1 
world   doc01:2

Notes

If output directory already exists delete it with:

hdfs dfs -rm -r output

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
input		input
.gitignore		.gitignore
InvertedIndex.java		InvertedIndex.java
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Inverted Index

Integrantes:

Requirements

Hadoop Distributed File System (HDFS)

Yet Another Resource Negotiator (YARN)

Start

Compile

HDFS

YARN

Results

Notes

About

Releases

Packages

Contributors 2

Languages

sharon1160/inverted-index

Folders and files

Latest commit

History

Repository files navigation

Inverted Index

Integrantes:

Requirements

Hadoop Distributed File System (HDFS)

Yet Another Resource Negotiator (YARN)

Start

Compile

HDFS

YARN

Results

Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages