Explain the WordCount implementation via Hadoop framework ?

We will count the words in all the input file flow as below
input
Assume there are two files each having a sentence
Hello World Hello World (In file 1)
Hello World Hello World (In file 2)
Mapper : There would be each mapper for the a file
For the given sample input the first map output:
< Hello, 1>
< World, 1>
< Hello, 1>
< World, 1>
The second map output:
< Hello, 1>
< World, 1>
< Hello, 1>
< World, 1>
Combiner/Sorting (This is done for each individual map)
So output looks like this
The output of the first map:
< Hello, 2>
< World, 2>\
The output of the second map:
< Hello, 2>
< World, 2>
 Reducer :
It sums up the above output and generates the output as below
< Hello, 4>
< World, 4>
Output
Final output would look like
Hello 4 times
World 4 times

Java | J2EE | Spring | Hibernate | Hadoop | Web Services |

Pages

Popular Posts

About

Explain the WordCount implementation via Hadoop framework ?

0 comments:

Post a Comment

CATEGORIES

Learn Your Topic

Technical Materials