倒排索引 mr实现

3/3/2017来源:C/C++教程人气:3236

Map阶段 <0,"this is google"> .... context.write("google ->a.txt",1); context.write("google ->a.txt",1); context.write("google ->a.txt",1); context.write("google ->a.txt",1); context.write("google ->a.txt",1); context.write("google ->b.txt",1); context.write("google ->b.txt",1); context.write("google ->b.txt",1); -------------------------------------------------------- combiner阶段 <"google ->a.txt",1> <"google ->a.txt",1> <"google ->a.txt",1> <"google ->a.txt",1> <"google ->a.txt",1> <"google ->b.txt",1> <"google ->b.txt",1> <"google ->b.txt",1> context.write("google ","a.txt->5"); context.write("google ","b.txt->3"); -------------------------------------------------------- Reducer阶段 <"hello",{"a.txt->5","b.txt->3"}> context.write("hello","a.txt->5 b.txt->3"); ------------------------------------------------------- hello    "a.txt->5 b.txt->3" tom        "a.txt->2 b.txt->1" kitty    "a.txt->1" .......