當前位置:編程學習大全網 - 源碼下載 - 怎麽做hadoop編程題

怎麽做hadoop編程題

自定義 MR 實現如下邏輯

product_no lac_id moment start_time user_id county_id staytime city_id13429100031 22554 8 2013-03-11 08:55:19.151754088 571 571 282 571

13429100082 22540 8 2013-03-11 08:58:20.152622488 571 571 270 571

13429100082 22691 8 2013-03-11 08:56:37.149593624 571 571 103 571

13429100087 22705 8 2013-03-11 08:56:51.139539816 571 571 220 571

13429100087 22540 8 2013-03-11 08:55:45.150276800 571 571 66 571

13429100082 22540 8 2013-03-11 08:55:38.140225200 571 571 133 571

13429100140 26642 9 2013-03-11 09:02:19.151754088 571 571 18 571

13429100082 22691 8 2013-03-11 08:57:32.151754088 571 571 287 571

13429100189 22558 8 2013-03-11 08:56:24.139539816 571 571 48 571

13429100349 22503 8 2013-03-11 08:54:30.152622440 571 571 211 5711234567891011

字段解釋:?

product_no:用戶手機號;?

lac_id:用戶所在基站;?

start_time:用戶在此基站的開始時間;?

staytime:用戶在此基站的逗留時間。?

需求描述:?

根據 lac_id 和 start_time 知道用戶當時的位置,根據 staytime 知道用戶各個基站的逗留時長。根據軌跡合?

並連續基站的 staytime。最終得到每壹個用戶按時間排序在每壹個基站駐留時長。?

期望輸出舉例:

13429100082 22540 8 2013-03-11 08:58:20.152622488 571 571 270 571

13429100082 22691 8 2013-03-11 08:56:37.149593624 571 571 390 571

13429100082 22540 8 2013-03-11 08:55:38.140225200 571 571 133 571

13429100087 22705 8 2013-03-11 08:56:51.139539816 571 571 220 571

13429100087 22540 8 2013-03-11 08:55:45.150276800 571 571 66 57112345

分析上面的結果:?

第壹列升序,第四列時間降序。因此,首先需要將這兩列抽取出來,然後自定義排序。

實現如下:

package FindFriend;import java.io.DataInput;import java.io.DataOutput;import java.io.IOException;import java.net.URI;import java.net.URISyntaxException;import java.util.regex.Matcher;import java.util.regex.Pattern;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.FileSystem;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.NullWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.io.WritableComparable;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;public class StringComp2 {

final static String INPUT_PATH = "hdfs://master:8020/liguodong/test2"; final static String OUT_PATH = "hdfs://master:8020/liguodong/test2out"; public static void main(String[] args) throws IOException,

URISyntaxException, ClassNotFoundException, InterruptedException {

Configuration conf = new Configuration(); final FileSystem fs = FileSystem.get(new URI(INPUT_PATH), conf); if(fs.exists(new Path(OUT_PATH))){

fs.delete(new Path(OUT_PATH),true);

}

Job job = Job.getInstance(conf, "date sort");

job.setJarByClass(StringComp2.class);

job.setMapperClass(MyMapper.class);

job.setMapOutputKeyClass(NewK2.class);

job.setMapOutputValueClass(Text.class); //job.setCombinerClass(MyReducer.class);

job.setReducerClass(MyReducer.class);

job.setOutputKeyClass(Text.class);

job.setOutputValueClass(Text.class);

FileInputFormat.addInputPath(job, new Path(INPUT_PATH));

FileOutputFormat.setOutputPath(job,new Path(OUT_PATH));

System.exit(job.waitForCompletion(true)?0:1);

} static class MyMapper extends Mapper<LongWritable, Text, NewK2, Text>{ @Override

protected void map(LongWritable k1, Text v1,

Context context) throws IOException, InterruptedException { //這裏采用正則表達式抽取出了product_no 與 start_time列的數據。

Pattern pattern = Pattern.compile

("([\\d]{11})|([\\d]{4}-[\\d]{2}-[\\d]{2} [\\d]{2}:[\\d]{2}:[\\d]{2}.[\\d]{9})");

Matcher matcher = pattern.matcher(v1.toString());

matcher.find();

String str1= matcher.group();

matcher.find();

String str2= matcher.group(); final NewK2 k2 = new NewK2(str1, str2); //System.err.println(stringBuilder);

context.write(k2, v1);

}

} static class MyReducer extends Reducer<NewK2, Text, Text, NullWritable>{ @Override

protected void reduce(NewK2 k2, Iterable<Text> v2s, Context context) throws IOException, InterruptedException { for (Text v2 : v2s) {

context.write(v2,NullWritable.get());

}

}

} static class ?NewK2 implements WritableComparable<NewK2>{

String first;

String second; public NewK2(){} public NewK2(String first, String second){ this.first = first; this.second = second;

} @Override

public void readFields(DataInput in) throws IOException { this.first = in.readUTF(); this.second = in.readUTF();

} @Override

public void write(DataOutput out) throws IOException {

out.writeUTF(first);

out.writeUTF(second);

} /**

* 當k2進行排序時,會調用該方法.

* 當第壹列不同時,升序;當第壹列相同時,第二列降序

*/

@Override

public int compareTo(NewK2 o) { final int minus = compTo(this.first,o.first); if(minus != 0){ return minus;

} return -compTo(this.second,o.second);

} //仿照JDK源碼String類的compareTo方法進行實現,

//我發現直接使用String類的compareTo方法,並不能得到我想要的結果(第壹列升序,第二列降序)。

public int compTo(String one,String another) { int len = one.length(); char[] v1 = one.toCharArray(); char[] v2 = another.toCharArray(); int k = 0; while (k < len) { char c1 = v1[k]; char c2 = v2[k]; if (c1 != c2) { return c1 - c2;

}

k++;

} return 0;

} @Override

public int hashCode() { return this.first.hashCode()+this.second.hashCode();

} @Override

public boolean equals(Object obj) { if(!(obj instanceof NewK2)){ return false;

}

NewK2 oK2 = (NewK2)obj; return (this.first==oK2.first)&&(this.second==oK2.second);

}

} ?

}123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162

運行結果:?

  • 上一篇:Android手機軟件開發能做什麽?求解答
  • 下一篇:夢見鑰匙有什麽征兆?
  • copyright 2024編程學習大全網