Hadoop 文件操作和代码运行测试

1. 查看HDFS文件或目录
[linuxidc@Hadoop02 ~]$ cd hadoop-1.1.2
[linuxidc@hadoop02 hadoop-1.1.2]$ ./bin/hadoop fs -ls
[linuxidc@hadoop02 hadoop-1.1.2]$ echo $?
0
[linuxidc@hadoop02 hadoop-1.1.2]$ ./bin/hadoop fs -lsr
[linuxidc@hadoop02 hadoop-1.1.2]$ echo $?
0
[linuxidc@hadoop02 hadoop-1.1.2]$
2. 新建HDFS目录或文件

[linuxidc@hadoop02 hadoop-1.1.2]$ ./bin/hadoop fs -mkdir TEST
[linuxidc@hadoop02 hadoop-1.1.2]$ ./bin/hadoop fs -lsr
drwxr-xr-x - linuxidc supergroup 0 2013-09-10 14:08 /user/linuxidc/TEST
3. 上传文件到HDFS目录中
本地建立一个文件,内容如下:
[linuxidc@hadoop02 hadoop-1.1.2]$ touch test.txt
[linuxidc@hadoop02 hadoop-1.1.2]$ vim test.txt
[linuxidc@hadoop02 hadoop-1.1.2]$ cat test.txt
Hello, Hadoop !
你好, Hadoop !
上传到HDFS的TEST目录中,可以使用-copyFromLocal参数,也可以使用-moveFromLocal,分别是从本地拷贝到和从本地剪切到HDFS目录。
[linuxidc@hadoop02 hadoop-1.1.2]$ ./bin/hadoop fs -moveFromLocal test.txt TEST
[linuxidc@hadoop02 hadoop-1.1.2]$ ./bin/hadoop fs -ls
Found 1 items
drwxr-xr-x - linuxidc supergroup 0 2013-09-10 14:14 /user/linuxidc/TEST
[linuxidc@hadoop02 hadoop-1.1.2]$
4. 查看HDFS文件内容
[linuxidc@hadoop02 hadoop-1.1.2]$ ./bin/hadoop fs -cat TEST/test.txt
Hello, Hadoop !
你好, Hadoop !
[linuxidc@hadoop02 hadoop-1.1.2]$
5. 运行jar范例,统计文本单词词频
[linuxidc@hadoop02 hadoop-1.1.2]$ ./bin/hadoop jar hadoop-examples-1.1.2.jar
wordcount TEST out
13/09/10 14:20:50 INFO input.FileInputFormat: Total input paths to process : 1
13/09/10 14:20:50 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/09/10 14:20:50 WARN snappy.LoadSnappy: Snappy native library not loaded
13/09/10 14:20:51 INFO mapred.JobClient: Running job: job_201309082325_0001
13/09/10 14:20:52 INFO mapred.JobClient: map 0% reduce 0%
13/09/10 14:21:03 INFO mapred.JobClient: map 100% reduce 0%
13/09/10 14:21:11 INFO mapred.JobClient: map 100% reduce 33%
13/09/10 14:21:13 INFO mapred.JobClient: map 100% reduce 100%
13/09/10 14:21:14 INFO mapred.JobClient: Job complete: job_201309082325_0001
13/09/10 14:21:14 INFO mapred.JobClient: Counters: 29
13/09/10 14:21:14 INFO mapred.JobClient: Job Counters
13/09/10 14:21:14 INFO mapred.JobClient: Launched reduce tasks=1
13/09/10 14:21:14 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=10619
13/09/10 14:21:14 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
13/09/10 14:21:14 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
13/09/10 14:21:14 INFO mapred.JobClient: Launched map tasks=1
13/09/10 14:21:14 INFO mapred.JobClient: Data-local map tasks=1
13/09/10 14:21:14 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=9864
13/09/10 14:21:14 INFO mapred.JobClient: File Output Format Counters
13/09/10 14:21:14 INFO mapred.JobClient: Bytes Written=38
13/09/10 14:21:14 INFO mapred.JobClient: FileSystemCounters
13/09/10 14:21:14 INFO mapred.JobClient: FILE_BYTES_READ=64
13/09/10 14:21:14 INFO mapred.JobClient: HDFS_BYTES_READ=146
13/09/10 14:21:14 INFO mapred.JobClient: FILE_BYTES_WRITTEN=109756
13/09/10 14:21:14 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=38
13/09/10 14:21:14 INFO mapred.JobClient: File Input Format Counters
13/09/10 14:21:14 INFO mapred.JobClient: Bytes Read=35
13/09/10 14:21:14 INFO mapred.JobClient: Map-Reduce Framework
13/09/10 14:21:14 INFO mapred.JobClient: Map output materialized bytes=64
13/09/10 14:21:14 INFO mapred.JobClient: Map input records=2
13/09/10 14:21:14 INFO mapred.JobClient: Reduce shuffle bytes=64
13/09/10 14:21:14 INFO mapred.JobClient: Spilled Records=10
13/09/10 14:21:14 INFO mapred.JobClient: Map output bytes=59
13/09/10 14:21:14 INFO mapred.JobClient: Total committed heap usage (bytes)=189464576
13/09/10 14:21:14 INFO mapred.JobClient: CPU time spent (ms)=4420
13/09/10 14:21:14 INFO mapred.JobClient: Combine input records=6
13/09/10 14:21:14 INFO mapred.JobClient: SPLIT_RAW_BYTES=111
13/09/10 14:21:14 INFO mapred.JobClient: Reduce input records=5
13/09/10 14:21:14 INFO mapred.JobClient: Reduce input groups=5
13/09/10 14:21:14 INFO mapred.JobClient: Combine output records=5
13/09/10 14:21:14 INFO mapred.JobClient: Physical memory (bytes) snapshot=281489408
13/09/10 14:21:14 INFO mapred.JobClient: Reduce output records=5
13/09/10 14:21:14 INFO mapred.JobClient: Virtual memory (bytes) snapshot=1542262784
13/09/10 14:21:14 INFO mapred.JobClient: Map output records=6
6. 检查步骤5运行结果
[linuxidc@hadoop02 hadoop-1.1.2]$ ./bin/hadoop fs -ls out
Found 3 items
-rw-r--r-- 1 linuxidc supergroup 0 2013-09-10 14:20 /user/linuxidc/out/_SUCCESS
drwxr-xr-x - linuxidc supergroup 0 2013-09-10 14:19 /user/linuxidc/out/_logs
-rw-r--r-- 1 linuxidc supergroup 38 2013-09-10 14:20 /user/linuxidc/out/part-r-00000
[linuxidc@hadoop02 hadoop-1.1.2]$ ./bin/hadoop fs -cat out/part-r-00000
! 1
Hadoop 2
Hello, 1
你好, 1
! 1
[linuxidc@hadoop02 hadoop-1.1.2]$
7. 删除HDFS测试文件和目录
[linuxidc@hadoop02 hadoop-1.1.2]$ ./bin/hadoop fs -rmr TEST out
Deleted hdfs://hadoop01:9000/user/linuxidc/TEST
Deleted hdfs://hadoop01:9000/user/linuxidc/out
[linuxidc@hadoop02 hadoop-1.1.2]$
至此,测试结束。

可以登录官网API文档查看
更多的hadoop示例程序

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:http://www.heiqu.com/919b4017c8d535ea355cd48716c64958.html