一:hadoop2.6.4 shell基本命令
输入hadoop,查看命令的基本用法:
此前必须要配置好hadoop的环境变量,配置环境变量可以参考此文章
http://blog.csdn.net/mastethuang/article/details/51867115
huang@ubuntu:~$ hadoop
之后会出现hadoop命令的具体用法:
Usage: hadoop [--config confdir] COMMANDwhere COMMAND is one of:fs run a generic filesystem user clientversion print the versionjar <jar> run a jar filechecknative [-a|-h] check native hadoop and compression libraries availabilitydistcp <srcurl> <desturl> copy file or directories recursivelyarchive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archiveclasspath prints the class path needed to get thecredential interact with credential providersHadoop jar and the required librariesdaemonlog get/set the log level for each daemontrace view and modify Hadoop tracing settingsorCLASSNAME run the class named CLASSNAMEMost commands print help when invoked w/o parameters.
查看hdfs的命令
huang@ubuntu:~$ hadoop fs
fs命令的具体用法:
Usage: hadoop fs [generic options][-appendToFile <localsrc> ... <dst>][-cat [-ignoreCrc] <src> ...][-checksum <src> ...][-chgrp [-R] GROUP PATH...][-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...][-chown [-R] [OWNER][:[GROUP]] PATH...][-copyFromLocal [-f] [-p] [-l] <localsrc> ... <dst>][-copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>][-count [-q] [-h] <path> ...][-cp [-f] [-p | -p[topax]] <src> ... <dst>][-createSnapshot <snapshotDir> [<snapshotName>]][-deleteSnapshot <snapshotDir> <snapshotName>][-df [-h] [<path> ...]][-du [-s] [-h] <path> ...][-expunge][-get [-p] [-ignoreCrc] [-crc] <src> ... <localdst>][-getfacl [-R] <path>][-getfattr [-R] {-n name | -d} [-e en] <path>][-getmerge [-nl] <src> <localdst>][-help [cmd ...]][-ls [-d] [-h] [-R] [<path> ...]][-mkdir [-p] <path> ...][-moveFromLocal <localsrc> ... <dst>][-moveToLocal <src> <localdst>][-mv <src> ... <dst>][-put [-f] [-p] [-l] <localsrc> ... <dst>][-renameSnapshot <snapshotDir> <oldName> <newName>][-rm [-f] [-r|-R] [-skipTrash] <src> ...][-rmdir [--ignore-fail-on-non-empty] <dir> ...][-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]][-setfattr {-n name [-v value] | -x name} <path>][-setrep [-R] [-w] <rep> <path> ...][-stat [format] <path> ...][-tail [-f] <file>][-test -[defsz] <path>][-text [-ignoreCrc] <src> ...][-touchz <path> ...][-usage [cmd ...]]Generic options supported are-conf <configuration file> specify an application configuration file-D <property=value> use value for given property-fs <local|namenode:port> specify a namenode-jt <local|resourcemanager:port> specify a ResourceManager-files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster-libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.-archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.The general command line syntax isbin/hadoop command [genericOptions] [commandOptions]
二:wordcount程序运行
在执行wordcount前,我们需要在hdfs文件管理系统中创建数据的input目录
参考上面的命令用法:
创建wc目录
huang@ubuntu:~$ hadoop fs -mkdir /wc/
在wc下创建input目录
huang@ubuntu:~$ hadoop fs -mkdir /wc/input/
查看目录是否创建成功:
huang@ubuntu:~$ hadoop fs -ls -R /
成功则会出现:
drwxr-xr-x - huang supergroup 0 2016-07-09 20:36 /wcdrwxr-xr-x - huang supergroup 0 2016-07-09 20:36 /wc/input
想input文件夹中加入要进行统计wordcount的文件(将
/
usr
/
local
/
hadoop
-
2.6
.
4
/
etc
/
hadoop/
里面的所有xml文件加入到input中
)
huang@ubuntu:~$ hadoop fs -put /usr/local/hadoop-2.6.4/etc/hadoop/*.xml /wc/input/
查看文件xml文件是否成功添加到input中
huang@ubuntu:~$ hadoop fs -ls -R /
显示下面内容则说明添加成功
drwxr-xr-x - huang supergroup 0 2016-07-09 20:36 /wcdrwxr-xr-x - huang supergroup 0 2016-07-09 20:39 /wc/input-rw-r--r-- 1 huang supergroup 4436 2016-07-09 20:39 /wc/input/capacity-scheduler.xml-rw-r--r-- 1 huang supergroup 1122 2016-07-09 20:39 /wc/input/core-site.xml-rw-r--r-- 1 huang supergroup 9683 2016-07-09 20:39 /wc/input/hadoop-policy.xml-rw-r--r-- 1 huang supergroup 1199 2016-07-09 20:39 /wc/input/hdfs-site.xml-rw-r--r-- 1 huang supergroup 620 2016-07-09 20:39 /wc/input/httpfs-site.xml-rw-r--r-- 1 huang supergroup 3523 2016-07-09 20:39 /wc/input/kms-acls.xml-rw-r--r-- 1 huang supergroup 5511 2016-07-09 20:39 /wc/input/kms-site.xml-rw-r--r-- 1 huang supergroup 690 2016-07-09 20:39 /wc/input/yarn-site.xml
运行wordcount程序,输入hadoop jar查看用法:
huang@ubuntu:~$ hadoop jarRunJar jarFile [mainClass] args...
hadoop2.6.4自带一个wordcount的例子,进入到自己安装hadoop的路径中,其jar包放在/share/hadoop/mapreduce/中
huang@ubuntu:/usr/local/hadoop-2.6.4/share/hadoop/mapreduce$ ls
其文件夹中内容如下
hadoop-mapreduce-examples-2.6.
4.jar正是我们所需要的
hadoop-mapreduce-client-app-2.6.4.jarhadoop-mapreduce-client-hs-2.6.4.jarhadoop-mapreduce-client-jobclient-2.6.4-tests.jarlibhadoop-mapreduce-client-common-2.6.4.jarhadoop-mapreduce-client-hs-plugins-2.6.4.jarhadoop-mapreduce-client-shuffle-2.6.4.jarlib-exampleshadoop-mapreduce-client-core-2.6.4.jarhadoop-mapreduce-client-jobclient-2.6.4.jarhadoop-mapreduce-examples-2.6.4.jarsources
接下来就可以运行这个jar包进行单词统计了(注意output目录不可事先存在)
huang@ubuntu:/usr/local/hadoop-2.6.4/share/hadoop/mapreduce$ hadoop jar hadoop-mapreduce-examples-2.6.4.jar wordcount /wc/input/ /wc/output/
查看运行结果:
huang@ubuntu:~$ hadoop fs -ls -R /wc/output/
里面有两个文件:
-rw-r--r-- 1 huang supergroup 0 2016-07-09 20:50 /wc/output/_SUCCESS-rw-r--r-- 1 huang supergroup 10431 2016-07-09 20:50 /wc/output/part-r-00000
打开part-r-00000
huang@ubuntu:~$ hadoop fs -text /wc/output/part-r-00000
最后会出现统计的结果(下面为一小部分的结果..):
via 1when 4where 1which 5while 1who 2will 7window 1window, 1with 27within 1without 1work 1writing, 8you 9
本文介绍Hadoop2.6.4的基本shell命令使用方法,包括环境配置及文件系统操作指令,并通过WordCount程序运行实例演示如何在HDFS上创建文件夹、上传文件及执行MapReduce任务。

1372

被折叠的 条评论
为什么被折叠?



