forked from cmri/bc-hadoop
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathCMRI-README.txt
199 lines (174 loc) · 8.18 KB
/
CMRI-README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
This is a version of code that runs in China Mobile Research Institute's clusters and is powered by Apache Hadoop.
This code is based on Cloudera's Hadoop Distribution (CDH3U4) and Facebook's AvatarNode.
CMRI-CHANGES.txt contains the additional patches that have been committed to the original code base.
For instructions on starting a hadoop cluster, see here:
System Requirements
- BCH-1.0.0
- Zookeeper-3.4.3 or other stable version
- At lease 3 nodes. Node1 is primary node, Node2 is standby node, Node3 is NFS server and zookeeper server
0. Compile bc-hadoop
- download source code of cdh3u4 from Cloudera's website http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u4.tar.gz
- copy files/dirs in github to cdh3u4 dir, overwrite once existed.
- ant package
1. Install and Config Zookeeper
- Please follow the instructions at http://zookeeper.apache.org/
2. Install and Config Hadoop
- Please follow the instructions at http://hadoop.apache.org/ to config HDFS and MapReduce
3. Config Hadoop NameNode HA
- config NFS server on Node3
. mkdir /nfsSvr
. edit /etc/exports, add '/nfsSvr *(rw,sync,no_root_squash)'
. exportfs -rv
. start nfs service: service nfs start
. Node1 and Node2 mount this nfs: mount -t nfs Node3:/nfsSvr /mnt/nfs
. Node1 and Node2 change the privilege of dir /mnt/nfs: chmod 777 /mnt/nfs/
- add zookeeper and HA jars to hadoop lib dir:
. cp contrib/highavailability/hadoop-highavailability-0.20.2-cdh3u4.jar lib/
. cp zookeeper-3.4.3.jar lib/
- modify the conf files
. add hostname of Node1 and Node2 to conf/masters
. vim conf/core-site.xml, add the following contents:
----------------------------------
<property>
<name>fs.default.name0</name>
<value>hdfs://HOSTNAME_Node1:9000</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
<property>
<name>fs.default.name1</name>
<value>hdfs://HOSTNAME_Node2:9000</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
<property>
<name>fs.ha.zookeeper.quorum</name>
<value>IP_OF_Node3</value>
</property>
<property>
<name>fs.ha.zookeeper.prefix</name>
<value>/hdfs</value>
</property>
<property>
<name>fs.hdfs.impl</name>
<value>org.apache.hadoop.hdfs.DistributedAvatarFileSystem</value>
<description>The FileSystem for hdfs: uris.</description>
</property>
----------------------------------
. vim conf/hdfs-site.xml, add the following contents:
----------------------------------
<property>
<name>dfs.http.address0</name>
<value>HOSTNAME_Node1:50070</value>
<description>
The address and the base port where the dfs namenode web ui will listen on.
If the port is 0 then the server will start on a free port.
</description>
</property>
<property>
<name>dfs.http.address1</name>
<value>HOSTNAME_Node2:50070</value>
<description>
The address and the base port where the dfs namenode web ui will listen on.
If the port is 0 then the server will start on a free port.
</description>
</property>
<property>
<name>dfs.name.dir.shared0</name>
<value>/mnt/nfs/dfs/shared0</value>
<description>Determines where on the local filesystem the DFS name node
should store the name table(fsimage). If this is a comma-delimited list
of directories then the name table is replicated in all of the
directories, for redundancy. </description>
</property>
<property>
<name>dfs.name.dir.shared1</name>
<value>/mnt/nfs/dfs/shared1</value>
<description>Determines where on the local filesystem the DFS name node
should store the name table(fsimage). If this is a comma-delimited list
of directories then the name table is replicated in all of the
directories, for redundancy. </description>
</property>
<property>
<name>dfs.name.edits.dir.shared0</name>
<value>/mnt/nfs/dfs/edits0</value>
<description>Determines where on the local filesystem the DFS name node
should store the name table(fsimage). If this is a comma-delimited list
of directories then the name table is replicated in all of the
directories, for redundancy. </description>
</property>
<property>
<name>dfs.name.edits.dir.shared1</name>
<value>/mnt/nfs/dfs/edits1</value>
<description>Determines where on the local filesystem the DFS name node
should store the name table(fsimage). If this is a comma-delimited list
of directories then the name table is replicated in all of the
directories, for redundancy. </description>
</property>
----------------------------------
4. Start/Stop HDFS with NameNode HA
- Start the HDFS
. start zookeeper: bin/zkServer.sh start
. If the first time to start hdfs, then run the following cmds, else skip this.
> mkdir -p /mnt/nfs/dfs/edits0 /mnt/nfs/dfs/edits1 /mnt/nfs/dfs/shared0 /mnt/nfs/dfs/shared1
> bin/hadoop namenode ¨Cformat
. bin/start-avatar.sh
- Stop HDFS
. bin/stop-avatar.sh
- Monitor the status of HDFS
. Web on primary avatarnode: http://Node1:50070/
. Web on standby avatarnode: http://Node2:50070/
. Open zookeeper console: bin/zkCli.sh, then check '/activeNameNode' and '/standbyNameNode'
5. Config Hadoop JobTracker HA
- config 'sudo' to user running hadoop, for example the user 'bigcloud' will start hadoop system.
. chmod u+w /etc/sudoers
. edit file /etc/sudoers
> comment 'Defaults requiretty' to '#Defaults requiretty'
> add bigcloud to sudo: 'bigcloud ALL=NOPASSWD:ALL'
. chmod u-w /etc/sudoers
- add zookeeper jar to hadoop lib: cp zookeeper-3.4.3.jar lib/
- vim conf/mapred-site.xml, add the following contents
----------------------------------
<property>
<name>mapred.job.tracker</name>
<value>SHARED_VIRTUAL_IP:9001</value>
</property>
<property>
<name>mapred.zookeeper.quorum</name>
<value>IP_OF_Node3</value>
</property>
<property>
<name>mapred.jobtracker.restart.recover</name>
<value>true</value>
<description>"true" to enable (job) recovery upon restart,
"false" to start afresh. default is false, set it true in jobtracker ha.
</description>
</property>
<property>
<name>mapred.jobtracker.job.history.block.size</name>
<value>524288</value>
<description>The block size of the job history file. Since the job recovery
uses job history, its important to dump job history to disk as
soon as possible. Note that this is an expert level parameter.
The default value is set to 3 MB. We set it as 512k in jobtracker ha.
</description>
</property>
----------------------------------
SHARED_VIRTUAL_IP is a virtual ip that will be shared among multi jobtrackers, the primary jobtracker will take this virtual ip.
6. Start/Stop MapReduce with JobTracker HA
- make sure hdfs and zookeeper have be running correctly.
- start mapreduce
. make sure the virtual IP is not assigned to any nodes. you can run 'ip addr show|grep eth0' to have a check, or 'sudo -S /sbin/ip addr del SHARED_VIRTUAL_IP/24 dev eth0' to delete the vip assignment.
. start mapreduce on primary jobtracker : start-mapred.sh
. start backup jobtracker on other nodes: nohup bin/hadoop jobtracker >>logs/hadoop-bcmr-jobtracker-`hostname`.log &
- stop mapreduce
. kill jobtracker processes on backup nodes
. stop mapreduce on primary jobtracker: stop-mapred.sh
- Monitor the status of mapreduce
. Web on primary jobtracker: http://SHARED_VIRTUAL_IP:50030/