Skip to content

Latest commit

 

History

History
337 lines (241 loc) · 17 KB

README-23.md

File metadata and controls

337 lines (241 loc) · 17 KB

An Ambari Stack for HDP IoT Demo

Ambari stack for easily installing and managing HDP IoT Demo which shows real time monitoring/alerts and predictions of driving violations generated by fleet of trucks. Also has optional steps to setup Ranger audits to Solr and a SILK dashboard to visualize audits

Videos on the demo itself available here:

Pre-reqs:

  • The service currently requires that it is installed on the Ambari server node and that Kafka and Zookeeper and also running on the same node.
  • HBase and Storm must be available on the cluster and started.
  • Falcon must be stopped before installing this service.

Previous versions:

  • For 2.2 version of the steps see here
Setup steps
  • Download HDP 2.3 sandbox VM image (Sandbox_HDP_2.3_VMWare.ova) from Hortonworks website
  • Import Sandbox_HDP_2.3_VMWare.ova into VMWare and set the VM memory size to at least 8GB (preferably 10GB) RAM and at least 4cpus allocated.
  • Now start the VM
  • After it boots up, find the IP address of the VM and add an entry into your machines hosts file e.g.
192.168.191.241 sandbox.hortonworks.com sandbox    
  • Connect to the VM via SSH (password hadoop) and restart Ambari server
  • Make sure Storm, HBase, Kafka, Hive are up and Falcon is down and all these services are out of maintenance mode. You can SSH into Ambari server node and run the below as a shortcut.
    • Before proceeding, you may want to wait a few minutes to ensure they stay up reliably or the demo setup may fail. If they do not, you may need to increase the memory/cpus allocated to the VM.
#Ambari password
export PASSWORD=admin
#Ambari host
export AMBARI_HOST=localhost

#detect name of cluster
output=`curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari'  http://$AMBARI_HOST:8080/api/v1/clusters`
CLUSTER=`echo $output | sed -n 's/.*"cluster_name" : "\([^\"]*\)".*/\1/p'`

#make sure kafka, storm, falcon are out of maintenance mode
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Remove Falcon from maintenance mode"}, "Body": {"ServiceInfo": {"maintenance_state": "OFF"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/FALCON
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Remove Kafka from maintenance mode"}, "Body": {"ServiceInfo": {"maintenance_state": "OFF"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/KAFKA
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Remove Storm from maintenance mode"}, "Body": {"ServiceInfo": {"maintenance_state": "OFF"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/STORM
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Remove Hbase from maintenance mode"}, "Body": {"ServiceInfo": {"maintenance_state": "OFF"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/HBASE
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Remove Hive from maintenance mode"}, "Body": {"ServiceInfo": {"maintenance_state": "OFF"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/HIVE


#Start Kafka, Storm, HBase, Hive
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Start KAFKA via REST"}, "Body": {"ServiceInfo": {"state": "STARTED"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/KAFKA
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Start STORM via REST"}, "Body": {"ServiceInfo": {"state": "STARTED"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/STORM
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Start HBASE via REST"}, "Body": {"ServiceInfo": {"state": "STARTED"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/HBASE
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Start HIVE via REST"}, "Body": {"ServiceInfo": {"state": "STARTED"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/HIVE


#stop Falcon
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Stop FALCON via REST"}, "Body": {"ServiceInfo": {"state": "INSTALLED"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/FALCON

  • (Optional): Setup YARN queue for Spark using the steps here

  • (Optional): Setup Solr and Banana and 'Ranger Audits' dashboard using HDP search (Solr 5.2)

cd
wget https://github.com/abajwa-hw/security-workshops/raw/master/scripts/setup_solr_banana.sh
chmod +x setup_solr_banana.sh

#change <arguments> below
./setup_solr_banana.sh <arguments>
- argument options:
  - if no arguments passed, FQDN will be used as hostname to setup dashboard/view (use this if you have created local hosts entry for host where Solr will run e.g. sandbox.hortonworks.com)
  - if "publicip" is passed, the public ip address will be used as hostname to setup dashboard/view (use this on cloud environments)
  - otherwise the passed in value will be assumed to be the hostname to setup dashboard/view

- Solr UI should be available at http://(your hostname):6083/solr/#/ranger_audits e.g. http://sandbox.hortonworks.com:6083/solr/#/ranger_audits 
- An Empty Banana dashboard should be available at http://(your hostname):6083/banana e.g. http://sandbox.hortonworks.com:6083/banana. 
  • (Optional): Setup Ranger using steps here:

    • Setup MySQL pre-requisites using steps here
    • Setup Ranger using steps here. You can ignore the LDAP specific configs
  • (Optional): setup HBase Ranger plugin to audit to Solr

    • On 2.3 Sandbox:
cd /usr/hdp/2.*/ranger-hbase-plugin/       
vi /usr/hdp/2.*/ranger-hbase-plugin/install.properties
XAAUDIT.SOLR.IS_ENABLED=true
XAAUDIT.SOLR.SOLR_URL=http://sandbox.hortonworks.com:6083/solr/ranger_audits

./enable-hbase-plugin.sh
  • (Optional): setup Hive Ranger plugin to audit to Solr

    • On 2.3 Sandbox:
cd /usr/hdp/2.*/ranger-hive-plugin/       
vi /usr/hdp/2.*/ranger-hive-plugin/install.properties
XAAUDIT.SOLR.IS_ENABLED=true
XAAUDIT.SOLR.SOLR_URL=http://sandbox.hortonworks.com:6083/solr/ranger_audits

./enable-hive-plugin.sh
  • Now retart HBase and Hive to register the plugins.

  • Deploy the IoTDemo service as well as Apache Zeppelin service to visualize/analyze violations events generated via prebuilt notebook

VERSION=`hdp-select status hadoop-client | sed 's/hadoop-client - \([0-9]\.[0-9]\).*/\1/'`
sudo git clone https://github.com/abajwa-hw/iotdemo-service.git   /var/lib/ambari-server/resources/stacks/HDP/$VERSION/services/IOTDEMO   
sudo git clone https://github.com/hortonworks-gallery/ambari-zeppelin-service.git   /var/lib/ambari-server/resources/stacks/HDP/$VERSION/services/ZEPPELIN   

#on sandbox
sudo service ambari restart

#on non sandbox
sudo service ambari-server restart
  • Then you can click on 'Add Service' from the 'Actions' dropdown menu in the bottom left of the Ambari dashboard:

On bottom left -> Actions -> Add service -> check both 'IoT Demo' and 'Zeppelin' -> Next -> Next -> Configure service -> Next -> Deploy Image

Things to remember while configuring the service

  • The service currently requires that it is installed on the Ambari server node and that Kafka and Zookeeper and also running on the same node.

    • If kafka is on a different node, the demo still could work (not tested) if you manually create the topics ahead of time
    /usr/hdp/current/kafka-broker/bin/kafka-topics.sh --create --zookeeper $ZK_HOST:2181 --replication-factor 1 --partitions 2 --topic truck_events
    
  • Under "Advanced demo-config"

    • enter your github credentials to allow the service to access the IoT demo artifacts
    • enter public name/IP of IoTDemo node: This is used to setup the Ambari view. Set this to the public host/IP of IoTDemo node (which must must be reachable from your local machine). If installing on sandbox (or local VM), change this to the IP address of VM. If installing on cloud, set this to public name/IP of IoTDemo node. Alternatively, if you already have a local hosts file entry for the internal hostname of the IoTDemo node (e.g. sandbox.hortonworks.com), you can leave this empty - it will default to internal hostname
      • you should use the same value for publicname property in Zeppelin config as well
  • Under "Advanced user-config":

    • enter your ambari user/password/port configuration (if not using default). These wil be used to check the required services are up
  • The IoT demo configs are available under "Advanced demo-env", but do not require updating as all required configs will be auto-populated:

    • Ambari host
    • Name node host/port
    • Nimbus host
    • Hive metastore host/port
    • Supervisor host
    • HBase master host
    • Kafka host/port (also where ActiveMQ will be installed)

Image

Image

  • On successful deployment you will see the IOTDEMO service as part of Ambari stack and will be able to start/stop the service from here: Image

  • You can see the parameters you configured under 'Configs' tab Image

  • One benefit to wrapping the component in Ambari service is that you can now monitor/manage this service remotely via REST API

export SERVICE=IOTDEMO
export PASSWORD=admin
export AMBARI_HOST=localhost

#detect name of cluster
output=`curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari'  http://$AMBARI_HOST:8080/api/v1/clusters`
CLUSTER=`echo $output | sed -n 's/.*"cluster_name" : "\([^\"]*\)".*/\1/p'`

#get service status
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X GET http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE

#start service
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Start $SERVICE via REST"}, "Body": {"ServiceInfo": {"state": "STARTED"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE

#stop service
curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Stop $SERVICE via REST"}, "Body": {"ServiceInfo": {"state": "INSTALLED"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE

Intall the view

  • If iotdemo service was not deployed on same node as Ambari server:

    • Transfer the compiled view jar from /root/iotdemo-view/target/*.jar on the iotdemo node to /var/lib/ambari-server/resources/views on ambari server node
    • Follow steps below to stop IotDemo service, restart ambari, start IoTDemo service:
  • Otherwise, if iotdemo service was deployed on same node as Ambari server:

    • Stop the IoTDemo service from Ambari
    • Restart ambari
#sandbox
service ambari restart
#non sandbox
service ambari-server restart
  • Start IotDemo service

Access webapp

Image

  • Check Storm view for metrics Image

  • Alternatively check Storm UI for metrics Image

Visualize events using Zeppelin

HDP Search

Authorization/Audit via Ranger

Access Ranger audits dashboard

  • If installed, open the Silk Ranger Audits dashboard via Ambari view or http://sandbox.hortonworks.com:6083/banana

  • By default you will see a visualization of HBase reads/gets: Image Image

  • Change the query filter to search for writes/puts: Image

  • Now open Hive view and query the truck_events_text_partition table: Image

  • On the Ranger audits dashboard, query for Hive audits: Image

  • Now disable the global allow policy on Hbase and Hive and wait 30s: Image Image

  • Try running the same query in Hive view

  • At this point, you should should see some Hbase audit records with result=0 Image Image

  • Confirm the same by opening the Audit tab of Ranger: http://sandbox.hortonworks.com:6080

Image Image

  • Re-enable the global allow policies.

Troubleshooting

  • Components going down? Increase VM memory/cores and restart

  • Storm Nimus is not coming up?

    • or getting error java.lang.RuntimeException: Could not find leader nimbus from seed hosts [sandbox.hortonworks.com]. Did you specify a valid list of nimbus hosts for config nimbus.seeds

    • Solution: Stop storm and run below script to clean old data before starting Storm back up

setup/bin/cleanupstormdirs.sh

/usr/hdp/current/zookeeper-client/bin/zkCli.sh
rmr /storm	
quit
  • Other issues? Try resetting demo and restarting
setup/bin/cleanup.sh

Remove the service

  • To remove the IOTDEMO service:
    • Stop the service via Ambari

    • Delete the service

export SERVICE=IOTDEMO export PASSWORD=admin export AMBARI_HOST=localhost

output=curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' http://$AMBARI_HOST:8080/api/v1/clusters CLUSTER=echo $output | sed -n 's/.*"cluster_name" : "\([^\"]*\)".*/\1/p'

curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X DELETE http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE

#if above errors out, run below first to fully stop the service #curl -u admin:$PASSWORD -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Stop $SERVICE via REST"}, "Body": {"ServiceInfo": {"state": "INSTALLED"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE

```
  • Remove artifacts

rm -rf /root/sedev rm -rf /root/iot* rm -rf /root/scala rm -rf /root/maven rm -rf /var/log/iotdemo.log rm -rf /var/lib/ambari-server/resources/views/iot* VERSION=hdp-select status hadoop-client | sed 's/hadoop-client - \([0-9]\.[0-9]\).*/\1/' rm -rf /var/lib/ambari-server/resources/stacks/HDP/$VERSION/services/IOTDEMO

service ambari-server restart ```