k7training

k7training

Friday, 23 September 2016

Manual

GRID COMPUTING LAB

Globus Toolkit

Introduction

  • The open source Globus Toolkit is a fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely online across corporate, institutional, and geographic boundaries without sacrificing local autonomy.
  • The toolkit includes software services and libraries for resource monitoring, discovery, and management, plus security and file management.
  • In addition to being a central part of science and engineering projects that total nearly a half-billion dollars internationally, the Globus Toolkit is a substrate on which leading IT companies are building significant commercial Grid products.

Intallation Procedure for Globus Toolkit

Mandatory prerequisite:

Linux 64 bit Operating System

Download all softwares
1.      Apache-tomcat-7.0.67-tar.gz
2.      Apache-ant-1.9.6.bin.tar.gz
3.      Junit 3.8.1.zp
4.      Jdk-8u60-linux-x64.gz

Copy downloads to /usr/local and type the following commands

  1. cp /home/stack/downloads/* /usr/local

  1. pwd

  1. tar zxvf jdk-8u60-linux-x64.gz
            cd jdk1.8.0_60/
            pwd
            export JAVA_HOME=/usr/local/grid/SOFTWARE/jdk1.8.0_60/bin/

             cd ..

  1. tar zxvf apache-ant-1.9.6-bin.tar.gz
            pwd
            export ANT_HOME=/usr/local/grid/SOFTWARE/apache-ant-1.9.6
            cd ..

  1. tar zxvf apache-tomcat-7.0.67.tar.gz
            cd apache-tomcat-7.0.67/
            pwd
            export CATALINA_HOME=/usr/local/grid/SOFTWARE/apache-tomcat-7.0.67

            cd ..

  1. unzip junit3.8.1.zip
            cd junit3.8.1
            pwd
            export JUNIT_HOME=/usr/local/grid/SOFTWARE/junit3.8.1

            cd ..
            pwd


  1. dpkg -­i globus-toolkit-repo_latest_all.deb

  1. apt-get update

  1. apt get install globus-data-management-client
            apt get install globus-gridftp
            apt get install globus-gram5
            apt get install globus-gsi
            apt get install myproxy
            apt get install myproxy-server
            apt get install myproxy-admin

  1. grid-cert-info -subject

  1. grid-mapfile-add-entry {-------output of grid-cert-info -subject-----} gtuser
grid-proxy-init -verify -debug

  1. service globus-gridftp-server start
            service globus-gridftp-server status
  1. myproxy-logon -s {name}
  2. service globus-gatekeeper start
            service globus-gatekeeper statusmyproxy-logon -s
  1. globus-job-run name /bin/hostname
Ex. No:1
Develop new web service for calculator

Aim:
      To Develop new web service for calculator using Globus toolkit
                 
Procedure :

When you start Globus toolkit container, there will be number of services starts up. The service for this task will be a simple Math service that can perform basic arithmetic for a client.
The Math service will access a resource with two properties:

1. An integer value that can be operated upon by the service
2. A string values that holds string describing the last operation

The service itself will have three remotely accessible operations that operate upon value:
(a) add, that adds a to the resource property value.
(b) subtract that subtracts a from the resource property value.
(c) getValueRP that returns the current value of value.

Usually, the best way for any programming task is to begin with an overall description of what you want the code to do, which in this case is the service interface. The service interface describes how what the service provides in terms of names of operations, their arguments and return values. A Java interface for our service is:

public interface
 Math
{ public void add(int a);
 public void subtract(int a);
public int getValueRP();
}
It is possible to start with this interface and create the necessary WSDL file using the standard Web service tool called Java2WSDL. However, the WSDL file for GT 4 has to include details of resource properties that are not given explicitly in the interface above.

Hence, we will provide the WSDL file. Step 1 Getting the Files All the required files are provided and comes directly from [1]. The MathService source code files can be found from http://www.gt4book.com (http://www.gt4book.com/downloads/gt4book-examples.tar.gz)

A Windows zip compressed version can be found at http://www.cs.uncc.edu/~abw/ITCS4146S07/gt4book-examples.zip. Download and uncompress the file into a directory called GT4services. Everything is included (the java source WSDL and deployment files, etc.):

WSDL service interface description file -- The WSDL service interface description file is provided within the GT4services folder at: GT4Services\schema\examples\MathService_instance\Math.wsdl This file, and discussion of its contents, can be found in Appendix A. Later on we will need to modify this file, but first we will use the existing contents that describe the Math service above.
Service code in Java -- For this assignment, both the code for service operations and for the resource properties are put in the same class for convenience. More complex services and resources would be defined in separate classes.

The Java code for the service and its resource properties is located within the GT4services folder at:
GT4services\org\globus\examples\services\core\first\impl\MathService.java.

 Deployment Descriptor -- The deployment descriptor gives several different important sets of information about the service once it is deployed. It is located within the GT4services folder at: GT4services\org\globus\examples\services\core\first\deploy-server.wsdd. Step 2 – Building the Math Service It is now necessary to package all the required files into a GAR (Grid Archive) file.

The build tool ant from the Apache Software Foundation is used to achieve this as shown overleaf:


Generating a GAR file with Ant (from http://gdp.globus.org/gt4- tutorial/multiplehtml/ch03s04.html) Ant is similar in concept to the Unix make tool but a java tool and XML based. Build scripts are provided by Globus 4 to use the ant build file. The windows version of the build script for MathService is the Python file called globus-build-service.py, which held in the GT4services directory. The build script takes one argument, the name of your service that you want to deploy. To keep with the naming convention in [1], this service will be called first.

In the Client Window, run the build script from the GT4services directory with: globus-build-service.py first The output should look similar to the following: Buildfile: build.xml

BUILD SUCCESSFUL
Total time: 8 seconds

During the build process, a new directory is created in your GT4Services directory that is named build. All of your stubs and class files that were generated will be in that directory and its subdirectories.
More importantly, there is a GAR (Grid Archive) file called org_globus_examples_services_core_first.gar. The GAR file is the package that contains every file that is needed to successfully deploy your Math Service into the Globus container.
The files contained in the GAR file are the Java class files, WSDL, compiled stubs, and the deployment descriptor. Step 3 – Deploying the Math Service If the container is still running in the Container Window, then stop it using Control-C.

To deploy the Math Service, you will use a tool provided by the Globus Toolkit called globus-deploy-gar. In the Container Window, issue the command: globus-deploy-gar org_globus_examples_services_core_first.gar Successful output of the command is :

The service has now been deployed.

Check service is deployed by starting container from the Container Window:


You should see the service called MathService.
Step 4 – Compiling the Client A client has already been provided to test the Math Service and is located in the GT4Services directory.

Step 5 – Start the Container for your Service Restart the Globus container from the Container Window with: globus-start-container -nosec if the container is not running.

Step 6 – Run the Client To start the client from your GT4Services directory, do the following in the Client Window, which passes the GSH of the service as an argument: java -classpath build\classes\org\globus\examples\services\core\first\impl\:%CLASSPATH% org.globus.examples.clients.MathService_instance.Client http://localhost:8080/wsrf/services/examples/core/first/MathService which should give the output: Current value: 15 Current value: 10

Step 7 – Undeploy the Math Service and Kill a Container Before we can add functionality to the Math Service (Section 5), we must undeploy the service.
 In the Container Window, kill the container with a Control-C.
Then to undeploy the service, type in the following command: globus-undeploy-gar org_globus_examples_services_core_first which should result with the following output: Undeploying gar... Deleting /.

Undeploy successful
























Ex. No:2
Developing New Grid Service

Aim:
      To Develop new Grid Service
                 
Procedure :

  1. Setting up Eclipse, GT4, Tomcat, and the other necessary plug-ins and tools
  2. Creating and configuring the Eclipse project in preparation for the source files
  3. Adding the source files (and reviewing their major features)
  4. Creating the build/deploy Launch Configuration that orchestrates the automatic generation of the remaining artifacts, assembling the GAR, and deploying the grid service into the Web services container
  5. Using the Launch Configuration to generate and deploy the grid service
  6. Running and debugging the grid service in the Tomcat container    
  7. Executing the test client
  8. To test the client, simply right-click the Client.java file and select Run > Run... from the pop-up menu (See Figure 27).
  9. In the Run dialog that is displayed, select the Arguments tab and enter http://127.0.0.1:8080/wsrf/services/examples/ProvisionDirService in the Program Arguments: textbox.
  10. Run dialog
  1. Run the client application by simply right-clicking the Client.java file and selecting Run > Java Application

Output
Run Java Application

     























Ex. No:3
Develop applications using Java - Grid APIs

Aim :

      To develop Applications using Java – Grid APIs

Procedure:

1.      Build a server-side SOAP service using Tomcat and Axis
2.      Create connection stubs to support client-side use of the SOAP service
3.      Build a custom client-side ClassLoader
4.      Build the main client application
5.      Build a trivial compute task designed to exercise the client ClassLoader
  1. Test the grid computing framework

Output













Ex. No:4
Develop  secured applications using basic security in Globus
Aim :
     
      To Develop secured applications using basic security in Globus

Procedure:

Mandatory prerequisite:

  • Tomcat v4.0.3
  • Axis beta 1
  • Commons Logging v1.0
  • Java CoG Kit v0.9.12
  • Xerces v2.0.1

1.Installing the software Tomcat and deploy Axis on Tomcat

2. Install libraries to provide GSI support for Tomcat

  • Copy cog.jar, cryptix.jar, iaik_ssl.jar, iaik_jce_full.jar, iaik_javax_crypto.jar to Tomcat's common/lib directory.
  • Check that log4j-core.jar and xerces.jar (or other XML parser) are in Tomcat's common/lib directory.
  • Copy gsicatalina.jar to Tomcat's server/lib directory.

3. Deploy GSI support in Tomcat

  • Edit Tomcat's conf/server.xml
            Add GSI Connector in <service> section:
               <!-- Define a GSI HTTP/1.1 Connector on port 8443 
                        Supported parameters include:
                        proxy         // proxy file for server to use
                          or 
                        cert          // server certificate file in PEM format 
                        key           // server key file in PEM format
             
                        cacertdir     // directory location containing trusted CA certs
                        gridMap       // grid map file used for authorization of users
                        debug         // "0" is off and "1" and greater for more info
                -->
                <Connector className="org.apache.catalina.connector.http.HttpConnector"
                           port="8443" minProcessors="5" maxProcessors="75"
                           enableLookups="true" authenticate="true"
                           acceptCount="10" debug="1" scheme="httpg" secure="true">
                  <Factory className="org.globus.tomcat.catalina.net.GSIServerSocketFactory"
                           cert="/etc/grid-security/hostcert.pem"
                           key="/etc/grid-security/hostkey.pem"
                           cacertdir="/etc/grid-security/certificates"
                           gridmap="/etc/grid-security/gridmap-file"
                           debug="1"/> 
                </Connector>
If you are testing under a user account, make sure that the proxy or certificates and keys are readable by Tomcat. For testing purposes you can use user proxies or certificates instead of host certificates e.g.:
    <Connector className="org.apache.catalina.connector.http.HttpConnector"
               port="8443" minProcessors="5" maxProcessors="75"
               enableLookups="true" authenticate="true"
               acceptCount="10" debug="1" scheme="httpg" secure="true">
      <Factory className="org.globus.tomcat.catalina.net.GSIServerSocketFactory"
               proxy="/tmp/x509u_up_neilc"
               debug="1"/> 
    </Connector>
If you do test using user proxies, make sure the proxy has not expired!
            Add a GSI Valve in the <engine> section:
                <Valve className="org.globus.tomcat.catalina.valves.CertificatesValve"
                       debug="1" />
             

4. Install libraries to provide GSI support for Axis

  • Copy gsiaxis.jar to the WEB-INF/lib directory of your Axis installation under Tomcat.

5. Set your CLASSPATH correctly

  • You should ensure that the following jars from the axis/lib directory are in your classpath:
    • axis.jar
    • clutil.jar
    • commons-logging.jar
    • jaxrpc.jar
    • log4j-core.jar
    • tt-bytecode.jar
    • wsdl4j.jar
  • You should also have these jars in your classpath:
    • gsiaxis.jar
    • cog.jar
    • xerces.jar (or other XML parser)

6. Start the GSI enabled Tomcat/Axis server

  • Start up Tomcat as normal
Check the logs in Tomcat's logs/ directory to ensure the server started correctly. In particular check that:
  • apache_log.YYYY-MM-DD.txt does not contain any GSI related error messages
  • catalina.out contains messages saying "Welcome to the IAIK ... Library"
  • catalina_log.YYYY-MM-DD.txt contains messages saying "HttpConnector[8443] Starting background thread" and "HttpProcessor[8443][N] Starting background thread"
  • localhost_log.YYYY-MM-DD.txt contains a message saying "WebappLoader[/axis]: Deploy JAR /WEB-INF/lib/gsiaxis.jar"

7. Writing a GSI enabled Web Service

7.1. Implementing the service

The extensions made to Tomcat allow us to receive credentials through a transport-level security mechanism. Tomcat exposes these credentials, and Axis makes them available as part of the MessageContext.

Alpha 3 version

Let's assume we already have a web service called MyService with a single method, myMethod. When a SOAP message request comes in over the GSI httpg transport, the Axis RPC despatcher will look for the same method, but with an additional parameter: the MessageContext. So we can write a new myMethod which takes an additional argument, the MessageContext.
This can be illustrated in the following example:
package org.globus.example;
 
import org.apache.axis.MessageContext;
import org.globus.axis.util.Util;
 
public class MyService {
 
   // The "normal" method
   public String myMethod(String arg) {
      System.out.println("MyService: http request\n");
      System.out.println("MyService: you sent " + arg);
      return "Hello Web Services World!";
   }
 
   // Add a MessageContext argument to the normal method
   public String myMethod(MessageContext ctx, String arg) {
      System.out.println("MyService: httpg request\n");
      System.out.println("MyService: you sent " + arg);
      System.out.println("GOT PROXY: " + Util.getCredentials(ctx));
      return "Hello Web Services World!";
   }
 
}

Beta 1 version

In the Beta 1 version, you don't even need to write a different method. Instead the Message Context is put on thread local store. This can be retrieved by calling MessageCOntext.getCurrentContext():
package org.globus.example;
 
import org.apache.axis.MessageContext;
import org.globus.axis.util.Util;
 
public class MyService {
 
   // Beta 1 version
   public String myMethod(String arg) {
      System.out.println("MyService: httpg request\n");
      System.out.println("MyService: you sent " + arg);
 
      // Retrieve the context from thread local
      MessageContext ctx = MessageContext.getCurrentContext();
      System.out.println("GOT PROXY: " + Util.getCredentials(ctx));
      return "Hello Web Services World!";
   }
 
}
Part of the code provided by ANL in gsiaxis.jar is a utility package which includes the getCredentials() method. This allows the service to extract the proxy credentials from the MessageContext.

7.2. Deploying the service

Before the service can be used it must be made available. This is done by deploying the service. This can be done in a number of ways:
  1. Use the Axis AdminClient to deploy the MyService classes.
  2. Add the following entry to the server-config.wsdd file in WEB-INF directory of axis on Tomcat:
3.        <service name="MyService" provider="java:RPC">
4.         <parameter name="methodName" value="*"/>
5.         <parameter name="className" value="org.globus.example.MyService"/>
6.        </service>

8. Writing a GSI enabled Web Service client

As in the previous example, this is very similar to writing a normal web services client. There are some additions required to use the new GSI over SSL transport:
  • Deploy a httpg transport chain
  • Use the Java CoG kit to load a Globus proxy
  • Use setProperty() to set GSI specifics in the Axis "Property Bag":
    • globus credentials (the proxy certificate)
    • authorisation type
    • GSI mode (SSL, no delegation, full delegation, limited delegation)
  • Continue with the normal Axis SOAP service invokation:
    • Set the target address for the service
    • Provide the name of the method to be invoked
    • Pass on any parameters required
    • Set the type of the returned value
    • Invoke the service
You can invoke this client by running:
java org.globus.example.Client -l httpg://127.0.0.1:8443/axis/servlet/AxisServlet "Hello!"








Ex. No:5
Develop a Grid portal

Aim :
     
            To Develop a Grid portal, where user can submit a job and get the result. Implement it with             and without GRAM concept.

Procedure:

1)   Building the GridSphere distribution requires 1.5+. You will also need Ant 1.6+ available at http://jakarta.apache.org/ant.
2)   You will also need a Tomcat 5.5.x servlet container available at http://jakarta.apache.org/tomcat. In addition to providing a hosting environment for GridSphere, Tomcat provides some of the required XML (JAR) libraries that are needed for compilation.
3)   Compiling and Deploying
4)   The Ant build script, build.xml, uses the build.properties file to specify any compilation options. Edit build.properties appropriately for your needs.
5)   At this point, simply invoking "ant install" will deploy the GridSphere portlet container to Tomcat using the default database. Please see the User Guide for more details on configuring the database.
6)   The build.xml supports the following basic tasks:
     install -- builds and deploys GridSphere, makes the documentation and installs the database
     clean -- removes the build and dist directories including all the compiled classes
     update -- updates the existing source code from CVS
     compile -- compiles the GridSphere source code
     deploy -- deploys the GridSphere framework and all portlets to a Tomcat servlet container located at $CATALINA_HOME
     create-database - creates a new, fresh database with original GridSphere settings, this wipes out your current database
     docs -- builds the Javadoc documentation from the source code
     To see all the targets invoke "ant --projecthelp".
7)   Startup Tomcat and then go to http://127.0.0.1:8080/gridsphere/gridsphere to see the portal.


CLOUD COMPTUING LAB


Mandatory prerequisite:

Linux 64 bit Operating

Installing KVM (Hypervisor for Virtualization)

1. Please check if the Virtualization flag is enabled in BIOS
                        Run the command in terminal
                                    egrep -c 'vmx|svm)' /proc/cpuinfo

                        If the result is any value higher than 0, then virtualization is enabled.
                        If the value is 0, then in BIOS enable Virtualization – Consult system administrator
                        for this step.

2. To check if your OS is 64 bit,
                        Run the command in terminal

                        uname -m

                        If the result is x86_64, it means that your Operating system is 64 bit Operating system.

3. Few KVM packages are availabe with Linux installation.
                        To check this, run the command,
           
                        ls /lib/modules/{press tab}/kernel/arch/x86/kvm    
                       
                        The three files which are installed in your system will be displayed
                        kvm-amd.ko      kvm-intel.ko           kvm.ko

4. Install the KVM packages
           
             1. Switch to root (Administrator) user
           
                        sudo -i

              2. To install the packages, run the following commands,
             
                        apt-get update
                        apt-get install qemu-kvm
                        apt-get install libvirt-bin
                        apt-get install bridge-utils
                        apt-get install virt-manager
                        apt-get install qemu-system

5. To verify your installation, run the command
                        virsh -c qemu:///system list
it shows output

                        Id                Name                  State
                        -------------------------------------------

If Vms are running, then it shows name of VM. If VM is not runnign, the system shows blank output, whcih means your KVM installation is perfect.

6. Run the command
                        virsh –connect qemu:///system list –all

7. Working with KVM

                        run the command
                        virsh
                        version  (this command displays version of software tools installed)
                        nodeinfo (this command displays your system information)
                        quit (come out of the system)
                       
8. To test KVM installation - we can create Virtual machines but these machines are to be done in manual mode. Skipping this, Directly install Openstack.

Installation of Openstack

1. add new user named stack – This stack user is the adminstrator of the openstack services.

            To add new user – run the command as root user.

                        adduser stack

2. run the command
                         apt-get install sudo -y || install -y sudo



3. Be careful in running the command – please be careful with the syntax. If any error in thsi following command, the system will crash beacause of permission errors. 
                                               
                        echo “stack ALL=(ALL) NOPASSWD:ALL” >> /etc/sudoers

4. Logout the system and login as stack user

5. Run the command (this installs git repo package)
Please ensure that you are as logged in as non-root user (stack user), and not in /root directory.

                        sudo apt-get install git
6. Run the command (This clones updatesd version of dev-stack (which is binary auto-installer package of Openstack)
                        git clone https://git.openstack.org/openstack-dev/devstack
                        ls (this shows a folder named devstack)
                        cd devstack (enter into the folder)

7. create a file called local.conf. To do this run the command,
                        nano local.conf

8. In the file, make the following entry (Contact Your Network Adminstrator for doubts in these values)
            [[local|localrc]]
FLOATING_RANGE=192.168.1.224/27
FIXED_RANGE=10.11.11.0/24
FIXED_NETWORK_SIZE=256
FLAT_INTERFACE=eth0
ADMIN_PASSWORD=root
DATABASE_PASSWORD=root
RABBIT_PASSWORD=root
SERVICE_PASSWORD=root
SERVICE_TOCKEN=root

9. Save this file
10. Run the command (This installs Opentack)
            ./stack.sh
11. If any error occurs, then run the command for uninistallation
            ./unstack.sh
                        1. update the packages
            apt-get update
                                    2. Then reinstall the package
            ./stack.sh


12. Open the browser, http://IP address of your machine, you will get the openstack portal.

13. If you  restart the machine, then to again start open stack

            open terminal,
                        su stack
                        cd devstack
                        run ./rejoin.sh

14. Again you can access openstack services in the browser, http://IP address of your machine, 

Ex. No:1
Procedure to run the virtual machine

Aim :
     
            To Find procedure to run the virtual machine of different configuration and to check how many   Virtual machines can be utilized at particular time.

Procedure:

This experiment is to be performed through portal.  Login into Openstack portal, in instances,       create virtual machines.

TO RUN VM
Step 1 : Under the Project Tab, Click Instances. In the right side screen Click Launch Instance.
Step 2 : In the details, Give the instance name(eg. Instance1).
Step 3: Click Instance Boot Source list and choose 'Boot from image'
Step 4: Click Image name list and choose the image currently uploaded.
Step 5: Click launch.
Your VM will get created.





















Ex. No:2
Procedure to attach virtual block to the virtual machine
Aim :
     
      To find procedure to attach virtual block to the virtual machine and check whether it holds    the data even after the release of the virtual machine.

Procedure:

Ø This experiment is to be performed through portal.  Login into Openstack portal, in instances, create virtual machines.
Ø In Volumes, create storage block of available capacity. Attach / Mount the storage block volumes to virtual machines, unmount the volume and reattach it.
Ø Volumes are block storage devices that you attach to instances to enable persistent storage. You can attach a volume to a running instance or detach a volume and attach it to another instance at any time. You can also create a snapshot from or delete a volume. Only administrative users can create volume types.

            Create a volume

1.      Log in to the dashboard.
2.      Select the appropriate project from the drop down menu at the top left.
3.      On the Project tab, open the Compute tab and click Volumes category.
4.      Click Create Volume.
In the dialog box that opens, enter or select the following values.
Volume Name: Specify a name for the volume.
Description: Optionally, provide a brief description for the volume.
Volume Source: Select one of the following options:
    • No source, empty volume: Creates an empty volume. An empty volume does not contain a file system or a partition table.
    • Image: If you choose this option, a new field for Use image as a source displays. You can select the image from the list.
    • Volume: If you choose this option, a new field for Use volume as a source displays. You can select the volume from the list. Options to use a snapshot or a volume as the source for a volume are displayed only if there are existing snapshots or volumes.
Type: Leave this field blank.
Size (GB): The size of the volume in gibibytes (GiB).
Availability Zone: Select the Availability Zone from the list. By default, this value is set to the availability zone given by the cloud provider (for example, us-west or apac-south). For some cases, it could be nova.
5.      Click Create Volume.
The dashboard shows the volume on the Volumes tab.

Attach a volume to an instance

After you create one or more volumes, you can attach them to instances. You can attach a volume to one instance at a time.
1.      Log in to the dashboard.
2.      Select the appropriate project from the drop down menu at the top left.
3.      On the Project tab, open the Compute tab and click Volumes category.
4.      Select the volume to add to an instance and click Manage Attachments.
5.      In the Manage Volume Attachments dialog box, select an instance.
6.      Enter the name of the device from which the volume is accessible by the instance.
7.      Click Attach Volume.
The dashboard shows the instance to which the volume is now attached and the device name.
You can view the status of a volume in the Volumes tab of the dashboard. The volume is either Available or In-Use.
Now you can log in to the instance and mount, format, and use the disk.

Detach a volume from an instance

  1. Log in to the dashboard.
  2. Select the appropriate project from the drop down menu at the top left.
  3. On the Project tab, open the Compute tab and click the Volumes category.
  4. Select the volume and click Manage Attachments.
  5. Click Detach Volume and confirm your changes.
            A message indicates whether the action was successful.

 

Create a snapshot from a volume

1.      Log in to the dashboard.
2.      Select the appropriate project from the drop down menu at the top left.
3.      On the Project tab, open the Compute tab and click Volumes category.
4.      Select a volume from which to create a snapshot.
5.      In the Actions column, click Create Snapshot.
6.      In the dialog box that opens, enter a snapshot name and a brief description.
7.      Confirm your changes.
The dashboard shows the new volume snapshot in Volume Snapshots tab.

Edit a volume

1.      Log in to the dashboard.
2.      Select the appropriate project from the drop down menu at the top left.
3.      On the Project tab, open the Compute tab and click Volumes category.
4.      Select the volume that you want to edit.
5.      In the Actions column, click Edit Volume.
6.      In the Edit Volume dialog box, update the name and description of the volume.
7.      Click Edit Volume.

Delete a volume

When you delete an instance, the data in its attached volumes is not deleted.
1.      Log in to the dashboard.
2.      Select the appropriate project from the drop down menu at the top left.
3.      On the Project tab, open the Compute tab and click Volumes category.
4.      Select the check boxes for the volumes that you want to delete.
5.      Click Delete Volumes and confirm your choice.
A message indicates whether the action was successful.

Ex. No:3
Install a C compiler in the virtual machine and execute a sample program.

Aim :
     
      To find procedure to attach virtual block to the virtual machine and check whether it holds    the data even after the release of the virtual machine.

Procedure:

1.      Install a C compiler in the virtual machine and execute a sample program. 
Through Openstack portal create virtual machine. Through the portal connect to virtual machines. Login to VMs and install c compiler using commands.
Eg : apt-get install gcc

2.      Show the virtual machine migration based on the certain condition from one node to the other.

To demonstrate virtual machine migration, two machines must be configured in one cloud. Take snapshot of running virtual machine and copy the snapshot file to the other destination machine and restore the snapshot. On restoring the snapshot, VM running in source will be migrated to destination machine.
  1. List the VMs you want to migrate, run:
$ nova list
  1. After selecting a VM from the list, run this command where VM_ID is set to the ID in the list returned in the previous step:
$ nova show VM_ID
  1. Use the nova migrate command.
$ nova migrate VM_ID
  1. To migrate an instance and watch the status, use this example script:
#!/bin/bash
# Provide usage
usage() {
echo "Usage: $0 VM_ID"
exit 1
}
[[ $# -eq 0 ]] && usage
# Migrate the VM to an alternate hypervisor
echo -n "Migrating instance to alternate host"
VM_ID=$1
nova migrate $VM_ID
VM_OUTPUT=`nova show $VM_ID`
VM_STATUS=`echo "$VM_OUTPUT" | grep status | awk '{print $4}'`
while [[ "$VM_STATUS" != "VERIFY_RESIZE" ]]; do
echo -n "."
sleep 2
VM_OUTPUT=`nova show $VM_ID`
VM_STATUS=`echo "$VM_OUTPUT" | grep status | awk '{print $4}'`
done
nova resize-confirm $VM_ID
echo " instance migrated and resized."
echo;
# Show the details for the VM
echo "Updated instance details:"
nova show $VM_ID
# Pause to allow users to examine VM details
read -p "Pausing, press <enter> to exit."







Ex. No:4
Procedure to install storage controller and interact with it

Aim :
     
To find procedure to install storage controller and interact with it.
Procedure:

Storage controller will be installed as Swift and Cinder components when installing Openstack. The ways to interact with the storage will be done through portal.

OpenStack Object Storage (swift) is used for redundant, scalable data storage using clusters of standardized servers to store petabytes of accessible data. It is a long-term storage system for large amounts of static data which can be retrieved and updated.
OpenStack Object Storage provides a distributed, API-accessible storage platform that can be integrated directly into an application or used to store any type of file, including VM images, backups, archives, or media files. In the OpenStack dashboard, you can only manage containers and objects.
In OpenStack Object Storage, containers provide storage for objects in a manner similar to a Windows folder or Linux file directory, though they cannot be nested. An object in OpenStack consists of the file to be stored in the container and any accompanying metadata.

Create a container

 

Log in to the dashboard

 

  1. Select the appropriate project from the drop down menu at the top left.
  2. On the Project tab, open the Object Store tab and click Containers category.
  3. Click Create Container.
  4. In the Create Container dialog box, enter a name for the container, and then click Create Container.
You have successfully created a container.

Upload an object

1.      Log in to the dashboard.
2.      Select the appropriate project from the drop down menu at the top left.
3.      On the Project tab, open the Object Store tab and click Containers category.
4.      Select the container in which you want to store your object.
5.      Click Upload Object.
The Upload Object To Container: <name> dialog box appears. <name> is the name of the container to which you are uploading the object.
6.      Enter a name for the object.
7.      Browse to and select the file that you want to upload.
8.      Click Upload Object.
You have successfully uploaded an object to the container.

Manage an object

To edit an object
1.      Log in to the dashboard.
2.      Select the appropriate project from the drop down menu at the top left.
3.      On the Project tab, open the Object Store tab and click Containers category.
4.      Select the container in which you want to store your object.
5.      Click the menu button and choose Edit from the dropdown list.
The Edit Object dialog box is displayed.
6.      Browse to and select the file that you want to upload.
7.      Click Update Object.























Ex. No:5
Procedure to set up the one node Hadoop cluster


Aim :
     
To find procedure to install storage controller and interact with it.
Procedure:


            Mandatory prerequisite:

  • Linux 64 bit Operating System
  • Installing Java v1.8

  • Configuring SSH access.

sudo apt-get install vim

1) Installing Java:

Hadoop is a framework written in Java for running applications on large clusters of commodity hardware. Hadoop needs Java 6 or above to work.

Step 1: Download Jdk tar.gz file for linux-62 bit, extract it into “/usr/local”
boss@solaiv[]# cd /opt

boss@solaiv[]# sudo tar xvpzf/home/itadmin/Downloads/jdk-8u5-linux-x64.tar.gz boss@solaiv[]# cd /opt/jdk1.8.0_05

Step 2:

Open the “/etc/profile” file and Add the following line as per the version

set a environment for Java

Use the root user to save the /etc/proflie or use gedit instead of vi .
The 'profile' file contains commands that ought to be run for login shells



boss@solaiv[]# sudo vi /etc/profile


#--insert JAVA_HOME

JAVA_HOME=/opt/jdk1.8.0_05
#--in PATH variable just append at the end of the line PATH=$PATH:$JAVA _HOME/bin

#--Append JAVA_HOME at end of the export statement export PATH JAVA_HOME

save the file using by pressing “Esc” key followed by :wq!


Step 3: Source the /etc/profileboss@solaiv[]# source /etc/profile
Step 3: Update the java alternatives

By default OS will have a open jdk. Check by “java -version”. You will be prompt “openJDK”

If you also have openjdk installed then you'll need to update the java alternatives:

If your system has more than one version of Java, configure which one your system causes by entering the following command in a terminal window

By default OS will have a open jdk. Check by “java -version”. You will be prompt “Java HotSpot(TM) 64-Bit Server”

boss@solaiv[]# update-alternatives --install "/usr/bin/java" java "/opt/jdk1.8.0_05/bin/java" 1

boss@solaiv[]# update- alternatives --config java --type selection number:

boss@solaiv[]# java -version

2)       configure ssh
Hadoop requires SSH access to manage its nodes, i.e. remote machines plus your local machine if you want to use Hadoop on it (which is what we want to do in this short tutorial). For our single-node setup of Hadoop, we therefore need to configure SSH access to localhos


The need to create a Password-less SSH Key generation based authentication is sothat the master node can then login to slave nodes (and the secondary node) to start/stop them easily without any delays for authentication

If you skip this step, then have to provide password

Generate an SSH key for the user. Then Enable password-less SSH access to yo

sudo apt-get install openssh-server
--You will be asked to enter password, root@solaiv[]# ssh localhost

root@solaiv[]# ssh-keygen root@solaiv[]# ssh-copy-id -i localhost

--After above 2 steps, You will be connected without

password, root@solaiv[]# ssh localhost

root@solaiv[]# exit


3)       Hadoop installation


Now Download Hadoop from the official Apache, preferably a stable release version of Hadoop 2.7.x and extract the contents of the Hadoop package to a location of your choice.

We chose location as “/opt/”

Step 1: Download the tar.gz file of latest version Hadoop ( hadoop-2.7.x) from the official site .
Step 2: Extract(untar) the downloaded file from this commands to /opt/bigdata

root@solaiv[]# cd /opt
root@solaiv[/opt]# sudo tar xvpzf /home/itadmin/Downloads/hadoop-2.7.0.tar.gz root@solaiv[/opt]# cd hadoop-2.7.0/

Like java, update Hadop environment variable in /etc/profile




boss@solaiv[]# sudo vi /etc/profile


#--insert HADOOP_PREFIX HADOOP_PREFIX=/opt/hadoop-2.7.0

#--in PATH variable just append at the end of the line PATH=$PATH:$HADOOP_PREFIX/bin

#--Append HADOOP_PREFIX at end of the export statement export PATH JAVA_HOME HADOOP_PREFIX

Save the file using by pressing “Esc” key followed by :wq!

Step 3: Source the /etc/profileboss@solaiv[]# source /etc/profile

Verify Hadoop installation

boss@solaiv[]# cd $HADOOP_PREFIX

boss@solaiv[]# bin/hadoop version



3.1) Modify the Hadoop Configuration Files
In this section, we will configure the directory where Hadoop will store its configuration files, the network ports it listens to, etc. Our setup will use Hadoop Distributed File System,(HDFS), even though we are using only a single local machine.

Add the following properties in the various hadoop configuration files which is available under $HADOOP_PREFIX/etc/hadoop/ core-site.xml, hdfs-site.xml, mapred-site.xml & yarn-site.xml

Update Java, hadoop path to the Hadoop environment file

boss@solaiv[]# cd $HADOOP_PREFIX/etc/hadoop

boss@solaiv[]# vi hadoop-env.sh




Paste following line at beginning of the fileexport

JAVA_HOME=/usr/local/jdk1.8.0_05 export HADOOP_PREFIX=/opt/hadoop-2.7.0



Modify the core-site.xml
boss@solaiv[]# cd $HADOOP_PREFIX/etc/hadoop boss@solaiv[]# vi core-site.xml
Paste following between <configuration> tags

<configuration>

<property>

<name>fs.defaultFS</name>

<value>hdfs://localhost:9000</value>

</property>

</configuration>



Modify the hdfs-site.xml
boss@solaiv[]# vi hdfs-site.xml



Paste following between <configuration> tags

<configuration>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>


</configuration>



YARN configuration - Single Node modify the mapred-site.xml
boss@solaiv[]# cp mapred-site.xml.template mapred-site.xml

boss@solaiv[]# vi mapred-site.xml



Paste following between <configuration> tags
<configuration>

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

</configuration>



Modiy yarn-site.xml
boss@solaiv[]# vi yarn-site.xml




Paste following between <configuration> tags


<configuration>

<property><name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value></property>

</configuration>


Formatting the HDFS file-system via the NameNode

The first step to starting up your Hadoop installation is formatting the Hadoop files system which is implemented on top of the local file system of our “cluster” which includes only our local machine. We need to do this the first time you set up a Hadoop cluster. Do not format a running Hadoop file system as you will lose all the data currently in the cluster (in HDFS)
root@solaiv[]# cd $HADOOP_PREFIX root@solaiv[]# bin/hadoop namenode -format

Start NameNode daemon and DataNode daemon: (port 50070)
root@solaiv[]# sbin/start-dfs.sh


To know the running daemons jut type jps or /usr/local/jdk1.8.0_05/bin/jps Start ResourceManager daemon and NodeManager daemon: (port 8088)
root@solaiv[]# sbin/start-yarn.sh



To stop the running process
root@solaiv[]# sbin/stop-dfs.sh


To know the running daemons jut type jps or /usr/local/jdk1.8.0_05/bin/jps Start ResourceManager daemon and NodeManager daemon: (port 8088)


root@solaiv[]# sbin/stop-yarn.sh





Ex. No:6
Mount the one node Hadoop cluster using FUSE

Introduction
FUSE (Filesystem in Userspace) enables you to write a normal user application as a bridge for a traditional filesystem interface.
The hadoop-hdfs-fuse package enables you to use your HDFS cluster as if it were a traditional filesystem on Linux. It is assumed that you have a working HDFS cluster and know the hostname and port that your NameNode exposes.
Aim :
           
To Mount the one node Hadoop cluster using FUSE.
Procedure:

    1. To install fuse-dfs on Ubuntu systems:
sudo apt-get install hadoop-hdfs-fuse
    1. To set up and test your mount point:
mkdir -p <mount_point>
hadoop-fuse-dfs dfs://<name_node_hostname>:<namenode_port><mount_point>
You can now run operations as if they are on your mount point. Press Ctrl+C to end the fuse-dfs program, and umount the partition if it is still mounted.
  Note:
  • To find its configuration directory, hadoop-fuse-dfs uses the HADOOP_CONF_DIR configured at the time the mount command is invoked.
  • If you are using SLES 11 with the Oracle JDK 6u26 package, hadoop-fuse-dfs may exit immediately because ld.so can't find libjvm.so. To work around this issue, add /usr/java/latest/jre/lib/amd64/server to the LD_LIBRARY_PATH.
    1. To clean up your test:
$ umount<mount_point>
You can now add a permanent HDFS mount which persists through reboots.
    1. To add a system mount:
  1. Open /etc/fstab and add lines to the bottom similar to these:
hadoop-fuse-dfs#dfs://<name_node_hostname>:<namenode_port><mount_point> fuse allow_other,usetrash,rw 2 0
For example:
hadoop-fuse-dfs#dfs://localhost:8020 /mnt/hdfs fuse allow_other,usetrash,rw 2 0
  1. Test to make sure everything is working properly:
$ mount <mount_point>
Your system is now configured to allow you to use the ls command and use that mount point as if it were a normal system disk.
Ex. No: 7
Program to use the API’s of Hadoop to interact with it.
Aim :
           
To write a program for  electrical consumption of all the largescale industries of a particular state to use the API’s of Hadoop to interact with it.

Procedure:

1.                                      Given below is the data regarding the electrical consumption of an organization. It contains the monthly electrical consumption and the annual average for various years.

 

Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Avg
1979
23
23
2
43
24
25
26
26
26
26
25
26
25
1980
26
27
28
28
28
30
31
31
31
30
30
30
29
1981
31
32
32
32
33
34
35
36
36
34
34
34
34
1984
39
38
39
39
39
41
42
43
40
39
38
38
40
1985
38
39
39
39
39
41
41
41
00
40
39
39
45
If the above data is given as input, we have to write applications to process it and produce results such as finding the year of maximum usage, year of minimum usage, and so on. This is a walkover for the programmers with finite number of records. They will simply write the logic to produce the required output, and pass the data to the application written.
But, think of the data representing the electrical consumption of all the large scale industries of a particular state, since its formation.
When we write applications to process such bulk data,
  • They will take a lot of time to execute.
  • There will be a heavy network traffic when we move data from source to network server and so on.
To solve these problems, we have the MapReduce framework.
2.                      The above data is saved as sample.txtand given as input. The input file looks as shown below.
1979   23   23   2   43   24   25   26   26   26   26   25   26  25 
1980   26   27   28  28   28   30   31   31   31   30   30   30  29 
1981   31   32   32  32   33   34   35   36   36   34   34   34  34 
1984   39   38   39  39   39   41   42   43   40   39   38   38  40 
1985   38   39   39  39   39   41   41   41   00   40   39   39  45 
3.                  Write a program to the sample data using MapReduce framework and Save the above program as ProcessUnits.java. The compilation and execution of the program is explained below.

4.                                                Compilation and Execution of Process Units Program

Let us assume we are in the home directory of a Hadoop user (e.g. /home/hadoop).
Follow the steps given below to compile and execute the above program.

Step 1

The following command is to create a directory to store the compiled java classes.
$ mkdir units 

Step 2

Download Hadoop-core-1.2.1.jar, which is used to compile and execute the MapReduce program. Visit the following linkhttp://mvnrepository.com/artifact/org.apache.hadoop/hadoop-core/1.2.1 to download the jar. Let us assume the downloaded folder is /home/hadoop/.

Step 3

The following commands are used for compiling the ProcessUnits.javaprogram and creating a jar for the program.
$ javac -classpath hadoop-core-1.2.1.jar -d units ProcessUnits.java 
$ jar -cvf units.jar -C units/ . 

Step 4

The following command is used to create an input directory in HDFS.
$HADOOP_HOME/bin/hadoop fs -mkdir input_dir 

Step 5

The following command is used to copy the input file named sample.txtin the input directory of HDFS.
$HADOOP_HOME/bin/hadoop fs -put /home/hadoop/sample.txt input_dir 

Step 6

The following command is used to verify the files in the input directory.
$HADOOP_HOME/bin/hadoop fs -ls input_dir/ 

Step 7

The following command is used to run the Eleunit_max application by taking the input files from the input directory.
$HADOOP_HOME/bin/hadoop jar units.jar hadoop.ProcessUnits input_dir output_dir 
Wait for a while until the file is executed. After execution, as shown below, the output will contain the number of input splits, the number of Map tasks, the number of reducer tasks, etc.
INFO mapreduce.Job: Job job_1414748220717_0002 
completed successfully 
14/10/31 06:02:52 
INFO mapreduce.Job: Counters: 49 
File System Counters 
FILE: Number of bytes read=61 
FILE: Number of bytes written=279400 
FILE: Number of read operations=0 
FILE: Number of large read operations=0   
FILE: Number of write operations=0 
HDFS: Number of bytes read=546 
HDFS: Number of bytes written=40 
HDFS: Number of read operations=9 
HDFS: Number of large read operations=0 
HDFS: Number of write operations=2 Job Counters 
   Launched map tasks=2  
   Launched reduce tasks=1 
   Data-local map tasks=2  
   Total time spent by all maps in occupied slots (ms)=146137 
   Total time spent by all reduces in occupied slots (ms)=441   
   Total time spent by all map tasks (ms)=14613 
   Total time spent by all reduce tasks (ms)=44120 
   Total vcore-seconds taken by all map tasks=146137 
   Total vcore-seconds taken by all reduce tasks=44120 
   Total megabyte-seconds taken by all map tasks=149644288 
   Total megabyte-seconds taken by all reduce tasks=45178880 
Map-Reduce Framework 
 
Map input records=5  
   Map output records=5   
   Map output bytes=45  
   Map output materialized bytes=67  
   Input split bytes=208 
   Combine input records=5  
   Combine output records=5 
   Reduce input groups=5  
   Reduce shuffle bytes=6  
   Reduce input records=5  
   Reduce output records=5  
   Spilled Records=10  
   Shuffled Maps =2  
   Failed Shuffles=0  
   Merged Map outputs=2  
   GC time elapsed (ms)=948  
   CPU time spent (ms)=5160  
   Physical memory (bytes) snapshot=47749120  
   Virtual memory (bytes) snapshot=2899349504  
   Total committed heap usage (bytes)=277684224
File Output Format Counters 
   Bytes Written=40 

Step 8

The following command is used to verify the resultant files in the output folder.
$HADOOP_HOME/bin/hadoop fs -ls output_dir/ 

Step 9

The following command is used to see the output in Part-00000 file. This file is generated by HDFS.
$HADOOP_HOME/bin/hadoop fs -cat output_dir/part-00000 
Below is the output generated by the MapReduce program.
1981    34 
1984    40 
1985    45 

Step 10

The following command is used to copy the output folder from HDFS to the local file system for analyzing.
$HADOOP_HOME/bin/hadoop fs -cat output_dir/part-00000/bin/hadoop dfs get output_dir /home/hadoop 

Important Commands

All Hadoop commands are invoked by the $HADOOP_HOME/bin/hadoopcommand. Running the Hadoop script without any arguments prints the description for all commands.
Usage : hadoop [--config confdir] COMMAND

5.                                        Interact with MapReduce Jobs

Usage: hadoop job [GENERIC_OPTIONS]
The following are the Generic Options available in a Hadoop job.
GENERIC_OPTIONS
Description
-submit <job-file>
Submits the job.
-status <job-id>
Prints the map and reduce completion percentage and all job counters.
-counter <job-id> <group-name> <countername>
Prints the counter value.
-kill <job-id>
Kills the job.
-events <job-id> <fromevent-#> <#-of-events>
Prints the events' details received by jobtracker for the given range.
-history [all] <jobOutputDir> - history < jobOutputDir>
Prints job details, failed and killed tip details. More details about the job such as successful tasks and task attempts made for each task can be viewed by specifying the [all] option.
-list[all]
Displays all jobs. -list displays only jobs which are yet to complete.
-kill-task <task-id>
Kills the task. Killed tasks are NOT counted against failed attempts.
-fail-task <task-id>
Fails the task. Failed tasks are counted against failed attempts.
-set-priority <job-id> <priority>
Changes the priority of the job. Allowed priority values are VERY_HIGH, HIGH, NORMAL, LOW, VERY_LOW

To see the status of job

$ $HADOOP_HOME/bin/hadoop job -status <JOB-ID> 
e.g. 
$ $HADOOP_HOME/bin/hadoop job -status job_201310191043_0004 

To see the history of job output-dir

$ $HADOOP_HOME/bin/hadoop job -history <DIR-NAME> 
e.g. 
$ $HADOOP_HOME/bin/hadoop job -history /user/expert/output 

To kill the job

$ $HADOOP_HOME/bin/hadoop job -kill <JOB-ID> 
e.g. 
$ $HADOOP_HOME/bin/hadoop job -kill job_201310191043_0004 


Output

1981      34
1984      40
1985      45







































Ex. No: 8
Word count program to demonstrate the use of Map and Reduce tasks

Aim :
           
To Write a wordcount program to demonstrate the use of Map and Reduce tasks

Procedure:
WordCount is a simple application that counts the number of occurrences of each word in a given input set.

  1. Write source code in java for which includes wordcount logic

  1. Assuming environment variables are set as follows:

export JAVA_HOME=/usr/java/default
export PATH=${JAVA_HOME}/bin:${PATH}
export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar


  1. Compile WordCount.java and create a jar:
$ bin/hadoop com.sun.tools.javac.Main WordCount.java
$ jar cf wc.jar WordCount*.class

·         Assuming that:
·         /user/joe/wordcount/input - input directory in HDFS
·         /user/joe/wordcount/output - output directory in HDFS

4.      Sample text-files as input:
$ bin/hadoop fs -ls /user/joe/wordcount/input/ /user/joe/wordcount/input/file01 /user/joe/wordcount/input/file02
 
$ bin/hadoop fs -cat /user/joe/wordcount/input/file01
Hello World Bye World
 
$ bin/hadoop fs -cat /user/joe/wordcount/input/file02
Hello Hadoop Goodbye Hadoop

  1. Run the application:
$ bin/hadoop jar wc.jar WordCount /user/joe/wordcount/input /user/joe/wordcount/output
Output:

$ bin/hadoop fs -cat /user/joe/wordcount/output/part-r-00000`
Bye 1
Goodbye 1
Hadoop 2
Hello 2

World 2