Skip to content

8. Chameleon Cloud

Jaime Cernuda edited this page Aug 28, 2023 · 1 revision

Chameleon Cloud

Chameleon Cloud is a research infrastructure funded by the National Science Foundation (NSF) that offers cloud resources tailored for the scientific community, especially those delving into computer science research areas like cloud computing, networking, and large-scale distributed systems. Unlike many commercial cloud providers, Chameleon provides direct bare metal access, allowing researchers to have unmediated access to physical hardware. This flexibility is pivotal for specific experimental needs. Located at the University of Chicago and the Texas Advanced Computing Center at the University of Texas at Austin, Chameleon emphasizes reproducibility, enabling researchers to share and replicate experimental setups. Moreover, its advanced networking capabilities and diverse hardware configurations cater to a wide range of research needs. As a collaborative platform, Chameleon fosters community engagement through workshops and tutorials, all while providing its resources free of charge to researchers.

Chameleon Cloud is the defacto cloud resource for all project in the center

Finding the chameleon URL

  1. Go to Chameleon Cloud
  2. Log-In
  3. Use the Menu at the top to select Experiment -> Hardware Discovery
  4. Use this page to explore various hardware resources available within chameleon. For instance, you can select various CPUs, RAM sizes, architectures and also, at times, Advanced filters. Once you've selected filters, click on view to see all resources which have that combination of hardware.
  5. Note, at this stage you can see not all sites will have all the hardware. For instance, IB networks are only available at TACC.
  6. Once you identify the site, select the site from the top three buttons. Eg, CHI@TACC.
  7. Optionally, you can use Experiment -> CHI@TACC. (This works much faster on my computer)
  8. You can use this link directly now to access the cluster at that site and make reservations, allocate nodes, etc.

Set Up SSH

  1. Now you need to setup Key pairs by adding your ssh keys.
    • Go to Compute -> Key Pairs
    • Here you can either create a new key-pair and use it to login or import an existing key-pair.
      • For a new Key-Pair (recommended):
        1. Click on Create Key-Pair
        2. Assign a unique name
        3. Select SSH key in key pair
        4. Download the key-pair
        5. Store this key in your computer, usually under ~/.ssh
        6. Change permissions of the key chmod 400 ~/.ssh/name.pem
        7. Optional: Set up a config file. (Note that you don't have an IP yet)
          Host name1
          Hostname ip
          User cc
          PubKeyAuthentication yes
          IdentityFile ~/.ssh/name.pem
          
        8. Once done, you can connect by using ssh name1
      • For importing:
        1. Generate a key with ssh-keygen
        2. Click on import public key
        3. Assign a unique name
        4. Either load your .pub file or copy-paste the key.

Create a lease

  1. Navigate to Reservations -> Leases
  2. Click on Create lease.
    • Fill in all the information based on the hardware discovery you did on the website.
    • As a good practice, always have one node as a head node which can host NFS, assign public IP, etc.
    • Click on create
  3. Wait for the lease to become ACTIVE (on the status column)
  4. NOTE: the lease duration is only for 7 days. If you want more, on the 5th day you can go to the lease and extend it, if no one else is using those nodes.

##Create an instance

  1. Go to Compute -> Instances
  2. Click on Launch Instances
    • Assign a unique name, count, and your reservation (if you assign count > 1 then this name is appended with the count starting from 0)
    • Then select the Source tab and select the image. Generally, start with a base IMAGE of an OS like UBUNTU, CentOS, etc., unless you have an IMAGE created that you want to use.
    • Select Flavor, as Bare Metal
    • Go to the key-pair tab and ensure your key-pair is selected.
    • Default Security group is sufficient. If you need any changes to this, consult your admin.
    • Once done, click on Launch Instance
    • It might take several minutes to launch the instance. When it's done, the Status becomes Active and Power State is Running.

Assign floating IP to head node

  1. Go to Network -> Floating IPs
  2. Click Allocate IP to Project
    • Give it a name and create it.
    • Click Associate on your newly created IP.
      • Here select your head node from Port to be associated.
      • Once Status changes to ACTIVE, your floating IP is ready to use.
  3. Ideally, associate a public IP only to your head node. Once inside, you can connect to other nodes using ssh internally.

Using the instances

  1. Connect to the login node using an ssh client.
  2. You now have full access to a Linux Machine with “full” sudo access.

Create a Ticket

  1. Go to Chameleon Cloud Hardware
  2. User Icon (top right) -> Dashboard
  3. Click on Open a Ticket
    • Provide complete information about the problem, which instances from the overview of the instance, etc.
    • Click Create

Snapshot

  1. If you install anything on the system (outside of workspace) then you need to update the image.
    • To update the image:
      • Run cc-snapshot <UNIQUE_IMAGE_NAME> utility preinstalled in the system. This requires sudo usage and to indicate your chameleon username and password.
      • Once created, they can be used to spawn new instances at the same point of development by going to compute->images and hitting launch or through the normal steps but selecting the snapshot when selecting the source.
      • This is very useful once software has been installed, so we can avoid installing MPI multiple times.

Passwordless SSH

  1. Additionally, you will need to establish passwordless-ssh with all the other nodes in the cluster (a requirement of MPI). Here are the steps:
    • Download the script: git clone https://github.com/JaimeCernuda/sshSyncScript.git
    • Edit the file in /etc/hosts (requires sudo). In this file, add the <IP> <HOSTNAME> for each instance you want to sync, one instance per line. The local IPs of all the instances can be viewed from the website, and the name is up to you. Example:
      192.168.0.125 master
      192.168.0.241 slave1
      
    • Move the keys used to log into chameleon into our master node: scp ~/.ssh/name.pem cc2020:~/.ssh/id_rsa
    • Run ./sync.sh. This script should now show all the hostnames you had added. Once you press enter, it will synchronize all instances so they can access each other without a password.

Install NFS

Server

  1. Install the server: sudo apt install nfs-kernel-server
  2. When configuring an NFSv4 server, a good practice is to use a global NFS root directory and bind mount the actual directories to the share mount point.
    • Create the root directory: mkdir -p /export/cc
    • Mount /home/cc into the root folder: sudo mount --bind /home/cc /export/cc
    • Ensure that it happens at runtime by adding to sudo vim /etc/fstab the following line:
      /home/users /export/cc none bind 0 0
      
    • Put the filesystem on the network by adding to sudo vim /etc/exports the following two lines:
      /export 192.168.0.0/24(rw,fsid=0,insecure,no_subtree_check,async)
      /export/cc 192.168.0.0/24(rw,nohide,insecure,no_subtree_check,async)
      
    • Run: sudo exportfs -ra
    • Restart the service to apply changes: sudo service nfs-kernel-server restart

Client

  1. Install the client: sudo apt-get install nfs-common
  2. Mount the network device into your home: mount -t nfs -o proto=tcp,port=2049 <nfs-server-IP>:/cc /home/cc
  3. Ensure that it happens at runtime by adding to sudo vim /etc/fstab the following line:
    <nfs-server-IP>:/cc /home/cc nfs auto 0 0
    

Refer to this for differences between Ubuntu and CentOS7

Additional Resources and Notes

  1. Basics of what SSH is can be found here.
  2. Additional resources on SSH: Digital Ocean SSH Essentials and SSH Config.
  3. Resources on SSH permissions: Perforce Community.
  4. More information on snapshots: Chameleon Cloud Blog.
  5. Resource on understanding private and public IPs: IP Location.
  6. General Flow: Students Work on Chameleon -> if you have issues ask your supervisor -> if they recommend "then and only then" create a ticket.
  7. More about MPI: Wikipedia.
  8. Tarball explanation: CATB Jargon.
  9. Super simplified resource for installing software: Linux.com.
  10. Some popular commands for Linux: UbuntuPit.
  11. Basics of installation "from source": ItsFOSS.
  12. Resource on .bashrc file: Unix StackExchange.
  13. More info on PATH variable: LINFO.
  14. More info on passwordless ssh: Tecmint.
  15. About /etc/hosts: Vitux.
  16. Info on id_rsa: Bandlem.
  17. NFS tutorial: Ubuntu Community.