Cyberinfrastructure Technology Integration

Clemson Computing and Information Technology (CCIT) provides research cyberinfrastructure resources and advanced research computing capabilities through its Cyberinfrastructure Technology Integration (CITI) group.

Training

CITI partners with researchers across campus and across the country to offer a diverse catalog of advanced computing training opportunities for Clemson University students, researchers, faculty, and staff, as well as opportunities for our external partners at other universities and organizations. If you have problems with or questions about course registration, please contact ithelp [at] clemson.edu with the word "Palmetto training" in the title

Registrations for training workshops will be available on clemson.edu/clereg one week prior to the listed start dates below

Fall 2017 Schedule of Workshops


Introduction to Linux from the command line:
Software Carpentry based introduction to the Linux command line interface

- Thursday, September 7, 9:00AM - 12:00PM. Location: Barre Hall B105
- Thursday, October 5, 9:00AM - 12:00PM. Location: Barre Hall B105
- Monday, November 6, 9:00AM - 12:00PM. Location: Barre Hall B105
- Wednesday, November 29, 9:00AM - 12:00PM. Location: Barre Hall B105


Introduction to Version Control with Git:
Introduction to git for version control and Gihub.com for collaboration

- Monday September 4, 9:00AM - 12:00PM. Location: Barre Hall B105
- Monday October 30, 9:00AM - 12:00PM. Location: Barre Hall B105


Introduction to research computing on Palmetto Cluster
Introduction to Palmetto cluster platform, details of infrastructure, scheduler, best practices

- Friday September 8, 9:00AM - 12:00PM. Location: Barre Hall B105
- Friday October 6, 9:00AM - 12:00PM. Location: Barre Hall B105
- Tuesday November 7, 9:00AM - 12:00PM. Location: Barre Hall B105
- Thursday November 30, 9:00AM - 12:00PM. Location: Barre Hall B105


Introduction to Python - Introduction to programming in Python
- Tuesday, October 31, 9:00AM - 12:00PM. Location: Barre Hall B105


Introduction to Data Science using R
Introduction to R language for data analytics using RStudio on PC and also Jupyter notebooks on Palmetto. Workshop contents include basic understand of R, installation of additional R modules, introduction to data manipulation, introduction to visualization, and several best practices for using R. No prior knowledge of R or programming in general is required.

- Thursday, August 31, 3:30PM - 4:45PM. (full)
- Tuesday, September 5, 9:00AM - 12:00PM. Location: Barre Hall B106
- Tuesday October 31, 9:00AM - 12:00PM. Location: Barre Hall B106


Data Mining using R
This workshop focuses on data mining techniques in R, with the emphasis on techniques to acquiring and curating data via online sources. For acquiring data, we will learn how to download from static links, crawl through entire websites, and stream data from real-time sources. For curating data, we will learn how to expand and extract information from acquired data, which are often stored under non-structured/semi-structured online data (XML, JSON, ...), into structured format suitable to subsequent analysis. We will also learn about best practices in data management, including organizing data directories, working with databases, and automating data-mining process through the Palmetto Supercomputer.

- Tuesday, September 5, 3:30PM - 4:45PM (full)
- Thursday, September 7, 3:30PM - 4:45PM (full)
- Thursday, October 12, 9:00AM - 12:00PM. Location: Barre Hall B105
- Tuesday, November 5, 9:00AM - 12:00PM. Location TBD


Introduction to Hadoop - Introduction to Hadoop ecosystem and MapReduce
This workshop will teach how to utilize Hadoop MapReduce and Python to perform large scale data analytics. Learning outcomes of this workshop include understanding the overall architecture of the Hadoop Distributed File System (HDFS) and understanding the concept of MapReduce. Throughout the workshop, participants will learn to develop and run MapReduce programs, examine system logs in order to perform debugging MapReduce applications, and be able to optimize MapReduce applications.

- Thusday, September 14, 9:00AM - 12:00PM. Location: Barre Hall B105
- Thursday, December 7, 9:00AM - 12:00PM. Location TBD


Introduction to Spark for fast in-memory big data processing using Python

This workshop will teach how to how to utilize Apache Spark and Python to perform large-scale in-memory data analytics. Learning outcomes of this workshop include understanding the overall conceptual design of Spark and what are the advantages of using Spark over the traditional Hadoop MapReduce. Participants will also learn to develop Spark programs using Python and to leverage Spark’s specific capacities such as SQLContext and DataFrame to assist with data analytics.

- Friday, September 15, 9:00AM - 12:00PM. Location: Barre Hall B106
- Friday, December 8, 9:00AM - 12:00PM. Location: Barre Hall B106


Introduction to Machine Learning
The first half of this workshop focuses on machine learning techniques in Python (Scikit-learn), which remains the overwhelming first choice as a programming language for machine learning. The second half of workshop focused on deep learning techniques in DIGITS, the NVIDIA Deep Learning GPU Training System, which is easy to learn and use. We will also learn how to process through the Palmetto Supercomputer.

- Tuesday, September 29, 9:00AM - 12:00PM. Location: ASC 118 (Academic Success Center)
- Thursday, November 2, 9:00AM - 12:00PM. Location TBD


High Performance Python I
This session will be a hands-on tutorial on different ways to accelerate Python code, measure and improve performance, including using multiple cores, multiple nodes, and the GPU to speed up computations.

Thursday, August 17 (morning), 9:00AM - 12:00PM. Location: Barre Hall B106


High Performance Python II
For this session, bring your own code or research questions, and we will help you get started with parallelizing and/or improving its performance.

Thursday, August 17 (afternoon), 1:00PM - 4:00PM. Location: Barre Hall B106


Scientific Visualization with ParaView
In this training session, an introduction to scientific visualization by using ParaView will be provided. The topics that will be covered are:  How to load different datasets and simulation results in ParaView.  How to apply pre-defined filters on loaded datasets in order to extract information about simulation results and create animation.  How to connect the ParaView to the Palmetto cluster in order to load big data structures and deal with them by using parallel visualization.  Send the scientific visualization datasets into HTC Vive headsets in order to interact with data structures in virtual reality environment.  How to use Python programming language in order to create customized filters and deal with complex data structures.  Show some real case of simulation results in ParaView to demonstrate the powerful tools of this scientific visualization software.

Friday, September 22, 10:30AM – 11:30AM


Scientific Visualization with VisIt
In this training session, an introduction to scientific visualization by using VisIt will be provided. The topics that will be covered are:  How to load different datasets and simulation results in VisIt.  How to apply pre-defined filters on loaded datasets in order to extract information about simulation results and create animation.  How to connect the VisIt to the Palmetto cluster in order to load big data structures and deal with them by using parallel visualization.  Show some real case of simulation results in VisIt to demonstrate the powerful tools of this scientific visualization software.

Friday, September 29, 10:30AM – 11:30AM


Scientific Visualization with VMD
In this training session, an introduction to molecular dynamics and biomolecular visualization by using VMD will be provided. The topics that will be covered are:  How to open different molecular and biomolecular structures in VMD and extract different regions of interest like hydrophilic and hydrophobic parts of a molecule.  How to show the molecular structures by different visualization types like using chain, ribbons etc. and assign the computed fields like temperature or movement of atoms as colors the molecular structures.  How to create animation from dynamic molecular simulation and extract the positional information about atoms like RMSD.  How to use Palmetto cluster in order to deal with big molecular structures and installing and using VMD on Palmetto cluster.

Friday, October 06, 10:30AM – 11:30AM


Scientific Visualization with CUDA
In this training session, some real case of using CUDA/OpenGL in scientific visualization will be showed and then some information about using Palmetto cluster in order to combine CUDA and high performance computing in order to visualize big data structures will be provided.

Friday, October 13, 10:30AM – 11:30AM