Cyberinfrastructure Technology Integration

Clemson Computing and Information Technology (CCIT) provides research cyberinfrastructure resources and advanced research computing capabilities through its Cyberinfrastructure Technology Integration (CITI) group.

Training Workshops

CITI partners with researchers across campus and across the country to offer a diverse catalog of advanced computing training opportunities for Clemson University students, researchers, faculty, and staff, as well as opportunities for our external partners at other universities and organizations. If you have problems with or questions about course registration, please email ithelp@clemson.edu with the words “Palmetto training” in the subject.

All workshops listed below are free for Clemson University students, faculty and staff. Registration is required, and can be done at https://cucourse.app.clemson.edu/it-training/student-index.php one week prior to the listed start dates.

Spring 2019 Schedule of Workshops

Introduction to Linux

Introduction to the Linux Command Line Interface for researchers

  • Location: Main Campus (Clemson)
    • Tuesday, January 22, 9:00AM - 11:00AM, 1:00PM - 3:00PM. Building/Room: TBD
    • Wednesday, February 20, 9:00AM - 11:00AM, 1:00PM - 3:00PM. Building/Room: TBD
    • Tuesday, March 26, 9:00AM - 11:00AM, 1:00PM - 3:00PM. Building/Room: TBD
    • Tuesday, April 9, 9:00AM - 11:00AM, 1:00PM - 3:00PM. Building/Room: TBD
  • Location: Zucker Family Graduate Education Center (North Charleston)
    • Wednesday, January 23, 9:00AM - 12:00PM. Room: Zucker 104
    • Friday, March 8, 9:00AM - 12:00PM. Room: Zucker 106
    • Wednesday, April 10, 9:00AM - 12:00PM. Room: Zucker 104

Introduction to Version Control with Git/GitHub

This workshop will cover Git, a revision control tool that lets people effectively track changes to software projects (such as technical papers, theses, or programs). Git allows them to work more systematically, by saving “snapshots” of different versions of the project, and allowing them to easily view or undo changes between snapshots. This lets them make changes to their project without the fear of losing work or “breaking things”. The workshop will also cover using Git with GitHub, a website that allows people to share their projects with the world (or with collaborators), and for groups of people to collaborate on projects in a more systematic way than e-mailing files to each other or using cloud storage.

  • Location: Main Campus (Clemson)
    • Friday, February 8, 9:00AM - 12:00PM. Building/Room: TBD
  • Location: Zucker Family Graduate Education Center (North Charleston)
    • Wednesday, March 6, 9:00AM - 12:00PM. Room: Zucker 104

Introduction to Research Computing on Palmetto Cluster

This workshop introduces participants to the Palmetto Cluster–Clemson University’s largest high-performance computing resource–its structure and basic usage and how to submit computational tasks to the cluster.

  • Location: Main Campus (Clemson)
    • Thursday, January 24, 9:00AM - 12:00PM Building/Room: TBD
    • Friday, February 22, 9:00AM - 12:00PM Building/Room: TBD
    • Thursday, March 28, 9:00am - 12:99PM Building/Room: TBD
    • Tuesday, April 9, 9:00am - 12:99PM Building/Room: TBD
  • Location: Zucker Family Graduate Education Center (North Charleston)
    • Friday, January 25, 9:00AM - 12:00PM Room: Zucker 104
    • Monday, March 11, 9:00AM - 12:00PM Room: Zucker 104
    • Friday, April 12, 9:00AM - 12:00PM Room: Zucker 104

Introduction to Programming in Python

This workshop introduces participants to programming, using the Python programming language, and is built around common scientific tasks such as loading, analyzing and visualizing data. The intended audience is researchers or students with no prior programming experience.

NOTE: This is an all day training. There will be a 1 hour lunch break from 12:00PM - 01:00PM. Lunch WILL NOT be provided.

  • Location: Main Campus (Clemson)
    • Friday, January 25, 9:00AM - 4:00PM. Building/Room: TBD
    • Friday, April 5, 9:00AM - 4:00PM. Building/Room: TBD
  • Location: Zucker Family Graduate Education Center (North Charleston)
    • (Part 1) Monday, February 4, 9:00AM - 12:00PM. Room: Zucker 104
    • (Part 2) Wednesday, February 6, 9:00AM - 12:00PM. Room: Zucker 104
    • (Part 1) Monday, April 1, 9:00AM - 12:00PM. Room: Zucker 104
    • (Part 2) Wednesday, April 3, 9:00AM - 12:00PM. Room: Zucker 104

Introduction to Hadoop on the Cypress Cluster

This workshop introduces participants to the Hadoop ecosystem and the Cypress Cluster–Clemson University’s largest Hadoop cluster. The Cypress Cluster is housed, networked, and integrated with Clemson’s Palmetto Cluster. This workshop will cover Hadoop’s architecture, the Cypress Cluster’s structure, import and export of big-data, basic usage, and how to submit scalable data analysis jobs to the Cypress Cluster. This workshop will incorporate the use of JupyterHub and Jupyter “Notebooks”. An understanding of the Linux command line and some Python experience would be beneficial (see other CITI trainings).

  • Location: Main Campus (Clemson)
    • NOTE: These main-campus sessions may be broadcast from Charleston to Clemson with in-person facilitation at Clemson or a recording of the Charleston session will be made available.
    • Monday, January 28, 9:00AM - 12:00PM. Room: TBD
    • Wednesday, March 13, 9:00AM - 12:00PM. Room: TBD
    • Wednesday, April 17, 9:00AM - 12:00PM. Room: TBD
  • Location: Zucker Family Graduate Education Center (North Charleston)
    • Monday, January 28, 9:00AM - 12:00PM. Room: Zucker 104
    • Wednesday, March 13, 9:00AM - 12:00PM. Room: Zucker 104
    • Wednesday, April 17, 9:00AM - 12:00PM. Room: Zucker 104

Introduction to Big Data Analytics in Python

This workshop will teach how to how to utilize Apache Spark and Python to perform large-scale, in-memory data analytics. Learning outcomes of this workshop include understanding the overall conceptual design of Spark and demonstrate the advantages of using Spark over traditional Hadoop MapReduce. Participants will also learn to develop Spark programs using Python and to leverage Spark’s specific capabilities such as SQLContext and DataFrame to assist with data analytics. Most importantly, we will teach you how to leverage Clemson University’s Cypress Cluster to run large-scale, in-memory data analytics.

NOTE: This is now an all day training. There will be a 1 hour lunch break from 12:00PM - 01:00PM. Lunch WILL NOT be provided.

  • Location: Main Campus (Clemson)
    • NOTE: These main-campus sessions will be broadcast from Charleston to Clemson with in-person facilitation.
    • Friday, February 15
      • NOTE: Rooms will change during the lunch hour, so plan to take your belongings with you.
      • 1st half: 09:00AM - 12:00PM. Room: Watt 313
      • 2nd half: 01:00PM - 04:00PM. Room: Watt 203
    • Friday, April 5, 9:00AM - 04:00PM. Room: TBD
  • Location: Zucker Family Graduate Education Center (North Charleston)
    • Friday, February 15, 9:00AM - 04:00PM. Room: Zucker 104
    • Friday, April 5, 9:00AM - 04:00PM. Room: Zucker 104

Introduction to Data Science using R

Introduction to R language for data analytics using RStudio on PC and also Jupyter notebooks on Palmetto. Workshop contents include basic understand of R, installation of additional R modules, introduction to data manipulation, introduction to visualization, and several best practices for using R. No prior knowledge of R or programming in general is required.

  • Tuesday, February 12, 9:00AM - 4:00PM. Location TBD
  • Friday, April 12, 9:00AM - 4:00PM. Location TBD

Introduction to Machine Learning using R
  • Location: Main Campus (Clemson)
    • Friday, March 1st, 9:00AM - 12:00PM. Location TBD

Machine learning is the science of teaching computers to reproduce the assigned procedure without being explicitly programmed. It has been used in many practical applications such as self-driving cars, speech recognition, email spam classification. It has been widely used not only in engineering (hydroinformatics, bioinformatics, genomics, geosciences and remote sensing, mechatronics) but also in economy, health sciences and even in real estates industry. This workshop provides an overall introduction to machine learning specifically with R programming language which utilizes abundance of R statistical packages. Such topics include: (1) Supervised learning (regression analysis, distance-based algorithm, regularization algorithm, tree-based algorithm, Bayes algorithm, support vector machines, artificial neural networks). (2) Unsupervised learning (clustering, dimensionality reduction). The course will also draw from numerous case studies and applications that can be applied in different engineering programs.

Pre-requisite for the course is “Introduction to Data Science using R”, offered by CITI team.

GIS Training

For GIS Training, please visit Clemson Center for Geospatial Technologies.