These CI links have been crowd-sourced from the ConnectCI community and represent a “vetted” list of useful websites, training modules and tutorials. CI links show up in a tag search if they have the relevant tag attached. Affinity groups can include relevant CI links on their respective affinity group pages. Additional CI links are always welcome, click the “Add New CI Link” button to suggest one.
We teach foundational coding and data science skills to researchers worldwide.
Website
administering-hpc training
Beginner, Intermediate, Advanced
A comprehensive list of training resources from the HPC University. HPCU is a virtual organization whose primary goal is to provide a cohesive,… more
Learning
debugging hpc-operations professional-development
This comprehensive workshop is designed to guide participants through the world of cryptography, from foundational concepts to advanced… more
python data-security cybersecurity
Beginner, Intermediate
Open OnDemand is an easy-to-use web portal that lets students, researchers, and industry professionals use supercomputers from anywhere. It is… more
open-ondemand administering-hpc cluster-management
RELION (REgularised LIkelihood OptimisatioN, pronounced rely-on) is a stand-alone software package developed by Sjors Scheres' group at the MRC… more
machine-learning data-analysis image-processing
This is a beginner-friendly tutorial on how to set up your web server using Rust!
Docs
Rust
Beginner
This is a great mentoring resource and has many articles related to mentoring. It is a one-stop shop for mentoring, and at the bottom, there are tags… more
mentorship
Monthly workshops sponsored by ACCESS on a variety of HPC topics organized by Pittsburgh Supercomputing Center (PSC). Each workshop will be telecast… more
deep-learning machine-learning neural-networks
The documentation provides an overview of using Pegasus, a workflow management system, on ACCESS resources for high throughput computing (HTC)… more
pegasus
This workshop focuses on developing an understanding of the fundamentals of attention and the transformer architecture so that you can understand how… more
ai deep-learning machine-learning
Intermediate
These are the templates for the launch and wrap presentations used by the Cyberinfrastructure Community-Wide Mentorship Network
community-outreach mentorship professional-development
This tutorial demonstrates how to create, manage, and deploy containerized Jupyter simulations for High-Performance Computing (HPC) environments,… more
cloud cloud-computing openstack
Cornell Virtual Workshop is a comprehensive training resource for high performance computing topics. The Cornell University Center for Advanced… more
jetstream stampede2 cloud-computing
DARWIN (Delaware Advanced Research Workforce and Innovation Network) is a big data and high performance computing system designed to catalyze… more
darwin big-data
Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. It makes analyzing and presenting your… more
documentation python
DeapSECURE is a training program to infuse high-performance computational techniques into cybersecurity research and education. It is an NSF-funded… more
This beginner-friendly guide introduces Retrieval-Augmented Generation (RAG), a technique to enhance Large Language Models (LLMs) by integrating… more
ai llm NAIRR-pilot
This course from MIT OpenCourseWare (OCW) covers very basic information on how to get started with programming using Python. Lectures are available,… more
python
Geocoding is the process of taking a street address and converting it into coordinates that can be plotted on a map. This conversion typically… more
gis
An HPC focused Carpentry community. Trainings include: HPC fundamentals, python, chapel, LAMMPS, parallelization with python, scaling studies, etc.
software-carpentry training
This workshop series introduces the essential concepts in deep learning and walks through the common steps in a deep learning workflow from data… more
ai deep-learning image-processing
This documentation contains introductory material on Python Programming for Digital Humanities and Computational Research. This can be a go-to… more
ai big-data data-analysis
In this presentation, I will explore the recent advancements in AI-driven production of 3D-generative assets and environments, particularly focusing… more
Slides
ai llm generative-ai
This workshop will go into the different ways python packages can be managed in a cluster environment using conda and python virtual environments… more
documentation pytorch data-science
Self-paced tutorials on high-end computing topics such as parallel computing, multi-core performance, and performance tools. Other related topics… more
performance-tuning profiling parallelization
This is the main documentation repo for the Open OnDemand Portal which enables researchers to access HPC resources from a familiar web interface.
documentation open-ondemand
PyTorch is a Python library that supports accelerated GPU processing for Machine Learning and Deep Learning. In this tutorial, I will teach the… more
This tutorial shows how to set up an open-source customizable RAG chatbot to answer questions about documents you can choose. It uses Indiana's… more
Tool
This Udacity article listed the most frequently used R packages for data science and statistics. For each package, the article provided the link to… more
plotting visualization data-analysis
Learn how to use Linux commands in a python script. Specifically, learn how to use the subprocess and os modules in python to run shell commands (… more
cluster-management programming python
Understand the benefits of an automated version control system and the basics of how automated version control systems work. Configure git the first… more
version-control github git
A question and answer forum for neuroscience researchers, infrastructure providers and software developers.
documentation image-processing data-sharing
These links take you to visualization resources supported by the University of Arizona's HPC visualization consultant ([rtdatavis.github.io](… more
visualization
Intermediate, Advanced
pip stands for "pip installs packages". It's the go-to package manager for Python, allowing developers to install, update, and manage… more
pip software-installation
This website is an interactive introduction to Gaussian Belief Propagation (GBP). A probabilistic inference algorithm that operates by passing… more
ai machine-learning
This is a short video on how to exchange ACCESS credits and connect to Jetstream 2 (please note this was created for Duke users but applies to all) .
Video
ACCESS-account ACCESS-credits exchange-request
ACCESS requests proposals to be written following NSF proposal guidelines. The link provides an example of an ACCESS proposal using an NSF LaTeX… more
allocations-proposal proposal-request research-facilitation
Listing of upcoming ACCESS related events and training activities.
professional-development training workforce-development
A step-by-step guide to getting your first allocation for Access computing and storage resources.
ACCESS-account ACCESS-credits allocations-proposal
A guide for Duke OIT on how to advise users on using ACCESS and allocation credits to jetstream 2 for Duke University members. This can be used for… more
ACCESS-credits adding-users allocation-management
NCSA is the home of Delta, a computing and data resource that balances cutting-edge graphics processor and CPU architectures with a non-POSIX file… more
delta
Expanse at SDSC is a cluster designed by Dell and SDSC delivering 5.16 peak petaflops, and offers Composable Systems and Cloud Bursting. This… more
expanse composable-systems gpu
affinity-group pegasus ACCESS-website
A library of short videos about ACCESS allocations, resources and support.
training
This tutorial introduces the use of Containers using the Charliecloud software suite. This tutorial will provide participants with background and… more
ACES TAMU scratch
This textbook is the first comprehensive treatment of active inference, an integrative perspective on brain, cognition, and behavior used across… more
ai machine-learning neural-networks
This is a self guided online course on compilers. The topics covered throughout the course include universal compilers topics like intermediate… more
optimization parallelization training
Advanced
Mathematical optimization deals with the problem of finding numerically minimums or maximums of a functions. This tutorial provides the Python… more
optimization python
This link is a documentary website to use AHPCC.
login batch-jobs slurm
These slides provide an introduction on how Termius and Cursor, two new and freemium apps that use AI to perform more efficient work, can be used for… more
documentation ai machine-learning
Materials from the SAIL meeting (https://aiinstitutes.org/2023/06/21/sail-2023-summit-for-ai-leadership/). A space where AI researchers can learn… more
ACCESS-account ai data-analysis
**Cursor: The AI-Powered Code Editor** Cursor is a cutting-edge, AI-first code editor designed to revolutionize the way developers… more
ai machine-learning workflow
This technology lab contains a set of sessions to help a new user start an AI project on the ACES cluster, a composable accelerator testbed at Texas… more
ACES documentation TAMU
The Julia Programming Language is one of the fastest growing software languages for AI/ML development. It writes in manner that's similar to… more
ai data-analysis machine-learning
Documentation for Anvil, a powerful supercomputer at Purdue University that provides advanced computing capabilities to support a wide range of… more
anvil
Purdue University is the home of Anvil, a powerful supercomputer that provides advanced computing capabilities to support a wide range of… more
The webpage provides guidelines for securing an Apache HTTP Server (version 2.4). It covers best practices such as minimizing module usage,… more
authentication data-security cybersecurity
The webpage lists known security vulnerabilities affecting Apache HTTP Server 2.4, including detailed descriptions, impact assessments, and… more
computer-science data-security cybersecurity
The provided text discusses various aspects of Android app development fundamentals. It covers key concepts related to app components, the… more
license api programming
Slides for a tutorial on Machine Learning applications in Engineering and parameter tuning given at the RMACC conference 2019.
data-analysis machine-learning python
resources programming-best-practices
Astropy is a community-driven package that offers core functionalities needed for astrophysical computations and data analysis. From coordinate… more
visualization image-processing astrophysics
The authoritative book on automated machine learning, which allows practitioners without ML expertise to develop and deploy state-of-the-art machine… more
ai data-analysis deep-learning
A curated list of awesome Jupyter widget packages and projects for building interactive visualizations for Python code
ai computer-graphics plotting
An AWS Tutorial for Beginners is a course that teaches the basics of Amazon Web Services (AWS), a cloud computing platform that offers a wide range… more
aws
Training materials for using the bash (and zsh) shell.
bash
This package lets you easily scrape websites and extract information based on html tags and various other metadata found in the page. It can be… more
documentation ai big-data
What is PyFR and how does it solve fluid flow problems? PyFR is an open-source Computational Fluid Dynamics (CFD) solver that is based on… more
finite-element-analysis benchmarking parallelization
The Better Scientific Software (BSSw) project provides a community to collaborate and learn about best practices in scientific software development.… more
community-outreach project-management research-facilitation
Background: Big data, defined as having high volume, complexity or velocity, have the potential to greatly accelerate research discovery. Such data… more
big-data
Nextflow is an open-source, domain-specific language and workflow manager designed for the execution and coordination of scientific and data-… more
cloud-computing parallelization data-management
The Biopython Tutorial and Cookbook website is a dedicated online resource for users in the field of computational biology and bioinformatics. It… more
bioinformatics genomics python
Landing Page for Bridges-2 information
bridges-2
This tutorial explains how to create an Anaconda Navigator Application (app) for JupyterLab. It is intended for users of Windows, macOS, and Linux… more
compiling conda programming
This article provides instructions for building AirSim, an open-source simulator for autonomous vehicles, on Linux. It outlines the steps to build… more
profiling data-transfer github
"These notes are part of the UW Experimental College course on Introductory C Programming. They are based on notes prepared (beginning in Spring… more
c c++ compiling
Campus Champions foster a dynamic environment for a diverse community of research computing and data professionals sharing knowledge and experience… more
community-outreach professional-development
CaRCC – the Campus Research Computing Consortium – is an organization of dedicated professionals developing, advocating for, and advancing campus… more
community-outreach professional-development research-facilitation
The Data-Facing Track of the People Network brings together people from research computing groups, libraries, research institutes, and other… more
data-analysis data-access-protocols data-lifecycle
Chameleon is an NSF-funded testbed system for Computer Science experimentation. It is designed to be deeply reconfigurable, with a wide variety of… more
data-sharing data-reproducibility
Announcements for for users and developers of Charliecloud, which provides lightweight user-defined software stacks for high-performance computing.
Mailing List
containers
CHARMM (Chemistry at HARvard Macromolecular Mechanics) is a widely distributed molecular simulation program with a broad array of applications.… more
charmm molecular-dynamics namd
Computing Module: Introduces fundamental concepts and skills of Cyberinfrastructure (CI) and High-Performance Computing (HPC) to lower the barrier to… more
ai computer-vision neural-networks
CMake is an open-source tool used to manage the build process in operating systems. This tutorial takes you through how to use CMake from the very… more
training compiling
Conda is a popular package management system. This tutorial introduces you to Conda and walks you through managing Python, your environment, and… more
anaconda conda python
Connect.Cybinfrastructure is a family of portals, each representing a program that is serving a segment of the research computing and data community… more
community-outreach
Containerization is a software development method in which applications are packaged into standard units for development, shipment, and deployment.
Goes through in detail on how to build an application that can run on Android and IOS devices, using Qt Creator to develop Qt Quick applications.… more
github compiling programming
NVIDIA CUDA Toolkit Documentation: If you are working with GPUs in HPC, the NVIDIA CUDA Toolkit is essential. You can access the CUDA Toolkit… more
documentation c c++
learning cybersecurity is crucial for personal protection, safeguarding digital assets, financial security, and national security. It is important… more
training data-security cybersecurity
The CyberAmbassadors project was funded through a workforce development grant from the National Science Foundation (Award #1730137). Starting in 2017… more
mentorship professional-development research-facilitation
Cybersecurity Guide is a comprehensive resource for students and early career professionals that provides users with a wide range of resources and up… more
resources training data-security
DAGMan (Directed Acyclic Graph Manager) is a meta-scheduler for HTCondor. It manages dependencies between jobs at a higher level than the HTCondor… more
open-science-grid
This webinar series is an orientation to R. We start with an overview of R’s history and place in the larger data science ecosystem. Next, we… more
data-analysis data-science psychology
This slices and videos introduced how to use K-Nearest-Neighbors method to impute climate data and how to use Bayesian Spatio-Temporal models in R-… more
allocation-value documentation ai
Plots.jl is the most widely used plotting library for the Julia programming language. It's known for being especially powerful in its… more
plotting visualization julia
Data visualization is a critical aspect of data analysis. It allows for a clear and concise representation of data, making it easier for users to… more
plotting visualization
DeepChem is an open-source library built on TensorFlow and PyTorch. It is helpful in applying machine learning algorithms to molecular data.
pytorch tensorflow computational-chemistry
Introductory video about DELTA. Speaker Tim Boerner, Senior Assistant Director, NCSA
video
delta gpu training
As developers, we get excited to think about challenging problems. When you ask us what we are working on, our eyes light up like children in a candy… more
community-outreach professional-development training
Discover Data Science is all about making connections between prospective students and educational opportunities in an exciting new, hot, and growing… more
data-analysis workforce-development
Tableau is a popular and capable software product for creating charts that present data and dashboards that allow you to explore data. It is… more
big-data data-analysis training
Docker allows for containerization of any task - basically a smaller, scalable version of a virtual machine. This is very useful when transferring… more
documentation cloud-computing deep-learning
The Docker container library, commonly known as Docker Hub, is a vast repository that hosts a multitude of pre-configured container images,… more
documentation cloud-computing cloud-open-source
A Docker tutorial for beginners is a course that teaches the basics of Docker, a containerization platform that allows you to package your… more
docker
EasyBuild is a software installation framework that allows administrators to easily build and install software on high-performance computing (HPC)… more
easybuild
The purpose of this group is to provide a forum to discuss NIST 800-171 compliance. Participants are encouraged to collaborate and share effective… more
cybersecurity
This code showcases how to work with the header-only nlohmann JSON library for C++. In order to compile, change the extensions from json_test.txt to… more
c++
Some examples for writing Thrust code. To compile, download the CUDA compiler from NVIDIA. This code was tested with CUDA 9.2 but is likely… more
parallelization gpu cuda
Expanse at SDSC is a cluster designed by Dell and SDSC delivering 5.16 peak petaflops, and offers Composable Systems and Cloud Bursting.
A tutorial paper that presents a generic message-passing algorithm, the sum-product algorithm, that operates in a factor graph. Following a single,… more
ACCESS-account ai machine-learning
The "Fairness and Machine Learning" book offers a rigorous exploration of fairness in ML and is suitable for researchers, practitioners,… more
Fastai offers many tools to people working with machine learning and artifical intelligence including tutorials on PyTorch in addition to their own… more
ai machine-learning pytorch
Feed-forward neural networks are a simple type of network that simply rely on data to be "fed-forward" through a series of layers that… more
Visual Studio Code, commonly known as VSCode, is a popular tool used by programmers worldwide. It serves as a text editor and an Integrated… more
faster file-limit scratch
As LLMs get larger fine-tuning to the full extent can become difficult to train on consumer hardware. Storing and deploying these tuned models can… more
faster optimization performance-tuning
This framework will help in scaling Machine Learning/Deep Learning/Artificial Intelligence/Natural Language Processing Models to Web Application… more
The official MGH / Harvard tutorial page for FreeSurfer. The FreeSurfer group has provided and designed a series of tutorials for using FreeSurfer… more
data-analysis image-processing psychology
This is the official University of Oxford FSL group lecture page. This includes information on upcoming and past courses (online and in-person), as… more
An introduction to Cloud Computing
cloud-computing
This course is an introduction to the R programming language and covers the fundamental concepts needed to operate in the R environment. This course… more
ACES TAMU plotting
Gaussian 16 is a computational chemistry package that is used in predicting molecular properties and understanding molecular behavior at a quantum… more
gaussian computational-chemistry
Multi-threading guidance when using GDAL.
parallelization gis
Below is a link for a book that focuses on how to use "sf" and "terra" packages for GIS computations. As of 5/1/2023, this book… more
r
MediaPipe is Google's open-source framework for building multimodal (e.g., video, audio, etc.) machine learning pipelines. It is highly… more
ai computer-vision visualization
In GIS, projections are helpful to take something plotted on a globe and convert it to a flat map that we can print or show on a screen.… more
Often when working with GIS, or spatial data, one encounters the word "datum" and it may require that you choose a "datum" when… more
arcgis gis
A couple of resources that: 1.) Presents and defends a git branching workflow for stable collaborative git based projects. ("A… more
github git
Globus is a data transfer, sharing, automation, and discovery service used by hundreds of thousands of researchers to manage "big data" at… more
cloud-storage data-sharing data-management
This tutorial explains how to use Python for GPU acceleration with libraries like CuPy, PyOpenCL, and PyCUDA. It shows how these libraries can speed… more
machine-learning big-data data-analysis
GPU training series for scientists, software engineers, and students, with emphasis on Earth science applications. The content of this… more
optimization performance-tuning profiling
This article provides step-by-step instructions on how to build AirSim, a simulator for autonomous vehicles, on Linux. It includes both Docker and… more
documentation github github-pages
This tutorial is essentially the "hello world" of image recognition and feed-forward neural network (using PyTorch). Using the MNIST… more
ai visualization deep-learning
Documentation and presentation on how to use machine learning and deep learning framework using TensorFlow, Keras and sci-kit learn for Climate and… more
machine-learning
JSON is a lightweight format for storing and transporting data, for example in a config file. This library is header-only, and has easy-to-read… more
resources c++
High Performance Computing (HPC) Cluster
hpc-cluster-build
An introductory guide to High Performance Computing.
administering-hpc
Horovod is a distributed deep learning training framework. Using horovod, a single-GPU training script can be scaled to train across many GPUs in… more
deep-learning distributed-computing gpu
Hour of Cyberinfrastructure (Hour of CI) is a nationwide campaign to introduce undergraduate and graduate students to cyberinfrastructure and… more
arcgis gis administering-hpc
A tutorial entitled "How the Little Jupyter Notebook Became a Web App: Managing Increasing Complexity with nbdev" presented at SciPy 2023… more
data-sharing data-management-software data-reproducibility
Emphasizes benefits of being mentored. Describes how to identify and choose a mentor. Suggests a path forward. Not mentor or two-way focused.
mentorship professional-development workforce-development
Backed by collegiate white papers, top industry professionals, and researchers, The Plank Center’s Mentorship Guide offers basic tips and tricks on… more
mentorship professional-development training
Learn how to use Rclone to transfer data, specifically from your local drive to the Open Storage Network, vice versa.
data-transfer
This video will walk you through the process of efficiently utilizing and managing your ACCESS project(s). Here, you’ll find instructions on how to… more
ACCESS-account ACCESS-allocations allocation-management
This repository offers accessible resources and workshops on AI and high-performance computing (HPC), designed for both STEM and non-STEM majors. The… more
HPCwire is a prominent news and information source for the HPC community. Their website offers articles, analysis, and reports on HPC technologies,… more
The following link provides an easy method of implementing Markov Decision Processes (MDP) in the Julia computing language. MDPs are a class of… more
ai machine-learning julia
R GIS packages "rgdal", "rgeos", and "maptools" are package set to be archived and no longer supported by end of 2023… more
InsideHPC is an informational site offers videos, research papers, articles, and other resources focused on machine learning and quantum computing… more
ai machine-learning community-outreach
Rocky Linux is an open-source enterprise operating system. It is compatible with Red Hat Enterprise Linux (RHEL). It is a community-driven project… more
unix-environment software-installation
tutorial on introduction to making a AI Chat assistant using GenAI API
ai generative-ai
This tutorial introduces machine learning on high performance computing (HPC) clusters. While it focuses on the HPC clusters at The University of… more
ai supervised-learning unsupervised-learning
The Stan language is used to specify a (Bayesian) statistical model with an imperative program calculating the log probability density function. Here… more
data-analysis machine-learning monte-carlo
Introduction to the basics of OpenACC.
gpu c c++
The goal of this video is to help researchers and students recently given allocations to High Performance Compute resources a basic introduction to… more
bash ssh research-facilitation
Open Multi-Processing, is an API designed to simplify the integration of parallelism in software development, particularly for applications running… more
expanse faster c
The tutorial is intended to provide a brief overview of the extensive and broad topic of Parallel Computing. It covers the basics of parallel… more
parallelization
This tutorial provides a comprehensive introduction to CUDA programming, focusing on essential concepts such as CUDA thread hierarchy, data parallel… more
gpu nvidia c
This website summarizes the notes of Stanford's introductory course on probabilistic graphical models. It starts from the very basics and… more
This workshop has an introduction to the concepts of visualization followed by hands on exercises. The concepts section has Speaker Notes, and the… more
visualization documentation training
A lecture and notes with the goal of teaching introductory python. Starting by understanding how to download and start using python, then expanding… more
documentation programming python
In this tutorial, I present an overview with many examples of the use of Numpy and Pandas for data analysis. Beginners in the field of data analysis… more
This tutorial will teach step-by-step how to create an image classification model using Core ML in XCode and integrate it into an iOS app that will… more
Jetstream2 makes cutting-edge high-performance computing and software easy to use for your research regardless of your project’s scale—even if you… more
jetstream
Documentation and research based on the latest NLP text generation detection methods for 2023.
natural-language-processing
Learn Python online with these distance learning courses.
professional-development training python
The following pages are intended to give you a solid foundation in how to use the terminal, to get the computer to do useful work for you. You won… more
file-systems bash unix-environment
A series of interviews with women in the HPC community
science-gateway community-outreach professional-development
Machine learning is becoming increasingly important in field with large data such as astrophysics. AstroML is a Python module for machine learning… more
plotting big-data image-processing
The free online book for the mlr3 machine learning framework for R. Gives a comprehensive overview of the package and ecosystem, suitable from… more
data-analysis machine-learning r
In the realm of Python-based machine learning, Scikit-Learn stands out as one of the most powerful and versatile tools available. This introductory… more
ai big-data machine-learning
An overview of tools and methods to manage and optimize jobs and HPC workflows
memory optimization batch-jobs
Introduction seminar for new reactive python notebook from marimo ambassador.
A master’s degree in data science helps prepare professionals to take the next career step. This article will focus primarily on data science, a… more
big-data data-analysis data-science
Offers comprehensive information on various master's degree options in cybersecurity, including program details, admission requirements, and… more
resources professional-development cybersecurity
Bioinformatics Toolbox provides algorithms and apps for Next Generation Sequencing (NGS), microarray analysis, mass spectrometry, and gene ontology.… more
visualization data-analysis bioinformatics
MATLAB is a really useful tool for data analysis among other computational work. This tutorial takes you through using MATLAB with other programming… more
c c++ fortran
MDAnalysis is a python based library of tools for the analysis of molecular dynamics simulations. It is able to read and write many popular… more
computational-chemistry materials-science python
There is a detailed explanation about communication routines and managing methods of different MPI libraries, as well as several exercises designed… more
compiling mpi
Metadata is a vital topic in libraries and librarianship, encompassing structured information used for accessing digital resources. The definition of… more
metadata
This tutorial will give you an introduction to neural networks using the ever-famous MNIST handwritten digits database! Presented by Robin… more
Links to MD tutorials for beginner's across various simulation platforms.
cloud-computing amber charmm
MOPAC (Molecular Orbital PACkage) is a semi-empirical quantum chemistry package used to compute molecular properties and structures by using… more
computational-chemistry
The listed repository contains code written in C++ to model the flow inside a cavity with a lid moving above from left to right by discretizing… more
fluid-dynamics
Workshop for beginners and intermediate students in MPI which includes helpful exercises. Open MPI documentation.
parallelization mpi
Pluses and challenges of mentor selection. Offers tips for acquiring a mentor (finding, asking). And how to be a good mentee. SMART framework… more
mentorship professional-development
CS244N is a renowned natural language processing course offered by Stanford University and taught by Christopher Manning. It covers a wide range of… more
natural-language-processing training workforce-development
The MOOSE Navier-Stokes Cahn-Hilliard (NSCH) application is a library for implementing simulation tools that solve the Navier-Stokes Cahn-Hilliard… more
ACCESS c++ python
Self-paced tutorials on high-end computing topics such as parallel computing, multi-core performance, and performance tools. Some of the tutorials… more
training workforce-development
Neocortex is a new supercomputing cluster at the Pittsburgh Supercomputing Center (PSC) that features groundbreaking AI hardware from Cerebras… more
documentation ai deep-learning
A comprehensive collection of NERSC developed training and tutorial events, offered on regular schedules. All sessions are archived, including slide… more
Making a neural network has never been easier! The following link directs users to the Flux.jl package, the easiest way of programming a neural… more
Neurodesk provides a containerised data analysis environment to facilitate reproducible analysis of neuroimaging data. Analysis pipelines for… more
psychology containers software-installation
The Neuroimaging Tools and Resources Collaboratory (NITRC) is a neuroimaging informatics knowledge environment for MR, PET/SPECT, CT, EEG/MEG,… more
data-analysis image-processing data-sharing
Numba is a Python compiler designed for accelerating numerical and array operations, enabling users to enhance their application's performance… more
vectorization optimization performance-tuning
Numpy is a python package that leverages types and compiled C code to make many math operations in Python efficient. It is especially useful for… more
documentation big-data data-analysis
Upcoming training events and archives of training materials detailing general HPC best practices as well as how to use OLCF resources and services.
The official documentation for PyTorch, a machine learning tensor-based framework, and NumPy, which allows for support for ndarrays which is useful… more
deep-learning neural-networks pytorch
VisIt is a prominent open-source, interactive parallel visualization and graphical analysis tool predominantly used for viewing scientific data. Its… more
visIt novel-accelerators particle-physics
The official documentation for Python 3.11.5. Python comes with a lot of features built into the language, so it is worth taking a look as you code.
An article from Science Adviser. An easy read on mentoring tips in the science community.
The realm of data science is one that onlookers regard with curiosity and respect. There are a lot of unknowns in this area of study that only… more
A degree in business analytics looks different in today’s world than it did a decade ago. In its most current application, business analytics uses… more
This contains documentation for getting started with using OnShape for CAD. OnShape cloud-hosted CAD software that lets you work with others like on… more
documentation faster
OnShape FeatureScripts allow users to create their own features via OnShape's programming language. The user can make these as simple or complex… more
documentation materials-science particle-physics
The webpage provides an overview of the architecture of Open OnDemand, a web-based interface for high-performance computing (HPC) resources. It… more
authentication open-ondemand data-security
The Open Storage Network, a national resource available through the XSEDE resource allocation system, is high quality, sustainable, distributed… more
open-storage-network data-management data-retention
Proxmox Virtual Environment is a hyper-converged infrastructure open-source software. It is a hosted hypervisor that can run operating systems… more
software-installation
Materials for the "OpenHPC: Beyond the Install Guide" half-day tutorial, first offered at PEARC24. The goal of this repository is to let… more
jetstream administering-hpc cluster-management
Techniques and support for multithreaded geospatial data processing in GRASS.
parallelization gis openmp
OpenStack Tutorial For Beginners
openstack
Snakemake is a powerful and versatile workflow management system that simplifies the creation, execution, and management of data analysis pipelines.… more
documentation data-analysis data-reproducibility
pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language… more
These links take you to visualization resources supported by the University of Arizona's HPC visualization consultant (rtdatavis.github.io). The… more
A class from MITOpenCourseware that gives a hands on approach to building scalable and high-performance software systems. Topics include performance… more
This documentation provides an overview of the PetIGA framework, an open source code for solving multiphysics problems with isogeometric analysis.… more
finite-element-analysis documentation fluid-dynamics
phenoACCESS-24: Workshop on Research Computing and Plant Phenotyping High-throughput plant phenotyping is computationally intensive… more
big-data data-management metadata
This video series provides a holistic understanding of machine learning, covering theory, application, and inner workings of supervised, unsupervised… more
machine-learning programming python
Humans cannot always be treated as oracles for collaborative sensing. Robots thus need to maintain beliefs over unknown world states when receiving… more
Python course offered by Texas A&M HPRC
5 Days of recordings of Python data analysis and visualization training.
data-science python
Python has become a very popular programming language and software ecosystem for work in Data Science, integrating support for data access, data… more
ai machine-learning big-data
This is a very barebones introduction to the PyTorch framework used to implement machine learning. This tutorial implements a feed-forward neural… more
Running QGIS tools from the command line
Data augmentation is a crucial step in the pipeline for image classification with deep learning. Albumentations is an extremely versatile Python… more
deep-learning python
R for Data Science is a comprehensive resource for individuals looking to harness the power of the R programming language for data analysis,… more
visualization data-analysis data-science
A book for researchers who contribute code to R projects: This booklet is the result of my work with the Social Cognition for Social Justice lab. It… more
software-carpentry workforce-development r
Raftlib is an open-source C++ Library that provides a framework for implementing parallel and concurrent data processing pipelines. It is designed… more
parallelization pthreads openmp
This repository contains information about Jupyter Widgets and how they can be used to develop interactive workflows, data dashboards, and web… more
Regular expressions (sometimes referred to as RegEx) is an incredibly powerful tool that is used to define string patterns for "find" or… more
perl programming python
The daily news clearly shows the increasing threat to safety and privacy of data, personal as well as intellectual property. While the requirements… more
community-outreach cybersecurity
This course takes through the fundamentals required to get started with reinforcement learning with Python, OpenAI Gym and Stable Baselines. You… more
deep-learning machine-learning tensorflow
Representation learning is a fundamental concept in machine learning and artificial intelligence, particularly in the field of deep learning. At its… more
deep-learning image-processing machine-learning
The NSF-funded ResearchSOC helps make scientific computing resilient to cyberattacks and capable of supporting trustworthy, productive research… more
Iterative Programming takes place when you can explore your code and play with your objects and functions without needing to save, recompile, or… more
ai visualization big-data
An ongoing collection of RSE training material, workshops, and resources. We are compiling this list as a starting point for future activities. We… more
astrophysics data-science novel-accelerators
Active inference is an emerging study field in machine learning and computational neuroscience. This website in particular introduces "active… more
ai
A compilation of the slides from this year's RMACC Sys Admin Workshop. RMACC Sys Admin Workhop Schedule: Tuesday 12… more
administering-hpc hpc-tools cluster-support
Rocky Mountain Advanced Computing Consortium Website
Resources and User Guide available at Rockfish
rockfish
WarpX is an advanced particle-in-cell code used to model particle accelerators, which needs to be run on HPC. This website contains the tutorial on… more
github github-pages novel-accelerators
Samtools is a suite of programs for interacting with high-throughput sequencing data, especially in the SAM/BAM format. It offers various utilities… more
documentation data-analysis bioinformatics
Use this template to turn any science gateway workflow into a web application!
data-analysis github astrophysics
Scikit-learn is free software machine learning library for Python. It has a variety of features you can use on data, from linear regression… more
documentation ai plotting
Comprehensive tutorials and lecture notes covering various aspects of scientific computing using Python and Scipy.
visualization data-analysis machine-learning
VSCode is a popular IDE that runs on Windows, MacOS, and Linux. This tutorial will explain how to get set up with VSCode to code in Python. It will… more
git python
These instructions were executed on the FASTER and Grace cluster computing facilities at Texas A&M University. However, the process can be… more
faster fluid-dynamics c++
Singularity/Apptainer is a free and open-source container platform that allows users to build and run containers on high performance computing… more
containers singularity
Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters. Slurm… more
cluster-management cluster-support slurm
Introduction to the Slurm Workload Manager for users and system administrators, plus some material for Slurm programmers.
administering-hpc cluster-management hpc-cluster-architecture
slurm schedulers
Differential equations, the backbone of countless physical phenomena, have traditionally been solved using numerical methods or analytical techniques… more
neural-networks
Spack is a package manager for supercomputers that can help administrators install scientific software and libraries for multiple complex software… more
spack
Spatial Data Science is a growing field across a wide range of industries and disciplines. The open-source programming language Python has many… more
cloud big-data data-analysis
The webpage SSL Configuration Generator by Mozilla provides recommended SSL/TLS configurations for various server platforms, helping administrators… more
optimization computer-science data-security
TensorFlow is a powerful framework for Deep Learning, developed by google. This specifically is their python package, which is easy to use and can be… more
documentation faster tensorflow
**Termius: The Modern SSH Client for 2023** Termius is the future-facing SSH client that's redefining remote server access in… more
cloud-computing data-sharing data-transfer
Training Resources and Courses offered by Texas A&M's Research Computing Group
ACES TAMU
Pandas is one of the most essential Python libraries for data analysis and manipulation. It provides high-performance, easy-to-use data structures,… more
This video by the YouTube channel 3Blue1Brown provides a very simplified introduction to the theory behind neural networks. This tutorial is perfect… more
This presentation gives a detailed breakdown of the outcome of my master's thesis which was focused on making HPC Clusters accessible across all… more
ACCESS login documentation
Thrust is a CUDA library that optimizes parallelization on the GPU for you. The Thrust tutorial is great for beginners. The documentation is helpful… more
parallelization gpu resources
A walkthrough (with a Google Colab link) on how to implement your own LSTM to observe time-dependent behavior.
This google colab notebook tutorial demonstrates how to create and train an lstm model in pytorch to be used to predict time series data. An airline… more
ai supervised-learning machine-learning
Trinity is one of the most popular tool to assemble transcripts from RNA-Seq short reads. In this tutorial, we will cover the basic usage of Trinity… more
biology
The mission of Trusted CI is to lead in the development of an NSF Cybersecurity Ecosystem with the workforce, knowledge, processes, and… more
cybersecurity training
Very helpful list of external resources from Trusted CI
Open OnDemand has been audited by Trusted CI, the NSF Cybersecurity Center of Excellence, to enhance its security and maintain its status as a… more
The following link elaborates the usage of OpenMP API and its related syntax. There are also several exercises available for learners to help them… more
openmp
Unix is incredibly common and useful. This website provides all the common commands and explanations for one to get started with a unix system.
With the recent uprising of LLM's many business are looking at way to adopt these LLMs and fine-tuning these models on specfic data sets to… more
big-data training
The United Nations (UN) is an international organization comprising 193 Member States, including the United States. As a global organization, the UN… more
Introductory training materials for working on the UNIX command line.
Windows Subsystem for Linux (WSL) provides a Linux environment for Windows users to access HPC resources fast and efficiently.
workflow ssh
A tutorial on the effective use of Dask on HPC resources. The four-hour tutorial will be split into two sections, with early topics focused on novice… more
training jupyterhub python
It's not uncommon to see beautiful visualizations in HPC center galleries, but the majority of these are either rendered off the HPC or created… more
anvil bridges-2 darwin
big-data computer-graphics workflow
Warewulf is an operating system provisioning platform for Linux that is designed to produce secure, scalable, turnkey cluster deployments that… more
documentation administering-hpc distributed-computing
Weka is a collection of machine learning algorithms for data mining tasks. It contains tools for data preparation, classification, regression,… more
big-data data-analysis machine-learning
This reading will explain what a long short-term memory neural network is. LSTMs are a type of neural networks that rely on both past and present… more
This article discusses the importance of fairness in machine learning and provides insights into how Google approaches fairness in their ML models.… more
ai visualization data-analysis
A VPN, or Virtual Private Network, is a technology that creates a secure tunnel between your device and a VPN server. This tunnel encrypts all of… more
vpn
The Why & How seminar series is designed to introduce research assistants, graduate students, and postdoctoral and clinical fellows – really,… more
image-processing
Describes effective mentorship (both ways).
This is a resource for researchers and students looking to on-board onto the c3ddb cluster at MGHPCC. In the code section, there are example job… more
cluster-support
Through collaboration and networking, WHPC strives to bring together women in HPC and technical computing while encouraging women to engage in… more
This tutorial series and documentation covers topics on using Python on HPC clusters. The specific steps are based on the HOPPER cluster at George… more
pytorch batch-jobs job-submission
CAC summer student employee Jeff Lantz describes his experiences in running the WRF weather forecasting application in the public cloud. He compares… more
aws azure cloud-commercial