Skip to main content

COMPLECS: Linux Tools for Text Processing

09/04/25 - 2:00 PM - 3:30 PM EDT

Location

Remote via Zoom

Summary

Many computational and data processing workloads require pre-processing input files to get the data into a format compatible with the user’s application and/or post-processing output files to extract key results for further analysis. While these operations could be done by hand, they tend to be time-consuming, tedious, and, worst of all, error-prone. In this session, we cover the Linux tools awk, sed, grep, sort, head, tail, cut, paste, cat, and split, which will help users to automate repetitive tasks easily. We conclude by showing how large language models (LLMs) such as ChatGPT could be used to write commands using these tools.

Instructor

Robert Sinkovits, Ph.D., is a senior computational scientist at the San Diego Supercomputer Center. He is currently the co-PI and project manager for Expanse, SDSC’s nationally allocated supercomputer. He has collaborated with researchers spanning many fields, including physics, chemistry, astronomy, structural biology, ecology, finance, immunology, and the social sciences, always with an emphasis on parallel scalability and maximizing application performance. Robert previously served as Director of Education and Training and launched SDSC’s COMPLECS project, which emphasizes the skills beyond programming that are needed to make effective use of advanced cyberinfrastructure.

See a full list of SDSC's upcoming training and events here.

--- 
COMPLECS (COMPrehensive Learning for end-users to Effectively utilize CyberinfraStructure) is a new SDSC program where training will cover non-programming skills needed to effectively use supercomputers. Topics include parallel computing concepts, Linux tools and bash scripting, security, batch computing, how to get help, data management and interactive computing. Each session offers 1 hour of instruction followed by a 30-minute Q&A. COMPLECS is supported by NSF award 2320934.