Search this site
Embedded Files
Stanford Data Users
  • Home
  • Current Workshops
  • GWE Data Storage
  • Details
  • Video Tutorials
  • 2019 Workshop
Stanford Data Users
  • Home
  • Current Workshops
  • GWE Data Storage
  • Details
  • Video Tutorials
  • 2019 Workshop
  • More
    • Home
    • Current Workshops
    • GWE Data Storage
    • Details
    • Video Tutorials
    • 2019 Workshop

Click here to apply!

Lab Meeting Workshops

Interested in having a workshop given directly to your lab? Contact Bryce at bdgrier@stanford.edu.

Scheduled Workshops

100 - Bash and the Unix Shell June 06, 2025

200 - Data Transfer and Storage June 13, 2025

220 - LLMs as Tools in Research June 20, 2025

300 - Containerized Analysis Notebooks   June 27, 2025

301 - Containerizing Analyses and Workflows July 2025

310 - Parallel Processing I   July 10 2025 

210 - Efficient Data Storage and Retrieval with HDF5 Late July 2025

110 - Git and Version Control August 2025

Current Offerings

100 - Bash and the Unix Shell (periodically offered in collaboration with Christina Gancayco from Stanford Research Computing)

  • introduction to Unix shells

  • essential Bash commands

  • developing and executing a Bash script

110 - Git and Version Control

  • coming soon!

200 - Data Transfer and Storage

      • Stanford-specific and external resources for data management

      • command line data transfer tools:

          • rsync (needed for FarmShare/Sherlock/OAK)

          • rclone (useful for cloud services)

210 - Efficient Data Storage and Retrieval with HDF5

  • introduction to HDF5 file format

  • guided exploration of sample files and tools

  • guidelines for storing different neuroscience data types in HDF5

220 - LLMs as Tools in Research

  • history, development, and current state of large language models (LLMs)

  • building a retrieval-augmented model that allows for management and querying of scientific knowledge

300 - Containerized Analysis Notebooks

  • promoting reproducibility through using containerized analysis

  • interactive, containerized, brower-based analysis using HPC resources:

      • FarmShare/Sherlock OnDemand

      • hosting python kernels on FarmShare/Sherlock

301 - Containerizing Analyses and Workflows

  • modifying existing containers

  • building custom containers from scratch

  • making custom containers available to your lab or to the public

310 - Parallel Processing I

  • introduction to embarrassingly parallel problems

  • methods of parallelization at large scales:

      • GNU Parallel (powerful, shell-based tool that allows for easy parallization of tasks on a local machine)

      • Pub/Sub (cloud-based service that allows for scalable, aynchronous analysis pipelines)

Future Offerings

311 - Parallel Processing II

320 - Efficient Analysis of Time Series Data

400 - CI/CD Pipelines for Scientific Computing

401 - Scientific Computing Clusters in the Cloud

500 - Integrating HDF5 with Parallel Processing

Report abuse
Report abuse