Date: 3 - 4 March 2025

Genomic studies produce vast amounts of data, usually in the form of very large text files. Linux is particularly suited to working with such files, and is therefore arguably one of the most important tools in a bioinformatician’s toolkit. The Linux command-line enables one to view, filter and manipulate large text files that are difficult or impossible to handle with applications like Word or Excel, write pipelines to perform certain tasks, and run bioinformatics software for which no web interface is available. In this workshop we will first cover the most used Linux commands, followed by a short introduction to several popular command-line tools that were especially developed for genomics as well as file formats commonly used in genomics (BED, FASTA, FASTQ, GFF/GTF, SAM/BAM, VCF).

Keywords: Command line, Genomics, Linux, Software

Prerequisites:

  • A general understanding of molecular biology and genomics, and elementary skills in computer usage are required.
  • A computer with a stable internet connection and a VNC viewer (download instructions included)

Learning objectives:

  • Accessing files
  • Command-line tools for genomics (seqtk, bioawk, samtools, bedtools, tabix)
  • Downloading remote files
  • File management
  • Files and directories
  • Filtering / manipulating file content
  • Getting help
  • Navigating the file system
  • Permissions
  • Pipes and redirects
  • Process management
  • Shell scripts
  • The shell and commands
  • Zipping and unzipping files

Organizer: Edinburgh Genomics

Target audience: Graduates, postgraduates, and PIs, without any previous command-line experience, who want to learn to use the Linux command-line in order to be able to work with large data files.

Event types:

  • Workshops and courses


Activity log