27–30 Jan 2020
Dr. Holms Hotel
Europe/Copenhagen timezone

Tutorial descriptions

Below you can find session descriptions for the four tutorials you can choose between at AHM20.

Radovan Bast: Share your tools

This will be a collaborative tutorial where all or most participants will be asked to briefly present 2 of their favorite tools or solutions or hacks which worked and improved their life or work situation, but also 1-2 things they tried which failed. You can for instance show what you do to automate mundane tasks or how you tackle distributed work and fragmented work weeks. Share the tools and apps that you would take to a desert island. Together we will in a very short time learn many new tricks. Hopefully we will also share ideas which did not work out since these are no less valuable.

Signing up to this workshop will require that you submit a short description before the new year about what you plan to share during your 10 minutes to help the tutorial organizer to prepare the session.

Jon Ander Novella: Distributing workloads in modern infrastructure

Possible topics for discussion in the talk

  • Brief Introduction about the current situation in academic infrastructure

  • HPC, Cloud and serverless, how will they converge, if they do?

  • Containers as fundamental objects in workload distribution

  • API-based tools for asynchronous tasks: state-of-the-art, examples as user stories in Sweden. Discussion about the pros and cons of different approaches.

    • Apache Spark

    • Kubernetes focused executors: Pachyderm, Argo, kube-openmpi, Kubeflow

  • Security management with distributed workloads: automated provisioning of credentials to executors by e.g: Vault. (Vault gives executors S3 credentials). Executors should not know about your storage backend.

  • Serverless workload example using Terraform and Amazon Lambda. Terraform allows developers to define infrastructure as code (IaC)

João da Silva: Kubernetes Essentials, the TL;DR

Have you been interested in getting started with Kubernetes? In this tutorial, we'll explain the main topics, and go through together trough them by kickstarting a simple Kubernetes cluster, and deploying containerised apps.

Prerequisites: General knowledge about containers (e.g. Docker), GNU/Linux or Mac laptop with admin rights

Keywords: Kubernetes master, Kubernetes nodes, pod, volume, service, namespace, deployment, Daemonset, statefulset, replicaset, job, control plane

Oskar Vidarsson: An overview of different common workflow languages

Reproducible data analysis is an important concept that has a whole host of technologies developed for it, and workflow languages are a central component of this. If you know that you are going to run an analysis repeatedly over an extended period of time you will be interested in doing it well, so why reinvent the wheel if you can get a perfectly fine wheel at the wheel store?

I will take you to the wheel store and give you my completely biased opinion of what I think are the strong and weak points of NextFlow, Snakemake and WDL.

We will:

  • Run a dockerized Snakemake workflow

  • Look at how the different languages are structured

  • Compare ways to visualize a workflow graphically

  • Point and laugh at memes

  • And more...

Requirements:

  • Basic command line skills

  • Git and docker to test Snakemake (optional)

  • A sense of humor

  • A laptop

Installation instructions for git and docker:

Git: https://git-scm.com/book/en/v2/Getting-Started-Installing-Git 

Docker: https://docs.docker.com/v17.09/engine/installation/ 

Run this to clone the tutorial repository

git clone https://github.com/oskarvid/snakemake-intro 

Then follow the instructions in the readme.