Below you can find session descriptions for the four tutorials you can choose between at AHM20.
Radovan Bast: Share your tools
This will be a collaborative tutorial where all or most participants will be asked to briefly present 2 of their favorite tools or solutions or hacks which worked and improved their life or work situation, but also 1-2 things they tried which failed. You can for instance show what you do to automate mundane tasks or how you tackle distributed work and fragmented work weeks. Share the tools and apps that you would take to a desert island. Together we will in a very short time learn many new tricks. Hopefully we will also share ideas which did not work out since these are no less valuable.
Signing up to this workshop will require that you submit a short description before the new year about what you plan to share during your 10 minutes to help the tutorial organizer to prepare the session.
Jon Ander Novella: Distributing workloads in modern infrastructure
Possible topics for discussion in the talk
-
Brief Introduction about the current situation in academic infrastructure
-
HPC, Cloud and serverless, how will they converge, if they do?
-
Containers as fundamental objects in workload distribution
-
API-based tools for asynchronous tasks: state-of-the-art, examples as user stories in Sweden. Discussion about the pros and cons of different approaches.
-
Apache Spark
-
Kubernetes focused executors: Pachyderm, Argo, kube-openmpi, Kubeflow
-
-
Security management with distributed workloads: automated provisioning of credentials to executors by e.g: Vault. (Vault gives executors S3 credentials). Executors should not know about your storage backend.
-
Serverless workload example using Terraform and Amazon Lambda. Terraform allows developers to define infrastructure as code (IaC)
João da Silva: Kubernetes Essentials, the TL;DR
Have you been interested in getting started with Kubernetes? In this tutorial, we'll explain the main topics, and go through together trough them by kickstarting a simple Kubernetes cluster, and deploying containerised apps.
Prerequisites: General knowledge about containers (e.g. Docker), GNU/Linux or Mac laptop with admin rights
Keywords: Kubernetes master, Kubernetes nodes, pod, volume, service, namespace, deployment, Daemonset, statefulset, replicaset, job, control plane
Oskar Vidarsson: An overview of different common workflow languages
Reproducible data analysis is an important concept that has a whole host of technologies developed for it, and workflow languages are a central component of this. If you know that you are going to run an analysis repeatedly over an extended period of time you will be interested in doing it well, so why reinvent the wheel if you can get a perfectly fine wheel at the wheel store?
I will take you to the wheel store and give you my completely biased opinion of what I think are the strong and weak points of NextFlow, Snakemake and WDL.
We will:
-
Run a dockerized Snakemake workflow
-
Look at how the different languages are structured
-
Compare ways to visualize a workflow graphically
-
Point and laugh at memes
-
And more...
Requirements:
-
Basic command line skills
-
Git and docker to test Snakemake (optional)
-
A sense of humor
-
A laptop
Installation instructions for git and docker:
Git: https://git-scm.com/book/en/v2/Getting-Started-Installing-Git
Docker: https://docs.docker.com/v17.09/engine/installation/
Run this to clone the tutorial repository
git clone https://github.com/oskarvid/snakemake-intro
Then follow the instructions in the readme.