Link back to main course page

Part 1: Working with a server remotely.

Introduction

A server is a computer or device on a network that manages network resources. Essentially a server is a collection of linked compuetres that have processors (CPU’s or GPU’s),memory (RAM) and disk storage space, like hard drives.

Servers are used for bioinformatics because data sets can be very large, and can take long times and/or large amounts of RAM to process.

Servers generally run using a linux operating system, rather than the MicroSoft Windows or MacOS. This is beause linux is more stable, more secure and has methods to process and distriute large numbers of tasks to different processors.

We will use a server based in York, called York Advanced Research Computing Cluster, or YARCC for short.


Using the York VPN

To work remotely on YARCC, you need to log in.

Regardless of whether you are logging in from a PC or a Mac, you will need to use the york VPN Pulse secure. The download and installation instructions are provided here: https://www.york.ac.uk/it-services/services/vpn/

When you open Pulse secure, you should have no connections added. As shown below


If you click on the plus button it will bring up a window, in order to connect to York you will need to fill in the VPN settings for the Name and Serve URL so that is looks like as follows


If you click connect it will then ask you to add your username and your password. Each of you will have a temporary account to log in with, and a associated password you will need to use to log in as shown below


If you connect and the information is correct, you should have the following screen, with a green tick, and a disconnect option available.


Logging into the server

Log in from a PC

If you are using a windows computer, you will then need to install login and configure this to connect using the Pulse secure VPN you just set up. You should already have login installed. If you open this you should have the following screen.

If you click on open, this will then bring up the following screen, you will need to fill out the host name and make sure it is connecting through SSH as shown beneath

This will be the first time connecting to the server using your login set up, so it will give you an authentication screen. You will need to select yes, to authenticate.

Once this has connected you should have a terminal screen open which will then ask you to again fill in your login details. If you log in successfully you should see the following two windows


To log in from a Mac

If you are logging in from a mac, this will be much easier and not require the login software. You will still need to log in using the VPN. Once this is connected, you will need to log in using the command ssh as follows

ssh username@login.yarcc.york.ac.uk 

When you press enter this will ask for your password. If both are entered succesfully this should have you logged in!


Part 2: Linux

If you are familar with linux and merely want a reminder, get a linux cheat sheet here.

Directories and your working directory

  • In the same way that windows PC’s have folders where you store your files, linux systems have directories. They are essentailly the same thing: a place to organise files and programs.

  • Linux often uses a command line (text-based) interface rather than a graphical interface like windows, so directories are referred to with text.

When you first log into a linux server you are directed to your home directory. On YARCC your home directory will be:

/home/userfs/t/username

Note that username is replaced with your username (like tmpq0001, etc). So everyone has their own unique home directory.


In linux you are always ‘working from’ a specific dsirectory. You can think of this as where you are in the file system. You are always somewhere!

To find out what your working directory at any point, use this command:

pwd

Directories can contain files and other directories. Just like houses contain rooms, rooms contain items and boxes. And tins and jars within boxes etc.

Directories are nested.


Remember where you are

As you work in linux it is important to keep track of where you are in relation to your directories.. This image below shows how directories might be organised. The something directory is’nt too useful, but the data directory tells you what is in there. (You’ll find out about bam files later.)

Directories and files have a path, which is the list of subdirectories that you need to specify to gte to that location. The path of your home directory on the YARCC linux system is something like: /home/userfs/t/tmpq1234.


To change which directory you are ‘in’ use this command:

cd data

This will take you to thr data directory (if it exists).


It is important to understand the concepts of directories and paths.

Discuss with your group or a tutor before you move on.


Trying out linux commands

This set of linux command will get you started. We suggest that you type each of these commands into your linux system in order. Do not copy and paste the text from this web page.

  • First, choose a bulding name, a room name, and a word for a box or container in any language. Note these down somwhere.

  • Then log into the server (if you haven’t already).

  • Then check where you are, with:

pwd


  • Now make a new directory called bulding name (using your own word):
mkdir building


  • Now change your location to the new building directory, using the cd (change directory) command:
cd building


  • Now make a another new directory called room name (using your own word). Then move into this new directoty using cd again.
mkdir room
cd room


  • Then check where you are again with:
pwd


  • Then make three files within the room directory, called box.1, box.2 and smallbox.1 (using your own word for box), using th touch command (which makes an empty file).
touch box.1
touch box.2
touch smallbox.1


  • Now list what files are in your current directory using the ls (list) command:
ls



Most commands in linux have optional ‘flags’, that are added with extra letters after the command. Flags allow you to run the command with different variations. Some of these flags can be very useful.

  • To find out about a linux command, and its flags, you can call us the ‘manual’ for that command with:

    man command

    (replacing command with something like ls, touch, cd etc)

  • For example, to list files in your working directory with a long format (-l), sorted by time (-t) in reverse (-r). This will show the most recent files at the end.

    ls -lrt


    Give this a try.


  • Now you have created your nested directories with ~/building/room/ and box files.To move ‘up’ one level, use this command:
cd ../



This is how cd ../ changes your working directory.


NOTE

Files can have any name in linux.
File names and commands are case sensitive.
So the command to change directory cd will not work if you type Cd.
Be careful with dots .. and spaces in linux - they matter!


More linux commands.

Make a copy of a file called myfile. The new copy is called myfile2.

cp myfile myfile2

If your working directory is kitchroom (~/) you can copy a file called myfile from your working directory to the fridge like so:

cp myfile fridge/

Remove a file called this.

rm this

Show (or print out to the screen) all of a file called this.

cat this

Warning!: some of the file we will work with are very large. Using cat can take a long time. To escape from a command that is running type Ctrl+Z.

Show the first ten lines of a file called this.

head this

Show the last ten lines of a file called this.

tail this

Wild cards

One of the most powerful parts of file handing in linux is it’s use of wild cards. These allow you to specify groups of files to move or copy, and in many other situations.

There are three main wildcards in Linux:

  • An asterisk (*) – matches one or more occurrences of any character, including no character.

  • Question mark (?) – represents or matches a single occurrence of any character.

  • Bracketed characters ([ ]) – matches any occurrence of character enclosed in the square brackets.

For example, to list only the files in your room directory that start with box. Do this:

Go back you your home directory:

cd ~/

List files that start with box:

cd ~/building/room/box.*

To list files that end with 1:

ls ~/building/room/*.1

To list files that start with b end with and single character:

ls ~/building/room/b*.?

To list files that start with anything, end with anything but contain the word box:

ls ~/building/room/*box*

Pipes and input/output redirection

Another powerful part of linux systems is the ability to ‘pipe’ or redirect the output of one program directly into another program (using the | symbol), or into a file (using the > symbol). Pipes work like an assembly line.

  • Here is how you pipe the output of a list command into the sort command:
ls -latr ~/building/room/*box* | sort

Note that you finish the ls command, add a pipe symbol ( | ) then use the sort command.


  • Here is how you redirect the output of a list command into a file called list-output. You can call the file anything you like - ot will be created by the pipe.
ls -latr ~/building/room/*box* > list-output

Note that you finish the ls command, add a redirect symbol ( > ) then specify a file name.


DISCUSS WITH YOUR GROUP

Quizz each other about what these commands mean:

cp this here/
rm *.vcf
mkdir something
rm ~/buidling/*/*.?
cp  ~/buidling/room/*ox* ~/buidling/

And one command you should not do!

rm *.*

Why not?


End

This should be all you need for linux at the moment.

Examples of all the commands you will need (and more) are in this cheat sheet.

The File Commands and Shortcuts will be the most useful for you now.


Link back to main course page