Linux command line introduction

Super fast introduction to UNIX/Linux...command line

UNIX/Linux/command line

The exclamation mark (!) or percentage (%) at the beginning of the lines is specific for Jupyter notebook. Don't use it in a normal command line!!!

On purpose, we will use some commands we will not comment on. It's up to you to search and figure out what they do.

Example of system commands you can try

Some commands you can try to see what happens in the command line and how it looks like.

In [1]:
!# name of the user
!whoami
guest03
In [2]:
!#  name of the computer
!hostname
wolf03
In [3]:
!# manual for the command; press q to quit
!man hostname
HOSTNAME(1)                Linux Programmer's Manual               HOSTNAME(1)

NNAAMMEE
       hostname - show or set the system's host name
       domainname - show or set the system's NIS/YP domain name
       ypdomainname - show or set the system's NIS/YP domain name
       nisdomainname - show or set the system's NIS/YP domain name
       dnsdomainname - show the system's DNS domain name

SSYYNNOOPPSSIISS
       hhoossttnnaammee [--aa||----aalliiaass] [--dd||----ddoommaaiinn] [--ff||----ffqqddnn||----lloonngg] [--AA||----aallll--ffqqddnnss]
       [--ii||----iipp--aaddddrreessss] [--II||----aallll--iipp--aaddddrreesssseess] [--ss||----sshhoorrtt] [--yy||----yypp||----nniiss]
       hhoossttnnaammee [--bb||----bboooott] [--FF||----ffiillee ffiilleennaammee] [hhoossttnnaammee]
       hhoossttnnaammee [--hh||----hheellpp] [--VV||----vveerrssiioonn]

       ddoommaaiinnnnaammee [nniissddoommaaiinn] [--FF ffiillee]
       yyppddoommaaiinnnnaammee [nniissddoommaaiinn] [--FF ffiillee]
       nniissddoommaaiinnnnaammee [nniissddoommaaiinn] [--FF ffiillee]

       ddnnssddoommaaiinnnnaammee

DDEESSCCRRIIPPTTIIOONN
       HHoossttnnaammee is used to display the system's DNS name, and  to  display  or
       set its hostname or NIS domain name.

   GGEETT NNAAMMEE
       When  called  without  any  arguments, the program displays the current
       names:

       hhoossttnnaammee will print the name of the system as returned by the  ggeetthhoosstt‐‐
       nnaammee(2) function.

       ddoommaaiinnnnaammee  will  print  the  NIS domainname of the system.  ddoommaaiinnnnaammee
       uses the ggeetthhoossttnnaammee(2) function, while yyppddoommaaiinnnnaammee and  nniissddoommaaiinnnnaammee
       use the yypp__ggeett__ddeeffaauulltt__ddoommaaiinn(3).

       ddnnssddoommaaiinnnnaammee  will  print the domain part of the FQDN (Fully Qualified
       Domain Name). The complete FQDN of the system is returned with hhoossttnnaammee
       ----ffqqddnn (but see the warnings in section TTHHEE FFQQDDNN below).

   SSEETT NNAAMMEE
       When  called  with one argument or with the ----ffiillee option, the commands
       set the host name  or  the  NIS/YP  domain  name.   hhoossttnnaammee  uses  the
       sseetthhoossttnnaammee(2)  function,  while all of the three ddoommaaiinnnnaammee,, yyppddoommaaiinn‐‐
       nnaammee and nniissddoommaaiinnnnaammee use sseettddoommaaiinnnnaammee(2).  Note, that this is effec‐
       tive  only  until  the  next  reboot.  Edit /etc/hostname for permanent
       change.

       Note, that only the super-user can change the names.

       It is not possible to set the FQDN or the DNS domain name with the ddnnss‐‐
       ddoommaaiinnnnaammee command (see TTHHEE FFQQDDNN below).

       The   host   name   is   usually   set   once   at  system  startup  in
       _/_e_t_c_/_i_n_i_t_._d_/_h_o_s_t_n_a_m_e_._s_h (normally by reading the  contents  of  a  file
       which contains the host name, e.g.  _/_e_t_c_/_h_o_s_t_n_a_m_e).

   TTHHEE FFQQDDNN
       The  FQDN  (Fully Qualified Domain Name) of the system is the name that
       the rreessoollvveerr(3) returns for the host name, such as, _u_r_s_u_l_a_._e_x_a_m_p_l_e_._c_o_m.
       It  is  usually  the hostname followed by the DNS domain name (the part
       after the first dot).  You can check the FQDN using hhoossttnnaammee ----ffqqddnn  or
       the domain name using ddnnssddoommaaiinnnnaammee.

       You cannot change the FQDN with hhoossttnnaammee or ddnnssddoommaaiinnnnaammee.

       The  recommended  method of setting the FQDN is to make the hostname be
       an alias for the fully qualified name using _/_e_t_c_/_h_o_s_t_s,  DNS,  or  NIS.
       For  example,  if  the  hostname was "ursula", one might have a line in
       _/_e_t_c_/_h_o_s_t_s which reads

              127.0.1.1    ursula.example.com ursula

       Technically: The FQDN is the name ggeettaaddddrriinnffoo(3) returns for  the  host
       name returned by ggeetthhoossttnnaammee(2).  The DNS domain name is the part after
       the first dot.

       Therefore it depends on the configuration of the resolver  (usually  in
       _/_e_t_c_/_h_o_s_t_._c_o_n_f) how you can change it. Usually the hosts file is parsed
       before DNS or NIS,  so  it  is  most  common  to  change  the  FQDN  in
       _/_e_t_c_/_h_o_s_t_s.

       If  a machine has multiple network interfaces/addresses or is used in a
       mobile environment, then it may either have multiple FQDNs/domain names
       or  none  at  all.  Therefore  avoid  using  hhoossttnnaammee  ----ffqqddnn, hhoossttnnaammee
       ----ddoommaaiinn and ddnnssddoommaaiinnnnaammee.  hhoossttnnaammee ----iipp--aaddddrreessss is  subject  to  the
       same limitations so it should be avoided as well.

OOPPTTIIOONNSS
       _-_a_, _-_-_a_l_i_a_s
              Display  the  alias  name  of the host (if used). This option is
              deprecated and should not be used anymore.

       _-_A_, _-_-_a_l_l_-_f_q_d_n_s
              Displays all FQDNs of the machine. This  option  enumerates  all
              configured  network  addresses  on all configured network inter‐
              faces, and translates them to DNS domain names.  Addresses  that
              cannot be translated (i.e. because they do not have an appropri‐
              ate reverse IP entry) are skipped. Note that different addresses
              may  resolve  to the same name, therefore the output may contain
              duplicate entries. Do not make any assumptions about  the  order
              of the output.

       _-_b_, _-_-_b_o_o_t
              Always  set  a hostname; this allows the file specified by _-_F to
              be non-existant or empty, in which  case  the  default  hostname
              _l_o_c_a_l_h_o_s_t will be used if none is yet set.

       _-_d_, _-_-_d_o_m_a_i_n
              Display  the  name  of  the  DNS  domain.  Don't use the command
              ddoommaaiinnnnaammee to get the DNS domain name because it will  show  the
              NIS  domain  name and not the DNS domain name. Use ddnnssddoommaaiinnnnaammee
              instead. See the warnings in section TTHHEE FFQQDDNN above,  and  avoid
              using this option.

       _-_f_, _-_-_f_q_d_n_, _-_-_l_o_n_g
              Display  the FQDN (Fully Qualified Domain Name). A FQDN consists
              of a short host name and the DNS domain  name.  Unless  you  are
              using  bind  or NIS for host lookups you can change the FQDN and
              the DNS  domain  name  (which  is  part  of  the  FQDN)  in  the
              _/_e_t_c_/_h_o_s_t_s  file. See the warnings in section TTHHEE FFQQDDNN above und
              use hhoossttnnaammee ----aallll--ffqqddnnss instead wherever possible.

       _-_F_, _-_-_f_i_l_e _f_i_l_e_n_a_m_e
              Read the host name from  the  specified  file.  Comments  (lines
              starting with a `#') are ignored.

       _-_i_, _-_-_i_p_-_a_d_d_r_e_s_s
              Display the network address(es) of the host name. Note that this
              works only if the host name can be resolved.  Avoid  using  this
              option; use hhoossttnnaammee ----aallll--iipp--aaddddrreesssseess instead.

       _-_I_, _-_-_a_l_l_-_i_p_-_a_d_d_r_e_s_s_e_s
              Display  all  network addresses of the host. This option enumer‐
              ates all configured addresses on  all  network  interfaces.  The
              loopback  interface  and  IPv6 link-local addresses are omitted.
              Contrary to option _-_i, this option does not depend on name reso‐
              lution.  Do not make any assumptions about the order of the out‐
              put.

       _-_s_, _-_-_s_h_o_r_t
              Display the short host name. This is the host name  cut  at  the
              first dot.

       _-_V_, _-_-_v_e_r_s_i_o_n
              Print  version  information on standard output and exit success‐
              fully.

       _-_y_, _-_-_y_p_, _-_-_n_i_s
              Display the NIS domain name. If a parameter is given (or  ----ffiillee
              nnaammee ) then root can also set a new NIS domain.

       _-_h_, _-_-_h_e_l_p
              Print a usage message and exit.

NNOOTTEESS
       The  address  families hhoossttnnaammee tries when looking up the FQDN, aliases
       and network addresses of the host are determined by  the  configuration
       of  your resolver.  For instance, on GNU Libc systems, the resolver can
       be instructed to try IPv6 lookups first by using the  iinneett66  option  in
       //eettcc//rreessoollvv..ccoonnff.

FFIILLEESS
       //eettcc//hhoossttnnaammee  Historically  this file was supposed to only contain the
       hostname and not the full canonical FQDN.  Nowadays  most  software  is
       able  to  cope with a full FQDN here. This file is read at boot time by
       the system initialization scripts to set the hostname.

       //eettcc//hhoossttss Usually, this is where one sets the domain name by  aliasing
       the host name to the FQDN.

AAUUTTHHOORRSS
       Peter Tobias, <tobias@et-inf.fho-emden.de>
       Bernd Eckenfels, <net-tools@lina.inka.de> (NIS and manpage).
       Michael Meskes, <meskes@debian.org>

net-tools                         2009-09-16                       HOSTNAME(1)

Paths and directories

Everytime you work in the command line you will have to work with paths and directories (folders in Windows). You can go from a directory to a directory without opening the graphical browser - much faster! You can also copy files and directories.

In [4]:
!# what is the path to the directory where I am now; Path of Working Directory
!pwd
/home/guest03/Desktop

Be careful, commands are case-sensitive!

In [5]:
!# doesn't do anything and possibly throws an error that it doesn't know the command (simply it doesn't exists unless there is a different command with very similar name - might happen)
!PWD
/bin/sh: 1: PWD: not found

We can try using the absolute path and relative path

We would like to go one directory up from our current 'location'.

In [6]:
!# where we are now
!pwd
!# what is the content of the current directory
!ls
/home/guest03/Desktop
linux_intro.ipynb
In [7]:
!# go one directory up using the absolute path - it goes from the root of the computer - from the very 'starts'
%cd /home/guest03/Desktop
!pwd
!ls
/home/guest03/Desktop
/home/guest03/Desktop
linux_intro.ipynb
In [8]:
# we can also use a relative path - it goes relative to the current directory
%cd ../
!pwd
!ls
/home/guest03
/home/guest03
Desktop  Documents  Downloads  Music  Pictures	Public	Templates  Videos

We can also list a directory without even going there.

In [9]:
!ls /home/guest03/Desktop
linux_intro.ipynb

There is also a special sign ~ (tilde) that marks your home directory. Home is basicaly your default directory.

In [10]:
!# using a special sign to go to your home directory
%cd ~
!pwd
/home/guest03
/home/guest03

Operation with directories and files - create, copy, move, etc.

Of course, using the command line you can copy files from one directory to another, create new ones, delete, rename, and everything else you can imagine.

In [11]:
!# first, create an empty, new directory
!mkdir newdir
!# do we really have it there?
!ls
Desktop    Downloads  newdir	Public	   Videos
Documents  Music      Pictures	Templates
In [12]:
!# go to the new directory (relative path) and create a new empty file there
%cd newdir/
!touch newfile
!ls
/home/guest03/newdir
newfile

Now, we can move the file using again both relative and absolute paths.

In [13]:
!# move the file to a different directory using the absolute path
!mv newfile /home/guest03/Downloads
!ls /home/guest03/Downloads
!ls
Bi5444_Analysis_of_sequencing_data_lesson_02_MMraz.pdf	newfile
In [14]:
!# and now we can move it back in here using the relative path
!mv /home/guest03/Downloads/newfile .
!ls
!ls /home/guest03/Downloads
newfile
Bi5444_Analysis_of_sequencing_data_lesson_02_MMraz.pdf

We can also copy and rename the file.

In [16]:
!# copy (not move) the file
!cp /home/guest03/newdir/newfile ../Pictures/
!ls 
!ls ../Pictures/
newfile
anotherfile  newfile
In [17]:
!# copy the file and rename it
!cp ../newdir/newfile ../Pictures/anotherfile
!ls ../Pictures/
anotherfile  newfile
In [18]:
!# in a simillar manner we can move the file
!mv newfile ../Pictures/movedfile
!ls
!ls ../Pictures/
anotherfile  movedfile	newfile
In [19]:
!# renaming a file in the command line is the same as moving the file
%cd ../Pictures/
!ls
!mv anotherfile renamedfile
!ls
/home/guest03/Pictures
anotherfile  movedfile	newfile
movedfile  newfile  renamedfile
In [20]:
!# and we can remove a file
!rm renamedfile
!ls
movedfile  newfile

You can do the same things with directories as with files but we need some minor changes.

In [21]:
!# make new directory
!mkdir newdir
!ls
movedfile  newdir  newfile
In [22]:
!# copy directory - have to use RECURSIVE copying
!cp newdir ../
!cp -r newdir ../
!ls ../
cp: omitting directory 'newdir'
Desktop    Downloads  newdir	Public	   Videos
Documents  Music      Pictures	Templates
In [23]:
!# move the directory
!mv newdir /home/guest03/Downloads/
!ls
!ls /home/guest03/Downloads/
movedfile  newfile
Bi5444_Analysis_of_sequencing_data_lesson_02_MMraz.pdf	newdir

Please note the "/" character at the end of the command. If you want to move something IN in directory, it better to use the "/" at the end. If you don't use it the command line, in some cases, might think you want to move and RENAME the moved directory and replace the target directory.

In [24]:
!# we can remove an empty directory
!rmdir /home/guest03/Downloads/newdir
!ls /home/guest03/Downloads
Bi5444_Analysis_of_sequencing_data_lesson_02_MMraz.pdf

But if the directory is not empy, simple rmdir won't work. We have to use RECURSIVE delete. But be careful, command line doesn't ask "Are you sure?" but does it right away.

In [25]:
!# remove non-empty directory
!mkdir bagr
!touch bagr/file
!ls
!rmdir bagr
!rm -r bagr
!ls
bagr  movedfile  newfile
rmdir: failed to remove 'bagr': Directory not empty
movedfile  newfile

Special making directories - more than one "level". We can also create a long "list" of directories which are in a sequence one after the other. For this, we have to create all the parent directories of the last listed directory = all directories which are "above" the last one.

In [26]:
!# make all the directories even if some of them doesn't exist
!mkdir /home/guest03/Desktop/testdir1/testdir2
!mkdir -p /home/guest03/Desktop/testdir1/testdir2
!ls /home/guest03/Desktop/testdir1
mkdir: cannot create directory ‘/home/guest03/Desktop/testdir1/testdir2’: No such file or directory
testdir2

Reading and writing to a file

Now, we can see how to download a file, how to read it and how to copy it to a different file (differently than the previous time).

First, we have to download a file. There are numerous ways how to do it but this one is one of the most easiest.

In [31]:
!# download a file from the command line
!wget https://www.dropbox.com/s/7ja4d2kifo3cbqx/textfile.txt?dl=0
--2018-10-02 11:06:05--  https://www.dropbox.com/s/7ja4d2kifo3cbqx/textfile.txt?dl=0
Resolving www.dropbox.com (www.dropbox.com)... 162.125.66.1, 2620:100:6022:1::a27d:4201
Connecting to www.dropbox.com (www.dropbox.com)|162.125.66.1|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: /s/raw/7ja4d2kifo3cbqx/textfile.txt [following]
--2018-10-02 11:06:05--  https://www.dropbox.com/s/raw/7ja4d2kifo3cbqx/textfile.txt
Reusing existing connection to www.dropbox.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://uc1016a6073b86c8ad789bb93f74.dl.dropboxusercontent.com/cd/0/inline/AR8LJ2h3BWAupBGIBvHXaN7jK8hXQyeLjDKjiLGbCy-6zahar3bD90qVb2axdHYlCIrnmKU7o4Q_T74OeHuB3Mz0Bgk9Fy94rt2luJH-jrDXwLQKeriidDZ0kqgjVF1E5Hd2MquPSKUVBuCTzxJwoO6kxBWN75MNJet2p2t5gp-QbyMvFY4WRXKq52jTV_WqapenTz2_lcyib9kSMy-qcbSy/file [following]
--2018-10-02 11:06:05--  https://uc1016a6073b86c8ad789bb93f74.dl.dropboxusercontent.com/cd/0/inline/AR8LJ2h3BWAupBGIBvHXaN7jK8hXQyeLjDKjiLGbCy-6zahar3bD90qVb2axdHYlCIrnmKU7o4Q_T74OeHuB3Mz0Bgk9Fy94rt2luJH-jrDXwLQKeriidDZ0kqgjVF1E5Hd2MquPSKUVBuCTzxJwoO6kxBWN75MNJet2p2t5gp-QbyMvFY4WRXKq52jTV_WqapenTz2_lcyib9kSMy-qcbSy/file
Resolving uc1016a6073b86c8ad789bb93f74.dl.dropboxusercontent.com (uc1016a6073b86c8ad789bb93f74.dl.dropboxusercontent.com)... 162.125.66.6, 2620:100:6022:6::a27d:4206
Connecting to uc1016a6073b86c8ad789bb93f74.dl.dropboxusercontent.com (uc1016a6073b86c8ad789bb93f74.dl.dropboxusercontent.com)|162.125.66.6|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1640 (1.6K) [text/plain]
Saving to: ‘textfile.txt?dl=0’

textfile.txt?dl=0   100%[===================>]   1.60K  --.-KB/s    in 0s      

2018-10-02 11:06:05 (137 MB/s) - ‘textfile.txt?dl=0’ saved [1640/1640]

In [32]:
!# see if we have it
!ls
movedfile  newfile  textfile.txt  textfile.txt?dl=0
In [34]:
!# see the content of the file
!cat textfile.txt\?dl\=0
1. Introduction to NGS technologies: a brief introduction to biology, sequencing, history, NGS technologies and their applications, sample extraction, library preparation, basic glossary.
2. The basic scheme of data analysis: how the data look like, definition of general steps in NGS data analysis, differences in dependence on the application (eg. variant calling vs RNA-Seq …), projects introduction.
3. Student project assignment and introduction to software for data analysis: a brief introduction to work with Linux, Bash and R, data formats and the differences between them, on-line courses, discussion about projects.
4. Quality control, data processing, specifications and start of work on projects: tools for quality control, Phred score, data pre-processing, examples on sample data.
5. Alignment and post-processing: reference genome databases, annotations, the differences between them and application, explanations of alignment algorithms, differences between spliced/non-spliced ​​tools and their application, alignment quality control, alignment visualization.
6. Theory to specifics parts of the analysis of the projects 1. (based on the student selections)
7. Theory to specifics parts of the analysis of the projects 2. (based on the student selections)
8. Theory to specifics parts of the analysis of the projects 3. (based on the student selections)
9. Theory to specifics parts of the analysis of the projects 4. (based on the student selections)
10. Projects processing/analysis, consultations.
11. Projects processing/analysis and projects finalization, consultations.
12. Presentation of the project results.
In [35]:
!# fix the ugly name
!wget https://www.dropbox.com/s/7ja4d2kifo3cbqx/textfile.txt?dl=0 -O textfile.txt
--2018-10-02 11:07:43--  https://www.dropbox.com/s/7ja4d2kifo3cbqx/textfile.txt?dl=0
Resolving www.dropbox.com (www.dropbox.com)... 162.125.66.1, 2620:100:6022:1::a27d:4201
Connecting to www.dropbox.com (www.dropbox.com)|162.125.66.1|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: /s/raw/7ja4d2kifo3cbqx/textfile.txt [following]
--2018-10-02 11:07:43--  https://www.dropbox.com/s/raw/7ja4d2kifo3cbqx/textfile.txt
Reusing existing connection to www.dropbox.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://uc1016a6073b86c8ad789bb93f74.dl.dropboxusercontent.com/cd/0/inline/AR8LJ2h3BWAupBGIBvHXaN7jK8hXQyeLjDKjiLGbCy-6zahar3bD90qVb2axdHYlCIrnmKU7o4Q_T74OeHuB3Mz0Bgk9Fy94rt2luJH-jrDXwLQKeriidDZ0kqgjVF1E5Hd2MquPSKUVBuCTzxJwoO6kxBWN75MNJet2p2t5gp-QbyMvFY4WRXKq52jTV_WqapenTz2_lcyib9kSMy-qcbSy/file [following]
--2018-10-02 11:07:43--  https://uc1016a6073b86c8ad789bb93f74.dl.dropboxusercontent.com/cd/0/inline/AR8LJ2h3BWAupBGIBvHXaN7jK8hXQyeLjDKjiLGbCy-6zahar3bD90qVb2axdHYlCIrnmKU7o4Q_T74OeHuB3Mz0Bgk9Fy94rt2luJH-jrDXwLQKeriidDZ0kqgjVF1E5Hd2MquPSKUVBuCTzxJwoO6kxBWN75MNJet2p2t5gp-QbyMvFY4WRXKq52jTV_WqapenTz2_lcyib9kSMy-qcbSy/file
Resolving uc1016a6073b86c8ad789bb93f74.dl.dropboxusercontent.com (uc1016a6073b86c8ad789bb93f74.dl.dropboxusercontent.com)... 162.125.66.6, 2620:100:6022:6::a27d:4206
Connecting to uc1016a6073b86c8ad789bb93f74.dl.dropboxusercontent.com (uc1016a6073b86c8ad789bb93f74.dl.dropboxusercontent.com)|162.125.66.6|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1640 (1.6K) [text/plain]
Saving to: ‘textfile.txt’

textfile.txt        100%[===================>]   1.60K  --.-KB/s    in 0s      

2018-10-02 11:07:44 (150 MB/s) - ‘textfile.txt’ saved [1640/1640]

In [37]:
!# see if it's there and delete the old one
!ls
!rm textfile.txt\?dl\=0
movedfile  newfile  textfile.txt  textfile.txt?dl=0
In [38]:
!# how many lines do we have
!wc -l textfile.txt
12 textfile.txt
In [40]:
!# what is the beginning of the file (top 10 lines)
!echo "Top 10 lines."
!head textfile.txt
!# what is the end of the file (last 10 lines)
!echo "Last 10 lines"
!tail textfile.txt
Top 10 lines.
1. Introduction to NGS technologies: a brief introduction to biology, sequencing, history, NGS technologies and their applications, sample extraction, library preparation, basic glossary.
2. The basic scheme of data analysis: how the data look like, definition of general steps in NGS data analysis, differences in dependence on the application (eg. variant calling vs RNA-Seq …), projects introduction.
3. Student project assignment and introduction to software for data analysis: a brief introduction to work with Linux, Bash and R, data formats and the differences between them, on-line courses, discussion about projects.
4. Quality control, data processing, specifications and start of work on projects: tools for quality control, Phred score, data pre-processing, examples on sample data.
5. Alignment and post-processing: reference genome databases, annotations, the differences between them and application, explanations of alignment algorithms, differences between spliced/non-spliced ​​tools and their application, alignment quality control, alignment visualization.
6. Theory to specifics parts of the analysis of the projects 1. (based on the student selections)
7. Theory to specifics parts of the analysis of the projects 2. (based on the student selections)
8. Theory to specifics parts of the analysis of the projects 3. (based on the student selections)
9. Theory to specifics parts of the analysis of the projects 4. (based on the student selections)
10. Projects processing/analysis, consultations.
Last 10 lines
3. Student project assignment and introduction to software for data analysis: a brief introduction to work with Linux, Bash and R, data formats and the differences between them, on-line courses, discussion about projects.
4. Quality control, data processing, specifications and start of work on projects: tools for quality control, Phred score, data pre-processing, examples on sample data.
5. Alignment and post-processing: reference genome databases, annotations, the differences between them and application, explanations of alignment algorithms, differences between spliced/non-spliced ​​tools and their application, alignment quality control, alignment visualization.
6. Theory to specifics parts of the analysis of the projects 1. (based on the student selections)
7. Theory to specifics parts of the analysis of the projects 2. (based on the student selections)
8. Theory to specifics parts of the analysis of the projects 3. (based on the student selections)
9. Theory to specifics parts of the analysis of the projects 4. (based on the student selections)
10. Projects processing/analysis, consultations.
11. Projects processing/analysis and projects finalization, consultations.
12. Presentation of the project results.

Editing a text file

We can also edit the files right from the command line. There is a number of tools for this purpose but one of the most easiest one is nano or pico. Unfortunately, this editing doesn't work in Jupyter and we have to open a command line for this. Please, do this now.

Edit our text file using nano. nano textfile.txt Now you can do whatever you want, change, delete, etc. You can also see a 'help' bar at the bottom.

Once you are done you can save the file by pressing control+o, typing a new name of the file or saving to a same file.

nano can do a lot of other stuff. Just see the manual.

It might happen that the computer won't have nano nor pico installed. In that case you have to use the 'hardcore' tools such as vi or vim but this I will leave to you.

Loop commands

A very useful (=you will use it a lot) are so called loops. There is a lot of them and this is one of very nice tutorials which you can go through are here.

We will mainly use the for loop.

In [ ]:
!# this loops goes thourgh all files that have a ".txt" suffix and prints their name and first 10 lines
for file in *.txt
do
    echo $file
    head $file
done

Note the * character. This is one of the regular expressions. The * stands for any number of any characters. This means there could be any name of any length of the file but it just has to finish with ".txt".