You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Unofficial, informal - but a bit of a pilot for a future more systematic Comp Biol training at INoŚ
Goal: teach you something useful. I assume you want to learn.
Contents and pace flexible, I'll adjust to your needs and interests.
You don't need to attend individual classes. Drop any time. No grades or evaluations.
When teaching new stuff I assume you are comfortable with the previously taught stuff. Make sure you are!
1.2. Bioinformatics and command line
Bioinformatics vs. computational biology
Either way, we want to learn tools that are applicable to many data types, allow to simplify time-intensive tasks, and enable tackling projects on an otherwise-implausible scale
Mastering a set of relatively simple tools that we will be able to combine, mix-and-match --- to achieve exactly what you want with the data.
Learning it takes effort - but once you have it, you will have a set of scaleable, highly replicable tools that you will be able to reuse and modify for an ever-increasing set of tasks.
Understanding your data and knowing what you want is going to be the key, but once we have that, we should be able to do whatever :)
Why use command line?
Less intuitive than pointing and clicking (Graphical User Interface - GUI), takes dedicated effort to learn
But more versatile, less platform-specific
Makes it easy to save and replicate sequences of operations
Makes it easy to keep the log of operations and repeat them, perhaps on new files
Makes it easy to run operations on many files at once
Many useful tools only exists in command-line mode
It's much easier to write your own scripts with a command-line-only interface
Computing clusters usually use command line... you need to know it if you work with big dats!
Getting set up!
Software is platform-specific: Windows vs Mac vs Linux
Installing useful tools (Appendix 1 in the Book!): powerful text editor (Notepad++, BBEdit, or some of the alternatives!) and access to Unix/Linux command line
Setting up accounts on the cluster, connecting
1.3. Unix/Linux - intro to the environment
Basic vocab:
Shell - command interpretor in your Unix/Linux operating system. Allows you to interact with the computer.
Kernel - the central part of a Unix operating system, the core interface between a computer's hardware and the processes
Bash - a Unix shell and command language, default login shell for most Linux distributions
Terminal - that's where shell runs!
Shell session is the current state/environment in the shell/terminal, delimited by login/logout.
Prompt - the active line where you type commands. Often includes useful info - user name, current directory
pwd --- where are you?
The Linux file system - more vocab:
root - the superuser account, but also, the very bottom of the file structure
Directories, files
Path
Current/working directory
Home directory (~)
ls --- listing directory contents
pointing at directories that you want listed!
arguments: -l, -h, -a ...
getting help about commands!
man
--help, -h
Google :)
cd --- changing directories
absolute vs relative path!
cd Project1
cd ..
cd ./Project1/Dataset1
cd /home/username/Project1/Dataset1
cd ../Dataset2
cd / cd ~
cd -
Effective navigation
up and down arrows move you to previously applied commands that you can now edit
command history lists previously
Ctrl+A, Ctrl+E - jump to the beginning/end of the edited command