2018-2019 / INFO0953-1

Scripting interfaces for biological software and databases

Duration

20h Th, 50h Mon. WS

Number of credits

 Master in bio-informatics and modelling (120 ECTS)8 crédits 

Lecturer

Denis Baurain, Pierre Tocquin

Coordinator

Denis Baurain

Language(s) of instruction

English language

Organisation and examination

Teaching in the first semester, review in January

Schedule

Schedule online

Units courses prerequisite and corequisite

Prerequisite or corequisite units are presented within each program

Learning unit contents

This course teaches the Linux operating system learning, Perl programming and key database concepts in the context of bioinformatics applications.
1. Linux

  • Linux (installation, configuration and customization, tour of the GUI)
  • Commande line (terminal, shell)
  • Interactive shell: commands for diagnostic, browsing, file search, file processing, file input/output, string processing and permission management
  • Advanced shell: process management, variables, loops, one-liners, scripts
  • How to keep a tidy bioinformatics log-book
2. Modern Perl
  • Variables (Scalars, Arrays, Hashes)
  • Operators, Boolean expressions and Control flow
  • Input/output
  • Regular expressions
  • One-liners
  • Functions
  • References and Nested data structures
  • Modules and Unit tests
  • Best of CPAN
  • Idiomatic Perl - TIMTOWTDI
3. Databases
  • Excel sheets vs Databases (non-redundancy and data integrity)
  • Database design: Conceptual design (entities-relationships-attributes, cardinalities, naming conventions) ; Logic design (primary keys, foreign keys, candidate keys and composite keys, normalization) ; Physical design (db-main, mysql-wb, indexing) ; Advanced concepts (objet orientation, denormalization)
  • Database use: Langage SQL (DDL, anatomy of SELECT), SQLite
  • SQLite integration with the Shell, R and Perl
  • Biological databases
  • Full example: the MIDI-CHIP database

Learning outcomes of the learning unit

This course is the main programming course of the Master BIM. Along with the other courses of this curriculum, it aims to ensure that students are able to use computers as scientific instruments. More specifically, they will have been trained for the following purposes:
1. Experimental design

  • how to choose appropriate controls
  • how to think in a statistical framework
2. Conducting experiments
  • how to run large series of analyzes
  • how to harness the power of grid computing
3. Interpretation of results
  • how to automate the analysis of output files
  • how to generate informative but nice looking graphs
  • how to draw statistically sound conclusions
4. Documentation and archiving
  • how to document experimental protocols
  • how to reorganize a series of past analyzes
  • how to manage multiple versions of the data sets, of the required programs and of the generated results

Prerequisite knowledge and skills

This course requires no prior knowledge in computer programming, but it is nevertheless based on Genomics [GENE0003-1] and Bioinformatics [BIOL0008-1] courses of Master BBMC or BIM.

Planned learning activities and teaching methods

  • brief theoretical lectures
  • challenges to solve
  • computer practicals
  • self-learning (textbooks and online tutorials)

Mode of delivery (face-to-face ; distance-learning)

This course is mostly face-to-face but as a problem-oriented course, it will require that students work also outside of the classroom.

Recommended or required readings

Hard copies of course materials will be distributed in class. Recommended reference books will be suggested in these course materials.

Assessment methods and criteria

The evaluation of this course will be based both on the work done during the academic year (homework: 15%), on an open book exam where an integrative problem will have to be solved using shell commands and a Perl program (60%), as well as a personal essay presenting the design process and possible uses of a biological database of choice (25%).

Work placement(s)

Organizational remarks

Taking notes on a laptop or tablet is allowed. However, students are expected not to surf or chat in the classroom.

Contacts

Prof. Denis Baurain Institut de Botanique B22 (P70) denis.baurain@uliege.be
Dr. Pierre Tocquin Institut de Botanique B22 (P70) ptocquin@uliege.be
Assistant: Dr. Damien Sirjacobs Institut de Botanique B22 (P70) 04/366.38.54 D.Sirjacobs@uliege.be