Colin Clarke

Principal Investigator


Colin Clarke

Principal Investigator
View Publications

Dr. Clarke graduated with a PhD in Bioinformatics from Cranfield University, UK, and specialises in the application of multivariate statistics and machine learning algorithms to high dimensional data. Upon completion of his doctoral training Colin took a position at Dublin Institute of Technology (DIT) applying chemometric approaches to analyse Raman and Infrared spectroscopy data from cells and tissues.

He moved to the National Institute for Cellular Biotechnology at Dublin City University in 2009 to work in Martin Clynes’ group investigating the biology of CHO cells during biopharmaceutical production.

A major component of his research in this time has centred on the application of statistical methods to study CHO transcriptomic and proteomic expression datasets. Examples of his work in the area include the use of partial least squares (PLS) to predict cell specific productivity from gene expression data and the elucidation of mRNA coexpression networks from a large scale CHO mRNA dataset.

An area of particular interest was the integration of miRNA, mRNA, proteomic and genomic data to understand the biological processes regulating the growth rate of CHO cells.

In 2014 he moved to NIBRT following the award of SFI’s prestigious Starting Investigator Research Grant. Dr Clarke’s bioinformatics group is currently focussed on further understanding of the CHO cell biological system using next generation sequencing and advanced computational techniques.


Martin Sinacore Outstanding Young Researcher Award 2014
Best Oral Presentation at the 2014 DCU School of Biotechnology Research Day.
Cranfield University Bioinformatics MSc. course team prize 2003/2004 academic year.
Editorial Board Member:

Biotechnology Letters

Reviewer for:

Biotechnology and Bioengineering, Journal of Biotechnology, Pharmaceutical Bioprocessing, Journal of Industrial Microbiology & Biotechnology, Cancer Biomarkers

Research Areas


Biopharmaceutical production


Multivariate Statistics

Next generation sequencing

Cancer research