-omics data analysis
high-throughput cell imaging data analysis
by
Andreas Hadjiprocopis
(contact details: 'andreashad2' then the funny snail symbol then 'gmail.com'
without the quotes)
In this place I present four problems I tried to tackled during my work at the Institute of Cancer Research, London between 2009-2013. There are others but are not shown here.
The main idea is to analyse large datasets comprising of various morphological features of cells (e.g. cell area) belonging to different cell lines (e.g. HeLa etc.) and treated in different conditions (e.g. TNF-alpha etc.) for different time durations. Examples of cell lines, conditions et al are here.
The features have been extracted from high-throughput images of cells processed by third-party software.
The presented work here serves as an exploration to dealing with high-throughput cell imaging data and all the software and pipelines are in prototype form (mainly written in perl, bash, C++ and R).
Here they are:
Models predicting cell line or NFkB activation given cell morphological features, here.
Calculating differences in the value of a cell's morphological feature (from high-throughput imaging, e.g. cell area) for two different cell-lines/cell-treatments. Here.
For a given cell-line/treatment-condition/treatment-duration, are there pairs of features which are likely to occur together? (For example, baldness and sex=male in humans). Here.
A method to see how different (separable) are pairs of cell-lines/treatment/duration with respect to cell shape and texture. Basically, whether cells belonging to one of two cell-lines/treatment/duration combinations, can be distinguished based on their cell shape and texture. Here.
My CV, past research work and interests and a list of publications here as a single PDF.
Book chapter in Systems Genetics, Linking Genotypes and Phenotypes: "Phenotype State Spaces and Strategies for Exploring Them" (in Florian Markowetz and Michael Boutros eds., Cambridge Series in Systems Genetic, Cambridge University Press, 2015).
My PhD thesis ("Feed Forward Neural Network Entities") on how to break a large-and-cumbersome Feed Forward Neural Network into a number of smaller-and-easier-to-handle neural networks is here.
Public domain software I have written over the years (Biological data analysis, some programs used to create the data described in other pages in this site, Neural Networks, Genetic algorithms, general machine learning and AI) can be found at my Github repository here: https://github.com/hadjiprocopis
Project protein-hops integrates STRING-DB data and gene expession data and builds a graph of functional associations. Questions such as list all 2-hop associations between two proteins can be answered in an efficient way.
talk on 29/11/2017 about integrating omics data from various sources and public databases. Also, analysis of cell morphological and texture features extracted from high-throughput images of live cells.
contact details: 'andreashad2' then the funny snail symbol -aka 'at' - then 'gmail.com' without the quotes