Profile cover photo
Profile photo
Goksel Misirli

Today, I am in London at the Cloud and Big Data computing conference, organised by Dr Huseyin Seker.

John Gordon is talking about consulting in the area of Cloud Computing. He trained graduate to work in this area.

Docker, AWS, Ansible are some of the topics from the syllabus.

Bid data training involves Hadoop, Apache Spark, Pentaho and so on.

Prof. Yike Guo presented about a framework for co-analysis of Big Data and Smart Data.
Issues involve
- Big data lacks context
- It is unimodal
- It can be biased

Smart. data is about purposed observations for building models. It can be representative but not always comprehensive. Big data can be the opposite. Data are often assimilated and it is difficult to use them both together. Smart data are especially useful in the context of telecommunication applications. Creating representative and comprehensive models by combining big and smart data is crucial.

Utilising big data to enrich smart data in order to create better data (Data enrichment)
- Here smart data is used to improve the quality of big data, by initially creating a model. This process is called transduction (transduction inference).

Transfer learning for data enrichment
Smart Data > Learning System> Enrichment + Big Data -> Learning System

Transfer Learning is part of assimilation learning.

Enriching Big Data with Context/Semantics

Feature Improvement:


Second day of ICBO 2017 starts with a keynote talk from Alan Rector.

Ontologies are subset of knowledge representation not the full representation They needs to be really supplemented with other tools.
Logic programming : MYCIN, Prolog, OPS5, JESS, Arden, Syntax, BPEL
Classification: ICD

- Closed world is what you see is what you get. All statements are facts
- Open world: Satisfiability and negation are the differences. Statements are not facts. Statements are axioms. If a class is empty it is unsatisfiable.
- Aggregation

Querying OWL- KB as a closed world:
SPARWL, SNOMED's expression constrain language, OPP:, OWL API

It was a great talk!

Different pathway databases:

Pathway Ontology has been developed to organise pathways by function. Most databases use their own ontology

Gedevo: Useful for aligning graphs

Ontology Building:
Refining ot partitioning concepts

e.g. Aminoacid ontology and refining classes using Shape (Large, Small) classes.

ICBO 2017, Newcastle upon Tyne.
Marcos talks about defining metadata in biomedicine. It is quite an important subject. Metadata are unfortunately not standardised. Even for a single concept, there are several similar terms. Finding experimental data, understanding and replicating the experiments is difficult. There is no easy way to find suitable ontology terms. CEDAR is a semantic ecosystem to annotate documents. Template designers are used to create templates to add metadata by curators.

Post has attachment

We are in Dublin for the 2nd Workshop on Molecular Communications.

—— Prof. Tom Lenaerts is talking about "information theory and the regulation of protein activity". ——

Cells can communicate with the same type of cells or different., through signalling molecules or some sort of channels.

The functionality of proteins are often through forming complexes, and changing one of the proteins may change the function of these complexes. In this aspects, proteins are like lego blocks. The activation and deactivation of proteins in those complexes are a bit more sophisticated. Sometimes the order of domains is important for the function of these proteins.

It is important to identify the design patterns when rewiring these relationships.

—— Gianluca Reali is talking about the MolComML toolkit ——

A number of molecular communication simulators have already emerged. There is a need for a standard to facilitate communication between these tools. MolComML is a markup language that has been developed to address this issue. There are already XML-based markup languages such as SBML, CellML, NeuroML and MathML. SBML and MathML ca be used as inputs to MolComML and different outputs regarding simulations can be generated.

The main entity is"Network topology" which contains "Network elements” that can transmit or receive “signals”. These signals interact with the network elements through “communication interfaces”. "Protocol Stack” is used to indicate how entities interact and binary signals are represented using MolComML entities. It may have layers composed of rules. Transportation of signals are represented by “Carriers”. And all these entities may be in different “Compartments”.

BinS2 is a simulation framework. It is customisable and can be used in the simulation of applications regarding amino systems, drug delivery, interactions between tissue cells and so on.

—— From Microfluidic Communications to Microfluidic Networking: but I still haven’t found what I’m looking for ——

Microfluidic laws are similar to ohm laws. And there are already simulation tools such as ComSOL and OPENFOAM. Prototypes can be created using 3D printers. The devices do not cost much.

The idea is then to create a network of microfluidic chips. How to encode information: One solution is that droplet in one slot means 1, if not 0.
- Distance based
- Composition based

Then a switch is required to redirect droplets. A packet is formed of header and payload droplets.The distance between these droplets can be used to direct the payloads.

—— A Microfluidic Communication Link: Definition, Analysis and Experimentation, Andrea Zanella ——

Droplets with different length can represent the digits of 0,1,2,3

These droplets are then used to implement communication platforms based on microfluidics. Using image analysis, these droplets can be counted computationally. These droplets can further be optimised to increase the speed of data per second.

— —Towards Push-Button Solutions: Design Automation for Microfluidic Chips, Werner Haselmayr ——

Werner talks about introducing multiple header payloads when redirecting the payload, and automating the design of microfluidic based systems. Droplet sequence based architectures is not ideal.

—— MakerFluidics: Microfluidics for the Masses, Ryan Silva ——

The idea is to open the use of microfluidics to masses.

Tools such as Cello and Phoenix can be used to specify designs at the high level. These designs can then be turned into designs. Microfluidic fabrication allows building physical devices. Neptune is a tool that can be used by labs for this process. The cost comes down to $3000 for creating microfluidics devices using MakerFluidics. These have already been used to carry out complex chips which can also be used in industry. This technology has noe being used in collaboration with Draper to identify antibiotic susceptibilities of patients. Blood cells are pushed to the sides and bacteria are predicted through channels prior to tests. Neptune is available at

Neo4j seminar by Dr Jim Webber, chief scientist at Neo4j, a few highlights:

It comes with a query language called, Cypher.
Optimises the queries, and sometimes replaces operators chosen in the queries. Queries are also compiled. Cypher is an open source project.

In Neo, transactions are supported and transaction objects are in-memory only.

Writing is through a single thread!

Graphs are indexed.

It supports clusters of database instances, using a scheme of masters and slaves to support transactions.

Graphs can be split between different machines. Transactions are coordinated when requests are for the same nodes.

Today I learned that glioma is a form of brain tumor. The risk of cancer increases exponentially with the age. Roman is interested in the mechanistic details of this cancer with different inputs/parameters.

Whilst the cell division increases linearly, neural stem cells decrease exponentially with the age.

Some of my notes from COMBINE 2016, Day 5:

----- PharmML - exchange format for models used in quantitative system pharmacology and pharmacometrics -----

PharmML and SO (Standard Output) are used to exchange models used in QSP

SBML, design, estimation and simulation tools are used in PharmML related workflows. SO captures the recording of results from various tasks.

Single subject and population level data can be handled.

SBML is directly incorporated to represent structured models within PharmML. Converters have already been developed to convert between the two. Whilst SBML covers for preclinical models, PharmML is ideal to represent late clinical models.

ProbOnto ontology and a KB have also been developed. This ontology is used directly in PharmML models.

----- Biological Pathway Exchange (BioPAX) Format and Pathway Commons Update -----

PaxTools is a Java API to deal with BIOPAX documents. It can export to SBGN and can utilise Cytoscape. ChiBE is a Cytoscape plugin that can be utilised. There is also a Web console. The editor takes a human readable input and produces a BIOPAX ontology based data output.

Pathways Commons Database integrates a wide range of biological pathways data. Common formats used are BioPAX and PSI-MI.

Chisio SBGN editor can be used to visualise Pathways. The SBGNViz.js is a standalone JS library for visualisation. Pathway commons can be directly queries by this library. SBML files can be visualised too.

----- Introducing FSK-ML - an SBML derivate for description and exchange of executable script based models -----

FSK-ML it is a food markup language. It is based on SBML v3.

Simulation settings are stored using in SED-ML and simulation results are stored in NuML formats

FSKX files include the model metadata using PMFML format.

KNIME: Open source version of Matlab and it is free.

----- Tramy Nguyen, Utah Interconversion Between BioPAX and SBML ---

The conversion uses the SBML's comp, qual and groups packages.

Conversion is two way between SBML and BioPAX models.

--- The Visual Protein Design Language ---
Icons are presented for representing functional domains and sites of protein for different biochemical activities such as binding, induction and dimerisation.

--- Comprehensive representation of disease mechanisms on multiple layers of granularity in SBGN PD and AF ---

Disease maps are used for interpretation and hypothesis generation.

A Systems biology format converts platform can be used to convert data between different COMBINE standards.

--- Visualising differences in SBML models using SBGN and BiVeS ---

SBGN is used to visualise differences between versions of SBML models.

Some notes from COMBINE2016, Day 4:
----- SBML Multi - From Specification to Application -----
Simmune, BioNetGen Kappa are rule-based models
SpeciesFeature and SpeciesType are new entities defined in the SBML multi package.

libsbml-multi supports the proposal.

Species and reactions are extended to represent agents and rules.

Simmune and BioNetGen can import and export the sbml multi format.
----- MultiCellDS: a community-developed standard for curating -----
PhysiCell. Chaste, TiSim are examples of tools supporting the MultiCellDS standard.

Using this language, it is now possible to describe how populations of cells behave in particular environments.

Cells - > Cell Variants - > Measurements using these variants - > Versions of measurements

Main types of information:
- Microenvironments
- Cellular information

Metadata includes the ORCHID ids, and provenance.

Variables are defined using units and ontology ids

Phenotypes are divided into several groups

Target and current phenotypes can be defined.

Cells can be described using individual cell definitions or using aggregation of cells

Microscopy images are being annotated currently. There are also some early visualisation efforts.

----- Standard openEHR to Manage Experimental Data in Computational Physiology ----- It is an ISO standard, open source specs and tools

Online model

Another emerging standard HL7 FHIR, scope is smaller than openehr and supports simpler use cases.

----- Approaches for Developing Computational Models of Immune System Formation and Function -----

Bioimaging, bioengineering, hybrid models (incorporating PDEs, RBMs and so on)

LeishSim SBML Petri-Net model. Transitions represent reactions.

Sensitivity analysis & Spartan

----- Tellurium: A Python Based Modeling and Reproducibility Platform for Systems Biology ---

Provides support for different standards.
roadrunner, pandas, matplotlib, SumPy, NumPy, WinPython are useful Python packages.

The tool support parameter scan and bifurcation analysis.
The editor uses PhraseML

SBGN support will be added soon.

To access the conda packages:

--- Modeling and Simulating Hybrid SBML Models ---
FBA and reaction based models are combined through iBioSim. FBA part is simulated using small intervals and the dynamic part of the models is then updated.

It is a proposal to encode hybrid models in SBML

--- High-performance Model Simulation with libRoadRunner ---
A whole-cell model required 11 hours on a 128 node Linux cluster.

in libRoadRunner high-performance is a priority. It comes with an easy to use API.

It uses Antimony

scipy can be used to fit parameters using differential evolution techniques.

It can be accessed as part of the Tellurium framework.

Some notes from COMBINE2016, Day 3:
----- Modelling ageing to enhance a healthy lifespan, Daryl Shanley --

The risk of chronic diseases increases exponentially with the age. Cancer, CVD, protein aggregation and so on.

Ageing is the result of accumulation of molecular damage over the years. Although some cells try to repair themselves, some of them will end up with permanent phenotypes, without the ability for repairing.

Interventions can now be made to reduce the rate of ageing. Eating less can increase ageing dramatically. In mouses experiments show 100% increase. Genetic interventions are also possible. Less food available trigger causes less molecular damage.

Computational models can be used to propose the dose and methods of interventions. One investigation is to reduce the positive feedback of damage, which then increases exponentially. Models can be used to address this positive feedback.

Sensitivity analysis and validation model simulations with selected perturbations is useful. Tools used include Cytoscape, CellDesigner, COPASI, Matlab (PottersWheel, Data2Dynamics)

PyCoTools (available through Git and can be installed using pip).

----- SigNetSim : A web-based framework for designing kinetic models of molecular signaling networks -----

libSigNetSim, a Python based library
Support hierarchical SBML models
Compatible with Jupyter

----- The ZBIT Systems Biology Software and Web Service Collection -----
KEGGTranslator, pulls data from the KEGG database about reactions and creates SBML models.

BIOPAX2SBML: Converter from BIOPAX to SBML models.

BIGG: A model database.

ModelPolisher adds annotations to FBA models.

SBMLSqueezer: An SABIO-RK parser have also been developed to extract kinetic information.

SBMLsimulator using an existing systems biology library for simulations.

SBML2Latex: To produce human readable reports

--- SED-ML support in JWS Online ---
JWS is dockerised and extended the JWS with new additional libraries.

--- The NormSys registry for modeling standards in systems and synthetic biology ---

A platform to give an overview of modelling standards. It integrates information about modelling standards and their use in different biological applications.

Wait while more posts are being loaded