Top image

Schneider (Reinhard) Group

Data integration and knowledge management

Schneider (Reinhard) Team

Examples of the graphical features of Arena3D. Heterogeneous data types can be visualised in a 3D environment and a range of layout and cluster algorithms can be applied.

Previous and current research

Today it is widely recognised that a comprehensive integration of data can be one of the key factors to improve productivity and efficiency in the biological research process. Successful data integration helps researchers to discover relationships that enable them to make better and faster decisions, thus considerably saving time and money.

Over the last 20 years, biological research has seen a very strong proliferation of data sources. Each research group and new experimental technique generates a source of valuable data, and new challenges from the standpoint of storage, indexing, retrieval and system scalability over disparate types of data are central to large-scale efforts in understanding biological systems.

Schneider (Reinhard) Team

OnTheFly and Reflect server. Figure (A,B,C) shows an annotated table (A) of an PDF full text article, the generated popup window with information about the protein YGL227W (B), and an automatically generated protein-protein interaction network (C) of associated entities for the proteins shown in part(A). Part (D) shows the architecture and functionality.

The current systems biology approaches are generating data sets with rapidly growing complexity and dynamics. One major challenge is to provide the mechanism for accessing the heterogeneous data and to detect the important information. We develop interactive visual data analysis techniques using automatic data analysis pipelines. The combination of techniques allows us to analyse otherwise unmanageable amounts of complex data.

The principal aim of the group is to capture and centralise the knowledge generated by the scientists in the several divisions, and to organise that knowledge such that it can be easily mined, browsed and navigated. By providing access to this resource to all scientists in the organisation, it will foster collaborations between researchers in different cross-functional groups.

The group is involved in the following areas: 

  • Data schema design and technical implementation;
  • metadata annotation with respect to experimental data;;
  • design and implementation of scientific data portals;
  • providing access to, and developing further, data-mining tools (e.g. text-mining);
  • visualisation environment for systems biology data.

Future projects and goals

Our goal is to develop a comprehensive knowledge platform for the life sciences. We will first focus on the biology-driven research areas, but will extend into chemistry-related fields, preliminary by collaborating with groups inside EMBL. Other research areas will include advanced data-mining and visualisation techniques.