Bioinformatics & Computational Biology
At Norwich Research Park we work collaboratively to understand the mechanisms by which plants and microbes produce new materials and molecules with bioactive qualities. Knowledge which can then be used to increase yields and create products of commercial and societal interest, such as personal care products, functional foods, new antibiotics and alternatives to current pesticides.
Technological advances in the last 20 years have allowed researchers to conduct large scale experiments and collect and analyse enormous amounts of data more easily. Life scientists are generating large data sets, for example of gene sequence, gene expression, genome modification, in experimental populations of plants or animals and from microbial communities. Research groups at Norwich Research Park utilise computers in biological research to help construct digital models, systems and algorithms and analyse big data.
Bioinformatics is a field that combines biological knowledge with computer programming and big data. It is particularly useful when dealing with large amounts of data, such as genome sequencing. Computational Biology uses computer science, statistics, and mathematics to help solve problems and can also include the development of algorithms, theoretical models, computational simulations, and mathematical models. The key difference between the simplified by the two is that bioinformatics focuses on analysing big data, whereas computational biology is normally more focused on smaller data sets.
The Leggett group at Earlham Institute uses computational methods to enhance sequencing technologies, with a particular focus on being able to analyse samples quickly and easily while out in the field. They work with handheld tiny sequencing devices, like the MinION by Oxford Nanopore Technologies, to build digital tools that analyse the massive data output of these devices. These small devices are really useful to researchers for metagenomics approaches which enable research to analyse genetic material recovered in environmental samples.
The group has used the MinION sequencer to sample polar ocean samples. Polar oceans are some of the most underexplored ecosystems in the world, and scientists have relatively little information about the microbial species which live there. Microbes, such as phytoplankton, which live in polar oceans are important in carbon capture, food-webs, and biogeochemical cycles of elements such as nitrogen and iron as well as silica. To help them in the field the group has developed NanOK, a tool for analysing real time data from the MinION.
Researchers from the Norwich Research Park, in collaboration with Durham University, are performing large scale molecular dynamics simulations of model lipid membranes. The image shows an atomic representation of a phospholipid bilayer composed of dipalmitoylphosphatidylcholine (orange and red), dioleoylphosphatidylcholine (cyan and skyblue) and cholesterol (magenta and yellow) molecules. Computer models help to explore the nature of lipid rafts that play a pivotal role in the functioning of biological membranes.
ELIXIR is a Europe wide data infrastructure for the life sciences and Earlham Institute is the lead institute in the UK. ELIXIR aims to facilitate sharing of life science data by integrating data sources and tools under a unified set of standards and provide computing infrastructure and the training needed to make these resources accessible to bioinformaticians and life scientists. This will help solve the challenge of fusing a massive range of data to allow new knowledge to be inferred from all kinds of different uncomparable data sets. Work on this project aims to create an integrated platform upon which new rich analyses will be built.
The Dr Richard Smith group at the John Innes Centre use mathematical and computer simulation techniques to investigate questions in plant development. They develop simulation models that show different aspects of the process of how plants grow including genes, proteins, hormone signalling to cells forming tissue.
The Korcsmaros group work on digital biology research for the microbiome. Many microbes in the gut still have an unknown role and the datasets acquired in microbiome studies can be enormous and require complex analytical methods. One of their projects aims to develop a machine learning-based systems biology workflow, which can be applied to the gut microbiome data for the study of the expression of genes in healthy ageing and age-related disorders. The approach is based on metagenomics and metatranscriptomics in natural environments) data and provides a robust and scalable method to process big datasets.
The Darwin Tree of Life Project (DTOL) is part of the global Earth Biogenome Project which aims to document and understand life on Earth. As part of DTOL, the Earlham Institute is applying expertise in single-cell genomics, bioinformatics, and data management to develop robust pipelines for both sequencing novel organisms and enabling the DTOL community to share genomic data in a findable, accessible and reproducible way.
This project will generate an invaluable open-source catalogue of data for research into how organisms develop and respond to pathogens, parasites, environmental change, and intra-species interactions. Revealing the evolutionary underpinnings of the human genome, our food sources and parasites will help to unearth processes that generate genomic diversity.
By partnering closely with high performance computing technology leaders, EI boasts world-class compute and storage infrastructure that allows their scientists to undertake some of the most challenging data-intensive research in the fields of genomics and biosciences. This enables them to store, categorise and analyse more genomic data in less time for decoding living systems and answering crucial biological questions.