The BioResource has launched a project using Long Read Sequencing (LRS), a new technology that may enable diagnoses where earlier sequencing methods were unable to. The initial focus is on rare disease patients and will also include participants from our Eating Disorders cohort (EDGI) and the Genes & Cognition Study, a large scale study that aims to investigate brain function and cognitive decline.
Between 2013-2018, the Rare Diseases BioResource carried out a short-read Whole Genome Sequencing (srWGS) study involving around 8,000 participants and family members with rare conditions. This work provided a genetic diagnosis for an average of 20% of the participating patients, and led to changes in treatment for some, and for others, receiving a diagnosis brought clarity and reassurance. Through LRS, we aim to provide researchers with new tools to provide quicker, more accurate diagnoses and increase the number of effective treatment options.
DNA sequencing technologies have advanced significantly in recent decades. Earlier methods could generate many small DNA fragments, which are then assembled to identify variations. However, that is like doing a very big, complicated jigsaw puzzle with billions of pieces.
For this project, we are using Long Read Sequencing, the latest technology. This method generates much bigger sections of DNA, whereby very similar-looking sections can be put together in the right place.
This makes piecing together the enormous jigsaw much simpler. We hope that this will enable us to identify variations in the DNA, that might have been missed when analysing sequences produced using the older sequencing methods.
It is hoped that this study will lead to:
The project aims to recruit up to 3,000 patients with approved rare disease conditions by March 2026, with a possible extension. Participation requires a clinician to apply on behalf of the patient, as self‑referral is not allowed.
The NIHR BioResource also plans to expand recruitment to additional rare diseases and continues to add new approved projects.
- Dr Kathy Stirrups – NIHR BioResource Samples Team Lead“The NIHR Rare Diseases BioResource is aiming to build upon the experience from the Whole Genome Sequencing Project by undertaking the Long Read Sequencing Project. This is a cutting-edge new technology, and we are very excited to be involved in testing its utility for diagnosis or discovery of new causes across our Rare Disease studies.
"Many people and their families are affected by rare genetic diseases and we are committed to enhancing the knowledge base available for researchers and reducing the diagnostic odyssey for the patients.”
What happens if my clinician asks me to participate?
Your clinical team will approach you if you meet the inclusion criteria for the project. You would be asked to consent and join the NIHR BioResource Rare Diseases study and donate a 15 ml (~1 tablespoon) blood sample.
From this blood sample, we will isolate, analyse and store your DNA and other components from the donation for use in medical research.
We will also store your personal data on secure servers. This information can only be accessed by staff should their job role require them to do so. More details can be found on our privacy information pages.
DNA (deoxyribonucleic acid) is your genetic code; it is present in almost every cell in your body and contains the instructions to be able to make every part of you. However, although it is present in almost all cells, most of it does not need to be‘read’.
Your brain cells, for example, will just need to ‘read’ and express the parts of the DNA that are important for the brain cells to function. It will not ‘read’ or use (or express) parts of the DNA that are unique to, for example, skin cells.
Genetics: the study of single genes and how they are inherited. Genetic tests can identify a specific gene or variation in a family.
Genomics: the study of all the genes (also called the genome) at once, as well as the interaction of those genes with each other and any environmental influence on these genes. A genomic test can compare more than one gene at a time and in some cases all the DNA (Whole Genome Sequencing or Long Read Sequencing).
Whole Genome Sequencing gives a ‘read out’ of all your DNA – your genome, which for humans is 3 billion bases (or letters).
Your DNA sequence is unique to you, however, some of it is shared with your relatives. By comparing your genome and understanding how it is similar or different from others, it is possible to learn more about why diseases may occur and how they may affect you.
There have been vast improvements in sequencing technology over the past decades that now enable whole human genomes to be sequenced in days.
The original next generation technology (now referred to as short read Whole Genome Sequencing) produced lots of chunks of sequences of 150 bases, these could then be aligned to the reference genome (a representative human sequence of around 3 billion bases that is used to compare data from new samples). Essentially it is like doing a jigsaw with millions of tiny pieces and using the picture on the box to help put them in the right places.
The most recent technology, Long Read Sequencing (LRS) can generate much longer chunks of sequence often with an average of 10,000 to 20,000 bases long. This makes the jigsaw much simpler and will help align the pieces to the right location.
Some Rare Diseases may be due to variations in highly repetitive regions of the genome or in places where there has been a rearrangement of the DNA compared to the reference and so improving the accuracy of the mapping of the pieces and variant identification may enable a diagnosis.
Long Read Sequencing can also look at methylation of the bases within the genome, this is a method cells use to turn expression of genes on or off, regulating how much gene is being used in certain situations. This could be a mechanism causing the rare disease.
So, it is possible these technological improvements may enable a diagnosis in some patients that have previously had short read sequencing or add more detail to the data available and allow discovery of new genes or mechanisms causing the rare disease which can then be used to improve treatments.
If you decide to join and participate in the study, you will be asked to attend an appointment to donate your blood (this technology requires DNA extracted from blood as this gives the most accurate data; DNA extracted from saliva may contain DNA from bacteria or food) and to sign the consent forms.
The clinic team will go through the consent process with you and take your sample.
The sample will be assigned a unique identifier so that you cannot be identified. The sample will be sent to the laboratory for processing.
Once the sample has been received in the laboratory it will undergo processing to extract the DNA, we will also bank other components such as plasma and serum from the sample.
The extracted DNA will be received into the specialized sequencing laboratory, where we undertake very careful quality control processes that fragments and profiles the DNA so that we can decide the best protocol and obtain the best fragment sizes for the sequencing.
We don’t want to over fragment the DNA as this will reduce the power of the Long Read technology; however, having too much very long DNA is detrimental and can reduce the amount of data obtained so we might not get full genome coverage.
After the quality checks have been completed, the DNA is prepared for sequencing, using a process called library preparation. The resultant libraries are then loaded onto the sequencing machine (PromethIONs), and for the next 72 hours the machine is continuously sequencing the fragments and collecting data.
This data is transferred to our high-performance computing facility, where the sequence fragments are aligned against the human reference genome. Then specialists will analyse the data to look for potentially causative variations.
Long read sequencing generates a lot of data, the raw data generated can be over 1 Terabyte, which aligned against the reference genome and the variations annotated to make a much smaller file. This data is then analysed by bioinformaticians (computer scientists that specialise in looking at biological data) including detailed checks on the quality of the data.
When we have data from many individuals with the same Rare Disease we can look to see if there are similarities or patterns appearing in the sequences, which may be associated with the Rare Disease.
For some Rare Diseases there are known genes associated with the disease, the bioinformaticians can look at the variation within these genes and determine if these could be responsible for causing the disease.
We work closely with the clinical teams to report and explain any research findings, so that these can be confirmed by the NHS Genomic Medicine service where appropriate. This enables the results to be used in a patient’s clinical care, e.g. to provide a genetic diagnosis, support treatment options, or to help inform a patient’s reproductive choices.