Predicting How Splicing Errors Impact Disease Risk.
Cells make proteins based on blueprints encoded in our genes. These blueprints are copied into a raw RNA (Ribonucleic acid is a polymeric molecule essential in various biological roles in coding, decoding, regulation, and expression of genes) message which must be edited, or spliced to form a mature message that can direct the cellular machinery that synthesizes proteins. Georgian Technical University scientists have rigorously analyzed how mutations can alter RNA (Ribonucleic acid is a polymeric molecule essential in various biological roles in coding, decoding, regulation, and expression of genes) messages at the start of a splicing site (5-prime splice site). 1 and 2 here indicate those positions in a hypothetical raw RNA (Ribonucleic acid is a polymeric molecule essential in various biological roles in coding, decoding, regulation, and expression of genes) message. The aim is to be able to predict how errors at these sites will affect protein synthesis. Some errors lead to serious illnesses.
No one knows how many times in a day or even an hour, the trillions of cells in our body need to make proteins. But we do know that it’s going on all the time, on a massive scale. We also know that every time this happens, an editing process takes place in the cell nucleus. Called RNA (Ribonucleic acid is a polymeric molecule essential in various biological roles in coding, decoding, regulation, and expression of genes) splicing it makes sure that the RNA (Ribonucleic acid is a polymeric molecule essential in various biological roles in coding, decoding, regulation, and expression of genes) “instructions” sent to cellular protein factories correspond precisely with the blueprint encoded in our genes.
Researchers led by X Professor and Assistant Professor Y are teasing out the rules that guide how cells process these RNA (Ribonucleic acid is a polymeric molecule essential in various biological roles in coding, decoding, regulation, and expression of genes) messages enabling better predictions about the impact of specific genetic mutations that affect this process. This in turn will help assess how certain mutations affect a person’s risk for disease.
Splicing removes interrupting segments called introns from the raw, unedited RNA (Ribonucleic acid is a polymeric molecule essential in various biological roles in coding, decoding, regulation, and expression of genes) copy of a gene leaving only the exons, or protein-coding regions. There are over 200,000 introns in the human genome and if they are spliced out imprecisely cells will generate faulty proteins. The results can be life-threatening: about 14% of the single-letter mutations that have been linked to human diseases are thought to occur within the DNA (Deoxyribonucleic acid is a molecule composed of two chains (made of nucleotides) which coil around each other to form a double helix carrying the genetic instructions used in the growth, development, functioning and reproduction of all known living organisms and many viruses) sequences that flag intron positions in the genome.
The cell’s splicing machinery seeks “splice sites” to correctly remove introns from a raw RNA (Ribonucleic acid is a polymeric molecule essential in various biological roles in coding, decoding, regulation, and expression of genes) message. Splice sites throughout the genome are similar but not identical, and small changes don’t always impair splicing efficiency. For the splice site at the beginning of an intron–known as its 5′ [“five-prime”] splice site X says “we know that at the first and second [DNA-letter] position, mutations have a very strong impact. Mutations elsewhere in the intron can have dramatic effects or no effect or something in between”.
That’s made it hard to predict how mutations at splice sites within disease-linked genes will impact patients. For example mutations in the genes BRCA1 (RCA1 and BRCA1 are a human gene and its protein product, respectively. The official symbol (BRCA1, italic for the gene, nonitalic for the protein) and the official name (originally breast cancer 1; currently BRCA1, DNA repair associated) are maintained by the HGNC. Orthologs, styled Brca1 and Brca1, are common in other mammalian species) or BRCA2 (BRCA2 and BRCA2 are a human gene and its protein product, respectively. The official symbol (BRCA2, italic for the gene, nonitalic for the protein) and the official name (originally breast cancer 2; currently BRCA2, DNA repair associated) are maintained by the HUGO Gene Nomenclature Committee) can increase a woman’s risk of breast and ovarian cancer, but not every mutation is harmful.
In experiments led by Z a X lab postdoc, the team created 5′ splice sites with every possible combination of DNA (Deoxyribonucleic acid is a molecule composed of two chains (made of nucleotides) which coil around each other to form a double helix carrying the genetic instructions used in the growth, development, functioning and reproduction of all known living organisms and many viruses) letters then measured how well the associated introns were removed from a larger piece of RNA (Ribonucleic acid is a polymeric molecule essential in various biological roles in coding, decoding, regulation, and expression of genes). For their experiments they used introns from three disease-associated genes–BRCA2 (BRCA2 and BRCA2 are a human gene and its protein product, respectively. The official symbol (BRCA2, italic for the gene, nonitalic for the protein) and the official name (originally breast cancer 2; currently BRCA2, DNA repair associated) are maintained by the HUGO Gene Nomenclature Committee) and two genes in which mutations cause neurodegenerative diseases, IKBKAP (IKBKAP (inhibitor of kappa light polypeptide gene enhancer in B-cells, kinase complex-associated protein) is a human gene encoding the IKAP protein, which is ubiquitously expressed at varying levels in all tissue types, including brain cells) and SMN1 (Survival of motor neuron 1 (SMN1), also known as component of gems 1 or GEMIN1, is a gene that encodes the SMN protein in humans).
In one intron of each of the three genes, the team tested over 32,000 5′ splice sites. They found that specific DNA (Deoxyribonucleic acid is a molecule composed of two chains (made of nucleotides) which coil around each other to form a double helix carrying the genetic instructions used in the growth, development, functioning and reproduction of all known living organisms and many viruses) sequences corresponded with similar splicing efficiency or inefficiency in different introns. This is a step toward making general predictions. But they also found that other features of each gene–the larger context–tended to modify the impact in each specific case. In other words: how a mutation within a given 5′ splice site will affect splicing is somewhat predictable but is also influenced by context beyond the splice site itself.
X says this knowledge will better help predict the impact of splice-site mutations–but a deeper investigation is needed.