Within each and every one of your body’s nucleated cells lies more than two metres worth of genetic information presented in the form of deoxyribonucleic acid (DNA). Through highly specialised biological processes these molecules are able to be condensed by a factor of more than 10 000, allowing all of your genetic information to fit in a space roughly one hundredth the size of the full stop at the end of this sentence.
What might surprise you, even more, is the fact that these bundles of DNA within you are divided into approximately 3bn pieces. Each ‘piece’ is known as a base. With just four types of bases (adenine, cytosine, thymine and guanine) it would take around 16000 Paper copies of Nouse from front to back just to write out the entirety of your genetic code. The first time the human genetic code (also known as the genome) was sequenced was from 1988 to 2003; titled the Human Genome Project, the huge feat was an international effort and had a hefty price tag of $3.8bn (USD). Within the last 14 years, however, technological advancements have brought down the cost of the process to around $1000 per individual. One might ask why this collection of genetic information is important or how it is stored and handled. The answers to all of these questions lie within a subsection of biology known as bioinformatics.
Upon first glance, ‘ensembl.org’ seems to be nothing more than yet another website of scientific importance that requires years of prior knowledge to understand and appreciate. However, with just a few clicks of your mouse, you can gain access to decades of scientific information. The fruits of the Human Genome Project are stored alongside the genetic information of 113 other species ranging from the giant panda (Ailuropoda melanoleuca) to multiple prokaryotes.
A team based in Cambridgeshire, UK manages the site while scientists from around the world can contribute their findings to the site as they see fit. Ensembl and other open source databases with similar goals are at the forefront of the bioinformatic data rush. However, it is not how we keep this information that is exciting scientists most, it is what we are doing with it instead.
Put simply, the goal of bioinformatics is to analyse biological data through statistical and computational methods in order to learn more about said data than we knew before. A great example of this would be sequencing the genomes of multiple organisms in the hopes of determining how they are related evolutionarily.By sequencing and comparing the genomes of chimpanzees (Pan troglodytes) and humans (Homo sapiens) in 2005, scientists were able to confirm that chimpanzees are our closest ancestral relatives based on the fact that they share more than 99 per cent of our genetic information.
Along with being able to compare genomes purposes of evolutionary classification, bioinformatics is also a useful tool in identifying specific regional mutations in the genome which can be linked to disease. An example of a disease where a genetic mutation is the main cause is Huntington’s disease. Characterised as a progressive brain condition, Huntington’s disease alters the nerve cells of the affected individual often causing uncontrolled movement, changes in emotion and altered cognitive capacity.
Throughout the late 1970s and early 1980s, Dr. Nancy Wexler worked with a team of geneticists to identify the genetic mutation which caused Huntington’s disease. This was done by hand, using pen and paper, and took years. Using bioinformatic techniques similar studies can now be completed in a matter of days. These are just a few examples of what can be done with this increasingly important subsection of biology. Other examples include searching for new target genes for antibiotics, predicting the functions of specific proteins and editing genomes to create more resistant crops. This is just the tip of the iceberg for bioinformatics, and as with the other fields of science, only time will tell what the future will bring.