1683744393 The first human pangenome reveals 120 million more letters in

The first human pangenome reveals 120 million more letters in DNA

On March 23, 1997, a man opened a newspaper and changed the world forever. An advertisement on one of the pages of The Buffalo News newspaper caught his attention: “Wanted: 20 volunteers to participate in the Human Genome Project, a very large international scientific research initiative. […] The outcome of the project will have a tremendous impact on the future advancement of medical science.” Heeding the call, he donated a few milliliters of blood and joined a $3 billion project that in 2003 produced the so-called human reference genome, which is to 70 % was this man ‘s DNA and the contributions of two dozen other people . While this genetic information changed human history, it was not enough because it precluded the diversity of the human species. An international consortium on Wednesday published a more sophisticated alternative, worked out using the genetic sequences of 47 people from different regions of the world. It is the first draft of the so-called human pangenome.

A person’s genome – their DNA – is the instruction manual that resides in each of their cells. It is a text of over 3 billion letters (ATGGCGAGT…), where each letter is simply the first letter of a chemical compound with varying amounts of carbon, hydrogen, oxygen and nitrogen. For example, G is guanine: C₅H₅N₅O. The genome of two people is 99.9% identical, but the remaining 0.1% is made up of millions of letters that make a person unique and can hide the key to their diseases. If the 2003 reference genome is a linear sequence, the new human pangenome can be thought of as a road map in which a single genome is just one route, they say Benedict godparentsa computational biologist at the University of California at Santa Cruz (USA) and one of the principal investigators of the study.

The new study adds 119 million letters to the previously used model. The study’s authors, who are grouped together in the Human Reference Pangenome Consortium, explain that the low diversity of the current reference genome has caused a “street lamp effect,” a type of observational bias that occurs when people are just looking for something where it’s easiest to go for example when a drunk person searches for their house keys on the ground under a street lamp at night. A police officer tries to help him and after a few minutes of unsuccessful searching, the officer asks the man if he is sure that he lost his keys there. “No, I dropped them in the park, but here’s the light,” the man replies. Scientists spent two decades looking for possible genetic variants where they were easier to look for: within the confines of the reference genome, which not only ignored human diversity but was also riddled with holes due to the technology’s lack of precision.

Benedict Paten and his colleagues have been working for years to develop new tools that can read DNA with unprecedented accuracy, with just one error every 200,000 letters. Several members of the team were also involved in the T2T consortium, which achieved the first truly complete sequencing of a human genome a year ago. By then, only 92% had been sequenced. The remaining 8% resembled the blue sky pieces in a large jigsaw puzzle: too monotonous to easily put where they belong.

Medicine that is “fairer”.

Geneticist Karen Miga of the University of California, Santa Cruz said at a news conference on Tuesday that the diversity of the pangenome heralds a new, “fairer” era in medicine. The 47 genomes recorded so far are mainly from Africa (24) and America (16), including four Peruvians from Lima, four Colombians from Medellín and eight Puerto Ricans. Six genomes are from Asia and only one is from Europe, a continent already overrepresented in genetic databases. The team’s goal is to reach 350 complete genomes in a single pangenome, to be published by mid-2024. The first draft was published in Nature magazine this Wednesday.

Computational biologist Benedict Paten from the University of California, Santa Cruz.Computational biologist Benedict Paten from the University of California, Santa Cruz.UCSC

The Spanish scientist Santiago Marco, who developed algorithms and software tools for the pangenome, explained the extent of the technical challenge: Today’s machines cannot read a genome all at once, but billions of tiny fragments randomly and repeatedly. “Putting together a person’s genome is like reconstructing a big book of 3 billion letters, putting together paragraphs and unordered pages, like it’s a big jigsaw puzzle,” said Marco, who works at the National Supercomputing Center in Barcelona, ​​​​​​Spain, works. “Construction of a reference pangenome may require processing 100 times more information,” he warned.

Francisco Martínez Jiménez, an expert in computational immunogenomics, uses the reference genome as a model in his daily work to look for specific changes in patients’ tumors at the Vall d’Hebron Institute of Oncology, also in Barcelona. If the patient’s ancestors come from South America, Africa or Southeast Asia, for example, it is “much more difficult” to detect these changes, the specialist explained, since the current reference genome consists mainly of DNA from people of European origin. “The genetic diversity in the pangenome is very relevant, especially in cancer,” he said.

Martínez Jiménez analyzed the complete genome of more than 7,000 primary and metastatic tumors from 71 types of cancer. The results of his study, also published in the journal Nature on Wednesday, show that in certain types of tumors – such as prostate, thyroid and some breast cancers – the genetic differences between primary cancer and metastasis are “very important”, while in other tumors “very important” is B. pancreatic cancer, they are subtle. “Metastasis per se does not generally seem to be explained by a specific genomic change, but rather possibly by changes in the tumor microenvironment, such as a weakening of the immune system at certain sites or increased blood flow through blood vessels, with more nutrients,” said Martínez Jiménez, who conducted his study at Utrecht University in the Netherlands.

Biologist Benedict Paten insists the human pangenome is a blueprint for now and asks for patience until real medical implications are seen. “There are assembly errors – not too many, but some – that we knew we were going to make and want to correct,” he admitted. Another study co-author, Erik Garrison of the University of Tennessee, shared his enthusiasm in a statement. In his opinion, the first draft of the human pangenome is as “exciting and unexpected” as the first observations of unknown regions of our own planet or solar system, adding that in this case, however, it is something that “could define our physical structure “. Nature.”

Sign up for our weekly newsletter for more English language news from the EL PAÍS USA Edition