Campus marks 20th anniversary of posting the assembled genome to the Internet
By Alexis Morgan
Communications Manager, UC Santa Cruz Genomics Institute
July 7, 2020 — Santa Cruz, CA
(Photo above: A copy of the first draft of the human genome sequence burned onto a CD-ROM at UCSC is presented to President Clinton and deposited in the Smithsonian in 2001. Contributed)
UC Santa Cruz researchers played a critical role in ensuring the human genome would be free for everyone, forever
The human genome sequence — spelled out in 3.2 billion units of DNA strung together on chromosomes — represents the complete genetic instructions for human life. Decades ago, interest in deciphering our genome sparked a revolution in biomedical research, setting the stage for genomics-based improvements in the diagnosis and treatment of disease.
Researchers at UC Santa Cruz played a crucial role in early planning of the project and in assembling the genome sequence. They continue to have a major role in the ongoing analysis of the human genome.
“UC Santa Cruz scientists launched consideration of this vast project in 1985 and then, 15 years later, played a key role in the integration of the massive data banks into a coherent whole,” UC Santa Cruz Chancellor Robert L. Sinsheimer said in 2001. “The campus can be proud of its role in this historic endeavor,” he concluded.
At the time, President Bill Clinton called the Human Genome Project an historic achievement and one that would revolutionize the diagnosis, prevention, and treatment of disease.
The first serious push toward sequencing the human genome actually began in 1984 when Sinsheimer proposed to UC President David Gardner that an Institute to Sequence the Human Genome be established on the UCSC campus. While the proposal was not funded, Sinsheimer held onto the basic idea. He discussed it with other molecular biologists at UCSC — including Harry Noller, Robert Edgar, and Robert Ludwig — and they decided to convene a workshop to explore the idea.
The 1985 UCSC workshop planted the idea of sequencing the human genome within the core group of scientists who attended. At around the same time, other scientists made independent proposals to sequence the human genome.
Ultimately, the Human Genome Project was launched in 1990 as an international scientific collaboration, jointly run by the Department of Energy and the National Institutes of Health.
In December 1999, UC Santa Cruz re-entered the stage with a dramatic, starring role as the Human Genome Project neared its goal of completing the first draft of the sequence. Project leaders at the international consortium asked David Haussler, then a professor of computer science at UCSC, to help with the analysis of the human genome by finding the locations of genes along the chromosomes.
However, before that work could get started, the underlying genome sequence had to be assembled from fragmented data obtained by the project’s sequencing laboratories.
This proved to be more difficult than anticipated. And by May 2000, Haussler was worried. A private company, Celera Genomics, was beating a path to the prize with a big budget and what was reported to be the most powerful computer cluster in civilian use. Haussler knew the public project had a formidable competitor in Celera, whose goals, as a commercial endeavor may not have been aligned with the public project. Celera’s assembly project was led by Gene Myers, a talented friend from Haussler’s graduate student days.
“It became evident the consortium was really in a bad position,” Haussler said. “They hadn’t found any way to do the assembly in time and our skunkworks project wasn’t producing a way to do it quickly enough, either.”
Jim Kent, then UC Santa Cruz biology graduate student, was similarly worried. If a private corporation sequenced the human genome first, Kent feared that thousands of genes might be patented, hamstringing the free dissemination of information to anyone who needed it.
“It was game on,” remembered Kent. Accepting the challenge before him, Kent began secretly working, writing and testing a program he thought might beat Celera to the punch. On May 22, Kent emailed Haussler saying he thought he had found the answer.
“It was a classic scene,” recalled Haussler. Kent, huddled in the garage office of his Seabright home for the next month, wrote 10,000 lines of code, working so furiously and for so many hours, he had to ice his wrists in order to go on.
On June 22, four days before the White House announcement —and three days before Celera finished its computer assembly — Kent ran the code to complete the first successful draft sequence of the human genome using 100 off-the-shelf Dell computers purchased with chancellor’s funds that they had connected in parallel to run the code.
The announcement of the successful sequencing of the human genome was hailed around the globe. Haussler and Kent had the honor of posting the first human genome on the Internet on July 7.
“That was truly a great moment in UC Santa Cruz history. It was an emotional day,” said Haussler. “Humanity got its first glimpse of its genetic heritage, an ocean of As, Cs, Gs and Ts representing the cumulative successes and failures of innumerable generations that went before us.”
And thanks to UC Santa Cruz, it was there for everyone to see, free and unrestricted.
And see they did.
That first day, a half a trillion bytes of information flowed out from UC Santa Cruz’s servers. Three months later, Kent debuted the UCSC Genome Browser, a web-based “microscope” designed for exploring the human genome sequence. Of course, it was available free to anyone who wanted it.
“It all started here,” said Francis Collins, then director of the National Human Genome Research Institute, and now director of the National Institutes of Health, speaking before a capacity crowd at UC Santa Cruz’s Human Genome Symposium in 2001. In his keynote address that day, Collins recognized the “absolutely critical role” of UC Santa Cruz researchers in assembling the genome sequence, as well as their ongoing contributions to the Human Genome Project. He noted, “Without the computational work of David Haussler and graduate student Jim Kent, we would not have seen the genome emerge in the way I can tell you about here.”
Today, more than 100,000 researchers worldwide use the UCSC Genome Browser each month in their work to uncover the causes of disease and develop new treatments. For Kent, the Browser” was one of the joys of the human genome project.”
“We were conscious we were doing the right thing,” Haussler said of their work on the genome and the free and open Browser. “We knew there would be enormous medical benefit and scientific understanding that would come from this.”
Both Haussler and Kent also say what happened at UC Santa Cruz might not have happened anywhere else.
“I appreciate Santa Cruz for its extraordinary boldness, its willingness to do things differently, unrestricted by the traditional approaches,” said Haussler, sitting in his cluttered campus office on Science Hill. “Even though we’re a small university, we ask big questions.”
The landmark achievement of the Human Genome Project and the publication of the first human genome marked the transition to what has been called the post-genomic era. Today, revolutionary technologies elucidate the functions of human genes and the variations of those genes that make each of us unique. The post-genomic era has seen the arrival of precision medicine, based on deep understanding of key molecular mechanisms and capable of delivering treatments individually tailored to the genetic makeup of each patient.
And UC Santa Cruz researchers remain at the forefront of these applications, proud to be part of a community of scientists using computers to solve the mystery of what it means to be human.
The Genomics Institute is grateful to UC Santa Cruz authors Peggy Townsend and Tim Stephens, whose reporting contributed to this story.