Open Source Databases

Revision as of 13:32, 6 June 2010

There are a number of very large databases that are available free for use by students, faculty, and others. Over the years, the U.S National Science has encouraged and sometimes funded such projects. This document lists some of the databases that are available.

Encyclopedia of Life

The goal of Encyclopedia of Life project is to develop one Web Page for each species of life on earth. The initial goals are based on an estimate that they will cover 10 million species. Quoting from the Website:

Welcome to the first release of the Encyclopedia of Life portal. This is the very beginning of our exciting journey to document all species of life on Earth.
Comprehensive, collaborative, ever-growing, and personalized, the Encyclopedia of Life is an ecosystem of websites that makes all key information about all life on Earth accessible to anyone, anywhere in the world.
Our goals are to:
  • Create a constantly evolving encyclopedia that lives on the Internet, with contributions from scientists and amateurs alike.
  • Transform the science of biology, and inspire a new generation of scientists, by aggregating virtually all known data about every living species.
  • Engage a wide audience of schoolchildren, educators, citizen scientists, academics and those who are just curious about Earth's species.
  • Increase our collective understanding of life on Earth, and safeguard the richest possible spectrum of biodiversity.


See Quoting from this 3/24/09 document:

WASHINGTON -- NASA and Microsoft Corp. announced Tuesday plans to make planetary images and data available via the Internet under a Space Act Agreement. Through this project, NASA and Microsoft jointly will develop the technology and infrastructure necessary to make the most interesting NASA content -- including high-resolution scientific images and data from Mars and the moon -- explorable on WorldWide Telescope, Microsoft's online virtual telescope for exploring the universe.

"Making NASA's scientific and astronomical data more accessible to the public is a high priority for NASA, especially given the new administration's recent emphasis on open government and transparency," said Ed Weiler, associate administrator for NASA's Science Mission Directorate in Washington.

Under the joint agreement, NASA's Ames Research Center in Moffett Field, Calif., will process and host more than 100 terabytes of data, enough to fill 20,000 DVDs. WorldWide Telescope will incorporate the data later in 2009 and feature imagery from NASA's Mars Reconnaissance Orbiter, known as MRO. Launched in August 2005, MRO has been examining Mars with a high-resolution camera and five other instruments since 2006 and has returned more data than all other Mars missions combined.
"This collaboration between Microsoft and NASA will enable people around the world to explore new images of the moon and Mars in a rich, interactive environment through the WorldWide Telescope," said Tony Hey, corporate vice president of Microsoft External Research in Redmond, Wash. "WorldWide Telescope serves as a powerful tool for computer science researchers, educators and students to explore space and experience the excitement of computer science."

Database to Help Development of New Drugs

Duncan, David Ewing (4/29/2010). Too much data, too few drugs. Fortune. Retrieved 5/10/2010 from from the article:

Like sages of old, they came to San Francisco last weekend, a group of biologists and computer scientists setting out to one-up every library ever conceived, from the great one in ancient Alexandria to Wikipedia today.:: This library, however, will not consist of vellum scrolls or e-page entries. It aims to compile and make sense of genetic sequences and other raw biological data that are proliferating so fast that biology is about to move from petabytes to exabytes of data -- from quadrillions to quintillions. Just ten years ago, in 2000, all of digitized biology equaled only about 10 gigabytes (giga=billion).…
Trying to make sense of all this data is what brought two hundred scientists here to the first-ever Sage Congress. Organized by Sage Bionetworks, a new nonprofit based in Seattle, the meeting's attendees have proposed a novel solution: to create a new, open-source model to standardize and link together thousands of databases around the world -- in universities, institutes, governments, and businesses.


Chronicle of Higher Education. See the following two articles that are closely related:

(5/28/2010). Crowd science reaches new heights.
(6/3/2010). The growth of 'citizen science.'

Cohn, David (1/17/05). Open-Source biology evolves. Retrieved 12/13/07: Quoting from the article:

To push research forward, scientists need to draw from the best data and innovations in their field. Much of the work, however, is patented, leaving many academic and nonprofit researchers hamstrung. But an Australian organization advocating an open-source approach to biology hopes to free up biological data without violating intellectual property rights.
The battle lies between biotech companies like multinational Monsanto, who can grant or deny the legal use of biological information, and independent organizations like The Biological Innovation for Open Society, or BIOS, and Science Commons. The indies want to give scientists free access to the latest methods in biotechnology through the web.