News

‘Superarchives’ Could Hold All Scholarly Output

Online collections by institutions may challenge the role of
journal publishers

Professors’ office computers hold a wealth of original content:
research articles, data sets, field notes, images, and the like. Some
of the material will be published in
journals months or years after it is
created, but even then it will
probably be available only to the
journals’ subscribers. The rest will
never see the light of day.

By JEFFREY R. YOUNG The Chronicle of Higher Education

Several colleges are now looking to share more of that work by
building "institutional repositories" online and inviting their professors
to upload copies of their research papers, data sets, and other work.
The idea is to gather as much of the intellectual output of an
institution as possible in an easy-to-search online collection. One
college has called its proposed repository a "super digital archive."

Proponents say such superarchives could increase communication
among scholars and spark greater levels of innovation, especially in
the sciences. Some imagine a day when every research university
gives its research away through the Web, allowing scholars and
nonacademics to mine it for ideas and information.

"The whole power of science is the power of shared ideas, not the
power of hidden ideas," says Paul Jones, associate professor of
information and library science at the University of North Carolina at
Chapel Hill. "Science advances when there’s a free exchange of ideas.
We move faster by being open. We know this, but we have
disincentives right now to openness."

One of those disincentives, many scholars believe, is the
scholarly-journal system, which, critics argue, has a monopoly on
scholarly output that leads to ever-soaring subscription prices.
Institutional repositories could create an alternative to journals, fans of
the archives say.

Journal publishers, meanwhile, say that such repositories are unlikely
to supplant their publications. Journals, they argue, are still the best
means of distributing and preserving research.

And even some of those supporting the new archives recognize the
difficulty of getting professors to change their habits. To make the
archives work, professors would have to take the initiative to submit
their materials, and, in some cases, persuade journals they work with
to allow them to place their articles in a repository.

Establishing a Model

The most ambitious and most closely watched superarchive is being
developed at the Massachusetts Institute of Technology. It is called
DSpace, and its goal is to collect research material from nearly every
professor at the institute — though participation will be voluntary.

"We want to give faculty the infrastructure that supports alternative
forms of publishing," says MacKenzie Smith, associate director of
technology for MIT’s libraries. Over the past two years, officials at MIT
have been building a set of software tools to support the repository,
and to make it easy for professors to submit material. Those tools are
nearly ready, and four departments and programs at MIT will be
testing them this summer.

Beginning this fall, MIT plans to open the archive to all of its
professors. "We don’t know how quickly it’s going to catch on," says
Ms. Smith, though she adds that professors have been enthusiastic
about the concept.

The biggest obstacle may be inertia. Professors are busy, and they
may not use the repository if they perceive it as more work, even if
they like it in principle, says Ms. Smith.

"We’ve gone to a lot of trouble to make the submission process very
simple for faculty," she adds.

Librarians don’t plan to actively police what goes into the repository,
though they do offer rules for what kind of work should be included.
Among those rules: Work must be "scholarly or research oriented,"
and it must be "complete and ready for ‘publication.’" Some
departments might choose to have someone serve as editor of their
department’s DSpace contributions, to read over them before they are
placed in the archive.

To make sure the new repositories don’t lead to information overload,
librarians are making sure that the materials are tagged with
"metadata" codes to help search engines navigate the sea of data.
Such tags include keywords, publishing information about the article
(if applicable), or an indication of what language the article is written
in, for example. Some departments may have graduate students or
staff members handle the virtual paperwork for professors. The
DSpace software will add the tags using information supplied by
users.

"The time involved for preparing the metadata is a small price to pay
for having these documents available for the long term," says Nicholas
M. Patrikalakis, a professor of ocean engineering at MIT. Mr.
Patrikalakis’s department is participating in the pilot project and plans
to upload its technical reports.

But new search tools would need to be developed to make full use of
the metadata tags. So far, traditional search engines like Google
aren’t equipped to do that — though librarians say such tools would be
relatively easy to create.

Professors who use the repository won’t have to make all of their
materials public. Researchers will be allowed to select access levels
for each item they contribute. Some research may be made available
only to those within MIT, while other materials may be free to anyone.

Why Share?

An incentive for making material available is that sharing research
helps professors build their reputations, some experts say. Some
research shows that the more professors open up their work, the more
likely they are to get cited by their peers.

In computer science, for instance, articles that appear online are
significantly more likely to be cited by other researchers than those
that do not appear online, according to a study of computer-science
research literature done by Steve Lawrence, a research scientist for
NEC Research Institute Inc. "The mean number of citations to offline
articles is 2.74, and the mean number of citations to online articles is
7.03, or 2.6 times greater than the number for offline articles," Mr.
Lawrence wrote last year in Nature.

Different disciplines have different attitudes about how much sharing is
appropriate, says Ms. Smith. Scientists often seek to get their
research out as soon as possible, while scholars in the humanities
might worry about someone stealing their ideas, she adds.

MIT officials say they hope institutional repositories will catch on
across academe, and they plan to make the DSpace software
available free to other colleges that want to use it. In fact, MIT plans to
lead a "federation" of libraries that want to use the software, helping
them with whatever policy issues arise, says Ms. Smith.

"We’ve had pretty serious interest in the system from about 30 major
institutions," Ms. Smith adds. The DSpace project is supported by a
$1.8-million grant from the Hewlett-Packard Company. Officials aren’t
sure exactly how much the archive will cost to maintain, though
universities already have much of the equipment in place to run digital
archives. Still, Ms. Smith estimates that DSpace could cost up to
$250,000 per year, if all of the costs were added up. The hope is that
free software tools will allow even small colleges to run repositories
using their existing resources.

Early Adopters

Meanwhile, a few other universities have begun building their own
superarchives, often at the urging of provosts or other administrators
who want to showcase their professors’ work and increase its impact.

One example is the California Institute of Technology, which has
already built an institutional repository
(http://library.caltech.edu/digital) with material from several
departments. Much of the drive for Caltech’s repository came from its
provost, Steven E. Koonin, who is also a professor of theoretical
physics.

"We do outreach and public education in so many different
dimensions," says Mr. Koonin. "Why aren’t we doing the same with
the scholarly information we produce, which is the core of what the
research institution does, most of which is funded by the public?"

Setting up the framework for an archive was the easy part, however.
Getting professors to contribute is proving more difficult. "It’s a slow
process," says Eric F. Van de Velde, director of library information
technology at Caltech. "We talk to people all the time" to try to get
them to include material, he adds. "This is not foremost on the mind
of any faculty member, and changing the work flow kind of takes
time." So far, about 600 papers are in the archive, which has been in
place for the past few years.

Another superarchive was recently created to serve the University of
California system. The archive, called the Scholarship Repository, is
run by the system’s California Digital Library
(http://escholarship.cdlib.org).

Colleges setting up repositories also have to set clear guidelines for
who owns the copyright to the materials. At Caltech, professors retain
copyright to anything placed in the archive, but they must sign waivers
allowing the university nonexclusive rights to keep copies in its
collection.

But professors don’t always have the right to place their published
papers in archives, or even on their own Web pages. Many journals
require scholars to sign over all rights to works that are accepted for
publication.

Several journals have recently changed their copyright policies,
however, to allow authors to place copies of their papers in personal or
institutional archives. But some publishers that have made such
policy changes, such as the American Physical Society, don’t make
it easy for professors. Scholars must make their own Web versions of
their articles by revising their own drafts to reflect editors’ changes.

Librarians discourage professors from ever removing work from their
repositories, so that once a paper is archived, it’s there for good. "We
don’t want this to become like a bulletin board," says Mr. Van de
Velde. "We want this to be a serious form of dissemination."

The repositories also encourage dissemination of materials that once
remained hidden, including photographs and other multimedia. "The
more you go out and investigate what’s going on in the faculty, the
more you discover the rich, rich assets that are there," says Joseph J.
Branin, director of libraries at Ohio State University, which is setting
up the framework for an institutional repository called the OSU
Knowledge Bank (http://www.lib.ohio-state.edu/Lib_Info/scholarcom/
KBproposal.html).

Some professors have expressed concerns that universities might try
to profit from the new repositories. Planning materials for
superarchives at both MIT and OSU contain suggestions for how the
university could charge a fee for access to selected materials.

But proponents of the repositories say that universities have an
incentive to make the archives free to all. "The special literature that is
at issue here … is worth incomparably more to researchers and their
institutions through its research impact than through any pennies that
could be made from charging pay-per-view tolls," says Stevan Harnad,
a professor of cognitive science at the University of Southampton, in
Britain.

Changing Role for Journals?

The do-it-yourself, or self-archiving, approach by colleges establishes
a new front in the struggle between colleges and journal publishers
over how much research should be made available free online. College
administrators have long been frustrated by the current
academic-publishing system, in which colleges pay the overhead
costs for the research and then must pay again to get access to the
research results.

Since last year, more than 30,000 scientists have pledged to boycott
journals that do not make their content free online no later than six
months after initial publication. But despite the pledges, led by a
group called the Public Library of Science, few scientists have
actually withheld their articles, and few publishers have changed their
ways.

Many of those who are active in the Public Library of Science boycott
are now working to help spark alternative outlets for scientific
publishing, such as institutional repositories. Even so, many who are
working to build institutional repositories say they aren’t trying to put
publishers out of business. Instead, they say their efforts may change
the role publishers play.

"Obviously, the information revolution is causing us to rethink how we
do scholarly communication and dissemination," says Mr. Koonin,
Caltech’s provost. If colleges can handle distribution on their own,
journals may focus on managing peer review and lending their seal of
approval to the best scholarship, and charging authors rather than
subscribers for their services.

"The print journals bundle together [several activities] — refereeing,
editorial standards, dissemination and marketing," Mr. Koonin says.
"What the technology starts to let you do is to unbundle those. You
could have dissemination done by one organization or mechanism,
but peer review done by another one."

Nice Idea, in Theory

"That will not work," says Arie Jongejan, chief executive officer of
Elsevier Science and Technology, a division of Reed Elsevier, one of
the largest commercial academic publishers. "You need publishers to
organize that process."

"If I was a researcher, I would be scared to death to make myself
dependent on that solution [institutional repositories]," adds Mr.
Jongejan. Journals, he says, "do things very efficiently and very
smoothly."

Elsevier does allow its authors to publish their papers in institutional
repositories or other noncommercial archives, provided that the
authors ask permission first. He says that fewer than 5 percent of
authors ask.

Other attempts at widespread reform of academic publishing have
fallen short. For instance, physicists have built a successful online
archive of pre-prints — articles that are distributed before being
reviewed by journals. That effort began more than 10 years ago, and
some scholars predicted that other disciplines would soon build their
own online pre-print archives. But few disciplines have even tried.

The reason is that disciplines are not the right agent for change, says
Mr. Harnad, of the University of Southampton. "The right entity for all
of this is the university," says Mr. Harnad, who is an outspoken
proponent of nontraditional academic publishing. "There is no entity
behind a discipline," he adds, but universities have an economic
incentive to try to reduce the cost of scientific publishing.

"We are in this confusing stage where it’s very difficult to say what it’s
going to be like 10 years out," says Lorcan Dempsey, vice president
for research at OCLC Online Computer Library Center, a nonprofit
library group. "The patterns of research and learning and
communication are really shifting."

Most institutions are waiting to see how DSpace and other
repositories develop before they join in, says Richard K. Johnson,
enterprise director of the Scholarly Publishing and Academic
Resources Coalition, an alliance of research institutions, libraries, and
organizations that encourages competition in scholarly
communications.

"A lot of institutions are thinking about this right now," Mr. Johnson
says. "Over the course of the next year or so, we’ll see quite a few of
them beginning to deploy."

One way or another, colleges seem interested in collecting and
showing off more of their scholars’ work online. As Mr. Dempsey puts
it: "I think there’s greater attention being paid to the whole range of
informational assets on campuses."

TOOLS TO BUILD A ‘SUPERARCHIVE’

Several new free tools are available or under development to help
colleges create "institutional repositories," superarchives of all
research generated by the college’s faculty members.

DSPACE

What: Massachusetts Institute of Technology’s project to develop a
superarchive, as well as software tools for creating and maintaining
the repository. The tools will be offered to other colleges that want to
use them.

When: DSpace has been under development for two years. The
university is testing it this summer, and plans to make the software
available free to anyone in the fall, when the university will invite all
professors at MIT to contribute to its archive.

Where: http://web.mit.edu/dspace

EPRINTS.ORG

What: Free software developed at the University of Southampton, in
Britain, to help individual scholars, departments, or universities create
archives of research papers online.

When: Available since 2000. An updated version was released this
year.

Where: http://www.eprints.org

OPEN ARCHIVES INITIATIVE

What: A series of "metadata" codes that librarians or others can
attach to research papers to help search engines pull out desired
information.

When: Available since 1999. An updated version was released last
month.

Where: http://www.openarchives.org

http://chronicle.com/free/v48/i43/43a02901.htm

Posted in:

Sorry, we couldn't find any posts. Please try a different search.

Leave a Comment

You must be logged in to post a comment.