Blues and Yellows

Tuesday, January 27, 2009

Using the Scopus API - 1. Getting started.

Scopus is one of the most important abstract and indexing databases available to people in my University. Usually, people are happy to log in and search within the Scopus interface, but there are possible uses for including content from Scopus within a web site. Scopus announced that an API was available back in March 2008. There was some interest expressed at the time, but there is not much in the way of examples and developer notes on how to use it. I am trying to work out for myself some of the possibilities of how it can be used.

Examples

Clinical Gastroenterology and Hepatology: list of the articles with the most citations published in the journal.
Science Direct: Count of citations for an individual article.

How to get started.

Before you can do anything with the Scopus API, you need to register for an API key. The form asks for basic information about who you are (name, email address), the web site on which you are developing and a password. You will be sent a Developer Key by email. You are also asked to agree that you have read and understand the Scopus Registered User Agreement and agree to be bound by all of its terms and conditions. Rather than just tick the box, you should read the agreement to make sure you are not trying to build something that breaks the agreement.
Once registered you can access 'My Scopus API' where you can get reminders of lost developer keys. If you are developing on one machine before making your code live on another one, you will need to register for each site. Once you have registered, it is possible to register for other sites where you want to use the API.

Placing the code on your site

There is a code example on the Scopus API documentation page.
Stepping though this example we find:
1. Files on www.scopus.com

2. Two sections of javascript

There is also a form to add to the body of your web page.

<body>
<h2>Search Form:</h2>
<form name="scapiForm" onsubmit="return false">
<input type="text" name="searchString"/>
<button onClick="runSearch()" name="searchButton"/>SEARCH</button>
</form>
<h2>Returned SCAPI Content:</h2>
<div id="scapi">
None.
</div>
</body>

The form should look and act like this, at least to start with

Returned SCAPI Content:

None.

In the next post on this topic, I will look at how to configure the search form and the results that are returned.

Wednesday, February 27, 2008

Technorati Profile

Technorati Profile. This post sets up a profile for me.

Thursday, January 17, 2008

DOI Cookie makes finding full text easier

Journal articles are frequently cited with just the DOI (Document Object Identifier) as a link. This would normally take you to the copy held on the publisher's site. As a member of an institution, you may have access to the article elsewhere. Your Institution's OpenURL Resolver can tell you where, if we can get it to resolve the link for you. Setting a cookie asking the DOI site to do the redirecting for you is a simple way of achieving this.

Test this button and the following link.

DOI Cookie set

http://dx.doi.org/10.1108/00330330610681312

http://dx.doi.org/10.1108/13685200810844451

http://dx.doi.org/10.1007/s00442-002-1047-9

Thursday, January 10, 2008

Archiving digital copies of electronic journals

There was a time, back in the 'Middle Ages', where monks would cross the continent of Europe to study and copy precious manuscripts. Umberto Eco captures this world in his novel 'The Name of the Rose'. In making an accurate copy and taking it back home with them, the monks were both distributing and preserving the work. Having a copy of a work in one's archive made your institution more prestigious, and helped to preserve the work from a variety of medieval threats. But not everything is copied everywhere, use of manuscripts is also a cause of them wearing out and becoming unusable for future readers.

Later scholars may puzzle over the complex social model, the reciprocal relationships, the transmission history of any individual manuscript and its relation to the original work. Which manuscripts have integrity as authentic copies? What 'trust' should be placed in a Saxon Manuscript found in the collection of a monastery in Lombardy? Where two or more copies exist detailed comparisons would need to take place to remove doubt about integrity: (all copies agree on every word of chapter 1, but there are variant readings of sentences in chapter 3).

Fast forward to the present and similar problems of integrity when it comes to preserving electronic material, such as the articles published in electronic journals. There are various schemes which libraries might look to or participate in and some are based on very different models. Locking files away in a safe place for ever, or until they are the only remaining copies is the strategy of schemes like Portico. Libraries might support this model as insurance against future loss of access.

Another model is the foundation of the LOCKSS strategy. Here the emphasis is on maintaining the integrity of a widespread network of copies. There is not just the safety feature of having multiple copies stored as a defence against the many threats of the modern age. It also enables on ongoing work of comparison to take place, ensuring that variant readings of texts do not get a chance to develop.

Comparing electronic objects to medieval manuscripts will only get you so far in thinking about digital preservation, but considering the social model behind any scheme is an important way of deciding between there relative claims. There is more to digital preservation than an effective backup and disaster recovery program. The threats faced by digital materials include format obsolescence (no viewers for certain file types), temporary or permanent lack of access to original versions (the publishers and their archives have 'gone away'), degradation of media (no one can play 12 inch laser discs anymore) or digital degradation (the saved bitstream is 'corrupt' and will not load), lack of context (the metadata describing the object is no longer attached).

LOCKSS may not have all the problems of digital preservation solved, but it does have some advantages for libraries. There is a sense of taking up the responsibility for preservation and keeping it within your control, it relies on open source software with a track record, it allows librarians to decide for themselves on their collection development policy for preservation.

Some UK Libraries, including De Montfort University, have been involved in testing LOCKSS. As this Pilot comes to a close in February 2008 libraries will be deciding on whether and how to take their participation forward. It will be interesting to see which technology for preservation, and which social model is behind the strategies that do get adopted.

SEADLE, M. (2006). A Social Model for Archiving Digital Serials: LOCKSS. Serials Review, 32(2), 73-77. DOI: 10.1016/j.serrev.2006.03.007

Thursday, January 03, 2008

Institutional Repositories and Digital Preservation

Hockx-Yu ,
. (2006). Digital preservation in the context of institutional repositories. Program: electronic library and information systems, 40(3), 232-243. DOI: 10.1108/00330330610681312

This article looks at the digital preservation dimension to developing institutional repositories. Helen Hockx-Yu was a programme manager working in this area for the JISC (Joint Information Systems Committee) so she is able to also outline the JISC view on Institutional Repositories and some of their initiatives to assist people developing them.

I am interested because De Montfort University Library is responsible for developing DORA: the De Montfort University Open Research Archive. I also administer the Library's LOCKSS Archive, which attempts to digitally preserve content from the electronic journal services to which the Library subscribes. There is not necessarily an overlap between these two activities: we are not using LOCKSS to preserve content from the Institutional Repository, though in theory, that may be possible.

Institutional repositories
The question: What is an institutional repository? is answered as 'digital collections that capture and preserve the intellectual output of a single or multi-university community'. DORA counts as at least claiming to be an attempt to capture De Montfort's intellectual output. DORA also falls within the pattern of most currently established institutional repositories in collecting 'e-prints' of scholarly publications. An IR could aim to include images, video, sound and other file formats. The wider the intake of file formats, the more problems are created from the digital preservation point of view.

Institutional Repositories are still new arrivals in the information landscape. Hockx-Yu can point to potential benefits for:

Authors: who gfain visibility.

Users: find information more easily (usually without charge).

Institutions: increase research profiles.

Funders: see wider dissemination of research outputs.

The push to develop IRs is partly from Institutions and those Research Councils who have mandated that research outputs be made freely available. Whether these motivation factors reach and mobilise the people doing the research has yet to be seen. The JISC has set up a number of projects, like SHERPA and ROMEO to help overcome some of the obstacles.

Digital preservation
Digital preservation refers to the series of managed activities necessary to ensure continued access to digital materials for as long as necessary. Backing up the files may be part of this activity, but is not the whole process. For example, you can save a file, but unless you know what application you need to open it, it will not be of much use to anyone. There may be an argument for not needing to bother with digital preservation within institutional repositories. Much of the material held is a poor version of the polished articles published in commercial journals and such journals should take on the burden of preservation as part of their service. Some publishers are more or less willing to take on this responsibility. In my view this is one of the lessons of the LOCKSS Pilot project.
There may be other materials held in a repository, such as images or research databases where the repository holds the primary preservation responsibility. Other JISC sponsored projects can help to assess the costs and practices used to preserve such material: LIFE and eSPIDA .

One question that we as institutional repository managers need to address is that of format migration. Should we accept any file format submitted or migrate those deemed to be 'non-standard' to a format based on an open standard. Hockx-Yu does not say whether PDF or the archive version of PDF is appropriate for this, though the James study ¹ of file formats may offer some guidance here.

LOCKSS
The one JISC sponsored preservation that we have been involved with does not get any detailed treatment in Hockx-Yu's article. The LOCKSS Pilot attempts to preserve material published in commercial or open access journals. Some LOCKSS members have set up private networks to share the preservation work for material they hold. That would not be covered by the Pilot sponsored by JISC, but LOCKSS may still touch upon material held in institutional repositories.

There is a debate about who has the responsibility for digital preservation: institutions or commercial publishers. One way of gathering evidence on this problem would be to look at a set of an institution's intellectual output and check on how much was being actively preserved in a way appropriate for that institution. This could be to check against the titles preserved within Portico, if the institution is a Portico member. A similar test could be done against the 'archival units' held by a LOCKSS member. You could also compare for overlaps with the content held in an institutional repository. If only the IR is making an effort to preserve material, then by default it may have found itself with the primary preservation responsibility.

¹ James, H., Ruusalepp, R., Anderson, S., Pinfield, S. (2003), Feasibility and Requirements Study on Preservation of E-Print,

Monday, April 24, 2006

Untitled

This is a test using Flock to blog.

Thursday, March 02, 2006

How busy is the Library catalogue?

February was a busier month for OPAC than January was, again with a day or more than out because of the power-cut which caused some downtime. The OPAC figures are harder to use to get a feel for how the service is operating. It might be interesting to compare the 7245 requests for the '/TalisPrism/placeReservation.do' script which OPAC uses to record reservation requests with the total number of reservations placed that month. There are other questions that we might like answering, like how many OPAC pages are viewed because they are mentioned in Reading Lists Online.
March is usually a busy month before April begins a slowdown in OPAC use which seems to last until the new academic year begins.

OPAC usage statistics.

Wednesday, March 01, 2006

What is popular on the Library web site?

At the start of a new month it is worth checking over the usage stats for the Library web site to see if there are any clues there on how the site is being used. There was more activity on the web site in February than in the previous month, despite it being a shorter month and having more than one day out because of power cut-offs in the Library building.
The figures show some clues about the patterns of usage. One surprising feature is the popularity of the Email directory (which has links to staff and student email as well as to Blackboard and Netstorage. This area is followed in popularity by the Databases pages. E-Journals and ExamNet are also highly used areas of the site.
You can view the February usage figures on the Library web site.
One area underreported would be the amount of people who use the site to start Google searches. They would tend to use the Google searchbox on the side of the screen and the results from the search appear from the Google web site, not the library one, despite the common look and feel.

Tuesday, February 28, 2006

Librarians as models of learning

I had the chance to think about how librarians interact with the environment in which people learn. I guess a library is a 'learning environment' in lots of ways. For example we ordered the fixtures and fitting, the seats tables and display boards that make a physical environment. We can help set the social environment by creating a friendly or forbidding atmosphere. It is all do easy to achieve one or other of these entirely unconsciously. We also provide the tools in the way of OPACs, web sites and authentication systems that variously assist or hinder learning.

It strikes me that another, possibly unconscious, way in which we influence the learning environment is the way we model learning behaviour to others.

This might be seen in the way we welcome or reject new ideas. Or share bookmarks through services like del.icio.us. Often though the way librarians learn is invisible as we do not choose to share the learning process with others. What if we did share the way we learn with others? Would they be able to learn from our experience as learners?

Blues dovetailed in yellow

Blues dovetailed in yellow is a print made by Patrick Heron in 1975. It was hanging in my office for a while, but it is currently part of an art exhibition in another building.

I am not sure if it is supposed to 'be' anything, but I am interested in the way the colours butt against each other and your eye is drawn from one to the other without finally settling anywhere.

blues dovetailed in yellow