Principal Investigators’ opinions on Open Data

January 29, 2010

A goal for the CLARION project is to make it easier for scientists to release their experimental results into the public domain as Open Data. We’ve been talking to some Principal Investigators in the Chemistry dept to hear their attitudes towards releasing data.

For all the PIs interviewed, the need to release data is not in the forefront of their mind. During the introductory preamble, they tend to look at you with a “Why are you asking me?” expression. The trick to make them think about open data is to find an angle that concerns them. Three things that are close to a PI’s heart are money, publications, and visibility. Good questions to ask are:

• Do any of your public-funding agencies require you to make your research results public?
• Have you needed to provide supporting experimental data for any of your papers?
• Would you like to increase the visibility and citations of your work?

Questions such as these help the researcher to realise that making their research data open could be advantageous to them – and that IT solutions could help them do it.

Almost without exception the PIs approve of the concept of Open Data. Several of them actively post data into public databases such as the Protein Data Base. However, we find a range of opinions as to the timing of release. Some are happy to release data almost as soon as it’s collected; others after a paper has been published; and others would only do so after any intellectual property had been patented. As might be expected, the desire to protect intellectual property seems to inversely correlate with the “pureness” of the work. The more applied the science, the more patentable the work, and hence the need to be sensitive to protecting IP.

A common concern from a PI is whether their group’s data would be useful to anyone else. Difficult one to answer this with anything beyond “Well, you never know until you try it”. But again a good way to help them think is to ask them what data they’d like to see from other researchers in their area, or from papers that they’ve read. Just about always they will say that there’s something they’d like to see – commonly, the supporting data used for a graph.

A diversity of opinion is good; diversity is the seed from which the fittest will grow. However, it does tend to complicate any IT solution…