New web-based model for sharing research datasets could have huge benefits
October 15, 2012
A group of researchers have proposed creating a new web-based data network to help researchers and policymakers worldwide turn existing knowledge into real-world applications and technologies and improve science and innovation policy.
Researchers around the world have created datasets that, if interlinked with other datasets and made more broadly available, could provide the needed foundation for policy and decision makers. But these datasets are spread across countries, scientific disciplines and data providers, and appear in a variety of inconsistent forms.
Writing in the new issue of the journal Science, seven researchers propose a new data network that can help bring this knowledge together and make it available to all.
The benefits to society from such a network are clear, said Bruce Weinberg, co-author of the paper and professor of economics at Ohio State University.
“Such a network could help scientists, policymakers and business people take the knowledge that is now locked in scientific publications and create new technologies and applications,” Weinberg said. “This is a key to economic growth.”
The purpose of this new model is to make data accessible, said Laurel Haak, co-author of the paper and executive director of ORCID, an international, interdisciplinary, open, and not-for-profit organization formed to provide a registry of unique identifiers for researchers.
“Researchers lament the lack of data sharing. But a new data infrastructure has the potential to overcome that problem and potentially transform research practice itself,” Haak said.
In the Science article, the authors say that one key to making this proposed project work is to have a unified set of standards between databases and platforms. One simple example is that databases often have different ways of identifying authors. In one database, an author may be listed as “David A. Smith” while another would list the same person as “D.A. Smith.” Other researchers would have no way of knowing if these two records referred to the same author.
“We need a coordination of data exchange standards to make this effort work,” said David Baker, co-author and executive director of Consortia Advancing Standards in Research Administration Information (CASRAI), a non-profit standards development organization.
This new data infrastructure must only be a “thin layer” on top of the database structures that already exist, Baker added. “It needs to work seamlessly with the databases and platforms we already have in place. “It shouldn’t add another layer of complexity.”
One major issue is achieving broad-based participation in this effort, said co-author Gregg Gordon, president and CEO of the Social Science Research Network.
“We need to have participation from researchers in all fields, whether they work in multinational corporations, non-profits, government agencies or universities,” Gordon said. “We need all the different players to work together to make this effort successful.”
Users of the infrastructure would use the public data and tools at no charge, pay for access to private areas and tools, and apply for access to security-sensitive part of the system.
The authors of the Science paper emphasize that no single organization can manage this infrastructure alone. Governments, non-profits, and for-profits must all collaborate.
They envision a steering committee comprising members of the major data providers, including government agencies, standards organizations, private data vendors as well as the research community.
While a lot of work needs to be done, the researchers say the effort will be worth it.
“The model we propose provides tremendous benefits from combining and mining the vast quantities of data that are already available,” the authors conclude.

Comments (12)
by GatorALLin
a cool ted talk about sharing medical data.
http://www.ted.com/talks/john_wilbanks_let_s_pool_our_medical_data.html
by Peter Kinnon
Bri perceptively remarks:
“I think this is an important step towards the world wide web becoming sentient”
He is among the groundswell of folk finally waking up to the realization that all such inevitable developments represent part of the gestation of a new predominant cognitive entity on this planet.
The construction of a “brain” that will soon equal and then surpass that typical of our species has for long been a work in progress. Not as a result of any deliberate human “design” but rather as the result of an autonomous evolutionary process that can be seen to have run its exponential course since humankind acquired the ability to share imagination, a function which we know as language.
Very real evidence indicates the rather imminent implementation of the next, (non-biological) phase of the on-going evolutionary “life” process from what we at present call the Internet.It is effectively evolving by a process of self-assembly. You may have noticed that we are increasingly, in a sense, “enslaved” by our PCs, mobile phones, their apps and many other trappings of the net.
We are already largely dependent upon it for our commerce and industry and there is no turning back. What we perceive as a tool is well on its way to becoming an agent.
Consider this:
There are at present an estimated 2 Billion Internet users. There are an estimated 13 Billion neurons in the human brain. On this basis for approximation the Internet is even now only one order of magnitude below the human brain and its growth is exponential.
That is a simplification, of course. For example: Not all users have their own computer. So perhaps we could reduce that, say, tenfold. The number of switching units, transistors, if you wish, contained by all the computers connecting to the Internet and which are more analogous to individual neurons is many orders of magnitude greater than 2 Billion. Then again, this is compensated for to some extent by the fact that neurons do not appear to be binary switching devices but can adopt multiple states.
Without even crunching the numbers, we see that we must take seriously the possibility that even the present Internet may well be comparable to a human brain in processing power.
And, of course, the degree of interconnection and cross-linking of networks within networks is also growing rapidly.The culmination of this exponential growth corresponds to the event that transhumanists inappropriately call “The Singularity” but is more properly regarded as a phase transition of the on-going “life” process.
An evolutionary continuum that can be traced back at least as far as the formation of the chemical elements in stars.
The broad evolutionary model that supports this contention is outlined very informally in “The Goldilocks Effect: What Has Serendipity Ever Done For Us?” , a free download in e-book formats from the “Unusual Perspectives” website
by GatorALLin
Years ago I was invited to be part of a group that was trying to setup some rules so that all Dental data could be shared within massive databases (mostly for electronic insurance claims, but other uses also). Year after year I went to participate and the problem often was that the experts could not agree what format should be universal and what data to include. Each University wanted to have it done their way so that their name could be attached or so they were included and part of the process and each year a few new things would come up or change. It also turned into a project where it was focused more on how to make the system “perfect” the first try than to make it useful and allow for changes as you go. The term “perfect” was more about everyone pushing to add their change or their part to it (ego based) vs. just trying to keep it simple and up and going to see how it would actually perform. I finally gave up and offered that when/if they can get a system they could agree on, then as a manufacturer the very next day we will write the code to link to it, but for now here is a system proposed from industry where thousands actual users are doing it now and for years already with great success. The University did a great job at coming up with their idea of a “perfect” system, but it was so cluttered with stuff that in the end was too huge or complicated to use and 10 years from that start date their system was still being discussed at a meeting to have another meeting. Most of the self appointed experts were not actively doing anything close to what I understood the goals were on a daily basis, so their expertise was mostly in theory vs. active situations. I hope this group can focus on the data and end goals and keep the egos and other ulterior motives or agendas out of the process.I found it hypocritical that 50 different Universities who value science and doing great work had 50 different systems in place and working toward a common theme/goad was so hijacked by egos, names, grants, tenure and how waiting for a perfect system tomorrow kept them from having a functional system now.
I hope you can create a system where good ideas win and keep the focus on the end goals. Keep it transparent so there is no “black box” to how it works or how to improve it over time. Just maybe a system that is allowed to fail and then fix it and improve is part of being “perfect” the first time.
by Bri
I think this is an important step towards the world wide web becoming sentient. Reflective algorithms can be written, similiar to the self aware robot NICKO. These could mine the vast data sets that we upload, as a type of sensory system, into analogous brain centers. From there a pulvinar analog would direct a conscious action algorithm from governmental centers. Preferably in a United nations platform.
by gaoptimize
That would be great, but we all now know it isn’t the data that matters anymore, but rather the “assumptions” and “seasonal adjustments” (re: today’s retail spending numbers and last week’s headline unemployment %).
by GatorALLin
…..an interesting idea…what if there are no facts anymore….only perceptions about what those “facts” or data means… I think the raw data should be available so any findings from it can be viewed by more than just those with connections…or those doing a study that has an agenda…. making this data open might help reduce some abuse of what it is supposed to mean. So many times I see scientific data published and the media latches on to only one tiny part that is the most sensational, but miss the main focus or message of what the data was really saying.
by Gorden Russell
Too many policymakers are already science-deniers. This will give them more learning to ignore.
by de Broglie
Politicians that don’t acknowledge that certain schools chronically underperform and certain schools chronically overperform as a result of human biodiversity are science deniers. Of course, most people are only in favor of “science” when it fits with what they want to believe about the world.
by Nancy
Exactly! How does a (rather stupid) creature create something smarter than itself on purpose? By accident sure. Through evolution, if the forcing conditions are favorable. But intentionally … no way.
by Peter Kinnon
Sure, Nancy.
The evolution of technology is an autonomous process that occurs within the collective imagination of our species. The development of its most significant current manifestation, the Internet, proceeds not as the result of deliberate design but by the myriad “selfish” desires of users. Whether for research, telemetering, gaming, tweeting, porn, artistry, commerce, terrorism or a host of other human requirements, good or bad.
An analogy can be drawn with the role of the “selfish” gene in biological evolution.
Contrary to the claims of many, nobody designed the Internet.
The anthropocentric world-view that we inherit both genetically and from our culture leads concept of “designers”. We intuitively assume that individuals of our race “design” things. But only in a very trivial everyday sense is this seemingly obvious notion valid.
It can be argued, with strong evidential support, that we do not, invent or create artifacts or systems but that , rather, these are more properly viewed as having evolved within the collective imagination of our species.
To quickly put this counter-intuitive view into focus, would you not agree that the following statement has a sound basis?
We would have geometry without Euclid, calculus without Newton or Liebnitz, the camera without Johann Zahn, the cathode ray tube without JJ Thomson, relativity (and quantum mechanics) without Einstein, the digital computer without Turin, the Internet without Vinton Cerf.
The list can. of course be extended indefinitely.
by Gabriel
October 17, 2012 by Peter Kinnon “We would have geometry without Euclid, calculus without Newton or Liebnitz, the camera without Johann Zahn, the cathode ray tube without JJ Thomson, relativity (and quantum mechanics) without Einstein, the digital computer without Turin, the Internet without Vinton Cerf.”
You know, Bri made a comment about that last bit Peter….that maybe, things like Math…or really, just things in general…we don’t and didn’t “invent” thing but perhaps simply “discovered” them….that even if the chain-of-events happened differently, the creation of many things would have inevitably happened and wasn’t purely the result of the driven creator(s).
If the universe is set up in such a way that it was inevitable for a certain paradigm to take place, can we really take credit for it? I’ve heard the argument that this makes us come off as “arrogant and anthropocentric”….My argument back, was that it’s not arrogant to claim “creation-rights” because it’s still nevertheless an accomplishment that we did and we have ever right to celebrate it as such.
Retro-actively, you could look at everything as an inevitability and that things were going to end up the way they did regardless of choice, options or whatever and that simply don’t see it that way in the present….I feel however that, while this could have truth to it, that it’s still wrong and robs too much rightful joy and celebration that we should have simply because hindsight is 20/20….that progress was actually autonomous, everything is already set up for us and we didn’t actually “do” anything, or at least as much as we think we did.
I feel a middle-road is the right way of looking at things with this sort of issue….if you choose to look at life in the sense that everything is prepackaged for you, waiting to be unlocked and that we don’t actually create so much as we discover, then that’s up to you….however, even so, it’s wrong to say we still shouldn’t look at such discoveries as anthropocentric because, in the end, we are humans….we are human beings trying to master as much as we possibly could, and even if something (the end-result) was intended to happen regardless of time or “how” or whatever….it still doesn’t hide the point that it has and did happen…even if Einstein didn’t create/discover relativity and someone else did, it doesn’t dampen or make it any less wonderful to us…you could argue that it was going to happen somewhere down the line anyway, but even if that has true, it doesn’t make such creations/discoveries any less special….they were driven by our needs, our anthropocentric needs, and we always benefit from such wonders, whether or not they came out of inevitable accident or deliberate intention.
You can say it’s arrogant and anthropocentric for people to actually take credit for anything that actually happens, but I could argue that it’s equally masochistic and self-defeating to argue otherwise…if you want to have this mentality on things, I say a middle-road is the right choice. I don’t believe in looking at things in the sense that humanity must “beware” and “remember it’s proper place”….that’s just stupid, and yet people seem to unconsciously have that adamant way of thinking as if we weren’t all the same….we may not have centrality in the grand scheme of things, but we certainly do to ourselves and in our own lives, and who knows….
who’s to say we really won’t get the universal centrality we once thought we were already born with….call it accidental or intention, but it seems to be what we’re aiming for, and I personally wouldn’t have it any other way.
by Peter Kinnon
Yes, Gabriel, there is no doubt that, even if they are just “picking the low hanging fruits” those individuals who fill the heroic roles have to be the right kind of folk in the right time and the right place with, perhaps most importantly, sufficient motivation. And as such they are deserving of respect within the context of our everyday world.
At the more fundamental level, though, we find that idea systems, particularly those related to technology, evolve autonomously and are subject to principles of natural selection akin to those of biology.
This is not to say that all is predetermined. If we were able to “rewind the tape of history” (a la Jay Gould) and run it again. We would not expect, say, the Eiffel Tower to re-emerge. Contingency is clearly a major player in our universe. But we could be very confident that the engineering principles which underlie its construction would be with us.
I suspect your deeper concerns lie with the question of “free will”.
Naturally we balk at the suggestion that we don’t have any, but quite natural this fear is unfounded.
Free will can now be clearly seen to be an evolutionary necessity. Simply that feature of an organism which allows optimization of interaction with its external environment. Which, of course, bears directly on its chances of survival and hence of positive natural selection. This is the point which is so often missed by many who debate the issue Certainly there is the deterministic component of decisions provided by internal molecular mechanisms. But there is also the equally deterministic component that is input from the external environment. As well as, as it seems with all natural processes, the element of chance.
The early philosophers, of course, just did not have sufficient information from chemistry and biology that now makes such an analysis rather trivial.
An additional point that we should always bear in mind is that although the free will of our own species is remarkably high because of the extreme level of interaction of our kind with the external world, it is in absolute terms still very limited.
So, to allay the fears engendered by the hard realization that we are not, and never will be, masters of the universe, but rather a tiny cog in the vast machinery of nature, the metaphor of the surfer is helpful.
The surfer does not pretend to be able to control the waves. Or even to fully understand their nature. But he knows enough about their properties to catch a wonderful ride. So hit the big tubes, my friend, and enjoy.
Check out the free e-books at my website for further expansions on this topic.