I love these lines from Rex Sanders:
If the data you need still exists;
If you found the data you need;
If you understand the data you found;
If you trust the data you understand;
If you can use the data you trust;
Someone did a good job of data management.
It encapsulates the goal as well as anything I've seen.
I used it to lead into the first of what I intend to be more or less monthly informal discussion sessions with the folks I'm somewhat tongue-in-cheek referring to as Data Wranglers. We gathered in the Café at the Edge of Chaos (conveniently just a few steps from my office). I scheduled it for 4:00 with beer in the fridge and wine on the counter, gave a five minute intro to some of the issues (essentially, what does the institution need to do to facilitate good data management) and opened it up for discussion. These folks are not shy.
Included among the dozen who came were an Institute of Medicine member who is a staunch OA advocate and leads several biostatistics groups, the PI of a very large multi-institutional longitudinal study of stroke risk factors, a computer scientist who runs a multidisciplinary team engaged in brain mapping, the director of the clinical data warehouse, an expert in decision support systems, and a woman working with NASA to link satellite, EPA and public health data. The others were equally diverse and distinguished. A fascinating group, all of whom have a keen interest in how we manage research data.
We touched on a number of key themes:
- Concerns about data sharing contrasted with the value of data sharing
- The limitations of metadata in supplying sufficient context for data re-use
- The dangers of one-size-fits all policies
- The need to provide good information support to investigators in response to imminent federal funder requirements for open data
- Information sharing vs data sharing
- Role of commercial interests
I have an ever expanding list of (currently about 40) people from across the campus that I'm inviting to these sessions. My overarching goal is to build a community of interest, make connections among people who have similar concerns but may not know each other, and use these discussions to drive priorities and strategy. It's a Wicked Problem, which is what the Edge of Chaos is all about.
After 15 years of working on these issues around the demands of my day job as LHL director, for the past nine months or so I've been able to dig in full time. It's become clearer than ever that it requires strong collaborative efforts that cross institutional boundaries. That is very tough to do, given the way that research institutions are organized and the siloed culture of those institutions.
In most places, it's the librarians that have taken the lead, usually in developing services around DMP requirements and, increasingly, tracking the new federal funder requirements for public access to publications and data. But this is much more than a library problem.
I have been quite struck by how much my perspective has been shifted by the fact that I am doing this out of the Provost's Office rather than out of the library. My focus is on engaging intensively with researchers across disciplines, the folks in IT and OSP and compliance, and using a very organic approach to surface issues and needs. Out of that, we'll try to identify the things that the various components of the university can do to help us all do a better job managing research data. My monthly Data Wranglers discussions are a key component of that approach.
I've come to appreciate that the challenges in achieving Rex Sanders' vision across the entire institution are practically insurmountable. I've always had a deep empathy for Don Quixote's battle with the windmills. That must be why I'm having such a good time.