What’s next for Mozilla Science Lab

Cross-posted on the Mozilla Science blog. Image courtesy of Mozilla Festival, under CC-BY 2.0.

We’ve had a hell of a great year at the Mozilla Science Lab. Thank you. And now, I’m excited to announce some new job titles and what’s next for our team. But first, a quick look back.

In the past year, the Science Lab has kind of been in startup mode – we brought on three new staff members and welcomed four Mozilla Fellows, ran a number of in-person workshops and meetings to refine our learning strategy, shipped two new sets of curriculum, and brought hundreds of people together around the world through sprints and the Mozilla Festival. In that time, we’ve learned a lot about what it means to meaningfully and thoughtfully work alongside members of the research community to further open practice, and how to build and sustain momentum. We’ve also sharpened our focus as an organization, building on what Mozilla does best: connecting communities through open curricula, global campaigns, training, policy, and events around issues like access to information, privacy, and digital inclusion.

The Science Lab is one of a number of programs at Mozilla that work to enable, empower and activate communities consisting of developers, educators, advocates and more. We do this primarily through fellowships, mentorship and project-based learning (through sprints and open source projects, specifically). Other programs include ones focusing on learning and education, advocacy, the internet of things, and women and web literacy. They make up what we’re referring to as the Mozilla Leadership Network, and represent a new way of working at Mozilla: in thinking of our communities as part of something more – a network rather than individual groups on their own. Across those programs, there are a number of shared challenges, from closed paradigms and lack of access to knowledge to privacy and lack of training. These are not solely science specific, but work we all carry in various ways to meet the needs of our communities. There are also shared sources of inspiration for our work, from civic tech and open government models to the open source movement and OER.

Here’s how we’re making our commitment to the Mozilla Network and our work with communities even stronger.

First, we’ll continue to invest in the network of open science leaders through the fellowships, study groups, Working Open workshops and mentorship. We’re also working on a mini-grant scheme (more this fall!) to further invest in our mission, and help catalyze prototyping efforts and support local organizers and mentors.

Second, the program leads for the Science Lab, Learning programs, Open Internet of Things, Women and Web Literacy and Advocacy Network will operate as one unified team, rather than operate in isolation.

Thirdly, we’re restructuring programs: for the Science Lab, Stephanie Wright, Zannah Marsh and Aurelia Moser, will be leading our work to make research open and accessible through fellowships, mentorship, and learning through open source projects and prototyping. Stephanie Wright, our Open Data Training Lead, is taking over leadership of the Science Lab, and our Fellows (current and alumni), Study Group leads, and mentors will take a more active role in supporting the community, helping us run workshops and provide expertise on what it means to work openly.

Abby Cabunoc Mayes, the Science Lab’s lead developer, is graduating into a new role leading developer engagement and contributorship (much like she’s done for sprints, Collaborate and “working open” in science) for the Mozilla Leadership Network. Arliss Collins is now serving as the Data and Metrics Analyst for the network, building off her data-driven approach to understanding community engagement across learning and project-based initiatives for the Science Lab for other programs, as well. Both Abby and Arliss’ roles still involve supporting the Science Lab’s work, while modelling that work for the broader organization – a huge step forward.

My role is also evolving. I will now oversee the four programs of the Mozilla Leadership Network (science, learning, women and web literacy and internet of things) while working closely with the advocacy network to ensure that we can work together to make transformative change globally. It’s been an honor to be the first director of the Science Lab. Now, we will take what we’ve learned this year in science and apply that knowledge across programs, as each program lead has an even bigger year next year.

We want to thank you for carrying this work forward and helping us grow the community of open science leaders (over 60 trainers and mentors worldwide this past year alone!). We can’t wait to continue supporting the open research community and working alongside you all to continue to further openness on the web. We hope you’ll join us.

[On the move] Open Science at USC

I’ll be speaking this Sunday at the Science and Tech Forum at USC in Los Angeles. The event kicks off today, led by Llewellyn Cox from USC’s School of Pharmacy.

The event also features Pete Binfield (PeerJ), Elizabeth Iorns (Science Exchange), Barry Bunin (Collaborative Drug Discovery), Mark Hahnel (figshare), Brian Nosek (Center for Open Science / Open Science Framework) and more. Check out the agenda here.

You can tune in to the livestream or follow the conversation online at #scitechLA.

[Event] HackYourPhD US wrap-up event on G+

The folks behind HackYourPhD (see last post) are wrapping up their US tour with an online event highlighting their work over the summer, interviewing key players in open science.

I’ll be joining the Hangout, along with William Gunn (Mendeley), Brian Glanz (Open Science Federation), Jeff Spies (Center for Open Science / Open Science Framework), Charles Fracchia and Adam Marblestone (Biobright), and Ann Lam and Elan L. Ohayon (Green Neuroscience Lab).

Do tune in (starts at 1pm EST).

The life sciences would seem, on the surface, ideal for open source. It’s a world built on disclosure – whether publication or patent – it doesn’t count until you tell the world. It’s a world where the knowledge itself snaps together in a fashion that looks eerily like a wiki, where one person only makes a small set of edits in an experiment that establishes a new fact. And it’s a world where the penalty for redundancy is high – no one in their right mind wants to spend scarce research dollars on a problem that has been solved already, a lead that is a dead end, a target guaranteed to lead to side effects.

– John Wilbanks in his recent Xconomy piece, “Understanding Open Science”.

The life scienc…

treating code as a first-class research object/citizen

This is the first is a series of posts over the coming months about treating code as a fundamental component – or a first-class citizen – of modern-day research. Research is becoming increasingly reliant on code and analysis, and we’ve come a ways in getting data recognized as a “research object”. But what about the software needed to regenerate analyses? How do we shift the conversation to also recognize the code used to conduct the experiment as a critical piece of research?

The Software Sustainability Institute in the UK has some excellent posts outlining the ideal world many of us are striving for in open science. Here’s an excerpt from their longer post on this entitled “Publish or be damned” that I found especially interesting. (Do give the full post a read when you have a chance, too. It outlines a number of the key issues we need to be cognizant of.)

And, as always, I welcome your thoughts (keep them constructive, please. 🙂 )

***

The Research Software Impact Manifesto

As those involved in the use and development of software used in research, we believe that:

Open science is a fundamental requirement for the overall improvement and acheivement of scientific research.
Open science is built on the tenets of reuse, repurposing, reproducibility and reward.
Software has become the third pillar of research, supporting theory and experiment.
Current mechanisms for measuring impact do not allow the impact of software to be properly tracked in the research community.
We must establish a framework for understanding the impact of software that both recognises and rewards software producers, software users and software contributors; and encourages the sharing and reuse of software to achieve maximum research impact.

To enable this, we subscribe to the following principles:

Communality: software is considered as the collective creation of all who have contributed
Openness: the ability of others to reuse, extend and repurpose our software should be rewarded
One of Many: we recognise that software is an intrinsic part of research, and should not be divorced from other research outputs
Pride: we shouldn’t be embarassed by publishing code which is imperfect, nor should other people embarass us
Explanation: we will provide sufficient associated data and metadata to allow the significant characteristics of the software to be defined
Recognition: if we use a piece of software for our research we will acknowledge its use and let its authors know
Availability: when a version of software is “released” we commit to making it available for an extended length of time
Tools: the methods of identification and description of software objects must lend themselves to the simple use of multiple tools for tracking impact
Equality: credit is due to both the producer and consumer in equal measure, and due to all who have contributed, whether they are academics or not

This does not rescind the values of the current credit system, but reinforces them by acknowledging that there are many forms of output that can lead to indicator events.

announcing the mozilla science lab

I’m thrilled to announce that I’ve joined Mozilla to build and direct their new open science initiative – the Mozilla Science Lab. The project is supported by the Alfred P. Sloan Foundation.

I’m excited to be returning to my open roots, as well as continuing to push the boundaries of what “digital research” can and should look like, and further explore how we can make the web work for science.

Why Mozilla?

Openness, empowerment and disruption are baked into Mozilla’s DNA. Their belief in the power of the open web and drive to explore new ways the technology can transform is inspiring. They truly believe that we all should be able to innovate in the digital world, regardless of your level of technical proficiency – that we should be able to be more than passive consumers. This is incredibly important for science, especially as we grapple with a daunting skills gap at the university level that is, in many cases, disincentivising researchers to participate, to innovate, or even in some cases, continue to do science.

Mozilla cares deeply about “digital literacy”, and it’s time we explore what that means for science, especially given discussion about the “skills gap” in funding circles and at the policy level. I started to unpack this a bit back in January in a piece on Radar – teasing out some of the core competencies I think we’re neglecting in basic science education. We’ll be discussing that more here on the blog in the coming weeks, as well.

The first member of my team is Greg Wilson, founder of Software Carpentry, a program that teaches basic computational literacy to researchers to help them be more productive. I’ve long admired Greg’s work in this space, in providing an entry point for students to learn things like version control, data management, basic scripting. In the last year alone, they’ve run over 70 events for more than 2,200 attendees – all led by volunteers – and are on track to double both numbers in the coming twelve months. More importantly, Software Carpentry is our first step in exploring what “digital literacy” ought to be for researchers and what they need to know to actually do it.

We also want to find ways of supporting and innovating with the research community – building bridges between projects, running experiments of our own, and building community. We have an initial idea of where to start, but want to start an open dialogue to figure out together how to best do that, and where we can be of most value.

I’ll be writing more here on the blog in the coming months as we ramp up development of the program (hint: we have some cool stuff planned. 😉 ). Stay tuned for more in the coming weeks about how you can get involved. You can also check out our wireframe here at wiki.mozilla.org/ScienceLab or follow us @MozillaScience.

making research more efficient – a preview of my #idcc13 talk

I’m off to Amsterdam tomorrow for the Digital Curation Centre’s annual conference, IDCC ’13. The program is a diverse mix of some of the top thinkers when it comes to issues of digital curation, data sharing, standards and information management. I’m delighted to be joining such an all-star group, speaking Tuesday in the Innovation/Applications track on our work at Digital Science, and in general, making research more efficient.

At the tail end of last year, the organisers asked if I’d be interested in engaging in an email interview leading up to the event. Below is an excerpted version of the interview. For the full post, visit their website. You can also find the program here.

—–

Your presentation will focus on Infrastructure. Are there any specific messages would you like people to take away from your talk?

It’s easy to think that we’ve worked out most of the kinks in research when we look at some of the latest advances in astronomy, genomics, and high-energy physics in the news, from the work at the LHC to the ENCODE project. But there are still a number of baseline assumptions in research that need rethinking – and in many cases, fixing. That’s what Digital Science was created to address, some of the oft overlooked roadblocks in things like search in the sciences, information management, and the dated incentive system which is keeping us from fully updating our practices in the lab.

We address three areas in our call this year – Infrastructure, Intelligence and Innovation. What do you see as the most pressing challenges across these?

Having worked on infrastructure issues in research for the last six years, I’d say one of the main challenges remains making the right design decisions. Whether that’s an open platform that operates on the back bone of the web or a lightweight software application for use in a research setting, design decisions are key, and in my experience, are often not thought through to the extent warranted.

There’s a reason why inefficiency still exists in modern research labs, and it’s not a shortage of tools. Part of that still lies in how the systems are crafted for the individual user, but also how it speaks to other systems.

Also, the age old incentive problem is still keeping us from reaching our full potential, as we continue to largely measure impact as papers produced. Not only does that skew researchers’ incentives to better manage and make available say, for instance, the data accompanying their research or the code needed to execute the experiment, but it only presents issues for other specialists whose main output may be software, not scholarly papers.

We need to rethink how we measure and reward research so that it better reflects a researcher’s contribution on his/her community and give the system a hard refresh.

And in terms of opportunities, do you see potential in data science as a new discipline?

Absolutely … though it’s not a “new” discipline, necessarily. There is an increasing understanding about the power in bringing together skillsets such as mathematics, machine learning, statistics, computer science and domain expertise (though not always necessary), which is helping us redefine how we think of hypothesis-driven research, becoming more data-driven.

What I find particularly fascinating is the spotlight it’s putting on how we teach science undergraduates – making sure they not only have the practical skills for working in a lab or conducting an experiment, but also the statistical literacy and analytical reasoning to understand the information they’re producing and collecting.

The conference theme recognises that the term ‘data’ can be applied to all manner of content. Do you also apply such a broad definition or are you less convinced that all data are equal?

I’m an equal opportunity data fan (and open purist, carried over from my time at Creative Commons). Too often, I feel, we get caught up in debates about the “worthiness” or “value” of particular data sets, a legacy from the publication world where only the most polished, interesting data counts. It’s pervasive and keeping us from doing more robust, reproducible work. I am a strong proponent of not cutting oneself off from yet unknown opportunities, and unfortunately classifications such as “junk data” are not only increasingly silly in the digital age, but borderline harmful.