Diversity and the intersectionality problem

Katy Huff (one of our Software Carpentry instructors) has a great birds-of-a-feather talk on attracting diversity in the scientific python community up. She frames as an “intersectionality problem”, the notion of when you have two or more competing diversity problems (say, women in science, or women in science that code) to a double compounding of the problems.

Huff goes on to further explain: “The feature by which you’re both a rainbow and a unicorn, have all the features of social discrimination or social privilege because of your rainbow-ness. And also the same feature because of unicorn-ness but adding these two in a linear combination isn’t the same as having the experience of rainbow unicorn.”

Wikipedia describes intersectionality as “the study of intersections between forms or systems of oppression, domination or discrimination. An example is black feminism, which argues that the experience of being a black female cannot be understood in terms of being black, and of being female, considered independently, but must include the interactions, which frequently reinforce each other.

Take science for example, where there are roughly around 20% women (though certain disciplines are higher and lower). Computing in industry has something like 20% women. SciPy, Huff says, doesn’t have 20% women. These problems compound one another to ill-effect. And it’s not just women, but other minorities and groups in scientific computing.

I really like Huff’s take home recommendation to the audience. Lower the barrier to entry, but not the standards, because as Huff aptly puts it, “that’s offensive.” And as one of the panelist suggests, be careful about micro-aggressions. But that’s a separate post.

Have a watch.

Jumping in the deep end

This past week my colleague Greg Wilson kicked off another round of Software Carpentry’s online instructor training (the 10th cohort, I believe). Software Carpentry is the leading educational program of the Mozilla Science Lab (which I head up), and a core piece of the puzzle in changing the way researchers do science on the web.

In the past year, over 3,600 researchers, librarians and other members of the scientific community have participated in a bootcamp, learning how to use the shell, introductory git and version control, some data analysis and testing. The training is designed to help serve as a jumping off point for researchers to help them introduce efficiency into their work (and with any luck, lead them to doing more open and collaborative research). We also are fortunate to have over 130 volunteer instructors coming from a wide variety of backgrounds, and a series of online (and soon-to-be in person) trainings to help further increase that number. And we have big plans for the program moving forward, exploring ways to move from short two-three day trainings to longer term engagement and learning.

I’ve long been a supporter of the program – even before joining Mozilla to lead the Science Lab, and have even participated in a bootcamp as a learner myself.

This week, I took another leap and attended the first session of Greg’s online training (notes from that class can be found here), and started reading “How Learning Works” – the assigned course reading material. The training runs over the course of 12 weeks, with the class meeting every two weeks to discuss homework and the readings. Greg has crafted the course to focus heavily on teaching others how to teach rather than how to teach specific components of the bootcamp material (such as python or SQL). It’s rooted in educational psychology, looking at how students learn, and how to craft effective material to maximise learning and impact.

What am I hoping to achieve? Well, there are a few desired outcomes on my end, outside of practicing what we preach and learning more about the inner workings of part of the program. First off, I’m eager to get a better understanding of the process of graduating from a learner to an instructor (and outside of the instructor training, I have a lot of technical proficiency to build 🙂 ), and experience Greg’s training for myself. Also, as someone who’s worked close to code but not often been the one programming myself, this exercise – with the end goal of being able to eventually help out with bootcamps – will give me a reason to embed these practices in my day to day. It also gives me a way to keep learning and engage with our outstanding instructor base in a new way (more on that later, and many thanks to those who’ve reached out to help so far). And beyond that, it gives me a better idea of where we need to focus out attention following a bootcamp to provide pathways for others to gain confidence and fluency in the skills taught, continue learning, and eventually one day, be able to give back and teach.

So, what’s next?

I posed to the instructor list a call for tips, tutorials and recommendations, so that in parallel to Greg’s training, I can also actively work to get up to speed with the bootcamp materials. I’ll post some of those recommendations in a subsequent post, and please keep those recommendations coming. Over the course of the next few months, I’ll be working to break away from some of my cheat sheets and gain more confidence in my technical skills, particularly around bash, git, python and SQL.

I’ll be blogging about the process, as well – as I’m sure I’m not the only dabbler interested in brushing up their skills so they can help with the program.

And with that, I have some homework to see to.

Thinking globally

Early on as we were setting up the Mozilla Science Lab, I had a chat with a former neuroscientist about the challenges facing modern day research – soundboarding our initial ideas, hearing about his experiences in academia, discussing gaps in training and awareness. What struck me from that conversation in particular was a comment made about where “most of the good science” came from (90%+ in his estimation), challenging the idea that such a program as ours was needed as in his opinion, if you needed to know something or access research and if you were at a top notch university, then it was a non-issue.

And that’s where 90% of the “good science” was done, he soon after conveyed – at top-level universities in the US (with a few exceptions).

Now, it’s no secret that the research we see gain the most citations, rank highest in indices such as the ISI, or share is not fully representative of all the world’s researchers, and that what’s available is skewed largely to western cultures. That’s slowly changing, but in reply to the neuroscientist’s point above, what’s reflected in the literature is not the whole picture, nor is it indicative of the broader research community – the folks we aim to help through our work at the Science Lab.

Which brings me to my conversations over lunch today. I’m currently in Nairobi, here for a workshop that kicks off tomorrow focusing on discoverability and openness of African scholarship, with many of the participants East African agricultural scientists. The two day workshop is sponsored by the Carnegie Foundation and organised by OpenUCT in Cape Town, the first of what I hope are many knowledge sharing discussions with those here on the ground trying to stay on top of their research, build out their university and personal footprint in their fields, and well, communicate that out to the world without their voices being lost.

Here Open Access is a touchy issue – most of the researchers and librarians I’ve spoken to in support of the premise and of what that unlocks for them in terms of the world’s literature. But in the push to turn that spigot all the way to OA, there’s also a tangible fear of losing what competitive advantage they may have built up over their careers, a worry that their work will not reach the same audience as others in the West.

And then there are the technical and cultural issues here on the ground, of which I’m just scratching the surface. From a UNESCO / eIFL workshop I participated in back in 2009 around sharing content and data on the web to the perspective heard today at lunch from a librarian in Ghana, we’re still working across varying levels of awareness and just sheer resource. Many students, even up to the postgrad level, at these universities still rely on their central library on campus for Internet access (smartphone adoption is helping, but personal laptop ownership is still not yet the norm). Some universities only recently celebrated their 20th anniversary, as opposed to their 150th (or 918th, if you’re Oxford), formed after African independence. (Have a read of Eve Gray’s fantastic post about these issues for more.)

So why am I here, and not at say, SXSW? Because science is global. Perspective is important. The means to understand the world, the ability to process, share, and generate new knowledge is not just for the elite, but for all (and luckily, my colleagues at Mozilla firmly believe that too). And given a chance to hear from other researchers on the ground, curious about “science on the web” from discoverability and dissemination, to capacity-building/skills training was an opportunity I couldn’t pass up.

This is one of our first steps in building bridges with researchers in other parts of the world, so that we can truly work together to make research more efficient. As stated in our plan for 2014, we’ll also be looking at running events and hearing from others in South America, Australia and in Asia about their challenges in doing open research on the web. Our goal is to continue to see how we can help the broader research community and join up the various efforts and threads of conversation to move forward together. Have an idea? Get in touch. We’re here to help.

And many thanks to the Carnegie Foundation and the Open UCT team for inviting me to join them this week in Nairobi. You’ve already given me much to think about, and I look forward to learning more over the next few days.

Science Lab featured on this week’s Pod Delusion

MozFest (and the Mozilla Science Lab) featured in this week’s Pod Delusion podcast. Tune in to hear more from Mark Surman, Josh Greenberg, Greg Wilson and more (first 12+ minutes).

(And more to come about MozFest soon … Stay tuned.)

A look back on the last few months …

Image MozillaEU, under a CC-BY-2.0 license

Today marks the start of the Mozilla Summit (#mozsummit), a three-day meetup (split between three cities) of Mozilla staff and key contributors. It’s a celebration of and a chance to discuss the amazing work that’s being done across the organisation, from Firefox OS to Open Badges and the Science Lab. This is our chance to come together as a community, learn from one another, and more specifically for our work at the Science Lab – our chance to not only tell our story, but invite our colleagues in to help test and shape it.

Earlier this week, in the lead up to the event, two of my Foundation colleagues posted two stellar pieces that got me thinking about where we’ve come in the last four+ months with shaping the Science Lab. If you have the time, do check out Matt’s post on working in the open – core to how we operate at Mozilla, and Brett’s reflections on the summer of Maker Parties on the Webmaker blog. They’re a great lens into not only the activity going on at the Foundation, but a look into our process as a whole.

On to some reflections of our own …

We launched with an idea of how Mozilla could best help the research community this past June. We had a project up and running to build on (Software Carpentry) and a sketch of some of the areas making the most progress in advancing science on the web, as well as an even longer list of areas needing attention.

So where have we come since June 14? Here’s a look at our progress to date, what we’re excited about and what we’re still exploring (and could use your help with).

We spoke with over 3,000 people.

In the last four months, we’ve engaged with (not just talked at) over 3,000 people, astonishingly, largely face to face, to ask them where they see the research system breaking down, where is attention needed, and start to discuss how an organization like Mozilla can help. I’ve spoken with researchers, educators, developers (or “research software engineers”, bridging both worlds), scientific startups, publishers across the spectrum, and institutions around the world. We’ve spoken with researchers of all shapes and sizes working on problems in the US, Australia / New Zealand, South America, and Africa – to see how we can best work together to achieve this vision of more web-enabled research that helps us connect, helps us learn and innovate, and helps us interoperate.

We honed a model for the Mozilla Science Lab.

A few common threads emerged from those conversations.

There’s a tremendous amount of work being done to move science to the web, but not in a coordinated fashion. And for all of that development, it’s still difficult to discover what work’s already been done. So we’re duplicating efforts, or, even worse, continuing on with business as usual.

You can read more about the model for the Science Lab in our post here (feedback always welcome). We strongly believe that what Mozilla can best contribute to this space is the expertise, values and leadership needed to fill in the missing gaps (digital skills education, examples of what’s technically possible if systems interoperate), to be the support beams needed to truly change the way science is done.

We started to map activity to those pillars

code and data literacy: Software Carpentry, currently at 135 volunteer instructors, 30 more in training, with more in the works. Care to help us shape digital training for researchers? Here are a few ways we could use your help.
technical prototyping and interoperability: We’re wrapping up our first pilot on code review in science with PLOS Computational Biology. You can read more here and here. And stay tuned for more soon on badges for contributorship, as well as more exploring “code as a research object”.
building communities of practice: We’re planning for the first “science and the web” track at MozFest (not too late to join us!), as well as building out resources to help take the ambiguity out of “open science”. Also, see the bottom of this post for more thinking on community building. We’d love to hear your thoughts.

What have we learned (and what do we need to work on)?

Terminology matters.

… and we need a tighter way of articulating our project aim to a wider audience.

One of the interesting points that’s bubbled up in the last few weeks is a bit of dissonance between the phrasing “Open Science” and “science on the web” – two characterisations that I believe are dependent upon one another, often used interchangeably in these circles. But there are other situations where “open” is used as a catchall, and we need to do a better job at unpicking why what terminology matters, what it means in this context, and explore the relationship between those two.

In the open science circles, working on the web is the condition to moving work forward – crafting our systems to interoperate, designing our communities to operate in a networked fashion, making sure the components of our research (data, code, content, materials) are as reusable and maximally interoperable as possible. Working on the web without having those components or that process be open doesn’t scale. But we need to better articulate what we mean by those two phrasings, and show by doing to researchers what science on the web means.

We need a better, more explicit call to action.

In the short term (for those of you who have been asking 😉 ), we are working towards launching our Science Lab website. But that’s not really the type of engagement I’m shooting for, though it will help us create a core focal point for developments and resources not only with the Science Lab, but also in the community. Watch this space.

We’re currently exploring an action-based community building effort that we’d love your feedback on – the aim being to do more science on the web, and build communities of practice around that action. Much of this is a hattip to the brilliant work the Webmaker team has done. For those of you who’ve seen me talk recently, I also take inspiration from the International Geophysical/Polar Years – international efforts spanning 60+ nations, involving 50,000 participants across the social / life / natural / theoretical sciences to push towards one common goal.

I think we have a real opportunity to do the same for science on the web, showing that working with open-access content, open data, open-source / interoperable tools and code, or running a training is the future of 21st century science, and invite you all to join us.

This is where we could use your help. As we work to craft resources to help others learn more about how to work on the web – about implications for data, content, code, new tools, training programs, how can we best structure that for maximum engagement?

I’d love to hear your thoughts.

But we’ll save further discussion on that for another blog post … I’ve got to get ready for the Summit. 🙂

[On the move] Open Science at USC

I’ll be speaking this Sunday at the Science and Tech Forum at USC in Los Angeles. The event kicks off today, led by Llewellyn Cox from USC’s School of Pharmacy.

The event also features Pete Binfield (PeerJ), Elizabeth Iorns (Science Exchange), Barry Bunin (Collaborative Drug Discovery), Mark Hahnel (figshare), Brian Nosek (Center for Open Science / Open Science Framework) and more. Check out the agenda here.

You can tune in to the livestream or follow the conversation online at #scitechLA.

[Interview] Code review in social science: COMSES Net

Following the announcement of our code review pilot, we heard about a group in the social and ecological sciences who were also exploring methods of code review in their discipline through a project called “COMSES Net”.

The issues we hope to address through our pilot with PLOS Computational Biology are not life science specific. Ecologists, earth scientists, social scientists and more are grappling with many of the same problems, as their work, too, becomes more computationally dependent and driven.

Part of our work at the Science Lab is in helping to increase awareness of initiatives around the world pushing the boundaries of sharing scientific knowledge, tools for better research, and means to get involved. It’s with that that I’m delighted to welcome Michael Barton to the blog to tell us more about the COMSES Network, OpenABM and their approach to code review for the social sciences.

Kay Thaney: Michael, thanks for joining us. To start, could you tell us a bit more about the project – where the idea came from, what you hope to achieve, how long you’ve been up and running?

Michael Barton: CoMSES Net (or the Network for Computational Modeling in the Social and Ecological Sciences) is a “Research Coordination Network” funded by the National Science Foundation. It originated in an NSF proposal review panel in 2004 on which one of the project’s co-Directors (Lilian Alessa) and I served, which made clear the growing importance of computational modeling to the future of socio-ecological sciences.

With NSF support, we convened a series of workshops and initiated a pilot project to better understand the challenges to making model-based research a part of normal science in these fields. Importantly, we learned that a key issue was the lack of ways for researchers to share knowledge and build on each others’ work–a hallmark of scientific practice. This led to the formation of the Open Agent Based Modeling Consortium (OpenABM), whose main aim was to develop a roadmap for improving the exchange of scientific knowledge about computational modeling. This pilot program had a number of useful outcomes–including improved interoperability between different modeling platforms, support for emerging metadata standards, a community web site, and recommendations for ways to share model code and educational materials. It also led to a subsequent NSF grant that created CoMSES Net in 2009.

CoMSES Net has been active for four years. In that time, we’ve implemented many of the recommendations of the pilot program, and gone beyond those recommendations in some areas. We’ve had a workshop on best practices for code sharing and promoting quality code, and another on modeling in education.

Particularly relevant for the code review that the Mozilla Foundation is exploring, we created the Computational Model Library (CML), where authors can publish scientific code and have negotiated an agreement with the Arizona State University Libraries to provide “handles” (open source DOI equivalents from the same organization that provides DOI’s) to peer reviewed models. We have established agreements with journals that publish model-based science to require or recommend that authors publish code with the CML, and created a protocol to enable code review simultaneously with the review of a submitted manuscript. We have also established an internal code “certification” program for peer review of code not associated with journal articles.

Kay Thaney: COMSES focuses on ecological and social science research. For those readers unfamiliar with those disciplines, what are some examples of code and software use in that sort of work? Is this a recent development, using more computational modeling and digital analysis in the discipline?

Michael Barton: Modeling and computation is becoming pervasive across many fields of science, engineering, and technology. Our focus is on the social and ecological sciences. These domains are increasingly ‘fuzzy’ around the edges, as human activities are affecting biophysical earth systems in more and more profound ways, and as we become increasingly concerned about the social dimensions of science, engineering, and technology. As a community, we are working to better delineate the scope of CoMSES Net in this changing social and intellectual environment. Still, at the core, this encompasses social organizations and institutions that emerge from the interactions of decision-making human actors who share cumulative cultural knowledge to varying degrees, and their evolution over time. It equally encompasses the evolution, behavior, and interactions of other living organisms–with each other, with the human world, and with the earth’s physical systems.

There is a long history of statistical analyses in the social and ecological sciences, the application of equation-based models in a few fields (e.g., economics and behavioral ecology), and some limited experimentation with computer simulations. But spread of computational modeling–especially models that take a bottom up approach representing many, discrete, decision-making agent–is a comparatively recent phenomenon of the past decade. Often these also employ complex systems conceptual approaches to explore how simple rules expressed by multiple interacting agents can lead to the emergence of complex phenomena at more inclusive scales. Examples include models of farmers managing a shared irrigation system, foraging behavior of ants, and flocking in birds.

Kay Thaney: You had mentioned in a comment to our initial post on our pilot that your team were endeavoring to work on new forms of review for code and software. Tell us more about your approach.

Michael Barton: Our approach to software and code review is to embed it into forms of scientific practice and reward systems that are familiar to most practitioners. We have created two pathways for code review. In one pathway, scientists who publish a journal paper that involves computational modeling can submit their code to the CML but leave it unpublished. They then can provide the URL of the code to a journal editor and/or reviewers for evaluation along with a submitted manuscript. Editors and reviewers can access the code, but it remains hidden from others. Authors can revise the unpublished code in response to reviewer comments, in the same way as they can revise a submitted manuscript. When the paper is published, author can then click a button to publish the code in the CoMSES Net library, making it accessible to the scientific community.

The second pathway is for code that is not directly associated with a journal publication. An author can request peer review and “certification” for a model submitted to the CML that either is published or unpublished. Certification follows a protocol that is similar to manuscript review for a journal, practices familiar to most scientists. A set of peer experts is selected to review the code, following a set of reviewer guidelines. Reviewers can request revision of the code or documentation. When the editor is satisfied that a code author has satisfactorily responded to peer review, the model is “certified”. This adds a certification badge to the model entry in the CML. The CoMSES Net certification review guidelines emphasize model functionality and documentation. The model should be sufficiently documented that another researcher can understand its operation and replicate its functionality. It should also function as described—including running in the environment specified. We are following something of a PLOS model for evaluating scientific value of the code, providing space for commentary and rating by the larger community of practice.

Code that has passed peer review via either pathway is assigned a “handle” from the Handle System (http://handle.net), which manages “persistent identifiers for internet resources”, through our agreement with the ASU Libraries. A DOI is a well-known, commercial implementation of Handle System resource identifiers. The CML provides a formatted citation for all published models, making them potentially citable in other papers and on professional CVs or resumes. A handle in a model citation, as a permanent resource identifier, is an added indicator of quality for code that has undergone peer review. Moreover, all individuals who join CoMSES Net as affiliate or full members (required for publishing code) must agree to a set of ethical practices that include proper citation of model authors.

Kay Thaney: What have you found this far in your efforts? Any interesting outliers or behaviors?

Michael Barton: All scientists who engage in model-based research that we have talked with have been strongly in favor or our efforts. However, this community has been slow to publish models. This is not unexpected, of course, as it is a change in established practice. We hope that as the CML grows and becomes an increasingly valuable resource, scientists’ reputations will be enhanced by publishing code, as they are by publishing manuscripts. If this happens, it can initiate a positive feedback loop of increasing code publication and enhanced reputation for publishing code. That is what we’d like to see ultimately.

Kay Thaney: How much of a technological problem do you think this is? What would you identify as some of the blocking points in your discipline that are keeping these practices from becoming the norm?

Michael Barton: I mentioned above, we were initially surprised to learn that the spread of computation in the social and ecological sciences has a more important social dimension than technological dimension. There are certainly technological aspects. But peer review and evaluation of scientific code is a social process that requires voluntary participation by reviewers and those who submit their code for review. All participants need perceive potential benefits. We are working to establish a community-recognized system of mutual benefits that will encourage scientists to share their knowledge and improve scientific computation. While enforcement by traditional publishing venues and granting agencies will be important, is system of rewards is perhaps even more important.

The technological issues involved is providing the environment where all this can happen–including a place where code can be accessed by the scientific community and where peer review can take place. The CoMSES Net community is working to provide such an environment. We hope it can serve as an exemplar for other scientific communities of practice who recognize the importance of scientific computation.

Finally, it is increasingly important to provide new opportunities for training in computational thinking across all scientific domains. This is not computer science per se, but the ability for scientists to express dynamic systems in algorithmic terms—and sufficient knowledge of computing platforms to instantiate these algorithmic expressions in environments where they can help to solve wicked problems.

Kay Thaney: Fascinating – and very in line with our initiatives. Anything else about COMSES Net to mention?

Michael Barton: CoMSES Net has also created discussion forums, including a very active jobs forum for expertise in computational modeling. The web portal is developing as a locale for sharing educational materials, training materials for modeling, links to modeling software sites, and we recently started a YouTube channel for educational videos on computational modeling.

In addition to providing a community and framework for knowledge exchange, we hope to establish protocols that help scientists to receive professional recognition for creating and publishing scientific code, in ways that parallel recognition for carrying out and publishing other forms of research. The rapidly growing importance of computation across all domains of science make it imperative that it be embedded in the kind of knowledge scaffolding that has been a key driver behind the scientific transformation in the way we understand our world.

Kay Thaney: Thank you again for joining us and sharing information about your work.

[On the move] ISEES and the role of software in environmental science

I just landed in Oakland, where I’ll be participating in a two-day workshop exploring the role of software and training for earth and environmental science. The meeting is convened by the Institute for Sustainable Earth and Environmental Software (ISEES), and supported by the National Science Foundation.

Over the course of the meeting we’ll discuss the needs of the community and the evolving role of software as an enabler in advancing the field, with the aim of honing a vision for a “software institute” for environmental science. Whether that’ll be modeled after the Software Sustainability Institute in Edinburgh, I’m not sure – but looking forward to seeing how the conversation unfolds. Also keen to hear if/how training may fit into their vision as a cornerstone for supporting better practice in the discipline.

This is the last in a series of workshops convened by ISEES (with an impressive group behind it, including Trisha Cruse, Peter Fox and Bruce Caron). Stay tuned for more about the event, and for information on ISEES, visit their website.

Addressing the black box in research

In the last post, I laid out some of the key pillars we hope to hang our activities off for the Science Lab moving forward. There’s been some great feedback sent via various channels so far (thank you! and keep it coming), one message raising the issue of the “black box” in research.

The point made me realise that I may have assumed the broader vision of the project was clearly defined (the three areas outlined in the previous post more seen as the means in which we hope to execute on that vision).

Let me explain.

(Also, this allows me to include one of my favorite depictions of said problem … )

The vision we’re working towards at the Science Lab is one where openness is the norm for research – where there’s unfettered access to knowledge and the components associated (articles, data, software, materials, methods), where work can be built upon without asking permission (setting our eyes on reproducibility), and where our modus operandi is more rooted in open collaboration (and we’re rewarded for such behavior).

We’ll be building out communities of practice and use cases around these points, from showing what’s possible on the tooling / technology front, to addressing some of the gaps in education (in many cases keeping researchers from using open tools to achieve this vision). And on top of that, amplifying the current work in this space and building out resources so that the researchers can easily find information on developments in this space, creating a focal point for the community.

It’s a big problem we’re aiming to tackle, but with the help of the community, I think there’s no better time to set out to help truly make the web work for science.

Our plan for the Science Lab (version 1.0)

It’s been just over two months since we announced the start of the Mozilla Science Lab, and I wanted to share with you our thoughts on how to structure our activities moving forward.

Our aim from the beginning has been to see how we could best support and extent the existing work going on in “open science” – in some cases bridging existing technologies to address new problems, in others providing the educational resource to help close the gap in skills and awareness. And on top of that, map some of the core values and areas of expertise of Mozilla to science (e.g., openness, digital literacy, open-source ethos, community), like we did at Creative Commons years ago with the science project.

To ensure sure we were not just operating off of assumption regarding what the community needed (especially one as diverse, dynamic, and with as many stakeholders), we hit the road (literally and somewhat figuratively speaking, thank you Skype). Over the course of the last 2+ months, I’ve spoken to rooms of 20 / 300 / 700 asking for their feedback, and had a series of 1-on-1 calls and meetings with researchers from a diverse sample of disciplines (earth science, biology, ecology, social science, astronomy, physics), policymakers, publishers, tool developers, and educators.

What we learned

There’s a *ton* of activity (tool development, community efforts, policy work) pushing things forward, but we’re not nearly there yet.
It’s difficult to know what all is out there – what software exists, who’s tried what (and with what successes), how others can get involved. This is keeping many projects from gaining the traction they need. We need a better means of communication (about and among efforts) so we can reduce duplication, foster collaboration and gain broader use.   (GiveWell’s recent post on the current open research landscape is a great start, but there’s still more we can do.)
The system is complex and reliant upon a diverse network of stakeholders and technologies. The vernacular, behavior and idea of what “making the web work for science” among these groups also varies greatly, and needs to be taken into consideration as we craft solutions.
We’re facing an ever-widening skills gap. Science is becoming increasingly computationally-dependent, web-enabled and data-driven, yet our training is still based around older methods of doing research rooted in the analog. For there to be a real sea change in the behavior of researchers … for us to make research more open, collaborative and reproducible, we need better training and education tailored towards these aims.

 How we can help

It was clear that there is a pressing need for coordination, interoperability, and better communication in this space, whether you’re building digital infrastructure for high performance computing, building open source tools for visualisation, or looking for new (open) means of doing your research in the lab.   Taking everything we heard into consideration, we distilled that activity down into three core areas, each mutually dependent upon one another. We view each of these areas as key to filling in some of the gaps in this space, as well as providing the infrastructure and support for the breadth of activities already going on.

Let me go into a bit more detail about each of these areas, to help you better understand our programmatic focus moving forward. We’ll be elaborating on each of these areas in a series of subsequent posts, as well.

  Code

Through this work, we’ll be engaging with other external groups (scientific startups, publishers, researchers, etc.) to help bridge existing technologies where possible, build our prototypes to explore new problem sets, and supporting existing development in the community. An example of this could be taking an existing tool that may be discipline-specific and working to see if it can be applied to a different field. Or it could be taking existing infrastructure (say, the badges work) and working with external groups to test out different implementations.

Our aim here will be to serve as a means to bring together community around best practice, and also support existing work and extend it through small prototyping bursts of activity, collaborations and internal development.

We currently have one pilot running with the help of PLOS Computational Biology, exploring what code review for science could look like. See our recent post on this for more information.

Code and Data Literacy

Running in parallel to all of our other efforts in building community, communities of practice and open tools is a skills training layer, exploring what “digital literacy” for science means. Research is becoming increasingly digital in nature – data-driven and in many cases computationally-dependent – yet digital skills such as version control, visualization, analysis and online collaboration are not often taught at the university level. There’s an increasing gap between what researchers are expected to know, and what they have access to training wise. And practices fundamental to doing open, reproducible science are still outside of most formal training at the university level, despite increased availability of free, open tools for data sharing, collaboration, electronic lab notebooks and external pressures from funders.

Our work will help bridge that gap – in part by making such training accessible and attainable to all, so that scientists can do better, more digitally-enabled research.

Software Carpentry is our main activity in this space to date, a project founded by Greg Wilson to help teach basic computing skills to researchers. This is done via short two-day bootcamps, taught by a volunteer instructor base all around the world. We hope in the future to be able to build out additional components and work with others in the community exploring different approaches to heightening digital literacy for research.

Community

Last but not least is our focus on building and supporting community around the work mentioned above. We hope to be that connective tissue between various bursts of activity, understanding and practice, providing a focal point for information on developments in this space and a means for others to plug in. Mozilla has a longstanding history in building community, and we want to use that know-how to amplify the work currently going on, build communities of practice around openness, and serve as the glue to bring interested parties together. Engagement is also high on our priority list, exploring how best to use the expertise and enthusiasm of the community to help push these ideas forward.

We will be announcing later this year an international effort that hopes to do just that – so do stay tuned. 🙂

And in the shorter term, we’ll soon be announcing our first community call which will give you a chance to interact directly with us and help us continue to shape the Science Lab, hear about new tools and projects in this space, and learn more about what we’ve got cooking for 2014.

To wrap up

We view these three areas as the support beams for the open research community. You’ll notice arrows showing flow between each of those core areas, as well – that’s intentional. From a technical partnership starting a broader conversation with the community about best practice to a gap raised in our training efforts turning into a technical project, we view these areas as interdependent and mutually reliant upon one another. Moving forward, our hope is to also use model as a means to assess new opportunities, build out new programs and measure our successes.

I’ll be going into more detail about our activity in these three areas over the coming weeks. In the meantime, we’d love to hear your thoughts. Feel free to chime in here in the comments, send to mail or find me on IRC @ kaythaney.

kaitlin thaney

Author: kaythaney