Hackathons, sometimes also called hackfest or codefest, are DESCRIBE WHAT A HACKATHON IS. Since the early 2000's, hackathons have become increasingly popular in both commercial and academic settings. They are most often deployed as instruments to foster community, and to stimulate creative uses of technology to address unsolved problems. Academic sponsors tend to design hackathons around community engagemnt and outreach goals, whereas company-run hackathons often focus on product improvement or innovation. Hackathon events can and do vary widely in important characteristics, such as length, size, participant recruitment, composition of participant group, amount of structure, competition between teams, sponsorship for participant costs, pre-event and post-event engagement and expectations, and others. The many choices one can make all have implications, not only for what happens (or doesn't) at the event itself, but also for outcomes, be they social, technical, or community-related. Most of these implications have not been scientifically studied, and therefore data that could inform decision making are scant. So are, unfortunately, data or studies about assessment of hackathon outcomes.
This is particularly problematic for hackathons that try to strike a balance between community outreach and creating tangible products, because decisions made early in the process of organizing an event can shift the balance later in the process in a way that is unintended or undesired by the sponsors. Further complicating the problem is that scope, objectives, and priorities for a hackathon are not always determined before the first decisions that influence them, and that many types of outcomes lack easy-to-measure metrics, and are thus difficult to assess. Hence, whether and to what degree a hackathon achieved its intended goals is typically a matter of subjective judgement, which itself is often rendered by those with obvious conflicts of interest, such as organizers and sponsors. A hackathon may be judged highly successful even when it failed to meet most of its intended goals, for example because it had unanticipated outcomes that have equal or greater impact than the intended ones would have had.
Scientific studies on hackathon processes versus outcomes have started to be undertaken, but results from these will take years to become conclusive. However, what can be done now is thoroughly documenting the hackathon models and processes that have been used recurringly by the same or related sponsors, assuming that those are the ones that have been deemed successful at achieving objectives that are important to the sponsors. Such documentation should thus include what the objectives were, which ones were more important than others, and, to the extent possible, the extent to which they were achieved. Rigorous documentation of this kind is currently scarce and nearly absent from the scholarly record. Although many reports on hackathons, including tips for others and some lessons learned, can be found online, they are in the form of informal, often intentionally subjective blog posts, and typically reflect on a single event rather than a model and process that have been applied repeatedly.
In this paper, we present a thorough description of a hackathon model and process that was used for organizing and running nine hackathons over a 10 year period. Although the target areas of these hackathons differ widely, they share a number of important properties that makes them a cohesive series for studying:
- They had the same primary sponsor, NESCent, an academic research center. (A few also had co-sponsors from other academic research entities.)
- Their target areas were all focused on the same domain science, evolutionary biology.
- Intangible outcomes, in particular building and nurturing a scientific community of practice, were of similar importance to the sponsor as tangible products.
At least one, and for six out of the nine events several of the co-organizers of these hackathons are among the authors. We note that we intentionally restrict ourselves to reporting and discussing the experience and evidence we have. This paper does therefore not attempt a comprehensive review of the hackathon models and processes that exist, nor do we try to compare the effectiveness of the model and process we describe to that of others.
By design, hackathons are generally motivated by bringing together people who would otherwise not meet, to collaboratively tackle goals or problems they would otherwise not have enough opportunity to work on, or for which it would be much more difficult to succeed on their own. The types of outcomes that can emerge from this cover a wide spectrum from intangible to tangible, and the impacts that these outcomes can have towards a sponsor's larger objectives can also vary widely, including in how well they are measurable.
To better understand the values and objectives that motivated the event and process model we present here, Table X gives a sample of potential hackathon outcomes, divided into tangible and intangible ones, to illustrate the spectrum.
Tangible Outcome | Possible Measures | Challenges |
---|---|---|
Source code | Draft versus working quality; sustained post-hackathon development activity; community interest gathered; adoption by non-participants (forks, downloads, citations) | Metrics for code writing productivity (lines of code, number of commits) are often confounded and not useful. The specific impact of contributions to larger projects is nearly impossible to track. The academic attribution system falls short for unpublished software. |
Publications | Number of and impact metrics for publications (citations, altmetrics); | Impact and value of non-scholarly publications can be significant yet difficult to quantify. |
Fund raising, Grant funding | Number of proposals; amount funded; increased funding rate | Measuring change of funding rate requires long time windows and disentanglement of the many confounding factors. The value of unfunded proposals is hard to quantify but likely non-zero. |
Documentation | Amoung of text written; number of tools or methods documented; access and citation statistics for online documentation | Quality, comprehensiveness, and being up-to-date can be difficult to measure. Metrics for offline documentation are few and inadequate (e.g., downloads). |
Data products (data, ontologies, benchmarks, etc) | Size, number of and impact metrics for data products (citations, downloads, altmetrics); | The specific impact of contributions to larger datasets is nearly impossible to track. Tracking of scholarly attribution and impact is still in its infancy, and nearly impossible for unpublished products. |
Community standards and best practices | As publications if published, and as documentation otherwise. | Tracking of scholarly attribution and impact is still in its infancy, and nearly impossible for unpublished products. |
Intangible Outcome | Possible Measures | Challenges |
Growing or building community | Size of community; existence and degree of adoption and active utilization of community interaction resources | Technologies that "work" are |
Growing or building community | # of novel collaborators (software/publications) from participants post-hackathon | |
Culture change towards openness and collaboration | # of novel collaborators (software/publications) from participants post-hackathon | |
Broadening Communities/Networks | Increased participation/friends/followers of mailing lists, social networks, etc. | |
Community Awareness/Training in Technologies, Standards, or Best Practices | # new users of a technology | |
Increased Diversity | Diversity includes demographic concepts (gender, ethnicity, experience) and disciplinary concepts | |
Publicity/Brand Awareness | Conference presentations/posters, Press releases (and where they are picked up), Social media impressions | |
Broadening Perspectives | Idea inspiration for projects/work unrelated to hackathon | |
Connecting Cultures | Bridging the culture gap between "coders" and "users" |
There are many possible outcomes and impacts of a hackathon (table X): some of these are tangible (e.g., code, publications, etc.) and some are intangible (e.g., increased community diversity or community awareness of technologies or best practices). The procedures for holding a hackathon that we have developed were generated specifically based on the value we give to these outcomes. Decisions on participant pools, for example, may be heavily driven on the relative weight one gives to tangible vs. intangible impacts. Organizers need to make decisions based on maximizing the outcomes they value the most.
Measuring the impact of these outcomes can be very difficult. "How much code is generated by a hackathon?" Measures such as number of lines of code or number of commits can be meaningless since these counts are very fungible based on the style in which individuals write code. Number of programs or scripts produced is barely better: is the code rough draft or a polished product? How bug-free does it need to be? Does it need to work at all or can it be conceptual? Does what happens to the code after the hackathon matter? Is code that is used widely by the community (download counts? citations? github forks?) count the same as code that is completely ignored? There are no simple answers to these questions and the matter is even worse for intangible outcomes where direct measurement may be more difficult.
One clear lesson learned is that data and meta-data about a hackathon need to be collected more deliberately by the leadership team, both before, during, and after the event. This is discussed in more detail below.