‘Hack weeks’ teach about big data through teamwork

Participants work on their projects at the 2018 Neurohackademy. (Credit:Alex Alspaugh/U. Washington)

A new interactive workshop teaches researchers at multiple stages of their careers about data science through collaboration.

Each night, high-definition cameras mounted to telescopes collect terabytes of data about objects in the sky. Each day, scientists sequence the genomes of people, animals, plants, and microbes for biomedical and evolutionary research. Each year, the Large Hadron Collider produces 30 petabytes of data on particle collisions.

Science has become a big-data endeavor. But scientists are not universally adept in “data science”—the computing and statistical skillsets needed to handle, sort, analyze, and draw conclusions from big data. The shortage of know-how in data science can hamper research, medicine, and even private industry.

The new course format, called “hack week,” blends elements from both traditional lecture-style pedagogy with participant-driven projects. The most recent was a neuroscience-themed event held in July.

Teaming up to learn together

As the team reports in a paper published in the Proceedings of the National Academy of Sciences, participants rated the hack weeks as opportunities to learn about new concepts, foster new connections, share data openly, and develop skills and work on problems that will positively affect their day-to-day research lives.

“The idea behind hack week was to bring together people who were interested in data science and give them a place to meet, talk, and exchange ideas,” says lead and corresponding author Daniela Huppenkothen, associate director of the astronomy-focused DIRAC Institute at the University of Washington. “But instead of a traditional format with experts lecturing nonexperts, this would allow participants to mingle more and teach one another.”

Participants collaborating on chosen projects at the 2018 Neurohackademy. (Credit: Alex Alspaugh/U. Washington)

Huppenkothen was involved in the inaugural hack week event, “Astro Data Hack Week,” held in 2014. That event brought together big-data researchers in astrophysics and cosmology. Since then, the team has held four additional Astro Hack Week events, three “Neuro Hack Week” events for neuroscience, and two “Geo Hack Week” events for the geosciences.

All hack week events have the same basic design and organizing principles. They usually commence with some structured periods for instruction, and then shift toward time for participant-driven, open-ended projects, as well as peer networking and free discussion.

Hack the world

The projects can resemble a hackathon, but with greater emphasis on collaboration and learning rather than specific outcomes. Hack week participants tackle their projects in smaller groups, with organizers circulating to observe and provide feedback or encouragement.

The projects range from experiments that the participants brought from their home institutions to ideas that come up during the course. One project from the inaugural Astro Hack Week, for example, eventually became Stingray, a software project to provide algorithms to analyze time-series data in astronomy.

“You have to set up ways to encourage participants at all levels of ability and comfort—creating a welcoming space for everyone to pitch ideas.”

At last month’s Neurohackademy, a new two-week version of Neuro Hack Week, one team worked on developing common ways to analyze different types of MRI scans.

The events’ open-ended structure places greater responsibility on the organizers of each hack week.

“A hack week takes a different kind of preparation, because you don’t have the security of ‘falling back’ on the structure of traditional talks and lectures,” says coauthor Anthony Arendt, a research scientist with the Applied Physics Laboratory who organized Geo Hack Week. “You have to set up ways to encourage participants at all levels of ability and comfort—creating a welcoming space for everyone to pitch ideas.”

Most hack weeks the team organizes cap the number of participants at 60. Organizers also strive to select participants to maximize diversity—including scientists of different abilities, backgrounds, and at different stages of their careers. Participants also agree to abide by a code of conduct that emphasizes respect and positive interactions.

Spreading the word

In surveys conducted after eight hack weeks, participants ranked the events positively as spaces to learn, teach, network, and foster relationships. More than three-quarters ranked the hack weeks as successful learning experiences, while two-thirds reported teaching skills to someone else. This feedback was constant across different backgrounds, showing that the unique format of hack weeks helps all participants feel included, says Huppenkothen.

“Now we want other scientific communities to learn about our experiences and see how they might start organizing their own events,” says Huppenkothen. “We also want feedback from other communities—both good and bad—and to widen the dialogue about data science and skill development.”

Their paper includes supplementary materials detailing the hack week experiences and advice for other groups interested in starting their own workshops.

Participants gave hack weeks high scores for promoting open-science principles—in which researchers publicly post and share their datasets, code, and methods. Open science principles are critical to addressing challenges that researchers face in making their research more reproducible, says coauthor Ariel Rokem, a data scientist with the eScience Institute and co-organizer of the recent Neurohackademy.

“One of our goals with the hack week format is to elevate the quality of science being done,” says Rokem. “The best way to do that is to try out ideas and share what you’ve learned.”

Additional coauthors are from the University of Washington; New York University; the University of California, Berkeley; and the University of Texas at Austin. The National Institutes of Health; the University of Washington; New York University; the University of California, Berkeley; the Charles and Lisa Simonyi Fund for Arts and Sciences; and the Washington Research Foundation funded the research.

Source: University of Washington