Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Book Expansion Sprint report #4

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
197 changes: 197 additions & 0 deletions turing-way-expansion-sprint.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,197 @@
# Report: *Turing Way* Expansion Sprint 10-11 October 2019

## Part 1

### Participants

- [Kirstie Whitaker](https://www.turing.ac.uk/people/researchers/kirstie-whitaker)
- [Malvika Sharan](https://twitter.com/malvikasharan)
- [Evelina Gabasova](https://www.turing.ac.uk/people/researchers/evelina-gabasova)
- [Martin O'Reilly](https://www.turing.ac.uk/people/researchers/martin-oreilly)
- [Rachael Ainsworth (online)](https://www.research.manchester.ac.uk/portal/rachael.ainsworth.html)
- [Sarah Gibson](https://www.turing.ac.uk/people/researchers/sarah-gibson)
- [Louise Bowler](https://www.turing.ac.uk/people/researchers/louise-bowler)

#### Icebreaker

Participants thanked other *Turing Way* members (not in the room) who they were personally impacted by.

#### Main theme

In this session, we aimed to reflect on how the project has been managed so far, and discuss how that needs to adjust to expand its scope.

#### Topics of considerations (post-it exercise)

In this section, attendees collected their topics of interest and compiled their ideas under the following categories:

- **Content and structure of 'the book':** how to keep this project’s theme of “this could be a chapter” and still maintain a consistent direction, how to create an overview of different chapters that are developed independently, what are the management plan for the six volumes of the project, what are the gaps in the current book, and what are the updates in our wishlist.
- **Policies and processes:** existing workflow for contributors and existing challenges, processes that will make our workflow time effective by lowering the barrier for involvement. Furthermore, we want to identify the best way to manage contributions, issue lists, pull requests and merge processes on GitHub.
- **Leadership and roles:** identifying roles and teams in the community, reviewing their responsibilities, and creating task lists and defining the autonomy level of the contributors. This requires us to identify core influencers in the community and supporting them in a way that’s mutually beneficial for their careers and the development of *Turing Way* as a project.
- **Building communities online:** review our engagement plans and policies, establish standard means for online interactions, understand barriers and create a more welcoming space. Since this project is moving from a “reproducible community” to “data science how-to community”, we need to establish a measure to assess our progress, check value alignment, and invite contributions from different areas of data science.
- **Involving desirable key contributors:** invite more contributors who can influence the cultural movement in data science widely and positively. A few groups of such contributors who have not been so far involved are PIs, Tech companies, and case study developers, who will be crucial for this project going forward.
- **Event and local teams:** engage with people in-person and still maintain a sense of large community in small teams. Training will become an important part of this project when engaging with new members, for which we all agreed that The Carpentries style lesson will be really useful.

#### Group discussion:

For the group discussion, we carried on to discuss **the content and structure of 'the book':**, which is essentially an expansion plan for the *Turing Way*.

The project will be written in 6 volumes that will cover the following Data Science related topics:

- Reproducibility
- Design
- Communication & Outreach
- How to collaborate
- Ethics
- Meta-Turing-Way

A few main recommendations from this discussion are listed below:

- The project will be maintained through a GitHub repo but we need to decide if it will be a single or 6 different repositories and what will be the process of adding new volume.
- It was suggested that we create a webpage with a landing page that is interactive with 6 buttons for 6 volumes. We can later think about having a downloadable version of the book for e-readers.
- We discussed the possible reasons that contributors might not engage. A few reasons we think are that people feel they need time, brain-space so that they can do it well. Mentored contributions are useful for those who find GitHub challenging. There should be different methods of capturing contributions.
- We need to plan how we announce our future plans to the community and invite their feedback. TO motivate engagements from others, it will be important to share the raw process of creating chapters, provide them a skeleton of the book (Table of Contents, headings, etc), collect examples of short chapters (issue templates, labels, etc) and define the expected timeline of the book.


#### Session end

The session ended with people sharing their concerns related to the project and aspects they are excited about.

## Part 2

### Participants

- [Kirstie Whitaker](https://www.turing.ac.uk/people/researchers/kirstie-whitaker)
- [Malvika Sharan](https://twitter.com/malvikasharan)
- [Martin O'Reilly](https://www.turing.ac.uk/people/researchers/martin-oreilly)
- [Sarah Gibson](https://www.turing.ac.uk/people/researchers/sarah-gibson)
- [Louise Bowler](https://www.turing.ac.uk/people/researchers/louise-bowler)
- [Eric Daub](https://www.turing.ac.uk/people/researchers/eric-daub)
- [Mishka Nemes](https://twitter.com/mishkanemes)
- [James Robinson](https://www.turing.ac.uk/people/researchers/james-robinson)
- [Rosie Higman (online)](https://www.sheffield.ac.uk/is/pgr/students/r-higman)
- [Patricia Herterich (online)](https://www.software.ac.uk/about/fellows/patricia-herterich)

### Report

#### Icebreaker

In this session, attendees shared their responses to two questions: 1) what does training mean for Turing way? 2) what is your memorable training moment? Several attendees stated that learning is part of Turing Way to show people how to do data science, making them more confident, allow them to demonstrate their work, teach each other and build a culture of helping.

#### Main theme

In this session, we discussed training, events and engagement plans for *Turing Way* and how we can link to other initiatives such as The Carpentries and Mozilla.

#### Topics of considerations (post-it exercise)

- **Aim of the training**: supporting people to use the *Turing Way* project effectively and ensuring that people in the *Turing Way* community are actually benefitting from the materials.
- **People involved**: Training is required on topics starting from basics of data science, GitHub and git lessons to teaching community aspects such as collaboration and working openly in *Turing Way*, and technical skills such as continuous integration, Binder and containerization. We will have to create materials for the different target audiences with different skills, expertise, and perspectives. In this direction, we will also link our work with other communities to avoid reinventing the wheel. For e.g. Train the trainer materials and training from The Carpentries and ELIXIR will ensure that people from within the community feel confident teaching. Other places to look at are [INCF training material](https://training.incf.org/), [Mozilla Science Lab](https://science.mozilla.org/), and [The Carpentries Incubators](https://github.com/carpentries-incubator/proposals#readme) and [The Carpentries Lab](https://github.com/carpentrieslab).
- **Methods for training**: we will aim to adopt engaging techniques instead of classical slides or long lectures. For example bite-size video with screen grabs. This could also be a way people make new contributions to the project.
- **Scope and scaling capacity**: training is quite essential and therefore is well within the scope of this project. When developing materials we need to think about the overlap of the contents with existing materials from other communities. We should focus on only developing those contents of The *Turing Way* that do not already exist as training materials o. A few questions we need to think about is if we have the capacity to train, if yes -where (online and in-person) and how are we going to run this? and if not - who else can provide this kind of training?
- **Structure**: A few questions that we will need to address are the location of these resources within the project, or outside, linking appropriate resources for training to different chapters by defining prerequisites of the chapters and further learning and self-assessment materials after the chapters.
- **Development**: these materials will be developed through lesson hackathons and other in-person events. We will need to identify experts who can help us get started, or review the materials. This is where we will also require external consultants and advisors.

#### Group discussion:

In this section, we discussed the scope and scaling capacity of The *Turing way* in detail.
As a partner, the integration of *Turing Way**’s training into The Carpentries will be very useful to avoid any potential conflict of interest and complement with other communities.

The Carpentries comprise a vast amount of information online which may be quite difficult to understand for a new contributor, however, after the training workshop, it becomes easier for new contributors to understand it. In a similar sense, *Turing Way* will be able to benefit its members by training them.

We need to define how do we aim to cooperate with other communities and what will be our level of independence. We will also have to clarify which policies are applicable from which organization - and when it is not?
When discussing the course content itself, we discussed what a coherent course looks like for The *Turing Way* and the Turing Institute. It is important to operate with the training team of the Turing Institute while maintaining the independence and financial support (to conduct these events for free).

We further discussed the scope of the training in detail. A few important points from the discussion are the following:


- Our target audience is large and not limited to the UK, or The Turing Institute.
We as a team don’t have the capacity at the moment, therefore, we need to figure out the modularity and recipes/workflow/process
- Training will become one of the main aspects of the *Turing Way* to bring people together as a community and help them grow, while books will be useful when they go back to work on their own.
- Scope for training is large, but we will have to identify our priorities based on who is in the community and what do they need. Most of the training will be aimed at online participants to create a self-directed learning experience.
- Content will be potentially developed and edited by people who use these skills every day in their work or shared by those who have already developed them before.
- Not every chapter will have training attached to it as it may slow down the progress or distract people from going further. The focus of the training will be on related topics such as data management skills, coding skills, Machine Learning, etc. They will be referenced before or after the chapter when training is required to understand the chapter completely.
- We can use the interactivity of Jupyter Notebook and potentially integrate a few training materials directly in the book, however, the materials which require in-person meetings can be hosted in The Carpentries Lab.
- We have to figure out pathways that will allow others to mix-and-match contents from both book and training as required, i.e. based on the length of the workshop, or requirement for their projects. Therefore creating modular content will be useful. Examples of use-cases will be a great addition to the project.
- In the project, the training aspect is highly important, however, the *Turing Way* team needs to develop a project plan to identify the resources available and required for doing this properly.


#### Session end

At the end of this session, we shared what we want to do in the *Turing Way* Project before the end of this year. A few responses were to develop a contributing pathway to taking training, identifying external resources for training, and identifying priorities within the training plan.

## Part 3

### Participants

- [Kirstie Whitaker](https://www.turing.ac.uk/people/researchers/kirstie-whitaker)
- [Malvika Sharan](https://twitter.com/malvikasharan)
- [Evelina Gabasova](https://www.turing.ac.uk/people/researchers/evelina-gabasova)
- [Martin O'reilly](https://www.turing.ac.uk/people/researchers/martin-oreilly)
- [Sarah Gibson](https://www.turing.ac.uk/people/researchers/sarah-gibson)
- [Louise Bowler](https://www.turing.ac.uk/people/researchers/louise-bowler)
- [Tania Allard (online)](https://www.software.ac.uk/about/fellows/tania-allard)
- [Amber Raza](https://www.turing.ac.uk/people/research-engineering/amber-raza)
- [Shakeel Liaqat](https://www.linkedin.com/in/shakeel-liaqat-01771112/)
- [Emily Neilson](https://www.turing.ac.uk/people/business-team/emily-neilson)

### Report

#### Icebreaker

The session started by asking people to describe a time that they created an idea. Most of the attendees shared their idea outside their work by mentioning how they used their creative skills such as photography, traveling or crafting to create something special or communicate their ideas to others.

#### Main theme

One of the volumes in the *Turing Way* will be on “designing your research” which will include conversations with collaborators, aimed at setting expectations appropriately. In this session, the meeting attendees brainstormed and participated in discussions that help generate a list of desired content for this volume of the book.

#### Topics of considerations (post-it exercise)

- **Research questions**: In this section, we discussed that before starting a project one needs to think about what makes their question a “good question”, how can they make their questions more useful for enhancing their project, how can they identify all the possible challenges that needs to be addressed, and which resources can they reuse which seeking and describing novelty in their work.
- **Data and resources**: The second topic in designing a project is to evaluate the status of their data, reproducible techniques, outputs, archiving process, and communication.
- **Scoping and project planning**: Before starting a project one needs to understand the research question, goals of those questions based on the user or target audience, resources available and what are the constraints. Then we need to plan the scope of the project in terms of ethics and usability of their outcome, expected minimum viable product of this project, synergies with other projects, similarities or differences compared to other projects, a measure of success, and the overall impact this project.
- **Pathways of working**: In this section, we discussed the different roadmaps one can take within a single project. Structure a project might look quite different settings that are formal, for archiving and improving findability of the resources with others, and informal, for agile development and sharing resources with the team internally. Before starting a project one can choose from the pathways for an open project or reproducibility, and understand the practices that should be avoided in a certain project.
- **Ethics**: In data science, when dealing with different types of data, we need to learn when and how we can introduce ethics to avoid harm to others. It will be useful to explore how one can manage different components of the research by handling openness vs confidentiality of sensitive digital and non-digital raw materials and products.
- **Case studies**: One of the most important resources of the project will be to compile different case studies for small, mid-size, and large projects spanning to both short-term and long-term plans. A few examples of failed projects will also be quite valuable at teaching the aspect that may create risk for a project.
- **People involved**: This section aims to identify the different stakeholders, mode of recruitment based on their skills, background, and diversity. Furthermore, it also raised questions related to what is planned for up-skilling, supporting and improving accessibility for these stakeholders. When designing a team, it is important to think about all the required skills for the project and resources available to access those skills.
- **Project Management**: This section includes management aspects such as time, budget, risks, expectations, people, resources and timeline.


#### Group discussion:

In the group discussion, we explored the topic - “research question” in great detail, breaking down each aspect of the project design and research goals of a data science project. This involved everyone asking a set of questions that can help researchers design their project that used best practices in a manner that is valuable for them and others.

#### Session end

We ended the session by acknowledging that questions discussed in this session can be converted into a checklist that can be used for bringing standard practice for project management in data science.

## Part 4

### Participants

- [Kirstie Whitaker](https://www.turing.ac.uk/people/researchers/kirstie-whitaker)
- [Malvika Sharan](https://twitter.com/malvikasharan)
- [Evelina Gabasova](https://www.turing.ac.uk/people/researchers/evelina-gabasova)
- [Martin O'reilly](https://www.turing.ac.uk/people/researchers/martin-oreilly)
- [Sarah Gibson](https://www.turing.ac.uk/people/researchers/sarah-gibson)
- [Louise Bowler](https://www.turing.ac.uk/people/researchers/louise-bowler)
- [Patricia Herterich (online)](https://www.software.ac.uk/about/fellows/patricia-herterich)
- [Tania Allard (online)](https://www.software.ac.uk/about/fellows/tania-allard)
- [Amber Raza](https://www.turing.ac.uk/people/research-engineering/amber-raza)
- [Catherine Lawrence](https://www.turing.ac.uk/people/business-team/catherine-lawrence)

### Report

#### Icebreaker

The icebreaker question intended to capture the final thoughts of the participants based on their participation at this sprint.

#### Main theme

This was a catch-all session to make sure that we finished the two long days capturing key points that weren’t covered over the previous 3 sessions. Some topics were around engagement within the Turing and beyond, next steps after the meeting and future involvements of people in the room.

#### Topics of considerations and Group discussion:

In this section, we shared our thoughts on potential challenges and reflections on the topics that had not been discussed in the previous three sessions. This included people who we should reach out to, identifying priorities, highlighting the next immediate actions, clarifying the importance of this project for The Turing research culture and other data science communities, and the need for creating true values for the contributors of the *Turing Way*.

#### Session end

The meeting ended by sharing final thoughts on the need and frequency of these meeting in future, creating official associations with the *Turing Way* for people who are not formally affiliated to this project, need for creating roles and identifying teams and Task Forces, and ways to reward them fairly for their work.