-
Notifications
You must be signed in to change notification settings - Fork 4
/
05-Education_training.Rmd
379 lines (308 loc) · 24.9 KB
/
05-Education_training.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
# Education & Training {#Ch-Edu}
Researchers often view software development skills as separate from, but complementary to
their disciplinary training. This perspective introduces a dilemma for researchers, particularly
for early-career researchers, who need to master their discipline and simultaneously acquire
sufficiently-good software engineering skills. When forced to choose, they typically make sacrifices
or take shortcuts in software engineering, as they view their disciplinary skills and knowledge as
more important to their careers, and more rewarded by current incentive systems. Further, many
researchers do not aspire to become software engineering experts, but approach the acquisition of
software skills as a means to a research end. This approach is implicitly encouraged by the fact
that the educational content of most disciplines that require software skills does not actually
include sufficient discussion of software development or engineering. As a result, there are valuable
skills such as packaging, testing, releasing, archiving, and even designing research software that
are dramatically underdeveloped in the overall research ecosystem.
Indeed, previous studies demonstrate that researchers are rarely purposely trained to develop software.
A 2017 survey of US National Postdoctoral Association members regarding postdocs' use of software in
research and their training regarding software development found that 95% of respondents use research
software. However, 54% (n = 104) of the respondents to this survey reported that they have not
received any training in software development [@nangia_katz_2017]. A similar survey of software
use and training from the UK repeats this finding: in 2014, Hettrick et al. asked researchers at
randomly selected researchers at 15 Russell Group universities about their software use and training.
Of the 417 responses, 45% reported having no training in software development. Of the 55% of respondents
that reported having received training in software development, 73% had received some form of
formal training in a software development course, the remaining 27% were self-taught. In further analyzing
these results, Hettrick et al. report that 21% of respondents who develop their own software had
no training in software development [@simon_hettrick_2018_1183562; @Hettrick-blog]. A 2016 survey
of 704 PIs from the Biological Sciences Directorate of the US NSF found the most pressing unmet
needs are training in data integration, data management, and scaling analyses for HPC [@barone2017unmet].
And a 2018 survey of almost 1600 scientist-developers found that 82% of respondents felt that
they were spending “more time” or “much more time” developing software than they did 10 years ago [@Pinto2018].
Not only is there a gap in software development training in general, there's also an additional
gap across gender. When analyzed by gender (self reported binary of men and women) the first
two surveys show that only 39% of female respondents in the UK and 32% in the USA report
have received some form of software development training, compared to 63% of male respondents
in the UK and 64% in the USA.
This lack of training is not new, nor is it newly discovered. Greg Wilson has written
[@wilsong2016; @softwarecarpentryhistory] about the history of Software Carpentry:
> In 1995–96, [Wilson] organized a series of articles in IEEE Computational Science &
> Engineering titled, “What Should Computer Scientists Teach to Physical Scientists and
> Engineers?”. These grew out of the frustration he had working with scientists who wanted
> to run before they could walk, i.e., to parallelize complex programs that were not broken
> down into self-contained functions, that did not have any automated tests, and that were
> not under version control.
> In response, John Reynders (then director of the Advanced Computing Laboratory at
> Los Alamos National Laboratory) invited [Wilson] and Brent Gorda (now at Intel) to
> teach a week-long course to LANL staff. This course ran for the first time in July
> 1998, and was repeated nine times over the next four years. It eventually wound down
> as Gorda and [Wilson] moved on to other projects, but two valuable lessons were learned:
>
> 1. There is tremendous pent-up demand for training in basic skills.
>
> 2. Textbook software engineering is not the right thing to teach most scientists.
This need led to the Software Carpentry organization, which built a successful train-the-trainer
model for collaborative lesson development and delivery, and which was used as a model
to build Data Carpentry, which then merged with Software Carpentry into The Carpentries,
and now also includes Library Carpentry. The Carpentries has developed highly impactful
training content and events that meet an intermediate need for training researchers to
better develop software. Many computing centers, including those funded by NSF and DOE,
have also long had user training in some types of development (similar to that of Los
Alamos), for example, joint activities run by
[TeraGrid](https://www.xsede.org/wwwteragrid/archive/web/eot/workshops.html) and now
[XSEDE](https://www.xsede.org/for-users/training), and Argonne's
[ATPESC](https://extremecomputingtraining.anl.gov). Given the success of the Carpentries,
these computing center training events often either include or contribute to Carpentries
lessons, or complement them. In addition, other NSF-funded software institutes (e.g.,
MolSSI, SGCI, and IRIS-HEP) are also developing training resources for their target
communities, similarly making use of, contributing to, and complementing the Carpentries.
While many researchers may learn how to program on their own, the opportunities for them
to learn software engineering and software design are less available. Therefore, we believe
that there is an important role that a US-based research software institute
can play to expand the breadth and depth of the training available, and to further
coordinate and facilitate the acquisition of software engineering expertise by researchers
engaged in various development activities. Rather than reinventing content, we anticipate
collaborating with and pointing to relevant resources developed through other related efforts.
The Education & Training thrust of URSSI complements and builds upon existing efforts through
the following planned activities (more detail appears below):
1. Develop and annually run a summer school on practices for sustainable software development
for research software.
2. Develop a Research Project Carpentry lesson program that teaches people how to turn code
into a formal project.
3. Plan a review service for software project plans (and note that the implementation of this
plan is likely out of scope of URSSI by itself).
4. Evaluate and document successes and failures of industrial software development practices
in research software.
5. Compile and maintain a body of knowledge of best practices for research software development.
(This URSSI activity will seek to aggregate both existing practices, and serve as an outlet
for emerging practices from other domain-specific software institute, e.g., MolSSI, SGCI, IRIS-HEP)
## Education & Training resources
<center>
![FigName](images/education_structure.png)
</center>
This area requires the following staff:
* Training Lead: coordinates all training activities and assessment of those
activities (1 FTE).
* Training Officer: assists Training Lead in coordination of training activities,
develops and delivers training activities (1 FTE).
* Assessment Coordinator: leads the assessment efforts, in conjunction with the Training
Lead (this position could be shared with other parts of the project) (0.5 FTE).
* Research Coordinator: leads the effort to gather experiences from research
software projects (0.25-0.5 FTE).
## Education & Training methods
This section provides more details on how we plan to design and execute each of these
activities within URSSI.
### Develop and annually run a summer school on practices for sustainable software development for research software
*Overview*: During the conceptualization phase, we identified a need for additional,
focused training opportunities on sustainable software development practices for researchers
who were not in a domain covered by an existing institute. One of the primary reasons for
this need is that it is generally difficult for degree programs to make room for this type of
content. We plan to meet this need with a summer school, which is a proven method for
providing hands-on training to transfer these skills to students, in this case on sustainable
software development practices for research software development. We conducted a Pilot
Winter School as part of the conceptualization process. The details of that activity appear
in [Chapter 2](#chapter2-WinterSchool). The plan described here builds on the key lessons
learned, including: (1) the school needs to be longer to allow for both discussion time
and focused time for students to work on their own code, applying the lessons from the
lectures, and (2) the presence of additional helpers outside of the primary instructors
was quite beneficial to help answer specific questions from the students.
*Content*: Based on the needs identified during our survey, workshops, and other discussions,
we identified the following topics as candidates for inclusion in such a school:
(1) software design (including modularity), (2) software testing,
(3) packaging and distributing code,
(4) collaborative software development with modern version control (i.e., git/GitHub),
(5) peer code review, (6) open science principles, and (7) software citation.
*Format*: The summer school will be a 1-week in-person event. We plan to follow the
successful PNAS hackweek model [@Huppenkothen8872] where we allocate 50% of the time
to lectures and 50% of the time to putting the concepts into practice. We will
invite instructors with expertise in the different topic areas (see above) to participate
and deliver the lectures, and will take diversity into consideration when selecting instructors.
One of the criteria used to select participants will be a diversity statement that potential participants
will be required to submit, and at least 25% of the selected participants will be members of underrepresented groups.
To facilitate the “hacking” time,
we will ask participants to bring a code project they wish to work on during the
school. The “hacking sessions” will provide ample opportunities for the participants to
have hands-on time applying the concepts from the lectures to their own code projects. These sessions
also provide time for code reviews and other ad hoc topics that arise during the school.
The instructors will be available during the hands-on time to help participants as
needed. By the end of the school, participants should have substantial intellectual
knowledge and experiential knowledge. Because researchers often do not have sufficient
time for training or to attend events like this one, as an alternative to the face-to-face
summer school events, we also plan to package the content into Carpentries-style lessons
that can be used either in shorter training sessions, webinars, or asynchronously.
The URSSI summer school is primarily aimed at the people and careers portion of URSSI's impact goals, and secondarily at the research software sustainability and research software ecosystem portions.
### Develop a Research Project Carpentry lesson program that teaches people how to turn code into a formal project
*Overview*: One challenge faced by many research software developers is how to convert
a small (individual) coding effort into a more formal project. This situation occurs
frequently as researchers develop small units of code to accomplish tasks important in
their own work. When other researchers are exposed to these code units, they often want
either to use the code, add to the code, or both. This is the point when the researcher
needs to convert that code into a more formal project, both to ensure quality as others
use it and to allow the community to contribute to it. Therefore, we propose to develop
a Research Project Carpentry lesson plan, along similar lines as Software Carpentry, that
will help researchers make this transition from an individual coding effort to a
community-focused project.
*Content*: The goal of this effort will be to help projects develop a good project plan.
As such, the Research Project Carpentries lesson program will focus on helping researchers
understand what content a good plan needs to contain. It will also provide information
researchers need for making choices about which of the various practices are most relevant
to their particular project. The topics to be covered in Research Project Carpentry include:
(1) requirements elicitation, (2) issue tracking, (3) source and version control, (4) testing,
(5) documentation, (6) packaging and distribution, (7) pair programming, (8) code review, and
(9) metrics. Note that there is some overlap between the focus of Research Project Carpentry
and the Summer School described above. We anticipate reusing relevant content between these efforts.
*Format*: We will develop modules on the topics above as carpentries-style modules. This approach
will allow the Research Projects Carpentries materials to be used either in a face-to-face training
course, like the Summer School, or as stand-alone units. This flexibility will allow researchers
to access the content in a mechanism that is most relevant to them. When this material is tested,
we will use use a diversity statement submitted by potential participants as an input to the decision
process, and at least 25% of the selected participants will be members of underrepresented groups.
*Note*: While this activity is relevant to the URSSI effort, we believe that it will have appeal
beyond URSSI. Therefore, even though we include it here in the plan, we will also seek funding to
support it from organizations outside of NSF. We also plan to solicit other organizations and
individuals to contribute to this activity. URSSI will lead this work, organizing a community
that will include external participants.
The Research Project Carpentry lesson program is primarily aimed at the research software sustainability portion of URSSI's impact goals, and secondarily at the research software ecosystem portion.
### Plan a review service for software project plans
Building on the work done for Research Project Carpentries, URSSI has identified the need for
a service where experts review and comment on proposed project plans. Researchers who are
interested in this service would submit a project plan in a specified format. Then, experts
would review the plan to ensure both that it is (1) complete with respect to the content necessary
for the type of project and its current stage, as well as (2) the choices made for each aspect of
the project plan are consistent with best practices. Once the review is completed, the reviewer
would work with the project team to help resolve any deficiencies identified in the original
project plan. This model is akin to that used by many "incubators" for start-up companies.
This activity is likely beyond the scope of URSSI, so we will create an initial plan for it,
then seek to partner with other organizations who are pursuing similar efforts and explore
external funding sources. In addition to the operating the service, URSSI and partners will need
to develop and make available the following resources: (1) templates for project plans,
(2) checklists for researchers to follow when developing the plan, and (3) guidance on how
to choose from various alternatives for each section of the plan.
The softare project plan review service is aimed at the research software sustainability portion of URSSI's impact goals.
### Evaluate and document successes and failures of the use of industrial software development practices in research software
Industrial software engineering has developed and refined a number of best practices for
software engineering and software development, resulting in a large body of related literature.
Conversely, it is not always clear which of these practices are relevant for use in developing
research software, most notably because of the different incentive structures for developers
in the research arena. It is also not always clear which aspects
of the research software domain motivate the need to develop new software engineering or
software development practices to handle the unique contexts within which research software
exists. To accomplish this activity, URSSI will need to perform the following steps.
First, URSSI will gather experience reports describing the successful and unsuccessful use
of industrial software engineering and software development practices in research software
projects. Gathering these experiences will involve conducting surveys and interviews of
developers of research software to document important information. In addition to the
successes and failures with specific practices, these experience reports will also document
cases where developers of research software encountered a situation or a problem for which
they were unable to find software engineering or software development practices that were
relevant. Cases where there is a lack of relevant practices suggest opportunities for
creation of new practices.
Second, URSSI will analyze the experience reports to systematize the successes and failures
and develop a series of evidence-based lessons learned. Because this information will be
based upon real experiences from research projects, they will have great value to other
research software projects.
The results of these efforts will feed into the next activity by providing concrete experiences
that we can document into best practices. These experiences will be beneficial because we
will draw them from experiences on real projects, with knowledge about the context within
which the practice works. This context will help provide a description of the limitations
on the practice, which is equally important to document.
This activity is primarily aimed at the research software sustainability portion of URSSI's impact goals, and secondarily at the research software ecosystem portion.
### Compile and maintain a body of knowledge of best practices for research software development
Based on the outcomes of the above activities, URSSI will develop a portal to provide living
information about best practices for research software development. In addition to including
content developed from our own activities, URSSI will also actively find, through various
means that include workshops and surveys, and curate information from other sites related
to the development of research or scientific software. We will specifically ensure that
information produced by the Carpentries is included, as relevant.
We will curate the following types of information:
1. Checklists for sustainable software - much of the content here will overlap with
ideas discussed in earlier activities.
2. Templates for use in developing software, including templates for project plans,
templates for documenting requirements, templates for test planning, and templates
for community guidelines and policies.
3. Badging efforts that provide recognition for various qualities of research software
and research software projects.
In addition to best practices, the URSSI clearinghouse will also curate information
about training resources. In addition to providing links to the URSSI-developed training
(described above), we will also link to other relevant training resources developed
through other efforts like MoLSSI, SGCI, and IRIS-HEP. To help individual researchers
and teams better understand how to make use of the various training options, we will
develop roadmaps that guide people to the most relevant training for their current situation.
This activity is also primarily aimed at the research software sustainability portion of URSSI's impact goals, and secondarily at the research software ecosystem portion.
## Cross-cutting activities
To support the five activities described above, URSSI will also need to perform a number
of cross-cutting activities as follows:
1. _Curriculum Development and Coordination_
To achieve the goals stated above, URSSI will need to support the development of new
training materials. We anticipate providing support to experts in the research software
community who can take their existing knowledge and package it in a Carpentries-style
format that is most amenable to be used in the various types of training activities URSSI
will support. Importantly, the training proposed above is meant to build upon, and
provide a complement to existing educational outreach efforts. This training will provide
follow on, more advanced content than the Carpentries, and will be longer in duration.
We can provide this advanced content because our target audience is a subset of the audience
focused on by the Carpentries.
2. _Instructor Training_
To help deliver all of the developed content to a wide range of audiences in various
locations, URSSI will need to provide support for training instructors. These instructors
will be integral to the success of the Summer School and the Research Projects Carpentry.
3. _Piloting Workshops_
As we develop content, we will conduct smaller workshops to pilot the content to ensure
its sufficiency before deploying it to a larger audience. Because we will be asking people
to participate in content development by providing feedback, we will need to have resources
to support the travel for instructors and participants.
4. _Gathering Experiences on Successes and Failures of Software Practices_
URSSI will support a team of researchers, who are experienced in survey and interview methods,
to gather experiences, summarize the results, and document the findings. In addition, URSSI will
need to provide support for travel for URSSI personnel to meet with members of research projects.
5. _Conducting Workshops_
URSSI will provide support for the Summer School and Research Project Carpentries instructors
to travel to the workshops. URSSI will also provide small fellowships for the instructors
to compensate for the time required to participate. In addition, to increase participation
from a diverse community, URSSI will provide travel support for a small number of workshop
participants, determined based on need in individual cases.
6. _Packaging and Sharing_
As URSSI develops content related to the educational activities described in this chapter, we
will package that content in a shareable format. We will make the content freely
available on GitHub. We will also archive each release of content on Zenodo using a proper
license that allows anyone to use and/or modify the content as needed. In addition, we will
strive to work with the Carpentries to enable the content to reach more people.
## Education & Training metrics/milestones
To ensure that the training activities are providing sufficient content, URSSI will conduct
numerous assessment activities. These activities will occur both concurrently with the
training activities, i.e., as entry and exit surveys, as well as longitudinally after
training. The benefit of conducting assessment longitudinally is that it provides insight
into whether the content learned during training is viewed as useful over time as real
projects evolve.
Some of the specific measures that we will gather include:
* Post-event evaluations: For workshops and training courses, we will conduct follow-up surveys
to evaluate the value of the content. We will conduct evaluations immediately at the end of
each event while the information is fresh in the attendees’ minds. We will then conduct a
follow-up survey 6 months, 1 year, and 3 years later to evaluate the applicability of the
content to the attendees’ work. In conducting these evaluations, we will build on lessons
learned and guidance from [The Carpentries Assessment Network](https://carpentries.org/assessment-network/).
* Demand: We will measure demand for the content by the number of people who apply to attend
various training events and workshops.
* Use of material: We will measure downloads and use of the information that we produce.
**Impacts:**
This table maps URSSI activities from this chapter to the three portions of URSSI's intended impact.
(A complete table of impacts from all URSSI activities can be found in [Chapter 10 (Metrics and Evaluation)](Ch-Metrics).)
In the impact cells, X indicates a designed primary impact on an activity, and y indicates
a designed secondary impact.
```{r echo=FALSE, message=FALSE, warning=FALSE}
library(kableExtra, warn.conflicts = TRUE)
options(knitr.kable.NA = '')
df <- readr::read_csv("edu_intended_impact.csv")
knitr::kable(df, align = "lccc", caption = 'Impact table',
col.names = c("Activity", "Impact on Research Software", "Impact on people and careers", "Impact on research software ecosystem"),
booktabs = TRUE)
detach("package:kableExtra", unload = TRUE)
```