A Socio-technical System for Collaborative Writing and Publishing

Introduction

In recent years, a number of larger and smaller platforms and online editors have come onto the market to simplify collaborative writing and publishing. Authorea and Overleaf are perhaps the better known. In fact, they are very powerful and offer many convenience features that simplify the scientific writing and publishing process. As Christian Heise found out in his open doctoral thesis, these approaches are already heading in the right direction, but for various reasons they don’t deliver what a scientific community can expect from such tools (cf. Heise, 2017, 2018). Dependencies on business models play just as much a role here as data protection aspects and a lack of incentives.

“Die notwendige (Weiter-)Entwicklung der Plattformen wird jedoch nur dann stattfinden, wenn die Nachfrage nach solchen Lösungen steigt. Die wissenschaftliche Gemeinschaft ist auch hier gefragt, diese Nachfrage (zum Beispiel durch Experimente mit offener wissenschaftlicher Kommunikation) zu erzeugen und bei der Entwicklung solcher Lösungen eine aktive und gestaltende Rolle einzunehmen.” (Heise, 2018, S. 261f.)

Experiments are in demand

Further research and development needs are therefore indicated. In the “Modern Publishing” project, we are addressing the question of needs-based and contemporary tools and workflows from the perspective of socio-technical systems. But pulling together digital tools for collaborative writing and publication processes means not only keeping an eye on technology. Equally important to us are the actors involved with their highly diverse writing habits, computer experiences, reputation cultures and publication strategies in their subjects. From this double perspective, a system to be developed can only be understood as a socio-technical system that, in the sense of Herrmann (2003), considers and develops the network of social, cultural, organizational and technical aspects and subsystems in equal parts.

We are guided by the central values in the discourse Open Science: We want to analyse and implement transparency, accountability and reproducibility as well as access and the FAIR principles (cf. Wilkinson et al., 2016) with regard to contemporary scientific publication processes.

The Architectural Design

In the following we want to develop and explain the decisions for our initial architectural design. The individual technical components will be presented and their interaction elaborated.

Architectural design of a socio-technical system for collaborative writing and publishing of scientific journal articles. Source: Axel Dürkop

The figure is roughly divided into three parts:

  • In the Writing Stage, the writing process is represented by one or more persons who alone or together, synchronously or asynchronously work up to a first draft of a scientific journal article.
  • The Pre-submission Stage in the middle is a phase of the writing and publishing process that can be repeated before the actual submission. In this phase, various tools and formats are used, which are presented in detail below. The green figure represents the new role of the Pre-submission Facilitator.
  • The Submission Stage on the right shows the transfer of generated formats from the Pre-submission Stage to a journal. In our project the target system is OJS at the Staats- und Universitätsbibliothek Hamburg (SUB).

Premises to the Writing Stage

We assume that the writing process of a journal article begins with one person, but rather with a team. We base this assumption on initial discussions with colleagues from different disciplines about their writing habits and publication strategies. In the further course of the project, we will conduct systematic research here. The authors negotiate their preferences for writing tools and the organisation of the writing process with each other and agree on the division of parts to be written, the mode of writing and the attribution of authorship.

In this Writing Stage we want to make recommendations and encourage the authors to try something new. So instead of – perhaps as in the past – sending Office documents to each other by mail, we encourage the authors to use collaborative writing tools like Etherpad, CryptPad or HackMD. It is important to us that the authors feel comfortable in their writing environment. Since writing is a very individual and intimate process, the quality of the scientific contribution should not be affected by a high parallel learning curve of new tools.

Pre-submission Stage: the Phase before Submission

If the authors have written a first draft of their article, we offer a transition to the * Pre-submission Stage*. The * Pre-submission Stage* is the phase in the publication process in which a paper is technically prepared *before submission* and can go through initial quality assurance cycles.

The Role of the Pre-submission Facilitator(s)

For the Pre-submission Stage, we have created a new role, the Pre-submission Facilitator, which is shown in green in the illustration of the architectural design. As part of the “Modern Publishing” project, we will take on this role together with team members and find out which social and technical competencies must be attributed to them. At this stage of the project, in the role of Pre-submission Facilitator, we are already talking to the authors at the beginning of the Writing Stage in order to win them over as test subjects..

The new role of the * Pre-submission Facilitator* communicates functions and values of the socio-technical system to the authors and other actors involved. It provides concrete assistance in the establishment and application of the system on the way to publication. Source: Axel Dürkop

We assume that the role of * Pre-submission Facilitator* will become superfluous with the increasing competence of the authors, because they can take on the tasks and work steps themselves.

Technical Preparation of the Draft

At this point, the Pre-submission Facilitators have the chance to draw the authors’ attention to the advantages of Markdown and pandoc/pandoc-scholar, which are shown above all by the fact that different open formats (HTML, PDF, EPUB, MOBI etc.) can be generated from one Markdown source at any time. Krewinkel & Winkler (2017) describe these advantages in detail and extend with the software pandoc-scholar the functional range of pandoc under the requirements of modern scientific publication strategies. Therefore we want to use pandoc-scholar centrally for our system. Grandesso (2018) has already done preparatory work in this context and outlined a workflow with pandoc to OJS.

If the authors do not submit their contribution in Markdown, the first task of the Pre-submission Facilitator is to convert the original format accordingly. Only if the text is available in Markdown can pandoc-scholar make use of the many advantages of automated format generation, which will be developed further below. The conversion is done with pandoc and additional manual work, e.g. to capture the metadata of the article correctly. Source formats can be e.g. Word documents with correct application of format templates or LaTeX documents..

Interaction of Actors and Technology in the Pre-submission Stage

After the conversion of the draft, the Pre-submission Facilitator will set up the contribution for the technology stack of the Presubmission Stage. This stack mainly consists of GitLab, Docker and pandoc-scholar. The use of these three components in connection can be outlined as follows (see also the illustration of the architectural design above):

A scientific journal article is stored in Markdown format in GitLab. GitLab is configured to generate a HTML page of the article each time the Markdown file is changed using pandoc-scholar. pandoc-scholar will be executed in a temporary Docker Container. The generated HTML page is then made available (access protected) by a webserver.

The public or an invited community can now use Hypothesis to encourage authors to revise the article before actually submitting it by commenting and annotating the protected website. The authors can participate in this discussion.

Through the concept of Review Apps in GitLab, which is based on Branches, it is possible to publish as many versions of the article as you like on the web with comments and to improve it it iteratively together.

By using pandoc-scholar different output formats of the article (HTML, PDF, EPUB, MOBI etc.) can be generated during each run, which can be directly transferred to a journal hosted in OJS, when the authors are ready for the submission.

This short summary of the system will now be examined in more detail.

GitLab and Docker Together: A Universal CMS

With GitLab, Docker, pandoc and other static page generators (Jekyll, GitBook, Hugo and others), we have already gathered extensive experience in the development of Open Educational Resources (OER) in recent years.[^quellen] We are now transferring this experience to the context of scientific publications in the “Modern Publishing” project..

Central to our understanding is that we see GitLab as a universal content management system: If possible, users work in the browser. Changes to “source code” trigger the preconfigured pipeline described above, which pulls up a Docker container (see the contents of the circle in the figure above). The Docker image for this container contains pandoc-scholar in our project, so that we can generate different output formats online as well as make them usable in different ways. We use generated HTML artifacts to post them on the net for commenting with Hypothesis, other formats will be submitted to the Journal in OJS in due course.

Review Apps in GitLab

GitLab comes up with a very powerful feature, which we also use in this project: Review Apps. With the appropriate configuration, digital artifacts of the project are generated from each branch. This means that all desired formats such as HTML, PDF etc. can be generated alternatively for each version of the article.

The idea for this comes from software development. It is difficult for software clients to imagine what a desired feature will look like after it has been programmed. The code that has to be written for this makes it clear. Recognition and learning becomes possible when all participants deal with the concrete implementation.

Review Apps are previews of software features that are developed in independent branches. A review app would be the (usually protected) view of a new subpage or a new function in the webshop, which still requires coordination between developers and clients. If the feature is accepted, the development branch is merged into the main branch.

Use of Review Apps in software development. We translate this way of working for the creation of alternative versions of text contributions in the Pre-submission Stage. Source: GitLab Docs

We take advantage of this feature in our system by creating branches for different iterations of a text contribution, from which review apps are generated. Each review app is an HTML version of the post and has a unique URL.

Different versions of the post are mapped by GitLab in branches. With the appropriate configuration, these branches can be used to create Review Apps in the form of HTML pages that can be commented on and annotated by the public or community with Hypothesis. Source: Axel Dürkop

Hypothesis

Hypothesis is an application that allows users to annotate internet sources. This functionality of Hypothesis behaves like an additional layer, based on open standards and placed over the actual content on the internet. Websites, documents, images, videos, data: In this “layer” all contents can be provided with individual thoughts, ideas and views without changing the original material.

Features of Hypothesis are:

  • Selected contents such as texts can be annotated without working directly in the original text. These annotations can also be tagged. They can then be made accessible to public or closed circles.
  • The annotation can be done alone or collaboratively in groups.
  • Other users can react to annotations via a “Reply” button. In addition, entire pages or individual annotations can be shared directly with other people via links.

Annotate any web content with Hypothesis. Source: Axel Dürkop

Annotating and commenting with Hypothesis

After the authors have submitted the text to the Pre-submission Facilitator and it has been prepared for initial feedback, it can be made available to selected groups of people in the form of a Review App (HTML document). These groups of people can then comment and annotate the corresponding document. The necessary functions can be activated by entering the URL in “Paste a Link”. It is also possible to force a panel to appear on the right side of the website (cf. animated figure above).

If a text is to be annotated, the corresponding section must be marked with the mouse. A click on the Annotate button shows the panel on the right where comments and tags can be stored. In the future, a filter system could be used to classify comments in aspects such as “source reference” or “spelling & grammar”.

It must also be determined whether only the writers or also the readers or the world public can view the existing comments on the net. After the annotation has been formulated, the corresponding content must be saved using the Post to… button, whereby closed groups can also be selected. Afterwards, comments can be adjusted by selecting the pen icon or deleted using the paper basket icon. Using the Reply function, annotations can be commented on, for example, by the team of authors.

It makes sense to set deadlines for feedback phases. Subsequently and also during this time, authors have the opportunity to incorporate the annotations in a new GitLab branch. At the same time, the community can comment on a previous version. Should the need arise, the revised version can be commented again with Hypothesis from the same or another community..

Open Journal Systems (OJS)

We assume that the draft at the * Pre-submission Stage* will eventually reach a quality that will make it ready for the actual *Submission*. This time is determined by the authors.

By using pandoc-scholar different export formats of the paper can be generated, commented and submitted at any time. Source: Axel Dürkop

In our project we are working on the transfer of various export formats from GitLab to the OJS of the Hamburg State and University Library (SUB). We are evaluating both a technical approach that interacts with the still rather incomplete REST API of the software and scenarios that include manual upload of the export formats to OJS.

In OJS traditional methods of review with selected volunteers of the respective journal are used. To what extent review processes in Pre-submission Stage and Submission Stage can correspond with each other, we do not yet know. Our aim this year is to experiment with different test subjects from different disciplines.

Testing of Architectural Design and Workflow

Currently, the first test phases with the presented socio-technical system are taking place thanks to the support of scientific staff at the TUHH. Within the framework of these experiments, the comprehensibility and practicability of the system will be tested in the future. We are interested in communication in feedback workflows as well as the perception of the different technical subsystems and in particular the perception of Hypothesis as a central feedback system.

We will be reporting.

Avatar
Axel Dürkop
Team leader, system architect and developer