Next Article in Journal
The Science of Literature Reviews: Searching, Identifying, Selecting, and Synthesising
Next Article in Special Issue
Leveraging Open Tools to Realize the Potential of Self-Archiving: A Cohort Study in Clinical Trials
Previous Article in Journal / Special Issue
Citizen Science in Europe—Challenges in Conducting Citizen Science Activities in Cooperation of University and Public Libraries
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Automatic XML Extraction from Word and Formatting of E-Book Formats: Insight into the Open Source Academic Publishing Suite (OS-APS)

1
SciFlow GmbH, Altensteinstraße 40, 14195 Berlin, Germany
2
University and State Library Saxony-Anhalt, Martin-Luther University, Halle-Wittenberg, 06098 Halle, Germany
3
Friedrich-Alexander-Universität Erlangen-Nürnberg, Universitätsbibliothek Erlangen-Nürnberg, Universitätstr. 4, 91054 Erlangen, Germany
*
Author to whom correspondence should be addressed.
Publications 2023, 11(1), 1; https://doi.org/10.3390/publications11010001
Submission received: 14 September 2022 / Revised: 6 December 2022 / Accepted: 21 December 2022 / Published: 29 December 2022

Abstract

:
Due to resource constraints, most Diamond Open Access journals publish fewer than 25 articles per year, and 75% of journals are not able to provide their content in XML and HTML, primarily providing only PDFs. In order to keep up with larger commercial publishers, a high degree of automation and streamlining of processes is necessary. The Open Source Academic Publishing Suite (OS-APS) project, funded by the German Federal Ministry of Education and Research, aims to achieve this. OS-APS automatically extracts the underlying XML from Word manuscripts and offers optimization and export options in various formats (PDF, HTML, EPUB). The professional corporate design, e.g., of the PDFs, is managed automatically using templates or creating one’s own using a Template Development Kit. OS-APS will also connect to scholarly-led and community-driven publishing platforms such as Open Journal Systems (OJS), Open Monograph Press (OMP), and DSpace: the software will be able to be integrated into a wide range of publication processes, whether at small, low-resource commercial Open Access Publishers, or institutional and Diamond Open Access Publishers.

1. Introduction

The 2021 OA Diamond Journals Study [1] has compiled a representative overview of Diamond Open Access journal operators in its “Part 1: Findings”. For example, 53% of journals are operated by fewer than one full-time equivalent (FTE), and 60% of journals rely heavily on volunteers. Due to these resource constraints, most Diamond Open Access journals publish fewer than 25 articles per year, and 75% of journals are not able to provide their content in XML and HTML, primarily providing only PDFs.
In order to keep up with larger commercial publishers and their professionalized content offerings, a high degree of automation and streamlining of processes is necessary. The Open Source Academic Publishing Suite (OS-APS, https://os-aps.de/en/, accessed on 6 December 2022) project funded by the German Federal Ministry of Education and Research aims to achieve this. For this purpose, open source software is to be developed by means of research (especially requirements analysis) and development, with which
  • XML is automatically extracted from (e.g., *.docx Word) author manuscripts;
  • The XML can be processed and, e.g., supplemented with semantic information;
  • Various e-journal or e-book output formats (e.g., XML, HTML, EPUB, and PDFs) can be generated in the respective publisher’s preferred design by selecting a template (ready-made or customized) before exporting the file.
Machine learning (ML)/artificial intelligence (AI) is not used here, although it has been considered especially in step 2 (e.g., recognizing headings even if they are not tagged as headings but only made bold) and could play a role in the future of the software, when the basic development is completed.
In addition to the three aforementioned points, OS-APS will also connect to widely-used, scholarly-led and community-driven publishing platforms such as Open Journal Systems (OJS), Open Monograph Press (OMP), and Open Access repositories (e.g., DSpace). The software will be able to be integrated into a wide range of publication processes, whether at small, low-resource commercial Open Access Publishers, or institutional and Diamond Open Access Publishers.
To understand the requirements of these heterogeneous publishers, a practical advisory board and scientific advisory board with representatives from the different publication sectors accompany the OS-APS project. In addition, an extensive survey [2] was conducted across various publishing houses and demo days with corresponding feedback opportunities are held regularly (https://os-aps.de/demo/, accessed on 6 December 2022).
The project is also in line with the recommendations of the OA Diamond Study and its urgent call for cOAlition S Funders and Infrastructures: “Support the development of generic tools to generate structured content in XML and HTML” [3]. This will also be a prerequisite for creating new, dynamic and machine-processable media formats, for example in terms of accessibility and screen readers.
The Open Source software could be thus a significant improvement for smaller, independent Open Access Publishers. It offers the possibility to increase the effectiveness and efficiency of their processes to create, for example, new e-journal article or e-book formats such as HTML and EPUB. These developments contribute to a higher bibliodiversity and may help independent OA publishers to become more viable and sustainable in the long term.

2. Materials and Methods

2.1. Materials

In terms of materials, the OS-APS project team has thus far produced insights into the project’s progress via presentations, posters, and articles [2,4,5,6,7,8], various software development sprints documented on GitLab [9], and a demo [10] to provide hands-on testing and feedback on the developments to date.

2.2. Methods

Methodologically, the project work is divided into four milestones. In the first, the requirements for the software were analyzed. In the second, all technology components, interfaces and intended workflows for connecting, e.g., OJS, OMP and DSpace were developed on this basis. In the third, existing journals and book series at the publishing services of the project partners will be iteratively converted to the Open Source Academic Publishing Suite production workflow for the purpose of practical testing and proof of implementation. In the fourth, a release of all open source software development results will take place; the OS-APS software can be downloaded free of charge and installed on the publishers’ own servers (all components are browser-based).
The following sections describe the methods used within the milestones in more detail.
The entire OS-APS project is accompanied by two advisory boards, which consist of publishers as well as institutions and libraries active in publishing. The Scientific Advisory Board is responsible for strategic and methodological advice and the User Advisory Board for discussions on practical procedures and requirements.

2.2.1. Software Requirements Analysis

Interviews were conducted with the publishers and publishing services of both the OS-APS project partners and advisory boards. The various publishing operations were mapped graphically via miro (https://miro.com, accessed on 6 December 2022). Miro provides a range of different options and templates for software development purposes (https://miro.com/templates/developers/, accessed on 6 December 2022). Subsequently, a review and statistical evaluation and classification of the workflows took place, whether they are, e.g., Word-, InDesign- or LaTeX-based, and which export formats are generated.
Through these interviews, a structured overview of various publishing processes could be obtained. Subsequently, a broader online survey based on the interviews was designed in order to reach more experts from the publishing community. This online survey was methodologically designed with Typeform (https://www.typeform.com, accessed on 6 December 2022), which is an online software specialized in creating dynamic surveys with logic flows.
The survey was sent to the mailing lists of the German AG Universitätsverlage (https://ag-univerlage.de, accessed on 6 December 2022), The Association of European University Presses (https://www.aeup.eu, accessed on 6 December 2022), the Enable! community (https://enable-oa.org, accessed on 6 December 2022), Peergroup Produktion of IG Digital (https://www.boersenverein.de/interessengruppen/ig-digital/die-peergroups-der-ig/#accordion-23919, accessed on 6 December 2022), GeSIG (https://gesig.org, accessed on 6 December 2022), Library Publishing Coalition (https://librarypublishing.org, accessed on 6 December 2022), Association of University Presses (https://aupresses.org, accessed on 6 December 2022), Open Access Scholarly Publishing Association (https://oaspa.org, accessed on 6 December 2022), ACUP/APUC—Association of Canadian University Presses (http://acup-apuc.ca, accessed on 6 December 2022), The Association of Japanese University Presses (https://www.ajup-net.com, accessed on 6 December 2022) as well as to cooperation partners of OA-STRUKTKOMM (https://oa-struktkomm.htwk-leipzig.de, accessed on 6 December 2022), DEval Communication and Publications Office (https://www.deval.org/de/publikationen, accessed on 6 December 2022), Center for Digital Systems Berlin (https://www.cedis.fu-berlin.de, accessed on 6 December 2022) and in forums such as the Open Access Books Network (https://openaccessbooksnetwork.hcommons.org, accessed on 6 December 2022) and the German PKP Community Forum (https://forum.pkp.sfu.ca/c/regional-networks/german-topics/13, accessed on 6 December 2022).
The results were evaluated, processed in a structured manner [2] and had a significant impact on some project decisions. Thus, it was decided to initially focus on Word manuscripts for XML extraction, since, e.g., LaTeX or other manuscript format submissions were rather rare (Figure 1).

2.2.2. Open Source Development of the Technology Components

The developing team aimed to build on already existing open source software wherever possible. In several cases it was also possible to build on existing code of the project partner SciFlow, which offers an online platform for collaborative scientific writing and automatic formatting according to the format specifications of renowned academic publishers (cf. https://www.sciflow.net/en/sciflow-free-researchers, accessed on 6 December 2022). SciFlow has extracted the relevant components from its platform and made them available as open source. Additional software development parts in the project context were that
  • Word *.docx manuscripts can be fed in the OS-APS browser-based importer; it does not matter which version of MS Word the author uses as long as the file is in *.docx format (hence, all MS Word versions released after the year 2003 should be compatible);
  • The XML is automatically extracted from them and is displayed in an editable browser interface; for this, many components were reused from SciFlow’s collaborative writing and editing platform;
  • Options for optimizing and semantically enriching the XML can be provided;
  • Corporate design templates depending on the publisher or its content, for example, for different book or journal series, can be used;
  • the publishing user should be able to control this corporate design “look” himself using the planned template development kit.
The Open Source software is currently based on Pandoc, Docker, paged.js, and components extracted by SciFlow from their own platform: https://gitlab.com/sciflow/development/-/milestones (accessed on 6 December 2022).
Pandoc (https://pandoc.org, accessed on 6 December 2022) is a free, GPL-licensed (https://www.gnu.org/licenses/gpl-3.0.html, accessed on 6 December 2022) converter and parser software. It is used to convert one document-based markup and file format to another.
Docker (https://www.docker.com, accessed on 6 December 2022) is an open platform for the running of applications. In this project, it is used to streamline the development for our OJS, OMP and DSpace platforms and to ease the deployment of ready-to-go code from our test environments onto our production systems.
Paged.js (https://pagedjs.org, accessed on 6 December 2022) is an open source library for displaying paginated content in the browser and then creating PDFs and their designs using, e.g., HTML and CCS.

2.2.3. Proof of Implementation and Application of the Open Source Academic Publishing Suite

At the University and State Library in Sachsen-Anhalt (ULB-SA) a testbed was created which can serve the purpose of implementing a number of the software tools developed in the course of the project. The library team supports several publishers’ teams in their efforts to publish a wide range of journals spanning across topics such as social geography, transnational economic law, ecology, geosciences. Out of this selection of journals, monographs and series, it has been possible to choose specific examples which have allowed us to not only test specific modules of the OS-APS developed tools but also the connection and integration of our publication tools OJS and OMP systems to a DSpace based publications repository.
In the first case, one journal, the “Hallesches Jahrbuch für Geowissenschaften” (the Yearbook of Geosciences in Halle) and the ULB-SA’s own series “Schriften zum Bibliotheks- und Büchereiwesen in Sachsen-Anhalt” (series on librarianship studies in Saxony-Anhalt) have been selected to be enhanced and given new layouts via the usage of the OS AP suite. In this particular case, Word *.docx templates are uploaded into the OS-APS environment and specific output formats can be generated for importing into the OJS of the ULB-SA. This process streamlines the template generation process of editorial teams, increases its level of automatization, and generally contributes to an increase in citation rates and visibility. These actions are in line with specific Open Science principles which aim at improving the accessibility and reusability of research outputs in fields where these issues may still need attention such as in some areas of the digital humanities. Scholars in these fields have recognized these endeavors as key components that can promote new research opportunities and can have a great societal value impact [11].
Regarding our connection to our publication tools, a number of journals and series (see for example MLU Human Geography Working Paper Series and the Policy Papers on Transnational Economic Law) are now fully integrated into our OJS/OMP publication cluster and have been exported to our DSpace repository. In this process, all articles have been issued with persistent identifiers (DOIs) and have thus gained a higher visibility and findability given the high data discoverability advantages that the DSpace platform offers. An ongoing migration is taking place so that a total of 13 journals will be migrated in the scope of this project.
As for the technical connection, it has been performed in a way that modular scripts are independently available to suit the different needs of our prospective end users. This means that the developed scripts can be implemented as a full set of scripts or just individually depending on the specifications of the environment where the tools are to be deployed. This modular approach has also meant that our developments do not compromise the native code and functionality of the publication tools in a way that further system upgrades or updates are compromised.

2.2.4. Release of the Open Source Software Development Results

The open source software will be downloadable from https://os-aps.de (accessed on 6 December 2022) and a suitable repository, presumably GitLab, after the end of the project (31 December 2022, if necessary the project will be extended cost-neutrally, then possibly also later in spring 2023). Addition-ally, SciFlow will offer an optional, commercial hosting and support service. Accompanying documentation of the software is of course also provided. The OJS and OMP to DSpace connection scripts and a series of quality control and validation scripts as well as documentation on how these publishing tools have been setup under Docker will be fully available as open source code as part of the project’s integral code materials. As part of our project commitments towards open science and transparency and reproducibility, we have already published some of the scripts (in a none-finalized and openly available for scrutiny and feedback version) over the Github repository of the University and State Library in Sachsen Anhalt (explore for instance, our OJS/OMP2DSpace connecting script, and our scripts for the dockerisation of OJS and OMP).

3. Results

The workflow extracted from the perceptions and requirements of the surveyed publishing group is shown in Figure 2 (see also Section 2.2.2).

3.1. OS-APS Importer and Editor

Manuscripts can be imported into the programmed OS-APS editor (Figure 3). By extracting XML structures, elements such as column titles, page breaks, tables, etc. are recognized. In the editor, it is possible to change the text as well as the formatting, if elements were not recognized correctly. If necessary, more metadata (e.g., with regard to accessibility) and semantic references can be added.

3.2. Template Development Kit and Re-Usable Templates

During export, the corporate design of the respective publisher is mapped via templates. Various standard templates are provided and can be reused.
Further templates and exports can be developed using the Template Development Kit. This is particularly interesting for publishers who have very clear format specifications and do not want to deviate from them.
With the help of the Template Development Kit, individual parameters in ready-made templates can be easily changed. It is also possible to create completely new templates, although this requires prior technical knowledge (esp. web programming). New exports can also be programmed in this way. The Template Development Kit is based on the open source software Pandoc and on SciFlow’s own development.
Commercial non-open-source based tools can also be integrated during export for typesetting optimization, specifically, e.g., Prince XML (https://www.princexml.com, accessed on 6 December 2022).

3.3. Connection to OJS, OMP and Repositories Such as DSpace

The OJS and OMP applications are deployed via a Docker environment; the OJS and OMP systems are connected to a DSpace repository (specifically a DSpace 6.3. version, in the case of our project partners). As part of the intended workflow, OJS and OMP data will be exported to DSpace with subsequent return of DOI information. The corresponding publications are displayed in OJS and OMP as well as in DSpace. In general, OJS and OMP are intended as presentation platforms and DSpace for long-term archiving.
The connection scripts as well as documentation on how these publishing tools have been setup will be fully available as open source code as part of the project’s integral code (see Section 2.2.4).
Figure 4 gives an overview of the interfaces and data paths.

3.4. Test Possibility of the Current Results

Every first Wednesday of the month, a public “Demo Day” in form of a video conference (for more information see https://os-aps.de/demo/, accessed on 6 December 2022) takes place, where interested parties are invited to test the current state of the software and give feedback. The input from the “Demo Day” participants is taken into account with regard to the development of OS-APS.
For the final release of the software and documentation, see the methodological announcements in Section 2.2.4.

4. Discussion

4.1. Possible Necessary Exceptions to the OS-APS Workflow

The OS-APS software development project is currently on schedule with its planned milestones. The basic objectives described in the introduction are being achieved. However, the tests conducted so far show that not all special cases that might occur in manuscripts can be implemented graphically in, e.g., PDFs in an ideal way.
This applies primarily but not exclusively to art volumes in which various figures must have exactly the same arrangement as in the original manuscript, grouped figures (e.g., as a block of four or six) with one caption, large, rotated tables, nested tables with multiple content types (e.g., images in different cells of the table), Word text fields or images originally drawn in Word itself with multiple image elements, and much more.
In addition, there may be quality requirements from both publishers and authors that necessitate very thoughtful, small-scale, manual typesetting in InDesign, for example. Examples could include art and exhibition volumes. Here, too, the fully automated approach may not meet these individually high-quality requirements.

4.2. Discussion about Embedding the New Output Formats

What publishers or platforms do with the new output formats remains deliberately open and up for discussion. Those who previously only distributed PDFs via OJS, OMP or repositories (e.g., the university repository, in the case of university presses), must think about how and where they integrate the HTML or EPUB files when using OS-APS, e.g., whether they provide viewers or corresponding plug-ins and whether they also archive them over the long term (or continue to only archive PDF/A). Making publications available as HTML on publisher’s or journal websites can be very useful: HTML is a mobile-compatible, easily accessible, indexable, and human- and machine-readable format. These features are also valuable for bibliometric analysis.
In addition, they have to think about URL, DOI and, for eBooks, ISBN registration with regard to the new output formats. In the case of repositories and the use of one front door under which all formats hang, a single DOI could still be used, for example. However, according to the German “Verzeichnis lieferbarer Bücher” (https://vlb.de, accessed on 6 December 2022) as ISBN agency, each different e-book format needs its own ISBN.

4.3. Possible OS-APS Platform Extensions in the Future

The OS-APS platform was developed as open source software. In addition, however, the project partner SciFlow will offer hosting; then for a fee, for those who do not want to set up their own server to run the software or do not want to worry about support.
In the context of this support, further extensions are conceivable, for example with regard to special viewers, such as for EPUB or HTML, depending on the existing information infrastructure, or in terms of accessibility support. The project team is happy to enter into discussions.

5. Practical Application of OS-APS

Regarding the application of OS-APS, there is not yet any finished user case because the software is still being developed. The tests with manuscripts from the ULB-SA do not map the entire workflow, but refer to individual components as well as the connections with OJS, OMP and DSpace.
However, when comparing OS-APS at its current state with other XML conversion tools, a few major differences can already be detected. Firstly, OS-APS is supposed to be a lean tool for the browser, so that it can be used without large initial efforts. Secondly, it does not aim to establish new standards, but instead relies on open standards that are already established in publishing houses (e.g., connection to OJS and OMP as widely used open source tools). This should make it easier for publishers to switch to a workflow that includes the use of OS-APS.
Thirdly, a unique feature of the software is that it aims to require as little technical know-how in the field of IT to work with it as possible. This way, user groups without much or any knowledge in XML or programming (this includes, e.g., most authors and editors) should be able to work comprehensively with OS-APS. The XML stays in the background behind an easy-to-use interface. The only case in which IT knowledge, in particular skills in web programming, would be needed is for the programming of entirely new templates or export formats.
The combination of these three aspects distinguishes it from other tools, e.g., those that focus purely on individual aspects such as XML editing (e.g., XML Copy Editor, https://xml-copy-editor.sourceforge.io, accessed on 6 December 2022, or Oxygen XML Editor, https://www.oxygenxml.com/xml_editor.html, accessed on 6 December 2022 as a more powerful tool) or those that support more holistic media-neutral publication processes with an XML extraction, editing and typesetting system, but rely on strong embedding in local processes and require in-depth technical know-how; e.g., in Germany Heidelberg Monograph PublishingTool (https://github.com/withanage/heimpt, accessed on 6 December 2022) or the XML-first typesetting system of OA-STRUKTKOMM (https://oa-struktkomm.htwk-leipzig.de/forschungsprojekt/publikationsserver/, accessed on 6 December 2022). Internationally, Kotahi (https://kotahi.community, accessed on 6 December 2022) could be functionally comparable to some extent, but it is less streamlined and with more functionalities, e.g., regarding peer review, while OS-APS can be used flexibly and as needed in or apart from existing publishing processes (e.g., use of OS-APS for normal, text-heavy manuscripts, while art volumes are set with InDesign in a classical way).

6. Conclusions

Preparing manuscripts for various formats such as HTML or EPUB can pose challenges for small- and medium-sized, as well as non-commercial (e.g., university) academic publishers. A high level of professionalism often requires extensive technical expertise as well as the use of cost-intensive XML content management systems.
The third-party funded project “Open Source Academic Publishing Suite (OS-APS)” provides relief in this area. It is intended to enable academic publishers to publish in a media-neutral way using XML-based workflows. The XML is automatically extracted from Word manuscripts and the corporate design of the exported PDFs can be controlled via templates. Institutions or publishers using OJS or OMP can also reuse the workflows and connections documented in the project. OS-APS is thus closely integrated into the open science landscape.

Author Contributions

Conceptualization, C.B., F.E., A.H. and M.P.; methodology, C.B., F.E., A.H. and M.P.; software, C.B. and F.E., developers at the ULB-SA; validation, C.B., F.E. and M.P.; formal analysis, C.B., F.E., A.H. and M.P.; investigation, C.B., F.E., A.H. and M.P.; resources, C.B., F.E. and M.P.; data curation, C.B., R.C. and F.E.; writing—original draft preparation, C.B., R.C., F.E., A.H. and M.P.; writing—review and editing, C.B., R.C., F.E., A.H. and M.P.; visualization, C.B., R.C., F.E., A.H. and M.P.; supervision, C.B., R.C., F.E., A.H. and M.P.; project administration, C.B., R.C., F.E., A.H. and M.P.; funding acquisition, C.B., F.E. and M.P. All authors have read and agreed to the published version of the manuscript.

Funding

The project on which this publication is based was funded by the German FEDERAL MINISTRY OF EDUCATION AND RESEARCH (BMBF) under grant numbers 16TOA017A (SciFlow), 16TOA017B (FAU), and 16TOA017C (ULB Sachsen-Anhalt). The responsibility for the content of this publication lies with the authors.

Data Availability Statement

The data and software presented in this study are or will be made available on https://os-aps.de (accessed on 6 December 2022) within the specified project deliverable times.

Acknowledgments

The authors acknowledge the support provided by the Members of the Scientific Advisory Board and the Members of the User Advisory Board (https://os-aps.de/en/participate/, accessed on 6 December 2022) on the OS-APS project and software development.

Conflicts of Interest

The authors declare no conflict of interest. The funder had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

  1. Jeroen, B.; Jan Erik, F.; Bianca, K.; Pierre-Carl, L.; Vanessa, P. OA Diamond Journals Study. Part 1: Findings. Zenodo 2021. [Google Scholar] [CrossRef]
  2. Borchert, C.; Eichler, F.; Söllner, K.; Putnings, M.; Hoffmann, A.; Berghaus-Sprengel, A.; Brenn, D. Open Source Academic Publishing Suite (OS-APS): OA-Publikationen medienneutral mit automatisiertem Corporate Design erstellen. BIB-OPUS. 2022. Available online: https://nbn-resolving.org/urn:nbn:de:0290-opus4-178739 (accessed on 6 December 2022).
  3. Arianna, B.; Jeroen, B.; Lars, B.; Jan Erik, F.; Bianca, K.; Pierre-Carl, L.; Pierre, M.; Vanessa, P. OA Diamond Journals Study. Part 2: Recommendations. Zenodo 2021. [Google Scholar] [CrossRef]
  4. Putnings, M.; Borchert, C.; Eichler, F. Project Open Source Academic Publishing Suite (OS-APS). 2021. Available online: https://librarypublishing.org/lpf21-lightning-talks/ (accessed on 6 December 2022).
  5. Konstanze, S.; Markus, P.; Birgit, H.A.; Anke, B.-S.; Daniel, B.; Carsten, B.; Frederik, E. Open Source Academic Publishing Suite (OS-APS): Medienneutrales OA-Publizieren Im Eigenen Corporate Design. Zenodo 2021. [Google Scholar] [CrossRef]
  6. Söllner, K.; Putnings, M.; Hoffmann, A.; Berghaus-Sprengel, A.; Cozatl, R.; Brenn, D.; Borchert, C.; Frederik, E. Publikationen medienneutral und automatisiert gemäß den eigenen Stilrichtlinien erstellen mit der Open-Source-Software “OS-APS”. 2021. Available online: https://dini.de/fileadmin/jahrestagungen/Jahrestagung_2019/DINI_2021_Poster_OS-APS.pdf (accessed on 6 December 2022).
  7. Söllner, K.; Putnings, M.; Hoffmann, A.; Berghaus-Sprengel, A.; Borchert, C.; Eichler, F. Open Source Academic Publishing Suite (OS-APS): Simple, Media-Neutral OA Publishing with Automatic Typesetting. SCS 2021, 4, 1–8. [Google Scholar] [CrossRef] [PubMed]
  8. Putnings, M.; Borchert, C.; Cozatl, R. Ein Einblick in Das BMBF-Projekt Open Source Academic Publishing Suite (OS-APS). ABI Tech. 2022, 42, 166–173. [Google Scholar] [CrossRef]
  9. SciFlow. Milestones. 2022. Available online: https://gitlab.com/sciflow/development/-/milestone0073 (accessed on 6 December 2022).
  10. OS-APS Demo. Available online: https://os-aps.de/demo/ (accessed on 6 December 2022).
  11. Führ, F.; Bisset Alvarez, E. Digital Humanities and Open Science: Initial Aspects. In Proceedings of the International Conference on Data and Information in Online, Florianópolis, Brazil, 10–12 March 2021; Bisset Álvarez, E., Ed.; Springer International Publishing: Cham, Switzerland, 2021; pp. 154–173. [Google Scholar]
Figure 1. Manuscript acceptance preferences of the surveyed publishers (own representation, translated from [2]).
Figure 1. Manuscript acceptance preferences of the surveyed publishers (own representation, translated from [2]).
Publications 11 00001 g001
Figure 2. Overview over the Open Source Academic Publishing Suite functionalities.
Figure 2. Overview over the Open Source Academic Publishing Suite functionalities.
Publications 11 00001 g002
Figure 3. View of the OS-APS editor.
Figure 3. View of the OS-APS editor.
Publications 11 00001 g003
Figure 4. Schematic representation of the connection of OJS, OMP and DSpace to OS-APS.
Figure 4. Schematic representation of the connection of OJS, OMP and DSpace to OS-APS.
Publications 11 00001 g004
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Borchert, C.; Cozatl, R.; Eichler, F.; Hoffmann, A.; Putnings, M. Automatic XML Extraction from Word and Formatting of E-Book Formats: Insight into the Open Source Academic Publishing Suite (OS-APS). Publications 2023, 11, 1. https://doi.org/10.3390/publications11010001

AMA Style

Borchert C, Cozatl R, Eichler F, Hoffmann A, Putnings M. Automatic XML Extraction from Word and Formatting of E-Book Formats: Insight into the Open Source Academic Publishing Suite (OS-APS). Publications. 2023; 11(1):1. https://doi.org/10.3390/publications11010001

Chicago/Turabian Style

Borchert, Carsten, Roberto Cozatl, Frederik Eichler, Astrid Hoffmann, and Markus Putnings. 2023. "Automatic XML Extraction from Word and Formatting of E-Book Formats: Insight into the Open Source Academic Publishing Suite (OS-APS)" Publications 11, no. 1: 1. https://doi.org/10.3390/publications11010001

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop