Author-Friendly Electronic Submission to SGML-based Academic Journal

Hidehiro Ishizuka
Univ. Library & Information Science
Kasuga, Tsukuba, Ibaraki 305, Japan
E-mail: ishizuka@ulis.ac.jp

Abstract

I and my co-workers developed an author-friendly method for electronic submission to an academic journal, which is published using a SGML (Standard Generalized Markup Language)-based system. The method uses a style function and RTF (rich text format), and can be used in popular word processing software: Microsoft Word, WordPerfect, PageMaker, etc. The method has been adopted in Bulletin of Chemical Society of Japan (CSJ) since April 1994, which is the monthly English journal of CSJ, and has been published since 1937. The journal has been published in a SGML-based system since January 1993. Our electronic submission method will be included in SIST (Standards for Information of Science and Technology) No. 14 (draft): "Guideline for electronic submission", which is considered in SIST Committee in Japan, and will be published in near future.

Keywords:

full-text database, electronic manuscript, electronic submission, SGML

1. Introduction

A structured document based on SGML (Standard Generalized Markup Language) [1] is a useful resource in a digital library and an electronic publishing. Several learned societies and publishers have adopted SGML-based publishing (SGML-EP) to their academic journals as follows. Elsevier announced a plan to extend SGML-EP to all of Elsevier's academic journals, and started CAPCAS (Computer Aided Production, Current Awareness Service) and TULIP (The University Licensing Programme) in 1993. I and my collaborators [2] have also studied and developed SGML-EP systems for several academic journals in Japan: Journal of Japan Society of Information and Knowledge (JJSIK) in 1990, SGML Experimental Journal by NACSIS (National Center for Science Information Systems) in 1991, and the monthly English journal of Chemical Society of Japan (Bulletin of the Chemical Society of Japan: BCSJ) since January 1993. WWW version of BCSJ [3] was experimentally served from November 1994 to January 1995. IEEE (The Institute of Electronics, Information and Communication Engineers) Computer Society [4] has been capturing electronic versions, i.e., SGML-based CD-ROM of its magazines and transactions since January 1995. ACM (Association for Computing) announced its electronic publishing plan [5] in 1995.

SGML-based publishing, especially tagging of data element, is a time-consuming work. Elsevier that publishes more than 1000 journals, extends tagging of title page data to all journals, however selects strictly journals in tagging of full-text data. If a SGML-oriented and author-friendly electronic submission is developed for an academic article, it will be an efficient method to publish an academic journal. Massachusetts Medical Society [6] adopted an electronic submission with WordPerfect in publishing SGML-based textbook: "The AIDS Knowledge Base". The style of the electronic manuscript is constructed a tagged format, such as a title is followed by "@T=". Y. Tanaka and I adopted a similar tagged format in the case of JJSIK in 1990. However, such a format is not easy to an author.

One of solutions of that problem is RTF-to-SGML conversion. Here, RTF is an exchange format for word processing text data, and is supported by popular word processing software: Microsoft Word, WordPerfect, PageMaker, etc. However several RTF-to-SGML converter tools[7] exist, Y. Tanaka, I and T. Ito [8] developed RTF-to-SGML converter program written in AWK, since self-developed program is more flexible than a converter tool, and since flexibility is suitable for a writing style of an academic article. We also adopted a style function available in Microsoft Word, WordPerfect, PageMaker, and successfully applied to electronic submission to BCSJ in 1993. I will report this electronic submission method in detail.

I will also introduce that this method will be included in one of Japanese standards: SIST (Standards for Information of Science and Technology) [9] No. 14 (draft) "Guideline for electronic submission", which is now considered in SIST Committee, and will be published in near future.

2. Author-friendly and SGML-oriented electronic submission

If contributor's electronic manuscript can be converted automatically to SGML-based format, SGML-EP will be efficiently performed. Issues to be considered are the followings; (1) a contributor may write his/her manuscript without knowledge about SGML, (2) he/she may use his/her familiar word processing software, (3) he/she may display/print his/her manuscript which is (almost) same layout to published articles, (4) his/her manuscript data can be automatically converted to SGML-based format, (5) constraint on writing style of an academic article is looser than that on a manual's.

Ito, Tanaka and I adopted style function (STYLE) and RTF combination on Windows PC or Macintosh in the case of BCSJ. Here, STYLE specifies a style name to a corresponding data element, such as section-title, and its parameters, such as, centering, font, and its size, etc. RTF can include information of STYLE. Only ASCII character or character-string is used in RTF to indicate any control command, such as font and its size and so on, and style-name, start-tag-of-the-style, end-tag-of-the-style.

The reasons of adopting STYLE and RTF are as follows; (1) STYLE and RTF are available in a popular word processing software, such as Microsoft Word on Windows/Macintosh, WordPerfect, PageMaker, etc., (2) STYLE serves easy assignment of data element by a contributor, and supports layout functions, such as centering, character size, etc., (3) while STYLE does not show any control characters on a display or a printed page, RTF including STYLE specifies data corresponding to SGML data element-name, start-tag, and end-tag (4) RTF text data including STYLE may automatically convert to SGML text data.

Table 1 shows a list of styles with display format of electronic manuscript for submission to BCSJ. Figure 1 shows a sample of an electronic manuscript written w ith this method. Style: Synopsis is selected on pull down menu of style in Fig. 1. A contributor may use bold, italic, superscript, subscript. He/she may type Greek alphabet using 'symbol font'. A special character is typed using a corresponding entity name defined in the DTD (Document Type Definition) for BCSJ; for example, 'right double arrow' is typed as '⇒'. The DTD includes entity name sets for special characters defined in ISO 8879 Annex and ISO/IEC TR 9573-13. Referring to a bibliographic reference is indicated superscript number with a parenthesis, such as 1). Referring to a table or a figure is indicated underlined character string like "... Table 1 shows ...". Bodies of a figure, a table, and a complicated equation are not included in this electronic submission yet, because of incompatibility among word processing software.

style name data element display format
Category category of article 12 p(point)
Running title running title 14 p, centering
Title article title 14 p, centering, bold
Sub title sub title 12 p, centering, bold
Author author's name 12 p
Address affiliation and address 12 p, centering
Received received date 12 p, centering
Synopsis synopsis 12 p, decreased line width
Normal paragraph 12 p, indent of 1st line
Section title section title 12 p, centering, bold
Subsection title subsection title 12 p, bold
References references 12 p
Table table captions 12 p
Scheme scheme captions 12 p
Figure figure captions 12 p
Chart chart captions 12 p
CI-Title title for CI(contents with illustration) 12 p
CI-Author author's name for CI 12 p
CI-Summary summary for CI 12 p
Profile author's profile (only for accounts) 12 p

Table 1 Styles used in electronic submission to Bulletin of the Chemical Society of Japan

Figure 1 Writing an electronic manuscript with style file on MS Word

CSJ [10] recruited a monitor of this method in March 1994. The style file and the manual for electronic submission to BCSJ have been free of charge from CSJ's WW W site since March 1996 [11].

Conversion program from RTF to SGML-text is written in AWK according to the follo wing methods.

(1) data element
In RTF, a style and its data are described like that: "...\s3...Synthesis and Reactions .... Acids\par".
Here, '\s3' and '\par' indicate start tag of the style for 'subtitle' and end tag of it, respectively.
Corresponding SGML text format is like that: "<SBT>Synthesis and Reactions .... Acids</SBT>".
Here, '<SBT>' and '</SBT>' are start-tag and end-tag of data element: subtitle, respectively.

(2) Greek alphabet
Italic Greek alphabet: 'η' is described as "{\i\f3\fs22 h}" in RTF, and as "<it>η</it>" in SGML-text.Here, '\i', '\f3', '\fs22' and 'h' indicate 'italic', 'symbol font', 'font size 11 point' and '', respectively.

(3) Referring to a bibliographic reference, to a table or to a figure
The conversion program finds superscript number with a parenthesis to convert a bibliographic reference; If without a parenthesis, the number does not indicates a bibliographic reference only superscript. Underlined character string including "Table" or "Fig." indicates Table or Figure reference, respectively. Therefore, the program converts the character string to a corresponding SGML expression.

(4) Others
'Italic' or 'bold' is converted to corresponding data element or ignored, which depends on context. 'Italic' character string in bibliographic references is converted to the data element: 'journal title', however 'italic' one in a sentence is converted to "<it> ... </it>". 'Bold' in paragraph is converted to SGML expression: "<bo> ... </bo>", however 'bold' in section title is ignored. 'Centering' is ignored, since style: title, author or section title indicates data element, and 'centering' is not necessary.
Author name(s) is written as follows: 'first name' 'blank' ['middle name'] 'blank' 'last name' [comma 'first name' ....]. Therefore, our conversion program segments and pick up each author and first, middle, last names by comma and blank.

3. Discussion

Converter program developed by myself is more tunable than a provided converter, especially in the case of refer to tables, figures, references. For example, "See Table 1, 2" or "Smith1-3)" indicates "Table 1 and Table 2" or "references 1, 2 and 3", respectively.

Four methods are applicable to write an electronic manuscript; (A) STYLE and RTF combination, (B) using macro function of word processing software, (C) using some special character sequence as if the case of "The AIDS Knowledge Base" published by Massachusetts Medical Society, and (D) LaTeX. Method A and B are easier to a contributor than method C and D. Method A is easier to maintain a style than method B, because of method A using GUI while method B programming command macro for a word processing software. However, method B is more tunable to each type of electronic manuscript than method A. Method D is suitable for description a complicated mathematical equation.

Method A is most suitable to the members of CSJ, since most of them use Microsoft Word. CSJ have a plan of method D, because some members, who study physical chemistry, prefer LaTeX.

4. SIST No. 14 (draft) : Guideline for electronic submission

I am a chairperson of the drafting committee for SIST No. 14: Guideline for electronic submission. The object of SIST 14 is an article of an academic journal. Aims of SIST 14 are the followings; (1) to be oriented SGML-based system; (2) to be useful for a learned society or a publisher to draft its rule for electronic submission. SIST 14 will include methods A, B, C and D mentioned before. It will be drafted through discussions in SIST Committee till end of 1997.

5. How to spread SGML-oriented electronic submission

Announcement of discount of submission charge or faster publication may be useful to spread SGML-oriented electronic submission. The Institute of Electronics, Information and Communication Engineers in Japan (IEICEJ) announced that submission charge is discounted in the case of electronic submission including a table data with LaTeX. However IEICEJ does not adopt SGML-EP yet. American Chemical Society [12] announced that submissions not in electronic form may face a delay in publication, however the form is not SGML-based but Microsoft Word, WordPerfect, etc.

Acknowledgment

I am greatly indebted to T. Ito: Prof. of Yokohama National University and Y. Tanaka: Director of Topan Printing Co., Ltd. for their valuable suggestion and encouragement. I would also like to express my best wishes to members of SIST Committee.

References

[1] ISO 8879-1986, Information processing - Text and office system - Standard Generalized Markup Language (SGML), Oct. 15, 1986.

[2] H.Ishizuka, "The reception of SGML based electronic publishing by Japanese scientific community," Proc. of 47th FID (International Federation for Information and Documentation) Conference and Congress, pp.505-508 (in Omiya Japan, October 1994).

[3] H. Ishizuka, "Multimedia and publishing", Journal of Information Processing and Management (Joho Kanri), Vol.38, No.4, pp.353-368 (1995) (in Japanese); H. Ishizuka, "Electronic publishing: its concept and technology", Journal of The Institute of Electronics, Information and Communication Engineers (DenshiJohoTsushinGakkaiShi), Vol.78, No.9, pp.891-898(1995) (in Japanese).

[4] <URL:http://www.computer.org/epub/>

[5] Peter J. Denning; Bernard Rous, "The ACM Electronic Publishing Plan", Communication of the ACM, Vol.38,No.4,pp.97-103(1995)

[6] EPSIG news, Vol.1, No.1, (Sept. 1987).

[7] <http://www.nicetech.com/venprod.htm>

[8] H.Ishizuka, T.Ito, et al., "Generation of full-text database based on SGML through an electronic contribution by an author -- An experiment in Chemical Society of Japan," IPSJ (Information Processing Society of Japan) SIG-FI Note, No.35, pp.1-8 (1994) (in Japanese).

[9] Science and Technology Agency, "SIST hand-book," 3rd ed., Japan Information Center for Science and Technology, 1992, 441p. (in Japanese)

[10] T.Shida, T.Ito, "Recruit monitors to examine a style file to submit a paper for Bull. Chem. Soc. Jpn.", Chemistry & Chemical Industry (Kagaku to Kougyou), Vol.47, No.3, p.270 (1994) (in Japanese).

[11]<URL:http://wwwsoc.nacsis.ac.jp/csj/journals/BCSJ-style_file/manual.html>

[12]<URL:http://pubs.acs.org/instruct/jacsat.html>