scispace - formally typeset
Search or ask a question

Showing papers by "Shigeo Sugimoto published in 1998"


Journal ArticleDOI
TL;DR: The developed a technology called MHTML to browse multilingual documents on an off-the-shelf Web browser, and applied the technology to a multilingual gateway service to browse foreign documents and to aMultilingual electronic text collection of Japanese folk tales.
Abstract: national network, and a multilingual browser is an essential tool for international access to and sharing of global information. The Web has expanded very rapidly worldwide. We can easily access a document from a foreign site using an off-the-shelf browser. However, that browser is usually capable of showing only documents written in English and a local language, not those written in other languages. The principal problem preventing users from browsing documents written in a foreign A new technology allows users to browse multilingual documents on the Internet. language is the lack of both a font for the language and a display function to process multiple character codes. We developed a technology called MHTML to browse multilingual documents on an off-the-shelf Web browser, and applied the technology to a multilingual gateway service to browse foreign documents and to a multilingual electronic text collection of Japanese folk tales [2, 3, 4]. MHTML technology is composed of three elements: The MHTML document object is a package containing a source text string and the minimum set of glyphs required to display the text. An HTML text specified by a URL in an applet tag to invoke the MHTML viewer is converted into an MHTML document object on the fly and sent to the viewer by an MHTML server. (A glyph of a character is a graphical entity used to display or print the character. A minimum set of glyphs for a text is a set of glyphs for all of the distinct characters that appear in the text.) The MHTML viewer, implemented as a Java applet, displays an MHTML document object on a Web browser using only the glyphs enclosed in the object. The MHTML server, with the viewer applet and a font bank, converts an HTML document into MHTML using the glyphs defined in the font bank. The four major advantages of MHTML technology begin with its simple user environment where the user installs only a Java-enabled browser. Secondly, the number of distinct characters used in a source text is much smaller than the total number of characters of the text. We found the ratio between the length of an MTHML document object and its source text was approximately 2:1 to 3:1 in the case of scientific articles written in Japanese. (The source Japanese character is encoded in two bytes. The font glyph is the bitmap data of a character, and its size …

27 citations


Book ChapterDOI
21 Sep 1998
TL;DR: The authors developed a display function for multilingual documents based on MHTML technology and extended it to text inputs in multiple languages for off-the-shelf browsers and sample applications to create an environment for digital library end-users.
Abstract: The World Wide Web (WWW) covers the globe. However, the browsing functions for documents in multiple languages are not easily accessed by occasional users. Functions to display and input multilingual texts in digital libraries are clearly crucial. Multilingual HTML (MHTML)is a document browser technology for multilingual documents on the WWW. The authors developed a display function for multilingual documents based on MHTML technology and extended it to text inputs in multiple languages for off-the-shelf browsers and sample applications. This extension creates an environment for digital library end-users, wherein they can view and search multilingual documents using any off-the-shelf browser. This paper also discusses the lessons learned from the MHTML project.

3 citations