Chapter 1: Introduction

1. Introduction

The aim of this thesis is to draw attention to the inherent potential of the Internet as a research tool in the field of English literature, since a lot of material is now available close at hand. A click with the mouse is enough to obtain information from various kinds of resources, for example academic discussion groups, electronic journals, full-text editions of literary works, or research documents including secondary sources and criticism. These supplies attract the scholar's attention. The ease of publishing and accessing material on the Internet makes publications available to everybody. Literary scholars, however, are rather reluctant and sceptical with regard to the new medium.¹ Web pages change continuously and some information may be altered, or may not be found at its former location. In this regard, websites are very unreliable and do not represent a sufficient source of reference. Thomas Rommel is aware of this problem since he believes that "footnotes have to refer to stable textual entities to be useful".²
The thesis cannot offer a comprehensive account of the material which is available on the new medium. A selective choice was made to explore some of the most accurate sites and services, as well as some of the unsuitable ones.
Chapter 1 provides a technical introduction defining terms, procedures of data transfer, and introduces the most common search engines which facilitate and support the localisation of resources. Chapter 2 deals with the history and theory of hypertext which constitutes the basis of the World Wide Web, and compares these ideas with contemporary criticism. In chapter 3, metapages and text archives including New English literatures are investigated and evaluated. These sources provide suitable material for literary scholars. Chapter 4 draws the attention to online discussion taking place in electronic magazines and mailing lists which expand the exchange of ideas from traditional academic discourse in journals and on annual conferences to ongoing, world-wide debate on the Internet.
Familiarity with the history and the principles of the new medium is necessary to understand the Internet's function as a research tool. In the 1960s, the Department of Defense in the USA considered plans of a computer system that should still work after a nuclear war. Even if single parts of the network were destroyed the information would still arrive at the defined destination. Some North American universities, for example the Massachusetts Institute of Technology (MIT) and the University of California Los Angeles (UCLA), were given orders to develop such a system. But the first test network was installed 1968 by the National Physical Laboratory in Great Britain. In 1969, the Pentagon's Advanced Research Projects Agency (ARPA) realised a sophisticated version consisting of four computers situated at American universities. This net was called ARPANET and was restricted to military usage, but scientists employed the network to communicate and to share knowledge with each other. The first mailing lists which enabled people to send emails to various others thus originated in 1973.
In the same year, several universities developed a different network, the Computer and Science Network (CSNET) because the restrictions of ARPANET did not allow equal access for all academic institutions. During the 1970s and 1980s, the net became more and more attractive for the business world, educational institutions, and the general public, so that the MILNET (Military Network) was separated from the ARPANET. In 1986, the National Science Foundation Network (NSFNET) was initiated and replaced the ARPANET by the end of the decade.³ The final breakthrough came in 1991 when Tim Berners-Lee and his assistants at CERN (the European Centre for Nuclear Physics Research in Geneva, Switzerland) invented the World Wide Web (WWW),⁴ which was originally intended for CERN members who required a universal, standard system of communication at their institution. In 1992, however, WWW software was released on the Internet and in 1993, the Web was presented to the general public. Mosaic became the most popular browser but was replaced by Netscape by the end of 1994.⁵ Internet and WWW are growing ever since at a rate of about 10 percent a month.⁶
The expressions Internet and WWW are nowadays often used synonymously even though "the World Wide Web is not the Internet".⁷ December and Randall believe that the World Wide Web is an interface that "does not require the Internet (...) [because] a distributed information system based on the Web can be constructed on any local-area or wide-area network".⁸ Therefore, the WWW is considered as a meta-application that makes available all the data on the Internet in a user-friendly and easily accessible form.⁹
The information path is regulated by two devices: the Internet Protocol (IP) which "works just like an envelope",¹⁰ i.e. the data are put in an electronic envelope which is provided with the adress of the destination. The Transmission Control Protocol (TCP) is the electronic post office which organises the distribution of the envelopes by dividing the data into single pieces and putting them together again at the end of the journey. TCP and IP collaborate and represent one unit: TCP/IP. The TCP/IP address consists of four numbers separated by dots.¹¹ The University of Bayreuth, for example, has the following address: 132.180.8.29. Since these numerals are hard to remember and not very user-friendly, the Domain Name System (DNS) was introduced in 1984. The DNS converts numeric addresses into filenames which tell the user where a site is located.¹² These names contain sub-domains and main domains, for instance the (abbreviated) name of a university or a service, and the country or organisation where a site is situated. The USA has six main domains: '.com' represents commercial organisations, '.edu' symbolises educational institutions, '.gov' is used for the government, '.mil' suggests military origin, '.org' indicates other organisations, and '.net' stands for network sources.¹³ Voice of the Shuttle, a network for resources in the humanities, has for instance the following DNS entry: vos.ucsb.edu. The main domain '.edu' shows that this service is maintained by an educational institution, in this case the University of California Santa Barbara, which appears in the sub-domain '.ucsb'. In Great Britain, addresses also contain sub-domain and main domain, e.g. 'ac.uk' which symbolises the United Kingdom ('.uk') and an academic institution ('.ac'), or 'co.uk' with '.co' representing a commercial provider. No such distinction is made in Germany where the entries consist of one sub-domain plus the main domain, for example uni-bayreuth.de.
The Uniform Resource Locator (URL) contains "the specific instructions for your Web browser to find and retrieve a file"¹⁴ and constitutes a complete Internet address. URLs are case-sensitive, i.e. one has to pay attention to upper and lower case characters when typing an address. In practice, navigation on the Web is easier than this sounds, since it is accomplished by following links that include the destination's full name. The URL is similar to a shelfmark of a book in a library and may refer to FTP (ftp://...), gopher (gopher://...), HTTP (http://...), or telnet sites (telnet://...).¹⁵ The File Transfer Protocol (FTP) is a protocol and a program: as a protocol it ensures a common standard for the transmission of files between computers and as a program, FTP is responsible for the exchange of data.¹⁶ FTP is also necessary to install a website on the Internet. Normally, FTP is not very user-friendly and requires some experience because one has to use commands for all transactions, for instance file retrieval ('get'), sending ('put'), or accessing the remote system ('open'), but there are applications (WWW, for example) which make the handling more convenient. The name gopher is a pun which derives from the following definitions:

1. Any of various short tailed, burrowing animals of the family Geomyidae, of North America. 2. (Amer.colloq.) Native or inhabitant of Minnesota: the Gopher State. 3. (Amer.colloq.) One who runs errands, does odd-jobs, fetches or delivers documents for office staff. 4. (computer tech.) Software following a simple protocol for tunneling through a TCP/IP internet.¹⁷

Gopher was developed in 1991 at the University of Minnesota and was intended as a kind of central interface that offered texts and preserved their original form¹⁸ without using hypertext. Gopher helps the user with digging tunnels through the Internet and becomes the office boy who has to "Go fer it!"¹⁹ in order to obtain information.
HTTP sites are the most common web documents at the moment. The HyperText Transfer Protocol is the standard communication protocol which is responsible for the transmission of data between WWW client and server.²⁰ The transactions take place in four basic phases: attempt to connect to a server, request from the client to the server, response from the server to the client, and close, i.e. the connection is established or the transfer is interrupted.²¹ December and Randall claim that "HTTP performs the requests and retrieve functions necessary to display documents stored on remote computers."²² The standard language of HTTP documents is HTML (HyperText Markup Language) which derives from the Standard Generalized Markup Language, a text-based code system for making documents readable for computers. Software such as Netscape translates the code into specific formats to be displayed on the screen.²³ SGML was introduced in 1986 and uses the tag set provided by the Text Encoding Initiative (TEI).²⁴ This set includes for instance tags to define abbreviations, additions, corrections, cross-references, deletions, expansions, font size and type, nonstandard characters, or substitutions, and was developed "for use with scholarly texts in the humanities".²⁵ Telnet is one of the first devices that was installed on the Internet (1969) and is used to log into remote computers which offer public services such as archives, bulletin boards, databases, and electronic library catalogues (OPACs). Comparable to FTP, telnet is operated by commands but can only be used for accessing data without transferring them.²⁶
Despite a myriad of possibilities for receiving data, the Internet does not replace books or libraries and should be regarded as an additional research tool. The new medium requires similar skills as in a print environment: one has to know how to use the research facilities, needs to find the most comprehensive collections of material, and must be familiar with the names of scholars whose works and expertise serve as a resource for high-quality information. A possible approach to obtain the URLs of websites dealing with English literature is to consult search engines, such as AltaVista²⁷ or Yahoo!²⁸ which allow the user to enter keywords - in this case 'English literature' - and which supply a number of results. Both engines also offer the opportunity to follow categories. With regard to these categories, AltaVista is business oriented whereas Yahoo! places more emphasis on educational contents.
Following the categories is more convenient than browsing through all the results displayed after a keyword search. In case the user is interested in a more specific area or a topic, research will be more efficient when s/he enters a term. This also works best if one has for instance a particular poem in mind and can only remember keywords or keyphrases rather than the exact title. Entering "the falcon cannot hear the falconer", for example, provides all matches leading to several full-text electronic versions of William Butler Yeats's The Second Coming. Owing to its rapid expansion, the net demands attentive and intelligent users who can choose from a far greater variety of knowledge than is available in a library.²⁹ Thus, research behaviour and experience change in a hypertext environment.
The new supply of material might lead to problems such as the inability to construct a meaningful context out of various pieces of information, disorientation in a system, or distraction from a particular topic. On the other hand, the user may find material for which s/he was not looking. This process is called serendipity.³⁰ These situations arise because one can assume that hypertext resembles the human memory concerning the associative structure. Kuhlen calls this phenomenon 'cognitive plausibility' and believes that knowledge is organised in a nonlinear way in the brain. As a consequence, the acquisition of knowledge is more efficient if information is presented in a nonlinear order, that is to say, in hypertext form.³¹

Notes

¹ George P. Landow, Hypertext. The Convergence of Contemporary Critical Theory (Baltimore, 1992), cf. pp. 164-168.
² Thomas Rommel, "Internet Survey for English Studies." In: Doris Feldmann, Anglistik im Internet (Heidelberg, 1997), p. 109.
³ Peter Klau, Das Internet. Der größte Informationshighway der Welt (Bonn, 1995), cf. pp. 31-35.
⁴ Jakob Nielsen, Multimedia and Hypertext. The Internet and Beyond (Boston, 1995), cf. pp. 34, 65.
⁵ Martin Scheller et al., Internet. Werkzeuge und Dienste (Berlin, 1994), cf. pp. 259-261.
⁶ Klau, cf. p. 35. See also John December and Neil Randall, The World Wide Web Unleashed (Indianapolis, 1995), pp. 1102-1109.
⁷ December and Randall, p. 7.
⁸ Ibid., p. 6.
⁹ Ibid., cf. p. 7.
¹⁰ Ed Krol, The Whole Internet. User's Guide & Catalog (Sebastopol CA, 1994), p. 25.
¹¹ Ibid., cf. pp. 27.
¹² Ibid., cf. pp. 30.
¹³ Ibid., cf. p. 32.
¹⁴ December and Randall, p. 17.
¹⁵ Scheller, cf. pp. 263-265.
¹⁶ December and Randall, cf. p. 19.
¹⁷ Klau, p. 221.
¹⁸ Scheller, cf. pp. 205.
¹⁹ Ibid., p. 205.
²⁰ December and Randall, cf. pp. 50.
²¹ Scheller, cf. pp. 295.
²² December and Randall, p. 1299.
²³ Ibid., cf. p. 51.
²⁴ For the guidelines of the Text Encoding Initiative, see C.M. Sperberg-McQueen and Lou Burnard (eds.), Guidelines for the Encoding and Interchange of Electronic Texts (Chicago, 1994).
²⁵ Susan Hockey, "Creating and Using Electronic Editions." In: Richard J. Finneran (ed.), The Literary Text in the Digital Age (Ann Arbor, 1996), p. 7.
²⁶ Klau, cf. pp. 172.
²⁷ AltaVista http://www.altavista.com.
²⁸ Yahoo! http://www.yahoo.com.
²⁹ Reinhard Kaiser, Literarische Spaziergänge im Internet. Bücher und Bibliotheken online (Frankfurt/M., 1996), cf. p 15.
³⁰ Detlev Leutner, "Psychological Aspects of Information Retrieval and Learning in Hypermedia Environments." In: Feldmann, cf. p. 31 f.
³¹ Rainer Kuhlen. Hypertext. Ein nicht-lineares Medium zwischen Buch und Wissensbank (Berlin, 1991), cf. pp. 55 ff.