[Home] [Teaching] [Projects] [Research] [Publications] [Curriculum Vitae]

HUMANITIES COMPUTING: Electronic Text - 2001-2002
Last Revision: 06/08/2002
some links have been disabled

B33080 - 30 contact hours - 4 credits



[Week 1] [Week 2] [Week 3] [Week 4] [Week 5] [Week 6] [Week 7] [Week 8] [Week 9] [Week 10]

Important Note

The group assignment and the documentation of the markup guidelines should be handed in no later than 31 May 2002 both on CD-Rom and per email <evanhoutte@kantl.be>. The CD-Rom should be sent to:

Edward Vanhoutte
CTB - Centrum voor Teksteditie en Bronnenstudie
Koningstraat 18
b-9000 Gent
Lecturer: Edward Vanhoutte
CTB - Centrum voor Teksteditie en Bronnenstudie
Koningstraat 18 / b-9000 Gent
tel: +32 (0)9 265.93.51 / fax: +32 (0)9 265.93.49
evanhout@uia.ua.ac.be
Time: Monday 9-12u.30. - 2nd semester 2001-2002
Contents: The use of electronic texts in all areas of current society and all disciplines of both the Humanities and the hard sciences is increasing enourmously. Together with this trend, the problems attached to the use and interchange of electronic texts become more prominent: software- and platform-incompatibility, loss of data in converting files, problems of arciving, creation, use, etc. This course addresses these problems and focuses on the problematic position of electronic texts in the humanities. The student can also expect an introduction in the history and evolution of electronic publication media such as the Internet. In formal lectures and seminars, we draw the attention to the creation and publication of electronic texts, and gain hand-on experience in using international accepted standards for text-encoding and markup - SGML, HTML, XHTML, CSS, XML, XSL, TEI, OEB (Open eBook). This course introduces tools and techniques which will be used by the students to produce an electronic publication. This year, the publication will be a web-edition of De Leeuw van Vlaenderen by Hendrik Conscience.
Pre-required knowledge: An elementary computer literateness is required (know how to work with multiple windows, work with the mouse, create folders and files, download files from the internet), but an introductory session my be organised for students who are not up to elementary standards.
This course is taught in English.
Onderwijsvorm: Formal lectures and seminars with preparation.
Examination: Permanent evaluation and group assignment. Only students who take part in all of the evaluation moments will be eligible to receive marks on this course.
Required reading:
  • Hockey, Susan (2000). Electronic Texts in the Humanities. Oxford: Oxford University Press.
  • The journals Literary & Linguistic Computing, Computers and the Humanities, Markup Languages: Theory and Practice en Human IT.
  • Further required and advised readings are posted on the course website.
Credits: This course counts for 4 ECTS credits, which equals a 120 hour workload. This is organized as follows:
  • Lectures: 24h.
  • Weekly preparation: 16h.
  • Group assignment: 60h.
  • Hypertext report: 20u.

Programme

Week 1 (11 February): Introduction Humanities Computing - History of the Internet - Hypertext - Introduction to Markup Languages.

Format Formal lecture
Preparation
  • Know how to surf the internet, look and find information.
  • Know how to email.
Required reading
  • Susan Hockey (2000). Electronic Texts in the Humanities. Oxford: Oxford University Press. Chapter 1: "Why Electronic Texts?". p. 1-10.
Further reading

Week 2 (18 February): Document Analysis - Markup - Text Encoding - SGML/XML - DTD.

Format Seminar
Preparation
  • Analyse a document.
Required reading
  • Susan Hockey (2000). Electronic Texts in the Humanities. Oxford: Oxford University Press. Chapter 3: "Text Encoding". p. 24-48.
Course material
Further reading

Week 3 (25 February): No Class

Assignment Organise a survey amongst literature staff and students and ask about their envisioned use of electronic texts. Try to single out these characteristics which could translate to markup. Oral report in Week 4 (with figures). This survey and report will be translated into markup guidelines for the group assignment.
Required reading
  • Susan Hockey (2000). Electronic Texts in the Humanities. Oxford: Oxford University Press. Chapter 5: "Literary Analysis". p. 66-84.
  • P4 TEI Guidelines for Electronic Text Encoding and Interchange. A Gentle Introduction to XML.
  • TEILite. "TEI U5: Encoding for Interchange: an introduction to the TEI." [html] [sgml] [xml] [pdf]

Week 4 (4 March): XML - TEI Hands-on

Format Seminar
Required reading Come well prepared and read:
Downloads
Installation

NoteTab Light is a very complete plain text editor which allows you to create SGML, XML, (X)HTML, CSS etc. documents.

Download the software on your computer and unzip the file with an Unzip programma (e.g. WinZip). Double click the Setup.exe file and follow the install shield guidance. Once installed, run the programma and select View > Options > File Filters. Select "New", and add the next details

  • Description: "xml"
  • Wildcards: "*.xml"
  • Click the OK button. Now you can save XML instances with the extension ".xml".

Repeat this operation for each file format you want to add to the software, e.g. CSS.

Download teixlite.clb and save (with .clb extension!) in NoteTab Light/Libraries. The Tab "teixlite" will now appear in the tab-bar at the bottom of the programme window. Click to activate the library which will appear in the left margin.

Further reading

Week 5 (11 March): XML - TEI Hands-on

Format Seminar
Assignment Make a document analysis of the first chapter of De Leeuw van Vlaenderen, combine this with your own insights in markup usability and the results of your survey. Mail me a list of the teixlite elements and attributes which should be used in encoding the novel according to you before Saturday night, March 9. Non-Dutch speaking students should make a document analysis of any piece of fictional prose in their own language. These reports will function as a blueprint for the project encoding guidelines we need to write in week 5.
Required reading
Downloads
Installation

Download the binaries for Windows 95 en Windows NT and unzip and extract in a SP folder which you create. The setup creates three folders: bin, doc and pubtext. You can find the parser (nsgmls) in the bin folder.

Next, download the Runsp2 windows interface for nsgmls. Unzip the file in the bin folder of SP. By running runsp2.exe, runsp2 wil find nsgmls. Read runsp.txt carefully.

Copy the next files in the same bin folder:

Specify where nsgmls can find the catalog file under Options in the toolbar of runsp2.

Specify where nsgmls can find xml.dcl under Options in the toolbar of runsp2.

Download and save teixlite.clb (with .clb extension!) in NoteTab Light/Libraries. The Tab "teixlite" will now appear in the tab-bar at the bottom of the programme window. Click to activate the library which will appear in the left margin.


Week 6 (18 March): XML - TEI Hands-on

Format Seminar
Assignment Check, correct and validate the file error.xml using nsgmls, either with Emacs, or with the Runsp windows interface. When parsing the file with the latter tool, OK the message "invalid line or column" and press the "Next error" button which will take you to the first error.
Group project
  • De Leeuw van Vlaenderen. First part.
    • [.wpd] (779kb)
    • [.rtf] (603kb)
    • [.txt] (304kb) (no footnotes, sorry)
    • leeuw1.zip (421kb) the three previous files in a compressed format.
    • List of corrections in the first part.
    • Glossary list
  • List of teilite tags.
TEI U5 and teixlite Please note that TEI U5 defines teilite in terms of SGML. Teixlite is XML, this means that you will have to adapt your encoding practice to next rules of thumb:
  1. Element names: use the exact element names. XML is case sensitive!
  2. Empty elements: don't tag empty elements as e.g. <pb>, but as <pb />
  3. Attributes: lower case.
  4. Attribute values: always quoted. e.g. <pb id="1.01.002" n="2" />
  5. Tag minimization: not allowed in XML. All tags which are opened must be closed and nested correctly.

Week 7 (25 March): Stylesheet languages - CSS - XSL

Format Seminar
Tools
Downloads
Further reading

Week 8 (15 April): No Class - workweek.

Assignment Group assignment

Week 9 (22 April): Digitization. What, why, how, OCR & imaging.

Format Formal lecture and Seminar
Required reading
  • Susan Hockey (2000). Electronic Texts in the Humanities. Oxford: Oxford University Press. Chapter 2: "Creating and Acquiring Electronic Texts". p. 11-23.
Further reading

Week 10 (29 April): HTML 4.01 / XHTML 1.0 - hands-on & Documentary "Into the Future"

Format Seminar
Required reading
Tools W3C HTML Validation Service
Downloads
Further reading


XHTML author: Edward Vanhoutte
Last Revision: 06/08/2002


[Home] [Teaching] [Projects] [Research] [Publications] [Curriculum Vitae]

Valid XHTML 1.0!