B33080 Humanities Computing: Electronic Text

University of Antwerp, Campus Drie Eiken

Second Term 2004

Edward Vanhoutte

edward.vanhoutte@kantl.be

TOC | First

B33080 Humanities Computing: Electronic Text

Week 1: Introduction to this course - Humanities Computing.

University of Antwerp, Campus Drie Eiken

Monday 16 February

Edward Vanhoutte

TOC | First


I. Monday 16 February Introductions

Congratulations

previous table of contents next
1 of 3 [50]
congratulationscongratulationscongratulationscongratulations congratulationscongratulationscongratulationscongratulationscon gratulationscongratulationscongratulationscongratulationscongra tulationscongratulationscongratulationscongratulationscongratul ationscongratulationscongratulationscongratulationscongratulati onscongratulationscongratulationscongratulationscongratulations congratulationscongratulationscongratulationscongratulationscon gratulationscongratulationscongratulationscongratulationscongra tulationscongratulationscongratulationscongratulationscongratul ationscongratulationscongratulationscongratulationscongratulati onscongratulationscongratulationscongratulationscongratulations congratulationscongratulationscongratulationscongratulationscon gratulationscongratulationscongratulationscongratulationscongra tulationscongratulationscongratulationscongratulationscongratul ationscongratulationscongratulationscongratulationscongratulati onscongratulationscongratulationscongratulationscongratulations congratulationscongratulationscongratulationscongratulationscon gratulationscongratulationscongratulationscongratulationscongra

Congratulations

previous table of contents next
2 of 3 [50]
  • Unique in Belgium
  • Of direct relevance to the job market
  • You managed to find this course

Monday 16 February

previous table of contents next
3 of 3 [50]
  1. Introduction to this course
    1. Objectives of this course
    2. (Non-)Assumptions
    3. Me & You
    4. Housekeeping Rules
    5. Overview of the Course
    6. Test Elementary Computer Skills
  2. Introduction to Humanities Computing
    1. Humanities Computing: definitions
    2. Humanities Computing: a field, a discipline
      1. Associations involved
      2. Journals and mailinglists
      3. Publications
      4. Institutions
    3. Humanities Computing: short history
  3. Computing
    1. Hardware
    2. Graphical Interface

1.a. Ojectives of this Course

previous table of contents next
1 of 13 [50]
  • Digitization
  • XML
  • XSL
  • Project management

1.a. Digitization of Texts and Images

previous table of contents next
2 of 13 [50]
After this course, you should be aware of:
  • what a digital image is
  • what digitization is
  • why you would digitize an object
  • ways of obtaining texts
    • archival
    • keyboarding
    • optical character recognition
  • how digitization works
  • tools, techniques, and standards
  • files and formats

1.a. XML: eXtensible Markup Language

previous table of contents next
3 of 13 [50]
XML is a metalanguage by which one can create separate markup languages for seperate purposes. After this course you should know how to:
  • analyse a document
  • create a well formed XML document
  • create a valid XML document
  • parse an XML document for validation
  • read and interpret an XML document
  • create and read small DTDs
  • use and understand TEI-Lite
  • use and understand DALF

1.a. XSL: eXtensible Stylesheet Language

previous table of contents next
4 of 13 [50]
XSLT is a tool for processing XML documents. After this course, you should know how to:
  • transform XML from one datatype to another
  • style XML for display in off-the-shelf Web browsers
  • use XPath to denote specific parts of a document
  • use XSLT as a search engine
  • segment your data
  • retag your data
  • express recursive algorithms in XSLT

1.a. Project Management

previous table of contents next
5 of 13 [50]
After this course, you should be aware of the issues involved in:
  • planning and instigate a project
  • costing a project
  • running a project
  • maintaining a project
  • evaluating a project

1.a. Not covered in the course

previous table of contents next
6 of 13 [50]
  • (X)HTML, CSS
  • ECDL
  • Web design, page design, typography
  • XSL formatting objects
  • Tricks of the trade (production use)
  • Digital audio and video
  • Three dimensional digitization

1.b. (Non-)Assumptions

previous table of contents next
7 of 13 [50]
I assume that:
  • You have elementary computer skills:
    • know how to work with multiple windows
    • work with the mouse
    • create folders and files
    • download files from the internet
  • You are interested in literary and historical texts, databases, and images.
  • You expect to work with XML data, and need to:
    • markup texts
    • display information
    • extract information
    • reformat information
  • You expect to work with digitized texts and images
  • Understand English
  • (and a little bit of Scottish and Dutch)

1.b. (Non-)Assumptions

previous table of contents next
8 of 13 [50]
I do not assume that:
  • You know something about programming
  • You know something about markup, transformations, and digitization
  • You know XML and can edit XML documents
  • You have used XSLT
  • You are a programmer

1.c. The Lecturer: Edward Vanhoutte

previous table of contents next
9 of 13 [50]
  • Co-ordinator, Centre for Scholarly Editing and Document Studies (Centrum voor Teksteditie en Bronnenstudie - CTB) of the Royal Academy of Dutch Language and Literature (KANTL)
  • Associate University Teacher in Humanities Computing, University of Antwerp
  • Degrees in Germanic Languages and Literature, Mediaeval Studies, Education
  • email: edward.vanhoutte@kantl.be
  • http://www.kantl.be/ctb/vanhoutte/
  • http://www.kantl.be/ctb/staff/edward.htm

1.c. You

previous table of contents next
10 of 13 [50]
I want to know:
  • Who you are (name)
  • What you do
  • What you study
  • Why you're here
in 30 seconds

1.d. Housekeeping Rules

previous table of contents next
11 of 13 [50]
  • Exam:
    1. Permanent Evaluation: presence and participation in the weekly lectures & exercises
    2. Group Assignment: electronic edition of the correspondence between Lynne Bryer & Daphne Rooke
    3. Viva Report
  • Lectures:
    • Academic 15 min for checking email
    • Always bring a disk
    • Preparation & exercises
    • Required readings
    • Suggested readings
  • Course website:
    • http://www.kantl.be/ctb/vanhoutte/teach/hc2004.htm
    • Check it regularily
    • No "I-didn't-know-it"s
    • Updated every Wednesday noon
  • A copy of your student card
  • A mail with your email address to edward.vanhoutte@kantl.be

1.e. Overview of the course

previous table of contents next
12 of 13 [50]
  • Monday 16 February: 1. Introduction to this course - Humanities Computing
  • Monday 23 February: 2 Digitization of Images and Textual Resources: Dr. Melissa Terras
  • Monday 1 March : 3. XML theory and practice: History of the Internet - Hypertext- Text Encoding & Markup - Document Analysis
  • Monday 8 March: 4. XML theory and practice: SGML/XML - TEI - DTD - well formed XML
  • Monday 15 March: 5. XML theory and practice: valid XML - validating
  • Monday 22 March: 6. TEI: TeiXlite
  • Monday 29 March: 7. TEI: DALF
  • Monday 19 April: 8. XSL theory and practice: basics, XPath, function
  • Monday 26 April: 9. XSL theory and practice: Real XSLT
  • Monday 3 May: 10. Group Project - Documentary "Into the Future"

1.f. Test Elementary Computer Skills

previous table of contents next
13 of 13 [50]
  • Make three folders on your C drive, name them "Man", "Woman", "Baby".
  • Download the following files from http://www.kantl.be/ctb/vanhoutte/teach/hc2004.htm
    • Week 1 > As We May Think (bushf.htm): put bushf.htm in folder "Man"
    • Week 4 > teixlite.dtd: put teixlite.dtd in folder "Woman"
    • Bottom > XHTML 1.0 logo (valid-xhtml10.png): put valid-xhtml10.png in folder "Baby"
    • Put the contents of "Baby" in "Woman" and delete "Baby"
    • Copy the contents of "Woman" to "Man"
    • Save the remaining folders to a disk
    • Pass the disk on to your right hand neighbour
    • Explore the contents of the disk and show me

2. Introduction to Humanities Computing

previous table of contents next
1 of 12 [50]
  • a. Humanities Computing: definitions
  • b. Humanities Computing: a field, a discipline
    • a. Associations
    • b. Journals
    • c. Mailinglists
    • d. Publications
    • e. Institutions
  • c. Humanities Computing: short history

2.a. Humanities Computing: definitions

previous table of contents next
2 of 12 [50]
Humanities Computing is an academic field concerned with the application of computing tools to arts and humanities data or their use in the creation of these data.
Willard McCarty
To apply Computing solutions to valid, and generally complex, Humanities based problems to provide access and answers that otherwise would have been impossible.
Melissa Terras

2.a. Humanities Computing: definitions

previous table of contents next
3 of 12 [50]
To apply Computing solutions to valid, and generally complex, Humanities based problems to provide access and answers that otherwise would have been impossible.
Melissa Terras

2.a. Humanities Computing: definitions

previous table of contents next
4 of 12 [50]
Required reading:
  • Willard McCarty (2002). Humanities Computing (Preliminary draft entry for The Encyclopedia of Library and Information Science, New York: Dekker, 2003.)
    → http://www.kcl.ac.uk/humanities/cch/wlm/essays/encyc/

2.b. Humanities Computing: a field, a discipline

previous table of contents next
5 of 12 [50]
Required reading
  • Marilyn Deegan (2000). "Introduction." Frances Condron, Michael Fraser & Stuart Sutherland (eds.), Guide to Digital Resources for the Humanities 2000. Oxford: CTI.

2.b.a. Associations: A Selection

previous table of contents next
6 of 12 [50]
  • Association for Literary and Linguistic Computing (ALLC)
    <http://www.allc.org>
  • Association for Computers and the Humanities (ACH)
    <http://www.ach.org>
  • Text Encoding Initiative Consortium (TEI)
    <http://www.tei-c.org>

2.b.b. Journals (Print): A Selection

previous table of contents next
7 of 12 [50]
  • Literary & Linguistic Computing
  • Computers and the Humanities
  • Human IT
  • Markup Languages: Theory and Practice (discontinued)
  • Computers and Texts
  • Revue. Informatique et Statistiques dans les Sciences Humaines.
  • Text Technology. The Journal of Computer Text Processing.
  • Cultural & Heritage Informatics Quarterly
→ Institutional models for humanities computing
<http://www.kcl.ac.uk/humanities/cch/allc/imhc/>

2.b.b. Journals (on-line): A Selection

previous table of contents next
8 of 12 [50]
  • DigiNews (RLG)
    <http://www.thames.rlg.org/preserv/diginews/>
  • D-Lib Magazine
    <http://www.dlib.org/dlib/>
  • Journal of Digital Information
    <http://jodi.ecs.soton.ac.uk/>
  • Journal of Electronic Publishing
    <http://www.press.umich.edu/jep/>
  • Ariadne
    <http://www.ariadne.ac.uk/>
→ Institutional models for humanities computing
<http://www.kcl.ac.uk/humanities/cch/allc/imhc/>

2.b.c. Mailing Lists: A Selection

previous table of contents next
9 of 12 [50]
  • Humanist
    an international electronic seminar on the application of computers to the humanities whose primary aim is to provide a forum for discussion of intellectual, scholarly, pedagogical, and social issues and for exchange of information among members.
    <http://www.kcl.ac.uk/humanities/cch/humanist/>
  • TEI-L
    a discussion for newbies and advanced users of the TEI.
  • E-DOCS
    a discussion list and website for professionals involved in the production, distribution, and organization of historical documents on the Internet. Offers an index of "Best Practices and Exemplary Sites."

2.b.d. Publications: A Selection

previous table of contents next
10 of 12 [50]
  • Susan Hockey (2000). Electronic Texts in the Humanities. Oxford: Oxford University Press.
  • Office for Humanities Communication Publications (King's College London, currently 16 titles in print)
  • Jahrbuch für Computerphilologie - print (München, Germany) & online <http://computerphilologie.uni-muenchen.de/ejournal.html>

2.b.e. Institutions

previous table of contents next
11 of 12 [50]
→ Institutional models for humanities computing
<http://www.kcl.ac.uk/humanities/cch/allc/imhc/>

2.c. Humanities Computing: short history

previous table of contents next
12 of 12 [50]
  1. The Beginnings of Humanities Computing
    • 1949-1980: Roberto Busa, Index Thomisticus: 8 million words, 60 volumes.
    • 1950's: New Testament concordancing and stylistic analysis.
  2. Developments in Concordancing
    • Quantitative basis for stylistic analysis, author attributorship, vocabulary studies of collocations, scholarly editing, ...
  3. Text Archives
  4. Developments in Databases and Statistics
    • 1970's: use of statistical techniques in historical analysis
    • → differences between historical source material
    • → links between important names in a variety of different documents and document types such as ephemera, baptism records and images
  5. Associations and consortia: ALLC, ACH, TEI

3. Computing

previous table of contents next
1 of 22 [50]
Development of
  • a. Hardware
  • b. PC & Graphical Interface
Required reading:
  • Michael Fraser (1996). A Hypertextual History of Humanities Computing
    → http://info.ox.ac.uk/ctitext/history/intro.html

3.a. Hardware

previous table of contents next
2 of 22 [50]
First generation: tubes
  • 1942: ABC (Atanasoff-Berry Computer):
    • more than 300 vacuumtubes
    • weight: 250 kg
    • more than 1.5 km wire
    • speed: 1 calculation every 15 seconds.
  • 1948: ENIAC (Electronic Numerical Integrator and Calculator):
    • 18,000 vacuumtubes
    • 70,000 transistors, 10,000 condensators, 6,000 manual switches
    • heighth: 2.5m
    • length: 24m (167 sq m)
    • weight: 30 ton
    • 160 kilowatt electricity.
→ Dr. John von Neumann (1903-1957): computer memory & BIT (1 or 0)

3.a. Hardware

previous table of contents next
3 of 22 [50]
EDSAC

3.a. Hardware

previous table of contents next
4 of 22 [50]
Titan

3.a. Hardware

previous table of contents next
5 of 22 [50]
Second generation: transistors
John Bardeen, Walter Brattain & William Shockley
  • transmitting
  • resistor
  • 1960: IBM 7090
  • 650 EDPM computer: 60% discount for universities

3.a. Hardware

previous table of contents next
6 of 22 [50]
Third generation: integrated circuits
→ transistors on semi conductors
  • Jack Kilby: germanium
  • Robert Noyce: silicium → CHIP
  • 1970: Intel 1103 chip: DRAM (Dynamic Random Access Memory)
  • → HP 9800
  • 1970: Fairchild corporation: 256k SRAM chip (Static Random Access Memory)

3.a. Hardware

previous table of contents next
7 of 22 [50]
Fourth generation: microprocessors
  • 1971: Intel 4004: 2.300 transistors on a 4 x 3 mm chip

3.a. Hardware

previous table of contents next
8 of 22 [50]
Storage media
  • Keypunch Machine
  • 1971: IBM 8" Memory Disk (Floppy)
  • 1978: Wang Laboratories 5 1/2" Floppy
  • 1981: Sony 3 1/2" Disk

3.b. PC & Graphical Interface

previous table of contents next
9 of 22 [50]
1. Personal Computer
Scelbi (1974)

3.b. PC & Graphical Interface

previous table of contents next
10 of 22 [50]
Altair (1975): DIY kit with switches for binary programming

3.b. PC & Graphical Interface

previous table of contents next
11 of 22 [50]
Basic: Bill Gates & Paul Allen wrote BASIC in 6 weeks

3.b. PC & Graphical Interface

previous table of contents next
12 of 22 [50]
Apple: Steve Wozniak & Steve Jobbs
  • Apple I (1976)
  • Apple II (1977)

3.b. PC & Graphical Interface

previous table of contents next
13 of 22 [50]
Commodore PET (1977): Personal Electronic Transactor

3.b. PC & Graphical Interface

previous table of contents next
14 of 22 [50]
Radio Shack (Tandy) TRS-80 (1978)

3.b. PC & Graphical Interface

previous table of contents next
15 of 22 [50]
2. User Software
  • Disk drive → distribution of software
  • Visicalc spreadsheet (1979) → Lotus 1-2-3 (1983)
  • Wordstar (1979)
→ American Software Patent Law in 1981

3.b. PC & Graphical Interface

previous table of contents next
16 of 22 [50]
3. IBM PC (1981)
  • First PC based on open architecture (off the shelf parts)
  • 4.77 MHz Intel 8088 microprocessor
  • one or two disk drives
  • 16kb memory → 256k
  • 16 bit Microsoft OS: MS-DOS 1.0
  • price: from 1,565 USD onwards (ca. 4,300 EUR now)

3.b. PC & Graphical Interface

previous table of contents next
17 of 22 [50]
4. Graphical User Interface
  • First User interfaces (MS DOS): command line

3.b. PC & Graphical Interface

previous table of contents next
18 of 22 [50]
  • 1970: Xerox Corporation, Palo Alto Research Parc 'PARC Alto': the architecture of information
  • 1973: Apple Lisa (Local Integrated Software Architecture)
    • drop-down menu bars
    • windows
    • multiple tasking
    • hierarchic file system
    • copy & paste
    • icons
    • folders
    • mouse (Doug Engelbart)

3.b. PC & Graphical Interface

previous table of contents next
19 of 22 [50]
More Graphical User Interfaces
  • Microsoft Windows 1.0 (1985)
  • IBM: Top View
  • VisiOn (1983)
  • GEM (Graphics Environment Manager)

3.b. PC & Graphical Interface

previous table of contents next
20 of 22 [50]
5. Cost factor
Sinclair ZX80 (1980)
  • 22x17 cm
  • 99.95 GBP or 79 GBP in kit form
  • Screen: TV
  • Storage medium: tape recorder
  • 20,000 computers sold in 9 months

3.b. PC & Graphical Interface

previous table of contents next
21 of 22 [50]
5. Cost factor
Sinclair ZX81 (1981)
  • 69.95 GBP or 49.95 GBP in kit form
  • Sold per Mail order & in supermarkets
  • Magazines, hobbyclubs
  • 300,000 computers sold in 10 months

3.b. PC & Graphical Interface

previous table of contents next
22 of 22 [50]
5. Cost factor
Sinclair ZX Spectrum (1982)
  • Low price → 'Home Computer'
  • Sold 15,000 pieces a week