Technologies of Information and Communication Applied to English
Studies
[ Unit 9 in syllabus for 35334 Practical Criticism Applied to
English Studies 2012-2017 )
An Introduction
Click here to start Concepts: Technology
| Information | Communication | Markup language | TEI Text Encoding
Initiative
computational tools -- "We shape our tools. And then our
tools shape us." (Marshall McLuhan)
Robert Harris, "The Personal
Computer as a Tool for Student Literary Analysis"
Reference management
"Reference
management software" Wikipedia
Software : Zotero
: Explanation in "Quick
Start Guide"
Mendeley:
(also an academic social network) Explanation in "Mendeley" Wikipedia
The University of València uses the propietary software RefWorks
See icon on the right hand side of any catalogue card
in the UV library catalogue.
"RefWorks"
Wikipedia
Bibliographic records can also be organized in a
database "Bibliographic
database" in Wikipedia
Databases
What is a database ? "Database" Wikipedia
Database software : MySQL |
Base in LibreOffice
Dictionaries are databases : OED
(accessible from a UV terminal)
Database as a linguistic corpus:
These databases use concordancing and other text analysis techniques
Text Analysis Tools
With electronic texts, it is easier to find and study patterns in
texts, see words and phrases in context, examine their frequencies,
analyse word clusters, obtain concordances, etc.
Text analysis software listed in DiRT
(Digital Research Tools) and in "Text
mining" Project Bamboo
Concordances
Use of concordances in the study of language and literature: "Concordance
(publishing)" Wikipedia
For instance, in an analysis of Frankenstein, Robert Harris
suggests to run the novel through a concordance program in order to
"determine the frequencies of word occurrence. After the common
"glue words" (such as the, of, is, and, etc.), what words are used
most often, and of what kind are they? Approach the meaning of
"kind" from several aspects. That is, are they nouns, verbs, or
adjectives? Are they tonal words such as dark or gloom?
Are they words conveying excess of a quality rather than
moderateness (such as terror rather than fear, elated
rather than happy)? And of course, so what? " ("Ideas for
Analyzing Frankenstein")
Software:
AntConc
(freeware concordance programme) | Concordance |
see "Concordancer"
entry in Wikipedia
Open
Source Shakespeare
WordCruncher : an
electronic text viewer with several tools with which one can look
for specific references, search for words or phrases, perform simple
word frequency distribution studies, study multiple languages in
synchronized windows, and see various reports (e.g., collocation,
vocabulary dispersion, vocabulary usage)
WordSmith
Tools (concordance, key words, word list alphabetical and
frequency, )
Word Count Tools
(identifies unique words, difficult words; average sentence
length, readability, keyword density, estimated reading time)
Wordcounter (shows top 10
keywords) |
Visualization tools
Voyant-Tools
TokenX
| Tagline
Generator | ManyEyes
|
TagCrowd | Wordle | ManiWordle | WordItOut (word cloud
generator) |
TextArc
TextArc Hamlet
Visualization techniques: Word Tree | Phrase
Nets | Tag Clouds | Theme River | Parallel Tag Clouds
| Spark Clouds |
Audiovisual presentation
Slide shows: CAUTION : Wailgum "PowerPoint
Hell" | Bumiller "We
Have Met the Enemy and He is PowerPoint"
Software: Impress in LibreOffice |
Video presentations:
Software: Avidemux
Epilogue
==============================================================================
CONCEPTS
"Information and Communication Technology (ICT)"
Let's analyze the constituents of this phrase
Technology =
"the making, modification, usage, and knowledge of tools,
machines,
techniques, crafts, systems,
and methods of organization, in order to solve a problem, [. . . ]
handle an applied input/output relation or perform a specific
function" ("Technology" Wikipedia
1 March 2013)
What sort of PROBLEMS can I have in the study of language and
literature? How can specific tools, such as computers, assist me in
solving these problems, in answering questions about texts, works
and their meanings?
In the interaction between humans and computers, pay
attention to Weizenbaum's warning in Computer Power and Human
Reason: From Judgment to Calculation
The book "displays his ambivalence towards computer technology and
lays out his case: while artificial intelligence may
be possible, we should never allow computers to make important
decisions because computers will always lack human qualities such
as compassion and wisdom.
Weizenbaum makes the crucial distinction between deciding and
choosing. Deciding is a computational activity, something that can
ultimately be programmed. It is the capacity to choose that
ultimately makes us human. Choice, however, is the product of
judgment, not calculation. Comprehensive human judgment is able to
include non-mathematical factors such as emotions. Judgment can
compare apples and oranges, and can do so without quantifying each
fruit type and then reductively quantifying each to factors
necessary for mathematical comparison." ("Computer
Power and Human Reason" in Wikipedia 1 March 2013)
Information =/= Communication
Communication = conveying
or sharing information
"a process by which information is
exchanged between individuals through a common system of symbols, signs, or behavior" ("communication"
Merriam-Webster Online).
Do we communicate with computers?
Compare the previous definition with "the exchange of meanings between individuals
through a common system of symbols" ("communication."
Encyclopædia Britannica Online Academic Edition. 01
Mar. 2013.)
"The communication process is complete once the receiver has
understood the message of the sender." ("Communication"
Wikipedia 1 March 2013)
Do computers understand?
Information =
" in its most restricted technical sense, is a sequence
of symbols that can be interpreted as a message.
Information can be recorded as signs, or transmitted as signals. [
. . . ] Conceptually, information is the message
(utterance or expression) being conveyed. " ("Information",
Wikipedia, 1 March 2013)
The text of a novel, poem, play, etc. is a sequence of symbols, an
arrangement of words (made up of letters) and punctuation marks,
recorded in a given document (paper, electronic). These symbols are
inscribed instructions for a reader or receiver to process that
information, to decode that text in order to understand the message.
While we can read the letter "I" or "i", a sign in the roman
alphabet code, computers don't read the letter "i" (to put it
crudely). Almost all modern computers use the binary numeral system
to encode information, and so they need to have the letter "i"
encoded as a given sequence of 0 and 1, such as "110 1001", in ASCII
code.
Computers' processing of information is called "computation" or
"calculation"
Calculation = "deliberate process for transforming one or
more inputs into one or more results, with variable change"
("calculation" Wikipedia)
In computing, this process involves a step-by-step procedure, that
is, an algorithm = " systematic procedure that
produces—in a finite number of steps—the answer to a
question or the solution of a problem. " ("algorithm." Encyclopædia
Britannica Online Academic Edition. Web. 01 Mar. 2013)
What sort of PROBLEMS can I have in the study of language and
literature?
For instance, in an analysis of Frankenstein we may be
interested in the contrast between positive terms related to
"happiness" and "beauty" and negative terms related to "darkness".
Robert Harris suggests the following question: "How do the passages
relating to happiness and beauty fall within the novel, and how to
they occur in relation to those of darkness?"
To answer that we need to "Look for such terms as lovely,
beautiful, beauty, pleasure, delight, joy, delighted, enchanged,
gay, happy and contrast them with terms such as ugly,
disconsolate, melancholy. " ("Ideas")
How can specific tools, such as computers, assist me in solving
these problems?
Computers can perform certain tasks more efficiently than humans
can: for instance, search and find text or patterns
In order to have computers perform a given task, humans need to give
them instructions
Readers understand a text because they share the code
(the language, e.g. English) with the sender or writer, as well as
its writing conventions (instructions). Thus humans can read the
letter "i", a sign in the roman alphabet code, and understand "I"
as the first personal pronoun because they learned that. Computers
"learn" as long as they are given instructions.
Instructions can be communicated to computers by means of a
programming language: A sequence or collection of such instructions
is a computer program
Basic computer programs to work with texts are text editors
and word processors.
Text editor : Bluefish
(explanation
in Wikipedia) | Word processor : Writer in LibreOffice
Robert Harris's 1994 article
explains ways of using word processors in literary analysis:
- finding occurrences and themes
- testing theories or claims
- discovering thematic shapes
- performing specialized searches
----
Before computers can perform a given task on a literary texts, they
obviously need the text of the novel, poem, play, etc., that is, an
"input" in electronic form.
Where to find electronic texts? Project
Gutenberg | Internet Archive
| The University of Oxford Text Archive (OTA) | click here for
further resources.
-----
When we produce or read a text, we not only encode or decode the
signs such as words and punctuation marks, but also apply cultural
codes (mostly typographical conventions) that tell how the text is
structured.
When we read the following text:
〈ACT 1〉
〈Scene 1〉
Enter Barnardo and Francisco, two sentinels.
BARNARDO Who’s there?
FRANCISCO
Nay, answer me. Stand and unfold
yourself.
we recognize it as a dramatic text, we interpret its typographical
features and content as belonging to the text genre of "play", as we
observe that this text is organized into a number of structural
elements: the header of a text division that we call "act",
followed by the header of a subdivision that we culturally know as
"scene", followed by text in italics and centered indicating that it
is a stage direction, then the name of one of the roles or speakers
("Barnardo") followed by the text of the dialogue he speaks "Who's
there?", etc.
Now, computers can only read a text as a sequence of signs, a string
of symbols. If the "input" were just the text, the computer would
only read something like
"ACT1-Scene1-EnterBarnardoandFrancisco,twosentinels.-BARNARDOWho’s
there?-FRANCISCO-Nay,answerme.Standandunfoldyourself.
The input needs to be encoded in such a way that the computer
distinguishes sections of the text and their structutre.
For this, computers need a markup language.
Markup language =
"A markup language is a modern system for annotating
a document
in a way that is syntactically distinguishable from the
text. [ . . . ] Examples are typesetting instructions such as
those found in troff, TeX and LaTeX,
or structural markers such as XML tags.
Markup instructs the software displaying the text to carry out
appropriate actions, but is omitted from the version of the text
that is displayed to users. Some markup languages, such as HTML,
have pre-defined presentation semantics, meaning
that their specification prescribes how the structured data are to
be presented; others, such as XML, do not. ("Markup
language" Wikipedia )
SGML (Standard Generalized Markup Language) (entry
in Wikipedia)
SGML
encoding of Literature Online
XML (entry in Wikipedia
)
HTML (entry in Wikipedia)
The above-quoted text with HTML tags, copied from Folger Digital
Texts Hamlet
:
<div id="copyPaste"
style="top:0px;left:0px;width:1px;height:1px;overflow:hidden;">
<div class="sceneHeader"><span><img
src="http://www.folgerdigitaltexts.org/html/fdt-texta-l.png"
class="imgTextX" alt="text from the Folio not found in the
Second Quarto" title="text from the Folio not found in the
Second Quarto">Scene</span> <span>1<img
src="http://www.folgerdigitaltexts.org/html/fdt-texta-r.png"
class="imgTextX" alt="text from the Folio not found in the
Second Quarto" title="text from the Folio not found in the
Second Quarto"></span>
</div>
<span class="stage centered"><span id="line-SD 1.1.0"
title="SD
1.1.0">Enter Barnardo and Francisco, two sentinels.</span></span><br>
<br>
<span class="speaker">BARNARDO</span><span
class="indentInline"> </span><span
id="line-1.1.1" title="1.1.1">Who’s there?</span><br>
<span class="speaker">FRANCISCO</span><span
class="indentInline"> </span><br>
<span id="line-1.1.2" title="1.1.2">Nay, answer me. Stand
and
unfold yourself.</span></div>
LaTex (explanation in Wikipedia)
What if we try to use the same encoding conventions to make texts
machine-readable?
TEI Text
Encoding Initiative: Guidelines
for encoding electronic texts
TEI by
Example
The above-quoted Hamlet text with TEI-conformant tags
<div type="act"> 〈ACT 1〉
</div>
<div type="scene"> 〈Scene
1〉</div>
<stage type="entrance"> Enter
Barnardo and Francisco, two sentinels. </stage>
<speaker>BARNARDO</speaker>
<p>Who’s there?</p>
<speaker>FRANCISCO</speaker>
<l>Nay,
answer me. Stand and unfold yourself.</l>
...
So far we have seen ICT as a tool to assist the study of language
and literature.
Let's consider ICT as an administrative or management tool, for
instance, in the case of documenting a paper
BACK to section "Reference
management"
===================================================
Journals
References
"Computer Power and Human Reason." Wikipedia. Wikimedia
Foundation. Web. 1 Mar 2013.
http://en.wikipedia.org/wiki/Computer_Power_and_Human_Reason
Harris, Robert. "Ideas for
Analyzing Frankenstein." VirtualSalt. 18 Jun
2000. Web. 1 Mar 2013. http://www.virtualsalt.com/lit/franidea.htm
Harris, Robert. "The
Personal Computer as a Tool for Student Literary Analysis" VirtualSalt.
30 Dec 1994. Web. 1 Mar 2013.
http://www.virtualsalt.com/comptool.htm
Jones, Sarah. "When
Computers Read: Literary Analysis and Digital Technology." ASIS&T
Bulletin (April/May 2012)
Literary and Linguistic
Computing. Association for Literary and Linguistic
Computing. Oxford University Press. http://llc.oxfordjournals.org/
[UV subscribed]
McLuhan, Marshall. Understanding Media: The Extensions of
Man. Cambridge, MA: The MIT Press, 1994.
Ong, Walter. Orality and Literacy: The Technologizing of
the Word. London, UK: Routledge, 1988.
Weizenbaum, Joseph. Computer Power and Human Reason: From
Judgment To Calculation. San Francisco: W. H. Freeman, 1976.