Universitat de València  Departament
                Filologia Anglesa i Alemanya
Jesús Tronch Pérez    [home]

Technologies of Information and Communication Applied to English Studies

[ Unit 9 in syllabus for 35334 Practical Criticism Applied to English Studies 2012-2017 )

An Introduction

Click here to start   ConceptsTechnology   |  Information  |  CommunicationMarkup language | TEI Text Encoding Initiative

computational tools  --  "We shape our tools. And then our tools shape us." (Marshall McLuhan)
Robert Harris, "The Personal Computer as a Tool for Student Literary Analysis"
 

Reference management

"Reference management software" Wikipedia

Software :  Zotero :  Explanation in "Quick Start Guide"
Mendeley:  (also an academic social network) Explanation in "Mendeley" Wikipedia
The University of València uses the propietary software RefWorks
See icon on the right hand side of any catalogue card in the UV library catalogue.   "RefWorks" Wikipedia

Bibliographic records can also be organized in a database   "Bibliographic database" in Wikipedia 

Databases

What is a database ?    "Database" Wikipedia

Database software  :      MySQL  |    Base in  LibreOffice  

Dictionaries are databases :  OED (accessible from a UV terminal)

Database as a linguistic corpus:

These databases use concordancing and other text analysis techniques

Text Analysis Tools

With electronic texts, it is easier to find and study patterns in texts, see words and phrases in context, examine their frequencies, analyse word clusters, obtain concordances, etc.

Text analysis software listed in DiRT (Digital Research Tools) and in "Text mining" Project Bamboo

For other digital research tools, see Project Bamboo

Concordances

Use of concordances in the study of language and literature: "Concordance (publishing)" Wikipedia

For instance, in an analysis of Frankenstein, Robert Harris suggests to run the novel through a concordance program in order to "determine the frequencies of word occurrence. After the common "glue words" (such as the, of, is, and, etc.), what words are used most often, and of what kind are they? Approach the meaning of "kind" from several aspects. That is, are they nouns, verbs, or adjectives? Are they tonal words such as dark or gloom? Are they words conveying excess of a quality rather than moderateness (such as terror rather than fear, elated rather than happy)? And of course, so what? " ("Ideas for Analyzing Frankenstein")

Software: 
AntConc (freeware concordance programme)  |  Concordance |

see "Concordancer" entry in Wikipedia

Open Source Shakespeare


WordCruncher : an electronic text viewer with several tools with which one can look for specific references, search for words or phrases, perform simple word frequency distribution studies, study multiple languages in synchronized windows, and see various reports (e.g., collocation, vocabulary dispersion, vocabulary usage)

WordSmith Tools (concordance, key words, word list alphabetical and frequency, )

Word Count Tools (identifies unique words, difficult words; average sentence length, readability, keyword density, estimated reading time)

Wordcounter (shows top 10 keywords)  | 

Visualization tools

Voyant-Tools

 TokenX    |  Tagline GeneratorManyEyes |   

 TagCrowd |   Wordle  | ManiWordle | WordItOut (word cloud generator) |

 TextArc    TextArc Hamlet

 Visualization techniques:  Word Tree |  Phrase Nets | Tag Clouds  | Theme River  | Parallel Tag Clouds | Spark Clouds  |



Audiovisual presentation

Slide shows:  CAUTION :  Wailgum "PowerPoint Hell" | Bumiller "We Have Met the Enemy and He is PowerPoint"
Software:  Impress in LibreOffice  |
Video presentations:
Software:  Avidemux 

Epilogue





==============================================================================


CONCEPTS

"Information and Communication Technology (ICT)"

Let's analyze the constituents of this phrase

Technology =
"the making, modification, usage, and knowledge of tools, machines, techniques, crafts, systems, and methods of organization, in order to solve a problem, [. . . ] handle an applied input/output relation or perform a specific function"  ("Technology" Wikipedia  1 March 2013)

What sort of PROBLEMS can I have in the study of language and literature? How can specific tools, such as computers, assist me in solving these problems, in answering questions about texts, works and their meanings?
In the interaction between humans and computers, pay attention to Weizenbaum's warning in Computer Power and Human Reason: From Judgment to Calculation

The book "displays his ambivalence towards computer technology and lays out his case: while artificial intelligence may be possible, we should never allow computers to make important decisions because computers will always lack human qualities such as compassion and wisdom. Weizenbaum makes the crucial distinction between deciding and choosing. Deciding is a computational activity, something that can ultimately be programmed. It is the capacity to choose that ultimately makes us human. Choice, however, is the product of judgment, not calculation. Comprehensive human judgment is able to include non-mathematical factors such as emotions. Judgment can compare apples and oranges, and can do so without quantifying each fruit type and then reductively quantifying each to factors necessary for mathematical comparison."  ("Computer Power and Human Reason" in Wikipedia 1 March 2013)

Information =/= Communication

Communication =  conveying or sharing information
 "a process by which information is exchanged between individuals through a common system of symbols, signs, or behavior" ("communication" Merriam-Webster Online).

Do we communicate with computers?

Compare the previous definition with "the exchange of meanings between individuals through a common system of symbols" ("communication."  Encyclopædia Britannica Online Academic Edition.  01 Mar. 2013.)

"The communication process is complete once the receiver has understood the message of the sender." ("Communication" Wikipedia 1 March 2013)

Do computers understand?

Information =
 " in its most restricted technical sense, is a sequence of symbols that can be interpreted as a message. Information can be recorded as signs, or transmitted as signals. [ . . . ] Conceptually, information is the message (utterance or expression) being conveyed. " ("Information", Wikipedia, 1 March 2013)

The text of a novel, poem, play, etc. is a sequence of symbols, an arrangement of words (made up of letters) and punctuation marks, recorded in a given document (paper, electronic). These symbols are inscribed instructions for a reader or receiver to process that information, to decode that text in order to understand the message.
While we  can read the letter "I" or "i", a sign in the roman alphabet code,  computers don't read the letter "i" (to put it crudely). Almost all modern computers use the binary numeral system to encode information, and so they need to have the letter "i" encoded as a given sequence of 0 and 1, such as "110 1001", in ASCII code.

Computers' processing of information is called "computation" or "calculation"
Calculation = "deliberate process for transforming one or more inputs into one or more results, with variable change" ("calculation" Wikipedia)
In computing, this process involves a step-by-step procedure, that is, an algorithm  = " systematic procedure that produces—in a finite number of steps—the answer to a question or the solution of a problem. " ("algorithm." Encyclopædia Britannica Online Academic Edition.  Web. 01 Mar. 2013)

What sort of PROBLEMS can I have in the study of language and literature?

For instance, in an analysis of Frankenstein we may be interested in the contrast between positive terms related to "happiness" and "beauty" and negative terms related to "darkness". Robert Harris suggests the following question: "How do the passages relating to happiness and beauty fall within the novel, and how to they occur in relation to those of darkness?"
To answer that we need to  "Look for such terms as lovely, beautiful, beauty, pleasure, delight, joy, delighted, enchanged, gay, happy and contrast them with terms such as ugly, disconsolate, melancholy. " ("Ideas")

How can specific tools, such as computers, assist me in solving these problems?

Computers can perform certain tasks more efficiently than humans can:  for instance, search and find text or patterns

In order to have computers perform a given task, humans need to give them instructions
Readers understand a text because they share the code (the language, e.g. English) with the sender or writer, as well as its writing conventions (instructions). Thus humans can read the letter "i", a sign in the roman alphabet code, and understand "I" as the first personal pronoun because they learned that. Computers "learn" as long as they are given instructions. 
Instructions can be communicated to computers by means of a programming language: A sequence or collection of such instructions is a computer program


Basic computer programs to work with texts are text editors and word processors.
Text editor :  Bluefish  (explanation in Wikipedia)  |  Word processor : Writer in LibreOffice

Robert Harris's 1994 article  explains ways of using word processors in literary analysis:
- finding occurrences and themes
- testing theories or claims
- discovering thematic shapes
- performing specialized searches


----
Before computers can perform a given task on a literary texts, they obviously need the text of the novel, poem, play, etc., that is, an "input" in electronic form.

Where to find electronic texts?    Project  Gutenberg | Internet Archive  | The University of Oxford Text Archive (OTA)  |  click here for further resources.

-----

When we produce or read a text, we not only encode or decode the signs such as words and punctuation marks, but also apply cultural codes (mostly typographical conventions) that tell how the text is structured.
When we read the following text:

                〈ACT 1〉

                〈Scene 1〉

Enter Barnardo and Francisco, two sentinels.

BARNARDO    Who’s there?
FRANCISCO
      Nay, answer me. Stand and unfold yourself.

we recognize it as a dramatic text, we interpret its typographical features and content as belonging to the text genre of "play", as we observe that this text is organized into a number of structural elements:  the header of a text division that we call "act", followed by the header of a subdivision that we culturally know as "scene", followed by text in italics and centered indicating that it is a stage direction, then the name of one of the roles or speakers ("Barnardo") followed by the text of the dialogue he speaks "Who's there?", etc.

Now, computers can only read a text as a sequence of signs, a string of symbols. If the "input" were just the text, the computer would only read something like "ACT1-Scene1-EnterBarnardoandFrancisco,twosentinels.-BARNARDOWho’s there?-FRANCISCO-Nay,answerme.Standandunfoldyourself.

The input needs to be encoded in such a way that the computer distinguishes sections of the text and their structutre.
For this, computers need a markup language.


Markup language =
"A markup language is a modern system for annotating a document in a way that is syntactically distinguishable from the text.  [ . . . ] Examples are typesetting instructions such as those found in troff, TeX and LaTeX, or structural markers such as XML tags. Markup instructs the software displaying the text to carry out appropriate actions, but is omitted from the version of the text that is displayed to users. Some markup languages, such as HTML, have pre-defined presentation semantics, meaning that their specification prescribes how the structured data are to be presented; others, such as XML, do not. ("Markup languageWikipedia )



SGML  (Standard Generalized Markup Language) (entry in Wikipedia)
SGML encoding of Literature Online

XML  (entry in Wikipedia )

HTML     (entry in Wikipedia)

The above-quoted text with HTML tags, copied from Folger Digital Texts Hamlet :
<div id="copyPaste"
style="top:0px;left:0px;width:1px;height:1px;overflow:hidden;">
<div class="sceneHeader"><span><img
src="http://www.folgerdigitaltexts.org/html/fdt-texta-l.png"
class="imgTextX" alt="text from the Folio not found in the
Second Quarto" title="text from the Folio not found in the
Second Quarto">Scene</span>&nbsp;<span>1<img
src="http://www.folgerdigitaltexts.org/html/fdt-texta-r.png"
class="imgTextX" alt="text from the Folio not found in the
Second Quarto" title="text from the Folio not found in the
Second Quarto"></span>
</div>
<span class="stage centered"><span id="line-SD 1.1.0" title="SD
1.1.0">Enter Barnardo and Francisco, two sentinels.</span></span><br>
<br>
<span class="speaker">BARNARDO</span><span class="indentInline">&nbsp;</span><span
id="line-1.1.1" title="1.1.1">Who’s there?</span><br>
<span class="speaker">FRANCISCO</span><span class="indentInline">&nbsp;</span><br>
<span id="line-1.1.2" title="1.1.2">Nay, answer me. Stand and
unfold yourself.</span></div>


LaTex  (explanation in Wikipedia)



What if we try to use the same encoding conventions to make texts machine-readable?
TEI Text Encoding Initiative: Guidelines for encoding electronic texts
TEI by Example

The above-quoted Hamlet text with TEI-conformant tags
              <div type="act">  〈ACT 1〉 </div>

                <div type="scene">  〈Scene 1〉</div>

<stage type="entrance"> Enter Barnardo and Francisco, two sentinels. </stage>

<speaker>BARNARDO</speaker>    <p>Who’s there?</p>
<speaker>FRANCISCO</speaker>
      <l>Nay, answer me. Stand and unfold yourself.</l>



...

So far we have seen ICT as a tool to assist the study of language and literature.

Let's consider ICT as an administrative or management tool, for instance, in the case of documenting a paper

BACK to section "Reference management"

===================================================

Journals




References


"Computer Power and Human Reason." Wikipedia. Wikimedia Foundation. Web. 1 Mar 2013. http://en.wikipedia.org/wiki/Computer_Power_and_Human_Reason

Harris, Robert. "Ideas for Analyzing Frankenstein." VirtualSalt. 18 Jun 2000. Web. 1 Mar 2013. http://www.virtualsalt.com/lit/franidea.htm

Harris, Robert. "The Personal Computer as a Tool for Student Literary Analysis" VirtualSalt. 30 Dec 1994. Web. 1 Mar 2013. http://www.virtualsalt.com/comptool.htm

Jones, Sarah. "When Computers Read: Literary Analysis and Digital Technology." ASIS&T Bulletin (April/May 2012)

Literary and Linguistic Computing. Association for Literary and Linguistic Computing. Oxford University Press. http://llc.oxfordjournals.org/ [UV subscribed]

 McLuhan, Marshall. Understanding Media: The Extensions of Man. Cambridge, MA: The MIT Press, 1994.

 Ong, Walter. Orality and Literacy: The Technologizing of the Word. London, UK: Routledge, 1988.

Weizenbaum, Joseph. Computer Power and Human Reason: From Judgment To Calculation. San Francisco: W. H. Freeman, 1976.