Veliza
Veliza is a (very) experimental tool for having a (limited) natural language exchange (in English) based on your corpus. It is inspired and enabled in large part by the famous Eliza program written by Joseph Weizenbaum in the 1960s.
Use it with a Jane Austen corpus or with your own corpus.
Overview
You can type in your own sentence and hit enter (or click the send button). This should generate a generic reply from Veliza that isn't influenced by your texts (that's the current behaviour, though we're considering ways of maybe having your texts influence the replies from Veliza). Keep in mind that the original Eliza was designed as a (parody of a) Rogerian psychotherapist, so the more you write content that sounds like it could be expressed to a psychologist, the more satisfying your results are likely to seem.
You can click on the from text button to get Voyant to fetch a random sentence from your text as input, to which Veliza will generate a reply. You can hover over the fetched sentence to see from which text the sentence is extracted.
Please note that Veliza is stubbornly monolingual, input sentences (including those from your texts) are expected to be in English (other languages may some day be supported).
Editing the Script
You can view and edit the script that Veliza uses by clicking on the collapsed panel along the right side of the tool. The editor is resizable if you want to make it bigger relative to the chat window, just slide the divider between the two.
There's a more detailed explanation of the syntax available, but here are the most relevant parts:
pre: [source] [target] before analysis, substitute the single word in [source] with the word or words in [target]
post: [source] [target] after analysis, substitute the single word in [source] with the word or words in [target]
synon: [word] [word] … during decomposition, treat any of the words as being the same as the first (when the first is written as @word)
key: [word] [weight] define a keyword to be matched during decomposition with an optional weight to define precedence of the keyword
decomp: [pattern] a pattern for matching that can include captured asterisks and synonyms as well as other words, which are then used for reassembly:
reasmb: [pattern] a pattern for reassembly when you can reference matched wildcards and synonyms with parenthetical numbers
For example here's a rule that says match a sentence with the word "dreamed" and if the phrase has "I dreamed" then reply "Really, " followed by what's captured after I dreamed, in other words the second (2) asterisk.
key: dreamed 4
decomp: * i dreamed *
reasmb: Really, (2) ?
More information
The original Eliza program was written by Joseph Weizenbaum and described in an article entitled "ELIZA—a computer program for the study of natural language communication between man and machine" (Communications of the ACM, 1966). It's important to emphasize that Weizenbaum intended to demonstrate the superficiality of automated communication even though many people were amazed by exchanges with Eliza and found it to be a promising early example of artificial intelligence (some five decades before Siri and its ilk).
Weizenbaum's description of Eliza has been implemented countless times in different languages. The version used here, by Veliza, is a lightly adapted variant of Chad Hayden's Java version which he characterizes as "a complete and faithful implementation of the program described by Weizenbaum". We chose to base Veliza on Hayden's version mostly because of the ease of finding and integrating it to Voyant. The iMessage-like styling of the discussion is adapted from code by adobewordpress.
Sentences generated by the user are processed directly by Veliza. When the user hits "check text" the following process happens:
- one of the documents from the corpus is randomly selected
- if that document hasn't yet been parsed for sentences, it's parsed
- all whitespace (spaces, tabs, new lines, etc.) are converted to a single space
- any letter leading to any end of sentence punctuation mark (.?!) followed by a space or end of document is identified
- If the sequence seems to have at leased two words it's added to a list of sentences
- a random sentence is selected from the document and a response from Veliza is generated
- if the reply seems to be very generic (i.e. from a finite list of sentences that indicate that Veliza was not able to find any useful keywords) the previous step is repeated until about one second has elapsed (at which point the genetic reply is kept)
Veliza was born from a conversation Stéfan and Geoffrey (we) were having (as part of a larger book project) on significant moments, algorithms and people in the history of digital humanities (and computing in general). In particular we were talking about early examples of text generation and what's sometimes called robotic poetics (see Winder 2004). We were musing about what it might be like to converse naturally (in English) with your texts, or even having your texts converse with something like Eliza. After a few hours work we had a working prototype (thanks in large part to an existing open-source version of Eliza and the flexible and extensible architecture of Voyant).
Veliza is a playful tool and an ongoing experiment.