Corpus Collocates

Corpus Collocates is a table view of which terms appear more frequently in proximity to keywords across the entire corpus.

Use it with a Jane Austen corpus or with your own corpus.

Overview

The table view shows the following three columns by default:

  • Term: this is the keyword (or keywords) being searched
  • Collocate: these are the words found in proximity of each keyword
  • Count (context): this is the frequency of the collocate occurring in proximity to the keyword

An additional column can be shown to display Count which is the frequency of the keyword term in the corpus – see the Grids guide for more information.

By default, the most frequent collocates are shown for the 10 most frequent keywords in the corpus.

Corpus Collocates with the Works of Jane Austen. You can also use Corpus Collocates with your own corpus.

Options

You can add specify which keyword to use by typing a query into the search box and hitting enter (see Search for more advanced searching capabilities).

There is also a slider that determines how much context to consider when looking for collocates. The value specifies the number of words to consider on each side of the keyword (so the total window of words is double). By default the context is set to 5 words per side, and the slider can have a minimum of 1 and a maximum of 30.

Clicking on the options icon also allows you to define a set of stopwords to exclude – see the Stopwords for more information.

Spyral

To use Catalogue in Spyral you can use the following code as a starting point. Modify the config object to modify the visualization.

let config = {
    context: 6,
    query: ["love"],
};

loadCorpus("austen").tool("CorpusCollocates", config);

Please see Tools.CorpusCollocates for more information about configuration.

Additional Information

For a graphical view of corpus collocates, try the Links tool.

See Also