Wiki‎ > ‎

Timeseries


Timeseries Searching

Timeseries searches allow you to view changes in the rate of usage of a word over a period of time. Here's an example: http://philologic.uchicago.edu/philologic3/bibliopolis.timeseries.html

Timeseries searching capability is not included by default when building a database, and must be run afterwards (See Loading a database).


Pieces


makefrequecnies

Should be loaded in your new database here:

 /var/lib/philologic/databases/DBNAME/frequencies/makefrequencies

along with various other helper scripts. Execute it by saying:

./makefrequencies

This generates timeseries and worddocfreq stats and tries to load them into MySQL. If for some reason you don't see them loaded up (look for DBNAMEyearfreqs and DBNAMEfreqs), try loading them manually:

mysql -uphilologic -p philologic < load.timeseries.sql 
mysql -uphilologic -p philologic < load.docfreq.sql

dbname.timeseries.html

The form for the relevant timeseries should be created when you run the makefrequencies command and placed into the web directory.

The form can be edited to use different date ranges, but the basic form looks something like:

<FORM ACTION="/cgi-bin/philologic/timeseries.pl">
<input type="hidden" name="dbname" value="newfrantext2">
<b>1500-2000 by century</b> (default)
<INPUT TYPE="radio" NAME="DATERANGE" VALUE="1" CHECKED>
<p>
<b>Selected Centuries by twenty-five year period</b>
<ul>
<li>1500-1599 <INPUT TYPE="radio" NAME="DATERANGE" VALUE="2">
<li>1600-1699 <INPUT TYPE="radio" NAME="DATERANGE" VALUE="3">
<li>1700-1799 <INPUT TYPE="radio" NAME="DATERANGE" VALUE="4">
<li>1800-1899 <INPUT TYPE="radio" NAME="DATERANGE" VALUE="5">
<li>1900-1999 <INPUT TYPE="radio" NAME="DATERANGE" VALUE="6">
</ul>
<p>
<HR>
Search Period for</A>: <inPUT NAME="word" SIZE=30 >
(ex. <tt>wom.n*</tt>)<P>
Do not display row totals less than
<input name="minrowtotal" size="3" value="2">
<HR>
To submit the query, press <inPUT TYPE="submit"
VALUE="Submit Query"> or
<inPUT TYPE="reset" VALUE="Clear Entries">. <P>
</FORM>

timeseries.pl

This is the CGI that executes the timeseries run and writes out the report.


DATERANGES

If you want to set up a different date range than what is available by default, you need to edit timeseries.pl and put it in there. Assign it an index higher than the current highest index and edit away. Here's the basic data structure:

	#################################################
# DATE RANGE 1: 1500 - 1999, by century #
#################################################

# The labels array contains the human friendly names for this daterange run
$labels[1] = "Time Series: 1500-1999";

# The freqreduct array contains the periods that the run will be broken down by
# I think we only use century or quarter (quarter century)
$freqreduct[1] = "century";

# The sqldaterange array contains the SQL to be used to limit the query
$sqldaterange[1] = "(theyear >= \"1500\" AND theyear <= \"1999\")";

# Set up the ALLwordfreq array for this daterange run
if ($DATERANGE eq "1") {
$ALLwordfreq{"C1500-1599"} = 0;
$ALLwordfreq{"C1600-1699"} = 0;
$ALLwordfreq{"C1700-1799"} = 0;
$ALLwordfreq{"C1800-1899"} = 0;
$ALLwordfreq{"C1900-1999"} = 0;
}

# The period total for this daterange run
# For daterange 1, it's a combination of the sub-centuries
# For other single c enturies, it's just that century's totals
$periodtotal[1] = $totaldocfreq{"C1500-1599"} + $totaldocfreq{"C1600-1699"} + $totaldocfreq{"C1700-1799"} + $totaldocfreq{"C1800-1899"} + $totaldocfreq{"C1900-1999"};

# The label rows are what get printed out to
$labelrow[1] = "<tr><td>Word</td>\n";
$labelrow[1] .= "<td>1500-99</td><td>Rate</td><td>1600-99</td><td>Rate</td>";
$labelrow[1] .= "<td>1700-99</td><td>Rate</td><td>1800-99</td><td>Rate</td>";
$labelrow[1] .= "<td>1900-99</td><td>Rate</td><td>Total</td><td>Rate</td></tr>";
#################################################

Remember that editing timeseriehttp://phttp://philologic.uchicago.edu/wiki/index.php?title=Worddocfreq&action=edithilologic.uchicago.edu/wiki/index.php?title=Worddocfreq&action=edits.pl probably affects all of the databases loaded up in this Philologic install, so if you need a new date range it's safest to add a new one rather than edit an existing one, unless you know that no other timeseries forms are using that DATERANGE value.

You can also copy timeseries.pl and edit a new version of it and update your form to point to the right version.

Comments