Using the Open4st database

Patrick N. McGovern 2025

TBD - Update for 2025

Description

The 2025 Open4st Cloud Drive provides view access to Openst database data for family, clones, field trials, nursery and research data. It leverages Google Drive, Postgresql database and the Insync dropbox to synchronize McGovern's local Open4st files to his Google drive account. The previous (pre-2022) online r4st database application used the open source Postgresql database and the online DBKiss database application for view access to database tables and views. Both systems were designed to be used as a central repository by importing/exporting data via csv files, creating SQL queries for specific reports and views for "big picture" summaries for R Programming and further analysis. While the online database application was very robust, it required login access and regular security maintenance updates on local and remote servers. The simpler Cloud Drive system can be maintained on the local system and provides user view access to Google Drive for the same CSV files.

An open source development copy of the Open4st database (aka “r4st”) is available via the pmcgover/24dev-demo GitHub public repository. It is a prototyping add-on process for the OSGeo Live DVD which allows the user to review, modify, and execute the open source code using the MIT license. To access this material, review the 24dev-demo documentation, download the latest 24dev-demo release and install it on a OSGeoLive system. The r4st database documentation provides a high level description of the 24dev-demo process. The r4st/csv folder contains the CSV files that are loaded into the database with scripts from the r4st/bin folder to create the tables and views. This is essentially a build process that drops and recreates all of the existing data each time the scripts are activated and allows for easy modifications that could be deactivated and used long term without the build process. TBD - Update 24dev-demo with the r4st-database-Template.

Accessing the Open4st Cloud Files

Users need a Google Drive account to view files under: Open4st-Database.
View the Open4st DatabaseTable Diagram and the Open4st View Diagram describing database table/view relationships.
The main Open4st-Database folder contains Open4st Table and View database named folders. Simply sort the files in this directory to view the alphabetical sorted priority.
The view file names are prefixed with letters and an underscore for easier sorting of field and nursery files. The recent year dated files are nursery related.
The table file names are not prefixed.
Users can click on any CSV file to view the general contents and download or share to their Google Drive as needed.
- Users can download all files in zipped format by right clicking on the folder name then "Download".
- Users can access each file/folder URL link by right clicking on the file/folder and select "Get Link". File links retain the same URL despite file content changes.
The r4t database folder contains IPC Salix cultivar related files related to the published manuscript, “ShortCommunication: IPC Salix Cultivar Database Proof-of-Concept”. It also references the supplementary technical document: Technical Information: IPC Salix Cultivar Database Proof-of-Concept.

Significant Open4st Tables

Database tables: Represent structured, related data in a row and column format.

Plant, Family, and Taxa tables: All Open4st families, even with germination are included in the "family" table to allow reporting of all family and related parents. New plants are typically entered into the "Plant" table when they have met the program selection criteria. Both tables should map to the "taxa" species description table.
Test Tables: Routine activities such as annual nursery or field trials are considered cumulative testing events to promote the concept of, "Everything is a test". The "test_spec" table contains high level test specifications with event (e.g. nursery) or trial (e.g field trial) details that map to the "test_detail" or "field_trial" tables describing the test detail events. These tables should also map to the "site" table.
Journal Table: Serves as a blog for general observations. Can be linked to other tables.
Pedigree Table: Contains pedigree details for each family table entry by associating them with plant table records starting at the plant root level and listing each possible combination, sometimes requiring multiple lines. The "Path" column lists each filial generation with the parent names in parentheses. The "@" character denotes the associated female or male parent for each filial family. See the complete Open4st Pedigree listing. It was originally created in 12/09/2012 with assistance from this StackOverflow topic: Postgresql Recursive via 2 Parent Child Tables but was incomplete. Then on 12/1/2025, McGovern updated the resolution notes to this StackOverflow topic: Update Postgres Pedigree Process to Prefix Child Decendents with Character.

Database views: Represent a custom, virtual data set of one or more tables.

Open4st database Column Descriptions and Naming Conventions.

Family Naming Scheme (pre-2011)

Family names are delimited by alpha-numeric characters that denote the Family ID, an "x" which denotes "cross", the Taxa abbreviations and the year the cross was made (now deprecated). A sample family name is: 1xGAA91 with the following syntax: [Family number]x[parent taxa][year of cross]. The parent letters are general taxa description of the cross. The family, “gaa” might indicate ga x a with the specific details in the online r4st database. See "Using and Accessing the r4st Database". Several pre-2004 crosses denoted ploidy by including the ploidy number of the non-diploid parent between the taxa letters (eg. 1xT4E04) This convention is deprecated in favor of not including the expected ploidy in the name.

Family Naming Scheme - Post 2011

Newer family names do not include the year the cross was made for conciseness. Most new hybrid parent names use a single character to describe them (e.g. R instead of AG and B instead of CAG). A sample family name is: 2xb with the following syntax: [Family ID number]x[parent taxa] which has the taxa parents P. canescens x (P. alba x P. grandidentata).

Open4st Clone Naming Scheme

Internal (Open4st) syntax: [Family ID number][Family Taxa][Clone ID] (e.g. 41aa102, 2b25, )

External cooperator Clone syntax: [Family ID number][Family Taxa][Clone ID][CooperatorSite] (e.g. 15AG4MF)

Taxa table: common aspen taxa_key single character abbreviations:

T = tremulodies
E = tremula
A = alba
C = canescens (P. alba x P. tremula)
G = grandidentata
R = alba x P. grandidentata (P. x rouleauiana Boivin)
S = P. grandidentata x P. tremuloides P. x smithii Boivin (smithii)
B = CAG - ((A x E) x (A x G))
W = The male is wind or open pollinated

Plant table, "alba_class" column:

In 2018 eight high level "hybrid name" classifications were developed to categorize alba/aspen materials from a higher level P. alba family percentage context. Below is a listing of these classes which are are associated to each clone in the Plant table in the "alba_class" column. It was not added to the Taxa table to ease manual data updates and improve accuracy, since there are many taxa and male/female reciprocal combinations. Note that the Family and Plant tables have foreign key relationships to the Taxa table. Below are the "alba_class" data descriptions:

'A' = Alba, 100% P. alba'
'H' = Hybrid, with 50% P. alba'
'AH' = 'Alba Hybrid, with > 50% P. alba
'ASH' = 'Aspen Hybrid, with < 50% P. alba'
'ASA' = 'Aspen American, native American aspen
'ASO' = 'Aspen Other, with 0% P. alba, (eg, P. tremula)
'C' = 'Control, with 0% P. alba
'U' = Unknown

Test_Detail table, "planted_order"column:

Records in the test_detail table are typically ordered by the test_detail_key (variety) for easier selection processing. The "planted_order" column simply contains the original planted order of the materials to verify records over time.

Test_Detail table, "selection_type" column:

Nursery testing involves the selection, promotion, demotion and removal of materials over time. The Test Detail table contains the "selection_type" column to track selections in a given year. These values are typically associated with ranked data to assist with selection decisions. Below are the definitions for each selection type:

'P' = Primary, an elite high priority selection
'S' = Secondary, a medium priority selection
'T' = Tertiary, a tertiary or parent selection
'F' = Family, a family population worthy of retesting
'R' = Retest, a clone worthy of retesting.
'D' = Discard
'U' = Unknown

Test_Detail table, "stock_type" column:

Nursery stock types typically describe the size, age and stage of nursery grown trees. The Test Detail table contains the "stock_type" column to track the stock types in a given year. These values are typically associated with ranked data to assist with selection decisions. Below are the definitions for each stock type:

'ASP' = Active Shoot Propagation, non-lignified greenwood shoots with actively growing meristematic tips.
'SASP' = Seedling Active Shoots, seedling shoot tips grown via ASP process.
'WASP' = Winter Active Shoots, 3"-4" dormant cuttings propagated under humid greenhouse ASP conditions.
'DC' = Dormant hardwood Cuttings
'ODC' = Ortet Dormant Cuttings, first year ortet whips representing a clone and made into dormant hardwood cuttings.
'FDC' = Family Dormant Cuttings, one lower 8" dormant cutting planted per ortet and retaining only the family name.
'MS' = Ministool, typically a 1-0 stock plant with roots cut to 1", stem to 3" and total length 10" to 12" long.
'RTL' = RootLings, typically 3 mm thick by 55 mm long juvenile roots propagated distal end up via ASP process.
'SEL' = SEedLings, typically aspen seedlings.
'RS' = Root Shoots, typically succulent aspen shoots sprouting from horizontally grown dormant aspen root sections.
'1-0' = 1-0 stock grown one year in a nursery and lifted in the Spring or Fall.
'1-1' = 1-1 stock grown one year in a nursery, lifted and replanted another year, then lifted in the Spring or Fall.
'U' = Unknown, typically reserved for the first 3 database "dummy" records.

Five point Issue scoring system: leaf_score, bld_leaf_score, disease_score

Updated in 2017. Based on an intuitive grading score where a low score is bad and highest is best. The 5 point issue scoring system uses 3 as a true middle metric and allows room for 5 to represent a "perfect" plus score. The -1 is the default r4st value for "Not Scored". This system can be adapted for scoring other attributes.

-1 = Not scored.
1= Over 50% of crown affected by leaf issues, major defoliation,
2= About 25% of crown affected by leaf issues and defoliation.
3= Leaves moderately affected with minor defoliation,
4= A few leaf issues, no defoliation,
5= No leaf issues.

Five point quality scoring system describing timber value potential: quality_score

1 = Poor timber value
2 = Below average timber value
3 = Average timber value
4 = Good timber value
5 = Excellent timber value

Three point selection scoring system describing a trees selection potential: selectable

The sum of selection values or “Sum Selectable'' contribute to the total selection value of a tree across different sites.

-1 = Undesireable
0 = N/A OR Unknown
1 = Selectable

Significant va1_master_test_detail view columns:

This view contains all McGovern nursery data based on the test_detail table with additional averaged detail columns listed below. These column terms are repeated in many test_detail nursery views. Results can vary annually per variety as different mix of stock types may be used.

collar_median_dia_mm: Each replicated nursery variety was measured for its median collar diameter in millimeters from dormant rooted cuttings stock types.
stool_collar_median_dia_mm: Each replicated nursery variety was measured for its median collar diameter in millimeters from rooted mini-stool stock types.
field_cuttings_ft: The number of field grade 12" cuttings with the smallest end diameter > 10mm in a given replicated nursery variety.
survival_rate: The replicated nursery variety survival rate with the formula: (end_quantity / start_quantity).
vigor_survival: Typically intended for dormant cutting stock types (DC). The survival rate multiplied by the related median cutting collar diameter with the formula: (end_quantity / start_quantity) * collar_median_dia_mm.
stool_vigor_survival: Typically intended for rooted mini-stool stock types (MS). The survival rate multiplied by the related median mini-stool collar diameter with the formula: (end_quantity / start_quantity) * collar_median_dia_mm.
is_plus_ynu: A subjective, contextual appraisal of a variety on a given site and time. Data values are the characters Y, N, U (Yes, No, Unknown).
is_plus_tree: A mathematical representation of the Yes values from the "is_plus_ynu" column. Often used to sum the count of Yes values for summaries.

Open4st Copyright: open4st Disclaimer and Copyright.

Page updated

Google Sites

Report abuse