Contents

This table of contents shows all chapter, section and sub-section titles. Click on a chapter to show that chapter's code examples. Not every chapters contain code examples.

1. Introduction

2. Relational Database Fundamentals

2.1 Introduction

2.2 Tables, Rows and Columns

2.3 External and Internal Representations Of Data

2.4 Advantages Over Spreadsheets

2.4.1 Size and Speed

2.4.2 Multiple Users

2.5 Relationships Among Tables

2.5.1 One-to-many Relationships

2.5.2 One-to-one Relationships

2.5.3 Many-to-many Relationships

2.6 Entity Relationship Diagrams

2.7 Uniqueness

2.8 Sequences

2.9 Keys

2.9.1 Primary Keys

2.9.2 Foreign Keys

2.10 Constraints

2.11 Indexes

2.12 Joining Tables

2.13 Normal Forms

2.13.1 First Normal Form

2.13.2 Second Normal Form

2.13.3 Third Normal Form

2.13.4 Summary Of Normal Forms

3. Structured Query Language (SQL)

3.1 Introduction

3.2 Databases, Schemas, Tables, Rows and Columns

3.3 Create

3.4 Insert

3.5 Select

3.6 Update and Delete

3.7 SQL Functions

3.7.1 Regular Functions

3.7.2 Aggregate Functions

3.8 Domains, Triggers and Views

3.9 Unions, Intersections and Differences

4. Relational Database Management Systems

4.1 Introduction

4.2 Standard SQL

4.3 A Sampling Of Differences

4.4 Server and Client

4.5 Compatibility

5. Client and Web Applications

5.1 Introduction

5.2 Command Line Programs

5.3 Web-Based Applications

5.4 Client Applications

5.5 SQL Interfaces In Various Languages

5.5.1 Perl

5.5.2 Python

5.5.3 PHP

5.5.4 Java

6. Data Storage, Searching and Manipulation

6.1 Introduction

6.2 General Schema Design Decisions

6.3 Sample Schema For Tracking Chemical Samples

6.4 Schemas For Pubchem Data

6.4.1 BioAssay Data

6.4.2 Substances

6.4.3 Compounds

6.5 Data Constraints and Data Integrity

6.6 Developing Complex SQL

6.7 Sub-Select Statements

6.8 Views

7. Computer Representations Of Molecular Structures

7.1 Introduction

7.2 SMILES Representation Of Molecular Structure

7.3 Extensions To SQL For Chemical Structures

7.4 SMARTS Representation Of Molecular Searches

7.5 SMILES and SMARTS Quirks

7.5.1 Hydrogen Atoms

7.5.2 Aromaticity

7.5.3 Tautomers

7.5.4 Valence

7.5.5 Chirality

7.5.6 Isotopes

7.5.7 Salts and Mixtures

7.5.8 InChI and Canonical SMILES

7.6 SMILES and Inorganic Structures

7.7 Other SMILES Extensions

7.8 Input and Output Of Molecular Structures

7.9 Useful SQL Extensions

7.10 SMILES As A SQL Data Type

7.10.1 Domains

7.10.2 Triggers

7.11 Summary

8. Molecular Fragments and Fingerprints

8.1 Introduction

8.2 Fragments

8.2.1 Fragment Keys

8.2.2 MACCS Keys and Other Fragment Keys

8.3 Fingerprints

8.4 Similarity Measures

8.5 Computing Fragment-Based Properties

9. Reactions and Transformations

9.1 Introduction

9.2 Reaction SMILES

9.3 Transformations

9.3.1 Unimolecular Transformations

9.3.2 Multi-component Transformations

9.4 Canonical Reaction SMILES

10. PostgreSQL Extensions

10.1 Introduction

10.2 Composite Data Types

10.3 Composite Data Type For Experimental Values

10.4 Array Data Types For 2- and 3-Dimensional Coordinates

10.5 Functions In Other Languages

10.5.1 Plpgsql

10.5.2 Plperl, Plpython, Pltcl

10.5.3 Core Chemical Functions

10.5.4 C Language Functions

10.6 Object RDBMS

11. 3-Dimensional Molecular Structure Tables

11.1 Introduction

11.2 Using Tables Instead Of Files

11.3 Molfile and Other Common File Formats

11.4 Processing SDF Files

11.5 Using Tables Instead Of Files In Client Programs

11.6 File Import, Export and Conversions

11.7 Functions Using 3-Dimensional Atomic Coordinates

11.8 Conformations

11.9 Other Representations Of 3-Dimensional Molecular Structure

12. More On Client and Web Interfaces To RDBMS

12.1 Introduction

12.2 Store All Possible Data In The RDBMS

12.3 Advanced SQL Techniques

12.3.1 Placeholders In SQL Statements

12.3.2 Bind Values In SQL Statements

12.4 Web Applications

12.5 R Programs

12.5.1 Hierarchical Clustering

12.5.2 Linear Models

13. Applications

13.1 Introduction

13.2 Compound Registration

13.3 Experimental Chemical and Biological Data Integration

13.4 Data From External Sources

13.5 Utilities

13.5.1 molgrep

13.5.2 molcat

13.5.3 molview

13.5.5 molrandom

13.5.6 molnear

13.5.7 molsame

Appendix

A.1 Introduction

A.2 Symbols and Bonds From Smiles

A.3 Normalizing Data

A.4 SQL Functions

A.4.1 Public166keys

A.4.2 Orsum

A.4.3 Tanimoto

A.4.4 Euclid

A.4.5 Hamming

A.4.6 Nbits_set

A.4.7 Amw

A.4.8 Tpsa

A.5 Tables Used In Functions

A.5.1 Amw

A.5.2 Tpsa

A.5.3 Public166keys

A.6 Core Function Implementation For PostgreSQL

A.6.1 PerlMol/plperlu

A.6.2 FROWNS/plpythonu

A.6.3 OpenBabel/python

A.7 C Language PostgreSQL Functions

A.8 Database Utilities Dbutils

A.9 Loading Files Into Simple Tables

A.9.1 Smiloader

A.9.2 Sdfloader