Object Oriented Access to Database Records - jcobban/Genealogy GitHub Wiki

Object-Oriented Access to Database Records

Up: PHP Programming Guide

Table of Contents

Table Overview

The family tree database functionality uses the following tables:

Table Formerly Description Class
Addresses tblAR Address/Repository List Address
PictureBases tblBP Picture Location Path List part of Picture
Pictures tblBR Picture/Video/Sound List Picture
tblCP Child to Parent Relationship Does not support i18n
Children tblCR Child Table, links children to parents Child
tblCS Child Statuses Does not support i18n
DontMergeEntries tblDM Don't Merge List, Ids of two individuals DontMergeEntry
Events tblER Events Event
tblET Event Type Names Does not support i18n
Bookmarks tblHB Bookmarks
HistoryEntries tblHL History List
FamilyTree tblHR Database Global Services FamilyTree
Persons tblIR Individuals in the Family Tree Person
Locations tblLR Locations Location
Families tblMR Families Family
tblMS Marriage Statuses Does not support i18n
Surnames tblNR Surnames Surname
Names tblNX Alternate Names Name
tblST Source Types Does not support i18n
Sources tblSR Sources Source
Citations tblSX Source Detail/Citations Citation
tblTC To Do Item Categories Does not support i18n
ToDos tblTD To Do / Correspondence List ToDo
tblTL To Do Localities, merged with Locations
Temples tblTR LDS Temple Names Temple
DeletedPersons tblXI Deleted Individual Identifier List managed by Person
DeletedFamilies tblXM Deleted Marriage Identifier List managed by Family

The design of these tables is ultimately derived from the database maintained by the Legacy Family Tree© software package from Millenium. The Microsoft Windows® version of that database is implemented using a Microsoft Access© (JET) file, a format which supports personal access using SQL commands by one user at a time and uses 16-bit little-endian characters (UCS-2LE) which will be misinterpreted on a computer such as a Motorola microprocessor or an IBM System Z processor that uses big-endian integers, and which are inefficient for representing Western European text. Most newer software products, for example Microsoft Office, have switched to using the more efficient and portable 8-bit Unicode Transformation Format (UTF-8). A number of the smaller tables are loaded entirely into memory by the Legacy Family Tree software package, which is not an option available to a web server. In any event these tables do not support internationalization, and do not require the ability of the application to modify them, so their contents are moved to template files. For the web server implementation this database structure is mapped to tables implemented on a SQL server using UTF-8 text to support multiple languages with the addition of a number of additional indexes to improve performance with large tables and to avoid the need to load tables into memory. A web server cannot gain efficiency by loading tables into memory because the functionality may be spread across multiple independent processors for performance. However the database server caches frequently issued queries which achieves the same performance increase.

The table design used by Legacy Family Tree has a number of issues which have required extensive modification.

  1. The design is not compliant to 3rd Normal Form. In particular the same item of information is present in multiple tables, which must all be synchronized by application logic. For example the names of individuals were maintained in the Person, Family, and Name records.
  2. Many tables use auto-increment primary keys even where there was a field in the table which has a unique value for each row. This requires application logic to prevent duplicate records from being created. To discourage this the concept of an auto-increment or sequence field is excluded from the SQL standard and is not implemented by all commercial SQL server implementations. Specific examples of this are the Surnames (tblNR), Sources (tblSR), Locations (tblLR), and Temples (tblTR) tables.
  3. There are columns in several tables whose name duplicates a SQL reserved word, for example order and desc.
  4. There are columns whose name does not properly describe the contents of the column.
  5. The table names do not describe the contents. For example without examining the implementation logic one could not know that the table named tblIR contained the core information about a person in the family tree. It is recommended design practice for the name of a table to be a plural noun, in this case Persons, and the encapsulating class to be the singular noun, in this case Person.
  6. Many events are represented not by records in the Events (tblER) table but rather by groups of fields within other records. For example the Birth event is represented by fields in the Person (tblIR) record. Despite this the Legacy definition of the Events table includes support for almost all of these events, although a "kind" field has to be added to support sacraments of the Church of Latter Day Saints which are performed within a temple.
  7. Some events are associated with the wrong record. For example the Adoption event is associated with a Person record rather than with the Child record, even though it is topologically similar to the LDS Sealed to Parents event, which is associated with the Child record, in that the event connects a child to a set of parents.
  8. Legacy Family Tree is designed to be used by a single person on a single computer. Therefore it supports only a single interface language at a time. A server-based implementation cannot use that paradigm. All tables which contain presentation language text are therefore deprecated, and the internationalization (i18n) support is moved to the browser.

The class implementation permits the application code to access the database using either the logical structure conformant to good design or the interface implied by the original Legacy Family Tree design to permit migration.

To these are added several tables for managing access by multiple users to a shared resource.

Table Description Class
Users Users registered with the site User
RecOwners Track users who have the authority to update specific records in the database. RecOwner
Blogs Record blog entries applying to any table record Blog
SqlLog Log SQL updates to trace and permit backing out. $record->log($cmd, $parms)
Advertisers Information about advertisers on the site Advertiser

Another group of tables are added to manage transcriptions of original sources:

Table Description Class
Births Transcription of Birth registrations Birth
CemeteryData Cemetery transcription data
Census1831 1831 Census of Lower Canada (Quebec) CensusLine
Census1851 1851 Imperial Census transcription CensusLine
Census1861 1861 Imperial Census transcription CensusLine
Census1871 1871 Census of Canada transcription CensusLine
Census1881 1881 Census of Canada transcription CensusLine
Census1891 1891 Census of Canada transcription CensusLine
Census1901 1901 Census of Canada transcription CensusLine
Census1906 1906 Census of Canada Prairie Provinces transcription CensusLine
Census1911 1911 Census of Canada transcription CensusLine
Census1916 1916 Census of Canada Prairie Provinces transcription CensusLine
Census1921 1921 Census of Canada transcription CensusLine
Censuses Information on censuses Census
Counties Information on Counties for vital statistics County
Countries Information on Countries Country
CountyMarriageReports County Marriage Reports CountyMarriageReport
CountyMarriages County Marriage transcriptions CountyMarriage
Deaths Death registration transcriptions Death
Districts Information on Census Districts District
Domains Information on sub-national administrative domains for vital statistics Domain
FieldComments Field Comments FieldComment
Languages Information on Languages Language
Marriages Marriage transcriptions Marriage
MarriageIndi Information on participants in a marriage including the groom, bride, minister, and witnesses Marriage
MethodistBaptisms Wesleyan Methodist Baptism transcriptions MethodistBaptism
Ocfa Ontario Cemetery Finding Aid copy Ocfa
OcfaTownships Townships found in OCFA
Pages Census page information Page
SubDistricts Census Sub-District information SubDistrict
Townships Township information for vital statistics Township

Access to these tables by PHP web page generation scripts is through class definitions residing in a portion of the file system on the web server that is not accessible to client browsers. The base class for all of these table implementations is class Record.

the class RecordSet is defined to represent sets of Records for the purpose of generating reports, updating multiple records, and deleting multiple records, .

Lessons Learned

Multiple Parameters

If a function has multiple parameters then there is a possibility that the invoking code will invoke the function with the parameters in the wrong order. This is a corollary to Murphy's Law which in its oldest attested form is "If an aircraft part can be installed incorrectly, someone will install it that way".

One way to reduce the number of parameters to a function is where a parameter is a reference to an object. An object combines information with behavior and is generally anything more complicated than a simple number; for example an array, a string, or a structure combining multiple attributes.

Where implementing an object-oriented method would still leave multiple parameters another way to reduce those multiple parameters to a single parameter is to use keyword parameters. That is to replace the multiple parameters with an associative array parameter. In particular some of the older class definitions in this application originally had multiple parameters to the constructor because they implemented objects whose unique key consisted of multiple values, for example a hierarchy of identifiers. All such classes have been changed to accept a single associative array identifying the parameters by name, but for backwards compatibility the old multiple parameter format is still accepted but is converted internally to an associative array.

Constructors are Blunt Instruments

One of the lessons I learned while developing this project was that constructors introduce a number of risks.

Firstly if the only way for a script to obtain class instances is through a constructor and the script invokes other scripts you will end up with multiple object instances which all represent the same table record. This is both inefficient, since the table record is read multiple times, and introduces potential problems since changes made to one copy of the instance are not reflected in the other copies.

I learned to never invoke a constructor for a related class from within a constructor as this almost always led to an infinite recursion loop. The issue is that many classes contain references to instances of other classes. For example the class Person contains references to instances of Name, Family, Child, and Event. But in turn each of those classes contain references back to instances of Person.

For these reasons this implementation has two features:

  1. Whenever possible use a factory method to obtain an instance of a class. The factory uses a constructor only if that is the only way to obtain the instance. If there is already an instance which matches the signature then that existing instance is returned. The class FamilyTree has a singleton member which provides factory methods for the most commonly used objects in the family tree.
  2. Wherever the unique key of a class instance is passed as a parameter to a constructor and the invoking script already has an instance for that key it passes that instance in place of the key. The invoked constructor then can simply save the reference and return it from a factory method.

Next: class LegacyDate