Object Oriented Access to Database Records - jcobban/Genealogy GitHub Wiki
Object-Oriented Access to Database Records
Table of Contents
-
This page
Table Overview
The family tree database functionality uses the following tables:
Table | Formerly | Description | Class |
---|---|---|---|
Addresses | tblAR | Address/Repository List | Address |
PictureBases | tblBP | Picture Location Path List | part of Picture |
Pictures | tblBR | Picture/Video/Sound List | Picture |
tblCP | Child to Parent Relationship | Does not support i18n | |
Children | tblCR | Child Table, links children to parents | Child |
tblCS | Child Statuses | Does not support i18n | |
DontMergeEntries | tblDM | Don't Merge List, Ids of two individuals | DontMergeEntry |
Events | tblER | Events | Event |
tblET | Event Type Names | Does not support i18n | |
Bookmarks | tblHB | Bookmarks | |
HistoryEntries | tblHL | History List | |
FamilyTree | tblHR | Database Global Services | FamilyTree |
Persons | tblIR | Individuals in the Family Tree | Person |
Locations | tblLR | Locations | Location |
Families | tblMR | Families | Family |
tblMS | Marriage Statuses | Does not support i18n | |
Surnames | tblNR | Surnames | Surname |
Names | tblNX | Alternate Names | Name |
tblST | Source Types | Does not support i18n | |
Sources | tblSR | Sources | Source |
Citations | tblSX | Source Detail/Citations | Citation |
tblTC | To Do Item Categories | Does not support i18n | |
ToDos | tblTD | To Do / Correspondence List | ToDo |
tblTL | To Do Localities, merged with Locations | ||
Temples | tblTR | LDS Temple Names | Temple |
DeletedPersons | tblXI | Deleted Individual Identifier List | managed by Person |
DeletedFamilies | tblXM | Deleted Marriage Identifier List | managed by Family |
The design of these tables is ultimately derived from the database maintained by the Legacy Family Tree© software package from Millenium. The Microsoft Windows® version of that database is implemented using a Microsoft Access© (JET) file, a format which supports personal access using SQL commands by one user at a time and uses 16-bit little-endian characters (UCS-2LE) which will be misinterpreted on a computer such as a Motorola microprocessor or an IBM System Z processor that uses big-endian integers, and which are inefficient for representing Western European text. Most newer software products, for example Microsoft Office, have switched to using the more efficient and portable 8-bit Unicode Transformation Format (UTF-8). A number of the smaller tables are loaded entirely into memory by the Legacy Family Tree software package, which is not an option available to a web server. In any event these tables do not support internationalization, and do not require the ability of the application to modify them, so their contents are moved to template files. For the web server implementation this database structure is mapped to tables implemented on a SQL server using UTF-8 text to support multiple languages with the addition of a number of additional indexes to improve performance with large tables and to avoid the need to load tables into memory. A web server cannot gain efficiency by loading tables into memory because the functionality may be spread across multiple independent processors for performance. However the database server caches frequently issued queries which achieves the same performance increase.
The table design used by Legacy Family Tree has a number of issues which have required extensive modification.
- The design is not compliant to 3rd Normal Form. In particular the same item of information is present in multiple tables, which must all be synchronized by application logic. For example the names of individuals were maintained in the
Person
,Family
, andName
records. - Many tables use auto-increment primary keys even where there was a field in the table which has a unique value for each row. This requires application logic to prevent duplicate records from being created. To discourage this the concept of an auto-increment or sequence field is excluded from the SQL standard and is not implemented by all commercial SQL server implementations. Specific examples of this are the Surnames (tblNR), Sources (tblSR), Locations (tblLR), and Temples (tblTR) tables.
- There are columns in several tables whose name duplicates a SQL reserved word, for example
order
anddesc
. - There are columns whose name does not properly describe the contents of the column.
- The table names do not describe the contents. For example without examining the implementation logic one could not know that the table named
tblIR
contained the core information about a person in the family tree. It is recommended design practice for the name of a table to be a plural noun, in this casePersons
, and the encapsulating class to be the singular noun, in this casePerson
. - Many events are represented not by records in the Events (tblER) table but rather by groups of fields within other records. For example the Birth event is represented by fields in the Person (tblIR) record. Despite this the Legacy definition of the Events table includes support for almost all of these events, although a "kind" field has to be added to support sacraments of the Church of Latter Day Saints which are performed within a temple.
- Some events are associated with the wrong record. For example the Adoption event is associated with a Person record rather than with the Child record, even though it is topologically similar to the LDS Sealed to Parents event, which is associated with the Child record, in that the event connects a child to a set of parents.
- Legacy Family Tree is designed to be used by a single person on a single computer. Therefore it supports only a single interface language at a time. A server-based implementation cannot use that paradigm. All tables which contain presentation language text are therefore deprecated, and the internationalization (i18n) support is moved to the browser.
The class implementation permits the application code to access the database using either the logical structure conformant to good design or the interface implied by the original Legacy Family Tree design to permit migration.
To these are added several tables for managing access by multiple users to a shared resource.
Table | Description | Class |
---|---|---|
Users | Users registered with the site | User |
RecOwners | Track users who have the authority to update specific records in the database. | RecOwner |
Blogs | Record blog entries applying to any table record | Blog |
SqlLog | Log SQL updates to trace and permit backing out. | $record->log($cmd, $parms) |
Advertisers | Information about advertisers on the site | Advertiser |
Another group of tables are added to manage transcriptions of original sources:
Table | Description | Class |
---|---|---|
Births | Transcription of Birth registrations | Birth |
CemeteryData | Cemetery transcription data | |
Census1831 | 1831 Census of Lower Canada (Quebec) | CensusLine |
Census1851 | 1851 Imperial Census transcription | CensusLine |
Census1861 | 1861 Imperial Census transcription | CensusLine |
Census1871 | 1871 Census of Canada transcription | CensusLine |
Census1881 | 1881 Census of Canada transcription | CensusLine |
Census1891 | 1891 Census of Canada transcription | CensusLine |
Census1901 | 1901 Census of Canada transcription | CensusLine |
Census1906 | 1906 Census of Canada Prairie Provinces transcription | CensusLine |
Census1911 | 1911 Census of Canada transcription | CensusLine |
Census1916 | 1916 Census of Canada Prairie Provinces transcription | CensusLine |
Census1921 | 1921 Census of Canada transcription | CensusLine |
Censuses | Information on censuses | Census |
Counties | Information on Counties for vital statistics | County |
Countries | Information on Countries | Country |
CountyMarriageReports | County Marriage Reports | CountyMarriageReport |
CountyMarriages | County Marriage transcriptions | CountyMarriage |
Deaths | Death registration transcriptions | Death |
Districts | Information on Census Districts | District |
Domains | Information on sub-national administrative domains for vital statistics | Domain |
FieldComments | Field Comments | FieldComment |
Languages | Information on Languages | Language |
Marriages | Marriage transcriptions | Marriage |
MarriageIndi | Information on participants in a marriage including the groom, bride, minister, and witnesses | Marriage |
MethodistBaptisms | Wesleyan Methodist Baptism transcriptions | MethodistBaptism |
Ocfa | Ontario Cemetery Finding Aid copy | Ocfa |
OcfaTownships | Townships found in OCFA | |
Pages | Census page information | Page |
SubDistricts | Census Sub-District information | SubDistrict |
Townships | Township information for vital statistics | Township |
Access to these tables by PHP web page generation scripts is through class definitions residing in a portion of the file system on the web server that is not accessible to client browsers. The base class for all of these table implementations is class Record.
the class RecordSet is defined to represent sets of Records for the purpose of generating reports, updating multiple records, and deleting multiple records, .
Lessons Learned
Multiple Parameters
If a function has multiple parameters then there is a possibility that the invoking code will invoke the function with the parameters in the wrong order. This is a corollary to Murphy's Law which in its oldest attested form is "If an aircraft part can be installed incorrectly, someone will install it that way".
One way to reduce the number of parameters to a function is where a parameter is a reference to an object. An object combines information with behavior and is generally anything more complicated than a simple number; for example an array, a string, or a structure combining multiple attributes.
Where implementing an object-oriented method would still leave multiple parameters another way to reduce those multiple parameters to a single parameter is to use keyword parameters. That is to replace the multiple parameters with an associative array parameter. In particular some of the older class definitions in this application originally had multiple parameters to the constructor because they implemented objects whose unique key consisted of multiple values, for example a hierarchy of identifiers. All such classes have been changed to accept a single associative array identifying the parameters by name, but for backwards compatibility the old multiple parameter format is still accepted but is converted internally to an associative array.
Constructors are Blunt Instruments
One of the lessons I learned while developing this project was that constructors introduce a number of risks.
Firstly if the only way for a script to obtain class instances is through a constructor and the script invokes other scripts you will end up with multiple object instances which all represent the same table record. This is both inefficient, since the table record is read multiple times, and introduces potential problems since changes made to one copy of the instance are not reflected in the other copies.
I learned to never invoke a constructor for a related class from within a constructor as this almost always led to an infinite recursion loop. The issue is that many classes contain references to instances of other classes. For example the class Person
contains references to instances of Name
, Family
, Child
, and Event
. But in turn each of those classes contain references back to instances of Person
.
For these reasons this implementation has two features:
- Whenever possible use a factory method to obtain an instance of a class. The factory uses a constructor only if that is the only way to obtain the instance. If there is already an instance which matches the signature then that existing instance is returned. The class FamilyTree has a singleton member which provides factory methods for the most commonly used objects in the family tree.
- Wherever the unique key of a class instance is passed as a parameter to a constructor and the invoking script already has an instance for that key it passes that instance in place of the key. The invoked constructor then can simply save the reference and return it from a factory method.
Next: class LegacyDate