Filtering keys - AtlasOfLivingAustralia/ala-keys-ui Wiki

The ability to create filters and retrieve keys that in combination will key out a subset of taxa is what most people like most about KeyBase.

In KeyBase we've got two types of filters, local filters, which act on a single key and enable the user to create a key to a subset of taxa, and global filters, which act on one or more projects and filter out the keys that are needed to key out a subset of taxa and select the filtered sets of items that need to be passed to the individual keys (as local filters). The local filters have been implemented in the ALA version of KeyBase already and are dealt with entirely in the front end. The global filters have to come from the web service.

So far, in KeyBase, global filters have been created the same way as local filters, using nested sets of keys. The global filters also used to be stored in the database, as they took a long time to create. However, in the new version of KeyBase, we have been moving much of the heavy lifting, including the creation of the nested sets (for the leads within the keys), to the front end, so the nested sets need not be stored in the database any more, so won't be available for the creation of filters. Moreover, as the filters can contain many taxa and the hierarchy of keys will never be very deep, the recursive querying that the nested sets are meant to avoid is probably going to be much more efficient than the nested sets.

I have set up a quick example web service that creates filters on the fly based on a set of item IDs at http://keybase.rbg.vic.gov.au/dev/ws/globalFilter?filter_id=555f5b4bd9412. This service is meant to be used in combination with the project web service (http://keybase.rbg.vic.gov.au/dev/ws/projects?project=10 for this particular filter), so the globalFilter service only needs to retrieve the IDs for the keys and the items in each key.

The PHP code that creates the globalFilter web service is below:

public function globalFilter($filter) {
  $this->filterKeys = array();
  $this->filterKeyIDs = array();
  $this->db->select('FilterItems, Filter');
  $this->db->from('globalfilter');
  $this->db->where('FilterID', $filter);
  $query = $this->db->get();
  if ($query->num_rows()) {
    $row = $query->row();
    $filterItems = unserialize($row->FilterItems);
    $projects = array_keys(unserialize($row->Filter));
    $this->getGlobalFilterKeys($filterItems, $projects);
    return $this->filterKeys;
  }
  else {
    return FALSE;
  }
}

private function getGlobalFilterKeys($items, $projects) {
  $newItems = array();
  $this->db->select('k.ProjectsID, k.KeysID, k.TaxonomicScopeID, k.Name AS KeyName, 
          group_concat(DISTINCT cast(l.ItemsID as char)) AS Items', FALSE);
  $this->db->from('keys k');
  $this->db->join('leads l', 'k.KeysID=l.KeysID');
  $this->db->join('groupitem g0', 'l.ItemsID=g0.GroupID AND g0.OrderNumber=0', 
          'left', FALSE);
  $this->db->join('groupitem g1', 'l.ItemsID=g1.GroupID AND g1.OrderNumber=1', 
          'left', FALSE);
  $this->db->join('items i', 'coalesce(g0.MemberID, g1.MemberID, 
          l.ItemsID)=i.ItemsID', 'inner', FALSE);
  $this->db->where_in('k.ProjectsID', $projects);
  $this->db->where_in('i.itemsID', $items);
  $this->db->group_by('k.KeysID');
  $query = $this->db->get();
  if ($query->num_rows()) {
    foreach ($query->result() as $row) {
      $key = array();
      $key['key_id'] = $row->KeysID;
      $key['key_name'] = $row->KeyName;
      $key['filter_items'] = explode(',', $row->Items);
      
      if (in_array($key['key_id'], $this->filterKeyIDs)) {
        $k = array_search($key['key_id'], $this->filterKeyIDs);
        $this->filterKeys[$k]->filter_items = array_unique(
                array_merge($this->filterKeys[$k]->filter_items, 
                  $key['filter_items']));
      }
      else {
        $this->filterKeys[] = (object) $key;
        $newItems[] = $row->TaxonomicScopeID;
      }
    }
    
    if ($newItems) {
      $this->getGlobalFilterKeys($newItems, $projects);
    }
  }
}

The globalFilter function initialises the getGlobalFilterKeys function, which creates the array of keys that will be converted to JSON later. The function takes as its argument the ID of a resource that contains a set of item IDs and project IDs. Currently, the resource is a record in the globalfilter table in the database, where the set of filtered items is stored in serialised form.

The getGlobalFilterKeys function is recursive. It starts with an empty filterKeys array. The function runs the following SQL:

SELECT k.ProjectsID, k.KeysID, k.TaxonomicScopeID, k.Name AS KeyName, 
  group_concat(DISTINCT cast(l.ItemsID as char)) AS Items
FROM `keys` k
JOIN leads l ON k.KeysID=l.KeysID
LEFT JOIN groupitem g0 ON l.ItemsID=g0.GroupID AND g0.OrderNumber=0
LEFT JOIN groupitem g1 ON l.ItemsID=g1.GroupID AND g1.OrderNumber=1
JOIN items i ON coalesce(g0.MemberID, g1.MemberID, l.ItemsID)=i.ItemsID
WHERE k.ProjectsID IN («project IDs»)
AND i.ItemsID IN («item IDs»)
GROUP BY k.KeysID;

For each record in the query result, it checks if the key is already in filterKeys. If it isn't, the key is added to the filterKeys array and the TaxonomicScopeID is added to the newItems array. If the key is already in the filterKeys array, the array of Items from the record is merged with that from the key already in the set. If, at the end of the loop, there are any items in the newItems array, the newItems array is passed to the getGlobalFilterKeys function. This goes on until the query doesn't return records any more (this should never take more than five or six iterations).