Estimate a cross table from a series of row and columntotals - ObjectVision/GeoDMS GitHub Wiki
Consider the following problem:
Known:
- Ggr: number of objects of g-type g in zone r, aka row-totals.
- Hhr: number of objects of h-type h in zone r, aka column-totals.
Requested:
- Qg**hr, an estimated number of objects with g-type g and h-type h in zone r.
- estimated number of objects per h-type as a function (preferably a linear combination) of object numbers per h-type (for estimating a h-type distribution given a g-type distribution of new objects).
We assume that:
- For each zone r, the F and G count all objects, thus:
$\forall r: \sum\limits_g G^r_g = \sum\limits_h H^r_h$ - Phi is equal for all i in the same r and with the same g, thus depends only on P(h|r**g).
- Qg**hr has a Ggr repeated categorical distribution per row g and zone r; thus E[Qg**hr] = P(h|r**g) ⋅ Ggr
-
Qg**hr := fg ⋅ fh ⋅ Pg**h
such that
$\sum\limits_{h} Q^r_{gh} = G^r_g$ and$\sum\limits_{g} Q^r_{gh} = H^r_h$ , to be determined by Iterative proportional fitting
Thus:
-
$\sum\limits_{h} P(h|rg) = 1$ ,$f_g = G^r_g / \sum\limits_{h} f_h \cdot P_{gh}$ , and$f_h = H^r_h / \sum\limits_{g} f_g \cdot P_{gh}$ . - Pg**h is to be determined by regression of Hhr by Ggr, thus, written in matrix notation: H = G × P + ϵ with ϵ a r × h matrix of independent stochasts with zero expectation.
- it follows that: P := (GT×G)−1 × (GT×H)
ToDo:
- Consider Alternative: Qbjh is determined by discrete allocation with the given constraints and suitability Pg**h
- Consider Alternative: P := (GT×H) × (HT×H)−1 which follows from regression of Ggr by Hhr.
- Consider Alternative: P := (GT×H)
- Consider Effect of possible heteroscedasticity of ϵ with H; i.e. assume var(ϵ) ∼ H, as each element of H is assumed to be the sum over j of the results of Ggr trials of categorical distributions with conditional probabilities Pg|h.