X11: London prototype: Testing automated generation of age data use of age data in procedural models - colouring-cities/manual GitHub Wiki

Introduction

A number of automated data capture methods were experimented with between 2017 and 2019 as part of Colouring London's early development at CASA, UCL. Details of preliminary investigations (unpublished) are recorded here along with first steps in Computer Vision collaboration and include notes on:

Use of historical network data to infer geolocation of non-domestic land use;
Use of historical network data to infer building age;
Use of footprint configuration to infer high street building locations;
Use of building age to infer 3D/4D attributes.

Use of historical network data to infer geolocation of non-domestic land use

In this section semi-automated methods of generating age and land-use data for London are described.

The close association of retail buildings with high accessibility routes in London has been recorded in a number of studies (Hiller, 1996; Crook and Smith, 2010; Stanilov and Batty, 2011; Masucci et al.,2013). The geolocation of non-domestic and mixed-use buildings along high accessibility routes has been illustrated by the London Building Stock Model (add ref). Kiril Stanilov and Mike Batty have also identified links between high accessibility routes and preurban networks describing these, in their work with Masucci, as forming the ‘backbone’ of London (Masucci, 2013, p.3). Brenda Case Scheer describes high accessibility routes as characterised by ‘elastic’ tissue, which is a heterogeneous dynamic tissue type along which retail clusters. Torma et al. have visualised, at building level, the ongoing cycle of incremental demolition and construction operating on London’s suburban high streets, and along this type of route. The association of retail with mixed use, and an illustration of the spatial distribution of retail/residential building along high accessibility/preurban routes in the London Borough of Camden is shown in Appendix 4, using manually captued data.

Below preurban routes are visualised for Greater London for the Roman period in red, the medieval period in blue, and for the post medieval period up to 1786 in yellow. (The choice of 1786 as a cut-off point for the preurban period is based on Stanilov and Batty’s 2011 definition, which itself is based on the availability of London-wide maps from around the turn of the 18th century.) Saxon sites, from which many of London’s high streets were to spring, are shown as black dots. Roman, Saxon and medieval data were provided by Peter Rauxloh at the Museum of London Archaeology (MOLA), and 1786 data by Kiril Stanilov. These networks are compare with the distribution of high streets in London (Ordnance Survey, 2019a)

Roman, Medieval and 1786 street networks in London. Historical data provided by MOLA, Historic England and Kiril Stanilov (Image, Polly Hudson).

High street location (Source, Ordnance Survey, 2019a)

In this first experiment, preurban networks were assessed for their potential as generators of draft open datasets relating to retail and mixed land use. Mixed use buildings from the Camden/Westminster sample, observed during collection to often contain retail use, were first mapped in ArcGIS alongside 1786 network data. In the image below mixed use, shown in red, can be seen clustering along the preurban network, shown in turquoise (much of which overlaps with much earlier medieval routes). In contrast, little mixed use appears to cluster along 1830 streets, shown in yellow (the next year for which Stanilov’s street network data are available). An anomaly was however found in Park Street, a small, diverse, shopping street forking west off Camden high street, shown boxed in Figure 8.3. Here mixed retail/residential buildings were found to cluster along the 1830 rather than 1786 route. It was therefore hypothesised that the route might be in fact much older and had perhaps been omitted from the map from which Stanilov’s 786 network data were generated. To test this, MOLA’s medieval archaeology map for Greater London map (Museum of London Archaeology, 2015) was checked. As shown in the inset image in Figure 8.3. This shows that Park Street does in fact represent the shortest path between two Saxon settlements, likely to have developed over time into a well-used route. MOLA’s maps were also used to check a second anomaly slightly to Park’s Street’s north. This was also found to run along the shortest path between two Saxon sites.

Chapter 8 1

Owing to the observed spatial relationship between retail/residential and the preurban network the hypothesis was tested that preurban networks could be used to rapidly generate open retail and mixed use data for Greater London. As mixed use data for Greater London were not available to the study for verification, the hypothesis was able only to be tested in relation to retail. This was done by using OS’s AddressBase Plus product (accessed 7 December 2017) to measure the percentage of retail buildings in Greater London clustering along the preurban route. The AddressBase Plus dataset for London comprised (in 2017) 240,451 OSMM ‘commercial’ footprints (CR code, 102357). The commercial class is in fact very broad and includes many types of non-domestic land use (Ordnance Survey, 2021f). From the commercial dataset, 102,357 OSMM polygons were extracted for the secondary description of ‘retail’. 50 and 75m wide buffers were then generated in ArcGIS from the centreline of Stanilov’s 1786 network, for the whole of London. The 50m buffer was designed to capture buildings facing directly onto commercial roads, and the 75m buffer to also capture buildings set immediately to their rear. A spatial join was then generated between OSMM polygons within both preurban network buffer sizes, using the OS AddressBase Plus ‘commercial’ dataset, and then repeated for the AddressBase Plus ‘retail’ dataset. The method was also repeated for 1830 and 1880 networks to allow clustering to also be compared with later routes. The number of ‘commercial’ and ‘retail’ polygons falling within each size of buffer, for each network age was then tabulated, as below. The spatial join required access to the OS AddressBase Plus product, and to downloaded OSMM polygons for the whole of Greater London.

Chapter 8 2

Nearly 81% of all polygons classified in AddressBase Plus as ‘retail’ were found to fall within a seventy-five-metre buffer of the 1786 route, and 77% within the fifty-metre buffer. For ‘commercial’ buildings as a whole, this was slightly lower, with 71% found within the seventy-five-metre buffer and 65% within the fifty-metre buffer. Only around 6% more ‘commercial’ and ‘retail’ OSMM polygons were found on new road segments built between 1786 and 1830, for either of the buffer sizes, and only around 12% on segments built between 1786 and 1880. The results confirmed that vectorised data for preurban routes has potential value in supporting the generation of open retail datsets. The method is considered sufficiently robust to warrant testing in other cities, and is easily reproducible where late-18th-century maps (from which streetnetworks can georeferenced and vectorised) and OSMM footprints are available. It is also considered useful for the rapid geolocation of ‘elastic’ tissue, and the capture and release of data on dynamic properties of buildings built along these routes (Chapter 4). Suggested second stage work includes more detailed analysis of the number of pure retail buildings, compared to mixed use retail buildings, falling along preurban routes. To support this, the author is currently working with Stephen Law at the Alan Turing Institute to design an algorithm using the 1786 network data, and ‘rules’ in relation to footprint size, shape and relation to plot. Verification is possible using OS’s highstreet dataset for London, available to universties under a public mapping agreement licence.

Tracking fluctuations in the proportion of retail, and size of retail buildings falling along preurban routes, and the socio-economic impact of this, will also be important in future, particularly in the light of recent planning policies which will encourage the replacement of high street retail in London with residential stock (Gardiner and Hopkirk, 2020). An obvious question arising, based on findings above, is: Where are new shops, which have a natural affinity to preurban routes/ along which retail uses have clustered for centuries, meant to go in future?

Using vectorised historical network data to generate draft building age datasets

In this section, a method is described allowiing for the geolocation of residential tissue in London and estimation of building date of construction. Residential tissue, described as ‘Static’ tissue by Brenda case Scheer, makes up most of Greater London; Smith and Crooks describe the city as being ‘dominated by residential functions, with small-scale local retail and service centres’ (Smith and Crooks, 2010, p.39). Masucci, Stanilov and Batty point out how spaces between the preurban/‘elastic’ network in London - described above- have been gradually filled by residential tissue, mainly from the centre outwards until constrained by the Green belt (Masucci et al., 2013). These residential streets function primarily as access routes to-and-from homes rather (than as connectors, as in the case of ‘elastic’ tissue, between commercial locations).

Below the growth of London between 1786 and 2010, is shown by visualising Kiril Stanilov’s vectorised historical street network data, chronologically. A detailed web of routes can already be seen by 1786. Between 1786 and 1830, densification can be seen concentrated around the periphery of the ancient core (though expansion of medieval villages is not visible at this scale). Between 1830 and 1900, development can be seen occurring across Inner London, reflecting growth in population, and house building, during the Industrial Revolution. In the interwar period, rapid suburban development is visible in Outer London. From the post-war period, clear patterns of development can no longer be seen.

In the figure below, the location of new roads constructed during each survey interval is made clearer by erasing one network from the next, in chronological order. This is done using the Arc ‘erase’ tool, for example, the 1786 network is first erased from the 1830 network to generate only roads built between 1786 and 1829, the 1830 network erased from the 1880 network etc. The process, which took only a few hours to complete, and was able to generate a dataset showing the position of all new roads built in Greater London within eight temporal intervals, from 1786 to 2010.

Based on Masucci et al.’s finding that residential tissue fills spaces between high accessibility older routes, it was felt it could be reasonably assumed that all routes other than pre-1786 routes would be likely to represent residential streets. Furthermore, that the majority of dwellings built along these could be reasonably expected to have been constructed at approximately the same time. Based on these assumptions it was hypothesised that if a 50m buffer was used to capture OSMM polygons/ buildings falling along these streets, then buildings could be approximately dated simply by assigning the street network date interval to each polygon/building. For example, a construction date interval of 1900–1919 would be assigned to buildings built along Stanilov’s 1920 network. This was also viewed a useful and quick method of geolocating ‘static’ tissue in The method was kindly tested for the whole of London by Flora Roumpani, using Stanilov’s street network data for 1900 (representing the interval 1880–1899 interval), for 1920 (the 1900–1919 interval) and 1940 (the 1920–1939 interval). This generated around 750,000 building age dates. These were later uploaded onto the Colouring London as described in Chapter 8, and a sample is shown in Figure 8.6. On the platform, estimated age is derived from the beginning of the interval, i.e. 1920, and the earliest and latest possible construction date given as 1920 and 1939 respectively.

Accuracy was checked in two limited ways owing to time constraints. Firstly a rapid visual assessment was made using EDINA Digimap Ancient Roam, to see whether age generated for the selected time intervals roughly corresponded with blocks of new development arising on historical maps, within corresponding interval dates. Areas, in the north, south, west and east of London were spot checked, for each of the three time intervals. A number of individual road segments in outer London were also analysed to gain some idea of the scale of error that might be occurring owing to the amount of infill and redevelopment occurring since the main construction phase. Analysis of Birkbeck Road, below, gives an idea of the time-consuming process of manually checking and correct data. Here, pale blue represents buildings for which age has been auto-generated, using the above method, for the period 1880–1899. Buildings in other colours, which were also initially also pale blue, have all required correction. Those in purple fall into the next age interval but also represent new build on greenfield sites. Those in orange and yellow represent later replacement buildings built between 1960 and 1999. Dates were checked by first comparing the earliest and latest EDINA OS Historical Roam maps, this being considered the quickest method available (as identified in Chapter 5) of estimating approximate date.

Facade images from Google Street View, and knowledge of map footprint shape were also used to assess whether buildings had been replaced. For Inner London areas, London County Council (LCC) bomb damage maps were also considered useful. The dating method was not as detailed as that described in Chapter 6. From these cursory assessments, it was observed that rebuilding and infilling, in a piecemeal fashion, at small-scale, was relatively common Outer London, between semi-detached and small-detached buildings. In Inner London (based on observation of the Camden/Westminster sample only), patterns of redevelopment were found to be different. Demolition of nineteenth- and early-twentieth-century streets tended to have occurred as groups, often as a result of bomb damage or social housing estate insertion. Where older terraces had managed to survive they appeared to also have often retained their integrity as a unit, with redevelopment and infilling seemingly inhibited by the tight row structure and sharing of party walls. Where small-scale demolition did occur, end of terraces appeared most vulnerable. Mid-terrace demolition in the sample, when checked against LCC bomb damage maps, was also found to often have been a result of bomb damage.

The test concluded that relatively reliable, baseline volume age data for ‘static’ residential tissue for London, could be generated using the semi-automated method extremely quickly. Roumpani estimates that it took around half a day to fully process data for each age interval, ready for upload to Colouring London. However it was concluded that to develop the most accurate datasets possible, all data may have to be checked to pick up later small infill and rebuilding. This process was found to be extremely time consuming, requiring expert knowledge, with the correcting of the Birkbeck Road sample alone taking around half a day, i.e. the same time it took Roumpani to auto-generate approximate data for an entire age interval. Finding methods of encouraging historians and building specialists to collaboratively check data at local level, for the city as a whole, was therefore confirmed as critical. Once auto-generated data have been checked by a network of historic environment specialists working at the local level, open age data of unprecedented accuracy and quality will be able to be, relatively quickly, produced for London. Furthermore open training data, able to be used in computational methods designed to reduce reliance on expert checking of age in other UK cities, will also be made available. Provided that Krenz’ method of spatially tracking demolition, or alternative approaches to demolition tracking can be applied, the age dataset’s shelf life will be unlimited, with historians also able to fine tune entries as-and-when new information becomes available.

Three future areas of work arose. The first, in relation to tapping into the historic environment ecosystem, to encourage checking of age entries at the earliest date; the second, to assess auto-generated data against the Camden/Westminster sample, for which the age of buildings has already been checked manually; the third, to explore ways in which feedback loops between historians, and data scientists could be supported and extended to include algorithm design. As Flora Roumpani has suggested, algorithms able to identify and classify buildings in the ‘static’ tissue, based on footprint shape and position, which do not match their neighbours would be very useful. However, though algorithms could potentially limit inaccuracies and narrow down the number of buildings needing to be manually verified, expert checking and enrichment of data categories, and the adding of links to sources of age information, will still be required.

Procedural/rule-based city models

As part of CCRP development, the procedural generation of building typologies using age data was investigated resulting in the following paper: The Use of Historical Data in Rule-Based Modelling for Scenarios to Improve Resilience within the Building Stock.

Procedural models are of particular interest to open building attribute mapping platforms in that they offer a relatively low-cost solution to providing open 3D information on stocks (critical as buildings not 2D objects like roads and pavements); creating dynamic simulations across time, (see conceptual age model) and can be used to rapidly generate interactive, 3D, What if? scenarios to visualise, and simulate sustainable, and unsustainable planning strategies (at local area and city level) and adjust these scenarios at the click of a button.

However to build such models, detailed rules of construction are needed for each building typology, (and for all other relevant urban components), as well as access to relevant attribute data. To generate much more complex 4D ‘procedural’ models, which anticipate change to typologies over time, information to rules of mutation are also required.

Procedural models of buildings and cities were first developed in the 1990s by the film and computer-gaming industries, to enable large numbers of generic building types to be quickly and cheaply generated (Roumpani, 2018). In 2001, Yoav Parish and Pascal Müller, at the Swiss Federal Institute of Technology (ETH Zurich), published their idea for a software product called City Engine, designed to rapidly generate 3D city models for development and planning purposes (Parish and Müller, 2001). The software has since become an Environmental Systems Research Institute (ESRI) product, integrated with ERSI’s ArcGIS software and widely used in urban masterplanning (ESRI, 2020).

Within ‘procedural’ models, rules of generation are applied to different types of urban component, e.g. types of street layout, land parcel/plot, and building. This process results in the parameterisation and abstraction of urban form (Parish and Muller, 2001). Procedural rules define the basic ‘grammar’ of the built type in question and encode how its parts relate geometrically, and constraints placed on it (for example the maximum and minimum height a terraced house could ever be, or the steepness or flatness of pitch its roof could ever have, or the possible ratio of floorspace to solid wall). This approach allows for actual dimensions for each parameter, i.e. height or roof slope, to be added separately, and for these to be adjusted for a single example of the typology, or, if required, for every example across the entire city, at the click of a mouse.

3D procedural city models are, in this way, constructed, and operate, completely differently from most 3D digital models of cities, where form is extruded from footprints; where surfaces are individually detailed and rendered, and where focus is mainly on visualisation of the existing city, and of new schemes, and not on urban analytics (VU.CITY, 2021). Procedural city models are designed to be able to simulate how the city behaves not just how it looks. Built-in analytical tools can be used to assess energy consumption, density, floor space, plot-to-building ratio, etc. and models used to simulate scenarios of value in related policy design and implementation (Roumpani et al. 2018). It is hypothesised that 4D procedural models, if able to be built, will be able to be used to also simulate, analyse and forecast urban metabolism, and building survival and mortality rates.

Despite being described as ‘dynamic’ in operation, procedural models, to a large extent, treat cities as static systems, though as noted below certain rules regarding relative rates of change are applied. To mimic cities as systems in motion dialogue between computer scientists developing procedural models, and urban morphologists and historians (with knowledge of rules of change) will be necessary.

In their 2001 paper, Parish and Muller describe City Engine as a ‘dynamic’ system in which all components interlink, and where global, hierarchical rules are applied to generate the abstract, elemental forms of buildings, plots and streets. Parameters are easily controlled by sliders (ibid.), allowing, as shown in Figure 5.1A the dimensions of a typology to be quickly adjusted while its underlying form remains the same. Here, step height and number of columns are shown adjusted in City Engine for a Roman temple typology. In Figure 5.2A the ability to adjust façade features using sliders within an ESRI City Engine demo is illustrated

Parish and Müller describe the speed with which 3D digital models of cities can be built using this method, if data and knowledge of typology rules are available. For a procedural model of Manhattan containing approximately 13,000 buildings (built c2001) ‘the creation of the street graph took less than ten seconds, the division into lots and the creation of buildings approximately 10 minutes’ (Parish and Müller, 2001, p.7). They also explain how procedural approaches allow cities to be broken down, into small, simple generic parts, represented by typologies, which when combined can demonstrate and depict in 3D the diversity and complexity of urban systems (ibid.). As discussed in Chapter 3 and 4, this modular, step-by-step approach has also been applied to the study of urban complexity within a variety of studies. Importantly, Parish and Müller also refer to the employment of a hierarchical set of rules within procedural models, relating to perceived relative rates of change of components; with networks, land use and housing needing to be set down first as these are slowest-changing elements in the urban environment (ibid.) the authors however note that rules relating to the evolution/development of cities still need to be addressed.

Research into typology classification, including use of computational approaches to abstract and parameterise form, has been carried out for more than half a century, within building science, in order to better understand the relationship of building form to building performance. From the late 1960s, work on the classification of building typologies began to be undertaken at the Centre for Land Use and Built Form Studies (LUBFS), at the University of Cambridge, from the early 1970s with the aid of computers (Steadman, 2016). A key aim was to move towards a better understanding of the city as a complex system and towards a science of form, able to underpin the design process with scientific understanding and tools (ibid.).

At that time of LUBFS’ foundation, many areas of academia had begun to test models as tools for research, to explore abstract concepts (Hawkes, 2017). In 1967, the biologist Peter Medawar, in his book The Art of the Soluble, observed how ‘in all sciences we are being progressively relieved of the burden of singular instances, the tyranny of the particular. We need no longer record the fall of every apple’ (ibid. p. 149). In the same year, Jane Jacobs in her lecture to the Royal Institute of British Architects, was proposing an alternative approach, in which cities were studied as complex systems, for which information on the particular was essential (Jacobs, 1967). In 1987, Steadman and Frank Brown published A Computerised Database of Contemporary House Plans in which the concept of all architectural plans lying within a ‘world of technical and functional possibility’ was described (Brown and Steadman, 1987). This built on March’s earlier work into the binary coding of internal plans of rectangular buildings, and Steadman’s research with W.J Mitchell and Robin Liggett into the complete range of possibilities for the layout of small rectangular building plans. Steadman describes the way in which, as in Caniggia and Maffei’s work and in procedural models, ‘underlying forms are kept the same while their dimensions are varied’, though here ‘in order to study the relation of form to performance’ (Steadman, 2016, p.301). Brown and Steadman described the development of a ‘design grammar’ for the British dwelling stock, using three common dwelling typologies as examples: the post-First World War local authority ‘cottage’, the 19th-century Victorian terraced house, and the private, interwar semi-detached house – the latter two estimated to comprise (at the time) over two-thirds of the housing stock (Brown and Steadman, 1987). Individual ‘grammars’ were produced for each, drawing on historical information including housing literature, building regulations and byelaw information, and building plans. Difficulty in accessing relevant historical data owing to the ‘nature of the historical evidence.. [being] dispersed and often not easily accessible’ was noted (ibid. p.410); ‘historical reconstruction is not easy’ (ibid., p.437). Permutations for the specific building types were then generated computationally based on rules for internal room arrangements and constraints imposed on each dwelling typology, to build up the fullest picture possible of the relationship between age and form and a building’s capacity to improve energy performance.

In Steadman et al.’s paper A Classification of Built Forms (Steadman et al, 2000), steps required to abstract and parameterise 3D typologies are also described. The importance of access to large-scale building attribute databases to support this process is noted (with data from 3350 non-domestic addresses from four UK towns used within the study) (ibid.). Geometrical form and land use activity were first separated out, owing to the range of uses that could be accommodated in base building types. Typology classification involved the generation of axonometric drawings for 400 buildings, with minor details removed. Buildings were then separated into primary and ‘parasite’ forms, the latter including basements, large balconies, small single storey extensions and occupied pitched roofs. Form was then represented without dimensions (for example the basic form of a shed is broken down into four parameters: length, height, width and roof slope) (ibid.). Steadman et al. note that through the process of studying attribute relationships, floor space, size, activities, construction materials, service systems, and roof geometry were all found to be able to be inferred. The study is also interesting in its discussion of parasitic forms, with which some overlaps with Kostourou’s dynamic mechanisms can be seen, and in the authors’ comments on the importance of quantitative studies, ‘devoted not to theoretical forms but to a mass of empirical data on existing buildings’ (ibid., p.90). For these, they argued, databases are required, ‘in which the geometry and construction of very large numbers of actual buildings are fully and consistently represented along with the patterns of activity which they accommodate’ (ibid.).

In 2010, Steadman and Linda Mitchell published their work on ‘Morphospace’, in which binary coding was used to map ‘distinct “dimensionless” configurations for built forms’ (Steadman and Mitchell, 2010, p.197). This tool was shown to be able to group typologies within a 2D space to reveal underlying mathematical relationships capable of explaining similarities in 3D form, a tool also of significance in large-scale open data generation. Here forms are mapped which both can, and cannot, be found in real life. From the 1990s, knowledge gathered from these studies began to be applied, at UCL Energy Institute, to the development of 3DStock, a 3D digital model of the UK’s building stock. This covers large areas of England and Wales, with models of cities automatically assembled (using, primarily, OS and VOA data), in order to provide information the relationship between built form, density and energy performance within stocks (Steadman et al., 2020; Evans et al., 2017). Out of this the 3D London Building Stock Model, built by Steve Evans, was produced at UCL Energy Institute. Though the model is restricted a considerable amount of consultation and collaborative work was undertaken with the UCL team, including the donation of inferred datasets produced through automatic methods, such as building adjacency which was included into Colouring London's typology category.

Between 2012 and 2018, as part of her doctoral thesis, Flora Roumpani began work on Procedural London, using CityEngine software and following procedural methods set out by Parish and Müller (Roumpani, 2018). Her interest lay in the fact that procedural models do not attempt to represent something in real life but instead create an abstracted, approximate representation of an object’s form, which has the potential to both simulate and anticipate behaviour (ibid.). She was particularly interested in the way in which multiple variables for cities could be layered and analysed simultaneously, at low-cost, and the way the impact of planning policies could be rapidly analysed and visualised in 3D through ‘What if?’ planning scenarios at street and city scale.

In 2016, Roumpani and the author started to work together to explore ways in which an open building attribute platform built for London could possibly assist the development of Procedural London, and link closely with it, to advance UK and NUA sustainability and resilience goals. Roumpani began to look at the value of building age, and original and current land use data, manually captured by the author for over 20,000 buildings, (described in Chapter 6), in the generation of 3D procedural typologies, and the geolocation of these typologies in London.

To test the value of these specific types of data, and to identify other data types that might need to be collected within open data platforms to support procedural model development, a single typology, the mid-19th-century Victorian terraced house was first selected. This was chosen by the author for being typical of buildings found in ‘static’ tissue not only in London but also other UK cities. The same typology was also chosen for Brown and Steadman’s 1987 study. It was also described by William Howell, in his comments, in 1967, on Jacobs’ RIBA lecture, as long-lasting and highly adaptable typology, indeed he criticised fellow architects for being ‘totally incapable’ of designing an equivalent (Howell, 1967). Similar kinds of historical information to those in Brown and Steadman’ study, including regulatory frameworks and byelaws, were accessed by the author to help Roumpani build a picture of the unique combination of characteristics making up the mid-Victorian terraced typology, i.e. its ‘grammar’, necessary to generate its procedural rules (Roumpani et al., 2018).

Owing to the availability (from the author) of a relatively large sample of data for the London Borough of Camden, at building level, providing information on original land use and building age, Roumpani was able to rapidly geolocate all domestic buildings built in the mid-Victorian period within this area of London. 3D form was then generated by studying the typical geometrical relationship between the footprints of these buildings and their plot size, distance from street etc.. Through a step-by-step, process, shown in Figure 5.3A, rules for the creation of the 3D generic typology form were gradually developed.

To check the accuracy of these rules, all possible permutations of the typology were generated computationally from them, examples of which are shown in Figure 5.4A. Expert knowledge of the typology was then applied, by architectural historians at the Survey of London, to eliminate variations that could not have existed, or were never known to exist. Rules were adjusted accordingly and a new set of permutations generated until the closest match with actual examples was achieved. (For this checks with Google Street View images of actual mid-Victorian buildings were also required). Roumpani then integrated the typology rules into the Procedural London model along with procedurally generated streets and plots, which were then checked against actual block configurations.

Though this process Roumpani and the author concluded that roof shape, plot shape, window arrangement, floor height, internal layouts and structural system could all potentially be quickly and cost-effectively generated procedurally for many building typologies, using a limited number of attributes, including age and original land use. This however relied on expert advice from historians and/or urban morphologists. Through this procedural models were also found to be able to, in turn, generate many types of open building attribute data, at city scale, able to be released within open platforms. The potential was also seen to use procedural models to generate age simulations (as discussed in Chapter 3), and to help target appropriate retrofit measures for specific typologies with greater precision. ‘Static’ tissue was identified through the study as the ideal tissue type to model first, owing to the dominance of residential buildings in cities, the stability of the tissue type, its repetitiveness, the fact that extensive knowledge of it already existed, and that 3D dynamic mechanisms suited to procedural modelling had already been developed.

The step-by-step process of procedurally modelling the mid-Victorian house was recorded in a joint paper, written by Roumpani and the author, entitled The Use of Historical Data in Rule-Based Modelling for Scenarios to Improve Resilience within the Building Stock’ (Roumpani et al., 2018). This was published 2018 in a special edition of Historic Environment: Policy and Practice (ibid.), along with a second paper, by the author, which summarised problems and opportunities relating to building attribute capture, and introduced the concept of open data building attribute platforms (Hudson, 2018).

In June 2018, a workshop was run at the Centre for Advanced Spatial Analysis (CASA), UCL designed to bring together a range of academics involved in, and interested in, co-developing new approaches, tools and datasets to support the analysis and visualisation of cities as dynamic systems, of use to resilience and sustainability research; and of exploring the value of 3D and 4D dynamic procedural models.

The workshop was attended by Robert Hecht, Hendrick Herold and Matthias Kalla from the Leibniz Institute, IOER, Dresden; Philip Steadman, Steve Evans and Rob Liddiard from UCL Energy Institute/3D Stock; Amy Smith from The Survey of London; and Flora Roumpani , and Elsa Arcaute (working on urban complexity) from CASA. It brought together expertise in Computer Vision and machine learning, historical research, sustainability science, building classification, 3D energy modelling, complexity science and procedural modelling.

As part of the workshop, Matthias Kalla, a master’s student at IOER, seconded to work with the author on a nine-week placement, presented findings from a short study exploring the feasibility of generating rules of mutation for typologies to create a 4D procedural model for London. The Georgian terraced typology (1714–1836), an earlier variant of the terraced house, was selected. The purpose of Kalla’s study was to identify what types of data would be required to generate rules for mutations to this typology. The Georgian terraced house was chosen by the author because it had been noted in data sample collection (discussed in Chapter 6 and 7) to be subject to a pattern of mutation in which pure domestic use converts to mixed use along the ‘elastic’ tissue, with domestic use retained above and retail below. This had been commonly found to occur in high streets and also appeared to follow burgage cycle rules, except that designation commonly prevented the demolition stage occurring. Attraction of residential and retail land use, as aforementioned, had also been noted by Stanilov and Batty. The author was particularly interested in the way that properties of ‘elastic’ routes appeared to cause the Georgian terraced typology to ‘react’ and mutate in a specific way, and whether particular characteristics of routes alone (as defined using Scheer’s dynamic classifications) could possibly be used to auto-generate data on specific typologies relating to their attributes and rules of mutation. This issue is also further discussed in Chapter 7.

The brief assessment carried out by Kalla, was overseen by the author and undertaken in close consultation with the Survey of London, Roumpani and IOER. It involved the analysis of eight hundred Georgian terraced houses in the London Borough of Camden, built between 1714 and 1839, geolocated using the author’s age and land-use datasets. Historical maps and current OS data were accessed from EDINA Digimap, and Google Street View was also used. In Figure 5.1 (2A), the Georgian typology and its typical mutation along the ‘elastic’ tissue is illustrated. Front gardens can be seen replaced by single-storey retail extensions, with plots shown densified incrementally. This process allowed procedural rules for the mutation for Georgian houses along the elastic’ tissue to begin to be formulated in collaboration with Roumpani. Tables, shown in Figure 5.2 (2A), were then generated to identify the type of 2D and 3D attribute data necessary to create the base form for the Georgian terraced typology.

This was also compared with attribute data used in Victorian typology generation, and the likelihood of these attributes changing over time. Data identified as being required included current and historical building footprints, plot outlines, street centrelines and current street-view images, as well as historical information on building history and regulatory frameworks/ domestic building rates. Challenges in deriving accurate height data from LiDAR, and dealing with the extent of variation in roof design, were noted. Kalla also, (working with Hecht and Herold), identified correlations between attributes, with the number of storeys (modal value being 4) found to relate to frontage width, extent of coverage of the plot, and footprint area. An agglomerative hierarchical clustering method was applied to group buildings from the bottom up by their properties alone, to identify the most common variations, as shown in Figure 5.3 (2A). During the workshop it was concluded that many advantages could be gained by sharing knowledge and expertise within the group on the form of buildings and change to them over time, and on computational approaches to attribute generation; not only to advance the development of 3D and 4D procedural models, and support content development for open building attribute data platforms, but also to advance transdisciplinary research in many areas of common interest relating to stock longevity, sustainability and resilience

Kalla’s classification of Georgian terraced house footprints using an agglomerative hierarchical clustering method (Courtesy Matthias Kalla and IOER/TU

2023 update- Work on automated vectorisation of historical maps is currently advancing through a collaboration between the Turing, IOER Dresden, École Polytechnique Fédérale de Lausanne and the National Library of Scoland coordinated via the Alan Turing Institute's Computer Vision for Digital Heritage Special Interest Group. Discussion on integration of use of procedural models in CCRP platforms and typology is currently being discussed with Dr Flora Roumpani and Concordia University, Montreal.