Grid geometry files - acroucher/PyTOUGH GitHub Wiki

Model geometry

TOUGH2 uses a flexible 'finite volume' formulation, in which the model grid is described only as a set of blocks and the connections between them. The blocks are given volumes, and the connections are assigned areas and distances, but nothing is referenced to any particular coordinate system.

For grid generation, and also for post-processing of model results, a coordinate system is usually needed, so that we know where the model grid blocks would be in the real world. For this it is often convenient to have a separate geometry file which defines the grid geometrically in space. This can then be used to generate the grid block volumes and connection data that TOUGH2 needs, and also to plot the model results spatially.

MULgraph geometry files

AUTOUGH2 (the University of Auckland version of TOUGH2) is usually used in conjunction with a MULgraph geometry file containing the geometric description of the model grid. This is named after MULgraph, the graphical interface for TOUGH2 developed at the University of Auckland in the 1990s. However, the MULgraph geometry file format can be used independently of MULgraph itself.

The MULgraph geometry file format assumes a layered grid structure in the vertical direction, but allows arbitrary unstructured polygonal columns in the horizontal. It can represent surface topography by allowing blocks in the top layer to have different heights, or blocks in the uppermost layers to be completely absent. A simple representation of well tracks can also be included in the MULgraph geometry file.

MULgraph geometry files can also be used in conjunction with TOUGH2, as well as AUTOUGH2. Conversely, most of PyTOUGH can still be used without using MULgraph geometry files, if preferred.

The PyTOUGH `mulgrids` module

PyTOUGH includes a mulgrids module for handling MULgraph geometry files in Python scripts. You can import it (like any other Python library) using the import command, e.g.:

from mulgrids import *

The `mulgrid` class

This module defines a mulgrid class, which represents the entire contents of a MULgraph geometry file. The definition of this class specifies what properties and methods a mulgrid object will have.

Its properties include things like the geometry's nodes (vertices), columns and layers, connections between the columns, surface elevations, and wells- in fact, everything present in the geometry file.

It has methods for carrying out tasks like reading from and writing to disk, generating simple irregular rectangular 3-D grid geometries, checking for errors, as well as manipulations such as translation, rotation, local grid refinement and fitting surface elevations from topographic data.

As for any Python class, an empty mulgrid object can be created by calling the name of the class itself as a method. For example:

geo = mulgrid()

creates an empty mulgrid object called geo. Often, however, we want to create a mulgrid object that is not empty, but has its contents read from a MULgraph geometry file on disk. This can be done using a command like:

geo = mulgrid('geom.dat')

In this case, the contents of the geometry file 'geom.dat' will be read into the object geo.

Printing a mulgrid object (by typing, for example, print geo, or just geo) will print a quick summary of the vital statistics of the grid geometry- including how many nodes, columns, layers etc. it contains.

The properties of a `mulgrid` object

Header information

At the top of a MULgraph geometry file there is a collection of 'header' information with general properties of the grid geometry. This information is represented in the mulgrid object by the following properties: type, convention, atmosphere_type, atmosphere_volume, atmosphere_connection, permeability_angle and unit_type. Details of these properties can be found in the PyTOUGH User's Guide.

As an example, if we have a geometry object called geo, as above, then we can find its atmosphere type by typing geo.atmosphere_type. This will return an integer, corresponding to the type of atmosphere blocks present in the model: zero for a single atmosphere block, 1 for an atmosphere block over each column, or 2 for no atmosphere blocks.

Nodes, columns, layers, wells and connections

The properties of a mulgrid object that are usually of most interest are those corresponding to its collections of nodes (vertices), columns, connections, layers and wells. Although these are all quite different types of objects, they are stored in a mulgrid object, and hence accessed, in a similar way.

The `column` and `columnlist` properties

Take columns, for example. The column property of a mulgrid object is a dictionary containing all the grid columns, accessed by column name. So if we have a geometry object geo, the column with the name 'abc' is accessed by

geo.column['abc']

This is useful if we want to access columns by name. Sometimes, we want to access them by index instead. For this, there is the columnlist property, which again provides access to the columns, but in the form of a list instead of a dictionary, enabling access by index. So for example the 100th column, which in Python will have the index 99 (as indices start from zero), can be accessed by

geo.columnlist[99]

The most common use of this is when we wish to loop over all the columns in the grid, in order:

for col in geo.columnlist:
    # do something with col

What is a column?

What sort of thing are we getting when we type geo.column['abc']? It is in fact another kind of Python object, an instance of the column class defined in the mulgrids module. This class represents columns in a MULgraph geometry file, so it has properties like name, centre, surface, and node, which is a list of the nodes at the corners of the column. So, for example, we can find the surface elevation of our column named 'abc' simply using

geo.column['abc'].surface

which will return a floating point number representing the surface elevation at that column.

The column class also contains other properties which don't correspond to anything present in the MULgraph geometry file, but are derived from the data there. For example, the area property returns the cross-sectional area of the column. So the following line:

[col.name for col in geo.columnlist if col.area < 1000 and col.surface > 500]

returns a list of the names of all the columns in our grid with areas less than 1000 square metres, and surface elevations above 500 m.

The column class also has its own methods. For example, you can find out if a point with horizontal grid coordinates p (a 2-D array) is inside a particular column using the contains_point() method:

if geo.column['abc'].contains_point(p):
    # do something

Nodes, layers and wells

There are also node, layer and well properties in a mulgrid object which are dictionaries of these objects, accessed by name. Just as for columns, there are corresponding lists (nodelist, layerlist and welllist) for accessing them by index.

In each case, these properties return objects of type node, layer or well, which are all classes defined in the mulgrids module.

A node object is relatively simple, as it just represents a named vertex in the grid. It has name and pos properties which are its label (a string) and horizontal position (a 2-D array). It also has a column property which is a set containing all the columns the node belongs to.

layer objects also have a name property, and also bottom, top and centre properties (all floats) representing their vertical location. (The name properties of nodes and layers are either 2 or 3 characters long, depending on the naming convention of the MULgraph grid- i.e. the value of its convention property.) For convenience, layer objects also have a thickness property which is just the difference between the top and bottom of the layer.

A well object can represent a deviated well track in the model. Besides its name, its main property is pos which is a list of 3-D points containing the nodes of the track. Even a vertical well needs two points to define its top and bottom. For any well, a well object has some convenience properties such as head which is the first element of pos, bottom which is the last element and num_deviations (the number of line segments in the well track) which is just one less than the num_pos property . A well object also has methods depth_elevation() and elevation_depth() for converting between downhole depths (along the well track) and elevations, and elevation_pos() and depth_pos() for finding the 3-D positions of points at a specified elevation or depth.

Connections

As for columns, nodes etc., a mulgrid object also has connection and connectionlist properties which identify which columns are connected horizontally. Each connection is an instance of the connection class defined in the mulgrids module. Its main property is column which is a two-element list of column objects. It also has a node property which is a list of the nodes in the face connecting the two columns.

The main difference between connections and columns etc. is that the connection dictionary property of a mulgrid is indexed not by single names (like column names) but by pairs of names. So if we have a mulgrid object geo which contains two connected columns named '991' and '983', we can access the connection between them using:

geo.connection['991','983']

This will return the connection between the specified columns, provided that such a connection actually exists in the grid. Note also that the column names in the pair must be in the right order (the same way they are defined in the 'CONNECTIONS' section MULgraph geometry file). If the pair is given in reverse order (e.g. geo.connection['983','991'] above) then an error will result, because no such connection exists.

Block name properties

A mulgrid object also has some properties related to the blocks it will contain when used to create a TOUGH2 grid.

The block_name_list property is a list of all the block names in the geometry's corresponding TOUGH2 grid. This is not necessarily a simple product of the column and layer names, because varying surface topography can cause some of the upper blocks to be omitted. The block names in this list are in the same order as they will appear in a TOUGH2 data file.

There is also a num_blocks property which is just the length of this list. In addition, there are num_underground_blocks and num_atmosphere_blocks which return the numbers of non-atmosphere and atmosphere blocks respectively (the sum of which will add up to num_blocks). The number of atmosphere blocks will depend on the atmosphere_type property- it could be equal to zero, one or the number of columns.

The block_name_index property is a dictionary, indexed by block name. It returns the index of the block name in the block_name_list list.

Boundary properties

Sometimes it is useful to be able to identify the horizontal boundary of the grid geometry- for assigning boundary conditions, for example. A mulgrid object has some properties for this.

If we just need to know the horizontal extent of the grid geometry, the bounds property returns a two-element list containing the coordinates of the bottom-left and top-right corners of the 'bounding box' of the grid (each represented as a 2-D array). The boundary_polygon property returns the simplest possible polygon (a list of 2-D arrays) representing the outer boundary of the grid geometry. This is formed from the coordinates of the grid nodes on the boundary, but intermediate points along straight sides of the boundary are removed.

There are also two properties for identifying grid nodes and columns along the outer boundary. The boundary_nodes property returns an ordered list of nodes along the boundary, while the boundary_columns property returns a set containing the columns along it.

What can a `mulgrid` object do?

Some of the main methods of a mulgrid object are outlined below.

Reading and writing

The contents of a MULgraph geometry file can be read into a mulgrid object using its read() method. If we have created a mulgrid called geo and we have geometry file named 'geom.dat', we can read it in using:

geo.read('geom.dat')

However, usually we just want to create the mulgrid object and read its contents from file at the same time. As we have seen above, this can be done simply by:

geo = mulgrid('geom.dat')

If we have edited a mulgrid object- for example, rotated it or fitted some surface topography- we will want to be able to write the altered mulgrid object to disk. For this, we can use the write() method. For example:

geo.write('altered_geom.dat')

This will create a new geometry file 'altered_geom.dat' and write geo to it. If we have read a geometry file in, and want to write it back out to the same file (having altered it in the meantime), we can omit the file name:

geo.write()

Creating rectangular grids

The rectangular() method of a mulgrid object can be used to create general regular or irregular rectangular grids. The sizes of the grid blocks in each of the x, y and z directions are specified as lists or arrays. For example, the command

geo = mulgrid().rectangular([100] * 20, [150] * 25, [10] * 15)

creates a geometry geo with 20 blocks in the x direction, each 100 m in size, 25 blocks in the y direction, each 150 m in size, and 15 layers, each 10 m thick. The syntax here makes use of the multiplication operator * for Python lists, which appends copies of a list onto the end of itself, so you don't have to write out the whole list.

Irregular rectangular grids can be constructed just by passing to the rectangular() method lists of block sizes that are not uniform. For example, passing as the first argument [100] * 5 + [50] * 20 + [100] * 5 gives a grid where the block sizes in the x direction are 100 m in size at the edges of the grid and 50 m in the middle.

You can also create more sophisticated sequences of block sizes using the mathematical functions in the Numerical Python library. For example, you can use np.logspace(1, 3, 100) to create an array of 10 sizes that increase logarithmically from 10¹ to 10³ m.

Besides the block sizes in each direction, the rectangular() method has other optional parameters for controlling things like the naming convention, the atmosphere type and the grid origin (grids are by default created with bottom left corner at (0, 0) and the top surface at elevation 0).

Fitting surface topography

The fit_surface() method can fit surface topographic data to a geometry object. Each column has a surface property, which represents its surface elevation. The fit_surface() method takes an arbitrary scattered data set of (x,y,z) points and uses it to calculate the best fit surface value for each column. This can be applied to any structured (e.g. rectangular) or unstructured grid, as long as it does not contain columns with more than four sides. It uses a least-squares bilinear finite element fitting procedure with a smoothing term, which is needed if the data set does not contain any points in some parts of the grid.

The main input parameter to the fit_surface() method is the topographic data, which is passed in as an array of 3-D points, one data point (x,y,z) per row. If you have data in a text file in that format it can be read straight into an array using the Numerical Python loadtxt() function, like this:

topo = np.loadtxt('topo.txt')
geo.fit_surface(topo)

If you have areas of the grid where there are no data (or possibly if your data set is very noisy), you will need to specify smoothing. There are two smoothing parameters, alpha and beta. The first controls the gradients in the fitted surface, so specifying a higher value will result in a flatter surface. The second controls curvature in the fitted surface (the fitted bilinear surface on a quadrilateral column can have curvature), so specifying a higher value will result in surface with less curvature. The values you need will depend on various factors including the grid column sizes, so in general it's a good idea to try a few values until you get a good compromise between following the data closely and avoiding oscillations. Normally the gradient parameter alpha is more important than the curvature parameter beta.

You can also fit only a selected part of the grid (leaving the rest alone) using the columns parameter. For example we could fit only columns within a certain radius r of a selected point p like this:

fit_cols =  [col for col in geo.columnlist if np.linalg.norm(col.centre - p) <= radius]
geo.fit_surface(topo, columns = fit_cols)

Depending on the data and the layer structure of the grid, the fitted surface may at times be very close to the boundary between two grid layers. This can result in very thin surface blocks on top of the model, which TOUGH2 may not like. To avoid this, use the layer_snap parameter, which specifies a minimum surface block thickness. If the difference between a column surface and a layer boundary is less than this value, the column surface will be 'snapped' to the elevation of the bottom of the layer, eliminating any very thin surface blocks. For example:

geo.fit_surface(topo, layer_snap = 1.0)

ensures that no surface blocks will have thicknesses less than 1 m.

Translation and rotation

mulgrid objects can be translated and rotated using the translate() and rotate() methods.

The translate() method takes a single parameter, a three-element list or array with the required shift values in each direction (x,y,z). So for example we can shift a geometry object geo 1 km west using:

geo.translate([-1.e3, 0., 0.])

or 2 km north and 500 m up using:

geo.translate([0., 2.e3, 500.])

The rotate() method rotates grids in the horizontal plane and has two parameters, angle and centre, so we can rotate the grid any desired angle about any centre point. If the centre (a two-element list or array) is not specified, it is taken as the horizontal centre of the grid. The angle parameter is in degrees clockwise, so:

geo.rotate(45)

rotates the grid 45 degrees clockwise about its centre.

For both the translate() and rotate() methods, wells in the geometry are by default not rotated with the grid. For either method you can set the wells parameter to True if you want the wells to be rotated also.

Searching

A mulgrid object has several methods available for searching for particular nodes, columns or blocks in the grid geometry.

Some of the other mulgrid methods (e.g. fit_surface(), refine()) take a particular area of interest in the grid geometry, and carry out some operation on the columns in that area. Often it is convenient to define such an area using a polygon (a list of points), so a mulgrid object has a columns_in_polygon() method which takes a polygon and returns a list of all the columns with centres inside it. The nodes_in_polygon() method does a similar thing for nodes.

In some situations, we want to know which column, layer or block in the model grid contains a particular point. There are several mulgrid methods for doing this. The column_containing_point() method takes a horizontal position (a two-element array) and returns the column that contains it (if the point is inside the grid). Similarly, the layer_containing_elevation() method takes an elevation and find the layer that contains it, and the block_name_containing_point() combines these two, taking a 3-D point and returning the name of the grid block containing it.

It is also possible to search for nodes. The node_nearest_to() method takes a horizontal position and returns the grid node nearest to that point.

A note on speed: for an arbitrary, possibly irregular unstructured grid, the column_containing_point() method has a reasonable amount of work to do, to identify the correct column. If you are calling it a large number of times (e.g. looping over a data set with millions of points and identifying which column each one is in), its default behaviour may be too slow (particularly if the grid is large). There are several extra parameters for speeding it up. In many cases the most effective one is the qtree parameter, which specifies a 'quadtree', a special data structure for searching the grid. The mulgrids module contains a Python class for quadtrees. You can construct a quadtree for your grid, before doing any searching, using the column_quadtree() method of a mulgrid object, as in the following example:

cols = []
qt = geo.column_quadtree()
for p in data:
    col =  geo.column_containing_point(p, qtree = qt)
    cols.append(col)

The column_quadtree() method can take an optional parameter specifying a list of columns to search in, if you know your points are clustered in one part of the grid.

Local grid refinement

Sometimes it is useful to include more detail in the model grid around an area of interest, for example a borefield. The refine() method of a mulgrid object can carry out local grid refinement of the columns inside a specified area. By default, each of the columns is divided into four, and triangular columns are added as needed in the 'transition' region around the edge of the refined region. (If the bisect parameter is set to True, columns are not divided into four but bisected instead, between their longest sides.)

The main input for this method is the columns parameter, which is a list of the columns (or column names) inside the region to be refined. (If this is not specified, the entire grid is refined.) So for example if we have a polygon p (a list of 2-D points) representing the boundary of the region to be refined, we can refine the columns inside it using:

refine_cols = geo.columns_in_polygon(p)
geo.refine(refine_cols)

Some care should be taken in refining grids this way. TOUGH2 generally performs less well if the grid is of lower 'quality', e.g. if the columns have large aspect ratios, or are very skewed, or the lines connecting block centres are far from perpendicular to the block faces. If the original unrefined grid is of low quality to begin with, the refined one (particularly in the transition zone) will often be worse. Grid quality in the transition zone can often be improved by using the bisect_edge_columns parameter.

Checking for errors

A mulgrid object can check itself for errors (and optionally fix them) using the check() method. The errors the check() method can find are:

missing connections between adjacent columns
duplicate connections
'orphaned' nodes (that don't belong to any column)
columns with centres specified outside themselves
layers with centres specified outside themselves

This method returns a value, either True if no errors were found or False if the geometry has problems. If there are problems it will also print out their details.

The fix parameter of this method can be used to fix any problems that are found, so for example:

geo.check(fix=True)

checks the object geo and fixes any errors found. The fix parameter is set to False by default.

Visualization

There are several mulgrid methods for producing graphical output from a grid geometry, both visualizing the grid itself and data sets defined on the grid. These fall into two kinds: 2-D and 3-D visualization.

2-D visualization is carried out using another Python library called Matplotlib, a library for general two-dimensional plotting. You will need to have Matplotlib installed on your computer before you can use the mulgrid 2-D visualization methods. The two main methods, layer_plot() and slice_plot(), are used to produce plots at a particular grid layer, or along a vertical slice through the geometry respectively. It is possible to shade each block according to an arbitrary data array (one value for each block). These could be, for example, rock properties, results from a TOUGH2 simulation, or some other quantity derived from the results. It's also possible to display well tracks on these plots, or shade the blocks according to rock type. For either method, a separate Matplotlib window will appear containing the plot.

3-D visualization is carried out by exporting the grid geometry to VTK format, for visualizing using Paraview, Mayavi or similar software. This can be done using the write_vtk() method. For this you will need a Python-enabled version of the VTK library installed on your computer (for more information on this, consult the PyTOUGH user guide). Once you have your data in Paraview or similar software you will be able to carry out sophisticated 3-D visualization, produce animations of time-dependent data etc.

Back to Contents page