Grid geometry files - acroucher/PyTOUGH GitHub Wiki
TOUGH2 uses a flexible 'finite volume' formulation, in which the model grid is described only as a set of blocks and the connections between them. The blocks are given volumes, and the connections are assigned areas and distances, but nothing is referenced to any particular coordinate system.
For grid generation, and also for post-processing of model results, a coordinate system is usually needed, so that we know where the model grid blocks would be in the real world. For this it is often convenient to have a separate geometry file which defines the grid geometrically in space. This can then be used to generate the grid block volumes and connection data that TOUGH2 needs, and also to plot the model results spatially.
AUTOUGH2 (the University of Auckland version of TOUGH2) is usually used in conjunction with a MULgraph geometry file containing the geometric description of the model grid. This is named after MULgraph, the graphical interface for TOUGH2 developed at the University of Auckland in the 1990s. However, the MULgraph geometry file format can be used independently of MULgraph itself.
The MULgraph geometry file format assumes a layered grid structure in the vertical direction, but allows arbitrary unstructured polygonal columns in the horizontal. It can represent surface topography by allowing blocks in the top layer to have different heights, or blocks in the uppermost layers to be completely absent. A simple representation of well tracks can also be included in the MULgraph geometry file.
MULgraph geometry files can also be used in conjunction with TOUGH2, as well as AUTOUGH2. Conversely, most of PyTOUGH can still be used without using MULgraph geometry files, if preferred.
PyTOUGH includes a mulgrids
module for handling MULgraph geometry files in Python scripts. You can import it (like any other Python library) using the import
command, e.g.:
from mulgrids import *
This module defines a mulgrid
class, which represents the entire contents of a MULgraph geometry file. The definition of this class specifies what properties and methods a mulgrid
object will have.
Its properties include things like the geometry's nodes (vertices), columns and layers, connections between the columns, surface elevations, and wells- in fact, everything present in the geometry file.
It has methods for carrying out tasks like reading from and writing to disk, generating simple irregular rectangular 3-D grid geometries, checking for errors, as well as manipulations such as translation, rotation, local grid refinement and fitting surface elevations from topographic data.
As for any Python class, an empty mulgrid
object can be created by calling the name of the class itself as a method. For example:
geo = mulgrid()
creates an empty mulgrid
object called geo
. Often, however, we want to create a mulgrid
object that is not empty, but has its contents read from a MULgraph geometry file on disk. This can be done using a command like:
geo = mulgrid('geom.dat')
In this case, the contents of the geometry file 'geom.dat' will be read into the object geo
.
Printing a mulgrid
object (by typing, for example, print geo
, or just geo
) will print a quick summary of the vital statistics of the grid geometry- including how many nodes, columns, layers etc. it contains.
At the top of a MULgraph geometry file there is a collection of 'header' information with general properties of the grid geometry. This information is represented in the mulgrid
object by the following properties: type
, convention
, atmosphere_type
, atmosphere_volume
, atmosphere_connection
, permeability_angle
and unit_type
. Details of these properties can be found in the PyTOUGH User's Guide.
As an example, if we have a geometry object called geo
, as above, then we can find its atmosphere type by typing geo.atmosphere_type
. This will return an integer, corresponding to the type of atmosphere blocks present in the model: zero for a single atmosphere block, 1 for an atmosphere block over each column, or 2 for no atmosphere blocks.
The properties of a mulgrid
object that are usually of most interest are those corresponding to its collections of nodes (vertices), columns, connections, layers and wells. Although these are all quite different types of objects, they are stored in a mulgrid
object, and hence accessed, in a similar way.
Take columns, for example. The column
property of a mulgrid
object is a dictionary containing all the grid columns, accessed by column name. So if we have a geometry object geo
, the column with the name 'abc'
is accessed by
geo.column['abc']
This is useful if we want to access columns by name. Sometimes, we want to access them by index instead. For this, there is the columnlist
property, which again provides access to the columns, but in the form of a list instead of a dictionary, enabling access by index. So for example the 100th column, which in Python will have the index 99 (as indices start from zero), can be accessed by
geo.columnlist[99]
The most common use of this is when we wish to loop over all the columns in the grid, in order:
for col in geo.columnlist:
# do something with col
What sort of thing are we getting when we type geo.column['abc']
? It is in fact another kind of Python object, an instance of the column
class defined in the mulgrids
module. This class represents columns in a MULgraph geometry file, so it has properties like name
, centre
, surface
, and node
, which is a list of the nodes at the corners of the column. So, for example, we can find the surface elevation of our column named 'abc' simply using
geo.column['abc'].surface
which will return a floating point number representing the surface elevation at that column.
The column
class also contains other properties which don't correspond to anything present in the MULgraph geometry file, but are derived from the data there. For example, the area
property returns the cross-sectional area of the column. So the following line:
[col.name for col in geo.columnlist if col.area < 1000 and col.surface > 500]
returns a list of the names of all the columns in our grid with areas less than 1000 square metres, and surface elevations above 500 m.
The column
class also has its own methods. For example, you can find out if a point with horizontal grid coordinates p
(a 2-D array) is inside a particular column using the contains_point()
method:
if geo.column['abc'].contains_point(p):
# do something
There are also node
, layer
and well
properties in a mulgrid
object which are dictionaries of these objects, accessed by name. Just as for columns, there are corresponding lists (nodelist
, layerlist
and welllist
) for accessing them by index.
In each case, these properties return objects of type node
, layer
or well
, which are all classes defined in the mulgrids
module.
A node
object is relatively simple, as it just represents a named vertex in the grid. It has name
and pos
properties which are its label (a string) and horizontal position (a 2-D array). It also has a column
property which is a set containing all the columns the node belongs to.
layer
objects also have a name
property, and also bottom
, top
and centre
properties (all floats) representing their vertical location. (The name
properties of nodes and layers are either 2 or 3 characters long, depending on the naming convention of the MULgraph grid- i.e. the value of its convention
property.) For convenience, layer
objects also have a thickness
property which is just the difference between the top
and bottom
of the layer.
A well
object can represent a deviated well track in the model. Besides its name
, its main property is pos
which is a list of 3-D points containing the nodes of the track. Even a vertical well needs two points to define its top and bottom. For any well, a well
object has some convenience properties such as head
which is the first element of pos
, bottom
which is the last element and num_deviations
(the number of line segments in the well track) which is just one less than the num_pos
property . A well
object also has methods depth_elevation()
and elevation_depth()
for converting between downhole depths (along the well track) and elevations, and elevation_pos()
and depth_pos()
for finding the 3-D positions of points at a specified elevation or depth.
As for columns, nodes etc., a mulgrid
object also has connection
and connectionlist
properties which identify which columns are connected horizontally. Each connection is an instance of the connection
class defined in the mulgrids
module. Its main property is column
which is a two-element list of column
objects. It also has a node
property which is a list of the nodes in the face connecting the two columns.
The main difference between connections and columns etc. is that the connection
dictionary property of a mulgrid
is indexed not by single names (like column names) but by pairs of names. So if we have a mulgrid
object geo
which contains two connected columns named '991' and '983', we can access the connection between them using:
geo.connection['991','983']
This will return the connection between the specified columns, provided that such a connection actually exists in the grid. Note also that the column names in the pair must be in the right order (the same way they are defined in the 'CONNECTIONS' section MULgraph geometry file). If the pair is given in reverse order (e.g. geo.connection['983','991']
above) then an error will result, because no such connection exists.
A mulgrid
object also has some properties related to the blocks it will contain when used to create a TOUGH2 grid.
The block_name_list
property is a list of all the block names in the geometry's corresponding TOUGH2 grid. This is not necessarily a simple product of the column and layer names, because varying surface topography can cause some of the upper blocks to be omitted. The block names in this list are in the same order as they will appear in a TOUGH2 data file.
There is also a num_blocks
property which is just the length of this list. In addition, there are num_underground_blocks
and num_atmosphere_blocks
which return the numbers of non-atmosphere and atmosphere blocks respectively (the sum of which will add up to num_blocks
). The number of atmosphere blocks will depend on the atmosphere_type
property- it could be equal to zero, one or the number of columns.
The block_name_index
property is a dictionary, indexed by block name. It returns the index of the block name in the block_name_list
list.
Sometimes it is useful to be able to identify the horizontal boundary of the grid geometry- for assigning boundary conditions, for example. A mulgrid
object has some properties for this.
If we just need to know the horizontal extent of the grid geometry, the bounds
property returns a two-element list containing the coordinates of the bottom-left and top-right corners of the 'bounding box' of the grid (each represented as a 2-D array). The boundary_polygon
property returns the simplest possible polygon (a list of 2-D arrays) representing the outer boundary of the grid geometry. This is formed from the coordinates of the grid nodes on the boundary, but intermediate points along straight sides of the boundary are removed.
There are also two properties for identifying grid nodes and columns along the outer boundary. The boundary_nodes
property returns an ordered list of nodes along the boundary, while the boundary_columns
property returns a set containing the columns along it.
Some of the main methods of a mulgrid
object are outlined below.
The contents of a MULgraph geometry file can be read into a mulgrid
object using its read()
method. If we have created a mulgrid
called geo
and we have geometry file named 'geom.dat', we can read it in using:
geo.read('geom.dat')
However, usually we just want to create the mulgrid
object and read its contents from file at the same time. As we have seen above, this can be done simply by:
geo = mulgrid('geom.dat')
If we have edited a mulgrid
object- for example, rotated it or fitted some surface topography- we will want to be able to write the altered mulgrid
object to disk. For this, we can use the write()
method. For example:
geo.write('altered_geom.dat')
This will create a new geometry file 'altered_geom.dat' and write geo
to it. If we have read a geometry file in, and want to write it back out to the same file (having altered it in the meantime), we can omit the file name:
geo.write()
The rectangular()
method of a mulgrid
object can be used to create general regular or irregular rectangular grids. The sizes of the grid blocks in each of the x, y and z directions are specified as lists or arrays. For example, the command
geo = mulgrid().rectangular([100] * 20, [150] * 25, [10] * 15)
creates a geometry geo
with 20 blocks in the x direction, each 100 m in size, 25 blocks in the y direction, each 150 m in size, and 15 layers, each 10 m thick. The syntax here makes use of the multiplication operator *
for Python lists, which appends copies of a list onto the end of itself, so you don't have to write out the whole list.
Irregular rectangular grids can be constructed just by passing to the rectangular()
method lists of block sizes that are not uniform. For example, passing as the first argument [100] * 5 + [50] * 20 + [100] * 5
gives a grid where the block sizes in the x direction are 100 m in size at the edges of the grid and 50 m in the middle.
You can also create more sophisticated sequences of block sizes using the mathematical functions in the Numerical Python library. For example, you can use np.logspace(1, 3, 100)
to create an array of 10 sizes that increase logarithmically from 101 to 103 m.
Besides the block sizes in each direction, the rectangular()
method has other optional parameters for controlling things like the naming convention, the atmosphere type and the grid origin (grids are by default created with bottom left corner at (0, 0) and the top surface at elevation 0).
The fit_surface()
method can fit surface topographic data to a geometry object. Each column has a surface
property, which represents its surface elevation. The fit_surface()
method takes an arbitrary scattered data set of (x,y,z) points and uses it to calculate the best fit surface
value for each column. This can be applied to any structured (e.g. rectangular) or unstructured grid, as long as it does not contain columns with more than four sides. It uses a least-squares bilinear finite element fitting procedure with a smoothing term, which is needed if the data set does not contain any points in some parts of the grid.
The main input parameter to the fit_surface()
method is the topographic data, which is passed in as an array of 3-D points, one data point (x,y,z) per row. If you have data in a text file in that format it can be read straight into an array using the Numerical Python loadtxt()
function, like this:
topo = np.loadtxt('topo.txt')
geo.fit_surface(topo)
If you have areas of the grid where there are no data (or possibly if your data set is very noisy), you will need to specify smoothing. There are two smoothing parameters, alpha
and beta
. The first controls the gradients in the fitted surface, so specifying a higher value will result in a flatter surface. The second controls curvature in the fitted surface (the fitted bilinear surface on a quadrilateral column can have curvature), so specifying a higher value will result in surface with less curvature. The values you need will depend on various factors including the grid column sizes, so in general it's a good idea to try a few values until you get a good compromise between following the data closely and avoiding oscillations. Normally the gradient parameter alpha
is more important than the curvature parameter beta
.
You can also fit only a selected part of the grid (leaving the rest alone) using the columns
parameter. For example we could fit only columns within a certain radius r
of a selected point p
like this:
fit_cols = [col for col in geo.columnlist if np.linalg.norm(col.centre - p) <= radius]
geo.fit_surface(topo, columns = fit_cols)
Depending on the data and the layer structure of the grid, the fitted surface may at times be very close to the boundary between two grid layers. This can result in very thin surface blocks on top of the model, which TOUGH2 may not like. To avoid this, use the layer_snap
parameter, which specifies a minimum surface block thickness. If the difference between a column surface and a layer boundary is less than this value, the column surface will be 'snapped' to the elevation of the bottom of the layer, eliminating any very thin surface blocks. For example:
geo.fit_surface(topo, layer_snap = 1.0)
ensures that no surface blocks will have thicknesses less than 1 m.
mulgrid
objects can be translated and rotated using the translate()
and rotate()
methods.
The translate()
method takes a single parameter, a three-element list or array with the required shift values in each direction (x,y,z). So for example we can shift a geometry object geo
1 km west using:
geo.translate([-1.e3, 0., 0.])
or 2 km north and 500 m up using:
geo.translate([0., 2.e3, 500.])
The rotate()
method rotates grids in the horizontal plane and has two parameters, angle
and centre
, so we can rotate the grid any desired angle about any centre point. If the centre
(a two-element list or array) is not specified, it is taken as the horizontal centre of the grid. The angle
parameter is in degrees clockwise, so:
geo.rotate(45)
rotates the grid 45 degrees clockwise about its centre.
For both the translate()
and rotate()
methods, wells in the geometry are by default not rotated with the grid. For either method you can set the wells
parameter to True
if you want the wells to be rotated also.
A mulgrid
object has several methods available for searching for particular nodes, columns or blocks in the grid geometry.
Some of the other mulgrid
methods (e.g. fit_surface()
, refine()
) take a particular area of interest in the grid geometry, and carry out some operation on the columns in that area. Often it is convenient to define such an area using a polygon (a list of points), so a mulgrid
object has a columns_in_polygon()
method which takes a polygon and returns a list of all the columns with centres inside it. The nodes_in_polygon()
method does a similar thing for nodes.
In some situations, we want to know which column, layer or block in the model grid contains a particular point. There are several mulgrid
methods for doing this. The column_containing_point()
method takes a horizontal position (a two-element array) and returns the column that contains it (if the point is inside the grid). Similarly, the layer_containing_elevation()
method takes an elevation and find the layer that contains it, and the block_name_containing_point()
combines these two, taking a 3-D point and returning the name of the grid block containing it.
It is also possible to search for nodes. The node_nearest_to()
method takes a horizontal position and returns the grid node nearest to that point.
A note on speed: for an arbitrary, possibly irregular unstructured grid, the column_containing_point()
method has a reasonable amount of work to do, to identify the correct column. If you are calling it a large number of times (e.g. looping over a data set with millions of points and identifying which column each one is in), its default behaviour may be too slow (particularly if the grid is large). There are several extra parameters for speeding it up. In many cases the most effective one is the qtree
parameter, which specifies a 'quadtree', a special data structure for searching the grid. The mulgrids
module contains a Python class for quadtrees. You can construct a quadtree for your grid, before doing any searching, using the column_quadtree()
method of a mulgrid
object, as in the following example:
cols = []
qt = geo.column_quadtree()
for p in data:
col = geo.column_containing_point(p, qtree = qt)
cols.append(col)
The column_quadtree()
method can take an optional parameter specifying a list of columns to search in, if you know your points are clustered in one part of the grid.
Sometimes it is useful to include more detail in the model grid around an area of interest, for example a borefield. The refine()
method of a mulgrid
object can carry out local grid refinement of the columns inside a specified area. By default, each of the columns is divided into four, and triangular columns are added as needed in the 'transition' region around the edge of the refined region. (If the bisect
parameter is set to True
, columns are not divided into four but bisected instead, between their longest sides.)
The main input for this method is the columns
parameter, which is a list of the columns (or column names) inside the region to be refined. (If this is not specified, the entire grid is refined.) So for example if we have a polygon p
(a list of 2-D points) representing the boundary of the region to be refined, we can refine the columns inside it using:
refine_cols = geo.columns_in_polygon(p)
geo.refine(refine_cols)
Some care should be taken in refining grids this way. TOUGH2 generally performs less well if the grid is of lower 'quality', e.g. if the columns have large aspect ratios, or are very skewed, or the lines connecting block centres are far from perpendicular to the block faces. If the original unrefined grid is of low quality to begin with, the refined one (particularly in the transition zone) will often be worse. Grid quality in the transition zone can often be improved by using the bisect_edge_columns
parameter.
A mulgrid
object can check itself for errors (and optionally fix them) using the check()
method. The errors the check()
method can find are:
- missing connections between adjacent columns
- duplicate connections
- 'orphaned' nodes (that don't belong to any column)
- columns with centres specified outside themselves
- layers with centres specified outside themselves
This method returns a value, either True
if no errors were found or False
if the geometry has problems. If there are problems it will also print out their details.
The fix
parameter of this method can be used to fix any problems that are found, so for example:
geo.check(fix=True)
checks the object geo
and fixes any errors found. The fix
parameter is set to False
by default.
There are several mulgrid
methods for producing graphical output from a grid geometry, both visualizing the grid itself and data sets defined on the grid. These fall into two kinds: 2-D and 3-D visualization.
2-D visualization is carried out using another Python library called Matplotlib, a library for general two-dimensional plotting. You will need to have Matplotlib installed on your computer before you can use the mulgrid
2-D visualization methods. The two main methods, layer_plot()
and slice_plot()
, are used to produce plots at a particular grid layer, or along a vertical slice through the geometry respectively. It is possible to shade each block according to an arbitrary data array (one value for each block). These could be, for example, rock properties, results from a TOUGH2 simulation, or some other quantity derived from the results. It's also possible to display well tracks on these plots, or shade the blocks according to rock type. For either method, a separate Matplotlib window will appear containing the plot.
3-D visualization is carried out by exporting the grid geometry to VTK format, for visualizing using Paraview, Mayavi or similar software. This can be done using the write_vtk()
method. For this you will need a Python-enabled version of the VTK library installed on your computer (for more information on this, consult the PyTOUGH user guide). Once you have your data in Paraview or similar software you will be able to carry out sophisticated 3-D visualization, produce animations of time-dependent data etc.