Masking and cleaning - eakadams/ihow-hi GitHub Wiki

Now that we have continuum-subtracted data, it's time to properly image the data. In order to do so, we need to clean the real emission. Generally, we want to clean deeply where there is emission, as this is important for getting the flux right (making sure everything is in clean beam units, rather than a mix of clean beam and dirty beam). But we don't want to clean where there isn't source emission, as that impacts the noise properties (and can be computationally intensive).

Another important note is that with a bright source, like we have here, we initially see emission that is also in the sidelobes, so we have to be careful and iteratively build a mask, only cleaning where we are sure there is real emission. Strong line channel from continuum subtracted dirty cube

First mask

As we start with building the mask and final cleaned cube, we want to first start with a dirty cube with our final imaging specifications. Choosing the weighting that will be used for creating final data products depends on the science case. The default is natural weighting which nominally has the best sensitivity but the worst spatial resolution. In addition, the PSF has strong lobes, meaning there is a large difference between the clean and dirty beams, which is best avoided. We will use a Briggs weighting with robust=0.2. This is generally a reasonable compromise between the extremes of uniform and natural weighting.

We will also only use a subset of channels to make our life easier - fewer channels to image makes everything faster. But we do want to include a few channels that are only noise as these are useful for determining the noise and setting limits. We had noted channels 85-150 during continuum subtraction, so we will only image channels 80~155.

tclean(vis='sexb.ms.contsub',imsize=300,cell='10arcsec',specmode='cube',niter=0,restfreq='1420.405752MHz',imagename='dirty_rob',
weighting='briggs',robust=0.2,outframe='bary',spw='0:80~155')

To build the first stage mask, find a clip that contains only real emission, and not sidelobes. Do this by plotting contours in the CASA viewer. What level do you find that contains only real emission? (An aside, setting base=0, unit=1 for contours will let you specify levels in physical image units.)

A clip of 0.075 will only include real emission (located at the center of image, not in the sidelobes on the side). There will also be channels that are not yet cleaned with this clip, but we will get to them. Create a mask with this clip:

immath(imagename='dirty_rob.image',mode='evalexpr',expr='iif(IM0>0.075,1,0)',outfile='mask1.im')

We are basically creating a new image that is 1 above the clip level and 0 below it. This is because this is the format CASA expects for an image mask in the tclean task: 1 where the data should be cleaned, and 0 where it shouldn't.

You can load the mask into the viewer to confirm the clip is good, e.g., there is no emission outside the center of the image.

Create a clean image using this mask. Within that mask, clean to 10% the level of the clip. By setting a threshold for cleaning, we can set the niter value (number of iterations) very high, knowing that we will stop at a reasonable threshold.

tclean(vis='sexb.ms.contsub',imsize=300,cell='10arcsec',specmode='cube',restfreq='1420.405752MHz',imagename='clean1',
weighting='briggs',robust=0.2,outframe='bary',spw='0:80~155',niter=100000,threshold='7.5mJy',mask='mask1.im')

Second mask

Now we want to expand our clean mask, by using this new, partially-cleaned image, to find a deeper clip that will include only real emission. Again, use the viewer to find a clip. What level do you find?

A clip of 0.06 will only include real emission and extend our mask, including channels that weren't previously cleaned. We still aren't getting quite all the channels, but we are getting there. Create a mask with this clip:

immath(imagename='clean1.image',mode='evalexpr',expr='iif(IM0>0.06,1,0)',outfile='mask2.im')

Create a clean image using this mask. Within that mask, clean to 10% the level of the clip.

tclean(vis='sexb.ms.contsub',imsize=300,cell='10arcsec',specmode='cube',restfreq='1420.405752MHz',imagename='clean2',
weighting='briggs',robust=0.2,outframe='bary',spw='0:80~155',niter=100000,threshold='6.0mJy',mask='mask2.im')

Third mask

Use the newly cleaned image to again expand the clean mask. What clip level do you find when exploring clean2.image?

A clip of 0.045 will only include real emission and extend our mask, including a few channels that weren't previously in our mask but still missing some. For this reason we will move on to the next stage in the masking, which is smoothing. We always want our final masking to be on a smoothed image so that we can include low level emission which is comparable to the noise.

First, get the beam size for the image using imhead:

imhead(imagename='clean2.image')

Looking at the output, the beam size is ~46"x43". We want to smooth to ~2x the beam size, so we will aim for 90".

imsmooth(imagename='clean2.image',kernel='gauss',targetres=True,major='90arcsec',minor='90arcsec',pa='0deg',outfile='smooth_clean2.image')

Look at this smoothed image to see if you can find a clip that includes only real emission. What level is it at? Is this a more expansive mask than if we clipped on the original resolution image?

We can clip the smoothed image at 0.1, and have a mask that covers a few more channels plus larger spatial extents within channels.

immath(imagename='smooth_clean2.image',mode='evalexpr',expr='iif(IM0>0.1,1,0)',outfile='mask3.im')

Now when we clean, we want to start setting our threshold by the image noise, cleaning to 0.5sigma. To get the image noise we can use imstat over line-free channels:

imstat(imagename='clean2.image',chans='1~6;68~75')

The noise is 6.0 mJy/beam, so clean to half that level

tclean(vis='sexb.ms.contsub',imsize=300,cell='10arcsec',specmode='cube',restfreq='1420.405752MHz',imagename='clean3',
weighting='briggs',robust=0.2,niter=100000,outframe='bary',spw='0:80~155',threshold='3.0mJy',mask='mask3.im')

Final mask

We could continue these iterations, slowly growing our mask. In the interest of time, we will skip to the final step, which is to implement a quantitative clip at a given multiple of the noise and then manually select emission that is part of the source (as any reasonable clip will also include some noise). We will again do this on a smoothed image, as it allows us to identify/include emission at the edge of the galaxy.

Smooth the last cleaned image:

imsmooth(imagename='clean3.image',kernel='gauss',targetres=True,major='90arcsec',minor='90arcsec',pa='0deg',outfile='smooth_clean3.image')

Find the noise in line-free channels:

imstat(imagename='smooth_clean3.image',chans='1~6;68~75')

The noise is 8.3mJy/beam. Plot the 3 and 4 sigma contours and explore the cube. Note that they're generally pretty similar but the 3-sigma sometimes extends to noise, so we will clip at 4sigma, 33.2 mJy/bm. Also identify the channels that have emission as 7~66.

Clip to get the mask, but this also includes noise, where we don't want to clean:

immath(imagename='smooth_clean3.image',mode='evalexpr',expr='iif(IM0>0.0332,1,0)',outfile='tmp_mask4.im')

Now we need to select only the real emisison. This is a time-consuming task and can require the use of different tools depending on the CASA version. For that reason, we will skip it in the tutorial, and the instructor provides a manually edited mask. You can find a description of how this mask was created, in addition to brief information on other approaches on the separate page Interactive masking.

Final clean cube

First, make sure you have retrieved the cleanmask from the tutorial data. This mask was saved as a tar file to avoid inconsistencies with the ordering of Stokes and frequency axes when exporting and reimporting fits images compared to CASA image structure. Thus we first to need untar it, then we can use the image as our input mask when cleaning.

Outside of CASA, untar the file:

tar -xvf sexb_mask.image.tar

Back within CASA, clean the image, using this mask.

tclean(vis='sexb.ms.contsub',imsize=300,cell='10arcsec',specmode='cube',restfreq='1420.405752MHz',imagename='sexb_cube',
weighting='briggs',robust=0.2,niter=100000,outframe='bary',spw='0:80~155',threshold='3.0mJy',mask='sexb_mask.image')

Write data products to fits

It is useful to write out our key data products to fits files, the "flexible image transport system", which is standard image format used in astronomy and thus allows the data to be used in other programs. The saved fits files are also available as part of the tutorial data and thus can be used to progress:

exportfits(imagename='sexb_cube.image',fitsimage='sexb_cube.fits',velocity=True)

Note that I generally write a cube out using velocity, as that is more physical and make analysis easier. I also tend to optical velocity (optical=True), as that enables the best comparison to multiwavelength data. But the radio frame is natural and keeps equal channel spacing - if you have high velocity resolution or large bandwidths, the non-linearity in the conversion from frequency to optical velocity is important.

Make a common resolution cube

Something you may notice is that your data cube does not have a single beam, but rather a beam that varies (very slightly) from channel to channel. CASA handles this internally without a problem, but depending on the analysis you will later do, you may also want a common beam across you cube. This is something that we could have (probably) set in tclean, to require a common beam. But sometimes you don't want to force the common beam at that stage. Instead, you can smooth the cube after cleaning to a common beam. Looking at imhead of our image cube, we see:

2023-09-05 12:04:43 INFO ImageMetaData	Restoring Beams 
2023-09-05 12:04:43 INFO ImageMetaData	Pol   Type Chan         Freq     Vel
2023-09-05 12:04:43 INFO ImageMetaData	  I    Max    0 1.418767e+09  345.83   45.8585 arcsec x   42.9986 arcsec pa= -7.5535 deg
2023-09-05 12:04:43 INFO ImageMetaData	  I    Min   10 1.418828e+09  332.95   45.8451 arcsec x   42.9504 arcsec pa= -7.2480 deg
2023-09-05 12:04:43 INFO ImageMetaData	  I Median   38 1.418999e+09  296.88   45.8597 arcsec x   42.9675 arcsec pa= -7.7054 deg

We need to go slightly larger than the larger beam in order for the convolution to work. Thus, we will take 49"x47". We can also go smaller, e.g., 46"x43", but we will receive errors about the kernel for convolution being smaller than the beamsize. Generally, you can ignore these, but you can also follow the advice and use a larger convolution kernel (go to a worse angular resolution) or regrid to a smaller pixel size.

imsmooth(imagename='sexb_cube.image',kernel='gauss',targetres=True,major='49arcsec',minor='47arcsec',pa='0deg',outfile='sexb_commonres_cube.image')

And we will write this one out also:

exportfits(imagename='sexb_commonres_cube.image',fitsimage='sexb_commonres_cube.fits',velocity=True)

Clean up

In order to manage dataspace, especially when working with large datasets, it can be very important to clean up all our intermediate files. Before cleaning up, it's always good to verify what you are going to clean up:

 ls clean?.*

We can see all the temporary clean cubes we made - go ahead and delete them.

 rm -rf clean?.*

There are also are smoothed images and dirty cube that we don't need anymore either:

ls smooth_clean?.image 
ls dirty*

rm -rf smooth_clean?.image 
rm -rf dirty*