Improvements in 'Dark Frame' Subtraction and Results Graph Corrections

As mentioned previously, the employment of 'dark frame subtraction' greatly assists in the extraction of weak HI signals  - especially extra-galactic HI signals.

Some further details about the method used for this project at HawkRAO can be found on the 'Data Analysis' page ("Selection of the 'Dark Frame' Method ").

Note that the 'subtraction' term is not used in the mathematical sense (the process is actually a mathematical division), but refers to the notion of the removal of undesired responses.

A comment was also made that the weak HI line signals are like pimples on the back of an elephant and further hidden by the gross amplitude response ripple of the RTL across the passband as seen on the right.  It can be observed that the HI signals in this raw data plot are so small as to be un-detectable.

The raw data is the combination of system noise and sky noise, shaped by the system frequency and amplitude response.  We want to cancel out the system responses and effects of system noise - to reveal the obscured HI line signals.

It is only by dividing this raw data by separate set of raw data taken from another part of the sky with a low level of hydrogen signals (the 'dark frame' data set) that the signals from the first raw data file can be extracted.  If the two raw data files were exactly the same shape then the division would produce a flat horizontal line across the graph.  If, however, the first raw data set has signals which are not present in the second 'dark frame' raw data set, they would emerge and appear in the results graph.

It can readily be seen that the HI signals appearing in the 'dark frame' corrected data are the result of very small variations between two raw data files with very large signals.  Therefore, any perturbations in either file can degrade the quality of the 'dark frame' subtraction process.  In an ideal world the two files should be identical except for the presence of the target signals in the target raw data file.  Unfortunately, this is not the case in practice and possible errors creep in from a range of sources...

Different combinations of these effects cause a 'scoliosis' effect on the graph and some means to 'straighten' out the graph is useful...

Cropping the Left and Right Extremities

The first step to straighten (linearise) the graph is to crop off the top and bottom 12.5% of the spectrum as these areas are well down on the anti-aliasing filter skirt and are next to impossible to rehabilitate.

Getting a Good Dark Frame

The next step is to obtain a good 'dark frame'.  As mentioned previously, if the target was HI line signals from within our own galaxy, the best 'dark frame' will be obtained by collecting data from an area of the sky which has a low level of HI signals - otherwise there will be a large cancellation of the sought after HI line signals from the processing.  This process is complicated for 'local' (intra-galactic) HI line signals because they are present at every possible pointing - it is only possible to find a position in the sky where they are at a minimum.   Generally speaking this is roughly in the direction of the Galactic Poles.

For this project, where extra-galactic signals from the Magellanic Clouds are the target, quite conveniently both Clouds are moving away from us at such a rate as to doppler-shift their band of HI line signals well away from the signals produced by the 'local' HI line radiation within our own galaxy.  This simplifies the selection of the area of sky, as we actually welcome cancellation of the 'local' intra-galactic HI line signals.  Even so, my first choice was not a particularly good one as discussed in the next paragraph...

In order to maximise the probability of having the same RFI in both the target data and the dark frame data (thereby maximising the cancellation of those unwanted signals), both sets of data were taken with the same fixed antenna azimuth and elevation settings - i.e., the dish was not moved between data acquisition sessions.  I used RadioEyes to locate a position in the sky at the same declination (-69 degrees) as the LMC (my principal target cloud) and identified RA = 22H as a 'quiet' spot in HI line signals at that declination.   This 'dark frame' worked well for the first several days of data collection (see 'Results'), but using it for subsequent data runs a week or so later produced problems.  What followed was a week or so of trying to track down the discrepancy between the points in time - I had done some changes to the analysis software in the interim, so I thought I had introduced a 'bug' - and eventually realised that the first runs were done on overcast days where the temperature of the 'dark file' and target data runs was essentially the same.   On subsequent days the weather was sunny with a large difference in heat energy impinging on the RF front end mounted on the antenna (early morning shade versus late afternoon full sun).

The times of the acquisition of dark files, therefore, needed to be as close as possible to the target data acquisition to minimise temperature variations.  Because of the extended nature of the Clouds, this meant that about 2 hours after the passage of the LMC through the antenna beam was a close as it was possible to get in time.   An RA of 7:20:00 was chosen (DEC -69 degrees), which had the added benefit of having a similar level of intra-galactic ('local') HI signals as in the direction of the LMC and so these undesirable signals would tend to cancel out.

It was thought initially that every target LMC data run would have to accompanied with a trailing 'dark frame' data run, but (at least for the 3 days so far analysed) a single 'dark frame' data run was found to be useful for a number of different runs on different days.

This convenience may not hold for the future.

Flattening the Result Graph

A number of steps have been taken in order to correct the results graph which is distorted by the above effects.  One effect noted was that different signal levels into the RTL dongle produced both a variable 'tilt' on the result, and a different 'ripple' across the passband.  This, I presume, is due to differing combinations of noise energy from sources subjected to different passband shapes dependent on where in the RF chain they insert themselves.   To minimise variations between observations it was decided to set the dongle gain at maximum (49.6 dB) and enable the 'digital AGC' option.  A number of tests done indicated that observation to observation variations were much reduced without any discernible loss in sensitivity.

Note: I observed some unexplained gain instability (in 'Auto' presenting itself by a cyclic variation in level) when the dongle was fed high levels of signal in order to get the maximum 'swing of bits' in the recorded data.   For manual gain settings this presented as a sudden drop in gain of some 20 dB or so after 5 or 10 minutes of operation.   Running the dongle at 49.6 dB with digital AGC enabled and adjusting the signal into the dongle such that about only 5 bits were being exercised in the recorded data seemed to eliminate this problem.  This is just an observation and so is not a statement of fact and the effect may be entirely local.

The result graphs for this gain configuration were found to have minimal 'tilt', but a residual near-parabolic 'scoliosis'. 

Controls were introduced to correct for this in the analysis software as shown on the right...

In any situation where data has been 'processed' careful checks must be made to try and ensure results haven't been 'manufactured'.

A valid way of attempting this check is to compare results with the known science (checking against results from professional radio astronomers) combined with checking for repeatability.

For the set of LMC results the results should look similar to this result obtained from the LAB survey on-line data base (using the a beamwidth of 6 degrees for the HawkRAO 3m TVRO dish @ 21 cm)...

LMC Result from LAB Survey Using Beamwidth = 6 Deg

Three days results were analysed and processed by display linearisation controls.

Results from Three Days for the LMC - 27th, 29th, 30th April 2016

These results demonstrate a similarity with the LAB survey results (especially when the different aspect ratios of the two sets of graphs are taken into account - see below) and a high degree of repeatability between the three separate days results.

LMC Result for 30th April with LAB Survey Results Superimposed

I am fairly happy with these results - especially the day-to-day repeatability.