Model Building

Introduction

At some point after you collect your data, you're going to need to convert raw spectrum files to peak areas. You can do this in Excel, you can use graph paper, or you can do what most people do and use a piece of spectrum processing software. In spectrum processing, you're basically trying to remove the background and fit different characteristic x-ray peaks to the residual.

We provide visitors with software and a default set of models to use in this lab (see data processing). But, we also strongly encourage everyone to learn how to make their own models, either during their visit or when they get back home. Processing is an interpretative act. The raw spectrum files are measurements, but peak areas are calculations based on sets of assumptions. Remember, you are accountable for your interpretations of the data that comes out of this lab!

If you're ready to start building your own models, then you should first get very familiar with your spectra in their raw form (Figure 1). Where are the peaks? Are there many large ones or just a few? What does the background look like? Before you run the spectra through any processing models, can you identify some of the major peaks? Do the spectra change significantly from sample to sample? All of these observations will help guide you as you construct your model, and most importantly, they will help you spot any errors that arise later on.

Screenshot showing a raw spectrum displayed in bAxil

Figure 1: Example of a raw spectrum collected at 30 kVp with a thick palladium filter

Modeling the Background

The first step when building a model is to choose how to fit the background. Spectrum processing packages typically treat background fitting and peak fitting as separate calculations, and although you can fit both with one click, I recommend that you start with the background by itself. After all, peaks are features that rise above the background, so if you don't model the background correctly, then you can't model the peaks correctly!

There are a number of different algorithms, each with adjustable parameters, which allow you to fit the background. Spectrum processing packages differ in their default algorithms, and some packages (such as PyMCA) give the user a great deal of control whereas others don't. In bAxil, you can adjust the background model parameters by clicking on "Fitting Model" in the top right corner and selecting the "Continuum" tab (Figure 2). You'll also want to choose a region of interest (ROI) for your model. You can do this in the "Fitting Model > ROI" tab, or you can right click on the spectrum and click "Set Start ROI / Set End ROI" accordingly. The software will ignore data outside this ROI, so make sure to set it large enough to include all visible peaks and notable background features. To see how your model performs, click on the gears icon in the upper right ("Fits current spectrum").

Screenshot showing the background fitted to a spectrum in bAxil, along with the background fit control panel

Figure 2: Screenshot showing the example spectrum with the background fitted. Also shown is a screenshot of the controls used to adjust the background model in bAxil.

The important thing to remember when fitting the background is that you can rarely get a good fit in every part of the spectrum at once. Most algorithms allow you to choose how smooth the background should be. If you choose to use a very smooth background, then it will fail to fit low-relief features and step-changes in the continuum. If those regions have peaks in them, then the peaks will appear larger than they really are. If, however, you allow the background to have more variation, then it will tend to fill-in parts of your peaks, making the peaks appear smaller than they really are. You will have to decide which source of error is a bigger concern, and you'll want to make sure that the areas that contain important peaks (to you) are where you optimize your fit. The default model in bAxil tends to be overly smooth, and it fails to fill in regions where the background steps up or down due to absorption edges (Figure 2). This may or may not be a problem, depending on your peaks of interest.

Modeling the Major Peaks

Once you're happy with your background model, then you can start fitting the peaks. When you fit the background, you don't have to care about the sample composition, the peak identities, or the relationship between channel and x-ray energy. You're simply applying numerical filters to raw data, only using interpretation insofar as you're deciding what you think is really signal and therefore worth keeping. Once you start considering peaks, then you're really entering the realm of spectrum interpretation!

You'll want to start this next phase by identifying the largest peaks in your spectrum. Ignore the medium and small peaks for now. In most cases, you'll only want to consider K-lines from major elements and K/L-lines from the source anode during this step. Other large peaks can come from irradiation of system components, such as Ag in the 50 kVp runs. If you're using bAxil, click on the periodic table symbol in the upper right to open up the peak selection window (Figure 3). Click on an element and then click on "K" and/or "L" at the bottom to add those lines to the model. When you click on an element, it will show red lines on the spectrum highlighting where the software expects the peaks to be. If you have trouble identifying certain peaks, right click on one in the spectrum and click on "Identify Peak". Remember, K peaks are larger than L peaks, so when in doubt, go with the element associated with a K peak. If you're not sure about a peak at this step, skip and come back to it later. Once you've added your peaks, click on the gears icon in the upper right ("Fits current spectrum") to see how the model performs. This can be an iterative process, so start with elements that you know should be there and work your way through the rest.

Screenshot showing a spectrum fit with a background and major peaks, along with the peak selection window used to alter the model

Figure 3: Screenshot showing the example spectrum with major element and source peaks fitted (bAxil)

The goal of this step is twofold: (1) establish a robust calibration between the channel and the x-ray energy, and (2) model the most significant peaks and their associated artifacts before trying to identify anything else.

The first step (energy - channel calibration) might seem trivial, but we dedicate an entire section to this later because it is the source of many problems! We measure x-ray counts by channel, not by photon energy. We calculate photon energy for each channel by starting with a default relationship (given by the manufacturer) and adjusting the calibration until selected peaks line up with their expected energy values. To get a good calibration, we need strong, unambiguous peaks in multiple locations. All large peaks should be included in your model, even if you're not interested in the values for those elements!

The second step (modeling major peaks and artifacts first) is necessary because of the nature of our detectors. In Energy Dispersive Spectroscopy (EDS), some of the peaks that we observe are formed in the detectors themselves or in the electronic signals that are read by the computers. These peaks, known as artifacts, can be larger than some of the real peaks coming off your sample so you must consider them.

The primary peak artifact you'll want to consider is the sum peak. Sum peaks are formed when two x-rays collide with the detector at nearly the same moment and are accidentally considered one "event" in the computer. You'll have a sum peak for every possible combination of peaks reaching your detector (Si Kα + Ca Kα, Ca Kα + Ca Kα, etc.), but you'll only observe ones for combinations of high intensity peaks. The second type of peak artifact that you'll want to consider is the escape peak. Escape peaks are formed when x-rays ionize the silicon in the detector. A characteristic x-ray forms in the detector and escapes, leading to a loss in the total energy gained by the impact. The measured event has 1.74 keV less energy than the energy of the original x-ray due to the loss of a Si Kα x-ray.

If you start out by modeling the large peaks in your spectrum, and you make sure to include sum peaks (checkbox at the bottom of the periodic table, see Figure 3) and escape peaks (included by default), then you'll know which small peaks are artifacts and you can avoid over-interpreting them as trace elements. In bAxil, make sure you highlight the artifacts by clicking on the "Spectrum" menu at the top and selecting "Show sum peaks" and "Show escape peaks".

Modeling the Minor Peaks

Once you fit the background, identify the large peaks, and establish a robust energy - channel calibration, then you'll probably want to start identifying the remaining peaks. If you're only interested in the major elements, you might not need or want to do this step. You don't need to identify and model every peak in your spectrum, particularly if the peaks are small. If you do want these small peaks, then you're going to need to build up your interpretation piece-by-piece.

At this point, you should have a fairly robust energy - channel calibration, so it should be straightforward to identify the energies of the unknown peaks. You can right click on a peak to see suggestions, but the identities are not always so clear. This is where you'll need to lean on your knowledge of your samples and a bit of trial-and-error.

Start by identifying the largest remaining peaks. You probably know from previous work which elements are enriched in the materials that you're looking at, even if you don't know the relative concentrations. Start there. If you know that Mn, Ti, Sr, Rb, and Zr should be present and detectable, add them and run the model. See how many peaks remain unfilled. Using this approach, you should be able to assign the majority of the peaks, but there will almost certainly be a few that remain.

You might be wondering at this point why you can't just add all of the elements to the model to see what happens. You'll just get an answer of "0" if there's nothing there, right? Unfortunately, it's not that simple in practice. If peaks never overlapped in energy space, that approach would work quite well. Each position on the spectrum could only be occupied by one peak, and it would be there or not. In reality, however, some peaks can be attributed to more than one element, and the software has no reliable way to distinguish which is the "correct" answer. If you tell the software there are two peaks in a position, it will give you two values. If you tell the software there is one peak, you'll get one value. You can see an example of this in Figure 4.

Screenshot showing a spectrum with major and minor peaks modeled, highlighting the differences in output if different peak assignments are made

Figure 4: Screenshot showing most of the peaks fit in the example spectrum. Also shown are the model results for the small highlighted peak, depending on whether the peak is attributed to arsenic, lead, or both (bAxil).

There is a small but noticeable peak that the software suggests could be an arsenic Kα peak or a lead Lα peak (Figure 4). In both cases, if there's an alpha peak then there should be a beta peak, but the beta peak will probably be too small to provide much help. So, which element created that peak? We know that K peaks are much larger than L peaks, so that might lead us to believe the peak is from arsenic. However, lead is a very common and reasonably abundant element. We might have high enough lead concentrations to allow us to see an L peak. Or, maybe both elements contribute to the signal. You, the interpreter, have to choose, and you'll get different model results. You can see the differences in As and Pb values from the various models at the bottom of Figure 4.

One approach that you may consider when building your models is to have two or more versions. You might want to have a model where you include all elements as a screening tool (to find elements you might not have considered), and then create a second model optimized to the results of the first model to better quantify peak areas. There is no "right" answer to how to build a spectrum processing model, and we caution all visitors to our lab that our default models might not be the best models for your data.

Fixed vs Variable Peak Ratios

One of the choices that you will need to make for each element is whether to use a fixed Kβ / Kα ratio or a variable one. Spectrum processing packages such as bAxil have default Kβ / Kα ratios in their libraries, but the values carry some uncertainties and the defaults don't always provide the best fits. You can see an example of that in Figure 5. So, if that's the case, why wouldn't you always model the peaks separately? The answer is "peak overlap".

Screenshot showing the differences between fitting iron using fixed and variable alpha to beta ratios

Figure 5: Example showing the difference in Fe Kβ fit when using fixed and variable ratios. The peak has a worse fit when using the fixed ratio (bAxil).

You can see in Figure 6 that the Rb Kβ peak overlaps with a peak from another element (not included in the model). If the Kβ peak is allowed to vary on its own, then it will fill the entire area and will be way too large. If a fixed Kβ / Kα ratio is used, then the remaining spectrum area is simply unassigned. In areas where there are many overlapping peaks, such as the one shown in Figures 6 and 7, you'll be safer by using fixed peak ratios. In fact, even though we can see that some peaks have a worse fit, using fixed ratios for all elements is a conservative and "safe" bet for your spectrum. It may not provide the best fit, but you can be sure that it won't provide a terrible fit if you forget to include some elements.

Screenshot showing how the rubidium K-beta peak is way too large when allowed to vary independently, if yttrium is present

Figure 6: Example showing the difference in Kβ fit for Rb when using fixed and variable ratios. The presence of another peak at the same location causes the Kβ to be far too large when allowed to vary independently (bAxil).

Screenshot showing many different elements with overlapping K-alpha and K-beta peaks in the same region

Figure 7: Example spectrum with peak labels showing the strong Kα and Kβ overlaps due to the presence of multiple trace elements (bAxil)

Other Parameters

Some of the other parameters that you can adjust in your model are peak shape, peak symmetry, and peak noise. The specific options depend on the processing software, and in many cases the default settings work quite well, but it can be good practice to experiment with different values. In bAxil, you can adjust the settings for shape and symmetry on a peak-by-peak basis (click "Details" after you select an element in the periodic table) and you can adjust the peak noise in the "Calibration" tab under the fitting model.

Importance of Energy Calibration

We don't measure x-ray counts by energy; we measure them by channel. We then convert each channel to an equivalent energy during processing so that we can identify and fit peaks. There's a default relationship between energy and channel that we get from the manufacturer, but due to drift in the electronics, the relationship can change a bit over time.

There are two ways that we can account for this drift: (1) we can measure pure element standards and use the peaks to fix our calibrations, and (2) we can allow the processing software to guess at the calibration while processing each file. The first approach is the most reliable and robust method, but it's time consuming. You have to analyze a large number of standards, and you have to run them frequently enough to capture drift throughout the day. In this lab, we use the second approach with some caution.

All spectrum processing packages will ask you to provide a default energy - channel relationship and will allow you to put some error bars on the offset and gain. You can decide how much the values will be allowed to vary during processing. When the peaks are being fit, the offset and gain will be adjusted within those ranges until the best fit is achieved. If you're having problems getting a good energy calibration in your model, check the default values and adjust the tolerances (Figure 8). The offset and gain are just two of the many, many parameters being fit simultaneously, though, and the model can converge on a wrong solution if you're not careful!

Screenshot showing how to adjust the energy - channel calibration parameters in bAxil

Figure 8: Example of how to set the default energy - channel relationship, and allowable variance due to drift, in bAxil

A cautionary example of this failure is shown in Figures 9-11. This example is from a real model used in production, and I have seen this type of problem occur in multiple XRF labs. First, you can see a raw spectrum that has been clipped to exclude all large peaks and only includes half of a medium-sized peak (Figure 9). The default energy - channel relationship is displayed, and you can clearly see that the peak labels are sitting above actual peaks. When the spectrum is fit, however, the energy - channel relationship is adjusted and the model goes off the rails (Figure 10). The spectrum doesn't move, but the energy assigned to each peak changes, and therefore the peak identities all shift. The model peaks are no longer centered on actual spectrum peaks, and the results are nonsense. All it takes to fix this model, however, is including the Fe Kα and Kβ peaks (Figure 11).

These sorts of model failures won't happen if you follow the approach described earlier: (1) use a wide ROI, and (2) include all large peaks in your model, even if you don't care about the data.

Screenshot showing a raw spectrum with a default energy - channel relationship used for peak assignment

Figure 9: Unprocessed spectrum showing the default energy-channel relationship and initial peak positions (WinAxil)

Screenshot showing a processed spectrum with a bad energy - channel relationship, highlighting mis-fit peaks and incorrect peak assignments

Figure 10: Processed spectrum showing an incorrect energy - channel calibration and a terrible model fit (WinAxil)

Screenshot showing the previous spectrum with a correct energy - channel calibration after the addition of the iron K-alpha and beta peaks

Figure 11: Processed spectrum showing a correct energy - channel calibration and model fit once at least one large peak is included (WinAxil)

Report abuse