This is an extension of the traditional concept of Wavelets to Text processing. The goal is to perform a similar approach to obtain graded approximations and details. The result is not a vector as in the first case but a combination of metrics and meta-data automatically generated. This table summarizes the main characteristics that unite or distinguish them:
Table 2.Characteristics of traditional wavelets and MLW
The evidence that wavelets offer the best description of such morphosyntactic decomposition is revealed by comparing the details of both traditional and morphosyntactical analyses.
Table 1.Traditional wavelets versus MLW
NOTES:
Detectable physical quantity or impulse by which information may be sent.
Although this theory is explained in general, it has only been proved in Spanish.
This is an advantage over the FFT alternative.
This is true within the MLW context, given the statements in rows 4 and 5.
The knowledge derived from the filtering processing is called Ece in the MLW context.
Figure 1 shows a graphical comparison between a signal and its FFT. Figure 2 is a linguistic version: Eci and ER. The graphics in Figure 1 represent the original signal (time-domain) and the resulting FFT decomposition (Lahm, 2002). The images in Figure 2 represent a translated original Spanish text (content from wikipedia.org, topic Topacio) transformed into an Eci (López De Luise, 2007) that models dialog knowledge. (Hisgen, 2010) Statistical modeling of knowledge is beyond the scope of this chapter, but additional information is available in (López De Luise, 2005, 2008, 2008b, 2008c, 2007b, 2007c).
Fig. 1. Signal and frequency decomposition
Figure 3 shows a sample wavelet decomposition. It is a signature decomposition using a Daubechies wavelet, a wavelet specially suited for this type of image. Figure 4 shows a MLW decomposition of a generic text. There, Ci, and Cj,k stand for abstract knowledge and Fm represents filters. This Figure will be described further in the final section.
Fig. 3. Original text and knowledge structure model
Fig. 4. Traditional wavelet decomposition
López De Luise, M.D., 2011. Morphosyntactic Linguistic Wavelets for Knowledge Management. In-Tech Open book “Intelligent Systems”, ISBN 979-953-307-593-7.
López De Luise, D., Hisgen, D., Cabrera, A. and Morales Rins, M., 2012. Modeling Dialogs with Linguistic Wavelets. IADIS International Conferences. TPMC – IAR.
Author: Daniela López De Luise