Previous page: Stylometrics And The Synoptic Problem
Note: The data used in the analysis below can be found in this Excel spreadsheet, also linked at the bottom of this page. In particular, the correlations are located between rows 37 and 150, and columns HI to LR.
One of the basic assumptions underlying this method of analysis is that different authors (aA, aB, and aC) have natural frequency profiles pA, pB, and pC that are sufficiently different that one can be distinguished from the other. Therefore, it is reasonable to test this assumption before continuing with the rest of the analysis. Because we are testing what copying took place from one synoptic gospel to another, we cannot compare the profiles of the whole of each gospel against each other. Instead, we need to isolate the text unique to each author and compare the profiles of each.
One way of doing this is to compare each author’s Sondergut material (HHBC categories 200, 020, and 002 respectively). Alternatively, we can combine all the categories containing words written by just one of the three authors, e.g. c200 + c210 + c211 + c201 (=c2AA), and compare them:
However, the values
do suggest that the differences between the profiles of Mark and Luke are somehow
of a different character than between those of the other pair of synoptics.
This suggests that the Greek used in Mark is different to that used in Luke, in a
way that does not apply to the other synoptic pairings. It may be that, as has
been suggested, that this is due to aMark not having Greek as his first language,
but whatever the reason, it is the case that Luke contains many systematic
language differences from Mark, for example there are around 150 places where Mark has
the narrative present while Luke has the past tense. Systematic differences such
as this would tend to create a negative correlation between Mark and Luke, such as
we see here.
Then, if aMatthew copied or edited some text from Mark, Matthew would contain text from two different authors, and would therefore be likely to be less homogenous than Mark. Finally, if aLuke copied or edited text from both Mark and Matthew, then Luke would be likely to be less homogenous still. Therefore, by comparing the homogeneity of Matthew, Mark, and Luke in turn we may be able to determine which of the authors copied or edited from which others.The HHBC data representing each of Matthew, Mark, and Luke is spread across 9 categories, depending on the interaction with the other two synoptics. For example, we can compare those parts of Matthew that have no parallels in Mark (c 200 + c201 + c202 = c20X) with those parts that do (c210 + c211 + c212 + c220 + c221 + c222 = c2NX), but also we can compare those parts of Matthew that have no parallels in Luke (c2X0) with those that do (c2XN). As a result, we can compare the categories grouped in various different ways to determine how the relationships between the synoptics affect which parts of any one synoptic, which show signs of homogeneity, and which do not. The tests in the group below all compare the profiles of categories representing passages in Matthew that do not have parallels in Mark (c20X, with ‘sub-divisions’ c20N, etc.) with those that do have parallels in Mark (c2NX, c2NN, etc.). We can perform comparisons using six different combinations of categories in Matthew, depending on the existence and content of parallel passages in Luke.
A similar group of tests can then be used to compare the profiles of the same categories representing passages in Matthew that do not have parallels in Mark, with those representing just those words in Matthew not used in the parallels in Mark (c21X, c21N, etc.).
We can also compare the profiles of those categories that denote passages in Matthew that either have no parallels in Mark, or where the parallels have different words (c2AX, c2AN, etc.), with those where the parallels contain the same words (c22X, c22N, etc.)
The above results show that there is no strong correlation in any of these three groups of tests. However, p201 and p211 (respectively, the profiles of the Double and the Triple Tradition words only in Matthew) do appear to be sufficiently similar (0.46) to be worth investigating further. An examination of a scatter plot of the two profiles shows that several words in these categories have similar below average frequencies in each. As both categories contain words only in Matthew, in passages that have parallels in Luke, this suggests that the frequency with which these words appear in both categories has been affected by copying/editing between Matthew and Luke.
Overall, there is no evidence here of homogeneity between those passages in Matthew with parallels in Mark and those without, i.e. there is no evidence that the passages in Matthew that have parallels in Mark came from the same source as the passages in Matthew that have no parallels in Mark. More specifically, where Matthew and Mark have identical words in parallel passages, there is no evidence that the words originated in Matthew.
The following groups of tests can be considered to be the ‘reverse’ of the previous groups, this time testing for homogeneity in Mark instead of in Matthew. The tests in the first two groups below compare the profiles of categories representing passages in Mark that do not have parallels in Matthew (c02X, c02N, etc.) with those representing:
These results show that the passages in Mark that have no parallels in Matthew have very similar profiles to those passages in Mark that do.
These results are very similar to those in the previous group, showing that the passages in Mark that have no parallels in Matthew have very similar profiles to just those words in the Matthew-Mark parallels that are in Mark but not Matthew. These two groups of results are a strong indication that the words common to Matthew and Mark originated in Mark.
The only area where categories in Mark show little signs of similarity are where Mark and Luke share identical words (c022 + c122 + c222 = cX22). Further investigation shows that this is primarily because the profile of the Mark-Luke agreements against Matthew (p122) is not similar to any other profile, while p022 and p222 are both similar to p220. This could indicate that the words in c122 came from a source outside the synoptics. However, p122 does have negative correlations with a number of other categories, the most significant being with p2AX (All of Matthew except for words shared with Mark). This is due in no small measure to AUTON appearing more frequently in c122 than in any other category, while Matthew uses AUTON much less.We can also compare the profiles of categories that denote passages in Mark with either no parallels in Matthew, or where the parallels have different words (cA2X, cA2N, etc.), with those that contain the same words (c22X, c22N, etc.)
Unlike the two previous groups of tests, here there is much less indication of homogeneity. However, this is in large part due to the fact that cA2X, cA2N, and cA22 all contain c122, and c122 is not similar to any other category, as reported above:
Overall, the above results show a great deal of homogeneity in Mark, but a lack of it in Matthew. This is a strong indicator that the passages shared between Matthew and Mark originated in Mark. The lack of similarity between p22X and p12X does not affect this indication, but instead just provides information about the choices made by aMatthew when copying/editing from aMark. The lack of similarity between p122 and the profiles of any other categories (in any of the synoptics) has a similar cause, but in this case it suggests that the sharing of words between Mark and Matthew was later affected by the sharing of words between Mark and Luke.As previously noted, one of the key indicators of directionality is the possible correlation between the profiles of categories containing words common to two of the synoptics and words unique to one or the other. There are four basic tests that can be used to look for ‘authorship’ of words common to any pair of the synoptics (e.g. c2XX in the case of Matthew-Mark). However, two of these tests have already been used when testing for homogeneity, leaving just two to test here. Both look for similarity between an author’s Sondergut material, and the words he has in common with another author:
There are then six variations on each of the above, depending on the existence and content of any parallels in Luke (X, N, A, 0, 1, 2), giving the following tests:
As previously noted, if whatever copying/editing took place included selectively choosing or replacing many individual words (rather than complete sentences), then the profile of the words common to any two of the synoptics may not have a strong correlation with the profile of the words in passages unique to one or the other, and that may be the case here.
Although the differences between the results of these two groups of tests are not great, what differences there are suggest that it is more likely that Mark was first, i.e. that the words common to both Matthew and Mark came from Mark. The differences between these two groups of results are greatest for c221, i.e. Triple Tradition words common to Matthew and Mark but not Luke (difference = 0.37 – 0.06 = 0.31).
The following tests can only provide limited (if any) information on directionality. However, using the directionality information from the previous tests, they may provide additional information on how the source material of the Matthew-Mark parallels was copied/edited.
The previous results have indicated that the source material came from Mark, so we can use that as an assumption in the following tests. As with the homogeneity tests, there are six variations on each of the above two tests, as follows:
There are no significant positive correlations here, and thus nothing to refute the assumption that the Matthew-Mark parallels originated in Mark.
The results of these groups of tests are not conclusive as to the mix of individual words vs. complete sentences that were copied or replaced. However, such indications as do exist support the conclusions of the tests for homogeneity, which are that the passages shared between Matthew and Mark came from Mark, and also that aMatthew mainly selected or rejected complete sentences from Mark, but did change or add many individual words as well.
The greatest indication of Matthew changing individual words comes from the 'Matthew-Mark double tradition' (c120 + c220 + c210 = cNN0), where we see:
This material (that is not in Luke) includes what is known as ‘The Great Omission’ from (approximately) Mark 6:47a - 8:27b, as well as the death of John the Baptist and some other items. In this material the relative frequency of use of various words varies greatly between Mark and Matthew, in particular the use of IHSOUS:
Here we see that the material that aMatthew chooses not to use (c120) contains the word IHSOUS a relatively small number of times, whereas in the material he adds (c210) he uses IHSOUS frequently.The following tests compare the profiles of categories representing passages in Matthew that do not have parallels in Luke (c2X0, c2N0, etc.) with those representing:
Here we see quite different results from those in the Matthew–Mark comparisons, with some significant correlations in Matthew between the profiles of passages with parallels in Luke, and those without. The strongest correlation occurs when comparing Sondergut Matthew (c200) with the Double Tradition passages in Matthew (c20N), i.e. where there are no parallels in Mark. This in itself is a strong indication that the Double Tradition did not originate in Luke.
The correlation is nearly as strong when comparing Sondergut Matthew with just those words from the Double Tradition that are in Matthew but not Luke (c201), suggesting that c201 consists mainly of complete sentences, rather than just a selection of individual words, i.e. that the Double Tradition was created largely by selecting and copying complete sentences, and changing relatively few individual words.
However, when looking at the categories corresponding to the Triple Tradition passages in Matthew (c211 + c221 + c222 + c212 = c2NN), the correlations indicate that the copying/editing between Matthew and Luke involved changing a much greater number of individual words. This is particularly so for the words also shared with Mark (c221 + c222 = c22N):
As with the previous two groups of tests, the strongest correlation again relates to the Double Tradition. We have:
This evidence supports the view that the Double Tradition material originated in Matthew, and that words common to Matthew and Luke were mainly re-used in Luke in the form of complete sentences, with only a small percentages of the words from Matthew being removed or replaced by aLuke in the process.With regard to passages in Matthew that are part of the Triple Tradition (c211 + c221 + c222 + c212), we have:
Although these two groups of tests appear to indicate that all the passages in Luke that do not have parallels in Matthew have similar profiles to all those that do (p0X2 – pNX2 = 0.52), the correlation is actually only significant where Luke does not share words with either Matthew or Mark (p0A2 – p1A2 = 0.70). This indicates that the only parts of Luke that are homogenous are those categories containing words unique to Luke, which in turn suggests that the words in Luke common to either Matthew or Mark did not come from the same source as the words unique to Luke.
The most interesting result here is the negative correlation between pAN2 and p2N2 (= -0.48). Examining the scatter plots of these and other categories shows that this is mainly due to there being a number of words with above average frequencies in c1N2 (which is part of cAN2) that have below average frequencies in c2N2:
Unlike the equivalent tests for the ‘ownership’ of the identical parallels in Matthew and Mark, there are significant differences between the results of these two groups of tests, indicating that many of the words shared with Luke originated in Matthew. The evidence is strongest for the words in the Double Tradition:
These results show that p220 (words common to Matthew and Mark only) and p022 (words common to Mark and Luke only) are similar, indicating that both categories contain words mainly originating in the same source, i.e. Mark. This is the reason that p222 is similar to both p220 and p022, indicating in turn that c222 also contains words mainly originating in Mark.The previous results indicate that the Double Tradition (c202) words common to both Matthew and Luke most likely originated in Matthew, while the origin of the words in Matthew-Luke agreements against Mark (c212) is uncertain. The following tests may help to clarify this.
There is little evidence of editing choices here, except in the case of the Double Tradition (c202), where the results support the previous evidence that indicates that the Double Tradition consists mainly of sentences originating in Matthew.
The Double Tradition result in this group of tests appears to contradict previous results, since p202 is similar to both p201 and p102, suggesting that c202 originated in both Matthew and Luke. The key indicators here are:
As p102 is not similar to p002, but is similar to p202, and p202 is similar to p200, it is reasonable to suppose that p102 might also be similar to p200. However, this is not actually the case. Instead:
More specifically, some of the Double Tradition text may have originated in S (in addition to Matthew, as previously suggested). However, because p200 (Sondergut Matthew) is very similar to both p201 and p202, any text originating in S must have been edited by aMatthew before any of it was used within Luke. Then, after aMatthew had made his mark on the text, aLuke added his own changes, leaving p102 still similar to p202, but not similar to Sondergut Luke.The following tests all compare the profiles of categories representing passages in Mark that do not have parallels in Luke (cX20, cN20, etc.) with those representing passages in Mark that do have parallels in Luke (cX2N, cN2N, etc.).
On the assumption that aLuke copied/edited text from Mark (as suggested above), we would expect to see significant correlations in these tests. However, although some do exist, they do not form a clear pattern. In particular, there is some evidence that the ‘Mark-Luke double tradition’ (i.e. c021 and c022, where there is no sharing with Matthew), may have originated in Mark, and some evidence that the Triple Tradition words common to all three synoptics (c222) also originated in Mark, but little else. Overall, there are enough correlations greater than 0.4 to suggest that Mark was the source of the words common to Mark and Luke (with Luke changing many individual words), and nothing to suggest otherwise.The following tests all compare the profiles of categories representing passages in Luke that do not have parallels in Mark (cX02, cN02, etc.) with those representing passages in Luke that do have parallels in Mark (cXN2, cNN2, etc.).
There is little evidence here of any significant degree of correlation between categories in Luke. The exception is cA02 (Sondergut Luke + Double Tradition words only in Luke), where we have:
However, because p012 does not have such a strong correlation with p002, again a little more investigation is needed:
From these results we can see that c012 is most likely to come from the same source as c002 and c112. However, the correlations indicate that p012 contains a higher percentage of individual words, which is consistent with the view that aLuke significantly altered the Greek of the passages he took from Mark that are not in Matthew.
The lack of correlation between p002 and both p022 and p122 suggests that words shared between Mark and Luke did not come from Luke, but they came from Mark instead.
The following tests check for correlations between the profiles of cX22 (and variations cN22, cA22, etc.) and the equivalent Markan and Lukan categories.
These tests support previous results suggesting that the main source of the words common to all three synoptics was Mark. However, they do not provide any strong indication of the sources of the words common to Mark and Luke but not Matthew, i.e. those in c022 and c122.
The lack of any significant correlations in the first group of tests above and the negative correlations in the second group suggest that the words common to both Mark and Luke could have originated in Luke. However, this appears to contradict previous results and others (below), that suggest that the words in Mark that are common to either Matthew or Luke originated in Mark:
The key to understanding this problem is knowledge of the differences between the Greek used in Mark and that used in Luke, in particular that Luke ‘corrects’ or ‘improves’ the Greek used in Mark. For example, cX22 contains above average use of EIPON, with below average use in cX12, while the converse is true of EIS. As previously mentioned, this causes a negative correlation between the profiles of many of the categories in Mark when compared with categories in Luke.