dianoigo: biblical studies, theology, church history and more: The Statistics of Satan

Two Christadelphian friends with whom I have discussed the biblical doctrine of the devil and Satan (Jonathan Burke and Kenneth Gilmore) recently made a surprising claim to me in correspondence. They hold that within the New Testament one observes the marginalization of the terms ‘devil’ (Greek: diabolos) and ‘Satan’ (Greek: satanas). More specifically, they maintain that these terms are prominent in books written for preaching purposes but virtually disappear in the rest of the New Testament which was written for mature Christians. (The same assertion was made regarding demons, but we will not discuss that here). Burke listed Matthew, Mark, Luke and Acts as the books written for preaching purposes while Gilmore included all four Gospels and Acts.

Gilmore produced the bar graph below in support of his claim:

It was pointed out to great effect that nine [sic] New Testament books contain no reference to the devil or Satan.

Now, as a professional statistician I was immediately skeptical of the conclusions being drawn from these figures. My suspicion was that the variations seen in the graph above could be largely explained on the basis of variations in word count. After all, the Gospels and Acts are long books, comprising over 60% of the New Testament by word count. Thus, if references to the devil/Satan were distributed uniformly across the New Testament, we would expect about 60% of these references to be in the Gospels and Acts. If the devil/Satan receives relatively more attention in these books than in other books, we would expect them to contain well over 60% of the occurrences of the words diabolos and satanas. In fact, they contain just under 50%.

I drew Kenneth Gilmore's attention to these figures, but he held steadfastly to his theory, so I decided that a more thorough analysis was in order, the results of which follow.

Data

I first counted the number of occurrences of the Greek nouns diabolos (devil) and satanas (Satan) in each New Testament book. The New Testament totals are 32 and 33 respectively. These counts omit three plural uses of diabolos to describe gossipy women in the pastoral epistles (1 Tim. 3:11; 2 Tim. 3:3; Titus 2:3), in which diabolos functions as an adjective (Daniel B. Wallace, Greek Grammar Beyond the Basics, p. 224).

I also counted 15 references to the devil and Satan by other titles: "the god of this world" (2 Cor. 4:4), "the ruler of this world" (John 12:31; 14:30; 16:11); Beliar (2 Cor.6:14); "the ruler of the power of the air" (Eph. 2:2); and "the evil one" (Matt. 13:19; 13:38; John 17:15; 2 Thess. 3:3; 1 John 2:13; 1John 2:14; 1 John 3:12; 1 John 5:18; 1 John 5:19). Some translations render "the evil one" in other passages (including the Lord's Prayer!) but I have limited myself to these nine instances which are unanimously so rendered in six translations I consulted (NRSV, ESV, NASB, NIV, NKJV and NLT).

Titles for the devil/Satan which are in the immediate context of an explicit reference to the devil/Satan were not counted. By the method of counting described above, the total number of references to the devil/Satan in the New Testament is 81. However, only the words satanas and diabolos were included in my initial analysis in case anyone might contest that the other terms refer to the devil.

I considered other variables to use in a statistical model to assess the claims made by Burke and Gilmore. The first of these is the word count per New Testament book. These were taken from the Nestle-Aland Greek text, and ranged from a low of 219 (3 John) to a high of 19 482 (Luke). Obviously these figures would vary slightly if a different text were used. The second is a categorical variable, 'purpose', given a value of '1' if the book was written for preaching purposes (according to Burke's classification) and '0' if written for mature Christians. This will allow us to check for differences between these two groups of books. I also considered a second, more objective way of classifying the books: genre. In this case, books were classified as ‘narrative’ (Matthew-Acts), ‘epistles’ (Romans-Jude) or ‘apocalyptic’ (Revelation). In practice the classification is nearly the same as Burke’s; only John and Revelation would need reclassification.

The final variable is the most likely date of composition for each book. These dates were taken from this resource, which provides published sources for its estimates. Where the most likely date provided was a range of years, I used the midpoint of the range.

Model

(Note: if you’re not mathematically inclined you may wish to skip down to the ‘Rate of Occurrence Graphs’ section).

The occurrences of words within a text are count data which would typically be modeled using the Poisson probability distribution. A Poisson regression model allows us to model the relationship between this dependent variable (the diabolos+satanas count) and certain predictor variables. The advantage to using such a statistical model is that it enables us to measure the effects of several different factors on the diabolos+satanas count simultaneously. This will help us determine whether the variations in counts in different New Testament books are a result of differing ‘purpose’ (as classified by Burke and Gilmore), or differing word counts, or both.

The general equation of the model is given below, where y_i is the number of occurrences of the words diabolos and satanas in the i^th book and the x variables are the independent variables (predictors).

Results

I first considered a simplistic model where Burke’s ‘purpose’ classification is the only predictor. Estimating the model in SAS produced the following output:

Analysis Of Maximum Likelihood Parameter Estimates
Parameter	DF	Estimate	Standard Error	Wald 95% Confidence Limits		Wald Chi-Square	Pr > ChiSq
Intercept	1	0.4754	0.1644	0.1532	0.7976	8.36	0.0038
preachingpurposes	1	1.4705	0.2505	0.9796	1.9614	34.46	<.0001

To interpret the output of such a model the two key quantities to look at are the sign of the number in the ‘Estimate’ column and the value in the ‘Pr > ChiSq’ column, known as the p-value. If the ‘Estimate’ for a particular variable is positive, this indicates a positive relationship with the dependent variable (satanas+diabolos count). If the ‘Estimate’ is negative, this indicates a negative relationship. The p-value tells us whether the relationship is statistically significant. The most widely used ‘rule of thumb’ states that if the p-value is less than 0.05, the relationship is statistically significant. If the p-value is greater than 0.05, statistically speaking we cannot affirm that such a relationship exists as it is not strong enough relative to the standard error of the estimate.

Applying this to the table above, we observe that the coefficient of the ‘purpose’ variable is statistically significant since the p-value (< .0001) is less than 0.05. Since the sign of the estimate is positive, this means that the rate of occurrences of satanas+diabolos per book in Matthew, Mark, Luke and Acts is higher than in the rest of the New Testament. By taking e^1.4705 we estimate that the rate is more than 4 times as much in these four books as in the rest of the New Testament. This is no surprise; it is basically the same information that is communicated visually in the bar graph above.

However, what happens in the model when we control for the differing lengths of New Testament books? To find out we add a second independent variable, log(word count). Because the total word counts per book are large numbers with a lot of variation, a better fit in the model is obtained by taking the natural logarithm.

Analysis Of Maximum Likelihood Parameter Estimates
Parameter	DF	Estimate	Standard Error	Wald 95% Confidence Limits		Wald Chi-Square	Pr > ChiSq
Intercept	1	-6.4054	1.3975	-9.1444	-3.6663	21.01	<.0001
ln(word count)	1	0.8650	0.1646	0.5423	1.1877	27.60	<.0001
preachingpurposes	1	-0.0673	0.3317	-0.7174	0.5828	0.04	0.8392

In the output above, we can see that log(word count) is significant (p-value < .0001), with a positive coefficient estimate. This indicates that as the length of a book increases, the number of occurrences of satanas and diabolos tends to increase (no surprise here!) Even more importantly, we see that the ‘purpose’ variable is no longer statistically significant (p-value = 0.8392). This means that once we control for word count, there is no longer any difference in the rate of occurrence of satanas and diabolos between these two categories of books. Indeed, if we add an interaction term to the model, it is also not significant (p-value = 0.9353). This implies that the rate at which satanas + diabolos occurrences increase with word count is the same in the Synoptic Gospels and Acts as in the rest of the New Testament.

Furthermore, by adding the log(word count) variable to the model, the AIC (a measure of goodness of fit which is smaller in a better model) reduces from 122.0 to 92.7, implying that our new model has much greater explanatory power.

I also ran the same model using Gilmore's way of classifying the books (which includes John in the 'preaching' group), and using my own way of classifying the books (according to three genres: narrative, epistles and apocalyptic). The conclusions are the same, except that the apocalyptic genre has a statistically significant positive effect on rate of occurrence of diabolos+satanas.

All of this draws us to the inevitable conclusion that the alleged marginalization of the devil/Satan in the non-preaching books of the New Testament is statistically unsustainable.

Date of Composition as a Predictor

What if we consider ‘Date of composition’ as a predictor variable in the model? If we consider a model with log(word count) and most likely date of composition as independent variables, the output is as follows:

Analysis Of Maximum Likelihood Parameter Estimates
Parameter	DF	Estimate	Standard Error	Wald 95% Confidence Limits		Wald Chi-Square	Pr > ChiSq
Intercept	1	-6.9092	1.1667	-9.1958	-4.6226	35.07	<.0001
ln(word count)	1	0.8131	0.1229	0.5722	1.0541	43.75	<.0001
likely_date	1	0.0138	0.0075	-0.0010	0.0285	3.33	0.0679

Here, we see that the date of composition has a positive sign which is not quite statistically significant (p-value = 0.07). This suggests that there is no significant change in rate of occurrence of satanas and diabolos as we move forward in time according to the composition of the books. If anything the rate increases slightly with time. This militates against any claim that the devil and Satan disappeared from the church’s vocabulary as time went on. (Burke and Gilmore have not made such a claim to my knowledge, but prevention is better than a cure).

Including other titles of Satan

Up to this point we excluded the other titles of Satan (the ruler of this world, the evil one, etc.) in case Christadelphians might object to these being called titles of Satan. However, since a strong case can be made that these are in fact titles of Satan, it is worth considering what effect their inclusion would have on our models.

In short, there are no major changes to the results except in the model which includes date of composition, where this variable’s coefficient is now statistically significant (p-value = 0.0009) and positive, suggesting that the rate of occurrence of references to the devil/Satan actually increase over the period of composition of the New Testament (assuming the dates accepted by scholarly consensus are accurate). This result is probably due to the inclusion of five references to ‘the evil one’ in 1 John, one of the latest books in the New Testament.

Rate of Occurrence Graphs

Graphs are often easier to understand than the output of sophisticated statistical models. Thus, having established statistically that the effect of book purpose or genre falls away once word count is taken into account, it would be useful to show this graphically. The bar graphs below are similar to Gilmore’s, but instead of showing only the counts of satanas + diabolos, they show the rates, calculated by dividing the satanas + diabolos count by the total word count of each book.

The main observation to be made about this graph is that there is no clear pattern as we look at different portions of the New Testament. It would be an oversimplification to conclude that the different rates of occurrence across different books and writers reflect different emphases on the doctrine of Satan. The frequency of references to Satan is governed by the broader purpose and themes of each book. Nevertheless, we see that Satan is mentioned fairly consistently across the New Testament. 1 John (written c. 95 AD for mature Christians) has the highest rate in the whole New Testament, and 1 Timothy (written to a Christian leader) has the highest rate when terms other than satanas and diabolos are excluded.

This rules out the idea that Satan/the devil are marginalized either as we move forward in time or as we move from ‘preaching’ books to books for ‘mature Christians’ as defined by Burke and Gilmore.

We can note further that every single New Testament writer makes mention of the devil/Satan at least once with the possible exception of the writer of 2 Peter (if we accept the critical consensus that it was not written by Peter). If Colossians was not written by Paul or by the writer of Ephesians or 2 Thessalonians, then this writer would be another exception. Those writers who do refer to the devil or Satan are Matthew, Mark, Luke (the presumed writer of Luke and Acts), John the Evangelist (writer of the Fourth Gospel and Johannine epistles), Paul, the writer(s) of the deutero-Pauline epistles if different from Paul (namely, Ephesians, Colossians and 2 Thessalonians), the Pastor if different from Paul, the writer of Hebrews, James, Peter, Jude, and John the Seer (again, if we follow critical scholarship in attributing Revelation to a different author than the Gospel and epistles of John). A graph showing the rate of occurrences by author appears below:

Again, the main thing to be noted is the absence of any clear trend. Apart from the anomalies of Jude and the writer of 2 Peter (which mean little in view of their small volumes of writings), the rate of occurrence is remarkably consistent across all writers.

Books which do not mention Satan

As mentioned at the beginning, Burke and Gilmore would draw our attention to the fact that nine New Testament books omit any mention of the devil or Satan.[1] In fact, the number is eight as they have wrongly included 2 Thessalonians in the list (2 Thess. 2:9; cf. also 2 Thess. 3:3). Is this problematic for the importance of the devil in first century Christian theology?

By way of comparison, a quick search shows there are ten New Testament books in which the word basileia (kingdom) does not occur.[2] There are also nine New Testament books in which neither the word anastasis (resurrection) nor the verbs anistemi or egeiro (rise; raise up) occurs.[3] I doubt that Burke or Gilmore would claim that this proves the kingdom of God and the resurrection are marginalized within the New Testament.

Among the books in all three lists are four of the five shortest books of the New Testament (Titus, Philemon, 2 John and 3 John), which have word counts of 659, 335, 245 and 219. This is again the word count effect: a shorter book has less content in which a reference to Satan might arise. The other four books are all epistles which fall under ‘task theology’, addressing specific situations faced by the original audience. Indeed, most of the New Testament is like this; it was not written as a purely theological endeavour. Thus, the fact that 18 of the 22 New Testament books which are longer than 700 words mention Satan demonstrates that Satan was a highly relevant topic throughout the apostolic age.

Conclusion

We can stress again that 1 John and Revelation, two of the final books to be written according to most scholars, have among the highest rate of references to the devil/Satan. Furthermore, two books stress the importance of the devil to the redemptive work of Christ, both of which were written to mature Christians (Heb. 2:14; 1 John 3:8).

In conclusion, the evidence does not support the claim that the devil/Satan is marginalized within any subset of New Testament books. While far from the most important doctrine of the early church, ‘satanology’ played a consistent supporting role as the New Testament writers sought to proclaim the gospel and teach and encourage Christian believers in the face of moral and doctrinal challenges within and persecution without. Satan is referred to repeatedly in narrative, the teachings of Jesus, pastoral advice in the epistles, and the Apocalypse.

[1] Galatians, Philippians, Colossians, Titus, Philemon, 2 Peter, 2 John, 3 John

[2] 2 Corinthians, Philippians, 1 Timothy, Titus, Philemon, 1 Peter, 1 John, 2 John, 3 John, Jude

[3] 2 Thessalonians, 1 Timothy, Titus, Philemon, 2 Peter, 1 John, 2 John, 3 John, Jude

Title

Tuesday, 5 November 2013

The Statistics of Satan

No comments: