Since my blog a few days ago, a few people have asked about the “Top Ten Mistakes Made When Evaluating Anti-Malware Software” that Kevin Townsend quoted here. Kevin was actually quoting a press release here, but it’s actually something I’ve used in quite a few contexts. A sort of mini-update to “A Reader’s Guide to Reviews“, originally credited to Sarah Tanner in Virus News International, but actually written by Dr. Alan Solomon.
So here it is again (slightly expanded): perhaps more up-to-date but considerably less detailed than Alan’s article.
1. Using samples received via email or on a honeypot machine, without checking that they really are malicious software. Some tests we come across have included well over 10% false positives, corrupted samples and so on and used them uncritically (i.e. without validation).
2. Using one of the programs to be tested to validate the samples. Or we might use the phrase pseudovalidate, since this takes no account of the possibility of false positives.
3. Assuming any sample detected by two or more scanners as malicious to be valid. This may bias the test in favour of products that flag everything that meets very broad criteria as suspicious, and against products that are more discriminating and fastidious about false positives. “It’s executable! It’s suspicious!”
4. Using VirusTotal or a similar service to check the samples and assume that any product that doesn’t report them as malicious can’t detect them. This will once again give the advantage to scanners that flag everything as “suspicious”, and will also disadvantage scanners that use some form of dynamic or behavioural analysis. It’s certainly not a real test, and it’s a form of pseudo-testing that VirusTotal itself discourages.
5. Using the default settings for detection testing, without trying to configure each product to the same level of paranoia. This isn’t a test of detection, but a test of design philosophy. Which is fine as long as you and your readers understand that.
6. Using default settings for scanning speed. This may introduce a bias in favour of products that get their speed advantage by cutting corners on detection, which may not be what the tester (or his audience) had in mind.
7. Asking vendors to supply samples. This may allow the vendor to bias the results in their own favour by including samples that other companies are unlikely to have access to, and to the disadvantage of companies who consider it unethical to share samples outside their web of trust. Some companies won’t cooperate with this sort of testing, but it puts them at an unfair disadvantage because it looks as if they’re scared to compete.
8. Categorising samples incorrectly, leading to possible errors in configuration. For instance, not all products flag certain kinds of “greyware” (described by some vendors as “possibly unwanted applications” or similar) as malware by default. That can be particularly misleading in combination with error 5.
9. Too much self belief. If, when testing two products that use the same version of the same engine, they score completely differently, it is unsafe to assume that there must be something wrong with the lower-scoring product. It is just as likely to be a problem with the setup or methodology. But some testers will not discuss the possibility that they may have tested incorrectly, and will not allow vendors to validate their sample set or methodology in any way. Of course, this may not be overconfidence, but a fear that their test will be found to be invalid.
10. Not including a contact point or allowing any right to reply. Be open in the methodology used and the objective of the evaluation, to allow others the possibility of verifying the validity of the test.
Mapping these points against the nine principles is left as an exercise for the reader. Or maybe I’ll come back to that. 😉
David Harley CITP FBCS CISSP
I like numbers (5) at (7).
(5) At one organization (not to be named), we changed our default out of the box settings to be more in line with competitors because we were getting burned in tests.
(7) More reviewers probably do this than would care to admit it. Stating where the samples are obtained from would be useful.
By: craig kensek on June 21, 2010
at 3:44 pm
[…] to an AMTSO blog post, I've returned to it (and slightly tweaked it) as another AMTSO blog post. I'll probably return to it in more detail here, […]
By: Triflex Enterprise | Testing and how not to do it on June 21, 2010
at 5:55 pm
[…] software testing results, keep the methodology used in mind. See if you can identify any of the top ten testing mistakes frequently made by testers and prepare to question the conclusions in the […]
By: What the New AMTSO Guidelines Mean to Users | Malware Blog | Trend Micro on June 24, 2010
at 11:51 am
[…] software testing results, keep the methodology used in mind. See if you can identify any of the top ten testing mistakes frequently made by testers and prepare to question the conclusions in the […]
By: Anti-Virus & Anti-Malware website. » What the New AMTSO Guidelines Mean for Users on June 24, 2010
at 7:40 pm
[…] software testing results, keep the methodology used in mind. See if you can identify any of the top ten testing mistakes frequently made by testers and prepare to question the conclusions in the […]
By: Virus and Malware Removal Services in Dallas – Ft Worth Metro Area » Blog Archive » What the New AMTSO Guidelines Mean for Users on June 25, 2010
at 12:26 pm
[…] Commentary without comment spam… By David Harley I should also have pointed out in my previous post that Alice Decker, a Trend Micro researcher who is very active in AMTSO, posted an interesting blog providing commentary on the latest guidelines documents approved at Helsinki and published here, the Kevin Townsend blog considered at some length here, and the top ten testing screw-ups blog here. […]
By: Commentary without comment spam… « amtso on June 27, 2010
at 6:25 pm
[…] one of David Harley’s ‘common mistakes’ in How to Screw Up Testing is “Using VirusTotal or a similar service to check the samples and assume that any product […]
By: Anti Malware Testing Standards Organization: a dissenting view « Kevin Townsend on June 28, 2010
at 10:42 am
[…] attention from sites pushing fake AV from re-posts of blogs that reference ours (especially this one on “how to screw up testing”). This blog offers a way for people who aren’t […]
By: Comment Spam and Worse « amtso on July 3, 2010
at 5:09 pm