Posted by: David Harley | January 3, 2013

Going beyond Imperva and VirusTotal

Old Mac Bloggit, my pseudonymous colleague from Mac Virus, launched his first salvo for this blog yesterday (a belated welcome to you, Mac). It consisted largely of a scathing criticism of the way in which a second wave of media comment on Imperva’s report on the shortcomings of anti-virus has mostly ignored the deficiencies of that report, as highlighted here and elsewhere.

Journalist Kevin Townsend rightly points out that:

Actually, I discussed it in Infosecurity Magazine on 28 November.

Indeed he did, and it was as I suspect that the only reason Mac didn’t include a link to it was that his criticism was aimed at the second wave of  commentary – largely led by the New York Times – that was not only late onto Imperva’s bandwagon, but somehow missed the fact that (1) there had been much criticism of the report and (2) Imperva has actually modified its position somewhat since those criticisms. Otherwise, I’m sure he would have included Kevin’s fair and balanced article – after all, it did quote me. 🙂

In his blog about Mac’s blog, Kevin summarizes the why-you-shouldn’t-use-VirusTotal-reports-as-a-detection-metric position succinctly and accurately, but he also makes other points that deserve further discussion.

But here’s the rub: the AV industry isn’t innocent of its own sleights of hand.

Hard to argue with that. AV marketing departments have made some pretty crass claims from time to time.

The one that gets me personally rather hot under the collar is the ‘destroys all known bacteria dead’. Well, that’s the clear message. The actual terminology is ‘stops 100% of viruses in the Wild’. What it is really saying is that Stoppem Anti Virus detects every virus in the Wild List. And the Wild List is very different to ‘in the wild’. In fact, the Wild List is effectively compiled by the AV industry; so in reality, any AV company that doesn’t score at least 99.99% success against viruses in the Wild is largely incompetent.

Well, sort of.

Once upon a time, the difference between the number of known viruses and the number of viruses literally in the wild (i.e. posing a real threat to the everyday user) was pretty small, so there was quite a lot of merit to the idea of the WildList, essentially a catalogue of virus names corresponding to a collection of verified samples known to be in the wild. (We tended to use the capitalization In the Wild or the abbreviation ItW to indicate that we meant malware qualified to be on the current WildList, rather than the entire population of known and unknown viruses that were out there and posing a threat.)

However, the general usefulness of the WildList to the world at large has declined as the number of samples on the WildList at any one time has become a tiny fraction of the total population of malicious programs that pose a potential threat to users, even though the range of malware that makes the WildList has widened. At the same time, many malicious samples have a lifetime of minutes, whereas old-time viruses could sometimes survive on the WildList for months, even years. As a result, the concept of ‘Wildness’ has become practically useless to people outside the AV industry, while the sheer volume of known malicious samples is unmanageable in terms of defining which samples are or are not ItW in some technical sense. So is the WildList (or the collection of samples it represents) of any use at all in testing?

On the negative side:

  • It’s a pretty small sample set.
  • It still doesn’t represent the whole range of malware that security software can or tries to detect.
  • It doesn’t really represent the dynamic state of the threat landscape. You might say it lacks the element of surprise.

On the positive side:

  • It represents a set of samples agreed to be truly malicious: that is, the verification process is better than that applied to most collections.
  • It represents a baseline of samples that all products should make a fair fist of detecting: if you like, a minimum standard that all products should be capable of meeting. It’s not literally compiled by the AV industry, as Kevin suggests, but it is verified by vendors. So it has some value for certification purposes (e.g. VB100) but very little for comparative testing.

What about Kevin’s point about 100% of all bacteria? Well, he’s right. There are (at least) three levels of aspiration here.

  • Detection of everything on the WildList. More difficult than you might think, and some mainstream companies (and testers) have simply decided not to bother with it any more in any case. When a product does achieve such certification, it’s certainly worth something, but it’s not a measure of absolute protection in the real world.
  • Detection of everything known to be in the wild (note the capitalization). Theoretically achievable given enough resources, but I wouldn’t care to promise that any product can achieve it. Or that any testing organization could realistically assess its ability to achieve it. Blocking without specific detection is a little more achievable in principle (in the form of whitelisting, for example), but there’s a hidden cost there (false positives, convenience trade-off, and so on).
  • Detection of all malware, known and unknown. Yeah, right. I believe that. I also believe in Santa Claus.

Ability to pass WildList-based certification is a good sign, but it’s not at all the same as catching everything that poses a threat.

So I would say this. Imperva, you have been a bit naughty in your report. AV industry, you can be a bit naughty yourself. So stoppit, both of you. Anti-virus is good, not perfect, but essential. Just tell us the truth.

That’s always my intention. And that of every competent, ethical researcher I know.

Small Blue-Green World/Mac Virus
ESET Senior Research Fellow


  1. […] David Harley includes quite a lengthy comment on this blog in his post, Going beyond Imperva and VirusTotal. In particular he delves into the pros and cons of WildList testing. He doesn’t completely […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.


%d bloggers like this: