[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Level of Abstraction Issue: Similar Applications, "Same" Vulnerability

Here's my 2 cents' worth. Sorry for the length of 
this. I don't have the time to reduce this down to
a few pithy comments. ;)


When we first introduced the idea of a CVE, we noted that
the word "vulnerability" was commonly used at different
levels of abstraction. Moreover, we noted that the
attempt to construct a CVE would necessitate that
we fix the term "vulnerability" to refer to a specific
level and that we choose other words to refer to
different levels of abstraction that are useful.

In short, we are struggling with a poorly defined
vocabulary.  While this discussion is hard, it
is very necessary, imo. I might suggest that the
ultimate choice as to which is correct *may* come down
to a coin toss.  We might do well to begin thinking
about what other terms we would like to use for a
higher or lower level of abstraction (depending on
which side of the fence you are sitting on).

I see potential problems with both sides. Like
Bill Hill, I come at this with a tool interoperability
bias and from what I've seen (ad hoc) I prefer
the same attack approach as a knee jerk reaction.
However, I do agree completely with Northcut who
reminded us to keep attacks/exploits and vulnerabilities
separate in our minds. I believe this is a fundamental
distinction and the "same attack" language fuzzies
this line.  Maybe "same opening" or "same chink in the
armor" is better?  Also, Spaf and Shostack have 
raised valid arguments to the same attack approach that
I have no good counter-arguements for. This gives me

At the same time, I have even deeper reservations about
the same code base approach.  Aleph's point that this
commits us to needing to know things that we may not 
have access to (an ontological gap) is dead-on correct.
But I see an even more fundemental problem with this

Like the Russian dolls game, casting the enumeration
of vulnerabilities problems in terms of "same codebase"
seems to reveal a similar sub-problem. Namely, how does
one enumerate all codebases?  Or putting it another way,
when are 2 codebases different? Or the same?  Is a dll
a part of (one of) the application(s) it supports or is
it a seperate codebase?  Bishop's question concerning
changes in the OS affecting a vulnerability in an application
begs the question of drawing the line between an OS
and an application. I won't make a federal case out
this question because there is already one working
its way through the courts!! ;) Yet another question
we have struggled with internally is in drawing the
line between an OS and a network service. Russ has 
raised questions that relate to the distinction between
versions of software.  So, even if we had omniscient
knowledge of a software's codebase (and we won't), I
see the problem of determining, or more precisely, of
defining when 2 codebases are different as being
no less problematic than our original question of
determining when 2 vulnerabilities are the same.


Two quick quotes from recent posts...

Russ Cooper writes:
> My vote is that we work on the assumption that "Same Attack/Same
> Results" should be the starting point for any CVE item, and, assuming
> details are forthcoming from a vendor, the CVE item gets revised
> according to the details as they unfold. 

Bill Hill writes:
>  So even if tool A and B detect the same number of CVE
> vulnerabilities, but tool B detects 100 more cases or instances of some
> vulnerabilities, then in some sense tool B is more capable than A, and I
> think people will understand that.

Note that in both cases, both Russ and Bill are assuming
or implying the need to manage vulnerability information at
(at least) 2 levels of abstraction.  Hold onto this thought
while I go on a minor rant...

As some of you know, I have some regrets to appealing 
to the examples of the biological and chemical enumeration/
taxonomy problems in our original paper.  Both of these 
examples are intriguing to anyone who loves order and structure 
because the fields have defined what can only be described as the 
canonical enumerations and taxonomies for their respective fields.
The hierarchical tree structure of the zoological taxonomy
is reflective of the history of the origin of species which
is ultimately encoded in DNA strands (NOTE: DNA testing
is the new arbiter for distinguishing species). And the
period table of elements is reflective of the atomic 
structure of the elements.

If we are to base our enumerative and classification
hopes for vulnerabilities on these models, I think we
must first ask if vulnerabilities encode a rich and universal
structure that we can base our decisions on?  I am very
doubtful on this subject.

I think a better model for thinking about our problem
lies with what I call the Library Sciences or Library
of Congress model.  And even this is with a substantial
caveat or two.

Let me begin with the caveats. First, I think the enumeration
of codebases or softwares is much more closely tied to
the Library of Congress problem. In both cases, the 
problem is to catalog published works of intellectual
property. The first caveat here is that I think that
the problem is harder for softwares due to the problem
of the slippery boundaries between codebases that may not
be as prevalent for books, periodicals and other
printed materials. I say "may" because it may be that
if you push deep into the study of Library Sciences
that they deal with similar issues. (I need to do more
research on this...)

The second caveat is more substantial.  Regardless
of how one defines vulnerability (an issue we may not
be able to agree on), it is clear that the idea of
a vulnerability is meta-data with respect to our
knowledge of softwares. In more casual language, with
software or codebases, I can point to a physical artifact
like a file.  With vulnerabilities, I ultimately am
referencing a derived fact (dare I say, opinion) about
the physical artifact. Even more casually, I would suggest
that vulnerabilities are nothing more than moral opinions
about the features of software. (I understand that
this comment is flame bait... But vulnerability
implies some notion of good and bad, at least relative
to some policy [aka moral objective]).

Let me summarize. The problem of enumerating and 
classifying books is different and less deterministic than
enumerating species and elements.  The problem of
enumerating softwares is arguably even less deterministic.
And the problem of enumerating the abstract notion
of vulnerabilities is even less deterministic.

I hope I haven't destroyed all hope here. One, I could
be way, way off base. Two, I do think that with these
caveats that the Library of Congress problem is still
instructive and before I launch into how I see this
analogy fitting to our situation, let me remind you
that Russ and Bill both referred to vulnerabilities
at 2 levels of abstraction (or at least implied it).  

When are 2 books the same or different?  Let's say
that Andre, Adam and I all go to our local library
and check out Herman Melville's "Moby Dick". I think
that everyone would agree that we would be reading
the same book.  Or would we?

Let's suppose that Andre was reading a copy printed
in 1976, that Adam was reading a 1997 edition and
that I had a rare 1923 edition.  At a different level,
these books are all different.  

If we consider the Title card catalog, we can see what
is happening.  The card catalog may have 5 or 6 or more
different cards all with the top line (or top 2 lines)
being something like

    Moby Dick
    Melville, Herman

At this level of abstraction, all the cards in the card
catalog are referring to the same (enhanced) title. But if 
we look deeper on each card we will see more information
that discriminates between each entry. This information
would include publisher and date and all of that stuff.
For the sake of discussion lets call these levels of
abstraction the title and reference levels.

My primary point is this (and you have been patient to
wait so long for it).  In the attempt to enumerate
books, libraries refuse to be limited to a single
level of abstraction.  Instead, they rely on a
2 tiered system of title/reference. 

Could it be that vulnerabilities, like books are simply
better handled with a 2 tiered approach?  As much as 
I have argued for keeping the CVE as simple as possible,
perhaps we need to consider this sort of approach.

I have only followed the Anti-Virus (AV) community in
a peripheral manner but I believe that they too have
settled on a 2-tiered approach, especially in light
of the new classes of closely related viruses.  Spaf
and others who are closer to this community, is this
true or am I way off base?

In suggesting that we consider a 2-tiered enumeration
(and hence, naming scheme) I do NOT mean to suggest
that the decision rules for each level of abstraction
are necessarily clear.  As I have argued (ad nuseum, I
am sure), I think the problem of vulnerabilities is
more slippery than the problem of published books.
Perhaps it is enough that we recognize that at some 
level, this set of things should be considered as
being essentially the same and at some other level
they should be considered different.

FWIW, I think this is exactly what we have seen out
of the CERT Advisories.  Most often, an advisory
will describe a single vulnerability (CERT's use
of the term) while at the same time the advisory
will include vendor specific information for 
various pieces of affected software. Essentially
a 2 tiered approach with no stated formalized rules
for distinguishing them. 

All, sorry for the obscene length.  I hope something
in this is helpful and I hope that I don't send us
off on a wild goose chase. Actually, I fear the
later a lot.  Please, speak up loudly if you think
I'm off base.



David Mann                     ||  phone: (781) 271 - 2252
INFOSEC Engineer/Scientist, Sr || 
Enterprise Security Solutions  ||    fax: (781) 271 - 3957
The MITRE Corporation          ||
Bedford, Mass 01730            || e-mail: damann@mitre.org

Page Last Updated: May 22, 2007