[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

The CVE-10K Problem




All,

Well, it's that time.  For 2006 so far, we've nearly assigned 7000 CVE
identifiers.  We don't have 100% completeness, but I'd say that for
the usual sources (major vuln DBs, vendor advisories, Bugtraq etc.)
there might be another 100 to 1000 CVE's for 2006.

Given the continued vulnerability growth trends, it's a real
possibility that in 2007, we run the risk of assigning 9,999 CVE's for
issues.  What to do with the 10,000'th entry is the CVE-10K Problem.

Here are some possible solutions.  Feedback appreciated.  We can cover
this topic in an upcoming telecon, too.

1) Keeping the year and moving to hex-based... CVE-2007-9999 would go
   to CVE-2007-A000, etc.  Problem: would probably break many apps
   that assume digits only.  Benefit: we could handle 65,000 ID's in a
   single year.

2) Completely randomize the year portion.  We've considered this for a
   number of reasons, because too many people make assumptions based
   on the year portion of the ID already - sometimes it's date of
   disclosure, sometimes it's date of assignment, sometimes it's
   because of a typo from an authoritative source.  Randomization
   would help in some other ways, too.  This is the most radical
   approach but has some strengths.  Problem: any crude usability is
   lost.  Benefit: the possible space of 100 million identifiers
   allows us to pass the problem onto the next generation :) but also
   might allow for less tightly controlled allocation of CVE's
   (although reduced control has serious negative consequences on
   CVE-based quantitative analyses and maintenance costs, so this is
   only a possibility).

3) Adding 1000 to the year.  Benefit: introduces predictability, and
   it's one of the least radical approaches.  It buys us some time.
   Problem: only increases to 20,000 identifiers in a year.  Bigger
   problem: the identifier is likely to be thought of as a typo by
   many readers, and automatically "corrected" to the current year,
   which would be an identifier for the wrong issue.

4) Keeping the year, and extending the numeric portion to 5 digits.
   Benefit: this preserves the CRUDE utility of the year portion and
   doesn't introduce any alphabetic characters.  Problem: some
   tools/products/databases might assume only 8 total digits instead
   of 9, so one digit could get lopped off.  Maintenance costs would
   be greater than #2 and #3.  It also might affect sorting, but in
   the grand scheme of things, I'm less concerned than I used to be.


Handling over, say, 20K issues in a year would likely require a
paradigm shift within the entire vulnerability information management
industry.  As Dave Mann has pointed out to me numerous times, the
growth in the number of vulns is outpacing the growth in CVE funding,
which has been mostly flat with respect to content generation itself,
with increasing risks of our funding actually being reduced (I don't
think most people understand why good vulnerability information isn't
cheap.)  Anyway, I suspect that this growth problem is hurting other
vuln databases/products, too.  We're already seeing some of that
paradigm shift; the Board gave up voting a while ago due to the amount
of effort, you're seeing more generic vulnerability database entries
with more mistakes (probably being made by less experienced analysts
with less editorial oversight), the percentage of verified issues is
probably smaller, etc.


Thoughts?

- Steve

P.S.  Thanks to Pascal Meunier for asking about this privately, which
prompted me to mention it here.

Page Last Updated or Reviewed: May 22, 2007