[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

What is the future of CVE - Scope, Volume & Quality?

The CVE Team

In June 2011, a paper from China was circulated that calls for the establishment of an International Vulnerability Database Alliance ("IVDA: International Vulnerability Database Alliance," Chen Zheng et al).  The MITRE CVE team has been giving this paper careful consideration and while the paper raises several issues with CVE, we believe it raises two important, high priority questions that the CVE Editorial Board should consider and respond to:

Q1: Can CVE effectively keep up in the face of an increasing volume of English-based disclosures?

Q2: What relationship should CVE have with any international effort (such as IVDA) to identify vulnerabilities disclosed in non-English based markets ?

To be sure, there are other questions to be asked and answered, but we feel these two are the most pressing.  And while these questions are related, we believe that the Editorial Board should consider them in the order above.  Regardless of what happens internationally, CVE is confronted with real issues of scope, volume and response time.  How the CVE Editorial Board decides to deal with the issues of scope, volume and response time will likely inform our position relative to efforts to further develop and operate CVE.  

If you are confident that you understand how CVE is currently operating, please feel free to jump ahead to the questions and respond to them.  Otherwise, we'll begin with a brief overview of our understanding of current CVE operations and scope, to ensure that everyone is starting with the same set of basic assumptions.

Please note, the CVE Team anticipates a day when CVE won't be able to remotely stay abreast of "all publicly disclosed" vulnerabilities.  We take as an example of this how the tracking of malware samples has changed in the face of a changing malware threat.  The basic thrust of the two questions is to help us focus our resources on the most important issues.

CVE aspires to assign vulnerabilities to all publicly known vulnerabilities (where "publicly known" has traditionally been taken to mean "disclosed in English-language disclosures"). CVE's are based on 3 primary sources of information. First, the largest number of CVEs is produced based on information pulled from web sites, vulnerability databases and mailing lists.  The vast majority of this information is gathered using web spidering capabilities and is generally done in coordination of the producers of the information.  The second source of information is the CVE Candidate Naming Authorities (CNAs), who are trained in how to assign CVE IDs and how to include them in their advisories.  It is important to note that in nearly all cases, the CVE team learns of CNA-issued CVE IDs in the same way that the rest of the world does -- we pull the information from the CAN's web site.  In this way, CNA advisories are treated just like all of the other information sources we monitor. Lastly, there are occasions in which CVEs are assigned in a pre-disclosure context.  While the CVE team is not an emergency response coordination center, it is sometimes the case that communities involved in pre-disclosure coordination benefit from using pre-disclosure CVE IDs..

In nearly all cases, new CVE information begins with the gathering of one or more related disclosures, which we call references.  The first analytical question the CVE team asks is whether or not any of the references relate to a CVE that already exists in the CVE corpus.   This analysis is based primarily on keyword searching, which means that the words that are chosen when writing CVE descriptions are vitally important.  If the reference relates to an existing CVE, we add the new reference to the existing references associated with that CVE and, if needed, update the language of the description based on the new information.  If we conclude that the reference relates to a new issue or issues, we first determine how many vulnerabilities are being discussed and then create new CVE entries for each.  For each newly created CVE, careful consideration is given to writing the description to help ensure that the new CVE can be found when tossed into a haystack with approximately 50,000 other entries. 

When we launched CVE many years ago, Stephen Northcutt endorsed the effort saying something along the lines that it was a good step forward and would be really useful when it had 1000 entries.  Stephen was right!   CVE wasn't keeping up and we've been trying to catch up ever since that first day.  While we aspire to cover all publicly available English-based disclosures, our best estimates are that, as of this writing, we end up covering about 35% of all references we monitor and between 65% and 85% of all "high priority" reference sources.  This has varied over the years due to a number of factors, primarily the increasing volume and complexity of vulnerabilities and having to manage more "raw" information than in the past. To date, we have responded by instituting processes that attempt to prioritize the processing of disclosures deemed to be the most important.

The CVE team maintains a mostly automated "rolling to-do list" of disclosures to process, whereby the "most important" bubble to the top of the list and the "less important" bubble down.  Priority is based on several factors, and some disclosure sources are given higher priority than others. For example, the rolling to-do list gives higher prioritization to references with reserved CVEs and references from major high-priority sources. Also, we work hard to achieve "reference completeness" for an exclusive set of providers, such as US-CERT Bulletins, but not for arbitrary posts to Bugtraq.  In addition, disclosures about some software vendors (such as Microsoft) are given higher priority than a disclosure about a php.golf application written by an undergraduate student as a part of his programming class and posted to a blog. ("php.golf" has become something of an internal catchphrase for "stuff we don't care about.")

This basic 2-dimensional prioritization grid (describe the grid) then gives us basic framework for internal performance goals that breaks out into 4 basic response levels:
1. High priority issues: 2 to 3 days
2. Moderate priority issues: within 2 weeks
3. Low priority issues: as we can get to them
4. Lower than low priority issues - these roll off the list, but we keep 
   them for possible future use or reference

As noted at the beginning, there are two questions that we would like the CVE Editorial Board to consider and respond to:

1. Can CVE effectively keep up in the face of an increasing volume of English-based disclosures?

2. What relationship should CVE have with any international effort (such as IVDA) to identify vulnerabilities disclosed in non-English based markets?

The first question is really about CVE's prioritization of issues and our response time. There are several questions we would like to discuss that are central to the "keep up" question.

1. Sources
  a. Which sources of vulnerability disclosures should be considered 
    "must haves" for which we provide "reference complete" coverage?  
  b. Which sources should be considered "nice-to-haves"?
  c. Which sources should be considered "can be safely ignored" 
    (e.g. they just cause noise)?

2. Coverage
  a. Which vendors and software products should we consider "must haves" 
     in that we will provide coverage for all reliable vulnerability 
     reports for them?  
  b. Which products or vendors should be considered "nice-to-haves"?
  c. Which ones should be considered "can be safely ignored" (e.g. php.golf)?

3. Response Time
  a. How should the answers to the Sources and Coverage questions be 
     combined to create a tiered priority list for response time?  
  b. For each tier, what is a reasonable response time?

4. Quality 
  a. What rate of duplicate CVE entries can be tolerated?  
  b. How consistent does CVE "counting" need to be relative to past 
     counting practices and content decisions? ("Counting" here means 
     the relationship between a given vulnerability and the number of 
     CVEs needed to correctly describe it and vice-versa. These may be 
     one-to-one, one-to-many, many-to-one, or many-to many.).

We believe that questions 1 and 2 form the basis of any rational prioritization for new CVEs.   We also believe that questions 3 and 4 need to be considered in tandem. The biggest delays in doing CVE analysis is in "getting things right" both in terms of counting CVEs and in terms of creating descriptions that allow futures queries to have reasonable chance of finding the correct CVE entry and avoiding duplicate entries.   We can produce CVE entries faster than the current rate, but one effect could be that we will assign CVE IDs in a less consistent manner and we will produce more duplicates.

The MITRE CVE team has formulated our own internal answers to these questions but we really need and solicit your input. We appreciate the time you may take to think about and respond to these questions, and thank you for your consideration of the above.

- The CVE Team (Steve Boyle, Steve Christey, Dave Mann)

Page Last Updated or Reviewed: November 06, 2012