How We Build the CVE List
The process begins with the discovery of a potential security vulnerability. The information is assigned a CVE Identifier (also called a "CVE-ID", "CVE Entry", "CVE Name" or a "CVE") by a CVE Numbering Authority (CNA), and then posted for the public on the CVE Web site by the CVE Editor. As part of its management of CVE, MITRE functions as CVE Editor and Primary CNA.
For the CVE Content Team, the process of building the CVE List is divided into three stages: the processing stage, the assignment stage, and the publication stage. The CVE Editorial Board oversees this process.
NOTE: See CVE Numbering Authorities for a description of how CVE-ID reservation and CNAs help build the CVE List, and for instructions on obtaining a CVE-ID.
For the CVE project, MITRE has a CVE Content Team whose primary task is to analyze, research, and process incoming vulnerability submissions from CVE’s data sources, transforming the submissions into CVE Entries. The team is led by the CVE Editor, who is ultimately responsible for all CVE content.
During the processing stage, MITRE’s CVE Content Team, which consists of MITRE security analysts and researchers, collects raw information from various sources, e.g., independent researchers, software vendors, the various Board members who have provided MITRE with their databases, or publishers of weekly vulnerability summaries, etc. Each separate item in the data source (typically a record of a single vulnerability) is then converted into a standardized format that facilitates processing by automated programs. Each issue includes the unique identifier that is used by the original data source.
After this conversion phase, each target issue is automatically matched against all other issues and published CVE Identifiers using information retrieval techniques. The matching is based primarily on keywords that are extracted from the target issue’s description, references, and short title. The keywords are weighted according to how frequently they appear, which generally gives preference to infrequently seen terms such as product and vendor names and specific vulnerability details. Keyword matching is not completely accurate, as there may be variations in spelling of important terms such as product names, or an anomalous term may be given a larger weight than a human would use. The closest matches for the target submission (typically 10) are then presented to a content team member, who identifies which submissions are describing the same issue (or the same set of closely related issues) as the target submission.
Once matching is complete, all related submissions are combined into submission groups, which may include any entries that were found during matching. Each group identifies a single vulnerability or a group of closely related vulnerabilities. These groups are then processed in the next phase, called "refinement."
Typically, a content team member is assigned a batch of 20 or more submission groups, which usually includes both duplicate submissions and new issues.
During refinement, the team member analyzes a submission group and determines whether one or more of the submissions identify an existing CVE item. If so, then the analyst notes any additional references that are in the new submission, but not the existing CVE item, so that the existing CVE item’s references can be extended.
If there are submissions from the group that do not describe an existing CVE item, then a team member makes the following assessment:
For each CVE Identifier to be created, the analyst does the following:
In some cases, an analyst may choose to delay analysis of a submission group (or a portion of the group) when an issue is unusually complex or if other individuals need to be consulted.
Submission refinement is a bottleneck because deep analysis is sometimes required to understand the reported problem, apply the content decisions, find vendor acknowledgement, research the references, and write the descriptions. Refinement is especially difficult for new analysts because there is a large amount of detail and background knowledge that is required before the analyst can be comfortable and productive in doing refinement.
For each action that the content team member undertakes — whether identifying a duplicate, rejecting a submission, or suggesting the creation of a new CVE Identifier — a "refinement group" is produced. One or more refinement groups are produced from the original submission group, depending on how many separate issues were in the original submission group.
After refinement, the CVE Editor reviews the work of the analysts, occasionally making modifications to follow CVE style, ensure that CVE content decisions are being followed, or to do advanced research. Occasionally, submissions may be added or removed from the refinement groups. The Editor provides feedback to the analyst for the purposes of training or to raise certain issues. Since the submission matching may not always find all related submissions, typically due to spelling inconsistencies across submissions, the Editor may merge multiple refinement groups that were produced by different analysts.
The Editor then processes the resulting refinement groups. New CVE Identifiers are assigned to the groups that identify new issues (Stage 2: Assignment, below).
After CVE Identifier assignment, each data source is provided with a backmap from their submission to the associated CVE items (whether newly created or existing entries). The backmap can reduce the amount of effort that the data source needs to perform to maintain a mapping to CVE. After the backmaps for the CVE Identifiers are generated, the associated submissions are removed from the submission pool. In addition to backmaps, a "gapmap" is also provided to the information source. The gapmap identifies the newly created CVE Identifiers that were not found when processing the data source’s original set of submissions, which may make the source aware of additional security problems that they had not seen previously.
In some cases, the processing stage may be entirely bypassed. This usually happens when an individual or organization reserves a CVE Identifier in order to include it in the initial public announcement of a vulnerability. See CVE Numbering Authorities for a description of this aspect of building the CVE List.
CVE Identifiers are normally created in one of three ways: (1) they are refined by the content team using issues submitted by CVE’s data sources; (2) they are reserved by an organization or individual who uses it when first announcing a new issue; or (3) they are created "out-of-band" by the CVE Editor, typically to quickly create a CVE Identifier for a new, critical issue that is being widely reported.
Once the CVE Identifier is officially assigned and the new CVE Entry is created, the new entry is published for the public on the CVE Web site in Stage 3: Publication.
The new entry is then added to the added to the CVE List and published on the public CVE Web site.
The entry may need to be modified in simple ways, e.g., to clarify the description or add more references.
Most CVE Identifiers are modified by adding more references (such as additional vendor advisories), or through small changes to descriptions (such as fixing typos and clarifying the issue). For CVE users who want to track modification in the CVE List, CERIAS/Purdue University offers a change monitoring report called "CVE Change Logs" that allows you to obtain daily or monthly changes to CVE Identifiers. In addition, The U.S. National Institute of Standards and Technology’s (NIST) National Vulnerability Database (NVD) provides (1) an RSS feed of all recently assigned CVE Identifiers, and (2) an RSS feed of all fully analyzed CVE Identifiers, which includes the names of the vulnerable products in the headers.
Some modifications may be substantial. For example, a CVE Identifier may need to be split into multiple items, or multiple CVE Identifiers may need to be merged into a single item (i.e., "recast"). This will happen if a content decision was not applied properly when the CVE Identifier was first created, or if new information forces such a change. The procedure for recasting CVE Identifiers has not been completely defined, because most of these changes are due to content decisions that have not been finalized yet. However, it is certain that the procedure will cover including forward pointers from any recast item to the correct items.
In other cases, a description and/or the set of references may be vague enough that the item could appear to describe more than one different vulnerability. This happened more frequently in the early days of CVE when the utility of references in deconflicting similar issues, and the importance of having necessary details in the descriptions, was not fully understood. Vague descriptions and missing references can lead to mapping errors in CVE-Compatible Products and Services. Vendor security advisories with vague information present a special challenge; the issue is likely to be real (otherwise the vendor would not have reported it), but the issue could already be identified in a different CVE item. Consultation with the vendor may clear up any ambiguity, but it is not always possible or feasible.
There may be several reasons why a CVE Identifier should be "Deleted" from its associated list:
Since any number of CVE-Compatible Products and Services could be using older CVE identifiers, it is important to keep a record of what happens to each item that must be "deleted." A CVE Identifier is deleted by deprecating it. The process includes the following:
The references and descriptions are removed so that (1) it is clear to everyone that the item is no longer identifying the original vulnerability, and (2) the item is not returned as a result of keyword searches.