Long-term Archive And Notary A. Jerman Blazic Services (LTANS) SETCCE Internet Draft S. Saljic Intended status: Standards Track SETCCE Expires: July 26, 2009 T. Gondrom January 26, 2009 Extensible Markup Language Evidence Record Syntax draft-ietf-ltans-xmlers-03.txt Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html Copyright Notice Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents Jerman Blazic, et. al. Expires July 26, 2009 [Page 1] Internet-Draft XMLERS January 2009 (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Abstract In many scenarios, users must be able to demonstrate the (time) existence, integrity and validity of data including signed data for long or undetermined period of time. This document specifies XML syntax and processing rules for creating evidence for long-term non- repudiation of existence of data. ERS-XML incorporates alternative syntax and processing rules to ASN.1 ERS syntax by using XML language. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. Jerman Blazic, et. al. Expires July 26, 2008 [Page 2] Internet-Draft XMLERS January 2009 Table of Contents 1. Introduction...................................................5 1.1. Motivation................................................5 1.2. General Overview and Requirements.........................7 1.3. Terminology...............................................8 1.4. Conventions Used in This Document........................10 2. Evidence Record...............................................10 2.1. Structure................................................11 2.2. Generation...............................................14 2.3. Verification.............................................15 3. Archive Time Stamp............................................15 3.1. Structure................................................16 3.1.1. Hash Tree...........................................16 3.1.2. Time-Stamp..........................................16 3.1.3. Cryptographic Information...........................17 3.2. Generation...............................................17 3.2.1. Generation of Hash-Tree.............................18 3.2.2. Reduction of Hash-Tree..............................21 3.3. Verification.............................................22 4. Archive Time-Stamp Sequence and Archive Time-Stamp Chain......23 4.1. Structure................................................24 4.1.1. Digest Method.......................................25 4.1.2. Canonicalization Method.............................25 4.2. Generation...............................................26 4.2.1. Time Stamp Renewal..................................26 4.2.2. Hash Tree Renewal...................................27 4.3. Verification.............................................28 5. Encryption....................................................30 6. XSD Schema for the Evidence Record............................31 7. Security Considerations.......................................34 APPENDIX A: Detailed verification process of an Evidence Record..37 8. References....................................................40 8.1. Normative References.....................................41 8.2. Informative References...................................41 Jerman Blazic, et. al. Expires July 26, 2008 [Page 3] Internet-Draft XMLERS January 2009 Author's Addresses...............................................41 Intellectual Property Statement........Error! Bookmark not defined. Disclaimer of Validity.................Error! Bookmark not defined. Jerman Blazic, et. al. Expires July 26, 2008 [Page 4] Internet-Draft XMLERS January 2009 1. Introduction The purpose of the document is to define XML Schema and processing rules for Evidence Record Syntax in XML format. Document is related to initial ASN.1 syntax for Evidence Record Syntax as defined in [RFC4998]. 1.1. Motivation The evolution of electronic commerce and electronic data exchange in general requires introduction of non-repudiable proof of data existence as well as data integrity and authenticity. Such data and non-repudiable proof of existence must endure for long periods of time, even when information to prove data existence and integrity weakens or ceases to exist. Mechanisms such as digital signatures do not provide absolute reliability on a long term basis. Algorithms and cryptographic material used to create a signature can become weak in course of time and information needed to validate digital signatures may became compromised or simply cease to exist due to for example decomposing certificate service provider. Providing a stable environment for electronic data on a long term basis requires the introduction of additional means to continually provide an appropriate level of trust in evidence on data existence, integrity and authenticity. All integrity and authenticity related techniques used today suffer from the same problem of time related reliability degradation including techniques for time stamping, which are generally recognized as data existence and integrity proofs mechanisms. Over long periods of time cryptographic algorithms used may become weak or encryption keys compromised. Some of the problems might not even be technically related like decomposing time stamping authority. To create a stable environment where proof of existence and integrity Jerman Blazic, et. al. Expires July 26, 2008 [Page 5] Internet-Draft XMLERS January 2009 can endure well into the future a new technical approach must be used. Long term non-repudiation of data existence and demonstration of data integrity techniques have been already introduced for example by long term signature syntaxes like [RFC3126]. Long term signature syntaxes and processing rules address mostly the long term endurance of digital signatures, while evidence record syntax broadens this approach for data of any type or format including digital signatures. The XMLERS syntax is based on Evidence Record Syntax as defined in [RFC4998] and is addressing the same problem of long term non- repudiable proof of data existence and demonstration of data integrity on a long term basis. XMLERS does not supplement the [RFC4998] specification. It introduces the same approach but in a different format and processing rules. The use of eXtensible Markup Language (XML) format is already recognized by a wide range of applications and services and is being selected as the de-facto standard for many applications based on data exchange. The introduction of evidence record syntax in XML format broadens the horizon of XML use and presents a harmonized syntax with a growing community of XML based standards including those related to security services (e.g. XMLDSig or XAdES). Due to the differences in XML processing rules and other characteristics of XML language, XMLERS does not present a direct transformation of ERS in ASN.1 syntax. The XMLERS syntax is based on different processing rules as defined in [RFC4998] and it does not support for example import of ASN.1 values in XML tags. Creating evidence records in XML syntax must follow the steps as defined in this draft. XMLERS is a standalone draft and is based on [RFC4998] conceptually only. Evidence Record Syntax in XML format is based on long term archive service requirements as defined in [RFC4810]. XMLERS syntax delivers Jerman Blazic, et. al. Expires July 26, 2008 [Page 6] Internet-Draft XMLERS January 2009 the same (level of) non-repudiable proof of data existence as ASN.1 ERS. The XML syntax supports archive data grouping (and de-grouping) together with simple or complex time-stamp renewal process. Evidence records can be embedded in the data itself or stored separately as a standalone XML file. 1.2. General Overview and Requirements XMLERS draft (draft-ietf-ltans-xmlers-03) specifies XML syntax and processing rules for creating evidence for long-term non-repudiation of existence of data in a unit called "Evidence Record". The XMLERS syntax is defined to meet the requirements for data structures as set out in [RFC4810]. This document also refers to ASN.1 ERS specification as defined in [RFC4998]. An Evidence Record may be generated and maintained for a single data object or a group of data objects that form an archive object. Data object (binary chunk or a file) may represent any kind of document or part of it. Dependencies among data objects, their validation or any other relationship than "a data object is a part of particular archived object" are out of the scope of this draft. Evidence Record maintains a close relationship to time stamping techniques. However, time-stamps as defined in [RFC3161], can cover only a single unit of data and do not provide processing rules for maintaining a long term stability of time-stamps applied over a data object. Evidence for an archive object is created by acquiring a time-stamp from a trustworthy authority for a specific value that is unambiguously related to a single or more data objects. The Evidence Record syntax enables processing of several archive objects within a single processing pass using a hash-treeing technique and acquiring only one time-stamp to protect all archive objects. Besides a time-stamp other artifacts are also preserved in Evidence Record: data necessary to verify the relationship between a time- stamped value and a specific data object, packed into a structure Jerman Blazic, et. al. Expires July 26, 2008 [Page 7] Internet-Draft XMLERS January 2009 called a "hash-tree"; and long term proofs for the formal verification of included time-stamp(s). Due to the fact that digest algorithms or cryptographic methods used may become weak or that certificates used within a time-stamp (and signed data) may be revoked or expired, the collected evidence data must be monitored and renewed before such events occur. This document introduces XML based syntax and processing rules for the creation and continuous renewal of evidence data. 1.3. Terminology Archive data object: Data unit that is archived and has to be preserved for a long time by the Long-term Archive Service. Archive data object group: A multitude of (archive) data objects, which for some reason (logically) belong together, e.g. a group of document files or a document file and a signature file could represent an archive data object group. Archive Time-Stamp (ATS): An Archive Time-Stamp contains a time-stamp token, useful data for validation and a single or a list of hash values. An Archive Time-Stamp relates to a data object, if the hash value of this data object is part of the first hash value list of the Archive Time-Stamp. An Archive Time-Stamp relates to a data object group, if it relates to every data object of the group and no other data object. Archive Time-Stamp Chain (ATSC): holds a sequence of Archive Time- Stamps generated during the preservation period. Archive Time-Stamp Sequence (ATSSeq): is a sequence of Archive Time- Stamp Chains. Canonicalization: Processing rules for transforming an XML document into its canonical form. Two XML documents may have different Jerman Blazic, et. al. Expires July 26, 2008 [Page 8] Internet-Draft XMLERS January 2009 physical representations, but they may have the same canonical form. For example a sort order of attributes does not change the meaning of the document as defined in [XMLC14N]. Cryptographic Information: Data or part of data related to the validation process of signed data, e.g. digital certificates, digital certificate chains, certificate revocation lists, etc. Digest Method: Digest method is an identifier for a digest algorithm, which is a strong one-way function, for which it is computationally infeasible to find an input that corresponds to a given output or to find two different input values that correspond to the same output. Digest algorithm transforms input data into a short value of fixed length. The output is called digest value, hash value or data fingerprint. Evidence: Information that may be used to resolve a dispute about various aspects of authenticity, validity and existence of archived data objects. Evidence record: Collection of evidence compiled for one or more given archived data objects over time. An evidence record includes ordered collection of ATSs, which are grouped into ATSCs and ATSSeqs. Long-term Archive Service (LTA): A service responsible for generation, collection and maintenance (renewal) of evidence data. A LTA service may also preserve data for long periods of time, e.g. storage of archive data and associated evidences. Hash Tree: Collection of significant values of protected objects (input objects and generated evidence within archival period). For that purpose a Merkle Hash Tree [MER1980] may be constructed and reduced for each archive object. Jerman Blazic, et. al. Expires July 26, 2008 [Page 9] Internet-Draft XMLERS January 2009 Reduced hash-tree: The process of reducing a Merkle hash-tree [MER1980] to a list of lists of hash values. This is the basis of storing the evidence for a single data object. Time-Stamp (TS): A cryptographically secure confirmation generated by a Time Stamping Authority (TSA) e.g. [RFC3161] which specifies a structure for time-stamps and a protocol for communicating with a Time-Stamp Authority. Besides this, other data structures and protocols may also be appropriate, such as defined in [ISO-18014- 1.2002], [ISO-18014-2.2002], [ISO-18014-3.2004], and [ANSI.X9- 95.2005]. 1.4. Conventions Used in This Document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 2. Evidence Record An Evidence Record is a unit of data, which is to be used to prove the existence of an archive object (a single archive data object or a group of archive data object) at a certain time. Through the lifetime of an archive object, an Evidence Record also demonstrates data objects integrity and non-repudiability. It is possible to store Evidence Record separately from the archive object or to integrate it into the data itself. To achieve this, cryptographic means are used, i.e. time-stamp tokens obtained from the Time-Stamping Authority (TSA). As cryptographic means are used to support evidence records, such records may lose their value though time. Time-stamps obtained from Time-Stamping Authorities may become invalid for a number of reasons, usually due to time constrains of time-stamp validity. Before time- stamp tokens used become unstable, evidence record has to be renewed. This may result in a series of time-stamp tokens, which are linked Jerman Blazic, et. al. Expires July 26, 2008 [Page 10] Internet-Draft XMLERS January 2009 between themselves according to the cryptographic methods and algorithms used. Evidence record can be supported with additional information, which can be used to ease the processes of evidence record validation and renewal. Information such as digital certificates, certificate revocation lists, etc. can be collected, enclosed and processed together with archive object data (i.e. time stamped). 2.1. Structure The Evidence Record contains one or several Archive Time-Stamps (ATS). An ATS contains a time stamp token and optionally other useful data for time stamp validation, e.g. certificates, CRLs or OCSP responses and also specific attributes such as service policies. Initially, an ATS is acquired and later, before it expires or becomes invalid a new ATS is acquired, which prolongs the validity of the archived object (its data objects together with all previously generated archive time stamps). This process must continue during the desired archiving period of archive data object. A series of successive Archive Time-Stamps is collected in Archive Time-Stamp Chains and a series of chains in Archive Time-Stamp Sequence. In XML syntax the Evidence Record is represented by the root element, which has the following structure (where "+" denotes one or more occurrences and "*" denotes zero or more occurrences): * * Jerman Blazic, et. al. Expires July 26, 2008 [Page 11] Internet-Draft XMLERS January 2009 + + * + * * ) + + + The XML tags have the following meanings: tag indicates the syntax version, for compatibility with future revisions of this specification and to distinguish it from earlier non-conformant or proprietary versions of the XMLERS. Current version of the XMLERS syntax is 03. tag is optional and is used to refer to archive data. Reference can only be used for a single data object within a single evidence record (for that particular data object). tag is optional and holds information on cryptographic algorithms and cryptographic material used to encrypt archive data (in case archive data is encrypted e.g. for privacy purposes). This optional information is needed to unambiguously re- encrypt data objects when processing evidence records. When omitted, data objects are not encrypted or non-repudiation proof is not needed for the unencrypted data. Details on how to process encrypted archive data and generate evidence record(s) are described in Section 5. is a sequence of . Jerman Blazic, et. al. Expires July 26, 2008 [Page 12] Internet-Draft XMLERS January 2009 holds a sequence of Archive Time-Stamps generated during the preservation period. Details on Archive Time- Stamp Chains and Archive Time-Stamp Sequences are described in section 4. The sequences of Archive Time-Stamp Chains and Archive Time-Stamps are ordered and the order must be indicated with "Order" attribute of the and element. tag identifies the digest algorithms used to calculate digest values from archive data object(s), archive time- stamps, archive time-stamp sequence(s) and within a time-stamp token. is a required element that specifies the canonicalization algorithm applied to the archive data, or element prior to performing digest value calculations. tag holds a value or a structure of a reduced hash tree(s) described in section 3.1.1. tag holds a time-stamp token provided by the Time- Stamping Authority. tag allows the storage of data needed in the process of archive time-stamp token validation in case when such data is not provided by the time-stamp token itself. This could include possible trust anchors, certificates, revocation information or the current definition of the suitability of cryptographic algorithms, past and present. These items may be added based on the policy used. This data is protected by successive time-stamps in the sequence of the time-stamps. tag contains additional information that may be provided by an LTA used for the renewal process. An example of additional information may be processing (renewal) policies, which Jerman Blazic, et. al. Expires July 26, 2008 [Page 13] Internet-Draft XMLERS January 2009 are relevant for document(s) preservation and evidence validation at a later stage. 2.2. Generation The generation of an element can be described as follows: 1. Select an archive object (a data object or a data object group) to archive. 2. Create the initial . This is the first ATS within the initial element of the element. 3. Refresh the when necessary, by Time-Stamp Renewal or Hash-Tree Renewal (see section 4.). The time-stamping service may be, for a large number of archived objects, expensive and time-demanding, so the LTA may profit from acquiring one time-stamp token for many archived objects, which are not otherwise related to each other. It is possible to collect many data objects, build hash trees, store them, and reduce them later. For performance reasons or in case of local time-stamp generation, hash-treeing ( element) can be omitted. In this case ATS element covers a single data object. It is also possible to convert existing time-stamps into ATS for renewal. In the case that only essential parts of documents or objects shall be protected, the application not defined in this draft must ensure that the correct extraction of binary data is made for generation of evidence record. Example: an application may provide also evidence such as certificates, revocation lists etc., needed to verify and validate signed data objects or data object group. This evidence may be added Jerman Blazic, et. al. Expires July 26, 2008 [Page 14] Internet-Draft XMLERS January 2009 to the archived object data group and will be protected within initial (and successive) time-stamp(s). Note that element of Evidence Record is not to be used to store and protect cryptographic material related to signed archive data. The use of this element is limited to cryptographic material related to ATS(s). 2.3. Verification The overall Verification of an Evidence Record can be described as follows: 1. Select an archive object (a data object or a data object group) 2. Re-encrypt data object or data object group, if encryption field is used (for details, see Section 5.). 3. Verify Archive Timestamp Sequence (details in Section 3.3. and Section 4.3.). 3. Archive Time Stamp An Archive Time-Stamp (an element) is a time-stamp and a set of hash values needed to unambiguously relate archived data with evidence data. The list of hash values can be reduced using Merkle Hash Tree [MER1980] with the aim to use a single time-stamp token to protect many data objects. The process of construction of an ATS must support evidence on a long term basis and prove that the archive object existed and was identical, at the time of the time-stamp, to the currently present archive object (at the time of verification). To achieve this, ATS must be renewed before it becomes invalid (which may happen for several reasons such as invalid digital certificate or decomposing TSA). Jerman Blazic, et. al. Expires July 26, 2008 [Page 15] Internet-Draft XMLERS January 2009 3.1. Structure An ATS is a collection of the time-stamp token, an optional structure (a hash tree) for digest values of objects that were protected with that time-stamp token and optional structures (cryptographic information) that store additional data needed for formal verification of the time-stamp token, such as certificate chain or certificate revocation list. 3.1.1. Hash Tree Hash tree structure is a container for significant values, needed to unambiguously relate a time-stamped value to protected data objects, and is represented by the element. The lists of hash values are generated by reduction of an ordered Merkle hash tree [MER1980]. The leaves of this hash tree are the hash values of the data objects to be time-stamped. Inner nodes of the tree contain one hash value, which is generated by hashing the concatenation of the children nodes. The root hash value, which represents unambiguously all data objects, is then time-stamped. For detailed specification on how to generate a hash tree see section 3.2. 3.1.2. Time-Stamp Time-Stamp Token is an attestation generated by a TSA that a data item existed at a certain time. For example, [RFC3161] specifies a structure for signed time-stamp tokens in ASN.1 format. The following structure example (referring to the Entrust XML Schema for time- stamp) is a digital signature compliant to [XMLDsig] specification containing time-stamp specific data, such as time-stamped value and time within element of a signature. Jerman Blazic, et. al. Expires July 26, 2008 [Page 16] Internet-Draft XMLERS January 2009 A element of ATS holds a complete structure of time-stamp token as provided by TSA. Time-stamp token may come in XML or ASN.1 format. 3.1.3. Cryptographic Information Digital certificates, CRLs or OCSP-Responses needed to verify the time-stamp token should be stored in the time-stamp token itself. When this is not possible, such data may be stored in element (as a node value of its element). The attribute Type is optional and is used to store processing information about type of stored cryptographic information. 3.2. Generation Initial ATS relates to a data object or a data object group that represents an archive object. The generation of the initial ATS element can be done in a single process pass for one or for many archived objects, described as follows: 1. Collect one or more archived objects to be time-stamped and optionally include their references in the element. Jerman Blazic, et. al. Expires July 26, 2008 [Page 17] Internet-Draft XMLERS January 2009 2. Select canonicalization method C to be used for obtaining binary representation of archive data and for Archive Time-Stamp at a later stage in renewing process (see section 4). Note that canonicalization method is to be used for archive data only when data is represented in XML format. 3. Select a valid digest algorithm H. Selected secure hash algorithm MUST be the same as hash algorithm in the time-stamp request (time-stamp token), or the element of the Archive Time-Stamp Chain MUST be present and specify the hash algorithm of the hash tree. 4. Create an input list of digest values of archive objects. Those digest values are the leaves of the hash tree for the whole group of archived objects. Generate a hash tree until the root hash value is calculated and optionally reduce the generated hash tree (see section 3.2.1. and 3.2.2. for details). Hash tree may be omitted in the initial ATS, when an archive object has a single data object; then the time-stamped value must match the digest value of that single data object. 5. Acquire time-stamp from TSA for root hash value of a hash tree. If the time-stamp is valid, the initial archive time-stamp may be generated. 3.2.1. Generation of Hash-Tree The Merkle Hash-Tree for a group of archive objects is built from bottom to the root. First are collected the leaves of the tree. The leaves represent digest values of archive objects: 1. Collect archive objects and for each archive object its corresponding data objects. Jerman Blazic, et. al. Expires July 26, 2008 [Page 18] Internet-Draft XMLERS January 2009 2. Chose a secure hash algorithm H and calculate digest values for the data objects and put them into input list as follows: a digest value of an archive object is the digest value of its data object, if there is only one data object; for more than one data object a digest value is the digest value of binary sorted, concatenated digest values of all its containing data objects. Note that for some hash digest on the input list (archive objects having more than one data object) also lists of their sub-digest values are stored. 3. Group together items in the input list by N (e.g. to make binary tree group in pairs) and for each group: binary ascending sort, concatenate and calculate hash values. The result is a new input list. 4. Repeat step 3, until only one digest value is left; this is the root value of the hash tree, which is time-stamped. Note that selected secure hash algorithm MUST be the same as hash algorithm in the time-stamp request, or the element of the ATS MUST be present and specify the hash algorithm of the hash tree. Example: An input list with 18 hash values, where the h'1 is generated for a group of data objects (d4, d5, d6 and d7) and has been grouped by 3. The group could be of any size (2, 3...). It is also possible to extend the tree with "dummy" values; to make every node have the same number of children. Jerman Blazic, et. al. Expires July 26, 2008 [Page 19] Internet-Draft XMLERS January 2009 ---------- d1 -> h1 \ \ G1 d2 -> h2 |-> h''1 +--------+ / \ |d4 -> h4|\ d3 -> h3 / \ |d5 -> h5| \ ---------- | | | | -> h'1 \ | |d6 -> h6| / \ | |d7 -> h7|/ d8 -> h8 |-> h''2 |-> h'''1 +--------+ / | \ d9 -> h9 / | \ ---------- | | d10 -> h10\ / | \ / | d11 -> h11 |-> h''3 | / | d12 -> h12/ |-> root hash value ---------- | d13 -> h13\ | \ | d14 -> h14 |-> h''4 | / \ / d15 -> h15/ \ / ---------- |-> h'''2 d16 -> h16\ / \ / d17 -> h17 |-> h''5 / d18 -> h18/ ---------- Figure 1 Generation of the Reduced Hash Tree. Jerman Blazic, et. al. Expires July 26, 2008 [Page 20] Internet-Draft XMLERS January 2009 Note that there are no restrictions on the quantity of hash value lists and of their length. Also note that it is profitable but not required to build and reduce hash-trees. An Archive Time-Stamp may consist only of one list of hash values and a time-stamp or in the extreme case, only a time-stamp with no hash value lists. 3.2.2. Reduction of Hash-Tree The hash tree generated can be reduced to lists of hash values, necessary as a proof of existence for a single data object as follows: 1. For a selected data object generate hash value using secure algorithm H of the hash tree. 2. Select all hash values, which have the same father node as hash value h. Place these hash values as partial hashes within separate element with Order attribute 1. Generate the first list of hash values by arranging these hashes in binary ascending order. 3. Repeat step 2 for the parent node of all hashes until the root hash value is reached. Place partial hashes that belong to the same node within separate element with higher Order attribute increased by one. Note that node values and hash value h are not saved in the list as they are computable. 4. Add the time-stamp (and cryptographic information) to get Archive Time-Stamp. Reduced Hash tree for data object d4 (from the previous example, presented in Figure 1): h4 h5 Jerman Blazic, et. al. Expires July 26, 2008 [Page 21] Internet-Draft XMLERS January 2009 h6 h7 h8 h9 h''1 h''3 h'''2 3.3. Verification An Archive Timestamp shall prove that a data object existed at a certain time, indicated by its time-stamp. Verification procedure is as follows: 1. Identify hash algorithm H (from time-stamp token or element) and calculate hash value h of the data object. 2. Search for hash value h in the first element that contains partial hashes. If not present, terminate verification process with negative result. 3. Concatenate the hash values of the actual list within element of hash values in binary ascending order and calculate the hash value h' with algorithm H. This hash value h' MUST become a member of the next higher list of hash values (from the next element). Continue step 3 until a root hash value is calculated. Jerman Blazic, et. al. Expires July 26, 2008 [Page 22] Internet-Draft XMLERS January 2009 4. Check timestamp. In case of a timestamp according to [RFC3161], the root hash value must correspond to hashedMessage in messageImprint field of timeStampToken, when corresponds to hashAlgorithm field of the time-stamp token. In case of other timestamp formats, the hash value and digestAlgorithm must also correspond to their equivalent fields if they exist. If a proof is necessary for more than one data object, steps 1 and 2 have to be performed for all data objects to be proved. If an additional proof is necessary that the Archive Time-Stamp relates to a data object group (e.g., a document and all its digital signatures), it can also be verified that only the hash values of the given data objects are in the first hash-value list. 4. Archive Time-Stamp Sequence and Archive Time-Stamp Chain An Archive Time-Stamp proves the existence of single data objects or data object group at a certain time. However, the initial evidence record created can become invalid due to decomposing validity of time-stamp token for a number of reasons: hash algorithms or public key algorithms used in its hash tree or the time-stamp may become weak or the validity period of the timestamp authority certificate expires or is revoked. To preserve the validity of evidence record before such events occur, evidence record has to be renewed. This can be done by creating a new ATS. Depending on the reason for reapplying evidence record (the time-stamp becomes invalid or the hash algorithm of the hash tree becomes weak) two types of renewal processes are possible: o Time-stamp renewal: For this process a new Archive Time-Stamp is generated, which is applied over last time-stamp created. The process results in a series of Archive Time-Stamps which are contained within a single Archive Time-Stamp Chain (ATSC). Jerman Blazic, et. al. Expires July 26, 2008 [Page 23] Internet-Draft XMLERS January 2009 o Hash-tree renewal: For this process a new Archive Time-Stamp is generated, which is applied over all existing time-stamps and data objects. Newly generated Archive Time-Stamp is placed in a new Archive Time-Stamp Chain. The process results in a series of Archive Time-Stamps Chains which are contained within a single Archive Time-Stamp Sequence (ATSS). After renewal process, only the most recent (i.e. the last generated) Archive Time-Stamp has to be monitored for expiration or decomposing validity due to weakening algorithms used. 4.1. Structure Archive Time-Stamp Chain and Archive Time-Stamp Sequence are containers for sequences of archive time-stamp(s) which are generated through renewal processes. Renewal process results in a series of evidence record elements: element contains an ordered sequence of elements and element contains an ordered sequence of elements. Both elements MUST be sorted by time of the time-stamp in ascending order. Order is indicated by the Order attribute. When Archive Time-Stamp must be renewed, a new element is generated and depending on the generation process, it is either placed: o as the last child element in a sequence of the last element in case of time-stamp renewal or o as the first child element in a sequence of the newly created created element in case of hash-tree renewal. Jerman Blazic, et. al. Expires July 26, 2008 [Page 24] Internet-Draft XMLERS January 2009 The ATS with the largest Order attribute value within the ATSC with the largest Order attribute value is the latest ATS and must be valid at the present time. 4.1.1. Digest Method Digest method is a required element that identifies the digest algorithm used to calculate hash values of archive data (and node values of hash tree). Digest method is specified in the element by element and indicates the digest algorithm that MUST be used for all hash value calculations related to the archive time-stamps within archive time- stamp chain. Digest algorithms used for evidence record must be equal to the algorithms used for time stamp token(s) within a single ATSC. When algorithms used by TSA are changed (e.g. upgraded) a new ATSC must be started using equal or stronger digest algorithm. 4.1.2. Canonicalization Method Prior to hash value calculations of an XML element, a proper binary representation must be extracted from its (abstract) XML data presentation. The binary representation is determined by UTF-8 encoding and canonicalization of the XML element. The XML element includes the entire text of the start and end tags as well as all descendant markup and character data (i.e., the text and sub- elements) between those tags. is a required element that identifies the canonicalization algorithm used to obtain binary representation of an XML element(s). Canonicalization MAY be applied over archive data and MUST be applied over elements of evidence record (namely ATS and ATSC in the renewing process). Jerman Blazic, et. al. Expires July 26, 2008 [Page 25] Internet-Draft XMLERS January 2009 Canonicalization method is specified in the element by element and indicates the canonicalization method that MUST be used for all binary representations of the archive time-stamps within archive time-stamp chain. In case of succeeding ATSC the canonicalization method indicated within the ATSC must also be used for calculation of digest value of preceding ATSC. Note that canonicalization method is unlikely to change over time as it does not impose the same constrains as digest method. In theory, the same canonicalization method can be used for a single Archive Time-Stamp Sequence. 4.2. Generation Before the cryptographic algorithms used within the most recent Archive Time-Stamp become weak or the time-stamp certificates are invalidated, Archive Time-Stamps have to be renewed by generating a new Archive Time-Stamp. 4.2.1. Time Stamp Renewal In case of time-stamp renewal, i.e. if the digest algorithm (H) to be used in the renewal process is the same as digest algorithm (H') used in the last Archive Time-Stamp, the complete content of the last ATS MUST be time-stamped and new element created as follows: 1. If the current element does not contain needed proof for long-term formal validation of its time-stamp token within the time-stamp token, collect needed data such as root certificates, certificate revocation lists, etc., and include them in element of the last Archive Time- Stamp (each data object into a separate element). Jerman Blazic, et. al. Expires July 26, 2008 [Page 26] Internet-Draft XMLERS January 2009 2. Select canonicalization method from element and select digest algorithm from element. Calculate hash value from binary representation of the last element including added cryptographic information. Acquire the time-stamp for the calculated hash value. If the time-stamp is valid, the new Archive Time-Stamp may be generated. 3. Increase the value order of the new ATS by one and place the new ATS into the last element. The new ATS and its hash tree MUST use the same digest algorithm as the preceding one, which is specified in the element within element or when not indicated the same as defined in the time-stamp token. 4.2.2. Hash Tree Renewal The process of hash tree renewal occurs when the new digest algorithm is different to the one used in the last Archive Time-Stamp (H <> H'). In this case the Archive Time-Stamp and the archive data objects covered by existing Archive Time-stamp must be time-stamped again as follows: 1. If the current element does not contain needed proof for long-term formal validation of its time-stamp token within the time-stamp token, collect needed data such as root certificates, certificate revocation lists, etc., and include them in element of the last Archive Time- Stamp (each data object into a separate element). 2. Select canonicalization method and select a new secure hash algorithm H. Jerman Blazic, et. al. Expires July 26, 2008 [Page 27] Internet-Draft XMLERS January 2009 3. Select data objects d(i) referred to by initial Archive Time-Stamp (objects that are still present and not deleted). Generate hash values h(i) = H(d(i)). In case the initial Archive Time-Stamp is applied to more than one data object (of archive data), then more than one hash values are generated i.e., h(i_a), h(i_b).., h(i_n) 4. Calculate hash value hatsc(i) = H(ATSC(i))from binary representation of the previously generated and ordered elements within element, corresponding to data object d(i). Note that Archive Time-Stamp Chains and Archive Time-Stamps MUST be chronologically ordered, each respectively to its Order attribute. 5. Concatenate and sort in binary ascending order each h(i) and corresponding hatsc(i)and generate a new digest value h(i)'=H(h(i)+hatsc(i)). In case of more data objects of archive data, concatenate and sort in binary ascending order hash values h(i_a)'= H(h(i_a)+hatsc(i)), h(i_b)'= H(h(i_b)+hatsc(i)), etc. 6. Build a new Archive Time-Stamp for each h(i)' (hash tree generation and reduction is defined in sections 3.2.1. and 3.2.2.). Note that each h(i)' is treated as the document hash in section 3.2.2. The first hash value list in the reduced hash tree should only contain h(i)'. For data object group the first hash value list contains the new hashes for all the documents in this group in binary ascending order, i.e. h(i_a)', h(i_b)', etc. 7. Create new containing the new element (with order number 1), and place it into the existing as a last child with the order number increased by one. 4.3. Verification An Evidence Record shall prove that an archive object existed and has not been changed from the time of the initial time-stamp token within Jerman Blazic, et. al. Expires July 26, 2008 [Page 28] Internet-Draft XMLERS January 2009 the first ATS. In order to complete the non-repudiation proof for the data objects, the last ATS has to be valid and ATSCs and their relations to each other have to be proved: 1. Select data object and re-encrypt data object or data object group, if field is used. Select the initial digest algorithm specified within Archive Time-Stamp Sequence and calculate hash value of the data object. Verify that the first Archive Time-Stamp of the first Archive Time-Stamp Chain contains (identical) hash value of the data object. 2. Verify each Archive Time-Stamp Chain and each Archive Time-Stamp within. The first hash value list of Archive Time-Stamp MUST contain the hash value of the Archive Time-Stamp before. Each Archive Time-Stamp must be valid relative to the time of the succeeding Archive Time-Stamp. All Archive Time-Stamps with the Archive Time-Stamp Chain MUST use the same algorithm, which was secure at the time of the first Archive Time-Stamp of the succeeding Archive Time-Stamp Chain. 3. Verify that the first hash value list of the first Archive Time- Stamp of all succeeding Archive Time-Stamp Chains contains hash values of data object and the hash value of all preceding Archive Time-Stamp Chains. Verify that Archive Time-Stamp was created when the last Archive Time-Stamp of the preceding Archive Time-Stamp Chain was valid. 4. Repeat steps from 1 to 3 for all data objects of data object group. To prove the Archive Time-Stamp Sequence relates to a data object group, verify that each first Archive Time-Stamp of the first Archive Time-Stamp Chain of each data object does not contain other hash values in its first hash value list (than the hash values of the other data objects). For non-repudiation proof for the data object, the last Archive Time- Stamp MUST be valid at the time of verification process. Jerman Blazic, et. al. Expires July 26, 2008 [Page 29] Internet-Draft XMLERS January 2009 5. Encryption In some archive services scenarios it may be required that clients send encrypted data only, preventing information disclosure to third parties, such as archive service providers. In such scenarios it must be clear that evidence records generated refer to encrypted data objects. Evidence records in general protect the bit-stream (or binary representation of XML data) which freezes the bit structure at the time of archiving. Encryption schemes in such scenarios cannot be changed afterwards without losing the integrity proof. Therefore, an ERS record must hold and preserve encryption information in a consistent manner. Encryption is a two way process, whose result depends on the cryptographic material used, e.g. encryption keys and encryption algorithms. Encryption and decryption keys as well as algorithms must match in order to reconstruct the original message or data that was encrypted. When different cryptographic material is used, the results may not be the same, i.e. decrypted data does not match the original (unencrypted) data. In cases when evidence was generated to prove the existence of encrypted data the corresponding algorithm and decryption keys used for encryption must become a part of the evidence record and is used to unambiguously represent original (unencrypted) data that was encrypted. Cryptographic material may also be used in scenarios when a local copy of encrypted data submitted to the archive service provider for preservation is kept in an unencrypted form by a client. In such scenarios cryptographic material is used to re-encrypt unencrypted data kept by a client for the purpose of performing validation of evidence record, which is related to the encrypted form of client's data. The attribute Type within element is optional and is used to store processing information about type of stored encryption information, e.g. encryption algorithm or encryption key. Jerman Blazic, et. al. Expires July 26, 2008 [Page 30] Internet-Draft XMLERS January 2009 The use of encryption elements heavily depends on the cryptographic mechanism and has to be defined by other specification. 6. XSD Schema for the Evidence Record Jerman Blazic, et. al. Expires July 26, 2008 [Page 31] Internet-Draft XMLERS January 2009 Jerman Blazic, et. al. Expires July 26, 2008 [Page 32] Internet-Draft XMLERS January 2009 Jerman Blazic, et. al. Expires July 26, 2008 [Page 33] Internet-Draft XMLERS January 2009 7. Security Considerations Secure Algorithms Jerman Blazic, et. al. Expires July 26, 2008 [Page 34] Internet-Draft XMLERS January 2009 Cryptographic algorithms and parameters that are used within Archive Time-Stamps must always be secure at the time of generation. This concerns the hash algorithm used in the hash lists of Archive Timestamp as well as hash algorithms and public key algorithms of the timestamps. Publications regarding security suitability of cryptographic algorithms ([NIST.800-57-Part1.2006] and [ETSI- TS102176-1-2005]) have to be considered by verifying components. A generic solution for automatic interpretation of security suitability policies in electronic form is not the subject of this specification. Redundancy Evidence records may become affected by weakening cryptographic algorithms even before this is publicly known. Retrospectively this has an impact on Archive Time-Stamps generated and renewed during the archival period. In this case the validity of evidence records created may end without any options for retroactive action. Many TSAs are using the same cryptographic algorithms. While compromise of a private key of a TSA may compromise the security of only one TSA (and only on Archive Time-Stamp for example), weakening cryptographic algorithms used to generate time-stamp tokens would affect many TSAs at the same time. To manage such risks and to avoid the loss of evidence record validity due to weakening cryptographic algorithms used, it is recommended to generate and manage at least two redundant Evidence Records for a single data object. In such scenarios Redundant Evidence Records must use different hash algorithms within Archive Time Stamp Sequences and different TSAs using different cryptographic algorithms for time-stamp tokens. Secure Time-Stamps Archive Time-Stamps depend upon the security of normal time stamping provided by TSA and stated in security policies. Renewed Archive Jerman Blazic, et. al. Expires July 26, 2008 [Page 35] Internet-Draft XMLERS January 2009 Time-Stamps should have the same or higher quality as the initial Archive Time-Stamp of archive data. Archive Time-Stamps used for signed archive data should have the same or higher quality than the maximum quality of the signatures. Jerman Blazic, et. al. Expires July 26, 2008 [Page 36] Internet-Draft XMLERS January 2009 APPENDIX A: Detailed verification process of an Evidence Record An Evidence Record shall prove that an archive object existed and has not been changed from the time of the time-stamp token within the first ATS. Every ATS, but the last, must be valid at the time of the next ATS. In order to complete the non-repudiation proof for the data objects, the last ATS has to be valid and ATSCs and their relations to each other have to be proved. To verify the validity of an Evidence Record start with the first ATS till the last ATS (ordered by attribute Order) and perform verification for each ATS, as follows: 5. Select an archive data object or group of data objects 6. Re-encrypt data object or data object group, if field is used (see section 5. for more details) 7. Get a canonicalization method C and a digest method H from the element of the current ATSC. 8. Make a list of digest values of (binary representation of) data objects within data object group that MUST be protected with this ATS as follows: a. If this ATS is the first in the ATSC chain: i. If this is the first ATS of the first ATSC in the ATSS sequence, calculate digest values of data objects with H and add each digest value to the list. Jerman Blazic, et. al. Expires July 26, 2008 [Page 37] Internet-Draft XMLERS January 2009 ii. If this is the first ATS of the ATSC which is not the initial ATSC in the ATSS sequence, calculate a single digest value with H of ordered ATSCs. Add and sort in binary ascending order this digest value with digest values of protected data objects and generate a new hash value. b. If this ATS is not the first in the ATSC chain: i. Calculate the digest value with H of the previous ATS element. 9. Get the first sequence of the hash tree for this ATS. If this ATS has no hash tree elements then: a. If this ATS is not the first in the ATSS (first ATS of first ATSC), exit with a negative result. b. If this ATS is the first in the ATSS, there must be only one protected data object. The digest value of that data object must be the same as its time-stamped value. If not, exit with a negative result. 10.If there is a digest value in the list of digest values of protected objects, which can not be found in the first sequence of the hash tree or if there is a hash value in the first sequence of the hash tree which is not in the list of digest values of protected objects, exit with a negative result. Get the hash tree from the current ATS and use H to calculate the root hash value (see sections 3.2.1. and 3.2.2.) 11.Get time stamped value from the time stamp token. If calculated root hash value from the hash tree does not match the time stamped value, exit with a negative result. Jerman Blazic, et. al. Expires July 26, 2008 [Page 38] Internet-Draft XMLERS January 2009 12.Verify time-stamp cryptographically and formally (validate the used certificate and its chain which may be available within the time stamp token itself or tag). 13.If this ATS is the last ATS, check formal validity for the current time (now), or get "valid from" time of the next ATS and verify formal validity at that specific time. 14.If the needed information to verify formal validity is not found within the time-stamp or within its Cryptographic Information section of ATS, exit with a negative result. Jerman Blazic, et. al. Expires July 26, 2008 [Page 39] Internet-Draft XMLERS January 2009 8. References [I-D.ietf-ltans-ers] Brandner, R., "Evidence Record Syntax (ERS)", draft-ietf-ltans-ers-11 (work in progress), February 2007 [I-D.ietf-ltans-ltap] Jerman-Blazic, A., "Long-term Archive Protocol (LTAP)", draft-ietf-ltans-ltap-03 (work in progress), October 2006. [I-D.ietf-ltans-reqs] Wallace, C., "Long-Term Archive Service Requirements", draft-ietf-ltans-reqs-10 (work in progress), December 2006. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BPC 14, RFC 2119, March 1997. [RFC3161] Adams, C., Cain, P., Pinkas, D., and R. Zuccherato, "Internet X.509 Public Key Infrastructure Time-Stamp Protocol (TSP)", RFC 3161, August 2001. [RFC3280] Housley, R., Polk, W., Ford, W., and D. Solo, "Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile", RFC 3280, April 2002. [RFC3852] Housley, R., "Cryptographic Message Syntax (CMS)", RFC 3852, July 2004. [XMLC14N] Boyer, J., "Canonical XML", W3C Recommendation, March 2001. [XMLDsig] Eastlake, D., "XML-Signature Syntax and Processing",XMLDsig, July 2006. Jerman Blazic, et. al. Expires July 26, 2008 [Page 40] Internet-Draft XMLERS January 2009 8.1. Normative References None. 8.2. Informative References [MER1980] Merkle, R., "Protocols for Public Key Cryptosystems, Proceedings of the 1980 IEEE Symposium on Security and Privacy (Oakland, CA, USA)", pages 122-134, April 1980. [MIME] Freed, N., "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies", RFC 2045, November 1996. [RFC3470] Hollenbeck, S., " Guidelines for the Use of Extensible Markup Language (XML) within IETF Protocols ", RFC 3470, January 2003. Author's Addresses Aleksej Jerman Blazic SETCCE Tehnoloski park 21 1000 Ljubljana Slovenia Phone: +386 (0) 1 620 4500 Fax: +386 (0) 1 620 4509 Email: aljosa@setcce.si Jerman Blazic, et. al. Expires July 26, 2008 [Page 41] Internet-Draft XMLERS January 2009 Svetlana Saljic SETCCE Tehnoloski park 21 1000 Ljubljana Slovenia Phone: +386 (0) 1 620 4506 Fax: +386 (0) 1 620 4509 Email: svetlana.saljic@setcce.si Tobias Gondrom Waisenhausstr. 67C 80637 Munich Germany Phone: +49 (0) 89 3205 330 Fax: / Email: tobias.gondrom@gondrom.org Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Jerman Blazic, et. al. Expires July 26, 2008 [Page 42]