Skip to content
Boris Doubrov edited this page Dec 16, 2016 · 2 revisions

Define institutional policies for PDF documents

Overview

Custom policies are defined using the following two ingredients:

  • Features report generated by veraPDF
  • Policy profiles defined using the Schematron technology

Features Report

Features report extracts various proprerties from the PDF document and transforms them into XML-based format. It is stored in the <featuresReport> element in each job in the veraPDF XML report.

Features report contains information about the following objects inside the PDF document:

  • Annotations
  • Color Spaces
  • Document Security
  • Embedded Files
  • Graphics States (or ExtGState dictionaries)
  • Fonts
  • Forms (or XObject forms)
  • ICC Profile
  • Images
  • Information Dictionary
  • Low Level Info
  • Metadata (or XMP Metadata)
  • Outlines (or Bookmarks)
  • Output Intents
  • Pages
  • Patterns (or Tiling Patterns)
  • PostScript
  • Properties Dictionaries
  • Shadings (or Gradients)
  • Digital Signatures

More details on the individual elements inside Features report can be found in the veraPDF Technical Specification, Section TS4.

Configuring features report

By default most of this data is disabled in order to minimize the XML report size. This is controlled by a config file <install dir>/config/features.xml> OR via the GUI File/Features Config setting (which updates the config file). Initially the file reads:

<featuresConfig>
  <enabledFeatures>
    <feature>INFORMATION_DICTIONARY</feature>
  </enabledFeatures>
</featuresConfig>

But heres an "all features on" config:

<featuresConfig>
    <enabledFeatures>
        <feature>ANNOTATION</feature>
        <feature>COLORSPACE</feature>
        <feature>DOCUMENT_SECURITY</feature>
        <feature>EMBEDDED_FILE</feature>
        <feature>EXT_G_STATE</feature>
        <feature>FONT</feature>
        <feature>FORM_XOBJECT</feature>
        <feature>ICCPROFILE</feature>
        <feature>IMAGE_XOBJECT</feature>
        <feature>INFORMATION_DICTIONARY</feature>
        <feature>LOW_LEVEL_INFO</feature>
        <feature>METADATA</feature>
        <feature>OUTLINES</feature>
        <feature>OUTPUTINTENT</feature>
        <feature>PAGE</feature>
        <feature>PATTERN</feature>
        <feature>POSTSCRIPT_XOBJECT</feature>
        <feature>PROPERTIES</feature>
        <feature>SHADING</feature>
        <feature>SIGNATURE</feature>
        <feature>FAILED_XOBJECT</feature>
        <feature>ERROR</feature>
    </enabledFeatures>
</featuresConfig>

veraPDF Policy Profiles

Policy profiles are defined using the standard Schematron syntax. A sample policy profile looks like:

<?xml version="1.0" encoding="UTF-8"?>
<iso:schema  xmlns:iso="http://purl.oclc.org/dsdl/schematron" >
   <iso:pattern name="Check compressions used in the document">
      <iso:rule context="/report/jobs/job/featuresReport">
         <iso:report test="lowLevelInfo/filters/filter/@name = 'CCITTDecode'">CCITT compression is OK</iso:report>
         <iso:assert test="lowLevelInfo/filters/filter/@name = 'DCTDecode'">JPEG compression is not OK</iso:assert>
      </iso:rule>
   </iso:pattern>
</iso:schema>

In this particular example the Policy profile allows (and reports) the sue of CCITT compression, and does not permit the use of JPEG (DCT) compression. Each successful <iso:report> test generates a passed check message in the final Policy Report, and each failed <iso:assert> test adds a failed check message to the Policy Report.

Policy Report

The collection of all passed and failed checks is stored in the veraPDF report within the <featuresReport> element for each <job> element:

<policyReport passedChecks="0" failedChecks="1">
  <passedChecks/>
  <failedChecks>
    <check status="failed" test=""lowLevelInfo/filters/filter/@name = 'DCTDecode'""     
                location="/report/jobs/job[7]/featuresReport/lowLevelInfo/filters/filter">
       <message>JPEG compression is not OK</message>
    </check>
  </failedChecks>
</policyReport>