Skip to content

1.1.2 Syntax

Felix Schütt edited this page Jul 5, 2017 · 11 revisions

As any file format, the data inside a PDF has a special syntax. These are well-defined rules how the data is written into the file. Before we look closer at the contents of a PDF file, we have to learn these rules.

Everything in the PDF body is structured in so-called "objects". In the "Hello World" PDF, the document body starts after the %%PDF-1.4 and end before the xref (not include these lines!). An "object" in the context of a PDF is just any kind of information. It does not have anything to do with objects used in programming languages.

There are several types of these objects which can contain various types of information and each one is serialized differently. However, there is one rule that all objects have to adhere to: Two objects have to be seperated by one or more spaces, except if the start of the next object is obvious from the context the object is used in.

A general warning: Everything in PDF is case-sensitive. You always have to match the exact capitalization.

Whitespace

Clone this wiki locally