This document contains assorted notes about SGFC, how it operates, and the consequences this has for libsgfc++ and/or the library client.
If you simply want to use libsgfc++ you don't need to read this file, instead SGF notes is the document for you.
If you want to develop for libsgfc++ then these notes can be important to understand the implementation of libsgfc++.
On reading SimpleText/Text property values, and values of Point/Move/Stone properties of game types != Go, SGFC removes all escape characters.
On writing these property values, SGFC adds escape characters back where they are needed to protect the SGF skeleton.
Also see the detailed comments in SgfcPropertyDecoder
and SgfcDocumentEncoder
.
Note: SgfcPropertyDecoder
used to perform some escape processing, but this became unnecessary when SGFC V2.00 started to remove all escape characters. The code that performs the processing has been left in but made optional and disabled by default. Some comments may still mention the old behaviour.
SGFC detects any kind of line breaks when it reads/parses SGF content.
However, the kind of line break SGFC uses when writing SGF content is determined at compile time. By default SGFC uses a single LF character. This can be changed by redefining the pre-processor macro EOLCHAR
to something else during compilation. The macro must resolve into a single character, such as a LF character (already the default) or a CR character (used on classic MacOS systems). If you undefine EOLCHAR
then SGFC will write two characters, a LF followed by a CR, which is the standard on Windows/MS-DOS systems.
When reading SimpleText or Text property values, SGFC follows the SGF standard rules for soft and hard line breaks.
- Text property values: SGFC preserves unescaped (= hard) line breaks. SGFC removes escaped (= soft) line breaks (including the escape character). libsgfc++ never gets to see escaped line breaks.
- SimpleText property values: SGFC converts unescaped line breaks into a single space character. SGFC removes escaped line breaks (including the escape character). libsgfc++ never gets to see escaped line breaks.
When writing SimpleText or Text property values, SGFC preserves unescaped line breaks and generates escaped line breaks as it sees fit (cf. -L
and -t
command line options).
For Go the SGF standard defines that black or white pass moves can have either value "" (an empty string) or "tt". The latter counts as pass move only for board sizes <= 19, for larger boards "tt" is a normal move. The SGF standard also mentions that "tt" is kept only for compatibility with FF3.
The observed behaviour is that SGFC can deal with "tt" both on reading and writing, and it always performs the conversion to an empty string in an attempt to produce FF4 content. Consequently:
- When
ISgfcDocumentReader
reads SGF content with "tt" property values, the resultingISgfcDocument
will contain pass moves with empty string property values. - When an
ISgfcDocument
is programmatically set up with black or white move properties that have value "tt", and the document is then passed toISgfcDocumentWriter
for writing, the resulting SGF content will contain pass moves with empty string property values.
SGFC automatically expands compressed point lists during parsing when the game type is Go (GM[1]). See Check_Pos()
. It compresses them again during saving unless the -e
option is specified. See WriteNode()
.
It's impossible to preserve the original format in all cases, because SGFC normalizes the input during parsing, but it does not record any traces of what it did.
The original idea was that the SgfcDocumentEncoder
class creates data structures using the structs that SGFC defines (e.g. Node
, Property
, PropValue
). In theory it should have been as easy as invoking the appropriate SGFC helper functions, such as NewNode()
or NewProperty()
, to create the data structures. In practice this scheme turned out at first to be problematic, and then doomed.
The first minor problem is accessibility of the involved functions: NewNode()
is declared extern
, so it can be used by SgfcDocumentEncoder
easily. NewProperty()
and other functions, on the other hand, are declared static
inside load.c
, so it would have been necessary to patch SGFC in order to be able to use those functions.
The next problem, which turned out to be the real bummer, is that each Property
structure has a member named buffer
, which is a pointer that SGFC expects to point deeply into the SGFInfo
file buffer. Although NewProperty()
sets buffer
up for us, it expects the buffer start as a parameter, i.e. the start within the file buffer from where property parsing should begin. We can't provide NewProperty()
with a pointer into an std::string
buffer because that goes away when the std::string
object is destroyed. This means we would have to create a copy of the std::string
buffer on the heap. But then the next problem would be, who frees that memory when the SGFC operation is done? FreeSGFInfo()
is not the one, because it assumes that the Property
buffer is part of the file buffer, so it merely frees the file buffer. In addition to the memory management issue, there are doubts whether SGFC's parsing functions can handle an abrupt end of the file buffer - which would occur because our copy of the std::string
buffer would naturally be bounded with a zero byte.
In the end the best (because simplest and safest) idea seemed to be to just let SgfcDocumentEncoder
generate an SGF content stream that simulates an entire file buffer.
This section can be seen as a very high-level approach to an inofficial SGFC API (there is no official API). You may find this interesting if you're new to SGFC and want to learn how you can reuse its code in a software project of your own.
At the highest level the SGFC codebase can be divided into two parts:
- The
main()
function - Everything else
A software library must not contain a main()
function, so the first thing that needs to be done to enable reusing the "everything else" part is to remove the main()
function part. This can be achieved in one of two ways:
- Don't include the file
main.c
in the build - Include the file
main.c
in the build but define the preprocessor macroVERSION_NO_MAIN
whenmain.c
is compiled.
libsgfc++ uses the second approach because it allows main.c
to be included in IDE projects that are generated by CMake.
The SGFC codebase offers a number of high-level global functions that are of interest to a library. The functions can be placed into four broad categories:
- Managing data structures
- The function
SetupSGFInfo()
creates anSGFInfo
data structure and initializes it with useful default values. AnSGFInfo
data structure must be passed as a parameter to all global functions that perform some sort of data processing. TheSGFInfo
data structure can be parameterized in a number of ways, but this is beyond the scope of this document. - The function
FreeSGFInfo()
frees all memory that is currently occupied by anSGFInfo
structure.
- The function
- Working with arguments.
- The function
ParseArgs()
processes command line arguments in the usualargc
/argv
format known from themain()
function and stores the result in anSGFCOptions
data structure.
- The function
- Loading SGF data.
- The function
LoadSGF()
loads the SGF data from the filesystem into intermediate in-memory data structures. - The function
LoadSGFFromFileBuffer()
loads the SGF data from an in-memory buffer into intermediate in-memory data structures. - The function
ParseSGF()
processes the intermediate data structures generated by either one of the two load functions and generates the final data structures. These are what a software library such as libsgfc++ wants to work with.
- The function
- Saving SGF data.
- The function
SaveSGF()
processes the final data structures generated byParseSGF()
into SGF data and saves that data, either to the filesystem or to an in-memory buffer.
- The function
SGFC offers a software library to tap into parts of its data processing by way of hooks/callbacks that can be installed/provided in strategic places.
- Diagnostic output. These are the messages that are printed by SGFC to standard output when it is run as command line tool. Messages are generated as a transient by-product of the processing done by most of the global functions mentioned in the previous section. In other words, the message data is not available to the caller in the data structures generated/returned by the global functions! To get hold of the message data a software library can install a hook/callback function by setting the global variable
print_error_output_hook
. The hook/callback function will then receive the message data in structured form for further processing. See theSgfcMessageStream
class for an example how libsgfc++ implements the hook/callback. - Redirecting SGF data on save. The default for
SaveSGF()
is to write the SGF data to the filesystem. A library can redirect the output to be written to somewhere else so that the library can perform further processing. SGFC provides three hook/callback opportunities to tap into the save procedure:open
andclose
for when an SGF file would normally be opened/created and closed, andputc
for when character data is actually output to the save stream. See theSgfcSaveStream
class for an example how libsgfc++ implements redirection to an in-memory buffer. - Preventing a call to
exit()
. SGFC has an internal central memory-allocation function (actually a preprocessor macro). When that function fails to allocate memory it invokes a "panic" function to report the out-of-memory problem. When SGFC is run as command line tool the default "panic" function prints a message to standard output and then invokesexit()
. A software library can install its own versin of a "panic" function by setting the global variableoom_panic_hook
. See theSgfcBackendController
class for an example how libsgfc++ attempts to handle memory allocation failures.
According to the SGFC readme document the "KI" property is a private property of the "Smart Game Board" application (SGB). The property name means "integer komi".
SGFC converts "KI" to the Go-specific "KM" property, dividing the original "KI" numeric value by 2 to obtain the new "KM" value. SGFC performs this conversion in all cases, even if the game tree's game type is not Go.