Skip to content

Parsing common data types

gershnik edited this page Apr 3, 2022 · 4 revisions

Often a command line argument or option value needs to be converted into a specific data type inside your application code. Examples include converting an argument into an integer, an enum or a floating point number. You can, of course, relatively easily perform such a conversion yourself using whatever facilities you prefer. However, doing it in C++ safely and correctly is often unnecessarily hard. To help with this Argument provides a few type parsers - utility functions or classes to simplify such common tasks.

If you are not using single-header (or module) distribution these parsers are available in <argum/type-parsers.h> header.

Parsing Integers

Function template parseIntegral<Type> allows you to convert a string_view/wstring_view into any integral type. The format it accepts is as follows

  • Any number of spaces followed by
  • A number recognized by the family of strtoX standard functions (see here and here). The number must fit into the destination type.
  • Any number of spaces

The second argument to parseIntegral is "base" which operates exactly like the "base" argument of strtoX family of functions. By default it is 0 - meaning "guess based on prefix".

When using exceptions if the format cannot be parsed the function throws Parser::ValidationError. Otherwise it returns the parsed integral. When using expected values the function returns Expected<Type> Here is an example using exceptions:

unsigned short aNumber;
...
parser.add(
    Positional("number").
    help("a number"). 
    handler([&](const std::string_view & value) { 
        aNumber = parseIntegral<unsigned short>(value);
}));

and using error codes:

unsigned short aNumber;
...
parser.add(
    Positional("number").
    help("a number"). 
    handler([&](const std::string_view & value) -> Expected<void> { 
        auto result = parseIntegral<unsigned short>(value);
        if (!result)
            return result;
        aNumber = *result;
}));

Parsing Floating Point Numbers

Very similar to the above there is also parseFloatingPoint<Type> that can parse float, double or long double. It operates almost identically except the parsing format is the one of strtof family of functions.

Parsing Enumerations or Choices

Very often an argument or option value must be a choice between different fixed options. The choices might map to some actual enum in application code or just be some fixed allowed values. To complicate matters, sometimes multiple command line choices (or alternative spellings) must map to a single entry in program code.

In addition, when using choices you usually want them to be displayed in usage and help messages, ideally without any code duplication.

To deal with all this complexity Argum provides ChoiceParser (and WChoiceParser) class.

You use it as follows

  • Construct an instance of ChoiceParser class specifying its overall settings. Available settings currently are:
    • Whether the parser is case sensitive (default: no)
    • Whether the parser should treat any unrecognized input as an "else clause" - an additional enumeration entry
  • Add possible choices. Each choice can have multiple alternative names.
  • Parse the input. The result is an index of the choice you added. If "else clause" is allowed any unmatched input will be returned as index of last choice + 1. Otherwise, unmatched input will throw Parser::ValidationError exception.
  • You can also ask the parser for syntax description that can be inserted into argName method of an Optional or used as a name of a Positional. This will list the choices in generated usage and help.

Here is an example:

enum Encoding { defaultEncoding, Base64, Hex };
Encoding encoding;

ChoiceParser encodingChoices;
//this is equivalent to
//ChoiceParser encodingChoices({.caseSensitive = false, .allowElse = false});

encodingChoices.addChoice("default");
encodingChoices.addChoice("base64");
encodingChoices.addChoice("hex", "binhex"); //alternative spellings
parser.add(
    Option("--format").
    help("output file format"). 
    argName(encodingChoices.description()).
    handler([&](string_view value) {
        encoding = Encoding(encodingChoices.parse(value));
}));

This produces the following help output

$ ./prog --help                                         
Usage: ./prog [--format {default, base64, hex, binhex}] [--help]

options:
  --format {default, base64, hex, binhex}
                        output file format
  --help, -h            show this help message and exit

$ ./prog --format foo
invalid arguments: value "foo" is not one of the valid choices {default, base64, hex, binhex}
Usage: ./prog [--format {default, base64, hex, binhex}] [--help]