Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow preservation of precision for JSON numbers in json_read #130

Open
thetrime opened this issue Apr 26, 2019 · 5 comments
Open

Allow preservation of precision for JSON numbers in json_read #130

thetrime opened this issue Apr 26, 2019 · 5 comments

Comments

@thetrime
Copy link
Contributor

After issue #128, I realised that there's a problem going the other way as well. When we receive a value that we know is a number, we use number_codes/2 to turn it into a float. I would like to instead either allow a hook that I can use to create exact representations of the values as rdiv/2 terms, or provide an option to json_read/3 so that json_read_number/4 (which will become json_read_number/5, I guess) can create an rdiv/2 term for the input.

For example, if we receive a term like
{"foo": 123456789012345678901234567890}
this already turns up (for me, anyway) as
_{foo: 123456789012345678901234567890}
and not
_{foo: 1.2345678901234568e+29}.

However, a term like
{"foo": 0.123456789012345678901234567890}
could be preserved as
_{foo: 12345678901234567890123456789 rdiv 100000000000000000000000000000}
but instead I end up with
_{foo: 0.12345678901234568}

This would require the SWI installation to have GMP support (obviously) and to pass an option to json_read/3 to indicate that we should use rationals instead of floats for representing non-integer numbers.

In short:

?- open_string('{"foo": 123456789012345678901234567890123456789012345678901234567890}', X), json_read(X, Term, []).
X = <stream>(0x7ff31846afd0),
Term = json([foo=123456789012345678901234567890123456789012345678901234567890]).

?- open_string('{"foo": 0.123456789012345678901234567890123456789012345678901234567890}', X), json_read(X, Term, []).
X = <stream>(0x7ff31a175780),
Term = json([foo=0.12345678901234568]).

?- open_string('{"foo": 0.123456789012345678901234567890123456789012345678901234567890}', X), json_read(X, Term, [decimals(rdiv)]).
X = <stream>(0x7ff31a175780),
Term = json([foo=12345678901234567890123456789012345678901234567890123456789 rdiv 100000000000000000000000000000000000000000000000000000000000]).

Hopefully I've done a better job of explaining my proposal this time!

@JanWielemaker
Copy link
Member

That is not a hook, but some extra functionality scoped by an option. I can live with that. Go ahead. Oh, and please add some documentation :)

@thetrime
Copy link
Contributor Author

Yikes. Since you doubled the performance by moving json_read_number to C, we may have a problem. Everything is fine up until we use PL_chars_to_term to convert the string buffer into a number. Since this leads us quite deep into the parser, we have a couple of options:

  1. Check the (new) option list to json_read_number, and if we have supplied the decimal(rdiv) flag, then change the way the JSON unifies the parsed number and the output term (for example, do not put the decimal point into the buffer, but instead count the number of digits after it. Then when we parse the buffer it will necessarily be an integer. We can divide it by 10**(number of digits) using an invocation of is/2 (or is there a foreign interface to the arithmetic setup?) This assumes that we are reading a base-10 number, which possibly we might not be - but how could we tell except by partially re-parsing the input to look for 0x or 0b or 0o or something?
  2. Change PL_chars_to_term so that we read floats as rdivs. This seems like a real rabbit hole.

Option 1 seems pretty hacky. Option 2 seems a bit extreme. Is there another option I might have missed?

@JanWielemaker
Copy link
Member

I think (2) is a bit too far stretched. At some point we may support decimals and provide an option to read_term to read x.y as decimal which could then end up as an option to PL_chars_to_term().

(1) is probably feasible. I'd add a boolean extra argument to json_read_number. I do not see the re-reading as you have everything there, no? JSON doesn't support non-decimal numbers (seems there is a JSON5 proposal that does. The beauty of JSON was being simple 😢)

The math is a bit of a problem. There is no foreign API for it. Not really sure I want this. I'd first call Prolog and to a performance test. Not sure how bad it will be. If to bad we need a plan B.

@thetrime
Copy link
Contributor Author

You mean just call is/2 directly from C in a new frame? That's what I was thinking of as first attempt, but it seemed pretty rough. Performance is really not my concern (at least not at this stage) - I just want it to be correct :)

If JSON only supports decimals that makes it pretty easy. Even more so if I'm allowed to just have a boolean argument rather than having to parse options. I'll put together a rough version that works, then you can direct where to apply the polish later

@JanWielemaker
Copy link
Member

Its internal, so need for an option list. If you follow the option encoding scheme of the JSON library it will be a simple direct lookup to get this boolean.

I'd add a predicate json:mk_rational(+Int, +DigitsOfterDot, -Rational) and call that using PL_call_predicate(). Do not forget to create a foreign frame or you will run out of stack! If this proves too slow we can always add an additional API. Possibly you can bypass the general rule for numbers that fit easily in int64_t.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants