Safe number convertion ? #1317

sbernard31 · 2022-09-30T14:29:11Z

I will use this issue as a kind of bookmark to store all information about number handling issues/questions in Leshan.

In a general way, our encoder/decoder have to deal with different kind of number encoding.

By default, we should be sure those conversions are done without any information lost.
We begin to do that in NumberUtil class but not yet fully done.

Maximize JSON/CBOR interoperability ?
Should we limit range and precision of numbers to IEEE 754 binary64 (double precision) to maximize interoperability by default ?

This would affect old JSON / SenML-JSON / SenML-CBOR encoding :

for timestamp ? Timestamps in seconds or milliseconds ? #1304 (comment)
for value ? Number representation issue for SenML content format. #916

It seems this is a real world question in JSON world : FasterXML/jackson-databind#911

Some links about how to handle this kind of conversion safely :

Integer to double : https://www.techempower.com/blog/2016/07/05/mangling-json-numbers/

Non exhaustive list of Leshan issue relative to that :

The text was updated successfully, but these errors were encountered:

sbernard31 · 2024-02-08T13:50:30Z

More safe number conversion question with number attribute : #1583

sbernard31 · 2025-01-22T16:58:57Z

This issue aims to centralize very different problem related to number conversion.

In this comment, I talk only about : Maximize JSON/SenML-JSON interoperability following RFC

Should we limit range and precision of numbers to IEEE 754 binary64 (double precision) to maximize interoperability by default ? (See #916 for more details)

(probably SenML-CBOR is not concerned by that)

The 2 points from RFC are :

In the interest of avoiding unnecessary verbosity and speeding up processing, the mantissa SHOULD be less than 19 characters long, and the exponent SHOULD be less than 5 characters long.

This specification allows implementations to set limits on the range
and precision of numbers accepted. Since software that implements
IEEE 754-2008 binary64 (double precision) numbers [IEEE754] is
generally available and widely used, good interoperability can be
achieved by implementations that expect no more precision or range
than these provide, in the sense that implementations will
approximate JSON numbers within the expected precision. A JSON
number such as 1E400 or 3.141592653589793238462643383279 may indicate
potential interoperability problems, since it suggests that the
software that created it expects receiving software to have greater
capabilities for numeric magnitude and precision than is widely
available.

Note that : A double uses 52 bits for the mantissa, which translates to about 15-17 decimal digits of precision.
And a double can safely store integer in the range [-(2^53)+1, (2^53)-1].

If we want to strictly follow RFC recommendation, I understand that we should avoid to send number with too big mantissa or exponent in way we ensure it fit in a double. And also probably not accept those number.

The obvious benefit of that would be to increase interoperability.
Why not doing that could cause interoperability issue ?

We can easily imagine that several JSON libraries will handle number as double.
We can imagine that some LWM2M implementation doesn't really consider that question and delegate that to underlayer JSON library (and so handle number as double too).
We can imagine that some JSON libraries just follow RFC recommendation.

The major drawbacks of following those recommendation :

We will strongly lost the benefit to have different kind of number (float, integer, unsigned_integer) for different purpose
We will lost information/precision and limit range for integer and unsigned_integer.
This will be done silently and so user will not even be aware of that.

Of course, this will only happen for very large number (> 2^53 or <-2^53), so maybe only for a few use cases.
But when this happen, if user is not aware then this lost of precision can lead to serious issue. 🤷

So I see possible 3 mode :

we convert number in double automatically (easily way to limit range and precision), with silent precision loss.
we convert number in double automatically but we raise exception is conversion can not be done without loss
we just encode number without precision loss (this is more or less current implemented mode)

Which of this mode should we implement ?
Which one should be the default one ?

My personal option, this option seems ok to me :

we only implement 3.
we implement 2 and 3 with 2 mode by default.
we implement 1, 2 and 3 with 2 mode by default.

Any opinion ?

sbernard31 · 2025-02-03T14:25:36Z

(Not directly linked to comment above : #1692 aims to detect Number To double precision loss in conversion)

sbernard31 · 2025-02-06T09:28:39Z

After some discussion with @jvermillard

Not so obvious that 1 and 2 is needed.

is already implemented + we have some number conversion which try to raise suspicious conversion.
Let's see if users report conversion issue later and reconsider the question if needed.

sbernard31 added the discussion Discussion about anything label Sep 30, 2022

sbernard31 mentioned this issue Jan 16, 2024

ObserveCompositeTimeStampTest failed when running with java 17+ #1575

Closed

sbernard31 closed this as completed Feb 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Safe number convertion ? #1317

Safe number convertion ? #1317

sbernard31 commented Sep 30, 2022 •

edited

Loading

sbernard31 commented Feb 8, 2024 •

edited

Loading

sbernard31 commented Jan 22, 2025 •

edited

Loading

sbernard31 commented Feb 3, 2025

sbernard31 commented Feb 6, 2025

Safe number convertion ? #1317

Safe number convertion ? #1317

Comments

sbernard31 commented Sep 30, 2022 • edited Loading

sbernard31 commented Feb 8, 2024 • edited Loading

sbernard31 commented Jan 22, 2025 • edited Loading

sbernard31 commented Feb 3, 2025

sbernard31 commented Feb 6, 2025

sbernard31 commented Sep 30, 2022 •

edited

Loading

sbernard31 commented Feb 8, 2024 •

edited

Loading

sbernard31 commented Jan 22, 2025 •

edited

Loading