Skip to content

Commit

Permalink
chore: add section of README to demonstrate low-level API
Browse files Browse the repository at this point in the history
  • Loading branch information
hansl committed Feb 15, 2020
1 parent d012830 commit cbb5fb5
Showing 1 changed file with 99 additions and 1 deletion.
100 changes: 99 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,103 @@
# Serde CBOR
[![Build Status](https://travis-ci.org/pyfisch/cbor.svg?branch=master)](https://travis-ci.org/pyfisch/cbor)

This repository manages 2 packages; ll_cbor and serde_cbor.

This comment has been minimized.

Copy link
@pyfisch

pyfisch Feb 16, 2020

Maybe instead of ll_cbor you can use just cbor as a name. The owner BurntSushi indicated that he might be willing to transfer the name as currently the cbor crate depends on rustc-serialize.

This comment has been minimized.

Copy link
@hansl

hansl Feb 16, 2020

Author Owner

That would be great.


# LL CBOR
[![Crates.io](https://img.shields.io/crates/v/serde_cbor.svg)](https://crates.io/crates/ll_cbor)
[![Documentation](https://docs.rs/serde_cbor/badge.svg)](https://docs.rs/ll_cbor)

LL CBOR is a low level CBOR serialization and deserialization API that allows people to
serialize and deserialize CBOR without any framework, in the Schema they want.

This crate comes from the need for developers to support CBOR values that aren't natively
supported by other serialization platform (like serde). An example of this is large integers
and dates; Serde does not support bit integer types or special object types, but CBOR has
them and an application might want to serialize or deserialize using those (or the full
CBOR spectrum, or custom made tags).

## Usage
LL CBOR supports Rust 1.38 and up. To install it add this to your `Cargo.toml`:

This comment has been minimized.

Copy link
@pyfisch

pyfisch Feb 16, 2020

Why version 1.38 in particular?

This comment has been minimized.

Copy link
@hansl

hansl Feb 16, 2020

Author Owner

That’s just what I’ve been working with and there’s no new features I need.

```toml
[dependencies]
ll_cbor = "0.1.0"
```

Then, for serializing values:

```rust
use ll_cbor::serialize::values as cbor;
use ll_cbor::serialize::builders;

fn main() -> Result<(), std::error::Error> {

This comment has been minimized.

Copy link
@cbeck88

cbeck88 Feb 16, 2020

hi, I'd like to make the following request / suggestion:

  • If you want ll_cbor to be friendly to no_std environments, try not to build it around std::error::Error which has no no_std equivalents. A good no_std friendly alternative is the failure crate. Or, just don't implement any of these Error traits in ll_cbor, let downstream wrap it and implement these traits if they need it.
  • In the serde_cbor crate itself, there has to be some reference to StdError because serde requires that serializer object's error types implement it. But it can ease portability if you use the re-export from Serde like in my patch here: pyfisch#180.

If the implementations of std::error are in ll_cbor then you either have to drop the stuff in (pyfisch#180), or create a conditional dep of ll_cbor on serde just for this error trait, which would work but seems like unnecessary complexity.


The issue is that in cargo, features are assumed to be strictly additive -- turning on more features just means, exposing additional API, make more functons visible, implementing additional traits from optional dependencies, etc., without breaking API. This simplifies Cargo point of view because then cargo does a "global feature unification" across all crates, and when you e.g. build once, then build -p a different target, cargo may not even rebuild crates that have been already built in the first build with features that aren't needed for the second build.

From ll_cbor crate point of view, using std::error::Error gated on an std flag is strictly additive and that's all fine.

The problem is that, serde itself uses features in a way that is unfortunately not strictly additive, and that has created a lot of pain in our projects.

In serde, serializers (provided by external crates like serde_cbor) are expected to provide a trait bound of std::error::Error on their errors, but this requirement only exists when serde/std feature is present, and that trait bound is dropped otherwise. This is not an additive use of features -- when the feature comes on, their API becomes stricter and less permissive, not more permissive, and it breaks the build if serde_cbor/std does not come on simultaneously. This creates global feature coordination problems that are really hard to solve in large projects. Lots of third party crates depend on serde and its hard to prevent them from turning on serde/std.

I think the reason this happened is that serde is older than no_std mode, and serde had a dependency on std::error::Error before no_std was even a thing. The real answer is that rust std developers should eventually make std::error::Error part of rust core, but they haven't done that yet. In the meantime @dtolnay did this: https://github.com/serde-rs/serde/releases/tag/v1.0.100, it would be super helpful (for me) if we can use that

// Serialize a single u64.
let some_value = cbor::u64(0);

// Serialize a vector of u32.
let some_vector = cbor::vector(vec![cbor::u32(1), cbor::u32(2), cbor::u32(3)]);

This comment has been minimized.

Copy link
@pyfisch

pyfisch Feb 16, 2020

CBOR is often used in no-std environments. Although alloc is sometimes available there should definitely a way to serialize an array without allocating.

See also https://github.com/pyfisch/minicbor for an example of no-std deserialization.


// Serialize a map of variable values.
// This is a HashMap<ll_cbor::Value, ll_cbor::Value>.
let hash = std::collections::HashMap::new();
hash.insert(cbor::string("hello"), cbor::i8(-100));
// It is legal in CBOR to have different type of keys in maps, but impossible to represent
// natively with Serde.
hash.insert(cbor::u32(1), cbor::string("World"));
let some_map = cbor::dictionary(&hash);

// We can also just pass in bytes and get an untrusted ll_cbor::Value from it:
let value = ll_cbor::Value::from_untrusted_slice(&[1, 2, 3]);

// When we don't know how many objects in advance, we can use a builder.
let some_map_builder = builders::dictionary();
for i in 0..1000 {
// It is also possible in CBOR to have multiple values with the same key.
some_map_builder.insert(cbor::string("key"), cbor::u32(i));
}

// Adding a CBOR tag to it.
let some_map2 = cbor::tag(55799, some_map_builder.build());

// Getting the bytes for the second map.
println("{}", hex::encode(&some_map2));

Ok(())
}
```

For deserialization, there are multiple ways. The main way is to use various `try_from`
to check if a byte stream is of the right type. You can also build a schema and validate
the input with it.

```rust
use ll_cbor::deserialize::values as cbor_de;
use ll_cbor::schema;

fn main() -> Result<(), std::error::Error> {
let bytes: Vec<u8> = vec![1, 2, 3];

// These will be of type Result<u64, ll_cbor::deserialize::Error>.
let maybe_u64 = cbor_de::u64::try_from(&bytes);

This comment has been minimized.

Copy link
@pyfisch

pyfisch Feb 16, 2020

Whil u64::try_from succeed if an 8-bit wide number is found? Some people work with different "canonical" formats and need to check that either really a 64-bit number was found or ensure that the number uses the shortest possible serialization.

This comment has been minimized.

Copy link
@hansl

hansl Feb 16, 2020

Author Owner

Yeah we struggled with that on my project. I think it’s fine to allow some construct that allows that, but the primitives should not. Maybe a separate trait than TryFrom

This comment has been minimized.

Copy link
@pyfisch

pyfisch Feb 16, 2020

Another interesting quirk of CBOR are the negative integers. The biggest negative number epresentable is -2⁶⁴, which does not fit into an i64.

let maybe_string = cbor_de::string::try_from(&bytes);

// A vector can contain any elements.
let maybe_vec = cbor_de::vec::try_from(&bytes);

This comment has been minimized.

Copy link
@pyfisch

pyfisch Feb 16, 2020

Again, there needs to be an API that does not allocate.


// So we have to map and test all items.
// TODO: correct the unwrap() calls with results.
let maybe_vec_of_u32 = maybe_vec.map(|v| v.iter().map(|i| cbor_de::u32::try_from(i)).collect());

// This will create a schema for a dictionary of string -> tag + i8.
let s = schema::dictionary(schema::string, schema::tag(schema::i8));

This comment has been minimized.

Copy link
@pyfisch

pyfisch Feb 16, 2020

A note about naming:

  • [1, 2, 3] is called an array in CBOR while it is refered to as a vector in rust
  • {"a": 1, "b": 2} is called a dictionary in Python, a (hash)map in Rust, an object in JS and a map in CBOR

The crate should be self-consistent in its naming.

This comment has been minimized.

Copy link
@hansl

hansl Feb 16, 2020

Author Owner

Yeah it should map CBOR names.

// Validate bytes match the schema.
// This returns a `Result<BTreeMap<String, i8>, ll_cbor::schema::Error>`.
let maybe_v = s.validate(&bytes);

Ok(())
}
```


# Serde CBOR
[![Crates.io](https://img.shields.io/crates/v/serde_cbor.svg)](https://crates.io/crates/serde_cbor)
[![Documentation](https://docs.rs/serde_cbor/badge.svg)](https://docs.rs/serde_cbor)

Expand Down

0 comments on commit cbb5fb5

Please sign in to comment.