Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converting R data.frame objects to/from JavaScript objects #364

Merged
merged 12 commits into from
Feb 27, 2024
Merged
6 changes: 6 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,10 @@

* JavaScript objects of type `TypedArray`, `ArrayBuffer`, and `ArrayBufferView` (e.g. `Uint8Array`) may now be used with the `RRaw` R object constructor. The generic `RObject` constructor now converts objects of this type to R raw atomic vectors by default.

* Constructing new R objects using `await new RObject(...)` now supports input objects of the form: `{a: [...], b: [...]}` or D3-style data arrays of the form: `[{a: ..., b: ...}, {a: ..., b: ...}, {a: ..., b: ...}, ... ]`. Where possible, lists are constructed of class `data.frame`. Direct construction with `RList()` does not create a `data.frame`.

* R `data.frame` objects may be converted into D3-style data arrays using the new R list object method `.toD3()`.

## Breaking changes

* The `captureR()` method now captures plots generated by the canvas graphics device by default. Captured plots are returned as an array of `ImageBitmap` objects in the property `images`. The previous behaviour may be restored either by manually starting a non-capturing `webr::canvas()` device during execution, or by including `captureGraphics: false` as part of the `options` argument. The default options for `evalR()` are set so that plotting is not captured, retaining the current behaviour.
Expand All @@ -26,6 +30,8 @@

* The LLVM flang build scripts are now sourced using a git submodule, to simplify management of CI builds. The build scripts are available at https://github.com/r-wasm/flang-wasm and the patched LLVM source at https://github.com/r-wasm/llvm-project. This allows for an independent build of the patched LLVM flang for WebAssembly, including as a separate Nix package.

* The `RObject.toObject()` methods have been refined for R lists and `data.frame` objects. The `.toObject()` method no longer uses recursion by default when converting R lists and environments, due to the possibility of unconvertible nested R objects. However, for symbols, atomic vectors, and R `data.frame` objects `.toObject()` will convert the object to JavaScript in entirety. For a type-stable conversion, serialise the object with the `.toJs()` method instead.

## Bug Fixes

* Fix showing content of lazy loaded files in webR demo app editor component (#320).
Expand Down
17 changes: 13 additions & 4 deletions src/docs/convert-js-to-r.qmd
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "Create R Objects from JavaScript"
title: "Creating New R Objects"
format: html
toc: true
---
Expand Down Expand Up @@ -59,11 +59,20 @@ The resulting R object type is chosen based on the contents of the JavaScript ar
| `{ re: 1, im: 2 }` | Complex atomic vector |
| `string` | Character atomic vector |
| `TypedArray`, `ArrayBuffer`, `ArrayBufferView` | Raw atomic vector |
| `Array` | An atomic vector of type following the coercion rules of R's `c()` function |
| [Object of type `WebRDataJs`](convert-r-to-js.qmd#serialising-r-objects) | Given by the `type` property in the provided object |
| `Array` | A vector or list of type following the coercion rules of R's `c()` function |
| `RObject` | Given by the type of the referenced R object |
| `{a: [...], b: [...], ...}` | R list object, possibly in the form of a `data.frame` |
| `[{a: 0, b: 'x'}, {a: 1, b: 'y'}, ...]` | R list object in the form of a `data.frame` |
| [`WebRDataJs`](convert-r-to-js.qmd#serialising-r-objects) | Given by the `type` property in the provided object |
| Other JavaScript object | Reserved for future use |

In the case of R lists provided as an [`WebRDataJs`](api/js/modules/RObject.md#webrdatajs), the above rules are applied recursively to also construct the objects provided within the list.
#### Further details

For JavaScript objects with a collection of properties, the above rules will be applied recursively to construct an R list with named components corresponding to each property.

If each property of the JavaScript object is an `Array`, all of equal length, all containing values compatible with R atomic vectors (or `null`, indicating a missing value), the resulting R object will be automatically^[This coercion may be avoided by constructing an `RList` object directly.] coerced into an R [`data.frame`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/data.frame.html).

When `RObject` references are used for constructing new R objects, no underlying copy is made. The resulting R object reference will point to the same memory location.

### Creating an R object with specific type

Expand Down
218 changes: 208 additions & 10 deletions src/docs/convert-r-to-js.qmd
Original file line number Diff line number Diff line change
@@ -1,20 +1,20 @@
---
title: "Convert R Objects to JavaScript"
title: "Converting to JavaScript"
format: html
toc: true
---

Once webR has been loaded into a web page, objects can be converted into JavaScript from the R environment. For example, it is possible perform some computation within R and then convert the resulting R object into a JavaScript object for use.

## Converting R objects to JavaScript objects

::: callout-warning
At the moment, not all R objects can be converted to JavaScript objects. Attempting to convert an unsupported R object will throw a JavaScript exception.
:::

Explicitly converting an [`RObject`](api/js/classes/RWorker.RObject.md) to JavaScript can be done by invoking the [`RObject.toJs()`](api/js/classes/RWorker.RObject.md#tojs) method, which returns a JavaScript representation of the associated R object. In most cases, JavaScript conversion has been implemented by serialising the R object to an [`WebRDataJs`](api/js/modules/RObject.md#webrdatajs).

### Serialising R objects
[Subclasses of `RObject`](api/js/modules/RWorker.md#classes) provide additional methods to convert objects into a JavaScript representation.

## Serialising R objects

Invoking [`RObject.toJs()`](api/js/classes/RWorker.RObject.md#tojs) on an R object serialises the object to a JavaScript object of type [`WebRDataJs`](api/js/modules/RObject.md#webrdatajs). This type is designed to form a tree structure, supporting an unambiguous JavaScript representation for potentially nested R objects.

Expand All @@ -34,7 +34,7 @@ An R `NULL` object is serialised to a JavaScript object of type [`WebRDataJsNull
The structure of serialised R objects may be updated in future versions of webR, expanding to include more R object attributes. As such, compatibility of serialised R objects between versions of webR is not guaranteed.
:::

#### Serialisation options {#serialisation-options}
### Serialisation options {#serialisation-options}

An `options` argument of type [`ToJsOptions`](api/js/interfaces/RWorker.ToJsOptions.md) can be provided to the [RObject.toJs()](api/js/classes/RWorker.RObject.md#tojs) method for fine-grained control over how objects are serialised.

Expand All @@ -57,9 +57,113 @@ await primes.toJs()
values: [2, 3, 5, 7, 11, 13]
}

### Other R object conversion methods
## Converting to JavaScript `Object`

R environments, lists, and atomic vectors provide a [`toObject()`](api/js/classes/RWorker.RDouble.md#toobject) method that converts the R object into a JavaScript object. The result of this conversion differs from the serialisation described above in that the resulting JavaScript object properties will be directly named by the components of the R object.

``` javascript
webR.objs.globalEnv.bind('foo', [1, 10, 28, 44, 26, 52]);
webR.objs.globalEnv.bind('bar', [1, 2, 3, 401, 113, 22]);
await webR.objs.globalEnv.toObject();
```

{ bar: Proxy(Object), foo: Proxy(Object) }

By default, `toObject()` will not recurse into objects and will return R object references as JavaScript values. As an override, the `depth` option may be set to recurse into the object using the `.toJs()` serialisation method.

``` javascript
webR.objs.globalEnv.bind('foo', [1, 10, 28, 44, 26, 52]);
webR.objs.globalEnv.bind('bar', [1, 2, 3, 401, 113, 22]);
await webR.objs.globalEnv.toObject({ depth: 0 });
```

{
bar: {
type: "double",
names: null,
values: [1, 2, 3, 401, 113, 22],
},
foo: {
type: "double",
names: null,
values: [1, 10, 28, 44, 26, 52]
},
}

#### Object conversion options

The `options` argument may be used to control how R objects with empty or duplicated names are converted. In the case of duplicated R component names, first wins.

::: {.panel-tabset}
## JavaScript

``` javascript
const obj = await webR.evalR(`
list(foo = c(1, 2, 3), foo = c(4, 5, 6), c("x", "y", "z"))
`);

await obj.toObject({ allowEmptyKey: true, allowDuplicateKey: true});
```

## TypeScript

``` typescript
import type { RList } from 'webr';

const obj = await webR.evalR(`
list(foo = c(1, 2, 3), foo = c(4, 5, 6), c("x", "y", "z"))
`) as RList;

await obj.toObject({ allowEmptyKey: true, allowDuplicateKey: true});
```

:::

{
"": {
type: "double",
names: null,
values: ['x', 'y', 'z'],
},
foo: {
type: "double",
names: null,
values: [1, 2, 3]
},
}

The following options are available,

| Property | Description |
|----------|-----------------------------------------|
| `depth` | How deep should nested R objects be serialised? A value of 0 indicates infinite depth. |
| `allowEmptyKey` | Allow an empty or null key when converting the object. |
| `allowDuplicateKey` | Allow duplicate keys when converting the object. |

When `allowEmptyKey` or `allowDuplicateKey` are `false`, an error is thrown in the case of empty or duplicated R component names.

### Converting to JavaScript `Array`

[Subclasses of `RObject`](api/js/modules/RWorker.md#classes) provide additional methods to convert objects into a JavaScript representation. For example, the [`toTypedArray()`](api/js/classes/RWorker.RDouble.md#totypedarray) method can be invoked on atomic vectors, such as an [`RDouble`](api/js/classes/RWorker.RDouble.md), to access a copy of the raw buffer as it exists in WebAssembly memory.
R lists and atomic vectors may be converted into a JavaScript `Array` value using the method [`toArray()`](api/js/classes/RWorker.RDouble.md#toarray).

``` javascript
const recurrence = await webR.evalR('c(1.1, 2.2, 3.3, 5.5, 8.8)');
await recurrence.toArray();
```

[1.1, 2.2, 3.3, 5.5, 8.8]

::: callout-note
When converting atomic vectors to JavaScript values, missing values of `NA` are represented as values of `null` in the resulting JavaScript representation. This conversion process may have a performance cost for very large vectors.
:::

#### Accessing raw WebAssembly memory

For atomic vectors, the [`toTypedArray()`](api/js/classes/RWorker.RDouble.md#totypedarray) method may be invoked to access a copy of the object data as it exists in WebAssembly memory.

::: callout-warning
The underlying raw memory buffer as managed by R will be returned as-is, including raw pointers for R character strings and sentinel values for missing values.
:::

::: {.panel-tabset}
## JavaScript
Expand All @@ -82,12 +186,106 @@ await primes.toTypedArray();

Float64Array(6) [2, 3, 5, 7, 11, 13, buffer: ArrayBuffer(48), ... ]

::: callout-note
When converting atomic vectors to JavaScript values, missing values of `NA` are represented as values of `null` in the resulting JavaScript representation. This conversion process may have a performance cost for very large vectors.
### Converting to primitive values

Scalar R values^[A scalar value in R is an atomic vector of length 1.] may be converted into JavaScript values using various subclass methods.

::: {.panel-tabset}
## JavaScript

``` javascript
const double = await webR.evalR('20');
await double.toNumber();
await webR.objs.true.toBoolean();
```

## TypeScript

``` typescript
import type { RDouble } from 'webr';

const double = await webR.evalR('20') as RDouble;
await double.toNumber();
await webR.objs.true.toBoolean();
```

:::

20
true


Each type of atomic scalar is converted into a particular JavaScript type, and so the method names are specialised.

| R object type | Method | JavaScript type |
|-------------------|-----------------|------------------------|
| `RLogical` | `.toBoolean()` | Boolean |
| `RInteger` | `.toNumber()` | Number |
| `RDouble` | `.toNumber()` | Double |
| `RComplex` | `.toComplex()` | `{ re: ..., im: ... }` |
| `RCharacter` | `.toString()` | String |
| `RRaw` | `.toNumber()` | Number |

### Converting from an R `data.frame`

R `data.frame` objects are list objects with an additional class attribute. As such, they may be converted into JavaScript objects using the [`toObject()`](api/js/classes/RWorker.RDouble.md#toobject) method. When the R list object is a `data.frame`, webR will automatically convert the inner atomic columns into `Array` format.

::: {.panel-tabset}
## JavaScript

``` javascript
const mtcars = await webR.evalR('mtcars');
await mtcars.toObject();
```

## TypeScript

``` typescript
import type { RList } from 'webr';

const mtcars = await webR.evalR('mtcars') as RList;
await mtcars.toObject();
```

:::

{
am: [1, 1, 1, ..., 1],
carb: [4, 4, 1, ..., 2],
cyl: [6, 6, 4, ..., 4]
...,
wt: (32) [2.62, 2.875, 2.32, ..., 2.78],
}

R `data.frame` objects may also be converted into a [D3](https://d3js.org)-style data array format using the [`toD3()`](api/js/classes/RWorker.RDouble.md#tod3) method. This method is only available for R objects of class `data.frame`.

::: {.panel-tabset}
## JavaScript

``` javascript
const mtcars = await webR.evalR('mtcars');
await mtcars.toD3();
```

## TypeScript

``` typescript
import type { RList } from 'webr';

const mtcars = await webR.evalR('mtcars') as RList;
await mtcars.toD3();
```

When using `toTypedArray()`, however, a copy of the raw memory buffer is returned. In this case the raw sentinel values will be preserved in the case of missing values.
:::

[
{ mpg: 21, cyl: 6, disp: 160, ... },
{ mpg: 21, cyl: 6, disp: 160, ... },
{ mpg: 22.8, cyl: 4, disp: 108, ...},
...
{ mpg: 21.4, cyl: 4, disp: 121, ...},
]

## Cached R objects

[`WebR.objs`](api/js/classes/WebR.WebR.md#objs) contains named references to long-living R objects in the form of [`RObject`](api/js/modules/RMain.md#robject) proxies. `WebR.objs` is automatically populated at initialisation time, and its properties may be safely accessed once the promise returned by [`WebR.init()`](api/js/classes/WebR.WebR.md#init) resolves.
Expand Down
22 changes: 20 additions & 2 deletions src/tests/webR/robj.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ test('Get RObject type as a string', async () => {
describe('Working with R lists and vectors', () => {
test('Get R object attributes', async () => {
const vector = await webR.evalR('c(a=1, b=2, c=3)');
const value = (await vector.attrs()) as RList;
const value = (await vector.attrs()) as RPairlist;
const attrs = await value.toObject({ depth: 0 });
expect(attrs.names).toEqual(expect.objectContaining({ names: null, values: ['a', 'b', 'c'] }));
});
Expand Down Expand Up @@ -178,6 +178,24 @@ describe('Working with R lists and vectors', () => {
expect(resJs.values[2]).toEqual(expect.objectContaining({ names: null, values: ['c'] }));
});

test('Convert an R data.frame to JS', async () => {
const result = await webR.evalR(`
data.frame(x = c(1,2,3), y = c(4,5,6), z = c(7,8,9))
`) as RList;
const obj = await result.toObject();
expect(obj).toEqual(expect.objectContaining({ x: [1, 2, 3], y: [4, 5, 6], z: [7, 8, 9] }));
});

test('Convert an R data.frame to D3 format', async () => {
const result = await webR.evalR(`
data.frame(x = c(1,2,3), y = c(4,5,6), z = c(7,8,9))
`) as RList;
const d3Obj = await result.toD3();
expect(d3Obj[0]).toEqual(expect.objectContaining({ x: 1, y: 4, z: 7 }));
expect(d3Obj[1]).toEqual(expect.objectContaining({ x: 2, y: 5, z: 8 }));
expect(d3Obj[2]).toEqual(expect.objectContaining({ x: 3, y: 6, z: 9 }));
});

test('Fully undefined names attribute', async () => {
const list = (await webR.evalR('list("a", "b", "c")')) as RList;
const pairlist = (await webR.evalR('pairlist("a", "b", "c")')) as RPairlist;
Expand Down Expand Up @@ -415,7 +433,7 @@ describe('Working with R environments', () => {
expect(envJs.values[0]).toEqual(expect.objectContaining({ names: null, values: [null] }));
expect(envJs.values[1]).toEqual(expect.objectContaining({ names: null, values: [true] }));
expect(envJs.values[2]).toEqual(expect.objectContaining({ names: null, values: [false] }));
const envObj = await env.toObject();
const envObj = await env.toObject({ depth: 0 });
expect(envObj['.c']).toEqual(expect.objectContaining({ names: null, values: [null] }));
expect(envObj['a']).toEqual(expect.objectContaining({ names: null, values: [true] }));
expect(envObj['b']).toEqual(expect.objectContaining({ names: null, values: [false] }));
Expand Down
Loading