Below I summarise the findings since early January to 21/02/2021.
- Researching frameworks for implementation
- Solid Auth Work
- Authentication Panel (Sketch Architecture to enable scalable authentication and access control)
- Authorization Panel (Work on Use Cases)
- Data Interoperability Panel
- Prototype Implementation
- Reactive Solid
- Architecture of Server
This is work done in the process of writing up a detailed work plan and producing a tentative software bill of materials.
As work on previous versions of this project where last done 4-5 years ago with rww-play for the server, and experiments on client with rww-scala-js and akka-http-signature we started off considering if other languages would bring greater benefits. We considered Rust, Haskell and Scala.
Haskell would have the major advantage of allowing proofs to be written in Agda that could be compiled to Haskell. The disadvantage would be that it would make it difficult to find people to work with, and a lot less enterprise integration. It was not quite clear how good the RDF libraries in Haskell were, and I could not get feedback from people on the subject.
Rust will for sure come in at some point as reasoning engines get compiled to WASM. The efficiency advantages are clearly shown in a recent report by Pierre Antoine-Champin Sophia: A Linked Data and Semantic Web toolkit for Rust. If the work on SWI-Prolog port to WebAssembly could be made to work with the Eulersharp (now EYE) N3 reasoning engine to bring this to the browser, this would be a major milestone in allowing full features semantic web applications to be deployed on the client.
Note: this also involves improving N3, following on the 2019 Thesis Notation3 as the unifying logic for the semantic web and which is being worked on in the Notation 3 (N3) Community Group at the W3C.
There are enough Semantic Web libraries in Rust to be able to build a server, even if they seem to be very new and clearly have not had as much work on them as the Java SemWeb libraries. There are also many good server libraries, so building a server in Rust is in many ways very attractive. There are also a few Quic (HTTP/3) libraries, but these have not been integrated into the web server libraries yet, and there are no clear timelines for that. (See discussion on Tokyio H3 Discord Channel.)
The problem for me is that
- it would require learning Rust, which adds a big unknown at this point.
- the abstractions of the libraries seem to be much lower than those found in Scala or Haskell. The actor frameworks in Rust don't have the backpressured streaming library abstraction of Akka, or the work on typed actors.
The main advantage of Scala is that I know the ecosystem well, and it the sometimes heated meeting of Functional Programmers from Haskell and OO programmers from Java has helped push up the abstraction layer. Some of the findings here are:
- Sadly there is no Quick implementation for Akka yet on the horizon. See the akka github issue on Quic but I helped stir some interest in it.
- The more functional libraries such as ZIO don't support HTTP/2 and that is a lot of work to do
- Akka supports HTTP/2 now
- Akka has moved to Typed Actors and working with academia on Session Types see Tweet Thread
- Akka has some very high level libraries to allow for streaming parsers. This could make for a nice self contained student project.
- banana-rdf allows one to write an RDF server and switch between Jena or Sesame in one line of code without wrapping objects. But it needs some work to bring up to Scala-3.
- Scala-JS allows one to reuse code from the server on the client.
- Akka is used by very large companies (LinkedIn, Tesla, PayPal) which means the components have been very heavily tested in production. (see PayPal's report on 1 billion transactions per day on 8 JVMs). Indeed there are open source component libraries built for them to make such deployments easy. This could foster adoption for large enterprises.
- Akka can also be run on small devices such as Raspberry Pi which could allow deployment on home boxes. (Perhaps not as efficiently as Rust based servers though. It would be really nice to know more.)
Given the time available for this project Scala still looks like a very good choice, with potentially WASM compiled Rust libraries on the client.
The initial Software Bill of Materials would then be:
- Scala and Scala-JS (Apache licences)
- Banana-rdf (MIT Licenced). This uses
- Jena Apache Project
- Sesame now RDF4J Eclipse Licence
- Scala-JS wrapped libraries (depending MIT, Apache, ...)
- Akka (Apache 2.0 Licenced) for the server with potentially
- Akka-JS for integration with JS ecosystem
- Akka-osgi for possible integration with Trellis LDP Server
Here we collect preparatory work on Solid Auth panels, to help steer it towards protocols that are scalable.
This is where work on sketching a scalable protocol for authentication that can integrate with the Self Sovereign Identity project, which inclused HTTP-Signature, Web Credentials, Universal Wallets, and more.
The Authentication Panel, which has not met as often recently, has meeting minutes that are stored in the meetings folder. There is one PR for Meeting Notes for 2021-01-25 where I went over how authentication with HTTP-Signature IETF specification could work with Web Credentials.
I have been developing on the initial Http-Signatures authentication proposal, to integrate WebIDs, Credentials, Wallets, etc... in PR 125: Http Sig. The text for this can be more easily viewed on the bblfish branch of the PR.
The Solid Authorization panel has been focusing on finishing the Use Cases and Requirements for Authorization in Solid published on Wed 17 Feb at a new URL.
The Weekly meeting notes are available in the meetings folder, and cover some of the discussion that went into that release.
I summarised an important development for Solid-Control in the meeting on 27 January 2021 where I sketch how Access Control, HTTP Signatures and Capability models can work together, following ideas from Abadi's logic of Saying-that.
We are now looking at which uses cases are not covered by the current Web Access Control Spec. This is important as it helps move the discussion in the direction as to how WAC may need to be extended.
- Issue 174: UCR limitation inaccuracy: Application determining access priviledges
- Issue 176: Limitations Only Trust Certain Issuers of Identity
- Issue 177: Limitations: Limiting access to trusted applications
The Data Interoperability panel is doing some work on Access Control it seems and I am trying to work out what that is. (It seems to be OIDC related.)
In my first intervention on Feb 2, 2021 I mainly discussed Confinement of JS in the browser as a way to get security.
The Solid Panels are meeting on March 10th with the W3C Credentials Community Groups see mail. I am making sure I am fully up to date on the work on Self Sovereign Identity how Solid can work with wallets and potential for DID integration with Solid.
I am starting nearly from scratch with Reactive SoLiD written in the just about to be finalised Scala 3 using the Akka libraries. Here we want to to build a fully Reactive Web Server that can store files and configuration to disk, in much the same way the Apache Web Server does, but with the inbuilt read-write capability enabled by LDP.
This framework will allow us then to implement the ideas discussed above on the Auth panels, such as HttpSig Authentication for Solid. Initial proof of concept for auth can be found in
- akka-http-signatures which implemented a library of HTTP-Sig Auth for the server following an older version of the HTTP Signatures spec, now taken over by the HTTPbis WG.
- rww-scala-js a client that could produce keys using the JS Crypto API and use those to authenticate to the server using HTTP-Sig.
Reactive-Solid is built on Akka, an Actor system following the principles introduced by Carl Hewitt in 1973.
A Web Server is started up by pointing to a root directory. This directory can contain configuration information, content and directories. The first HTTP Request will be routed to the root container Actor which will register itself in a Container Registry, visible to all actors. That container will serve any file in that container.
For GET
requests with paths such as /blog/2021-02-02/
the root container will search for a directory blog
and if that exist spawn a new actor for it, which will also register itself.
The root actor will then forward the HTTP Request message to the blog
actor which will read it when it is ready.
This blog
actor will be able to check the config information for the directory from which it will be able to decide what Behavior to use, to map requests from the web.
Requests could be mapped:
- to files on the file system. This will be the default implementation.
- to files inside a tar
- to files in CVS or GIT allowing a history to be kept
- to databases
- ...
The blog actor on inspecting the request for /blog/2021-02-02/
will for the FileSystem Behavior be able to create a new child actor 2021-02-02
which will also register itself, consult config files if needed and await messages.
When it receives the request for /blog/2021-02-02/
it will know to show an index for that resource i.e the contents of the ldp:Container as specified by the LDP Spec.
Another request for /blog/2021-02-02/entry1
will be directly routed to the /blog/2021-02-02/
actor which will be able to spawn a new actor to deal with serving that file in an asynchronous non blocking manner.
Stateful behaviors using PUT
, POST
, PATCH
and DELETE
will be handled by such specialised actors individually, in a way to minimise any delay.
E.g. a POST
to a container will create a new resource and only when done inform the container so that requests on incomplete resources get rejected.
A PUT
or PATCH
that changes a resource will create a new versioned resource to which later requests will be directed when the change is completed.
On top of this framework, Auth can be built cleanly, i.e., without leaking the underlying implementations. For example, Authorization agents will be able to fetch graphs by making requests on URLs that can be resolved locally or remotely. When fetched remotely they will go through a caching service that also needs to be written. This caching service will need to keep track of open resources per hosts so as to abide by web politeness rules.
Later implementations will be able to add features to garbage collect actors that are not being used.