This short document aims to be a quick guide for anyone implementing IPFS -- it is modelled after go-ipfs, and serves as a template for js-ipfs and py-ipfs.
Sections:
- IPFS Types
- API Transports
- API Commands
- Implementing bindings for the HTTP API
There are a number of non-ipfs specific things that have been built for ipfs, that ipfs depends on. Implement these first.
There are a number of self-describing protocols/formats in use all over ipfs.
All the complex peer-to-peer protocols for IPFS have been abstracted out into a separate library called libp2p
. libp2p
is a thin veneer over a wealth of modules that interface well with each other.
Implementations:
libp2p
may in fact be the bulk of an ipfs implementation. The rest is very simple.
IPLD is the format for IPFS objects, but it can be used outside of ipfs (hence a module). It's layered on top of multihash
and multicodec
, and provides the heart of ipfs: the merkledag.
Implementations:
IPRS is the record system for IPFS, but it can be used outside of ipfs (hence a module). This deals with p2p system records -- it is also used by libp2p
.
Implementations:
- go-iprs
- js-iprs Forthcoming
IPNS provides name resolution on top of IPRS -- and a choice of record routing system.
The IPFS-Repo is an IPFS Node's "local storage" or "database", though the storage may not be in a database nor local at all (e.g. s3-repo
). There are common formats so that multiple implementations can read and write to the same repos. Though today we only have one repo format, more are easy to add so that we can create IPFS nodes on top of other storage solutions.
Implementations:
The Core of IPFS is an interface of functions layered over all the other pieces.
The IPFS Node is an entity that bundles all the other pieces together, and implements the interface (described below). In its most basic sense, an IPFS node is really just:
type ipfs.Node struct {
Config // has a configuration
repo.Repo // has a Repo for storing all the local data
libp2p.Node // has an embedded libp2p.Node, and thus a peer.ID, and keys
dag.Store // has a DAG Store (over the repo + network)
}
IPFS itself is very, very simple. The complexity lies within libp2p.Node
and how the different IPFS commands should run depending on the libp2p.Node
configuration.
IPFS Nodes can be configured. The basic configuration format is a JSON file, and so naturally converters to other formats can be made. Eventually, the configuration will be an ipfs object itself.
The config is stored in the IPFS Repo, but is separate because some implementations may give it knowledge of other packages (like routing, http, etc).
The IPFS Interface or API (not to be confused with the IPFS HTTP API) is the set of functions that IPFS Nodes must support. These are classified into sections, like node, network, data, util etc.
The IPFS Interface can be implemented:
- as a library - first and foremost
- as a commandline toolchain, so users can use it directly
- as RPC API, so that other programs could use it
- over HTTP (the IPFS HTTP API)
- over unix domain sockets
- over IPC
One goal for the core interface libraries is to produce an interface that could operate on a local or a remote node. This means that, for example:
func Cat(n ipfs.Node, p ipfs.Path) io.Reader { ... }
should be able to work whether n
represents a local node (in-process, local storage), or a remote node (over an RPC API, say HTTP).
For now, i list these from the commandline, but the goal is to produce a proper typed function interface/API that we can all agree on.
These are the for the node itself.
- ipfs init
- ipfs config
- ipfs repo
- ipfs repo gc
- ipfs stats
- ipfs diag
- ipfs block
- ipfs object
- ipfs {cat, ls, refs}
- ipfs pin
- ipfs files
- ipfs tar
- ipfs resolve
These are carried over from libp2p, so ideally the libp2p implementations do the heavy lifting here.
- ipfs id
- ipfs ping
- ipfs swarm
- ipfs exchange
- ipfs routing
- ipfs bitswap
- ipfs bootstrap
These are carried over from IPNS (can make that its own tool/lib).
- ipfs dns
- ipfs name
- ipfs log
- ipfs update
- ipfs version
- ipfs tour
- ipfs daemon
There are many useful datastructures on top of IPFS. Things like unixfs
, tar
, keychain
, etc. And there are a number of ways of importing data -- whether posix files or not.
Importing data into IPFS can be done in a variety of ways. These are use-case specific, produce different datastructures, produce different graph topologies, and so on. These are not strictly needed in an IPFS implementation, but definitely make it more useful. They are really tools on top of IPLD though, so these can be generic and separate from IPFS itself.
- graph topologies - shape of the graphs
- balanced - dumb, dead simple
- trickledag - optimized for seeking
- live stream
- database indices
- file chunking - how to split a continuous stream/file
- fixed size
- rabin fingerprinting
- format chunking (use knowledge of formats, e.g. audio, video, etc)
- special format datastructures
- tar
- document formats - pdf, doc, etc
- audio and video formats - ogg, mpeg, etc
- container and vm images
- and many more
It's worth mentioning the unixfs
datastructure, as it provides support for representing unix (posix) files in ipfs. It's simple, but powerful. And it is first class, in that several basic commands make use of it.
interfacing with a variety of data structures
We are still figuring out good ways to make all the different data structures play well with various commands -- there is some complexity when it comes to implementing things like ipfs cat
-- it currently outputs the data of a unixfs.File
, but it could do something for other graph objects too. Ideally, we could figure out common ways of making this work, If you have ideas, please discuss.
graph mapping
Sometimes one graph maps to another, for example a unixfs graph shards big files and big directories into smaller units and transparently presents them to the user for commands such as ipfs cat
and ipfs ls
.
mixing data structures
Some data structures are meant to be interspersed with others, meaning that they provide meaning to arbitrary things. One example is a keychain.Signature
, which provides a cryptographic signature on any other object. Another example is a versioning.Commit
which represents a specific revision in a version history over any other object. It is still not entirely clear how to build nice tooling that handles these transparently.