The distributed content storage and distribution platform Swarm is a fundamental component of Ethereum's vision of a decentralised internet.
A recent series of orange papers outline a crypto-economic system that allows participants to efficiently pool their storage and bandwidth resources. The goal is a peer-to-peer serverless hosting, storage and serving solution that is DDOS-resistant, fault-tolerant, has zero-downtime, and is censorship-resistant.
In the very early days of the web, users had to manage their own servers and if their content suddenly became popular their servers could be incapacitated. This gave way to Web 2.0 and a "Faustian bargain" which traded control of content, access and user data for scalable hosting infrastructure which is kept cheap or free by selling on their profile info to third parties for targeted advertising.
This period, perhaps unsurprisingly, also saw the first peer-to-peer services gathering momentum. Notably, these P2P solutions offered higher levels of fault tolerance and resilience at a fraction of the cost of data centres.
As data sits in a redundant way across many people's computers and servers, the system is censorship-resistent: a government is unable to prevent a whistleblower like WikiLeaks publishing content using a decentralised system.
Swarm lead developer Viktor Trón told IBTimes UK: "It occurred to the founders of Ethereum quite early on that this could be a very good skeleton for a new generation of the web, a vision they termed Web 3.
"It's important to realise that currently the web is missing a lot of the underlying infrastructure that allows people to transact with each other and have direct protocols to allow them to pay each other, or refer to each other's reputation in a transparent and relatively standardised fashion.
"It was quite obvious that they needed a decentralised file sharing and content storage system to complement this full web infrastructure," he said.
Trón highlighted three use cases of Swarm, relating to Ethereum. Firstly, as a content distribution system for Dapps; secondly for certain application logic which is not consensus critical which you don't want to be replicated on every node, think the database component of web applications; and thirdly, the blockchain itself could be stored in a decentralised way.
The Ethereum blockchain storage of consensus sensitive data to be processed by every node is expensive; the cost of redundancy is the trade-off which eliminates trusted third parties. Therefore Dapps are incentivised to have the minimum possible amount of information on the blockchain itself.
Trón provided some examples to illustrate this point: "In a decentralised Facebook you would not want to register every 'like' on the blockchain, or the posts on a decentralised Reddit, or the actual product descriptions of a decentralised eBay.
"And similarly if you had a decentralised Airbnb, then you would have all the accommodation offerings with the photos of everything on Swarm. But those transactions that are critical for the reputation and the payment and for the escrow - like when a business deal is made - that information might go on the blockchain and that would verify certain data for payment and escrow and then reputation."
Trón explained the Swarm protocol, when serving content to other nodes, also caches content that may be requested and in the long run becomes saturated with it. "So we have a system that is always running to maximum capacity. But what happens when new content is coming in and I'm saturated? Obviously what I would do is delete some stuff.
"I can't delete content that is very popular. I would be stupid to do so, because if this content is often asked for then I get paid for it if i have it. That is one part of how the incentivisation works: you are compensated for serving up chunks to others.
"As a result of caching, and having an incentive for doing the caching and then serving, you end up having a system which automatically scales to popularity."
In this way the system will give prominence to very popular Dapps and the blockchain itself, and can be thought of as compensating nodes for their bandwidth use, he said.
Swarm can also deal with content which may not be accessed often, but which must be preserved, like a birth certificate, for instance. "This problem is basically the reliability of storage in the face of unpopularity. That's a harder nut to crack. What we need to do is find a solution where both parties have a stake in behaving honestly.
"It's basically an insurance-based system. I can insure availability of that content for a particular amount of time, and offer a particular amount of money for it and ask for degree of reliability."
The solution proposed in the orange paper uses cryptographically verifiable proof of custody combined with erasure coding (a method of data protection in which data is broken into fragments, expanded and encoded with redundant data pieces and stored across a set of different locations or storage media).
"It's a relatively novel approach we have taken with our proof of custody system, and just a nice way how it interacts with erasure code."
When asked about synergies between Swarm and related decentralised projects, Trón mentioned Interplanetary File System (IPFS): "IPFS enables us to redefine Swarm as a modular system where standard interfaces allow mutually interchangeable components.
"So we could, for example, have an operational mode of Swarm, where the retrieval end of the protocol uses IPFS distributed hash table (DHT) or the default IPFS stack uses Swarm's erasure coding scheme or incentive system," he said.