How to build a Web3 social graph of one billion users?

This article is machine translated
Show original
This paper presents two proposals for layering social graphs into a billion users table: On-Chain Graph (OCG) and Linked Graph (CLG).

Original title: " The Billion User Social Graph "

Written by: Jon Stokes

Compilation: Dan, W3. Hitchhiker

With Elon Musk's recent takeover of Twitter, there has been more talk of migrating from the larger social network to independent or open alternatives, but for all those who are just starting to fantasize about joining the thriving community of ex-Twitter residents, Soon there will be a problem the right has grappled with since the post-J6 cross-platform social media purge: Web lock-in is real.

You can do theoretical and strategic analysis of coordination problems, preference cascades, signaling and other game-theoretic-style concepts - I don't deny that these are useful ways to understand problems - but understand the powerful influence of Twitter and Facebook on hundreds of millions of us , all you really need to know is a simple heuristic from the early days of the Internet.

Metcalfe's law states that the value of a telecommunications network is proportional to the square of the number of connected users of the system (n2). Metcalfe's Law was originally formulated in this form by George Gilder in 1993 and is credited to Robert Metcalfe's work on Ethernet. Circa 1980, Metcalfe's Law was originally expressed not in units of users, but in units of "compatible communication devices" (eg, fax machines, telephones). It was only later with the globalization of the Internet that this law carried over to users and networks, since it was originally intended to describe Ethernet connections.

It is almost impossible to get people to give up a large, dense network graph in favor of a small, sparse network graph, and the only reason is that the former has value and the latter does not.

Strangely though, web3 solves this problem. Or at least it can solve this problem if we use some simple smart contracts that turn the blockchain from a giant table of users into a giant social graph.

Theoretical basis and previous work

Blockchains can and do function as one vast, shared table of users, open and public, not controlled by any one entity. As I wrote in The Billion Users Table:

The public blockchain is equivalent to a single, large-scale user table for the entire Internet, upon which the next wave of distributed applications will be built.

In its place is a decentralized network of user data warehouses connected by API, a single decentralized store of user data accessed through an open protocol and a decentralized network of storage nodes. As such, an identity-custodial blockchain represents a decentralization of the data storage implementation layer, and a re-centralization of the data storage access layer.

Imagine LinkedIn, Reddit, and Github all porting their user tables (and a lot of their proprietary data like endorsements, points, and activity history) to BitClout. Right off the bat, here's what happens: Every Github user is also a Reddit user, a LinkedIn user, and a BitClout user. Likewise, every Reddit user is also a Github user, a LinkedIn user, and a BitClout user. I could go on and on, but you'll get the point.

Every company built on the same table of virtual users gets instant access to the network effects of every other startup on that table. Every time an on-chain company joins a new user, then your service also has a new user. (In a way. They may not be actively using your service yet, but they are actually potential users of your service).

That previous post used Bitclout (the chain in that project is now called DeSo) as a prime example of a blockchain that could support this use case. But as excited as I was about the whole DeSo thing, it didn't turn out that great.

This is not the place to do a Bitclout/DeSo postmortem, but it makes sense to flag an aspect of this blockchain that is important to the discussion now. Bitclout strives to put the entire social network on-chain, where each post is written on-chain as an object that can accrue income (via Bitclout Diamonds). That's clever, but any blockchain trying to host actual content will see its data needs grow linearly with the number of users and connections.

The Bitclout team is very familiar with this unbounded data growth problem and has spent a lot of real engineering effort trying to solve it. But in hindsight, I actually think they were trying to do too many things at once. They should only focus on the portability issue of social graphs.

Described in database terms from my previous post, Bitclout attempts to put all of the following tables on-chain (plus some that are Bitclout-specific):

  • users
  • user_follows_user
  • posts
  • user_likes_post

The last two tables always have data explosions, which will become difficult to operate in the case of rapid user growth.

So I think a better approach would be to take the existing blockchain, which is basically already that first table (i.e. users), and add a user_follows_user join table to it. (We could also extend joins for other types of relations, like user_mutes_user , but for now we'll keep it simple.)

This user-to-user connection table will also grow linearly with the number of users, but at a slower rate, and more importantly, in order to represent the amount of additional data it needs (= the additional block space it consumes amount) will be much lower than the posts table.

I suggest this because user and fan relationships constitute a major source of lock-in for every large social networking platform. If your entire Twitter or Facebook social graph is open and readily available to other social platforms that want to host posts and other more data-intensive social networking experiences, there is essentially zero lock-in for those platforms .

What an on-chain social graph might look like

Imagine my entire twitter graph being embodied on-chain – including actual account and follower relationships. In order to view the Twitter posts in this graph (and associated likes, retweets, track-retweets, etc.), I need to connect to Twitter.com with my wallet. But let's say I want to jump to tribe.com, or gab.com, or some other social platform with its own special tendencies and moderation policies - if they can read my social graph from the blockchain, then I can Connect my wallet there and see the same connection and see any post they have on this other site.

That might not sound all that appealing, but consider the fact that if I follow a new person on Tribel , I'm now also following that person on Twitter and Gab – and everything else using the same on-chain graph users and relationships on social platforms. Unfollowing and blocking work the same way – do it once in one place, and changes to your Graph are instantly reflected everywhere.

Now, those of you who want to take advantage of this while reading already realize that in a world like the one above, what will inevitably happen is that someone will make a catch-all client that will allow you to access data from any or all of these networks through a single interface. Read and publish information in . Then there's no point in having separate services, they'll all go out of business...or will they?

A preview of things to come: phone numbers + contacts + messaging apps

The world I'm describing already exists in a prototype state, in the form of competing messaging protocols all tied to your phone number and populating themselves from your contact database. The telephone number system is the prototype of the table of hundreds of millions of users , and the distributed contact application program can read and write the standard Vcard format, forming a relationship graph based on the table.

There are many messaging protocols that rely on this phone number + contact combination, and the result is a bit like the social network I'm describing here. For example, when you log into Telegram for the first time, it scans your contacts, and you immediately have your existing networks in this new app.

As a result, you can choose to exchange messages with the same phone number via Signal, Telegram, WhatsApp, iMessage or traditional SMS – it all depends on which messaging protocol you and others in your network want to use.

There is also an eternal cycle, which is the decentralization and re-centralization of messaging applications, which started from the ICQ era and is still happening in the WhatsApp/Signal/Telegram/Facebook/etc. era. You can find any number of all-in-one messaging clients that support many of these platforms in one window.

None of these messaging apps are compromised because they all draw their identities from the same open phone number system and interoperable ecosystem of contacts apps and services – they all co-exist and bring something different, and many of us Humans switch between them, talking to different subgraphs of our contacts with different needs and preferences. If we move the social graph on-chain, I expect this dynamic to continue.

On Composability and Social Relations

Different platforms have different types of social connections that users can connect with each other. Facebook has friends, follows and blocks. Twitter has follow, mute and block. Those are great for these platforms, but we can improve them, make them better for blockchains, make them more composable.

Composability is a computer science term that roughly means that you can mix and match these small, discrete, well-defined tools to achieve different effects and functions.

Consider Facebook "friends" - this is its own type of connection, but it also means "following" because when you add someone as a friend, you automatically follow them. On Twitter, "block" means "mute," because when you block someone, you're basically muting them while also preventing them from seeing your posts.

For my own two social graph proposals, below, I would like to suggest following, cleaner and composable sets of social graph relations:

  • Follow: You can read posts from people you follow.
  • Mute: You can't read posts from people you've muted.
  • Block: People you block cannot read your posts.

Under this scheme, a block is a "mute" plus a "block", so it is composed of two operations on the same target address (for example, if I want to block ETH, I will put this address Mute, then block it).

If I want to see someone's posts but don't want them to see my own, I can follow them, plus block. Or, I can follow and mute if I want to keep reading by navigating to their content or periodically unmuting them.

I try to clarify relationships this way because it makes it easier to reason about contracts and relationships in the following chapters.

Some background on my two proposals

In the remainder of this paper, I present two proposals for layering a social graph into a billion users table.

  • The first, On-Chain Graph (OCG) , is more open and simple, but also more expensive in terms of fees, so some people will like it and some won't.
  • The second, Chained Graph (CLG) , is more complex but cheaper, and offers more control and privacy, so I predict most people will prefer it. However, platforms can support both approaches at the same time.

To really understand both proposals, you need some basic familiarity with the following concepts:

  • Non-dividable tokens (NFT) and non-dividable non-transferable tokens (NTFT, also called soul-bound tokens).
  • Ethereum Domain Name Service
  • smart contract

Knowing a bit of Solidity (Ethereum's smart contract programming language) will also help. If you're vague on one or all of the above, I've tried to write this in a way that you should still be able to grasp the basics.

For both proposals, I'm assuming we use ENS as the root of identity, and add new address records to it, containing the addresses of some fairly standard ERC721 NFT contracts representing the three types of social Relationships (follow, mute, block). The role of these three contracts is very different from one proposal to another, but the basic idea of putting their addresses into three special ENS address records remains the same.

Example of an ENS record, in this case my own ENS name

I would also like to propose an additional ENS record for the social user data URI, so you can update your social data without consuming gas. A proposed profileURI record would link to a JSON object hidden on some third-party platform, looking something like this:

curl https://jonstokes.com/jons-profile.json

-H "Accept: application/json"

{

"name": "jonstokes.(ETH|com)",

"bio": "Writer. Coder. Doomer Techno-Optimist. Cryptography Brother.",

"website": "https://jonstokes.com/",

"location": "Austin, TX"

}

Some of the content in the profile JSON is redundant with the existing ENS field, but that's ok; the purpose of this is to give social platforms something to display and allow users to make changes to their social profiles without having to Spend gas to update ENS records.

Suggestion 1: On-chain graph

The idea of On-Chain Graph uses NTFT to represent the above three relationships. For the following three social contracts, the same wallet that holds the ENS NFT should also own these contracts, and their three corresponding ENS address records should point to these contracts:

  • OCG follower: When you deposit an NTFT from my OCG follower contract into your wallet, then you follow me. Any of us can destroy this NFT and make you unfollow me.
  • OCG Blocking: I blocked you when I AirDrop an NTFT from my OCG Ghosted contract into your wallet. Only I can destroy this NTFT to relieve you.
  • OCG Mute: When I AirDrop an NTFT from my OCG Mute contract to your wallet, I've muted you. Only I can destroy this NTFT to unmute you.

The semantics of these three cases are basically: "relative to me, the contract owner, you are X", where "X" is a kind of follower, blocking, and mute.

Here is a sample follower contract:

// SPDX-License-Identifier: MIT

pragma solidity ^0.8.4;

import "@openzeppelin/contracts/token/ERC721/ERC721. SOL";

import "@openzeppelin/contracts/token/ERC721/extensions/ SOL";

import "@openzeppelin/contracts/security/ SOL";

import "@openzeppelin/contracts/access/ SOL";

import "@openzeppelin/contracts/token/ERC721/extensions/ SOL";

import "@openzeppelin/contracts/utils/ SOL";

contract OCGFollower is ERC721, ERC721Enumerable, Pausable, Ownable, ERC721Burnable {

using Counters for Counters. Counter;

Counters. Counter private _tokenIdCounter;

constructor() ERC721("OCGFollower", "OCGF") {}

function _baseURI() internal pure override returns (string memory) {

return "https://jonstokes.com/ocg/follower";

}

function relationship() public {

return "ocg follower";

}

function pause() public onlyOwner {

_pause();

}

function unpause() public onlyOwner {

_unpause();

}

function safeMint(address to) public {

//Prevent anyone but the owner from minting

//a token to an address they don't own.

require(isOwner(_msgSender()) || (_msgSender() == to), "Unable to mint to this address");

uint256 tokenId = _tokenIdCounter.current();

_tokenIdCounter. increment();

_safeMint(to, tokenId);

}

function _beforeTokenTransfer(address from, address to, uint256) pure override internal {

//Disable token transfers.

require(from == address(0) || to == address(0), "Cannot be transferred.");

}

// The following functions are overrides required by Solidity.

function supportsInterface(bytes4 interfaceId)

public

view

override(ERC721, ERC721Enumerable)

returns (bool)

{

return super. supportsInterface(interfaceId);

}

}

If you're familiar with Solidity, you can see what this very simple (and untested!) contract is trying to do.

First the extension:

  • ERC721Enumerable extension is included so token holders can be listed by social network clients without having to scan the entire chain.
  • I use Pausable because you should be able to pause minting to essentially lock your account for a period of time, i.e. stop accepting new followers.
  • Ownable is essential because there are some things that only the contract owner should do. I don't think there is a need to use the more powerful character functions.
  • ERC721Burnable is here because you need to be able to burn tokens in order to delete a follow relationship. The standard burn() function included here has the permissions we need, i.e. only the owner or token owner can burn tokens.
  • I included Counters so that tokenID is auto-incremented, which is handy.

Now modify the output of the OpenZeppelin wizard:

  • After safeMint() is modified, only the owner of the contract can mint tokens to someone else's address. For all non-owners, you can only mint coins to the address you called the contract.
  • _beforeTokenTransfer() is overridden so that it essentially disables the ability to transfer tokens, creating a simple soulbound token.
  • relationship() function is a convenience method that ensures there is an easy way to query the contract and confirm what relationship the NFT represents. I'm not a fan of including this, but it seems useful.

It's all really simple, for OCG's masked and OCG's muted variants, you have to do the following small changes:

  1. Change contract name and symbol
  2. Change the return values of relationship() and possibly baseURI() to reflect the relationship you represent (ie, "muted" or "ghosted").
  3. Turn both safeMint() and burn() into onlyOwner functions, so that only the contract owner can call these two functions.

Obviously, this will depend on whether the platform fulfills these contracts (i.e. follow, block, mute) in the correct way. It's less threatening and destabilizing than it sounds, though, because if a particular social platform isn't fulfilling the contract you care about, don't use it.

increase paid attention

You can include payable in safeMint , then use setMintRate to set the price people have to pay you for the following. So something like this:

uint256 public mintRate = 0.01 ether;

function setMintRate(uint256 mintRate_) public onlyOwner {

mintRate = mintRate_;

}

function safeMint(address to) public payable {

// Require pay-to-follow

require(msg.value >= mintRate, "Not enough ether to mint");

//Prevent anyone but the owner from minting

//a token to an address they don't own.

require(isOwner(_msgSender()) || (_msgSender() == to), "Unable to mint to this address");

uint256 tokenId = _tokenIdCounter.current();

_tokenIdCounter. increment();

_safeMint(to, tokenId);

}

I'm sure I can think of many other tweaks and features to add to this suggestion, but it's best to start with something simple and easy to understand.

Suggestion 2: Chain connection graph

The OCG contract described above is simple enough, but the scheme has some idiosyncrasies that may divide many people:

  • Everything is public, on-chain, including blocking and muting. You can't do this to lock the account, but a possible solution to this problem is to use an alternate account.
  • Every action costs gas, which means you have to make real choices about who you follow, block, and mute. But if the gas costs are high enough, then this could render the network unusable.
  • Paid Follow may or may not be a desirable feature for a network or a particular account, but you will have the option.

Given that not everyone will like these qualities of this proposal, I would like to propose an alternative set of social contracts that give users and platforms more fine-grained control over who sees what information, and are less expensive to use.

The basic idea of Chain Link Graph (CLG): we don’t express social relationships (follow, block, mute) directly on-chain through NFT, but store these relationships off-chain, and use on-chain tokens to discover and access these relationships .

  • Discovery: The contract provides a listURI() function that returns a JSON list of links to the ENS names for which you intend to declare a social relationship (ie, I follow them, I mute them, or I block them).
  • Access: If the link returned by listURI() is token-controlled, then the contract's token grants the holder read access to the link found in the metadata.

Then the social relationship is not directly on the chain, but connected to the chain through a set of contracts and URLs.

Like OCG, each of the three social relationships is governed by smart contracts, but CLG has different semantics:

  • Follows: Contains a JSON list of links to the ENS names you are following, the token issued by it grants read access to that watchlist.
  • Mute: Contains a JSON list of names linked to the ENS you are muting, the token issued by it grants read access to that muted list.
  • Block: Contains a JSON list linking to an ENS name you are blocking, the token issued by it grants read access to the block list.

Therefore, the semantics of the CLG token are: "This is read access to my list of X accounts", where "X" is "follow", "mute", or "block".

You can think of my proposal in this section as an approximation of the phone number + address book combination I describe for messaging applications. Your phone number is (quasi-)public, and when you connect a new messaging app to it, you can grant or deny that app read access to your contacts.

In my CLG social token scheme, your ENS name is public like your phone number, and you issue and revoke tokens to grant and deny access to lists of people you are somehow related to. You can grant these tokens to random users if you want, but mostly you're granting them to social platforms so that those platforms know whose posts to show you and whose posts to hide (or who shouldn't see you s post).

( Write access to the lists that make up your social graph may be controlled by your normal ENS NFT - if you have your ENS name in your wallet, you can write/update/delete the list. One possible The alternative is to have a fourth social contract that grants NTFT holders list write access, so you can outsource list management to some 3rd party)

Hosting these lists off-chain, while pointing to them from on-chain, has several benefits:

  • You can lock down your relationship from public viewing by using authentication on the endpoint hosting the list. Or you can make it public so anyone can read it.
  • Updating an off-chain list costs no gas.
  • This approach enables the creation of a marketplace of social graph hosting services interoperable with social providers.
  • Anyone or service can easily discover your listing.

Token Access Control and Read Access

A key innovation in implementing CLG contracts is token access control. The concept behind token access control is that you cannot access specific data on a host unless you connect to the host with a wallet containing a specific access token.

For example, you could have tokenized access control over content on IPFS, so that only readers who connect to the endpoint with a specific NFT in their wallet can view specific files.

CLG uses token gates to add some indirection to our social contracts, so instead of representing a specific type of relationship – follow, mute, or block – a social NFT represents read access to a portion of your social graph.

Clearly, in order for the token threshold to work, the platform must respect it. Presumably, if the platform doesn't respect token access controls, you'll transfer your relationship list to other platforms, and change your contracts, reissuing any NFTs if necessary.

Also, to be clear, some people's lists are leaked at some point. We live in a world of personal data breaches, so if the data is hosted somewhere, then some data will be compromised. I'll discuss some possible mitigations in later chapters.

Contract template: CLG Follows

The contract below will be a standard ERC721 NTFT contract, very close to the OCG contract above:

// SPDX-License-Identifier: MIT

pragma solidity ^0.8.4;

import "@openzeppelin/contracts/token/ERC721/ERC721. SOL";

import "@openzeppelin/contracts/security/ SOL";

import "@openzeppelin/contracts/access/ SOL";

import "@openzeppelin/contracts/token/ERC721/extensions/ SOL";

import "@openzeppelin/contracts/utils/ SOL";

contract CLGFollows is ERC721, Pausable, Ownable, ERC721Burnable {

using Counters for Counters. Counter;

Counters. Counter private _tokenIdCounter;

constructor() ERC721("CLGFollows", "CLGF") {}

function _baseURI() internal pure override returns (string memory) {

return "https://jonstokes.com/clgfollows/";

}

function listURI() public {

return "https://jonstokes.com/clgfollows/list";

}

function relationship() public {

return "clg follows";

}

function pause() public onlyOwner {

_pause();

}

function unpause() public onlyOwner {

_unpause();

}

function safeMint(address to) public onlyOwner {

uint256 tokenId = _tokenIdCounter.current();

_tokenIdCounter. increment();

_safeMint(to, tokenId);

}

function _beforeTokenTransfer(address from, address to, uint256) pure override internal {

//Disable token transfers.

require(from == address(0) || to == address(0), "Cannot be transferred.");

}

}

All extensions are the same as OCG except I didn't include ERC721Enumerable because it's not clear if anyone wants their CLG Follows tokens to be enumerated (plus it raises the gas cost of minting)

As for the function, I made the following modifications to the output of the OpenZeppelin wizard:

  • relationship() : Like OLG, it returns the type of social contract. Again, this is probably not necessary for Solidity contracts, and I haven't seen it done, but nonetheless, I feel like I want the contract to self-report its type. So I don't know - please ignore if this offends you.
  • listURI() returns a link to a JSON object that is a list of ENS names you are following (or muted or blocked, depending on the contract type). We would like this URI to be marked private, but it doesn't have to.

Most of the time you will use CLG Follows NTFT and post it to an address owned by the social platform. That way, the platform can read your watchlist and show you the correct posts.

But you can also send these NTFTs to followers so your followers can discover other followers. You can do this by AirDrop to followers, or by unbanning coinage, allowing anyone to mint.

All other contracts work exactly as above, but have different names and symbols, and return different values from relationship() and listURI() .

possible variables

If you're worried about your lists leaking from different services, it's pretty straightforward to change listURI() into something more like tokenURI(uint256 tokenId) ie the signature is listURI(uint256 tokenId) which concatenates tokenID to a basic URI, so that each token holder can get its own list URL. This feature, combined with some logic on the list host, allows you to segregate lists such that different token holders get different subgraphs of the main graph. That way, if a platform is owned, only that part of my graph is compromised.

Like OCG, you can turn safemint into a payable function and charge people for visiting your listings. Please see the code in the OCG section to see what this example looks like.

You may want to be able to update the URLs returned by tokenURI() and/or listURI() , in which case you'll need to store these URLs in variables, initialize them in the constructor, and provide onlyOwner setter functions for updating them. This will increase your minting costs, but if you're only going to give them to services and not individuals, this probably doesn't matter.

Serve

Both proposals outlined here provide some place for centralized hosting services, even if it is only a stopgap, until the ecosystem transitions to a distributed system like IPFS.

The most obvious type of service is hosting anything returned by one of the URI functions – profile data, NTFT metadata, and JSON lists of token controls (in the case of CLG).

Another useful service is a specialized version of Infura that exposes on-chain social data via an API. Alternatively, Infura can provide a dedicated API for social data.

Finally, there can be third-party services to verify accounts to suit the needs of users and organizations.

Summarize

I don't know if I expect my on-chain social graph proposal to be adopted in the form I describe here. I bring up these ideas more to spark conversations about how we can effectively transition from our current state of fully locked-down platforms to a more portable state where you own your graph and can easily take it with you wherever you go.

Part of the above looks a bit like a web5 proposal, but the key difference is that both of my ideas are designed to be simpler and leverage smart contracts and existing on-chain identity providers (ENS, but also others similar providers on the chain).

If you get nothing else from this post, I hope I have at least made it clear that in a world of distributed ledger technology and smart contracts, there is no need for any of us to be locked into a social network in 2022. The tools to solve this locking problem are widely available, we just have to pick them up and use them.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments