Public Goods Research: How has Wikipedia continued to grow to this day?

This article is machine translated
Show original

Author: Bai Ding & Wuyue, Geek web3

When it comes to public goods in the Internet era, Wikipedia may be the most representative case. This globally renowned online encyclopedia was co-founded by Jimmy Wales and Larry Sanger in 2001, with the aim of providing a free access knowledge platform through the Internet, "to allow everyone to freely access the knowledge of all humanity."

To this day, Wikipedia has undisputedly achieved this goal. Through the open editing model of "user-uploaded content", anyone can contribute content online, and Wikipedia has been able to gather the "think tanks" of the world.

As of now, the platform has content in over 300 languages, with over 62 million entries, over 14 million editing events per month, and the English Wikipedia alone has data exceeding 20TB, with a monthly visit volume of over 6 billion, ranking among the top 10 most popular websites globally. With these data, Wikipedia is undoubtedly the benchmark for the knowledge base of the Web2 era.

And in the face of the rapid development of AI, the value of Wikipedia is even more incalculable. Computer scientist Jesse Dodge has stated that Wikipedia is the single largest information source for the underlying large language model of ChatGPT, accounting for 3% to 5% of the data it has captured. Nicholas Vincent, a faculty member at Simon Fraser University, even said, "Without Wikipedia, generative artificial intelligence would not exist."

What is most surprising is that the massive and highly successful Wikipedia is not a commercialized private organization, but rather "the largest non-commercial website in the world." This sounds quite incredible, as most Internet platforms of a similar scale rely on advertising revenue or burning money to maintain their operations, and for the generally short-lived and revenue-challenged Web2 public goods, operating in a non-commercial model and maintaining a massive scale is almost as difficult as climbing a mountain.

Wikimedia Foundation CEO Katherine Maher even stated in 2021: "If Wikipedia had not been founded in the early 21st century, it would not have been able to exist in today's fragmented, commercialized Internet world."

How has the non-profit Wikipedia achieved such influence? The mystery behind it is worth exploring. Driven by our interest in the research of Public goods, we conducted a simple investigation of Wikipedia. Due to the strong inspirational significance of this case for public goods operators, especially content output platforms, we recommend that everyone read this article. Below, we will elaborate on Wikipedia's content production model, sources of cash flow and expenditure allocation, and controversies over power and finance.

UGC: A Groundbreaking Content Generation Model

The open editing model of Wikipedia can be traced back to its founding. Its predecessor was Nupedia, which aimed to build a high-quality online encyclopedia. However, the editing process of Nupedia was very slow, as the uploaded content had to go through multiple levels of review and expert approval, severely limiting the speed of content generation. In the first year after its establishment in 2000, the number of articles collected was only a handful.

To improve the efficiency of content production, Nupedia's founder Larry Sanger proposed a new idea and developed a knowledge network system called "Wiki", which allowed users to freely upload content, and anyone could participate in editing the entries, which became the later Wikipedia.

From a product perspective, wiki is a knowledge network system where the cost for users to create, modify, and publish wiki text on the web is much lower than HTML text, and the wiki system supports community-oriented collaborative writing, providing simple tools for community communication, which helps to share knowledge in a certain field.

In the book "The World is Flat", the author directly refers to the above model as "community-uploaded content", and in more literature, the content editing model introduced by Larry Sanger is called UGC (User-Generated Content), which is often driven by interest rather than significant material incentives.

UGC quickly broke the traditional form of encyclopedias being dominated by experts and publishers, allowing for the flexible inclusion of non-academic but high-profile events, and thus quickly captured the attention of a wide range of users. This bottom-up "crowdsourcing" model allowed Wikipedia's information to quickly extend to all aspects, and after its launch in January 2001, Wikipedia quickly surpassed Nupedia, which was shut down in 2003, while the Encyclopædia Britannica also announced the cessation of print publication in 2012 under the impact of Wikipedia.

Currently, there are still millions of volunteers around the world participating in editing and maintaining the content on the Wikipedia platform, with about 120,000 active editors (editing at least once a month), and about 300 editing events occurring on the website per minute.

While UGC has created the conditions for Wikipedia's rise, the side effects it has brought are also obvious. Under the open and free editing model, how to ensure the accuracy of the content is an unavoidable pain point. Wikipedia has experienced countless incidents of entry falsification or destructive editing, the most common of which include inserting false information, advertising copy, or politically-oriented content, the most famous of which is the "John Seigenthaler Sr. article falsification incident".

Wikipedia's current solution is to provide a function to revert the content of entries to previous versions, and each entry has a history record of revision actions, so anyone who discovers that an entry has been maliciously changed can revert it to the previous version.

Statistics show that obvious malicious editing is easily detected and removed, and according to experimental tests, such correction actions can be triggered in just a few minutes on average. Wikipedia now widely uses bots to correct simple writing errors or vulgar content, but it is still difficult to quickly detect and address less obvious destructive behavior, which requires human intervention.

For problems that require human manual intervention to solve, Wikipedia has developed a three-tier safeguard system to execute in as decentralized a manner as possible. First, when malicious editing occurs, the most common approach is "modify, revert, discuss", where if user A makes an edit to an entry and user B has doubts, they can revert it to the previous version and discuss the differences on the discussion page to seek consensus.

Sometimes the two sides of the dispute may become deadlocked, repeatedly occurring "edit, revert, edit, revert", and this requires the intervention of a higher authority role, commonly known as administrators and patrollers.

Administrators have higher powers such as deleting entries, protecting pages, preventing editing conflicts, and handling complaints, while patrollers' main task is to quickly review and mark the latest published content, and they can mark problematic content as "pending review" and report it to administrators or higher-level volunteers.

Administrators can also set partially or fully protected status for entries that are easily subject to malicious editing (such as public figure entries), limiting editing rights to maintain the stability of the entry. Administrators also have the power to block users who maliciously edit entries.

For more complex situations, Wikipedia also has an Arbitration Committee composed of senior volunteers, as a last resort. The committee members are all seasoned volunteers, and their decisions are based on Wikipedia's editing guidelines and community norms, ensuring that the content meets the standards of neutrality and verifiability.

In terms of open source content licensing, Wikipedia uses several Creative Commons licenses, the most important of which is the CC BY-SA 4.0 license, which allows users to freely share or adapt the content, but must meet two conditions:

1. The original author's name, source, and link must be indicated.

2. If the work is adapted, the adapted work must also be released under the CC BY-SA 4.0 license, to facilitate further user-generated content. In addition to CC BY-SA 4.0, some earlier content and images are still subject to the GNU Free Documentation License (GFDL).

Cash Flow Analysis: Can Donations Alone Support the Tower of Babel?

Sources of Cash Flow

For large-scale internet platforms with a large user base, how to obtain stable cash flow is the biggest headache. Wikipedia, which focuses on non-commercialization and free reading, and maintains value neutrality, is almost impossible to monetize like commercial platforms such as Twitter and YouTube through the insertion of advertisements or membership systems. In addition, Wikipedia lacks a strong private institution to provide substantial subsidies, so how it obtains cash flow to maintain operations is a question that many people are curious about.

As a comparison, we can first look at Baidu Baike. Taking the search term "medical insurance" as an example, it is easy to find that Baidu Baike is heavily dependent on advertising system revenue. This commercial monetization model often leads to biases or misinformation, such as the Wei Zexi incident in 2016, which was a victim of this model, eventually forcing relevant internet platforms to reduce the proportion of commercial promotion under the order of the Cyberspace Administration of China.

According to Vitalik's "revenue - evil curve" metric, the Wei Zexi incident can be considered a typical case of the negative externalities caused by the over-monetization of public goods. In contrast, Wikipedia's non-commercial policy makes it more neutral and able to retain more positive externalities, but can this model really be sustained?

Comparison Table of Wikipedia and Other "Encyclopedia-like" Products

Regarding the sustainability of Wikipedia, we have to trace back to the founding organization behind it - the Wikimedia Foundation. The foundation was established in 2003, headquartered in San Francisco, and currently has a staff of over 500. Its main sources of funding are donations and grants, and according to its public disclosures, the Wikimedia Foundation's revenue sources include the following:

First, user donations. Each year, the Wikimedia Foundation launches fundraising campaigns, appealing to global users to donate to support platform operations. These donations are mostly small in amount, but the number of donors is large, accounting for a large proportion of the foundation's revenue. Most users see a banner on the screen asking for donations to maintain the platform when browsing Wikipedia.

According to the Wikimedia Foundation's data for the 2022-2023 fiscal year, the foundation's total revenue reached $180 million, of which small user donations account for over 90% of the funding sources. On average, each donor contributes about $11, with about 7.5 million people supporting Wikipedia in this way globally.

In addition to individual donations, the Wikimedia Foundation also receives funding from some large companies and foundations, such as Google, Microsoft, and the Bill & Melinda Gates Foundation. Google and the Alfred P. Sloan Foundation alone have each donated over $3 million to Wikipedia.

Furthermore, the Wikimedia Foundation also actively applies for grants from charitable projects, a typical example being the "Reading Wikipedia in the Classroom" project, which aims to help teachers and students around the world better utilize Wikipedia for teaching. Initially piloted in Nigeria, Bolivia, and the Philippines, the project has now expanded to over 40 countries, helping people in those regions effectively use Wikipedia in the classroom. Through this project, the Wikimedia Foundation has successfully received sponsorship from multiple parties.

To achieve sustainable development, the Wikimedia Foundation is also actively exploring self-sustaining economic sources beyond donations. The foundation launched the "Wikimedia Enterprise" service in October 2021, primarily targeting large tech companies like Google and Amazon with specialized paid APIs, which has brought additional revenue to the foundation. In the 2022-2023 fiscal year, Wikimedia Enterprise generated revenue of several million dollars, with Google alone paying over $2 million to Wikipedia, and the paid API business is expected to become an important driver of future revenue growth for Wikipedia.

The foundation also operates a Wikimedia online store (store.wikimedia.org), selling merchandise with the Wikipedia logo, such as T-shirts, mugs, and stickers. Although this part of the revenue is relatively small, it is also a supplementary source of income for the foundation, generating around several hundred thousand dollars in additional revenue each year.

In addition to the officially mentioned stable sources of funding, by checking the balance sheet, we can also see that the Wikimedia Foundation participates in some investment activities. In 2023, the Wikimedia Foundation's investment gains were about $6.5 million, but in 2022, its investment activities resulted in a loss of more than $11 million.

Expenditure Allocation

The Wikimedia Foundation has detailed budget planning and financial auditing for all fund usage, and each major expenditure is subject to multiple approvals to ensure reasonableness and transparency. The foundation's financial reports are also regularly made public, allowing donors and the public to understand the specific use of funds.

According to the Wikimedia Foundation's statements, we can see its specific expenditure situation, with its expenditures reaching $169 million in the 2022 fiscal year alone, of which employee salaries and benefits account for 60% of the expenditures. This funding is mainly used to pay the salaries and related benefits of the technical team and community personnel, covering the expenditures for server maintenance, software updates, data security, and other work.

As the world's largest online encyclopedia, Wikipedia needs to handle massive data and traffic, and the maintenance and upgrade of servers, data centers, and other technical resources alone is a huge expense. As of 2024, Wikipedia has 6 data centers globally, distributed in the United States, the Netherlands, France, and Singapore, to ensure the stable operation of Wikipedia and other Wikimedia projects.

At the same time, Wikipedia relies on the support of the global volunteer community, and the Wikimedia Foundation provides various awards and funding activities around the world to promote community building, accounting for about 14% of the expenditures. For example, the Wikimedia Foundation has organized "edit-a-thons" in some regions, encouraging volunteers to focus on editing entries on specific topics to expand the breadth and depth of content. Typical cases include the "Fashion Edit-a-thons" held mainly in France and other countries, as well as the "Wiki4Climate" event focused on climate subjects in 2020.

In addition, the Wikimedia Foundation has also invested a large amount of resources in professional services, including legal consulting, external technical support, and accounting audits, to ensure the compliance and operational security of Wikipedia globally.

Meanwhile, the foundation's administrative expenses also include the rental of office facilities and daily management expenses to maintain internal operations, as well as the regular hosting of technical seminars and international editing conferences to promote collaboration and exchange within the global volunteer community, all of which require financial support.

The above two parts account for 15% of the total expenditures. In addition, the Wikimedia Foundation's expenditures on advertising and payment channels for fundraising campaigns in society account for 4% of the total expenditures.

Challenges for Wikipedia: Donation Fraud, Corruption, and Political Correctness

The sustainable development of any public good is an issue that cannot be ignored. It is undeniable that Wikipedia has done very well in this regard in the past, but it still has hidden dangers and challenges. First, Wikipedia's operating funds mainly depend on user donations, and although this model has sustained the platform's development, its non-autonomous economic sources still have strong instability, and under the impact of large language models, users' willingness to donate to Wikipedia is more easily affected.

Secondly, as a non-profit organization, if the foundation tries to increase revenue through typical commercial means such as paid APIs, it may also raise external controversies over the nature and neutrality of the platform. In this way, the instability of Wikipedia's economic sources and neutrality has become an intractable problem, and there is an issue that must be mentioned.

As the saying goes, "The bigger the tree, the more it attracts the wind." Wikipedia, relying solely on donations, has managed to obtain such a huge source of income, which has led to widespread dissatisfaction from the outside world, and the use of its funds has been highly controversial, with rumors of "over-fundraising" and "fraudulent donations" never ceasing. On the one hand, Wikipedia's fundraising copy sometimes exaggerates the urgency of its financial needs, even giving the impression that Wikipedia is "on the verge of collapse," leading to users' misunderstanding of the platform's financial situation.

On the other hand, some insiders have provided specific data showing that Wikipedia's operations do not require so much funding, and there is a strong suspicion of "embezzlement."

Kolbe, the former co-editor-in-chief of the Wikipedia community newspaper, stated that he is very familiar with the internal operations of Wikipedia, and the donation fund plan launched by the Wikimedia Foundation in 2016 was originally planned to complete a fundraising target of 100 million dollars in 10 years, but the recent fundraising activities and the density of fundraising advertisements have clearly increased, and the fundraising scale that can be completed at least 5 years in advance is several times that, while in contrast, the normal operation of Wikipedia only requires 10 million dollars per year.

Previously, a Brazilian editor, Felipe da Fonseca, also said: "To ask for money using the achievements of others, this beggar-like posture, is really too ugly and unethical."

Jimmy Wales, the founder of Wikipedia, has also frequently faced accusations from the community, with many believing that the cost-benefit ratio of the Wikimedia Foundation is dismal, as the foundation has spent millions of dollars on software development over the years, but has not produced anything effective. In 2014, Wales acknowledged that he was frustrated by the endless controversies, which accused him of wasting funds on developing some software without practical value without sufficient community consultation, and not having proper incremental promotion to make up for the mistakes.

And in February 2017, The Signpost published a column article titled "Wikipedia is Sick," in which the author criticized the Wikimedia Foundation's annual spending, which has been increasing, but without corresponding output.

Musk is also a staunch critic of Wikipedia. In 2023, when Musk renamed Twitter to "X," it sparked much discussion, and Musk then posted a message joking that if Wikipedia changed its name to "Dickipedia" for a year, he would immediately donate 1 billion dollars to the Wikimedia Foundation to express his dissatisfaction with Wikipedia's fundraising appeals and rumors of over-fundraising. Musk also posted statements such as "Wikipedia is broken" and "Wikipedia is losing its objectivity," which are not listed one by one in this article.

Musk's remarks may contain some political factors (many Wikipedia entries have a clear anti-Trump bias), and we will not discuss this, but this does represent the negative attitude of many well-known figures towards Wikipedia.

In response to such rumors, the Wikimedia Foundation explained that the fundraising proceeds are not only used for daily operations, but can also ensure that Wikipedia has sufficient reserve funds to cope with potential crises, under the premise of being ad-free, free to read, and not influenced by commercial interests, and this financial management strategy can enhance its fault tolerance and help Wikipedia maintain the independence and stability of a non-profit public good.

In addition to the above issues, Wikipedia's development also faces multiple problems.

First, as an open-editing platform, Wikipedia's content relies on global volunteers to create and maintain, and while this model encourages widespread participation, it also leads to misleading, inaccurate, and even malicious edits. Although the platform has strict editing rules and review mechanisms, in the era of AI, how to ensure the reliability and neutrality of content and correct errors in a timely manner will be an unavoidable challenge in its development.

At the same time, through some third-party data, we can find that although the number of Wikipedia users is increasing year by year, the number of active editors on the platform has decreased significantly in recent years. The main reasons for this phenomenon are:

  1. The review mechanism of Wikipedia has become increasingly strict, dampening the enthusiasm of new editors.

  2. Administrator privileges have become increasingly high, and they can block the accounts and IP addresses of some editors, leading to abuse of power.

In addition, the management team is not a monolith, and there are many differences between the Wikipedia community and the Wikimedia Foundation, which have even been brought to the surface, involving issues such as management corruption and abuse of power.

In 2014, the Wikimedia Foundation tried to install a new software that could view multimedia content on the German Wikipedia, but the German Wikipedia editors refused to update the user interface, and the two sides were at an impasse. Eventually, the Wikimedia Foundation forcibly installed the new software and set high-level permissions to prevent editors from rolling back to the old version.

On September 13, 2021, the Wikimedia Foundation also launched an action against the Chinese Wikipedia, resulting in the banning of 7 users and the removal of the privileges of 12 administrators. Three of the users were among the top ten most active on the Chinese Wikipedia. Since the Wikimedia Foundation did not provide a systematic and detailed evidence or explanation afterwards, this incident was seen by the Chinese mainland Wikipedia community and Chinese media as a suspected excessive interference with the community's autonomy and suppression of those holding opposing Western ideologies, and a lack of procedural justice.

In addition, in terms of resource allocation, such as the distribution of funds between different language versions, the cost of software development and infrastructure maintenance, and investment in different regions, the Wikipedia community and the Foundation have actually been constantly competing for dominance.

As a public good, Wikipedia relies on its credibility to obtain donations to sustain its operations, and this credibility is maintained by the authority and comprehensiveness of its content, as well as the decentralized power distribution between the community and the Foundation. The above-mentioned public infighting behavior is a form of damage to its credibility, and combined with the impact of large language models like AI, the quality of Wikipedia's entries and user scale may experience an irreversible decline, further undermining its credibility.

At the same time, Wikipedia also faces the problem of insufficient diversity of volunteers. For example, content about women, minorities, and non-English cultures is often overlooked. How to attract more volunteers and encourage people from different backgrounds and regions to participate is another key to the platform's future development.

Summary

The success of Wikipedia lies not only in its outstanding achievements as a knowledge-sharing platform, but also in its provision of valuable insights for the sustainable development of public goods. As the world's largest open-source encyclopedia, Wikipedia has maintained content neutrality without commercial means, successfully addressing the challenges of the Internet era, which has profound implications for the management of other public goods.

The history of Wikipedia shows that only through a stable source of income, efficient use of funds, transparent financial management, and deep community participation can public goods move steadily forward in the long-term development. At the same time, we must also see that Wikipedia's operations, whether financially, organizationally, or in terms of public opinion, are not perfect and have generated undeniable controversies. The lessons learned from this serve as a warning for the builders of other public goods.

In the future, the sustainable development of public goods will face more complex environmental changes, including the fragmentation of self-media leading to user attention dispersion and soaring operating costs, as well as adjustments to laws and regulations globally, and even the constant evolution of user needs. This means that public goods not only need to continue to attract user participation, but also need to actively explore more sources of income to open up a stable and sustainable development path.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments